Amino acid dipepetide frequency for Bacillus phage Harambe

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
0.308AlaAla: 0.308 ± 0.276
0.308AlaCys: 0.308 ± 0.179
2.157AlaAsp: 2.157 ± 0.607
4.16AlaGlu: 4.16 ± 0.905
2.927AlaPhe: 2.927 ± 0.607
3.543AlaGly: 3.543 ± 0.914
0.77AlaHis: 0.77 ± 0.372
5.084AlaIle: 5.084 ± 0.628
5.7AlaLys: 5.7 ± 0.916
3.697AlaLeu: 3.697 ± 0.643
2.003AlaMet: 2.003 ± 0.594
2.311AlaAsn: 2.311 ± 0.62
2.003AlaPro: 2.003 ± 0.562
1.078AlaGln: 1.078 ± 0.341
2.003AlaArg: 2.003 ± 0.598
2.311AlaSer: 2.311 ± 0.481
3.851AlaThr: 3.851 ± 0.683
2.619AlaVal: 2.619 ± 0.719
0.308AlaTrp: 0.308 ± 0.202
2.465AlaTyr: 2.465 ± 0.688
0.0AlaXaa: 0.0 ± 0.0
Cys
0.616CysAla: 0.616 ± 0.338
0.0CysCys: 0.0 ± 0.0
1.078CysAsp: 1.078 ± 0.628
1.232CysGlu: 1.232 ± 0.384
0.154CysPhe: 0.154 ± 0.132
0.77CysGly: 0.77 ± 0.306
0.308CysHis: 0.308 ± 0.225
0.924CysIle: 0.924 ± 0.413
0.0CysLys: 0.0 ± 0.0
0.154CysLeu: 0.154 ± 0.165
0.154CysMet: 0.154 ± 0.125
0.462CysAsn: 0.462 ± 0.348
0.308CysPro: 0.308 ± 0.203
0.154CysGln: 0.154 ± 0.151
0.308CysArg: 0.308 ± 0.183
0.462CysSer: 0.462 ± 0.247
0.154CysThr: 0.154 ± 0.132
0.616CysVal: 0.616 ± 0.318
0.0CysTrp: 0.0 ± 0.0
0.462CysTyr: 0.462 ± 0.185
0.0CysXaa: 0.0 ± 0.0
Asp
3.697AspAla: 3.697 ± 0.597
1.078AspCys: 1.078 ± 0.473
3.081AspAsp: 3.081 ± 0.773
4.776AspGlu: 4.776 ± 0.633
2.773AspPhe: 2.773 ± 0.413
5.546AspGly: 5.546 ± 0.86
0.77AspHis: 0.77 ± 0.288
3.697AspIle: 3.697 ± 0.768
5.546AspLys: 5.546 ± 0.757
4.776AspLeu: 4.776 ± 0.947
2.157AspMet: 2.157 ± 0.581
3.697AspAsn: 3.697 ± 0.628
1.695AspPro: 1.695 ± 0.585
1.232AspGln: 1.232 ± 0.578
2.465AspArg: 2.465 ± 0.531
3.697AspSer: 3.697 ± 0.486
4.006AspThr: 4.006 ± 0.933
5.084AspVal: 5.084 ± 0.833
0.308AspTrp: 0.308 ± 0.195
3.389AspTyr: 3.389 ± 0.78
0.0AspXaa: 0.0 ± 0.0
Glu
2.927GluAla: 2.927 ± 0.737
0.308GluCys: 0.308 ± 0.176
6.47GluAsp: 6.47 ± 0.943
7.549GluGlu: 7.549 ± 1.701
3.697GluPhe: 3.697 ± 0.789
5.7GluGly: 5.7 ± 0.972
1.387GluHis: 1.387 ± 0.603
7.087GluIle: 7.087 ± 1.264
4.776GluLys: 4.776 ± 0.939
7.087GluLeu: 7.087 ± 1.164
2.465GluMet: 2.465 ± 0.596
4.468GluAsn: 4.468 ± 0.953
1.695GluPro: 1.695 ± 0.373
3.389GluGln: 3.389 ± 0.702
2.927GluArg: 2.927 ± 0.854
4.314GluSer: 4.314 ± 0.871
6.162GluThr: 6.162 ± 0.935
4.16GluVal: 4.16 ± 0.692
1.695GluTrp: 1.695 ± 0.546
3.081GluTyr: 3.081 ± 0.608
0.0GluXaa: 0.0 ± 0.0
Phe
2.465PheAla: 2.465 ± 0.596
0.308PheCys: 0.308 ± 0.211
3.389PheAsp: 3.389 ± 0.603
3.851PheGlu: 3.851 ± 0.758
1.387PhePhe: 1.387 ± 0.495
2.927PheGly: 2.927 ± 0.607
0.924PheHis: 0.924 ± 0.207
2.927PheIle: 2.927 ± 0.76
5.854PheLys: 5.854 ± 1.144
2.003PheLeu: 2.003 ± 0.592
0.924PheMet: 0.924 ± 0.3
3.081PheAsn: 3.081 ± 0.553
1.387PhePro: 1.387 ± 0.451
0.924PheGln: 0.924 ± 0.291
1.387PheArg: 1.387 ± 0.474
2.003PheSer: 2.003 ± 0.497
3.697PheThr: 3.697 ± 0.815
1.849PheVal: 1.849 ± 0.547
0.308PheTrp: 0.308 ± 0.18
2.157PheTyr: 2.157 ± 0.649
0.0PheXaa: 0.0 ± 0.0
Gly
3.389GlyAla: 3.389 ± 0.778
0.616GlyCys: 0.616 ± 0.266
3.389GlyAsp: 3.389 ± 0.824
5.854GlyGlu: 5.854 ± 0.811
2.619GlyPhe: 2.619 ± 0.529
4.314GlyGly: 4.314 ± 0.931
0.616GlyHis: 0.616 ± 0.289
4.468GlyIle: 4.468 ± 1.172
6.625GlyLys: 6.625 ± 0.92
3.389GlyLeu: 3.389 ± 0.479
0.77GlyMet: 0.77 ± 0.257
6.008GlyAsn: 6.008 ± 1.028
0.77GlyPro: 0.77 ± 0.355
1.232GlyGln: 1.232 ± 0.381
1.541GlyArg: 1.541 ± 0.387
4.16GlySer: 4.16 ± 1.12
4.622GlyThr: 4.622 ± 0.929
5.238GlyVal: 5.238 ± 0.993
0.462GlyTrp: 0.462 ± 0.315
3.697GlyTyr: 3.697 ± 0.953
0.0GlyXaa: 0.0 ± 0.0
His
1.078HisAla: 1.078 ± 0.427
0.154HisCys: 0.154 ± 0.125
1.232HisAsp: 1.232 ± 0.51
1.541HisGlu: 1.541 ± 0.387
0.924HisPhe: 0.924 ± 0.366
0.462HisGly: 0.462 ± 0.258
0.462HisHis: 0.462 ± 0.193
1.232HisIle: 1.232 ± 0.377
1.078HisLys: 1.078 ± 0.357
1.232HisLeu: 1.232 ± 0.353
1.078HisMet: 1.078 ± 0.513
1.232HisAsn: 1.232 ± 0.32
0.462HisPro: 0.462 ± 0.239
0.462HisGln: 0.462 ± 0.238
0.924HisArg: 0.924 ± 0.335
1.078HisSer: 1.078 ± 0.379
1.387HisThr: 1.387 ± 0.495
0.924HisVal: 0.924 ± 0.272
0.154HisTrp: 0.154 ± 0.165
1.387HisTyr: 1.387 ± 0.525
0.0HisXaa: 0.0 ± 0.0
Ile
4.16IleAla: 4.16 ± 0.67
0.154IleCys: 0.154 ± 0.125
5.7IleAsp: 5.7 ± 0.897
7.395IleGlu: 7.395 ± 1.163
2.465IlePhe: 2.465 ± 0.58
3.235IleGly: 3.235 ± 0.619
2.311IleHis: 2.311 ± 0.598
3.851IleIle: 3.851 ± 1.03
6.008IleLys: 6.008 ± 0.984
3.081IleLeu: 3.081 ± 0.847
2.465IleMet: 2.465 ± 0.48
5.392IleAsn: 5.392 ± 0.973
1.541IlePro: 1.541 ± 0.459
2.927IleGln: 2.927 ± 0.556
3.389IleArg: 3.389 ± 0.961
3.697IleSer: 3.697 ± 0.708
5.238IleThr: 5.238 ± 0.631
3.543IleVal: 3.543 ± 0.747
0.616IleTrp: 0.616 ± 0.355
2.311IleTyr: 2.311 ± 0.608
0.0IleXaa: 0.0 ± 0.0
Lys
4.776LysAla: 4.776 ± 1.042
1.232LysCys: 1.232 ± 0.545
5.238LysAsp: 5.238 ± 0.935
7.395LysGlu: 7.395 ± 1.161
3.081LysPhe: 3.081 ± 0.871
5.854LysGly: 5.854 ± 0.984
1.695LysHis: 1.695 ± 0.54
4.006LysIle: 4.006 ± 0.828
7.703LysLys: 7.703 ± 1.153
9.09LysLeu: 9.09 ± 0.881
3.389LysMet: 3.389 ± 0.846
4.006LysAsn: 4.006 ± 0.622
2.157LysPro: 2.157 ± 0.602
3.851LysGln: 3.851 ± 0.998
6.008LysArg: 6.008 ± 0.945
3.389LysSer: 3.389 ± 0.539
4.622LysThr: 4.622 ± 0.719
4.622LysVal: 4.622 ± 0.648
0.77LysTrp: 0.77 ± 0.392
3.543LysTyr: 3.543 ± 0.779
0.0LysXaa: 0.0 ± 0.0
Leu
3.697LeuAla: 3.697 ± 0.559
0.616LeuCys: 0.616 ± 0.31
4.314LeuAsp: 4.314 ± 0.524
5.392LeuGlu: 5.392 ± 1.007
2.773LeuPhe: 2.773 ± 0.653
3.081LeuGly: 3.081 ± 0.654
1.541LeuHis: 1.541 ± 0.575
3.851LeuIle: 3.851 ± 1.031
5.7LeuLys: 5.7 ± 1.167
4.93LeuLeu: 4.93 ± 0.678
2.311LeuMet: 2.311 ± 0.543
5.854LeuAsn: 5.854 ± 0.635
3.081LeuPro: 3.081 ± 0.674
2.311LeuGln: 2.311 ± 0.739
2.773LeuArg: 2.773 ± 0.636
3.851LeuSer: 3.851 ± 0.631
4.622LeuThr: 4.622 ± 0.808
3.235LeuVal: 3.235 ± 0.79
0.308LeuTrp: 0.308 ± 0.202
4.16LeuTyr: 4.16 ± 0.83
0.0LeuXaa: 0.0 ± 0.0
Met
1.232MetAla: 1.232 ± 0.367
0.462MetCys: 0.462 ± 0.311
1.232MetAsp: 1.232 ± 0.368
2.773MetGlu: 2.773 ± 0.814
0.924MetPhe: 0.924 ± 0.356
2.157MetGly: 2.157 ± 0.578
0.616MetHis: 0.616 ± 0.369
0.924MetIle: 0.924 ± 0.379
3.081MetLys: 3.081 ± 0.778
1.695MetLeu: 1.695 ± 0.482
1.387MetMet: 1.387 ± 0.48
2.619MetAsn: 2.619 ± 0.588
0.616MetPro: 0.616 ± 0.281
1.232MetGln: 1.232 ± 0.377
1.232MetArg: 1.232 ± 0.441
1.541MetSer: 1.541 ± 0.461
1.849MetThr: 1.849 ± 0.429
3.389MetVal: 3.389 ± 0.691
0.0MetTrp: 0.0 ± 0.0
2.311MetTyr: 2.311 ± 0.385
0.0MetXaa: 0.0 ± 0.0
Asn
3.697AsnAla: 3.697 ± 0.776
0.616AsnCys: 0.616 ± 0.299
3.851AsnAsp: 3.851 ± 0.702
6.008AsnGlu: 6.008 ± 0.721
2.311AsnPhe: 2.311 ± 0.571
5.7AsnGly: 5.7 ± 1.157
1.232AsnHis: 1.232 ± 0.385
5.084AsnIle: 5.084 ± 0.915
5.7AsnLys: 5.7 ± 0.994
4.314AsnLeu: 4.314 ± 0.875
1.541AsnMet: 1.541 ± 0.633
4.006AsnAsn: 4.006 ± 0.695
1.541AsnPro: 1.541 ± 0.446
2.619AsnGln: 2.619 ± 0.623
2.311AsnArg: 2.311 ± 0.696
4.16AsnSer: 4.16 ± 0.68
4.006AsnThr: 4.006 ± 0.758
4.16AsnVal: 4.16 ± 0.636
1.232AsnTrp: 1.232 ± 0.32
3.697AsnTyr: 3.697 ± 0.691
0.0AsnXaa: 0.0 ± 0.0
Pro
1.232ProAla: 1.232 ± 0.364
0.308ProCys: 0.308 ± 0.176
1.387ProAsp: 1.387 ± 0.479
2.157ProGlu: 2.157 ± 0.567
1.695ProPhe: 1.695 ± 0.56
0.308ProGly: 0.308 ± 0.195
0.154ProHis: 0.154 ± 0.161
1.232ProIle: 1.232 ± 0.357
2.003ProLys: 2.003 ± 0.407
1.849ProLeu: 1.849 ± 0.507
1.232ProMet: 1.232 ± 0.532
2.311ProAsn: 2.311 ± 0.5
1.078ProPro: 1.078 ± 0.43
1.387ProGln: 1.387 ± 0.484
1.078ProArg: 1.078 ± 0.329
1.849ProSer: 1.849 ± 0.472
2.619ProThr: 2.619 ± 0.523
2.619ProVal: 2.619 ± 0.617
0.308ProTrp: 0.308 ± 0.218
1.695ProTyr: 1.695 ± 0.405
0.0ProXaa: 0.0 ± 0.0
Gln
2.619GlnAla: 2.619 ± 0.747
0.154GlnCys: 0.154 ± 0.172
1.849GlnAsp: 1.849 ± 0.593
1.695GlnGlu: 1.695 ± 0.472
1.849GlnPhe: 1.849 ± 0.632
2.311GlnGly: 2.311 ± 0.494
0.924GlnHis: 0.924 ± 0.326
3.389GlnIle: 3.389 ± 0.647
2.927GlnLys: 2.927 ± 0.515
2.311GlnLeu: 2.311 ± 0.618
1.387GlnMet: 1.387 ± 0.684
1.849GlnAsn: 1.849 ± 0.523
1.078GlnPro: 1.078 ± 0.45
1.387GlnGln: 1.387 ± 0.43
1.387GlnArg: 1.387 ± 0.381
2.003GlnSer: 2.003 ± 0.862
2.773GlnThr: 2.773 ± 0.838
2.157GlnVal: 2.157 ± 0.626
0.462GlnTrp: 0.462 ± 0.172
1.541GlnTyr: 1.541 ± 0.419
0.0GlnXaa: 0.0 ± 0.0
Arg
1.849ArgAla: 1.849 ± 0.502
0.462ArgCys: 0.462 ± 0.247
2.311ArgAsp: 2.311 ± 0.675
2.465ArgGlu: 2.465 ± 0.518
3.235ArgPhe: 3.235 ± 0.702
2.311ArgGly: 2.311 ± 0.649
0.462ArgHis: 0.462 ± 0.25
2.619ArgIle: 2.619 ± 0.532
4.468ArgLys: 4.468 ± 0.684
2.927ArgLeu: 2.927 ± 1.044
1.695ArgMet: 1.695 ± 0.476
3.389ArgAsn: 3.389 ± 0.689
0.924ArgPro: 0.924 ± 0.344
1.695ArgGln: 1.695 ± 0.435
1.541ArgArg: 1.541 ± 0.444
1.541ArgSer: 1.541 ± 0.429
2.619ArgThr: 2.619 ± 0.435
2.465ArgVal: 2.465 ± 0.547
0.616ArgTrp: 0.616 ± 0.281
1.541ArgTyr: 1.541 ± 0.506
0.0ArgXaa: 0.0 ± 0.0
Ser
2.619SerAla: 2.619 ± 0.796
0.616SerCys: 0.616 ± 0.216
3.851SerAsp: 3.851 ± 0.732
4.314SerGlu: 4.314 ± 0.924
1.849SerPhe: 1.849 ± 0.394
2.157SerGly: 2.157 ± 0.453
1.078SerHis: 1.078 ± 0.475
3.543SerIle: 3.543 ± 0.802
5.238SerLys: 5.238 ± 0.782
5.084SerLeu: 5.084 ± 0.66
1.541SerMet: 1.541 ± 0.372
2.773SerAsn: 2.773 ± 0.598
1.695SerPro: 1.695 ± 0.406
2.311SerGln: 2.311 ± 0.542
1.541SerArg: 1.541 ± 0.438
3.235SerSer: 3.235 ± 0.65
2.927SerThr: 2.927 ± 0.774
4.16SerVal: 4.16 ± 0.568
0.462SerTrp: 0.462 ± 0.22
2.157SerTyr: 2.157 ± 0.581
0.0SerXaa: 0.0 ± 0.0
Thr
2.773ThrAla: 2.773 ± 0.644
0.462ThrCys: 0.462 ± 0.232
4.776ThrAsp: 4.776 ± 0.839
3.851ThrGlu: 3.851 ± 0.823
4.006ThrPhe: 4.006 ± 0.683
5.7ThrGly: 5.7 ± 1.361
1.387ThrHis: 1.387 ± 0.358
5.546ThrIle: 5.546 ± 1.011
6.008ThrLys: 6.008 ± 1.092
4.006ThrLeu: 4.006 ± 0.808
0.924ThrMet: 0.924 ± 0.284
4.006ThrAsn: 4.006 ± 0.919
2.927ThrPro: 2.927 ± 0.39
2.927ThrGln: 2.927 ± 0.706
3.081ThrArg: 3.081 ± 0.612
4.776ThrSer: 4.776 ± 0.822
5.546ThrThr: 5.546 ± 1.215
3.389ThrVal: 3.389 ± 0.901
1.078ThrTrp: 1.078 ± 0.406
2.003ThrTyr: 2.003 ± 0.468
0.0ThrXaa: 0.0 ± 0.0
Val
3.081ValAla: 3.081 ± 1.009
0.0ValCys: 0.0 ± 0.0
3.851ValAsp: 3.851 ± 0.641
4.93ValGlu: 4.93 ± 0.824
2.465ValPhe: 2.465 ± 0.607
2.465ValGly: 2.465 ± 0.567
0.924ValHis: 0.924 ± 0.284
5.7ValIle: 5.7 ± 0.949
4.468ValLys: 4.468 ± 0.759
2.773ValLeu: 2.773 ± 0.713
1.849ValMet: 1.849 ± 0.453
6.316ValAsn: 6.316 ± 0.686
2.157ValPro: 2.157 ± 0.438
2.003ValGln: 2.003 ± 0.719
2.311ValArg: 2.311 ± 0.581
3.081ValSer: 3.081 ± 0.741
5.392ValThr: 5.392 ± 0.893
3.851ValVal: 3.851 ± 0.845
1.078ValTrp: 1.078 ± 0.418
2.773ValTyr: 2.773 ± 0.462
0.0ValXaa: 0.0 ± 0.0
Trp
0.77TrpAla: 0.77 ± 0.459
0.0TrpCys: 0.0 ± 0.0
0.616TrpAsp: 0.616 ± 0.309
0.77TrpGlu: 0.77 ± 0.403
0.77TrpPhe: 0.77 ± 0.344
0.462TrpGly: 0.462 ± 0.228
0.308TrpHis: 0.308 ± 0.19
0.616TrpIle: 0.616 ± 0.313
1.078TrpLys: 1.078 ± 0.488
1.232TrpLeu: 1.232 ± 0.367
0.154TrpMet: 0.154 ± 0.144
0.924TrpAsn: 0.924 ± 0.383
0.0TrpPro: 0.0 ± 0.0
0.616TrpGln: 0.616 ± 0.29
0.308TrpArg: 0.308 ± 0.148
0.154TrpSer: 0.154 ± 0.125
1.078TrpThr: 1.078 ± 0.329
0.616TrpVal: 0.616 ± 0.307
0.154TrpTrp: 0.154 ± 0.176
0.77TrpTyr: 0.77 ± 0.351
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.619TyrAla: 2.619 ± 0.615
0.462TyrCys: 0.462 ± 0.251
3.697TyrAsp: 3.697 ± 0.874
3.235TyrGlu: 3.235 ± 0.675
1.849TyrPhe: 1.849 ± 0.426
4.776TyrGly: 4.776 ± 0.915
0.616TyrHis: 0.616 ± 0.292
3.697TyrIle: 3.697 ± 0.606
2.465TyrLys: 2.465 ± 0.518
2.773TyrLeu: 2.773 ± 0.726
1.695TyrMet: 1.695 ± 0.503
3.081TyrAsn: 3.081 ± 0.727
1.387TyrPro: 1.387 ± 0.39
2.465TyrGln: 2.465 ± 0.574
2.619TyrArg: 2.619 ± 0.64
1.849TyrSer: 1.849 ± 0.464
2.003TyrThr: 2.003 ± 0.443
2.773TyrVal: 2.773 ± 0.545
1.078TyrTrp: 1.078 ± 0.351
1.849TyrTyr: 1.849 ± 0.516
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 33 proteins (6492 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski