Amino acid dipepetide frequency for Streptococcus satellite phage Javan368

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
3.8AlaAla: 3.8 ± 0.936
0.585AlaCys: 0.585 ± 0.321
3.508AlaAsp: 3.508 ± 0.812
7.308AlaGlu: 7.308 ± 1.677
2.923AlaPhe: 2.923 ± 0.559
3.215AlaGly: 3.215 ± 0.867
0.877AlaHis: 0.877 ± 0.603
6.723AlaIle: 6.723 ± 1.353
4.677AlaLys: 4.677 ± 0.936
7.6AlaLeu: 7.6 ± 1.51
1.169AlaMet: 1.169 ± 0.861
4.385AlaAsn: 4.385 ± 1.329
0.585AlaPro: 0.585 ± 0.419
1.462AlaGln: 1.462 ± 0.694
2.046AlaArg: 2.046 ± 0.772
4.385AlaSer: 4.385 ± 1.118
2.923AlaThr: 2.923 ± 1.044
3.8AlaVal: 3.8 ± 1.182
0.292AlaTrp: 0.292 ± 0.257
2.338AlaTyr: 2.338 ± 0.745
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.292CysCys: 0.292 ± 0.296
0.0CysAsp: 0.0 ± 0.0
0.292CysGlu: 0.292 ± 0.27
0.292CysPhe: 0.292 ± 0.276
0.292CysGly: 0.292 ± 0.234
0.0CysHis: 0.0 ± 0.0
0.585CysIle: 0.585 ± 0.469
0.292CysLys: 0.292 ± 0.234
0.877CysLeu: 0.877 ± 0.515
0.0CysMet: 0.0 ± 0.0
0.0CysAsn: 0.0 ± 0.0
0.292CysPro: 0.292 ± 0.27
0.0CysGln: 0.0 ± 0.0
0.877CysArg: 0.877 ± 0.416
0.0CysSer: 0.0 ± 0.0
0.292CysThr: 0.292 ± 0.27
0.0CysVal: 0.0 ± 0.0
0.0CysTrp: 0.0 ± 0.0
0.585CysTyr: 0.585 ± 0.41
0.0CysXaa: 0.0 ± 0.0
Asp
3.215AspAla: 3.215 ± 1.22
0.292AspCys: 0.292 ± 0.296
4.385AspAsp: 4.385 ± 0.88
7.015AspGlu: 7.015 ± 1.882
3.508AspPhe: 3.508 ± 1.214
2.923AspGly: 2.923 ± 0.785
0.585AspHis: 0.585 ± 0.45
5.262AspIle: 5.262 ± 0.776
4.677AspLys: 4.677 ± 1.314
8.185AspLeu: 8.185 ± 1.249
1.462AspMet: 1.462 ± 0.618
4.092AspAsn: 4.092 ± 1.246
1.169AspPro: 1.169 ± 0.733
0.877AspGln: 0.877 ± 0.427
2.631AspArg: 2.631 ± 0.738
2.338AspSer: 2.338 ± 0.542
1.462AspThr: 1.462 ± 0.863
1.754AspVal: 1.754 ± 0.563
0.877AspTrp: 0.877 ± 0.361
2.338AspTyr: 2.338 ± 0.889
0.0AspXaa: 0.0 ± 0.0
Glu
7.308GluAla: 7.308 ± 1.498
0.585GluCys: 0.585 ± 0.314
2.923GluAsp: 2.923 ± 0.766
6.723GluGlu: 6.723 ± 1.269
2.923GluPhe: 2.923 ± 0.879
5.554GluGly: 5.554 ± 1.08
1.462GluHis: 1.462 ± 0.627
7.308GluIle: 7.308 ± 1.68
9.646GluLys: 9.646 ± 1.745
9.646GluLeu: 9.646 ± 1.745
1.462GluMet: 1.462 ± 0.553
6.139GluAsn: 6.139 ± 1.083
1.462GluPro: 1.462 ± 0.657
7.6GluGln: 7.6 ± 2.249
3.215GluArg: 3.215 ± 1.251
4.092GluSer: 4.092 ± 0.968
3.8GluThr: 3.8 ± 0.998
4.092GluVal: 4.092 ± 1.081
1.754GluTrp: 1.754 ± 0.574
1.462GluTyr: 1.462 ± 0.579
0.0GluXaa: 0.0 ± 0.0
Phe
2.631PheAla: 2.631 ± 0.844
0.292PheCys: 0.292 ± 0.27
1.754PheAsp: 1.754 ± 0.509
4.677PheGlu: 4.677 ± 0.953
2.338PhePhe: 2.338 ± 0.757
2.046PheGly: 2.046 ± 0.602
1.462PheHis: 1.462 ± 0.633
2.631PheIle: 2.631 ± 0.85
1.754PheLys: 1.754 ± 0.57
3.8PheLeu: 3.8 ± 0.781
0.292PheMet: 0.292 ± 0.269
2.046PheAsn: 2.046 ± 0.641
0.877PhePro: 0.877 ± 0.404
0.877PheGln: 0.877 ± 0.495
0.585PheArg: 0.585 ± 0.321
2.631PheSer: 2.631 ± 0.897
2.631PheThr: 2.631 ± 0.618
2.631PheVal: 2.631 ± 0.681
0.292PheTrp: 0.292 ± 0.234
0.877PheTyr: 0.877 ± 0.671
0.0PheXaa: 0.0 ± 0.0
Gly
4.677GlyAla: 4.677 ± 1.212
0.585GlyCys: 0.585 ± 0.329
1.462GlyAsp: 1.462 ± 0.389
2.631GlyGlu: 2.631 ± 0.677
1.462GlyPhe: 1.462 ± 0.421
2.923GlyGly: 2.923 ± 1.332
1.462GlyHis: 1.462 ± 0.505
4.677GlyIle: 4.677 ± 1.08
5.554GlyLys: 5.554 ± 1.55
6.431GlyLeu: 6.431 ± 1.37
1.169GlyMet: 1.169 ± 0.521
1.754GlyAsn: 1.754 ± 0.889
0.0GlyPro: 0.0 ± 0.0
2.338GlyGln: 2.338 ± 1.048
1.169GlyArg: 1.169 ± 0.559
2.046GlySer: 2.046 ± 0.739
0.877GlyThr: 0.877 ± 0.467
5.554GlyVal: 5.554 ± 1.264
0.0GlyTrp: 0.0 ± 0.0
2.338GlyTyr: 2.338 ± 0.583
0.0GlyXaa: 0.0 ± 0.0
His
1.169HisAla: 1.169 ± 0.674
0.0HisCys: 0.0 ± 0.0
0.0HisAsp: 0.0 ± 0.0
1.754HisGlu: 1.754 ± 0.968
0.877HisPhe: 0.877 ± 0.461
0.877HisGly: 0.877 ± 0.436
0.0HisHis: 0.0 ± 0.0
0.877HisIle: 0.877 ± 0.446
2.046HisLys: 2.046 ± 0.558
1.754HisLeu: 1.754 ± 0.628
0.292HisMet: 0.292 ± 0.278
1.169HisAsn: 1.169 ± 0.474
0.292HisPro: 0.292 ± 0.334
0.877HisGln: 0.877 ± 0.603
0.0HisArg: 0.0 ± 0.0
0.877HisSer: 0.877 ± 0.365
0.877HisThr: 0.877 ± 0.392
0.292HisVal: 0.292 ± 0.296
0.0HisTrp: 0.0 ± 0.0
1.462HisTyr: 1.462 ± 0.591
0.0HisXaa: 0.0 ± 0.0
Ile
3.8IleAla: 3.8 ± 1.09
0.292IleCys: 0.292 ± 0.257
5.262IleAsp: 5.262 ± 1.115
8.769IleGlu: 8.769 ± 2.091
3.215IlePhe: 3.215 ± 0.823
2.631IleGly: 2.631 ± 0.655
1.462IleHis: 1.462 ± 0.632
5.262IleIle: 5.262 ± 1.026
4.677IleLys: 4.677 ± 1.185
7.308IleLeu: 7.308 ± 0.885
1.169IleMet: 1.169 ± 0.386
4.385IleAsn: 4.385 ± 1.068
2.046IlePro: 2.046 ± 0.524
3.215IleGln: 3.215 ± 0.761
3.8IleArg: 3.8 ± 1.169
9.062IleSer: 9.062 ± 1.643
2.631IleThr: 2.631 ± 0.97
4.677IleVal: 4.677 ± 1.049
0.585IleTrp: 0.585 ± 0.427
1.169IleTyr: 1.169 ± 0.738
0.0IleXaa: 0.0 ± 0.0
Lys
6.431LysAla: 6.431 ± 1.835
0.585LysCys: 0.585 ± 0.353
4.677LysAsp: 4.677 ± 0.823
9.062LysGlu: 9.062 ± 1.108
3.215LysPhe: 3.215 ± 1.092
3.508LysGly: 3.508 ± 0.956
0.877LysHis: 0.877 ± 0.504
5.846LysIle: 5.846 ± 1.159
8.769LysLys: 8.769 ± 1.75
4.677LysLeu: 4.677 ± 1.253
1.754LysMet: 1.754 ± 0.792
3.8LysAsn: 3.8 ± 0.875
2.631LysPro: 2.631 ± 0.712
5.554LysGln: 5.554 ± 1.056
4.092LysArg: 4.092 ± 1.1
8.769LysSer: 8.769 ± 1.376
3.508LysThr: 3.508 ± 0.844
3.8LysVal: 3.8 ± 1.13
1.169LysTrp: 1.169 ± 0.388
2.046LysTyr: 2.046 ± 0.502
0.0LysXaa: 0.0 ± 0.0
Leu
8.185LeuAla: 8.185 ± 1.675
0.0LeuCys: 0.0 ± 0.0
7.892LeuAsp: 7.892 ± 1.812
7.892LeuGlu: 7.892 ± 1.437
3.8LeuPhe: 3.8 ± 0.97
4.969LeuGly: 4.969 ± 1.075
1.169LeuHis: 1.169 ± 0.471
5.554LeuIle: 5.554 ± 1.416
7.6LeuLys: 7.6 ± 1.216
8.185LeuLeu: 8.185 ± 1.252
2.338LeuMet: 2.338 ± 0.838
6.723LeuAsn: 6.723 ± 1.458
3.215LeuPro: 3.215 ± 1.113
3.215LeuGln: 3.215 ± 0.745
6.139LeuArg: 6.139 ± 0.652
8.477LeuSer: 8.477 ± 1.812
5.262LeuThr: 5.262 ± 0.919
4.969LeuVal: 4.969 ± 0.876
1.169LeuTrp: 1.169 ± 0.519
3.215LeuTyr: 3.215 ± 0.552
0.0LeuXaa: 0.0 ± 0.0
Met
0.585MetAla: 0.585 ± 0.419
0.0MetCys: 0.0 ± 0.0
1.754MetAsp: 1.754 ± 0.692
1.754MetGlu: 1.754 ± 0.657
0.0MetPhe: 0.0 ± 0.0
0.292MetGly: 0.292 ± 0.272
0.0MetHis: 0.0 ± 0.0
1.169MetIle: 1.169 ± 0.645
1.169MetLys: 1.169 ± 0.646
2.631MetLeu: 2.631 ± 0.829
0.585MetMet: 0.585 ± 0.342
2.631MetAsn: 2.631 ± 0.702
0.292MetPro: 0.292 ± 0.296
1.169MetGln: 1.169 ± 0.548
0.877MetArg: 0.877 ± 0.449
1.169MetSer: 1.169 ± 0.49
2.338MetThr: 2.338 ± 0.716
1.462MetVal: 1.462 ± 0.624
0.0MetTrp: 0.0 ± 0.0
0.292MetTyr: 0.292 ± 0.234
0.0MetXaa: 0.0 ± 0.0
Asn
2.338AsnAla: 2.338 ± 0.432
0.0AsnCys: 0.0 ± 0.0
4.677AsnAsp: 4.677 ± 0.932
4.677AsnGlu: 4.677 ± 1.318
2.046AsnPhe: 2.046 ± 0.73
3.508AsnGly: 3.508 ± 0.949
0.292AsnHis: 0.292 ± 0.246
3.508AsnIle: 3.508 ± 0.954
5.262AsnLys: 5.262 ± 0.934
7.015AsnLeu: 7.015 ± 1.358
0.877AsnMet: 0.877 ± 0.446
3.508AsnAsn: 3.508 ± 0.785
2.046AsnPro: 2.046 ± 0.778
1.462AsnGln: 1.462 ± 0.459
3.508AsnArg: 3.508 ± 1.473
5.262AsnSer: 5.262 ± 0.755
3.215AsnThr: 3.215 ± 0.922
3.215AsnVal: 3.215 ± 1.183
0.877AsnTrp: 0.877 ± 0.483
3.215AsnTyr: 3.215 ± 0.855
0.0AsnXaa: 0.0 ± 0.0
Pro
1.169ProAla: 1.169 ± 0.558
0.877ProCys: 0.877 ± 0.407
2.338ProAsp: 2.338 ± 0.692
2.923ProGlu: 2.923 ± 0.638
1.169ProPhe: 1.169 ± 0.559
1.754ProGly: 1.754 ± 0.667
0.0ProHis: 0.0 ± 0.0
1.462ProIle: 1.462 ± 0.721
2.046ProLys: 2.046 ± 0.84
2.046ProLeu: 2.046 ± 0.97
0.292ProMet: 0.292 ± 0.296
1.462ProAsn: 1.462 ± 0.39
0.877ProPro: 0.877 ± 0.405
0.877ProGln: 0.877 ± 0.54
1.754ProArg: 1.754 ± 0.611
1.462ProSer: 1.462 ± 0.638
1.754ProThr: 1.754 ± 0.739
1.169ProVal: 1.169 ± 0.45
0.0ProTrp: 0.0 ± 0.0
1.169ProTyr: 1.169 ± 0.704
0.0ProXaa: 0.0 ± 0.0
Gln
4.385GlnAla: 4.385 ± 1.447
0.0GlnCys: 0.0 ± 0.0
2.046GlnAsp: 2.046 ± 0.673
3.8GlnGlu: 3.8 ± 0.617
1.169GlnPhe: 1.169 ± 0.353
1.754GlnGly: 1.754 ± 0.769
1.169GlnHis: 1.169 ± 0.477
2.923GlnIle: 2.923 ± 1.208
2.046GlnLys: 2.046 ± 0.564
3.508GlnLeu: 3.508 ± 0.905
1.169GlnMet: 1.169 ± 0.374
2.631GlnAsn: 2.631 ± 1.071
1.169GlnPro: 1.169 ± 0.38
3.215GlnGln: 3.215 ± 1.267
2.923GlnArg: 2.923 ± 0.994
2.923GlnSer: 2.923 ± 1.189
3.215GlnThr: 3.215 ± 1.012
2.631GlnVal: 2.631 ± 0.945
0.877GlnTrp: 0.877 ± 0.8
2.923GlnTyr: 2.923 ± 0.802
0.0GlnXaa: 0.0 ± 0.0
Arg
3.8ArgAla: 3.8 ± 0.855
0.0ArgCys: 0.0 ± 0.0
2.631ArgAsp: 2.631 ± 0.738
3.8ArgGlu: 3.8 ± 1.141
1.462ArgPhe: 1.462 ± 0.625
1.462ArgGly: 1.462 ± 0.644
0.292ArgHis: 0.292 ± 0.27
2.631ArgIle: 2.631 ± 0.656
6.139ArgLys: 6.139 ± 1.089
4.677ArgLeu: 4.677 ± 1.024
1.754ArgMet: 1.754 ± 0.481
1.754ArgAsn: 1.754 ± 0.57
1.169ArgPro: 1.169 ± 0.561
3.215ArgGln: 3.215 ± 0.748
1.754ArgArg: 1.754 ± 0.684
2.631ArgSer: 2.631 ± 0.705
2.923ArgThr: 2.923 ± 0.625
3.215ArgVal: 3.215 ± 1.115
0.292ArgTrp: 0.292 ± 0.276
1.754ArgTyr: 1.754 ± 0.454
0.0ArgXaa: 0.0 ± 0.0
Ser
2.046SerAla: 2.046 ± 0.771
0.292SerCys: 0.292 ± 0.276
3.508SerAsp: 3.508 ± 1.393
4.969SerGlu: 4.969 ± 0.633
1.754SerPhe: 1.754 ± 0.99
3.215SerGly: 3.215 ± 1.093
2.046SerHis: 2.046 ± 0.634
6.723SerIle: 6.723 ± 0.877
4.969SerLys: 4.969 ± 1.235
5.554SerLeu: 5.554 ± 0.73
1.169SerMet: 1.169 ± 0.751
4.969SerAsn: 4.969 ± 1.006
2.338SerPro: 2.338 ± 0.813
3.8SerGln: 3.8 ± 1.014
3.8SerArg: 3.8 ± 1.027
4.092SerSer: 4.092 ± 1.085
5.262SerThr: 5.262 ± 1.338
4.677SerVal: 4.677 ± 0.973
1.169SerTrp: 1.169 ± 0.475
0.877SerTyr: 0.877 ± 0.554
0.0SerXaa: 0.0 ± 0.0
Thr
3.508ThrAla: 3.508 ± 0.732
0.0ThrCys: 0.0 ± 0.0
2.631ThrAsp: 2.631 ± 1.05
5.262ThrGlu: 5.262 ± 1.389
1.169ThrPhe: 1.169 ± 0.545
3.215ThrGly: 3.215 ± 1.046
0.585ThrHis: 0.585 ± 0.321
5.846ThrIle: 5.846 ± 1.254
4.385ThrLys: 4.385 ± 1.168
4.677ThrLeu: 4.677 ± 0.722
0.877ThrMet: 0.877 ± 0.519
2.046ThrAsn: 2.046 ± 0.525
3.215ThrPro: 3.215 ± 0.782
2.338ThrGln: 2.338 ± 1.039
2.631ThrArg: 2.631 ± 0.986
1.462ThrSer: 1.462 ± 0.497
2.923ThrThr: 2.923 ± 0.922
2.923ThrVal: 2.923 ± 1.029
0.0ThrTrp: 0.0 ± 0.0
1.754ThrTyr: 1.754 ± 0.81
0.0ThrXaa: 0.0 ± 0.0
Val
3.215ValAla: 3.215 ± 0.797
0.292ValCys: 0.292 ± 0.27
4.969ValAsp: 4.969 ± 0.957
2.631ValGlu: 2.631 ± 1.547
2.046ValPhe: 2.046 ± 0.958
2.046ValGly: 2.046 ± 0.967
1.754ValHis: 1.754 ± 0.646
4.385ValIle: 4.385 ± 1.103
5.554ValLys: 5.554 ± 0.966
4.677ValLeu: 4.677 ± 1.039
0.877ValMet: 0.877 ± 0.438
5.554ValAsn: 5.554 ± 0.733
2.338ValPro: 2.338 ± 0.947
1.462ValGln: 1.462 ± 0.741
2.923ValArg: 2.923 ± 0.913
2.923ValSer: 2.923 ± 0.912
3.508ValThr: 3.508 ± 0.995
2.923ValVal: 2.923 ± 1.199
0.0ValTrp: 0.0 ± 0.0
2.631ValTyr: 2.631 ± 0.686
0.0ValXaa: 0.0 ± 0.0
Trp
0.292TrpAla: 0.292 ± 0.27
0.0TrpCys: 0.0 ± 0.0
0.292TrpAsp: 0.292 ± 0.234
0.877TrpGlu: 0.877 ± 0.442
0.0TrpPhe: 0.0 ± 0.0
0.877TrpGly: 0.877 ± 0.512
0.0TrpHis: 0.0 ± 0.0
1.169TrpIle: 1.169 ± 0.519
0.292TrpLys: 0.292 ± 0.234
1.462TrpLeu: 1.462 ± 0.536
0.292TrpMet: 0.292 ± 0.234
0.585TrpAsn: 0.585 ± 0.407
0.292TrpPro: 0.292 ± 0.257
1.169TrpGln: 1.169 ± 0.523
0.0TrpArg: 0.0 ± 0.0
0.585TrpSer: 0.585 ± 0.321
0.585TrpThr: 0.585 ± 0.427
0.877TrpVal: 0.877 ± 0.483
0.292TrpTrp: 0.292 ± 0.27
0.292TrpTyr: 0.292 ± 0.308
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.046TyrAla: 2.046 ± 1.284
0.0TyrCys: 0.0 ± 0.0
2.338TyrAsp: 2.338 ± 0.856
2.338TyrGlu: 2.338 ± 0.766
1.754TyrPhe: 1.754 ± 0.615
2.046TyrGly: 2.046 ± 0.532
0.0TyrHis: 0.0 ± 0.0
0.877TyrIle: 0.877 ± 0.48
3.215TyrLys: 3.215 ± 1.468
5.262TyrLeu: 5.262 ± 1.031
0.877TyrMet: 0.877 ± 0.494
0.877TyrAsn: 0.877 ± 0.581
0.292TyrPro: 0.292 ± 0.257
1.754TyrGln: 1.754 ± 0.612
2.631TyrArg: 2.631 ± 0.997
2.338TyrSer: 2.338 ± 0.847
1.462TyrThr: 1.462 ± 0.648
2.046TyrVal: 2.046 ± 0.673
0.585TyrTrp: 0.585 ± 0.359
0.877TyrTyr: 0.877 ± 0.519
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 18 proteins (3422 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski