Amino acid dipepetide frequency for uncultured marine virus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
4.592AlaAla: 4.592 ± 1.779
1.148AlaCys: 1.148 ± 0.751
8.611AlaAsp: 8.611 ± 0.954
1.722AlaGlu: 1.722 ± 0.885
2.87AlaPhe: 2.87 ± 0.964
5.741AlaGly: 5.741 ± 1.198
2.87AlaHis: 2.87 ± 0.894
1.722AlaIle: 1.722 ± 1.104
1.722AlaLys: 1.722 ± 1.012
6.889AlaLeu: 6.889 ± 0.902
1.722AlaMet: 1.722 ± 0.674
2.87AlaAsn: 2.87 ± 0.966
5.166AlaPro: 5.166 ± 0.839
1.722AlaGln: 1.722 ± 0.873
12.629AlaArg: 12.629 ± 3.251
4.592AlaSer: 4.592 ± 1.365
7.463AlaThr: 7.463 ± 2.14
0.574AlaVal: 0.574 ± 0.412
2.296AlaTrp: 2.296 ± 0.998
4.018AlaTyr: 4.018 ± 0.657
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.0CysCys: 0.0 ± 0.0
2.296CysAsp: 2.296 ± 1.004
1.148CysGlu: 1.148 ± 0.511
0.0CysPhe: 0.0 ± 0.0
0.0CysGly: 0.0 ± 0.0
0.574CysHis: 0.574 ± 0.504
1.722CysIle: 1.722 ± 0.674
0.0CysLys: 0.0 ± 0.0
1.148CysLeu: 1.148 ± 0.746
0.0CysMet: 0.0 ± 0.0
1.722CysAsn: 1.722 ± 0.749
1.722CysPro: 1.722 ± 0.674
0.0CysGln: 0.0 ± 0.0
2.87CysArg: 2.87 ± 0.898
1.722CysSer: 1.722 ± 0.719
2.296CysThr: 2.296 ± 0.459
0.0CysVal: 0.0 ± 0.0
0.0CysTrp: 0.0 ± 0.0
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
6.315AspAla: 6.315 ± 1.806
0.0AspCys: 0.0 ± 0.0
2.296AspAsp: 2.296 ± 0.459
1.148AspGlu: 1.148 ± 0.825
7.463AspPhe: 7.463 ± 0.638
4.018AspGly: 4.018 ± 1.392
1.722AspHis: 1.722 ± 0.611
8.037AspIle: 8.037 ± 1.15
1.722AspLys: 1.722 ± 0.682
3.444AspLeu: 3.444 ± 1.259
2.87AspMet: 2.87 ± 1.187
1.722AspAsn: 1.722 ± 0.856
2.296AspPro: 2.296 ± 0.905
3.444AspGln: 3.444 ± 1.343
0.574AspArg: 0.574 ± 0.412
1.722AspSer: 1.722 ± 0.772
6.889AspThr: 6.889 ± 1.013
2.296AspVal: 2.296 ± 0.767
2.296AspTrp: 2.296 ± 0.459
2.87AspTyr: 2.87 ± 0.758
0.574AspXaa: 0.574 ± 0.504
Glu
2.296GluAla: 2.296 ± 0.814
1.148GluCys: 1.148 ± 0.568
1.722GluAsp: 1.722 ± 0.732
3.444GluGlu: 3.444 ± 1.27
1.148GluPhe: 1.148 ± 0.751
1.722GluGly: 1.722 ± 0.674
1.722GluHis: 1.722 ± 0.611
2.296GluIle: 2.296 ± 1.921
1.722GluLys: 1.722 ± 0.562
2.296GluLeu: 2.296 ± 0.459
2.296GluMet: 2.296 ± 0.953
1.722GluAsn: 1.722 ± 0.719
0.0GluPro: 0.0 ± 0.0
0.574GluGln: 0.574 ± 0.503
5.166GluArg: 5.166 ± 0.995
1.722GluSer: 1.722 ± 0.733
1.722GluThr: 1.722 ± 0.856
1.148GluVal: 1.148 ± 0.745
0.574GluTrp: 0.574 ± 0.505
2.87GluTyr: 2.87 ± 0.757
0.0GluXaa: 0.0 ± 0.0
Phe
2.87PheAla: 2.87 ± 0.922
0.574PheCys: 0.574 ± 0.653
3.444PheAsp: 3.444 ± 1.195
2.296PheGlu: 2.296 ± 1.031
2.87PhePhe: 2.87 ± 2.062
0.574PheGly: 0.574 ± 0.412
0.574PheHis: 0.574 ± 0.412
2.296PheIle: 2.296 ± 0.923
0.574PheLys: 0.574 ± 0.503
3.444PheLeu: 3.444 ± 0.758
1.148PheMet: 1.148 ± 0.825
2.87PheAsn: 2.87 ± 1.006
1.722PhePro: 1.722 ± 1.283
1.722PheGln: 1.722 ± 0.719
5.166PheArg: 5.166 ± 1.216
2.296PheSer: 2.296 ± 0.69
1.722PheThr: 1.722 ± 0.772
4.018PheVal: 4.018 ± 0.905
2.296PheTrp: 2.296 ± 0.716
0.0PheTyr: 0.0 ± 0.0
0.0PheXaa: 0.0 ± 0.0
Gly
6.889GlyAla: 6.889 ± 0.944
1.722GlyCys: 1.722 ± 0.766
6.889GlyAsp: 6.889 ± 1.481
1.148GlyGlu: 1.148 ± 0.723
1.722GlyPhe: 1.722 ± 0.856
6.889GlyGly: 6.889 ± 1.793
1.722GlyHis: 1.722 ± 0.745
3.444GlyIle: 3.444 ± 1.428
7.463GlyLys: 7.463 ± 1.526
2.296GlyLeu: 2.296 ± 0.767
6.315GlyMet: 6.315 ± 1.513
4.592GlyAsn: 4.592 ± 1.603
1.148GlyPro: 1.148 ± 0.568
1.148GlyGln: 1.148 ± 0.536
2.87GlyArg: 2.87 ± 0.8
4.592GlySer: 4.592 ± 0.892
7.463GlyThr: 7.463 ± 1.512
7.463GlyVal: 7.463 ± 1.782
0.0GlyTrp: 0.0 ± 0.0
2.296GlyTyr: 2.296 ± 1.022
0.0GlyXaa: 0.0 ± 0.0
His
0.0HisAla: 0.0 ± 0.0
0.0HisCys: 0.0 ± 0.0
0.574HisAsp: 0.574 ± 0.547
0.0HisGlu: 0.0 ± 0.0
2.296HisPhe: 2.296 ± 0.902
4.018HisGly: 4.018 ± 1.375
2.87HisHis: 2.87 ± 0.918
2.87HisIle: 2.87 ± 1.202
0.574HisLys: 0.574 ± 0.503
1.148HisLeu: 1.148 ± 0.746
1.722HisMet: 1.722 ± 0.794
0.574HisAsn: 0.574 ± 0.503
2.87HisPro: 2.87 ± 0.555
1.148HisGln: 1.148 ± 0.511
6.315HisArg: 6.315 ± 1.55
0.574HisSer: 0.574 ± 0.503
1.722HisThr: 1.722 ± 0.844
2.87HisVal: 2.87 ± 1.187
1.148HisTrp: 1.148 ± 0.536
0.574HisTyr: 0.574 ± 0.503
0.0HisXaa: 0.0 ± 0.0
Ile
7.463IleAla: 7.463 ± 0.976
1.148IleCys: 1.148 ± 0.511
2.87IleAsp: 2.87 ± 0.555
5.741IleGlu: 5.741 ± 1.252
4.018IlePhe: 4.018 ± 1.186
2.296IleGly: 2.296 ± 0.645
2.296IleHis: 2.296 ± 0.68
6.889IleIle: 6.889 ± 1.304
0.574IleLys: 0.574 ± 0.412
2.87IleLeu: 2.87 ± 1.129
0.574IleMet: 0.574 ± 0.503
5.166IleAsn: 5.166 ± 0.814
1.722IlePro: 1.722 ± 1.237
1.722IleGln: 1.722 ± 1.009
2.296IleArg: 2.296 ± 0.809
2.296IleSer: 2.296 ± 1.163
4.592IleThr: 4.592 ± 1.197
2.87IleVal: 2.87 ± 0.977
1.148IleTrp: 1.148 ± 0.825
1.722IleTyr: 1.722 ± 0.766
0.0IleXaa: 0.0 ± 0.0
Lys
1.722LysAla: 1.722 ± 0.873
0.574LysCys: 0.574 ± 0.412
2.296LysAsp: 2.296 ± 0.996
1.148LysGlu: 1.148 ± 0.723
1.722LysPhe: 1.722 ± 0.611
2.87LysGly: 2.87 ± 0.805
1.722LysHis: 1.722 ± 0.682
5.166LysIle: 5.166 ± 1.01
0.574LysLys: 0.574 ± 0.503
5.166LysLeu: 5.166 ± 0.997
0.0LysMet: 0.0 ± 0.0
3.444LysAsn: 3.444 ± 1.236
1.148LysPro: 1.148 ± 1.005
0.0LysGln: 0.0 ± 0.0
3.444LysArg: 3.444 ± 1.125
6.889LysSer: 6.889 ± 0.512
5.741LysThr: 5.741 ± 1.874
1.148LysVal: 1.148 ± 0.536
1.722LysTrp: 1.722 ± 1.061
0.574LysTyr: 0.574 ± 0.505
0.0LysXaa: 0.0 ± 0.0
Leu
7.463LeuAla: 7.463 ± 2.177
1.722LeuCys: 1.722 ± 0.674
6.315LeuAsp: 6.315 ± 0.117
4.018LeuGlu: 4.018 ± 1.694
0.574LeuPhe: 0.574 ± 0.412
6.315LeuGly: 6.315 ± 1.391
2.296LeuHis: 2.296 ± 0.969
3.444LeuIle: 3.444 ± 1.077
5.741LeuLys: 5.741 ± 1.484
2.296LeuLeu: 2.296 ± 0.459
1.148LeuMet: 1.148 ± 0.511
1.722LeuAsn: 1.722 ± 1.959
7.463LeuPro: 7.463 ± 0.919
1.722LeuGln: 1.722 ± 0.873
1.148LeuArg: 1.148 ± 1.009
3.444LeuSer: 3.444 ± 2.018
6.889LeuThr: 6.889 ± 1.226
2.87LeuVal: 2.87 ± 0.555
4.018LeuTrp: 4.018 ± 1.186
0.0LeuTyr: 0.0 ± 0.0
0.0LeuXaa: 0.0 ± 0.0
Met
1.148MetAla: 1.148 ± 0.568
0.574MetCys: 0.574 ± 0.505
2.296MetAsp: 2.296 ± 1.022
2.87MetGlu: 2.87 ± 1.431
1.722MetPhe: 1.722 ± 0.674
1.148MetGly: 1.148 ± 0.825
1.148MetHis: 1.148 ± 0.723
0.574MetIle: 0.574 ± 0.517
0.0MetLys: 0.0 ± 0.0
5.741MetLeu: 5.741 ± 1.738
0.574MetMet: 0.574 ± 0.653
1.148MetAsn: 1.148 ± 0.62
3.444MetPro: 3.444 ± 1.101
1.148MetGln: 1.148 ± 0.649
1.722MetArg: 1.722 ± 1.237
1.722MetSer: 1.722 ± 1.043
0.574MetThr: 0.574 ± 0.503
4.018MetVal: 4.018 ± 1.077
0.574MetTrp: 0.574 ± 0.412
1.148MetTyr: 1.148 ± 0.751
0.0MetXaa: 0.0 ± 0.0
Asn
4.592AsnAla: 4.592 ± 1.135
0.0AsnCys: 0.0 ± 0.0
4.592AsnAsp: 4.592 ± 0.711
0.574AsnGlu: 0.574 ± 0.503
2.296AsnPhe: 2.296 ± 0.998
7.463AsnGly: 7.463 ± 1.385
3.444AsnHis: 3.444 ± 1.343
2.296AsnIle: 2.296 ± 1.275
1.722AsnLys: 1.722 ± 0.745
3.444AsnLeu: 3.444 ± 1.468
4.018AsnMet: 4.018 ± 0.853
1.722AsnAsn: 1.722 ± 0.812
0.574AsnPro: 0.574 ± 0.505
1.148AsnGln: 1.148 ± 0.511
0.574AsnArg: 0.574 ± 0.653
0.574AsnSer: 0.574 ± 0.412
3.444AsnThr: 3.444 ± 1.245
3.444AsnVal: 3.444 ± 0.971
0.574AsnTrp: 0.574 ± 0.503
0.574AsnTyr: 0.574 ± 0.505
0.0AsnXaa: 0.0 ± 0.0
Pro
1.722ProAla: 1.722 ± 0.873
2.296ProCys: 2.296 ± 0.996
5.166ProAsp: 5.166 ± 1.243
1.722ProGlu: 1.722 ± 0.674
0.574ProPhe: 0.574 ± 0.412
5.166ProGly: 5.166 ± 0.981
1.722ProHis: 1.722 ± 0.674
0.574ProIle: 0.574 ± 0.412
3.444ProLys: 3.444 ± 1.066
3.444ProLeu: 3.444 ± 1.907
0.0ProMet: 0.0 ± 0.0
1.148ProAsn: 1.148 ± 0.856
2.296ProPro: 2.296 ± 0.996
1.148ProGln: 1.148 ± 0.568
8.037ProArg: 8.037 ± 2.236
0.574ProSer: 0.574 ± 0.505
1.148ProThr: 1.148 ± 0.625
4.592ProVal: 4.592 ± 0.815
0.574ProTrp: 0.574 ± 0.505
0.574ProTyr: 0.574 ± 0.503
0.0ProXaa: 0.0 ± 0.0
Gln
0.574GlnAla: 0.574 ± 0.517
0.574GlnCys: 0.574 ± 0.517
2.296GlnAsp: 2.296 ± 0.809
1.722GlnGlu: 1.722 ± 0.766
0.574GlnPhe: 0.574 ± 0.412
5.741GlnGly: 5.741 ± 1.003
1.148GlnHis: 1.148 ± 0.581
0.0GlnIle: 0.0 ± 0.0
1.722GlnLys: 1.722 ± 0.856
0.574GlnLeu: 0.574 ± 0.412
0.0GlnMet: 0.0 ± 0.0
2.87GlnAsn: 2.87 ± 0.428
0.574GlnPro: 0.574 ± 0.505
0.574GlnGln: 0.574 ± 0.504
1.148GlnArg: 1.148 ± 0.581
2.296GlnSer: 2.296 ± 1.475
3.444GlnThr: 3.444 ± 1.218
0.574GlnVal: 0.574 ± 0.412
1.148GlnTrp: 1.148 ± 0.649
0.0GlnTyr: 0.0 ± 0.0
0.0GlnXaa: 0.0 ± 0.0
Arg
10.907ArgAla: 10.907 ± 2.411
0.574ArgCys: 0.574 ± 0.503
3.444ArgAsp: 3.444 ± 0.44
2.87ArgGlu: 2.87 ± 0.894
2.87ArgPhe: 2.87 ± 0.705
5.741ArgGly: 5.741 ± 0.812
4.592ArgHis: 4.592 ± 1.541
6.315ArgIle: 6.315 ± 0.949
4.592ArgLys: 4.592 ± 0.993
5.741ArgLeu: 5.741 ± 1.145
2.296ArgMet: 2.296 ± 0.433
3.444ArgAsn: 3.444 ± 1.439
1.148ArgPro: 1.148 ± 0.745
1.148ArgGln: 1.148 ± 0.65
7.463ArgArg: 7.463 ± 1.604
6.315ArgSer: 6.315 ± 1.662
2.87ArgThr: 2.87 ± 0.428
2.87ArgVal: 2.87 ± 1.164
0.574ArgTrp: 0.574 ± 0.504
4.018ArgTyr: 4.018 ± 0.896
0.0ArgXaa: 0.0 ± 0.0
Ser
6.315SerAla: 6.315 ± 1.138
0.0SerCys: 0.0 ± 0.0
1.722SerAsp: 1.722 ± 0.913
0.0SerGlu: 0.0 ± 0.0
4.018SerPhe: 4.018 ± 1.345
4.018SerGly: 4.018 ± 1.854
1.148SerHis: 1.148 ± 0.568
2.296SerIle: 2.296 ± 1.649
3.444SerLys: 3.444 ± 1.217
4.018SerLeu: 4.018 ± 1.048
3.444SerMet: 3.444 ± 1.541
2.296SerAsn: 2.296 ± 0.996
4.592SerPro: 4.592 ± 1.557
2.296SerGln: 2.296 ± 1.235
6.889SerArg: 6.889 ± 1.264
3.444SerSer: 3.444 ± 1.954
2.296SerThr: 2.296 ± 1.163
1.722SerVal: 1.722 ± 0.812
0.574SerTrp: 0.574 ± 0.505
1.148SerTyr: 1.148 ± 0.706
0.0SerXaa: 0.0 ± 0.0
Thr
4.018ThrAla: 4.018 ± 1.647
0.574ThrCys: 0.574 ± 0.504
1.722ThrAsp: 1.722 ± 0.563
1.722ThrGlu: 1.722 ± 1.061
1.722ThrPhe: 1.722 ± 1.237
4.592ThrGly: 4.592 ± 1.197
1.148ThrHis: 1.148 ± 0.581
5.166ThrIle: 5.166 ± 1.147
4.592ThrLys: 4.592 ± 1.453
9.759ThrLeu: 9.759 ± 2.663
2.87ThrMet: 2.87 ± 0.805
3.444ThrAsn: 3.444 ± 1.468
4.592ThrPro: 4.592 ± 1.067
2.296ThrGln: 2.296 ± 0.835
5.166ThrArg: 5.166 ± 1.573
5.166ThrSer: 5.166 ± 1.292
2.87ThrThr: 2.87 ± 0.69
3.444ThrVal: 3.444 ± 0.5
4.018ThrTrp: 4.018 ± 1.413
0.574ThrTyr: 0.574 ± 0.412
0.0ThrXaa: 0.0 ± 0.0
Val
5.741ValAla: 5.741 ± 1.048
1.148ValCys: 1.148 ± 0.536
2.296ValAsp: 2.296 ± 1.313
4.018ValGlu: 4.018 ± 1.164
0.0ValPhe: 0.0 ± 0.0
2.296ValGly: 2.296 ± 0.827
0.574ValHis: 0.574 ± 0.503
4.018ValIle: 4.018 ± 1.472
5.741ValLys: 5.741 ± 0.686
5.166ValLeu: 5.166 ± 0.606
1.148ValMet: 1.148 ± 0.768
0.574ValAsn: 0.574 ± 0.504
2.87ValPro: 2.87 ± 1.465
2.296ValGln: 2.296 ± 1.131
4.592ValArg: 4.592 ± 1.035
0.574ValSer: 0.574 ± 0.412
1.722ValThr: 1.722 ± 0.873
0.574ValVal: 0.574 ± 0.505
2.296ValTrp: 2.296 ± 0.996
3.444ValTyr: 3.444 ± 1.116
0.0ValXaa: 0.0 ± 0.0
Trp
3.444TrpAla: 3.444 ± 1.348
1.148TrpCys: 1.148 ± 0.723
2.296TrpAsp: 2.296 ± 0.923
0.0TrpGlu: 0.0 ± 0.0
2.296TrpPhe: 2.296 ± 0.459
4.592TrpGly: 4.592 ± 1.296
0.0TrpHis: 0.0 ± 0.0
0.0TrpIle: 0.0 ± 0.0
0.574TrpLys: 0.574 ± 0.503
0.574TrpLeu: 0.574 ± 0.517
0.0TrpMet: 0.0 ± 0.0
4.018TrpAsn: 4.018 ± 0.811
0.0TrpPro: 0.0 ± 0.0
0.0TrpGln: 0.0 ± 0.0
0.0TrpArg: 0.0 ± 0.0
2.87TrpSer: 2.87 ± 1.466
4.018TrpThr: 4.018 ± 1.186
0.0TrpVal: 0.0 ± 0.0
0.574TrpTrp: 0.574 ± 0.517
1.722TrpTyr: 1.722 ± 0.674
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.87TyrAla: 2.87 ± 1.187
2.296TyrCys: 2.296 ± 0.68
0.0TyrAsp: 0.0 ± 0.0
0.0TyrGlu: 0.0 ± 0.0
1.722TyrPhe: 1.722 ± 0.719
2.87TyrGly: 2.87 ± 1.187
0.574TyrHis: 0.574 ± 0.505
1.148TyrIle: 1.148 ± 0.745
0.574TyrLys: 0.574 ± 0.653
1.148TyrLeu: 1.148 ± 0.706
1.148TyrMet: 1.148 ± 0.511
0.0TyrAsn: 0.0 ± 0.0
0.574TyrPro: 0.574 ± 0.517
2.296TyrGln: 2.296 ± 0.68
2.296TyrArg: 2.296 ± 0.946
2.296TyrSer: 2.296 ± 0.809
0.574TyrThr: 0.574 ± 0.505
4.592TyrVal: 4.592 ± 0.502
1.722TyrTrp: 1.722 ± 0.844
1.148TyrTyr: 1.148 ± 0.62
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.574XaaTyr: 0.574 ± 0.504
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 6 proteins (1743 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski