Amino acid dipepetide frequency for Genomoviridae sp.

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
1.357AlaAla: 1.357 ± 0.939
0.0AlaCys: 0.0 ± 0.0
8.141AlaAsp: 8.141 ± 5.355
4.071AlaGlu: 4.071 ± 1.86
0.0AlaPhe: 0.0 ± 0.0
8.141AlaGly: 8.141 ± 0.795
2.714AlaHis: 2.714 ± 0.974
1.357AlaIle: 1.357 ± 1.061
2.714AlaLys: 2.714 ± 0.974
0.0AlaLeu: 0.0 ± 0.0
2.714AlaMet: 2.714 ± 2.123
4.071AlaAsn: 4.071 ± 1.419
2.714AlaPro: 2.714 ± 0.974
4.071AlaGln: 4.071 ± 1.658
5.427AlaArg: 5.427 ± 2.386
6.784AlaSer: 6.784 ± 5.307
8.141AlaThr: 8.141 ± 0.795
2.714AlaVal: 2.714 ± 1.323
0.0AlaTrp: 0.0 ± 0.0
0.0AlaTyr: 0.0 ± 0.0
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.0CysCys: 0.0 ± 0.0
1.357CysAsp: 1.357 ± 1.105
1.357CysGlu: 1.357 ± 1.061
0.0CysPhe: 0.0 ± 0.0
0.0CysGly: 0.0 ± 0.0
1.357CysHis: 1.357 ± 1.061
1.357CysIle: 1.357 ± 1.105
0.0CysLys: 0.0 ± 0.0
0.0CysLeu: 0.0 ± 0.0
0.0CysMet: 0.0 ± 0.0
1.357CysAsn: 1.357 ± 1.105
0.0CysPro: 0.0 ± 0.0
0.0CysGln: 0.0 ± 0.0
2.714CysArg: 2.714 ± 1.323
0.0CysSer: 0.0 ± 0.0
2.714CysThr: 2.714 ± 0.974
0.0CysVal: 0.0 ± 0.0
0.0CysTrp: 0.0 ± 0.0
1.357CysTyr: 1.357 ± 0.939
0.0CysXaa: 0.0 ± 0.0
Asp
4.071AspAla: 4.071 ± 2.129
0.0AspCys: 0.0 ± 0.0
5.427AspAsp: 5.427 ± 3.756
5.427AspGlu: 5.427 ± 3.756
4.071AspPhe: 4.071 ± 1.86
5.427AspGly: 5.427 ± 1.949
1.357AspHis: 1.357 ± 1.105
4.071AspIle: 4.071 ± 3.316
4.071AspLys: 4.071 ± 1.562
10.855AspLeu: 10.855 ± 2.953
1.357AspMet: 1.357 ± 0.929
4.071AspAsn: 4.071 ± 0.398
5.427AspPro: 5.427 ± 0.574
1.357AspGln: 1.357 ± 0.939
2.714AspArg: 2.714 ± 0.83
4.071AspSer: 4.071 ± 0.398
0.0AspThr: 0.0 ± 0.0
2.714AspVal: 2.714 ± 0.974
4.071AspTrp: 4.071 ± 2.195
2.714AspTyr: 2.714 ± 0.974
0.0AspXaa: 0.0 ± 0.0
Glu
1.357GluAla: 1.357 ± 1.105
1.357GluCys: 1.357 ± 1.105
4.071GluAsp: 4.071 ± 1.658
0.0GluGlu: 0.0 ± 0.0
0.0GluPhe: 0.0 ± 0.0
2.714GluGly: 2.714 ± 1.323
0.0GluHis: 0.0 ± 0.0
2.714GluIle: 2.714 ± 1.323
1.357GluLys: 1.357 ± 0.939
5.427GluLeu: 5.427 ± 1.439
1.357GluMet: 1.357 ± 0.939
1.357GluAsn: 1.357 ± 1.061
1.357GluPro: 1.357 ± 1.105
4.071GluGln: 4.071 ± 2.195
8.141GluArg: 8.141 ± 1.043
4.071GluSer: 4.071 ± 1.419
0.0GluThr: 0.0 ± 0.0
1.357GluVal: 1.357 ± 0.939
0.0GluTrp: 0.0 ± 0.0
0.0GluTyr: 0.0 ± 0.0
0.0GluXaa: 0.0 ± 0.0
Phe
4.071PheAla: 4.071 ± 2.195
1.357PheCys: 1.357 ± 1.061
4.071PheAsp: 4.071 ± 2.195
1.357PheGlu: 1.357 ± 1.105
2.714PhePhe: 2.714 ± 0.974
2.714PheGly: 2.714 ± 0.974
2.714PheHis: 2.714 ± 2.211
0.0PheIle: 0.0 ± 0.0
2.714PheLys: 2.714 ± 1.878
5.427PheLeu: 5.427 ± 2.658
1.357PheMet: 1.357 ± 1.061
1.357PheAsn: 1.357 ± 0.939
1.357PhePro: 1.357 ± 1.105
0.0PheGln: 0.0 ± 0.0
1.357PheArg: 1.357 ± 1.105
4.071PheSer: 4.071 ± 1.562
4.071PheThr: 4.071 ± 1.562
4.071PheVal: 4.071 ± 1.419
0.0PheTrp: 0.0 ± 0.0
0.0PheTyr: 0.0 ± 0.0
0.0PheXaa: 0.0 ± 0.0
Gly
6.784GlyAla: 6.784 ± 2.31
0.0GlyCys: 0.0 ± 0.0
8.141GlyAsp: 8.141 ± 3.124
2.714GlyGlu: 2.714 ± 2.123
4.071GlyPhe: 4.071 ± 2.817
9.498GlyGly: 9.498 ± 3.035
0.0GlyHis: 0.0 ± 0.0
1.357GlyIle: 1.357 ± 1.061
0.0GlyLys: 0.0 ± 0.0
9.498GlyLeu: 9.498 ± 3.152
2.714GlyMet: 2.714 ± 2.123
2.714GlyAsn: 2.714 ± 1.878
4.071GlyPro: 4.071 ± 1.562
4.071GlyGln: 4.071 ± 1.562
2.714GlyArg: 2.714 ± 0.974
10.855GlySer: 10.855 ± 0.757
10.855GlyThr: 10.855 ± 2.303
2.714GlyVal: 2.714 ± 1.878
1.357GlyTrp: 1.357 ± 1.105
1.357GlyTyr: 1.357 ± 0.939
0.0GlyXaa: 0.0 ± 0.0
His
1.357HisAla: 1.357 ± 1.105
2.714HisCys: 2.714 ± 0.974
1.357HisAsp: 1.357 ± 1.105
1.357HisGlu: 1.357 ± 0.939
0.0HisPhe: 0.0 ± 0.0
2.714HisGly: 2.714 ± 0.83
1.357HisHis: 1.357 ± 1.061
5.427HisIle: 5.427 ± 1.273
0.0HisLys: 0.0 ± 0.0
4.071HisLeu: 4.071 ± 2.195
1.357HisMet: 1.357 ± 1.061
1.357HisAsn: 1.357 ± 1.061
4.071HisPro: 4.071 ± 0.398
0.0HisGln: 0.0 ± 0.0
0.0HisArg: 0.0 ± 0.0
1.357HisSer: 1.357 ± 1.061
1.357HisThr: 1.357 ± 1.061
2.714HisVal: 2.714 ± 2.211
0.0HisTrp: 0.0 ± 0.0
0.0HisTyr: 0.0 ± 0.0
0.0HisXaa: 0.0 ± 0.0
Ile
1.357IleAla: 1.357 ± 1.105
0.0IleCys: 0.0 ± 0.0
4.071IleAsp: 4.071 ± 1.419
1.357IleGlu: 1.357 ± 0.939
4.071IlePhe: 4.071 ± 1.658
5.427IleGly: 5.427 ± 0.574
0.0IleHis: 0.0 ± 0.0
5.427IleIle: 5.427 ± 1.273
2.714IleLys: 2.714 ± 1.878
9.498IleLeu: 9.498 ± 3.183
2.714IleMet: 2.714 ± 1.323
0.0IleAsn: 0.0 ± 0.0
0.0IlePro: 0.0 ± 0.0
0.0IleGln: 0.0 ± 0.0
5.427IleArg: 5.427 ± 1.949
4.071IleSer: 4.071 ± 3.184
1.357IleThr: 1.357 ± 0.939
4.071IleVal: 4.071 ± 1.86
2.714IleTrp: 2.714 ± 0.974
1.357IleTyr: 1.357 ± 0.939
0.0IleXaa: 0.0 ± 0.0
Lys
1.357LysAla: 1.357 ± 0.939
0.0LysCys: 0.0 ± 0.0
5.427LysAsp: 5.427 ± 1.949
1.357LysGlu: 1.357 ± 0.939
1.357LysPhe: 1.357 ± 1.105
2.714LysGly: 2.714 ± 1.878
0.0LysHis: 0.0 ± 0.0
1.357LysIle: 1.357 ± 0.939
1.357LysLys: 1.357 ± 0.939
2.714LysLeu: 2.714 ± 2.123
0.0LysMet: 0.0 ± 0.0
0.0LysAsn: 0.0 ± 0.0
0.0LysPro: 0.0 ± 0.0
1.357LysGln: 1.357 ± 0.939
8.141LysArg: 8.141 ± 2.442
4.071LysSer: 4.071 ± 1.562
4.071LysThr: 4.071 ± 2.817
0.0LysVal: 0.0 ± 0.0
0.0LysTrp: 0.0 ± 0.0
1.357LysTyr: 1.357 ± 1.105
0.0LysXaa: 0.0 ± 0.0
Leu
6.784LeuAla: 6.784 ± 2.535
1.357LeuCys: 1.357 ± 1.061
5.427LeuAsp: 5.427 ± 1.439
6.784LeuGlu: 6.784 ± 3.382
4.071LeuPhe: 4.071 ± 1.562
5.427LeuGly: 5.427 ± 0.574
4.071LeuHis: 4.071 ± 2.129
6.784LeuIle: 6.784 ± 2.398
2.714LeuLys: 2.714 ± 0.83
10.855LeuLeu: 10.855 ± 2.344
1.357LeuMet: 1.357 ± 1.061
2.714LeuAsn: 2.714 ± 0.83
2.714LeuPro: 2.714 ± 0.83
2.714LeuGln: 2.714 ± 2.123
5.427LeuArg: 5.427 ± 0.574
9.498LeuSer: 9.498 ± 3.38
5.427LeuThr: 5.427 ± 4.246
4.071LeuVal: 4.071 ± 2.129
4.071LeuTrp: 4.071 ± 1.658
2.714LeuTyr: 2.714 ± 1.878
0.0LeuXaa: 0.0 ± 0.0
Met
5.427MetAla: 5.427 ± 2.658
0.0MetCys: 0.0 ± 0.0
4.071MetAsp: 4.071 ± 0.398
0.0MetGlu: 0.0 ± 0.0
2.714MetPhe: 2.714 ± 0.83
1.357MetGly: 1.357 ± 0.939
0.0MetHis: 0.0 ± 0.0
0.0MetIle: 0.0 ± 0.0
1.357MetLys: 1.357 ± 0.939
2.714MetLeu: 2.714 ± 2.123
0.0MetMet: 0.0 ± 0.0
0.0MetAsn: 0.0 ± 0.0
0.0MetPro: 0.0 ± 0.0
5.427MetGln: 5.427 ± 2.658
2.714MetArg: 2.714 ± 1.878
4.071MetSer: 4.071 ± 1.658
2.714MetThr: 2.714 ± 1.323
0.0MetVal: 0.0 ± 0.0
0.0MetTrp: 0.0 ± 0.0
0.0MetTyr: 0.0 ± 0.0
0.0MetXaa: 0.0 ± 0.0
Asn
4.071AsnAla: 4.071 ± 3.184
0.0AsnCys: 0.0 ± 0.0
2.714AsnAsp: 2.714 ± 0.974
0.0AsnGlu: 0.0 ± 0.0
0.0AsnPhe: 0.0 ± 0.0
2.714AsnGly: 2.714 ± 0.974
0.0AsnHis: 0.0 ± 0.0
4.071AsnIle: 4.071 ± 1.562
0.0AsnLys: 0.0 ± 0.0
4.071AsnLeu: 4.071 ± 2.129
1.357AsnMet: 1.357 ± 0.939
0.0AsnAsn: 0.0 ± 0.0
0.0AsnPro: 0.0 ± 0.0
1.357AsnGln: 1.357 ± 1.105
4.071AsnArg: 4.071 ± 2.817
2.714AsnSer: 2.714 ± 2.123
0.0AsnThr: 0.0 ± 0.0
0.0AsnVal: 0.0 ± 0.0
0.0AsnTrp: 0.0 ± 0.0
0.0AsnTyr: 0.0 ± 0.0
0.0AsnXaa: 0.0 ± 0.0
Pro
5.427ProAla: 5.427 ± 0.574
0.0ProCys: 0.0 ± 0.0
0.0ProAsp: 0.0 ± 0.0
1.357ProGlu: 1.357 ± 1.061
1.357ProPhe: 1.357 ± 0.939
5.427ProGly: 5.427 ± 1.949
2.714ProHis: 2.714 ± 1.323
2.714ProIle: 2.714 ± 1.878
0.0ProLys: 0.0 ± 0.0
1.357ProLeu: 1.357 ± 1.105
2.714ProMet: 2.714 ± 0.83
1.357ProAsn: 1.357 ± 1.105
1.357ProPro: 1.357 ± 1.061
0.0ProGln: 0.0 ± 0.0
2.714ProArg: 2.714 ± 1.878
5.427ProSer: 5.427 ± 1.439
4.071ProThr: 4.071 ± 1.562
1.357ProVal: 1.357 ± 1.061
2.714ProTrp: 2.714 ± 0.974
1.357ProTyr: 1.357 ± 1.105
0.0ProXaa: 0.0 ± 0.0
Gln
0.0GlnAla: 0.0 ± 0.0
1.357GlnCys: 1.357 ± 1.105
1.357GlnAsp: 1.357 ± 0.939
4.071GlnGlu: 4.071 ± 1.658
1.357GlnPhe: 1.357 ± 1.105
4.071GlnGly: 4.071 ± 1.658
2.714GlnHis: 2.714 ± 0.83
2.714GlnIle: 2.714 ± 1.323
0.0GlnLys: 0.0 ± 0.0
1.357GlnLeu: 1.357 ± 0.939
0.0GlnMet: 0.0 ± 0.0
0.0GlnAsn: 0.0 ± 0.0
0.0GlnPro: 0.0 ± 0.0
0.0GlnGln: 0.0 ± 0.0
1.357GlnArg: 1.357 ± 1.061
6.784GlnSer: 6.784 ± 2.31
1.357GlnThr: 1.357 ± 0.939
1.357GlnVal: 1.357 ± 1.061
1.357GlnTrp: 1.357 ± 1.105
0.0GlnTyr: 0.0 ± 0.0
0.0GlnXaa: 0.0 ± 0.0
Arg
1.357ArgAla: 1.357 ± 1.105
0.0ArgCys: 0.0 ± 0.0
4.071ArgAsp: 4.071 ± 1.419
4.071ArgGlu: 4.071 ± 3.316
4.071ArgPhe: 4.071 ± 1.86
8.141ArgGly: 8.141 ± 2.442
4.071ArgHis: 4.071 ± 2.129
8.141ArgIle: 8.141 ± 2.839
6.784ArgLys: 6.784 ± 2.428
4.071ArgLeu: 4.071 ± 1.419
2.714ArgMet: 2.714 ± 0.945
0.0ArgAsn: 0.0 ± 0.0
5.427ArgPro: 5.427 ± 2.386
0.0ArgGln: 0.0 ± 0.0
12.212ArgArg: 12.212 ± 3.846
5.427ArgSer: 5.427 ± 2.259
2.714ArgThr: 2.714 ± 1.878
2.714ArgVal: 2.714 ± 1.878
1.357ArgTrp: 1.357 ± 0.939
4.071ArgTyr: 4.071 ± 1.86
0.0ArgXaa: 0.0 ± 0.0
Ser
8.141SerAla: 8.141 ± 2.49
1.357SerCys: 1.357 ± 1.105
4.071SerAsp: 4.071 ± 1.419
2.714SerGlu: 2.714 ± 1.323
8.141SerPhe: 8.141 ± 2.46
6.784SerGly: 6.784 ± 2.31
0.0SerHis: 0.0 ± 0.0
1.357SerIle: 1.357 ± 1.061
2.714SerLys: 2.714 ± 1.323
12.212SerLeu: 12.212 ± 8.279
4.071SerMet: 4.071 ± 1.419
2.714SerAsn: 2.714 ± 1.323
8.141SerPro: 8.141 ± 1.043
4.071SerGln: 4.071 ± 0.398
6.784SerArg: 6.784 ± 3.273
9.498SerSer: 9.498 ± 2.901
5.427SerThr: 5.427 ± 2.658
2.714SerVal: 2.714 ± 0.83
2.714SerTrp: 2.714 ± 0.83
1.357SerTyr: 1.357 ± 0.939
0.0SerXaa: 0.0 ± 0.0
Thr
5.427ThrAla: 5.427 ± 3.756
1.357ThrCys: 1.357 ± 1.061
4.071ThrAsp: 4.071 ± 2.817
0.0ThrGlu: 0.0 ± 0.0
5.427ThrPhe: 5.427 ± 2.646
5.427ThrGly: 5.427 ± 0.574
4.071ThrHis: 4.071 ± 0.398
1.357ThrIle: 1.357 ± 0.939
4.071ThrLys: 4.071 ± 0.398
8.141ThrLeu: 8.141 ± 3.317
0.0ThrMet: 0.0 ± 0.0
1.357ThrAsn: 1.357 ± 1.061
2.714ThrPro: 2.714 ± 0.974
0.0ThrGln: 0.0 ± 0.0
2.714ThrArg: 2.714 ± 1.323
5.427ThrSer: 5.427 ± 3.756
1.357ThrThr: 1.357 ± 1.061
1.357ThrVal: 1.357 ± 0.939
0.0ThrTrp: 0.0 ± 0.0
6.784ThrTyr: 6.784 ± 2.756
0.0ThrXaa: 0.0 ± 0.0
Val
1.357ValAla: 1.357 ± 1.105
1.357ValCys: 1.357 ± 0.939
1.357ValAsp: 1.357 ± 1.105
1.357ValGlu: 1.357 ± 0.939
1.357ValPhe: 1.357 ± 1.105
2.714ValGly: 2.714 ± 1.878
0.0ValHis: 0.0 ± 0.0
4.071ValIle: 4.071 ± 2.817
2.714ValLys: 2.714 ± 1.878
2.714ValLeu: 2.714 ± 1.323
1.357ValMet: 1.357 ± 0.939
0.0ValAsn: 0.0 ± 0.0
2.714ValPro: 2.714 ± 1.323
1.357ValGln: 1.357 ± 1.061
2.714ValArg: 2.714 ± 1.323
4.071ValSer: 4.071 ± 3.184
4.071ValThr: 4.071 ± 1.562
1.357ValVal: 1.357 ± 1.061
0.0ValTrp: 0.0 ± 0.0
1.357ValTyr: 1.357 ± 0.939
0.0ValXaa: 0.0 ± 0.0
Trp
0.0TrpAla: 0.0 ± 0.0
1.357TrpCys: 1.357 ± 1.105
1.357TrpAsp: 1.357 ± 1.105
0.0TrpGlu: 0.0 ± 0.0
0.0TrpPhe: 0.0 ± 0.0
1.357TrpGly: 1.357 ± 1.105
5.427TrpHis: 5.427 ± 0.574
1.357TrpIle: 1.357 ± 0.939
0.0TrpLys: 0.0 ± 0.0
0.0TrpLeu: 0.0 ± 0.0
1.357TrpMet: 1.357 ± 1.061
2.714TrpAsn: 2.714 ± 0.974
0.0TrpPro: 0.0 ± 0.0
1.357TrpGln: 1.357 ± 1.061
2.714TrpArg: 2.714 ± 0.974
1.357TrpSer: 1.357 ± 1.061
0.0TrpThr: 0.0 ± 0.0
1.357TrpVal: 1.357 ± 0.939
0.0TrpTrp: 0.0 ± 0.0
0.0TrpTyr: 0.0 ± 0.0
0.0TrpXaa: 0.0 ± 0.0
Tyr
5.427TyrAla: 5.427 ± 2.901
0.0TyrCys: 0.0 ± 0.0
2.714TyrAsp: 2.714 ± 1.878
1.357TyrGlu: 1.357 ± 1.105
0.0TyrPhe: 0.0 ± 0.0
2.714TyrGly: 2.714 ± 0.974
0.0TyrHis: 0.0 ± 0.0
0.0TyrIle: 0.0 ± 0.0
1.357TyrLys: 1.357 ± 0.939
0.0TyrLeu: 0.0 ± 0.0
2.714TyrMet: 2.714 ± 1.743
0.0TyrAsn: 0.0 ± 0.0
1.357TyrPro: 1.357 ± 1.105
0.0TyrGln: 0.0 ± 0.0
1.357TyrArg: 1.357 ± 0.939
1.357TyrSer: 1.357 ± 1.105
1.357TyrThr: 1.357 ± 0.939
1.357TyrVal: 1.357 ± 0.939
1.357TyrTrp: 1.357 ± 0.939
1.357TyrTyr: 1.357 ± 0.939
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 3 proteins (738 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski