Amino acid dipepetide frequency for Sanxia sobemo-like virus 4

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
3.222AlaAla: 3.222 ± 1.985
1.074AlaCys: 1.074 ± 0.662
4.296AlaAsp: 4.296 ± 0.468
3.222AlaGlu: 3.222 ± 1.985
3.222AlaPhe: 3.222 ± 0.427
3.222AlaGly: 3.222 ± 0.427
0.0AlaHis: 0.0 ± 0.0
3.222AlaIle: 3.222 ± 0.427
3.222AlaLys: 3.222 ± 0.427
4.296AlaLeu: 4.296 ± 0.468
1.074AlaMet: 1.074 ± 0.662
4.296AlaAsn: 4.296 ± 2.647
2.148AlaPro: 2.148 ± 1.323
3.222AlaGln: 3.222 ± 0.427
1.074AlaArg: 1.074 ± 0.662
4.296AlaSer: 4.296 ± 1.089
3.222AlaThr: 3.222 ± 0.427
7.519AlaVal: 7.519 ± 3.074
0.0AlaTrp: 0.0 ± 0.0
3.222AlaTyr: 3.222 ± 1.13
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.0CysCys: 0.0 ± 0.0
0.0CysAsp: 0.0 ± 0.0
0.0CysGlu: 0.0 ± 0.0
2.148CysPhe: 2.148 ± 0.234
1.074CysGly: 1.074 ± 0.662
0.0CysHis: 0.0 ± 0.0
0.0CysIle: 0.0 ± 0.0
3.222CysLys: 3.222 ± 1.13
0.0CysLeu: 0.0 ± 0.0
2.148CysMet: 2.148 ± 0.681
1.074CysAsn: 1.074 ± 0.662
1.074CysPro: 1.074 ± 0.896
0.0CysGln: 0.0 ± 0.0
0.0CysArg: 0.0 ± 0.0
1.074CysSer: 1.074 ± 0.662
1.074CysThr: 1.074 ± 0.662
1.074CysVal: 1.074 ± 0.896
1.074CysTrp: 1.074 ± 0.896
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
3.222AspAla: 3.222 ± 1.13
0.0AspCys: 0.0 ± 0.0
1.074AspAsp: 1.074 ± 0.896
2.148AspGlu: 2.148 ± 0.234
2.148AspPhe: 2.148 ± 0.234
5.371AspGly: 5.371 ± 2.922
1.074AspHis: 1.074 ± 0.896
1.074AspIle: 1.074 ± 0.896
1.074AspLys: 1.074 ± 0.896
5.371AspLeu: 5.371 ± 0.193
2.148AspMet: 2.148 ± 0.234
3.222AspAsn: 3.222 ± 0.427
4.296AspPro: 4.296 ± 3.584
3.222AspGln: 3.222 ± 0.427
5.371AspArg: 5.371 ± 1.751
1.074AspSer: 1.074 ± 0.896
6.445AspThr: 6.445 ± 0.703
1.074AspVal: 1.074 ± 0.896
3.222AspTrp: 3.222 ± 1.13
1.074AspTyr: 1.074 ± 0.662
0.0AspXaa: 0.0 ± 0.0
Glu
0.0GluAla: 0.0 ± 0.0
0.0GluCys: 0.0 ± 0.0
3.222GluAsp: 3.222 ± 1.13
0.0GluGlu: 0.0 ± 0.0
4.296GluPhe: 4.296 ± 3.584
3.222GluGly: 3.222 ± 1.985
2.148GluHis: 2.148 ± 0.234
2.148GluIle: 2.148 ± 0.234
5.371GluLys: 5.371 ± 1.751
4.296GluLeu: 4.296 ± 2.647
5.371GluMet: 5.371 ± 0.193
5.371GluAsn: 5.371 ± 0.193
3.222GluPro: 3.222 ± 1.13
1.074GluGln: 1.074 ± 0.662
6.445GluArg: 6.445 ± 0.703
5.371GluSer: 5.371 ± 1.751
0.0GluThr: 0.0 ± 0.0
3.222GluVal: 3.222 ± 0.427
2.148GluTrp: 2.148 ± 0.234
0.0GluTyr: 0.0 ± 0.0
0.0GluXaa: 0.0 ± 0.0
Phe
4.296PheAla: 4.296 ± 1.089
1.074PheCys: 1.074 ± 0.896
1.074PheAsp: 1.074 ± 0.662
4.296PheGlu: 4.296 ± 0.468
1.074PhePhe: 1.074 ± 0.896
3.222PheGly: 3.222 ± 0.427
0.0PheHis: 0.0 ± 0.0
3.222PheIle: 3.222 ± 2.688
0.0PheLys: 0.0 ± 0.0
1.074PheLeu: 1.074 ± 0.662
2.148PheMet: 2.148 ± 2.271
1.074PheAsn: 1.074 ± 0.662
0.0PhePro: 0.0 ± 0.0
0.0PheGln: 0.0 ± 0.0
1.074PheArg: 1.074 ± 0.896
2.148PheSer: 2.148 ± 1.323
2.148PheThr: 2.148 ± 0.234
1.074PheVal: 1.074 ± 0.896
1.074PheTrp: 1.074 ± 0.662
2.148PheTyr: 2.148 ± 0.234
0.0PheXaa: 0.0 ± 0.0
Gly
6.445GlyAla: 6.445 ± 3.97
2.148GlyCys: 2.148 ± 0.234
3.222GlyAsp: 3.222 ± 1.13
2.148GlyGlu: 2.148 ± 0.234
3.222GlyPhe: 3.222 ± 0.427
4.296GlyGly: 4.296 ± 2.026
0.0GlyHis: 0.0 ± 0.0
1.074GlyIle: 1.074 ± 0.662
7.519GlyLys: 7.519 ± 3.074
1.074GlyLeu: 1.074 ± 0.896
3.222GlyMet: 3.222 ± 0.427
1.074GlyAsn: 1.074 ± 0.896
1.074GlyPro: 1.074 ± 0.896
6.445GlyGln: 6.445 ± 0.703
5.371GlyArg: 5.371 ± 0.193
5.371GlySer: 5.371 ± 3.309
2.148GlyThr: 2.148 ± 0.234
8.593GlyVal: 8.593 ± 2.178
2.148GlyTrp: 2.148 ± 1.792
2.148GlyTyr: 2.148 ± 0.234
0.0GlyXaa: 0.0 ± 0.0
His
1.074HisAla: 1.074 ± 0.896
0.0HisCys: 0.0 ± 0.0
0.0HisAsp: 0.0 ± 0.0
0.0HisGlu: 0.0 ± 0.0
0.0HisPhe: 0.0 ± 0.0
0.0HisGly: 0.0 ± 0.0
1.074HisHis: 1.074 ± 0.896
3.222HisIle: 3.222 ± 1.985
2.148HisLys: 2.148 ± 0.234
0.0HisLeu: 0.0 ± 0.0
1.074HisMet: 1.074 ± 0.896
1.074HisAsn: 1.074 ± 0.662
0.0HisPro: 0.0 ± 0.0
2.148HisGln: 2.148 ± 1.323
2.148HisArg: 2.148 ± 1.792
2.148HisSer: 2.148 ± 0.234
0.0HisThr: 0.0 ± 0.0
2.148HisVal: 2.148 ± 0.234
0.0HisTrp: 0.0 ± 0.0
1.074HisTyr: 1.074 ± 0.896
0.0HisXaa: 0.0 ± 0.0
Ile
3.222IleAla: 3.222 ± 1.985
0.0IleCys: 0.0 ± 0.0
3.222IleAsp: 3.222 ± 2.688
1.074IleGlu: 1.074 ± 0.662
1.074IlePhe: 1.074 ± 0.896
4.296IleGly: 4.296 ± 2.026
0.0IleHis: 0.0 ± 0.0
2.148IleIle: 2.148 ± 0.234
3.222IleLys: 3.222 ± 0.427
3.222IleLeu: 3.222 ± 1.13
1.074IleMet: 1.074 ± 0.896
1.074IleAsn: 1.074 ± 0.662
6.445IlePro: 6.445 ± 0.703
0.0IleGln: 0.0 ± 0.0
3.222IleArg: 3.222 ± 0.427
2.148IleSer: 2.148 ± 0.234
2.148IleThr: 2.148 ± 0.234
2.148IleVal: 2.148 ± 1.323
1.074IleTrp: 1.074 ± 0.896
2.148IleTyr: 2.148 ± 1.792
0.0IleXaa: 0.0 ± 0.0
Lys
0.0LysAla: 0.0 ± 0.0
2.148LysCys: 2.148 ± 0.234
2.148LysAsp: 2.148 ± 1.792
2.148LysGlu: 2.148 ± 0.234
0.0LysPhe: 0.0 ± 0.0
1.074LysGly: 1.074 ± 0.896
1.074LysHis: 1.074 ± 0.896
3.222LysIle: 3.222 ± 1.13
5.371LysLys: 5.371 ± 1.751
6.445LysLeu: 6.445 ± 0.703
2.148LysMet: 2.148 ± 0.234
6.445LysAsn: 6.445 ± 0.855
3.222LysPro: 3.222 ± 0.427
1.074LysGln: 1.074 ± 0.662
1.074LysArg: 1.074 ± 0.662
4.296LysSer: 4.296 ± 0.468
5.371LysThr: 5.371 ± 1.751
4.296LysVal: 4.296 ± 2.026
1.074LysTrp: 1.074 ± 0.662
2.148LysTyr: 2.148 ± 1.323
0.0LysXaa: 0.0 ± 0.0
Leu
6.445LeuAla: 6.445 ± 0.703
2.148LeuCys: 2.148 ± 0.234
7.519LeuAsp: 7.519 ± 1.517
8.593LeuGlu: 8.593 ± 2.178
2.148LeuPhe: 2.148 ± 1.792
1.074LeuGly: 1.074 ± 0.662
1.074LeuHis: 1.074 ± 0.896
3.222LeuIle: 3.222 ± 0.427
4.296LeuLys: 4.296 ± 1.089
9.667LeuLeu: 9.667 ± 1.833
2.148LeuMet: 2.148 ± 1.792
2.148LeuAsn: 2.148 ± 1.792
3.222LeuPro: 3.222 ± 0.427
2.148LeuGln: 2.148 ± 1.792
7.519LeuArg: 7.519 ± 1.599
5.371LeuSer: 5.371 ± 1.751
2.148LeuThr: 2.148 ± 1.792
8.593LeuVal: 8.593 ± 3.736
0.0LeuTrp: 0.0 ± 0.0
2.148LeuTyr: 2.148 ± 1.792
0.0LeuXaa: 0.0 ± 0.0
Met
2.148MetAla: 2.148 ± 1.792
1.074MetCys: 1.074 ± 0.896
1.074MetAsp: 1.074 ± 0.896
4.296MetGlu: 4.296 ± 1.089
1.074MetPhe: 1.074 ± 0.896
5.371MetGly: 5.371 ± 0.193
1.074MetHis: 1.074 ± 0.896
2.148MetIle: 2.148 ± 0.234
1.074MetLys: 1.074 ± 0.896
2.148MetLeu: 2.148 ± 0.234
1.074MetMet: 1.074 ± 0.662
2.148MetAsn: 2.148 ± 0.234
2.148MetPro: 2.148 ± 1.323
0.0MetGln: 0.0 ± 0.0
6.445MetArg: 6.445 ± 2.26
2.148MetSer: 2.148 ± 1.323
3.222MetThr: 3.222 ± 0.427
3.222MetVal: 3.222 ± 0.427
1.074MetTrp: 1.074 ± 0.662
0.0MetTyr: 0.0 ± 0.0
0.0MetXaa: 0.0 ± 0.0
Asn
3.222AsnAla: 3.222 ± 0.427
0.0AsnCys: 0.0 ± 0.0
2.148AsnAsp: 2.148 ± 0.234
5.371AsnGlu: 5.371 ± 0.193
1.074AsnPhe: 1.074 ± 0.662
4.296AsnGly: 4.296 ± 0.468
4.296AsnHis: 4.296 ± 2.647
1.074AsnIle: 1.074 ± 0.662
2.148AsnLys: 2.148 ± 0.234
4.296AsnLeu: 4.296 ± 1.089
1.074AsnMet: 1.074 ± 0.662
0.0AsnAsn: 0.0 ± 0.0
1.074AsnPro: 1.074 ± 0.896
4.296AsnGln: 4.296 ± 1.089
0.0AsnArg: 0.0 ± 0.0
2.148AsnSer: 2.148 ± 0.234
1.074AsnThr: 1.074 ± 0.896
4.296AsnVal: 4.296 ± 2.647
1.074AsnTrp: 1.074 ± 0.896
2.148AsnTyr: 2.148 ± 0.234
0.0AsnXaa: 0.0 ± 0.0
Pro
0.0ProAla: 0.0 ± 0.0
0.0ProCys: 0.0 ± 0.0
1.074ProAsp: 1.074 ± 0.662
3.222ProGlu: 3.222 ± 1.13
0.0ProPhe: 0.0 ± 0.0
8.593ProGly: 8.593 ± 0.937
2.148ProHis: 2.148 ± 0.234
2.148ProIle: 2.148 ± 1.792
3.222ProLys: 3.222 ± 1.13
3.222ProLeu: 3.222 ± 0.427
0.0ProMet: 0.0 ± 0.0
0.0ProAsn: 0.0 ± 0.0
3.222ProPro: 3.222 ± 1.13
1.074ProGln: 1.074 ± 0.662
5.371ProArg: 5.371 ± 3.309
6.445ProSer: 6.445 ± 0.703
4.296ProThr: 4.296 ± 2.026
3.222ProVal: 3.222 ± 0.427
0.0ProTrp: 0.0 ± 0.0
1.074ProTyr: 1.074 ± 0.896
0.0ProXaa: 0.0 ± 0.0
Gln
1.074GlnAla: 1.074 ± 0.662
1.074GlnCys: 1.074 ± 0.662
2.148GlnAsp: 2.148 ± 1.323
1.074GlnGlu: 1.074 ± 0.662
1.074GlnPhe: 1.074 ± 0.662
3.222GlnGly: 3.222 ± 1.985
1.074GlnHis: 1.074 ± 0.662
2.148GlnIle: 2.148 ± 1.792
3.222GlnLys: 3.222 ± 1.13
1.074GlnLeu: 1.074 ± 0.662
3.222GlnMet: 3.222 ± 0.427
1.074GlnAsn: 1.074 ± 0.896
3.222GlnPro: 3.222 ± 0.427
2.148GlnGln: 2.148 ± 1.323
4.296GlnArg: 4.296 ± 0.468
8.593GlnSer: 8.593 ± 2.178
1.074GlnThr: 1.074 ± 0.662
5.371GlnVal: 5.371 ± 0.193
1.074GlnTrp: 1.074 ± 0.896
1.074GlnTyr: 1.074 ± 0.896
0.0GlnXaa: 0.0 ± 0.0
Arg
6.445ArgAla: 6.445 ± 0.855
1.074ArgCys: 1.074 ± 0.896
5.371ArgAsp: 5.371 ± 2.922
5.371ArgGlu: 5.371 ± 0.193
4.296ArgPhe: 4.296 ± 1.089
2.148ArgGly: 2.148 ± 1.323
1.074ArgHis: 1.074 ± 0.662
3.222ArgIle: 3.222 ± 1.985
3.222ArgLys: 3.222 ± 0.427
8.593ArgLeu: 8.593 ± 0.937
3.222ArgMet: 3.222 ± 0.427
3.222ArgAsn: 3.222 ± 0.427
3.222ArgPro: 3.222 ± 1.985
3.222ArgGln: 3.222 ± 1.13
4.296ArgArg: 4.296 ± 2.026
7.519ArgSer: 7.519 ± 0.041
3.222ArgThr: 3.222 ± 0.427
5.371ArgVal: 5.371 ± 1.364
0.0ArgTrp: 0.0 ± 0.0
3.222ArgTyr: 3.222 ± 1.13
0.0ArgXaa: 0.0 ± 0.0
Ser
6.445SerAla: 6.445 ± 3.97
0.0SerCys: 0.0 ± 0.0
3.222SerAsp: 3.222 ± 1.985
5.371SerGlu: 5.371 ± 1.751
3.222SerPhe: 3.222 ± 0.427
5.371SerGly: 5.371 ± 1.364
0.0SerHis: 0.0 ± 0.0
4.296SerIle: 4.296 ± 0.468
2.148SerLys: 2.148 ± 0.234
6.445SerLeu: 6.445 ± 2.26
0.0SerMet: 0.0 ± 0.0
4.296SerAsn: 4.296 ± 2.647
4.296SerPro: 4.296 ± 1.089
2.148SerGln: 2.148 ± 0.234
8.593SerArg: 8.593 ± 3.736
10.741SerSer: 10.741 ± 1.944
1.074SerThr: 1.074 ± 0.662
7.519SerVal: 7.519 ± 1.599
0.0SerTrp: 0.0 ± 0.0
4.296SerTyr: 4.296 ± 0.468
0.0SerXaa: 0.0 ± 0.0
Thr
3.222ThrAla: 3.222 ± 0.427
1.074ThrCys: 1.074 ± 0.662
4.296ThrAsp: 4.296 ± 2.026
1.074ThrGlu: 1.074 ± 0.896
2.148ThrPhe: 2.148 ± 1.323
7.519ThrGly: 7.519 ± 4.632
0.0ThrHis: 0.0 ± 0.0
3.222ThrIle: 3.222 ± 1.13
0.0ThrLys: 0.0 ± 0.0
4.296ThrLeu: 4.296 ± 0.468
1.074ThrMet: 1.074 ± 0.896
2.148ThrAsn: 2.148 ± 0.234
0.0ThrPro: 0.0 ± 0.0
4.296ThrGln: 4.296 ± 2.647
4.296ThrArg: 4.296 ± 0.468
2.148ThrSer: 2.148 ± 1.792
2.148ThrThr: 2.148 ± 1.323
2.148ThrVal: 2.148 ± 0.234
2.148ThrTrp: 2.148 ± 0.234
1.074ThrTyr: 1.074 ± 0.896
0.0ThrXaa: 0.0 ± 0.0
Val
4.296ValAla: 4.296 ± 2.647
1.074ValCys: 1.074 ± 0.896
4.296ValAsp: 4.296 ± 3.584
2.148ValGlu: 2.148 ± 0.234
1.074ValPhe: 1.074 ± 0.896
3.222ValGly: 3.222 ± 1.985
1.074ValHis: 1.074 ± 0.662
2.148ValIle: 2.148 ± 1.792
1.074ValLys: 1.074 ± 0.896
11.815ValLeu: 11.815 ± 2.067
6.445ValMet: 6.445 ± 0.855
2.148ValAsn: 2.148 ± 1.323
4.296ValPro: 4.296 ± 0.468
10.741ValGln: 10.741 ± 1.944
6.445ValArg: 6.445 ± 2.413
4.296ValSer: 4.296 ± 1.089
5.371ValThr: 5.371 ± 3.309
3.222ValVal: 3.222 ± 0.427
1.074ValTrp: 1.074 ± 0.896
1.074ValTyr: 1.074 ± 0.662
0.0ValXaa: 0.0 ± 0.0
Trp
1.074TrpAla: 1.074 ± 0.662
1.074TrpCys: 1.074 ± 0.662
1.074TrpAsp: 1.074 ± 0.896
2.148TrpGlu: 2.148 ± 1.792
0.0TrpPhe: 0.0 ± 0.0
0.0TrpGly: 0.0 ± 0.0
0.0TrpHis: 0.0 ± 0.0
0.0TrpIle: 0.0 ± 0.0
1.074TrpLys: 1.074 ± 0.896
1.074TrpLeu: 1.074 ± 0.662
1.074TrpMet: 1.074 ± 0.662
2.148TrpAsn: 2.148 ± 0.234
0.0TrpPro: 0.0 ± 0.0
0.0TrpGln: 0.0 ± 0.0
2.148TrpArg: 2.148 ± 1.792
2.148TrpSer: 2.148 ± 0.234
2.148TrpThr: 2.148 ± 1.792
1.074TrpVal: 1.074 ± 0.896
0.0TrpTrp: 0.0 ± 0.0
0.0TrpTyr: 0.0 ± 0.0
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.222TyrAla: 3.222 ± 0.427
0.0TyrCys: 0.0 ± 0.0
3.222TyrAsp: 3.222 ± 0.427
3.222TyrGlu: 3.222 ± 1.13
0.0TyrPhe: 0.0 ± 0.0
2.148TyrGly: 2.148 ± 1.323
2.148TyrHis: 2.148 ± 1.792
0.0TyrIle: 0.0 ± 0.0
2.148TyrLys: 2.148 ± 1.792
3.222TyrLeu: 3.222 ± 0.427
2.148TyrMet: 2.148 ± 0.234
1.074TyrAsn: 1.074 ± 0.896
2.148TyrPro: 2.148 ± 1.792
1.074TyrGln: 1.074 ± 0.896
2.148TyrArg: 2.148 ± 1.792
0.0TyrSer: 0.0 ± 0.0
0.0TyrThr: 0.0 ± 0.0
2.148TyrVal: 2.148 ± 0.234
0.0TyrTrp: 0.0 ± 0.0
0.0TyrTyr: 0.0 ± 0.0
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 2 proteins (932 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski