Amino acid dipepetide frequency for Beihai sobemo-like virus 22

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
8.574AlaAla: 8.574 ± 4.539
4.287AlaCys: 4.287 ± 2.27
3.215AlaAsp: 3.215 ± 2.862
3.215AlaGlu: 3.215 ± 0.231
2.144AlaPhe: 2.144 ± 1.185
9.646AlaGly: 9.646 ± 3.787
2.144AlaHis: 2.144 ± 0.361
4.287AlaIle: 4.287 ± 0.824
4.287AlaLys: 4.287 ± 2.371
8.574AlaLeu: 8.574 ± 1.648
3.215AlaMet: 3.215 ± 0.231
2.144AlaAsn: 2.144 ± 0.361
2.144AlaPro: 2.144 ± 0.361
4.287AlaGln: 4.287 ± 0.824
3.215AlaArg: 3.215 ± 0.231
2.144AlaSer: 2.144 ± 0.361
4.287AlaThr: 4.287 ± 0.824
5.359AlaVal: 5.359 ± 1.417
3.215AlaTrp: 3.215 ± 1.315
2.144AlaTyr: 2.144 ± 1.185
0.0AlaXaa: 0.0 ± 0.0
Cys
1.072CysAla: 1.072 ± 0.593
0.0CysCys: 0.0 ± 0.0
1.072CysAsp: 1.072 ± 0.954
1.072CysGlu: 1.072 ± 0.954
0.0CysPhe: 0.0 ± 0.0
1.072CysGly: 1.072 ± 0.593
1.072CysHis: 1.072 ± 0.593
1.072CysIle: 1.072 ± 0.954
0.0CysLys: 0.0 ± 0.0
0.0CysLeu: 0.0 ± 0.0
0.0CysMet: 0.0 ± 0.0
0.0CysAsn: 0.0 ± 0.0
2.144CysPro: 2.144 ± 1.185
1.072CysGln: 1.072 ± 0.954
0.0CysArg: 0.0 ± 0.0
0.0CysSer: 0.0 ± 0.0
0.0CysThr: 0.0 ± 0.0
1.072CysVal: 1.072 ± 0.954
0.0CysTrp: 0.0 ± 0.0
1.072CysTyr: 1.072 ± 0.593
0.0CysXaa: 0.0 ± 0.0
Asp
3.215AspAla: 3.215 ± 0.231
0.0AspCys: 0.0 ± 0.0
2.144AspAsp: 2.144 ± 0.361
5.359AspGlu: 5.359 ± 1.677
4.287AspPhe: 4.287 ± 0.824
2.144AspGly: 2.144 ± 0.361
0.0AspHis: 0.0 ± 0.0
1.072AspIle: 1.072 ± 0.954
1.072AspLys: 1.072 ± 0.954
4.287AspLeu: 4.287 ± 0.824
0.0AspMet: 0.0 ± 0.0
4.287AspAsn: 4.287 ± 0.723
3.215AspPro: 3.215 ± 0.231
0.0AspGln: 0.0 ± 0.0
2.144AspArg: 2.144 ± 0.361
2.144AspSer: 2.144 ± 1.185
3.215AspThr: 3.215 ± 0.231
3.215AspVal: 3.215 ± 0.231
2.144AspTrp: 2.144 ± 1.908
1.072AspTyr: 1.072 ± 0.593
0.0AspXaa: 0.0 ± 0.0
Glu
7.503GluAla: 7.503 ± 2.602
0.0GluCys: 0.0 ± 0.0
5.359GluAsp: 5.359 ± 0.13
3.215GluGlu: 3.215 ± 1.778
3.215GluPhe: 3.215 ± 1.315
1.072GluGly: 1.072 ± 0.593
0.0GluHis: 0.0 ± 0.0
8.574GluIle: 8.574 ± 1.446
6.431GluLys: 6.431 ± 1.084
5.359GluLeu: 5.359 ± 0.13
1.072GluMet: 1.072 ± 0.593
5.359GluAsn: 5.359 ± 0.13
3.215GluPro: 3.215 ± 2.862
2.144GluGln: 2.144 ± 0.361
4.287GluArg: 4.287 ± 0.723
2.144GluSer: 2.144 ± 0.361
2.144GluThr: 2.144 ± 0.361
8.574GluVal: 8.574 ± 3.195
1.072GluTrp: 1.072 ± 0.954
3.215GluTyr: 3.215 ± 0.231
0.0GluXaa: 0.0 ± 0.0
Phe
3.215PheAla: 3.215 ± 0.231
0.0PheCys: 0.0 ± 0.0
0.0PheAsp: 0.0 ± 0.0
2.144PheGlu: 2.144 ± 1.908
1.072PhePhe: 1.072 ± 0.954
4.287PheGly: 4.287 ± 2.27
0.0PheHis: 0.0 ± 0.0
0.0PheIle: 0.0 ± 0.0
1.072PheLys: 1.072 ± 0.593
1.072PheLeu: 1.072 ± 0.954
1.072PheMet: 1.072 ± 0.593
4.287PheAsn: 4.287 ± 0.824
2.144PhePro: 2.144 ± 0.361
1.072PheGln: 1.072 ± 0.954
1.072PheArg: 1.072 ± 0.954
5.359PheSer: 5.359 ± 0.13
3.215PheThr: 3.215 ± 2.862
3.215PheVal: 3.215 ± 1.778
1.072PheTrp: 1.072 ± 0.593
0.0PheTyr: 0.0 ± 0.0
0.0PheXaa: 0.0 ± 0.0
Gly
6.431GlyAla: 6.431 ± 0.463
0.0GlyCys: 0.0 ± 0.0
5.359GlyAsp: 5.359 ± 0.13
2.144GlyGlu: 2.144 ± 1.185
3.215GlyPhe: 3.215 ± 0.231
5.359GlyGly: 5.359 ± 1.417
3.215GlyHis: 3.215 ± 1.315
1.072GlyIle: 1.072 ± 0.593
6.431GlyLys: 6.431 ± 0.463
1.072GlyLeu: 1.072 ± 0.593
2.144GlyMet: 2.144 ± 1.185
0.0GlyAsn: 0.0 ± 0.0
4.287GlyPro: 4.287 ± 0.723
1.072GlyGln: 1.072 ± 0.593
3.215GlyArg: 3.215 ± 0.231
7.503GlySer: 7.503 ± 1.055
6.431GlyThr: 6.431 ± 0.463
5.359GlyVal: 5.359 ± 3.224
1.072GlyTrp: 1.072 ± 0.954
4.287GlyTyr: 4.287 ± 2.27
0.0GlyXaa: 0.0 ± 0.0
His
0.0HisAla: 0.0 ± 0.0
0.0HisCys: 0.0 ± 0.0
0.0HisAsp: 0.0 ± 0.0
1.072HisGlu: 1.072 ± 0.954
3.215HisPhe: 3.215 ± 0.231
1.072HisGly: 1.072 ± 0.954
0.0HisHis: 0.0 ± 0.0
0.0HisIle: 0.0 ± 0.0
3.215HisLys: 3.215 ± 1.315
3.215HisLeu: 3.215 ± 0.231
0.0HisMet: 0.0 ± 0.0
0.0HisAsn: 0.0 ± 0.0
1.072HisPro: 1.072 ± 0.593
0.0HisGln: 0.0 ± 0.0
3.215HisArg: 3.215 ± 1.315
0.0HisSer: 0.0 ± 0.0
3.215HisThr: 3.215 ± 0.231
1.072HisVal: 1.072 ± 0.593
0.0HisTrp: 0.0 ± 0.0
0.0HisTyr: 0.0 ± 0.0
0.0HisXaa: 0.0 ± 0.0
Ile
5.359IleAla: 5.359 ± 0.13
0.0IleCys: 0.0 ± 0.0
0.0IleAsp: 0.0 ± 0.0
2.144IleGlu: 2.144 ± 1.908
1.072IlePhe: 1.072 ± 0.954
3.215IleGly: 3.215 ± 1.315
1.072IleHis: 1.072 ± 0.593
3.215IleIle: 3.215 ± 2.862
2.144IleLys: 2.144 ± 1.908
3.215IleLeu: 3.215 ± 0.231
2.144IleMet: 2.144 ± 1.185
0.0IleAsn: 0.0 ± 0.0
2.144IlePro: 2.144 ± 0.361
1.072IleGln: 1.072 ± 0.954
7.503IleArg: 7.503 ± 3.585
2.144IleSer: 2.144 ± 0.361
3.215IleThr: 3.215 ± 1.778
3.215IleVal: 3.215 ± 0.231
0.0IleTrp: 0.0 ± 0.0
0.0IleTyr: 0.0 ± 0.0
0.0IleXaa: 0.0 ± 0.0
Lys
4.287LysAla: 4.287 ± 0.824
0.0LysCys: 0.0 ± 0.0
2.144LysAsp: 2.144 ± 0.361
4.287LysGlu: 4.287 ± 0.824
2.144LysPhe: 2.144 ± 0.361
3.215LysGly: 3.215 ± 1.315
1.072LysHis: 1.072 ± 0.954
3.215LysIle: 3.215 ± 2.862
3.215LysLys: 3.215 ± 0.231
8.574LysLeu: 8.574 ± 0.101
2.144LysMet: 2.144 ± 1.185
4.287LysAsn: 4.287 ± 2.371
0.0LysPro: 0.0 ± 0.0
1.072LysGln: 1.072 ± 0.593
1.072LysArg: 1.072 ± 0.593
2.144LysSer: 2.144 ± 0.361
6.431LysThr: 6.431 ± 2.009
6.431LysVal: 6.431 ± 2.009
1.072LysTrp: 1.072 ± 0.954
3.215LysTyr: 3.215 ± 1.315
0.0LysXaa: 0.0 ± 0.0
Leu
4.287LeuAla: 4.287 ± 0.824
1.072LeuCys: 1.072 ± 0.593
1.072LeuAsp: 1.072 ± 0.593
7.503LeuGlu: 7.503 ± 2.602
4.287LeuPhe: 4.287 ± 2.27
5.359LeuGly: 5.359 ± 3.224
2.144LeuHis: 2.144 ± 0.361
3.215LeuIle: 3.215 ± 1.315
8.574LeuLys: 8.574 ± 1.648
5.359LeuLeu: 5.359 ± 1.417
3.215LeuMet: 3.215 ± 0.231
1.072LeuAsn: 1.072 ± 0.593
2.144LeuPro: 2.144 ± 1.185
6.431LeuGln: 6.431 ± 1.084
7.503LeuArg: 7.503 ± 2.602
3.215LeuSer: 3.215 ± 1.778
7.503LeuThr: 7.503 ± 0.492
7.503LeuVal: 7.503 ± 1.055
0.0LeuTrp: 0.0 ± 0.0
2.144LeuTyr: 2.144 ± 0.361
0.0LeuXaa: 0.0 ± 0.0
Met
3.215MetAla: 3.215 ± 0.231
0.0MetCys: 0.0 ± 0.0
1.072MetAsp: 1.072 ± 0.954
1.072MetGlu: 1.072 ± 0.593
2.144MetPhe: 2.144 ± 1.908
1.072MetGly: 1.072 ± 0.593
2.144MetHis: 2.144 ± 1.185
2.144MetIle: 2.144 ± 0.361
3.215MetLys: 3.215 ± 1.778
0.0MetLeu: 0.0 ± 0.0
1.072MetMet: 1.072 ± 0.954
4.287MetAsn: 4.287 ± 2.371
0.0MetPro: 0.0 ± 0.0
1.072MetGln: 1.072 ± 0.593
2.144MetArg: 2.144 ± 1.185
1.072MetSer: 1.072 ± 0.593
2.144MetThr: 2.144 ± 0.361
0.0MetVal: 0.0 ± 0.0
1.072MetTrp: 1.072 ± 0.954
1.072MetTyr: 1.072 ± 0.593
0.0MetXaa: 0.0 ± 0.0
Asn
5.359AsnAla: 5.359 ± 1.417
0.0AsnCys: 0.0 ± 0.0
2.144AsnAsp: 2.144 ± 1.185
1.072AsnGlu: 1.072 ± 0.593
0.0AsnPhe: 0.0 ± 0.0
4.287AsnGly: 4.287 ± 0.824
0.0AsnHis: 0.0 ± 0.0
4.287AsnIle: 4.287 ± 0.723
3.215AsnLys: 3.215 ± 1.315
1.072AsnLeu: 1.072 ± 0.593
1.072AsnMet: 1.072 ± 0.593
1.072AsnAsn: 1.072 ± 0.593
1.072AsnPro: 1.072 ± 0.593
2.144AsnGln: 2.144 ± 0.361
4.287AsnArg: 4.287 ± 0.723
3.215AsnSer: 3.215 ± 1.315
2.144AsnThr: 2.144 ± 1.185
4.287AsnVal: 4.287 ± 2.371
2.144AsnTrp: 2.144 ± 1.908
1.072AsnTyr: 1.072 ± 0.593
0.0AsnXaa: 0.0 ± 0.0
Pro
4.287ProAla: 4.287 ± 0.723
0.0ProCys: 0.0 ± 0.0
3.215ProAsp: 3.215 ± 0.231
6.431ProGlu: 6.431 ± 2.009
0.0ProPhe: 0.0 ± 0.0
3.215ProGly: 3.215 ± 0.231
2.144ProHis: 2.144 ± 0.361
2.144ProIle: 2.144 ± 0.361
2.144ProLys: 2.144 ± 0.361
4.287ProLeu: 4.287 ± 2.371
0.0ProMet: 0.0 ± 0.0
1.072ProAsn: 1.072 ± 0.593
1.072ProPro: 1.072 ± 0.593
2.144ProGln: 2.144 ± 0.361
1.072ProArg: 1.072 ± 0.954
3.215ProSer: 3.215 ± 0.231
4.287ProThr: 4.287 ± 0.824
3.215ProVal: 3.215 ± 1.315
1.072ProTrp: 1.072 ± 0.593
1.072ProTyr: 1.072 ± 0.593
0.0ProXaa: 0.0 ± 0.0
Gln
2.144GlnAla: 2.144 ± 1.185
1.072GlnCys: 1.072 ± 0.593
0.0GlnAsp: 0.0 ± 0.0
6.431GlnGlu: 6.431 ± 2.631
1.072GlnPhe: 1.072 ± 0.954
2.144GlnGly: 2.144 ± 0.361
1.072GlnHis: 1.072 ± 0.954
2.144GlnIle: 2.144 ± 0.361
2.144GlnLys: 2.144 ± 0.361
5.359GlnLeu: 5.359 ± 0.13
1.072GlnMet: 1.072 ± 0.593
0.0GlnAsn: 0.0 ± 0.0
2.144GlnPro: 2.144 ± 0.361
1.072GlnGln: 1.072 ± 0.954
3.215GlnArg: 3.215 ± 0.231
0.0GlnSer: 0.0 ± 0.0
0.0GlnThr: 0.0 ± 0.0
4.287GlnVal: 4.287 ± 2.27
1.072GlnTrp: 1.072 ± 0.954
5.359GlnTyr: 5.359 ± 0.13
0.0GlnXaa: 0.0 ± 0.0
Arg
5.359ArgAla: 5.359 ± 0.13
3.215ArgCys: 3.215 ± 0.231
2.144ArgAsp: 2.144 ± 1.185
8.574ArgGlu: 8.574 ± 1.446
0.0ArgPhe: 0.0 ± 0.0
3.215ArgGly: 3.215 ± 1.315
1.072ArgHis: 1.072 ± 0.593
1.072ArgIle: 1.072 ± 0.954
2.144ArgLys: 2.144 ± 1.185
8.574ArgLeu: 8.574 ± 1.446
1.072ArgMet: 1.072 ± 0.954
6.431ArgAsn: 6.431 ± 0.463
1.072ArgPro: 1.072 ± 0.593
3.215ArgGln: 3.215 ± 1.315
4.287ArgArg: 4.287 ± 2.371
3.215ArgSer: 3.215 ± 1.778
1.072ArgThr: 1.072 ± 0.593
3.215ArgVal: 3.215 ± 0.231
0.0ArgTrp: 0.0 ± 0.0
1.072ArgTyr: 1.072 ± 0.593
0.0ArgXaa: 0.0 ± 0.0
Ser
4.287SerAla: 4.287 ± 0.824
1.072SerCys: 1.072 ± 0.954
1.072SerAsp: 1.072 ± 0.593
2.144SerGlu: 2.144 ± 1.185
0.0SerPhe: 0.0 ± 0.0
3.215SerGly: 3.215 ± 0.231
0.0SerHis: 0.0 ± 0.0
1.072SerIle: 1.072 ± 0.593
3.215SerLys: 3.215 ± 0.231
6.431SerLeu: 6.431 ± 2.009
2.144SerMet: 2.144 ± 1.53
1.072SerAsn: 1.072 ± 0.593
5.359SerPro: 5.359 ± 0.13
6.431SerGln: 6.431 ± 2.631
2.144SerArg: 2.144 ± 0.361
4.287SerSer: 4.287 ± 2.27
4.287SerThr: 4.287 ± 0.824
2.144SerVal: 2.144 ± 1.185
1.072SerTrp: 1.072 ± 0.593
2.144SerTyr: 2.144 ± 1.908
0.0SerXaa: 0.0 ± 0.0
Thr
6.431ThrAla: 6.431 ± 1.084
0.0ThrCys: 0.0 ± 0.0
7.503ThrAsp: 7.503 ± 1.055
4.287ThrGlu: 4.287 ± 2.371
4.287ThrPhe: 4.287 ± 0.723
1.072ThrGly: 1.072 ± 0.593
0.0ThrHis: 0.0 ± 0.0
3.215ThrIle: 3.215 ± 0.231
1.072ThrLys: 1.072 ± 0.593
8.574ThrLeu: 8.574 ± 1.446
2.144ThrMet: 2.144 ± 0.361
4.287ThrAsn: 4.287 ± 2.27
4.287ThrPro: 4.287 ± 2.371
1.072ThrGln: 1.072 ± 0.593
0.0ThrArg: 0.0 ± 0.0
3.215ThrSer: 3.215 ± 1.778
4.287ThrThr: 4.287 ± 0.824
6.431ThrVal: 6.431 ± 2.009
0.0ThrTrp: 0.0 ± 0.0
1.072ThrTyr: 1.072 ± 0.954
0.0ThrXaa: 0.0 ± 0.0
Val
4.287ValAla: 4.287 ± 0.824
1.072ValCys: 1.072 ± 0.593
4.287ValAsp: 4.287 ± 0.723
5.359ValGlu: 5.359 ± 1.677
2.144ValPhe: 2.144 ± 1.185
9.646ValGly: 9.646 ± 2.24
1.072ValHis: 1.072 ± 0.954
1.072ValIle: 1.072 ± 0.593
4.287ValLys: 4.287 ± 0.824
5.359ValLeu: 5.359 ± 0.13
2.144ValMet: 2.144 ± 1.908
2.144ValAsn: 2.144 ± 1.185
7.503ValPro: 7.503 ± 2.602
2.144ValGln: 2.144 ± 0.361
7.503ValArg: 7.503 ± 4.149
2.144ValSer: 2.144 ± 0.361
3.215ValThr: 3.215 ± 0.231
8.574ValVal: 8.574 ± 4.741
2.144ValTrp: 2.144 ± 0.361
3.215ValTyr: 3.215 ± 0.231
0.0ValXaa: 0.0 ± 0.0
Trp
0.0TrpAla: 0.0 ± 0.0
0.0TrpCys: 0.0 ± 0.0
1.072TrpAsp: 1.072 ± 0.954
2.144TrpGlu: 2.144 ± 1.908
0.0TrpPhe: 0.0 ± 0.0
0.0TrpGly: 0.0 ± 0.0
0.0TrpHis: 0.0 ± 0.0
0.0TrpIle: 0.0 ± 0.0
2.144TrpLys: 2.144 ± 0.361
0.0TrpLeu: 0.0 ± 0.0
2.144TrpMet: 2.144 ± 0.399
2.144TrpAsn: 2.144 ± 1.908
1.072TrpPro: 1.072 ± 0.954
2.144TrpGln: 2.144 ± 0.361
1.072TrpArg: 1.072 ± 0.954
2.144TrpSer: 2.144 ± 0.361
1.072TrpThr: 1.072 ± 0.954
1.072TrpVal: 1.072 ± 0.593
0.0TrpTrp: 0.0 ± 0.0
1.072TrpTyr: 1.072 ± 0.954
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.215TyrAla: 3.215 ± 1.315
0.0TyrCys: 0.0 ± 0.0
2.144TyrAsp: 2.144 ± 0.361
4.287TyrGlu: 4.287 ± 0.824
1.072TyrPhe: 1.072 ± 0.954
5.359TyrGly: 5.359 ± 0.13
2.144TyrHis: 2.144 ± 0.361
0.0TyrIle: 0.0 ± 0.0
0.0TyrLys: 0.0 ± 0.0
3.215TyrLeu: 3.215 ± 0.231
1.072TyrMet: 1.072 ± 0.593
0.0TyrAsn: 0.0 ± 0.0
0.0TyrPro: 0.0 ± 0.0
2.144TyrGln: 2.144 ± 0.361
1.072TyrArg: 1.072 ± 0.593
5.359TyrSer: 5.359 ± 0.13
1.072TyrThr: 1.072 ± 0.593
1.072TyrVal: 1.072 ± 0.954
1.072TyrTrp: 1.072 ± 0.954
3.215TyrTyr: 3.215 ± 1.315
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 2 proteins (934 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski