Amino acid dipepetide frequency for Beihai sobemo-like virus 20

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
6.826AlaAla: 6.826 ± 4.27
1.138AlaCys: 1.138 ± 0.712
6.826AlaAsp: 6.826 ± 2.511
3.413AlaGlu: 3.413 ± 3.144
1.138AlaPhe: 1.138 ± 1.048
0.0AlaGly: 0.0 ± 0.0
1.138AlaHis: 1.138 ± 1.048
3.413AlaIle: 3.413 ± 0.375
3.413AlaLys: 3.413 ± 2.135
5.688AlaLeu: 5.688 ± 0.039
1.138AlaMet: 1.138 ± 1.131
5.688AlaAsn: 5.688 ± 1.799
2.275AlaPro: 2.275 ± 1.423
6.826AlaGln: 6.826 ± 2.511
1.138AlaArg: 1.138 ± 0.712
4.551AlaSer: 4.551 ± 2.847
7.964AlaThr: 7.964 ± 0.297
0.0AlaVal: 0.0 ± 0.0
2.275AlaTrp: 2.275 ± 0.336
1.138AlaTyr: 1.138 ± 1.048
0.0AlaXaa: 0.0 ± 0.0
Cys
1.138CysAla: 1.138 ± 0.712
0.0CysCys: 0.0 ± 0.0
0.0CysAsp: 0.0 ± 0.0
1.138CysGlu: 1.138 ± 1.048
1.138CysPhe: 1.138 ± 0.712
1.138CysGly: 1.138 ± 1.048
1.138CysHis: 1.138 ± 0.712
1.138CysIle: 1.138 ± 1.048
1.138CysLys: 1.138 ± 1.048
0.0CysLeu: 0.0 ± 0.0
2.275CysMet: 2.275 ± 1.423
0.0CysAsn: 0.0 ± 0.0
2.275CysPro: 2.275 ± 0.336
0.0CysGln: 0.0 ± 0.0
0.0CysArg: 0.0 ± 0.0
1.138CysSer: 1.138 ± 1.048
0.0CysThr: 0.0 ± 0.0
0.0CysVal: 0.0 ± 0.0
0.0CysTrp: 0.0 ± 0.0
2.275CysTyr: 2.275 ± 2.096
0.0CysXaa: 0.0 ± 0.0
Asp
0.0AspAla: 0.0 ± 0.0
1.138AspCys: 1.138 ± 0.712
3.413AspAsp: 3.413 ± 1.384
9.101AspGlu: 9.101 ± 3.934
2.275AspPhe: 2.275 ± 1.423
1.138AspGly: 1.138 ± 1.048
1.138AspHis: 1.138 ± 0.712
2.275AspIle: 2.275 ± 0.336
1.138AspLys: 1.138 ± 1.048
4.551AspLeu: 4.551 ± 2.433
2.275AspMet: 2.275 ± 0.336
2.275AspAsn: 2.275 ± 1.423
1.138AspPro: 1.138 ± 1.048
1.138AspGln: 1.138 ± 1.048
5.688AspArg: 5.688 ± 0.039
3.413AspSer: 3.413 ± 2.135
3.413AspThr: 3.413 ± 0.375
2.275AspVal: 2.275 ± 0.336
2.275AspTrp: 2.275 ± 2.096
3.413AspTyr: 3.413 ± 2.135
0.0AspXaa: 0.0 ± 0.0
Glu
6.826GluAla: 6.826 ± 0.751
0.0GluCys: 0.0 ± 0.0
5.688GluAsp: 5.688 ± 3.559
7.964GluGlu: 7.964 ± 4.982
2.275GluPhe: 2.275 ± 0.336
9.101GluGly: 9.101 ± 1.345
0.0GluHis: 0.0 ± 0.0
5.688GluIle: 5.688 ± 1.721
4.551GluLys: 4.551 ± 1.087
4.551GluLeu: 4.551 ± 0.673
2.275GluMet: 2.275 ± 0.421
3.413GluAsn: 3.413 ± 0.375
11.377GluPro: 11.377 ± 1.838
3.413GluGln: 3.413 ± 0.375
3.413GluArg: 3.413 ± 1.384
4.551GluSer: 4.551 ± 2.847
2.275GluThr: 2.275 ± 0.336
6.826GluVal: 6.826 ± 1.009
1.138GluTrp: 1.138 ± 0.712
1.138GluTyr: 1.138 ± 1.048
0.0GluXaa: 0.0 ± 0.0
Phe
4.551PheAla: 4.551 ± 0.673
1.138PheCys: 1.138 ± 1.048
2.275PheAsp: 2.275 ± 0.336
2.275PheGlu: 2.275 ± 0.336
1.138PhePhe: 1.138 ± 1.048
3.413PheGly: 3.413 ± 1.384
1.138PheHis: 1.138 ± 1.048
3.413PheIle: 3.413 ± 1.384
0.0PheLys: 0.0 ± 0.0
4.551PheLeu: 4.551 ± 1.087
0.0PheMet: 0.0 ± 0.0
1.138PheAsn: 1.138 ± 1.048
1.138PhePro: 1.138 ± 0.712
1.138PheGln: 1.138 ± 0.712
3.413PheArg: 3.413 ± 0.375
0.0PheSer: 0.0 ± 0.0
2.275PheThr: 2.275 ± 2.096
3.413PheVal: 3.413 ± 0.375
1.138PheTrp: 1.138 ± 0.712
0.0PheTyr: 0.0 ± 0.0
0.0PheXaa: 0.0 ± 0.0
Gly
2.275GlyAla: 2.275 ± 0.336
2.275GlyCys: 2.275 ± 2.096
2.275GlyAsp: 2.275 ± 0.336
1.138GlyGlu: 1.138 ± 1.048
5.688GlyPhe: 5.688 ± 1.721
1.138GlyGly: 1.138 ± 1.048
1.138GlyHis: 1.138 ± 0.712
2.275GlyIle: 2.275 ± 2.096
2.275GlyLys: 2.275 ± 2.096
5.688GlyLeu: 5.688 ± 0.039
0.0GlyMet: 0.0 ± 0.0
1.138GlyAsn: 1.138 ± 1.048
4.551GlyPro: 4.551 ± 2.433
2.275GlyGln: 2.275 ± 2.096
4.551GlyArg: 4.551 ± 0.673
2.275GlySer: 2.275 ± 0.336
2.275GlyThr: 2.275 ± 2.096
3.413GlyVal: 3.413 ± 0.375
2.275GlyTrp: 2.275 ± 2.096
2.275GlyTyr: 2.275 ± 0.336
0.0GlyXaa: 0.0 ± 0.0
His
1.138HisAla: 1.138 ± 1.048
1.138HisCys: 1.138 ± 1.048
0.0HisAsp: 0.0 ± 0.0
0.0HisGlu: 0.0 ± 0.0
2.275HisPhe: 2.275 ± 0.336
1.138HisGly: 1.138 ± 1.048
1.138HisHis: 1.138 ± 1.048
1.138HisIle: 1.138 ± 1.048
4.551HisLys: 4.551 ± 1.087
0.0HisLeu: 0.0 ± 0.0
1.138HisMet: 1.138 ± 1.048
0.0HisAsn: 0.0 ± 0.0
0.0HisPro: 0.0 ± 0.0
1.138HisGln: 1.138 ± 1.048
0.0HisArg: 0.0 ± 0.0
1.138HisSer: 1.138 ± 0.712
0.0HisThr: 0.0 ± 0.0
2.275HisVal: 2.275 ± 0.336
0.0HisTrp: 0.0 ± 0.0
0.0HisTyr: 0.0 ± 0.0
0.0HisXaa: 0.0 ± 0.0
Ile
1.138IleAla: 1.138 ± 0.712
1.138IleCys: 1.138 ± 1.048
2.275IleAsp: 2.275 ± 2.096
5.688IleGlu: 5.688 ± 1.799
2.275IlePhe: 2.275 ± 0.336
3.413IleGly: 3.413 ± 3.144
2.275IleHis: 2.275 ± 2.096
1.138IleIle: 1.138 ± 1.048
1.138IleLys: 1.138 ± 1.048
9.101IleLeu: 9.101 ± 3.105
0.0IleMet: 0.0 ± 0.0
3.413IleAsn: 3.413 ± 0.375
4.551IlePro: 4.551 ± 1.087
4.551IleGln: 4.551 ± 0.673
1.138IleArg: 1.138 ± 0.712
0.0IleSer: 0.0 ± 0.0
3.413IleThr: 3.413 ± 1.384
3.413IleVal: 3.413 ± 1.384
0.0IleTrp: 0.0 ± 0.0
0.0IleTyr: 0.0 ± 0.0
0.0IleXaa: 0.0 ± 0.0
Lys
4.551LysAla: 4.551 ± 1.087
2.275LysCys: 2.275 ± 0.336
3.413LysAsp: 3.413 ± 2.135
5.688LysGlu: 5.688 ± 0.039
0.0LysPhe: 0.0 ± 0.0
3.413LysGly: 3.413 ± 0.375
1.138LysHis: 1.138 ± 1.048
3.413LysIle: 3.413 ± 0.375
12.514LysLys: 12.514 ± 6.069
4.551LysLeu: 4.551 ± 2.847
4.551LysMet: 4.551 ± 1.087
2.275LysAsn: 2.275 ± 1.423
6.826LysPro: 6.826 ± 1.009
6.826LysGln: 6.826 ± 0.751
2.275LysArg: 2.275 ± 0.336
2.275LysSer: 2.275 ± 0.336
2.275LysThr: 2.275 ± 1.423
5.688LysVal: 5.688 ± 0.039
0.0LysTrp: 0.0 ± 0.0
2.275LysTyr: 2.275 ± 0.336
0.0LysXaa: 0.0 ± 0.0
Leu
5.688LeuAla: 5.688 ± 0.039
0.0LeuCys: 0.0 ± 0.0
1.138LeuAsp: 1.138 ± 0.712
3.413LeuGlu: 3.413 ± 0.375
2.275LeuPhe: 2.275 ± 2.096
2.275LeuGly: 2.275 ± 0.336
0.0LeuHis: 0.0 ± 0.0
3.413LeuIle: 3.413 ± 1.384
2.275LeuLys: 2.275 ± 0.336
1.138LeuLeu: 1.138 ± 0.712
2.275LeuMet: 2.275 ± 0.336
7.964LeuAsn: 7.964 ± 1.463
4.551LeuPro: 4.551 ± 0.673
7.964LeuGln: 7.964 ± 1.463
7.964LeuArg: 7.964 ± 2.057
9.101LeuSer: 9.101 ± 5.694
3.413LeuThr: 3.413 ± 2.135
3.413LeuVal: 3.413 ± 1.384
0.0LeuTrp: 0.0 ± 0.0
4.551LeuTyr: 4.551 ± 0.673
0.0LeuXaa: 0.0 ± 0.0
Met
1.138MetAla: 1.138 ± 0.712
0.0MetCys: 0.0 ± 0.0
3.413MetAsp: 3.413 ± 3.144
0.0MetGlu: 0.0 ± 0.0
0.0MetPhe: 0.0 ± 0.0
3.413MetGly: 3.413 ± 1.384
0.0MetHis: 0.0 ± 0.0
2.275MetIle: 2.275 ± 0.336
1.138MetLys: 1.138 ± 0.712
2.275MetLeu: 2.275 ± 1.423
0.0MetMet: 0.0 ± 0.0
1.138MetAsn: 1.138 ± 1.048
2.275MetPro: 2.275 ± 0.336
2.275MetGln: 2.275 ± 0.336
1.138MetArg: 1.138 ± 1.048
4.551MetSer: 4.551 ± 2.847
1.138MetThr: 1.138 ± 0.712
2.275MetVal: 2.275 ± 2.096
0.0MetTrp: 0.0 ± 0.0
0.0MetTyr: 0.0 ± 0.0
0.0MetXaa: 0.0 ± 0.0
Asn
3.413AsnAla: 3.413 ± 1.384
0.0AsnCys: 0.0 ± 0.0
1.138AsnAsp: 1.138 ± 1.048
4.551AsnGlu: 4.551 ± 1.087
4.551AsnPhe: 4.551 ± 0.673
2.275AsnGly: 2.275 ± 0.336
1.138AsnHis: 1.138 ± 1.048
2.275AsnIle: 2.275 ± 1.423
4.551AsnLys: 4.551 ± 2.847
1.138AsnLeu: 1.138 ± 1.048
1.138AsnMet: 1.138 ± 0.712
0.0AsnAsn: 0.0 ± 0.0
1.138AsnPro: 1.138 ± 0.712
2.275AsnGln: 2.275 ± 1.423
1.138AsnArg: 1.138 ± 0.712
1.138AsnSer: 1.138 ± 1.048
1.138AsnThr: 1.138 ± 0.712
3.413AsnVal: 3.413 ± 2.135
0.0AsnTrp: 0.0 ± 0.0
0.0AsnTyr: 0.0 ± 0.0
0.0AsnXaa: 0.0 ± 0.0
Pro
5.688ProAla: 5.688 ± 1.799
1.138ProCys: 1.138 ± 0.712
2.275ProAsp: 2.275 ± 1.423
3.413ProGlu: 3.413 ± 1.384
1.138ProPhe: 1.138 ± 1.048
3.413ProGly: 3.413 ± 1.384
1.138ProHis: 1.138 ± 1.048
1.138ProIle: 1.138 ± 1.048
7.964ProLys: 7.964 ± 0.297
1.138ProLeu: 1.138 ± 0.712
3.413ProMet: 3.413 ± 1.384
1.138ProAsn: 1.138 ± 0.712
5.688ProPro: 5.688 ± 0.039
6.826ProGln: 6.826 ± 4.27
2.275ProArg: 2.275 ± 1.423
4.551ProSer: 4.551 ± 0.673
5.688ProThr: 5.688 ± 1.799
4.551ProVal: 4.551 ± 2.847
1.138ProTrp: 1.138 ± 1.048
2.275ProTyr: 2.275 ± 2.096
0.0ProXaa: 0.0 ± 0.0
Gln
2.275GlnAla: 2.275 ± 1.423
0.0GlnCys: 0.0 ± 0.0
4.551GlnAsp: 4.551 ± 2.433
6.826GlnGlu: 6.826 ± 1.009
3.413GlnPhe: 3.413 ± 2.135
2.275GlnGly: 2.275 ± 0.336
0.0GlnHis: 0.0 ± 0.0
5.688GlnIle: 5.688 ± 1.799
5.688GlnLys: 5.688 ± 1.799
11.377GlnLeu: 11.377 ± 3.598
2.275GlnMet: 2.275 ± 0.336
1.138GlnAsn: 1.138 ± 0.712
3.413GlnPro: 3.413 ± 2.135
7.964GlnGln: 7.964 ± 1.463
4.551GlnArg: 4.551 ± 1.087
5.688GlnSer: 5.688 ± 0.039
2.275GlnThr: 2.275 ± 1.423
6.826GlnVal: 6.826 ± 0.751
0.0GlnTrp: 0.0 ± 0.0
0.0GlnTyr: 0.0 ± 0.0
0.0GlnXaa: 0.0 ± 0.0
Arg
3.413ArgAla: 3.413 ± 0.375
0.0ArgCys: 0.0 ± 0.0
1.138ArgAsp: 1.138 ± 0.712
6.826ArgGlu: 6.826 ± 1.009
2.275ArgPhe: 2.275 ± 0.336
2.275ArgGly: 2.275 ± 2.096
1.138ArgHis: 1.138 ± 1.048
1.138ArgIle: 1.138 ± 0.712
5.688ArgLys: 5.688 ± 0.039
4.551ArgLeu: 4.551 ± 2.433
0.0ArgMet: 0.0 ± 0.0
2.275ArgAsn: 2.275 ± 0.336
4.551ArgPro: 4.551 ± 1.087
1.138ArgGln: 1.138 ± 0.712
2.275ArgArg: 2.275 ± 2.096
2.275ArgSer: 2.275 ± 0.336
2.275ArgThr: 2.275 ± 1.423
2.275ArgVal: 2.275 ± 1.423
1.138ArgTrp: 1.138 ± 1.048
1.138ArgTyr: 1.138 ± 1.048
0.0ArgXaa: 0.0 ± 0.0
Ser
5.688SerAla: 5.688 ± 1.799
0.0SerCys: 0.0 ± 0.0
7.964SerAsp: 7.964 ± 3.222
6.826SerGlu: 6.826 ± 4.27
1.138SerPhe: 1.138 ± 0.712
4.551SerGly: 4.551 ± 0.673
1.138SerHis: 1.138 ± 0.712
2.275SerIle: 2.275 ± 2.096
4.551SerLys: 4.551 ± 2.847
4.551SerLeu: 4.551 ± 2.847
0.0SerMet: 0.0 ± 0.0
1.138SerAsn: 1.138 ± 0.712
4.551SerPro: 4.551 ± 0.673
3.413SerGln: 3.413 ± 2.135
2.275SerArg: 2.275 ± 0.336
2.275SerSer: 2.275 ± 0.336
2.275SerThr: 2.275 ± 0.336
3.413SerVal: 3.413 ± 0.375
1.138SerTrp: 1.138 ± 1.048
3.413SerTyr: 3.413 ± 1.384
0.0SerXaa: 0.0 ± 0.0
Thr
3.413ThrAla: 3.413 ± 2.135
1.138ThrCys: 1.138 ± 1.048
2.275ThrAsp: 2.275 ± 0.336
3.413ThrGlu: 3.413 ± 0.375
0.0ThrPhe: 0.0 ± 0.0
1.138ThrGly: 1.138 ± 1.048
3.413ThrHis: 3.413 ± 0.375
4.551ThrIle: 4.551 ± 0.673
4.551ThrLys: 4.551 ± 2.847
4.551ThrLeu: 4.551 ± 1.087
2.275ThrMet: 2.275 ± 2.096
2.275ThrAsn: 2.275 ± 0.336
3.413ThrPro: 3.413 ± 0.375
4.551ThrGln: 4.551 ± 1.087
1.138ThrArg: 1.138 ± 0.712
4.551ThrSer: 4.551 ± 0.673
6.826ThrThr: 6.826 ± 2.769
1.138ThrVal: 1.138 ± 1.048
1.138ThrTrp: 1.138 ± 0.712
0.0ThrTyr: 0.0 ± 0.0
0.0ThrXaa: 0.0 ± 0.0
Val
4.551ValAla: 4.551 ± 0.673
2.275ValCys: 2.275 ± 1.423
2.275ValAsp: 2.275 ± 0.336
10.239ValGlu: 10.239 ± 2.886
3.413ValPhe: 3.413 ± 1.384
3.413ValGly: 3.413 ± 1.384
0.0ValHis: 0.0 ± 0.0
2.275ValIle: 2.275 ± 2.096
4.551ValLys: 4.551 ± 2.847
1.138ValLeu: 1.138 ± 0.712
2.275ValMet: 2.275 ± 2.096
0.0ValAsn: 0.0 ± 0.0
1.138ValPro: 1.138 ± 0.712
7.964ValGln: 7.964 ± 3.222
2.275ValArg: 2.275 ± 2.096
3.413ValSer: 3.413 ± 1.384
3.413ValThr: 3.413 ± 0.375
3.413ValVal: 3.413 ± 3.144
0.0ValTrp: 0.0 ± 0.0
1.138ValTyr: 1.138 ± 1.048
0.0ValXaa: 0.0 ± 0.0
Trp
0.0TrpAla: 0.0 ± 0.0
1.138TrpCys: 1.138 ± 1.048
1.138TrpAsp: 1.138 ± 1.048
2.275TrpGlu: 2.275 ± 0.336
0.0TrpPhe: 0.0 ± 0.0
0.0TrpGly: 0.0 ± 0.0
0.0TrpHis: 0.0 ± 0.0
0.0TrpIle: 0.0 ± 0.0
1.138TrpLys: 1.138 ± 1.048
1.138TrpLeu: 1.138 ± 1.048
0.0TrpMet: 0.0 ± 0.0
0.0TrpAsn: 0.0 ± 0.0
0.0TrpPro: 0.0 ± 0.0
0.0TrpGln: 0.0 ± 0.0
1.138TrpArg: 1.138 ± 1.048
4.551TrpSer: 4.551 ± 1.087
2.275TrpThr: 2.275 ± 2.096
0.0TrpVal: 0.0 ± 0.0
0.0TrpTrp: 0.0 ± 0.0
0.0TrpTyr: 0.0 ± 0.0
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.413TyrAla: 3.413 ± 2.135
0.0TyrCys: 0.0 ± 0.0
0.0TyrAsp: 0.0 ± 0.0
2.275TyrGlu: 2.275 ± 2.096
1.138TyrPhe: 1.138 ± 1.048
2.275TyrGly: 2.275 ± 2.096
0.0TyrHis: 0.0 ± 0.0
1.138TyrIle: 1.138 ± 1.048
3.413TyrLys: 3.413 ± 3.144
1.138TyrLeu: 1.138 ± 1.048
0.0TyrMet: 0.0 ± 0.0
0.0TyrAsn: 0.0 ± 0.0
1.138TyrPro: 1.138 ± 0.712
4.551TyrGln: 4.551 ± 0.673
0.0TyrArg: 0.0 ± 0.0
1.138TyrSer: 1.138 ± 0.712
1.138TyrThr: 1.138 ± 0.712
1.138TyrVal: 1.138 ± 1.048
1.138TyrTrp: 1.138 ± 1.048
0.0TyrTyr: 0.0 ± 0.0
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 2 proteins (880 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski