Amino acid dipepetide frequency for Sanxia sobemo-like virus 5

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
3.071AlaAla: 3.071 ± 1.661
2.047AlaCys: 2.047 ± 1.107
5.118AlaAsp: 5.118 ± 1.848
2.047AlaGlu: 2.047 ± 1.107
2.047AlaPhe: 2.047 ± 1.107
4.094AlaGly: 4.094 ± 0.863
1.024AlaHis: 1.024 ± 0.554
1.024AlaIle: 1.024 ± 0.985
4.094AlaLys: 4.094 ± 0.863
6.141AlaLeu: 6.141 ± 0.244
2.047AlaMet: 2.047 ± 1.107
1.024AlaAsn: 1.024 ± 0.554
3.071AlaPro: 3.071 ± 0.122
5.118AlaGln: 5.118 ± 0.309
4.094AlaArg: 4.094 ± 2.214
6.141AlaSer: 6.141 ± 1.294
2.047AlaThr: 2.047 ± 1.107
7.165AlaVal: 7.165 ± 0.798
2.047AlaTrp: 2.047 ± 1.107
3.071AlaTyr: 3.071 ± 1.661
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.0CysCys: 0.0 ± 0.0
1.024CysAsp: 1.024 ± 0.985
2.047CysGlu: 2.047 ± 1.107
1.024CysPhe: 1.024 ± 0.985
1.024CysGly: 1.024 ± 0.554
0.0CysHis: 0.0 ± 0.0
3.071CysIle: 3.071 ± 0.122
1.024CysLys: 1.024 ± 0.554
5.118CysLeu: 5.118 ± 1.848
1.024CysMet: 1.024 ± 0.554
0.0CysAsn: 0.0 ± 0.0
0.0CysPro: 0.0 ± 0.0
0.0CysGln: 0.0 ± 0.0
1.024CysArg: 1.024 ± 0.554
0.0CysSer: 0.0 ± 0.0
0.0CysThr: 0.0 ± 0.0
1.024CysVal: 1.024 ± 0.985
0.0CysTrp: 0.0 ± 0.0
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
3.071AspAla: 3.071 ± 1.661
2.047AspCys: 2.047 ± 1.107
7.165AspAsp: 7.165 ± 2.336
5.118AspGlu: 5.118 ± 1.229
2.047AspPhe: 2.047 ± 0.431
3.071AspGly: 3.071 ± 0.122
0.0AspHis: 0.0 ± 0.0
0.0AspIle: 0.0 ± 0.0
2.047AspLys: 2.047 ± 0.431
6.141AspLeu: 6.141 ± 0.244
0.0AspMet: 0.0 ± 0.0
1.024AspAsn: 1.024 ± 0.985
1.024AspPro: 1.024 ± 0.985
1.024AspGln: 1.024 ± 0.985
3.071AspArg: 3.071 ± 0.122
4.094AspSer: 4.094 ± 2.402
1.024AspThr: 1.024 ± 0.985
2.047AspVal: 2.047 ± 0.431
3.071AspTrp: 3.071 ± 1.417
1.024AspTyr: 1.024 ± 0.554
0.0AspXaa: 0.0 ± 0.0
Glu
5.118GluAla: 5.118 ± 0.309
1.024GluCys: 1.024 ± 0.554
5.118GluAsp: 5.118 ± 2.768
3.071GluGlu: 3.071 ± 0.122
2.047GluPhe: 2.047 ± 1.97
5.118GluGly: 5.118 ± 1.229
1.024GluHis: 1.024 ± 0.985
3.071GluIle: 3.071 ± 0.122
6.141GluLys: 6.141 ± 3.321
6.141GluLeu: 6.141 ± 1.294
2.047GluMet: 2.047 ± 1.97
1.024GluAsn: 1.024 ± 0.554
7.165GluPro: 7.165 ± 0.741
3.071GluGln: 3.071 ± 1.661
1.024GluArg: 1.024 ± 0.985
7.165GluSer: 7.165 ± 2.336
2.047GluThr: 2.047 ± 0.431
5.118GluVal: 5.118 ± 2.768
1.024GluTrp: 1.024 ± 0.554
0.0GluTyr: 0.0 ± 0.0
0.0GluXaa: 0.0 ± 0.0
Phe
2.047PheAla: 2.047 ± 0.431
0.0PheCys: 0.0 ± 0.0
1.024PheAsp: 1.024 ± 0.554
1.024PheGlu: 1.024 ± 0.554
1.024PhePhe: 1.024 ± 0.985
5.118PheGly: 5.118 ± 1.848
0.0PheHis: 0.0 ± 0.0
4.094PheIle: 4.094 ± 2.402
1.024PheLys: 1.024 ± 0.554
3.071PheLeu: 3.071 ± 1.661
0.0PheMet: 0.0 ± 0.0
2.047PheAsn: 2.047 ± 0.431
1.024PhePro: 1.024 ± 0.554
5.118PheGln: 5.118 ± 1.848
3.071PheArg: 3.071 ± 1.417
1.024PheSer: 1.024 ± 0.554
0.0PheThr: 0.0 ± 0.0
3.071PheVal: 3.071 ± 1.661
1.024PheTrp: 1.024 ± 0.554
0.0PheTyr: 0.0 ± 0.0
0.0PheXaa: 0.0 ± 0.0
Gly
4.094GlyAla: 4.094 ± 2.214
1.024GlyCys: 1.024 ± 0.985
3.071GlyAsp: 3.071 ± 0.122
5.118GlyGlu: 5.118 ± 1.229
1.024GlyPhe: 1.024 ± 0.554
4.094GlyGly: 4.094 ± 0.863
2.047GlyHis: 2.047 ± 0.431
11.259GlyIle: 11.259 ± 1.473
3.071GlyLys: 3.071 ± 0.122
4.094GlyLeu: 4.094 ± 0.863
2.047GlyMet: 2.047 ± 1.107
2.047GlyAsn: 2.047 ± 1.107
7.165GlyPro: 7.165 ± 2.336
1.024GlyGln: 1.024 ± 0.554
3.071GlyArg: 3.071 ± 0.122
7.165GlySer: 7.165 ± 0.741
1.024GlyThr: 1.024 ± 0.554
5.118GlyVal: 5.118 ± 1.229
4.094GlyTrp: 4.094 ± 3.94
4.094GlyTyr: 4.094 ± 0.676
0.0GlyXaa: 0.0 ± 0.0
His
2.047HisAla: 2.047 ± 0.431
2.047HisCys: 2.047 ± 1.97
0.0HisAsp: 0.0 ± 0.0
1.024HisGlu: 1.024 ± 0.554
0.0HisPhe: 0.0 ± 0.0
2.047HisGly: 2.047 ± 1.107
2.047HisHis: 2.047 ± 0.431
0.0HisIle: 0.0 ± 0.0
3.071HisLys: 3.071 ± 2.955
3.071HisLeu: 3.071 ± 1.417
0.0HisMet: 0.0 ± 0.0
1.024HisAsn: 1.024 ± 0.554
0.0HisPro: 0.0 ± 0.0
2.047HisGln: 2.047 ± 0.431
2.047HisArg: 2.047 ± 0.431
1.024HisSer: 1.024 ± 0.554
0.0HisThr: 0.0 ± 0.0
3.071HisVal: 3.071 ± 0.122
0.0HisTrp: 0.0 ± 0.0
1.024HisTyr: 1.024 ± 0.985
0.0HisXaa: 0.0 ± 0.0
Ile
4.094IleAla: 4.094 ± 2.402
1.024IleCys: 1.024 ± 0.554
2.047IleAsp: 2.047 ± 1.97
5.118IleGlu: 5.118 ± 1.848
0.0IlePhe: 0.0 ± 0.0
1.024IleGly: 1.024 ± 0.554
2.047IleHis: 2.047 ± 0.431
5.118IleIle: 5.118 ± 0.309
2.047IleLys: 2.047 ± 0.431
4.094IleLeu: 4.094 ± 0.863
2.047IleMet: 2.047 ± 1.107
0.0IleAsn: 0.0 ± 0.0
6.141IlePro: 6.141 ± 2.833
1.024IleGln: 1.024 ± 0.554
7.165IleArg: 7.165 ± 0.798
2.047IleSer: 2.047 ± 0.431
0.0IleThr: 0.0 ± 0.0
4.094IleVal: 4.094 ± 0.676
1.024IleTrp: 1.024 ± 0.985
1.024IleTyr: 1.024 ± 0.985
0.0IleXaa: 0.0 ± 0.0
Lys
6.141LysAla: 6.141 ± 3.321
0.0LysCys: 0.0 ± 0.0
1.024LysAsp: 1.024 ± 0.985
5.118LysGlu: 5.118 ± 1.229
2.047LysPhe: 2.047 ± 0.431
3.071LysGly: 3.071 ± 0.122
4.094LysHis: 4.094 ± 0.863
1.024LysIle: 1.024 ± 0.985
4.094LysLys: 4.094 ± 0.676
6.141LysLeu: 6.141 ± 1.783
1.024LysMet: 1.024 ± 0.554
2.047LysAsn: 2.047 ± 0.431
3.071LysPro: 3.071 ± 0.122
0.0LysGln: 0.0 ± 0.0
6.141LysArg: 6.141 ± 0.244
5.118LysSer: 5.118 ± 1.848
2.047LysThr: 2.047 ± 1.107
7.165LysVal: 7.165 ± 0.798
0.0LysTrp: 0.0 ± 0.0
0.0LysTyr: 0.0 ± 0.0
0.0LysXaa: 0.0 ± 0.0
Leu
10.235LeuAla: 10.235 ± 0.619
2.047LeuCys: 2.047 ± 0.431
4.094LeuAsp: 4.094 ± 0.863
10.235LeuGlu: 10.235 ± 0.92
7.165LeuPhe: 7.165 ± 2.28
8.188LeuGly: 8.188 ± 2.89
2.047LeuHis: 2.047 ± 1.97
4.094LeuIle: 4.094 ± 0.863
2.047LeuLys: 2.047 ± 1.107
9.212LeuLeu: 9.212 ± 2.711
0.0LeuMet: 0.0 ± 0.0
2.047LeuAsn: 2.047 ± 0.431
2.047LeuPro: 2.047 ± 1.107
2.047LeuGln: 2.047 ± 1.97
5.118LeuArg: 5.118 ± 1.848
6.141LeuSer: 6.141 ± 0.244
3.071LeuThr: 3.071 ± 1.417
8.188LeuVal: 8.188 ± 2.89
2.047LeuTrp: 2.047 ± 0.431
1.024LeuTyr: 1.024 ± 0.985
0.0LeuXaa: 0.0 ± 0.0
Met
2.047MetAla: 2.047 ± 1.107
0.0MetCys: 0.0 ± 0.0
1.024MetAsp: 1.024 ± 0.554
1.024MetGlu: 1.024 ± 0.554
1.024MetPhe: 1.024 ± 0.554
4.094MetGly: 4.094 ± 0.863
0.0MetHis: 0.0 ± 0.0
1.024MetIle: 1.024 ± 0.985
1.024MetLys: 1.024 ± 0.985
4.094MetLeu: 4.094 ± 0.676
2.047MetMet: 2.047 ± 1.107
1.024MetAsn: 1.024 ± 0.985
1.024MetPro: 1.024 ± 0.554
2.047MetGln: 2.047 ± 0.431
1.024MetArg: 1.024 ± 0.554
5.118MetSer: 5.118 ± 1.229
0.0MetThr: 0.0 ± 0.0
2.047MetVal: 2.047 ± 0.431
0.0MetTrp: 0.0 ± 0.0
0.0MetTyr: 0.0 ± 0.0
0.0MetXaa: 0.0 ± 0.0
Asn
2.047AsnAla: 2.047 ± 0.431
0.0AsnCys: 0.0 ± 0.0
1.024AsnAsp: 1.024 ± 0.554
2.047AsnGlu: 2.047 ± 1.107
0.0AsnPhe: 0.0 ± 0.0
3.071AsnGly: 3.071 ± 2.955
2.047AsnHis: 2.047 ± 1.107
0.0AsnIle: 0.0 ± 0.0
1.024AsnLys: 1.024 ± 0.985
1.024AsnLeu: 1.024 ± 0.985
1.024AsnMet: 1.024 ± 0.796
1.024AsnAsn: 1.024 ± 0.554
1.024AsnPro: 1.024 ± 0.554
0.0AsnGln: 0.0 ± 0.0
0.0AsnArg: 0.0 ± 0.0
4.094AsnSer: 4.094 ± 0.676
1.024AsnThr: 1.024 ± 0.985
1.024AsnVal: 1.024 ± 0.554
0.0AsnTrp: 0.0 ± 0.0
0.0AsnTyr: 0.0 ± 0.0
0.0AsnXaa: 0.0 ± 0.0
Pro
4.094ProAla: 4.094 ± 0.863
0.0ProCys: 0.0 ± 0.0
1.024ProAsp: 1.024 ± 0.554
6.141ProGlu: 6.141 ± 0.244
0.0ProPhe: 0.0 ± 0.0
7.165ProGly: 7.165 ± 0.798
2.047ProHis: 2.047 ± 0.431
2.047ProIle: 2.047 ± 0.431
3.071ProLys: 3.071 ± 1.661
5.118ProLeu: 5.118 ± 1.229
0.0ProMet: 0.0 ± 0.0
0.0ProAsn: 0.0 ± 0.0
4.094ProPro: 4.094 ± 2.214
2.047ProGln: 2.047 ± 0.431
8.188ProArg: 8.188 ± 2.89
5.118ProSer: 5.118 ± 1.848
2.047ProThr: 2.047 ± 0.431
1.024ProVal: 1.024 ± 0.554
1.024ProTrp: 1.024 ± 0.985
3.071ProTyr: 3.071 ± 1.417
0.0ProXaa: 0.0 ± 0.0
Gln
2.047GlnAla: 2.047 ± 1.107
0.0GlnCys: 0.0 ± 0.0
1.024GlnAsp: 1.024 ± 0.985
3.071GlnGlu: 3.071 ± 1.417
2.047GlnPhe: 2.047 ± 0.431
3.071GlnGly: 3.071 ± 0.122
0.0GlnHis: 0.0 ± 0.0
3.071GlnIle: 3.071 ± 2.955
3.071GlnLys: 3.071 ± 1.417
3.071GlnLeu: 3.071 ± 0.122
0.0GlnMet: 0.0 ± 0.0
1.024GlnAsn: 1.024 ± 0.985
3.071GlnPro: 3.071 ± 0.122
1.024GlnGln: 1.024 ± 0.985
2.047GlnArg: 2.047 ± 0.431
5.118GlnSer: 5.118 ± 1.229
0.0GlnThr: 0.0 ± 0.0
3.071GlnVal: 3.071 ± 1.417
1.024GlnTrp: 1.024 ± 0.554
0.0GlnTyr: 0.0 ± 0.0
0.0GlnXaa: 0.0 ± 0.0
Arg
2.047ArgAla: 2.047 ± 1.107
0.0ArgCys: 0.0 ± 0.0
3.071ArgAsp: 3.071 ± 1.417
2.047ArgGlu: 2.047 ± 0.431
5.118ArgPhe: 5.118 ± 1.229
5.118ArgGly: 5.118 ± 2.768
3.071ArgHis: 3.071 ± 1.417
1.024ArgIle: 1.024 ± 0.554
5.118ArgLys: 5.118 ± 1.229
11.259ArgLeu: 11.259 ± 0.065
4.094ArgMet: 4.094 ± 0.676
1.024ArgAsn: 1.024 ± 0.554
6.141ArgPro: 6.141 ± 3.321
1.024ArgGln: 1.024 ± 0.554
8.188ArgArg: 8.188 ± 4.428
4.094ArgSer: 4.094 ± 0.863
1.024ArgThr: 1.024 ± 0.554
8.188ArgVal: 8.188 ± 0.187
1.024ArgTrp: 1.024 ± 0.554
2.047ArgTyr: 2.047 ± 1.97
0.0ArgXaa: 0.0 ± 0.0
Ser
5.118SerAla: 5.118 ± 1.229
2.047SerCys: 2.047 ± 0.431
3.071SerAsp: 3.071 ± 0.122
4.094SerGlu: 4.094 ± 0.676
4.094SerPhe: 4.094 ± 0.676
5.118SerGly: 5.118 ± 1.229
1.024SerHis: 1.024 ± 0.554
4.094SerIle: 4.094 ± 0.863
7.165SerLys: 7.165 ± 0.798
3.071SerLeu: 3.071 ± 0.122
4.094SerMet: 4.094 ± 0.863
1.024SerAsn: 1.024 ± 0.985
8.188SerPro: 8.188 ± 1.726
3.071SerGln: 3.071 ± 0.122
8.188SerArg: 8.188 ± 1.351
9.212SerSer: 9.212 ± 1.905
2.047SerThr: 2.047 ± 0.431
3.071SerVal: 3.071 ± 0.122
2.047SerTrp: 2.047 ± 1.97
8.188SerTyr: 8.188 ± 1.726
0.0SerXaa: 0.0 ± 0.0
Thr
0.0ThrAla: 0.0 ± 0.0
1.024ThrCys: 1.024 ± 0.554
0.0ThrAsp: 0.0 ± 0.0
0.0ThrGlu: 0.0 ± 0.0
0.0ThrPhe: 0.0 ± 0.0
2.047ThrGly: 2.047 ± 0.431
1.024ThrHis: 1.024 ± 0.985
1.024ThrIle: 1.024 ± 0.985
0.0ThrLys: 0.0 ± 0.0
1.024ThrLeu: 1.024 ± 0.985
2.047ThrMet: 2.047 ± 0.431
1.024ThrAsn: 1.024 ± 0.985
0.0ThrPro: 0.0 ± 0.0
2.047ThrGln: 2.047 ± 0.431
3.071ThrArg: 3.071 ± 1.661
4.094ThrSer: 4.094 ± 0.676
1.024ThrThr: 1.024 ± 0.554
3.071ThrVal: 3.071 ± 0.122
0.0ThrTrp: 0.0 ± 0.0
0.0ThrTyr: 0.0 ± 0.0
0.0ThrXaa: 0.0 ± 0.0
Val
6.141ValAla: 6.141 ± 0.244
2.047ValCys: 2.047 ± 0.431
3.071ValAsp: 3.071 ± 0.122
3.071ValGlu: 3.071 ± 1.661
3.071ValPhe: 3.071 ± 1.661
7.165ValGly: 7.165 ± 0.798
1.024ValHis: 1.024 ± 0.554
1.024ValIle: 1.024 ± 0.985
6.141ValLys: 6.141 ± 1.783
5.118ValLeu: 5.118 ± 0.309
4.094ValMet: 4.094 ± 0.676
3.071ValAsn: 3.071 ± 0.122
2.047ValPro: 2.047 ± 0.431
2.047ValGln: 2.047 ± 1.97
7.165ValArg: 7.165 ± 2.336
8.188ValSer: 8.188 ± 1.351
1.024ValThr: 1.024 ± 0.554
14.33ValVal: 14.33 ± 0.057
2.047ValTrp: 2.047 ± 0.431
2.047ValTyr: 2.047 ± 1.107
0.0ValXaa: 0.0 ± 0.0
Trp
2.047TrpAla: 2.047 ± 1.107
0.0TrpCys: 0.0 ± 0.0
2.047TrpAsp: 2.047 ± 0.431
2.047TrpGlu: 2.047 ± 1.107
0.0TrpPhe: 0.0 ± 0.0
0.0TrpGly: 0.0 ± 0.0
0.0TrpHis: 0.0 ± 0.0
2.047TrpIle: 2.047 ± 0.431
1.024TrpLys: 1.024 ± 0.985
2.047TrpLeu: 2.047 ± 0.431
1.024TrpMet: 1.024 ± 0.623
0.0TrpAsn: 0.0 ± 0.0
0.0TrpPro: 0.0 ± 0.0
2.047TrpGln: 2.047 ± 1.97
1.024TrpArg: 1.024 ± 0.985
3.071TrpSer: 3.071 ± 1.417
2.047TrpThr: 2.047 ± 1.97
1.024TrpVal: 1.024 ± 0.554
0.0TrpTrp: 0.0 ± 0.0
0.0TrpTyr: 0.0 ± 0.0
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.024TyrAla: 1.024 ± 0.985
1.024TyrCys: 1.024 ± 0.985
3.071TyrAsp: 3.071 ± 0.122
4.094TyrGlu: 4.094 ± 2.402
1.024TyrPhe: 1.024 ± 0.985
2.047TyrGly: 2.047 ± 1.107
1.024TyrHis: 1.024 ± 0.985
3.071TyrIle: 3.071 ± 1.661
3.071TyrLys: 3.071 ± 1.661
2.047TyrLeu: 2.047 ± 1.97
0.0TyrMet: 0.0 ± 0.0
1.024TyrAsn: 1.024 ± 0.554
1.024TyrPro: 1.024 ± 0.985
1.024TyrGln: 1.024 ± 0.554
0.0TyrArg: 0.0 ± 0.0
0.0TyrSer: 0.0 ± 0.0
1.024TyrThr: 1.024 ± 0.554
1.024TyrVal: 1.024 ± 0.985
0.0TyrTrp: 0.0 ± 0.0
0.0TyrTyr: 0.0 ± 0.0
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 2 proteins (978 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski