Amino acid dipepetide frequency for Virus sp. ctgoX21

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
3.263AlaAla: 3.263 ± 2.603
3.263AlaCys: 3.263 ± 2.196
3.263AlaAsp: 3.263 ± 2.603
4.894AlaGlu: 4.894 ± 0.895
1.631AlaPhe: 1.631 ± 1.098
4.894AlaGly: 4.894 ± 0.895
1.631AlaHis: 1.631 ± 1.098
1.631AlaIle: 1.631 ± 1.098
1.631AlaLys: 1.631 ± 1.302
6.525AlaLeu: 6.525 ± 1.993
1.631AlaMet: 1.631 ± 1.098
6.525AlaAsn: 6.525 ± 2.807
1.631AlaPro: 1.631 ± 1.302
1.631AlaGln: 1.631 ± 1.098
1.631AlaArg: 1.631 ± 1.302
0.0AlaSer: 0.0 ± 0.0
4.894AlaThr: 4.894 ± 3.905
1.631AlaVal: 1.631 ± 1.302
0.0AlaTrp: 0.0 ± 0.0
0.0AlaTyr: 0.0 ± 0.0
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.0CysCys: 0.0 ± 0.0
0.0CysAsp: 0.0 ± 0.0
1.631CysGlu: 1.631 ± 1.302
3.263CysPhe: 3.263 ± 0.204
1.631CysGly: 1.631 ± 1.302
1.631CysHis: 1.631 ± 1.302
1.631CysIle: 1.631 ± 1.302
1.631CysLys: 1.631 ± 1.302
3.263CysLeu: 3.263 ± 0.204
0.0CysMet: 0.0 ± 0.0
0.0CysAsn: 0.0 ± 0.0
0.0CysPro: 0.0 ± 0.0
0.0CysGln: 0.0 ± 0.0
3.263CysArg: 3.263 ± 2.196
0.0CysSer: 0.0 ± 0.0
4.894CysThr: 4.894 ± 3.294
0.0CysVal: 0.0 ± 0.0
0.0CysTrp: 0.0 ± 0.0
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
3.263AspAla: 3.263 ± 0.204
0.0AspCys: 0.0 ± 0.0
6.525AspAsp: 6.525 ± 1.993
4.894AspGlu: 4.894 ± 0.895
6.525AspPhe: 6.525 ± 1.993
1.631AspGly: 1.631 ± 1.302
1.631AspHis: 1.631 ± 1.098
1.631AspIle: 1.631 ± 1.098
4.894AspLys: 4.894 ± 3.294
4.894AspLeu: 4.894 ± 0.895
0.0AspMet: 0.0 ± 0.0
3.263AspAsn: 3.263 ± 2.603
3.263AspPro: 3.263 ± 2.196
0.0AspGln: 0.0 ± 0.0
3.263AspArg: 3.263 ± 2.196
1.631AspSer: 1.631 ± 1.098
4.894AspThr: 4.894 ± 0.895
1.631AspVal: 1.631 ± 1.302
1.631AspTrp: 1.631 ± 1.098
1.631AspTyr: 1.631 ± 1.098
0.0AspXaa: 0.0 ± 0.0
Glu
1.631GluAla: 1.631 ± 1.098
0.0GluCys: 0.0 ± 0.0
3.263GluAsp: 3.263 ± 2.196
3.263GluGlu: 3.263 ± 2.196
3.263GluPhe: 3.263 ± 2.196
3.263GluGly: 3.263 ± 2.196
1.631GluHis: 1.631 ± 1.098
3.263GluIle: 3.263 ± 2.196
3.263GluLys: 3.263 ± 0.204
4.894GluLeu: 4.894 ± 3.294
3.263GluMet: 3.263 ± 0.718
1.631GluAsn: 1.631 ± 1.098
0.0GluPro: 0.0 ± 0.0
1.631GluGln: 1.631 ± 1.098
1.631GluArg: 1.631 ± 1.098
3.263GluSer: 3.263 ± 2.196
0.0GluThr: 0.0 ± 0.0
1.631GluVal: 1.631 ± 1.098
3.263GluTrp: 3.263 ± 2.196
1.631GluTyr: 1.631 ± 1.302
0.0GluXaa: 0.0 ± 0.0
Phe
3.263PheAla: 3.263 ± 2.196
1.631PheCys: 1.631 ± 1.302
4.894PheAsp: 4.894 ± 0.895
1.631PheGlu: 1.631 ± 1.098
1.631PhePhe: 1.631 ± 1.302
4.894PheGly: 4.894 ± 0.895
0.0PheHis: 0.0 ± 0.0
3.263PheIle: 3.263 ± 0.204
4.894PheLys: 4.894 ± 3.905
1.631PheLeu: 1.631 ± 1.098
0.0PheMet: 0.0 ± 0.0
6.525PheAsn: 6.525 ± 1.993
1.631PhePro: 1.631 ± 1.098
0.0PheGln: 0.0 ± 0.0
1.631PheArg: 1.631 ± 1.098
3.263PheSer: 3.263 ± 0.204
3.263PheThr: 3.263 ± 0.204
4.894PheVal: 4.894 ± 1.505
1.631PheTrp: 1.631 ± 1.098
1.631PheTyr: 1.631 ± 1.302
0.0PheXaa: 0.0 ± 0.0
Gly
3.263GlyAla: 3.263 ± 0.204
0.0GlyCys: 0.0 ± 0.0
3.263GlyAsp: 3.263 ± 2.196
0.0GlyGlu: 0.0 ± 0.0
1.631GlyPhe: 1.631 ± 1.302
3.263GlyGly: 3.263 ± 2.196
0.0GlyHis: 0.0 ± 0.0
1.631GlyIle: 1.631 ± 1.098
6.525GlyLys: 6.525 ± 1.993
0.0GlyLeu: 0.0 ± 0.0
3.263GlyMet: 3.263 ± 0.651
6.525GlyAsn: 6.525 ± 5.207
0.0GlyPro: 0.0 ± 0.0
0.0GlyGln: 0.0 ± 0.0
0.0GlyArg: 0.0 ± 0.0
1.631GlySer: 1.631 ± 1.302
4.894GlyThr: 4.894 ± 1.505
3.263GlyVal: 3.263 ± 0.204
1.631GlyTrp: 1.631 ± 1.098
1.631GlyTyr: 1.631 ± 1.302
0.0GlyXaa: 0.0 ± 0.0
His
0.0HisAla: 0.0 ± 0.0
1.631HisCys: 1.631 ± 1.302
0.0HisAsp: 0.0 ± 0.0
1.631HisGlu: 1.631 ± 1.098
0.0HisPhe: 0.0 ± 0.0
0.0HisGly: 0.0 ± 0.0
0.0HisHis: 0.0 ± 0.0
3.263HisIle: 3.263 ± 2.196
3.263HisLys: 3.263 ± 2.196
1.631HisLeu: 1.631 ± 1.098
0.0HisMet: 0.0 ± 0.0
0.0HisAsn: 0.0 ± 0.0
1.631HisPro: 1.631 ± 1.098
0.0HisGln: 0.0 ± 0.0
1.631HisArg: 1.631 ± 1.302
0.0HisSer: 0.0 ± 0.0
3.263HisThr: 3.263 ± 0.204
3.263HisVal: 3.263 ± 2.196
0.0HisTrp: 0.0 ± 0.0
0.0HisTyr: 0.0 ± 0.0
0.0HisXaa: 0.0 ± 0.0
Ile
4.894IleAla: 4.894 ± 0.895
0.0IleCys: 0.0 ± 0.0
3.263IleAsp: 3.263 ± 2.196
3.263IleGlu: 3.263 ± 2.196
3.263IlePhe: 3.263 ± 2.196
6.525IleGly: 6.525 ± 2.807
3.263IleHis: 3.263 ± 0.204
0.0IleIle: 0.0 ± 0.0
6.525IleLys: 6.525 ± 0.407
0.0IleLeu: 0.0 ± 0.0
0.0IleMet: 0.0 ± 0.0
3.263IleAsn: 3.263 ± 0.204
0.0IlePro: 0.0 ± 0.0
6.525IleGln: 6.525 ± 2.807
3.263IleArg: 3.263 ± 0.204
4.894IleSer: 4.894 ± 0.895
3.263IleThr: 3.263 ± 0.204
3.263IleVal: 3.263 ± 0.204
4.894IleTrp: 4.894 ± 3.294
4.894IleTyr: 4.894 ± 0.895
0.0IleXaa: 0.0 ± 0.0
Lys
0.0LysAla: 0.0 ± 0.0
1.631LysCys: 1.631 ± 1.098
6.525LysAsp: 6.525 ± 4.393
1.631LysGlu: 1.631 ± 1.098
4.894LysPhe: 4.894 ± 0.895
1.631LysGly: 1.631 ± 1.302
1.631LysHis: 1.631 ± 1.098
6.525LysIle: 6.525 ± 0.407
6.525LysLys: 6.525 ± 2.807
1.631LysLeu: 1.631 ± 1.098
1.631LysMet: 1.631 ± 1.302
1.631LysAsn: 1.631 ± 1.098
6.525LysPro: 6.525 ± 0.407
0.0LysGln: 0.0 ± 0.0
3.263LysArg: 3.263 ± 0.204
4.894LysSer: 4.894 ± 3.905
6.525LysThr: 6.525 ± 1.993
4.894LysVal: 4.894 ± 1.505
1.631LysTrp: 1.631 ± 1.302
4.894LysTyr: 4.894 ± 1.505
0.0LysXaa: 0.0 ± 0.0
Leu
6.525LeuAla: 6.525 ± 1.993
1.631LeuCys: 1.631 ± 1.302
3.263LeuAsp: 3.263 ± 2.196
4.894LeuGlu: 4.894 ± 3.294
0.0LeuPhe: 0.0 ± 0.0
1.631LeuGly: 1.631 ± 1.302
0.0LeuHis: 0.0 ± 0.0
1.631LeuIle: 1.631 ± 1.098
0.0LeuLys: 0.0 ± 0.0
1.631LeuLeu: 1.631 ± 1.098
4.894LeuMet: 4.894 ± 1.505
0.0LeuAsn: 0.0 ± 0.0
6.525LeuPro: 6.525 ± 4.393
3.263LeuGln: 3.263 ± 0.204
3.263LeuArg: 3.263 ± 0.204
6.525LeuSer: 6.525 ± 1.993
6.525LeuThr: 6.525 ± 1.993
1.631LeuVal: 1.631 ± 1.302
0.0LeuTrp: 0.0 ± 0.0
1.631LeuTyr: 1.631 ± 1.098
0.0LeuXaa: 0.0 ± 0.0
Met
3.263MetAla: 3.263 ± 2.603
0.0MetCys: 0.0 ± 0.0
1.631MetAsp: 1.631 ± 1.302
0.0MetGlu: 0.0 ± 0.0
1.631MetPhe: 1.631 ± 1.098
0.0MetGly: 0.0 ± 0.0
1.631MetHis: 1.631 ± 1.098
3.263MetIle: 3.263 ± 0.204
1.631MetLys: 1.631 ± 1.098
1.631MetLeu: 1.631 ± 1.098
0.0MetMet: 0.0 ± 0.0
3.263MetAsn: 3.263 ± 0.204
0.0MetPro: 0.0 ± 0.0
1.631MetGln: 1.631 ± 1.302
0.0MetArg: 0.0 ± 0.0
4.894MetSer: 4.894 ± 0.895
0.0MetThr: 0.0 ± 0.0
1.631MetVal: 1.631 ± 1.302
0.0MetTrp: 0.0 ± 0.0
1.631MetTyr: 1.631 ± 1.302
0.0MetXaa: 0.0 ± 0.0
Asn
3.263AsnAla: 3.263 ± 2.603
1.631AsnCys: 1.631 ± 1.098
3.263AsnAsp: 3.263 ± 2.603
0.0AsnGlu: 0.0 ± 0.0
3.263AsnPhe: 3.263 ± 0.204
1.631AsnGly: 1.631 ± 1.098
1.631AsnHis: 1.631 ± 1.098
4.894AsnIle: 4.894 ± 0.895
1.631AsnLys: 1.631 ± 1.098
4.894AsnLeu: 4.894 ± 0.895
0.0AsnMet: 0.0 ± 0.0
3.263AsnAsn: 3.263 ± 2.196
3.263AsnPro: 3.263 ± 2.603
8.157AsnGln: 8.157 ± 4.109
3.263AsnArg: 3.263 ± 2.196
4.894AsnSer: 4.894 ± 1.505
1.631AsnThr: 1.631 ± 1.302
1.631AsnVal: 1.631 ± 1.302
0.0AsnTrp: 0.0 ± 0.0
3.263AsnTyr: 3.263 ± 2.603
0.0AsnXaa: 0.0 ± 0.0
Pro
1.631ProAla: 1.631 ± 1.302
1.631ProCys: 1.631 ± 1.302
0.0ProAsp: 0.0 ± 0.0
3.263ProGlu: 3.263 ± 2.196
1.631ProPhe: 1.631 ± 1.302
0.0ProGly: 0.0 ± 0.0
1.631ProHis: 1.631 ± 1.098
3.263ProIle: 3.263 ± 2.196
3.263ProLys: 3.263 ± 0.204
0.0ProLeu: 0.0 ± 0.0
1.631ProMet: 1.631 ± 1.098
1.631ProAsn: 1.631 ± 1.098
3.263ProPro: 3.263 ± 2.196
1.631ProGln: 1.631 ± 1.098
4.894ProArg: 4.894 ± 3.905
1.631ProSer: 1.631 ± 1.302
8.157ProThr: 8.157 ± 3.091
1.631ProVal: 1.631 ± 1.302
1.631ProTrp: 1.631 ± 1.302
3.263ProTyr: 3.263 ± 0.204
0.0ProXaa: 0.0 ± 0.0
Gln
6.525GlnAla: 6.525 ± 0.407
0.0GlnCys: 0.0 ± 0.0
0.0GlnAsp: 0.0 ± 0.0
1.631GlnGlu: 1.631 ± 1.098
1.631GlnPhe: 1.631 ± 1.098
0.0GlnGly: 0.0 ± 0.0
0.0GlnHis: 0.0 ± 0.0
4.894GlnIle: 4.894 ± 1.505
1.631GlnLys: 1.631 ± 1.302
1.631GlnLeu: 1.631 ± 1.098
0.0GlnMet: 0.0 ± 0.0
3.263GlnAsn: 3.263 ± 0.204
6.525GlnPro: 6.525 ± 0.407
1.631GlnGln: 1.631 ± 1.302
4.894GlnArg: 4.894 ± 1.505
6.525GlnSer: 6.525 ± 5.207
1.631GlnThr: 1.631 ± 1.302
4.894GlnVal: 4.894 ± 3.905
0.0GlnTrp: 0.0 ± 0.0
0.0GlnTyr: 0.0 ± 0.0
0.0GlnXaa: 0.0 ± 0.0
Arg
3.263ArgAla: 3.263 ± 0.204
0.0ArgCys: 0.0 ± 0.0
4.894ArgAsp: 4.894 ± 3.294
0.0ArgGlu: 0.0 ± 0.0
6.525ArgPhe: 6.525 ± 0.407
1.631ArgGly: 1.631 ± 1.098
1.631ArgHis: 1.631 ± 1.302
3.263ArgIle: 3.263 ± 0.204
4.894ArgLys: 4.894 ± 3.905
3.263ArgLeu: 3.263 ± 2.196
0.0ArgMet: 0.0 ± 0.0
3.263ArgAsn: 3.263 ± 2.196
1.631ArgPro: 1.631 ± 1.302
1.631ArgGln: 1.631 ± 1.302
4.894ArgArg: 4.894 ± 3.905
8.157ArgSer: 8.157 ± 3.091
3.263ArgThr: 3.263 ± 2.603
3.263ArgVal: 3.263 ± 0.204
1.631ArgTrp: 1.631 ± 1.098
0.0ArgTyr: 0.0 ± 0.0
0.0ArgXaa: 0.0 ± 0.0
Ser
0.0SerAla: 0.0 ± 0.0
1.631SerCys: 1.631 ± 1.098
6.525SerAsp: 6.525 ± 1.993
4.894SerGlu: 4.894 ± 3.294
3.263SerPhe: 3.263 ± 0.204
3.263SerGly: 3.263 ± 0.204
0.0SerHis: 0.0 ± 0.0
3.263SerIle: 3.263 ± 2.603
8.157SerLys: 8.157 ± 1.709
4.894SerLeu: 4.894 ± 1.505
1.631SerMet: 1.631 ± 1.302
1.631SerAsn: 1.631 ± 1.302
1.631SerPro: 1.631 ± 1.302
11.419SerGln: 11.419 ± 6.712
6.525SerArg: 6.525 ± 1.993
13.051SerSer: 13.051 ± 5.614
3.263SerThr: 3.263 ± 2.603
6.525SerVal: 6.525 ± 0.407
1.631SerTrp: 1.631 ± 1.098
0.0SerTyr: 0.0 ± 0.0
0.0SerXaa: 0.0 ± 0.0
Thr
3.263ThrAla: 3.263 ± 0.204
3.263ThrCys: 3.263 ± 2.196
3.263ThrAsp: 3.263 ± 2.603
1.631ThrGlu: 1.631 ± 1.098
4.894ThrPhe: 4.894 ± 3.905
4.894ThrGly: 4.894 ± 1.505
3.263ThrHis: 3.263 ± 2.196
3.263ThrIle: 3.263 ± 0.204
3.263ThrLys: 3.263 ± 2.196
6.525ThrLeu: 6.525 ± 2.807
3.263ThrMet: 3.263 ± 2.196
3.263ThrAsn: 3.263 ± 0.204
4.894ThrPro: 4.894 ± 1.505
4.894ThrGln: 4.894 ± 1.505
3.263ThrArg: 3.263 ± 0.204
6.525ThrSer: 6.525 ± 5.207
8.157ThrThr: 8.157 ± 1.709
3.263ThrVal: 3.263 ± 2.196
0.0ThrTrp: 0.0 ± 0.0
6.525ThrTyr: 6.525 ± 1.993
0.0ThrXaa: 0.0 ± 0.0
Val
3.263ValAla: 3.263 ± 2.603
3.263ValCys: 3.263 ± 2.603
1.631ValAsp: 1.631 ± 1.098
1.631ValGlu: 1.631 ± 1.098
1.631ValPhe: 1.631 ± 1.098
3.263ValGly: 3.263 ± 0.204
0.0ValHis: 0.0 ± 0.0
9.788ValIle: 9.788 ± 0.611
1.631ValLys: 1.631 ± 1.098
3.263ValLeu: 3.263 ± 2.196
3.263ValMet: 3.263 ± 2.603
4.894ValAsn: 4.894 ± 3.905
0.0ValPro: 0.0 ± 0.0
0.0ValGln: 0.0 ± 0.0
3.263ValArg: 3.263 ± 2.603
4.894ValSer: 4.894 ± 1.505
3.263ValThr: 3.263 ± 2.196
6.525ValVal: 6.525 ± 2.807
0.0ValTrp: 0.0 ± 0.0
3.263ValTyr: 3.263 ± 0.204
0.0ValXaa: 0.0 ± 0.0
Trp
1.631TrpAla: 1.631 ± 1.098
0.0TrpCys: 0.0 ± 0.0
0.0TrpAsp: 0.0 ± 0.0
1.631TrpGlu: 1.631 ± 1.098
0.0TrpPhe: 0.0 ± 0.0
0.0TrpGly: 0.0 ± 0.0
0.0TrpHis: 0.0 ± 0.0
3.263TrpIle: 3.263 ± 0.204
1.631TrpLys: 1.631 ± 1.098
1.631TrpLeu: 1.631 ± 1.098
0.0TrpMet: 0.0 ± 0.0
0.0TrpAsn: 0.0 ± 0.0
0.0TrpPro: 0.0 ± 0.0
1.631TrpGln: 1.631 ± 1.098
3.263TrpArg: 3.263 ± 2.196
3.263TrpSer: 3.263 ± 0.204
0.0TrpThr: 0.0 ± 0.0
1.631TrpVal: 1.631 ± 1.098
0.0TrpTrp: 0.0 ± 0.0
1.631TrpTyr: 1.631 ± 1.098
0.0TrpXaa: 0.0 ± 0.0
Tyr
0.0TyrAla: 0.0 ± 0.0
1.631TyrCys: 1.631 ± 1.302
1.631TyrAsp: 1.631 ± 1.098
3.263TyrGlu: 3.263 ± 2.196
1.631TyrPhe: 1.631 ± 1.302
0.0TyrGly: 0.0 ± 0.0
0.0TyrHis: 0.0 ± 0.0
1.631TyrIle: 1.631 ± 1.098
1.631TyrLys: 1.631 ± 1.098
1.631TyrLeu: 1.631 ± 1.302
1.631TyrMet: 1.631 ± 1.302
1.631TyrAsn: 1.631 ± 1.302
3.263TyrPro: 3.263 ± 0.204
1.631TyrGln: 1.631 ± 1.098
0.0TyrArg: 0.0 ± 0.0
3.263TyrSer: 3.263 ± 0.204
9.788TyrThr: 9.788 ± 5.41
1.631TyrVal: 1.631 ± 1.098
1.631TyrTrp: 1.631 ± 1.098
1.631TyrTyr: 1.631 ± 1.302
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 2 proteins (614 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski