Amino acid dipepetide frequency for HCBI9.212 virus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
1.562AlaAla: 1.562 ± 1.159
0.0AlaCys: 0.0 ± 0.0
3.125AlaAsp: 3.125 ± 2.318
1.562AlaGlu: 1.562 ± 1.061
3.125AlaPhe: 3.125 ± 0.839
4.688AlaGly: 4.688 ± 1.522
3.125AlaHis: 3.125 ± 0.839
1.562AlaIle: 1.562 ± 1.159
1.562AlaLys: 1.562 ± 2.19
0.0AlaLeu: 0.0 ± 0.0
1.562AlaMet: 1.562 ± 1.159
3.125AlaAsn: 3.125 ± 2.318
9.375AlaPro: 9.375 ± 3.044
1.562AlaGln: 1.562 ± 2.19
4.688AlaArg: 4.688 ± 1.723
0.0AlaSer: 0.0 ± 0.0
3.125AlaThr: 3.125 ± 0.839
4.688AlaVal: 4.688 ± 1.386
0.0AlaTrp: 0.0 ± 0.0
3.125AlaTyr: 3.125 ± 2.122
0.0AlaXaa: 0.0 ± 0.0
Cys
1.562CysAla: 1.562 ± 1.159
1.562CysCys: 1.562 ± 1.159
3.125CysAsp: 3.125 ± 2.318
0.0CysGlu: 0.0 ± 0.0
0.0CysPhe: 0.0 ± 0.0
0.0CysGly: 0.0 ± 0.0
1.562CysHis: 1.562 ± 2.19
1.562CysIle: 1.562 ± 2.19
1.562CysLys: 1.562 ± 1.159
0.0CysLeu: 0.0 ± 0.0
1.562CysMet: 1.562 ± 2.19
0.0CysAsn: 0.0 ± 0.0
1.562CysPro: 1.562 ± 1.061
1.562CysGln: 1.562 ± 2.19
3.125CysArg: 3.125 ± 2.318
0.0CysSer: 0.0 ± 0.0
0.0CysThr: 0.0 ± 0.0
0.0CysVal: 0.0 ± 0.0
0.0CysTrp: 0.0 ± 0.0
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
3.125AspAla: 3.125 ± 2.318
3.125AspCys: 3.125 ± 2.193
6.25AspAsp: 6.25 ± 3.831
6.25AspGlu: 6.25 ± 1.132
0.0AspPhe: 0.0 ± 0.0
6.25AspGly: 6.25 ± 2.814
1.562AspHis: 1.562 ± 1.061
4.688AspIle: 4.688 ± 2.741
3.125AspLys: 3.125 ± 2.193
0.0AspLeu: 0.0 ± 0.0
0.0AspMet: 0.0 ± 0.0
4.688AspAsn: 4.688 ± 1.723
3.125AspPro: 3.125 ± 4.379
1.562AspGln: 1.562 ± 1.061
1.562AspArg: 1.562 ± 1.061
1.562AspSer: 1.562 ± 1.061
3.125AspThr: 3.125 ± 0.839
3.125AspVal: 3.125 ± 2.122
3.125AspTrp: 3.125 ± 2.193
6.25AspTyr: 6.25 ± 1.69
0.0AspXaa: 0.0 ± 0.0
Glu
3.125GluAla: 3.125 ± 0.839
0.0GluCys: 0.0 ± 0.0
3.125GluAsp: 3.125 ± 1.915
0.0GluGlu: 0.0 ± 0.0
3.125GluPhe: 3.125 ± 2.193
3.125GluGly: 3.125 ± 2.122
3.125GluHis: 3.125 ± 2.318
1.562GluIle: 1.562 ± 1.159
1.562GluLys: 1.562 ± 1.159
0.0GluLeu: 0.0 ± 0.0
0.0GluMet: 0.0 ± 0.0
4.688GluAsn: 4.688 ± 1.522
1.562GluPro: 1.562 ± 1.159
1.562GluGln: 1.562 ± 2.19
12.5GluArg: 12.5 ± 7.388
4.688GluSer: 4.688 ± 1.386
0.0GluThr: 0.0 ± 0.0
0.0GluVal: 0.0 ± 0.0
1.562GluTrp: 1.562 ± 1.159
6.25GluTyr: 6.25 ± 1.69
0.0GluXaa: 0.0 ± 0.0
Phe
4.688PheAla: 4.688 ± 2.19
1.562PheCys: 1.562 ± 1.061
9.375PheAsp: 9.375 ± 3.576
1.562PheGlu: 1.562 ± 2.19
1.562PhePhe: 1.562 ± 1.061
6.25PheGly: 6.25 ± 1.678
1.562PheHis: 1.562 ± 2.19
0.0PheIle: 0.0 ± 0.0
0.0PheLys: 0.0 ± 0.0
4.688PheLeu: 4.688 ± 1.723
0.0PheMet: 0.0 ± 0.0
1.562PheAsn: 1.562 ± 1.061
1.562PhePro: 1.562 ± 1.159
1.562PheGln: 1.562 ± 1.159
3.125PheArg: 3.125 ± 2.122
4.688PheSer: 4.688 ± 2.19
1.562PheThr: 1.562 ± 1.061
1.562PheVal: 1.562 ± 2.19
1.562PheTrp: 1.562 ± 1.061
1.562PheTyr: 1.562 ± 1.159
0.0PheXaa: 0.0 ± 0.0
Gly
6.25GlyAla: 6.25 ± 1.132
0.0GlyCys: 0.0 ± 0.0
7.812GlyAsp: 7.812 ± 2.45
1.562GlyGlu: 1.562 ± 1.159
7.812GlyPhe: 7.812 ± 1.701
6.25GlyGly: 6.25 ± 1.132
3.125GlyHis: 3.125 ± 2.122
4.688GlyIle: 4.688 ± 1.723
3.125GlyLys: 3.125 ± 2.193
1.562GlyLeu: 1.562 ± 1.061
1.562GlyMet: 1.562 ± 0.904
4.688GlyAsn: 4.688 ± 1.723
3.125GlyPro: 3.125 ± 2.122
0.0GlyGln: 0.0 ± 0.0
9.375GlyArg: 9.375 ± 4.545
0.0GlySer: 0.0 ± 0.0
6.25GlyThr: 6.25 ± 2.486
3.125GlyVal: 3.125 ± 1.915
3.125GlyTrp: 3.125 ± 1.915
6.25GlyTyr: 6.25 ± 4.244
0.0GlyXaa: 0.0 ± 0.0
His
0.0HisAla: 0.0 ± 0.0
0.0HisCys: 0.0 ± 0.0
3.125HisAsp: 3.125 ± 2.122
3.125HisGlu: 3.125 ± 0.839
3.125HisPhe: 3.125 ± 1.915
3.125HisGly: 3.125 ± 2.318
0.0HisHis: 0.0 ± 0.0
6.25HisIle: 6.25 ± 1.132
1.562HisLys: 1.562 ± 1.159
1.562HisLeu: 1.562 ± 1.159
0.0HisMet: 0.0 ± 0.0
1.562HisAsn: 1.562 ± 1.061
0.0HisPro: 0.0 ± 0.0
0.0HisGln: 0.0 ± 0.0
0.0HisArg: 0.0 ± 0.0
3.125HisSer: 3.125 ± 2.193
1.562HisThr: 1.562 ± 1.159
0.0HisVal: 0.0 ± 0.0
0.0HisTrp: 0.0 ± 0.0
1.562HisTyr: 1.562 ± 1.159
0.0HisXaa: 0.0 ± 0.0
Ile
3.125IleAla: 3.125 ± 2.122
1.562IleCys: 1.562 ± 2.19
1.562IleAsp: 1.562 ± 1.061
3.125IleGlu: 3.125 ± 2.318
3.125IlePhe: 3.125 ± 2.122
3.125IleGly: 3.125 ± 0.839
1.562IleHis: 1.562 ± 1.159
9.375IleIle: 9.375 ± 3.576
3.125IleLys: 3.125 ± 0.839
0.0IleLeu: 0.0 ± 0.0
1.562IleMet: 1.562 ± 3.776
1.562IleAsn: 1.562 ± 2.19
0.0IlePro: 0.0 ± 0.0
1.562IleGln: 1.562 ± 2.19
3.125IleArg: 3.125 ± 2.318
6.25IleSer: 6.25 ± 2.814
3.125IleThr: 3.125 ± 2.193
4.688IleVal: 4.688 ± 2.741
3.125IleTrp: 3.125 ± 2.122
6.25IleTyr: 6.25 ± 1.69
0.0IleXaa: 0.0 ± 0.0
Lys
0.0LysAla: 0.0 ± 0.0
0.0LysCys: 0.0 ± 0.0
1.562LysAsp: 1.562 ± 1.159
1.562LysGlu: 1.562 ± 1.159
1.562LysPhe: 1.562 ± 1.061
7.812LysGly: 7.812 ± 3.975
1.562LysHis: 1.562 ± 1.061
1.562LysIle: 1.562 ± 1.159
3.125LysLys: 3.125 ± 1.915
1.562LysLeu: 1.562 ± 1.159
1.562LysMet: 1.562 ± 1.159
1.562LysAsn: 1.562 ± 1.061
3.125LysPro: 3.125 ± 2.193
1.562LysGln: 1.562 ± 2.19
1.562LysArg: 1.562 ± 2.19
1.562LysSer: 1.562 ± 1.159
1.562LysThr: 1.562 ± 1.061
3.125LysVal: 3.125 ± 0.839
1.562LysTrp: 1.562 ± 1.159
1.562LysTyr: 1.562 ± 2.19
0.0LysXaa: 0.0 ± 0.0
Leu
0.0LeuAla: 0.0 ± 0.0
1.562LeuCys: 1.562 ± 1.159
0.0LeuAsp: 0.0 ± 0.0
4.688LeuGlu: 4.688 ± 1.723
1.562LeuPhe: 1.562 ± 1.061
6.25LeuGly: 6.25 ± 1.69
3.125LeuHis: 3.125 ± 0.839
3.125LeuIle: 3.125 ± 0.839
1.562LeuLys: 1.562 ± 1.159
3.125LeuLeu: 3.125 ± 2.318
1.562LeuMet: 1.562 ± 2.19
3.125LeuAsn: 3.125 ± 2.122
0.0LeuPro: 0.0 ± 0.0
7.812LeuGln: 7.812 ± 2.217
0.0LeuArg: 0.0 ± 0.0
4.688LeuSer: 4.688 ± 1.723
1.562LeuThr: 1.562 ± 1.061
1.562LeuVal: 1.562 ± 1.061
0.0LeuTrp: 0.0 ± 0.0
1.562LeuTyr: 1.562 ± 1.159
0.0LeuXaa: 0.0 ± 0.0
Met
1.562MetAla: 1.562 ± 1.061
3.125MetCys: 3.125 ± 2.193
1.562MetAsp: 1.562 ± 2.19
0.0MetGlu: 0.0 ± 0.0
0.0MetPhe: 0.0 ± 0.0
0.0MetGly: 0.0 ± 0.0
0.0MetHis: 0.0 ± 0.0
0.0MetIle: 0.0 ± 0.0
1.562MetLys: 1.562 ± 1.061
0.0MetLeu: 0.0 ± 0.0
0.0MetMet: 0.0 ± 0.0
3.125MetAsn: 3.125 ± 4.379
3.125MetPro: 3.125 ± 0.839
0.0MetGln: 0.0 ± 0.0
1.562MetArg: 1.562 ± 1.159
0.0MetSer: 0.0 ± 0.0
0.0MetThr: 0.0 ± 0.0
1.562MetVal: 1.562 ± 1.061
0.0MetTrp: 0.0 ± 0.0
1.562MetTyr: 1.562 ± 1.061
0.0MetXaa: 0.0 ± 0.0
Asn
1.562AsnAla: 1.562 ± 1.061
1.562AsnCys: 1.562 ± 2.19
1.562AsnAsp: 1.562 ± 1.061
4.688AsnGlu: 4.688 ± 3.975
3.125AsnPhe: 3.125 ± 2.318
1.562AsnGly: 1.562 ± 1.061
0.0AsnHis: 0.0 ± 0.0
3.125AsnIle: 3.125 ± 2.318
3.125AsnLys: 3.125 ± 0.839
1.562AsnLeu: 1.562 ± 1.159
3.125AsnMet: 3.125 ± 2.122
1.562AsnAsn: 1.562 ± 2.19
6.25AsnPro: 6.25 ± 1.132
3.125AsnGln: 3.125 ± 2.318
3.125AsnArg: 3.125 ± 2.122
6.25AsnSer: 6.25 ± 1.678
6.25AsnThr: 6.25 ± 4.244
3.125AsnVal: 3.125 ± 1.915
0.0AsnTrp: 0.0 ± 0.0
0.0AsnTyr: 0.0 ± 0.0
0.0AsnXaa: 0.0 ± 0.0
Pro
1.562ProAla: 1.562 ± 1.159
1.562ProCys: 1.562 ± 1.159
1.562ProAsp: 1.562 ± 1.159
6.25ProGlu: 6.25 ± 4.636
1.562ProPhe: 1.562 ± 1.159
1.562ProGly: 1.562 ± 1.061
3.125ProHis: 3.125 ± 2.193
4.688ProIle: 4.688 ± 1.386
0.0ProLys: 0.0 ± 0.0
6.25ProLeu: 6.25 ± 2.486
0.0ProMet: 0.0 ± 0.0
3.125ProAsn: 3.125 ± 2.122
4.688ProPro: 4.688 ± 2.741
3.125ProGln: 3.125 ± 0.839
4.688ProArg: 4.688 ± 1.723
4.688ProSer: 4.688 ± 3.975
6.25ProThr: 6.25 ± 4.244
1.562ProVal: 1.562 ± 1.061
0.0ProTrp: 0.0 ± 0.0
1.562ProTyr: 1.562 ± 1.061
0.0ProXaa: 0.0 ± 0.0
Gln
3.125GlnAla: 3.125 ± 2.318
0.0GlnCys: 0.0 ± 0.0
0.0GlnAsp: 0.0 ± 0.0
1.562GlnGlu: 1.562 ± 1.061
1.562GlnPhe: 1.562 ± 2.19
4.688GlnGly: 4.688 ± 2.19
1.562GlnHis: 1.562 ± 1.159
3.125GlnIle: 3.125 ± 2.193
4.688GlnLys: 4.688 ± 2.19
3.125GlnLeu: 3.125 ± 2.122
0.0GlnMet: 0.0 ± 0.0
0.0GlnAsn: 0.0 ± 0.0
3.125GlnPro: 3.125 ± 0.839
1.562GlnGln: 1.562 ± 1.159
1.562GlnArg: 1.562 ± 1.159
1.562GlnSer: 1.562 ± 2.19
0.0GlnThr: 0.0 ± 0.0
4.688GlnVal: 4.688 ± 3.477
0.0GlnTrp: 0.0 ± 0.0
1.562GlnTyr: 1.562 ± 2.19
0.0GlnXaa: 0.0 ± 0.0
Arg
3.125ArgAla: 3.125 ± 0.839
0.0ArgCys: 0.0 ± 0.0
3.125ArgAsp: 3.125 ± 2.318
4.688ArgGlu: 4.688 ± 3.477
4.688ArgPhe: 4.688 ± 1.522
6.25ArgGly: 6.25 ± 2.486
1.562ArgHis: 1.562 ± 1.159
3.125ArgIle: 3.125 ± 0.839
1.562ArgLys: 1.562 ± 2.19
6.25ArgLeu: 6.25 ± 2.814
3.125ArgMet: 3.125 ± 1.047
1.562ArgAsn: 1.562 ± 1.159
4.688ArgPro: 4.688 ± 1.522
4.688ArgGln: 4.688 ± 1.723
26.562ArgArg: 26.562 ± 14.342
7.812ArgSer: 7.812 ± 3.506
6.25ArgThr: 6.25 ± 1.132
6.25ArgVal: 6.25 ± 2.486
0.0ArgTrp: 0.0 ± 0.0
6.25ArgTyr: 6.25 ± 2.486
0.0ArgXaa: 0.0 ± 0.0
Ser
3.125SerAla: 3.125 ± 2.318
3.125SerCys: 3.125 ± 2.318
6.25SerAsp: 6.25 ± 1.132
3.125SerGlu: 3.125 ± 2.193
3.125SerPhe: 3.125 ± 0.839
3.125SerGly: 3.125 ± 2.122
0.0SerHis: 0.0 ± 0.0
4.688SerIle: 4.688 ± 1.386
3.125SerLys: 3.125 ± 2.122
6.25SerLeu: 6.25 ± 2.486
0.0SerMet: 0.0 ± 0.0
9.375SerAsn: 9.375 ± 2.771
1.562SerPro: 1.562 ± 1.061
3.125SerGln: 3.125 ± 2.193
4.688SerArg: 4.688 ± 1.522
4.688SerSer: 4.688 ± 2.19
1.562SerThr: 1.562 ± 1.061
0.0SerVal: 0.0 ± 0.0
1.562SerTrp: 1.562 ± 1.061
6.25SerTyr: 6.25 ± 2.859
0.0SerXaa: 0.0 ± 0.0
Thr
6.25ThrAla: 6.25 ± 1.678
0.0ThrCys: 0.0 ± 0.0
3.125ThrAsp: 3.125 ± 1.915
3.125ThrGlu: 3.125 ± 2.122
3.125ThrPhe: 3.125 ± 0.839
6.25ThrGly: 6.25 ± 4.244
0.0ThrHis: 0.0 ± 0.0
1.562ThrIle: 1.562 ± 2.19
3.125ThrLys: 3.125 ± 0.839
1.562ThrLeu: 1.562 ± 1.061
0.0ThrMet: 0.0 ± 0.0
0.0ThrAsn: 0.0 ± 0.0
6.25ThrPro: 6.25 ± 1.678
1.562ThrGln: 1.562 ± 1.061
4.688ThrArg: 4.688 ± 3.183
4.688ThrSer: 4.688 ± 1.522
1.562ThrThr: 1.562 ± 1.061
3.125ThrVal: 3.125 ± 0.839
0.0ThrTrp: 0.0 ± 0.0
1.562ThrTyr: 1.562 ± 1.061
0.0ThrXaa: 0.0 ± 0.0
Val
3.125ValAla: 3.125 ± 2.122
0.0ValCys: 0.0 ± 0.0
3.125ValAsp: 3.125 ± 4.379
1.562ValGlu: 1.562 ± 1.159
4.688ValPhe: 4.688 ± 1.386
3.125ValGly: 3.125 ± 0.839
1.562ValHis: 1.562 ± 1.061
4.688ValIle: 4.688 ± 2.19
0.0ValLys: 0.0 ± 0.0
1.562ValLeu: 1.562 ± 1.061
0.0ValMet: 0.0 ± 0.0
3.125ValAsn: 3.125 ± 0.839
1.562ValPro: 1.562 ± 1.159
0.0ValGln: 0.0 ± 0.0
7.812ValArg: 7.812 ± 2.217
4.688ValSer: 4.688 ± 1.386
3.125ValThr: 3.125 ± 2.318
1.562ValVal: 1.562 ± 2.19
0.0ValTrp: 0.0 ± 0.0
0.0ValTyr: 0.0 ± 0.0
0.0ValXaa: 0.0 ± 0.0
Trp
1.562TrpAla: 1.562 ± 1.159
0.0TrpCys: 0.0 ± 0.0
0.0TrpAsp: 0.0 ± 0.0
1.562TrpGlu: 1.562 ± 1.159
0.0TrpPhe: 0.0 ± 0.0
1.562TrpGly: 1.562 ± 2.19
0.0TrpHis: 0.0 ± 0.0
1.562TrpIle: 1.562 ± 1.061
0.0TrpLys: 0.0 ± 0.0
4.688TrpLeu: 4.688 ± 4.227
0.0TrpMet: 0.0 ± 0.0
1.562TrpAsn: 1.562 ± 1.061
0.0TrpPro: 0.0 ± 0.0
0.0TrpGln: 0.0 ± 0.0
1.562TrpArg: 1.562 ± 1.061
3.125TrpSer: 3.125 ± 2.122
1.562TrpThr: 1.562 ± 1.061
0.0TrpVal: 0.0 ± 0.0
0.0TrpTrp: 0.0 ± 0.0
0.0TrpTyr: 0.0 ± 0.0
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.125TyrAla: 3.125 ± 2.193
0.0TyrCys: 0.0 ± 0.0
3.125TyrAsp: 3.125 ± 1.915
1.562TyrGlu: 1.562 ± 1.159
3.125TyrPhe: 3.125 ± 2.193
4.688TyrGly: 4.688 ± 3.183
1.562TyrHis: 1.562 ± 1.159
1.562TyrIle: 1.562 ± 1.061
1.562TyrLys: 1.562 ± 2.19
3.125TyrLeu: 3.125 ± 2.318
1.562TyrMet: 1.562 ± 1.061
4.688TyrAsn: 4.688 ± 3.183
3.125TyrPro: 3.125 ± 2.318
1.562TyrGln: 1.562 ± 1.061
6.25TyrArg: 6.25 ± 1.132
4.688TyrSer: 4.688 ± 3.183
3.125TyrThr: 3.125 ± 2.122
1.562TyrVal: 1.562 ± 1.159
3.125TyrTrp: 3.125 ± 1.915
1.562TyrTyr: 1.562 ± 1.159
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 3 proteins (641 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski