Amino acid dipepetide frequency for CRESS virus sp. ctf7a5

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
4.651AlaAla: 4.651 ± 2.202
0.0AlaCys: 0.0 ± 0.0
3.488AlaAsp: 3.488 ± 0.504
3.488AlaGlu: 3.488 ± 0.504
1.163AlaPhe: 1.163 ± 0.605
6.977AlaGly: 6.977 ± 2.092
0.0AlaHis: 0.0 ± 0.0
6.977AlaIle: 6.977 ± 3.265
4.651AlaLys: 4.651 ± 2.707
3.488AlaLeu: 3.488 ± 0.504
0.0AlaMet: 0.0 ± 0.0
1.163AlaAsn: 1.163 ± 0.605
1.163AlaPro: 1.163 ± 1.082
3.488AlaGln: 3.488 ± 1.734
0.0AlaArg: 0.0 ± 0.0
5.814AlaSer: 5.814 ± 1.506
4.651AlaThr: 4.651 ± 0.946
3.488AlaVal: 3.488 ± 1.816
1.163AlaTrp: 1.163 ± 1.082
4.651AlaTyr: 4.651 ± 1.176
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.0CysCys: 0.0 ± 0.0
1.163CysAsp: 1.163 ± 1.082
0.0CysGlu: 0.0 ± 0.0
2.326CysPhe: 2.326 ± 0.588
1.163CysGly: 1.163 ± 0.605
0.0CysHis: 0.0 ± 0.0
1.163CysIle: 1.163 ± 0.605
1.163CysLys: 1.163 ± 0.605
0.0CysLeu: 0.0 ± 0.0
1.163CysMet: 1.163 ± 0.605
0.0CysAsn: 0.0 ± 0.0
0.0CysPro: 0.0 ± 0.0
1.163CysGln: 1.163 ± 0.605
0.0CysArg: 0.0 ± 0.0
0.0CysSer: 0.0 ± 0.0
1.163CysThr: 1.163 ± 1.082
0.0CysVal: 0.0 ± 0.0
0.0CysTrp: 0.0 ± 0.0
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
1.163AspAla: 1.163 ± 1.082
0.0AspCys: 0.0 ± 0.0
1.163AspAsp: 1.163 ± 1.803
4.651AspGlu: 4.651 ± 4.327
1.163AspPhe: 1.163 ± 0.605
3.488AspGly: 3.488 ± 0.504
1.163AspHis: 1.163 ± 1.803
2.326AspIle: 2.326 ± 1.211
0.0AspLys: 0.0 ± 0.0
4.651AspLeu: 4.651 ± 2.421
0.0AspMet: 0.0 ± 0.0
0.0AspAsn: 0.0 ± 0.0
4.651AspPro: 4.651 ± 1.176
0.0AspGln: 0.0 ± 0.0
2.326AspArg: 2.326 ± 0.588
0.0AspSer: 0.0 ± 0.0
4.651AspThr: 4.651 ± 2.202
5.814AspVal: 5.814 ± 1.506
2.326AspTrp: 2.326 ± 0.588
1.163AspTyr: 1.163 ± 1.082
0.0AspXaa: 0.0 ± 0.0
Glu
2.326GluAla: 2.326 ± 2.163
1.163GluCys: 1.163 ± 0.605
2.326GluAsp: 2.326 ± 2.163
3.488GluGlu: 3.488 ± 1.734
2.326GluPhe: 2.326 ± 2.163
3.488GluGly: 3.488 ± 1.633
0.0GluHis: 0.0 ± 0.0
2.326GluIle: 2.326 ± 1.211
3.488GluLys: 3.488 ± 1.816
4.651GluLeu: 4.651 ± 2.202
3.488GluMet: 3.488 ± 3.254
4.651GluAsn: 4.651 ± 0.946
0.0GluPro: 0.0 ± 0.0
1.163GluGln: 1.163 ± 0.605
5.814GluArg: 5.814 ± 2.203
1.163GluSer: 1.163 ± 0.605
3.488GluThr: 3.488 ± 1.623
3.488GluVal: 3.488 ± 0.504
0.0GluTrp: 0.0 ± 0.0
4.651GluTyr: 4.651 ± 2.202
0.0GluXaa: 0.0 ± 0.0
Phe
2.326PheAla: 2.326 ± 1.211
1.163PheCys: 1.163 ± 0.605
0.0PheAsp: 0.0 ± 0.0
3.488PheGlu: 3.488 ± 1.633
2.326PhePhe: 2.326 ± 1.211
1.163PheGly: 1.163 ± 1.803
0.0PheHis: 0.0 ± 0.0
0.0PheIle: 0.0 ± 0.0
3.488PheLys: 3.488 ± 1.816
0.0PheLeu: 0.0 ± 0.0
2.326PheMet: 2.326 ± 1.211
1.163PheAsn: 1.163 ± 0.605
1.163PhePro: 1.163 ± 1.803
0.0PheGln: 0.0 ± 0.0
2.326PheArg: 2.326 ± 0.588
4.651PheSer: 4.651 ± 0.946
4.651PheThr: 4.651 ± 2.421
4.651PheVal: 4.651 ± 0.946
2.326PheTrp: 2.326 ± 0.588
0.0PheTyr: 0.0 ± 0.0
0.0PheXaa: 0.0 ± 0.0
Gly
10.465GlyAla: 10.465 ± 2.017
0.0GlyCys: 0.0 ± 0.0
3.488GlyAsp: 3.488 ± 1.623
1.163GlyGlu: 1.163 ± 1.082
5.814GlyPhe: 5.814 ± 3.026
8.14GlyGly: 8.14 ± 3.363
1.163GlyHis: 1.163 ± 1.082
2.326GlyIle: 2.326 ± 0.588
3.488GlyLys: 3.488 ± 0.504
3.488GlyLeu: 3.488 ± 0.504
2.326GlyMet: 2.326 ± 0.588
3.488GlyAsn: 3.488 ± 3.414
3.488GlyPro: 3.488 ± 1.816
4.651GlyGln: 4.651 ± 1.176
5.814GlyArg: 5.814 ± 3.026
6.977GlySer: 6.977 ± 1.413
2.326GlyThr: 2.326 ± 2.163
8.14GlyVal: 8.14 ± 2.686
1.163GlyTrp: 1.163 ± 0.605
4.651GlyTyr: 4.651 ± 1.176
0.0GlyXaa: 0.0 ± 0.0
His
0.0HisAla: 0.0 ± 0.0
0.0HisCys: 0.0 ± 0.0
0.0HisAsp: 0.0 ± 0.0
0.0HisGlu: 0.0 ± 0.0
2.326HisPhe: 2.326 ± 2.077
1.163HisGly: 1.163 ± 0.605
0.0HisHis: 0.0 ± 0.0
1.163HisIle: 1.163 ± 1.803
1.163HisLys: 1.163 ± 0.605
0.0HisLeu: 0.0 ± 0.0
1.163HisMet: 1.163 ± 1.082
1.163HisAsn: 1.163 ± 0.605
0.0HisPro: 0.0 ± 0.0
1.163HisGln: 1.163 ± 0.605
0.0HisArg: 0.0 ± 0.0
0.0HisSer: 0.0 ± 0.0
1.163HisThr: 1.163 ± 0.605
0.0HisVal: 0.0 ± 0.0
1.163HisTrp: 1.163 ± 1.082
3.488HisTyr: 3.488 ± 3.245
0.0HisXaa: 0.0 ± 0.0
Ile
2.326IleAla: 2.326 ± 0.588
0.0IleCys: 0.0 ± 0.0
2.326IleAsp: 2.326 ± 2.163
4.651IleGlu: 4.651 ± 3.38
2.326IlePhe: 2.326 ± 1.211
2.326IleGly: 2.326 ± 1.211
0.0IleHis: 0.0 ± 0.0
1.163IleIle: 1.163 ± 0.605
2.326IleLys: 2.326 ± 1.662
1.163IleLeu: 1.163 ± 0.605
1.163IleMet: 1.163 ± 0.605
2.326IleAsn: 2.326 ± 1.662
4.651IlePro: 4.651 ± 1.176
2.326IleGln: 2.326 ± 1.211
3.488IleArg: 3.488 ± 1.633
5.814IleSer: 5.814 ± 3.026
2.326IleThr: 2.326 ± 2.163
4.651IleVal: 4.651 ± 1.996
2.326IleTrp: 2.326 ± 0.588
1.163IleTyr: 1.163 ± 1.082
0.0IleXaa: 0.0 ± 0.0
Lys
3.488LysAla: 3.488 ± 1.816
1.163LysCys: 1.163 ± 1.082
1.163LysAsp: 1.163 ± 1.082
1.163LysGlu: 1.163 ± 1.082
1.163LysPhe: 1.163 ± 0.605
5.814LysGly: 5.814 ± 0.913
2.326LysHis: 2.326 ± 1.662
6.977LysIle: 6.977 ± 2.092
4.651LysLys: 4.651 ± 0.946
2.326LysLeu: 2.326 ± 1.211
3.488LysMet: 3.488 ± 1.816
3.488LysAsn: 3.488 ± 0.504
3.488LysPro: 3.488 ± 0.504
4.651LysGln: 4.651 ± 2.421
4.651LysArg: 4.651 ± 0.946
3.488LysSer: 3.488 ± 3.245
1.163LysThr: 1.163 ± 0.605
3.488LysVal: 3.488 ± 1.816
4.651LysTrp: 4.651 ± 1.3
3.488LysTyr: 3.488 ± 1.816
0.0LysXaa: 0.0 ± 0.0
Leu
4.651LeuAla: 4.651 ± 0.946
1.163LeuCys: 1.163 ± 1.082
5.814LeuAsp: 5.814 ± 3.067
4.651LeuGlu: 4.651 ± 1.3
1.163LeuPhe: 1.163 ± 0.605
8.14LeuGly: 8.14 ± 1.866
1.163LeuHis: 1.163 ± 1.082
2.326LeuIle: 2.326 ± 2.077
2.326LeuLys: 2.326 ± 1.211
5.814LeuLeu: 5.814 ± 3.785
0.0LeuMet: 0.0 ± 0.86
1.163LeuAsn: 1.163 ± 1.082
1.163LeuPro: 1.163 ± 0.605
3.488LeuGln: 3.488 ± 1.816
10.465LeuArg: 10.465 ± 3.853
3.488LeuSer: 3.488 ± 1.623
2.326LeuThr: 2.326 ± 1.662
2.326LeuVal: 2.326 ± 2.163
4.651LeuTrp: 4.651 ± 1.3
1.163LeuTyr: 1.163 ± 0.605
0.0LeuXaa: 0.0 ± 0.0
Met
0.0MetAla: 0.0 ± 0.0
0.0MetCys: 0.0 ± 0.0
1.163MetAsp: 1.163 ± 0.605
4.651MetGlu: 4.651 ± 1.3
2.326MetPhe: 2.326 ± 1.211
1.163MetGly: 1.163 ± 0.605
0.0MetHis: 0.0 ± 0.0
0.0MetIle: 0.0 ± 0.0
3.488MetLys: 3.488 ± 1.816
1.163MetLeu: 1.163 ± 1.082
0.0MetMet: 0.0 ± 0.0
2.326MetAsn: 2.326 ± 1.662
3.488MetPro: 3.488 ± 1.623
0.0MetGln: 0.0 ± 0.0
2.326MetArg: 2.326 ± 0.588
1.163MetSer: 1.163 ± 1.082
0.0MetThr: 0.0 ± 0.0
0.0MetVal: 0.0 ± 0.0
1.163MetTrp: 1.163 ± 0.605
2.326MetTyr: 2.326 ± 1.211
0.0MetXaa: 0.0 ± 0.0
Asn
2.326AsnAla: 2.326 ± 0.588
0.0AsnCys: 0.0 ± 0.0
1.163AsnAsp: 1.163 ± 0.605
4.651AsnGlu: 4.651 ± 1.3
1.163AsnPhe: 1.163 ± 0.605
1.163AsnGly: 1.163 ± 1.082
0.0AsnHis: 0.0 ± 0.0
2.326AsnIle: 2.326 ± 0.588
3.488AsnLys: 3.488 ± 1.623
4.651AsnLeu: 4.651 ± 1.3
0.0AsnMet: 0.0 ± 0.0
3.488AsnAsn: 3.488 ± 1.816
4.651AsnPro: 4.651 ± 1.996
1.163AsnGln: 1.163 ± 0.605
1.163AsnArg: 1.163 ± 0.605
3.488AsnSer: 3.488 ± 1.816
3.488AsnThr: 3.488 ± 1.816
4.651AsnVal: 4.651 ± 1.176
0.0AsnTrp: 0.0 ± 0.0
2.326AsnTyr: 2.326 ± 3.605
0.0AsnXaa: 0.0 ± 0.0
Pro
3.488ProAla: 3.488 ± 0.504
0.0ProCys: 0.0 ± 0.0
1.163ProAsp: 1.163 ± 0.605
5.814ProGlu: 5.814 ± 0.913
0.0ProPhe: 0.0 ± 0.0
3.488ProGly: 3.488 ± 1.734
1.163ProHis: 1.163 ± 1.082
3.488ProIle: 3.488 ± 1.623
3.488ProLys: 3.488 ± 0.504
4.651ProLeu: 4.651 ± 1.3
0.0ProMet: 0.0 ± 0.0
2.326ProAsn: 2.326 ± 0.588
0.0ProPro: 0.0 ± 0.0
1.163ProGln: 1.163 ± 1.082
4.651ProArg: 4.651 ± 2.707
3.488ProSer: 3.488 ± 1.734
0.0ProThr: 0.0 ± 0.0
4.651ProVal: 4.651 ± 0.946
1.163ProTrp: 1.163 ± 0.605
1.163ProTyr: 1.163 ± 1.082
0.0ProXaa: 0.0 ± 0.0
Gln
2.326GlnAla: 2.326 ± 0.588
2.326GlnCys: 2.326 ± 1.211
2.326GlnAsp: 2.326 ± 1.211
0.0GlnGlu: 0.0 ± 0.0
1.163GlnPhe: 1.163 ± 1.082
0.0GlnGly: 0.0 ± 0.0
2.326GlnHis: 2.326 ± 0.588
1.163GlnIle: 1.163 ± 0.605
2.326GlnLys: 2.326 ± 1.211
3.488GlnLeu: 3.488 ± 2.778
3.488GlnMet: 3.488 ± 1.706
3.488GlnAsn: 3.488 ± 1.734
1.163GlnPro: 1.163 ± 1.082
0.0GlnGln: 0.0 ± 0.0
1.163GlnArg: 1.163 ± 0.605
4.651GlnSer: 4.651 ± 2.421
4.651GlnThr: 4.651 ± 2.421
2.326GlnVal: 2.326 ± 1.662
0.0GlnTrp: 0.0 ± 0.0
1.163GlnTyr: 1.163 ± 0.605
0.0GlnXaa: 0.0 ± 0.0
Arg
4.651ArgAla: 4.651 ± 1.176
0.0ArgCys: 0.0 ± 0.0
1.163ArgAsp: 1.163 ± 0.605
2.326ArgGlu: 2.326 ± 1.211
0.0ArgPhe: 0.0 ± 0.0
11.628ArgGly: 11.628 ± 1.62
0.0ArgHis: 0.0 ± 0.0
2.326ArgIle: 2.326 ± 1.211
3.488ArgLys: 3.488 ± 1.816
4.651ArgLeu: 4.651 ± 1.176
2.326ArgMet: 2.326 ± 1.211
1.163ArgAsn: 1.163 ± 1.082
3.488ArgPro: 3.488 ± 1.734
2.326ArgGln: 2.326 ± 1.662
8.14ArgArg: 8.14 ± 1.39
6.977ArgSer: 6.977 ± 1.144
8.14ArgThr: 8.14 ± 4.337
5.814ArgVal: 5.814 ± 2.203
3.488ArgTrp: 3.488 ± 0.504
5.814ArgTyr: 5.814 ± 1.215
0.0ArgXaa: 0.0 ± 0.0
Ser
3.488SerAla: 3.488 ± 0.504
0.0SerCys: 0.0 ± 0.0
2.326SerAsp: 2.326 ± 0.588
1.163SerGlu: 1.163 ± 0.605
2.326SerPhe: 2.326 ± 1.211
4.651SerGly: 4.651 ± 0.946
1.163SerHis: 1.163 ± 1.082
1.163SerIle: 1.163 ± 0.605
8.14SerLys: 8.14 ± 1.39
3.488SerLeu: 3.488 ± 1.633
0.0SerMet: 0.0 ± 0.0
4.651SerAsn: 4.651 ± 0.946
1.163SerPro: 1.163 ± 0.605
2.326SerGln: 2.326 ± 1.211
11.628SerArg: 11.628 ± 3.408
3.488SerSer: 3.488 ± 1.633
6.977SerThr: 6.977 ± 2.853
4.651SerVal: 4.651 ± 1.176
2.326SerTrp: 2.326 ± 1.211
2.326SerTyr: 2.326 ± 1.662
0.0SerXaa: 0.0 ± 0.0
Thr
4.651ThrAla: 4.651 ± 1.996
1.163ThrCys: 1.163 ± 1.082
2.326ThrAsp: 2.326 ± 1.211
0.0ThrGlu: 0.0 ± 0.0
3.488ThrPhe: 3.488 ± 1.734
6.977ThrGly: 6.977 ± 1.009
2.326ThrHis: 2.326 ± 0.588
1.163ThrIle: 1.163 ± 1.082
2.326ThrLys: 2.326 ± 2.163
6.977ThrLeu: 6.977 ± 1.764
2.326ThrMet: 2.326 ± 0.588
2.326ThrAsn: 2.326 ± 1.662
2.326ThrPro: 2.326 ± 0.588
4.651ThrGln: 4.651 ± 1.3
2.326ThrArg: 2.326 ± 0.588
5.814ThrSer: 5.814 ± 1.506
1.163ThrThr: 1.163 ± 0.605
3.488ThrVal: 3.488 ± 1.816
1.163ThrTrp: 1.163 ± 0.605
2.326ThrTyr: 2.326 ± 1.211
0.0ThrXaa: 0.0 ± 0.0
Val
3.488ValAla: 3.488 ± 1.816
1.163ValCys: 1.163 ± 0.605
4.651ValAsp: 4.651 ± 0.946
4.651ValGlu: 4.651 ± 1.176
4.651ValPhe: 4.651 ± 0.946
6.977ValGly: 6.977 ± 2.092
1.163ValHis: 1.163 ± 0.605
5.814ValIle: 5.814 ± 1.215
6.977ValLys: 6.977 ± 1.009
4.651ValLeu: 4.651 ± 1.176
1.163ValMet: 1.163 ± 0.605
0.0ValAsn: 0.0 ± 0.0
3.488ValPro: 3.488 ± 1.633
2.326ValGln: 2.326 ± 0.588
8.14ValArg: 8.14 ± 1.802
3.488ValSer: 3.488 ± 0.504
3.488ValThr: 3.488 ± 1.816
5.814ValVal: 5.814 ± 0.913
2.326ValTrp: 2.326 ± 2.163
1.163ValTyr: 1.163 ± 0.605
0.0ValXaa: 0.0 ± 0.0
Trp
0.0TrpAla: 0.0 ± 0.0
0.0TrpCys: 0.0 ± 0.0
1.163TrpAsp: 1.163 ± 1.082
2.326TrpGlu: 2.326 ± 0.588
0.0TrpPhe: 0.0 ± 0.0
4.651TrpGly: 4.651 ± 0.946
0.0TrpHis: 0.0 ± 0.0
3.488TrpIle: 3.488 ± 1.623
3.488TrpLys: 3.488 ± 0.504
4.651TrpLeu: 4.651 ± 3.669
0.0TrpMet: 0.0 ± 0.0
4.651TrpAsn: 4.651 ± 0.946
0.0TrpPro: 0.0 ± 0.0
2.326TrpGln: 2.326 ± 0.588
1.163TrpArg: 1.163 ± 0.605
1.163TrpSer: 1.163 ± 0.605
1.163TrpThr: 1.163 ± 0.605
1.163TrpVal: 1.163 ± 0.605
2.326TrpTrp: 2.326 ± 0.588
1.163TrpTyr: 1.163 ± 1.082
0.0TrpXaa: 0.0 ± 0.0
Tyr
4.651TyrAla: 4.651 ± 1.176
1.163TyrCys: 1.163 ± 0.605
2.326TyrAsp: 2.326 ± 1.211
1.163TyrGlu: 1.163 ± 1.082
0.0TyrPhe: 0.0 ± 0.0
0.0TyrGly: 0.0 ± 0.0
1.163TyrHis: 1.163 ± 0.605
1.163TyrIle: 1.163 ± 0.605
2.326TyrLys: 2.326 ± 1.211
4.651TyrLeu: 4.651 ± 5.486
1.163TyrMet: 1.163 ± 1.803
2.326TyrAsn: 2.326 ± 0.588
5.814TyrPro: 5.814 ± 2.203
1.163TyrGln: 1.163 ± 0.605
2.326TyrArg: 2.326 ± 2.077
2.326TyrSer: 2.326 ± 1.211
2.326TyrThr: 2.326 ± 0.588
6.977TyrVal: 6.977 ± 1.764
1.163TyrTrp: 1.163 ± 1.082
1.163TyrTyr: 1.163 ± 0.605
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 3 proteins (861 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski