Amino acid dipepetide frequency for Gokushovirus MK-2017

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
5.063AlaAla: 5.063 ± 1.795
0.844AlaCys: 0.844 ± 0.526
7.595AlaAsp: 7.595 ± 2.574
3.376AlaGlu: 3.376 ± 1.169
2.532AlaPhe: 2.532 ± 0.893
5.063AlaGly: 5.063 ± 1.413
0.0AlaHis: 0.0 ± 0.0
0.844AlaIle: 0.844 ± 0.763
2.532AlaLys: 2.532 ± 0.714
5.907AlaLeu: 5.907 ± 1.547
1.688AlaMet: 1.688 ± 1.053
5.907AlaAsn: 5.907 ± 1.152
3.376AlaPro: 3.376 ± 1.285
0.844AlaGln: 0.844 ± 0.526
3.376AlaArg: 3.376 ± 0.623
3.376AlaSer: 3.376 ± 1.428
5.063AlaThr: 5.063 ± 3.159
4.219AlaVal: 4.219 ± 3.398
2.532AlaTrp: 2.532 ± 0.893
3.376AlaTyr: 3.376 ± 0.623
0.0AlaXaa: 0.0 ± 0.0
Cys
1.688CysAla: 1.688 ± 1.529
0.0CysCys: 0.0 ± 0.0
1.688CysAsp: 1.688 ± 0.904
0.0CysGlu: 0.0 ± 0.0
0.0CysPhe: 0.0 ± 0.0
0.844CysGly: 0.844 ± 0.763
0.0CysHis: 0.0 ± 0.0
0.844CysIle: 0.844 ± 1.471
0.844CysLys: 0.844 ± 0.526
3.376CysLeu: 3.376 ± 2.043
0.0CysMet: 0.0 ± 0.0
0.844CysAsn: 0.844 ± 0.526
0.0CysPro: 0.0 ± 0.0
1.688CysGln: 1.688 ± 0.642
1.688CysArg: 1.688 ± 1.529
0.0CysSer: 0.0 ± 0.0
0.0CysThr: 0.0 ± 0.0
0.0CysVal: 0.0 ± 0.0
0.0CysTrp: 0.0 ± 0.0
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
4.219AspAla: 4.219 ± 3.547
0.0AspCys: 0.0 ± 0.0
3.376AspAsp: 3.376 ± 1.08
1.688AspGlu: 1.688 ± 1.321
3.376AspPhe: 3.376 ± 1.285
2.532AspGly: 2.532 ± 1.372
3.376AspHis: 3.376 ± 1.318
5.063AspIle: 5.063 ± 0.886
3.376AspLys: 3.376 ± 1.169
7.595AspLeu: 7.595 ± 2.013
0.844AspMet: 0.844 ± 0.526
4.219AspAsn: 4.219 ± 1.439
3.376AspPro: 3.376 ± 1.643
1.688AspGln: 1.688 ± 0.642
0.0AspArg: 0.0 ± 0.0
7.595AspSer: 7.595 ± 4.172
2.532AspThr: 2.532 ± 1.068
7.595AspVal: 7.595 ± 3.101
0.0AspTrp: 0.0 ± 0.0
6.751AspTyr: 6.751 ± 1.545
0.0AspXaa: 0.0 ± 0.0
Glu
3.376GluAla: 3.376 ± 1.318
0.844GluCys: 0.844 ± 0.763
0.844GluAsp: 0.844 ± 0.526
1.688GluGlu: 1.688 ± 1.529
0.844GluPhe: 0.844 ± 1.023
0.844GluGly: 0.844 ± 1.023
3.376GluHis: 3.376 ± 1.169
3.376GluIle: 3.376 ± 1.169
1.688GluLys: 1.688 ± 0.642
0.0GluLeu: 0.0 ± 0.0
2.532GluMet: 2.532 ± 1.517
5.063GluAsn: 5.063 ± 1.413
0.844GluPro: 0.844 ± 0.763
3.376GluGln: 3.376 ± 1.421
5.063GluArg: 5.063 ± 2.166
3.376GluSer: 3.376 ± 1.428
5.063GluThr: 5.063 ± 1.927
5.063GluVal: 5.063 ± 0.98
0.844GluTrp: 0.844 ± 0.526
5.063GluTyr: 5.063 ± 2.617
0.0GluXaa: 0.0 ± 0.0
Phe
5.063PheAla: 5.063 ± 1.927
1.688PheCys: 1.688 ± 1.321
7.595PheAsp: 7.595 ± 2.37
1.688PheGlu: 1.688 ± 1.917
0.844PhePhe: 0.844 ± 0.526
5.063PheGly: 5.063 ± 1.486
0.0PheHis: 0.0 ± 0.0
1.688PheIle: 1.688 ± 1.525
2.532PheLys: 2.532 ± 1.068
0.0PheLeu: 0.0 ± 0.0
1.688PheMet: 1.688 ± 0.642
0.844PheAsn: 0.844 ± 0.526
2.532PhePro: 2.532 ± 1.579
1.688PheGln: 1.688 ± 0.642
4.219PheArg: 4.219 ± 1.546
0.844PheSer: 0.844 ± 1.023
2.532PheThr: 2.532 ± 1.068
3.376PheVal: 3.376 ± 1.285
0.844PheTrp: 0.844 ± 0.526
2.532PheTyr: 2.532 ± 0.893
0.0PheXaa: 0.0 ± 0.0
Gly
3.376GlyAla: 3.376 ± 1.421
0.844GlyCys: 0.844 ± 0.763
3.376GlyAsp: 3.376 ± 1.421
6.751GlyGlu: 6.751 ± 0.604
2.532GlyPhe: 2.532 ± 1.372
6.751GlyGly: 6.751 ± 1.981
0.844GlyHis: 0.844 ± 1.023
2.532GlyIle: 2.532 ± 1.305
4.219GlyLys: 4.219 ± 1.464
5.063GlyLeu: 5.063 ± 1.345
1.688GlyMet: 1.688 ± 1.053
3.376GlyAsn: 3.376 ± 1.285
1.688GlyPro: 1.688 ± 1.321
3.376GlyGln: 3.376 ± 1.539
1.688GlyArg: 1.688 ± 0.904
6.751GlySer: 6.751 ± 1.311
5.907GlyThr: 5.907 ± 1.232
4.219GlyVal: 4.219 ± 1.907
0.844GlyTrp: 0.844 ± 0.526
1.688GlyTyr: 1.688 ± 0.642
0.0GlyXaa: 0.0 ± 0.0
His
3.376HisAla: 3.376 ± 2.226
0.0HisCys: 0.0 ± 0.0
1.688HisAsp: 1.688 ± 1.053
1.688HisGlu: 1.688 ± 1.089
1.688HisPhe: 1.688 ± 1.053
0.844HisGly: 0.844 ± 0.526
1.688HisHis: 1.688 ± 1.053
0.844HisIle: 0.844 ± 1.471
1.688HisLys: 1.688 ± 0.642
0.844HisLeu: 0.844 ± 0.526
1.688HisMet: 1.688 ± 0.801
0.0HisAsn: 0.0 ± 0.0
0.0HisPro: 0.0 ± 0.0
0.0HisGln: 0.0 ± 0.0
0.0HisArg: 0.0 ± 0.0
0.0HisSer: 0.0 ± 0.0
1.688HisThr: 1.688 ± 0.642
0.844HisVal: 0.844 ± 0.763
1.688HisTrp: 1.688 ± 1.321
0.844HisTyr: 0.844 ± 0.763
0.0HisXaa: 0.0 ± 0.0
Ile
3.376IleAla: 3.376 ± 2.419
0.844IleCys: 0.844 ± 0.526
2.532IleAsp: 2.532 ± 1.068
0.844IleGlu: 0.844 ± 0.526
0.844IlePhe: 0.844 ± 0.763
3.376IleGly: 3.376 ± 1.08
0.0IleHis: 0.0 ± 0.0
2.532IleIle: 2.532 ± 1.577
0.844IleLys: 0.844 ± 1.023
1.688IleLeu: 1.688 ± 1.529
0.0IleMet: 0.0 ± 0.0
6.751IleAsn: 6.751 ± 2.223
5.063IlePro: 5.063 ± 2.298
3.376IleGln: 3.376 ± 2.419
3.376IleArg: 3.376 ± 1.728
2.532IleSer: 2.532 ± 1.308
1.688IleThr: 1.688 ± 1.053
3.376IleVal: 3.376 ± 1.285
0.0IleTrp: 0.0 ± 0.0
3.376IleTyr: 3.376 ± 1.246
0.0IleXaa: 0.0 ± 0.0
Lys
4.219LysAla: 4.219 ± 0.906
0.844LysCys: 0.844 ± 0.763
5.063LysAsp: 5.063 ± 2.026
1.688LysGlu: 1.688 ± 0.904
3.376LysPhe: 3.376 ± 1.285
1.688LysGly: 1.688 ± 1.053
0.844LysHis: 0.844 ± 0.763
0.844LysIle: 0.844 ± 1.356
4.219LysLys: 4.219 ± 2.331
2.532LysLeu: 2.532 ± 1.308
0.0LysMet: 0.0 ± 0.0
0.844LysAsn: 0.844 ± 0.763
0.844LysPro: 0.844 ± 0.526
2.532LysGln: 2.532 ± 1.579
3.376LysArg: 3.376 ± 2.084
4.219LysSer: 4.219 ± 1.464
5.907LysThr: 5.907 ± 1.119
1.688LysVal: 1.688 ± 1.712
0.0LysTrp: 0.0 ± 0.0
2.532LysTyr: 2.532 ± 1.308
0.0LysXaa: 0.0 ± 0.0
Leu
4.219LeuAla: 4.219 ± 2.632
0.0LeuCys: 0.0 ± 0.0
6.751LeuAsp: 6.751 ± 1.913
6.751LeuGlu: 6.751 ± 4.086
3.376LeuPhe: 3.376 ± 1.539
4.219LeuGly: 4.219 ± 1.915
0.0LeuHis: 0.0 ± 0.0
4.219LeuIle: 4.219 ± 1.858
2.532LeuLys: 2.532 ± 1.308
4.219LeuLeu: 4.219 ± 2.793
4.219LeuMet: 4.219 ± 3.794
3.376LeuAsn: 3.376 ± 1.08
5.907LeuPro: 5.907 ± 2.459
3.376LeuGln: 3.376 ± 1.08
5.063LeuArg: 5.063 ± 2.765
3.376LeuSer: 3.376 ± 1.421
5.063LeuThr: 5.063 ± 1.927
2.532LeuVal: 2.532 ± 1.857
0.0LeuTrp: 0.0 ± 0.0
4.219LeuTyr: 4.219 ± 1.907
0.0LeuXaa: 0.0 ± 0.0
Met
0.0MetAla: 0.0 ± 0.0
0.0MetCys: 0.0 ± 0.0
2.532MetAsp: 2.532 ± 0.714
0.0MetGlu: 0.0 ± 0.0
0.844MetPhe: 0.844 ± 0.526
1.688MetGly: 1.688 ± 0.904
0.844MetHis: 0.844 ± 1.471
1.688MetIle: 1.688 ± 2.942
0.844MetLys: 0.844 ± 0.763
0.844MetLeu: 0.844 ± 1.023
0.844MetMet: 0.844 ± 0.526
1.688MetAsn: 1.688 ± 0.642
2.532MetPro: 2.532 ± 1.37
0.844MetGln: 0.844 ± 0.526
1.688MetArg: 1.688 ± 0.642
4.219MetSer: 4.219 ± 2.02
0.0MetThr: 0.0 ± 0.0
0.844MetVal: 0.844 ± 0.526
0.0MetTrp: 0.0 ± 0.0
1.688MetTyr: 1.688 ± 0.904
0.0MetXaa: 0.0 ± 0.0
Asn
1.688AsnAla: 1.688 ± 1.053
0.0AsnCys: 0.0 ± 0.0
0.844AsnAsp: 0.844 ± 1.471
3.376AsnGlu: 3.376 ± 0.623
0.844AsnPhe: 0.844 ± 1.023
5.063AsnGly: 5.063 ± 1.428
0.0AsnHis: 0.0 ± 0.0
4.219AsnIle: 4.219 ± 1.398
0.844AsnLys: 0.844 ± 0.526
5.063AsnLeu: 5.063 ± 0.886
0.844AsnMet: 0.844 ± 0.763
0.844AsnAsn: 0.844 ± 1.023
4.219AsnPro: 4.219 ± 1.954
4.219AsnGln: 4.219 ± 0.906
3.376AsnArg: 3.376 ± 1.285
3.376AsnSer: 3.376 ± 1.421
2.532AsnThr: 2.532 ± 0.714
3.376AsnVal: 3.376 ± 1.543
2.532AsnTrp: 2.532 ± 1.577
1.688AsnTyr: 1.688 ± 1.053
0.0AsnXaa: 0.0 ± 0.0
Pro
0.844ProAla: 0.844 ± 0.526
0.844ProCys: 0.844 ± 0.763
4.219ProAsp: 4.219 ± 0.906
2.532ProGlu: 2.532 ± 0.714
3.376ProPhe: 3.376 ± 1.318
3.376ProGly: 3.376 ± 1.318
1.688ProHis: 1.688 ± 1.089
0.844ProIle: 0.844 ± 0.526
2.532ProLys: 2.532 ± 2.034
5.063ProLeu: 5.063 ± 2.429
0.844ProMet: 0.844 ± 0.526
0.0ProAsn: 0.0 ± 0.0
2.532ProPro: 2.532 ± 1.577
2.532ProGln: 2.532 ± 1.579
0.844ProArg: 0.844 ± 1.471
5.907ProSer: 5.907 ± 1.83
4.219ProThr: 4.219 ± 1.516
8.439ProVal: 8.439 ± 2.617
0.0ProTrp: 0.0 ± 0.0
1.688ProTyr: 1.688 ± 0.642
0.0ProXaa: 0.0 ± 0.0
Gln
3.376GlnAla: 3.376 ± 1.421
0.0GlnCys: 0.0 ± 0.0
0.0GlnAsp: 0.0 ± 0.0
3.376GlnGlu: 3.376 ± 1.285
1.688GlnPhe: 1.688 ± 1.053
5.063GlnGly: 5.063 ± 0.886
0.0GlnHis: 0.0 ± 0.0
2.532GlnIle: 2.532 ± 0.893
2.532GlnLys: 2.532 ± 1.579
5.063GlnLeu: 5.063 ± 1.486
0.0GlnMet: 0.0 ± 0.0
3.376GlnAsn: 3.376 ± 3.059
0.844GlnPro: 0.844 ± 1.023
3.376GlnGln: 3.376 ± 1.285
1.688GlnArg: 1.688 ± 0.904
5.063GlnSer: 5.063 ± 2.156
0.844GlnThr: 0.844 ± 0.763
1.688GlnVal: 1.688 ± 1.053
1.688GlnTrp: 1.688 ± 1.525
0.844GlnTyr: 0.844 ± 0.526
0.0GlnXaa: 0.0 ± 0.0
Arg
3.376ArgAla: 3.376 ± 2.043
0.844ArgCys: 0.844 ± 0.763
4.219ArgAsp: 4.219 ± 1.954
1.688ArgGlu: 1.688 ± 0.904
3.376ArgPhe: 3.376 ± 1.421
1.688ArgGly: 1.688 ± 1.257
1.688ArgHis: 1.688 ± 1.321
1.688ArgIle: 1.688 ± 1.053
3.376ArgLys: 3.376 ± 2.043
5.063ArgLeu: 5.063 ± 2.096
0.844ArgMet: 0.844 ± 0.526
1.688ArgAsn: 1.688 ± 0.642
4.219ArgPro: 4.219 ± 1.915
2.532ArgGln: 2.532 ± 1.846
4.219ArgArg: 4.219 ± 4.336
4.219ArgSer: 4.219 ± 3.3
1.688ArgThr: 1.688 ± 1.551
4.219ArgVal: 4.219 ± 0.906
1.688ArgTrp: 1.688 ± 2.045
5.063ArgTyr: 5.063 ± 0.886
0.0ArgXaa: 0.0 ± 0.0
Ser
3.376SerAla: 3.376 ± 1.08
2.532SerCys: 2.532 ± 1.659
4.219SerAsp: 4.219 ± 1.602
2.532SerGlu: 2.532 ± 0.714
4.219SerPhe: 4.219 ± 3.547
5.063SerGly: 5.063 ± 3.107
1.688SerHis: 1.688 ± 1.053
2.532SerIle: 2.532 ± 1.579
3.376SerLys: 3.376 ± 3.424
8.439SerLeu: 8.439 ± 2.504
1.688SerMet: 1.688 ± 1.551
3.376SerAsn: 3.376 ± 1.807
4.219SerPro: 4.219 ± 2.642
2.532SerGln: 2.532 ± 0.893
9.283SerArg: 9.283 ± 3.233
2.532SerSer: 2.532 ± 0.893
4.219SerThr: 4.219 ± 0.894
5.907SerVal: 5.907 ± 1.119
0.844SerTrp: 0.844 ± 0.526
2.532SerTyr: 2.532 ± 1.577
0.0SerXaa: 0.0 ± 0.0
Thr
4.219ThrAla: 4.219 ± 1.546
0.844ThrCys: 0.844 ± 0.763
5.907ThrAsp: 5.907 ± 1.383
5.063ThrGlu: 5.063 ± 1.345
3.376ThrPhe: 3.376 ± 1.318
7.595ThrGly: 7.595 ± 1.715
0.844ThrHis: 0.844 ± 0.526
3.376ThrIle: 3.376 ± 1.728
2.532ThrLys: 2.532 ± 1.308
4.219ThrLeu: 4.219 ± 1.764
1.688ThrMet: 1.688 ± 2.045
0.844ThrAsn: 0.844 ± 0.526
2.532ThrPro: 2.532 ± 1.579
1.688ThrGln: 1.688 ± 0.642
1.688ThrArg: 1.688 ± 0.904
5.063ThrSer: 5.063 ± 1.345
2.532ThrThr: 2.532 ± 1.579
2.532ThrVal: 2.532 ± 1.068
1.688ThrTrp: 1.688 ± 1.525
1.688ThrTyr: 1.688 ± 0.642
0.0ThrXaa: 0.0 ± 0.0
Val
6.751ValAla: 6.751 ± 1.449
0.844ValCys: 0.844 ± 0.763
3.376ValAsp: 3.376 ± 1.807
5.907ValGlu: 5.907 ± 1.78
4.219ValPhe: 4.219 ± 2.031
5.063ValGly: 5.063 ± 2.136
0.0ValHis: 0.0 ± 0.0
2.532ValIle: 2.532 ± 1.857
3.376ValLys: 3.376 ± 2.62
5.907ValLeu: 5.907 ± 1.83
0.844ValMet: 0.844 ± 0.526
4.219ValAsn: 4.219 ± 1.858
5.063ValPro: 5.063 ± 1.486
0.844ValGln: 0.844 ± 0.763
4.219ValArg: 4.219 ± 2.168
3.376ValSer: 3.376 ± 1.543
4.219ValThr: 4.219 ± 0.894
0.844ValVal: 0.844 ± 0.526
1.688ValTrp: 1.688 ± 1.053
0.844ValTyr: 0.844 ± 1.471
0.0ValXaa: 0.0 ± 0.0
Trp
1.688TrpAla: 1.688 ± 0.642
0.844TrpCys: 0.844 ± 1.471
0.0TrpAsp: 0.0 ± 0.0
0.844TrpGlu: 0.844 ± 0.763
0.844TrpPhe: 0.844 ± 0.526
0.0TrpGly: 0.0 ± 0.0
0.844TrpHis: 0.844 ± 0.526
0.844TrpIle: 0.844 ± 0.526
1.688TrpLys: 1.688 ± 1.053
0.844TrpLeu: 0.844 ± 0.526
0.844TrpMet: 0.844 ± 1.023
0.844TrpAsn: 0.844 ± 1.023
0.844TrpPro: 0.844 ± 0.526
0.0TrpGln: 0.0 ± 0.0
0.0TrpArg: 0.0 ± 0.0
3.376TrpSer: 3.376 ± 0.623
0.844TrpThr: 0.844 ± 0.763
0.844TrpVal: 0.844 ± 0.763
0.0TrpTrp: 0.0 ± 0.0
1.688TrpTyr: 1.688 ± 1.525
0.0TrpXaa: 0.0 ± 0.0
Tyr
4.219TyrAla: 4.219 ± 1.439
0.844TyrCys: 0.844 ± 0.526
3.376TyrAsp: 3.376 ± 1.643
1.688TyrGlu: 1.688 ± 1.053
5.063TyrPhe: 5.063 ± 2.617
1.688TyrGly: 1.688 ± 0.642
3.376TyrHis: 3.376 ± 2.043
3.376TyrIle: 3.376 ± 1.807
1.688TyrLys: 1.688 ± 1.053
2.532TyrLeu: 2.532 ± 0.893
0.0TyrMet: 0.0 ± 0.0
0.844TyrAsn: 0.844 ± 0.763
1.688TyrPro: 1.688 ± 1.525
2.532TyrGln: 2.532 ± 1.068
2.532TyrArg: 2.532 ± 1.37
5.907TyrSer: 5.907 ± 2.33
3.376TyrThr: 3.376 ± 3.051
2.532TyrVal: 2.532 ± 0.893
0.844TyrTrp: 0.844 ± 0.526
3.376TyrTyr: 3.376 ± 1.318
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 5 proteins (1186 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski