Amino acid dipepetide frequency for Human gut gokushovirus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
7.676AlaAla: 7.676 ± 0.698
1.396AlaCys: 1.396 ± 0.944
2.791AlaAsp: 2.791 ± 1.429
4.885AlaGlu: 4.885 ± 3.092
3.489AlaPhe: 3.489 ± 1.566
8.374AlaGly: 8.374 ± 4.021
0.0AlaHis: 0.0 ± 0.0
4.885AlaIle: 4.885 ± 0.815
2.791AlaLys: 2.791 ± 1.72
4.187AlaLeu: 4.187 ± 1.63
4.187AlaMet: 4.187 ± 1.848
5.583AlaAsn: 5.583 ± 1.338
0.698AlaPro: 0.698 ± 0.896
2.791AlaGln: 2.791 ± 1.348
3.489AlaArg: 3.489 ± 1.08
4.187AlaSer: 4.187 ± 1.657
2.791AlaThr: 2.791 ± 1.0
5.583AlaVal: 5.583 ± 2.263
0.698AlaTrp: 0.698 ± 0.475
6.978AlaTyr: 6.978 ± 1.616
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.698CysCys: 0.698 ± 0.475
0.698CysAsp: 0.698 ± 0.896
0.0CysGlu: 0.0 ± 0.0
0.0CysPhe: 0.0 ± 0.0
0.698CysGly: 0.698 ± 0.589
0.0CysHis: 0.0 ± 0.0
0.698CysIle: 0.698 ± 0.589
1.396CysLys: 1.396 ± 0.944
1.396CysLeu: 1.396 ± 0.951
0.698CysMet: 0.698 ± 0.589
0.0CysAsn: 0.0 ± 0.0
0.698CysPro: 0.698 ± 0.475
0.0CysGln: 0.0 ± 0.0
0.698CysArg: 0.698 ± 0.589
0.0CysSer: 0.0 ± 0.0
0.698CysThr: 0.698 ± 0.589
0.698CysVal: 0.698 ± 0.475
0.0CysTrp: 0.0 ± 0.0
0.698CysTyr: 0.698 ± 0.589
0.0CysXaa: 0.0 ± 0.0
Asp
2.094AspAla: 2.094 ± 0.815
0.0AspCys: 0.0 ± 0.0
4.885AspAsp: 4.885 ± 0.91
2.791AspGlu: 2.791 ± 1.472
2.094AspPhe: 2.094 ± 0.809
2.094AspGly: 2.094 ± 0.785
2.094AspHis: 2.094 ± 1.363
3.489AspIle: 3.489 ± 2.73
2.094AspLys: 2.094 ± 0.496
3.489AspLeu: 3.489 ± 1.495
0.698AspMet: 0.698 ± 0.615
2.791AspAsn: 2.791 ± 1.429
0.698AspPro: 0.698 ± 0.896
1.396AspGln: 1.396 ± 0.951
2.094AspArg: 2.094 ± 0.815
1.396AspSer: 1.396 ± 0.951
3.489AspThr: 3.489 ± 1.176
0.698AspVal: 0.698 ± 0.475
2.094AspTrp: 2.094 ± 1.289
4.885AspTyr: 4.885 ± 1.2
0.0AspXaa: 0.0 ± 0.0
Glu
6.281GluAla: 6.281 ± 1.272
0.698GluCys: 0.698 ± 0.896
3.489GluAsp: 3.489 ± 1.492
7.676GluGlu: 7.676 ± 4.636
2.094GluPhe: 2.094 ± 1.126
4.187GluGly: 4.187 ± 1.82
2.791GluHis: 2.791 ± 0.943
7.676GluIle: 7.676 ± 1.898
4.885GluLys: 4.885 ± 1.673
3.489GluLeu: 3.489 ± 1.175
0.698GluMet: 0.698 ± 0.475
4.885GluAsn: 4.885 ± 1.276
1.396GluPro: 1.396 ± 0.897
5.583GluGln: 5.583 ± 1.764
2.094GluArg: 2.094 ± 0.809
4.187GluSer: 4.187 ± 2.435
4.187GluThr: 4.187 ± 1.68
3.489GluVal: 3.489 ± 1.714
0.698GluTrp: 0.698 ± 0.615
2.791GluTyr: 2.791 ± 0.862
0.0GluXaa: 0.0 ± 0.0
Phe
4.187PheAla: 4.187 ± 1.252
0.0PheCys: 0.0 ± 0.0
2.094PheAsp: 2.094 ± 1.126
0.698PheGlu: 0.698 ± 0.475
2.094PhePhe: 2.094 ± 1.122
2.094PheGly: 2.094 ± 0.815
0.698PheHis: 0.698 ± 0.475
2.094PheIle: 2.094 ± 0.785
2.094PheLys: 2.094 ± 0.773
2.094PheLeu: 2.094 ± 1.068
0.698PheMet: 0.698 ± 0.475
2.791PheAsn: 2.791 ± 1.902
0.698PhePro: 0.698 ± 0.896
0.0PheGln: 0.0 ± 0.0
2.094PheArg: 2.094 ± 1.122
1.396PheSer: 1.396 ± 0.939
2.094PheThr: 2.094 ± 0.809
2.094PheVal: 2.094 ± 0.496
1.396PheTrp: 1.396 ± 0.951
2.094PheTyr: 2.094 ± 1.426
0.0PheXaa: 0.0 ± 0.0
Gly
6.281GlyAla: 6.281 ± 1.638
0.698GlyCys: 0.698 ± 0.589
2.791GlyAsp: 2.791 ± 0.986
8.374GlyGlu: 8.374 ± 1.377
2.094GlyPhe: 2.094 ± 1.058
10.468GlyGly: 10.468 ± 3.901
0.698GlyHis: 0.698 ± 0.615
0.698GlyIle: 0.698 ± 0.784
8.374GlyLys: 8.374 ± 1.73
4.885GlyLeu: 4.885 ± 1.959
2.094GlyMet: 2.094 ± 1.363
2.791GlyAsn: 2.791 ± 0.874
0.698GlyPro: 0.698 ± 0.475
6.281GlyGln: 6.281 ± 2.308
2.791GlyArg: 2.791 ± 0.862
5.583GlySer: 5.583 ± 2.617
4.885GlyThr: 4.885 ± 1.31
4.187GlyVal: 4.187 ± 1.546
2.094GlyTrp: 2.094 ± 0.496
4.885GlyTyr: 4.885 ± 1.26
0.0GlyXaa: 0.0 ± 0.0
His
0.698HisAla: 0.698 ± 0.589
0.0HisCys: 0.0 ± 0.0
0.698HisAsp: 0.698 ± 0.475
0.698HisGlu: 0.698 ± 0.896
2.094HisPhe: 2.094 ± 1.511
0.698HisGly: 0.698 ± 0.475
0.0HisHis: 0.0 ± 0.0
0.698HisIle: 0.698 ± 0.615
0.698HisLys: 0.698 ± 0.589
0.698HisLeu: 0.698 ± 0.589
2.094HisMet: 2.094 ± 1.289
2.094HisAsn: 2.094 ± 0.785
0.0HisPro: 0.0 ± 0.0
0.0HisGln: 0.0 ± 0.0
0.698HisArg: 0.698 ± 0.896
1.396HisSer: 1.396 ± 0.543
0.698HisThr: 0.698 ± 0.784
1.396HisVal: 1.396 ± 0.524
0.0HisTrp: 0.0 ± 0.0
2.094HisTyr: 2.094 ± 1.008
0.0HisXaa: 0.0 ± 0.0
Ile
5.583IleAla: 5.583 ± 0.811
0.0IleCys: 0.0 ± 0.0
4.187IleAsp: 4.187 ± 1.157
2.094IleGlu: 2.094 ± 1.068
0.698IlePhe: 0.698 ± 0.475
2.094IleGly: 2.094 ± 0.809
0.698IleHis: 0.698 ± 0.896
4.885IleIle: 4.885 ± 1.863
4.187IleLys: 4.187 ± 1.018
4.187IleLeu: 4.187 ± 0.816
2.094IleMet: 2.094 ± 0.496
3.489IleAsn: 3.489 ± 1.193
4.187IlePro: 4.187 ± 1.618
4.187IleGln: 4.187 ± 2.023
3.489IleArg: 3.489 ± 1.08
2.094IleSer: 2.094 ± 0.496
2.791IleThr: 2.791 ± 1.274
4.187IleVal: 4.187 ± 3.168
1.396IleTrp: 1.396 ± 0.524
0.0IleTyr: 0.0 ± 0.0
0.0IleXaa: 0.0 ± 0.0
Lys
9.072LysAla: 9.072 ± 1.785
0.0LysCys: 0.0 ± 0.0
2.094LysAsp: 2.094 ± 1.309
6.281LysGlu: 6.281 ± 1.649
2.094LysPhe: 2.094 ± 0.766
4.187LysGly: 4.187 ± 0.816
0.698LysHis: 0.698 ± 0.589
3.489LysIle: 3.489 ± 1.08
8.374LysLys: 8.374 ± 1.773
4.187LysLeu: 4.187 ± 1.553
4.187LysMet: 4.187 ± 1.163
0.698LysAsn: 0.698 ± 0.589
4.885LysPro: 4.885 ± 1.654
4.187LysGln: 4.187 ± 2.001
2.791LysArg: 2.791 ± 1.348
2.791LysSer: 2.791 ± 1.737
4.885LysThr: 4.885 ± 1.199
4.885LysVal: 4.885 ± 2.233
0.698LysTrp: 0.698 ± 0.589
4.187LysTyr: 4.187 ± 1.561
0.0LysXaa: 0.0 ± 0.0
Leu
4.885LeuAla: 4.885 ± 2.241
0.0LeuCys: 0.0 ± 0.0
2.094LeuAsp: 2.094 ± 1.426
4.187LeuGlu: 4.187 ± 1.574
0.0LeuPhe: 0.0 ± 0.0
4.187LeuGly: 4.187 ± 0.942
0.0LeuHis: 0.0 ± 0.0
2.791LeuIle: 2.791 ± 1.048
5.583LeuLys: 5.583 ± 2.728
0.698LeuLeu: 0.698 ± 0.589
1.396LeuMet: 1.396 ± 0.913
2.094LeuAsn: 2.094 ± 1.38
4.187LeuPro: 4.187 ± 2.853
2.791LeuGln: 2.791 ± 2.153
2.791LeuArg: 2.791 ± 1.048
4.885LeuSer: 4.885 ± 0.879
4.187LeuThr: 4.187 ± 0.974
0.0LeuVal: 0.0 ± 0.0
2.791LeuTrp: 2.791 ± 1.048
4.885LeuTyr: 4.885 ± 2.001
0.0LeuXaa: 0.0 ± 0.0
Met
1.396MetAla: 1.396 ± 0.939
0.0MetCys: 0.0 ± 0.0
0.698MetAsp: 0.698 ± 0.475
2.791MetGlu: 2.791 ± 0.891
0.0MetPhe: 0.0 ± 0.0
4.885MetGly: 4.885 ± 1.347
0.0MetHis: 0.0 ± 0.0
2.094MetIle: 2.094 ± 1.286
3.489MetLys: 3.489 ± 1.115
2.791MetLeu: 2.791 ± 0.986
0.698MetMet: 0.698 ± 0.589
2.094MetAsn: 2.094 ± 0.924
1.396MetPro: 1.396 ± 0.951
1.396MetGln: 1.396 ± 1.229
2.094MetArg: 2.094 ± 0.809
2.791MetSer: 2.791 ± 0.943
2.094MetThr: 2.094 ± 0.924
0.0MetVal: 0.0 ± 0.0
0.0MetTrp: 0.0 ± 0.0
1.396MetTyr: 1.396 ± 1.178
0.0MetXaa: 0.0 ± 0.0
Asn
6.978AsnAla: 6.978 ± 1.529
0.0AsnCys: 0.0 ± 0.0
1.396AsnAsp: 1.396 ± 0.524
4.885AsnGlu: 4.885 ± 2.468
0.698AsnPhe: 0.698 ± 0.475
3.489AsnGly: 3.489 ± 1.024
1.396AsnHis: 1.396 ± 0.543
2.791AsnIle: 2.791 ± 1.0
2.791AsnLys: 2.791 ± 1.233
0.698AsnLeu: 0.698 ± 0.475
0.0AsnMet: 0.0 ± 0.0
1.396AsnAsn: 1.396 ± 0.951
2.094AsnPro: 2.094 ± 1.252
2.094AsnGln: 2.094 ± 1.056
4.187AsnArg: 4.187 ± 0.942
4.187AsnSer: 4.187 ± 2.023
4.885AsnThr: 4.885 ± 1.434
4.187AsnVal: 4.187 ± 0.974
0.698AsnTrp: 0.698 ± 0.784
1.396AsnTyr: 1.396 ± 0.543
0.0AsnXaa: 0.0 ± 0.0
Pro
0.698ProAla: 0.698 ± 0.475
0.698ProCys: 0.698 ± 0.589
0.698ProAsp: 0.698 ± 0.475
4.187ProGlu: 4.187 ± 1.618
2.094ProPhe: 2.094 ± 0.766
4.885ProGly: 4.885 ± 1.387
0.698ProHis: 0.698 ± 0.589
4.187ProIle: 4.187 ± 1.59
2.094ProLys: 2.094 ± 0.785
2.094ProLeu: 2.094 ± 0.809
0.698ProMet: 0.698 ± 0.475
2.791ProAsn: 2.791 ± 0.69
0.0ProPro: 0.0 ± 0.0
4.885ProGln: 4.885 ± 1.461
0.698ProArg: 0.698 ± 0.589
0.0ProSer: 0.0 ± 0.0
2.791ProThr: 2.791 ± 1.219
2.791ProVal: 2.791 ± 1.902
0.698ProTrp: 0.698 ± 0.475
2.094ProTyr: 2.094 ± 0.815
0.0ProXaa: 0.0 ± 0.0
Gln
4.187GlnAla: 4.187 ± 1.234
2.094GlnCys: 2.094 ± 1.767
0.698GlnAsp: 0.698 ± 0.615
5.583GlnGlu: 5.583 ± 2.831
2.791GlnPhe: 2.791 ± 1.472
6.978GlnGly: 6.978 ± 2.478
0.698GlnHis: 0.698 ± 0.589
4.187GlnIle: 4.187 ± 1.305
3.489GlnLys: 3.489 ± 1.492
1.396GlnLeu: 1.396 ± 1.178
2.094GlnMet: 2.094 ± 1.289
2.094GlnAsn: 2.094 ± 1.068
1.396GlnPro: 1.396 ± 0.524
4.187GlnGln: 4.187 ± 1.336
2.791GlnArg: 2.791 ± 0.943
3.489GlnSer: 3.489 ± 3.073
4.187GlnThr: 4.187 ± 1.572
4.187GlnVal: 4.187 ± 0.816
0.0GlnTrp: 0.0 ± 0.0
2.094GlnTyr: 2.094 ± 0.766
0.0GlnXaa: 0.0 ± 0.0
Arg
2.791ArgAla: 2.791 ± 0.874
0.698ArgCys: 0.698 ± 0.475
2.094ArgAsp: 2.094 ± 0.809
2.791ArgGlu: 2.791 ± 0.733
2.094ArgPhe: 2.094 ± 0.809
2.094ArgGly: 2.094 ± 1.056
1.396ArgHis: 1.396 ± 0.951
2.791ArgIle: 2.791 ± 1.048
5.583ArgLys: 5.583 ± 2.07
3.489ArgLeu: 3.489 ± 2.14
2.791ArgMet: 2.791 ± 1.472
1.396ArgAsn: 1.396 ± 0.799
2.094ArgPro: 2.094 ± 1.767
2.791ArgGln: 2.791 ± 1.566
0.698ArgArg: 0.698 ± 0.475
2.094ArgSer: 2.094 ± 1.058
2.791ArgThr: 2.791 ± 1.429
2.791ArgVal: 2.791 ± 1.048
0.0ArgTrp: 0.0 ± 0.0
4.187ArgTyr: 4.187 ± 1.338
0.0ArgXaa: 0.0 ± 0.0
Ser
2.094SerAla: 2.094 ± 1.058
1.396SerCys: 1.396 ± 0.524
2.791SerAsp: 2.791 ± 1.219
2.791SerGlu: 2.791 ± 2.745
2.094SerPhe: 2.094 ± 1.122
7.676SerGly: 7.676 ± 1.88
0.698SerHis: 0.698 ± 0.784
2.094SerIle: 2.094 ± 1.126
2.094SerLys: 2.094 ± 1.289
2.094SerLeu: 2.094 ± 1.286
2.094SerMet: 2.094 ± 1.844
2.094SerAsn: 2.094 ± 0.815
0.0SerPro: 0.0 ± 0.0
6.281SerGln: 6.281 ± 3.993
3.489SerArg: 3.489 ± 0.917
5.583SerSer: 5.583 ± 3.286
4.885SerThr: 4.885 ± 2.091
3.489SerVal: 3.489 ± 0.917
1.396SerTrp: 1.396 ± 1.229
2.791SerTyr: 2.791 ± 1.098
0.0SerXaa: 0.0 ± 0.0
Thr
4.187ThrAla: 4.187 ± 1.766
0.0ThrCys: 0.0 ± 0.0
6.281ThrAsp: 6.281 ± 1.22
3.489ThrGlu: 3.489 ± 1.949
2.791ThrPhe: 2.791 ± 1.219
4.885ThrGly: 4.885 ± 0.879
0.698ThrHis: 0.698 ± 0.615
2.791ThrIle: 2.791 ± 1.219
6.281ThrLys: 6.281 ± 1.726
5.583ThrLeu: 5.583 ± 1.417
0.0ThrMet: 0.0 ± 0.0
2.791ThrAsn: 2.791 ± 0.563
5.583ThrPro: 5.583 ± 1.749
0.0ThrGln: 0.0 ± 0.0
3.489ThrArg: 3.489 ± 1.471
4.885ThrSer: 4.885 ± 0.815
2.791ThrThr: 2.791 ± 1.219
3.489ThrVal: 3.489 ± 1.176
0.0ThrTrp: 0.0 ± 0.0
2.094ThrTyr: 2.094 ± 1.008
0.0ThrXaa: 0.0 ± 0.0
Val
5.583ValAla: 5.583 ± 1.126
0.698ValCys: 0.698 ± 0.475
2.791ValAsp: 2.791 ± 1.0
2.791ValGlu: 2.791 ± 1.219
0.698ValPhe: 0.698 ± 0.784
2.791ValGly: 2.791 ± 1.145
0.698ValHis: 0.698 ± 0.784
3.489ValIle: 3.489 ± 1.193
3.489ValLys: 3.489 ± 1.85
2.791ValLeu: 2.791 ± 1.233
2.094ValMet: 2.094 ± 0.695
2.094ValAsn: 2.094 ± 1.426
6.978ValPro: 6.978 ± 1.85
4.187ValGln: 4.187 ± 0.942
2.094ValArg: 2.094 ± 1.122
2.094ValSer: 2.094 ± 0.773
4.187ValThr: 4.187 ± 1.61
0.698ValVal: 0.698 ± 0.475
2.094ValTrp: 2.094 ± 1.286
2.094ValTyr: 2.094 ± 1.029
0.0ValXaa: 0.0 ± 0.0
Trp
0.698TrpAla: 0.698 ± 0.475
0.0TrpCys: 0.0 ± 0.0
0.698TrpAsp: 0.698 ± 0.475
0.0TrpGlu: 0.0 ± 0.0
0.698TrpPhe: 0.698 ± 0.589
1.396TrpGly: 1.396 ± 0.792
0.698TrpHis: 0.698 ± 0.475
0.0TrpIle: 0.0 ± 0.0
0.698TrpLys: 0.698 ± 0.589
1.396TrpLeu: 1.396 ± 0.792
1.396TrpMet: 1.396 ± 0.951
1.396TrpAsn: 1.396 ± 0.951
0.0TrpPro: 0.0 ± 0.0
2.094TrpGln: 2.094 ± 1.289
1.396TrpArg: 1.396 ± 1.178
1.396TrpSer: 1.396 ± 0.543
1.396TrpThr: 1.396 ± 0.939
0.698TrpVal: 0.698 ± 0.784
0.0TrpTrp: 0.0 ± 0.0
0.698TrpTyr: 0.698 ± 0.615
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.094TyrAla: 2.094 ± 0.924
0.698TyrCys: 0.698 ± 0.475
1.396TyrAsp: 1.396 ± 0.944
6.281TyrGlu: 6.281 ± 2.242
2.791TyrPhe: 2.791 ± 1.048
3.489TyrGly: 3.489 ± 2.011
2.791TyrHis: 2.791 ± 0.988
0.698TyrIle: 0.698 ± 0.589
4.187TyrLys: 4.187 ± 1.572
3.489TyrLeu: 3.489 ± 1.447
0.698TyrMet: 0.698 ± 0.475
4.885TyrAsn: 4.885 ± 2.772
2.094TyrPro: 2.094 ± 0.809
3.489TyrGln: 3.489 ± 0.857
3.489TyrArg: 3.489 ± 1.471
3.489TyrSer: 3.489 ± 0.588
1.396TyrThr: 1.396 ± 0.524
4.885TyrVal: 4.885 ± 1.343
0.0TyrTrp: 0.0 ± 0.0
4.187TyrTyr: 4.187 ± 1.572
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 5 proteins (1434 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski