Amino acid dipepetide frequency for Human gut microviridae SH-CHD12

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
7.638AlaAla: 7.638 ± 2.906
0.0AlaCys: 0.0 ± 0.0
2.938AlaAsp: 2.938 ± 0.509
4.7AlaGlu: 4.7 ± 1.04
4.7AlaPhe: 4.7 ± 1.491
2.938AlaGly: 2.938 ± 0.747
0.588AlaHis: 0.588 ± 0.484
7.051AlaIle: 7.051 ± 1.514
2.938AlaLys: 2.938 ± 0.963
7.638AlaLeu: 7.638 ± 2.375
0.588AlaMet: 0.588 ± 0.342
5.875AlaAsn: 5.875 ± 0.985
2.35AlaPro: 2.35 ± 1.369
4.113AlaGln: 4.113 ± 0.881
1.763AlaArg: 1.763 ± 0.766
6.463AlaSer: 6.463 ± 3.026
3.525AlaThr: 3.525 ± 0.404
1.175AlaVal: 1.175 ± 0.383
1.175AlaTrp: 1.175 ± 0.383
5.288AlaTyr: 5.288 ± 2.104
0.0AlaXaa: 0.0 ± 0.0
Cys
0.588CysAla: 0.588 ± 0.484
0.0CysCys: 0.0 ± 0.0
0.0CysAsp: 0.0 ± 0.0
0.588CysGlu: 0.588 ± 0.484
0.0CysPhe: 0.0 ± 0.0
0.0CysGly: 0.0 ± 0.0
0.588CysHis: 0.588 ± 0.484
0.588CysIle: 0.588 ± 0.484
0.0CysLys: 0.0 ± 0.0
0.588CysLeu: 0.588 ± 0.342
0.588CysMet: 0.588 ± 0.484
0.588CysAsn: 0.588 ± 0.484
3.525CysPro: 3.525 ± 1.046
0.0CysGln: 0.0 ± 0.0
2.35CysArg: 2.35 ± 1.626
1.175CysSer: 1.175 ± 0.859
0.0CysThr: 0.0 ± 0.0
1.175CysVal: 1.175 ± 0.968
0.0CysTrp: 0.0 ± 0.0
0.588CysTyr: 0.588 ± 0.484
0.0CysXaa: 0.0 ± 0.0
Asp
8.813AspAla: 8.813 ± 2.242
0.0AspCys: 0.0 ± 0.0
2.35AspAsp: 2.35 ± 1.38
2.35AspGlu: 2.35 ± 1.369
2.35AspPhe: 2.35 ± 0.908
3.525AspGly: 3.525 ± 1.084
0.0AspHis: 0.0 ± 0.0
4.7AspIle: 4.7 ± 1.417
2.938AspLys: 2.938 ± 0.509
3.525AspLeu: 3.525 ± 1.323
1.175AspMet: 1.175 ± 0.439
1.763AspAsn: 1.763 ± 0.542
0.588AspPro: 0.588 ± 0.342
1.763AspGln: 1.763 ± 1.069
2.35AspArg: 2.35 ± 0.895
4.113AspSer: 4.113 ± 0.653
4.7AspThr: 4.7 ± 0.812
4.113AspVal: 4.113 ± 0.616
2.35AspTrp: 2.35 ± 0.269
2.938AspTyr: 2.938 ± 0.425
0.0AspXaa: 0.0 ± 0.0
Glu
2.938GluAla: 2.938 ± 1.14
0.588GluCys: 0.588 ± 0.484
1.763GluAsp: 1.763 ± 0.347
1.763GluGlu: 1.763 ± 1.069
1.763GluPhe: 1.763 ± 1.027
1.763GluGly: 1.763 ± 0.658
1.763GluHis: 1.763 ± 0.959
4.113GluIle: 4.113 ± 2.009
1.175GluLys: 1.175 ± 0.634
3.525GluLeu: 3.525 ± 0.404
0.0GluMet: 0.0 ± 0.331
1.763GluAsn: 1.763 ± 0.542
0.588GluPro: 0.588 ± 0.342
2.35GluGln: 2.35 ± 0.58
2.35GluArg: 2.35 ± 1.691
2.938GluSer: 2.938 ± 0.425
2.938GluThr: 2.938 ± 0.425
2.938GluVal: 2.938 ± 0.991
1.763GluTrp: 1.763 ± 0.347
1.763GluTyr: 1.763 ± 0.803
0.0GluXaa: 0.0 ± 0.0
Phe
2.35PheAla: 2.35 ± 0.822
1.175PheCys: 1.175 ± 0.968
3.525PheAsp: 3.525 ± 1.519
0.588PheGlu: 0.588 ± 0.595
1.763PhePhe: 1.763 ± 0.542
2.938PheGly: 2.938 ± 1.204
0.0PheHis: 0.0 ± 0.0
3.525PheIle: 3.525 ± 0.591
1.175PheLys: 1.175 ± 0.634
4.7PheLeu: 4.7 ± 1.638
1.175PheMet: 1.175 ± 0.685
4.113PheAsn: 4.113 ± 1.349
2.35PhePro: 2.35 ± 0.767
1.175PheGln: 1.175 ± 0.684
5.288PheArg: 5.288 ± 0.78
2.938PheSer: 2.938 ± 1.136
2.938PheThr: 2.938 ± 0.475
3.525PheVal: 3.525 ± 1.046
0.588PheTrp: 0.588 ± 0.697
3.525PheTyr: 3.525 ± 1.57
0.0PheXaa: 0.0 ± 0.0
Gly
2.35GlyAla: 2.35 ± 0.895
0.0GlyCys: 0.0 ± 0.0
5.875GlyAsp: 5.875 ± 1.56
2.938GlyGlu: 2.938 ± 0.425
3.525GlyPhe: 3.525 ± 0.825
3.525GlyGly: 3.525 ± 1.323
1.175GlyHis: 1.175 ± 1.189
5.875GlyIle: 5.875 ± 0.851
2.938GlyLys: 2.938 ± 0.509
1.763GlyLeu: 1.763 ± 0.658
1.175GlyMet: 1.175 ± 0.684
1.763GlyAsn: 1.763 ± 0.658
0.0GlyPro: 0.0 ± 0.0
2.938GlyGln: 2.938 ± 0.509
1.763GlyArg: 1.763 ± 0.542
5.288GlySer: 5.288 ± 1.719
1.175GlyThr: 1.175 ± 0.525
4.113GlyVal: 4.113 ± 0.518
0.0GlyTrp: 0.0 ± 0.0
2.35GlyTyr: 2.35 ± 0.269
0.0GlyXaa: 0.0 ± 0.0
His
0.0HisAla: 0.0 ± 0.0
0.0HisCys: 0.0 ± 0.0
0.0HisAsp: 0.0 ± 0.0
0.0HisGlu: 0.0 ± 0.0
1.175HisPhe: 1.175 ± 0.684
1.175HisGly: 1.175 ± 0.525
0.0HisHis: 0.0 ± 0.0
1.175HisIle: 1.175 ± 0.383
0.0HisLys: 0.0 ± 0.0
3.525HisLeu: 3.525 ± 1.58
0.0HisMet: 0.0 ± 0.0
0.0HisAsn: 0.0 ± 0.0
1.175HisPro: 1.175 ± 0.968
0.0HisGln: 0.0 ± 0.0
1.763HisArg: 1.763 ± 0.803
0.588HisSer: 0.588 ± 0.484
1.175HisThr: 1.175 ± 1.189
1.763HisVal: 1.763 ± 0.347
0.588HisTrp: 0.588 ± 0.342
1.175HisTyr: 1.175 ± 0.968
0.0HisXaa: 0.0 ± 0.0
Ile
5.875IleAla: 5.875 ± 1.56
0.0IleCys: 0.0 ± 0.0
4.113IleAsp: 4.113 ± 0.732
1.763IleGlu: 1.763 ± 0.91
1.763IlePhe: 1.763 ± 0.766
2.938IleGly: 2.938 ± 0.963
0.0IleHis: 0.0 ± 0.0
1.763IleIle: 1.763 ± 0.959
2.35IleLys: 2.35 ± 1.626
7.051IleLeu: 7.051 ± 2.216
2.35IleMet: 2.35 ± 1.27
3.525IleAsn: 3.525 ± 1.072
4.113IlePro: 4.113 ± 0.616
4.7IleGln: 4.7 ± 1.301
2.938IleArg: 2.938 ± 1.204
9.401IleSer: 9.401 ± 2.208
2.35IleThr: 2.35 ± 0.822
1.175IleVal: 1.175 ± 0.525
2.35IleTrp: 2.35 ± 0.658
1.763IleTyr: 1.763 ± 0.658
0.0IleXaa: 0.0 ± 0.0
Lys
2.938LysAla: 2.938 ± 1.479
1.763LysCys: 1.763 ± 0.803
2.35LysAsp: 2.35 ± 0.908
3.525LysGlu: 3.525 ± 1.917
3.525LysPhe: 3.525 ± 1.327
1.763LysGly: 1.763 ± 0.803
1.763LysHis: 1.763 ± 0.347
1.763LysIle: 1.763 ± 0.347
2.35LysLys: 2.35 ± 0.822
5.288LysLeu: 5.288 ± 1.38
1.175LysMet: 1.175 ± 0.785
3.525LysAsn: 3.525 ± 0.698
1.763LysPro: 1.763 ± 0.959
3.525LysGln: 3.525 ± 1.58
1.763LysArg: 1.763 ± 1.452
2.35LysSer: 2.35 ± 1.057
1.763LysThr: 1.763 ± 0.658
2.938LysVal: 2.938 ± 0.475
0.588LysTrp: 0.588 ± 0.484
3.525LysTyr: 3.525 ± 1.15
0.0LysXaa: 0.0 ± 0.0
Leu
4.113LeuAla: 4.113 ± 1.22
2.35LeuCys: 2.35 ± 1.414
6.463LeuAsp: 6.463 ± 0.709
1.763LeuGlu: 1.763 ± 0.542
2.938LeuPhe: 2.938 ± 0.816
5.875LeuGly: 5.875 ± 1.691
2.938LeuHis: 2.938 ± 1.162
2.938LeuIle: 2.938 ± 1.14
4.113LeuLys: 4.113 ± 2.707
8.226LeuLeu: 8.226 ± 4.635
2.938LeuMet: 2.938 ± 0.898
7.051LeuAsn: 7.051 ± 1.648
4.113LeuPro: 4.113 ± 0.519
4.113LeuGln: 4.113 ± 1.349
2.35LeuArg: 2.35 ± 1.05
9.401LeuSer: 9.401 ± 1.624
4.7LeuThr: 4.7 ± 2.636
4.113LeuVal: 4.113 ± 1.237
0.588LeuTrp: 0.588 ± 0.484
1.763LeuTyr: 1.763 ± 0.766
0.0LeuXaa: 0.0 ± 0.0
Met
4.113MetAla: 4.113 ± 1.911
0.588MetCys: 0.588 ± 0.484
0.0MetAsp: 0.0 ± 0.0
0.588MetGlu: 0.588 ± 0.484
1.763MetPhe: 1.763 ± 1.207
0.0MetGly: 0.0 ± 0.0
0.0MetHis: 0.0 ± 0.0
1.763MetIle: 1.763 ± 0.542
2.35MetLys: 2.35 ± 0.269
1.763MetLeu: 1.763 ± 0.766
0.588MetMet: 0.588 ± 0.342
0.588MetAsn: 0.588 ± 0.697
1.763MetPro: 1.763 ± 0.542
0.588MetGln: 0.588 ± 0.484
0.588MetArg: 0.588 ± 0.484
1.763MetSer: 1.763 ± 0.689
0.588MetThr: 0.588 ± 0.697
1.175MetVal: 1.175 ± 0.525
0.588MetTrp: 0.588 ± 0.595
1.763MetTyr: 1.763 ± 0.803
0.0MetXaa: 0.0 ± 0.0
Asn
4.113AsnAla: 4.113 ± 0.616
0.0AsnCys: 0.0 ± 0.0
3.525AsnAsp: 3.525 ± 2.137
2.35AsnGlu: 2.35 ± 1.38
2.938AsnPhe: 2.938 ± 1.452
3.525AsnGly: 3.525 ± 1.463
0.0AsnHis: 0.0 ± 0.0
4.113AsnIle: 4.113 ± 1.349
1.175AsnLys: 1.175 ± 0.634
7.051AsnLeu: 7.051 ± 0.918
1.175AsnMet: 1.175 ± 0.634
5.288AsnAsn: 5.288 ± 2.104
5.288AsnPro: 5.288 ± 0.332
2.35AsnGln: 2.35 ± 1.057
3.525AsnArg: 3.525 ± 0.592
5.875AsnSer: 5.875 ± 1.691
2.938AsnThr: 2.938 ± 1.479
7.051AsnVal: 7.051 ± 2.002
1.763AsnTrp: 1.763 ± 0.658
2.35AsnTyr: 2.35 ± 0.658
0.0AsnXaa: 0.0 ± 0.0
Pro
5.288ProAla: 5.288 ± 1.231
2.35ProCys: 2.35 ± 1.936
1.763ProAsp: 1.763 ± 0.689
2.35ProGlu: 2.35 ± 0.767
2.938ProPhe: 2.938 ± 1.204
1.175ProGly: 1.175 ± 0.634
1.175ProHis: 1.175 ± 0.684
1.763ProIle: 1.763 ± 0.803
2.938ProLys: 2.938 ± 1.102
4.113ProLeu: 4.113 ± 1.232
3.525ProMet: 3.525 ± 0.968
0.588ProAsn: 0.588 ± 0.342
1.763ProPro: 1.763 ± 0.91
2.35ProGln: 2.35 ± 0.908
2.35ProArg: 2.35 ± 1.369
4.113ProSer: 4.113 ± 0.616
2.35ProThr: 2.35 ± 1.369
3.525ProVal: 3.525 ± 1.241
1.175ProTrp: 1.175 ± 0.684
1.763ProTyr: 1.763 ± 1.452
0.0ProXaa: 0.0 ± 0.0
Gln
2.938GlnAla: 2.938 ± 0.747
0.0GlnCys: 0.0 ± 0.0
1.175GlnAsp: 1.175 ± 0.968
2.938GlnGlu: 2.938 ± 2.235
1.763GlnPhe: 1.763 ± 0.542
2.35GlnGly: 2.35 ± 0.754
1.175GlnHis: 1.175 ± 0.785
3.525GlnIle: 3.525 ± 2.069
1.763GlnLys: 1.763 ± 1.069
4.113GlnLeu: 4.113 ± 1.237
2.35GlnMet: 2.35 ± 0.269
4.7GlnAsn: 4.7 ± 1.849
1.175GlnPro: 1.175 ± 0.684
1.763GlnGln: 1.763 ± 1.784
4.113GlnArg: 4.113 ± 1.207
4.113GlnSer: 4.113 ± 1.232
3.525GlnThr: 3.525 ± 0.825
1.763GlnVal: 1.763 ± 0.803
0.0GlnTrp: 0.0 ± 0.0
2.938GlnTyr: 2.938 ± 0.425
0.0GlnXaa: 0.0 ± 0.0
Arg
2.35ArgAla: 2.35 ± 0.767
0.588ArgCys: 0.588 ± 0.484
4.113ArgAsp: 4.113 ± 0.934
1.175ArgGlu: 1.175 ± 0.525
3.525ArgPhe: 3.525 ± 1.21
1.175ArgGly: 1.175 ± 0.525
1.175ArgHis: 1.175 ± 0.634
3.525ArgIle: 3.525 ± 1.323
2.938ArgLys: 2.938 ± 1.746
2.938ArgLeu: 2.938 ± 1.444
0.588ArgMet: 0.588 ± 0.697
4.113ArgAsn: 4.113 ± 1.349
1.175ArgPro: 1.175 ± 0.634
1.763ArgGln: 1.763 ± 0.803
1.763ArgArg: 1.763 ± 0.959
4.113ArgSer: 4.113 ± 0.84
4.7ArgThr: 4.7 ± 1.856
4.7ArgVal: 4.7 ± 1.091
0.0ArgTrp: 0.0 ± 0.0
4.113ArgTyr: 4.113 ± 0.934
0.0ArgXaa: 0.0 ± 0.0
Ser
5.288SerAla: 5.288 ± 1.677
0.588SerCys: 0.588 ± 0.697
5.288SerAsp: 5.288 ± 0.739
3.525SerGlu: 3.525 ± 0.694
5.875SerPhe: 5.875 ± 1.299
7.051SerGly: 7.051 ± 3.153
0.0SerHis: 0.0 ± 0.0
4.7SerIle: 4.7 ± 1.347
4.7SerLys: 4.7 ± 1.257
7.051SerLeu: 7.051 ± 2.457
0.588SerMet: 0.588 ± 0.697
2.938SerAsn: 2.938 ± 0.425
4.7SerPro: 4.7 ± 1.139
6.463SerGln: 6.463 ± 3.608
4.113SerArg: 4.113 ± 2.009
5.288SerSer: 5.288 ± 0.988
4.7SerThr: 4.7 ± 1.091
4.113SerVal: 4.113 ± 1.201
1.175SerTrp: 1.175 ± 0.383
2.35SerTyr: 2.35 ± 0.767
0.0SerXaa: 0.0 ± 0.0
Thr
5.288ThrAla: 5.288 ± 1.677
0.588ThrCys: 0.588 ± 0.484
1.175ThrAsp: 1.175 ± 0.383
4.7ThrGlu: 4.7 ± 0.389
1.763ThrPhe: 1.763 ± 0.542
2.35ThrGly: 2.35 ± 0.767
0.588ThrHis: 0.588 ± 0.484
2.35ThrIle: 2.35 ± 0.908
1.763ThrLys: 1.763 ± 0.766
2.35ThrLeu: 2.35 ± 0.819
0.588ThrMet: 0.588 ± 0.484
4.113ThrAsn: 4.113 ± 0.653
6.463ThrPro: 6.463 ± 1.449
2.35ThrGln: 2.35 ± 0.269
2.35ThrArg: 2.35 ± 0.819
4.7ThrSer: 4.7 ± 1.511
5.875ThrThr: 5.875 ± 2.807
3.525ThrVal: 3.525 ± 0.591
0.588ThrTrp: 0.588 ± 0.342
4.113ThrTyr: 4.113 ± 1.343
0.0ThrXaa: 0.0 ± 0.0
Val
3.525ValAla: 3.525 ± 1.399
1.175ValCys: 1.175 ± 0.383
4.7ValAsp: 4.7 ± 0.914
0.588ValGlu: 0.588 ± 0.484
1.763ValPhe: 1.763 ± 1.027
2.938ValGly: 2.938 ± 0.509
0.588ValHis: 0.588 ± 0.342
5.288ValIle: 5.288 ± 1.685
5.875ValLys: 5.875 ± 1.926
4.7ValLeu: 4.7 ± 2.1
0.588ValMet: 0.588 ± 0.484
5.875ValAsn: 5.875 ± 1.691
4.113ValPro: 4.113 ± 1.349
1.175ValGln: 1.175 ± 0.525
5.288ValArg: 5.288 ± 0.78
2.35ValSer: 2.35 ± 0.58
3.525ValThr: 3.525 ± 1.084
4.7ValVal: 4.7 ± 1.456
0.0ValTrp: 0.0 ± 0.0
3.525ValTyr: 3.525 ± 1.084
0.0ValXaa: 0.0 ± 0.0
Trp
0.588TrpAla: 0.588 ± 0.342
0.588TrpCys: 0.588 ± 0.484
1.175TrpAsp: 1.175 ± 0.525
1.175TrpGlu: 1.175 ± 0.525
0.588TrpPhe: 0.588 ± 0.484
0.0TrpGly: 0.0 ± 0.0
0.588TrpHis: 0.588 ± 0.342
0.0TrpIle: 0.0 ± 0.0
1.175TrpLys: 1.175 ± 0.735
0.588TrpLeu: 0.588 ± 0.595
0.588TrpMet: 0.588 ± 0.484
1.763TrpAsn: 1.763 ± 0.347
0.0TrpPro: 0.0 ± 0.0
1.763TrpGln: 1.763 ± 0.347
0.588TrpArg: 0.588 ± 0.484
1.175TrpSer: 1.175 ± 0.383
1.763TrpThr: 1.763 ± 0.542
0.0TrpVal: 0.0 ± 0.0
0.588TrpTrp: 0.588 ± 0.342
1.175TrpTyr: 1.175 ± 0.684
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.35TyrAla: 2.35 ± 1.369
0.588TyrCys: 0.588 ± 0.342
4.113TyrAsp: 4.113 ± 0.934
1.763TyrGlu: 1.763 ± 1.027
2.938TyrPhe: 2.938 ± 2.42
3.525TyrGly: 3.525 ± 1.15
1.175TyrHis: 1.175 ± 0.968
1.763TyrIle: 1.763 ± 0.542
5.288TyrLys: 5.288 ± 1.881
2.35TyrLeu: 2.35 ± 0.767
0.588TyrMet: 0.588 ± 0.484
6.463TyrAsn: 6.463 ± 2.277
2.35TyrPro: 2.35 ± 0.822
2.938TyrGln: 2.938 ± 1.102
1.175TyrArg: 1.175 ± 0.634
2.35TyrSer: 2.35 ± 0.658
2.35TyrThr: 2.35 ± 0.822
4.7TyrVal: 4.7 ± 1.961
0.0TyrTrp: 0.0 ± 0.0
4.113TyrTyr: 4.113 ± 1.608
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 4 proteins (1703 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski