Amino acid dipepetide frequency for Human gut microviridae SH-CHD8

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
5.37AlaAla: 5.37 ± 3.017
1.79AlaCys: 1.79 ± 0.857
4.773AlaAsp: 4.773 ± 1.984
4.177AlaGlu: 4.177 ± 2.78
2.387AlaPhe: 2.387 ± 1.357
7.16AlaGly: 7.16 ± 4.462
1.79AlaHis: 1.79 ± 1.342
2.983AlaIle: 2.983 ± 1.446
3.58AlaLys: 3.58 ± 1.871
8.353AlaLeu: 8.353 ± 1.528
2.387AlaMet: 2.387 ± 0.329
2.387AlaAsn: 2.387 ± 0.329
3.58AlaPro: 3.58 ± 0.735
1.193AlaGln: 1.193 ± 1.128
3.58AlaArg: 3.58 ± 2.706
7.757AlaSer: 7.757 ± 2.994
3.58AlaThr: 3.58 ± 1.433
4.177AlaVal: 4.177 ± 1.489
1.193AlaTrp: 1.193 ± 0.506
4.177AlaTyr: 4.177 ± 0.729
0.0AlaXaa: 0.0 ± 0.0
Cys
1.193CysAla: 1.193 ± 0.433
1.193CysCys: 1.193 ± 0.433
0.597CysAsp: 0.597 ± 0.529
0.597CysGlu: 0.597 ± 0.529
1.193CysPhe: 1.193 ± 0.433
0.597CysGly: 0.597 ± 0.529
0.0CysHis: 0.0 ± 0.0
0.597CysIle: 0.597 ± 0.564
0.597CysLys: 0.597 ± 0.529
2.983CysLeu: 2.983 ± 1.822
0.597CysMet: 0.597 ± 0.452
2.387CysAsn: 2.387 ± 0.653
0.597CysPro: 0.597 ± 0.447
1.193CysGln: 1.193 ± 0.895
1.79CysArg: 1.79 ± 0.857
0.597CysSer: 0.597 ± 0.447
1.79CysThr: 1.79 ± 0.854
0.0CysVal: 0.0 ± 0.0
0.0CysTrp: 0.0 ± 0.0
1.193CysTyr: 1.193 ± 1.057
0.0CysXaa: 0.0 ± 0.0
Asp
2.387AspAla: 2.387 ± 1.789
1.193AspCys: 1.193 ± 1.057
4.177AspAsp: 4.177 ± 1.912
1.193AspGlu: 1.193 ± 0.506
3.58AspPhe: 3.58 ± 1.3
4.177AspGly: 4.177 ± 0.373
0.0AspHis: 0.0 ± 0.0
2.983AspIle: 2.983 ± 0.868
7.757AspLys: 7.757 ± 1.439
5.967AspLeu: 5.967 ± 2.323
2.983AspMet: 2.983 ± 0.745
5.37AspAsn: 5.37 ± 1.061
1.193AspPro: 1.193 ± 0.895
0.0AspGln: 0.0 ± 0.0
2.387AspArg: 2.387 ± 1.357
7.16AspSer: 7.16 ± 2.604
3.58AspThr: 3.58 ± 0.743
5.967AspVal: 5.967 ± 2.298
0.0AspTrp: 0.0 ± 0.0
1.79AspTyr: 1.79 ± 1.342
0.0AspXaa: 0.0 ± 0.0
Glu
4.773GluAla: 4.773 ± 2.413
0.597GluCys: 0.597 ± 0.529
1.79GluAsp: 1.79 ± 0.974
1.79GluGlu: 1.79 ± 1.09
0.597GluPhe: 0.597 ± 0.529
0.597GluGly: 0.597 ± 0.814
0.597GluHis: 0.597 ± 0.564
4.177GluIle: 4.177 ± 1.661
2.387GluLys: 2.387 ± 0.756
2.983GluLeu: 2.983 ± 2.015
1.79GluMet: 1.79 ± 0.408
2.387GluAsn: 2.387 ± 0.756
0.597GluPro: 0.597 ± 0.529
3.58GluGln: 3.58 ± 1.871
0.597GluArg: 0.597 ± 0.447
0.597GluSer: 0.597 ± 0.529
0.597GluThr: 0.597 ± 0.447
2.983GluVal: 2.983 ± 1.153
0.597GluTrp: 0.597 ± 0.529
2.387GluTyr: 2.387 ± 1.512
0.0GluXaa: 0.0 ± 0.0
Phe
1.79PheAla: 1.79 ± 0.771
1.79PheCys: 1.79 ± 0.854
4.773PheAsp: 4.773 ± 0.658
1.193PheGlu: 1.193 ± 0.433
2.387PhePhe: 2.387 ± 0.867
8.95PheGly: 8.95 ± 2.668
0.0PheHis: 0.0 ± 0.0
3.58PheIle: 3.58 ± 0.731
2.387PheLys: 2.387 ± 1.789
5.37PheLeu: 5.37 ± 2.327
1.193PheMet: 1.193 ± 1.015
4.177PheAsn: 4.177 ± 1.672
2.387PhePro: 2.387 ± 0.653
1.79PheGln: 1.79 ± 0.705
1.193PheArg: 1.193 ± 0.433
2.983PheSer: 2.983 ± 1.788
3.58PheThr: 3.58 ± 1.409
4.177PheVal: 4.177 ± 1.887
0.0PheTrp: 0.0 ± 0.0
3.58PheTyr: 3.58 ± 0.901
0.0PheXaa: 0.0 ± 0.0
Gly
6.563GlyAla: 6.563 ± 3.899
0.0GlyCys: 0.0 ± 0.0
4.773GlyAsp: 4.773 ± 0.545
1.193GlyGlu: 1.193 ± 0.506
5.37GlyPhe: 5.37 ± 2.1
2.387GlyGly: 2.387 ± 0.756
0.597GlyHis: 0.597 ± 0.529
1.79GlyIle: 1.79 ± 0.974
4.177GlyLys: 4.177 ± 0.822
4.773GlyLeu: 4.773 ± 0.465
0.0GlyMet: 0.0 ± 0.0
5.37GlyAsn: 5.37 ± 1.486
0.0GlyPro: 0.0 ± 0.0
1.193GlyGln: 1.193 ± 0.506
3.58GlyArg: 3.58 ± 1.414
5.37GlySer: 5.37 ± 1.87
0.597GlyThr: 0.597 ± 0.447
5.37GlyVal: 5.37 ± 1.525
0.597GlyTrp: 0.597 ± 0.564
2.387GlyTyr: 2.387 ± 0.867
0.0GlyXaa: 0.0 ± 0.0
His
1.193HisAla: 1.193 ± 0.644
1.193HisCys: 1.193 ± 1.057
1.79HisAsp: 1.79 ± 0.857
0.0HisGlu: 0.0 ± 0.0
1.79HisPhe: 1.79 ± 0.857
0.597HisGly: 0.597 ± 0.447
0.0HisHis: 0.0 ± 0.0
0.0HisIle: 0.0 ± 0.0
0.0HisLys: 0.0 ± 0.0
1.193HisLeu: 1.193 ± 0.433
0.0HisMet: 0.0 ± 0.0
0.597HisAsn: 0.597 ± 0.447
0.0HisPro: 0.0 ± 0.0
1.79HisGln: 1.79 ± 1.341
1.193HisArg: 1.193 ± 0.506
0.597HisSer: 0.597 ± 0.529
1.193HisThr: 1.193 ± 1.057
0.597HisVal: 0.597 ± 0.529
0.597HisTrp: 0.597 ± 0.529
0.597HisTyr: 0.597 ± 0.529
0.0HisXaa: 0.0 ± 0.0
Ile
3.58IleAla: 3.58 ± 1.46
1.79IleCys: 1.79 ± 1.022
2.983IleAsp: 2.983 ± 0.597
1.193IleGlu: 1.193 ± 1.128
1.79IlePhe: 1.79 ± 0.248
2.983IleGly: 2.983 ± 1.311
0.0IleHis: 0.0 ± 0.0
1.79IleIle: 1.79 ± 2.442
4.773IleLys: 4.773 ± 1.996
2.983IleLeu: 2.983 ± 0.611
0.597IleMet: 0.597 ± 0.564
2.387IleAsn: 2.387 ± 0.329
1.79IlePro: 1.79 ± 0.705
1.193IleGln: 1.193 ± 0.644
1.79IleArg: 1.79 ± 0.857
7.16IleSer: 7.16 ± 1.719
3.58IleThr: 3.58 ± 0.782
3.58IleVal: 3.58 ± 1.441
0.0IleTrp: 0.0 ± 0.0
0.597IleTyr: 0.597 ± 0.814
0.0IleXaa: 0.0 ± 0.0
Lys
4.177LysAla: 4.177 ± 1.078
0.0LysCys: 0.0 ± 0.0
5.37LysAsp: 5.37 ± 0.546
3.58LysGlu: 3.58 ± 1.117
1.193LysPhe: 1.193 ± 0.809
1.79LysGly: 1.79 ± 0.705
1.193LysHis: 1.193 ± 1.057
4.773LysIle: 4.773 ± 0.465
2.387LysLys: 2.387 ± 0.756
5.37LysLeu: 5.37 ± 1.741
1.79LysMet: 1.79 ± 1.26
4.773LysAsn: 4.773 ± 1.922
2.387LysPro: 2.387 ± 0.867
2.983LysGln: 2.983 ± 2.82
3.58LysArg: 3.58 ± 0.496
1.79LysSer: 1.79 ± 0.974
2.983LysThr: 2.983 ± 0.611
7.16LysVal: 7.16 ± 0.716
0.0LysTrp: 0.0 ± 0.0
5.37LysTyr: 5.37 ± 2.571
0.0LysXaa: 0.0 ± 0.0
Leu
7.757LeuAla: 7.757 ± 1.439
2.387LeuCys: 2.387 ± 0.653
4.177LeuAsp: 4.177 ± 0.822
2.983LeuGlu: 2.983 ± 1.699
5.967LeuPhe: 5.967 ± 1.126
7.757LeuGly: 7.757 ± 1.401
1.193LeuHis: 1.193 ± 1.057
3.58LeuIle: 3.58 ± 1.667
4.177LeuLys: 4.177 ± 0.729
2.983LeuLeu: 2.983 ± 1.589
1.193LeuMet: 1.193 ± 0.433
7.16LeuAsn: 7.16 ± 0.336
7.757LeuPro: 7.757 ± 0.985
4.177LeuGln: 4.177 ± 1.078
2.983LeuArg: 2.983 ± 0.425
10.74LeuSer: 10.74 ± 2.71
5.967LeuThr: 5.967 ± 2.298
4.773LeuVal: 4.773 ± 1.773
0.597LeuTrp: 0.597 ± 0.529
2.387LeuTyr: 2.387 ± 1.512
0.0LeuXaa: 0.0 ± 0.0
Met
1.193MetAla: 1.193 ± 0.644
1.79MetCys: 1.79 ± 1.586
0.597MetAsp: 0.597 ± 0.564
0.0MetGlu: 0.0 ± 0.0
0.597MetPhe: 0.597 ± 0.529
1.193MetGly: 1.193 ± 0.955
0.0MetHis: 0.0 ± 0.0
0.0MetIle: 0.0 ± 0.0
0.597MetLys: 0.597 ± 0.447
1.193MetLeu: 1.193 ± 0.433
1.193MetMet: 1.193 ± 0.955
2.387MetAsn: 2.387 ± 2.465
0.597MetPro: 0.597 ± 0.529
1.79MetGln: 1.79 ± 1.692
2.387MetArg: 2.387 ± 0.867
4.177MetSer: 4.177 ± 0.944
0.597MetThr: 0.597 ± 0.447
0.597MetVal: 0.597 ± 0.447
0.0MetTrp: 0.0 ± 0.0
1.79MetTyr: 1.79 ± 0.857
0.0MetXaa: 0.0 ± 0.0
Asn
7.757AsnAla: 7.757 ± 1.345
0.597AsnCys: 0.597 ± 0.564
4.177AsnAsp: 4.177 ± 1.912
1.79AsnGlu: 1.79 ± 0.248
4.773AsnPhe: 4.773 ± 0.953
2.983AsnGly: 2.983 ± 1.521
0.597AsnHis: 0.597 ± 0.529
1.79AsnIle: 1.79 ± 1.342
5.967AsnLys: 5.967 ± 0.807
3.58AsnLeu: 3.58 ± 1.223
0.597AsnMet: 0.597 ± 0.529
5.967AsnAsn: 5.967 ± 1.043
1.79AsnPro: 1.79 ± 0.248
4.177AsnGln: 4.177 ± 1.509
1.193AsnArg: 1.193 ± 0.433
4.177AsnSer: 4.177 ± 0.373
6.563AsnThr: 6.563 ± 1.951
3.58AsnVal: 3.58 ± 1.184
0.597AsnTrp: 0.597 ± 0.814
2.387AsnTyr: 2.387 ± 0.878
0.0AsnXaa: 0.0 ± 0.0
Pro
0.0ProAla: 0.0 ± 0.0
1.193ProCys: 1.193 ± 0.433
2.983ProAsp: 2.983 ± 1.572
1.193ProGlu: 1.193 ± 0.433
0.597ProPhe: 0.597 ± 0.564
0.597ProGly: 0.597 ± 0.564
2.387ProHis: 2.387 ± 1.357
2.387ProIle: 2.387 ± 0.911
3.58ProLys: 3.58 ± 1.3
5.967ProLeu: 5.967 ± 1.191
0.597ProMet: 0.597 ± 0.564
1.79ProAsn: 1.79 ± 0.248
0.0ProPro: 0.0 ± 0.0
0.0ProGln: 0.0 ± 0.0
1.193ProArg: 1.193 ± 0.433
5.37ProSer: 5.37 ± 1.486
1.79ProThr: 1.79 ± 0.705
1.193ProVal: 1.193 ± 0.433
0.0ProTrp: 0.0 ± 0.0
1.193ProTyr: 1.193 ± 0.895
0.0ProXaa: 0.0 ± 0.0
Gln
2.983GlnAla: 2.983 ± 1.699
0.0GlnCys: 0.0 ± 0.0
0.597GlnAsp: 0.597 ± 0.564
1.79GlnGlu: 1.79 ± 0.974
2.983GlnPhe: 2.983 ± 2.154
3.58GlnGly: 3.58 ± 1.46
1.79GlnHis: 1.79 ± 1.09
0.597GlnIle: 0.597 ± 0.447
2.387GlnLys: 2.387 ± 1.155
4.177GlnLeu: 4.177 ± 0.974
2.387GlnMet: 2.387 ± 1.012
2.983GlnAsn: 2.983 ± 1.225
1.193GlnPro: 1.193 ± 0.433
1.79GlnGln: 1.79 ± 1.692
0.597GlnArg: 0.597 ± 0.529
3.58GlnSer: 3.58 ± 0.901
1.79GlnThr: 1.79 ± 0.974
1.193GlnVal: 1.193 ± 1.057
0.597GlnTrp: 0.597 ± 0.447
2.387GlnTyr: 2.387 ± 0.756
0.0GlnXaa: 0.0 ± 0.0
Arg
4.773ArgAla: 4.773 ± 2.441
0.0ArgCys: 0.0 ± 0.0
2.387ArgAsp: 2.387 ± 0.653
2.387ArgGlu: 2.387 ± 0.653
2.983ArgPhe: 2.983 ± 0.745
1.193ArgGly: 1.193 ± 0.895
0.597ArgHis: 0.597 ± 0.529
2.387ArgIle: 2.387 ± 0.329
3.58ArgLys: 3.58 ± 2.07
6.563ArgLeu: 6.563 ± 0.658
1.79ArgMet: 1.79 ± 0.857
2.387ArgAsn: 2.387 ± 1.012
0.597ArgPro: 0.597 ± 0.529
1.79ArgGln: 1.79 ± 0.248
1.79ArgArg: 1.79 ± 0.771
2.387ArgSer: 2.387 ± 1.168
0.0ArgThr: 0.0 ± 0.0
0.0ArgVal: 0.0 ± 0.0
0.0ArgTrp: 0.0 ± 0.0
2.983ArgTyr: 2.983 ± 1.081
0.0ArgXaa: 0.0 ± 0.0
Ser
7.16SerAla: 7.16 ± 2.514
2.983SerCys: 2.983 ± 1.153
4.177SerAsp: 4.177 ± 2.395
4.177SerGlu: 4.177 ± 0.974
8.353SerPhe: 8.353 ± 3.363
3.58SerGly: 3.58 ± 2.684
1.193SerHis: 1.193 ± 1.015
1.79SerIle: 1.79 ± 0.248
6.563SerLys: 6.563 ± 0.876
10.74SerLeu: 10.74 ± 1.755
1.193SerMet: 1.193 ± 0.809
2.387SerAsn: 2.387 ± 0.867
2.387SerPro: 2.387 ± 0.329
1.79SerGln: 1.79 ± 1.035
4.177SerArg: 4.177 ± 1.17
8.95SerSer: 8.95 ± 2.381
7.16SerThr: 7.16 ± 2.93
6.563SerVal: 6.563 ± 1.35
0.597SerTrp: 0.597 ± 0.447
7.757SerTyr: 7.757 ± 0.795
0.0SerXaa: 0.0 ± 0.0
Thr
5.37ThrAla: 5.37 ± 2.432
0.0ThrCys: 0.0 ± 0.0
2.983ThrAsp: 2.983 ± 1.741
3.58ThrGlu: 3.58 ± 0.58
2.983ThrPhe: 2.983 ± 1.577
1.79ThrGly: 1.79 ± 1.035
0.597ThrHis: 0.597 ± 0.447
4.177ThrIle: 4.177 ± 0.693
2.983ThrLys: 2.983 ± 0.994
4.177ThrLeu: 4.177 ± 1.709
0.0ThrMet: 0.0 ± 0.0
1.79ThrAsn: 1.79 ± 0.705
2.983ThrPro: 2.983 ± 1.572
2.983ThrGln: 2.983 ± 1.572
2.983ThrArg: 2.983 ± 0.597
5.37ThrSer: 5.37 ± 0.548
4.773ThrThr: 4.773 ± 1.556
1.193ThrVal: 1.193 ± 0.895
1.79ThrTrp: 1.79 ± 0.771
2.387ThrTyr: 2.387 ± 1.289
0.0ThrXaa: 0.0 ± 0.0
Val
3.58ValAla: 3.58 ± 1.175
1.193ValCys: 1.193 ± 0.433
4.773ValAsp: 4.773 ± 0.545
0.597ValGlu: 0.597 ± 0.529
3.58ValPhe: 3.58 ± 1.184
1.79ValGly: 1.79 ± 0.248
0.0ValHis: 0.0 ± 0.0
4.773ValIle: 4.773 ± 2.284
1.79ValLys: 1.79 ± 0.248
6.563ValLeu: 6.563 ± 2.379
0.597ValMet: 0.597 ± 0.814
5.37ValAsn: 5.37 ± 1.87
3.58ValPro: 3.58 ± 1.184
3.58ValGln: 3.58 ± 1.409
2.387ValArg: 2.387 ± 0.867
8.353ValSer: 8.353 ± 2.512
2.387ValThr: 2.387 ± 0.732
4.773ValVal: 4.773 ± 2.3
1.193ValTrp: 1.193 ± 0.433
3.58ValTyr: 3.58 ± 1.714
0.0ValXaa: 0.0 ± 0.0
Trp
0.0TrpAla: 0.0 ± 0.0
0.0TrpCys: 0.0 ± 0.0
0.0TrpAsp: 0.0 ± 0.0
0.597TrpGlu: 0.597 ± 0.564
0.0TrpPhe: 0.0 ± 0.0
0.0TrpGly: 0.0 ± 0.0
0.0TrpHis: 0.0 ± 0.0
0.0TrpIle: 0.0 ± 0.0
1.193TrpLys: 1.193 ± 0.644
1.79TrpLeu: 1.79 ± 0.771
0.0TrpMet: 0.0 ± 0.0
0.597TrpAsn: 0.597 ± 0.447
0.0TrpPro: 0.0 ± 0.0
1.193TrpGln: 1.193 ± 1.057
0.597TrpArg: 0.597 ± 0.447
1.79TrpSer: 1.79 ± 1.048
0.0TrpThr: 0.0 ± 0.0
0.0TrpVal: 0.0 ± 0.0
0.0TrpTrp: 0.0 ± 0.0
1.193TrpTyr: 1.193 ± 0.895
0.0TrpXaa: 0.0 ± 0.0
Tyr
4.773TyrAla: 4.773 ± 1.873
0.0TyrCys: 0.0 ± 0.0
5.967TyrAsp: 5.967 ± 3.061
2.387TyrGlu: 2.387 ± 0.867
4.177TyrPhe: 4.177 ± 1.489
2.387TyrGly: 2.387 ± 1.512
2.387TyrHis: 2.387 ± 0.653
2.387TyrIle: 2.387 ± 1.512
2.387TyrLys: 2.387 ± 1.512
2.983TyrLeu: 2.983 ± 1.741
1.193TyrMet: 1.193 ± 1.057
2.387TyrAsn: 2.387 ± 0.867
0.597TyrPro: 0.597 ± 0.814
1.193TyrGln: 1.193 ± 0.506
0.597TyrArg: 0.597 ± 0.529
4.773TyrSer: 4.773 ± 0.658
2.387TyrThr: 2.387 ± 0.867
6.563TyrVal: 6.563 ± 1.671
0.597TyrTrp: 0.597 ± 0.564
1.79TyrTyr: 1.79 ± 0.771
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 4 proteins (1677 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski