Amino acid dipepetide frequency for Enterobacteria phage Hgal1

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
5.329AlaAla: 5.329 ± 1.596
1.776AlaCys: 1.776 ± 1.239
2.664AlaAsp: 2.664 ± 2.843
4.44AlaGlu: 4.44 ± 2.484
2.664AlaPhe: 2.664 ± 1.858
5.329AlaGly: 5.329 ± 1.352
0.0AlaHis: 0.0 ± 0.0
5.329AlaIle: 5.329 ± 1.723
3.552AlaLys: 3.552 ± 2.28
11.545AlaLeu: 11.545 ± 5.133
0.888AlaMet: 0.888 ± 0.752
2.664AlaAsn: 2.664 ± 2.041
3.552AlaPro: 3.552 ± 1.923
2.664AlaGln: 2.664 ± 2.041
5.329AlaArg: 5.329 ± 2.681
4.44AlaSer: 4.44 ± 1.141
6.217AlaThr: 6.217 ± 3.676
3.552AlaVal: 3.552 ± 0.907
2.664AlaTrp: 2.664 ± 1.208
4.44AlaTyr: 4.44 ± 1.645
0.0AlaXaa: 0.0 ± 0.0
Cys
0.888CysAla: 0.888 ± 0.619
0.0CysCys: 0.0 ± 0.0
1.776CysAsp: 1.776 ± 1.239
0.0CysGlu: 0.0 ± 0.0
0.0CysPhe: 0.0 ± 0.0
0.888CysGly: 0.888 ± 0.619
0.0CysHis: 0.0 ± 0.0
0.888CysIle: 0.888 ± 1.868
0.888CysLys: 0.888 ± 0.619
0.888CysLeu: 0.888 ± 0.619
0.0CysMet: 0.0 ± 0.0
0.0CysAsn: 0.0 ± 0.0
0.0CysPro: 0.0 ± 0.0
0.0CysGln: 0.0 ± 0.0
0.888CysArg: 0.888 ± 0.619
1.776CysSer: 1.776 ± 0.596
0.888CysThr: 0.888 ± 0.752
1.776CysVal: 1.776 ± 1.373
0.0CysTrp: 0.0 ± 0.0
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
4.44AspAla: 4.44 ± 1.141
0.888AspCys: 0.888 ± 1.532
3.552AspAsp: 3.552 ± 2.478
0.888AspGlu: 0.888 ± 0.619
1.776AspPhe: 1.776 ± 0.596
1.776AspGly: 1.776 ± 1.239
0.0AspHis: 0.0 ± 0.0
4.44AspIle: 4.44 ± 0.764
2.664AspLys: 2.664 ± 2.613
6.217AspLeu: 6.217 ± 3.288
0.888AspMet: 0.888 ± 0.752
2.664AspAsn: 2.664 ± 1.054
1.776AspPro: 1.776 ± 1.505
1.776AspGln: 1.776 ± 1.373
2.664AspArg: 2.664 ± 1.054
2.664AspSer: 2.664 ± 0.955
0.888AspThr: 0.888 ± 0.619
7.105AspVal: 7.105 ± 2.991
1.776AspTrp: 1.776 ± 1.505
0.888AspTyr: 0.888 ± 0.752
0.0AspXaa: 0.0 ± 0.0
Glu
3.552GluAla: 3.552 ± 1.211
1.776GluCys: 1.776 ± 0.596
0.0GluAsp: 0.0 ± 0.0
2.664GluGlu: 2.664 ± 1.858
1.776GluPhe: 1.776 ± 1.239
4.44GluGly: 4.44 ± 2.08
0.888GluHis: 0.888 ± 0.752
0.888GluIle: 0.888 ± 0.752
2.664GluLys: 2.664 ± 1.858
7.105GluLeu: 7.105 ± 1.484
0.888GluMet: 0.888 ± 0.619
0.0GluAsn: 0.0 ± 0.0
2.664GluPro: 2.664 ± 1.054
2.664GluGln: 2.664 ± 1.054
3.552GluArg: 3.552 ± 1.211
1.776GluSer: 1.776 ± 1.239
2.664GluThr: 2.664 ± 2.613
1.776GluVal: 1.776 ± 0.596
0.0GluTrp: 0.0 ± 0.0
0.0GluTyr: 0.0 ± 0.0
0.0GluXaa: 0.0 ± 0.0
Phe
6.217PheAla: 6.217 ± 2.337
0.888PheCys: 0.888 ± 0.619
0.888PheAsp: 0.888 ± 0.752
5.329PheGlu: 5.329 ± 2.416
2.664PhePhe: 2.664 ± 1.858
0.888PheGly: 0.888 ± 0.752
0.888PheHis: 0.888 ± 0.619
1.776PheIle: 1.776 ± 3.736
0.888PheLys: 0.888 ± 0.619
5.329PheLeu: 5.329 ± 1.352
0.0PheMet: 0.0 ± 0.0
1.776PheAsn: 1.776 ± 0.596
2.664PhePro: 2.664 ± 1.208
0.888PheGln: 0.888 ± 0.619
1.776PheArg: 1.776 ± 1.239
5.329PheSer: 5.329 ± 1.723
4.44PheThr: 4.44 ± 2.788
0.888PheVal: 0.888 ± 0.619
0.0PheTrp: 0.0 ± 0.0
2.664PheTyr: 2.664 ± 1.858
0.0PheXaa: 0.0 ± 0.0
Gly
1.776GlyAla: 1.776 ± 0.596
0.0GlyCys: 0.0 ± 0.0
3.552GlyAsp: 3.552 ± 1.192
2.664GlyGlu: 2.664 ± 1.766
5.329GlyPhe: 5.329 ± 1.91
5.329GlyGly: 5.329 ± 1.216
0.0GlyHis: 0.0 ± 0.0
4.44GlyIle: 4.44 ± 2.26
5.329GlyLys: 5.329 ± 2.681
4.44GlyLeu: 4.44 ± 1.75
2.664GlyMet: 2.664 ± 3.758
4.44GlyAsn: 4.44 ± 1.75
1.776GlyPro: 1.776 ± 0.596
1.776GlyGln: 1.776 ± 0.596
3.552GlyArg: 3.552 ± 1.192
3.552GlySer: 3.552 ± 1.495
5.329GlyThr: 5.329 ± 2.362
4.44GlyVal: 4.44 ± 2.26
0.888GlyTrp: 0.888 ± 0.752
0.0GlyTyr: 0.0 ± 0.0
0.0GlyXaa: 0.0 ± 0.0
His
0.0HisAla: 0.0 ± 0.0
0.0HisCys: 0.0 ± 0.0
0.0HisAsp: 0.0 ± 0.0
0.0HisGlu: 0.0 ± 0.0
0.888HisPhe: 0.888 ± 0.619
3.552HisGly: 3.552 ± 1.192
0.0HisHis: 0.0 ± 0.0
0.888HisIle: 0.888 ± 0.619
0.888HisLys: 0.888 ± 0.619
1.776HisLeu: 1.776 ± 0.596
0.0HisMet: 0.0 ± 0.0
0.0HisAsn: 0.0 ± 0.0
0.888HisPro: 0.888 ± 0.619
0.0HisGln: 0.0 ± 0.0
1.776HisArg: 1.776 ± 1.77
1.776HisSer: 1.776 ± 0.596
0.888HisThr: 0.888 ± 1.532
0.888HisVal: 0.888 ± 0.619
0.0HisTrp: 0.0 ± 0.0
0.888HisTyr: 0.888 ± 0.619
0.0HisXaa: 0.0 ± 0.0
Ile
1.776IleAla: 1.776 ± 2.793
0.0IleCys: 0.0 ± 0.0
5.329IleAsp: 5.329 ± 2.681
1.776IleGlu: 1.776 ± 0.596
1.776IlePhe: 1.776 ± 1.239
1.776IleGly: 1.776 ± 1.505
1.776IleHis: 1.776 ± 1.239
2.664IleIle: 2.664 ± 2.569
0.888IleLys: 0.888 ± 0.752
6.217IleLeu: 6.217 ± 2.432
0.888IleMet: 0.888 ± 0.619
3.552IleAsn: 3.552 ± 2.28
1.776IlePro: 1.776 ± 0.596
0.888IleGln: 0.888 ± 1.868
7.993IleArg: 7.993 ± 4.559
7.105IleSer: 7.105 ± 1.484
2.664IleThr: 2.664 ± 1.208
3.552IleVal: 3.552 ± 2.243
1.776IleTrp: 1.776 ± 1.239
2.664IleTyr: 2.664 ± 4.498
0.0IleXaa: 0.0 ± 0.0
Lys
1.776LysAla: 1.776 ± 1.373
0.0LysCys: 0.0 ± 0.0
2.664LysAsp: 2.664 ± 1.054
0.888LysGlu: 0.888 ± 0.619
1.776LysPhe: 1.776 ± 1.806
2.664LysGly: 2.664 ± 1.208
2.664LysHis: 2.664 ± 1.858
1.776LysIle: 1.776 ± 0.596
6.217LysLys: 6.217 ± 2.432
6.217LysLeu: 6.217 ± 0.877
0.888LysMet: 0.888 ± 0.619
1.776LysAsn: 1.776 ± 0.596
0.888LysPro: 0.888 ± 0.619
0.0LysGln: 0.0 ± 0.0
4.44LysArg: 4.44 ± 1.467
7.993LysSer: 7.993 ± 1.502
7.993LysThr: 7.993 ± 3.991
1.776LysVal: 1.776 ± 1.505
1.776LysTrp: 1.776 ± 1.505
1.776LysTyr: 1.776 ± 0.596
0.0LysXaa: 0.0 ± 0.0
Leu
12.433LeuAla: 12.433 ± 2.369
0.888LeuCys: 0.888 ± 0.619
2.664LeuAsp: 2.664 ± 1.882
4.44LeuGlu: 4.44 ± 1.467
1.776LeuPhe: 1.776 ± 1.806
5.329LeuGly: 5.329 ± 1.216
1.776LeuHis: 1.776 ± 0.596
5.329LeuIle: 5.329 ± 1.91
5.329LeuLys: 5.329 ± 2.362
7.993LeuLeu: 7.993 ± 3.046
2.664LeuMet: 2.664 ± 1.377
7.105LeuAsn: 7.105 ± 1.497
4.44LeuPro: 4.44 ± 2.08
3.552LeuGln: 3.552 ± 0.907
4.44LeuArg: 4.44 ± 2.788
10.657LeuSer: 10.657 ± 3.197
2.664LeuThr: 2.664 ± 1.208
7.993LeuVal: 7.993 ± 1.499
0.0LeuTrp: 0.0 ± 0.0
1.776LeuTyr: 1.776 ± 0.596
0.0LeuXaa: 0.0 ± 0.0
Met
1.776MetAla: 1.776 ± 1.77
0.0MetCys: 0.0 ± 0.0
0.888MetAsp: 0.888 ± 0.752
0.0MetGlu: 0.0 ± 0.0
2.664MetPhe: 2.664 ± 1.481
0.888MetGly: 0.888 ± 0.619
0.888MetHis: 0.888 ± 0.619
0.0MetIle: 0.0 ± 0.0
0.0MetLys: 0.0 ± 0.0
0.888MetLeu: 0.888 ± 1.868
0.0MetMet: 0.0 ± 0.0
0.0MetAsn: 0.0 ± 0.0
1.776MetPro: 1.776 ± 1.472
2.664MetGln: 2.664 ± 0.955
0.888MetArg: 0.888 ± 0.752
0.0MetSer: 0.0 ± 0.0
0.888MetThr: 0.888 ± 1.868
1.776MetVal: 1.776 ± 1.472
0.888MetTrp: 0.888 ± 0.752
0.0MetTyr: 0.0 ± 0.0
0.0MetXaa: 0.0 ± 0.0
Asn
5.329AsnAla: 5.329 ± 1.64
0.888AsnCys: 0.888 ± 0.619
3.552AsnAsp: 3.552 ± 2.561
1.776AsnGlu: 1.776 ± 0.596
0.888AsnPhe: 0.888 ± 0.752
3.552AsnGly: 3.552 ± 1.192
0.0AsnHis: 0.0 ± 0.0
1.776AsnIle: 1.776 ± 1.505
0.888AsnLys: 0.888 ± 0.619
4.44AsnLeu: 4.44 ± 2.143
0.888AsnMet: 0.888 ± 0.619
1.776AsnAsn: 1.776 ± 1.373
2.664AsnPro: 2.664 ± 2.041
2.664AsnGln: 2.664 ± 0.955
1.776AsnArg: 1.776 ± 0.596
7.105AsnSer: 7.105 ± 2.402
2.664AsnThr: 2.664 ± 2.257
6.217AsnVal: 6.217 ± 3.676
1.776AsnTrp: 1.776 ± 0.596
1.776AsnTyr: 1.776 ± 1.472
0.0AsnXaa: 0.0 ± 0.0
Pro
2.664ProAla: 2.664 ± 2.257
0.888ProCys: 0.888 ± 0.619
1.776ProAsp: 1.776 ± 1.239
0.888ProGlu: 0.888 ± 0.619
1.776ProPhe: 1.776 ± 0.596
0.888ProGly: 0.888 ± 0.619
0.0ProHis: 0.0 ± 0.0
7.105ProIle: 7.105 ± 1.862
3.552ProLys: 3.552 ± 0.907
5.329ProLeu: 5.329 ± 1.789
0.888ProMet: 0.888 ± 1.868
2.664ProAsn: 2.664 ± 0.955
0.888ProPro: 0.888 ± 0.752
1.776ProGln: 1.776 ± 1.472
3.552ProArg: 3.552 ± 3.884
3.552ProSer: 3.552 ± 2.478
0.888ProThr: 0.888 ± 0.619
3.552ProVal: 3.552 ± 4.41
0.888ProTrp: 0.888 ± 0.752
2.664ProTyr: 2.664 ± 1.208
0.0ProXaa: 0.0 ± 0.0
Gln
4.44GlnAla: 4.44 ± 2.0
0.0GlnCys: 0.0 ± 0.0
2.664GlnAsp: 2.664 ± 1.054
0.888GlnGlu: 0.888 ± 0.619
1.776GlnPhe: 1.776 ± 1.806
2.664GlnGly: 2.664 ± 0.955
0.0GlnHis: 0.0 ± 0.0
3.552GlnIle: 3.552 ± 2.243
0.0GlnLys: 0.0 ± 0.0
3.552GlnLeu: 3.552 ± 0.907
0.0GlnMet: 0.0 ± 0.0
3.552GlnAsn: 3.552 ± 2.561
0.888GlnPro: 0.888 ± 0.619
0.888GlnGln: 0.888 ± 0.752
0.888GlnArg: 0.888 ± 0.619
1.776GlnSer: 1.776 ± 0.596
3.552GlnThr: 3.552 ± 1.211
1.776GlnVal: 1.776 ± 1.505
1.776GlnTrp: 1.776 ± 0.596
0.888GlnTyr: 0.888 ± 0.619
0.0GlnXaa: 0.0 ± 0.0
Arg
7.105ArgAla: 7.105 ± 0.909
0.888ArgCys: 0.888 ± 1.868
1.776ArgAsp: 1.776 ± 1.505
4.44ArgGlu: 4.44 ± 1.645
2.664ArgPhe: 2.664 ± 2.569
2.664ArgGly: 2.664 ± 1.208
2.664ArgHis: 2.664 ± 0.955
2.664ArgIle: 2.664 ± 1.882
2.664ArgLys: 2.664 ± 2.257
7.993ArgLeu: 7.993 ± 4.488
0.888ArgMet: 0.888 ± 0.692
7.105ArgAsn: 7.105 ± 2.854
2.664ArgPro: 2.664 ± 1.208
3.552ArgGln: 3.552 ± 2.561
7.105ArgArg: 7.105 ± 2.026
3.552ArgSer: 3.552 ± 2.478
3.552ArgThr: 3.552 ± 3.01
0.888ArgVal: 0.888 ± 0.619
1.776ArgTrp: 1.776 ± 0.596
2.664ArgTyr: 2.664 ± 0.955
0.0ArgXaa: 0.0 ± 0.0
Ser
7.993SerAla: 7.993 ± 1.285
0.888SerCys: 0.888 ± 0.619
8.881SerAsp: 8.881 ± 3.378
1.776SerGlu: 1.776 ± 1.239
4.44SerPhe: 4.44 ± 1.467
5.329SerGly: 5.329 ± 2.416
2.664SerHis: 2.664 ± 1.054
5.329SerIle: 5.329 ± 1.723
3.552SerLys: 3.552 ± 2.478
3.552SerLeu: 3.552 ± 0.907
0.888SerMet: 0.888 ± 0.619
2.664SerAsn: 2.664 ± 0.955
3.552SerPro: 3.552 ± 2.243
4.44SerGln: 4.44 ± 2.016
5.329SerArg: 5.329 ± 1.91
8.881SerSer: 8.881 ± 1.801
6.217SerThr: 6.217 ± 1.872
5.329SerVal: 5.329 ± 2.014
0.888SerTrp: 0.888 ± 0.619
1.776SerTyr: 1.776 ± 1.239
0.0SerXaa: 0.0 ± 0.0
Thr
2.664ThrAla: 2.664 ± 1.766
0.0ThrCys: 0.0 ± 0.0
1.776ThrAsp: 1.776 ± 1.373
2.664ThrGlu: 2.664 ± 1.054
6.217ThrPhe: 6.217 ± 3.605
1.776ThrGly: 1.776 ± 1.373
0.0ThrHis: 0.0 ± 0.0
2.664ThrIle: 2.664 ± 1.208
7.105ThrLys: 7.105 ± 3.897
4.44ThrLeu: 4.44 ± 2.143
0.888ThrMet: 0.888 ± 1.495
1.776ThrAsn: 1.776 ± 0.596
5.329ThrPro: 5.329 ± 2.25
2.664ThrGln: 2.664 ± 0.955
4.44ThrArg: 4.44 ± 1.75
5.329ThrSer: 5.329 ± 2.362
5.329ThrThr: 5.329 ± 2.362
8.881ThrVal: 8.881 ± 2.909
0.0ThrTrp: 0.0 ± 0.0
0.888ThrTyr: 0.888 ± 0.752
0.0ThrXaa: 0.0 ± 0.0
Val
6.217ValAla: 6.217 ± 1.872
0.0ValCys: 0.0 ± 0.0
3.552ValAsp: 3.552 ± 1.192
2.664ValGlu: 2.664 ± 2.909
1.776ValPhe: 1.776 ± 1.239
7.105ValGly: 7.105 ± 3.345
0.888ValHis: 0.888 ± 1.868
3.552ValIle: 3.552 ± 2.478
4.44ValLys: 4.44 ± 1.719
1.776ValLeu: 1.776 ± 1.472
1.776ValMet: 1.776 ± 0.596
6.217ValAsn: 6.217 ± 1.059
6.217ValPro: 6.217 ± 3.076
1.776ValGln: 1.776 ± 1.373
4.44ValArg: 4.44 ± 0.764
2.664ValSer: 2.664 ± 1.481
5.329ValThr: 5.329 ± 2.014
9.769ValVal: 9.769 ± 3.737
0.888ValTrp: 0.888 ± 0.752
1.776ValTyr: 1.776 ± 0.596
0.0ValXaa: 0.0 ± 0.0
Trp
0.888TrpAla: 0.888 ± 0.752
0.888TrpCys: 0.888 ± 0.619
1.776TrpAsp: 1.776 ± 1.505
0.888TrpGlu: 0.888 ± 0.619
2.664TrpPhe: 2.664 ± 0.955
0.0TrpGly: 0.0 ± 0.0
0.0TrpHis: 0.0 ± 0.0
0.888TrpIle: 0.888 ± 0.752
1.776TrpLys: 1.776 ± 0.596
0.888TrpLeu: 0.888 ± 0.752
0.0TrpMet: 0.0 ± 0.0
0.888TrpAsn: 0.888 ± 0.752
0.0TrpPro: 0.0 ± 0.0
0.0TrpGln: 0.0 ± 0.0
1.776TrpArg: 1.776 ± 1.505
1.776TrpSer: 1.776 ± 1.505
1.776TrpThr: 1.776 ± 0.596
0.0TrpVal: 0.0 ± 0.0
0.888TrpTrp: 0.888 ± 0.619
0.888TrpTyr: 0.888 ± 0.619
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.776TyrAla: 1.776 ± 0.596
0.888TyrCys: 0.888 ± 0.752
0.0TyrAsp: 0.0 ± 0.0
2.664TyrGlu: 2.664 ± 0.955
0.888TyrPhe: 0.888 ± 0.752
4.44TyrGly: 4.44 ± 2.26
0.0TyrHis: 0.0 ± 0.0
0.888TyrIle: 0.888 ± 0.752
2.664TyrLys: 2.664 ± 1.766
3.552TyrLeu: 3.552 ± 3.539
0.0TyrMet: 0.0 ± 0.0
1.776TyrAsn: 1.776 ± 0.596
1.776TyrPro: 1.776 ± 1.239
0.888TyrGln: 0.888 ± 0.619
2.664TyrArg: 2.664 ± 1.208
2.664TyrSer: 2.664 ± 1.858
0.0TyrThr: 0.0 ± 0.0
0.888TyrVal: 0.888 ± 1.868
0.0TyrTrp: 0.0 ± 0.0
0.0TyrTyr: 0.0 ± 0.0
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 4 proteins (1127 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski