Amino acid dipepetide frequency for Escherichia phage Lilleput

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
8.686AlaAla: 8.686 ± 1.651
1.737AlaCys: 1.737 ± 1.313
7.528AlaAsp: 7.528 ± 1.609
2.895AlaGlu: 2.895 ± 1.228
2.895AlaPhe: 2.895 ± 1.347
11.002AlaGly: 11.002 ± 4.136
2.895AlaHis: 2.895 ± 2.105
5.79AlaIle: 5.79 ± 1.565
6.369AlaLys: 6.369 ± 1.771
6.948AlaLeu: 6.948 ± 3.503
1.158AlaMet: 1.158 ± 0.499
2.316AlaAsn: 2.316 ± 0.768
4.053AlaPro: 4.053 ± 0.81
4.053AlaGln: 4.053 ± 1.279
1.158AlaArg: 1.158 ± 1.001
10.423AlaSer: 10.423 ± 1.689
6.948AlaThr: 6.948 ± 1.619
8.107AlaVal: 8.107 ± 2.545
0.579AlaTrp: 0.579 ± 0.507
1.737AlaTyr: 1.737 ± 0.647
0.0AlaXaa: 0.0 ± 0.0
Cys
0.579CysAla: 0.579 ± 0.675
0.579CysCys: 0.579 ± 0.507
0.0CysAsp: 0.0 ± 0.0
0.0CysGlu: 0.0 ± 0.0
0.579CysPhe: 0.579 ± 0.507
0.0CysGly: 0.0 ± 0.0
0.579CysHis: 0.579 ± 0.507
0.0CysIle: 0.0 ± 0.0
0.579CysLys: 0.579 ± 0.697
1.158CysLeu: 1.158 ± 0.682
0.0CysMet: 0.0 ± 0.0
0.0CysAsn: 0.0 ± 0.0
0.579CysPro: 0.579 ± 0.421
0.0CysGln: 0.0 ± 0.0
0.579CysArg: 0.579 ± 0.675
0.579CysSer: 0.579 ± 0.421
0.579CysThr: 0.579 ± 0.507
1.158CysVal: 1.158 ± 0.682
0.0CysTrp: 0.0 ± 0.0
1.158CysTyr: 1.158 ± 0.842
0.0CysXaa: 0.0 ± 0.0
Asp
5.79AspAla: 5.79 ± 1.908
1.158AspCys: 1.158 ± 0.677
2.895AspAsp: 2.895 ± 0.921
3.474AspGlu: 3.474 ± 1.441
2.316AspPhe: 2.316 ± 1.458
4.053AspGly: 4.053 ± 1.492
1.158AspHis: 1.158 ± 0.499
5.79AspIle: 5.79 ± 1.388
1.737AspLys: 1.737 ± 1.035
2.316AspLeu: 2.316 ± 0.941
0.579AspMet: 0.579 ± 0.421
4.632AspAsn: 4.632 ± 1.851
1.158AspPro: 1.158 ± 0.499
2.316AspGln: 2.316 ± 0.966
3.474AspArg: 3.474 ± 1.246
4.632AspSer: 4.632 ± 1.617
4.053AspThr: 4.053 ± 1.656
2.316AspVal: 2.316 ± 0.571
0.579AspTrp: 0.579 ± 0.769
4.053AspTyr: 4.053 ± 1.077
0.0AspXaa: 0.0 ± 0.0
Glu
2.316GluAla: 2.316 ± 0.752
1.737GluCys: 1.737 ± 0.772
1.737GluAsp: 1.737 ± 1.313
2.895GluGlu: 2.895 ± 1.433
1.737GluPhe: 1.737 ± 2.091
2.316GluGly: 2.316 ± 1.249
1.158GluHis: 1.158 ± 0.898
3.474GluIle: 3.474 ± 0.588
1.158GluLys: 1.158 ± 0.898
4.053GluLeu: 4.053 ± 1.64
1.158GluMet: 1.158 ± 0.569
2.316GluAsn: 2.316 ± 1.094
0.579GluPro: 0.579 ± 0.421
0.0GluGln: 0.0 ± 0.0
1.737GluArg: 1.737 ± 0.823
3.474GluSer: 3.474 ± 1.604
2.895GluThr: 2.895 ± 0.931
0.579GluVal: 0.579 ± 0.525
0.579GluTrp: 0.579 ± 0.421
1.158GluTyr: 1.158 ± 0.499
0.0GluXaa: 0.0 ± 0.0
Phe
2.316PheAla: 2.316 ± 1.445
0.579PheCys: 0.579 ± 0.421
0.579PheAsp: 0.579 ± 0.421
1.737PheGlu: 1.737 ± 1.053
0.579PhePhe: 0.579 ± 0.507
2.316PheGly: 2.316 ± 0.675
0.579PheHis: 0.579 ± 0.421
1.737PheIle: 1.737 ± 0.647
0.579PheLys: 0.579 ± 0.507
1.737PheLeu: 1.737 ± 1.19
1.737PheMet: 1.737 ± 0.913
1.737PheAsn: 1.737 ± 0.77
2.316PhePro: 2.316 ± 1.039
2.316PheGln: 2.316 ± 1.223
4.053PheArg: 4.053 ± 1.101
1.158PheSer: 1.158 ± 0.499
4.632PheThr: 4.632 ± 1.108
3.474PheVal: 3.474 ± 2.334
0.579PheTrp: 0.579 ± 0.507
2.316PheTyr: 2.316 ± 1.138
0.0PheXaa: 0.0 ± 0.0
Gly
8.107GlyAla: 8.107 ± 2.514
0.579GlyCys: 0.579 ± 0.697
2.316GlyAsp: 2.316 ± 1.563
1.737GlyGlu: 1.737 ± 0.772
2.895GlyPhe: 2.895 ± 0.805
4.632GlyGly: 4.632 ± 2.268
0.579GlyHis: 0.579 ± 0.421
5.79GlyIle: 5.79 ± 2.105
6.369GlyLys: 6.369 ± 1.926
4.632GlyLeu: 4.632 ± 1.637
1.737GlyMet: 1.737 ± 0.882
3.474GlyAsn: 3.474 ± 0.766
1.737GlyPro: 1.737 ± 0.97
3.474GlyGln: 3.474 ± 1.257
2.895GlyArg: 2.895 ± 1.229
4.053GlySer: 4.053 ± 1.447
2.316GlyThr: 2.316 ± 0.571
3.474GlyVal: 3.474 ± 1.478
1.158GlyTrp: 1.158 ± 0.842
2.895GlyTyr: 2.895 ± 0.921
0.0GlyXaa: 0.0 ± 0.0
His
0.579HisAla: 0.579 ± 0.421
0.0HisCys: 0.0 ± 0.0
0.0HisAsp: 0.0 ± 0.0
0.0HisGlu: 0.0 ± 0.0
1.737HisPhe: 1.737 ± 0.734
2.316HisGly: 2.316 ± 1.066
0.0HisHis: 0.0 ± 0.0
1.158HisIle: 1.158 ± 1.014
0.579HisLys: 0.579 ± 0.507
2.316HisLeu: 2.316 ± 0.998
0.0HisMet: 0.0 ± 0.0
1.737HisAsn: 1.737 ± 0.617
1.158HisPro: 1.158 ± 0.79
1.737HisGln: 1.737 ± 0.77
0.579HisArg: 0.579 ± 0.507
1.158HisSer: 1.158 ± 0.499
2.316HisThr: 2.316 ± 0.744
0.0HisVal: 0.0 ± 0.0
2.316HisTrp: 2.316 ± 1.138
1.158HisTyr: 1.158 ± 1.014
0.0HisXaa: 0.0 ± 0.0
Ile
9.844IleAla: 9.844 ± 2.023
0.0IleCys: 0.0 ± 0.0
5.211IleAsp: 5.211 ± 1.886
1.158IleGlu: 1.158 ± 0.898
0.0IlePhe: 0.0 ± 0.0
2.895IleGly: 2.895 ± 1.026
0.0IleHis: 0.0 ± 0.0
1.158IleIle: 1.158 ± 0.677
2.895IleLys: 2.895 ± 1.034
2.316IleLeu: 2.316 ± 1.382
3.474IleMet: 3.474 ± 1.757
2.316IleAsn: 2.316 ± 1.355
3.474IlePro: 3.474 ± 1.798
5.211IleGln: 5.211 ± 1.549
2.895IleArg: 2.895 ± 1.59
5.211IleSer: 5.211 ± 2.476
2.316IleThr: 2.316 ± 1.194
1.158IleVal: 1.158 ± 0.84
1.158IleTrp: 1.158 ± 0.499
1.158IleTyr: 1.158 ± 1.014
0.0IleXaa: 0.0 ± 0.0
Lys
5.211LysAla: 5.211 ± 0.876
0.0LysCys: 0.0 ± 0.0
6.948LysAsp: 6.948 ± 3.35
1.737LysGlu: 1.737 ± 0.882
1.737LysPhe: 1.737 ± 0.77
5.211LysGly: 5.211 ± 1.594
0.579LysHis: 0.579 ± 0.675
2.895LysIle: 2.895 ± 0.947
2.316LysLys: 2.316 ± 0.856
4.053LysLeu: 4.053 ± 1.905
4.632LysMet: 4.632 ± 1.518
1.737LysAsn: 1.737 ± 0.77
1.737LysPro: 1.737 ± 1.263
4.053LysGln: 4.053 ± 1.919
0.579LysArg: 0.579 ± 0.421
3.474LysSer: 3.474 ± 1.532
2.895LysThr: 2.895 ± 1.217
1.737LysVal: 1.737 ± 0.942
1.158LysTrp: 1.158 ± 0.75
1.737LysTyr: 1.737 ± 0.913
0.0LysXaa: 0.0 ± 0.0
Leu
8.107LeuAla: 8.107 ± 1.967
0.579LeuCys: 0.579 ± 0.675
5.211LeuAsp: 5.211 ± 1.408
2.316LeuGlu: 2.316 ± 0.9
2.895LeuPhe: 2.895 ± 0.638
4.632LeuGly: 4.632 ± 1.146
1.737LeuHis: 1.737 ± 0.913
4.053LeuIle: 4.053 ± 1.195
6.948LeuLys: 6.948 ± 1.856
5.211LeuLeu: 5.211 ± 1.157
3.474LeuMet: 3.474 ± 1.167
4.053LeuAsn: 4.053 ± 1.892
2.316LeuPro: 2.316 ± 0.998
5.79LeuGln: 5.79 ± 1.189
4.632LeuArg: 4.632 ± 1.505
4.632LeuSer: 4.632 ± 2.066
9.265LeuThr: 9.265 ± 2.289
4.632LeuVal: 4.632 ± 1.384
0.579LeuTrp: 0.579 ± 0.421
1.737LeuTyr: 1.737 ± 0.77
0.0LeuXaa: 0.0 ± 0.0
Met
2.316MetAla: 2.316 ± 1.138
0.0MetCys: 0.0 ± 0.0
0.579MetAsp: 0.579 ± 0.421
1.158MetGlu: 1.158 ± 0.79
1.158MetPhe: 1.158 ± 0.787
1.158MetGly: 1.158 ± 0.569
0.0MetHis: 0.0 ± 0.0
0.579MetIle: 0.579 ± 0.697
2.895MetLys: 2.895 ± 1.044
3.474MetLeu: 3.474 ± 1.287
0.579MetMet: 0.579 ± 0.661
2.316MetAsn: 2.316 ± 0.9
0.579MetPro: 0.579 ± 0.421
2.316MetGln: 2.316 ± 1.452
2.895MetArg: 2.895 ± 1.026
4.053MetSer: 4.053 ± 1.49
2.316MetThr: 2.316 ± 1.39
1.158MetVal: 1.158 ± 0.591
0.0MetTrp: 0.0 ± 0.0
0.0MetTyr: 0.0 ± 0.0
0.0MetXaa: 0.0 ± 0.0
Asn
4.632AsnAla: 4.632 ± 1.848
0.0AsnCys: 0.0 ± 0.0
1.737AsnAsp: 1.737 ± 0.461
1.737AsnGlu: 1.737 ± 0.762
2.895AsnPhe: 2.895 ± 1.039
1.737AsnGly: 1.737 ± 0.647
0.0AsnHis: 0.0 ± 0.0
3.474AsnIle: 3.474 ± 2.243
2.895AsnLys: 2.895 ± 1.088
5.211AsnLeu: 5.211 ± 1.63
1.737AsnMet: 1.737 ± 0.823
2.895AsnAsn: 2.895 ± 0.63
4.632AsnPro: 4.632 ± 0.756
2.895AsnGln: 2.895 ± 1.905
1.737AsnArg: 1.737 ± 0.913
4.053AsnSer: 4.053 ± 1.657
5.79AsnThr: 5.79 ± 1.312
2.895AsnVal: 2.895 ± 0.947
0.0AsnTrp: 0.0 ± 0.0
2.895AsnTyr: 2.895 ± 0.805
0.0AsnXaa: 0.0 ± 0.0
Pro
1.737ProAla: 1.737 ± 1.558
0.0ProCys: 0.0 ± 0.0
1.158ProAsp: 1.158 ± 0.591
2.895ProGlu: 2.895 ± 0.931
1.158ProPhe: 1.158 ± 0.499
1.158ProGly: 1.158 ± 1.049
0.579ProHis: 0.579 ± 0.507
1.737ProIle: 1.737 ± 0.904
1.737ProLys: 1.737 ± 0.772
5.211ProLeu: 5.211 ± 1.568
0.0ProMet: 0.0 ± 0.0
4.053ProAsn: 4.053 ± 1.82
2.316ProPro: 2.316 ± 1.401
1.737ProGln: 1.737 ± 0.617
1.737ProArg: 1.737 ± 1.13
2.316ProSer: 2.316 ± 1.401
5.211ProThr: 5.211 ± 1.911
5.79ProVal: 5.79 ± 1.983
1.158ProTrp: 1.158 ± 0.569
0.579ProTyr: 0.579 ± 0.421
0.0ProXaa: 0.0 ± 0.0
Gln
5.79GlnAla: 5.79 ± 2.082
0.0GlnCys: 0.0 ± 0.0
1.158GlnAsp: 1.158 ± 0.842
2.895GlnGlu: 2.895 ± 0.798
1.737GlnPhe: 1.737 ± 1.089
2.895GlnGly: 2.895 ± 1.197
1.158GlnHis: 1.158 ± 0.591
1.158GlnIle: 1.158 ± 0.591
2.895GlnLys: 2.895 ± 1.602
6.948GlnLeu: 6.948 ± 1.497
0.579GlnMet: 0.579 ± 0.525
4.053GlnAsn: 4.053 ± 1.656
1.158GlnPro: 1.158 ± 0.677
2.316GlnGln: 2.316 ± 1.452
1.737GlnArg: 1.737 ± 0.461
2.895GlnSer: 2.895 ± 0.824
5.79GlnThr: 5.79 ± 1.861
2.895GlnVal: 2.895 ± 1.449
0.579GlnTrp: 0.579 ± 0.507
2.316GlnTyr: 2.316 ± 0.675
0.0GlnXaa: 0.0 ± 0.0
Arg
6.369ArgAla: 6.369 ± 3.063
1.158ArgCys: 1.158 ± 0.499
5.211ArgAsp: 5.211 ± 1.773
0.579ArgGlu: 0.579 ± 0.507
2.316ArgPhe: 2.316 ± 1.223
3.474ArgGly: 3.474 ± 1.409
2.895ArgHis: 2.895 ± 2.129
2.316ArgIle: 2.316 ± 1.684
1.737ArgLys: 1.737 ± 1.035
4.053ArgLeu: 4.053 ± 1.899
1.737ArgMet: 1.737 ± 0.882
1.158ArgAsn: 1.158 ± 0.75
1.158ArgPro: 1.158 ± 0.499
1.737ArgGln: 1.737 ± 0.882
3.474ArgArg: 3.474 ± 1.08
2.895ArgSer: 2.895 ± 0.638
2.895ArgThr: 2.895 ± 1.026
3.474ArgVal: 3.474 ± 1.246
0.0ArgTrp: 0.0 ± 0.0
2.316ArgTyr: 2.316 ± 0.768
0.0ArgXaa: 0.0 ± 0.0
Ser
11.002SerAla: 11.002 ± 3.813
0.0SerCys: 0.0 ± 0.0
2.895SerAsp: 2.895 ± 1.643
1.737SerGlu: 1.737 ± 0.797
1.737SerPhe: 1.737 ± 1.089
2.895SerGly: 2.895 ± 1.109
1.737SerHis: 1.737 ± 0.617
3.474SerIle: 3.474 ± 2.16
4.053SerLys: 4.053 ± 1.577
6.369SerLeu: 6.369 ± 1.026
3.474SerMet: 3.474 ± 1.452
4.053SerAsn: 4.053 ± 1.183
2.895SerPro: 2.895 ± 1.24
2.316SerGln: 2.316 ± 1.249
7.528SerArg: 7.528 ± 1.67
5.211SerSer: 5.211 ± 2.061
4.632SerThr: 4.632 ± 1.181
4.053SerVal: 4.053 ± 0.722
0.579SerTrp: 0.579 ± 0.507
2.316SerTyr: 2.316 ± 1.066
0.0SerXaa: 0.0 ± 0.0
Thr
5.79ThrAla: 5.79 ± 2.231
0.579ThrCys: 0.579 ± 0.507
4.053ThrAsp: 4.053 ± 1.068
3.474ThrGlu: 3.474 ± 1.757
2.895ThrPhe: 2.895 ± 0.834
4.053ThrGly: 4.053 ± 2.467
2.316ThrHis: 2.316 ± 1.39
2.895ThrIle: 2.895 ± 1.176
5.79ThrLys: 5.79 ± 2.126
7.528ThrLeu: 7.528 ± 1.265
0.579ThrMet: 0.579 ± 0.421
3.474ThrAsn: 3.474 ± 1.337
4.053ThrPro: 4.053 ± 1.118
5.211ThrGln: 5.211 ± 2.081
3.474ThrArg: 3.474 ± 0.74
7.528ThrSer: 7.528 ± 1.854
7.528ThrThr: 7.528 ± 2.989
5.211ThrVal: 5.211 ± 2.349
1.158ThrTrp: 1.158 ± 0.591
0.579ThrTyr: 0.579 ± 0.675
0.0ThrXaa: 0.0 ± 0.0
Val
4.632ValAla: 4.632 ± 0.934
0.0ValCys: 0.0 ± 0.0
5.211ValAsp: 5.211 ± 1.32
3.474ValGlu: 3.474 ± 1.947
1.158ValPhe: 1.158 ± 0.499
5.211ValGly: 5.211 ± 1.608
1.737ValHis: 1.737 ± 1.045
4.632ValIle: 4.632 ± 1.51
2.316ValLys: 2.316 ± 0.9
5.211ValLeu: 5.211 ± 1.944
0.579ValMet: 0.579 ± 0.525
3.474ValAsn: 3.474 ± 1.55
3.474ValPro: 3.474 ± 1.585
2.895ValGln: 2.895 ± 0.947
2.895ValArg: 2.895 ± 0.609
2.316ValSer: 2.316 ± 1.063
3.474ValThr: 3.474 ± 1.876
1.158ValVal: 1.158 ± 0.677
0.579ValTrp: 0.579 ± 0.697
2.895ValTyr: 2.895 ± 1.434
0.0ValXaa: 0.0 ± 0.0
Trp
0.579TrpAla: 0.579 ± 0.507
0.0TrpCys: 0.0 ± 0.0
0.0TrpAsp: 0.0 ± 0.0
0.579TrpGlu: 0.579 ± 0.525
0.579TrpPhe: 0.579 ± 0.769
0.0TrpGly: 0.0 ± 0.0
0.0TrpHis: 0.0 ± 0.0
1.158TrpIle: 1.158 ± 0.79
1.158TrpLys: 1.158 ± 0.591
1.158TrpLeu: 1.158 ± 0.75
0.579TrpMet: 0.579 ± 0.507
1.737TrpAsn: 1.737 ± 0.461
1.158TrpPro: 1.158 ± 0.842
0.0TrpGln: 0.0 ± 0.0
0.0TrpArg: 0.0 ± 0.0
1.737TrpSer: 1.737 ± 0.772
1.737TrpThr: 1.737 ± 0.772
0.0TrpVal: 0.0 ± 0.0
0.0TrpTrp: 0.0 ± 0.0
1.158TrpTyr: 1.158 ± 0.499
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.895TyrAla: 2.895 ± 0.947
0.0TyrCys: 0.0 ± 0.0
4.053TyrAsp: 4.053 ± 1.71
0.579TyrGlu: 0.579 ± 0.769
4.053TyrPhe: 4.053 ± 1.101
2.895TyrGly: 2.895 ± 0.765
1.158TyrHis: 1.158 ± 1.014
0.579TyrIle: 0.579 ± 0.507
0.0TyrLys: 0.0 ± 0.0
2.895TyrLeu: 2.895 ± 1.217
1.158TyrMet: 1.158 ± 0.842
2.316TyrAsn: 2.316 ± 0.768
1.737TyrPro: 1.737 ± 0.898
0.579TyrGln: 0.579 ± 0.421
2.895TyrArg: 2.895 ± 1.382
1.158TyrSer: 1.158 ± 0.842
0.579TyrThr: 0.579 ± 0.507
4.053TyrVal: 4.053 ± 0.928
0.579TyrTrp: 0.579 ± 0.525
1.158TyrTyr: 1.158 ± 1.001
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 6 proteins (1728 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski