Amino acid dipepetide frequency for Thermus phage phiOH3

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
8.035AlaAla: 8.035 ± 2.359
0.0AlaCys: 0.0 ± 0.0
3.708AlaAsp: 3.708 ± 1.776
4.944AlaGlu: 4.944 ± 1.393
5.562AlaPhe: 5.562 ± 1.507
6.18AlaGly: 6.18 ± 3.693
1.854AlaHis: 1.854 ± 0.797
6.799AlaIle: 6.799 ± 2.12
2.472AlaLys: 2.472 ± 1.336
10.507AlaLeu: 10.507 ± 2.245
0.0AlaMet: 0.0 ± 0.724
1.854AlaAsn: 1.854 ± 0.769
3.708AlaPro: 3.708 ± 1.082
3.708AlaGln: 3.708 ± 0.854
8.653AlaArg: 8.653 ± 3.644
4.326AlaSer: 4.326 ± 1.69
5.562AlaThr: 5.562 ± 0.992
11.743AlaVal: 11.743 ± 2.624
4.326AlaTrp: 4.326 ± 1.714
6.18AlaTyr: 6.18 ± 1.95
0.0AlaXaa: 0.0 ± 0.0
Cys
0.618CysAla: 0.618 ± 0.455
0.0CysCys: 0.0 ± 0.0
0.618CysAsp: 0.618 ± 0.455
0.0CysGlu: 0.0 ± 0.0
0.618CysPhe: 0.618 ± 0.635
0.618CysGly: 0.618 ± 0.635
0.0CysHis: 0.0 ± 0.0
1.236CysIle: 1.236 ± 0.909
0.0CysLys: 0.0 ± 0.0
0.0CysLeu: 0.0 ± 0.0
0.0CysMet: 0.0 ± 0.0
0.0CysAsn: 0.0 ± 0.0
0.618CysPro: 0.618 ± 0.455
0.0CysGln: 0.0 ± 0.0
0.0CysArg: 0.0 ± 0.0
0.618CysSer: 0.618 ± 0.455
0.0CysThr: 0.0 ± 0.0
0.0CysVal: 0.0 ± 0.0
0.0CysTrp: 0.0 ± 0.0
0.618CysTyr: 0.618 ± 0.635
0.0CysXaa: 0.0 ± 0.0
Asp
8.653AspAla: 8.653 ± 1.764
0.0AspCys: 0.0 ± 0.0
0.0AspAsp: 0.0 ± 0.0
2.472AspGlu: 2.472 ± 1.27
1.236AspPhe: 1.236 ± 0.492
7.417AspGly: 7.417 ± 2.227
0.0AspHis: 0.0 ± 0.0
1.236AspIle: 1.236 ± 0.909
1.236AspLys: 1.236 ± 0.492
4.944AspLeu: 4.944 ± 1.392
0.618AspMet: 0.618 ± 0.455
0.618AspAsn: 0.618 ± 0.455
6.799AspPro: 6.799 ± 2.764
1.236AspGln: 1.236 ± 0.874
5.562AspArg: 5.562 ± 2.138
2.472AspSer: 2.472 ± 1.012
4.326AspThr: 4.326 ± 1.156
4.944AspVal: 4.944 ± 0.841
2.472AspTrp: 2.472 ± 0.854
1.854AspTyr: 1.854 ± 0.48
0.0AspXaa: 0.0 ± 0.0
Glu
10.507GluAla: 10.507 ± 2.819
0.0GluCys: 0.0 ± 0.0
0.618GluAsp: 0.618 ± 0.506
4.944GluGlu: 4.944 ± 1.384
0.618GluPhe: 0.618 ± 0.93
5.562GluGly: 5.562 ± 3.012
1.236GluHis: 1.236 ± 0.492
1.236GluIle: 1.236 ± 0.635
1.236GluLys: 1.236 ± 0.492
4.326GluLeu: 4.326 ± 2.949
0.618GluMet: 0.618 ± 0.506
1.236GluAsn: 1.236 ± 0.635
2.472GluPro: 2.472 ± 0.651
1.236GluGln: 1.236 ± 0.909
2.472GluArg: 2.472 ± 1.033
1.854GluSer: 1.854 ± 1.204
0.0GluThr: 0.0 ± 0.0
7.417GluVal: 7.417 ± 2.167
2.472GluTrp: 2.472 ± 0.651
1.854GluTyr: 1.854 ± 1.203
0.0GluXaa: 0.0 ± 0.0
Phe
3.09PheAla: 3.09 ± 1.559
0.0PheCys: 0.0 ± 0.0
3.09PheAsp: 3.09 ± 2.566
1.854PheGlu: 1.854 ± 1.344
0.0PhePhe: 0.0 ± 0.0
6.18PheGly: 6.18 ± 2.361
0.0PheHis: 0.0 ± 0.0
1.854PheIle: 1.854 ± 0.769
1.236PheLys: 1.236 ± 0.874
1.854PheLeu: 1.854 ± 1.276
1.236PheMet: 1.236 ± 1.011
0.618PheAsn: 0.618 ± 0.719
1.854PhePro: 1.854 ± 0.48
0.0PheGln: 0.0 ± 0.0
0.618PheArg: 0.618 ± 0.455
1.854PheSer: 1.854 ± 1.1
2.472PheThr: 2.472 ± 0.984
2.472PheVal: 2.472 ± 1.341
3.708PheTrp: 3.708 ± 0.92
2.472PheTyr: 2.472 ± 0.818
0.0PheXaa: 0.0 ± 0.0
Gly
4.944GlyAla: 4.944 ± 2.441
0.0GlyCys: 0.0 ± 0.0
3.708GlyAsp: 3.708 ± 1.871
2.472GlyGlu: 2.472 ± 2.023
3.09GlyPhe: 3.09 ± 1.074
6.18GlyGly: 6.18 ± 2.435
1.236GlyHis: 1.236 ± 0.78
2.472GlyIle: 2.472 ± 1.012
1.854GlyLys: 1.854 ± 0.807
11.125GlyLeu: 11.125 ± 4.983
2.472GlyMet: 2.472 ± 1.17
0.618GlyAsn: 0.618 ± 0.506
4.326GlyPro: 4.326 ± 2.24
0.0GlyGln: 0.0 ± 0.0
8.653GlyArg: 8.653 ± 1.368
7.417GlySer: 7.417 ± 2.176
4.326GlyThr: 4.326 ± 2.025
9.889GlyVal: 9.889 ± 1.106
2.472GlyTrp: 2.472 ± 1.375
1.236GlyTyr: 1.236 ± 0.909
0.0GlyXaa: 0.0 ± 0.0
His
0.0HisAla: 0.0 ± 0.0
0.0HisCys: 0.0 ± 0.0
1.236HisAsp: 1.236 ± 0.868
1.236HisGlu: 1.236 ± 1.042
0.0HisPhe: 0.0 ± 0.0
1.854HisGly: 1.854 ± 1.517
0.0HisHis: 0.0 ± 0.0
0.618HisIle: 0.618 ± 0.506
0.0HisLys: 0.0 ± 0.0
1.236HisLeu: 1.236 ± 0.749
0.618HisMet: 0.618 ± 0.635
0.0HisAsn: 0.0 ± 0.0
0.618HisPro: 0.618 ± 0.455
0.0HisGln: 0.0 ± 0.0
1.236HisArg: 1.236 ± 1.27
0.0HisSer: 0.0 ± 0.0
0.0HisThr: 0.0 ± 0.0
0.618HisVal: 0.618 ± 0.506
1.236HisTrp: 1.236 ± 0.749
0.618HisTyr: 0.618 ± 0.506
0.0HisXaa: 0.0 ± 0.0
Ile
3.09IleAla: 3.09 ± 1.411
0.618IleCys: 0.618 ± 0.455
2.472IleAsp: 2.472 ± 1.691
0.618IleGlu: 0.618 ± 0.635
3.09IlePhe: 3.09 ± 1.764
2.472IleGly: 2.472 ± 2.345
0.0IleHis: 0.0 ± 0.0
1.854IleIle: 1.854 ± 1.395
0.0IleLys: 0.0 ± 0.0
3.09IleLeu: 3.09 ± 1.332
0.618IleMet: 0.618 ± 0.635
0.618IleAsn: 0.618 ± 0.455
3.708IlePro: 3.708 ± 1.331
0.618IleGln: 0.618 ± 0.455
6.799IleArg: 6.799 ± 1.465
2.472IleSer: 2.472 ± 1.559
1.236IleThr: 1.236 ± 1.437
4.326IleVal: 4.326 ± 2.12
0.618IleTrp: 0.618 ± 0.506
0.0IleTyr: 0.0 ± 0.0
0.0IleXaa: 0.0 ± 0.0
Lys
3.09LysAla: 3.09 ± 2.211
0.618LysCys: 0.618 ± 0.455
2.472LysAsp: 2.472 ± 1.033
0.618LysGlu: 0.618 ± 0.77
1.236LysPhe: 1.236 ± 0.979
3.708LysGly: 3.708 ± 1.632
0.0LysHis: 0.0 ± 0.0
0.0LysIle: 0.0 ± 0.0
1.854LysLys: 1.854 ± 0.983
3.708LysLeu: 3.708 ± 1.082
0.618LysMet: 0.618 ± 0.417
0.0LysAsn: 0.0 ± 0.0
0.618LysPro: 0.618 ± 0.506
1.236LysGln: 1.236 ± 0.492
1.854LysArg: 1.854 ± 1.204
2.472LysSer: 2.472 ± 0.754
0.0LysThr: 0.0 ± 0.0
4.944LysVal: 4.944 ± 2.08
0.618LysTrp: 0.618 ± 0.635
0.618LysTyr: 0.618 ± 0.506
0.0LysXaa: 0.0 ± 0.0
Leu
11.743LeuAla: 11.743 ± 3.16
0.618LeuCys: 0.618 ± 0.635
7.417LeuAsp: 7.417 ± 1.203
2.472LeuGlu: 2.472 ± 0.926
1.236LeuPhe: 1.236 ± 0.671
8.035LeuGly: 8.035 ± 2.228
0.618LeuHis: 0.618 ± 0.947
3.09LeuIle: 3.09 ± 2.05
4.944LeuLys: 4.944 ± 1.292
14.833LeuLeu: 14.833 ± 4.464
2.472LeuMet: 2.472 ± 1.342
1.854LeuAsn: 1.854 ± 0.48
8.653LeuPro: 8.653 ± 1.112
4.944LeuGln: 4.944 ± 0.685
4.326LeuArg: 4.326 ± 3.001
5.562LeuSer: 5.562 ± 1.394
3.708LeuThr: 3.708 ± 1.61
11.125LeuVal: 11.125 ± 3.403
3.09LeuTrp: 3.09 ± 1.781
3.09LeuTyr: 3.09 ± 1.259
0.0LeuXaa: 0.0 ± 0.0
Met
1.236MetAla: 1.236 ± 0.78
0.0MetCys: 0.0 ± 0.0
0.618MetAsp: 0.618 ± 0.635
0.618MetGlu: 0.618 ± 0.93
0.0MetPhe: 0.0 ± 0.0
1.236MetGly: 1.236 ± 0.868
0.0MetHis: 0.0 ± 0.0
1.854MetIle: 1.854 ± 1.317
1.236MetLys: 1.236 ± 0.874
0.0MetLeu: 0.0 ± 0.0
1.236MetMet: 1.236 ± 0.671
0.618MetAsn: 0.618 ± 0.753
0.618MetPro: 0.618 ± 0.506
1.236MetGln: 1.236 ± 1.27
0.618MetArg: 0.618 ± 0.506
1.236MetSer: 1.236 ± 0.492
1.236MetThr: 1.236 ± 0.829
1.236MetVal: 1.236 ± 0.492
0.618MetTrp: 0.618 ± 0.635
0.618MetTyr: 0.618 ± 0.506
0.0MetXaa: 0.0 ± 0.0
Asn
0.0AsnAla: 0.0 ± 0.0
0.0AsnCys: 0.0 ± 0.0
0.618AsnAsp: 0.618 ± 0.455
0.0AsnGlu: 0.0 ± 0.0
1.854AsnPhe: 1.854 ± 0.48
1.854AsnGly: 1.854 ± 1.009
0.0AsnHis: 0.0 ± 0.0
0.618AsnIle: 0.618 ± 0.455
0.0AsnLys: 0.0 ± 0.0
1.854AsnLeu: 1.854 ± 0.657
0.0AsnMet: 0.0 ± 0.0
0.618AsnAsn: 0.618 ± 0.455
7.417AsnPro: 7.417 ± 3.109
1.236AsnGln: 1.236 ± 0.749
0.618AsnArg: 0.618 ± 0.455
0.618AsnSer: 0.618 ± 0.635
1.236AsnThr: 1.236 ± 0.874
2.472AsnVal: 2.472 ± 1.195
0.618AsnTrp: 0.618 ± 0.455
1.854AsnTyr: 1.854 ± 0.91
0.0AsnXaa: 0.0 ± 0.0
Pro
6.799ProAla: 6.799 ± 1.445
0.0ProCys: 0.0 ± 0.0
8.035ProAsp: 8.035 ± 4.089
7.417ProGlu: 7.417 ± 2.723
1.854ProPhe: 1.854 ± 0.904
4.326ProGly: 4.326 ± 0.845
1.854ProHis: 1.854 ± 1.004
4.944ProIle: 4.944 ± 1.015
0.618ProLys: 0.618 ± 0.635
3.708ProLeu: 3.708 ± 0.963
0.618ProMet: 0.618 ± 0.506
3.708ProAsn: 3.708 ± 1.614
7.417ProPro: 7.417 ± 1.708
2.472ProGln: 2.472 ± 1.819
1.236ProArg: 1.236 ± 0.893
3.09ProSer: 3.09 ± 1.015
1.854ProThr: 1.854 ± 0.801
6.18ProVal: 6.18 ± 1.413
3.708ProTrp: 3.708 ± 0.959
3.09ProTyr: 3.09 ± 1.843
0.0ProXaa: 0.0 ± 0.0
Gln
4.326GlnAla: 4.326 ± 0.671
0.618GlnCys: 0.618 ± 0.455
1.236GlnAsp: 1.236 ± 0.492
1.236GlnGlu: 1.236 ± 1.27
1.236GlnPhe: 1.236 ± 0.893
1.854GlnGly: 1.854 ± 0.801
0.618GlnHis: 0.618 ± 0.635
0.618GlnIle: 0.618 ± 0.719
1.236GlnLys: 1.236 ± 0.909
4.944GlnLeu: 4.944 ± 2.413
0.0GlnMet: 0.0 ± 0.0
1.854GlnAsn: 1.854 ± 0.91
1.236GlnPro: 1.236 ± 0.909
1.854GlnGln: 1.854 ± 1.364
2.472GlnArg: 2.472 ± 1.793
2.472GlnSer: 2.472 ± 0.651
0.618GlnThr: 0.618 ± 0.455
3.708GlnVal: 3.708 ± 1.123
2.472GlnTrp: 2.472 ± 1.497
1.236GlnTyr: 1.236 ± 0.492
0.0GlnXaa: 0.0 ± 0.0
Arg
4.326ArgAla: 4.326 ± 2.177
0.618ArgCys: 0.618 ± 0.455
4.944ArgAsp: 4.944 ± 1.507
5.562ArgGlu: 5.562 ± 3.379
4.326ArgPhe: 4.326 ± 2.35
5.562ArgGly: 5.562 ± 1.298
0.618ArgHis: 0.618 ± 0.506
1.236ArgIle: 1.236 ± 0.492
4.944ArgLys: 4.944 ± 1.681
4.326ArgLeu: 4.326 ± 1.16
1.236ArgMet: 1.236 ± 0.843
2.472ArgAsn: 2.472 ± 0.958
5.562ArgPro: 5.562 ± 1.211
3.708ArgGln: 3.708 ± 1.188
4.944ArgArg: 4.944 ± 1.502
2.472ArgSer: 2.472 ± 0.651
1.854ArgThr: 1.854 ± 1.004
6.799ArgVal: 6.799 ± 2.237
1.854ArgTrp: 1.854 ± 1.364
3.09ArgTyr: 3.09 ± 1.556
0.0ArgXaa: 0.0 ± 0.0
Ser
8.035SerAla: 8.035 ± 0.796
1.236SerCys: 1.236 ± 1.27
1.236SerAsp: 1.236 ± 0.492
2.472SerGlu: 2.472 ± 0.754
3.708SerPhe: 3.708 ± 0.963
3.708SerGly: 3.708 ± 0.488
1.236SerHis: 1.236 ± 0.635
0.618SerIle: 0.618 ± 0.635
3.09SerLys: 3.09 ± 1.631
6.799SerLeu: 6.799 ± 1.055
0.618SerMet: 0.618 ± 0.609
1.236SerAsn: 1.236 ± 0.492
3.708SerPro: 3.708 ± 1.808
1.236SerGln: 1.236 ± 0.635
3.708SerArg: 3.708 ± 1.667
1.854SerSer: 1.854 ± 0.807
3.708SerThr: 3.708 ± 2.198
3.09SerVal: 3.09 ± 1.35
1.854SerTrp: 1.854 ± 1.517
3.708SerTyr: 3.708 ± 1.641
0.0SerXaa: 0.0 ± 0.0
Thr
3.09ThrAla: 3.09 ± 1.383
1.236ThrCys: 1.236 ± 0.909
4.326ThrAsp: 4.326 ± 1.675
1.854ThrGlu: 1.854 ± 0.778
3.09ThrPhe: 3.09 ± 1.839
2.472ThrGly: 2.472 ± 1.359
0.0ThrHis: 0.0 ± 0.0
1.236ThrIle: 1.236 ± 1.361
0.618ThrLys: 0.618 ± 0.77
3.708ThrLeu: 3.708 ± 1.824
0.618ThrMet: 0.618 ± 0.635
1.236ThrAsn: 1.236 ± 0.909
2.472ThrPro: 2.472 ± 0.926
1.236ThrGln: 1.236 ± 0.635
3.708ThrArg: 3.708 ± 1.878
3.708ThrSer: 3.708 ± 1.432
0.618ThrThr: 0.618 ± 0.455
1.854ThrVal: 1.854 ± 1.336
0.618ThrTrp: 0.618 ± 0.753
1.236ThrTyr: 1.236 ± 0.635
0.0ThrXaa: 0.0 ± 0.0
Val
8.653ValAla: 8.653 ± 2.637
0.0ValCys: 0.0 ± 0.0
7.417ValAsp: 7.417 ± 1.649
8.035ValGlu: 8.035 ± 1.952
1.236ValPhe: 1.236 ± 1.308
4.944ValGly: 4.944 ± 1.212
1.236ValHis: 1.236 ± 0.868
6.18ValIle: 6.18 ± 2.34
3.09ValLys: 3.09 ± 0.804
12.361ValLeu: 12.361 ± 3.216
0.0ValMet: 0.0 ± 0.0
2.472ValAsn: 2.472 ± 0.818
5.562ValPro: 5.562 ± 1.388
7.417ValGln: 7.417 ± 2.355
6.799ValArg: 6.799 ± 1.78
6.799ValSer: 6.799 ± 2.614
1.854ValThr: 1.854 ± 0.778
6.799ValVal: 6.799 ± 2.049
1.236ValTrp: 1.236 ± 0.909
3.708ValTyr: 3.708 ± 2.243
0.0ValXaa: 0.0 ± 0.0
Trp
3.09TrpAla: 3.09 ± 1.631
0.0TrpCys: 0.0 ± 0.0
2.472TrpAsp: 2.472 ± 0.754
1.236TrpGlu: 1.236 ± 1.011
0.618TrpPhe: 0.618 ± 0.753
2.472TrpGly: 2.472 ± 0.651
0.618TrpHis: 0.618 ± 0.455
0.0TrpIle: 0.0 ± 0.0
0.0TrpLys: 0.0 ± 0.0
7.417TrpLeu: 7.417 ± 2.819
1.236TrpMet: 1.236 ± 1.286
0.618TrpAsn: 0.618 ± 0.719
2.472TrpPro: 2.472 ± 1.819
1.854TrpGln: 1.854 ± 0.48
2.472TrpArg: 2.472 ± 1.46
2.472TrpSer: 2.472 ± 1.012
1.236TrpThr: 1.236 ± 0.909
3.708TrpVal: 3.708 ± 1.625
0.0TrpTrp: 0.0 ± 0.0
1.854TrpTyr: 1.854 ± 0.801
0.0TrpXaa: 0.0 ± 0.0
Tyr
7.417TyrAla: 7.417 ± 2.216
0.0TyrCys: 0.0 ± 0.0
1.854TyrAsp: 1.854 ± 0.807
1.854TyrGlu: 1.854 ± 1.517
1.854TyrPhe: 1.854 ± 0.801
1.854TyrGly: 1.854 ± 0.905
0.0TyrHis: 0.0 ± 0.0
0.618TyrIle: 0.618 ± 0.93
0.618TyrLys: 0.618 ± 0.506
3.09TyrLeu: 3.09 ± 0.922
0.618TyrMet: 0.618 ± 0.506
1.236TyrAsn: 1.236 ± 0.909
3.09TyrPro: 3.09 ± 1.638
0.618TyrGln: 0.618 ± 0.506
3.708TyrArg: 3.708 ± 0.754
3.09TyrSer: 3.09 ± 1.015
3.09TyrThr: 3.09 ± 0.765
2.472TyrVal: 2.472 ± 1.819
1.854TyrTrp: 1.854 ± 1.204
0.618TyrTyr: 0.618 ± 0.635
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 8 proteins (1619 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski