Amino acid dipepetide frequency for Entamoeba histolytica

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
1.868AlaAla: 1.868 ± 0.038
0.683AlaCys: 0.683 ± 0.016
1.519AlaAsp: 1.519 ± 0.023
2.333AlaGlu: 2.333 ± 0.037
1.853AlaPhe: 1.853 ± 0.027
1.539AlaGly: 1.539 ± 0.028
0.649AlaHis: 0.649 ± 0.013
3.276AlaIle: 3.276 ± 0.033
2.906AlaLys: 2.906 ± 0.049
3.808AlaLeu: 3.808 ± 0.039
0.952AlaMet: 0.952 ± 0.016
1.732AlaAsn: 1.732 ± 0.028
1.249AlaPro: 1.249 ± 0.025
1.514AlaGln: 1.514 ± 0.026
1.262AlaArg: 1.262 ± 0.028
2.449AlaSer: 2.449 ± 0.029
1.955AlaThr: 1.955 ± 0.03
2.257AlaVal: 2.257 ± 0.036
0.254AlaTrp: 0.254 ± 0.009
1.182AlaTyr: 1.182 ± 0.021
0.001AlaXaa: 0.001 ± 0.001
Cys
0.701CysAla: 0.701 ± 0.018
0.659CysCys: 0.659 ± 0.017
1.236CysAsp: 1.236 ± 0.023
1.369CysGlu: 1.369 ± 0.024
1.367CysPhe: 1.367 ± 0.032
1.356CysGly: 1.356 ± 0.022
0.451CysHis: 0.451 ± 0.011
2.03CysIle: 2.03 ± 0.032
1.622CysLys: 1.622 ± 0.029
1.951CysLeu: 1.951 ± 0.03
0.409CysMet: 0.409 ± 0.01
1.293CysAsn: 1.293 ± 0.024
0.799CysPro: 0.799 ± 0.019
0.816CysGln: 0.816 ± 0.018
0.591CysArg: 0.591 ± 0.013
1.945CysSer: 1.945 ± 0.037
1.126CysThr: 1.126 ± 0.028
1.396CysVal: 1.396 ± 0.025
0.212CysTrp: 0.212 ± 0.008
1.025CysTyr: 1.025 ± 0.018
0.001CysXaa: 0.001 ± 0.0
Asp
1.773AspAla: 1.773 ± 0.026
1.111AspCys: 1.111 ± 0.018
3.303AspAsp: 3.303 ± 0.078
4.454AspGlu: 4.454 ± 0.057
2.498AspPhe: 2.498 ± 0.028
2.754AspGly: 2.754 ± 0.035
0.706AspHis: 0.706 ± 0.017
5.193AspIle: 5.193 ± 0.042
4.065AspLys: 4.065 ± 0.038
4.139AspLeu: 4.139 ± 0.041
1.102AspMet: 1.102 ± 0.02
3.167AspAsn: 3.167 ± 0.034
1.588AspPro: 1.588 ± 0.027
1.559AspGln: 1.559 ± 0.02
1.372AspArg: 1.372 ± 0.029
3.583AspSer: 3.583 ± 0.04
2.332AspThr: 2.332 ± 0.027
3.071AspVal: 3.071 ± 0.036
0.459AspTrp: 0.459 ± 0.012
2.107AspTyr: 2.107 ± 0.028
0.001AspXaa: 0.001 ± 0.0
Glu
2.628GluAla: 2.628 ± 0.04
1.737GluCys: 1.737 ± 0.035
3.979GluAsp: 3.979 ± 0.065
9.126GluGlu: 9.126 ± 0.136
3.129GluPhe: 3.129 ± 0.031
3.17GluGly: 3.17 ± 0.039
1.278GluHis: 1.278 ± 0.022
7.316GluIle: 7.316 ± 0.069
7.619GluLys: 7.619 ± 0.086
6.774GluLeu: 6.774 ± 0.059
2.226GluMet: 2.226 ± 0.024
5.029GluAsn: 5.029 ± 0.05
1.764GluPro: 1.764 ± 0.028
3.334GluGln: 3.334 ± 0.042
3.128GluArg: 3.128 ± 0.044
4.475GluSer: 4.475 ± 0.046
3.944GluThr: 3.944 ± 0.038
4.151GluVal: 4.151 ± 0.035
0.705GluTrp: 0.705 ± 0.016
2.983GluTyr: 2.983 ± 0.034
0.002GluXaa: 0.002 ± 0.001
Phe
1.747PheAla: 1.747 ± 0.026
1.068PheCys: 1.068 ± 0.018
3.016PheAsp: 3.016 ± 0.033
3.25PheGlu: 3.25 ± 0.034
2.303PhePhe: 2.303 ± 0.036
2.607PheGly: 2.607 ± 0.031
0.933PheHis: 0.933 ± 0.02
5.072PheIle: 5.072 ± 0.05
3.567PheLys: 3.567 ± 0.03
3.785PheLeu: 3.785 ± 0.04
1.069PheMet: 1.069 ± 0.017
3.507PheAsn: 3.507 ± 0.035
1.672PhePro: 1.672 ± 0.024
1.548PheGln: 1.548 ± 0.025
1.315PheArg: 1.315 ± 0.018
3.636PheSer: 3.636 ± 0.041
2.613PheThr: 2.613 ± 0.036
2.968PheVal: 2.968 ± 0.031
0.354PheTrp: 0.354 ± 0.01
1.94PheTyr: 1.94 ± 0.028
0.001PheXaa: 0.001 ± 0.001
Gly
1.636GlyAla: 1.636 ± 0.025
1.297GlyCys: 1.297 ± 0.03
2.335GlyAsp: 2.335 ± 0.031
3.072GlyGlu: 3.072 ± 0.034
2.227GlyPhe: 2.227 ± 0.034
2.305GlyGly: 2.305 ± 0.037
0.726GlyHis: 0.726 ± 0.017
4.565GlyIle: 4.565 ± 0.042
4.089GlyLys: 4.089 ± 0.042
3.374GlyLeu: 3.374 ± 0.039
1.231GlyMet: 1.231 ± 0.027
2.93GlyAsn: 2.93 ± 0.031
0.768GlyPro: 0.768 ± 0.017
1.236GlyGln: 1.236 ± 0.025
1.517GlyArg: 1.517 ± 0.028
2.957GlySer: 2.957 ± 0.037
2.518GlyThr: 2.518 ± 0.034
2.805GlyVal: 2.805 ± 0.034
0.431GlyTrp: 0.431 ± 0.012
2.171GlyTyr: 2.171 ± 0.033
0.003GlyXaa: 0.003 ± 0.001
His
0.57HisAla: 0.57 ± 0.014
0.653HisCys: 0.653 ± 0.018
0.773HisAsp: 0.773 ± 0.017
1.021HisGlu: 1.021 ± 0.018
1.165HisPhe: 1.165 ± 0.021
0.753HisGly: 0.753 ± 0.016
0.509HisHis: 0.509 ± 0.013
1.523HisIle: 1.523 ± 0.023
1.23HisLys: 1.23 ± 0.02
1.932HisLeu: 1.932 ± 0.026
0.37HisMet: 0.37 ± 0.012
0.943HisAsn: 0.943 ± 0.016
0.987HisPro: 0.987 ± 0.021
0.901HisGln: 0.901 ± 0.015
0.671HisArg: 0.671 ± 0.015
1.68HisSer: 1.68 ± 0.025
0.848HisThr: 0.848 ± 0.017
0.852HisVal: 0.852 ± 0.018
0.182HisTrp: 0.182 ± 0.008
0.854HisTyr: 0.854 ± 0.016
0.0HisXaa: 0.0 ± 0.0
Ile
3.339IleAla: 3.339 ± 0.039
1.913IleCys: 1.913 ± 0.029
5.237IleAsp: 5.237 ± 0.041
7.537IleGlu: 7.537 ± 0.067
3.798IlePhe: 3.798 ± 0.045
4.259IleGly: 4.259 ± 0.042
1.896IleHis: 1.896 ± 0.026
8.66IleIle: 8.66 ± 0.086
8.171IleLys: 8.171 ± 0.073
7.24IleLeu: 7.24 ± 0.06
1.65IleMet: 1.65 ± 0.022
6.724IleAsn: 6.724 ± 0.06
4.228IlePro: 4.228 ± 0.041
4.349IleGln: 4.349 ± 0.044
2.903IleArg: 2.903 ± 0.032
6.743IleSer: 6.743 ± 0.056
5.371IleThr: 5.371 ± 0.053
5.086IleVal: 5.086 ± 0.045
0.589IleTrp: 0.589 ± 0.013
3.035IleTyr: 3.035 ± 0.035
0.004IleXaa: 0.004 ± 0.001
Lys
3.185LysAla: 3.185 ± 0.047
1.827LysCys: 1.827 ± 0.035
4.268LysAsp: 4.268 ± 0.039
10.075LysGlu: 10.075 ± 0.105
2.863LysPhe: 2.863 ± 0.028
3.747LysGly: 3.747 ± 0.046
1.401LysHis: 1.401 ± 0.022
7.069LysIle: 7.069 ± 0.054
9.608LysLys: 9.608 ± 0.123
6.524LysLeu: 6.524 ± 0.057
2.262LysMet: 2.262 ± 0.027
5.012LysAsn: 5.012 ± 0.04
2.728LysPro: 2.728 ± 0.036
4.2LysGln: 4.2 ± 0.05
3.953LysArg: 3.953 ± 0.045
5.192LysSer: 5.192 ± 0.044
5.186LysThr: 5.186 ± 0.047
4.452LysVal: 4.452 ± 0.042
0.68LysTrp: 0.68 ± 0.014
3.679LysTyr: 3.679 ± 0.033
0.003LysXaa: 0.003 ± 0.001
Leu
3.009LeuAla: 3.009 ± 0.036
1.888LeuCys: 1.888 ± 0.025
3.91LeuAsp: 3.91 ± 0.039
5.442LeuGlu: 5.442 ± 0.048
5.023LeuPhe: 5.023 ± 0.047
3.163LeuGly: 3.163 ± 0.035
1.874LeuHis: 1.874 ± 0.027
7.798LeuIle: 7.798 ± 0.071
7.937LeuLys: 7.937 ± 0.051
8.894LeuLeu: 8.894 ± 0.081
2.141LeuMet: 2.141 ± 0.028
6.055LeuAsn: 6.055 ± 0.052
3.739LeuPro: 3.739 ± 0.041
3.812LeuGln: 3.812 ± 0.041
3.093LeuArg: 3.093 ± 0.034
6.885LeuSer: 6.885 ± 0.057
5.18LeuThr: 5.18 ± 0.05
4.491LeuVal: 4.491 ± 0.04
0.639LeuTrp: 0.639 ± 0.015
3.258LeuTyr: 3.258 ± 0.035
0.002LeuXaa: 0.002 ± 0.001
Met
0.938MetAla: 0.938 ± 0.018
0.387MetCys: 0.387 ± 0.011
1.028MetAsp: 1.028 ± 0.019
1.732MetGlu: 1.732 ± 0.022
1.057MetPhe: 1.057 ± 0.02
0.909MetGly: 0.909 ± 0.018
0.273MetHis: 0.273 ± 0.009
2.033MetIle: 2.033 ± 0.027
2.916MetLys: 2.916 ± 0.037
1.747MetLeu: 1.747 ± 0.025
0.662MetMet: 0.662 ± 0.018
1.907MetAsn: 1.907 ± 0.025
0.618MetPro: 0.618 ± 0.022
0.696MetGln: 0.696 ± 0.017
0.811MetArg: 0.811 ± 0.016
1.902MetSer: 1.902 ± 0.024
1.424MetThr: 1.424 ± 0.021
1.188MetVal: 1.188 ± 0.02
0.169MetTrp: 0.169 ± 0.007
0.811MetTyr: 0.811 ± 0.016
0.0MetXaa: 0.0 ± 0.0
Asn
2.232AsnAla: 2.232 ± 0.031
1.447AsnCys: 1.447 ± 0.031
3.792AsnAsp: 3.792 ± 0.035
6.547AsnGlu: 6.547 ± 0.058
2.252AsnPhe: 2.252 ± 0.027
3.64AsnGly: 3.64 ± 0.041
1.109AsnHis: 1.109 ± 0.018
5.751AsnIle: 5.751 ± 0.054
6.079AsnLys: 6.079 ± 0.05
4.397AsnLeu: 4.397 ± 0.043
1.293AsnMet: 1.293 ± 0.02
4.786AsnAsn: 4.786 ± 0.05
2.289AsnPro: 2.289 ± 0.028
2.966AsnGln: 2.966 ± 0.033
1.726AsnArg: 1.726 ± 0.023
4.672AsnSer: 4.672 ± 0.045
3.745AsnThr: 3.745 ± 0.039
3.368AsnVal: 3.368 ± 0.031
0.485AsnTrp: 0.485 ± 0.013
2.484AsnTyr: 2.484 ± 0.03
0.001AsnXaa: 0.001 ± 0.001
Pro
1.048ProAla: 1.048 ± 0.025
0.527ProCys: 0.527 ± 0.014
1.447ProAsp: 1.447 ± 0.024
2.389ProGlu: 2.389 ± 0.026
2.062ProPhe: 2.062 ± 0.027
1.079ProGly: 1.079 ± 0.028
0.716ProHis: 0.716 ± 0.016
3.355ProIle: 3.355 ± 0.035
2.943ProLys: 2.943 ± 0.034
3.526ProLeu: 3.526 ± 0.04
0.735ProMet: 0.735 ± 0.019
2.282ProAsn: 2.282 ± 0.032
1.757ProPro: 1.757 ± 0.062
1.761ProGln: 1.761 ± 0.033
1.178ProArg: 1.178 ± 0.026
3.203ProSer: 3.203 ± 0.039
2.501ProThr: 2.501 ± 0.03
1.801ProVal: 1.801 ± 0.026
0.234ProTrp: 0.234 ± 0.008
1.318ProTyr: 1.318 ± 0.024
0.001ProXaa: 0.001 ± 0.0
Gln
1.277GlnAla: 1.277 ± 0.026
0.939GlnCys: 0.939 ± 0.022
1.339GlnAsp: 1.339 ± 0.02
3.084GlnGlu: 3.084 ± 0.048
2.117GlnPhe: 2.117 ± 0.029
1.27GlnGly: 1.27 ± 0.022
0.838GlnHis: 0.838 ± 0.015
3.771GlnIle: 3.771 ± 0.035
3.641GlnLys: 3.641 ± 0.04
4.278GlnLeu: 4.278 ± 0.044
1.19GlnMet: 1.19 ± 0.021
2.771GlnAsn: 2.771 ± 0.032
1.878GlnPro: 1.878 ± 0.041
2.84GlnGln: 2.84 ± 0.054
1.749GlnArg: 1.749 ± 0.024
2.903GlnSer: 2.903 ± 0.035
2.612GlnThr: 2.612 ± 0.032
1.866GlnVal: 1.866 ± 0.024
0.408GlnTrp: 0.408 ± 0.01
1.721GlnTyr: 1.721 ± 0.024
0.0GlnXaa: 0.0 ± 0.0
Arg
1.268ArgAla: 1.268 ± 0.024
0.787ArgCys: 0.787 ± 0.018
1.597ArgAsp: 1.597 ± 0.025
2.381ArgGlu: 2.381 ± 0.028
1.559ArgPhe: 1.559 ± 0.022
1.546ArgGly: 1.546 ± 0.032
0.585ArgHis: 0.585 ± 0.015
3.229ArgIle: 3.229 ± 0.037
3.452ArgLys: 3.452 ± 0.047
2.995ArgLeu: 2.995 ± 0.03
0.98ArgMet: 0.98 ± 0.018
2.145ArgAsn: 2.145 ± 0.025
0.925ArgPro: 0.925 ± 0.02
1.217ArgGln: 1.217 ± 0.028
1.882ArgArg: 1.882 ± 0.036
2.197ArgSer: 2.197 ± 0.03
1.735ArgThr: 1.735 ± 0.027
1.931ArgVal: 1.931 ± 0.025
0.295ArgTrp: 0.295 ± 0.01
1.453ArgTyr: 1.453 ± 0.02
0.001ArgXaa: 0.001 ± 0.0
Ser
2.303SerAla: 2.303 ± 0.031
1.574SerCys: 1.574 ± 0.035
3.44SerAsp: 3.44 ± 0.043
4.293SerGlu: 4.293 ± 0.039
4.336SerPhe: 4.336 ± 0.043
2.922SerGly: 2.922 ± 0.032
1.436SerHis: 1.436 ± 0.023
7.287SerIle: 7.287 ± 0.058
5.686SerLys: 5.686 ± 0.044
7.855SerLeu: 7.855 ± 0.066
1.493SerMet: 1.493 ± 0.021
4.674SerAsn: 4.674 ± 0.044
2.63SerPro: 2.63 ± 0.041
3.177SerGln: 3.177 ± 0.031
2.111SerArg: 2.111 ± 0.029
7.026SerSer: 7.026 ± 0.096
4.425SerThr: 4.425 ± 0.041
4.093SerVal: 4.093 ± 0.04
0.514SerTrp: 0.514 ± 0.013
2.732SerTyr: 2.732 ± 0.033
0.001SerXaa: 0.001 ± 0.001
Thr
1.956ThrAla: 1.956 ± 0.028
1.136ThrCys: 1.136 ± 0.028
2.455ThrAsp: 2.455 ± 0.029
3.746ThrGlu: 3.746 ± 0.037
2.803ThrPhe: 2.803 ± 0.034
2.239ThrGly: 2.239 ± 0.035
1.056ThrHis: 1.056 ± 0.017
5.461ThrIle: 5.461 ± 0.044
4.677ThrLys: 4.677 ± 0.04
5.414ThrLeu: 5.414 ± 0.046
1.163ThrMet: 1.163 ± 0.023
3.703ThrAsn: 3.703 ± 0.043
2.621ThrPro: 2.621 ± 0.035
2.564ThrGln: 2.564 ± 0.035
1.681ThrArg: 1.681 ± 0.025
4.628ThrSer: 4.628 ± 0.05
4.11ThrThr: 4.11 ± 0.052
2.965ThrVal: 2.965 ± 0.033
0.352ThrTrp: 0.352 ± 0.009
1.88ThrTyr: 1.88 ± 0.028
0.002ThrXaa: 0.002 ± 0.001
Val
2.221ValAla: 2.221 ± 0.032
1.272ValCys: 1.272 ± 0.023
3.056ValAsp: 3.056 ± 0.03
3.719ValGlu: 3.719 ± 0.044
2.9ValPhe: 2.9 ± 0.032
2.601ValGly: 2.601 ± 0.033
0.996ValHis: 0.996 ± 0.02
5.388ValIle: 5.388 ± 0.048
4.13ValLys: 4.13 ± 0.041
5.088ValLeu: 5.088 ± 0.042
1.33ValMet: 1.33 ± 0.02
3.324ValAsn: 3.324 ± 0.034
1.946ValPro: 1.946 ± 0.028
2.045ValGln: 2.045 ± 0.028
1.681ValArg: 1.681 ± 0.024
4.193ValSer: 4.193 ± 0.041
2.826ValThr: 2.826 ± 0.032
3.646ValVal: 3.646 ± 0.037
0.424ValTrp: 0.424 ± 0.011
2.013ValTyr: 2.013 ± 0.029
0.002ValXaa: 0.002 ± 0.001
Trp
0.246TrpAla: 0.246 ± 0.01
0.21TrpCys: 0.21 ± 0.009
0.49TrpAsp: 0.49 ± 0.013
0.495TrpGlu: 0.495 ± 0.014
0.396TrpPhe: 0.396 ± 0.012
0.395TrpGly: 0.395 ± 0.014
0.104TrpHis: 0.104 ± 0.006
0.702TrpIle: 0.702 ± 0.014
0.879TrpLys: 0.879 ± 0.017
0.604TrpLeu: 0.604 ± 0.014
0.233TrpMet: 0.233 ± 0.009
0.615TrpAsn: 0.615 ± 0.016
0.146TrpPro: 0.146 ± 0.007
0.194TrpGln: 0.194 ± 0.008
0.316TrpArg: 0.316 ± 0.009
0.496TrpSer: 0.496 ± 0.015
0.374TrpThr: 0.374 ± 0.012
0.426TrpVal: 0.426 ± 0.011
0.115TrpTrp: 0.115 ± 0.006
0.403TrpTyr: 0.403 ± 0.012
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.295TyrAla: 1.295 ± 0.02
1.238TyrCys: 1.238 ± 0.02
2.117TyrAsp: 2.117 ± 0.026
2.432TyrGlu: 2.432 ± 0.028
2.263TyrPhe: 2.263 ± 0.029
1.937TyrGly: 1.937 ± 0.025
0.88TyrHis: 0.88 ± 0.017
3.247TyrIle: 3.247 ± 0.036
2.691TyrLys: 2.691 ± 0.028
3.909TyrLeu: 3.909 ± 0.038
0.714TyrMet: 0.714 ± 0.013
2.462TyrAsn: 2.462 ± 0.026
1.527TyrPro: 1.527 ± 0.026
1.763TyrGln: 1.763 ± 0.026
1.219TyrArg: 1.219 ± 0.02
3.243TyrSer: 3.243 ± 0.036
1.786TyrThr: 1.786 ± 0.026
1.987TyrVal: 1.987 ± 0.026
0.341TyrTrp: 0.341 ± 0.009
2.076TyrTyr: 2.076 ± 0.031
0.001TyrXaa: 0.001 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.001XaaCys: 0.001 ± 0.0
0.001XaaAsp: 0.001 ± 0.001
0.001XaaGlu: 0.001 ± 0.0
0.002XaaPhe: 0.002 ± 0.001
0.001XaaGly: 0.001 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.003XaaIle: 0.003 ± 0.001
0.003XaaLys: 0.003 ± 0.001
0.003XaaLeu: 0.003 ± 0.001
0.002XaaMet: 0.002 ± 0.001
0.001XaaAsn: 0.001 ± 0.001
0.0XaaPro: 0.0 ± 0.0
0.001XaaGln: 0.001 ± 0.001
0.0XaaArg: 0.0 ± 0.0
0.003XaaSer: 0.003 ± 0.001
0.001XaaThr: 0.001 ± 0.001
0.001XaaVal: 0.001 ± 0.001
0.0XaaTrp: 0.0 ± 0.0
0.003XaaTyr: 0.003 ± 0.001
0.906XaaXaa: 0.906 ± 0.099
Statistics based on 7959 proteins (3380165 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski