Amino acid dipepetide frequency for Clostridium sp. CAG:226

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
13.356AlaAla: 13.356 ± 0.227
1.486AlaCys: 1.486 ± 0.053
6.255AlaAsp: 6.255 ± 0.1
7.497AlaGlu: 7.497 ± 0.144
3.579AlaPhe: 3.579 ± 0.092
7.314AlaGly: 7.314 ± 0.122
1.722AlaHis: 1.722 ± 0.056
6.423AlaIle: 6.423 ± 0.107
5.795AlaLys: 5.795 ± 0.103
10.282AlaLeu: 10.282 ± 0.143
3.526AlaMet: 3.526 ± 0.072
3.435AlaAsn: 3.435 ± 0.077
3.283AlaPro: 3.283 ± 0.066
3.051AlaGln: 3.051 ± 0.073
5.101AlaArg: 5.101 ± 0.098
4.667AlaSer: 4.667 ± 0.084
3.786AlaThr: 3.786 ± 0.079
7.551AlaVal: 7.551 ± 0.122
0.762AlaTrp: 0.762 ± 0.039
3.442AlaTyr: 3.442 ± 0.085
0.0AlaXaa: 0.0 ± 0.0
Cys
1.982CysAla: 1.982 ± 0.059
0.409CysCys: 0.409 ± 0.023
0.892CysAsp: 0.892 ± 0.038
0.915CysGlu: 0.915 ± 0.041
0.561CysPhe: 0.561 ± 0.031
1.822CysGly: 1.822 ± 0.059
0.317CysHis: 0.317 ± 0.024
1.229CysIle: 1.229 ± 0.042
0.676CysLys: 0.676 ± 0.033
1.107CysLeu: 1.107 ± 0.042
0.562CysMet: 0.562 ± 0.043
0.531CysAsn: 0.531 ± 0.028
0.767CysPro: 0.767 ± 0.039
0.341CysGln: 0.341 ± 0.024
0.919CysArg: 0.919 ± 0.041
0.902CysSer: 0.902 ± 0.044
0.981CysThr: 0.981 ± 0.044
1.138CysVal: 1.138 ± 0.04
0.091CysTrp: 0.091 ± 0.012
0.584CysTyr: 0.584 ± 0.034
0.0CysXaa: 0.0 ± 0.0
Asp
6.146AspAla: 6.146 ± 0.117
0.892AspCys: 0.892 ± 0.044
3.104AspAsp: 3.104 ± 0.087
4.811AspGlu: 4.811 ± 0.096
2.159AspPhe: 2.159 ± 0.059
4.85AspGly: 4.85 ± 0.099
0.731AspHis: 0.731 ± 0.034
4.41AspIle: 4.41 ± 0.094
3.348AspLys: 3.348 ± 0.07
3.348AspLeu: 3.348 ± 0.08
2.398AspMet: 2.398 ± 0.057
1.936AspAsn: 1.936 ± 0.058
2.127AspPro: 2.127 ± 0.059
0.77AspGln: 0.77 ± 0.04
2.569AspArg: 2.569 ± 0.071
2.716AspSer: 2.716 ± 0.065
3.165AspThr: 3.165 ± 0.08
3.818AspVal: 3.818 ± 0.083
0.531AspTrp: 0.531 ± 0.031
2.429AspTyr: 2.429 ± 0.067
0.0AspXaa: 0.0 ± 0.0
Glu
6.169GluAla: 6.169 ± 0.131
0.876GluCys: 0.876 ± 0.038
3.194GluAsp: 3.194 ± 0.08
5.025GluGlu: 5.025 ± 0.119
2.202GluPhe: 2.202 ± 0.064
4.616GluGly: 4.616 ± 0.105
1.488GluHis: 1.488 ± 0.05
4.324GluIle: 4.324 ± 0.096
4.593GluLys: 4.593 ± 0.092
7.129GluLeu: 7.129 ± 0.131
2.352GluMet: 2.352 ± 0.061
3.389GluAsn: 3.389 ± 0.071
2.424GluPro: 2.424 ± 0.073
2.632GluGln: 2.632 ± 0.067
4.35GluArg: 4.35 ± 0.096
3.232GluSer: 3.232 ± 0.064
3.354GluThr: 3.354 ± 0.083
3.513GluVal: 3.513 ± 0.097
0.551GluTrp: 0.551 ± 0.027
2.95GluTyr: 2.95 ± 0.066
0.0GluXaa: 0.0 ± 0.0
Phe
3.955PheAla: 3.955 ± 0.091
0.64PheCys: 0.64 ± 0.034
2.441PheAsp: 2.441 ± 0.061
2.226PheGlu: 2.226 ± 0.062
1.527PhePhe: 1.527 ± 0.061
3.241PheGly: 3.241 ± 0.082
0.567PheHis: 0.567 ± 0.034
2.645PheIle: 2.645 ± 0.074
1.601PheLys: 1.601 ± 0.042
2.805PheLeu: 2.805 ± 0.076
1.151PheMet: 1.151 ± 0.046
1.512PheAsn: 1.512 ± 0.053
1.192PhePro: 1.192 ± 0.048
0.655PheGln: 0.655 ± 0.039
1.661PheArg: 1.661 ± 0.058
2.574PheSer: 2.574 ± 0.074
2.347PheThr: 2.347 ± 0.063
2.639PheVal: 2.639 ± 0.08
0.269PheTrp: 0.269 ± 0.024
1.309PheTyr: 1.309 ± 0.048
0.0PheXaa: 0.0 ± 0.0
Gly
7.33GlyAla: 7.33 ± 0.121
1.61GlyCys: 1.61 ± 0.066
3.978GlyAsp: 3.978 ± 0.082
5.826GlyGlu: 5.826 ± 0.105
3.086GlyPhe: 3.086 ± 0.078
6.148GlyGly: 6.148 ± 0.14
1.332GlyHis: 1.332 ± 0.056
5.995GlyIle: 5.995 ± 0.118
4.815GlyLys: 4.815 ± 0.096
6.346GlyLeu: 6.346 ± 0.112
2.916GlyMet: 2.916 ± 0.073
2.611GlyAsn: 2.611 ± 0.07
1.489GlyPro: 1.489 ± 0.049
1.855GlyGln: 1.855 ± 0.059
4.027GlyArg: 4.027 ± 0.079
4.484GlySer: 4.484 ± 0.093
4.487GlyThr: 4.487 ± 0.092
5.779GlyVal: 5.779 ± 0.104
0.707GlyTrp: 0.707 ± 0.035
3.316GlyTyr: 3.316 ± 0.081
0.0GlyXaa: 0.0 ± 0.0
His
1.496HisAla: 1.496 ± 0.055
0.313HisCys: 0.313 ± 0.022
0.993HisAsp: 0.993 ± 0.046
1.116HisGlu: 1.116 ± 0.05
0.643HisPhe: 0.643 ± 0.031
1.59HisGly: 1.59 ± 0.064
0.322HisHis: 0.322 ± 0.026
1.408HisIle: 1.408 ± 0.054
0.859HisLys: 0.859 ± 0.039
1.153HisLeu: 1.153 ± 0.048
0.594HisMet: 0.594 ± 0.032
0.658HisAsn: 0.658 ± 0.036
0.879HisPro: 0.879 ± 0.036
0.36HisGln: 0.36 ± 0.023
0.83HisArg: 0.83 ± 0.037
0.952HisSer: 0.952 ± 0.041
1.046HisThr: 1.046 ± 0.041
1.14HisVal: 1.14 ± 0.043
0.158HisTrp: 0.158 ± 0.017
0.683HisTyr: 0.683 ± 0.035
0.0HisXaa: 0.0 ± 0.0
Ile
7.936IleAla: 7.936 ± 0.115
1.304IleCys: 1.304 ± 0.046
4.392IleAsp: 4.392 ± 0.081
4.596IleGlu: 4.596 ± 0.09
2.195IlePhe: 2.195 ± 0.078
5.386IleGly: 5.386 ± 0.107
1.044IleHis: 1.044 ± 0.04
4.858IleIle: 4.858 ± 0.111
3.537IleLys: 3.537 ± 0.079
5.37IleLeu: 5.37 ± 0.106
2.122IleMet: 2.122 ± 0.061
2.698IleAsn: 2.698 ± 0.066
2.879IlePro: 2.879 ± 0.07
1.486IleGln: 1.486 ± 0.053
3.466IleArg: 3.466 ± 0.079
4.395IleSer: 4.395 ± 0.09
4.141IleThr: 4.141 ± 0.087
4.84IleVal: 4.84 ± 0.096
0.528IleTrp: 0.528 ± 0.029
2.228IleTyr: 2.228 ± 0.065
0.002IleXaa: 0.002 ± 0.002
Lys
5.566LysAla: 5.566 ± 0.115
0.655LysCys: 0.655 ± 0.035
2.896LysAsp: 2.896 ± 0.069
4.029LysGlu: 4.029 ± 0.095
1.344LysPhe: 1.344 ± 0.049
3.666LysGly: 3.666 ± 0.079
0.971LysHis: 0.971 ± 0.041
2.861LysIle: 2.861 ± 0.072
3.712LysLys: 3.712 ± 0.086
5.901LysLeu: 5.901 ± 0.105
1.641LysMet: 1.641 ± 0.053
2.39LysAsn: 2.39 ± 0.061
2.207LysPro: 2.207 ± 0.055
1.883LysGln: 1.883 ± 0.057
3.463LysArg: 3.463 ± 0.071
2.714LysSer: 2.714 ± 0.064
3.082LysThr: 3.082 ± 0.086
3.199LysVal: 3.199 ± 0.078
0.454LysTrp: 0.454 ± 0.029
2.078LysTyr: 2.078 ± 0.068
0.0LysXaa: 0.0 ± 0.0
Leu
7.749LeuAla: 7.749 ± 0.116
2.078LeuCys: 2.078 ± 0.057
4.878LeuAsp: 4.878 ± 0.109
4.914LeuGlu: 4.914 ± 0.095
3.755LeuPhe: 3.755 ± 0.093
6.714LeuGly: 6.714 ± 0.118
1.524LeuHis: 1.524 ± 0.053
6.483LeuIle: 6.483 ± 0.107
5.084LeuLys: 5.084 ± 0.087
7.787LeuLeu: 7.787 ± 0.143
3.189LeuMet: 3.189 ± 0.075
3.96LeuAsn: 3.96 ± 0.091
3.814LeuPro: 3.814 ± 0.084
1.844LeuGln: 1.844 ± 0.055
4.529LeuArg: 4.529 ± 0.086
6.776LeuSer: 6.776 ± 0.12
5.079LeuThr: 5.079 ± 0.093
5.068LeuVal: 5.068 ± 0.095
0.684LeuTrp: 0.684 ± 0.039
3.221LeuTyr: 3.221 ± 0.089
0.002LeuXaa: 0.002 ± 0.002
Met
3.447MetAla: 3.447 ± 0.083
0.478MetCys: 0.478 ± 0.029
1.989MetAsp: 1.989 ± 0.054
2.15MetGlu: 2.15 ± 0.06
1.014MetPhe: 1.014 ± 0.041
2.477MetGly: 2.477 ± 0.062
0.643MetHis: 0.643 ± 0.034
1.88MetIle: 1.88 ± 0.057
2.015MetLys: 2.015 ± 0.051
3.772MetLeu: 3.772 ± 0.077
0.897MetMet: 0.897 ± 0.038
1.456MetAsn: 1.456 ± 0.047
1.459MetPro: 1.459 ± 0.049
1.194MetGln: 1.194 ± 0.038
2.015MetArg: 2.015 ± 0.054
1.925MetSer: 1.925 ± 0.063
1.588MetThr: 1.588 ± 0.046
1.954MetVal: 1.954 ± 0.058
0.211MetTrp: 0.211 ± 0.021
0.978MetTyr: 0.978 ± 0.037
0.0MetXaa: 0.0 ± 0.0
Asn
4.443AsnAla: 4.443 ± 0.092
0.597AsnCys: 0.597 ± 0.036
2.093AsnAsp: 2.093 ± 0.058
2.813AsnGlu: 2.813 ± 0.058
1.118AsnPhe: 1.118 ± 0.041
3.879AsnGly: 3.879 ± 0.091
0.528AsnHis: 0.528 ± 0.027
2.894AsnIle: 2.894 ± 0.075
1.892AsnLys: 1.892 ± 0.054
2.734AsnLeu: 2.734 ± 0.079
1.423AsnMet: 1.423 ± 0.049
1.428AsnAsn: 1.428 ± 0.06
1.699AsnPro: 1.699 ± 0.056
0.815AsnGln: 0.815 ± 0.038
1.799AsnArg: 1.799 ± 0.059
1.971AsnSer: 1.971 ± 0.064
2.197AsnThr: 2.197 ± 0.057
2.683AsnVal: 2.683 ± 0.067
0.338AsnTrp: 0.338 ± 0.028
1.403AsnTyr: 1.403 ± 0.05
0.0AsnXaa: 0.0 ± 0.0
Pro
3.391ProAla: 3.391 ± 0.085
0.546ProCys: 0.546 ± 0.031
2.282ProAsp: 2.282 ± 0.066
3.09ProGlu: 3.09 ± 0.077
1.558ProPhe: 1.558 ± 0.053
2.465ProGly: 2.465 ± 0.075
0.712ProHis: 0.712 ± 0.039
2.274ProIle: 2.274 ± 0.062
1.939ProLys: 1.939 ± 0.055
3.302ProLeu: 3.302 ± 0.074
1.179ProMet: 1.179 ± 0.044
1.483ProAsn: 1.483 ± 0.048
1.032ProPro: 1.032 ± 0.044
1.186ProGln: 1.186 ± 0.051
1.644ProArg: 1.644 ± 0.069
1.984ProSer: 1.984 ± 0.054
2.142ProThr: 2.142 ± 0.087
2.792ProVal: 2.792 ± 0.069
0.312ProTrp: 0.312 ± 0.023
1.57ProTyr: 1.57 ± 0.054
0.0ProXaa: 0.0 ± 0.0
Gln
2.416GlnAla: 2.416 ± 0.061
0.412GlnCys: 0.412 ± 0.027
1.184GlnAsp: 1.184 ± 0.043
1.412GlnGlu: 1.412 ± 0.051
0.904GlnPhe: 0.904 ± 0.04
1.814GlnGly: 1.814 ± 0.059
0.462GlnHis: 0.462 ± 0.029
1.61GlnIle: 1.61 ± 0.049
1.634GlnLys: 1.634 ± 0.045
2.695GlnLeu: 2.695 ± 0.072
0.876GlnMet: 0.876 ± 0.032
1.158GlnAsn: 1.158 ± 0.047
0.986GlnPro: 0.986 ± 0.039
1.08GlnGln: 1.08 ± 0.051
1.834GlnArg: 1.834 ± 0.06
1.674GlnSer: 1.674 ± 0.052
1.384GlnThr: 1.384 ± 0.052
1.456GlnVal: 1.456 ± 0.048
0.226GlnTrp: 0.226 ± 0.02
1.181GlnTyr: 1.181 ± 0.048
0.0GlnXaa: 0.0 ± 0.0
Arg
5.058ArgAla: 5.058 ± 0.106
0.836ArgCys: 0.836 ± 0.039
2.662ArgAsp: 2.662 ± 0.076
3.89ArgGlu: 3.89 ± 0.087
2.319ArgPhe: 2.319 ± 0.059
3.666ArgGly: 3.666 ± 0.084
0.975ArgHis: 0.975 ± 0.042
4.314ArgIle: 4.314 ± 0.093
2.531ArgLys: 2.531 ± 0.07
5.007ArgLeu: 5.007 ± 0.103
1.862ArgMet: 1.862 ± 0.06
1.857ArgAsn: 1.857 ± 0.059
1.707ArgPro: 1.707 ± 0.064
1.745ArgGln: 1.745 ± 0.058
3.648ArgArg: 3.648 ± 0.091
2.625ArgSer: 2.625 ± 0.075
2.663ArgThr: 2.663 ± 0.066
3.353ArgVal: 3.353 ± 0.08
0.388ArgTrp: 0.388 ± 0.027
2.009ArgTyr: 2.009 ± 0.057
0.002ArgXaa: 0.002 ± 0.002
Ser
6.042SerAla: 6.042 ± 0.112
0.866SerCys: 0.866 ± 0.034
3.0SerAsp: 3.0 ± 0.081
3.656SerGlu: 3.656 ± 0.079
2.399SerPhe: 2.399 ± 0.061
5.655SerGly: 5.655 ± 0.101
0.912SerHis: 0.912 ± 0.037
4.115SerIle: 4.115 ± 0.074
2.363SerLys: 2.363 ± 0.066
5.081SerLeu: 5.081 ± 0.095
1.732SerMet: 1.732 ± 0.061
1.732SerAsn: 1.732 ± 0.064
1.859SerPro: 1.859 ± 0.056
1.286SerGln: 1.286 ± 0.05
2.884SerArg: 2.884 ± 0.081
3.053SerSer: 3.053 ± 0.09
2.832SerThr: 2.832 ± 0.067
4.411SerVal: 4.411 ± 0.086
0.427SerTrp: 0.427 ± 0.025
1.89SerTyr: 1.89 ± 0.053
0.0SerXaa: 0.0 ± 0.0
Thr
5.797ThrAla: 5.797 ± 0.106
0.574ThrCys: 0.574 ± 0.032
3.316ThrAsp: 3.316 ± 0.075
3.48ThrGlu: 3.48 ± 0.078
1.971ThrPhe: 1.971 ± 0.056
4.568ThrGly: 4.568 ± 0.084
0.976ThrHis: 0.976 ± 0.046
3.704ThrIle: 3.704 ± 0.083
2.304ThrLys: 2.304 ± 0.062
5.454ThrLeu: 5.454 ± 0.099
1.458ThrMet: 1.458 ± 0.046
1.697ThrAsn: 1.697 ± 0.072
2.759ThrPro: 2.759 ± 0.091
1.224ThrGln: 1.224 ± 0.047
2.437ThrArg: 2.437 ± 0.073
2.404ThrSer: 2.404 ± 0.065
2.586ThrThr: 2.586 ± 0.068
4.618ThrVal: 4.618 ± 0.103
0.434ThrTrp: 0.434 ± 0.03
1.74ThrTyr: 1.74 ± 0.056
0.0ThrXaa: 0.0 ± 0.0
Val
5.569ValAla: 5.569 ± 0.106
1.385ValCys: 1.385 ± 0.051
3.719ValAsp: 3.719 ± 0.08
4.141ValGlu: 4.141 ± 0.091
2.789ValPhe: 2.789 ± 0.068
4.497ValGly: 4.497 ± 0.1
1.181ValHis: 1.181 ± 0.044
4.623ValIle: 4.623 ± 0.088
3.839ValLys: 3.839 ± 0.084
6.338ValLeu: 6.338 ± 0.12
2.207ValMet: 2.207 ± 0.06
2.898ValAsn: 2.898 ± 0.08
2.596ValPro: 2.596 ± 0.068
1.783ValGln: 1.783 ± 0.056
3.593ValArg: 3.593 ± 0.089
4.438ValSer: 4.438 ± 0.091
3.991ValThr: 3.991 ± 0.098
4.161ValVal: 4.161 ± 0.092
0.571ValTrp: 0.571 ± 0.031
2.563ValTyr: 2.563 ± 0.069
0.002ValXaa: 0.002 ± 0.002
Trp
0.615TrpAla: 0.615 ± 0.035
0.119TrpCys: 0.119 ± 0.014
0.482TrpAsp: 0.482 ± 0.031
0.475TrpGlu: 0.475 ± 0.028
0.383TrpPhe: 0.383 ± 0.024
0.673TrpGly: 0.673 ± 0.036
0.172TrpHis: 0.172 ± 0.017
0.521TrpIle: 0.521 ± 0.033
0.394TrpLys: 0.394 ± 0.026
0.801TrpLeu: 0.801 ± 0.039
0.274TrpMet: 0.274 ± 0.023
0.376TrpAsn: 0.376 ± 0.028
0.269TrpPro: 0.269 ± 0.022
0.353TrpGln: 0.353 ± 0.025
0.434TrpArg: 0.434 ± 0.029
0.429TrpSer: 0.429 ± 0.026
0.389TrpThr: 0.389 ± 0.031
0.45TrpVal: 0.45 ± 0.029
0.102TrpTrp: 0.102 ± 0.013
0.338TrpTyr: 0.338 ± 0.026
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.851TyrAla: 3.851 ± 0.095
0.65TyrCys: 0.65 ± 0.034
2.508TyrAsp: 2.508 ± 0.081
2.297TyrGlu: 2.297 ± 0.072
1.453TyrPhe: 1.453 ± 0.048
3.097TyrGly: 3.097 ± 0.082
0.613TyrHis: 0.613 ± 0.034
2.835TyrIle: 2.835 ± 0.07
1.778TyrLys: 1.778 ± 0.055
2.721TyrLeu: 2.721 ± 0.08
1.184TyrMet: 1.184 ± 0.044
1.544TyrAsn: 1.544 ± 0.056
1.455TyrPro: 1.455 ± 0.049
0.853TyrGln: 0.853 ± 0.039
1.911TyrArg: 1.911 ± 0.052
2.279TyrSer: 2.279 ± 0.063
2.259TyrThr: 2.259 ± 0.058
2.319TyrVal: 2.319 ± 0.069
0.305TyrTrp: 0.305 ± 0.023
1.573TyrTyr: 1.573 ± 0.053
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.002XaaGlu: 0.002 ± 0.002
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.002XaaLeu: 0.002 ± 0.002
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.002XaaPro: 0.002 ± 0.002
0.002XaaGln: 0.002 ± 0.002
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.01XaaXaa: 0.01 ± 0.006
Statistics based on 2034 proteins (606381 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski