Amino acid dipepetide frequency for Clostridium sp. CAG:964

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
7.107AlaAla: 7.107 ± 0.158
1.102AlaCys: 1.102 ± 0.044
4.838AlaAsp: 4.838 ± 0.104
5.367AlaGlu: 5.367 ± 0.109
3.114AlaPhe: 3.114 ± 0.087
5.202AlaGly: 5.202 ± 0.1
0.923AlaHis: 0.923 ± 0.039
5.087AlaIle: 5.087 ± 0.096
5.33AlaLys: 5.33 ± 0.108
6.735AlaLeu: 6.735 ± 0.127
2.176AlaMet: 2.176 ± 0.072
3.037AlaAsn: 3.037 ± 0.082
2.234AlaPro: 2.234 ± 0.09
2.689AlaGln: 2.689 ± 0.074
2.606AlaArg: 2.606 ± 0.083
3.928AlaSer: 3.928 ± 0.085
3.755AlaThr: 3.755 ± 0.102
7.572AlaVal: 7.572 ± 0.138
0.391AlaTrp: 0.391 ± 0.026
2.725AlaTyr: 2.725 ± 0.064
0.0AlaXaa: 0.0 ± 0.0
Cys
1.211CysAla: 1.211 ± 0.043
0.396CysCys: 0.396 ± 0.03
1.023CysAsp: 1.023 ± 0.043
0.928CysGlu: 0.928 ± 0.041
0.679CysPhe: 0.679 ± 0.036
1.772CysGly: 1.772 ± 0.065
0.304CysHis: 0.304 ± 0.023
1.216CysIle: 1.216 ± 0.046
1.213CysLys: 1.213 ± 0.042
1.136CysLeu: 1.136 ± 0.041
0.372CysMet: 0.372 ± 0.024
0.783CysAsn: 0.783 ± 0.038
0.622CysPro: 0.622 ± 0.042
0.38CysGln: 0.38 ± 0.028
0.692CysArg: 0.692 ± 0.039
1.24CysSer: 1.24 ± 0.044
0.986CysThr: 0.986 ± 0.048
1.172CysVal: 1.172 ± 0.044
0.089CysTrp: 0.089 ± 0.012
0.651CysTyr: 0.651 ± 0.033
0.0CysXaa: 0.0 ± 0.0
Asp
3.714AspAla: 3.714 ± 0.088
0.935AspCys: 0.935 ± 0.041
3.276AspAsp: 3.276 ± 0.086
4.037AspGlu: 4.037 ± 0.11
2.858AspPhe: 2.858 ± 0.073
4.298AspGly: 4.298 ± 0.105
0.641AspHis: 0.641 ± 0.032
5.149AspIle: 5.149 ± 0.099
4.707AspLys: 4.707 ± 0.092
4.011AspLeu: 4.011 ± 0.084
1.572AspMet: 1.572 ± 0.051
3.443AspAsn: 3.443 ± 0.07
1.332AspPro: 1.332 ± 0.041
0.94AspGln: 0.94 ± 0.044
2.028AspArg: 2.028 ± 0.059
3.914AspSer: 3.914 ± 0.09
3.472AspThr: 3.472 ± 0.079
3.764AspVal: 3.764 ± 0.083
0.442AspTrp: 0.442 ± 0.03
2.898AspTyr: 2.898 ± 0.072
0.0AspXaa: 0.0 ± 0.0
Glu
4.692GluAla: 4.692 ± 0.099
0.953GluCys: 0.953 ± 0.039
3.344GluAsp: 3.344 ± 0.082
4.722GluGlu: 4.722 ± 0.104
2.381GluPhe: 2.381 ± 0.063
3.465GluGly: 3.465 ± 0.079
1.005GluHis: 1.005 ± 0.036
5.405GluIle: 5.405 ± 0.117
6.124GluLys: 6.124 ± 0.113
5.967GluLeu: 5.967 ± 0.116
1.936GluMet: 1.936 ± 0.067
4.603GluAsn: 4.603 ± 0.088
1.784GluPro: 1.784 ± 0.064
2.497GluGln: 2.497 ± 0.074
2.708GluArg: 2.708 ± 0.075
3.373GluSer: 3.373 ± 0.073
2.889GluThr: 2.889 ± 0.073
3.798GluVal: 3.798 ± 0.099
0.418GluTrp: 0.418 ± 0.03
3.239GluTyr: 3.239 ± 0.072
0.0GluXaa: 0.0 ± 0.0
Phe
3.051PheAla: 3.051 ± 0.081
0.756PheCys: 0.756 ± 0.032
2.502PheAsp: 2.502 ± 0.076
2.546PheGlu: 2.546 ± 0.063
1.801PhePhe: 1.801 ± 0.065
2.85PheGly: 2.85 ± 0.083
0.524PheHis: 0.524 ± 0.031
3.097PheIle: 3.097 ± 0.087
2.669PheLys: 2.669 ± 0.065
3.215PheLeu: 3.215 ± 0.09
1.095PheMet: 1.095 ± 0.045
2.277PheAsn: 2.277 ± 0.063
1.148PhePro: 1.148 ± 0.041
0.907PheGln: 0.907 ± 0.04
1.354PheArg: 1.354 ± 0.051
3.216PheSer: 3.216 ± 0.094
2.548PheThr: 2.548 ± 0.067
2.923PheVal: 2.923 ± 0.087
0.348PheTrp: 0.348 ± 0.025
1.7PheTyr: 1.7 ± 0.057
0.0PheXaa: 0.0 ± 0.0
Gly
4.763GlyAla: 4.763 ± 0.107
1.474GlyCys: 1.474 ± 0.055
3.699GlyAsp: 3.699 ± 0.082
4.252GlyGlu: 4.252 ± 0.099
2.879GlyPhe: 2.879 ± 0.079
4.705GlyGly: 4.705 ± 0.1
1.016GlyHis: 1.016 ± 0.047
5.73GlyIle: 5.73 ± 0.1
5.514GlyLys: 5.514 ± 0.105
5.224GlyLeu: 5.224 ± 0.09
1.915GlyMet: 1.915 ± 0.058
3.24GlyAsn: 3.24 ± 0.087
0.947GlyPro: 0.947 ± 0.047
1.726GlyGln: 1.726 ± 0.052
2.698GlyArg: 2.698 ± 0.073
4.55GlySer: 4.55 ± 0.099
4.156GlyThr: 4.156 ± 0.08
5.261GlyVal: 5.261 ± 0.11
0.556GlyTrp: 0.556 ± 0.034
3.21GlyTyr: 3.21 ± 0.068
0.002GlyXaa: 0.002 ± 0.002
His
0.825HisAla: 0.825 ± 0.039
0.339HisCys: 0.339 ± 0.027
0.677HisAsp: 0.677 ± 0.034
0.621HisGlu: 0.621 ± 0.03
0.617HisPhe: 0.617 ± 0.03
1.052HisGly: 1.052 ± 0.044
0.304HisHis: 0.304 ± 0.022
1.354HisIle: 1.354 ± 0.051
0.974HisLys: 0.974 ± 0.046
1.211HisLeu: 1.211 ± 0.048
0.345HisMet: 0.345 ± 0.023
0.95HisAsn: 0.95 ± 0.044
0.68HisPro: 0.68 ± 0.031
0.413HisGln: 0.413 ± 0.027
0.704HisArg: 0.704 ± 0.031
1.119HisSer: 1.119 ± 0.046
1.011HisThr: 1.011 ± 0.042
0.51HisVal: 0.51 ± 0.033
0.155HisTrp: 0.155 ± 0.017
0.66HisTyr: 0.66 ± 0.035
0.0HisXaa: 0.0 ± 0.0
Ile
6.036IleAla: 6.036 ± 0.113
1.311IleCys: 1.311 ± 0.051
4.826IleAsp: 4.826 ± 0.088
4.767IleGlu: 4.767 ± 0.116
3.003IlePhe: 3.003 ± 0.084
4.661IleGly: 4.661 ± 0.106
0.993IleHis: 0.993 ± 0.046
6.157IleIle: 6.157 ± 0.123
5.771IleLys: 5.771 ± 0.115
5.979IleLeu: 5.979 ± 0.116
2.04IleMet: 2.04 ± 0.054
4.311IleAsn: 4.311 ± 0.086
3.01IlePro: 3.01 ± 0.073
1.828IleGln: 1.828 ± 0.055
2.606IleArg: 2.606 ± 0.071
5.326IleSer: 5.326 ± 0.109
4.999IleThr: 4.999 ± 0.104
5.077IleVal: 5.077 ± 0.108
0.476IleTrp: 0.476 ± 0.029
2.739IleTyr: 2.739 ± 0.068
0.0IleXaa: 0.0 ± 0.0
Lys
5.841LysAla: 5.841 ± 0.126
0.933LysCys: 0.933 ± 0.037
4.127LysAsp: 4.127 ± 0.084
6.068LysGlu: 6.068 ± 0.114
2.096LysPhe: 2.096 ± 0.064
4.659LysGly: 4.659 ± 0.093
1.098LysHis: 1.098 ± 0.045
5.862LysIle: 5.862 ± 0.114
6.549LysLys: 6.549 ± 0.115
6.297LysLeu: 6.297 ± 0.112
2.116LysMet: 2.116 ± 0.061
4.726LysAsn: 4.726 ± 0.091
2.372LysPro: 2.372 ± 0.068
2.585LysGln: 2.585 ± 0.074
3.072LysArg: 3.072 ± 0.071
4.606LysSer: 4.606 ± 0.09
4.349LysThr: 4.349 ± 0.097
4.211LysVal: 4.211 ± 0.098
0.534LysTrp: 0.534 ± 0.029
3.37LysTyr: 3.37 ± 0.087
0.0LysXaa: 0.0 ± 0.0
Leu
6.245LeuAla: 6.245 ± 0.122
1.661LeuCys: 1.661 ± 0.059
4.658LeuAsp: 4.658 ± 0.087
5.188LeuGlu: 5.188 ± 0.105
3.648LeuPhe: 3.648 ± 0.105
5.273LeuGly: 5.273 ± 0.101
1.207LeuHis: 1.207 ± 0.054
5.645LeuIle: 5.645 ± 0.111
6.326LeuLys: 6.326 ± 0.102
7.443LeuLeu: 7.443 ± 0.151
2.35LeuMet: 2.35 ± 0.065
4.441LeuAsn: 4.441 ± 0.09
3.28LeuPro: 3.28 ± 0.076
2.531LeuGln: 2.531 ± 0.088
3.406LeuArg: 3.406 ± 0.097
6.334LeuSer: 6.334 ± 0.122
5.137LeuThr: 5.137 ± 0.083
5.382LeuVal: 5.382 ± 0.115
0.595LeuTrp: 0.595 ± 0.035
3.182LeuTyr: 3.182 ± 0.077
0.0LeuXaa: 0.0 ± 0.0
Met
2.219MetAla: 2.219 ± 0.07
0.476MetCys: 0.476 ± 0.026
1.463MetAsp: 1.463 ± 0.05
1.625MetGlu: 1.625 ± 0.054
1.034MetPhe: 1.034 ± 0.042
1.862MetGly: 1.862 ± 0.066
0.385MetHis: 0.385 ± 0.024
1.6MetIle: 1.6 ± 0.048
1.97MetLys: 1.97 ± 0.054
2.73MetLeu: 2.73 ± 0.07
0.64MetMet: 0.64 ± 0.038
1.531MetAsn: 1.531 ± 0.056
1.071MetPro: 1.071 ± 0.043
0.849MetGln: 0.849 ± 0.035
1.037MetArg: 1.037 ± 0.043
1.956MetSer: 1.956 ± 0.065
1.455MetThr: 1.455 ± 0.055
1.746MetVal: 1.746 ± 0.057
0.194MetTrp: 0.194 ± 0.019
0.945MetTyr: 0.945 ± 0.039
0.0MetXaa: 0.0 ± 0.0
Asn
3.928AsnAla: 3.928 ± 0.093
0.82AsnCys: 0.82 ± 0.036
2.906AsnAsp: 2.906 ± 0.084
3.203AsnGlu: 3.203 ± 0.08
1.809AsnPhe: 1.809 ± 0.059
4.465AsnGly: 4.465 ± 0.116
0.773AsnHis: 0.773 ± 0.038
4.687AsnIle: 4.687 ± 0.097
4.224AsnLys: 4.224 ± 0.097
3.658AsnLeu: 3.658 ± 0.079
1.497AsnMet: 1.497 ± 0.052
3.419AsnAsn: 3.419 ± 0.136
2.197AsnPro: 2.197 ± 0.066
1.373AsnGln: 1.373 ± 0.052
1.944AsnArg: 1.944 ± 0.06
3.917AsnSer: 3.917 ± 0.098
3.181AsnThr: 3.181 ± 0.087
3.486AsnVal: 3.486 ± 0.079
0.404AsnTrp: 0.404 ± 0.026
2.173AsnTyr: 2.173 ± 0.076
0.0AsnXaa: 0.0 ± 0.0
Pro
2.336ProAla: 2.336 ± 0.062
0.462ProCys: 0.462 ± 0.027
1.919ProAsp: 1.919 ± 0.065
2.497ProGlu: 2.497 ± 0.073
1.453ProPhe: 1.453 ± 0.054
1.717ProGly: 1.717 ± 0.059
0.539ProHis: 0.539 ± 0.033
2.065ProIle: 2.065 ± 0.054
2.249ProLys: 2.249 ± 0.065
2.51ProLeu: 2.51 ± 0.062
0.767ProMet: 0.767 ± 0.032
1.571ProAsn: 1.571 ± 0.061
0.885ProPro: 0.885 ± 0.047
1.526ProGln: 1.526 ± 0.065
0.936ProArg: 0.936 ± 0.041
1.924ProSer: 1.924 ± 0.062
2.198ProThr: 2.198 ± 0.103
2.865ProVal: 2.865 ± 0.066
0.263ProTrp: 0.263 ± 0.022
1.446ProTyr: 1.446 ± 0.049
0.0ProXaa: 0.0 ± 0.0
Gln
2.403GlnAla: 2.403 ± 0.067
0.455GlnCys: 0.455 ± 0.03
1.385GlnAsp: 1.385 ± 0.046
1.905GlnGlu: 1.905 ± 0.071
0.965GlnPhe: 0.965 ± 0.038
1.924GlnGly: 1.924 ± 0.061
0.464GlnHis: 0.464 ± 0.032
2.094GlnIle: 2.094 ± 0.07
2.393GlnLys: 2.393 ± 0.069
3.174GlnLeu: 3.174 ± 0.084
0.945GlnMet: 0.945 ± 0.036
1.661GlnAsn: 1.661 ± 0.071
1.039GlnPro: 1.039 ± 0.046
1.75GlnGln: 1.75 ± 0.079
1.511GlnArg: 1.511 ± 0.061
1.833GlnSer: 1.833 ± 0.058
1.547GlnThr: 1.547 ± 0.057
1.784GlnVal: 1.784 ± 0.062
0.317GlnTrp: 0.317 ± 0.024
1.32GlnTyr: 1.32 ± 0.05
0.0GlnXaa: 0.0 ± 0.0
Arg
2.456ArgAla: 2.456 ± 0.07
0.665ArgCys: 0.665 ± 0.039
1.891ArgAsp: 1.891 ± 0.063
2.703ArgGlu: 2.703 ± 0.075
1.736ArgPhe: 1.736 ± 0.061
2.265ArgGly: 2.265 ± 0.067
0.704ArgHis: 0.704 ± 0.038
2.833ArgIle: 2.833 ± 0.079
2.788ArgLys: 2.788 ± 0.076
4.033ArgLeu: 4.033 ± 0.097
1.081ArgMet: 1.081 ± 0.04
1.702ArgAsn: 1.702 ± 0.063
1.139ArgPro: 1.139 ± 0.054
1.456ArgGln: 1.456 ± 0.057
1.866ArgArg: 1.866 ± 0.066
2.14ArgSer: 2.14 ± 0.063
1.775ArgThr: 1.775 ± 0.05
2.836ArgVal: 2.836 ± 0.07
0.281ArgTrp: 0.281 ± 0.025
1.668ArgTyr: 1.668 ± 0.065
0.0ArgXaa: 0.0 ± 0.0
Ser
4.818SerAla: 4.818 ± 0.104
1.136SerCys: 1.136 ± 0.045
3.972SerAsp: 3.972 ± 0.097
3.847SerGlu: 3.847 ± 0.083
3.041SerPhe: 3.041 ± 0.079
5.029SerGly: 5.029 ± 0.094
1.028SerHis: 1.028 ± 0.04
4.519SerIle: 4.519 ± 0.089
4.675SerLys: 4.675 ± 0.092
5.505SerLeu: 5.505 ± 0.107
1.566SerMet: 1.566 ± 0.05
3.307SerAsn: 3.307 ± 0.094
2.002SerPro: 2.002 ± 0.068
2.222SerGln: 2.222 ± 0.062
2.55SerArg: 2.55 ± 0.072
4.91SerSer: 4.91 ± 0.172
3.761SerThr: 3.761 ± 0.086
5.2SerVal: 5.2 ± 0.107
0.423SerTrp: 0.423 ± 0.025
2.785SerTyr: 2.785 ± 0.079
0.0SerXaa: 0.0 ± 0.0
Thr
5.715ThrAla: 5.715 ± 0.121
0.723ThrCys: 0.723 ± 0.038
3.761ThrAsp: 3.761 ± 0.089
4.098ThrGlu: 4.098 ± 0.106
2.084ThrPhe: 2.084 ± 0.055
4.652ThrGly: 4.652 ± 0.095
0.887ThrHis: 0.887 ± 0.043
3.934ThrIle: 3.934 ± 0.098
3.382ThrLys: 3.382 ± 0.082
4.804ThrLeu: 4.804 ± 0.09
1.226ThrMet: 1.226 ± 0.044
2.492ThrAsn: 2.492 ± 0.07
2.391ThrPro: 2.391 ± 0.065
1.78ThrGln: 1.78 ± 0.059
1.751ThrArg: 1.751 ± 0.053
3.344ThrSer: 3.344 ± 0.087
3.135ThrThr: 3.135 ± 0.085
5.49ThrVal: 5.49 ± 0.128
0.304ThrTrp: 0.304 ± 0.023
2.118ThrTyr: 2.118 ± 0.069
0.002ThrXaa: 0.002 ± 0.002
Val
5.364ValAla: 5.364 ± 0.105
1.347ValCys: 1.347 ± 0.052
4.011ValAsp: 4.011 ± 0.083
4.228ValGlu: 4.228 ± 0.103
3.158ValPhe: 3.158 ± 0.083
4.214ValGly: 4.214 ± 0.095
0.95ValHis: 0.95 ± 0.042
5.676ValIle: 5.676 ± 0.099
5.116ValLys: 5.116 ± 0.1
6.424ValLeu: 6.424 ± 0.111
1.924ValMet: 1.924 ± 0.056
3.513ValAsn: 3.513 ± 0.092
2.539ValPro: 2.539 ± 0.062
1.879ValGln: 1.879 ± 0.06
2.637ValArg: 2.637 ± 0.068
5.019ValSer: 5.019 ± 0.093
4.738ValThr: 4.738 ± 0.109
5.379ValVal: 5.379 ± 0.112
0.536ValTrp: 0.536 ± 0.029
2.708ValTyr: 2.708 ± 0.076
0.0ValXaa: 0.0 ± 0.0
Trp
0.462TrpAla: 0.462 ± 0.027
0.138TrpCys: 0.138 ± 0.016
0.45TrpAsp: 0.45 ± 0.029
0.379TrpGlu: 0.379 ± 0.026
0.329TrpPhe: 0.329 ± 0.026
0.496TrpGly: 0.496 ± 0.03
0.155TrpHis: 0.155 ± 0.016
0.43TrpIle: 0.43 ± 0.033
0.428TrpLys: 0.428 ± 0.027
0.764TrpLeu: 0.764 ± 0.035
0.213TrpMet: 0.213 ± 0.016
0.421TrpAsn: 0.421 ± 0.029
0.116TrpPro: 0.116 ± 0.015
0.396TrpGln: 0.396 ± 0.028
0.278TrpArg: 0.278 ± 0.019
0.466TrpSer: 0.466 ± 0.038
0.348TrpThr: 0.348 ± 0.027
0.421TrpVal: 0.421 ± 0.028
0.082TrpTrp: 0.082 ± 0.012
0.333TrpTyr: 0.333 ± 0.026
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.732TyrAla: 2.732 ± 0.069
0.781TyrCys: 0.781 ± 0.041
2.701TyrAsp: 2.701 ± 0.074
2.325TyrGlu: 2.325 ± 0.066
1.84TyrPhe: 1.84 ± 0.06
2.952TyrGly: 2.952 ± 0.079
0.622TyrHis: 0.622 ± 0.03
3.377TyrIle: 3.377 ± 0.079
3.07TyrLys: 3.07 ± 0.083
3.111TyrLeu: 3.111 ± 0.086
1.011TyrMet: 1.011 ± 0.043
2.635TyrAsn: 2.635 ± 0.084
1.342TyrPro: 1.342 ± 0.058
1.138TyrGln: 1.138 ± 0.043
1.637TyrArg: 1.637 ± 0.057
3.179TyrSer: 3.179 ± 0.089
2.635TyrThr: 2.635 ± 0.075
2.483TyrVal: 2.483 ± 0.071
0.298TyrTrp: 0.298 ± 0.024
2.006TyrTyr: 2.006 ± 0.084
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.002XaaGly: 0.002 ± 0.002
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.002XaaGln: 0.002 ± 0.002
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.009XaaXaa: 0.009 ± 0.006
Statistics based on 1860 proteins (586356 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski