Amino acid dipepetide frequency for Clostridium sp. CAG:343

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
2.295AlaAla: 2.295 ± 0.073
0.598AlaCys: 0.598 ± 0.036
2.371AlaAsp: 2.371 ± 0.084
3.416AlaGlu: 3.416 ± 0.105
1.916AlaPhe: 1.916 ± 0.08
3.19AlaGly: 3.19 ± 0.104
0.637AlaHis: 0.637 ± 0.04
5.573AlaIle: 5.573 ± 0.125
5.32AlaLys: 5.32 ± 0.126
3.711AlaLeu: 3.711 ± 0.105
1.338AlaMet: 1.338 ± 0.064
2.959AlaAsn: 2.959 ± 0.089
1.099AlaPro: 1.099 ± 0.056
1.279AlaGln: 1.279 ± 0.065
1.877AlaArg: 1.877 ± 0.081
2.575AlaSer: 2.575 ± 0.08
2.737AlaThr: 2.737 ± 0.095
3.03AlaVal: 3.03 ± 0.105
0.305AlaTrp: 0.305 ± 0.025
1.97AlaTyr: 1.97 ± 0.072
0.0AlaXaa: 0.0 ± 0.0
Cys
0.507CysAla: 0.507 ± 0.035
0.15CysCys: 0.15 ± 0.02
0.588CysAsp: 0.588 ± 0.037
0.75CysGlu: 0.75 ± 0.042
0.465CysPhe: 0.465 ± 0.037
0.895CysGly: 0.895 ± 0.056
0.189CysHis: 0.189 ± 0.024
1.092CysIle: 1.092 ± 0.061
1.144CysLys: 1.144 ± 0.057
0.861CysLeu: 0.861 ± 0.045
0.298CysMet: 0.298 ± 0.026
0.713CysAsn: 0.713 ± 0.049
0.386CysPro: 0.386 ± 0.03
0.236CysGln: 0.236 ± 0.023
0.349CysArg: 0.349 ± 0.027
0.669CysSer: 0.669 ± 0.046
0.548CysThr: 0.548 ± 0.04
0.639CysVal: 0.639 ± 0.04
0.069CysTrp: 0.069 ± 0.015
0.507CysTyr: 0.507 ± 0.039
0.0CysXaa: 0.0 ± 0.0
Asp
2.073AspAla: 2.073 ± 0.072
0.529AspCys: 0.529 ± 0.039
2.622AspAsp: 2.622 ± 0.09
4.828AspGlu: 4.828 ± 0.118
2.585AspPhe: 2.585 ± 0.078
3.347AspGly: 3.347 ± 0.102
0.465AspHis: 0.465 ± 0.03
6.296AspIle: 6.296 ± 0.135
5.672AspLys: 5.672 ± 0.14
4.629AspLeu: 4.629 ± 0.115
1.372AspMet: 1.372 ± 0.059
3.625AspAsn: 3.625 ± 0.111
1.08AspPro: 1.08 ± 0.061
0.863AspGln: 0.863 ± 0.046
1.64AspArg: 1.64 ± 0.07
3.001AspSer: 3.001 ± 0.106
2.873AspThr: 2.873 ± 0.081
3.288AspVal: 3.288 ± 0.091
0.376AspTrp: 0.376 ± 0.03
2.619AspTyr: 2.619 ± 0.091
0.0AspXaa: 0.0 ± 0.0
Glu
3.783GluAla: 3.783 ± 0.119
0.711GluCys: 0.711 ± 0.041
4.592GluAsp: 4.592 ± 0.111
9.046GluGlu: 9.046 ± 0.174
3.097GluPhe: 3.097 ± 0.1
3.662GluGly: 3.662 ± 0.114
0.912GluHis: 0.912 ± 0.044
8.985GluIle: 8.985 ± 0.155
10.856GluLys: 10.856 ± 0.174
7.344GluLeu: 7.344 ± 0.144
2.061GluMet: 2.061 ± 0.082
7.246GluAsn: 7.246 ± 0.166
1.463GluPro: 1.463 ± 0.069
2.934GluGln: 2.934 ± 0.094
2.622GluArg: 2.622 ± 0.079
3.458GluSer: 3.458 ± 0.097
3.891GluThr: 3.891 ± 0.098
4.508GluVal: 4.508 ± 0.114
0.46GluTrp: 0.46 ± 0.037
3.839GluTyr: 3.839 ± 0.105
0.0GluXaa: 0.0 ± 0.0
Phe
1.99PheAla: 1.99 ± 0.079
0.514PheCys: 0.514 ± 0.034
2.418PheAsp: 2.418 ± 0.085
3.217PheGlu: 3.217 ± 0.095
1.643PhePhe: 1.643 ± 0.077
2.506PheGly: 2.506 ± 0.086
0.453PheHis: 0.453 ± 0.041
3.994PheIle: 3.994 ± 0.129
3.655PheLys: 3.655 ± 0.116
3.529PheLeu: 3.529 ± 0.109
0.984PheMet: 0.984 ± 0.051
2.585PheAsn: 2.585 ± 0.086
0.969PhePro: 0.969 ± 0.044
0.858PheGln: 0.858 ± 0.046
1.107PheArg: 1.107 ± 0.052
2.661PheSer: 2.661 ± 0.101
1.948PheThr: 1.948 ± 0.063
2.327PheVal: 2.327 ± 0.067
0.315PheTrp: 0.315 ± 0.027
1.741PheTyr: 1.741 ± 0.063
0.0PheXaa: 0.0 ± 0.0
Gly
3.003GlyAla: 3.003 ± 0.099
0.671GlyCys: 0.671 ± 0.051
2.553GlyAsp: 2.553 ± 0.084
4.051GlyGlu: 4.051 ± 0.11
2.398GlyPhe: 2.398 ± 0.085
3.151GlyGly: 3.151 ± 0.109
0.905GlyHis: 0.905 ± 0.052
6.481GlyIle: 6.481 ± 0.121
6.073GlyLys: 6.073 ± 0.131
4.415GlyLeu: 4.415 ± 0.114
1.54GlyMet: 1.54 ± 0.065
3.392GlyAsn: 3.392 ± 0.106
0.927GlyPro: 0.927 ± 0.052
1.436GlyGln: 1.436 ± 0.068
1.965GlyArg: 1.965 ± 0.086
2.816GlySer: 2.816 ± 0.09
3.475GlyThr: 3.475 ± 0.096
3.603GlyVal: 3.603 ± 0.107
0.379GlyTrp: 0.379 ± 0.032
2.841GlyTyr: 2.841 ± 0.091
0.0GlyXaa: 0.0 ± 0.0
His
0.544HisAla: 0.544 ± 0.04
0.199HisCys: 0.199 ± 0.022
0.568HisAsp: 0.568 ± 0.036
0.733HisGlu: 0.733 ± 0.039
0.553HisPhe: 0.553 ± 0.036
0.812HisGly: 0.812 ± 0.047
0.239HisHis: 0.239 ± 0.025
1.289HisIle: 1.289 ± 0.063
1.058HisLys: 1.058 ± 0.053
1.075HisLeu: 1.075 ± 0.057
0.322HisMet: 0.322 ± 0.029
0.748HisAsn: 0.748 ± 0.042
0.521HisPro: 0.521 ± 0.038
0.347HisGln: 0.347 ± 0.03
0.497HisArg: 0.497 ± 0.038
0.789HisSer: 0.789 ± 0.047
0.654HisThr: 0.654 ± 0.045
0.615HisVal: 0.615 ± 0.044
0.103HisTrp: 0.103 ± 0.013
0.526HisTyr: 0.526 ± 0.035
0.0HisXaa: 0.0 ± 0.0
Ile
5.812IleAla: 5.812 ± 0.128
1.363IleCys: 1.363 ± 0.06
6.082IleAsp: 6.082 ± 0.128
8.911IleGlu: 8.911 ± 0.165
4.213IlePhe: 4.213 ± 0.141
5.654IleGly: 5.654 ± 0.139
1.077IleHis: 1.077 ± 0.054
10.851IleIle: 10.851 ± 0.241
10.197IleLys: 10.197 ± 0.179
9.432IleLeu: 9.432 ± 0.188
2.199IleMet: 2.199 ± 0.08
6.766IleAsn: 6.766 ± 0.156
3.323IlePro: 3.323 ± 0.098
2.782IleGln: 2.782 ± 0.088
3.195IleArg: 3.195 ± 0.114
6.835IleSer: 6.835 ± 0.143
5.391IleThr: 5.391 ± 0.121
6.205IleVal: 6.205 ± 0.127
0.48IleTrp: 0.48 ± 0.033
4.375IleTyr: 4.375 ± 0.105
0.0IleXaa: 0.0 ± 0.0
Lys
4.801LysAla: 4.801 ± 0.118
0.947LysCys: 0.947 ± 0.05
6.124LysAsp: 6.124 ± 0.146
11.574LysGlu: 11.574 ± 0.231
3.458LysPhe: 3.458 ± 0.1
4.739LysGly: 4.739 ± 0.111
1.134LysHis: 1.134 ± 0.056
11.301LysIle: 11.301 ± 0.2
10.802LysLys: 10.802 ± 0.198
7.949LysLeu: 7.949 ± 0.131
2.964LysMet: 2.964 ± 0.077
8.136LysAsn: 8.136 ± 0.168
2.073LysPro: 2.073 ± 0.076
3.593LysGln: 3.593 ± 0.109
3.532LysArg: 3.532 ± 0.097
4.953LysSer: 4.953 ± 0.109
5.482LysThr: 5.482 ± 0.126
6.213LysVal: 6.213 ± 0.137
0.632LysTrp: 0.632 ± 0.04
5.143LysTyr: 5.143 ± 0.118
0.0LysXaa: 0.0 ± 0.0
Leu
4.393LeuAla: 4.393 ± 0.105
0.844LeuCys: 0.844 ± 0.058
4.599LeuAsp: 4.599 ± 0.1
7.315LeuGlu: 7.315 ± 0.144
3.2LeuPhe: 3.2 ± 0.093
5.01LeuGly: 5.01 ± 0.125
1.013LeuHis: 1.013 ± 0.05
7.489LeuIle: 7.489 ± 0.166
9.159LeuLys: 9.159 ± 0.143
6.805LeuLeu: 6.805 ± 0.153
1.781LeuMet: 1.781 ± 0.063
5.701LeuAsn: 5.701 ± 0.116
2.41LeuPro: 2.41 ± 0.084
2.078LeuGln: 2.078 ± 0.079
2.644LeuArg: 2.644 ± 0.102
5.266LeuSer: 5.266 ± 0.126
4.213LeuThr: 4.213 ± 0.108
4.939LeuVal: 4.939 ± 0.122
0.485LeuTrp: 0.485 ± 0.037
3.406LeuTyr: 3.406 ± 0.092
0.0LeuXaa: 0.0 ± 0.0
Met
1.412MetAla: 1.412 ± 0.067
0.3MetCys: 0.3 ± 0.033
1.173MetAsp: 1.173 ± 0.06
1.862MetGlu: 1.862 ± 0.073
0.972MetPhe: 0.972 ± 0.047
1.294MetGly: 1.294 ± 0.062
0.347MetHis: 0.347 ± 0.028
2.135MetIle: 2.135 ± 0.075
2.681MetLys: 2.681 ± 0.076
2.238MetLeu: 2.238 ± 0.067
0.553MetMet: 0.553 ± 0.035
1.577MetAsn: 1.577 ± 0.055
0.935MetPro: 0.935 ± 0.046
0.979MetGln: 0.979 ± 0.053
0.706MetArg: 0.706 ± 0.044
1.429MetSer: 1.429 ± 0.053
1.19MetThr: 1.19 ± 0.054
1.215MetVal: 1.215 ± 0.063
0.15MetTrp: 0.15 ± 0.017
0.969MetTyr: 0.969 ± 0.053
0.0MetXaa: 0.0 ± 0.0
Asn
2.752AsnAla: 2.752 ± 0.098
0.817AsnCys: 0.817 ± 0.053
3.207AsnAsp: 3.207 ± 0.091
5.071AsnGlu: 5.071 ± 0.127
2.723AsnPhe: 2.723 ± 0.107
3.839AsnGly: 3.839 ± 0.101
0.792AsnHis: 0.792 ± 0.044
8.146AsnIle: 8.146 ± 0.169
7.961AsnLys: 7.961 ± 0.174
5.886AsnLeu: 5.886 ± 0.122
1.682AsnMet: 1.682 ± 0.056
5.499AsnAsn: 5.499 ± 0.167
2.152AsnPro: 2.152 ± 0.076
2.216AsnGln: 2.216 ± 0.079
2.199AsnArg: 2.199 ± 0.08
4.459AsnSer: 4.459 ± 0.125
3.756AsnThr: 3.756 ± 0.122
3.596AsnVal: 3.596 ± 0.099
0.435AsnTrp: 0.435 ± 0.033
3.224AsnTyr: 3.224 ± 0.091
0.0AsnXaa: 0.0 ± 0.0
Pro
1.163ProAla: 1.163 ± 0.064
0.342ProCys: 0.342 ± 0.029
1.51ProAsp: 1.51 ± 0.069
2.44ProGlu: 2.44 ± 0.092
1.033ProPhe: 1.033 ± 0.053
1.399ProGly: 1.399 ± 0.068
0.394ProHis: 0.394 ± 0.033
2.654ProIle: 2.654 ± 0.082
2.287ProLys: 2.287 ± 0.087
1.795ProLeu: 1.795 ± 0.067
0.544ProMet: 0.544 ± 0.03
1.741ProAsn: 1.741 ± 0.057
0.423ProPro: 0.423 ± 0.039
0.679ProGln: 0.679 ± 0.037
0.777ProArg: 0.777 ± 0.048
1.316ProSer: 1.316 ± 0.055
1.365ProThr: 1.365 ± 0.06
1.695ProVal: 1.695 ± 0.067
0.202ProTrp: 0.202 ± 0.024
1.195ProTyr: 1.195 ± 0.055
0.0ProXaa: 0.0 ± 0.0
Gln
1.419GlnAla: 1.419 ± 0.055
0.214GlnCys: 0.214 ± 0.025
1.5GlnAsp: 1.5 ± 0.057
2.966GlnGlu: 2.966 ± 0.096
0.917GlnPhe: 0.917 ± 0.05
1.505GlnGly: 1.505 ± 0.067
0.263GlnHis: 0.263 ± 0.027
3.072GlnIle: 3.072 ± 0.102
3.183GlnLys: 3.183 ± 0.087
2.204GlnLeu: 2.204 ± 0.084
0.787GlnMet: 0.787 ± 0.05
2.25GlnAsn: 2.25 ± 0.074
0.509GlnPro: 0.509 ± 0.037
0.777GlnGln: 0.777 ± 0.053
1.065GlnArg: 1.065 ± 0.055
1.289GlnSer: 1.289 ± 0.055
1.458GlnThr: 1.458 ± 0.062
1.66GlnVal: 1.66 ± 0.062
0.143GlnTrp: 0.143 ± 0.018
1.326GlnTyr: 1.326 ± 0.057
0.0GlnXaa: 0.0 ± 0.0
Arg
1.424ArgAla: 1.424 ± 0.059
0.389ArgCys: 0.389 ± 0.039
1.741ArgAsp: 1.741 ± 0.071
2.701ArgGlu: 2.701 ± 0.089
1.392ArgPhe: 1.392 ± 0.063
1.665ArgGly: 1.665 ± 0.076
0.467ArgHis: 0.467 ± 0.032
3.124ArgIle: 3.124 ± 0.104
3.891ArgLys: 3.891 ± 0.092
2.639ArgLeu: 2.639 ± 0.091
0.849ArgMet: 0.849 ± 0.045
2.307ArgAsn: 2.307 ± 0.084
0.853ArgPro: 0.853 ± 0.046
1.05ArgGln: 1.05 ± 0.051
1.348ArgArg: 1.348 ± 0.066
1.291ArgSer: 1.291 ± 0.049
1.751ArgThr: 1.751 ± 0.075
1.896ArgVal: 1.896 ± 0.078
0.253ArgTrp: 0.253 ± 0.028
1.537ArgTyr: 1.537 ± 0.058
0.0ArgXaa: 0.0 ± 0.0
Ser
2.378SerAla: 2.378 ± 0.08
0.595SerCys: 0.595 ± 0.046
3.025SerAsp: 3.025 ± 0.096
4.393SerGlu: 4.393 ± 0.116
2.339SerPhe: 2.339 ± 0.074
3.552SerGly: 3.552 ± 0.108
0.674SerHis: 0.674 ± 0.043
5.89SerIle: 5.89 ± 0.133
6.016SerLys: 6.016 ± 0.121
4.447SerLeu: 4.447 ± 0.116
1.267SerMet: 1.267 ± 0.06
4.334SerAsn: 4.334 ± 0.138
1.276SerPro: 1.276 ± 0.058
1.734SerGln: 1.734 ± 0.071
1.783SerArg: 1.783 ± 0.085
3.925SerSer: 3.925 ± 0.135
3.207SerThr: 3.207 ± 0.107
3.015SerVal: 3.015 ± 0.086
0.369SerTrp: 0.369 ± 0.033
2.656SerTyr: 2.656 ± 0.093
0.0SerXaa: 0.0 ± 0.0
Thr
2.836ThrAla: 2.836 ± 0.095
0.516ThrCys: 0.516 ± 0.038
2.927ThrAsp: 2.927 ± 0.093
3.719ThrGlu: 3.719 ± 0.094
2.12ThrPhe: 2.12 ± 0.071
3.591ThrGly: 3.591 ± 0.098
0.745ThrHis: 0.745 ± 0.041
5.423ThrIle: 5.423 ± 0.125
5.025ThrLys: 5.025 ± 0.11
4.42ThrLeu: 4.42 ± 0.101
1.102ThrMet: 1.102 ± 0.052
3.517ThrAsn: 3.517 ± 0.114
1.581ThrPro: 1.581 ± 0.064
1.493ThrGln: 1.493 ± 0.064
1.697ThrArg: 1.697 ± 0.073
3.365ThrSer: 3.365 ± 0.101
3.136ThrThr: 3.136 ± 0.128
3.315ThrVal: 3.315 ± 0.106
0.366ThrTrp: 0.366 ± 0.032
2.425ThrTyr: 2.425 ± 0.093
0.0ThrXaa: 0.0 ± 0.0
Val
3.357ValAla: 3.357 ± 0.093
0.689ValCys: 0.689 ± 0.048
3.411ValAsp: 3.411 ± 0.105
4.621ValGlu: 4.621 ± 0.111
2.159ValPhe: 2.159 ± 0.07
3.293ValGly: 3.293 ± 0.101
0.772ValHis: 0.772 ± 0.042
5.952ValIle: 5.952 ± 0.132
5.48ValLys: 5.48 ± 0.136
5.057ValLeu: 5.057 ± 0.114
1.237ValMet: 1.237 ± 0.05
3.539ValAsn: 3.539 ± 0.103
1.672ValPro: 1.672 ± 0.065
1.611ValGln: 1.611 ± 0.059
1.936ValArg: 1.936 ± 0.082
3.571ValSer: 3.571 ± 0.091
3.32ValThr: 3.32 ± 0.1
3.82ValVal: 3.82 ± 0.106
0.396ValTrp: 0.396 ± 0.036
2.327ValTyr: 2.327 ± 0.073
0.0ValXaa: 0.0 ± 0.0
Trp
0.271TrpAla: 0.271 ± 0.024
0.093TrpCys: 0.093 ± 0.015
0.332TrpAsp: 0.332 ± 0.027
0.445TrpGlu: 0.445 ± 0.035
0.322TrpPhe: 0.322 ± 0.03
0.408TrpGly: 0.408 ± 0.035
0.118TrpHis: 0.118 ± 0.018
0.649TrpIle: 0.649 ± 0.045
0.598TrpLys: 0.598 ± 0.04
0.578TrpLeu: 0.578 ± 0.044
0.157TrpMet: 0.157 ± 0.022
0.48TrpAsn: 0.48 ± 0.037
0.125TrpPro: 0.125 ± 0.019
0.234TrpGln: 0.234 ± 0.022
0.202TrpArg: 0.202 ± 0.021
0.342TrpSer: 0.342 ± 0.032
0.295TrpThr: 0.295 ± 0.03
0.288TrpVal: 0.288 ± 0.031
0.071TrpTrp: 0.071 ± 0.014
0.303TrpTyr: 0.303 ± 0.027
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.985TyrAla: 1.985 ± 0.077
0.576TyrCys: 0.576 ± 0.038
2.44TyrAsp: 2.44 ± 0.078
3.411TyrGlu: 3.411 ± 0.098
1.909TyrPhe: 1.909 ± 0.07
2.56TyrGly: 2.56 ± 0.087
0.595TyrHis: 0.595 ± 0.039
4.747TyrIle: 4.747 ± 0.103
4.498TyrLys: 4.498 ± 0.122
3.736TyrLeu: 3.736 ± 0.102
1.077TyrMet: 1.077 ± 0.047
3.298TyrAsn: 3.298 ± 0.104
1.173TyrPro: 1.173 ± 0.061
1.328TyrGln: 1.328 ± 0.048
1.429TyrArg: 1.429 ± 0.069
2.865TyrSer: 2.865 ± 0.101
2.627TyrThr: 2.627 ± 0.088
2.354TyrVal: 2.354 ± 0.071
0.322TyrTrp: 0.322 ± 0.031
2.152TyrTyr: 2.152 ± 0.083
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 1443 proteins (406588 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski