Amino acid dipepetide frequency for Clostridium sp. CAG:440

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
2.66AlaAla: 2.66 ± 0.135
0.554AlaCys: 0.554 ± 0.039
2.266AlaAsp: 2.266 ± 0.084
2.988AlaGlu: 2.988 ± 0.087
1.809AlaPhe: 1.809 ± 0.073
3.368AlaGly: 3.368 ± 0.116
0.672AlaHis: 0.672 ± 0.037
5.504AlaIle: 5.504 ± 0.136
5.903AlaLys: 5.903 ± 0.127
3.922AlaLeu: 3.922 ± 0.115
1.283AlaMet: 1.283 ± 0.057
3.413AlaAsn: 3.413 ± 0.119
1.054AlaPro: 1.054 ± 0.056
1.551AlaGln: 1.551 ± 0.063
1.671AlaArg: 1.671 ± 0.067
2.772AlaSer: 2.772 ± 0.084
2.882AlaThr: 2.882 ± 0.126
3.1AlaVal: 3.1 ± 0.108
0.301AlaTrp: 0.301 ± 0.03
1.867AlaTyr: 1.867 ± 0.066
0.0AlaXaa: 0.0 ± 0.0
Cys
0.5CysAla: 0.5 ± 0.038
0.141CysCys: 0.141 ± 0.022
0.521CysAsp: 0.521 ± 0.039
0.681CysGlu: 0.681 ± 0.042
0.485CysPhe: 0.485 ± 0.036
0.88CysGly: 0.88 ± 0.049
0.184CysHis: 0.184 ± 0.022
1.037CysIle: 1.037 ± 0.047
1.181CysLys: 1.181 ± 0.051
0.848CysLeu: 0.848 ± 0.045
0.251CysMet: 0.251 ± 0.025
0.772CysAsn: 0.772 ± 0.047
0.402CysPro: 0.402 ± 0.034
0.256CysGln: 0.256 ± 0.027
0.284CysArg: 0.284 ± 0.029
0.698CysSer: 0.698 ± 0.05
0.631CysThr: 0.631 ± 0.036
0.636CysVal: 0.636 ± 0.043
0.062CysTrp: 0.062 ± 0.012
0.44CysTyr: 0.44 ± 0.03
0.0CysXaa: 0.0 ± 0.0
Asp
2.38AspAla: 2.38 ± 0.092
0.531AspCys: 0.531 ± 0.038
2.887AspAsp: 2.887 ± 0.099
5.084AspGlu: 5.084 ± 0.113
2.404AspPhe: 2.404 ± 0.089
3.482AspGly: 3.482 ± 0.151
0.495AspHis: 0.495 ± 0.037
6.106AspIle: 6.106 ± 0.127
5.332AspLys: 5.332 ± 0.119
4.601AspLeu: 4.601 ± 0.123
1.312AspMet: 1.312 ± 0.058
3.791AspAsn: 3.791 ± 0.109
0.999AspPro: 0.999 ± 0.05
0.906AspGln: 0.906 ± 0.048
1.379AspArg: 1.379 ± 0.062
2.933AspSer: 2.933 ± 0.086
2.725AspThr: 2.725 ± 0.099
3.587AspVal: 3.587 ± 0.087
0.33AspTrp: 0.33 ± 0.029
2.772AspTyr: 2.772 ± 0.091
0.0AspXaa: 0.0 ± 0.0
Glu
3.719GluAla: 3.719 ± 0.105
0.688GluCys: 0.688 ± 0.043
4.374GluAsp: 4.374 ± 0.125
8.226GluGlu: 8.226 ± 0.193
2.933GluPhe: 2.933 ± 0.101
3.63GluGly: 3.63 ± 0.113
0.801GluHis: 0.801 ± 0.051
8.207GluIle: 8.207 ± 0.135
9.95GluLys: 9.95 ± 0.196
6.599GluLeu: 6.599 ± 0.15
1.781GluMet: 1.781 ± 0.083
7.725GluAsn: 7.725 ± 0.164
1.444GluPro: 1.444 ± 0.062
2.808GluGln: 2.808 ± 0.094
2.309GluArg: 2.309 ± 0.095
3.411GluSer: 3.411 ± 0.089
3.695GluThr: 3.695 ± 0.091
4.512GluVal: 4.512 ± 0.127
0.428GluTrp: 0.428 ± 0.034
3.566GluTyr: 3.566 ± 0.116
0.0GluXaa: 0.0 ± 0.0
Phe
2.029PheAla: 2.029 ± 0.068
0.485PheCys: 0.485 ± 0.037
2.44PheAsp: 2.44 ± 0.085
3.076PheGlu: 3.076 ± 0.096
1.503PhePhe: 1.503 ± 0.062
2.421PheGly: 2.421 ± 0.082
0.404PheHis: 0.404 ± 0.035
3.85PheIle: 3.85 ± 0.124
3.566PheLys: 3.566 ± 0.101
3.368PheLeu: 3.368 ± 0.109
0.951PheMet: 0.951 ± 0.049
2.588PheAsn: 2.588 ± 0.089
1.011PhePro: 1.011 ± 0.055
0.798PheGln: 0.798 ± 0.043
1.023PheArg: 1.023 ± 0.05
2.536PheSer: 2.536 ± 0.087
2.103PheThr: 2.103 ± 0.071
2.457PheVal: 2.457 ± 0.079
0.277PheTrp: 0.277 ± 0.028
1.64PheTyr: 1.64 ± 0.064
0.0PheXaa: 0.0 ± 0.0
Gly
3.009GlyAla: 3.009 ± 0.105
0.619GlyCys: 0.619 ± 0.049
2.703GlyAsp: 2.703 ± 0.103
3.829GlyGlu: 3.829 ± 0.109
2.533GlyPhe: 2.533 ± 0.083
3.286GlyGly: 3.286 ± 0.127
0.858GlyHis: 0.858 ± 0.049
6.281GlyIle: 6.281 ± 0.136
6.006GlyLys: 6.006 ± 0.133
4.445GlyLeu: 4.445 ± 0.099
1.453GlyMet: 1.453 ± 0.071
3.757GlyAsn: 3.757 ± 0.15
0.896GlyPro: 0.896 ± 0.054
1.635GlyGln: 1.635 ± 0.084
1.879GlyArg: 1.879 ± 0.073
3.04GlySer: 3.04 ± 0.105
3.97GlyThr: 3.97 ± 0.174
3.716GlyVal: 3.716 ± 0.115
0.423GlyTrp: 0.423 ± 0.051
2.782GlyTyr: 2.782 ± 0.12
0.0GlyXaa: 0.0 ± 0.0
His
0.535HisAla: 0.535 ± 0.038
0.174HisCys: 0.174 ± 0.018
0.543HisAsp: 0.543 ± 0.032
0.717HisGlu: 0.717 ± 0.041
0.523HisPhe: 0.523 ± 0.038
0.779HisGly: 0.779 ± 0.042
0.229HisHis: 0.229 ± 0.028
1.221HisIle: 1.221 ± 0.056
1.059HisLys: 1.059 ± 0.053
0.966HisLeu: 0.966 ± 0.055
0.275HisMet: 0.275 ± 0.027
0.748HisAsn: 0.748 ± 0.041
0.528HisPro: 0.528 ± 0.036
0.284HisGln: 0.284 ± 0.026
0.428HisArg: 0.428 ± 0.034
0.784HisSer: 0.784 ± 0.045
0.614HisThr: 0.614 ± 0.036
0.645HisVal: 0.645 ± 0.042
0.069HisTrp: 0.069 ± 0.013
0.468HisTyr: 0.468 ± 0.034
0.0HisXaa: 0.0 ± 0.0
Ile
5.81IleAla: 5.81 ± 0.156
1.343IleCys: 1.343 ± 0.057
5.765IleAsp: 5.765 ± 0.133
8.425IleGlu: 8.425 ± 0.162
3.946IlePhe: 3.946 ± 0.12
5.487IleGly: 5.487 ± 0.13
1.133IleHis: 1.133 ± 0.055
9.95IleIle: 9.95 ± 0.224
9.945IleLys: 9.945 ± 0.181
8.943IleLeu: 8.943 ± 0.194
2.125IleMet: 2.125 ± 0.074
7.019IleAsn: 7.019 ± 0.15
3.026IlePro: 3.026 ± 0.094
2.531IleGln: 2.531 ± 0.091
2.686IleArg: 2.686 ± 0.085
6.79IleSer: 6.79 ± 0.145
6.097IleThr: 6.097 ± 0.135
6.269IleVal: 6.269 ± 0.112
0.502IleTrp: 0.502 ± 0.033
4.295IleTyr: 4.295 ± 0.118
0.0IleXaa: 0.0 ± 0.0
Lys
4.732LysAla: 4.732 ± 0.119
0.98LysCys: 0.98 ± 0.049
6.226LysAsp: 6.226 ± 0.154
10.619LysGlu: 10.619 ± 0.195
3.444LysPhe: 3.444 ± 0.096
4.364LysGly: 4.364 ± 0.099
1.159LysHis: 1.159 ± 0.056
11.128LysIle: 11.128 ± 0.19
10.526LysLys: 10.526 ± 0.212
8.344LysLeu: 8.344 ± 0.165
2.956LysMet: 2.956 ± 0.086
8.556LysAsn: 8.556 ± 0.178
2.098LysPro: 2.098 ± 0.078
4.135LysGln: 4.135 ± 0.127
3.162LysArg: 3.162 ± 0.11
4.954LysSer: 4.954 ± 0.105
5.887LysThr: 5.887 ± 0.114
6.145LysVal: 6.145 ± 0.154
0.664LysTrp: 0.664 ± 0.037
5.478LysTyr: 5.478 ± 0.131
0.0LysXaa: 0.0 ± 0.0
Leu
4.62LeuAla: 4.62 ± 0.113
0.87LeuCys: 0.87 ± 0.047
4.854LeuAsp: 4.854 ± 0.122
7.039LeuGlu: 7.039 ± 0.181
3.047LeuPhe: 3.047 ± 0.106
4.909LeuGly: 4.909 ± 0.131
0.956LeuHis: 0.956 ± 0.049
7.438LeuIle: 7.438 ± 0.165
8.661LeuLys: 8.661 ± 0.165
6.62LeuLeu: 6.62 ± 0.186
1.718LeuMet: 1.718 ± 0.063
6.061LeuAsn: 6.061 ± 0.154
2.433LeuPro: 2.433 ± 0.078
2.194LeuGln: 2.194 ± 0.086
2.409LeuArg: 2.409 ± 0.09
5.337LeuSer: 5.337 ± 0.132
4.271LeuThr: 4.271 ± 0.119
4.711LeuVal: 4.711 ± 0.115
0.485LeuTrp: 0.485 ± 0.035
3.363LeuTyr: 3.363 ± 0.103
0.0LeuXaa: 0.0 ± 0.0
Met
1.513MetAla: 1.513 ± 0.065
0.304MetCys: 0.304 ± 0.026
1.147MetAsp: 1.147 ± 0.056
1.89MetGlu: 1.89 ± 0.07
0.87MetPhe: 0.87 ± 0.051
1.24MetGly: 1.24 ± 0.061
0.296MetHis: 0.296 ± 0.029
1.972MetIle: 1.972 ± 0.073
2.598MetLys: 2.598 ± 0.07
2.163MetLeu: 2.163 ± 0.075
0.562MetMet: 0.562 ± 0.042
1.439MetAsn: 1.439 ± 0.06
0.994MetPro: 0.994 ± 0.052
1.021MetGln: 1.021 ± 0.052
0.598MetArg: 0.598 ± 0.041
1.326MetSer: 1.326 ± 0.06
1.066MetThr: 1.066 ± 0.057
1.389MetVal: 1.389 ± 0.059
0.155MetTrp: 0.155 ± 0.021
0.927MetTyr: 0.927 ± 0.051
0.0MetXaa: 0.0 ± 0.0
Asn
3.112AsnAla: 3.112 ± 0.092
0.863AsnCys: 0.863 ± 0.055
3.575AsnAsp: 3.575 ± 0.104
5.528AsnGlu: 5.528 ± 0.137
2.694AsnPhe: 2.694 ± 0.083
4.529AsnGly: 4.529 ± 0.179
0.741AsnHis: 0.741 ± 0.047
8.207AsnIle: 8.207 ± 0.15
8.504AsnLys: 8.504 ± 0.188
6.353AsnLeu: 6.353 ± 0.152
1.845AsnMet: 1.845 ± 0.059
6.307AsnAsn: 6.307 ± 0.174
2.012AsnPro: 2.012 ± 0.073
2.06AsnGln: 2.06 ± 0.085
1.855AsnArg: 1.855 ± 0.077
4.586AsnSer: 4.586 ± 0.129
4.147AsnThr: 4.147 ± 0.136
4.445AsnVal: 4.445 ± 0.111
0.576AsnTrp: 0.576 ± 0.045
3.403AsnTyr: 3.403 ± 0.102
0.0AsnXaa: 0.0 ± 0.0
Pro
1.13ProAla: 1.13 ± 0.055
0.332ProCys: 0.332 ± 0.029
1.412ProAsp: 1.412 ± 0.064
2.278ProGlu: 2.278 ± 0.083
1.049ProPhe: 1.049 ± 0.059
1.336ProGly: 1.336 ± 0.059
0.337ProHis: 0.337 ± 0.028
2.409ProIle: 2.409 ± 0.08
2.254ProLys: 2.254 ± 0.072
1.781ProLeu: 1.781 ± 0.072
0.543ProMet: 0.543 ± 0.036
1.766ProAsn: 1.766 ± 0.063
0.454ProPro: 0.454 ± 0.045
0.731ProGln: 0.731 ± 0.04
0.695ProArg: 0.695 ± 0.051
1.389ProSer: 1.389 ± 0.057
1.46ProThr: 1.46 ± 0.064
1.726ProVal: 1.726 ± 0.068
0.182ProTrp: 0.182 ± 0.023
1.092ProTyr: 1.092 ± 0.052
0.0ProXaa: 0.0 ± 0.0
Gln
1.374GlnAla: 1.374 ± 0.06
0.198GlnCys: 0.198 ± 0.023
1.745GlnAsp: 1.745 ± 0.077
2.916GlnGlu: 2.916 ± 0.107
0.894GlnPhe: 0.894 ± 0.043
1.637GlnGly: 1.637 ± 0.058
0.203GlnHis: 0.203 ± 0.02
3.346GlnIle: 3.346 ± 0.1
3.344GlnLys: 3.344 ± 0.101
1.996GlnLeu: 1.996 ± 0.095
0.779GlnMet: 0.779 ± 0.035
2.662GlnAsn: 2.662 ± 0.083
0.413GlnPro: 0.413 ± 0.032
0.562GlnGln: 0.562 ± 0.034
0.989GlnArg: 0.989 ± 0.049
1.439GlnSer: 1.439 ± 0.066
1.757GlnThr: 1.757 ± 0.068
1.506GlnVal: 1.506 ± 0.059
0.198GlnTrp: 0.198 ± 0.024
1.264GlnTyr: 1.264 ± 0.059
0.0GlnXaa: 0.0 ± 0.0
Arg
1.391ArgAla: 1.391 ± 0.069
0.342ArgCys: 0.342 ± 0.028
1.508ArgAsp: 1.508 ± 0.065
2.106ArgGlu: 2.106 ± 0.079
1.217ArgPhe: 1.217 ± 0.055
1.513ArgGly: 1.513 ± 0.073
0.437ArgHis: 0.437 ± 0.034
2.921ArgIle: 2.921 ± 0.097
3.296ArgLys: 3.296 ± 0.094
2.457ArgLeu: 2.457 ± 0.087
0.772ArgMet: 0.772 ± 0.047
1.941ArgAsn: 1.941 ± 0.074
0.827ArgPro: 0.827 ± 0.048
1.011ArgGln: 1.011 ± 0.051
1.219ArgArg: 1.219 ± 0.068
1.178ArgSer: 1.178 ± 0.056
1.616ArgThr: 1.616 ± 0.069
1.783ArgVal: 1.783 ± 0.063
0.213ArgTrp: 0.213 ± 0.024
1.434ArgTyr: 1.434 ± 0.058
0.0ArgXaa: 0.0 ± 0.0
Ser
2.376SerAla: 2.376 ± 0.076
0.595SerCys: 0.595 ± 0.041
2.866SerAsp: 2.866 ± 0.088
3.776SerGlu: 3.776 ± 0.098
2.471SerPhe: 2.471 ± 0.082
3.58SerGly: 3.58 ± 0.118
0.655SerHis: 0.655 ± 0.042
5.91SerIle: 5.91 ± 0.14
6.472SerLys: 6.472 ± 0.135
4.813SerLeu: 4.813 ± 0.112
1.245SerMet: 1.245 ± 0.054
4.704SerAsn: 4.704 ± 0.133
1.245SerPro: 1.245 ± 0.053
1.905SerGln: 1.905 ± 0.074
1.632SerArg: 1.632 ± 0.065
4.171SerSer: 4.171 ± 0.155
3.329SerThr: 3.329 ± 0.114
3.131SerVal: 3.131 ± 0.102
0.413SerTrp: 0.413 ± 0.034
2.787SerTyr: 2.787 ± 0.099
0.0SerXaa: 0.0 ± 0.0
Thr
2.83ThrAla: 2.83 ± 0.147
0.617ThrCys: 0.617 ± 0.043
2.918ThrAsp: 2.918 ± 0.09
3.205ThrGlu: 3.205 ± 0.087
2.232ThrPhe: 2.232 ± 0.076
3.97ThrGly: 3.97 ± 0.148
0.746ThrHis: 0.746 ± 0.036
5.779ThrIle: 5.779 ± 0.15
5.652ThrLys: 5.652 ± 0.128
4.639ThrLeu: 4.639 ± 0.119
1.13ThrMet: 1.13 ± 0.051
4.068ThrAsn: 4.068 ± 0.14
1.668ThrPro: 1.668 ± 0.064
1.745ThrGln: 1.745 ± 0.08
1.628ThrArg: 1.628 ± 0.065
3.695ThrSer: 3.695 ± 0.133
3.408ThrThr: 3.408 ± 0.156
3.681ThrVal: 3.681 ± 0.149
0.435ThrTrp: 0.435 ± 0.038
2.536ThrTyr: 2.536 ± 0.113
0.0ThrXaa: 0.0 ± 0.0
Val
3.411ValAla: 3.411 ± 0.104
0.676ValCys: 0.676 ± 0.043
3.401ValAsp: 3.401 ± 0.118
4.632ValGlu: 4.632 ± 0.121
2.259ValPhe: 2.259 ± 0.081
3.585ValGly: 3.585 ± 0.109
0.643ValHis: 0.643 ± 0.039
5.796ValIle: 5.796 ± 0.133
6.001ValLys: 6.001 ± 0.138
5.0ValLeu: 5.0 ± 0.117
1.369ValMet: 1.369 ± 0.057
3.879ValAsn: 3.879 ± 0.106
1.675ValPro: 1.675 ± 0.069
1.659ValGln: 1.659 ± 0.064
1.785ValArg: 1.785 ± 0.091
3.869ValSer: 3.869 ± 0.116
3.853ValThr: 3.853 ± 0.176
3.81ValVal: 3.81 ± 0.101
0.378ValTrp: 0.378 ± 0.047
2.481ValTyr: 2.481 ± 0.083
0.0ValXaa: 0.0 ± 0.0
Trp
0.327TrpAla: 0.327 ± 0.027
0.069TrpCys: 0.069 ± 0.014
0.315TrpAsp: 0.315 ± 0.028
0.442TrpGlu: 0.442 ± 0.036
0.287TrpPhe: 0.287 ± 0.03
0.435TrpGly: 0.435 ± 0.063
0.112TrpHis: 0.112 ± 0.017
0.509TrpIle: 0.509 ± 0.033
0.578TrpLys: 0.578 ± 0.04
0.526TrpLeu: 0.526 ± 0.037
0.172TrpMet: 0.172 ± 0.024
0.492TrpAsn: 0.492 ± 0.035
0.122TrpPro: 0.122 ± 0.02
0.272TrpGln: 0.272 ± 0.024
0.206TrpArg: 0.206 ± 0.021
0.363TrpSer: 0.363 ± 0.027
0.368TrpThr: 0.368 ± 0.035
0.378TrpVal: 0.378 ± 0.032
0.084TrpTrp: 0.084 ± 0.014
0.37TrpTyr: 0.37 ± 0.028
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.082TyrAla: 2.082 ± 0.082
0.507TyrCys: 0.507 ± 0.037
2.569TyrAsp: 2.569 ± 0.086
3.348TyrGlu: 3.348 ± 0.092
1.926TyrPhe: 1.926 ± 0.083
2.689TyrGly: 2.689 ± 0.098
0.531TyrHis: 0.531 ± 0.037
4.515TyrIle: 4.515 ± 0.117
4.732TyrLys: 4.732 ± 0.113
3.614TyrLeu: 3.614 ± 0.106
0.944TyrMet: 0.944 ± 0.054
3.654TyrAsn: 3.654 ± 0.108
1.054TyrPro: 1.054 ± 0.056
1.159TyrGln: 1.159 ± 0.056
1.343TyrArg: 1.343 ± 0.057
2.806TyrSer: 2.806 ± 0.094
2.703TyrThr: 2.703 ± 0.107
2.536TyrVal: 2.536 ± 0.087
0.253TyrTrp: 0.253 ± 0.026
2.065TyrTyr: 2.065 ± 0.091
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 1420 proteins (418409 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski