Amino acid dipepetide frequency for Clostridium sp. CAG:524

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
1.822AlaAla: 1.822 ± 0.111
0.602AlaCys: 0.602 ± 0.045
1.903AlaAsp: 1.903 ± 0.09
2.121AlaGlu: 2.121 ± 0.091
1.943AlaPhe: 1.943 ± 0.093
2.233AlaGly: 2.233 ± 0.09
0.652AlaHis: 0.652 ± 0.042
3.875AlaIle: 3.875 ± 0.116
3.561AlaLys: 3.561 ± 0.128
4.223AlaLeu: 4.223 ± 0.136
0.877AlaMet: 0.877 ± 0.045
2.131AlaAsn: 2.131 ± 0.08
0.932AlaPro: 0.932 ± 0.059
0.615AlaGln: 0.615 ± 0.039
1.576AlaArg: 1.576 ± 0.07
3.082AlaSer: 3.082 ± 0.097
2.54AlaThr: 2.54 ± 0.098
2.393AlaVal: 2.393 ± 0.089
0.196AlaTrp: 0.196 ± 0.021
1.993AlaTyr: 1.993 ± 0.076
0.0AlaXaa: 0.0 ± 0.0
Cys
0.508CysAla: 0.508 ± 0.041
0.173CysCys: 0.173 ± 0.024
0.932CysAsp: 0.932 ± 0.146
0.694CysGlu: 0.694 ± 0.045
0.432CysPhe: 0.432 ± 0.034
0.974CysGly: 0.974 ± 0.067
0.157CysHis: 0.157 ± 0.02
1.079CysIle: 1.079 ± 0.058
1.039CysLys: 1.039 ± 0.06
0.909CysLeu: 0.909 ± 0.049
0.228CysMet: 0.228 ± 0.023
0.772CysAsn: 0.772 ± 0.05
0.419CysPro: 0.419 ± 0.04
0.157CysGln: 0.157 ± 0.021
0.28CysArg: 0.28 ± 0.028
0.825CysSer: 0.825 ± 0.052
0.634CysThr: 0.634 ± 0.044
0.576CysVal: 0.576 ± 0.037
0.037CysTrp: 0.037 ± 0.01
0.597CysTyr: 0.597 ± 0.045
0.0CysXaa: 0.0 ± 0.0
Asp
2.519AspAla: 2.519 ± 0.09
0.456AspCys: 0.456 ± 0.034
3.179AspAsp: 3.179 ± 0.114
4.773AspGlu: 4.773 ± 0.119
2.773AspPhe: 2.773 ± 0.096
3.419AspGly: 3.419 ± 0.162
0.484AspHis: 0.484 ± 0.038
6.886AspIle: 6.886 ± 0.169
7.169AspLys: 7.169 ± 0.153
5.184AspLeu: 5.184 ± 0.123
1.563AspMet: 1.563 ± 0.07
4.956AspAsn: 4.956 ± 0.111
1.155AspPro: 1.155 ± 0.061
0.594AspGln: 0.594 ± 0.036
1.83AspArg: 1.83 ± 0.072
3.757AspSer: 3.757 ± 0.099
3.857AspThr: 3.857 ± 0.117
3.548AspVal: 3.548 ± 0.099
0.217AspTrp: 0.217 ± 0.031
3.867AspTyr: 3.867 ± 0.106
0.0AspXaa: 0.0 ± 0.0
Glu
2.985GluAla: 2.985 ± 0.113
0.791GluCys: 0.791 ± 0.043
3.886GluAsp: 3.886 ± 0.114
6.208GluGlu: 6.208 ± 0.17
3.082GluPhe: 3.082 ± 0.104
2.927GluGly: 2.927 ± 0.096
0.877GluHis: 0.877 ± 0.048
7.216GluIle: 7.216 ± 0.163
7.284GluLys: 7.284 ± 0.179
7.122GluLeu: 7.122 ± 0.156
1.903GluMet: 1.903 ± 0.076
5.142GluAsn: 5.142 ± 0.115
1.288GluPro: 1.288 ± 0.061
1.367GluGln: 1.367 ± 0.053
2.401GluArg: 2.401 ± 0.09
4.071GluSer: 4.071 ± 0.111
3.524GluThr: 3.524 ± 0.094
4.485GluVal: 4.485 ± 0.123
0.401GluTrp: 0.401 ± 0.03
4.092GluTyr: 4.092 ± 0.108
0.0GluXaa: 0.0 ± 0.0
Phe
1.634PheAla: 1.634 ± 0.067
0.492PheCys: 0.492 ± 0.034
3.15PheAsp: 3.15 ± 0.105
2.527PheGlu: 2.527 ± 0.08
1.712PhePhe: 1.712 ± 0.076
2.254PheGly: 2.254 ± 0.083
0.474PheHis: 0.474 ± 0.036
4.983PheIle: 4.983 ± 0.168
4.367PheLys: 4.367 ± 0.107
3.642PheLeu: 3.642 ± 0.112
1.123PheMet: 1.123 ± 0.057
3.579PheAsn: 3.579 ± 0.106
1.0PhePro: 1.0 ± 0.053
0.681PheGln: 0.681 ± 0.039
1.262PheArg: 1.262 ± 0.059
2.843PheSer: 2.843 ± 0.092
2.592PheThr: 2.592 ± 0.094
2.396PheVal: 2.396 ± 0.081
0.204PheTrp: 0.204 ± 0.029
2.184PheTyr: 2.184 ± 0.073
0.0PheXaa: 0.0 ± 0.0
Gly
2.281GlyAla: 2.281 ± 0.103
0.731GlyCys: 0.731 ± 0.049
2.702GlyAsp: 2.702 ± 0.104
3.456GlyGlu: 3.456 ± 0.108
2.529GlyPhe: 2.529 ± 0.086
3.003GlyGly: 3.003 ± 0.153
0.846GlyHis: 0.846 ± 0.045
5.307GlyIle: 5.307 ± 0.126
5.14GlyLys: 5.14 ± 0.152
4.037GlyLeu: 4.037 ± 0.111
1.314GlyMet: 1.314 ± 0.064
3.419GlyAsn: 3.419 ± 0.107
0.848GlyPro: 0.848 ± 0.05
0.843GlyGln: 0.843 ± 0.044
1.615GlyArg: 1.615 ± 0.071
3.626GlySer: 3.626 ± 0.131
3.299GlyThr: 3.299 ± 0.127
3.378GlyVal: 3.378 ± 0.1
0.411GlyTrp: 0.411 ± 0.044
3.275GlyTyr: 3.275 ± 0.103
0.0GlyXaa: 0.0 ± 0.0
His
0.558HisAla: 0.558 ± 0.035
0.126HisCys: 0.126 ± 0.018
0.806HisAsp: 0.806 ± 0.045
1.05HisGlu: 1.05 ± 0.047
0.534HisPhe: 0.534 ± 0.045
0.846HisGly: 0.846 ± 0.044
0.212HisHis: 0.212 ± 0.023
1.194HisIle: 1.194 ± 0.056
1.142HisLys: 1.142 ± 0.052
1.034HisLeu: 1.034 ± 0.056
0.304HisMet: 0.304 ± 0.031
0.901HisAsn: 0.901 ± 0.054
0.526HisPro: 0.526 ± 0.038
0.238HisGln: 0.238 ± 0.022
0.456HisArg: 0.456 ± 0.036
0.702HisSer: 0.702 ± 0.043
0.681HisThr: 0.681 ± 0.041
0.652HisVal: 0.652 ± 0.041
0.055HisTrp: 0.055 ± 0.011
0.615HisTyr: 0.615 ± 0.037
0.0HisXaa: 0.0 ± 0.0
Ile
4.009IleAla: 4.009 ± 0.107
1.215IleCys: 1.215 ± 0.058
7.187IleAsp: 7.187 ± 0.15
6.836IleGlu: 6.836 ± 0.174
4.336IlePhe: 4.336 ± 0.131
4.975IleGly: 4.975 ± 0.122
1.157IleHis: 1.157 ± 0.06
11.13IleIle: 11.13 ± 0.238
10.091IleLys: 10.091 ± 0.192
9.476IleLeu: 9.476 ± 0.215
2.262IleMet: 2.262 ± 0.073
8.321IleAsn: 8.321 ± 0.181
2.728IlePro: 2.728 ± 0.101
1.532IleGln: 1.532 ± 0.07
2.875IleArg: 2.875 ± 0.095
7.242IleSer: 7.242 ± 0.167
5.684IleThr: 5.684 ± 0.121
6.525IleVal: 6.525 ± 0.149
0.44IleTrp: 0.44 ± 0.039
4.773IleTyr: 4.773 ± 0.157
0.0IleXaa: 0.0 ± 0.0
Lys
3.367LysAla: 3.367 ± 0.101
1.084LysCys: 1.084 ± 0.093
7.402LysAsp: 7.402 ± 0.143
9.57LysGlu: 9.57 ± 0.19
3.44LysPhe: 3.44 ± 0.098
4.383LysGly: 4.383 ± 0.129
1.097LysHis: 1.097 ± 0.054
9.69LysIle: 9.69 ± 0.177
10.421LysLys: 10.421 ± 0.227
8.119LysLeu: 8.119 ± 0.144
2.733LysMet: 2.733 ± 0.094
8.166LysAsn: 8.166 ± 0.172
1.88LysPro: 1.88 ± 0.081
1.854LysGln: 1.854 ± 0.083
3.367LysArg: 3.367 ± 0.096
5.577LysSer: 5.577 ± 0.124
5.47LysThr: 5.47 ± 0.137
6.313LysVal: 6.313 ± 0.135
0.524LysTrp: 0.524 ± 0.045
6.912LysTyr: 6.912 ± 0.147
0.0LysXaa: 0.0 ± 0.0
Leu
3.302LeuAla: 3.302 ± 0.099
1.003LeuCys: 1.003 ± 0.052
5.449LeuAsp: 5.449 ± 0.115
5.92LeuGlu: 5.92 ± 0.123
3.993LeuPhe: 3.993 ± 0.12
4.509LeuGly: 4.509 ± 0.124
1.152LeuHis: 1.152 ± 0.057
8.632LeuIle: 8.632 ± 0.198
8.879LeuLys: 8.879 ± 0.161
8.371LeuLeu: 8.371 ± 0.183
2.121LeuMet: 2.121 ± 0.081
6.933LeuAsn: 6.933 ± 0.147
2.346LeuPro: 2.346 ± 0.081
1.474LeuGln: 1.474 ± 0.066
2.676LeuArg: 2.676 ± 0.098
6.601LeuSer: 6.601 ± 0.144
4.841LeuThr: 4.841 ± 0.124
5.153LeuVal: 5.153 ± 0.126
0.39LeuTrp: 0.39 ± 0.031
4.281LeuTyr: 4.281 ± 0.124
0.0LeuXaa: 0.0 ± 0.0
Met
1.026MetAla: 1.026 ± 0.054
0.259MetCys: 0.259 ± 0.028
1.372MetAsp: 1.372 ± 0.064
1.487MetGlu: 1.487 ± 0.063
1.346MetPhe: 1.346 ± 0.087
1.191MetGly: 1.191 ± 0.068
0.306MetHis: 0.306 ± 0.028
2.322MetIle: 2.322 ± 0.086
2.896MetLys: 2.896 ± 0.09
2.176MetLeu: 2.176 ± 0.078
0.665MetMet: 0.665 ± 0.046
2.181MetAsn: 2.181 ± 0.082
0.67MetPro: 0.67 ± 0.045
0.516MetGln: 0.516 ± 0.034
0.689MetArg: 0.689 ± 0.041
1.72MetSer: 1.72 ± 0.066
1.121MetThr: 1.121 ± 0.062
1.042MetVal: 1.042 ± 0.055
0.115MetTrp: 0.115 ± 0.017
1.191MetTyr: 1.191 ± 0.053
0.0MetXaa: 0.0 ± 0.0
Asn
2.773AsnAla: 2.773 ± 0.097
0.738AsnCys: 0.738 ± 0.084
5.587AsnAsp: 5.587 ± 0.145
5.718AsnGlu: 5.718 ± 0.133
2.663AsnPhe: 2.663 ± 0.091
4.147AsnGly: 4.147 ± 0.114
0.827AsnHis: 0.827 ± 0.049
8.842AsnIle: 8.842 ± 0.178
8.855AsnLys: 8.855 ± 0.181
5.53AsnLeu: 5.53 ± 0.13
1.998AsnMet: 1.998 ± 0.085
7.549AsnAsn: 7.549 ± 0.223
1.828AsnPro: 1.828 ± 0.075
1.249AsnGln: 1.249 ± 0.072
2.084AsnArg: 2.084 ± 0.078
4.577AsnSer: 4.577 ± 0.153
4.383AsnThr: 4.383 ± 0.146
4.59AsnVal: 4.59 ± 0.115
0.264AsnTrp: 0.264 ± 0.029
4.307AsnTyr: 4.307 ± 0.119
0.0AsnXaa: 0.0 ± 0.0
Pro
1.013ProAla: 1.013 ± 0.059
0.304ProCys: 0.304 ± 0.032
1.272ProAsp: 1.272 ± 0.065
1.744ProGlu: 1.744 ± 0.081
1.163ProPhe: 1.163 ± 0.07
1.228ProGly: 1.228 ± 0.053
0.372ProHis: 0.372 ± 0.034
2.171ProIle: 2.171 ± 0.076
2.068ProLys: 2.068 ± 0.075
1.841ProLeu: 1.841 ± 0.069
0.497ProMet: 0.497 ± 0.035
1.715ProAsn: 1.715 ± 0.07
0.403ProPro: 0.403 ± 0.037
0.419ProGln: 0.419 ± 0.037
0.657ProArg: 0.657 ± 0.04
1.754ProSer: 1.754 ± 0.069
1.453ProThr: 1.453 ± 0.061
1.707ProVal: 1.707 ± 0.08
0.165ProTrp: 0.165 ± 0.022
1.317ProTyr: 1.317 ± 0.059
0.0ProXaa: 0.0 ± 0.0
Gln
0.882GlnAla: 0.882 ± 0.068
0.147GlnCys: 0.147 ± 0.022
0.929GlnAsp: 0.929 ± 0.055
1.317GlnGlu: 1.317 ± 0.056
0.668GlnPhe: 0.668 ± 0.047
0.778GlnGly: 0.778 ± 0.05
0.178GlnHis: 0.178 ± 0.02
1.744GlnIle: 1.744 ± 0.069
1.676GlnLys: 1.676 ± 0.072
1.244GlnLeu: 1.244 ± 0.057
0.505GlnMet: 0.505 ± 0.031
1.338GlnAsn: 1.338 ± 0.065
0.309GlnPro: 0.309 ± 0.033
0.322GlnGln: 0.322 ± 0.039
0.6GlnArg: 0.6 ± 0.042
0.927GlnSer: 0.927 ± 0.052
0.971GlnThr: 0.971 ± 0.05
1.1GlnVal: 1.1 ± 0.051
0.071GlnTrp: 0.071 ± 0.016
0.746GlnTyr: 0.746 ± 0.045
0.0GlnXaa: 0.0 ± 0.0
Arg
1.118ArgAla: 1.118 ± 0.058
0.411ArgCys: 0.411 ± 0.032
1.849ArgAsp: 1.849 ± 0.081
2.404ArgGlu: 2.404 ± 0.088
1.202ArgPhe: 1.202 ± 0.055
1.466ArgGly: 1.466 ± 0.067
0.463ArgHis: 0.463 ± 0.033
3.163ArgIle: 3.163 ± 0.096
3.147ArgLys: 3.147 ± 0.098
2.582ArgLeu: 2.582 ± 0.085
0.83ArgMet: 0.83 ± 0.048
2.532ArgAsn: 2.532 ± 0.086
0.623ArgPro: 0.623 ± 0.042
0.652ArgGln: 0.652 ± 0.041
1.218ArgArg: 1.218 ± 0.062
1.65ArgSer: 1.65 ± 0.066
1.689ArgThr: 1.689 ± 0.071
1.814ArgVal: 1.814 ± 0.076
0.168ArgTrp: 0.168 ± 0.021
1.623ArgTyr: 1.623 ± 0.07
0.0ArgXaa: 0.0 ± 0.0
Ser
2.393SerAla: 2.393 ± 0.083
0.704SerCys: 0.704 ± 0.053
3.673SerAsp: 3.673 ± 0.096
4.226SerGlu: 4.226 ± 0.115
3.239SerPhe: 3.239 ± 0.104
3.87SerGly: 3.87 ± 0.157
0.979SerHis: 0.979 ± 0.066
6.784SerIle: 6.784 ± 0.161
6.938SerLys: 6.938 ± 0.152
6.218SerLeu: 6.218 ± 0.148
1.503SerMet: 1.503 ± 0.06
5.307SerAsn: 5.307 ± 0.189
1.356SerPro: 1.356 ± 0.06
1.147SerGln: 1.147 ± 0.065
1.869SerArg: 1.869 ± 0.069
5.106SerSer: 5.106 ± 0.177
3.527SerThr: 3.527 ± 0.114
3.93SerVal: 3.93 ± 0.123
0.377SerTrp: 0.377 ± 0.033
3.477SerTyr: 3.477 ± 0.117
0.0SerXaa: 0.0 ± 0.0
Thr
2.223ThrAla: 2.223 ± 0.096
0.723ThrCys: 0.723 ± 0.061
3.147ThrAsp: 3.147 ± 0.095
3.312ThrGlu: 3.312 ± 0.112
2.718ThrPhe: 2.718 ± 0.079
3.241ThrGly: 3.241 ± 0.129
0.838ThrHis: 0.838 ± 0.053
6.208ThrIle: 6.208 ± 0.149
5.129ThrLys: 5.129 ± 0.136
5.48ThrLeu: 5.48 ± 0.117
1.173ThrMet: 1.173 ± 0.065
4.244ThrAsn: 4.244 ± 0.142
1.807ThrPro: 1.807 ± 0.086
0.772ThrGln: 0.772 ± 0.05
1.652ThrArg: 1.652 ± 0.069
4.043ThrSer: 4.043 ± 0.156
3.647ThrThr: 3.647 ± 0.144
2.969ThrVal: 2.969 ± 0.09
0.314ThrTrp: 0.314 ± 0.03
3.192ThrTyr: 3.192 ± 0.128
0.0ThrXaa: 0.0 ± 0.0
Val
2.498ValAla: 2.498 ± 0.113
0.788ValCys: 0.788 ± 0.053
3.608ValAsp: 3.608 ± 0.097
3.715ValGlu: 3.715 ± 0.116
2.589ValPhe: 2.589 ± 0.088
3.448ValGly: 3.448 ± 0.102
0.749ValHis: 0.749 ± 0.046
5.985ValIle: 5.985 ± 0.132
5.491ValLys: 5.491 ± 0.14
5.422ValLeu: 5.422 ± 0.128
1.259ValMet: 1.259 ± 0.062
4.367ValAsn: 4.367 ± 0.121
1.757ValPro: 1.757 ± 0.076
0.937ValGln: 0.937 ± 0.05
1.796ValArg: 1.796 ± 0.074
4.59ValSer: 4.59 ± 0.121
3.598ValThr: 3.598 ± 0.143
3.941ValVal: 3.941 ± 0.113
0.343ValTrp: 0.343 ± 0.034
2.726ValTyr: 2.726 ± 0.093
0.0ValXaa: 0.0 ± 0.0
Trp
0.173TrpAla: 0.173 ± 0.02
0.079TrpCys: 0.079 ± 0.014
0.264TrpAsp: 0.264 ± 0.026
0.246TrpGlu: 0.246 ± 0.028
0.23TrpPhe: 0.23 ± 0.022
0.293TrpGly: 0.293 ± 0.029
0.071TrpHis: 0.071 ± 0.014
0.477TrpIle: 0.477 ± 0.036
0.403TrpLys: 0.403 ± 0.037
0.484TrpLeu: 0.484 ± 0.039
0.173TrpMet: 0.173 ± 0.025
0.44TrpAsn: 0.44 ± 0.044
0.144TrpPro: 0.144 ± 0.022
0.097TrpGln: 0.097 ± 0.017
0.12TrpArg: 0.12 ± 0.017
0.304TrpSer: 0.304 ± 0.03
0.338TrpThr: 0.338 ± 0.032
0.23TrpVal: 0.23 ± 0.027
0.079TrpTrp: 0.079 ± 0.016
0.359TrpTyr: 0.359 ± 0.036
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.189TyrAla: 2.189 ± 0.091
0.597TyrCys: 0.597 ± 0.041
3.888TyrAsp: 3.888 ± 0.112
3.728TyrGlu: 3.728 ± 0.122
2.508TyrPhe: 2.508 ± 0.1
2.875TyrGly: 2.875 ± 0.073
0.741TyrHis: 0.741 ± 0.04
5.106TyrIle: 5.106 ± 0.131
5.674TyrLys: 5.674 ± 0.145
4.93TyrLeu: 4.93 ± 0.121
1.257TyrMet: 1.257 ± 0.064
4.514TyrAsn: 4.514 ± 0.119
1.244TyrPro: 1.244 ± 0.06
0.984TyrGln: 0.984 ± 0.056
1.579TyrArg: 1.579 ± 0.071
3.629TyrSer: 3.629 ± 0.12
2.985TyrThr: 2.985 ± 0.104
2.867TyrVal: 2.867 ± 0.086
0.233TyrTrp: 0.233 ± 0.025
3.372TyrTyr: 3.372 ± 0.123
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 1265 proteins (381930 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski