Amino acid dipepetide frequency for Clostridium sp. CAG:798

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
2.54AlaAla: 2.54 ± 0.114
0.54AlaCys: 0.54 ± 0.038
2.54AlaAsp: 2.54 ± 0.087
3.838AlaGlu: 3.838 ± 0.127
1.838AlaPhe: 1.838 ± 0.077
3.298AlaGly: 3.298 ± 0.12
0.665AlaHis: 0.665 ± 0.043
5.972AlaIle: 5.972 ± 0.14
5.316AlaLys: 5.316 ± 0.132
4.074AlaLeu: 4.074 ± 0.119
1.231AlaMet: 1.231 ± 0.054
3.119AlaAsn: 3.119 ± 0.105
1.202AlaPro: 1.202 ± 0.061
1.48AlaGln: 1.48 ± 0.062
1.95AlaArg: 1.95 ± 0.072
2.508AlaSer: 2.508 ± 0.083
3.178AlaThr: 3.178 ± 0.13
2.965AlaVal: 2.965 ± 0.104
0.319AlaTrp: 0.319 ± 0.027
1.942AlaTyr: 1.942 ± 0.077
0.0AlaXaa: 0.0 ± 0.0
Cys
0.517CysAla: 0.517 ± 0.038
0.151CysCys: 0.151 ± 0.02
0.574CysAsp: 0.574 ± 0.043
0.831CysGlu: 0.831 ± 0.049
0.47CysPhe: 0.47 ± 0.038
0.911CysGly: 0.911 ± 0.059
0.179CysHis: 0.179 ± 0.023
1.272CysIle: 1.272 ± 0.068
1.135CysLys: 1.135 ± 0.06
0.756CysLeu: 0.756 ± 0.05
0.293CysMet: 0.293 ± 0.026
0.753CysAsn: 0.753 ± 0.048
0.431CysPro: 0.431 ± 0.035
0.273CysGln: 0.273 ± 0.026
0.348CysArg: 0.348 ± 0.029
0.719CysSer: 0.719 ± 0.063
0.605CysThr: 0.605 ± 0.048
0.626CysVal: 0.626 ± 0.041
0.055CysTrp: 0.055 ± 0.013
0.428CysTyr: 0.428 ± 0.03
0.0CysXaa: 0.0 ± 0.0
Asp
2.347AspAla: 2.347 ± 0.084
0.571AspCys: 0.571 ± 0.041
2.493AspAsp: 2.493 ± 0.093
4.905AspGlu: 4.905 ± 0.132
2.418AspPhe: 2.418 ± 0.075
3.186AspGly: 3.186 ± 0.105
0.423AspHis: 0.423 ± 0.032
6.307AspIle: 6.307 ± 0.153
5.388AspLys: 5.388 ± 0.121
4.562AspLeu: 4.562 ± 0.117
1.379AspMet: 1.379 ± 0.062
3.583AspAsn: 3.583 ± 0.103
1.078AspPro: 1.078 ± 0.065
0.701AspGln: 0.701 ± 0.044
1.615AspArg: 1.615 ± 0.069
2.682AspSer: 2.682 ± 0.095
2.807AspThr: 2.807 ± 0.093
3.384AspVal: 3.384 ± 0.111
0.374AspTrp: 0.374 ± 0.032
2.605AspTyr: 2.605 ± 0.095
0.0AspXaa: 0.0 ± 0.0
Glu
3.703GluAla: 3.703 ± 0.108
0.665GluCys: 0.665 ± 0.042
4.401GluAsp: 4.401 ± 0.117
8.702GluGlu: 8.702 ± 0.238
3.01GluPhe: 3.01 ± 0.096
3.651GluGly: 3.651 ± 0.121
0.917GluHis: 0.917 ± 0.042
9.05GluIle: 9.05 ± 0.17
10.171GluLys: 10.171 ± 0.209
7.39GluLeu: 7.39 ± 0.156
2.155GluMet: 2.155 ± 0.085
7.312GluAsn: 7.312 ± 0.149
1.657GluPro: 1.657 ± 0.067
3.233GluGln: 3.233 ± 0.096
2.724GluArg: 2.724 ± 0.09
3.046GluSer: 3.046 ± 0.088
4.02GluThr: 4.02 ± 0.096
4.568GluVal: 4.568 ± 0.124
0.493GluTrp: 0.493 ± 0.038
3.994GluTyr: 3.994 ± 0.111
0.0GluXaa: 0.0 ± 0.0
Phe
2.067PheAla: 2.067 ± 0.094
0.532PheCys: 0.532 ± 0.041
2.332PheAsp: 2.332 ± 0.077
3.168PheGlu: 3.168 ± 0.081
1.522PhePhe: 1.522 ± 0.077
2.223PheGly: 2.223 ± 0.075
0.431PheHis: 0.431 ± 0.033
3.911PheIle: 3.911 ± 0.124
3.435PheLys: 3.435 ± 0.109
3.178PheLeu: 3.178 ± 0.118
0.922PheMet: 0.922 ± 0.05
2.766PheAsn: 2.766 ± 0.104
0.945PhePro: 0.945 ± 0.052
0.74PheGln: 0.74 ± 0.051
1.106PheArg: 1.106 ± 0.052
2.405PheSer: 2.405 ± 0.094
2.127PheThr: 2.127 ± 0.089
2.212PheVal: 2.212 ± 0.084
0.327PheTrp: 0.327 ± 0.032
1.558PheTyr: 1.558 ± 0.073
0.0PheXaa: 0.0 ± 0.0
Gly
3.067GlyAla: 3.067 ± 0.12
0.657GlyCys: 0.657 ± 0.055
2.737GlyAsp: 2.737 ± 0.095
4.124GlyGlu: 4.124 ± 0.116
2.334GlyPhe: 2.334 ± 0.074
3.225GlyGly: 3.225 ± 0.163
0.852GlyHis: 0.852 ± 0.05
6.258GlyIle: 6.258 ± 0.152
5.744GlyLys: 5.744 ± 0.143
4.555GlyLeu: 4.555 ± 0.14
1.415GlyMet: 1.415 ± 0.06
3.49GlyAsn: 3.49 ± 0.116
0.909GlyPro: 0.909 ± 0.055
1.41GlyGln: 1.41 ± 0.068
2.083GlyArg: 2.083 ± 0.09
2.768GlySer: 2.768 ± 0.114
3.794GlyThr: 3.794 ± 0.174
3.482GlyVal: 3.482 ± 0.091
0.421GlyTrp: 0.421 ± 0.034
2.934GlyTyr: 2.934 ± 0.092
0.0GlyXaa: 0.0 ± 0.0
His
0.532HisAla: 0.532 ± 0.042
0.195HisCys: 0.195 ± 0.028
0.501HisAsp: 0.501 ± 0.042
0.706HisGlu: 0.706 ± 0.044
0.522HisPhe: 0.522 ± 0.034
0.745HisGly: 0.745 ± 0.048
0.257HisHis: 0.257 ± 0.026
1.361HisIle: 1.361 ± 0.069
0.966HisLys: 0.966 ± 0.048
0.963HisLeu: 0.963 ± 0.054
0.33HisMet: 0.33 ± 0.027
0.779HisAsn: 0.779 ± 0.046
0.499HisPro: 0.499 ± 0.037
0.304HisGln: 0.304 ± 0.028
0.491HisArg: 0.491 ± 0.039
0.735HisSer: 0.735 ± 0.052
0.647HisThr: 0.647 ± 0.038
0.574HisVal: 0.574 ± 0.037
0.088HisTrp: 0.088 ± 0.016
0.488HisTyr: 0.488 ± 0.033
0.0HisXaa: 0.0 ± 0.0
Ile
5.959IleAla: 5.959 ± 0.14
1.514IleCys: 1.514 ± 0.07
5.934IleAsp: 5.934 ± 0.134
9.089IleGlu: 9.089 ± 0.179
4.025IlePhe: 4.025 ± 0.138
6.058IleGly: 6.058 ± 0.138
1.137IleHis: 1.137 ± 0.057
11.35IleIle: 11.35 ± 0.241
10.239IleLys: 10.239 ± 0.189
9.862IleLeu: 9.862 ± 0.24
2.241IleMet: 2.241 ± 0.085
7.273IleAsn: 7.273 ± 0.161
3.417IlePro: 3.417 ± 0.086
2.737IleGln: 2.737 ± 0.107
3.285IleArg: 3.285 ± 0.102
6.772IleSer: 6.772 ± 0.143
5.786IleThr: 5.786 ± 0.154
6.424IleVal: 6.424 ± 0.149
0.576IleTrp: 0.576 ± 0.038
4.594IleTyr: 4.594 ± 0.124
0.0IleXaa: 0.0 ± 0.0
Lys
4.736LysAla: 4.736 ± 0.121
1.039LysCys: 1.039 ± 0.045
5.482LysAsp: 5.482 ± 0.136
10.332LysGlu: 10.332 ± 0.209
3.217LysPhe: 3.217 ± 0.097
4.622LysGly: 4.622 ± 0.117
1.106LysHis: 1.106 ± 0.064
10.94LysIle: 10.94 ± 0.199
10.062LysLys: 10.062 ± 0.211
8.463LysLeu: 8.463 ± 0.17
2.708LysMet: 2.708 ± 0.087
7.52LysAsn: 7.52 ± 0.165
2.155LysPro: 2.155 ± 0.081
3.581LysGln: 3.581 ± 0.107
3.342LysArg: 3.342 ± 0.117
4.446LysSer: 4.446 ± 0.118
5.375LysThr: 5.375 ± 0.116
5.9LysVal: 5.9 ± 0.146
0.618LysTrp: 0.618 ± 0.051
5.349LysTyr: 5.349 ± 0.129
0.0LysXaa: 0.0 ± 0.0
Leu
4.744LeuAla: 4.744 ± 0.129
0.945LeuCys: 0.945 ± 0.046
4.856LeuAsp: 4.856 ± 0.118
7.364LeuGlu: 7.364 ± 0.142
3.077LeuPhe: 3.077 ± 0.129
4.936LeuGly: 4.936 ± 0.134
1.026LeuHis: 1.026 ± 0.051
8.065LeuIle: 8.065 ± 0.17
8.785LeuLys: 8.785 ± 0.16
7.209LeuLeu: 7.209 ± 0.201
1.771LeuMet: 1.771 ± 0.065
5.723LeuAsn: 5.723 ± 0.141
2.605LeuPro: 2.605 ± 0.093
2.324LeuGln: 2.324 ± 0.081
2.727LeuArg: 2.727 ± 0.085
5.279LeuSer: 5.279 ± 0.135
4.438LeuThr: 4.438 ± 0.126
4.713LeuVal: 4.713 ± 0.122
0.514LeuTrp: 0.514 ± 0.043
3.303LeuTyr: 3.303 ± 0.113
0.0LeuXaa: 0.0 ± 0.0
Met
1.345MetAla: 1.345 ± 0.061
0.296MetCys: 0.296 ± 0.031
1.223MetAsp: 1.223 ± 0.057
1.898MetGlu: 1.898 ± 0.064
1.049MetPhe: 1.049 ± 0.074
1.309MetGly: 1.309 ± 0.066
0.28MetHis: 0.28 ± 0.024
1.971MetIle: 1.971 ± 0.08
2.493MetLys: 2.493 ± 0.076
2.238MetLeu: 2.238 ± 0.081
0.506MetMet: 0.506 ± 0.036
1.579MetAsn: 1.579 ± 0.068
0.953MetPro: 0.953 ± 0.044
0.982MetGln: 0.982 ± 0.05
0.706MetArg: 0.706 ± 0.046
1.4MetSer: 1.4 ± 0.062
1.07MetThr: 1.07 ± 0.056
1.28MetVal: 1.28 ± 0.053
0.158MetTrp: 0.158 ± 0.021
1.018MetTyr: 1.018 ± 0.05
0.0MetXaa: 0.0 ± 0.0
Asn
3.046AsnAla: 3.046 ± 0.076
0.857AsnCys: 0.857 ± 0.051
3.054AsnAsp: 3.054 ± 0.101
5.245AsnGlu: 5.245 ± 0.136
2.667AsnPhe: 2.667 ± 0.093
3.942AsnGly: 3.942 ± 0.13
0.649AsnHis: 0.649 ± 0.038
9.05AsnIle: 9.05 ± 0.221
7.575AsnLys: 7.575 ± 0.156
5.757AsnLeu: 5.757 ± 0.132
1.776AsnMet: 1.776 ± 0.071
5.783AsnAsn: 5.783 ± 0.158
2.093AsnPro: 2.093 ± 0.078
1.714AsnGln: 1.714 ± 0.071
2.046AsnArg: 2.046 ± 0.083
4.113AsnSer: 4.113 ± 0.129
3.778AsnThr: 3.778 ± 0.121
4.033AsnVal: 4.033 ± 0.122
0.452AsnTrp: 0.452 ± 0.034
3.264AsnTyr: 3.264 ± 0.101
0.0AsnXaa: 0.0 ± 0.0
Pro
1.298ProAla: 1.298 ± 0.06
0.351ProCys: 0.351 ± 0.028
1.493ProAsp: 1.493 ± 0.071
2.636ProGlu: 2.636 ± 0.085
1.098ProPhe: 1.098 ± 0.06
1.358ProGly: 1.358 ± 0.065
0.34ProHis: 0.34 ± 0.028
2.737ProIle: 2.737 ± 0.089
2.259ProLys: 2.259 ± 0.079
1.903ProLeu: 1.903 ± 0.081
0.613ProMet: 0.613 ± 0.043
1.794ProAsn: 1.794 ± 0.073
0.501ProPro: 0.501 ± 0.042
0.771ProGln: 0.771 ± 0.045
0.73ProArg: 0.73 ± 0.045
1.363ProSer: 1.363 ± 0.06
1.633ProThr: 1.633 ± 0.076
1.665ProVal: 1.665 ± 0.066
0.179ProTrp: 0.179 ± 0.028
1.226ProTyr: 1.226 ± 0.057
0.0ProXaa: 0.0 ± 0.0
Gln
1.566GlnAla: 1.566 ± 0.069
0.247GlnCys: 0.247 ± 0.026
1.511GlnAsp: 1.511 ± 0.061
2.623GlnGlu: 2.623 ± 0.087
0.984GlnPhe: 0.984 ± 0.057
1.343GlnGly: 1.343 ± 0.069
0.262GlnHis: 0.262 ± 0.029
3.181GlnIle: 3.181 ± 0.097
3.03GlnLys: 3.03 ± 0.09
2.329GlnLeu: 2.329 ± 0.085
0.815GlnMet: 0.815 ± 0.049
2.023GlnAsn: 2.023 ± 0.082
0.504GlnPro: 0.504 ± 0.04
0.862GlnGln: 0.862 ± 0.056
1.028GlnArg: 1.028 ± 0.054
1.213GlnSer: 1.213 ± 0.055
1.511GlnThr: 1.511 ± 0.07
1.475GlnVal: 1.475 ± 0.066
0.169GlnTrp: 0.169 ± 0.019
1.363GlnTyr: 1.363 ± 0.071
0.0GlnXaa: 0.0 ± 0.0
Arg
1.584ArgAla: 1.584 ± 0.071
0.397ArgCys: 0.397 ± 0.039
1.711ArgAsp: 1.711 ± 0.073
2.771ArgGlu: 2.771 ± 0.093
1.262ArgPhe: 1.262 ± 0.063
1.729ArgGly: 1.729 ± 0.073
0.405ArgHis: 0.405 ± 0.033
3.267ArgIle: 3.267 ± 0.095
3.635ArgLys: 3.635 ± 0.113
2.753ArgLeu: 2.753 ± 0.084
0.867ArgMet: 0.867 ± 0.053
2.231ArgAsn: 2.231 ± 0.081
0.94ArgPro: 0.94 ± 0.056
1.002ArgGln: 1.002 ± 0.048
1.475ArgArg: 1.475 ± 0.068
1.213ArgSer: 1.213 ± 0.058
1.805ArgThr: 1.805 ± 0.072
1.958ArgVal: 1.958 ± 0.075
0.226ArgTrp: 0.226 ± 0.026
1.426ArgTyr: 1.426 ± 0.067
0.0ArgXaa: 0.0 ± 0.0
Ser
2.498SerAla: 2.498 ± 0.077
0.491SerCys: 0.491 ± 0.042
2.846SerAsp: 2.846 ± 0.091
3.937SerGlu: 3.937 ± 0.088
2.114SerPhe: 2.114 ± 0.088
3.456SerGly: 3.456 ± 0.109
0.712SerHis: 0.712 ± 0.04
5.819SerIle: 5.819 ± 0.131
5.552SerLys: 5.552 ± 0.127
4.334SerLeu: 4.334 ± 0.115
1.182SerMet: 1.182 ± 0.056
3.947SerAsn: 3.947 ± 0.137
1.288SerPro: 1.288 ± 0.064
1.67SerGln: 1.67 ± 0.07
1.893SerArg: 1.893 ± 0.078
3.506SerSer: 3.506 ± 0.144
3.095SerThr: 3.095 ± 0.125
2.929SerVal: 2.929 ± 0.09
0.356SerTrp: 0.356 ± 0.029
2.306SerTyr: 2.306 ± 0.082
0.0SerXaa: 0.0 ± 0.0
Thr
3.01ThrAla: 3.01 ± 0.112
0.457ThrCys: 0.457 ± 0.032
3.051ThrAsp: 3.051 ± 0.096
3.989ThrGlu: 3.989 ± 0.137
2.085ThrPhe: 2.085 ± 0.08
3.76ThrGly: 3.76 ± 0.129
0.766ThrHis: 0.766 ± 0.041
5.993ThrIle: 5.993 ± 0.149
4.768ThrLys: 4.768 ± 0.121
4.716ThrLeu: 4.716 ± 0.123
1.057ThrMet: 1.057 ± 0.058
3.773ThrAsn: 3.773 ± 0.138
1.784ThrPro: 1.784 ± 0.074
1.493ThrGln: 1.493 ± 0.066
1.639ThrArg: 1.639 ± 0.078
3.29ThrSer: 3.29 ± 0.141
3.594ThrThr: 3.594 ± 0.202
3.433ThrVal: 3.433 ± 0.119
0.338ThrTrp: 0.338 ± 0.031
2.625ThrTyr: 2.625 ± 0.093
0.0ThrXaa: 0.0 ± 0.0
Val
3.456ValAla: 3.456 ± 0.109
0.743ValCys: 0.743 ± 0.049
3.287ValAsp: 3.287 ± 0.104
4.731ValGlu: 4.731 ± 0.128
2.166ValPhe: 2.166 ± 0.075
3.443ValGly: 3.443 ± 0.104
0.662ValHis: 0.662 ± 0.048
6.001ValIle: 6.001 ± 0.125
5.443ValLys: 5.443 ± 0.116
5.045ValLeu: 5.045 ± 0.124
1.27ValMet: 1.27 ± 0.059
3.563ValAsn: 3.563 ± 0.095
1.675ValPro: 1.675 ± 0.072
1.514ValGln: 1.514 ± 0.064
1.823ValArg: 1.823 ± 0.068
3.422ValSer: 3.422 ± 0.099
3.43ValThr: 3.43 ± 0.122
3.752ValVal: 3.752 ± 0.11
0.332ValTrp: 0.332 ± 0.033
2.418ValTyr: 2.418 ± 0.072
0.0ValXaa: 0.0 ± 0.0
Trp
0.314TrpAla: 0.314 ± 0.027
0.086TrpCys: 0.086 ± 0.016
0.374TrpAsp: 0.374 ± 0.031
0.486TrpGlu: 0.486 ± 0.036
0.283TrpPhe: 0.283 ± 0.029
0.408TrpGly: 0.408 ± 0.034
0.093TrpHis: 0.093 ± 0.016
0.618TrpIle: 0.618 ± 0.035
0.574TrpLys: 0.574 ± 0.041
0.522TrpLeu: 0.522 ± 0.04
0.166TrpMet: 0.166 ± 0.024
0.475TrpAsn: 0.475 ± 0.039
0.13TrpPro: 0.13 ± 0.021
0.241TrpGln: 0.241 ± 0.024
0.208TrpArg: 0.208 ± 0.023
0.348TrpSer: 0.348 ± 0.029
0.286TrpThr: 0.286 ± 0.031
0.343TrpVal: 0.343 ± 0.032
0.078TrpTrp: 0.078 ± 0.015
0.366TrpTyr: 0.366 ± 0.031
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.186TyrAla: 2.186 ± 0.072
0.595TyrCys: 0.595 ± 0.039
2.41TyrAsp: 2.41 ± 0.071
3.49TyrGlu: 3.49 ± 0.105
1.766TyrPhe: 1.766 ± 0.078
2.654TyrGly: 2.654 ± 0.096
0.519TyrHis: 0.519 ± 0.037
5.149TyrIle: 5.149 ± 0.126
4.425TyrLys: 4.425 ± 0.126
3.794TyrLeu: 3.794 ± 0.106
1.054TyrMet: 1.054 ± 0.05
3.407TyrAsn: 3.407 ± 0.103
1.184TyrPro: 1.184 ± 0.064
1.008TyrGln: 1.008 ± 0.056
1.439TyrArg: 1.439 ± 0.062
2.701TyrSer: 2.701 ± 0.086
2.641TyrThr: 2.641 ± 0.104
2.462TyrVal: 2.462 ± 0.091
0.325TyrTrp: 0.325 ± 0.024
2.342TyrTyr: 2.342 ± 0.099
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 1270 proteins (385101 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski