Amino acid dipepetide frequency for candidate division MSBL1 archaeon SCGC-AAA382M17

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
4.592AlaAla: 4.592 ± 0.275
0.619AlaCys: 0.619 ± 0.088
2.926AlaAsp: 2.926 ± 0.18
5.531AlaGlu: 5.531 ± 0.282
2.83AlaPhe: 2.83 ± 0.193
4.399AlaGly: 4.399 ± 0.306
1.025AlaHis: 1.025 ± 0.114
4.431AlaIle: 4.431 ± 0.237
4.143AlaLys: 4.143 ± 0.212
6.204AlaLeu: 6.204 ± 0.321
1.559AlaMet: 1.559 ± 0.151
2.349AlaAsn: 2.349 ± 0.149
2.061AlaPro: 2.061 ± 0.163
1.335AlaGln: 1.335 ± 0.143
3.438AlaArg: 3.438 ± 0.212
4.143AlaSer: 4.143 ± 0.223
3.001AlaThr: 3.001 ± 0.184
4.688AlaVal: 4.688 ± 0.267
0.673AlaTrp: 0.673 ± 0.089
1.687AlaTyr: 1.687 ± 0.134
0.0AlaXaa: 0.0 ± 0.0
Cys
0.47CysAla: 0.47 ± 0.067
0.096CysCys: 0.096 ± 0.033
0.416CysAsp: 0.416 ± 0.064
0.683CysGlu: 0.683 ± 0.098
0.342CysPhe: 0.342 ± 0.069
0.961CysGly: 0.961 ± 0.134
0.203CysHis: 0.203 ± 0.043
0.395CysIle: 0.395 ± 0.066
0.566CysLys: 0.566 ± 0.087
0.651CysLeu: 0.651 ± 0.082
0.182CysMet: 0.182 ± 0.045
0.32CysAsn: 0.32 ± 0.056
0.587CysPro: 0.587 ± 0.085
0.246CysGln: 0.246 ± 0.053
0.523CysArg: 0.523 ± 0.066
0.555CysSer: 0.555 ± 0.081
0.459CysThr: 0.459 ± 0.08
0.683CysVal: 0.683 ± 0.102
0.075CysTrp: 0.075 ± 0.031
0.278CysTyr: 0.278 ± 0.064
0.0CysXaa: 0.0 ± 0.0
Asp
2.755AspAla: 2.755 ± 0.197
0.513AspCys: 0.513 ± 0.08
2.232AspAsp: 2.232 ± 0.202
5.232AspGlu: 5.232 ± 0.247
2.531AspPhe: 2.531 ± 0.18
3.257AspGly: 3.257 ± 0.191
0.886AspHis: 0.886 ± 0.108
4.57AspIle: 4.57 ± 0.223
3.887AspLys: 3.887 ± 0.2
6.364AspLeu: 6.364 ± 0.255
1.559AspMet: 1.559 ± 0.142
1.837AspAsn: 1.837 ± 0.165
2.381AspPro: 2.381 ± 0.174
1.442AspGln: 1.442 ± 0.102
3.054AspArg: 3.054 ± 0.189
3.598AspSer: 3.598 ± 0.185
2.349AspThr: 2.349 ± 0.145
3.737AspVal: 3.737 ± 0.233
0.897AspTrp: 0.897 ± 0.098
2.509AspTyr: 2.509 ± 0.202
0.0AspXaa: 0.0 ± 0.0
Glu
5.958GluAla: 5.958 ± 0.295
0.694GluCys: 0.694 ± 0.092
6.321GluAsp: 6.321 ± 0.273
10.582GluGlu: 10.582 ± 0.461
3.203GluPhe: 3.203 ± 0.197
6.14GluGly: 6.14 ± 0.268
1.175GluHis: 1.175 ± 0.113
7.443GluIle: 7.443 ± 0.291
9.343GluLys: 9.343 ± 0.476
7.314GluLeu: 7.314 ± 0.349
2.2GluMet: 2.2 ± 0.171
4.581GluAsn: 4.581 ± 0.249
2.627GluPro: 2.627 ± 0.204
1.687GluGln: 1.687 ± 0.148
5.061GluArg: 5.061 ± 0.27
4.752GluSer: 4.752 ± 0.226
4.175GluThr: 4.175 ± 0.219
6.524GluVal: 6.524 ± 0.3
0.961GluTrp: 0.961 ± 0.1
2.371GluTyr: 2.371 ± 0.155
0.0GluXaa: 0.0 ± 0.0
Phe
2.381PheAla: 2.381 ± 0.192
0.438PheCys: 0.438 ± 0.067
2.413PheAsp: 2.413 ± 0.15
2.936PheGlu: 2.936 ± 0.162
1.773PhePhe: 1.773 ± 0.17
3.225PheGly: 3.225 ± 0.179
0.876PheHis: 0.876 ± 0.101
2.627PheIle: 2.627 ± 0.198
2.093PheLys: 2.093 ± 0.183
4.613PheLeu: 4.613 ± 0.296
0.801PheMet: 0.801 ± 0.106
1.644PheAsn: 1.644 ± 0.155
1.42PhePro: 1.42 ± 0.135
1.068PheGln: 1.068 ± 0.114
1.986PheArg: 1.986 ± 0.16
3.332PheSer: 3.332 ± 0.23
1.741PheThr: 1.741 ± 0.145
2.712PheVal: 2.712 ± 0.173
0.448PheTrp: 0.448 ± 0.068
1.559PheTyr: 1.559 ± 0.144
0.0PheXaa: 0.0 ± 0.0
Gly
4.517GlyAla: 4.517 ± 0.297
0.662GlyCys: 0.662 ± 0.102
3.631GlyAsp: 3.631 ± 0.216
6.14GlyGlu: 6.14 ± 0.255
3.086GlyPhe: 3.086 ± 0.189
5.99GlyGly: 5.99 ± 0.308
1.036GlyHis: 1.036 ± 0.115
6.012GlyIle: 6.012 ± 0.273
5.958GlyLys: 5.958 ± 0.258
6.385GlyLeu: 6.385 ± 0.329
2.05GlyMet: 2.05 ± 0.16
2.648GlyAsn: 2.648 ± 0.178
2.392GlyPro: 2.392 ± 0.207
1.57GlyGln: 1.57 ± 0.142
4.015GlyArg: 4.015 ± 0.225
4.517GlySer: 4.517 ± 0.242
3.833GlyThr: 3.833 ± 0.208
5.478GlyVal: 5.478 ± 0.259
0.886GlyTrp: 0.886 ± 0.094
2.381GlyTyr: 2.381 ± 0.191
0.0GlyXaa: 0.0 ± 0.0
His
1.014HisAla: 1.014 ± 0.112
0.214HisCys: 0.214 ± 0.049
0.961HisAsp: 0.961 ± 0.096
1.281HisGlu: 1.281 ± 0.113
0.833HisPhe: 0.833 ± 0.086
1.527HisGly: 1.527 ± 0.14
0.427HisHis: 0.427 ± 0.079
1.303HisIle: 1.303 ± 0.108
0.705HisLys: 0.705 ± 0.084
1.708HisLeu: 1.708 ± 0.131
0.288HisMet: 0.288 ± 0.06
0.641HisAsn: 0.641 ± 0.073
1.068HisPro: 1.068 ± 0.118
0.555HisGln: 0.555 ± 0.078
0.929HisArg: 0.929 ± 0.107
1.025HisSer: 1.025 ± 0.12
0.683HisThr: 0.683 ± 0.087
1.324HisVal: 1.324 ± 0.119
0.192HisTrp: 0.192 ± 0.042
0.715HisTyr: 0.715 ± 0.076
0.0HisXaa: 0.0 ± 0.0
Ile
4.762IleAla: 4.762 ± 0.274
0.737IleCys: 0.737 ± 0.097
4.506IleAsp: 4.506 ± 0.227
6.375IleGlu: 6.375 ± 0.259
2.979IlePhe: 2.979 ± 0.205
5.307IleGly: 5.307 ± 0.284
1.527IleHis: 1.527 ± 0.14
4.698IleIle: 4.698 ± 0.246
4.389IleLys: 4.389 ± 0.248
7.133IleLeu: 7.133 ± 0.371
1.292IleMet: 1.292 ± 0.094
2.819IleAsn: 2.819 ± 0.202
3.929IlePro: 3.929 ± 0.222
2.018IleGln: 2.018 ± 0.161
4.09IleArg: 4.09 ± 0.204
6.3IleSer: 6.3 ± 0.298
3.513IleThr: 3.513 ± 0.187
4.25IleVal: 4.25 ± 0.212
0.566IleTrp: 0.566 ± 0.075
2.189IleTyr: 2.189 ± 0.167
0.0IleXaa: 0.0 ± 0.0
Lys
4.506LysAla: 4.506 ± 0.246
0.609LysCys: 0.609 ± 0.086
3.919LysAsp: 3.919 ± 0.199
7.891LysGlu: 7.891 ± 0.321
2.563LysPhe: 2.563 ± 0.176
5.051LysGly: 5.051 ± 0.255
1.292LysHis: 1.292 ± 0.115
6.375LysIle: 6.375 ± 0.269
7.282LysLys: 7.282 ± 0.378
6.257LysLeu: 6.257 ± 0.312
1.794LysMet: 1.794 ± 0.137
4.004LysAsn: 4.004 ± 0.266
2.435LysPro: 2.435 ± 0.155
1.837LysGln: 1.837 ± 0.157
4.132LysArg: 4.132 ± 0.258
4.57LysSer: 4.57 ± 0.243
3.524LysThr: 3.524 ± 0.213
5.179LysVal: 5.179 ± 0.267
0.822LysTrp: 0.822 ± 0.103
1.986LysTyr: 1.986 ± 0.157
0.0LysXaa: 0.0 ± 0.0
Leu
6.161LeuAla: 6.161 ± 0.297
0.641LeuCys: 0.641 ± 0.074
5.392LeuAsp: 5.392 ± 0.251
8.884LeuGlu: 8.884 ± 0.38
3.449LeuPhe: 3.449 ± 0.262
6.78LeuGly: 6.78 ± 0.324
1.559LeuHis: 1.559 ± 0.115
6.14LeuIle: 6.14 ± 0.315
7.571LeuLys: 7.571 ± 0.367
8.542LeuLeu: 8.542 ± 0.414
1.911LeuMet: 1.911 ± 0.141
3.791LeuAsn: 3.791 ± 0.262
3.684LeuPro: 3.684 ± 0.203
2.573LeuGln: 2.573 ± 0.161
4.816LeuArg: 4.816 ± 0.233
6.738LeuSer: 6.738 ± 0.327
4.495LeuThr: 4.495 ± 0.21
5.553LeuVal: 5.553 ± 0.25
0.908LeuTrp: 0.908 ± 0.091
2.403LeuTyr: 2.403 ± 0.196
0.0LeuXaa: 0.0 ± 0.0
Met
1.431MetAla: 1.431 ± 0.108
0.171MetCys: 0.171 ± 0.036
1.324MetAsp: 1.324 ± 0.105
2.029MetGlu: 2.029 ± 0.136
0.577MetPhe: 0.577 ± 0.074
1.431MetGly: 1.431 ± 0.121
0.246MetHis: 0.246 ± 0.058
1.484MetIle: 1.484 ± 0.13
1.965MetLys: 1.965 ± 0.146
1.943MetLeu: 1.943 ± 0.145
0.438MetMet: 0.438 ± 0.073
1.014MetAsn: 1.014 ± 0.096
1.121MetPro: 1.121 ± 0.117
0.523MetGln: 0.523 ± 0.082
1.42MetArg: 1.42 ± 0.127
1.516MetSer: 1.516 ± 0.121
1.249MetThr: 1.249 ± 0.135
1.666MetVal: 1.666 ± 0.148
0.171MetTrp: 0.171 ± 0.039
0.577MetTyr: 0.577 ± 0.069
0.0MetXaa: 0.0 ± 0.0
Asn
2.531AsnAla: 2.531 ± 0.158
0.395AsnCys: 0.395 ± 0.076
1.634AsnAsp: 1.634 ± 0.127
3.033AsnGlu: 3.033 ± 0.221
1.922AsnPhe: 1.922 ± 0.159
2.285AsnGly: 2.285 ± 0.174
0.833AsnHis: 0.833 ± 0.099
3.267AsnIle: 3.267 ± 0.208
2.509AsnLys: 2.509 ± 0.187
4.869AsnLeu: 4.869 ± 0.223
0.961AsnMet: 0.961 ± 0.124
1.591AsnAsn: 1.591 ± 0.137
2.584AsnPro: 2.584 ± 0.169
1.527AsnGln: 1.527 ± 0.168
2.306AsnArg: 2.306 ± 0.174
2.413AsnSer: 2.413 ± 0.195
1.708AsnThr: 1.708 ± 0.163
2.691AsnVal: 2.691 ± 0.192
0.555AsnTrp: 0.555 ± 0.083
1.58AsnTyr: 1.58 ± 0.174
0.0AsnXaa: 0.0 ± 0.0
Pro
2.189ProAla: 2.189 ± 0.159
0.384ProCys: 0.384 ± 0.071
2.883ProAsp: 2.883 ± 0.176
4.602ProGlu: 4.602 ± 0.262
1.548ProPhe: 1.548 ± 0.127
3.235ProGly: 3.235 ± 0.241
0.812ProHis: 0.812 ± 0.093
2.627ProIle: 2.627 ± 0.186
2.819ProLys: 2.819 ± 0.177
3.428ProLeu: 3.428 ± 0.207
0.79ProMet: 0.79 ± 0.098
1.655ProAsn: 1.655 ± 0.138
1.58ProPro: 1.58 ± 0.14
1.217ProGln: 1.217 ± 0.122
1.815ProArg: 1.815 ± 0.133
2.605ProSer: 2.605 ± 0.191
1.965ProThr: 1.965 ± 0.179
2.851ProVal: 2.851 ± 0.195
0.342ProTrp: 0.342 ± 0.06
1.239ProTyr: 1.239 ± 0.115
0.0ProXaa: 0.0 ± 0.0
Gln
1.73GlnAla: 1.73 ± 0.14
0.075GlnCys: 0.075 ± 0.026
1.431GlnAsp: 1.431 ± 0.106
1.975GlnGlu: 1.975 ± 0.156
0.769GlnPhe: 0.769 ± 0.089
1.623GlnGly: 1.623 ± 0.16
0.523GlnHis: 0.523 ± 0.075
2.21GlnIle: 2.21 ± 0.173
2.338GlnLys: 2.338 ± 0.175
1.954GlnLeu: 1.954 ± 0.162
0.801GlnMet: 0.801 ± 0.095
1.207GlnAsn: 1.207 ± 0.121
1.121GlnPro: 1.121 ± 0.104
0.737GlnGln: 0.737 ± 0.102
1.687GlnArg: 1.687 ± 0.13
1.452GlnSer: 1.452 ± 0.123
1.324GlnThr: 1.324 ± 0.111
1.634GlnVal: 1.634 ± 0.114
0.203GlnTrp: 0.203 ± 0.053
0.854GlnTyr: 0.854 ± 0.102
0.0GlnXaa: 0.0 ± 0.0
Arg
3.631ArgAla: 3.631 ± 0.183
0.459ArgCys: 0.459 ± 0.081
2.851ArgAsp: 2.851 ± 0.18
5.787ArgGlu: 5.787 ± 0.273
2.104ArgPhe: 2.104 ± 0.148
4.218ArgGly: 4.218 ± 0.232
0.79ArgHis: 0.79 ± 0.11
4.346ArgIle: 4.346 ± 0.208
5.147ArgLys: 5.147 ± 0.264
4.485ArgLeu: 4.485 ± 0.201
1.303ArgMet: 1.303 ± 0.126
2.253ArgAsn: 2.253 ± 0.147
1.794ArgPro: 1.794 ± 0.16
1.239ArgGln: 1.239 ± 0.119
3.417ArgArg: 3.417 ± 0.227
3.086ArgSer: 3.086 ± 0.194
2.317ArgThr: 2.317 ± 0.146
3.962ArgVal: 3.962 ± 0.209
0.641ArgTrp: 0.641 ± 0.093
1.58ArgTyr: 1.58 ± 0.122
0.0ArgXaa: 0.0 ± 0.0
Ser
3.748SerAla: 3.748 ± 0.224
0.513SerCys: 0.513 ± 0.076
4.026SerAsp: 4.026 ± 0.197
6.129SerGlu: 6.129 ± 0.283
2.915SerPhe: 2.915 ± 0.209
5.83SerGly: 5.83 ± 0.318
1.185SerHis: 1.185 ± 0.112
4.656SerIle: 4.656 ± 0.263
4.463SerLys: 4.463 ± 0.224
5.787SerLeu: 5.787 ± 0.273
1.271SerMet: 1.271 ± 0.115
2.616SerAsn: 2.616 ± 0.17
2.787SerPro: 2.787 ± 0.152
2.039SerGln: 2.039 ± 0.126
3.833SerArg: 3.833 ± 0.222
4.773SerSer: 4.773 ± 0.228
2.99SerThr: 2.99 ± 0.192
4.303SerVal: 4.303 ± 0.205
0.673SerTrp: 0.673 ± 0.092
2.21SerTyr: 2.21 ± 0.177
0.0SerXaa: 0.0 ± 0.0
Thr
2.968ThrAla: 2.968 ± 0.207
0.363ThrCys: 0.363 ± 0.066
2.509ThrAsp: 2.509 ± 0.17
3.908ThrGlu: 3.908 ± 0.211
2.007ThrPhe: 2.007 ± 0.157
4.143ThrGly: 4.143 ± 0.236
0.812ThrHis: 0.812 ± 0.104
3.396ThrIle: 3.396 ± 0.189
3.182ThrLys: 3.182 ± 0.197
4.367ThrLeu: 4.367 ± 0.214
0.833ThrMet: 0.833 ± 0.108
1.869ThrAsn: 1.869 ± 0.159
2.317ThrPro: 2.317 ± 0.148
1.143ThrGln: 1.143 ± 0.111
2.168ThrArg: 2.168 ± 0.139
3.321ThrSer: 3.321 ± 0.177
2.573ThrThr: 2.573 ± 0.174
3.374ThrVal: 3.374 ± 0.189
0.374ThrTrp: 0.374 ± 0.066
1.42ThrTyr: 1.42 ± 0.124
0.0ThrXaa: 0.0 ± 0.0
Val
3.994ValAla: 3.994 ± 0.223
0.758ValCys: 0.758 ± 0.084
3.823ValAsp: 3.823 ± 0.227
6.556ValGlu: 6.556 ± 0.303
2.766ValPhe: 2.766 ± 0.202
4.912ValGly: 4.912 ± 0.277
1.313ValHis: 1.313 ± 0.115
4.901ValIle: 4.901 ± 0.274
4.794ValLys: 4.794 ± 0.238
6.086ValLeu: 6.086 ± 0.249
1.431ValMet: 1.431 ± 0.125
2.36ValAsn: 2.36 ± 0.147
3.022ValPro: 3.022 ± 0.156
1.623ValGln: 1.623 ± 0.137
4.068ValArg: 4.068 ± 0.237
5.051ValSer: 5.051 ± 0.242
3.278ValThr: 3.278 ± 0.186
5.019ValVal: 5.019 ± 0.283
0.726ValTrp: 0.726 ± 0.1
1.815ValTyr: 1.815 ± 0.133
0.0ValXaa: 0.0 ± 0.0
Trp
0.598TrpAla: 0.598 ± 0.085
0.107TrpCys: 0.107 ± 0.035
0.662TrpAsp: 0.662 ± 0.089
0.886TrpGlu: 0.886 ± 0.099
0.352TrpPhe: 0.352 ± 0.063
0.63TrpGly: 0.63 ± 0.081
0.139TrpHis: 0.139 ± 0.04
0.854TrpIle: 0.854 ± 0.102
0.993TrpLys: 0.993 ± 0.098
0.844TrpLeu: 0.844 ± 0.105
0.182TrpMet: 0.182 ± 0.047
0.513TrpAsn: 0.513 ± 0.069
0.267TrpPro: 0.267 ± 0.057
0.374TrpGln: 0.374 ± 0.058
0.726TrpArg: 0.726 ± 0.09
0.694TrpSer: 0.694 ± 0.088
0.438TrpThr: 0.438 ± 0.065
0.673TrpVal: 0.673 ± 0.079
0.171TrpTrp: 0.171 ± 0.042
0.513TrpTyr: 0.513 ± 0.081
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.698TyrAla: 1.698 ± 0.141
0.246TyrCys: 0.246 ± 0.053
1.73TyrAsp: 1.73 ± 0.144
2.669TyrGlu: 2.669 ± 0.173
1.538TyrPhe: 1.538 ± 0.137
2.242TyrGly: 2.242 ± 0.155
0.737TyrHis: 0.737 ± 0.094
1.644TyrIle: 1.644 ± 0.151
1.89TyrLys: 1.89 ± 0.175
3.054TyrLeu: 3.054 ± 0.214
0.641TyrMet: 0.641 ± 0.093
1.484TyrAsn: 1.484 ± 0.164
1.409TyrPro: 1.409 ± 0.145
0.897TyrGln: 0.897 ± 0.104
1.922TyrArg: 1.922 ± 0.153
2.306TyrSer: 2.306 ± 0.217
1.431TyrThr: 1.431 ± 0.142
1.997TyrVal: 1.997 ± 0.163
0.331TyrTrp: 0.331 ± 0.057
1.207TyrTyr: 1.207 ± 0.157
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 356 proteins (93652 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski