Amino acid dipepetide frequency for candidate division MSBL1 archaeon SCGC-AAA259D18

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
4.408AlaAla: 4.408 ± 0.296
0.69AlaCys: 0.69 ± 0.103
3.368AlaAsp: 3.368 ± 0.201
5.855AlaGlu: 5.855 ± 0.298
2.306AlaPhe: 2.306 ± 0.141
5.143AlaGly: 5.143 ± 0.257
1.198AlaHis: 1.198 ± 0.118
4.567AlaIle: 4.567 ± 0.246
3.787AlaLys: 3.787 ± 0.204
6.206AlaLeu: 6.206 ± 0.311
1.39AlaMet: 1.39 ± 0.175
1.707AlaAsn: 1.707 ± 0.144
2.34AlaPro: 2.34 ± 0.166
1.515AlaGln: 1.515 ± 0.124
4.047AlaArg: 4.047 ± 0.241
4.137AlaSer: 4.137 ± 0.244
2.792AlaThr: 2.792 ± 0.163
4.894AlaVal: 4.894 ± 0.244
0.52AlaTrp: 0.52 ± 0.082
1.741AlaTyr: 1.741 ± 0.167
0.0AlaXaa: 0.0 ± 0.0
Cys
0.61CysAla: 0.61 ± 0.078
0.192CysCys: 0.192 ± 0.06
0.61CysAsp: 0.61 ± 0.1
1.017CysGlu: 1.017 ± 0.098
0.373CysPhe: 0.373 ± 0.063
1.153CysGly: 1.153 ± 0.142
0.158CysHis: 0.158 ± 0.036
0.463CysIle: 0.463 ± 0.085
0.486CysLys: 0.486 ± 0.092
0.769CysLeu: 0.769 ± 0.1
0.192CysMet: 0.192 ± 0.045
0.249CysAsn: 0.249 ± 0.055
0.656CysPro: 0.656 ± 0.092
0.203CysGln: 0.203 ± 0.047
0.599CysArg: 0.599 ± 0.093
0.531CysSer: 0.531 ± 0.079
0.418CysThr: 0.418 ± 0.06
0.543CysVal: 0.543 ± 0.08
0.102CysTrp: 0.102 ± 0.033
0.237CysTyr: 0.237 ± 0.047
0.0CysXaa: 0.0 ± 0.0
Asp
2.781AspAla: 2.781 ± 0.169
0.576AspCys: 0.576 ± 0.087
2.589AspAsp: 2.589 ± 0.165
5.369AspGlu: 5.369 ± 0.253
2.555AspPhe: 2.555 ± 0.165
3.832AspGly: 3.832 ± 0.187
0.983AspHis: 0.983 ± 0.118
3.775AspIle: 3.775 ± 0.21
3.075AspLys: 3.075 ± 0.238
6.76AspLeu: 6.76 ± 0.269
1.221AspMet: 1.221 ± 0.122
1.628AspAsn: 1.628 ± 0.157
2.849AspPro: 2.849 ± 0.176
1.402AspGln: 1.402 ± 0.127
3.357AspArg: 3.357 ± 0.192
3.504AspSer: 3.504 ± 0.244
1.876AspThr: 1.876 ± 0.175
4.634AspVal: 4.634 ± 0.207
0.927AspTrp: 0.927 ± 0.123
2.035AspTyr: 2.035 ± 0.179
0.0AspXaa: 0.0 ± 0.0
Glu
6.138GluAla: 6.138 ± 0.297
0.712GluCys: 0.712 ± 0.093
5.55GluAsp: 5.55 ± 0.291
12.536GluGlu: 12.536 ± 0.563
3.154GluPhe: 3.154 ± 0.184
7.121GluGly: 7.121 ± 0.33
1.187GluHis: 1.187 ± 0.112
7.359GluIle: 7.359 ± 0.324
9.574GluLys: 9.574 ± 0.419
8.353GluLeu: 8.353 ± 0.331
2.17GluMet: 2.17 ± 0.163
5.607GluAsn: 5.607 ± 0.371
3.018GluPro: 3.018 ± 0.191
1.176GluGln: 1.176 ± 0.109
5.641GluArg: 5.641 ± 0.3
5.143GluSer: 5.143 ± 0.226
4.295GluThr: 4.295 ± 0.236
7.065GluVal: 7.065 ± 0.277
1.164GluTrp: 1.164 ± 0.129
2.43GluTyr: 2.43 ± 0.18
0.0GluXaa: 0.0 ± 0.0
Phe
2.385PheAla: 2.385 ± 0.186
0.441PheCys: 0.441 ± 0.072
2.475PheAsp: 2.475 ± 0.171
3.459PheGlu: 3.459 ± 0.209
1.876PhePhe: 1.876 ± 0.173
3.685PheGly: 3.685 ± 0.195
0.701PheHis: 0.701 ± 0.097
2.057PheIle: 2.057 ± 0.14
2.306PheLys: 2.306 ± 0.174
4.035PheLeu: 4.035 ± 0.278
0.859PheMet: 0.859 ± 0.102
0.916PheAsn: 0.916 ± 0.103
1.583PhePro: 1.583 ± 0.147
1.108PheGln: 1.108 ± 0.112
2.046PheArg: 2.046 ± 0.172
3.244PheSer: 3.244 ± 0.189
1.763PheThr: 1.763 ± 0.135
2.928PheVal: 2.928 ± 0.225
0.565PheTrp: 0.565 ± 0.081
1.176PheTyr: 1.176 ± 0.115
0.0PheXaa: 0.0 ± 0.0
Gly
5.335GlyAla: 5.335 ± 0.29
0.803GlyCys: 0.803 ± 0.101
4.137GlyAsp: 4.137 ± 0.211
7.686GlyGlu: 7.686 ± 0.274
3.255GlyPhe: 3.255 ± 0.225
6.353GlyGly: 6.353 ± 0.343
1.39GlyHis: 1.39 ± 0.129
5.516GlyIle: 5.516 ± 0.298
5.697GlyLys: 5.697 ± 0.267
6.714GlyLeu: 6.714 ± 0.356
2.091GlyMet: 2.091 ± 0.151
2.566GlyAsn: 2.566 ± 0.183
2.882GlyPro: 2.882 ± 0.204
1.413GlyGln: 1.413 ± 0.12
4.668GlyArg: 4.668 ± 0.24
5.652GlySer: 5.652 ± 0.303
3.775GlyThr: 3.775 ± 0.198
5.821GlyVal: 5.821 ± 0.313
1.006GlyTrp: 1.006 ± 0.124
2.408GlyTyr: 2.408 ± 0.164
0.0GlyXaa: 0.0 ± 0.0
His
0.995HisAla: 0.995 ± 0.097
0.35HisCys: 0.35 ± 0.064
0.78HisAsp: 0.78 ± 0.09
1.469HisGlu: 1.469 ± 0.13
0.927HisPhe: 0.927 ± 0.114
1.469HisGly: 1.469 ± 0.125
0.339HisHis: 0.339 ± 0.067
1.04HisIle: 1.04 ± 0.098
0.554HisLys: 0.554 ± 0.081
1.82HisLeu: 1.82 ± 0.152
0.362HisMet: 0.362 ± 0.054
0.441HisAsn: 0.441 ± 0.078
1.243HisPro: 1.243 ± 0.117
0.328HisGln: 0.328 ± 0.068
1.04HisArg: 1.04 ± 0.104
1.198HisSer: 1.198 ± 0.137
0.825HisThr: 0.825 ± 0.104
1.255HisVal: 1.255 ± 0.114
0.215HisTrp: 0.215 ± 0.045
0.509HisTyr: 0.509 ± 0.08
0.0HisXaa: 0.0 ± 0.0
Ile
4.725IleAla: 4.725 ± 0.262
0.633IleCys: 0.633 ± 0.075
4.103IleAsp: 4.103 ± 0.197
6.183IleGlu: 6.183 ± 0.276
2.882IlePhe: 2.882 ± 0.226
5.652IleGly: 5.652 ± 0.324
1.39IleHis: 1.39 ± 0.141
4.013IleIle: 4.013 ± 0.301
4.013IleLys: 4.013 ± 0.226
6.296IleLeu: 6.296 ± 0.306
1.176IleMet: 1.176 ± 0.116
2.159IleAsn: 2.159 ± 0.165
3.402IlePro: 3.402 ± 0.186
1.605IleGln: 1.605 ± 0.134
3.12IleArg: 3.12 ± 0.216
5.403IleSer: 5.403 ± 0.275
3.244IleThr: 3.244 ± 0.194
4.827IleVal: 4.827 ± 0.231
0.599IleTrp: 0.599 ± 0.087
1.842IleTyr: 1.842 ± 0.157
0.0IleXaa: 0.0 ± 0.0
Lys
4.759LysAla: 4.759 ± 0.315
0.543LysCys: 0.543 ± 0.08
3.662LysAsp: 3.662 ± 0.272
6.94LysGlu: 6.94 ± 0.34
2.758LysPhe: 2.758 ± 0.197
4.578LysGly: 4.578 ± 0.251
0.983LysHis: 0.983 ± 0.086
5.72LysIle: 5.72 ± 0.292
6.036LysLys: 6.036 ± 0.298
6.014LysLeu: 6.014 ± 0.277
1.583LysMet: 1.583 ± 0.126
3.007LysAsn: 3.007 ± 0.211
2.306LysPro: 2.306 ± 0.176
1.243LysGln: 1.243 ± 0.123
3.628LysArg: 3.628 ± 0.232
4.228LysSer: 4.228 ± 0.217
3.719LysThr: 3.719 ± 0.217
5.008LysVal: 5.008 ± 0.242
0.791LysTrp: 0.791 ± 0.096
2.08LysTyr: 2.08 ± 0.141
0.0LysXaa: 0.0 ± 0.0
Leu
6.579LeuAla: 6.579 ± 0.318
0.746LeuCys: 0.746 ± 0.105
5.81LeuAsp: 5.81 ± 0.278
9.585LeuGlu: 9.585 ± 0.401
3.38LeuPhe: 3.38 ± 0.27
7.754LeuGly: 7.754 ± 0.398
1.492LeuHis: 1.492 ± 0.113
5.584LeuIle: 5.584 ± 0.311
6.104LeuLys: 6.104 ± 0.292
8.014LeuLeu: 8.014 ± 0.425
1.967LeuMet: 1.967 ± 0.156
2.973LeuAsn: 2.973 ± 0.165
3.855LeuPro: 3.855 ± 0.204
1.956LeuGln: 1.956 ± 0.155
5.245LeuArg: 5.245 ± 0.259
6.997LeuSer: 6.997 ± 0.304
4.431LeuThr: 4.431 ± 0.224
6.228LeuVal: 6.228 ± 0.255
1.063LeuTrp: 1.063 ± 0.114
2.204LeuTyr: 2.204 ± 0.151
0.0LeuXaa: 0.0 ± 0.0
Met
1.526MetAla: 1.526 ± 0.137
0.102MetCys: 0.102 ± 0.031
1.436MetAsp: 1.436 ± 0.127
2.114MetGlu: 2.114 ± 0.16
0.644MetPhe: 0.644 ± 0.09
1.899MetGly: 1.899 ± 0.148
0.373MetHis: 0.373 ± 0.068
1.696MetIle: 1.696 ± 0.132
1.831MetLys: 1.831 ± 0.135
1.662MetLeu: 1.662 ± 0.152
0.452MetMet: 0.452 ± 0.075
0.995MetAsn: 0.995 ± 0.096
0.859MetPro: 0.859 ± 0.108
0.554MetGln: 0.554 ± 0.081
1.142MetArg: 1.142 ± 0.108
1.3MetSer: 1.3 ± 0.112
1.345MetThr: 1.345 ± 0.118
1.865MetVal: 1.865 ± 0.15
0.158MetTrp: 0.158 ± 0.038
0.486MetTyr: 0.486 ± 0.077
0.0MetXaa: 0.0 ± 0.0
Asn
1.696AsnAla: 1.696 ± 0.14
0.486AsnCys: 0.486 ± 0.053
1.515AsnAsp: 1.515 ± 0.147
2.362AsnGlu: 2.362 ± 0.159
2.102AsnPhe: 2.102 ± 0.147
2.442AsnGly: 2.442 ± 0.166
0.678AsnHis: 0.678 ± 0.089
2.317AsnIle: 2.317 ± 0.19
1.775AsnLys: 1.775 ± 0.161
4.115AsnLeu: 4.115 ± 0.241
0.893AsnMet: 0.893 ± 0.121
1.063AsnAsn: 1.063 ± 0.125
2.238AsnPro: 2.238 ± 0.169
0.916AsnGln: 0.916 ± 0.11
2.216AsnArg: 2.216 ± 0.183
2.295AsnSer: 2.295 ± 0.181
1.628AsnThr: 1.628 ± 0.152
2.295AsnVal: 2.295 ± 0.169
0.61AsnTrp: 0.61 ± 0.081
1.424AsnTyr: 1.424 ± 0.148
0.0AsnXaa: 0.0 ± 0.0
Pro
2.464ProAla: 2.464 ± 0.174
0.384ProCys: 0.384 ± 0.07
2.803ProAsp: 2.803 ± 0.199
4.781ProGlu: 4.781 ± 0.257
1.549ProPhe: 1.549 ± 0.141
3.357ProGly: 3.357 ± 0.202
0.87ProHis: 0.87 ± 0.091
2.419ProIle: 2.419 ± 0.166
2.69ProLys: 2.69 ± 0.172
3.357ProLeu: 3.357 ± 0.214
0.87ProMet: 0.87 ± 0.102
1.549ProAsn: 1.549 ± 0.168
2.08ProPro: 2.08 ± 0.16
1.243ProGln: 1.243 ± 0.113
1.865ProArg: 1.865 ± 0.148
3.041ProSer: 3.041 ± 0.17
2.204ProThr: 2.204 ± 0.158
2.849ProVal: 2.849 ± 0.182
0.576ProTrp: 0.576 ± 0.08
1.368ProTyr: 1.368 ± 0.152
0.0ProXaa: 0.0 ± 0.0
Gln
1.673GlnAla: 1.673 ± 0.13
0.192GlnCys: 0.192 ± 0.049
1.187GlnAsp: 1.187 ± 0.117
2.012GlnGlu: 2.012 ± 0.174
0.791GlnPhe: 0.791 ± 0.084
1.356GlnGly: 1.356 ± 0.121
0.373GlnHis: 0.373 ± 0.059
2.012GlnIle: 2.012 ± 0.146
1.922GlnLys: 1.922 ± 0.151
1.831GlnLeu: 1.831 ± 0.142
0.543GlnMet: 0.543 ± 0.075
0.972GlnAsn: 0.972 ± 0.107
0.735GlnPro: 0.735 ± 0.096
0.407GlnGln: 0.407 ± 0.07
1.458GlnArg: 1.458 ± 0.116
1.017GlnSer: 1.017 ± 0.118
1.096GlnThr: 1.096 ± 0.101
1.763GlnVal: 1.763 ± 0.146
0.147GlnTrp: 0.147 ± 0.042
0.509GlnTyr: 0.509 ± 0.068
0.0GlnXaa: 0.0 ± 0.0
Arg
3.391ArgAla: 3.391 ± 0.21
0.384ArgCys: 0.384 ± 0.072
2.882ArgAsp: 2.882 ± 0.189
6.567ArgGlu: 6.567 ± 0.274
1.752ArgPhe: 1.752 ± 0.153
4.386ArgGly: 4.386 ± 0.233
0.791ArgHis: 0.791 ± 0.092
3.979ArgIle: 3.979 ± 0.22
5.358ArgLys: 5.358 ± 0.306
4.465ArgLeu: 4.465 ± 0.242
1.549ArgMet: 1.549 ± 0.112
2.227ArgAsn: 2.227 ± 0.17
2.046ArgPro: 2.046 ± 0.164
1.187ArgGln: 1.187 ± 0.113
4.001ArgArg: 4.001 ± 0.265
3.493ArgSer: 3.493 ± 0.207
2.871ArgThr: 2.871 ± 0.18
3.866ArgVal: 3.866 ± 0.238
0.656ArgTrp: 0.656 ± 0.087
1.673ArgTyr: 1.673 ± 0.127
0.0ArgXaa: 0.0 ± 0.0
Ser
3.9SerAla: 3.9 ± 0.229
0.803SerCys: 0.803 ± 0.108
3.493SerAsp: 3.493 ± 0.164
6.387SerGlu: 6.387 ± 0.289
2.984SerPhe: 2.984 ± 0.206
5.573SerGly: 5.573 ± 0.267
1.243SerHis: 1.243 ± 0.123
4.499SerIle: 4.499 ± 0.224
4.261SerLys: 4.261 ± 0.257
6.375SerLeu: 6.375 ± 0.273
1.65SerMet: 1.65 ± 0.155
2.046SerAsn: 2.046 ± 0.155
3.041SerPro: 3.041 ± 0.197
1.876SerGln: 1.876 ± 0.152
4.047SerArg: 4.047 ± 0.237
4.951SerSer: 4.951 ± 0.248
2.939SerThr: 2.939 ± 0.18
4.634SerVal: 4.634 ± 0.278
0.893SerTrp: 0.893 ± 0.122
1.978SerTyr: 1.978 ± 0.166
0.0SerXaa: 0.0 ± 0.0
Thr
3.097ThrAla: 3.097 ± 0.176
0.565ThrCys: 0.565 ± 0.078
2.509ThrAsp: 2.509 ± 0.188
4.024ThrGlu: 4.024 ± 0.197
1.82ThrPhe: 1.82 ± 0.14
4.239ThrGly: 4.239 ± 0.219
0.859ThrHis: 0.859 ± 0.088
3.165ThrIle: 3.165 ± 0.189
2.385ThrLys: 2.385 ± 0.157
4.42ThrLeu: 4.42 ± 0.246
1.029ThrMet: 1.029 ± 0.125
1.503ThrAsn: 1.503 ± 0.153
2.238ThrPro: 2.238 ± 0.152
1.108ThrGln: 1.108 ± 0.119
2.566ThrArg: 2.566 ± 0.204
3.346ThrSer: 3.346 ± 0.173
2.498ThrThr: 2.498 ± 0.231
4.069ThrVal: 4.069 ± 0.277
0.735ThrTrp: 0.735 ± 0.099
1.594ThrTyr: 1.594 ± 0.137
0.0ThrXaa: 0.0 ± 0.0
Val
4.284ValAla: 4.284 ± 0.236
0.52ValCys: 0.52 ± 0.068
4.578ValAsp: 4.578 ± 0.233
7.155ValGlu: 7.155 ± 0.331
2.679ValPhe: 2.679 ± 0.203
5.821ValGly: 5.821 ± 0.307
1.198ValHis: 1.198 ± 0.113
4.465ValIle: 4.465 ± 0.23
5.222ValLys: 5.222 ± 0.318
6.974ValLeu: 6.974 ± 0.265
1.503ValMet: 1.503 ± 0.126
2.295ValAsn: 2.295 ± 0.154
3.086ValPro: 3.086 ± 0.181
1.684ValGln: 1.684 ± 0.124
3.9ValArg: 3.9 ± 0.197
5.166ValSer: 5.166 ± 0.244
3.809ValThr: 3.809 ± 0.217
5.177ValVal: 5.177 ± 0.273
0.723ValTrp: 0.723 ± 0.086
1.842ValTyr: 1.842 ± 0.149
0.0ValXaa: 0.0 ± 0.0
Trp
0.52TrpAla: 0.52 ± 0.075
0.192TrpCys: 0.192 ± 0.045
0.441TrpAsp: 0.441 ± 0.079
0.961TrpGlu: 0.961 ± 0.114
0.588TrpPhe: 0.588 ± 0.108
0.904TrpGly: 0.904 ± 0.098
0.237TrpHis: 0.237 ± 0.054
1.209TrpIle: 1.209 ± 0.119
0.995TrpLys: 0.995 ± 0.09
0.938TrpLeu: 0.938 ± 0.098
0.407TrpMet: 0.407 ± 0.07
0.441TrpAsn: 0.441 ± 0.069
0.305TrpPro: 0.305 ± 0.06
0.317TrpGln: 0.317 ± 0.069
0.893TrpArg: 0.893 ± 0.125
0.825TrpSer: 0.825 ± 0.098
0.803TrpThr: 0.803 ± 0.136
0.746TrpVal: 0.746 ± 0.084
0.181TrpTrp: 0.181 ± 0.044
0.283TrpTyr: 0.283 ± 0.056
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.345TyrAla: 1.345 ± 0.123
0.249TyrCys: 0.249 ± 0.058
1.797TyrAsp: 1.797 ± 0.135
2.928TyrGlu: 2.928 ± 0.203
1.232TyrPhe: 1.232 ± 0.132
2.453TyrGly: 2.453 ± 0.151
0.712TyrHis: 0.712 ± 0.091
1.413TyrIle: 1.413 ± 0.118
1.515TyrLys: 1.515 ± 0.124
2.849TyrLeu: 2.849 ± 0.18
0.531TyrMet: 0.531 ± 0.079
0.723TyrAsn: 0.723 ± 0.1
1.56TyrPro: 1.56 ± 0.108
0.757TyrGln: 0.757 ± 0.095
2.046TyrArg: 2.046 ± 0.183
2.102TyrSer: 2.102 ± 0.165
1.436TyrThr: 1.436 ± 0.108
1.616TyrVal: 1.616 ± 0.11
0.554TyrTrp: 0.554 ± 0.07
1.176TyrTyr: 1.176 ± 0.117
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 389 proteins (88468 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski