Amino acid dipepetide frequency for candidate division MSBL1 archaeon SCGC-AAA261F17

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
5.361AlaAla: 5.361 ± 0.288
0.678AlaCys: 0.678 ± 0.065
3.671AlaAsp: 3.671 ± 0.209
6.023AlaGlu: 6.023 ± 0.251
2.569AlaPhe: 2.569 ± 0.158
5.502AlaGly: 5.502 ± 0.296
1.452AlaHis: 1.452 ± 0.107
5.189AlaIle: 5.189 ± 0.227
4.973AlaLys: 4.973 ± 0.21
6.879AlaLeu: 6.879 ± 0.281
1.444AlaMet: 1.444 ± 0.116
2.033AlaAsn: 2.033 ± 0.109
2.368AlaPro: 2.368 ± 0.149
1.571AlaGln: 1.571 ± 0.094
4.75AlaArg: 4.75 ± 0.182
4.132AlaSer: 4.132 ± 0.174
3.246AlaThr: 3.246 ± 0.212
4.556AlaVal: 4.556 ± 0.222
0.812AlaTrp: 0.812 ± 0.072
1.824AlaTyr: 1.824 ± 0.116
0.0AlaXaa: 0.0 ± 0.0
Cys
0.491CysAla: 0.491 ± 0.071
0.089CysCys: 0.089 ± 0.028
0.558CysAsp: 0.558 ± 0.057
0.782CysGlu: 0.782 ± 0.081
0.29CysPhe: 0.29 ± 0.046
1.102CysGly: 1.102 ± 0.106
0.201CysHis: 0.201 ± 0.037
0.596CysIle: 0.596 ± 0.076
0.581CysLys: 0.581 ± 0.065
0.737CysLeu: 0.737 ± 0.069
0.179CysMet: 0.179 ± 0.039
0.268CysAsn: 0.268 ± 0.041
0.663CysPro: 0.663 ± 0.087
0.275CysGln: 0.275 ± 0.048
0.536CysArg: 0.536 ± 0.072
0.558CysSer: 0.558 ± 0.078
0.484CysThr: 0.484 ± 0.077
0.573CysVal: 0.573 ± 0.063
0.149CysTrp: 0.149 ± 0.028
0.275CysTyr: 0.275 ± 0.049
0.0CysXaa: 0.0 ± 0.0
Asp
3.626AspAla: 3.626 ± 0.199
0.581AspCys: 0.581 ± 0.066
2.658AspAsp: 2.658 ± 0.15
5.316AspGlu: 5.316 ± 0.205
2.502AspPhe: 2.502 ± 0.133
3.73AspGly: 3.73 ± 0.164
0.968AspHis: 0.968 ± 0.095
3.886AspIle: 3.886 ± 0.179
3.291AspLys: 3.291 ± 0.173
6.291AspLeu: 6.291 ± 0.238
1.236AspMet: 1.236 ± 0.094
1.586AspAsn: 1.586 ± 0.187
2.889AspPro: 2.889 ± 0.159
1.392AspGln: 1.392 ± 0.104
2.919AspArg: 2.919 ± 0.133
2.971AspSer: 2.971 ± 0.144
2.375AspThr: 2.375 ± 0.16
4.571AspVal: 4.571 ± 0.181
0.812AspTrp: 0.812 ± 0.079
2.234AspTyr: 2.234 ± 0.15
0.0AspXaa: 0.0 ± 0.0
Glu
6.254GluAla: 6.254 ± 0.258
0.767GluCys: 0.767 ± 0.077
5.882GluAsp: 5.882 ± 0.192
11.242GluGlu: 11.242 ± 0.462
3.276GluPhe: 3.276 ± 0.164
5.785GluGly: 5.785 ± 0.209
1.251GluHis: 1.251 ± 0.093
7.244GluIle: 7.244 ± 0.254
8.48GluLys: 8.48 ± 0.286
9.299GluLeu: 9.299 ± 0.349
2.114GluMet: 2.114 ± 0.115
4.594GluAsn: 4.594 ± 0.268
2.725GluPro: 2.725 ± 0.163
1.809GluGln: 1.809 ± 0.121
5.83GluArg: 5.83 ± 0.22
4.653GluSer: 4.653 ± 0.19
3.968GluThr: 3.968 ± 0.157
7.177GluVal: 7.177 ± 0.265
0.849GluTrp: 0.849 ± 0.087
2.39GluTyr: 2.39 ± 0.153
0.0GluXaa: 0.0 ± 0.0
Phe
2.39PheAla: 2.39 ± 0.127
0.372PheCys: 0.372 ± 0.053
2.114PheAsp: 2.114 ± 0.12
2.926PheGlu: 2.926 ± 0.158
1.504PhePhe: 1.504 ± 0.13
3.157PheGly: 3.157 ± 0.166
0.64PheHis: 0.64 ± 0.064
2.196PheIle: 2.196 ± 0.137
2.219PheLys: 2.219 ± 0.111
3.477PheLeu: 3.477 ± 0.199
0.879PheMet: 0.879 ± 0.085
1.117PheAsn: 1.117 ± 0.098
1.266PhePro: 1.266 ± 0.102
0.99PheGln: 0.99 ± 0.072
1.921PheArg: 1.921 ± 0.113
2.457PheSer: 2.457 ± 0.147
2.025PheThr: 2.025 ± 0.142
2.07PheVal: 2.07 ± 0.125
0.462PheTrp: 0.462 ± 0.06
1.147PheTyr: 1.147 ± 0.091
0.0PheXaa: 0.0 ± 0.0
Gly
5.361GlyAla: 5.361 ± 0.242
0.67GlyCys: 0.67 ± 0.071
3.872GlyAsp: 3.872 ± 0.178
7.43GlyGlu: 7.43 ± 0.267
2.993GlyPhe: 2.993 ± 0.155
5.785GlyGly: 5.785 ± 0.296
1.392GlyHis: 1.392 ± 0.113
5.294GlyIle: 5.294 ± 0.197
5.904GlyLys: 5.904 ± 0.192
6.567GlyLeu: 6.567 ± 0.258
1.735GlyMet: 1.735 ± 0.14
2.375GlyAsn: 2.375 ± 0.15
2.234GlyPro: 2.234 ± 0.148
1.631GlyGln: 1.631 ± 0.112
4.14GlyArg: 4.14 ± 0.187
4.348GlySer: 4.348 ± 0.181
3.886GlyThr: 3.886 ± 0.185
6.135GlyVal: 6.135 ± 0.202
0.923GlyTrp: 0.923 ± 0.076
2.315GlyTyr: 2.315 ± 0.138
0.0GlyXaa: 0.0 ± 0.0
His
1.295HisAla: 1.295 ± 0.113
0.261HisCys: 0.261 ± 0.042
0.826HisAsp: 0.826 ± 0.084
1.683HisGlu: 1.683 ± 0.124
0.804HisPhe: 0.804 ± 0.075
1.429HisGly: 1.429 ± 0.101
0.462HisHis: 0.462 ± 0.069
1.147HisIle: 1.147 ± 0.1
0.886HisLys: 0.886 ± 0.081
1.899HisLeu: 1.899 ± 0.129
0.439HisMet: 0.439 ± 0.055
0.476HisAsn: 0.476 ± 0.059
1.266HisPro: 1.266 ± 0.087
0.618HisGln: 0.618 ± 0.071
0.879HisArg: 0.879 ± 0.099
1.042HisSer: 1.042 ± 0.087
0.856HisThr: 0.856 ± 0.079
1.392HisVal: 1.392 ± 0.106
0.261HisTrp: 0.261 ± 0.044
0.544HisTyr: 0.544 ± 0.057
0.0HisXaa: 0.0 ± 0.0
Ile
5.122IleAla: 5.122 ± 0.25
0.581IleCys: 0.581 ± 0.066
4.274IleAsp: 4.274 ± 0.187
6.247IleGlu: 6.247 ± 0.199
2.472IlePhe: 2.472 ± 0.141
5.599IleGly: 5.599 ± 0.215
1.437IleHis: 1.437 ± 0.104
4.698IleIle: 4.698 ± 0.224
4.4IleLys: 4.4 ± 0.205
6.798IleLeu: 6.798 ± 0.284
1.318IleMet: 1.318 ± 0.107
2.129IleAsn: 2.129 ± 0.13
3.291IlePro: 3.291 ± 0.161
1.995IleGln: 1.995 ± 0.129
3.946IleArg: 3.946 ± 0.197
4.758IleSer: 4.758 ± 0.187
3.849IleThr: 3.849 ± 0.168
4.355IleVal: 4.355 ± 0.182
0.692IleTrp: 0.692 ± 0.075
1.943IleTyr: 1.943 ± 0.13
0.0IleXaa: 0.0 ± 0.0
Lys
5.107LysAla: 5.107 ± 0.235
0.499LysCys: 0.499 ± 0.07
3.797LysAsp: 3.797 ± 0.182
6.865LysGlu: 6.865 ± 0.276
2.591LysPhe: 2.591 ± 0.142
4.564LysGly: 4.564 ± 0.204
1.318LysHis: 1.318 ± 0.111
6.462LysIle: 6.462 ± 0.235
6.418LysLys: 6.418 ± 0.316
7.468LysLeu: 7.468 ± 0.251
1.705LysMet: 1.705 ± 0.122
2.688LysAsn: 2.688 ± 0.146
2.613LysPro: 2.613 ± 0.146
1.586LysGln: 1.586 ± 0.098
4.46LysArg: 4.46 ± 0.228
4.221LysSer: 4.221 ± 0.192
3.983LysThr: 3.983 ± 0.185
4.966LysVal: 4.966 ± 0.196
0.849LysTrp: 0.849 ± 0.07
2.055LysTyr: 2.055 ± 0.111
0.0LysXaa: 0.0 ± 0.0
Leu
7.267LeuAla: 7.267 ± 0.239
0.812LeuCys: 0.812 ± 0.084
6.06LeuAsp: 6.06 ± 0.204
9.671LeuGlu: 9.671 ± 0.324
2.636LeuPhe: 2.636 ± 0.154
7.214LeuGly: 7.214 ± 0.287
1.668LeuHis: 1.668 ± 0.124
5.718LeuIle: 5.718 ± 0.236
7.549LeuLys: 7.549 ± 0.259
7.49LeuLeu: 7.49 ± 0.269
2.003LeuMet: 2.003 ± 0.118
3.298LeuAsn: 3.298 ± 0.163
4.006LeuPro: 4.006 ± 0.157
2.256LeuGln: 2.256 ± 0.149
6.053LeuArg: 6.053 ± 0.24
6.47LeuSer: 6.47 ± 0.223
4.847LeuThr: 4.847 ± 0.184
6.09LeuVal: 6.09 ± 0.253
0.908LeuTrp: 0.908 ± 0.087
2.181LeuTyr: 2.181 ± 0.124
0.0LeuXaa: 0.0 ± 0.0
Met
1.437MetAla: 1.437 ± 0.129
0.149MetCys: 0.149 ± 0.033
1.251MetAsp: 1.251 ± 0.099
1.765MetGlu: 1.765 ± 0.113
0.536MetPhe: 0.536 ± 0.063
1.608MetGly: 1.608 ± 0.126
0.313MetHis: 0.313 ± 0.044
1.608MetIle: 1.608 ± 0.127
1.794MetLys: 1.794 ± 0.118
1.869MetLeu: 1.869 ± 0.145
0.7MetMet: 0.7 ± 0.068
0.826MetAsn: 0.826 ± 0.072
1.154MetPro: 1.154 ± 0.097
0.469MetGln: 0.469 ± 0.068
1.452MetArg: 1.452 ± 0.104
1.474MetSer: 1.474 ± 0.115
1.31MetThr: 1.31 ± 0.09
1.422MetVal: 1.422 ± 0.104
0.201MetTrp: 0.201 ± 0.045
0.447MetTyr: 0.447 ± 0.055
0.0MetXaa: 0.0 ± 0.0
Asn
2.323AsnAla: 2.323 ± 0.147
0.462AsnCys: 0.462 ± 0.057
1.325AsnAsp: 1.325 ± 0.106
2.494AsnGlu: 2.494 ± 0.124
1.325AsnPhe: 1.325 ± 0.11
2.1AsnGly: 2.1 ± 0.153
0.633AsnHis: 0.633 ± 0.067
2.114AsnIle: 2.114 ± 0.145
1.765AsnLys: 1.765 ± 0.116
4.154AsnLeu: 4.154 ± 0.213
0.678AsnMet: 0.678 ± 0.072
0.841AsnAsn: 0.841 ± 0.077
2.397AsnPro: 2.397 ± 0.148
1.273AsnGln: 1.273 ± 0.094
1.75AsnArg: 1.75 ± 0.116
2.129AsnSer: 2.129 ± 0.138
1.645AsnThr: 1.645 ± 0.145
2.673AsnVal: 2.673 ± 0.149
0.588AsnTrp: 0.588 ± 0.064
1.184AsnTyr: 1.184 ± 0.12
0.0AsnXaa: 0.0 ± 0.0
Pro
2.323ProAla: 2.323 ± 0.127
0.402ProCys: 0.402 ± 0.046
2.569ProAsp: 2.569 ± 0.144
4.661ProGlu: 4.661 ± 0.221
1.407ProPhe: 1.407 ± 0.115
3.03ProGly: 3.03 ± 0.153
0.938ProHis: 0.938 ± 0.09
2.844ProIle: 2.844 ± 0.154
2.636ProLys: 2.636 ± 0.136
3.522ProLeu: 3.522 ± 0.16
0.692ProMet: 0.692 ± 0.064
1.437ProAsn: 1.437 ± 0.105
1.995ProPro: 1.995 ± 0.126
1.31ProGln: 1.31 ± 0.092
2.092ProArg: 2.092 ± 0.115
2.881ProSer: 2.881 ± 0.146
2.472ProThr: 2.472 ± 0.126
2.762ProVal: 2.762 ± 0.161
0.469ProTrp: 0.469 ± 0.061
1.645ProTyr: 1.645 ± 0.112
0.0ProXaa: 0.0 ± 0.0
Gln
2.092GlnAla: 2.092 ± 0.13
0.186GlnCys: 0.186 ± 0.042
1.377GlnAsp: 1.377 ± 0.098
2.546GlnGlu: 2.546 ± 0.149
0.752GlnPhe: 0.752 ± 0.079
1.742GlnGly: 1.742 ± 0.134
0.454GlnHis: 0.454 ± 0.057
2.003GlnIle: 2.003 ± 0.125
1.98GlnLys: 1.98 ± 0.114
2.338GlnLeu: 2.338 ± 0.127
0.67GlnMet: 0.67 ± 0.065
0.759GlnAsn: 0.759 ± 0.083
0.856GlnPro: 0.856 ± 0.074
0.692GlnGln: 0.692 ± 0.082
1.415GlnArg: 1.415 ± 0.114
1.266GlnSer: 1.266 ± 0.091
1.139GlnThr: 1.139 ± 0.093
2.01GlnVal: 2.01 ± 0.124
0.246GlnTrp: 0.246 ± 0.043
0.745GlnTyr: 0.745 ± 0.076
0.0GlnXaa: 0.0 ± 0.0
Arg
4.065ArgAla: 4.065 ± 0.181
0.491ArgCys: 0.491 ± 0.064
3.127ArgAsp: 3.127 ± 0.156
6.954ArgGlu: 6.954 ± 0.27
1.794ArgPhe: 1.794 ± 0.134
4.661ArgGly: 4.661 ± 0.214
0.871ArgHis: 0.871 ± 0.077
3.976ArgIle: 3.976 ± 0.173
5.405ArgLys: 5.405 ± 0.271
4.936ArgLeu: 4.936 ± 0.206
1.616ArgMet: 1.616 ± 0.1
1.809ArgAsn: 1.809 ± 0.124
1.906ArgPro: 1.906 ± 0.111
1.385ArgGln: 1.385 ± 0.12
3.715ArgArg: 3.715 ± 0.181
2.718ArgSer: 2.718 ± 0.144
2.725ArgThr: 2.725 ± 0.148
4.326ArgVal: 4.326 ± 0.216
0.692ArgTrp: 0.692 ± 0.077
1.541ArgTyr: 1.541 ± 0.111
0.0ArgXaa: 0.0 ± 0.0
Ser
4.02SerAla: 4.02 ± 0.185
0.655SerCys: 0.655 ± 0.074
3.328SerAsp: 3.328 ± 0.189
5.919SerGlu: 5.919 ± 0.208
2.39SerPhe: 2.39 ± 0.148
5.063SerGly: 5.063 ± 0.204
1.251SerHis: 1.251 ± 0.102
3.797SerIle: 3.797 ± 0.17
4.4SerLys: 4.4 ± 0.175
5.361SerLeu: 5.361 ± 0.181
1.191SerMet: 1.191 ± 0.092
1.854SerAsn: 1.854 ± 0.124
2.844SerPro: 2.844 ± 0.147
1.943SerGln: 1.943 ± 0.138
3.373SerArg: 3.373 ± 0.171
3.946SerSer: 3.946 ± 0.216
3.03SerThr: 3.03 ± 0.156
3.782SerVal: 3.782 ± 0.191
0.812SerTrp: 0.812 ± 0.084
1.593SerTyr: 1.593 ± 0.118
0.0SerXaa: 0.0 ± 0.0
Thr
3.574ThrAla: 3.574 ± 0.19
0.491ThrCys: 0.491 ± 0.064
2.77ThrAsp: 2.77 ± 0.169
3.432ThrGlu: 3.432 ± 0.161
1.69ThrPhe: 1.69 ± 0.127
4.303ThrGly: 4.303 ± 0.193
1.065ThrHis: 1.065 ± 0.087
3.909ThrIle: 3.909 ± 0.183
2.963ThrLys: 2.963 ± 0.126
4.884ThrLeu: 4.884 ± 0.16
0.99ThrMet: 0.99 ± 0.08
1.817ThrAsn: 1.817 ± 0.109
2.658ThrPro: 2.658 ± 0.131
1.288ThrGln: 1.288 ± 0.099
3.082ThrArg: 3.082 ± 0.169
2.986ThrSer: 2.986 ± 0.158
2.71ThrThr: 2.71 ± 0.151
3.678ThrVal: 3.678 ± 0.215
0.558ThrTrp: 0.558 ± 0.06
1.578ThrTyr: 1.578 ± 0.109
0.0ThrXaa: 0.0 ± 0.0
Val
4.452ValAla: 4.452 ± 0.223
0.745ValCys: 0.745 ± 0.073
4.058ValAsp: 4.058 ± 0.178
6.373ValGlu: 6.373 ± 0.192
2.211ValPhe: 2.211 ± 0.142
5.45ValGly: 5.45 ± 0.247
1.295ValHis: 1.295 ± 0.107
4.899ValIle: 4.899 ± 0.21
6.083ValLys: 6.083 ± 0.192
6.254ValLeu: 6.254 ± 0.209
1.392ValMet: 1.392 ± 0.118
2.42ValAsn: 2.42 ± 0.148
3.134ValPro: 3.134 ± 0.154
1.459ValGln: 1.459 ± 0.101
3.983ValArg: 3.983 ± 0.182
4.534ValSer: 4.534 ± 0.195
3.827ValThr: 3.827 ± 0.175
4.542ValVal: 4.542 ± 0.21
0.663ValTrp: 0.663 ± 0.071
1.958ValTyr: 1.958 ± 0.125
0.0ValXaa: 0.0 ± 0.0
Trp
0.715TrpAla: 0.715 ± 0.064
0.141TrpCys: 0.141 ± 0.038
0.506TrpAsp: 0.506 ± 0.069
0.893TrpGlu: 0.893 ± 0.098
0.558TrpPhe: 0.558 ± 0.065
0.789TrpGly: 0.789 ± 0.085
0.179TrpHis: 0.179 ± 0.035
0.819TrpIle: 0.819 ± 0.073
0.923TrpLys: 0.923 ± 0.084
1.027TrpLeu: 1.027 ± 0.088
0.335TrpMet: 0.335 ± 0.048
0.514TrpAsn: 0.514 ± 0.063
0.387TrpPro: 0.387 ± 0.056
0.305TrpGln: 0.305 ± 0.042
0.767TrpArg: 0.767 ± 0.076
0.901TrpSer: 0.901 ± 0.086
0.648TrpThr: 0.648 ± 0.079
0.618TrpVal: 0.618 ± 0.065
0.223TrpTrp: 0.223 ± 0.04
0.417TrpTyr: 0.417 ± 0.052
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.757TyrAla: 1.757 ± 0.101
0.357TyrCys: 0.357 ± 0.052
1.809TyrAsp: 1.809 ± 0.138
2.368TyrGlu: 2.368 ± 0.131
1.027TyrPhe: 1.027 ± 0.089
2.375TyrGly: 2.375 ± 0.15
0.678TyrHis: 0.678 ± 0.065
1.504TyrIle: 1.504 ± 0.104
1.564TyrLys: 1.564 ± 0.11
2.956TyrLeu: 2.956 ± 0.143
0.462TyrMet: 0.462 ± 0.061
1.013TyrAsn: 1.013 ± 0.084
1.437TyrPro: 1.437 ± 0.123
1.02TyrGln: 1.02 ± 0.077
1.653TyrArg: 1.653 ± 0.122
2.107TyrSer: 2.107 ± 0.131
1.4TyrThr: 1.4 ± 0.106
2.033TyrVal: 2.033 ± 0.125
0.499TyrTrp: 0.499 ± 0.063
0.96TyrTyr: 0.96 ± 0.087
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 559 proteins (134315 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski