Amino acid dipepetide frequency for candidate division MSBL1 archaeon SCGC-AAA259J03

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
4.399AlaAla: 4.399 ± 0.18
0.692AlaCys: 0.692 ± 0.057
3.424AlaAsp: 3.424 ± 0.13
6.156AlaGlu: 6.156 ± 0.19
2.406AlaPhe: 2.406 ± 0.113
5.138AlaGly: 5.138 ± 0.171
0.984AlaHis: 0.984 ± 0.078
4.298AlaIle: 4.298 ± 0.157
4.168AlaLys: 4.168 ± 0.174
5.931AlaLeu: 5.931 ± 0.154
1.551AlaMet: 1.551 ± 0.095
1.69AlaAsn: 1.69 ± 0.09
2.084AlaPro: 2.084 ± 0.116
1.426AlaGln: 1.426 ± 0.087
3.558AlaArg: 3.558 ± 0.151
3.976AlaSer: 3.976 ± 0.141
2.651AlaThr: 2.651 ± 0.126
4.682AlaVal: 4.682 ± 0.155
0.605AlaTrp: 0.605 ± 0.063
1.662AlaTyr: 1.662 ± 0.09
0.0AlaXaa: 0.0 ± 0.0
Cys
0.519CysAla: 0.519 ± 0.046
0.082CysCys: 0.082 ± 0.023
0.581CysAsp: 0.581 ± 0.053
0.946CysGlu: 0.946 ± 0.071
0.403CysPhe: 0.403 ± 0.048
1.153CysGly: 1.153 ± 0.09
0.245CysHis: 0.245 ± 0.036
0.447CysIle: 0.447 ± 0.044
0.557CysLys: 0.557 ± 0.055
0.783CysLeu: 0.783 ± 0.066
0.182CysMet: 0.182 ± 0.03
0.298CysAsn: 0.298 ± 0.039
0.773CysPro: 0.773 ± 0.069
0.24CysGln: 0.24 ± 0.036
0.634CysArg: 0.634 ± 0.06
0.648CysSer: 0.648 ± 0.059
0.447CysThr: 0.447 ± 0.047
0.591CysVal: 0.591 ± 0.058
0.125CysTrp: 0.125 ± 0.025
0.341CysTyr: 0.341 ± 0.046
0.0CysXaa: 0.0 ± 0.0
Asp
3.294AspAla: 3.294 ± 0.135
0.538AspCys: 0.538 ± 0.047
2.819AspAsp: 2.819 ± 0.126
5.715AspGlu: 5.715 ± 0.174
2.651AspPhe: 2.651 ± 0.116
3.799AspGly: 3.799 ± 0.135
1.032AspHis: 1.032 ± 0.067
4.197AspIle: 4.197 ± 0.123
3.53AspLys: 3.53 ± 0.144
6.762AspLeu: 6.762 ± 0.191
1.354AspMet: 1.354 ± 0.088
1.753AspAsn: 1.753 ± 0.12
3.145AspPro: 3.145 ± 0.118
1.047AspGln: 1.047 ± 0.074
3.899AspArg: 3.899 ± 0.148
3.266AspSer: 3.266 ± 0.136
2.291AspThr: 2.291 ± 0.119
4.096AspVal: 4.096 ± 0.143
0.836AspTrp: 0.836 ± 0.068
2.075AspTyr: 2.075 ± 0.094
0.0AspXaa: 0.0 ± 0.0
Glu
5.883GluAla: 5.883 ± 0.205
0.687GluCys: 0.687 ± 0.058
5.974GluAsp: 5.974 ± 0.187
12.86GluGlu: 12.86 ± 0.314
3.434GluPhe: 3.434 ± 0.126
7.304GluGly: 7.304 ± 0.207
1.522GluHis: 1.522 ± 0.088
8.414GluIle: 8.414 ± 0.237
11.136GluLys: 11.136 ± 0.286
8.317GluLeu: 8.317 ± 0.262
2.42GluMet: 2.42 ± 0.106
5.162GluAsn: 5.162 ± 0.173
3.064GluPro: 3.064 ± 0.108
1.738GluGln: 1.738 ± 0.091
6.574GluArg: 6.574 ± 0.22
4.581GluSer: 4.581 ± 0.168
4.845GluThr: 4.845 ± 0.129
6.752GluVal: 6.752 ± 0.199
0.994GluTrp: 0.994 ± 0.068
2.55GluTyr: 2.55 ± 0.119
0.0GluXaa: 0.0 ± 0.0
Phe
2.319PheAla: 2.319 ± 0.114
0.432PheCys: 0.432 ± 0.047
2.56PheAsp: 2.56 ± 0.122
3.703PheGlu: 3.703 ± 0.152
1.815PhePhe: 1.815 ± 0.111
3.107PheGly: 3.107 ± 0.144
0.778PheHis: 0.778 ± 0.066
2.171PheIle: 2.171 ± 0.116
2.142PheLys: 2.142 ± 0.108
4.192PheLeu: 4.192 ± 0.155
0.773PheMet: 0.773 ± 0.059
1.081PheAsn: 1.081 ± 0.082
1.806PhePro: 1.806 ± 0.099
1.018PheGln: 1.018 ± 0.064
2.036PheArg: 2.036 ± 0.099
2.977PheSer: 2.977 ± 0.135
1.801PheThr: 1.801 ± 0.096
2.646PheVal: 2.646 ± 0.109
0.485PheTrp: 0.485 ± 0.056
1.273PheTyr: 1.273 ± 0.089
0.0PheXaa: 0.0 ± 0.0
Gly
4.817GlyAla: 4.817 ± 0.215
0.912GlyCys: 0.912 ± 0.071
3.775GlyAsp: 3.775 ± 0.142
7.52GlyGlu: 7.52 ± 0.211
3.04GlyPhe: 3.04 ± 0.125
5.844GlyGly: 5.844 ± 0.248
1.181GlyHis: 1.181 ± 0.075
5.575GlyIle: 5.575 ± 0.174
5.984GlyLys: 5.984 ± 0.194
6.545GlyLeu: 6.545 ± 0.191
1.887GlyMet: 1.887 ± 0.091
2.54GlyAsn: 2.54 ± 0.106
2.483GlyPro: 2.483 ± 0.129
1.522GlyGln: 1.522 ± 0.085
4.49GlyArg: 4.49 ± 0.148
4.269GlySer: 4.269 ± 0.159
3.597GlyThr: 3.597 ± 0.118
5.513GlyVal: 5.513 ± 0.192
0.898GlyTrp: 0.898 ± 0.063
2.766GlyTyr: 2.766 ± 0.115
0.0GlyXaa: 0.0 ± 0.0
His
1.234HisAla: 1.234 ± 0.075
0.298HisCys: 0.298 ± 0.044
0.898HisAsp: 0.898 ± 0.065
1.45HisGlu: 1.45 ± 0.082
0.692HisPhe: 0.692 ± 0.059
1.378HisGly: 1.378 ± 0.078
0.317HisHis: 0.317 ± 0.039
1.071HisIle: 1.071 ± 0.072
0.749HisLys: 0.749 ± 0.063
1.786HisLeu: 1.786 ± 0.096
0.312HisMet: 0.312 ± 0.035
0.562HisAsn: 0.562 ± 0.053
1.076HisPro: 1.076 ± 0.072
0.389HisGln: 0.389 ± 0.042
1.066HisArg: 1.066 ± 0.071
1.052HisSer: 1.052 ± 0.08
0.749HisThr: 0.749 ± 0.059
1.22HisVal: 1.22 ± 0.07
0.202HisTrp: 0.202 ± 0.028
0.576HisTyr: 0.576 ± 0.055
0.0HisXaa: 0.0 ± 0.0
Ile
4.725IleAla: 4.725 ± 0.17
0.73IleCys: 0.73 ± 0.062
4.447IleAsp: 4.447 ± 0.154
7.155IleGlu: 7.155 ± 0.187
2.737IlePhe: 2.737 ± 0.141
5.273IleGly: 5.273 ± 0.211
1.359IleHis: 1.359 ± 0.076
4.586IleIle: 4.586 ± 0.191
4.188IleLys: 4.188 ± 0.164
6.31IleLeu: 6.31 ± 0.206
1.465IleMet: 1.465 ± 0.086
2.473IleAsn: 2.473 ± 0.111
3.208IlePro: 3.208 ± 0.13
1.58IleGln: 1.58 ± 0.075
3.731IleArg: 3.731 ± 0.123
5.028IleSer: 5.028 ± 0.174
3.29IleThr: 3.29 ± 0.147
4.946IleVal: 4.946 ± 0.155
0.562IleTrp: 0.562 ± 0.056
1.931IleTyr: 1.931 ± 0.118
0.0IleXaa: 0.0 ± 0.0
Lys
4.466LysAla: 4.466 ± 0.161
0.692LysCys: 0.692 ± 0.064
4.183LysAsp: 4.183 ± 0.139
7.914LysGlu: 7.914 ± 0.219
2.449LysPhe: 2.449 ± 0.105
5.018LysGly: 5.018 ± 0.147
1.186LysHis: 1.186 ± 0.077
6.416LysIle: 6.416 ± 0.21
7.026LysLys: 7.026 ± 0.221
6.329LysLeu: 6.329 ± 0.173
1.767LysMet: 1.767 ± 0.096
3.496LysAsn: 3.496 ± 0.141
2.42LysPro: 2.42 ± 0.112
1.489LysGln: 1.489 ± 0.087
4.288LysArg: 4.288 ± 0.143
3.895LysSer: 3.895 ± 0.158
3.976LysThr: 3.976 ± 0.135
5.09LysVal: 5.09 ± 0.158
0.706LysTrp: 0.706 ± 0.057
2.127LysTyr: 2.127 ± 0.094
0.0LysXaa: 0.0 ± 0.0
Leu
5.897LeuAla: 5.897 ± 0.189
0.735LeuCys: 0.735 ± 0.06
6.104LeuAsp: 6.104 ± 0.202
9.801LeuGlu: 9.801 ± 0.274
3.155LeuPhe: 3.155 ± 0.171
6.958LeuGly: 6.958 ± 0.201
1.469LeuHis: 1.469 ± 0.086
5.475LeuIle: 5.475 ± 0.18
6.867LeuLys: 6.867 ± 0.174
7.482LeuLeu: 7.482 ± 0.218
2.055LeuMet: 2.055 ± 0.1
3.448LeuAsn: 3.448 ± 0.134
3.587LeuPro: 3.587 ± 0.137
1.863LeuGln: 1.863 ± 0.092
5.273LeuArg: 5.273 ± 0.183
6.358LeuSer: 6.358 ± 0.151
4.62LeuThr: 4.62 ± 0.149
5.667LeuVal: 5.667 ± 0.154
0.917LeuTrp: 0.917 ± 0.074
2.3LeuTyr: 2.3 ± 0.104
0.0LeuXaa: 0.0 ± 0.0
Met
1.402MetAla: 1.402 ± 0.084
0.178MetCys: 0.178 ± 0.028
1.585MetAsp: 1.585 ± 0.088
2.348MetGlu: 2.348 ± 0.119
0.682MetPhe: 0.682 ± 0.06
1.705MetGly: 1.705 ± 0.094
0.274MetHis: 0.274 ± 0.037
1.566MetIle: 1.566 ± 0.087
2.103MetLys: 2.103 ± 0.091
1.566MetLeu: 1.566 ± 0.092
0.499MetMet: 0.499 ± 0.052
1.042MetAsn: 1.042 ± 0.07
1.018MetPro: 1.018 ± 0.075
0.403MetGln: 0.403 ± 0.043
1.436MetArg: 1.436 ± 0.083
1.532MetSer: 1.532 ± 0.086
1.196MetThr: 1.196 ± 0.078
1.623MetVal: 1.623 ± 0.091
0.154MetTrp: 0.154 ± 0.031
0.408MetTyr: 0.408 ± 0.041
0.0MetXaa: 0.0 ± 0.0
Asn
2.214AsnAla: 2.214 ± 0.094
0.514AsnCys: 0.514 ± 0.054
1.58AsnAsp: 1.58 ± 0.082
2.833AsnGlu: 2.833 ± 0.118
1.676AsnPhe: 1.676 ± 0.106
2.392AsnGly: 2.392 ± 0.114
0.802AsnHis: 0.802 ± 0.061
2.829AsnIle: 2.829 ± 0.113
2.036AsnLys: 2.036 ± 0.096
4.24AsnLeu: 4.24 ± 0.144
0.831AsnMet: 0.831 ± 0.054
1.201AsnAsn: 1.201 ± 0.078
2.295AsnPro: 2.295 ± 0.107
0.941AsnGln: 0.941 ± 0.061
2.185AsnArg: 2.185 ± 0.096
2.19AsnSer: 2.19 ± 0.114
1.71AsnThr: 1.71 ± 0.1
2.737AsnVal: 2.737 ± 0.138
0.528AsnTrp: 0.528 ± 0.048
1.373AsnTyr: 1.373 ± 0.077
0.0AsnXaa: 0.0 ± 0.0
Pro
2.315ProAla: 2.315 ± 0.111
0.442ProCys: 0.442 ± 0.043
2.756ProAsp: 2.756 ± 0.106
4.994ProGlu: 4.994 ± 0.168
1.465ProPhe: 1.465 ± 0.093
3.136ProGly: 3.136 ± 0.163
0.884ProHis: 0.884 ± 0.063
2.497ProIle: 2.497 ± 0.109
2.612ProLys: 2.612 ± 0.105
3.386ProLeu: 3.386 ± 0.134
0.884ProMet: 0.884 ± 0.067
1.455ProAsn: 1.455 ± 0.091
2.118ProPro: 2.118 ± 0.089
1.066ProGln: 1.066 ± 0.075
1.916ProArg: 1.916 ± 0.103
2.987ProSer: 2.987 ± 0.118
1.983ProThr: 1.983 ± 0.092
3.126ProVal: 3.126 ± 0.139
0.437ProTrp: 0.437 ± 0.042
1.229ProTyr: 1.229 ± 0.078
0.0ProXaa: 0.0 ± 0.0
Gln
1.561GlnAla: 1.561 ± 0.086
0.168GlnCys: 0.168 ± 0.029
1.143GlnAsp: 1.143 ± 0.075
2.07GlnGlu: 2.07 ± 0.095
0.764GlnPhe: 0.764 ± 0.06
1.455GlnGly: 1.455 ± 0.087
0.307GlnHis: 0.307 ± 0.035
1.734GlnIle: 1.734 ± 0.084
2.003GlnLys: 2.003 ± 0.106
1.791GlnLeu: 1.791 ± 0.092
0.562GlnMet: 0.562 ± 0.047
1.042GlnAsn: 1.042 ± 0.067
0.696GlnPro: 0.696 ± 0.066
0.519GlnGln: 0.519 ± 0.058
1.383GlnArg: 1.383 ± 0.097
1.133GlnSer: 1.133 ± 0.077
1.061GlnThr: 1.061 ± 0.068
1.431GlnVal: 1.431 ± 0.075
0.211GlnTrp: 0.211 ± 0.028
0.61GlnTyr: 0.61 ± 0.056
0.0GlnXaa: 0.0 ± 0.0
Arg
3.602ArgAla: 3.602 ± 0.141
0.571ArgCys: 0.571 ± 0.053
3.294ArgAsp: 3.294 ± 0.136
7.03ArgGlu: 7.03 ± 0.208
2.396ArgPhe: 2.396 ± 0.11
4.236ArgGly: 4.236 ± 0.152
0.735ArgHis: 0.735 ± 0.058
4.053ArgIle: 4.053 ± 0.117
5.306ArgLys: 5.306 ± 0.186
4.898ArgLeu: 4.898 ± 0.2
1.455ArgMet: 1.455 ± 0.084
2.372ArgAsn: 2.372 ± 0.095
1.926ArgPro: 1.926 ± 0.09
1.124ArgGln: 1.124 ± 0.079
3.899ArgArg: 3.899 ± 0.178
3.261ArgSer: 3.261 ± 0.122
2.684ArgThr: 2.684 ± 0.106
4.077ArgVal: 4.077 ± 0.161
0.6ArgTrp: 0.6 ± 0.058
1.599ArgTyr: 1.599 ± 0.085
0.0ArgXaa: 0.0 ± 0.0
Ser
3.203SerAla: 3.203 ± 0.121
0.552SerCys: 0.552 ± 0.052
3.765SerAsp: 3.765 ± 0.117
6.589SerGlu: 6.589 ± 0.191
2.877SerPhe: 2.877 ± 0.118
5.038SerGly: 5.038 ± 0.167
0.994SerHis: 0.994 ± 0.069
3.967SerIle: 3.967 ± 0.144
4.365SerLys: 4.365 ± 0.135
5.619SerLeu: 5.619 ± 0.156
1.426SerMet: 1.426 ± 0.088
2.118SerAsn: 2.118 ± 0.113
2.872SerPro: 2.872 ± 0.121
1.537SerGln: 1.537 ± 0.101
3.63SerArg: 3.63 ± 0.15
4.092SerSer: 4.092 ± 0.158
2.901SerThr: 2.901 ± 0.115
4.082SerVal: 4.082 ± 0.13
0.706SerTrp: 0.706 ± 0.058
1.599SerTyr: 1.599 ± 0.1
0.0SerXaa: 0.0 ± 0.0
Thr
3.232ThrAla: 3.232 ± 0.146
0.447ThrCys: 0.447 ± 0.049
2.526ThrAsp: 2.526 ± 0.102
4.168ThrGlu: 4.168 ± 0.151
1.887ThrPhe: 1.887 ± 0.088
4.13ThrGly: 4.13 ± 0.161
0.965ThrHis: 0.965 ± 0.067
3.371ThrIle: 3.371 ± 0.152
2.814ThrLys: 2.814 ± 0.11
4.12ThrLeu: 4.12 ± 0.174
1.081ThrMet: 1.081 ± 0.072
1.604ThrAsn: 1.604 ± 0.089
2.406ThrPro: 2.406 ± 0.112
1.071ThrGln: 1.071 ± 0.07
2.319ThrArg: 2.319 ± 0.108
3.064ThrSer: 3.064 ± 0.134
2.473ThrThr: 2.473 ± 0.109
3.967ThrVal: 3.967 ± 0.158
0.509ThrTrp: 0.509 ± 0.047
1.369ThrTyr: 1.369 ± 0.073
0.0ThrXaa: 0.0 ± 0.0
Val
4.245ValAla: 4.245 ± 0.159
0.812ValCys: 0.812 ± 0.064
4.533ValAsp: 4.533 ± 0.16
7.333ValGlu: 7.333 ± 0.208
2.881ValPhe: 2.881 ± 0.142
4.908ValGly: 4.908 ± 0.165
1.263ValHis: 1.263 ± 0.078
4.562ValIle: 4.562 ± 0.168
4.879ValLys: 4.879 ± 0.149
6.329ValLeu: 6.329 ± 0.178
1.378ValMet: 1.378 ± 0.081
2.358ValAsn: 2.358 ± 0.112
3.001ValPro: 3.001 ± 0.105
1.561ValGln: 1.561 ± 0.083
3.808ValArg: 3.808 ± 0.143
4.898ValSer: 4.898 ± 0.187
3.467ValThr: 3.467 ± 0.161
4.605ValVal: 4.605 ± 0.185
0.706ValTrp: 0.706 ± 0.068
1.599ValTyr: 1.599 ± 0.077
0.0ValXaa: 0.0 ± 0.0
Trp
0.624TrpAla: 0.624 ± 0.051
0.134TrpCys: 0.134 ± 0.025
0.557TrpAsp: 0.557 ± 0.063
0.96TrpGlu: 0.96 ± 0.072
0.48TrpPhe: 0.48 ± 0.052
0.749TrpGly: 0.749 ± 0.061
0.134TrpHis: 0.134 ± 0.026
0.773TrpIle: 0.773 ± 0.067
0.927TrpLys: 0.927 ± 0.069
0.869TrpLeu: 0.869 ± 0.078
0.298TrpMet: 0.298 ± 0.038
0.466TrpAsn: 0.466 ± 0.049
0.351TrpPro: 0.351 ± 0.045
0.25TrpGln: 0.25 ± 0.038
0.96TrpArg: 0.96 ± 0.073
0.677TrpSer: 0.677 ± 0.056
0.48TrpThr: 0.48 ± 0.05
0.619TrpVal: 0.619 ± 0.056
0.187TrpTrp: 0.187 ± 0.027
0.293TrpTyr: 0.293 ± 0.046
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.532TyrAla: 1.532 ± 0.077
0.379TyrCys: 0.379 ± 0.048
1.594TyrAsp: 1.594 ± 0.095
2.665TyrGlu: 2.665 ± 0.11
1.225TyrPhe: 1.225 ± 0.079
2.444TyrGly: 2.444 ± 0.115
0.644TyrHis: 0.644 ± 0.055
1.724TyrIle: 1.724 ± 0.088
1.542TyrLys: 1.542 ± 0.085
2.776TyrLeu: 2.776 ± 0.131
0.533TyrMet: 0.533 ± 0.049
0.946TyrAsn: 0.946 ± 0.071
1.359TyrPro: 1.359 ± 0.09
0.86TyrGln: 0.86 ± 0.063
2.031TyrArg: 2.031 ± 0.104
2.046TyrSer: 2.046 ± 0.095
1.301TyrThr: 1.301 ± 0.079
1.657TyrVal: 1.657 ± 0.085
0.423TyrTrp: 0.423 ± 0.045
0.84TyrTyr: 0.84 ± 0.06
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 964 proteins (208237 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski