Amino acid dipepetide frequency for candidate division MSBL1 archaeon SCGC-AAA382A20

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
3.495AlaAla: 3.495 ± 0.153
0.74AlaCys: 0.74 ± 0.07
3.204AlaAsp: 3.204 ± 0.138
5.46AlaGlu: 5.46 ± 0.172
2.08AlaPhe: 2.08 ± 0.106
4.105AlaGly: 4.105 ± 0.145
0.952AlaHis: 0.952 ± 0.06
3.768AlaIle: 3.768 ± 0.112
4.045AlaLys: 4.045 ± 0.148
5.275AlaLeu: 5.275 ± 0.17
1.327AlaMet: 1.327 ± 0.078
1.877AlaAsn: 1.877 ± 0.089
1.687AlaPro: 1.687 ± 0.096
1.47AlaGln: 1.47 ± 0.082
3.333AlaArg: 3.333 ± 0.127
3.569AlaSer: 3.569 ± 0.133
2.57AlaThr: 2.57 ± 0.1
4.087AlaVal: 4.087 ± 0.146
0.555AlaTrp: 0.555 ± 0.048
1.641AlaTyr: 1.641 ± 0.087
0.0AlaXaa: 0.0 ± 0.0
Cys
0.555CysAla: 0.555 ± 0.056
0.203CysCys: 0.203 ± 0.029
0.541CysAsp: 0.541 ± 0.051
1.073CysGlu: 1.073 ± 0.072
0.416CysPhe: 0.416 ± 0.046
1.341CysGly: 1.341 ± 0.08
0.245CysHis: 0.245 ± 0.035
0.596CysIle: 0.596 ± 0.052
0.8CysLys: 0.8 ± 0.063
0.855CysLeu: 0.855 ± 0.069
0.259CysMet: 0.259 ± 0.035
0.388CysAsn: 0.388 ± 0.05
0.902CysPro: 0.902 ± 0.065
0.379CysGln: 0.379 ± 0.05
0.615CysArg: 0.615 ± 0.055
0.749CysSer: 0.749 ± 0.069
0.573CysThr: 0.573 ± 0.046
0.629CysVal: 0.629 ± 0.048
0.176CysTrp: 0.176 ± 0.025
0.365CysTyr: 0.365 ± 0.043
0.0CysXaa: 0.0 ± 0.0
Asp
3.074AspAla: 3.074 ± 0.131
0.781AspCys: 0.781 ± 0.065
3.232AspAsp: 3.232 ± 0.175
5.913AspGlu: 5.913 ± 0.174
3.033AspPhe: 3.033 ± 0.111
4.036AspGly: 4.036 ± 0.136
0.952AspHis: 0.952 ± 0.064
4.66AspIle: 4.66 ± 0.166
4.286AspLys: 4.286 ± 0.157
6.306AspLeu: 6.306 ± 0.177
1.479AspMet: 1.479 ± 0.082
2.409AspAsn: 2.409 ± 0.163
2.612AspPro: 2.612 ± 0.129
1.248AspGln: 1.248 ± 0.072
3.204AspArg: 3.204 ± 0.151
3.578AspSer: 3.578 ± 0.137
2.705AspThr: 2.705 ± 0.142
4.364AspVal: 4.364 ± 0.151
0.929AspTrp: 0.929 ± 0.075
2.33AspTyr: 2.33 ± 0.106
0.0AspXaa: 0.0 ± 0.0
Glu
5.224GluAla: 5.224 ± 0.183
0.92GluCys: 0.92 ± 0.062
6.528GluAsp: 6.528 ± 0.2
12.446GluGlu: 12.446 ± 0.34
3.282GluPhe: 3.282 ± 0.13
6.514GluGly: 6.514 ± 0.197
1.322GluHis: 1.322 ± 0.076
8.077GluIle: 8.077 ± 0.203
10.994GluLys: 10.994 ± 0.351
8.007GluLeu: 8.007 ± 0.202
2.45GluMet: 2.45 ± 0.097
5.539GluAsn: 5.539 ± 0.184
2.792GluPro: 2.792 ± 0.108
1.748GluGln: 1.748 ± 0.09
5.437GluArg: 5.437 ± 0.209
5.076GluSer: 5.076 ± 0.18
4.475GluThr: 4.475 ± 0.159
6.5GluVal: 6.5 ± 0.166
1.086GluTrp: 1.086 ± 0.075
2.584GluTyr: 2.584 ± 0.105
0.0GluXaa: 0.0 ± 0.0
Phe
2.016PheAla: 2.016 ± 0.095
0.541PheCys: 0.541 ± 0.051
2.714PheAsp: 2.714 ± 0.125
3.745PheGlu: 3.745 ± 0.136
1.738PhePhe: 1.738 ± 0.088
2.987PheGly: 2.987 ± 0.122
0.781PheHis: 0.781 ± 0.059
2.233PheIle: 2.233 ± 0.096
2.464PheLys: 2.464 ± 0.109
4.147PheLeu: 4.147 ± 0.168
0.735PheMet: 0.735 ± 0.06
1.475PheAsn: 1.475 ± 0.095
1.438PhePro: 1.438 ± 0.098
1.049PheGln: 1.049 ± 0.068
2.067PheArg: 2.067 ± 0.096
3.093PheSer: 3.093 ± 0.129
2.002PheThr: 2.002 ± 0.104
2.362PheVal: 2.362 ± 0.114
0.527PheTrp: 0.527 ± 0.05
1.285PheTyr: 1.285 ± 0.083
0.0PheXaa: 0.0 ± 0.0
Gly
3.805GlyAla: 3.805 ± 0.149
0.902GlyCys: 0.902 ± 0.07
4.105GlyAsp: 4.105 ± 0.14
6.791GlyGlu: 6.791 ± 0.216
3.042GlyPhe: 3.042 ± 0.11
5.497GlyGly: 5.497 ± 0.181
1.211GlyHis: 1.211 ± 0.079
5.465GlyIle: 5.465 ± 0.181
5.802GlyLys: 5.802 ± 0.164
5.576GlyLeu: 5.576 ± 0.178
1.812GlyMet: 1.812 ± 0.114
3.024GlyAsn: 3.024 ± 0.176
2.279GlyPro: 2.279 ± 0.101
1.715GlyGln: 1.715 ± 0.097
3.93GlyArg: 3.93 ± 0.143
4.651GlySer: 4.651 ± 0.182
3.477GlyThr: 3.477 ± 0.145
5.136GlyVal: 5.136 ± 0.153
1.054GlyTrp: 1.054 ± 0.083
2.473GlyTyr: 2.473 ± 0.115
0.0GlyXaa: 0.0 ± 0.0
His
1.003HisAla: 1.003 ± 0.068
0.259HisCys: 0.259 ± 0.035
0.943HisAsp: 0.943 ± 0.067
1.382HisGlu: 1.382 ± 0.085
0.828HisPhe: 0.828 ± 0.062
1.368HisGly: 1.368 ± 0.092
0.467HisHis: 0.467 ± 0.052
1.147HisIle: 1.147 ± 0.079
1.188HisLys: 1.188 ± 0.078
1.623HisLeu: 1.623 ± 0.092
0.301HisMet: 0.301 ± 0.034
0.652HisAsn: 0.652 ± 0.059
0.999HisPro: 0.999 ± 0.069
0.499HisGln: 0.499 ± 0.048
0.957HisArg: 0.957 ± 0.065
1.184HisSer: 1.184 ± 0.073
0.772HisThr: 0.772 ± 0.056
1.026HisVal: 1.026 ± 0.071
0.231HisTrp: 0.231 ± 0.034
0.652HisTyr: 0.652 ± 0.063
0.0HisXaa: 0.0 ± 0.0
Ile
4.115IleAla: 4.115 ± 0.151
0.832IleCys: 0.832 ± 0.062
4.706IleAsp: 4.706 ± 0.159
6.995IleGlu: 6.995 ± 0.18
2.705IlePhe: 2.705 ± 0.111
5.085IleGly: 5.085 ± 0.179
1.151IleHis: 1.151 ± 0.077
4.198IleIle: 4.198 ± 0.143
4.924IleLys: 4.924 ± 0.157
6.098IleLeu: 6.098 ± 0.189
1.285IleMet: 1.285 ± 0.082
2.681IleAsn: 2.681 ± 0.128
3.25IlePro: 3.25 ± 0.13
1.919IleGln: 1.919 ± 0.096
3.717IleArg: 3.717 ± 0.151
4.961IleSer: 4.961 ± 0.159
3.504IleThr: 3.504 ± 0.159
4.6IleVal: 4.6 ± 0.15
0.68IleTrp: 0.68 ± 0.054
1.96IleTyr: 1.96 ± 0.09
0.0IleXaa: 0.0 ± 0.0
Lys
4.651LysAla: 4.651 ± 0.152
0.804LysCys: 0.804 ± 0.072
4.84LysAsp: 4.84 ± 0.148
8.497LysGlu: 8.497 ± 0.267
2.686LysPhe: 2.686 ± 0.129
5.173LysGly: 5.173 ± 0.176
1.285LysHis: 1.285 ± 0.076
6.662LysIle: 6.662 ± 0.203
7.758LysLys: 7.758 ± 0.275
6.791LysLeu: 6.791 ± 0.215
1.766LysMet: 1.766 ± 0.087
3.944LysAsn: 3.944 ± 0.156
2.612LysPro: 2.612 ± 0.108
1.835LysGln: 1.835 ± 0.092
4.397LysArg: 4.397 ± 0.186
4.693LysSer: 4.693 ± 0.134
3.939LysThr: 3.939 ± 0.141
5.317LysVal: 5.317 ± 0.163
0.985LysTrp: 0.985 ± 0.06
2.469LysTyr: 2.469 ± 0.102
0.0LysXaa: 0.0 ± 0.0
Leu
5.409LeuAla: 5.409 ± 0.177
0.837LeuCys: 0.837 ± 0.068
5.77LeuAsp: 5.77 ± 0.176
9.45LeuGlu: 9.45 ± 0.284
3.037LeuPhe: 3.037 ± 0.126
6.329LeuGly: 6.329 ± 0.199
1.512LeuHis: 1.512 ± 0.077
5.28LeuIle: 5.28 ± 0.209
7.226LeuLys: 7.226 ± 0.228
7.12LeuLeu: 7.12 ± 0.24
1.771LeuMet: 1.771 ± 0.087
3.842LeuAsn: 3.842 ± 0.138
3.514LeuPro: 3.514 ± 0.114
2.386LeuGln: 2.386 ± 0.114
4.619LeuArg: 4.619 ± 0.173
6.343LeuSer: 6.343 ± 0.194
4.202LeuThr: 4.202 ± 0.134
5.192LeuVal: 5.192 ± 0.188
0.86LeuTrp: 0.86 ± 0.066
2.561LeuTyr: 2.561 ± 0.126
0.0LeuXaa: 0.0 ± 0.0
Met
1.466MetAla: 1.466 ± 0.075
0.199MetCys: 0.199 ± 0.029
1.466MetAsp: 1.466 ± 0.08
2.21MetGlu: 2.21 ± 0.122
0.693MetPhe: 0.693 ± 0.06
1.738MetGly: 1.738 ± 0.098
0.337MetHis: 0.337 ± 0.04
1.345MetIle: 1.345 ± 0.083
2.113MetLys: 2.113 ± 0.095
1.655MetLeu: 1.655 ± 0.097
0.546MetMet: 0.546 ± 0.064
1.197MetAsn: 1.197 ± 0.077
0.915MetPro: 0.915 ± 0.066
0.55MetGln: 0.55 ± 0.053
1.16MetArg: 1.16 ± 0.075
1.493MetSer: 1.493 ± 0.072
1.045MetThr: 1.045 ± 0.072
1.438MetVal: 1.438 ± 0.078
0.176MetTrp: 0.176 ± 0.026
0.462MetTyr: 0.462 ± 0.041
0.0MetXaa: 0.0 ± 0.0
Asn
2.141AsnAla: 2.141 ± 0.12
0.638AsnCys: 0.638 ± 0.064
1.965AsnAsp: 1.965 ± 0.125
3.319AsnGlu: 3.319 ± 0.147
1.951AsnPhe: 1.951 ± 0.1
2.506AsnGly: 2.506 ± 0.12
0.707AsnHis: 0.707 ± 0.063
3.467AsnIle: 3.467 ± 0.148
2.931AsnLys: 2.931 ± 0.123
4.642AsnLeu: 4.642 ± 0.147
0.818AsnMet: 0.818 ± 0.06
2.057AsnAsn: 2.057 ± 0.261
2.52AsnPro: 2.52 ± 0.123
1.322AsnGln: 1.322 ± 0.076
2.288AsnArg: 2.288 ± 0.111
2.834AsnSer: 2.834 ± 0.131
2.043AsnThr: 2.043 ± 0.197
2.922AsnVal: 2.922 ± 0.142
0.647AsnTrp: 0.647 ± 0.051
1.711AsnTyr: 1.711 ± 0.106
0.0AsnXaa: 0.0 ± 0.0
Pro
2.048ProAla: 2.048 ± 0.098
0.439ProCys: 0.439 ± 0.048
2.631ProAsp: 2.631 ± 0.116
4.498ProGlu: 4.498 ± 0.141
1.674ProPhe: 1.674 ± 0.097
2.631ProGly: 2.631 ± 0.126
0.888ProHis: 0.888 ± 0.064
2.312ProIle: 2.312 ± 0.085
2.709ProLys: 2.709 ± 0.127
2.889ProLeu: 2.889 ± 0.134
0.8ProMet: 0.8 ± 0.07
1.526ProAsn: 1.526 ± 0.09
1.766ProPro: 1.766 ± 0.101
0.999ProGln: 0.999 ± 0.065
1.766ProArg: 1.766 ± 0.092
2.806ProSer: 2.806 ± 0.115
1.988ProThr: 1.988 ± 0.131
2.945ProVal: 2.945 ± 0.125
0.462ProTrp: 0.462 ± 0.05
1.498ProTyr: 1.498 ± 0.087
0.0ProXaa: 0.0 ± 0.0
Gln
1.433GlnAla: 1.433 ± 0.079
0.236GlnCys: 0.236 ± 0.035
1.47GlnAsp: 1.47 ± 0.08
2.409GlnGlu: 2.409 ± 0.106
0.86GlnPhe: 0.86 ± 0.057
1.729GlnGly: 1.729 ± 0.101
0.393GlnHis: 0.393 ± 0.041
1.789GlnIle: 1.789 ± 0.093
2.566GlnLys: 2.566 ± 0.116
2.048GlnLeu: 2.048 ± 0.098
0.689GlnMet: 0.689 ± 0.056
1.197GlnAsn: 1.197 ± 0.077
0.781GlnPro: 0.781 ± 0.054
0.73GlnGln: 0.73 ± 0.068
1.405GlnArg: 1.405 ± 0.074
1.364GlnSer: 1.364 ± 0.079
1.17GlnThr: 1.17 ± 0.075
1.604GlnVal: 1.604 ± 0.099
0.259GlnTrp: 0.259 ± 0.034
0.749GlnTyr: 0.749 ± 0.064
0.0GlnXaa: 0.0 ± 0.0
Arg
2.834ArgAla: 2.834 ± 0.136
0.555ArgCys: 0.555 ± 0.05
3.098ArgAsp: 3.098 ± 0.125
6.079ArgGlu: 6.079 ± 0.199
2.071ArgPhe: 2.071 ± 0.104
3.809ArgGly: 3.809 ± 0.159
0.952ArgHis: 0.952 ± 0.069
3.874ArgIle: 3.874 ± 0.148
5.312ArgLys: 5.312 ± 0.189
4.466ArgLeu: 4.466 ± 0.166
1.456ArgMet: 1.456 ± 0.095
2.649ArgAsn: 2.649 ± 0.12
1.678ArgPro: 1.678 ± 0.084
1.234ArgGln: 1.234 ± 0.084
3.412ArgArg: 3.412 ± 0.179
2.765ArgSer: 2.765 ± 0.118
2.288ArgThr: 2.288 ± 0.114
3.056ArgVal: 3.056 ± 0.136
0.606ArgTrp: 0.606 ± 0.062
1.604ArgTyr: 1.604 ± 0.096
0.0ArgXaa: 0.0 ± 0.0
Ser
3.333SerAla: 3.333 ± 0.134
0.689SerCys: 0.689 ± 0.064
4.133SerAsp: 4.133 ± 0.154
6.149SerGlu: 6.149 ± 0.157
2.626SerPhe: 2.626 ± 0.103
5.243SerGly: 5.243 ± 0.206
1.285SerHis: 1.285 ± 0.065
4.239SerIle: 4.239 ± 0.135
5.012SerLys: 5.012 ± 0.175
5.691SerLeu: 5.691 ± 0.156
1.336SerMet: 1.336 ± 0.077
2.594SerAsn: 2.594 ± 0.17
2.728SerPro: 2.728 ± 0.13
1.683SerGln: 1.683 ± 0.098
3.343SerArg: 3.343 ± 0.131
4.693SerSer: 4.693 ± 0.18
2.852SerThr: 2.852 ± 0.138
4.092SerVal: 4.092 ± 0.137
0.675SerTrp: 0.675 ± 0.066
2.187SerTyr: 2.187 ± 0.111
0.0SerXaa: 0.0 ± 0.0
Thr
2.94ThrAla: 2.94 ± 0.135
0.532ThrCys: 0.532 ± 0.052
2.746ThrAsp: 2.746 ± 0.127
4.059ThrGlu: 4.059 ± 0.141
1.932ThrPhe: 1.932 ± 0.103
3.92ThrGly: 3.92 ± 0.142
0.883ThrHis: 0.883 ± 0.063
3.287ThrIle: 3.287 ± 0.152
2.792ThrLys: 2.792 ± 0.122
4.471ThrLeu: 4.471 ± 0.152
0.925ThrMet: 0.925 ± 0.081
1.738ThrAsn: 1.738 ± 0.116
2.196ThrPro: 2.196 ± 0.116
1.207ThrGln: 1.207 ± 0.083
2.043ThrArg: 2.043 ± 0.107
3.093ThrSer: 3.093 ± 0.15
2.603ThrThr: 2.603 ± 0.146
3.759ThrVal: 3.759 ± 0.147
0.518ThrTrp: 0.518 ± 0.059
1.674ThrTyr: 1.674 ± 0.11
0.0ThrXaa: 0.0 ± 0.0
Val
3.726ValAla: 3.726 ± 0.152
0.892ValCys: 0.892 ± 0.067
4.346ValAsp: 4.346 ± 0.155
6.708ValGlu: 6.708 ± 0.2
2.751ValPhe: 2.751 ± 0.12
4.582ValGly: 4.582 ± 0.163
1.262ValHis: 1.262 ± 0.066
4.189ValIle: 4.189 ± 0.154
5.155ValLys: 5.155 ± 0.172
5.631ValLeu: 5.631 ± 0.153
1.41ValMet: 1.41 ± 0.082
2.654ValAsn: 2.654 ± 0.109
2.825ValPro: 2.825 ± 0.104
1.646ValGln: 1.646 ± 0.09
3.292ValArg: 3.292 ± 0.115
4.776ValSer: 4.776 ± 0.16
3.111ValThr: 3.111 ± 0.136
4.304ValVal: 4.304 ± 0.144
0.652ValTrp: 0.652 ± 0.054
1.761ValTyr: 1.761 ± 0.09
0.0ValXaa: 0.0 ± 0.0
Trp
0.522TrpAla: 0.522 ± 0.058
0.19TrpCys: 0.19 ± 0.028
0.629TrpAsp: 0.629 ± 0.055
1.045TrpGlu: 1.045 ± 0.071
0.578TrpPhe: 0.578 ± 0.047
0.818TrpGly: 0.818 ± 0.068
0.277TrpHis: 0.277 ± 0.035
0.902TrpIle: 0.902 ± 0.067
0.975TrpLys: 0.975 ± 0.068
1.04TrpLeu: 1.04 ± 0.069
0.305TrpMet: 0.305 ± 0.04
0.656TrpAsn: 0.656 ± 0.056
0.328TrpPro: 0.328 ± 0.039
0.268TrpGln: 0.268 ± 0.04
0.758TrpArg: 0.758 ± 0.065
0.772TrpSer: 0.772 ± 0.084
0.527TrpThr: 0.527 ± 0.06
0.615TrpVal: 0.615 ± 0.055
0.24TrpTrp: 0.24 ± 0.037
0.342TrpTyr: 0.342 ± 0.041
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.452TyrAla: 1.452 ± 0.077
0.472TyrCys: 0.472 ± 0.051
2.071TyrAsp: 2.071 ± 0.104
2.774TyrGlu: 2.774 ± 0.118
1.415TyrPhe: 1.415 ± 0.071
2.335TyrGly: 2.335 ± 0.123
0.615TyrHis: 0.615 ± 0.058
1.877TyrIle: 1.877 ± 0.095
2.053TyrLys: 2.053 ± 0.098
2.926TyrLeu: 2.926 ± 0.133
0.73TyrMet: 0.73 ± 0.063
1.331TyrAsn: 1.331 ± 0.09
1.452TyrPro: 1.452 ± 0.088
0.994TyrGln: 0.994 ± 0.063
2.057TyrArg: 2.057 ± 0.098
2.09TyrSer: 2.09 ± 0.111
1.47TyrThr: 1.47 ± 0.092
1.724TyrVal: 1.724 ± 0.093
0.472TyrTrp: 0.472 ± 0.055
1.35TyrTyr: 1.35 ± 0.088
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 947 proteins (216303 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski