Amino acid dipepetide frequency for candidate division MSBL1 archaeon SCGC-AAA259D14

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
4.265AlaAla: 4.265 ± 0.204
0.7AlaCys: 0.7 ± 0.079
3.109AlaAsp: 3.109 ± 0.129
5.669AlaGlu: 5.669 ± 0.183
2.441AlaPhe: 2.441 ± 0.123
5.12AlaGly: 5.12 ± 0.194
1.016AlaHis: 1.016 ± 0.085
4.384AlaIle: 4.384 ± 0.184
4.11AlaLys: 4.11 ± 0.166
5.633AlaLeu: 5.633 ± 0.176
1.498AlaMet: 1.498 ± 0.093
1.746AlaAsn: 1.746 ± 0.101
1.917AlaPro: 1.917 ± 0.114
1.461AlaGln: 1.461 ± 0.081
3.565AlaArg: 3.565 ± 0.158
3.664AlaSer: 3.664 ± 0.153
2.752AlaThr: 2.752 ± 0.144
4.664AlaVal: 4.664 ± 0.182
0.772AlaTrp: 0.772 ± 0.063
1.617AlaTyr: 1.617 ± 0.082
0.0AlaXaa: 0.0 ± 0.0
Cys
0.508CysAla: 0.508 ± 0.046
0.155CysCys: 0.155 ± 0.034
0.575CysAsp: 0.575 ± 0.053
1.057CysGlu: 1.057 ± 0.083
0.415CysPhe: 0.415 ± 0.045
1.332CysGly: 1.332 ± 0.099
0.311CysHis: 0.311 ± 0.047
0.575CysIle: 0.575 ± 0.062
0.513CysLys: 0.513 ± 0.058
0.751CysLeu: 0.751 ± 0.056
0.223CysMet: 0.223 ± 0.032
0.28CysAsn: 0.28 ± 0.042
0.824CysPro: 0.824 ± 0.08
0.275CysGln: 0.275 ± 0.036
0.643CysArg: 0.643 ± 0.061
0.85CysSer: 0.85 ± 0.091
0.472CysThr: 0.472 ± 0.052
0.648CysVal: 0.648 ± 0.062
0.104CysTrp: 0.104 ± 0.02
0.311CysTyr: 0.311 ± 0.049
0.0CysXaa: 0.0 ± 0.0
Asp
2.85AspAla: 2.85 ± 0.143
0.72AspCys: 0.72 ± 0.065
2.715AspAsp: 2.715 ± 0.132
5.498AspGlu: 5.498 ± 0.17
2.679AspPhe: 2.679 ± 0.141
3.762AspGly: 3.762 ± 0.142
0.86AspHis: 0.86 ± 0.071
4.457AspIle: 4.457 ± 0.174
3.643AspLys: 3.643 ± 0.162
6.483AspLeu: 6.483 ± 0.218
1.389AspMet: 1.389 ± 0.094
1.726AspAsn: 1.726 ± 0.113
3.026AspPro: 3.026 ± 0.136
1.031AspGln: 1.031 ± 0.08
3.659AspArg: 3.659 ± 0.159
2.995AspSer: 2.995 ± 0.136
2.047AspThr: 2.047 ± 0.093
4.156AspVal: 4.156 ± 0.155
0.788AspTrp: 0.788 ± 0.071
2.021AspTyr: 2.021 ± 0.102
0.0AspXaa: 0.0 ± 0.0
Glu
5.457GluAla: 5.457 ± 0.19
0.803GluCys: 0.803 ± 0.07
5.825GluAsp: 5.825 ± 0.177
12.054GluGlu: 12.054 ± 0.369
3.482GluPhe: 3.482 ± 0.141
6.473GluGly: 6.473 ± 0.211
1.363GluHis: 1.363 ± 0.086
8.38GluIle: 8.38 ± 0.226
10.748GluLys: 10.748 ± 0.338
8.121GluLeu: 8.121 ± 0.226
2.451GluMet: 2.451 ± 0.112
5.317GluAsn: 5.317 ± 0.191
2.923GluPro: 2.923 ± 0.118
1.721GluGln: 1.721 ± 0.098
6.053GluArg: 6.053 ± 0.218
4.835GluSer: 4.835 ± 0.189
4.56GluThr: 4.56 ± 0.173
6.882GluVal: 6.882 ± 0.226
0.928GluTrp: 0.928 ± 0.077
2.358GluTyr: 2.358 ± 0.113
0.0GluXaa: 0.0 ± 0.0
Phe
2.234PheAla: 2.234 ± 0.121
0.529PheCys: 0.529 ± 0.054
2.757PheAsp: 2.757 ± 0.118
3.617PheGlu: 3.617 ± 0.137
1.933PhePhe: 1.933 ± 0.139
3.311PheGly: 3.311 ± 0.158
0.803PheHis: 0.803 ± 0.066
2.285PheIle: 2.285 ± 0.131
2.213PheLys: 2.213 ± 0.102
4.477PheLeu: 4.477 ± 0.2
0.86PheMet: 0.86 ± 0.075
1.052PheAsn: 1.052 ± 0.082
1.591PhePro: 1.591 ± 0.096
1.114PheGln: 1.114 ± 0.074
2.011PheArg: 2.011 ± 0.098
3.358PheSer: 3.358 ± 0.154
1.632PheThr: 1.632 ± 0.099
2.467PheVal: 2.467 ± 0.122
0.534PheTrp: 0.534 ± 0.064
1.311PheTyr: 1.311 ± 0.088
0.0PheXaa: 0.0 ± 0.0
Gly
4.742GlyAla: 4.742 ± 0.227
0.886GlyCys: 0.886 ± 0.083
3.653GlyAsp: 3.653 ± 0.151
7.644GlyGlu: 7.644 ± 0.208
3.286GlyPhe: 3.286 ± 0.155
6.203GlyGly: 6.203 ± 0.216
1.161GlyHis: 1.161 ± 0.083
5.763GlyIle: 5.763 ± 0.208
5.882GlyLys: 5.882 ± 0.186
6.675GlyLeu: 6.675 ± 0.233
1.943GlyMet: 1.943 ± 0.106
2.529GlyAsn: 2.529 ± 0.096
2.498GlyPro: 2.498 ± 0.122
1.482GlyGln: 1.482 ± 0.096
4.296GlyArg: 4.296 ± 0.155
4.799GlySer: 4.799 ± 0.187
3.534GlyThr: 3.534 ± 0.162
5.545GlyVal: 5.545 ± 0.189
0.922GlyTrp: 0.922 ± 0.076
2.441GlyTyr: 2.441 ± 0.108
0.0GlyXaa: 0.0 ± 0.0
His
0.948HisAla: 0.948 ± 0.078
0.275HisCys: 0.275 ± 0.044
0.943HisAsp: 0.943 ± 0.065
1.254HisGlu: 1.254 ± 0.094
0.694HisPhe: 0.694 ± 0.059
1.321HisGly: 1.321 ± 0.079
0.306HisHis: 0.306 ± 0.042
1.052HisIle: 1.052 ± 0.078
0.684HisLys: 0.684 ± 0.058
1.746HisLeu: 1.746 ± 0.11
0.326HisMet: 0.326 ± 0.044
0.58HisAsn: 0.58 ± 0.05
1.104HisPro: 1.104 ± 0.073
0.389HisGln: 0.389 ± 0.054
0.84HisArg: 0.84 ± 0.07
1.073HisSer: 1.073 ± 0.077
0.627HisThr: 0.627 ± 0.058
1.093HisVal: 1.093 ± 0.082
0.207HisTrp: 0.207 ± 0.038
0.57HisTyr: 0.57 ± 0.06
0.0HisXaa: 0.0 ± 0.0
Ile
4.954IleAla: 4.954 ± 0.178
0.865IleCys: 0.865 ± 0.077
4.514IleAsp: 4.514 ± 0.164
7.069IleGlu: 7.069 ± 0.234
2.923IlePhe: 2.923 ± 0.175
5.773IleGly: 5.773 ± 0.235
1.27IleHis: 1.27 ± 0.082
4.887IleIle: 4.887 ± 0.22
4.369IleLys: 4.369 ± 0.164
6.644IleLeu: 6.644 ± 0.219
1.306IleMet: 1.306 ± 0.094
2.544IleAsn: 2.544 ± 0.13
3.436IlePro: 3.436 ± 0.17
1.663IleGln: 1.663 ± 0.091
3.913IleArg: 3.913 ± 0.135
4.944IleSer: 4.944 ± 0.188
3.28IleThr: 3.28 ± 0.144
5.032IleVal: 5.032 ± 0.21
0.658IleTrp: 0.658 ± 0.066
1.886IleTyr: 1.886 ± 0.101
0.0IleXaa: 0.0 ± 0.0
Lys
4.343LysAla: 4.343 ± 0.163
0.741LysCys: 0.741 ± 0.072
4.073LysAsp: 4.073 ± 0.156
7.96LysGlu: 7.96 ± 0.251
2.472LysPhe: 2.472 ± 0.112
4.98LysGly: 4.98 ± 0.2
1.021LysHis: 1.021 ± 0.086
6.721LysIle: 6.721 ± 0.236
6.944LysLys: 6.944 ± 0.263
6.167LysLeu: 6.167 ± 0.199
1.814LysMet: 1.814 ± 0.09
3.524LysAsn: 3.524 ± 0.146
2.555LysPro: 2.555 ± 0.125
1.549LysGln: 1.549 ± 0.102
4.167LysArg: 4.167 ± 0.18
4.317LysSer: 4.317 ± 0.17
3.975LysThr: 3.975 ± 0.147
5.358LysVal: 5.358 ± 0.187
0.783LysTrp: 0.783 ± 0.068
2.296LysTyr: 2.296 ± 0.099
0.0LysXaa: 0.0 ± 0.0
Leu
6.312LeuAla: 6.312 ± 0.196
0.793LeuCys: 0.793 ± 0.069
5.768LeuAsp: 5.768 ± 0.203
9.105LeuGlu: 9.105 ± 0.302
3.327LeuPhe: 3.327 ± 0.17
6.913LeuGly: 6.913 ± 0.229
1.477LeuHis: 1.477 ± 0.101
5.721LeuIle: 5.721 ± 0.202
6.882LeuLys: 6.882 ± 0.221
7.478LeuLeu: 7.478 ± 0.23
1.752LeuMet: 1.752 ± 0.106
3.617LeuAsn: 3.617 ± 0.148
3.705LeuPro: 3.705 ± 0.145
2.228LeuGln: 2.228 ± 0.114
4.954LeuArg: 4.954 ± 0.186
6.592LeuSer: 6.592 ± 0.208
4.514LeuThr: 4.514 ± 0.177
5.742LeuVal: 5.742 ± 0.188
0.871LeuTrp: 0.871 ± 0.088
2.306LeuTyr: 2.306 ± 0.108
0.0LeuXaa: 0.0 ± 0.0
Met
1.358MetAla: 1.358 ± 0.082
0.285MetCys: 0.285 ± 0.033
1.404MetAsp: 1.404 ± 0.08
2.12MetGlu: 2.12 ± 0.12
0.622MetPhe: 0.622 ± 0.058
1.793MetGly: 1.793 ± 0.116
0.301MetHis: 0.301 ± 0.038
1.57MetIle: 1.57 ± 0.086
1.985MetLys: 1.985 ± 0.102
1.684MetLeu: 1.684 ± 0.095
0.44MetMet: 0.44 ± 0.046
1.135MetAsn: 1.135 ± 0.074
0.922MetPro: 0.922 ± 0.073
0.43MetGln: 0.43 ± 0.052
1.202MetArg: 1.202 ± 0.075
1.632MetSer: 1.632 ± 0.093
1.031MetThr: 1.031 ± 0.074
1.721MetVal: 1.721 ± 0.095
0.176MetTrp: 0.176 ± 0.025
0.415MetTyr: 0.415 ± 0.05
0.0MetXaa: 0.0 ± 0.0
Asn
2.171AsnAla: 2.171 ± 0.132
0.544AsnCys: 0.544 ± 0.056
1.425AsnAsp: 1.425 ± 0.09
2.783AsnGlu: 2.783 ± 0.124
1.855AsnPhe: 1.855 ± 0.099
2.275AsnGly: 2.275 ± 0.111
0.637AsnHis: 0.637 ± 0.06
2.861AsnIle: 2.861 ± 0.134
2.275AsnLys: 2.275 ± 0.112
4.208AsnLeu: 4.208 ± 0.161
0.777AsnMet: 0.777 ± 0.063
1.104AsnAsn: 1.104 ± 0.073
2.218AsnPro: 2.218 ± 0.112
1.047AsnGln: 1.047 ± 0.08
2.135AsnArg: 2.135 ± 0.128
2.415AsnSer: 2.415 ± 0.12
1.638AsnThr: 1.638 ± 0.102
2.881AsnVal: 2.881 ± 0.124
0.544AsnTrp: 0.544 ± 0.049
1.591AsnTyr: 1.591 ± 0.103
0.0AsnXaa: 0.0 ± 0.0
Pro
2.27ProAla: 2.27 ± 0.119
0.347ProCys: 0.347 ± 0.044
2.586ProAsp: 2.586 ± 0.102
4.773ProGlu: 4.773 ± 0.151
1.529ProPhe: 1.529 ± 0.089
3.037ProGly: 3.037 ± 0.163
0.783ProHis: 0.783 ± 0.059
2.638ProIle: 2.638 ± 0.127
2.684ProLys: 2.684 ± 0.123
3.125ProLeu: 3.125 ± 0.14
0.669ProMet: 0.669 ± 0.06
1.415ProAsn: 1.415 ± 0.083
1.995ProPro: 1.995 ± 0.119
1.062ProGln: 1.062 ± 0.076
1.897ProArg: 1.897 ± 0.114
3.12ProSer: 3.12 ± 0.13
2.016ProThr: 2.016 ± 0.1
2.964ProVal: 2.964 ± 0.123
0.456ProTrp: 0.456 ± 0.052
1.384ProTyr: 1.384 ± 0.082
0.0ProXaa: 0.0 ± 0.0
Gln
1.394GlnAla: 1.394 ± 0.086
0.176GlnCys: 0.176 ± 0.034
1.192GlnAsp: 1.192 ± 0.089
2.052GlnGlu: 2.052 ± 0.107
0.736GlnPhe: 0.736 ± 0.064
1.549GlnGly: 1.549 ± 0.097
0.316GlnHis: 0.316 ± 0.039
1.695GlnIle: 1.695 ± 0.1
2.187GlnLys: 2.187 ± 0.132
1.726GlnLeu: 1.726 ± 0.102
0.513GlnMet: 0.513 ± 0.051
1.13GlnAsn: 1.13 ± 0.082
0.829GlnPro: 0.829 ± 0.068
0.586GlnGln: 0.586 ± 0.066
1.394GlnArg: 1.394 ± 0.093
1.145GlnSer: 1.145 ± 0.077
1.176GlnThr: 1.176 ± 0.09
1.591GlnVal: 1.591 ± 0.095
0.238GlnTrp: 0.238 ± 0.033
0.653GlnTyr: 0.653 ± 0.063
0.0GlnXaa: 0.0 ± 0.0
Arg
3.322ArgAla: 3.322 ± 0.139
0.44ArgCys: 0.44 ± 0.052
3.161ArgAsp: 3.161 ± 0.14
6.613ArgGlu: 6.613 ± 0.186
2.373ArgPhe: 2.373 ± 0.109
4.053ArgGly: 4.053 ± 0.182
0.705ArgHis: 0.705 ± 0.056
4.021ArgIle: 4.021 ± 0.136
5.327ArgLys: 5.327 ± 0.191
4.488ArgLeu: 4.488 ± 0.182
1.461ArgMet: 1.461 ± 0.095
2.42ArgAsn: 2.42 ± 0.117
1.829ArgPro: 1.829 ± 0.095
1.088ArgGln: 1.088 ± 0.074
3.814ArgArg: 3.814 ± 0.181
3.125ArgSer: 3.125 ± 0.124
2.767ArgThr: 2.767 ± 0.14
3.949ArgVal: 3.949 ± 0.159
0.71ArgTrp: 0.71 ± 0.067
1.752ArgTyr: 1.752 ± 0.079
0.0ArgXaa: 0.0 ± 0.0
Ser
3.353SerAla: 3.353 ± 0.132
0.632SerCys: 0.632 ± 0.063
3.42SerAsp: 3.42 ± 0.135
6.794SerGlu: 6.794 ± 0.191
2.829SerPhe: 2.829 ± 0.133
5.405SerGly: 5.405 ± 0.166
0.938SerHis: 0.938 ± 0.073
4.353SerIle: 4.353 ± 0.177
4.742SerLys: 4.742 ± 0.171
5.934SerLeu: 5.934 ± 0.174
1.565SerMet: 1.565 ± 0.089
1.99SerAsn: 1.99 ± 0.105
2.596SerPro: 2.596 ± 0.128
1.638SerGln: 1.638 ± 0.089
3.612SerArg: 3.612 ± 0.136
4.281SerSer: 4.281 ± 0.173
2.726SerThr: 2.726 ± 0.114
4.591SerVal: 4.591 ± 0.146
0.777SerTrp: 0.777 ± 0.071
1.845SerTyr: 1.845 ± 0.08
0.0SerXaa: 0.0 ± 0.0
Thr
3.011ThrAla: 3.011 ± 0.131
0.492ThrCys: 0.492 ± 0.049
2.591ThrAsp: 2.591 ± 0.12
3.97ThrGlu: 3.97 ± 0.163
1.866ThrPhe: 1.866 ± 0.112
4.172ThrGly: 4.172 ± 0.167
0.819ThrHis: 0.819 ± 0.068
3.394ThrIle: 3.394 ± 0.146
2.643ThrLys: 2.643 ± 0.111
4.369ThrLeu: 4.369 ± 0.16
0.954ThrMet: 0.954 ± 0.072
1.581ThrAsn: 1.581 ± 0.092
2.322ThrPro: 2.322 ± 0.114
0.943ThrGln: 0.943 ± 0.063
2.322ThrArg: 2.322 ± 0.102
2.897ThrSer: 2.897 ± 0.136
2.342ThrThr: 2.342 ± 0.107
3.933ThrVal: 3.933 ± 0.165
0.482ThrTrp: 0.482 ± 0.047
1.456ThrTyr: 1.456 ± 0.082
0.0ThrXaa: 0.0 ± 0.0
Val
4.213ValAla: 4.213 ± 0.16
0.922ValCys: 0.922 ± 0.079
4.389ValAsp: 4.389 ± 0.145
7.079ValGlu: 7.079 ± 0.195
2.99ValPhe: 2.99 ± 0.14
5.229ValGly: 5.229 ± 0.202
1.187ValHis: 1.187 ± 0.072
4.545ValIle: 4.545 ± 0.198
5.25ValLys: 5.25 ± 0.184
6.374ValLeu: 6.374 ± 0.227
1.451ValMet: 1.451 ± 0.093
2.332ValAsn: 2.332 ± 0.111
2.747ValPro: 2.747 ± 0.109
1.549ValGln: 1.549 ± 0.092
3.887ValArg: 3.887 ± 0.144
5.213ValSer: 5.213 ± 0.173
3.69ValThr: 3.69 ± 0.148
4.897ValVal: 4.897 ± 0.202
0.757ValTrp: 0.757 ± 0.064
1.969ValTyr: 1.969 ± 0.09
0.0ValXaa: 0.0 ± 0.0
Trp
0.679TrpAla: 0.679 ± 0.066
0.13TrpCys: 0.13 ± 0.029
0.601TrpAsp: 0.601 ± 0.053
0.954TrpGlu: 0.954 ± 0.082
0.332TrpPhe: 0.332 ± 0.049
0.86TrpGly: 0.86 ± 0.069
0.13TrpHis: 0.13 ± 0.025
0.84TrpIle: 0.84 ± 0.075
0.943TrpLys: 0.943 ± 0.074
0.954TrpLeu: 0.954 ± 0.074
0.29TrpMet: 0.29 ± 0.037
0.477TrpAsn: 0.477 ± 0.048
0.326TrpPro: 0.326 ± 0.042
0.228TrpGln: 0.228 ± 0.035
0.985TrpArg: 0.985 ± 0.08
0.726TrpSer: 0.726 ± 0.072
0.554TrpThr: 0.554 ± 0.079
0.814TrpVal: 0.814 ± 0.067
0.181TrpTrp: 0.181 ± 0.034
0.285TrpTyr: 0.285 ± 0.041
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.721TyrAla: 1.721 ± 0.09
0.383TyrCys: 0.383 ± 0.05
1.809TyrAsp: 1.809 ± 0.101
2.669TyrGlu: 2.669 ± 0.117
1.306TyrPhe: 1.306 ± 0.08
2.322TyrGly: 2.322 ± 0.106
0.632TyrHis: 0.632 ± 0.049
1.788TyrIle: 1.788 ± 0.085
1.736TyrLys: 1.736 ± 0.094
3.021TyrLeu: 3.021 ± 0.124
0.539TyrMet: 0.539 ± 0.052
0.886TyrAsn: 0.886 ± 0.068
1.384TyrPro: 1.384 ± 0.092
0.793TyrGln: 0.793 ± 0.064
2.073TyrArg: 2.073 ± 0.114
2.042TyrSer: 2.042 ± 0.112
1.223TyrThr: 1.223 ± 0.076
1.741TyrVal: 1.741 ± 0.104
0.399TyrTrp: 0.399 ± 0.049
1.026TyrTyr: 1.026 ± 0.082
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 885 proteins (192968 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski