Amino acid dipepetide frequency for Marine Group I thaumarchaeote SCGC AAA799-O18

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
4.426AlaAla: 4.426 ± 0.22
0.806AlaCys: 0.806 ± 0.078
2.909AlaAsp: 2.909 ± 0.137
3.992AlaGlu: 3.992 ± 0.173
2.474AlaPhe: 2.474 ± 0.116
4.369AlaGly: 4.369 ± 0.191
1.045AlaHis: 1.045 ± 0.076
5.15AlaIle: 5.15 ± 0.205
5.843AlaLys: 5.843 ± 0.232
5.301AlaLeu: 5.301 ± 0.22
1.517AlaMet: 1.517 ± 0.107
2.581AlaAsn: 2.581 ± 0.13
1.53AlaPro: 1.53 ± 0.098
1.643AlaGln: 1.643 ± 0.107
2.399AlaArg: 2.399 ± 0.143
4.036AlaSer: 4.036 ± 0.194
3.236AlaThr: 3.236 ± 0.125
4.162AlaVal: 4.162 ± 0.206
0.441AlaTrp: 0.441 ± 0.042
1.807AlaTyr: 1.807 ± 0.113
0.006AlaXaa: 0.006 ± 0.006
Cys
0.611CysAla: 0.611 ± 0.067
0.183CysCys: 0.183 ± 0.032
0.686CysAsp: 0.686 ± 0.075
0.623CysGlu: 0.623 ± 0.064
0.466CysPhe: 0.466 ± 0.054
1.133CysGly: 1.133 ± 0.089
0.258CysHis: 0.258 ± 0.041
0.944CysIle: 0.944 ± 0.07
0.944CysLys: 0.944 ± 0.076
0.844CysLeu: 0.844 ± 0.075
0.239CysMet: 0.239 ± 0.037
0.478CysAsn: 0.478 ± 0.061
0.516CysPro: 0.516 ± 0.056
0.258CysGln: 0.258 ± 0.037
0.422CysArg: 0.422 ± 0.047
0.756CysSer: 0.756 ± 0.069
0.598CysThr: 0.598 ± 0.061
0.711CysVal: 0.711 ± 0.073
0.076CysTrp: 0.076 ± 0.022
0.441CysTyr: 0.441 ± 0.057
0.0CysXaa: 0.0 ± 0.0
Asp
3.715AspAla: 3.715 ± 0.143
0.661AspCys: 0.661 ± 0.08
3.375AspAsp: 3.375 ± 0.167
4.029AspGlu: 4.029 ± 0.161
2.714AspPhe: 2.714 ± 0.128
3.652AspGly: 3.652 ± 0.196
0.957AspHis: 0.957 ± 0.071
5.081AspIle: 5.081 ± 0.173
4.136AspLys: 4.136 ± 0.173
4.886AspLeu: 4.886 ± 0.159
1.486AspMet: 1.486 ± 0.12
2.588AspAsn: 2.588 ± 0.127
2.311AspPro: 2.311 ± 0.104
1.454AspGln: 1.454 ± 0.096
1.794AspArg: 1.794 ± 0.112
4.092AspSer: 4.092 ± 0.156
2.726AspThr: 2.726 ± 0.149
3.992AspVal: 3.992 ± 0.164
0.554AspTrp: 0.554 ± 0.074
2.071AspTyr: 2.071 ± 0.121
0.0AspXaa: 0.0 ± 0.0
Glu
3.469GluAla: 3.469 ± 0.142
0.661GluCys: 0.661 ± 0.075
3.009GluAsp: 3.009 ± 0.173
4.483GluGlu: 4.483 ± 0.209
3.161GluPhe: 3.161 ± 0.146
3.161GluGly: 3.161 ± 0.138
1.19GluHis: 1.19 ± 0.09
6.831GluIle: 6.831 ± 0.197
7.605GluLys: 7.605 ± 0.217
6.031GluLeu: 6.031 ± 0.206
1.851GluMet: 1.851 ± 0.115
3.998GluAsn: 3.998 ± 0.171
2.178GluPro: 2.178 ± 0.165
2.109GluGln: 2.109 ± 0.109
2.449GluArg: 2.449 ± 0.139
4.269GluSer: 4.269 ± 0.158
3.186GluThr: 3.186 ± 0.163
3.752GluVal: 3.752 ± 0.142
0.623GluTrp: 0.623 ± 0.066
2.342GluTyr: 2.342 ± 0.141
0.0GluXaa: 0.0 ± 0.0
Phe
2.695PheAla: 2.695 ± 0.137
0.724PheCys: 0.724 ± 0.068
3.016PheAsp: 3.016 ± 0.146
3.072PheGlu: 3.072 ± 0.166
1.958PhePhe: 1.958 ± 0.12
3.255PheGly: 3.255 ± 0.16
0.806PheHis: 0.806 ± 0.082
2.695PheIle: 2.695 ± 0.134
2.814PheLys: 2.814 ± 0.135
4.105PheLeu: 4.105 ± 0.166
0.888PheMet: 0.888 ± 0.076
1.731PheAsn: 1.731 ± 0.109
1.58PhePro: 1.58 ± 0.099
1.322PheGln: 1.322 ± 0.092
1.542PheArg: 1.542 ± 0.104
3.834PheSer: 3.834 ± 0.174
2.619PheThr: 2.619 ± 0.131
3.438PheVal: 3.438 ± 0.137
0.485PheTrp: 0.485 ± 0.049
1.391PheTyr: 1.391 ± 0.086
0.006PheXaa: 0.006 ± 0.006
Gly
3.948GlyAla: 3.948 ± 0.207
0.781GlyCys: 0.781 ± 0.073
3.054GlyAsp: 3.054 ± 0.141
3.551GlyGlu: 3.551 ± 0.145
3.305GlyPhe: 3.305 ± 0.158
4.653GlyGly: 4.653 ± 0.207
1.316GlyHis: 1.316 ± 0.084
6.825GlyIle: 6.825 ± 0.217
6.05GlyLys: 6.05 ± 0.202
5.383GlyLeu: 5.383 ± 0.181
2.166GlyMet: 2.166 ± 0.135
2.827GlyAsn: 2.827 ± 0.139
1.82GlyPro: 1.82 ± 0.129
1.7GlyGln: 1.7 ± 0.126
2.619GlyArg: 2.619 ± 0.112
4.231GlySer: 4.231 ± 0.176
3.885GlyThr: 3.885 ± 0.154
4.577GlyVal: 4.577 ± 0.204
0.68GlyTrp: 0.68 ± 0.059
2.298GlyTyr: 2.298 ± 0.121
0.0GlyXaa: 0.0 ± 0.0
His
1.24HisAla: 1.24 ± 0.094
0.214HisCys: 0.214 ± 0.038
1.07HisAsp: 1.07 ± 0.08
1.077HisGlu: 1.077 ± 0.092
0.762HisPhe: 0.762 ± 0.068
1.417HisGly: 1.417 ± 0.092
0.447HisHis: 0.447 ± 0.057
1.498HisIle: 1.498 ± 0.093
1.171HisLys: 1.171 ± 0.102
1.536HisLeu: 1.536 ± 0.104
0.504HisMet: 0.504 ± 0.058
0.85HisAsn: 0.85 ± 0.073
1.083HisPro: 1.083 ± 0.095
0.535HisGln: 0.535 ± 0.084
0.724HisArg: 0.724 ± 0.059
1.272HisSer: 1.272 ± 0.093
1.001HisThr: 1.001 ± 0.082
1.265HisVal: 1.265 ± 0.085
0.164HisTrp: 0.164 ± 0.033
0.667HisTyr: 0.667 ± 0.064
0.0HisXaa: 0.0 ± 0.0
Ile
5.666IleAla: 5.666 ± 0.191
1.007IleCys: 1.007 ± 0.079
5.2IleAsp: 5.2 ± 0.193
5.773IleGlu: 5.773 ± 0.203
3.652IlePhe: 3.652 ± 0.159
6.214IleGly: 6.214 ± 0.224
1.536IleHis: 1.536 ± 0.094
7.7IleIle: 7.7 ± 0.244
7.486IleLys: 7.486 ± 0.205
8.046IleLeu: 8.046 ± 0.27
2.166IleMet: 2.166 ± 0.112
4.042IleAsn: 4.042 ± 0.148
4.092IlePro: 4.092 ± 0.16
2.732IleGln: 2.732 ± 0.13
3.538IleArg: 3.538 ± 0.13
6.755IleSer: 6.755 ± 0.205
5.364IleThr: 5.364 ± 0.177
5.616IleVal: 5.616 ± 0.179
0.579IleTrp: 0.579 ± 0.065
1.958IleTyr: 1.958 ± 0.107
0.0IleXaa: 0.0 ± 0.0
Lys
4.621LysAla: 4.621 ± 0.193
0.951LysCys: 0.951 ± 0.08
4.313LysAsp: 4.313 ± 0.168
6.013LysGlu: 6.013 ± 0.181
4.055LysPhe: 4.055 ± 0.172
4.413LysGly: 4.413 ± 0.19
1.631LysHis: 1.631 ± 0.121
9.809LysIle: 9.809 ± 0.272
10.407LysLys: 10.407 ± 0.34
7.782LysLeu: 7.782 ± 0.231
2.6LysMet: 2.6 ± 0.123
6.604LysAsn: 6.604 ± 0.224
3.179LysPro: 3.179 ± 0.188
2.877LysGln: 2.877 ± 0.132
3.702LysArg: 3.702 ± 0.17
5.824LysSer: 5.824 ± 0.194
5.125LysThr: 5.125 ± 0.214
4.476LysVal: 4.476 ± 0.188
0.68LysTrp: 0.68 ± 0.08
3.066LysTyr: 3.066 ± 0.149
0.0LysXaa: 0.0 ± 0.0
Leu
5.446LeuAla: 5.446 ± 0.183
0.73LeuCys: 0.73 ± 0.074
5.912LeuAsp: 5.912 ± 0.207
6.529LeuGlu: 6.529 ± 0.23
3.242LeuPhe: 3.242 ± 0.171
5.962LeuGly: 5.962 ± 0.203
1.637LeuHis: 1.637 ± 0.111
6.541LeuIle: 6.541 ± 0.224
7.971LeuLys: 7.971 ± 0.257
7.316LeuLeu: 7.316 ± 0.244
1.92LeuMet: 1.92 ± 0.116
4.149LeuAsn: 4.149 ± 0.155
2.997LeuPro: 2.997 ± 0.142
2.682LeuGln: 2.682 ± 0.129
3.778LeuArg: 3.778 ± 0.148
6.346LeuSer: 6.346 ± 0.178
4.52LeuThr: 4.52 ± 0.181
6.069LeuVal: 6.069 ± 0.209
0.63LeuTrp: 0.63 ± 0.06
2.267LeuTyr: 2.267 ± 0.145
0.006LeuXaa: 0.006 ± 0.006
Met
1.587MetAla: 1.587 ± 0.116
0.271MetCys: 0.271 ± 0.039
1.335MetAsp: 1.335 ± 0.096
1.398MetGlu: 1.398 ± 0.099
0.919MetPhe: 0.919 ± 0.073
1.826MetGly: 1.826 ± 0.112
0.56MetHis: 0.56 ± 0.07
2.311MetIle: 2.311 ± 0.122
2.695MetLys: 2.695 ± 0.125
2.468MetLeu: 2.468 ± 0.128
0.743MetMet: 0.743 ± 0.074
1.373MetAsn: 1.373 ± 0.088
1.14MetPro: 1.14 ± 0.085
0.73MetGln: 0.73 ± 0.061
0.938MetArg: 0.938 ± 0.076
1.744MetSer: 1.744 ± 0.103
1.568MetThr: 1.568 ± 0.108
1.536MetVal: 1.536 ± 0.092
0.189MetTrp: 0.189 ± 0.035
0.73MetTyr: 0.73 ± 0.073
0.0MetXaa: 0.0 ± 0.0
Asn
3.047AsnAla: 3.047 ± 0.138
0.623AsnCys: 0.623 ± 0.059
2.965AsnAsp: 2.965 ± 0.145
3.677AsnGlu: 3.677 ± 0.165
2.437AsnPhe: 2.437 ± 0.129
2.997AsnGly: 2.997 ± 0.138
0.913AsnHis: 0.913 ± 0.081
4.306AsnIle: 4.306 ± 0.175
3.828AsnLys: 3.828 ± 0.178
4.653AsnLeu: 4.653 ± 0.23
1.435AsnMet: 1.435 ± 0.094
2.512AsnAsn: 2.512 ± 0.126
2.285AsnPro: 2.285 ± 0.125
1.656AsnGln: 1.656 ± 0.097
1.775AsnArg: 1.775 ± 0.095
3.601AsnSer: 3.601 ± 0.164
2.701AsnThr: 2.701 ± 0.134
3.381AsnVal: 3.381 ± 0.138
0.453AsnTrp: 0.453 ± 0.054
1.687AsnTyr: 1.687 ± 0.103
0.006AsnXaa: 0.006 ± 0.005
Pro
1.945ProAla: 1.945 ± 0.121
0.214ProCys: 0.214 ± 0.034
2.38ProAsp: 2.38 ± 0.137
3.066ProGlu: 3.066 ± 0.205
1.53ProPhe: 1.53 ± 0.088
2.248ProGly: 2.248 ± 0.139
0.793ProHis: 0.793 ± 0.072
3.236ProIle: 3.236 ± 0.12
3.223ProLys: 3.223 ± 0.142
2.714ProLeu: 2.714 ± 0.137
0.888ProMet: 0.888 ± 0.069
1.895ProAsn: 1.895 ± 0.105
1.221ProPro: 1.221 ± 0.133
1.165ProGln: 1.165 ± 0.083
1.203ProArg: 1.203 ± 0.091
2.172ProSer: 2.172 ± 0.11
2.197ProThr: 2.197 ± 0.148
2.487ProVal: 2.487 ± 0.133
0.334ProTrp: 0.334 ± 0.048
1.196ProTyr: 1.196 ± 0.088
0.0ProXaa: 0.0 ± 0.0
Gln
1.763GlnAla: 1.763 ± 0.114
0.258GlnCys: 0.258 ± 0.044
1.398GlnAsp: 1.398 ± 0.102
1.92GlnGlu: 1.92 ± 0.126
1.404GlnPhe: 1.404 ± 0.092
1.53GlnGly: 1.53 ± 0.094
0.567GlnHis: 0.567 ± 0.068
3.116GlnIle: 3.116 ± 0.142
3.148GlnLys: 3.148 ± 0.132
2.512GlnLeu: 2.512 ± 0.121
0.781GlnMet: 0.781 ± 0.068
1.801GlnAsn: 1.801 ± 0.122
0.756GlnPro: 0.756 ± 0.069
1.033GlnGln: 1.033 ± 0.075
1.133GlnArg: 1.133 ± 0.084
1.738GlnSer: 1.738 ± 0.129
1.631GlnThr: 1.631 ± 0.1
1.857GlnVal: 1.857 ± 0.127
0.264GlnTrp: 0.264 ± 0.045
0.951GlnTyr: 0.951 ± 0.069
0.006GlnXaa: 0.006 ± 0.006
Arg
2.021ArgAla: 2.021 ± 0.104
0.422ArgCys: 0.422 ± 0.052
2.115ArgAsp: 2.115 ± 0.117
2.304ArgGlu: 2.304 ± 0.134
1.82ArgPhe: 1.82 ± 0.105
2.374ArgGly: 2.374 ± 0.131
0.567ArgHis: 0.567 ± 0.059
3.809ArgIle: 3.809 ± 0.162
4.243ArgLys: 4.243 ± 0.184
3.349ArgLeu: 3.349 ± 0.151
1.095ArgMet: 1.095 ± 0.074
2.078ArgAsn: 2.078 ± 0.107
1.24ArgPro: 1.24 ± 0.094
0.982ArgGln: 0.982 ± 0.087
1.801ArgArg: 1.801 ± 0.132
2.323ArgSer: 2.323 ± 0.121
2.008ArgThr: 2.008 ± 0.118
2.443ArgVal: 2.443 ± 0.14
0.384ArgTrp: 0.384 ± 0.051
1.303ArgTyr: 1.303 ± 0.092
0.0ArgXaa: 0.0 ± 0.0
Ser
3.796SerAla: 3.796 ± 0.149
0.762SerCys: 0.762 ± 0.072
4.174SerAsp: 4.174 ± 0.171
4.76SerGlu: 4.76 ± 0.187
2.915SerPhe: 2.915 ± 0.141
5.005SerGly: 5.005 ± 0.189
1.328SerHis: 1.328 ± 0.091
5.893SerIle: 5.893 ± 0.19
6.774SerLys: 6.774 ± 0.205
6.069SerLeu: 6.069 ± 0.232
1.794SerMet: 1.794 ± 0.106
3.387SerAsn: 3.387 ± 0.189
2.254SerPro: 2.254 ± 0.109
2.222SerGln: 2.222 ± 0.12
2.537SerArg: 2.537 ± 0.114
4.854SerSer: 4.854 ± 0.185
3.538SerThr: 3.538 ± 0.161
4.539SerVal: 4.539 ± 0.165
0.523SerTrp: 0.523 ± 0.056
2.052SerTyr: 2.052 ± 0.131
0.019SerXaa: 0.019 ± 0.011
Thr
3.305ThrAla: 3.305 ± 0.161
0.705ThrCys: 0.705 ± 0.06
2.795ThrAsp: 2.795 ± 0.136
3.406ThrGlu: 3.406 ± 0.13
2.361ThrPhe: 2.361 ± 0.13
4.155ThrGly: 4.155 ± 0.178
1.045ThrHis: 1.045 ± 0.083
4.892ThrIle: 4.892 ± 0.18
5.257ThrLys: 5.257 ± 0.212
4.508ThrLeu: 4.508 ± 0.168
1.36ThrMet: 1.36 ± 0.098
2.814ThrAsn: 2.814 ± 0.128
2.046ThrPro: 2.046 ± 0.152
1.587ThrGln: 1.587 ± 0.105
2.153ThrArg: 2.153 ± 0.123
3.652ThrSer: 3.652 ± 0.158
3.135ThrThr: 3.135 ± 0.142
3.809ThrVal: 3.809 ± 0.183
0.416ThrTrp: 0.416 ± 0.053
1.517ThrTyr: 1.517 ± 0.093
0.0ThrXaa: 0.0 ± 0.0
Val
3.966ValAla: 3.966 ± 0.161
0.705ValCys: 0.705 ± 0.07
3.966ValAsp: 3.966 ± 0.156
4.275ValGlu: 4.275 ± 0.174
2.833ValPhe: 2.833 ± 0.155
4.565ValGly: 4.565 ± 0.187
1.108ValHis: 1.108 ± 0.089
5.673ValIle: 5.673 ± 0.202
5.471ValLys: 5.471 ± 0.171
5.471ValLeu: 5.471 ± 0.237
1.681ValMet: 1.681 ± 0.094
3.161ValAsn: 3.161 ± 0.125
2.336ValPro: 2.336 ± 0.117
1.687ValGln: 1.687 ± 0.119
2.424ValArg: 2.424 ± 0.128
4.967ValSer: 4.967 ± 0.188
3.796ValThr: 3.796 ± 0.167
4.376ValVal: 4.376 ± 0.173
0.497ValTrp: 0.497 ± 0.052
2.027ValTyr: 2.027 ± 0.113
0.019ValXaa: 0.019 ± 0.011
Trp
0.535TrpAla: 0.535 ± 0.058
0.082TrpCys: 0.082 ± 0.021
0.434TrpAsp: 0.434 ± 0.045
0.497TrpGlu: 0.497 ± 0.052
0.491TrpPhe: 0.491 ± 0.047
0.523TrpGly: 0.523 ± 0.062
0.176TrpHis: 0.176 ± 0.032
0.781TrpIle: 0.781 ± 0.079
0.756TrpLys: 0.756 ± 0.078
0.749TrpLeu: 0.749 ± 0.066
0.246TrpMet: 0.246 ± 0.04
0.535TrpAsn: 0.535 ± 0.052
0.239TrpPro: 0.239 ± 0.04
0.264TrpGln: 0.264 ± 0.039
0.422TrpArg: 0.422 ± 0.052
0.466TrpSer: 0.466 ± 0.062
0.39TrpThr: 0.39 ± 0.049
0.51TrpVal: 0.51 ± 0.062
0.164TrpTrp: 0.164 ± 0.036
0.22TrpTyr: 0.22 ± 0.037
0.006TrpXaa: 0.006 ± 0.006
Tyr
1.864TyrAla: 1.864 ± 0.112
0.422TyrCys: 0.422 ± 0.056
2.159TyrAsp: 2.159 ± 0.131
1.908TyrGlu: 1.908 ± 0.112
1.48TyrPhe: 1.48 ± 0.099
2.21TyrGly: 2.21 ± 0.104
0.617TyrHis: 0.617 ± 0.057
2.027TyrIle: 2.027 ± 0.123
2.304TyrLys: 2.304 ± 0.121
2.928TyrLeu: 2.928 ± 0.166
0.68TyrMet: 0.68 ± 0.067
1.536TyrAsn: 1.536 ± 0.085
1.297TyrPro: 1.297 ± 0.093
0.925TyrGln: 0.925 ± 0.071
1.328TyrArg: 1.328 ± 0.097
2.267TyrSer: 2.267 ± 0.129
1.675TyrThr: 1.675 ± 0.101
2.015TyrVal: 2.015 ± 0.102
0.353TyrTrp: 0.353 ± 0.053
1.083TyrTyr: 1.083 ± 0.078
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.006XaaAla: 0.006 ± 0.005
0.006XaaCys: 0.006 ± 0.006
0.006XaaAsp: 0.006 ± 0.006
0.006XaaGlu: 0.006 ± 0.006
0.0XaaPhe: 0.0 ± 0.0
0.006XaaGly: 0.006 ± 0.006
0.0XaaHis: 0.0 ± 0.0
0.006XaaIle: 0.006 ± 0.006
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.013XaaMet: 0.013 ± 0.009
0.0XaaAsn: 0.0 ± 0.001
0.006XaaPro: 0.006 ± 0.006
0.0XaaGln: 0.0 ± 0.001
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.019XaaThr: 0.019 ± 0.01
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.353XaaXaa: 0.353 ± 0.096
Statistics based on 697 proteins (158835 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski