Amino acid dipepetide frequency for Acidobacteria bacterium SCGC AG-212-P17

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
12.327AlaAla: 12.327 ± 0.438
1.041AlaCys: 1.041 ± 0.086
5.13AlaAsp: 5.13 ± 0.203
5.945AlaGlu: 5.945 ± 0.282
3.983AlaPhe: 3.983 ± 0.164
8.145AlaGly: 8.145 ± 0.308
2.154AlaHis: 2.154 ± 0.112
5.11AlaIle: 5.11 ± 0.189
4.507AlaLys: 4.507 ± 0.197
10.2AlaLeu: 10.2 ± 0.362
2.552AlaMet: 2.552 ± 0.135
3.592AlaAsn: 3.592 ± 0.173
4.812AlaPro: 4.812 ± 0.253
4.348AlaGln: 4.348 ± 0.179
6.137AlaArg: 6.137 ± 0.245
6.21AlaSer: 6.21 ± 0.224
5.415AlaThr: 5.415 ± 0.238
7.277AlaVal: 7.277 ± 0.245
1.233AlaTrp: 1.233 ± 0.099
2.339AlaTyr: 2.339 ± 0.145
0.0AlaXaa: 0.0 ± 0.0
Cys
0.868CysAla: 0.868 ± 0.081
0.192CysCys: 0.192 ± 0.041
0.391CysAsp: 0.391 ± 0.053
0.471CysGlu: 0.471 ± 0.057
0.477CysPhe: 0.477 ± 0.062
0.868CysGly: 0.868 ± 0.092
0.365CysHis: 0.365 ± 0.076
0.378CysIle: 0.378 ± 0.059
0.278CysLys: 0.278 ± 0.04
1.041CysLeu: 1.041 ± 0.093
0.186CysMet: 0.186 ± 0.034
0.358CysAsn: 0.358 ± 0.055
0.497CysPro: 0.497 ± 0.063
0.192CysGln: 0.192 ± 0.033
0.656CysArg: 0.656 ± 0.071
0.703CysSer: 0.703 ± 0.073
0.61CysThr: 0.61 ± 0.076
0.703CysVal: 0.703 ± 0.068
0.113CysTrp: 0.113 ± 0.027
0.219CysTyr: 0.219 ± 0.039
0.0CysXaa: 0.0 ± 0.0
Asp
4.865AspAla: 4.865 ± 0.191
0.457AspCys: 0.457 ± 0.049
2.578AspAsp: 2.578 ± 0.16
3.015AspGlu: 3.015 ± 0.144
2.3AspPhe: 2.3 ± 0.121
3.917AspGly: 3.917 ± 0.191
1.001AspHis: 1.001 ± 0.074
2.412AspIle: 2.412 ± 0.124
2.068AspLys: 2.068 ± 0.149
5.236AspLeu: 5.236 ± 0.235
1.001AspMet: 1.001 ± 0.078
1.73AspAsn: 1.73 ± 0.12
3.479AspPro: 3.479 ± 0.175
2.167AspGln: 2.167 ± 0.13
2.883AspArg: 2.883 ± 0.155
2.757AspSer: 2.757 ± 0.14
2.101AspThr: 2.101 ± 0.134
3.857AspVal: 3.857 ± 0.145
0.775AspTrp: 0.775 ± 0.074
1.604AspTyr: 1.604 ± 0.115
0.0AspXaa: 0.0 ± 0.0
Glu
5.249GluAla: 5.249 ± 0.238
0.378GluCys: 0.378 ± 0.06
2.571GluAsp: 2.571 ± 0.159
3.632GluGlu: 3.632 ± 0.236
2.333GluPhe: 2.333 ± 0.131
3.307GluGly: 3.307 ± 0.167
1.339GluHis: 1.339 ± 0.092
3.559GluIle: 3.559 ± 0.188
3.466GluLys: 3.466 ± 0.19
5.945GluLeu: 5.945 ± 0.325
1.577GluMet: 1.577 ± 0.105
2.068GluAsn: 2.068 ± 0.121
2.313GluPro: 2.313 ± 0.127
2.704GluGln: 2.704 ± 0.151
4.089GluArg: 4.089 ± 0.21
3.36GluSer: 3.36 ± 0.142
3.194GluThr: 3.194 ± 0.14
3.387GluVal: 3.387 ± 0.182
0.59GluTrp: 0.59 ± 0.067
1.584GluTyr: 1.584 ± 0.114
0.0GluXaa: 0.0 ± 0.0
Phe
3.963PheAla: 3.963 ± 0.193
0.517PheCys: 0.517 ± 0.062
2.532PheAsp: 2.532 ± 0.124
2.313PheGlu: 2.313 ± 0.131
1.803PhePhe: 1.803 ± 0.115
3.135PheGly: 3.135 ± 0.194
0.974PheHis: 0.974 ± 0.076
1.935PheIle: 1.935 ± 0.128
1.418PheLys: 1.418 ± 0.112
3.764PheLeu: 3.764 ± 0.166
0.875PheMet: 0.875 ± 0.077
1.518PheAsn: 1.518 ± 0.126
2.048PhePro: 2.048 ± 0.121
1.597PheGln: 1.597 ± 0.102
2.465PheArg: 2.465 ± 0.133
3.002PheSer: 3.002 ± 0.137
2.817PheThr: 2.817 ± 0.173
2.711PheVal: 2.711 ± 0.125
0.63PheTrp: 0.63 ± 0.073
1.213PheTyr: 1.213 ± 0.101
0.0PheXaa: 0.0 ± 0.0
Gly
6.8GlyAla: 6.8 ± 0.252
0.789GlyCys: 0.789 ± 0.078
3.307GlyAsp: 3.307 ± 0.152
3.778GlyGlu: 3.778 ± 0.185
3.433GlyPhe: 3.433 ± 0.175
5.799GlyGly: 5.799 ± 0.253
1.796GlyHis: 1.796 ± 0.119
4.434GlyIle: 4.434 ± 0.172
4.182GlyLys: 4.182 ± 0.205
6.946GlyLeu: 6.946 ± 0.211
1.776GlyMet: 1.776 ± 0.119
2.923GlyAsn: 2.923 ± 0.158
3.234GlyPro: 3.234 ± 0.151
2.724GlyGln: 2.724 ± 0.137
4.082GlyArg: 4.082 ± 0.149
5.448GlySer: 5.448 ± 0.226
4.719GlyThr: 4.719 ± 0.238
5.898GlyVal: 5.898 ± 0.213
1.18GlyTrp: 1.18 ± 0.099
2.134GlyTyr: 2.134 ± 0.128
0.0GlyXaa: 0.0 ± 0.0
His
2.207HisAla: 2.207 ± 0.137
0.272HisCys: 0.272 ± 0.048
1.113HisAsp: 1.113 ± 0.08
1.246HisGlu: 1.246 ± 0.103
0.941HisPhe: 0.941 ± 0.081
1.982HisGly: 1.982 ± 0.113
0.649HisHis: 0.649 ± 0.069
1.14HisIle: 1.14 ± 0.084
0.57HisLys: 0.57 ± 0.061
1.948HisLeu: 1.948 ± 0.1
0.497HisMet: 0.497 ± 0.053
0.881HisAsn: 0.881 ± 0.073
1.425HisPro: 1.425 ± 0.101
0.842HisGln: 0.842 ± 0.077
1.312HisArg: 1.312 ± 0.1
1.379HisSer: 1.379 ± 0.083
1.246HisThr: 1.246 ± 0.082
1.504HisVal: 1.504 ± 0.1
0.371HisTrp: 0.371 ± 0.05
0.577HisTyr: 0.577 ± 0.06
0.0HisXaa: 0.0 ± 0.0
Ile
5.905IleAla: 5.905 ± 0.206
0.53IleCys: 0.53 ± 0.057
2.949IleAsp: 2.949 ± 0.143
3.373IleGlu: 3.373 ± 0.163
2.412IlePhe: 2.412 ± 0.143
4.242IleGly: 4.242 ± 0.198
1.292IleHis: 1.292 ± 0.102
2.419IleIle: 2.419 ± 0.139
2.041IleLys: 2.041 ± 0.134
4.361IleLeu: 4.361 ± 0.177
0.895IleMet: 0.895 ± 0.088
1.849IleAsn: 1.849 ± 0.12
2.459IlePro: 2.459 ± 0.138
1.948IleGln: 1.948 ± 0.106
2.75IleArg: 2.75 ± 0.146
3.758IleSer: 3.758 ± 0.172
3.228IleThr: 3.228 ± 0.172
3.817IleVal: 3.817 ± 0.182
0.656IleTrp: 0.656 ± 0.076
1.538IleTyr: 1.538 ± 0.112
0.0IleXaa: 0.0 ± 0.0
Lys
4.063LysAla: 4.063 ± 0.191
0.365LysCys: 0.365 ± 0.048
2.273LysAsp: 2.273 ± 0.147
2.525LysGlu: 2.525 ± 0.157
1.504LysPhe: 1.504 ± 0.098
3.055LysGly: 3.055 ± 0.156
0.961LysHis: 0.961 ± 0.086
2.876LysIle: 2.876 ± 0.179
2.598LysLys: 2.598 ± 0.181
4.454LysLeu: 4.454 ± 0.192
1.12LysMet: 1.12 ± 0.093
1.617LysAsn: 1.617 ± 0.097
2.611LysPro: 2.611 ± 0.155
2.134LysGln: 2.134 ± 0.124
2.565LysArg: 2.565 ± 0.151
2.929LysSer: 2.929 ± 0.159
2.89LysThr: 2.89 ± 0.162
2.863LysVal: 2.863 ± 0.134
0.557LysTrp: 0.557 ± 0.057
1.1LysTyr: 1.1 ± 0.102
0.0LysXaa: 0.0 ± 0.0
Leu
10.803LeuAla: 10.803 ± 0.279
0.987LeuCys: 0.987 ± 0.082
4.984LeuAsp: 4.984 ± 0.206
5.56LeuGlu: 5.56 ± 0.292
3.937LeuPhe: 3.937 ± 0.149
7.456LeuGly: 7.456 ± 0.239
2.154LeuHis: 2.154 ± 0.116
4.785LeuIle: 4.785 ± 0.169
4.328LeuLys: 4.328 ± 0.195
10.2LeuLeu: 10.2 ± 0.328
1.995LeuMet: 1.995 ± 0.107
3.214LeuAsn: 3.214 ± 0.145
5.136LeuPro: 5.136 ± 0.201
4.01LeuGln: 4.01 ± 0.184
6.376LeuArg: 6.376 ± 0.224
6.468LeuSer: 6.468 ± 0.226
5.381LeuThr: 5.381 ± 0.207
6.999LeuVal: 6.999 ± 0.233
1.312LeuTrp: 1.312 ± 0.104
2.373LeuTyr: 2.373 ± 0.152
0.0LeuXaa: 0.0 ± 0.0
Met
2.717MetAla: 2.717 ± 0.131
0.159MetCys: 0.159 ± 0.032
1.087MetAsp: 1.087 ± 0.078
1.266MetGlu: 1.266 ± 0.109
0.749MetPhe: 0.749 ± 0.069
1.359MetGly: 1.359 ± 0.099
0.543MetHis: 0.543 ± 0.065
1.06MetIle: 1.06 ± 0.082
1.292MetLys: 1.292 ± 0.099
2.227MetLeu: 2.227 ± 0.146
0.51MetMet: 0.51 ± 0.067
0.934MetAsn: 0.934 ± 0.082
1.352MetPro: 1.352 ± 0.105
0.815MetGln: 0.815 ± 0.076
1.511MetArg: 1.511 ± 0.105
1.465MetSer: 1.465 ± 0.094
1.352MetThr: 1.352 ± 0.088
1.63MetVal: 1.63 ± 0.117
0.258MetTrp: 0.258 ± 0.041
0.391MetTyr: 0.391 ± 0.054
0.0MetXaa: 0.0 ± 0.0
Asn
3.552AsnAla: 3.552 ± 0.186
0.318AsnCys: 0.318 ± 0.056
1.61AsnAsp: 1.61 ± 0.117
1.73AsnGlu: 1.73 ± 0.12
1.564AsnPhe: 1.564 ± 0.109
3.214AsnGly: 3.214 ± 0.23
0.888AsnHis: 0.888 ± 0.081
1.856AsnIle: 1.856 ± 0.117
1.345AsnLys: 1.345 ± 0.116
3.526AsnLeu: 3.526 ± 0.148
0.749AsnMet: 0.749 ± 0.066
1.027AsnAsn: 1.027 ± 0.1
2.552AsnPro: 2.552 ± 0.198
1.485AsnGln: 1.485 ± 0.102
1.962AsnArg: 1.962 ± 0.123
2.2AsnSer: 2.2 ± 0.121
1.935AsnThr: 1.935 ± 0.134
2.764AsnVal: 2.764 ± 0.157
0.477AsnTrp: 0.477 ± 0.051
1.094AsnTyr: 1.094 ± 0.093
0.0AsnXaa: 0.0 ± 0.0
Pro
6.018ProAla: 6.018 ± 0.305
0.338ProCys: 0.338 ± 0.055
2.989ProAsp: 2.989 ± 0.148
3.91ProGlu: 3.91 ± 0.167
2.108ProPhe: 2.108 ± 0.119
4.427ProGly: 4.427 ± 0.219
0.915ProHis: 0.915 ± 0.077
2.412ProIle: 2.412 ± 0.135
2.141ProLys: 2.141 ± 0.14
4.772ProLeu: 4.772 ± 0.179
1.054ProMet: 1.054 ± 0.082
1.942ProAsn: 1.942 ± 0.137
2.883ProPro: 2.883 ± 0.2
2.366ProGln: 2.366 ± 0.121
2.286ProArg: 2.286 ± 0.119
3.387ProSer: 3.387 ± 0.189
2.943ProThr: 2.943 ± 0.161
4.162ProVal: 4.162 ± 0.17
0.656ProTrp: 0.656 ± 0.058
1.166ProTyr: 1.166 ± 0.084
0.0ProXaa: 0.0 ± 0.0
Gln
4.122GlnAla: 4.122 ± 0.225
0.477GlnCys: 0.477 ± 0.059
1.71GlnAsp: 1.71 ± 0.118
2.426GlnGlu: 2.426 ± 0.158
1.458GlnPhe: 1.458 ± 0.109
2.571GlnGly: 2.571 ± 0.146
0.948GlnHis: 0.948 ± 0.071
2.306GlnIle: 2.306 ± 0.129
1.935GlnLys: 1.935 ± 0.137
4.122GlnLeu: 4.122 ± 0.202
1.16GlnMet: 1.16 ± 0.084
1.538GlnAsn: 1.538 ± 0.094
2.359GlnPro: 2.359 ± 0.127
2.439GlnGln: 2.439 ± 0.175
2.565GlnArg: 2.565 ± 0.139
2.499GlnSer: 2.499 ± 0.151
2.346GlnThr: 2.346 ± 0.121
2.75GlnVal: 2.75 ± 0.143
0.484GlnTrp: 0.484 ± 0.061
1.027GlnTyr: 1.027 ± 0.073
0.0GlnXaa: 0.0 ± 0.0
Arg
5.242ArgAla: 5.242 ± 0.238
0.524ArgCys: 0.524 ± 0.063
2.996ArgAsp: 2.996 ± 0.16
3.685ArgGlu: 3.685 ± 0.207
2.393ArgPhe: 2.393 ± 0.134
3.473ArgGly: 3.473 ± 0.16
1.279ArgHis: 1.279 ± 0.082
3.32ArgIle: 3.32 ± 0.156
3.175ArgLys: 3.175 ± 0.157
6.15ArgLeu: 6.15 ± 0.233
1.823ArgMet: 1.823 ± 0.126
2.048ArgAsn: 2.048 ± 0.108
2.611ArgPro: 2.611 ± 0.131
2.578ArgGln: 2.578 ± 0.134
4.235ArgArg: 4.235 ± 0.202
3.579ArgSer: 3.579 ± 0.187
3.175ArgThr: 3.175 ± 0.156
4.228ArgVal: 4.228 ± 0.154
1.034ArgTrp: 1.034 ± 0.077
1.723ArgTyr: 1.723 ± 0.105
0.0ArgXaa: 0.0 ± 0.0
Ser
6.893SerAla: 6.893 ± 0.257
0.643SerCys: 0.643 ± 0.083
3.234SerAsp: 3.234 ± 0.164
3.029SerGlu: 3.029 ± 0.156
2.691SerPhe: 2.691 ± 0.145
6.077SerGly: 6.077 ± 0.291
1.306SerHis: 1.306 ± 0.119
3.228SerIle: 3.228 ± 0.129
2.784SerLys: 2.784 ± 0.16
6.243SerLeu: 6.243 ± 0.215
1.392SerMet: 1.392 ± 0.099
2.373SerAsn: 2.373 ± 0.169
3.407SerPro: 3.407 ± 0.153
2.492SerGln: 2.492 ± 0.131
3.526SerArg: 3.526 ± 0.153
4.633SerSer: 4.633 ± 0.224
4.136SerThr: 4.136 ± 0.286
4.626SerVal: 4.626 ± 0.168
0.729SerTrp: 0.729 ± 0.082
1.683SerTyr: 1.683 ± 0.12
0.0SerXaa: 0.0 ± 0.0
Thr
6.031ThrAla: 6.031 ± 0.257
0.517ThrCys: 0.517 ± 0.054
2.697ThrAsp: 2.697 ± 0.151
2.731ThrGlu: 2.731 ± 0.162
2.326ThrPhe: 2.326 ± 0.148
4.719ThrGly: 4.719 ± 0.185
1.2ThrHis: 1.2 ± 0.097
3.201ThrIle: 3.201 ± 0.188
2.194ThrLys: 2.194 ± 0.127
5.56ThrLeu: 5.56 ± 0.218
1.074ThrMet: 1.074 ± 0.087
2.048ThrAsn: 2.048 ± 0.141
3.479ThrPro: 3.479 ± 0.17
2.101ThrGln: 2.101 ± 0.112
3.108ThrArg: 3.108 ± 0.126
3.89ThrSer: 3.89 ± 0.266
3.287ThrThr: 3.287 ± 0.209
4.851ThrVal: 4.851 ± 0.212
0.716ThrTrp: 0.716 ± 0.085
1.511ThrTyr: 1.511 ± 0.15
0.0ThrXaa: 0.0 ± 0.0
Val
7.284ValAla: 7.284 ± 0.218
0.649ValCys: 0.649 ± 0.081
3.738ValAsp: 3.738 ± 0.19
4.122ValGlu: 4.122 ± 0.194
3.009ValPhe: 3.009 ± 0.151
4.719ValGly: 4.719 ± 0.208
1.339ValHis: 1.339 ± 0.102
4.116ValIle: 4.116 ± 0.171
2.896ValLys: 2.896 ± 0.186
7.469ValLeu: 7.469 ± 0.276
1.624ValMet: 1.624 ± 0.108
2.704ValAsn: 2.704 ± 0.169
4.182ValPro: 4.182 ± 0.187
2.591ValGln: 2.591 ± 0.15
4.348ValArg: 4.348 ± 0.183
4.792ValSer: 4.792 ± 0.18
4.467ValThr: 4.467 ± 0.213
6.024ValVal: 6.024 ± 0.233
0.981ValTrp: 0.981 ± 0.085
1.637ValTyr: 1.637 ± 0.111
0.0ValXaa: 0.0 ± 0.0
Trp
1.087TrpAla: 1.087 ± 0.09
0.093TrpCys: 0.093 ± 0.027
0.722TrpAsp: 0.722 ± 0.078
0.603TrpGlu: 0.603 ± 0.07
0.616TrpPhe: 0.616 ± 0.068
0.848TrpGly: 0.848 ± 0.086
0.311TrpHis: 0.311 ± 0.039
0.729TrpIle: 0.729 ± 0.072
0.888TrpLys: 0.888 ± 0.083
1.657TrpLeu: 1.657 ± 0.131
0.338TrpMet: 0.338 ± 0.047
0.55TrpAsn: 0.55 ± 0.075
0.53TrpPro: 0.53 ± 0.06
0.809TrpGln: 0.809 ± 0.088
0.782TrpArg: 0.782 ± 0.084
0.835TrpSer: 0.835 ± 0.085
0.696TrpThr: 0.696 ± 0.068
0.862TrpVal: 0.862 ± 0.08
0.239TrpTrp: 0.239 ± 0.039
0.305TrpTyr: 0.305 ± 0.044
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.459TyrAla: 2.459 ± 0.136
0.318TyrCys: 0.318 ± 0.051
1.591TyrAsp: 1.591 ± 0.142
1.286TyrGlu: 1.286 ± 0.104
1.193TyrPhe: 1.193 ± 0.098
2.068TyrGly: 2.068 ± 0.129
0.689TyrHis: 0.689 ± 0.064
1.147TyrIle: 1.147 ± 0.088
0.954TyrLys: 0.954 ± 0.081
2.638TyrLeu: 2.638 ± 0.145
0.477TyrMet: 0.477 ± 0.056
0.934TyrAsn: 0.934 ± 0.099
1.504TyrPro: 1.504 ± 0.089
0.901TyrGln: 0.901 ± 0.079
1.637TyrArg: 1.637 ± 0.109
1.829TyrSer: 1.829 ± 0.135
1.306TyrThr: 1.306 ± 0.098
1.776TyrVal: 1.776 ± 0.112
0.51TyrTrp: 0.51 ± 0.056
0.802TyrTyr: 0.802 ± 0.082
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 481 proteins (150889 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski