Amino acid dipepetide frequency for Klebsiella phage Marfa

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
6.255AlaAla: 6.255 ± 0.419
0.518AlaCys: 0.518 ± 0.105
4.375AlaAsp: 4.375 ± 0.276
5.661AlaGlu: 5.661 ± 0.393
2.706AlaPhe: 2.706 ± 0.247
4.701AlaGly: 4.701 ± 0.428
1.19AlaHis: 1.19 ± 0.133
5.238AlaIle: 5.238 ± 0.327
5.776AlaLys: 5.776 ± 0.404
6.236AlaLeu: 6.236 ± 0.33
1.727AlaMet: 1.727 ± 0.194
3.723AlaAsn: 3.723 ± 0.325
2.955AlaPro: 2.955 ± 0.272
2.782AlaGln: 2.782 ± 0.271
3.588AlaArg: 3.588 ± 0.29
4.663AlaSer: 4.663 ± 0.297
4.087AlaThr: 4.087 ± 0.43
4.893AlaVal: 4.893 ± 0.316
1.151AlaTrp: 1.151 ± 0.149
2.706AlaTyr: 2.706 ± 0.258
0.0AlaXaa: 0.0 ± 0.0
Cys
0.94CysAla: 0.94 ± 0.149
0.192CysCys: 0.192 ± 0.062
0.576CysAsp: 0.576 ± 0.118
0.806CysGlu: 0.806 ± 0.114
0.48CysPhe: 0.48 ± 0.098
0.844CysGly: 0.844 ± 0.126
0.249CysHis: 0.249 ± 0.069
0.768CysIle: 0.768 ± 0.129
0.633CysLys: 0.633 ± 0.109
0.576CysLeu: 0.576 ± 0.122
0.461CysMet: 0.461 ± 0.102
0.518CysAsn: 0.518 ± 0.097
0.518CysPro: 0.518 ± 0.102
0.288CysGln: 0.288 ± 0.069
0.537CysArg: 0.537 ± 0.091
0.921CysSer: 0.921 ± 0.132
0.652CysThr: 0.652 ± 0.104
0.672CysVal: 0.672 ± 0.138
0.077CysTrp: 0.077 ± 0.034
0.595CysTyr: 0.595 ± 0.112
0.0CysXaa: 0.0 ± 0.0
Asp
4.164AspAla: 4.164 ± 0.294
0.71AspCys: 0.71 ± 0.116
3.569AspAsp: 3.569 ± 0.369
4.567AspGlu: 4.567 ± 0.361
2.686AspPhe: 2.686 ± 0.236
4.433AspGly: 4.433 ± 0.292
0.863AspHis: 0.863 ± 0.125
4.586AspIle: 4.586 ± 0.288
4.221AspLys: 4.221 ± 0.29
4.72AspLeu: 4.72 ± 0.315
1.689AspMet: 1.689 ± 0.161
2.782AspAsn: 2.782 ± 0.209
2.283AspPro: 2.283 ± 0.243
1.765AspGln: 1.765 ± 0.16
2.341AspArg: 2.341 ± 0.22
3.818AspSer: 3.818 ± 0.228
3.569AspThr: 3.569 ± 0.324
4.01AspVal: 4.01 ± 0.246
1.17AspTrp: 1.17 ± 0.151
2.821AspTyr: 2.821 ± 0.272
0.0AspXaa: 0.0 ± 0.0
Glu
6.562GluAla: 6.562 ± 0.408
0.902GluCys: 0.902 ± 0.133
3.454GluAsp: 3.454 ± 0.263
4.816GluGlu: 4.816 ± 0.371
3.281GluPhe: 3.281 ± 0.265
4.317GluGly: 4.317 ± 0.275
1.19GluHis: 1.19 ± 0.177
5.354GluIle: 5.354 ± 0.339
4.49GluLys: 4.49 ± 0.375
6.927GluLeu: 6.927 ± 0.435
2.168GluMet: 2.168 ± 0.22
3.166GluAsn: 3.166 ± 0.256
2.111GluPro: 2.111 ± 0.217
2.59GluGln: 2.59 ± 0.241
3.032GluArg: 3.032 ± 0.242
3.492GluSer: 3.492 ± 0.282
4.087GluThr: 4.087 ± 0.264
5.277GluVal: 5.277 ± 0.351
0.921GluTrp: 0.921 ± 0.122
3.454GluTyr: 3.454 ± 0.284
0.0GluXaa: 0.0 ± 0.0
Phe
2.801PheAla: 2.801 ± 0.266
0.307PheCys: 0.307 ± 0.084
2.533PheAsp: 2.533 ± 0.222
3.703PheGlu: 3.703 ± 0.317
1.458PhePhe: 1.458 ± 0.178
2.686PheGly: 2.686 ± 0.192
0.595PheHis: 0.595 ± 0.1
2.706PheIle: 2.706 ± 0.209
3.761PheLys: 3.761 ± 0.29
2.744PheLeu: 2.744 ± 0.22
1.401PheMet: 1.401 ± 0.144
2.667PheAsn: 2.667 ± 0.232
1.228PhePro: 1.228 ± 0.146
1.228PheGln: 1.228 ± 0.152
2.13PheArg: 2.13 ± 0.184
2.571PheSer: 2.571 ± 0.203
2.782PheThr: 2.782 ± 0.226
2.744PheVal: 2.744 ± 0.209
0.729PheTrp: 0.729 ± 0.141
1.343PheTyr: 1.343 ± 0.187
0.0PheXaa: 0.0 ± 0.0
Gly
3.396GlyAla: 3.396 ± 0.313
0.71GlyCys: 0.71 ± 0.104
3.857GlyAsp: 3.857 ± 0.338
4.01GlyGlu: 4.01 ± 0.266
3.07GlyPhe: 3.07 ± 0.273
3.665GlyGly: 3.665 ± 0.382
0.844GlyHis: 0.844 ± 0.115
4.701GlyIle: 4.701 ± 0.305
4.241GlyLys: 4.241 ± 0.314
4.989GlyLeu: 4.989 ± 0.352
1.746GlyMet: 1.746 ± 0.182
3.166GlyAsn: 3.166 ± 0.483
1.593GlyPro: 1.593 ± 0.153
2.168GlyGln: 2.168 ± 0.211
2.955GlyArg: 2.955 ± 0.216
4.682GlySer: 4.682 ± 0.335
4.644GlyThr: 4.644 ± 0.477
4.221GlyVal: 4.221 ± 0.329
1.266GlyTrp: 1.266 ± 0.184
3.051GlyTyr: 3.051 ± 0.233
0.0GlyXaa: 0.0 ± 0.0
His
1.17HisAla: 1.17 ± 0.143
0.269HisCys: 0.269 ± 0.081
1.094HisAsp: 1.094 ± 0.151
1.19HisGlu: 1.19 ± 0.154
0.94HisPhe: 0.94 ± 0.13
1.075HisGly: 1.075 ± 0.135
0.556HisHis: 0.556 ± 0.111
1.343HisIle: 1.343 ± 0.157
1.324HisLys: 1.324 ± 0.196
1.286HisLeu: 1.286 ± 0.149
0.403HisMet: 0.403 ± 0.098
0.633HisAsn: 0.633 ± 0.109
0.768HisPro: 0.768 ± 0.112
0.633HisGln: 0.633 ± 0.103
0.672HisArg: 0.672 ± 0.129
1.075HisSer: 1.075 ± 0.125
0.825HisThr: 0.825 ± 0.127
1.17HisVal: 1.17 ± 0.161
0.249HisTrp: 0.249 ± 0.067
0.902HisTyr: 0.902 ± 0.114
0.0HisXaa: 0.0 ± 0.0
Ile
5.008IleAla: 5.008 ± 0.287
0.748IleCys: 0.748 ± 0.125
4.701IleAsp: 4.701 ± 0.299
5.718IleGlu: 5.718 ± 0.386
2.283IlePhe: 2.283 ± 0.199
3.607IleGly: 3.607 ± 0.312
1.151IleHis: 1.151 ± 0.145
4.03IleIle: 4.03 ± 0.262
6.102IleLys: 6.102 ± 0.379
4.241IleLeu: 4.241 ± 0.296
1.823IleMet: 1.823 ± 0.193
3.953IleAsn: 3.953 ± 0.261
2.763IlePro: 2.763 ± 0.237
2.418IleGln: 2.418 ± 0.223
3.32IleArg: 3.32 ± 0.239
4.874IleSer: 4.874 ± 0.289
4.356IleThr: 4.356 ± 0.274
4.548IleVal: 4.548 ± 0.303
0.729IleTrp: 0.729 ± 0.121
2.303IleTyr: 2.303 ± 0.239
0.0IleXaa: 0.0 ± 0.0
Lys
6.351LysAla: 6.351 ± 0.39
0.806LysCys: 0.806 ± 0.129
4.452LysAsp: 4.452 ± 0.258
5.142LysGlu: 5.142 ± 0.402
2.993LysPhe: 2.993 ± 0.287
4.01LysGly: 4.01 ± 0.263
1.65LysHis: 1.65 ± 0.174
4.989LysIle: 4.989 ± 0.241
4.049LysLys: 4.049 ± 0.305
5.872LysLeu: 5.872 ± 0.364
2.59LysMet: 2.59 ± 0.212
3.665LysAsn: 3.665 ± 0.215
2.475LysPro: 2.475 ± 0.227
2.341LysGln: 2.341 ± 0.238
3.703LysArg: 3.703 ± 0.306
4.183LysSer: 4.183 ± 0.313
3.723LysThr: 3.723 ± 0.236
5.027LysVal: 5.027 ± 0.311
0.902LysTrp: 0.902 ± 0.147
3.051LysTyr: 3.051 ± 0.238
0.0LysXaa: 0.0 ± 0.0
Leu
5.91LeuAla: 5.91 ± 0.386
0.921LeuCys: 0.921 ± 0.137
5.354LeuAsp: 5.354 ± 0.3
5.565LeuGlu: 5.565 ± 0.328
3.185LeuPhe: 3.185 ± 0.293
4.298LeuGly: 4.298 ± 0.295
1.19LeuHis: 1.19 ± 0.161
4.855LeuIle: 4.855 ± 0.329
6.006LeuLys: 6.006 ± 0.358
4.509LeuLeu: 4.509 ± 0.308
2.053LeuMet: 2.053 ± 0.19
4.356LeuAsn: 4.356 ± 0.267
3.032LeuPro: 3.032 ± 0.203
2.629LeuGln: 2.629 ± 0.203
3.684LeuArg: 3.684 ± 0.291
4.835LeuSer: 4.835 ± 0.301
4.26LeuThr: 4.26 ± 0.353
4.644LeuVal: 4.644 ± 0.284
0.71LeuTrp: 0.71 ± 0.112
3.07LeuTyr: 3.07 ± 0.279
0.0LeuXaa: 0.0 ± 0.0
Met
2.648MetAla: 2.648 ± 0.216
0.326MetCys: 0.326 ± 0.076
1.382MetAsp: 1.382 ± 0.136
1.573MetGlu: 1.573 ± 0.202
1.286MetPhe: 1.286 ± 0.152
1.497MetGly: 1.497 ± 0.174
0.518MetHis: 0.518 ± 0.109
2.245MetIle: 2.245 ± 0.212
2.341MetLys: 2.341 ± 0.224
1.996MetLeu: 1.996 ± 0.173
0.652MetMet: 0.652 ± 0.113
1.746MetAsn: 1.746 ± 0.167
0.863MetPro: 0.863 ± 0.122
1.075MetGln: 1.075 ± 0.136
1.132MetArg: 1.132 ± 0.162
1.957MetSer: 1.957 ± 0.199
1.842MetThr: 1.842 ± 0.188
1.631MetVal: 1.631 ± 0.182
0.211MetTrp: 0.211 ± 0.064
1.113MetTyr: 1.113 ± 0.141
0.0MetXaa: 0.0 ± 0.0
Asn
3.646AsnAla: 3.646 ± 0.267
0.633AsnCys: 0.633 ± 0.12
2.667AsnAsp: 2.667 ± 0.256
3.627AsnGlu: 3.627 ± 0.272
2.533AsnPhe: 2.533 ± 0.217
4.471AsnGly: 4.471 ± 0.406
0.902AsnHis: 0.902 ± 0.136
3.435AsnIle: 3.435 ± 0.285
3.588AsnLys: 3.588 ± 0.233
3.876AsnLeu: 3.876 ± 0.309
1.228AsnMet: 1.228 ± 0.134
2.61AsnAsn: 2.61 ± 0.259
2.283AsnPro: 2.283 ± 0.19
1.785AsnGln: 1.785 ± 0.217
2.053AsnArg: 2.053 ± 0.158
3.435AsnSer: 3.435 ± 0.281
2.725AsnThr: 2.725 ± 0.225
3.396AsnVal: 3.396 ± 0.233
0.825AsnTrp: 0.825 ± 0.136
2.034AsnTyr: 2.034 ± 0.171
0.0AsnXaa: 0.0 ± 0.0
Pro
2.955ProAla: 2.955 ± 0.237
0.384ProCys: 0.384 ± 0.071
2.149ProAsp: 2.149 ± 0.214
3.377ProGlu: 3.377 ± 0.27
1.516ProPhe: 1.516 ± 0.144
2.859ProGly: 2.859 ± 0.263
0.576ProHis: 0.576 ± 0.106
2.053ProIle: 2.053 ± 0.205
2.437ProLys: 2.437 ± 0.244
2.264ProLeu: 2.264 ± 0.205
0.768ProMet: 0.768 ± 0.097
1.938ProAsn: 1.938 ± 0.204
0.902ProPro: 0.902 ± 0.163
0.979ProGln: 0.979 ± 0.14
1.382ProArg: 1.382 ± 0.155
2.686ProSer: 2.686 ± 0.219
1.861ProThr: 1.861 ± 0.217
2.821ProVal: 2.821 ± 0.231
0.691ProTrp: 0.691 ± 0.127
1.209ProTyr: 1.209 ± 0.151
0.0ProXaa: 0.0 ± 0.0
Gln
3.013GlnAla: 3.013 ± 0.252
0.345GlnCys: 0.345 ± 0.079
1.957GlnAsp: 1.957 ± 0.235
2.149GlnGlu: 2.149 ± 0.189
1.573GlnPhe: 1.573 ± 0.183
1.689GlnGly: 1.689 ± 0.206
0.748GlnHis: 0.748 ± 0.129
2.456GlnIle: 2.456 ± 0.25
2.015GlnLys: 2.015 ± 0.221
2.936GlnLeu: 2.936 ± 0.222
1.055GlnMet: 1.055 ± 0.162
1.458GlnAsn: 1.458 ± 0.17
0.959GlnPro: 0.959 ± 0.144
1.017GlnGln: 1.017 ± 0.147
1.861GlnArg: 1.861 ± 0.178
2.092GlnSer: 2.092 ± 0.167
1.861GlnThr: 1.861 ± 0.188
2.456GlnVal: 2.456 ± 0.221
0.768GlnTrp: 0.768 ± 0.12
1.976GlnTyr: 1.976 ± 0.205
0.0GlnXaa: 0.0 ± 0.0
Arg
3.147ArgAla: 3.147 ± 0.261
0.691ArgCys: 0.691 ± 0.106
2.974ArgAsp: 2.974 ± 0.238
3.377ArgGlu: 3.377 ± 0.222
1.996ArgPhe: 1.996 ± 0.219
2.821ArgGly: 2.821 ± 0.204
0.94ArgHis: 0.94 ± 0.132
3.435ArgIle: 3.435 ± 0.264
3.3ArgLys: 3.3 ± 0.256
3.703ArgLeu: 3.703 ± 0.261
1.324ArgMet: 1.324 ± 0.145
2.36ArgAsn: 2.36 ± 0.2
1.228ArgPro: 1.228 ± 0.145
1.996ArgGln: 1.996 ± 0.214
2.245ArgArg: 2.245 ± 0.208
2.264ArgSer: 2.264 ± 0.219
2.379ArgThr: 2.379 ± 0.22
3.07ArgVal: 3.07 ± 0.273
0.883ArgTrp: 0.883 ± 0.129
1.861ArgTyr: 1.861 ± 0.203
0.0ArgXaa: 0.0 ± 0.0
Ser
4.72SerAla: 4.72 ± 0.362
0.71SerCys: 0.71 ± 0.121
4.375SerAsp: 4.375 ± 0.303
4.356SerGlu: 4.356 ± 0.259
2.801SerPhe: 2.801 ± 0.256
4.701SerGly: 4.701 ± 0.366
1.094SerHis: 1.094 ± 0.133
4.145SerIle: 4.145 ± 0.279
4.413SerLys: 4.413 ± 0.285
4.433SerLeu: 4.433 ± 0.283
1.746SerMet: 1.746 ± 0.158
3.128SerAsn: 3.128 ± 0.317
2.149SerPro: 2.149 ± 0.194
2.149SerGln: 2.149 ± 0.189
2.974SerArg: 2.974 ± 0.233
4.874SerSer: 4.874 ± 0.36
3.876SerThr: 3.876 ± 0.279
3.857SerVal: 3.857 ± 0.304
1.036SerTrp: 1.036 ± 0.129
2.725SerTyr: 2.725 ± 0.251
0.0SerXaa: 0.0 ± 0.0
Thr
3.895ThrAla: 3.895 ± 0.316
0.345ThrCys: 0.345 ± 0.081
3.051ThrAsp: 3.051 ± 0.26
3.588ThrGlu: 3.588 ± 0.206
2.61ThrPhe: 2.61 ± 0.219
4.337ThrGly: 4.337 ± 0.338
1.19ThrHis: 1.19 ± 0.136
4.164ThrIle: 4.164 ± 0.258
3.684ThrLys: 3.684 ± 0.285
4.452ThrLeu: 4.452 ± 0.383
1.573ThrMet: 1.573 ± 0.168
2.801ThrAsn: 2.801 ± 0.288
2.667ThrPro: 2.667 ± 0.266
1.861ThrGln: 1.861 ± 0.229
2.667ThrArg: 2.667 ± 0.209
3.934ThrSer: 3.934 ± 0.376
3.339ThrThr: 3.339 ± 0.322
4.74ThrVal: 4.74 ± 0.38
0.652ThrTrp: 0.652 ± 0.099
2.303ThrTyr: 2.303 ± 0.187
0.0ThrXaa: 0.0 ± 0.0
Val
4.509ValAla: 4.509 ± 0.291
0.768ValCys: 0.768 ± 0.126
4.586ValAsp: 4.586 ± 0.27
5.354ValGlu: 5.354 ± 0.341
2.571ValPhe: 2.571 ± 0.249
3.761ValGly: 3.761 ± 0.295
1.036ValHis: 1.036 ± 0.148
4.855ValIle: 4.855 ± 0.31
5.526ValLys: 5.526 ± 0.317
4.893ValLeu: 4.893 ± 0.311
2.015ValMet: 2.015 ± 0.206
3.435ValAsn: 3.435 ± 0.248
2.552ValPro: 2.552 ± 0.228
2.686ValGln: 2.686 ± 0.209
3.185ValArg: 3.185 ± 0.289
4.413ValSer: 4.413 ± 0.352
3.665ValThr: 3.665 ± 0.341
5.258ValVal: 5.258 ± 0.37
0.768ValTrp: 0.768 ± 0.126
2.744ValTyr: 2.744 ± 0.212
0.0ValXaa: 0.0 ± 0.0
Trp
0.94TrpAla: 0.94 ± 0.145
0.249TrpCys: 0.249 ± 0.067
0.768TrpAsp: 0.768 ± 0.113
0.883TrpGlu: 0.883 ± 0.13
0.691TrpPhe: 0.691 ± 0.117
0.614TrpGly: 0.614 ± 0.105
0.365TrpHis: 0.365 ± 0.093
0.883TrpIle: 0.883 ± 0.143
1.19TrpLys: 1.19 ± 0.159
1.132TrpLeu: 1.132 ± 0.142
0.518TrpMet: 0.518 ± 0.095
0.748TrpAsn: 0.748 ± 0.121
0.518TrpPro: 0.518 ± 0.107
0.48TrpGln: 0.48 ± 0.092
0.633TrpArg: 0.633 ± 0.088
0.768TrpSer: 0.768 ± 0.126
1.036TrpThr: 1.036 ± 0.189
1.266TrpVal: 1.266 ± 0.154
0.288TrpTrp: 0.288 ± 0.063
0.729TrpTyr: 0.729 ± 0.113
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.897TyrAla: 2.897 ± 0.263
0.768TyrCys: 0.768 ± 0.135
2.897TyrAsp: 2.897 ± 0.228
2.13TyrGlu: 2.13 ± 0.222
1.458TyrPhe: 1.458 ± 0.157
2.59TyrGly: 2.59 ± 0.24
0.729TyrHis: 0.729 ± 0.114
2.437TyrIle: 2.437 ± 0.217
2.897TyrLys: 2.897 ± 0.262
3.281TyrLeu: 3.281 ± 0.252
1.075TyrMet: 1.075 ± 0.141
2.897TyrAsn: 2.897 ± 0.26
1.957TyrPro: 1.957 ± 0.201
1.516TyrGln: 1.516 ± 0.2
1.9TyrArg: 1.9 ± 0.226
2.725TyrSer: 2.725 ± 0.209
2.283TyrThr: 2.283 ± 0.217
2.878TyrVal: 2.878 ± 0.318
0.672TyrTrp: 0.672 ± 0.118
1.669TyrTyr: 1.669 ± 0.219
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 279 proteins (52116 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski