Amino acid dipepetide frequency for Klebsiella phage Matisse

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
4.078AlaAla: 4.078 ± 0.319
0.501AlaCys: 0.501 ± 0.103
3.72AlaAsp: 3.72 ± 0.256
4.793AlaGlu: 4.793 ± 0.378
2.826AlaPhe: 2.826 ± 0.258
4.06AlaGly: 4.06 ± 0.388
1.198AlaHis: 1.198 ± 0.139
4.972AlaIle: 4.972 ± 0.304
5.312AlaLys: 5.312 ± 0.397
5.67AlaLeu: 5.67 ± 0.328
2.093AlaMet: 2.093 ± 0.236
3.434AlaAsn: 3.434 ± 0.286
2.039AlaPro: 2.039 ± 0.192
2.361AlaGln: 2.361 ± 0.224
3.416AlaArg: 3.416 ± 0.293
3.595AlaSer: 3.595 ± 0.269
4.221AlaThr: 4.221 ± 0.455
4.722AlaVal: 4.722 ± 0.306
0.823AlaTrp: 0.823 ± 0.137
3.076AlaTyr: 3.076 ± 0.233
0.0AlaXaa: 0.0 ± 0.0
Cys
0.894CysAla: 0.894 ± 0.117
0.215CysCys: 0.215 ± 0.063
0.787CysAsp: 0.787 ± 0.118
0.787CysGlu: 0.787 ± 0.119
0.554CysPhe: 0.554 ± 0.103
1.037CysGly: 1.037 ± 0.148
0.286CysHis: 0.286 ± 0.073
0.519CysIle: 0.519 ± 0.091
1.019CysLys: 1.019 ± 0.142
0.859CysLeu: 0.859 ± 0.11
0.411CysMet: 0.411 ± 0.085
0.644CysAsn: 0.644 ± 0.134
0.537CysPro: 0.537 ± 0.111
0.286CysGln: 0.286 ± 0.082
0.59CysArg: 0.59 ± 0.104
0.715CysSer: 0.715 ± 0.114
0.733CysThr: 0.733 ± 0.108
0.805CysVal: 0.805 ± 0.162
0.143CysTrp: 0.143 ± 0.05
0.429CysTyr: 0.429 ± 0.085
0.0CysXaa: 0.0 ± 0.0
Asp
4.293AspAla: 4.293 ± 0.325
0.751AspCys: 0.751 ± 0.108
3.989AspAsp: 3.989 ± 0.297
4.954AspGlu: 4.954 ± 0.329
3.058AspPhe: 3.058 ± 0.252
4.615AspGly: 4.615 ± 0.406
0.966AspHis: 0.966 ± 0.141
4.901AspIle: 4.901 ± 0.324
4.4AspLys: 4.4 ± 0.346
5.097AspLeu: 5.097 ± 0.362
1.878AspMet: 1.878 ± 0.18
3.273AspAsn: 3.273 ± 0.265
2.969AspPro: 2.969 ± 0.2
1.681AspGln: 1.681 ± 0.165
2.933AspArg: 2.933 ± 0.194
3.702AspSer: 3.702 ± 0.279
3.863AspThr: 3.863 ± 0.275
4.132AspVal: 4.132 ± 0.285
1.252AspTrp: 1.252 ± 0.153
3.398AspTyr: 3.398 ± 0.305
0.0AspXaa: 0.0 ± 0.0
Glu
4.972GluAla: 4.972 ± 0.312
1.002GluCys: 1.002 ± 0.166
3.631GluAsp: 3.631 ± 0.266
5.276GluGlu: 5.276 ± 0.391
2.808GluPhe: 2.808 ± 0.243
3.667GluGly: 3.667 ± 0.282
1.538GluHis: 1.538 ± 0.172
5.634GluIle: 5.634 ± 0.336
5.258GluLys: 5.258 ± 0.335
5.902GluLeu: 5.902 ± 0.316
2.254GluMet: 2.254 ± 0.246
3.452GluAsn: 3.452 ± 0.24
1.431GluPro: 1.431 ± 0.171
2.325GluGln: 2.325 ± 0.217
3.13GluArg: 3.13 ± 0.249
4.203GluSer: 4.203 ± 0.351
4.167GluThr: 4.167 ± 0.318
4.418GluVal: 4.418 ± 0.284
1.091GluTrp: 1.091 ± 0.13
2.987GluTyr: 2.987 ± 0.23
0.0GluXaa: 0.0 ± 0.0
Phe
2.361PheAla: 2.361 ± 0.225
0.483PheCys: 0.483 ± 0.083
3.631PheAsp: 3.631 ± 0.286
3.058PheGlu: 3.058 ± 0.271
1.377PhePhe: 1.377 ± 0.201
3.058PheGly: 3.058 ± 0.252
0.537PheHis: 0.537 ± 0.119
2.54PheIle: 2.54 ± 0.245
2.987PheLys: 2.987 ± 0.253
2.486PheLeu: 2.486 ± 0.198
1.27PheMet: 1.27 ± 0.147
2.45PheAsn: 2.45 ± 0.218
1.377PhePro: 1.377 ± 0.179
1.198PheGln: 1.198 ± 0.163
1.735PheArg: 1.735 ± 0.158
2.558PheSer: 2.558 ± 0.21
2.772PheThr: 2.772 ± 0.225
3.023PheVal: 3.023 ± 0.253
0.715PheTrp: 0.715 ± 0.101
1.574PheTyr: 1.574 ± 0.187
0.0PheXaa: 0.0 ± 0.0
Gly
4.024GlyAla: 4.024 ± 0.411
1.037GlyCys: 1.037 ± 0.151
4.364GlyAsp: 4.364 ± 0.319
3.917GlyGlu: 3.917 ± 0.248
3.058GlyPhe: 3.058 ± 0.282
3.649GlyGly: 3.649 ± 0.374
1.109GlyHis: 1.109 ± 0.146
3.953GlyIle: 3.953 ± 0.236
4.954GlyLys: 4.954 ± 0.257
4.436GlyLeu: 4.436 ± 0.258
1.717GlyMet: 1.717 ± 0.184
3.47GlyAsn: 3.47 ± 0.339
0.859GlyPro: 0.859 ± 0.107
1.628GlyGln: 1.628 ± 0.183
2.915GlyArg: 2.915 ± 0.23
4.561GlySer: 4.561 ± 0.304
3.881GlyThr: 3.881 ± 0.429
4.901GlyVal: 4.901 ± 0.299
0.948GlyTrp: 0.948 ± 0.129
3.452GlyTyr: 3.452 ± 0.248
0.0GlyXaa: 0.0 ± 0.0
His
1.234HisAla: 1.234 ± 0.157
0.268HisCys: 0.268 ± 0.076
1.109HisAsp: 1.109 ± 0.175
1.27HisGlu: 1.27 ± 0.17
0.751HisPhe: 0.751 ± 0.111
0.948HisGly: 0.948 ± 0.149
0.376HisHis: 0.376 ± 0.083
1.306HisIle: 1.306 ± 0.164
1.216HisLys: 1.216 ± 0.149
1.467HisLeu: 1.467 ± 0.169
0.411HisMet: 0.411 ± 0.099
0.876HisAsn: 0.876 ± 0.135
0.966HisPro: 0.966 ± 0.113
0.626HisGln: 0.626 ± 0.115
0.859HisArg: 0.859 ± 0.13
1.055HisSer: 1.055 ± 0.148
0.715HisThr: 0.715 ± 0.101
1.252HisVal: 1.252 ± 0.174
0.25HisTrp: 0.25 ± 0.065
0.751HisTyr: 0.751 ± 0.114
0.0HisXaa: 0.0 ± 0.0
Ile
4.865IleAla: 4.865 ± 0.384
0.608IleCys: 0.608 ± 0.098
5.419IleAsp: 5.419 ± 0.321
4.561IleGlu: 4.561 ± 0.31
2.325IlePhe: 2.325 ± 0.244
3.756IleGly: 3.756 ± 0.297
1.288IleHis: 1.288 ± 0.152
3.81IleIle: 3.81 ± 0.263
5.097IleLys: 5.097 ± 0.324
4.114IleLeu: 4.114 ± 0.28
2.057IleMet: 2.057 ± 0.204
3.72IleAsn: 3.72 ± 0.258
2.754IlePro: 2.754 ± 0.22
2.325IleGln: 2.325 ± 0.205
3.863IleArg: 3.863 ± 0.283
3.845IleSer: 3.845 ± 0.261
4.436IleThr: 4.436 ± 0.256
5.026IleVal: 5.026 ± 0.315
0.698IleTrp: 0.698 ± 0.104
2.79IleTyr: 2.79 ± 0.246
0.0IleXaa: 0.0 ± 0.0
Lys
5.741LysAla: 5.741 ± 0.354
0.841LysCys: 0.841 ± 0.159
4.221LysAsp: 4.221 ± 0.248
5.634LysGlu: 5.634 ± 0.388
3.041LysPhe: 3.041 ± 0.24
4.31LysGly: 4.31 ± 0.314
1.485LysHis: 1.485 ± 0.173
4.811LysIle: 4.811 ± 0.291
4.972LysLys: 4.972 ± 0.425
5.527LysLeu: 5.527 ± 0.374
2.468LysMet: 2.468 ± 0.2
4.221LysAsn: 4.221 ± 0.279
2.647LysPro: 2.647 ± 0.231
2.898LysGln: 2.898 ± 0.252
3.559LysArg: 3.559 ± 0.284
3.738LysSer: 3.738 ± 0.291
4.579LysThr: 4.579 ± 0.249
4.722LysVal: 4.722 ± 0.296
1.019LysTrp: 1.019 ± 0.128
3.613LysTyr: 3.613 ± 0.288
0.0LysXaa: 0.0 ± 0.0
Leu
4.99LeuAla: 4.99 ± 0.297
1.002LeuCys: 1.002 ± 0.144
5.473LeuAsp: 5.473 ± 0.375
4.99LeuGlu: 4.99 ± 0.311
3.094LeuPhe: 3.094 ± 0.273
4.096LeuGly: 4.096 ± 0.247
1.431LeuHis: 1.431 ± 0.157
4.471LeuIle: 4.471 ± 0.288
5.509LeuLys: 5.509 ± 0.327
4.597LeuLeu: 4.597 ± 0.295
2.665LeuMet: 2.665 ± 0.25
4.203LeuAsn: 4.203 ± 0.263
3.219LeuPro: 3.219 ± 0.253
2.236LeuGln: 2.236 ± 0.258
4.078LeuArg: 4.078 ± 0.222
5.133LeuSer: 5.133 ± 0.344
4.31LeuThr: 4.31 ± 0.352
4.686LeuVal: 4.686 ± 0.326
0.751LeuTrp: 0.751 ± 0.122
3.595LeuTyr: 3.595 ± 0.283
0.0LeuXaa: 0.0 ± 0.0
Met
1.932MetAla: 1.932 ± 0.182
0.304MetCys: 0.304 ± 0.085
1.824MetAsp: 1.824 ± 0.185
1.717MetGlu: 1.717 ± 0.192
1.055MetPhe: 1.055 ± 0.164
1.574MetGly: 1.574 ± 0.177
0.501MetHis: 0.501 ± 0.099
2.254MetIle: 2.254 ± 0.189
2.915MetLys: 2.915 ± 0.239
2.128MetLeu: 2.128 ± 0.192
0.984MetMet: 0.984 ± 0.123
2.289MetAsn: 2.289 ± 0.218
0.823MetPro: 0.823 ± 0.119
1.198MetGln: 1.198 ± 0.165
1.127MetArg: 1.127 ± 0.155
1.95MetSer: 1.95 ± 0.181
1.753MetThr: 1.753 ± 0.163
1.699MetVal: 1.699 ± 0.15
0.447MetTrp: 0.447 ± 0.1
1.27MetTyr: 1.27 ± 0.145
0.0MetXaa: 0.0 ± 0.0
Asn
4.06AsnAla: 4.06 ± 0.288
0.644AsnCys: 0.644 ± 0.105
3.13AsnAsp: 3.13 ± 0.276
3.935AsnGlu: 3.935 ± 0.256
1.967AsnPhe: 1.967 ± 0.239
4.65AsnGly: 4.65 ± 0.375
1.163AsnHis: 1.163 ± 0.158
3.255AsnIle: 3.255 ± 0.276
3.416AsnLys: 3.416 ± 0.295
4.239AsnLeu: 4.239 ± 0.291
1.502AsnMet: 1.502 ± 0.163
3.058AsnAsn: 3.058 ± 0.3
2.754AsnPro: 2.754 ± 0.274
1.914AsnGln: 1.914 ± 0.211
2.683AsnArg: 2.683 ± 0.229
3.416AsnSer: 3.416 ± 0.258
3.13AsnThr: 3.13 ± 0.281
3.774AsnVal: 3.774 ± 0.311
0.322AsnTrp: 0.322 ± 0.076
1.789AsnTyr: 1.789 ± 0.175
0.0AsnXaa: 0.0 ± 0.0
Pro
2.057ProAla: 2.057 ± 0.228
0.465ProCys: 0.465 ± 0.081
3.291ProAsp: 3.291 ± 0.225
2.719ProGlu: 2.719 ± 0.235
1.574ProPhe: 1.574 ± 0.167
2.272ProGly: 2.272 ± 0.238
0.501ProHis: 0.501 ± 0.093
2.057ProIle: 2.057 ± 0.18
2.415ProLys: 2.415 ± 0.247
2.522ProLeu: 2.522 ± 0.186
0.662ProMet: 0.662 ± 0.103
1.717ProAsn: 1.717 ± 0.194
1.127ProPro: 1.127 ± 0.157
0.948ProGln: 0.948 ± 0.108
1.502ProArg: 1.502 ± 0.192
2.218ProSer: 2.218 ± 0.198
2.289ProThr: 2.289 ± 0.233
3.076ProVal: 3.076 ± 0.256
0.429ProTrp: 0.429 ± 0.086
1.538ProTyr: 1.538 ± 0.18
0.0ProXaa: 0.0 ± 0.0
Gln
2.397GlnAla: 2.397 ± 0.275
0.376GlnCys: 0.376 ± 0.088
1.61GlnAsp: 1.61 ± 0.169
2.021GlnGlu: 2.021 ± 0.186
1.27GlnPhe: 1.27 ± 0.131
1.61GlnGly: 1.61 ± 0.173
0.447GlnHis: 0.447 ± 0.079
2.719GlnIle: 2.719 ± 0.24
2.593GlnLys: 2.593 ± 0.299
2.862GlnLeu: 2.862 ± 0.243
0.93GlnMet: 0.93 ± 0.124
1.61GlnAsn: 1.61 ± 0.175
0.859GlnPro: 0.859 ± 0.128
1.359GlnGln: 1.359 ± 0.15
1.753GlnArg: 1.753 ± 0.165
1.824GlnSer: 1.824 ± 0.182
1.878GlnThr: 1.878 ± 0.18
2.325GlnVal: 2.325 ± 0.183
0.411GlnTrp: 0.411 ± 0.095
1.699GlnTyr: 1.699 ± 0.168
0.0GlnXaa: 0.0 ± 0.0
Arg
2.719ArgAla: 2.719 ± 0.228
0.59ArgCys: 0.59 ± 0.111
3.184ArgAsp: 3.184 ± 0.241
3.416ArgGlu: 3.416 ± 0.255
1.914ArgPhe: 1.914 ± 0.226
3.166ArgGly: 3.166 ± 0.223
0.68ArgHis: 0.68 ± 0.13
3.577ArgIle: 3.577 ± 0.291
3.595ArgLys: 3.595 ± 0.289
3.774ArgLeu: 3.774 ± 0.232
1.288ArgMet: 1.288 ± 0.155
2.915ArgAsn: 2.915 ± 0.201
1.502ArgPro: 1.502 ± 0.16
1.556ArgGln: 1.556 ± 0.179
2.075ArgArg: 2.075 ± 0.207
2.933ArgSer: 2.933 ± 0.226
2.415ArgThr: 2.415 ± 0.199
3.219ArgVal: 3.219 ± 0.237
0.876ArgTrp: 0.876 ± 0.124
2.057ArgTyr: 2.057 ± 0.186
0.0ArgXaa: 0.0 ± 0.0
Ser
3.738SerAla: 3.738 ± 0.274
0.662SerCys: 0.662 ± 0.107
4.006SerAsp: 4.006 ± 0.277
3.72SerGlu: 3.72 ± 0.291
2.719SerPhe: 2.719 ± 0.211
4.632SerGly: 4.632 ± 0.357
1.091SerHis: 1.091 ± 0.144
4.185SerIle: 4.185 ± 0.312
3.989SerLys: 3.989 ± 0.23
4.937SerLeu: 4.937 ± 0.337
1.878SerMet: 1.878 ± 0.193
2.933SerAsn: 2.933 ± 0.255
2.468SerPro: 2.468 ± 0.225
1.985SerGln: 1.985 ± 0.175
2.772SerArg: 2.772 ± 0.235
3.792SerSer: 3.792 ± 0.314
3.291SerThr: 3.291 ± 0.283
4.418SerVal: 4.418 ± 0.26
0.787SerTrp: 0.787 ± 0.117
2.504SerTyr: 2.504 ± 0.214
0.0SerXaa: 0.0 ± 0.0
Thr
3.953ThrAla: 3.953 ± 0.379
0.59ThrCys: 0.59 ± 0.116
3.47ThrAsp: 3.47 ± 0.29
3.72ThrGlu: 3.72 ± 0.315
2.558ThrPhe: 2.558 ± 0.22
4.239ThrGly: 4.239 ± 0.297
1.109ThrHis: 1.109 ± 0.117
4.042ThrIle: 4.042 ± 0.279
3.935ThrLys: 3.935 ± 0.275
5.276ThrLeu: 5.276 ± 0.301
1.413ThrMet: 1.413 ± 0.139
3.094ThrAsn: 3.094 ± 0.235
2.772ThrPro: 2.772 ± 0.245
1.699ThrGln: 1.699 ± 0.209
2.754ThrArg: 2.754 ± 0.219
3.363ThrSer: 3.363 ± 0.293
3.237ThrThr: 3.237 ± 0.326
4.901ThrVal: 4.901 ± 0.427
0.662ThrTrp: 0.662 ± 0.091
2.54ThrTyr: 2.54 ± 0.227
0.0ThrXaa: 0.0 ± 0.0
Val
4.704ValAla: 4.704 ± 0.352
1.055ValCys: 1.055 ± 0.151
4.99ValAsp: 4.99 ± 0.321
5.044ValGlu: 5.044 ± 0.334
2.951ValPhe: 2.951 ± 0.205
4.006ValGly: 4.006 ± 0.335
0.93ValHis: 0.93 ± 0.136
4.543ValIle: 4.543 ± 0.286
5.849ValLys: 5.849 ± 0.341
4.776ValLeu: 4.776 ± 0.285
1.932ValMet: 1.932 ± 0.171
3.756ValAsn: 3.756 ± 0.292
2.289ValPro: 2.289 ± 0.212
2.2ValGln: 2.2 ± 0.216
3.005ValArg: 3.005 ± 0.223
4.579ValSer: 4.579 ± 0.293
4.114ValThr: 4.114 ± 0.327
4.4ValVal: 4.4 ± 0.298
1.002ValTrp: 1.002 ± 0.148
3.541ValTyr: 3.541 ± 0.25
0.0ValXaa: 0.0 ± 0.0
Trp
0.948TrpAla: 0.948 ± 0.129
0.197TrpCys: 0.197 ± 0.058
1.002TrpAsp: 1.002 ± 0.149
0.662TrpGlu: 0.662 ± 0.116
0.68TrpPhe: 0.68 ± 0.107
0.662TrpGly: 0.662 ± 0.12
0.215TrpHis: 0.215 ± 0.07
0.841TrpIle: 0.841 ± 0.124
1.27TrpLys: 1.27 ± 0.154
0.966TrpLeu: 0.966 ± 0.124
0.644TrpMet: 0.644 ± 0.128
0.733TrpAsn: 0.733 ± 0.123
0.197TrpPro: 0.197 ± 0.065
0.554TrpGln: 0.554 ± 0.088
0.68TrpArg: 0.68 ± 0.124
0.572TrpSer: 0.572 ± 0.096
0.751TrpThr: 0.751 ± 0.123
0.93TrpVal: 0.93 ± 0.15
0.125TrpTrp: 0.125 ± 0.044
0.698TrpTyr: 0.698 ± 0.096
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.951TyrAla: 2.951 ± 0.224
0.662TyrCys: 0.662 ± 0.095
3.255TyrAsp: 3.255 ± 0.227
2.951TyrGlu: 2.951 ± 0.207
1.628TyrPhe: 1.628 ± 0.15
2.826TyrGly: 2.826 ± 0.287
0.859TyrHis: 0.859 ± 0.149
2.969TyrIle: 2.969 ± 0.238
3.345TyrLys: 3.345 ± 0.268
2.898TyrLeu: 2.898 ± 0.215
1.27TyrMet: 1.27 ± 0.156
2.951TyrAsn: 2.951 ± 0.218
1.789TyrPro: 1.789 ± 0.204
1.574TyrGln: 1.574 ± 0.15
2.021TyrArg: 2.021 ± 0.207
2.754TyrSer: 2.754 ± 0.233
2.79TyrThr: 2.79 ± 0.254
3.237TyrVal: 3.237 ± 0.243
0.572TyrTrp: 0.572 ± 0.1
1.86TyrTyr: 1.86 ± 0.198
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 280 proteins (55911 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski