Amino acid dipepetide frequency for Salmonella phage smaug

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
7.719AlaAla: 7.719 ± 1.114
0.654AlaCys: 0.654 ± 0.132
4.295AlaAsp: 4.295 ± 0.462
5.385AlaGlu: 5.385 ± 0.54
3.361AlaPhe: 3.361 ± 0.305
5.416AlaGly: 5.416 ± 0.553
1.494AlaHis: 1.494 ± 0.273
4.98AlaIle: 4.98 ± 0.388
6.163AlaLys: 6.163 ± 0.54
6.754AlaLeu: 6.754 ± 0.56
1.805AlaMet: 1.805 ± 0.289
3.33AlaAsn: 3.33 ± 0.369
2.303AlaPro: 2.303 ± 0.297
3.33AlaGln: 3.33 ± 0.41
3.237AlaArg: 3.237 ± 0.256
4.357AlaSer: 4.357 ± 0.511
4.015AlaThr: 4.015 ± 0.477
4.389AlaVal: 4.389 ± 0.437
0.965AlaTrp: 0.965 ± 0.145
2.677AlaTyr: 2.677 ± 0.291
0.0AlaXaa: 0.0 ± 0.0
Cys
0.591CysAla: 0.591 ± 0.153
0.28CysCys: 0.28 ± 0.09
0.529CysAsp: 0.529 ± 0.133
0.747CysGlu: 0.747 ± 0.142
0.342CysPhe: 0.342 ± 0.115
0.778CysGly: 0.778 ± 0.165
0.218CysHis: 0.218 ± 0.08
0.747CysIle: 0.747 ± 0.17
0.84CysLys: 0.84 ± 0.19
0.778CysLeu: 0.778 ± 0.162
0.218CysMet: 0.218 ± 0.088
0.56CysAsn: 0.56 ± 0.136
0.373CysPro: 0.373 ± 0.133
0.311CysGln: 0.311 ± 0.107
0.56CysArg: 0.56 ± 0.149
0.84CysSer: 0.84 ± 0.172
0.498CysThr: 0.498 ± 0.13
0.716CysVal: 0.716 ± 0.147
0.093CysTrp: 0.093 ± 0.051
0.249CysTyr: 0.249 ± 0.089
0.0CysXaa: 0.0 ± 0.0
Asp
4.575AspAla: 4.575 ± 0.41
0.56AspCys: 0.56 ± 0.165
2.926AspAsp: 2.926 ± 0.404
4.326AspGlu: 4.326 ± 0.324
2.926AspPhe: 2.926 ± 0.31
3.922AspGly: 3.922 ± 0.338
1.307AspHis: 1.307 ± 0.224
4.108AspIle: 4.108 ± 0.402
4.389AspLys: 4.389 ± 0.366
5.26AspLeu: 5.26 ± 0.413
2.023AspMet: 2.023 ± 0.219
2.365AspAsn: 2.365 ± 0.261
2.708AspPro: 2.708 ± 0.238
1.432AspGln: 1.432 ± 0.196
2.708AspArg: 2.708 ± 0.291
3.735AspSer: 3.735 ± 0.383
3.859AspThr: 3.859 ± 0.368
3.61AspVal: 3.61 ± 0.337
0.934AspTrp: 0.934 ± 0.184
2.614AspTyr: 2.614 ± 0.297
0.0AspXaa: 0.0 ± 0.0
Glu
5.914GluAla: 5.914 ± 0.465
0.716GluCys: 0.716 ± 0.155
3.486GluAsp: 3.486 ± 0.347
5.602GluGlu: 5.602 ± 0.492
2.614GluPhe: 2.614 ± 0.279
3.766GluGly: 3.766 ± 0.361
1.463GluHis: 1.463 ± 0.217
5.011GluIle: 5.011 ± 0.367
5.011GluLys: 5.011 ± 0.387
7.283GluLeu: 7.283 ± 0.466
1.992GluMet: 1.992 ± 0.228
2.614GluAsn: 2.614 ± 0.29
1.681GluPro: 1.681 ± 0.239
2.988GluGln: 2.988 ± 0.389
3.019GluArg: 3.019 ± 0.362
3.61GluSer: 3.61 ± 0.277
3.828GluThr: 3.828 ± 0.307
4.357GluVal: 4.357 ± 0.364
1.214GluTrp: 1.214 ± 0.198
3.019GluTyr: 3.019 ± 0.347
0.0GluXaa: 0.0 ± 0.0
Phe
2.708PheAla: 2.708 ± 0.267
0.436PheCys: 0.436 ± 0.116
2.614PheAsp: 2.614 ± 0.289
2.739PheGlu: 2.739 ± 0.33
1.743PhePhe: 1.743 ± 0.201
2.957PheGly: 2.957 ± 0.333
1.12PheHis: 1.12 ± 0.159
2.895PheIle: 2.895 ± 0.299
3.175PheLys: 3.175 ± 0.256
3.05PheLeu: 3.05 ± 0.315
0.84PheMet: 0.84 ± 0.171
3.144PheAsn: 3.144 ± 0.296
1.961PhePro: 1.961 ± 0.279
0.903PheGln: 0.903 ± 0.169
2.085PheArg: 2.085 ± 0.309
2.646PheSer: 2.646 ± 0.307
2.49PheThr: 2.49 ± 0.25
2.521PheVal: 2.521 ± 0.248
0.436PheTrp: 0.436 ± 0.109
1.463PheTyr: 1.463 ± 0.209
0.0PheXaa: 0.0 ± 0.0
Gly
4.638GlyAla: 4.638 ± 0.598
0.934GlyCys: 0.934 ± 0.193
3.735GlyAsp: 3.735 ± 0.383
4.824GlyGlu: 4.824 ± 0.451
2.77GlyPhe: 2.77 ± 0.238
3.859GlyGly: 3.859 ± 0.531
1.245GlyHis: 1.245 ± 0.235
4.42GlyIle: 4.42 ± 0.351
4.949GlyLys: 4.949 ± 0.408
4.544GlyLeu: 4.544 ± 0.393
1.805GlyMet: 1.805 ± 0.263
3.704GlyAsn: 3.704 ± 0.429
0.965GlyPro: 0.965 ± 0.184
2.521GlyGln: 2.521 ± 0.257
2.677GlyArg: 2.677 ± 0.255
4.233GlySer: 4.233 ± 0.46
4.575GlyThr: 4.575 ± 0.462
5.198GlyVal: 5.198 ± 0.396
0.903GlyTrp: 0.903 ± 0.185
3.299GlyTyr: 3.299 ± 0.303
0.0GlyXaa: 0.0 ± 0.0
His
1.276HisAla: 1.276 ± 0.198
0.311HisCys: 0.311 ± 0.105
1.152HisAsp: 1.152 ± 0.192
0.996HisGlu: 0.996 ± 0.17
0.778HisPhe: 0.778 ± 0.154
1.276HisGly: 1.276 ± 0.164
0.747HisHis: 0.747 ± 0.186
1.65HisIle: 1.65 ± 0.222
1.432HisLys: 1.432 ± 0.213
1.587HisLeu: 1.587 ± 0.248
0.28HisMet: 0.28 ± 0.125
0.934HisAsn: 0.934 ± 0.231
0.871HisPro: 0.871 ± 0.146
0.56HisGln: 0.56 ± 0.148
1.183HisArg: 1.183 ± 0.18
1.276HisSer: 1.276 ± 0.206
0.716HisThr: 0.716 ± 0.166
0.622HisVal: 0.622 ± 0.143
0.218HisTrp: 0.218 ± 0.09
0.747HisTyr: 0.747 ± 0.169
0.0HisXaa: 0.0 ± 0.0
Ile
4.855IleAla: 4.855 ± 0.379
0.716IleCys: 0.716 ± 0.159
4.824IleAsp: 4.824 ± 0.44
4.389IleGlu: 4.389 ± 0.336
2.397IlePhe: 2.397 ± 0.328
4.14IleGly: 4.14 ± 0.355
1.152IleHis: 1.152 ± 0.171
4.077IleIle: 4.077 ± 0.347
4.638IleLys: 4.638 ± 0.425
5.011IleLeu: 5.011 ± 0.431
2.023IleMet: 2.023 ± 0.311
4.202IleAsn: 4.202 ± 0.405
3.05IlePro: 3.05 ± 0.311
2.365IleGln: 2.365 ± 0.264
2.863IleArg: 2.863 ± 0.278
4.42IleSer: 4.42 ± 0.323
4.7IleThr: 4.7 ± 0.376
3.517IleVal: 3.517 ± 0.323
0.871IleTrp: 0.871 ± 0.172
2.459IleTyr: 2.459 ± 0.282
0.0IleXaa: 0.0 ± 0.0
Lys
6.474LysAla: 6.474 ± 0.543
0.56LysCys: 0.56 ± 0.141
4.824LysAsp: 4.824 ± 0.315
5.322LysGlu: 5.322 ± 0.402
3.361LysPhe: 3.361 ± 0.303
3.486LysGly: 3.486 ± 0.296
1.058LysHis: 1.058 ± 0.196
4.357LysIle: 4.357 ± 0.361
4.482LysLys: 4.482 ± 0.44
6.381LysLeu: 6.381 ± 0.574
2.334LysMet: 2.334 ± 0.281
3.393LysAsn: 3.393 ± 0.432
2.365LysPro: 2.365 ± 0.32
3.299LysGln: 3.299 ± 0.338
3.455LysArg: 3.455 ± 0.273
4.046LysSer: 4.046 ± 0.374
3.859LysThr: 3.859 ± 0.387
4.793LysVal: 4.793 ± 0.341
0.747LysTrp: 0.747 ± 0.171
3.268LysTyr: 3.268 ± 0.292
0.0LysXaa: 0.0 ± 0.0
Leu
7.003LeuAla: 7.003 ± 0.472
0.685LeuCys: 0.685 ± 0.129
6.163LeuAsp: 6.163 ± 0.37
6.692LeuGlu: 6.692 ± 0.455
2.926LeuPhe: 2.926 ± 0.324
5.602LeuGly: 5.602 ± 0.481
1.556LeuHis: 1.556 ± 0.212
5.136LeuIle: 5.136 ± 0.423
5.758LeuLys: 5.758 ± 0.365
5.914LeuLeu: 5.914 ± 0.47
2.085LeuMet: 2.085 ± 0.308
4.793LeuAsn: 4.793 ± 0.411
3.486LeuPro: 3.486 ± 0.306
2.863LeuGln: 2.863 ± 0.307
3.673LeuArg: 3.673 ± 0.315
4.513LeuSer: 4.513 ± 0.41
4.669LeuThr: 4.669 ± 0.347
5.665LeuVal: 5.665 ± 0.413
0.747LeuTrp: 0.747 ± 0.15
2.614LeuTyr: 2.614 ± 0.281
0.0LeuXaa: 0.0 ± 0.0
Met
2.148MetAla: 2.148 ± 0.205
0.218MetCys: 0.218 ± 0.071
1.214MetAsp: 1.214 ± 0.211
1.93MetGlu: 1.93 ± 0.268
0.934MetPhe: 0.934 ± 0.198
1.307MetGly: 1.307 ± 0.179
0.529MetHis: 0.529 ± 0.123
2.148MetIle: 2.148 ± 0.316
2.334MetLys: 2.334 ± 0.275
2.365MetLeu: 2.365 ± 0.285
0.405MetMet: 0.405 ± 0.099
1.089MetAsn: 1.089 ± 0.144
0.871MetPro: 0.871 ± 0.155
1.245MetGln: 1.245 ± 0.227
1.12MetArg: 1.12 ± 0.189
1.992MetSer: 1.992 ± 0.326
1.836MetThr: 1.836 ± 0.231
1.307MetVal: 1.307 ± 0.203
0.311MetTrp: 0.311 ± 0.107
0.965MetTyr: 0.965 ± 0.165
0.0MetXaa: 0.0 ± 0.0
Asn
3.486AsnAla: 3.486 ± 0.412
0.56AsnCys: 0.56 ± 0.146
2.801AsnAsp: 2.801 ± 0.294
2.832AsnGlu: 2.832 ± 0.292
2.085AsnPhe: 2.085 ± 0.231
4.731AsnGly: 4.731 ± 0.526
0.84AsnHis: 0.84 ± 0.184
3.486AsnIle: 3.486 ± 0.337
3.859AsnLys: 3.859 ± 0.371
4.606AsnLeu: 4.606 ± 0.431
1.183AsnMet: 1.183 ± 0.19
2.895AsnAsn: 2.895 ± 0.356
2.614AsnPro: 2.614 ± 0.294
1.463AsnGln: 1.463 ± 0.194
2.646AsnArg: 2.646 ± 0.303
3.797AsnSer: 3.797 ± 0.275
2.926AsnThr: 2.926 ± 0.267
3.828AsnVal: 3.828 ± 0.367
0.809AsnTrp: 0.809 ± 0.17
2.023AsnTyr: 2.023 ± 0.275
0.0AsnXaa: 0.0 ± 0.0
Pro
2.334ProAla: 2.334 ± 0.297
0.342ProCys: 0.342 ± 0.112
2.646ProAsp: 2.646 ± 0.326
3.517ProGlu: 3.517 ± 0.331
1.743ProPhe: 1.743 ± 0.237
1.93ProGly: 1.93 ± 0.239
0.436ProHis: 0.436 ± 0.12
2.303ProIle: 2.303 ± 0.321
2.054ProLys: 2.054 ± 0.277
2.241ProLeu: 2.241 ± 0.263
0.685ProMet: 0.685 ± 0.151
2.272ProAsn: 2.272 ± 0.343
1.681ProPro: 1.681 ± 0.331
0.965ProGln: 0.965 ± 0.152
1.556ProArg: 1.556 ± 0.242
2.272ProSer: 2.272 ± 0.31
1.93ProThr: 1.93 ± 0.243
2.583ProVal: 2.583 ± 0.286
0.436ProTrp: 0.436 ± 0.09
1.743ProTyr: 1.743 ± 0.194
0.0ProXaa: 0.0 ± 0.0
Gln
2.863GlnAla: 2.863 ± 0.491
0.436GlnCys: 0.436 ± 0.112
1.93GlnAsp: 1.93 ± 0.235
2.801GlnGlu: 2.801 ± 0.383
2.023GlnPhe: 2.023 ± 0.241
1.899GlnGly: 1.899 ± 0.206
0.436GlnHis: 0.436 ± 0.127
2.023GlnIle: 2.023 ± 0.217
2.677GlnLys: 2.677 ± 0.355
3.486GlnLeu: 3.486 ± 0.382
0.903GlnMet: 0.903 ± 0.174
1.681GlnAsn: 1.681 ± 0.214
0.591GlnPro: 0.591 ± 0.147
2.023GlnGln: 2.023 ± 0.362
1.899GlnArg: 1.899 ± 0.238
2.397GlnSer: 2.397 ± 0.295
1.867GlnThr: 1.867 ± 0.252
2.832GlnVal: 2.832 ± 0.29
0.622GlnTrp: 0.622 ± 0.147
1.432GlnTyr: 1.432 ± 0.201
0.0GlnXaa: 0.0 ± 0.0
Arg
3.081ArgAla: 3.081 ± 0.296
0.187ArgCys: 0.187 ± 0.076
2.863ArgAsp: 2.863 ± 0.244
2.926ArgGlu: 2.926 ± 0.345
1.836ArgPhe: 1.836 ± 0.241
3.424ArgGly: 3.424 ± 0.313
0.747ArgHis: 0.747 ± 0.189
3.081ArgIle: 3.081 ± 0.351
3.019ArgLys: 3.019 ± 0.416
4.108ArgLeu: 4.108 ± 0.377
1.618ArgMet: 1.618 ± 0.238
2.926ArgAsn: 2.926 ± 0.287
1.401ArgPro: 1.401 ± 0.23
1.681ArgGln: 1.681 ± 0.195
2.646ArgArg: 2.646 ± 0.322
2.552ArgSer: 2.552 ± 0.259
2.583ArgThr: 2.583 ± 0.267
3.112ArgVal: 3.112 ± 0.312
0.436ArgTrp: 0.436 ± 0.132
1.712ArgTyr: 1.712 ± 0.258
0.0ArgXaa: 0.0 ± 0.0
Ser
4.638SerAla: 4.638 ± 0.48
0.654SerCys: 0.654 ± 0.143
2.863SerAsp: 2.863 ± 0.293
3.393SerGlu: 3.393 ± 0.367
2.895SerPhe: 2.895 ± 0.227
5.011SerGly: 5.011 ± 0.492
0.84SerHis: 0.84 ± 0.152
4.575SerIle: 4.575 ± 0.388
4.544SerLys: 4.544 ± 0.362
5.291SerLeu: 5.291 ± 0.332
1.681SerMet: 1.681 ± 0.259
3.579SerAsn: 3.579 ± 0.435
2.054SerPro: 2.054 ± 0.264
2.241SerGln: 2.241 ± 0.311
2.77SerArg: 2.77 ± 0.267
3.859SerSer: 3.859 ± 0.365
3.33SerThr: 3.33 ± 0.299
4.171SerVal: 4.171 ± 0.334
1.307SerTrp: 1.307 ± 0.195
2.397SerTyr: 2.397 ± 0.266
0.0SerXaa: 0.0 ± 0.0
Thr
4.42ThrAla: 4.42 ± 0.532
0.467ThrCys: 0.467 ± 0.096
3.299ThrAsp: 3.299 ± 0.305
3.206ThrGlu: 3.206 ± 0.268
2.272ThrPhe: 2.272 ± 0.24
5.073ThrGly: 5.073 ± 0.543
0.809ThrHis: 0.809 ± 0.169
3.891ThrIle: 3.891 ± 0.377
4.108ThrLys: 4.108 ± 0.396
4.606ThrLeu: 4.606 ± 0.357
1.338ThrMet: 1.338 ± 0.215
3.735ThrAsn: 3.735 ± 0.318
2.552ThrPro: 2.552 ± 0.289
2.023ThrGln: 2.023 ± 0.217
2.241ThrArg: 2.241 ± 0.346
3.704ThrSer: 3.704 ± 0.306
2.957ThrThr: 2.957 ± 0.308
4.202ThrVal: 4.202 ± 0.344
0.871ThrTrp: 0.871 ± 0.204
1.867ThrTyr: 1.867 ± 0.236
0.0ThrXaa: 0.0 ± 0.0
Val
4.98ValAla: 4.98 ± 0.429
0.685ValCys: 0.685 ± 0.148
4.295ValAsp: 4.295 ± 0.395
4.42ValGlu: 4.42 ± 0.382
2.521ValPhe: 2.521 ± 0.27
4.171ValGly: 4.171 ± 0.343
1.245ValHis: 1.245 ± 0.225
4.295ValIle: 4.295 ± 0.374
4.575ValLys: 4.575 ± 0.347
4.887ValLeu: 4.887 ± 0.381
1.65ValMet: 1.65 ± 0.223
3.299ValAsn: 3.299 ± 0.376
2.614ValPro: 2.614 ± 0.286
2.365ValGln: 2.365 ± 0.262
3.112ValArg: 3.112 ± 0.315
4.295ValSer: 4.295 ± 0.312
3.891ValThr: 3.891 ± 0.331
4.731ValVal: 4.731 ± 0.379
0.747ValTrp: 0.747 ± 0.156
2.677ValTyr: 2.677 ± 0.301
0.0ValXaa: 0.0 ± 0.0
Trp
0.56TrpAla: 0.56 ± 0.132
0.28TrpCys: 0.28 ± 0.09
1.027TrpAsp: 1.027 ± 0.18
0.903TrpGlu: 0.903 ± 0.167
0.467TrpPhe: 0.467 ± 0.113
0.809TrpGly: 0.809 ± 0.185
0.187TrpHis: 0.187 ± 0.088
0.84TrpIle: 0.84 ± 0.17
1.183TrpLys: 1.183 ± 0.158
1.338TrpLeu: 1.338 ± 0.25
0.373TrpMet: 0.373 ± 0.106
0.654TrpAsn: 0.654 ± 0.172
0.311TrpPro: 0.311 ± 0.098
0.778TrpGln: 0.778 ± 0.158
0.622TrpArg: 0.622 ± 0.152
0.809TrpSer: 0.809 ± 0.187
0.654TrpThr: 0.654 ± 0.147
0.871TrpVal: 0.871 ± 0.151
0.249TrpTrp: 0.249 ± 0.119
0.498TrpTyr: 0.498 ± 0.162
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.365TyrAla: 2.365 ± 0.279
0.591TyrCys: 0.591 ± 0.14
2.801TyrAsp: 2.801 ± 0.274
2.241TyrGlu: 2.241 ± 0.285
1.93TyrPhe: 1.93 ± 0.27
2.241TyrGly: 2.241 ± 0.313
1.183TyrHis: 1.183 ± 0.165
2.677TyrIle: 2.677 ± 0.248
2.677TyrLys: 2.677 ± 0.261
3.05TyrLeu: 3.05 ± 0.321
0.996TyrMet: 0.996 ± 0.164
2.303TyrAsn: 2.303 ± 0.244
1.214TyrPro: 1.214 ± 0.168
1.463TyrGln: 1.463 ± 0.239
1.867TyrArg: 1.867 ± 0.232
2.739TyrSer: 2.739 ± 0.309
2.459TyrThr: 2.459 ± 0.243
2.428TyrVal: 2.428 ± 0.246
0.498TyrTrp: 0.498 ± 0.115
1.712TyrTyr: 1.712 ± 0.241
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 158 proteins (32130 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski