Amino acid dipepetide frequency for Mycobacterium phage Cracklewink

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
16.564AlaAla: 16.564 ± 1.226
1.099AlaCys: 1.099 ± 0.217
8.831AlaAsp: 8.831 ± 0.64
6.715AlaGlu: 6.715 ± 0.629
1.994AlaPhe: 1.994 ± 0.283
8.18AlaGly: 8.18 ± 0.687
2.076AlaHis: 2.076 ± 0.265
4.599AlaIle: 4.599 ± 0.419
5.291AlaLys: 5.291 ± 0.593
8.139AlaLeu: 8.139 ± 0.595
3.459AlaMet: 3.459 ± 0.421
3.785AlaAsn: 3.785 ± 0.459
4.802AlaPro: 4.802 ± 0.517
5.006AlaGln: 5.006 ± 0.631
7.529AlaArg: 7.529 ± 0.619
4.68AlaSer: 4.68 ± 0.43
6.878AlaThr: 6.878 ± 0.672
7.692AlaVal: 7.692 ± 0.59
1.953AlaTrp: 1.953 ± 0.288
2.767AlaTyr: 2.767 ± 0.309
0.0AlaXaa: 0.0 ± 0.0
Cys
1.099CysAla: 1.099 ± 0.194
0.203CysCys: 0.203 ± 0.092
1.017CysAsp: 1.017 ± 0.223
0.488CysGlu: 0.488 ± 0.144
0.163CysPhe: 0.163 ± 0.075
2.157CysGly: 2.157 ± 0.416
0.448CysHis: 0.448 ± 0.164
0.407CysIle: 0.407 ± 0.117
0.122CysLys: 0.122 ± 0.061
0.773CysLeu: 0.773 ± 0.19
0.163CysMet: 0.163 ± 0.083
0.488CysAsn: 0.488 ± 0.127
1.628CysPro: 1.628 ± 0.302
0.57CysGln: 0.57 ± 0.168
1.221CysArg: 1.221 ± 0.289
0.814CysSer: 0.814 ± 0.151
0.651CysThr: 0.651 ± 0.175
0.692CysVal: 0.692 ± 0.178
0.366CysTrp: 0.366 ± 0.128
0.163CysTyr: 0.163 ± 0.075
0.0CysXaa: 0.0 ± 0.0
Asp
6.105AspAla: 6.105 ± 0.482
0.855AspCys: 0.855 ± 0.23
6.511AspAsp: 6.511 ± 0.655
4.884AspGlu: 4.884 ± 0.564
1.221AspPhe: 1.221 ± 0.217
6.389AspGly: 6.389 ± 0.516
1.465AspHis: 1.465 ± 0.237
2.442AspIle: 2.442 ± 0.306
1.831AspLys: 1.831 ± 0.283
4.884AspLeu: 4.884 ± 0.515
1.302AspMet: 1.302 ± 0.226
2.198AspAsn: 2.198 ± 0.31
5.006AspPro: 5.006 ± 0.428
3.459AspGln: 3.459 ± 0.348
5.046AspArg: 5.046 ± 0.466
3.256AspSer: 3.256 ± 0.361
3.907AspThr: 3.907 ± 0.339
4.436AspVal: 4.436 ± 0.438
1.18AspTrp: 1.18 ± 0.268
1.831AspTyr: 1.831 ± 0.284
0.0AspXaa: 0.0 ± 0.0
Glu
5.494GluAla: 5.494 ± 0.529
0.855GluCys: 0.855 ± 0.178
3.703GluAsp: 3.703 ± 0.369
3.093GluGlu: 3.093 ± 0.481
1.75GluPhe: 1.75 ± 0.253
3.663GluGly: 3.663 ± 0.374
1.18GluHis: 1.18 ± 0.214
1.424GluIle: 1.424 ± 0.205
1.669GluLys: 1.669 ± 0.301
6.389GluLeu: 6.389 ± 0.601
1.465GluMet: 1.465 ± 0.237
1.343GluAsn: 1.343 ± 0.322
3.581GluPro: 3.581 ± 0.435
2.767GluGln: 2.767 ± 0.334
4.273GluArg: 4.273 ± 0.388
3.256GluSer: 3.256 ± 0.44
2.605GluThr: 2.605 ± 0.296
4.762GluVal: 4.762 ± 0.551
1.506GluTrp: 1.506 ± 0.24
1.221GluTyr: 1.221 ± 0.199
0.0GluXaa: 0.0 ± 0.0
Phe
2.442PheAla: 2.442 ± 0.289
0.122PheCys: 0.122 ± 0.064
2.157PheAsp: 2.157 ± 0.27
1.302PheGlu: 1.302 ± 0.215
0.488PhePhe: 0.488 ± 0.15
2.076PheGly: 2.076 ± 0.269
0.651PheHis: 0.651 ± 0.162
0.692PheIle: 0.692 ± 0.161
0.651PheLys: 0.651 ± 0.159
1.302PheLeu: 1.302 ± 0.189
0.57PheMet: 0.57 ± 0.139
0.936PheAsn: 0.936 ± 0.236
1.709PhePro: 1.709 ± 0.266
1.221PheGln: 1.221 ± 0.199
1.953PheArg: 1.953 ± 0.233
1.017PheSer: 1.017 ± 0.192
1.628PheThr: 1.628 ± 0.221
1.628PheVal: 1.628 ± 0.237
0.407PheTrp: 0.407 ± 0.134
0.651PheTyr: 0.651 ± 0.154
0.0PheXaa: 0.0 ± 0.0
Gly
7.448GlyAla: 7.448 ± 0.866
1.424GlyCys: 1.424 ± 0.256
6.064GlyAsp: 6.064 ± 0.548
3.825GlyGlu: 3.825 ± 0.335
2.36GlyPhe: 2.36 ± 0.38
8.506GlyGly: 8.506 ± 1.294
2.808GlyHis: 2.808 ± 0.277
2.889GlyIle: 2.889 ± 0.375
2.686GlyLys: 2.686 ± 0.274
7.285GlyLeu: 7.285 ± 0.517
1.791GlyMet: 1.791 ± 0.252
2.767GlyAsn: 2.767 ± 0.313
4.762GlyPro: 4.762 ± 0.525
4.192GlyGln: 4.192 ± 0.463
5.86GlyArg: 5.86 ± 0.458
4.07GlySer: 4.07 ± 0.421
6.552GlyThr: 6.552 ± 0.57
6.634GlyVal: 6.634 ± 0.504
2.442GlyTrp: 2.442 ± 0.283
2.971GlyTyr: 2.971 ± 0.365
0.0GlyXaa: 0.0 ± 0.0
His
2.564HisAla: 2.564 ± 0.339
0.448HisCys: 0.448 ± 0.125
1.75HisAsp: 1.75 ± 0.292
0.733HisGlu: 0.733 ± 0.17
0.529HisPhe: 0.529 ± 0.135
2.645HisGly: 2.645 ± 0.307
1.302HisHis: 1.302 ± 0.239
1.017HisIle: 1.017 ± 0.197
0.814HisLys: 0.814 ± 0.169
2.279HisLeu: 2.279 ± 0.306
0.529HisMet: 0.529 ± 0.131
0.692HisAsn: 0.692 ± 0.163
2.035HisPro: 2.035 ± 0.292
1.262HisGln: 1.262 ± 0.237
1.872HisArg: 1.872 ± 0.3
0.773HisSer: 0.773 ± 0.163
1.506HisThr: 1.506 ± 0.267
1.384HisVal: 1.384 ± 0.24
0.61HisTrp: 0.61 ± 0.149
0.936HisTyr: 0.936 ± 0.186
0.0HisXaa: 0.0 ± 0.0
Ile
5.575IleAla: 5.575 ± 0.386
0.407IleCys: 0.407 ± 0.134
2.238IleAsp: 2.238 ± 0.257
2.93IleGlu: 2.93 ± 0.373
0.407IlePhe: 0.407 ± 0.193
3.459IleGly: 3.459 ± 0.335
0.895IleHis: 0.895 ± 0.177
1.017IleIle: 1.017 ± 0.245
1.058IleLys: 1.058 ± 0.171
1.628IleLeu: 1.628 ± 0.254
0.57IleMet: 0.57 ± 0.136
1.14IleAsn: 1.14 ± 0.217
2.32IlePro: 2.32 ± 0.246
1.099IleGln: 1.099 ± 0.227
2.605IleArg: 2.605 ± 0.262
1.669IleSer: 1.669 ± 0.27
3.093IleThr: 3.093 ± 0.365
2.523IleVal: 2.523 ± 0.282
0.692IleTrp: 0.692 ± 0.141
0.488IleTyr: 0.488 ± 0.141
0.0IleXaa: 0.0 ± 0.0
Lys
4.192LysAla: 4.192 ± 0.554
0.407LysCys: 0.407 ± 0.132
1.628LysAsp: 1.628 ± 0.232
1.506LysGlu: 1.506 ± 0.244
0.855LysPhe: 0.855 ± 0.18
2.32LysGly: 2.32 ± 0.319
0.773LysHis: 0.773 ± 0.151
1.18LysIle: 1.18 ± 0.214
0.61LysLys: 0.61 ± 0.146
2.32LysLeu: 2.32 ± 0.29
0.855LysMet: 0.855 ± 0.173
0.448LysAsn: 0.448 ± 0.14
2.279LysPro: 2.279 ± 0.305
0.977LysGln: 0.977 ± 0.206
3.012LysArg: 3.012 ± 0.345
1.343LysSer: 1.343 ± 0.252
1.669LysThr: 1.669 ± 0.285
2.93LysVal: 2.93 ± 0.353
0.651LysTrp: 0.651 ± 0.148
0.692LysTyr: 0.692 ± 0.143
0.0LysXaa: 0.0 ± 0.0
Leu
9.686LeuAla: 9.686 ± 0.55
1.058LeuCys: 1.058 ± 0.239
5.535LeuAsp: 5.535 ± 0.494
4.802LeuGlu: 4.802 ± 0.413
2.035LeuPhe: 2.035 ± 0.316
6.878LeuGly: 6.878 ± 0.571
2.076LeuHis: 2.076 ± 0.327
2.808LeuIle: 2.808 ± 0.362
2.442LeuLys: 2.442 ± 0.333
6.878LeuLeu: 6.878 ± 0.666
1.587LeuMet: 1.587 ± 0.222
2.198LeuAsn: 2.198 ± 0.345
4.924LeuPro: 4.924 ± 0.562
2.808LeuGln: 2.808 ± 0.323
4.558LeuArg: 4.558 ± 0.406
3.5LeuSer: 3.5 ± 0.391
6.349LeuThr: 6.349 ± 0.443
5.575LeuVal: 5.575 ± 0.57
1.262LeuTrp: 1.262 ± 0.279
1.994LeuTyr: 1.994 ± 0.261
0.0LeuXaa: 0.0 ± 0.0
Met
3.012MetAla: 3.012 ± 0.321
0.244MetCys: 0.244 ± 0.114
0.977MetAsp: 0.977 ± 0.176
0.977MetGlu: 0.977 ± 0.225
0.407MetPhe: 0.407 ± 0.13
1.709MetGly: 1.709 ± 0.285
0.448MetHis: 0.448 ± 0.143
0.936MetIle: 0.936 ± 0.209
0.692MetLys: 0.692 ± 0.16
2.035MetLeu: 2.035 ± 0.346
0.895MetMet: 0.895 ± 0.17
1.017MetAsn: 1.017 ± 0.188
1.099MetPro: 1.099 ± 0.165
0.448MetGln: 0.448 ± 0.13
1.546MetArg: 1.546 ± 0.28
2.564MetSer: 2.564 ± 0.312
3.174MetThr: 3.174 ± 0.326
1.506MetVal: 1.506 ± 0.226
0.366MetTrp: 0.366 ± 0.136
0.488MetTyr: 0.488 ± 0.135
0.0MetXaa: 0.0 ± 0.0
Asn
3.948AsnAla: 3.948 ± 0.541
0.529AsnCys: 0.529 ± 0.15
1.709AsnAsp: 1.709 ± 0.256
1.709AsnGlu: 1.709 ± 0.326
0.529AsnPhe: 0.529 ± 0.141
3.174AsnGly: 3.174 ± 0.377
1.099AsnHis: 1.099 ± 0.221
0.895AsnIle: 0.895 ± 0.164
0.488AsnLys: 0.488 ± 0.168
2.198AsnLeu: 2.198 ± 0.318
0.651AsnMet: 0.651 ± 0.151
0.814AsnAsn: 0.814 ± 0.193
2.767AsnPro: 2.767 ± 0.342
1.017AsnGln: 1.017 ± 0.268
2.605AsnArg: 2.605 ± 0.379
1.099AsnSer: 1.099 ± 0.207
1.506AsnThr: 1.506 ± 0.251
2.32AsnVal: 2.32 ± 0.283
0.529AsnTrp: 0.529 ± 0.144
0.488AsnTyr: 0.488 ± 0.114
0.0AsnXaa: 0.0 ± 0.0
Pro
6.552ProAla: 6.552 ± 0.603
0.895ProCys: 0.895 ± 0.199
5.25ProAsp: 5.25 ± 0.464
4.029ProGlu: 4.029 ± 0.408
1.628ProPhe: 1.628 ± 0.192
5.331ProGly: 5.331 ± 0.475
1.465ProHis: 1.465 ± 0.255
3.093ProIle: 3.093 ± 0.427
1.709ProLys: 1.709 ± 0.253
4.07ProLeu: 4.07 ± 0.385
1.872ProMet: 1.872 ± 0.279
1.75ProAsn: 1.75 ± 0.279
4.599ProPro: 4.599 ± 0.707
2.442ProGln: 2.442 ± 0.328
3.581ProArg: 3.581 ± 0.509
3.215ProSer: 3.215 ± 0.35
4.517ProThr: 4.517 ± 0.43
5.25ProVal: 5.25 ± 0.578
1.587ProTrp: 1.587 ± 0.242
1.384ProTyr: 1.384 ± 0.212
0.0ProXaa: 0.0 ± 0.0
Gln
4.639GlnAla: 4.639 ± 0.569
0.448GlnCys: 0.448 ± 0.215
1.302GlnAsp: 1.302 ± 0.204
1.506GlnGlu: 1.506 ± 0.239
1.14GlnPhe: 1.14 ± 0.202
3.5GlnGly: 3.5 ± 0.349
0.895GlnHis: 0.895 ± 0.191
1.709GlnIle: 1.709 ± 0.279
1.18GlnLys: 1.18 ± 0.207
5.087GlnLeu: 5.087 ± 0.441
1.465GlnMet: 1.465 ± 0.233
0.855GlnAsn: 0.855 ± 0.204
3.215GlnPro: 3.215 ± 0.42
2.157GlnGln: 2.157 ± 0.288
3.948GlnArg: 3.948 ± 0.513
1.628GlnSer: 1.628 ± 0.215
1.75GlnThr: 1.75 ± 0.289
2.727GlnVal: 2.727 ± 0.323
0.733GlnTrp: 0.733 ± 0.196
0.936GlnTyr: 0.936 ± 0.211
0.0GlnXaa: 0.0 ± 0.0
Arg
7.163ArgAla: 7.163 ± 0.601
1.424ArgCys: 1.424 ± 0.335
4.029ArgAsp: 4.029 ± 0.403
3.988ArgGlu: 3.988 ± 0.446
1.669ArgPhe: 1.669 ± 0.24
5.698ArgGly: 5.698 ± 0.529
1.913ArgHis: 1.913 ± 0.299
2.279ArgIle: 2.279 ± 0.312
2.767ArgLys: 2.767 ± 0.426
6.634ArgLeu: 6.634 ± 0.498
2.035ArgMet: 2.035 ± 0.267
1.709ArgAsn: 1.709 ± 0.249
4.232ArgPro: 4.232 ± 0.563
2.767ArgGln: 2.767 ± 0.342
6.145ArgArg: 6.145 ± 0.698
3.988ArgSer: 3.988 ± 0.387
3.093ArgThr: 3.093 ± 0.321
5.779ArgVal: 5.779 ± 0.502
1.953ArgTrp: 1.953 ± 0.276
3.012ArgTyr: 3.012 ± 0.339
0.0ArgXaa: 0.0 ± 0.0
Ser
4.314SerAla: 4.314 ± 0.351
0.366SerCys: 0.366 ± 0.12
3.703SerAsp: 3.703 ± 0.448
2.564SerGlu: 2.564 ± 0.306
1.099SerPhe: 1.099 ± 0.189
5.372SerGly: 5.372 ± 0.625
1.343SerHis: 1.343 ± 0.219
1.587SerIle: 1.587 ± 0.365
1.302SerLys: 1.302 ± 0.241
3.012SerLeu: 3.012 ± 0.345
1.18SerMet: 1.18 ± 0.195
1.709SerAsn: 1.709 ± 0.237
2.645SerPro: 2.645 ± 0.369
2.36SerGln: 2.36 ± 0.306
3.866SerArg: 3.866 ± 0.458
2.564SerSer: 2.564 ± 0.315
2.808SerThr: 2.808 ± 0.351
3.907SerVal: 3.907 ± 0.406
1.058SerTrp: 1.058 ± 0.203
1.302SerTyr: 1.302 ± 0.238
0.0SerXaa: 0.0 ± 0.0
Thr
7.977ThrAla: 7.977 ± 0.643
0.855ThrCys: 0.855 ± 0.214
4.355ThrAsp: 4.355 ± 0.354
3.703ThrGlu: 3.703 ± 0.451
1.831ThrPhe: 1.831 ± 0.309
5.698ThrGly: 5.698 ± 0.501
1.343ThrHis: 1.343 ± 0.22
2.442ThrIle: 2.442 ± 0.34
2.035ThrLys: 2.035 ± 0.318
4.232ThrLeu: 4.232 ± 0.468
1.221ThrMet: 1.221 ± 0.205
2.279ThrAsn: 2.279 ± 0.305
5.575ThrPro: 5.575 ± 0.445
1.669ThrGln: 1.669 ± 0.249
3.907ThrArg: 3.907 ± 0.464
2.727ThrSer: 2.727 ± 0.293
5.25ThrThr: 5.25 ± 0.565
5.616ThrVal: 5.616 ± 0.477
1.302ThrTrp: 1.302 ± 0.233
1.384ThrTyr: 1.384 ± 0.216
0.0ThrXaa: 0.0 ± 0.0
Val
9.238ValAla: 9.238 ± 0.817
1.14ValCys: 1.14 ± 0.206
4.314ValAsp: 4.314 ± 0.426
5.209ValGlu: 5.209 ± 0.557
2.442ValPhe: 2.442 ± 0.295
6.389ValGly: 6.389 ± 0.592
1.994ValHis: 1.994 ± 0.297
3.052ValIle: 3.052 ± 0.301
2.279ValLys: 2.279 ± 0.312
5.657ValLeu: 5.657 ± 0.422
1.953ValMet: 1.953 ± 0.272
2.523ValAsn: 2.523 ± 0.294
4.762ValPro: 4.762 ± 0.432
2.523ValGln: 2.523 ± 0.358
4.11ValArg: 4.11 ± 0.466
3.052ValSer: 3.052 ± 0.415
5.331ValThr: 5.331 ± 0.432
6.959ValVal: 6.959 ± 0.632
1.628ValTrp: 1.628 ± 0.273
1.669ValTyr: 1.669 ± 0.252
0.0ValXaa: 0.0 ± 0.0
Trp
1.546TrpAla: 1.546 ± 0.248
0.326TrpCys: 0.326 ± 0.117
1.709TrpAsp: 1.709 ± 0.258
0.855TrpGlu: 0.855 ± 0.158
0.733TrpPhe: 0.733 ± 0.146
1.424TrpGly: 1.424 ± 0.279
0.936TrpHis: 0.936 ± 0.223
0.529TrpIle: 0.529 ± 0.146
0.529TrpLys: 0.529 ± 0.159
2.076TrpLeu: 2.076 ± 0.285
0.61TrpMet: 0.61 ± 0.165
0.651TrpAsn: 0.651 ± 0.193
0.977TrpPro: 0.977 ± 0.174
0.936TrpGln: 0.936 ± 0.195
1.913TrpArg: 1.913 ± 0.254
1.384TrpSer: 1.384 ± 0.23
1.384TrpThr: 1.384 ± 0.238
2.076TrpVal: 2.076 ± 0.317
0.529TrpTrp: 0.529 ± 0.142
0.285TrpTyr: 0.285 ± 0.092
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.483TyrAla: 2.483 ± 0.271
0.529TyrCys: 0.529 ± 0.154
1.465TyrAsp: 1.465 ± 0.247
1.18TyrGlu: 1.18 ± 0.241
0.326TyrPhe: 0.326 ± 0.12
2.767TyrGly: 2.767 ± 0.393
0.57TyrHis: 0.57 ± 0.173
0.814TyrIle: 0.814 ± 0.199
0.488TyrLys: 0.488 ± 0.126
2.076TyrLeu: 2.076 ± 0.256
0.326TyrMet: 0.326 ± 0.103
0.773TyrAsn: 0.773 ± 0.189
1.302TyrPro: 1.302 ± 0.22
1.14TyrGln: 1.14 ± 0.179
2.727TyrArg: 2.727 ± 0.329
1.465TyrSer: 1.465 ± 0.196
1.75TyrThr: 1.75 ± 0.285
1.831TyrVal: 1.831 ± 0.243
0.651TyrTrp: 0.651 ± 0.161
0.407TyrTyr: 0.407 ± 0.112
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 135 proteins (24573 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski