Amino acid dipepetide frequency for Mycobacterium phage Joselito

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
11.712AlaAla: 11.712 ± 1.178
0.854AlaCys: 0.854 ± 0.227
6.222AlaAsp: 6.222 ± 0.579
7.686AlaGlu: 7.686 ± 0.711
3.416AlaPhe: 3.416 ± 0.39
7.381AlaGly: 7.381 ± 0.916
1.403AlaHis: 1.403 ± 0.234
5.246AlaIle: 5.246 ± 0.532
3.904AlaLys: 3.904 ± 0.604
9.089AlaLeu: 9.089 ± 1.095
2.501AlaMet: 2.501 ± 0.446
3.233AlaAsn: 3.233 ± 0.488
4.697AlaPro: 4.697 ± 0.579
3.05AlaGln: 3.05 ± 0.489
6.039AlaArg: 6.039 ± 0.612
4.209AlaSer: 4.209 ± 0.427
5.307AlaThr: 5.307 ± 0.648
6.832AlaVal: 6.832 ± 0.537
1.83AlaTrp: 1.83 ± 0.281
2.562AlaTyr: 2.562 ± 0.444
0.0AlaXaa: 0.0 ± 0.0
Cys
0.61CysAla: 0.61 ± 0.186
0.0CysCys: 0.0 ± 0.0
0.61CysAsp: 0.61 ± 0.18
0.549CysGlu: 0.549 ± 0.23
0.305CysPhe: 0.305 ± 0.15
1.159CysGly: 1.159 ± 0.268
0.183CysHis: 0.183 ± 0.089
0.427CysIle: 0.427 ± 0.155
0.183CysLys: 0.183 ± 0.094
0.793CysLeu: 0.793 ± 0.228
0.427CysMet: 0.427 ± 0.174
0.305CysAsn: 0.305 ± 0.132
0.488CysPro: 0.488 ± 0.188
0.366CysGln: 0.366 ± 0.133
0.854CysArg: 0.854 ± 0.224
0.427CysSer: 0.427 ± 0.132
0.366CysThr: 0.366 ± 0.129
0.61CysVal: 0.61 ± 0.187
0.305CysTrp: 0.305 ± 0.156
0.305CysTyr: 0.305 ± 0.137
0.0CysXaa: 0.0 ± 0.0
Asp
5.978AspAla: 5.978 ± 0.647
0.61AspCys: 0.61 ± 0.221
3.66AspAsp: 3.66 ± 0.497
4.758AspGlu: 4.758 ± 0.556
2.623AspPhe: 2.623 ± 0.484
5.124AspGly: 5.124 ± 0.647
1.952AspHis: 1.952 ± 0.438
3.294AspIle: 3.294 ± 0.477
3.05AspLys: 3.05 ± 0.381
5.246AspLeu: 5.246 ± 0.58
1.586AspMet: 1.586 ± 0.261
1.647AspAsn: 1.647 ± 0.259
4.697AspPro: 4.697 ± 0.533
2.013AspGln: 2.013 ± 0.286
4.331AspArg: 4.331 ± 0.484
2.806AspSer: 2.806 ± 0.467
3.782AspThr: 3.782 ± 0.553
4.88AspVal: 4.88 ± 0.602
0.915AspTrp: 0.915 ± 0.23
2.196AspTyr: 2.196 ± 0.39
0.0AspXaa: 0.0 ± 0.0
Glu
7.686GluAla: 7.686 ± 0.698
0.427GluCys: 0.427 ± 0.174
4.087GluAsp: 4.087 ± 0.484
6.344GluGlu: 6.344 ± 0.703
2.684GluPhe: 2.684 ± 0.379
6.283GluGly: 6.283 ± 0.59
1.891GluHis: 1.891 ± 0.352
4.148GluIle: 4.148 ± 0.501
2.318GluLys: 2.318 ± 0.346
7.625GluLeu: 7.625 ± 0.669
2.135GluMet: 2.135 ± 0.332
2.074GluAsn: 2.074 ± 0.292
2.44GluPro: 2.44 ± 0.381
2.135GluGln: 2.135 ± 0.294
4.575GluArg: 4.575 ± 0.512
2.623GluSer: 2.623 ± 0.449
3.782GluThr: 3.782 ± 0.412
5.063GluVal: 5.063 ± 0.573
1.525GluTrp: 1.525 ± 0.25
2.44GluTyr: 2.44 ± 0.396
0.0GluXaa: 0.0 ± 0.0
Phe
2.501PheAla: 2.501 ± 0.385
0.366PheCys: 0.366 ± 0.144
2.745PheAsp: 2.745 ± 0.431
2.684PheGlu: 2.684 ± 0.397
0.732PhePhe: 0.732 ± 0.168
2.623PheGly: 2.623 ± 0.403
0.549PheHis: 0.549 ± 0.183
1.525PheIle: 1.525 ± 0.29
1.342PheLys: 1.342 ± 0.283
2.562PheLeu: 2.562 ± 0.426
0.488PheMet: 0.488 ± 0.155
1.708PheAsn: 1.708 ± 0.312
2.562PhePro: 2.562 ± 0.336
1.037PheGln: 1.037 ± 0.237
2.318PheArg: 2.318 ± 0.412
2.318PheSer: 2.318 ± 0.382
1.525PheThr: 1.525 ± 0.305
2.562PheVal: 2.562 ± 0.336
0.427PheTrp: 0.427 ± 0.183
0.671PheTyr: 0.671 ± 0.187
0.0PheXaa: 0.0 ± 0.0
Gly
6.344GlyAla: 6.344 ± 0.804
0.61GlyCys: 0.61 ± 0.19
5.734GlyAsp: 5.734 ± 0.653
5.063GlyGlu: 5.063 ± 0.519
2.623GlyPhe: 2.623 ± 0.347
8.54GlyGly: 8.54 ± 1.542
2.074GlyHis: 2.074 ± 0.331
4.209GlyIle: 4.209 ± 0.526
3.782GlyLys: 3.782 ± 0.56
6.161GlyLeu: 6.161 ± 0.734
1.647GlyMet: 1.647 ± 0.282
2.867GlyAsn: 2.867 ± 0.411
3.843GlyPro: 3.843 ± 0.603
3.233GlyGln: 3.233 ± 0.53
4.087GlyArg: 4.087 ± 0.562
4.148GlySer: 4.148 ± 0.468
4.697GlyThr: 4.697 ± 0.519
5.917GlyVal: 5.917 ± 0.678
2.013GlyTrp: 2.013 ± 0.342
2.562GlyTyr: 2.562 ± 0.385
0.0GlyXaa: 0.0 ± 0.0
His
1.22HisAla: 1.22 ± 0.261
0.305HisCys: 0.305 ± 0.126
1.586HisAsp: 1.586 ± 0.397
1.708HisGlu: 1.708 ± 0.334
0.488HisPhe: 0.488 ± 0.206
1.83HisGly: 1.83 ± 0.297
0.793HisHis: 0.793 ± 0.204
1.098HisIle: 1.098 ± 0.285
1.037HisLys: 1.037 ± 0.273
1.708HisLeu: 1.708 ± 0.33
0.549HisMet: 0.549 ± 0.177
0.549HisAsn: 0.549 ± 0.154
1.037HisPro: 1.037 ± 0.256
1.22HisGln: 1.22 ± 0.292
1.83HisArg: 1.83 ± 0.391
0.793HisSer: 0.793 ± 0.195
0.976HisThr: 0.976 ± 0.216
1.647HisVal: 1.647 ± 0.343
0.427HisTrp: 0.427 ± 0.144
0.549HisTyr: 0.549 ± 0.239
0.0HisXaa: 0.0 ± 0.0
Ile
4.453IleAla: 4.453 ± 0.468
0.549IleCys: 0.549 ± 0.152
3.66IleAsp: 3.66 ± 0.405
5.429IleGlu: 5.429 ± 0.548
1.22IlePhe: 1.22 ± 0.221
3.721IleGly: 3.721 ± 0.468
0.976IleHis: 0.976 ± 0.218
1.952IleIle: 1.952 ± 0.424
2.257IleLys: 2.257 ± 0.36
3.416IleLeu: 3.416 ± 0.447
0.915IleMet: 0.915 ± 0.232
2.44IleAsn: 2.44 ± 0.389
3.355IlePro: 3.355 ± 0.412
1.464IleGln: 1.464 ± 0.303
3.843IleArg: 3.843 ± 0.472
2.745IleSer: 2.745 ± 0.507
3.477IleThr: 3.477 ± 0.447
3.538IleVal: 3.538 ± 0.386
0.854IleTrp: 0.854 ± 0.184
0.854IleTyr: 0.854 ± 0.24
0.0IleXaa: 0.0 ± 0.0
Lys
4.453LysAla: 4.453 ± 0.742
0.366LysCys: 0.366 ± 0.148
2.928LysAsp: 2.928 ± 0.358
2.745LysGlu: 2.745 ± 0.411
1.159LysPhe: 1.159 ± 0.252
3.294LysGly: 3.294 ± 0.461
0.976LysHis: 0.976 ± 0.246
1.83LysIle: 1.83 ± 0.265
2.745LysLys: 2.745 ± 0.516
3.05LysLeu: 3.05 ± 0.436
1.037LysMet: 1.037 ± 0.186
1.769LysAsn: 1.769 ± 0.366
2.806LysPro: 2.806 ± 0.582
1.83LysGln: 1.83 ± 0.26
3.233LysArg: 3.233 ± 0.455
2.501LysSer: 2.501 ± 0.398
2.135LysThr: 2.135 ± 0.348
4.331LysVal: 4.331 ± 0.516
0.732LysTrp: 0.732 ± 0.174
1.159LysTyr: 1.159 ± 0.274
0.0LysXaa: 0.0 ± 0.0
Leu
8.052LeuAla: 8.052 ± 0.587
0.793LeuCys: 0.793 ± 0.237
4.941LeuAsp: 4.941 ± 0.478
6.161LeuGlu: 6.161 ± 0.706
2.257LeuPhe: 2.257 ± 0.389
5.612LeuGly: 5.612 ± 0.532
2.623LeuHis: 2.623 ± 0.446
3.782LeuIle: 3.782 ± 0.449
4.392LeuLys: 4.392 ± 0.761
4.758LeuLeu: 4.758 ± 0.639
2.257LeuMet: 2.257 ± 0.311
2.379LeuAsn: 2.379 ± 0.409
4.453LeuPro: 4.453 ± 0.458
2.379LeuGln: 2.379 ± 0.311
5.368LeuArg: 5.368 ± 0.579
5.124LeuSer: 5.124 ± 0.732
4.941LeuThr: 4.941 ± 0.481
5.185LeuVal: 5.185 ± 0.495
1.952LeuTrp: 1.952 ± 0.288
2.684LeuTyr: 2.684 ± 0.393
0.0LeuXaa: 0.0 ± 0.0
Met
2.257MetAla: 2.257 ± 0.32
0.122MetCys: 0.122 ± 0.092
0.732MetAsp: 0.732 ± 0.193
1.159MetGlu: 1.159 ± 0.198
0.671MetPhe: 0.671 ± 0.184
1.586MetGly: 1.586 ± 0.28
0.427MetHis: 0.427 ± 0.144
1.525MetIle: 1.525 ± 0.284
1.769MetLys: 1.769 ± 0.343
2.013MetLeu: 2.013 ± 0.332
0.366MetMet: 0.366 ± 0.149
0.671MetAsn: 0.671 ± 0.188
1.22MetPro: 1.22 ± 0.322
0.732MetGln: 0.732 ± 0.208
1.403MetArg: 1.403 ± 0.285
2.501MetSer: 2.501 ± 0.333
2.135MetThr: 2.135 ± 0.273
1.098MetVal: 1.098 ± 0.294
0.427MetTrp: 0.427 ± 0.157
0.488MetTyr: 0.488 ± 0.182
0.0MetXaa: 0.0 ± 0.0
Asn
3.05AsnAla: 3.05 ± 0.475
0.427AsnCys: 0.427 ± 0.167
1.83AsnAsp: 1.83 ± 0.283
2.196AsnGlu: 2.196 ± 0.409
1.22AsnPhe: 1.22 ± 0.287
3.355AsnGly: 3.355 ± 0.354
0.915AsnHis: 0.915 ± 0.218
1.586AsnIle: 1.586 ± 0.251
0.976AsnLys: 0.976 ± 0.244
3.416AsnLeu: 3.416 ± 0.387
0.427AsnMet: 0.427 ± 0.169
0.854AsnAsn: 0.854 ± 0.241
2.562AsnPro: 2.562 ± 0.38
1.22AsnGln: 1.22 ± 0.239
1.952AsnArg: 1.952 ± 0.284
1.586AsnSer: 1.586 ± 0.239
2.013AsnThr: 2.013 ± 0.372
2.562AsnVal: 2.562 ± 0.351
0.793AsnTrp: 0.793 ± 0.199
0.793AsnTyr: 0.793 ± 0.175
0.0AsnXaa: 0.0 ± 0.0
Pro
5.002ProAla: 5.002 ± 0.542
0.549ProCys: 0.549 ± 0.174
4.148ProAsp: 4.148 ± 0.453
4.636ProGlu: 4.636 ± 0.578
1.952ProPhe: 1.952 ± 0.357
4.026ProGly: 4.026 ± 0.498
0.793ProHis: 0.793 ± 0.187
2.074ProIle: 2.074 ± 0.359
2.745ProLys: 2.745 ± 0.551
3.538ProLeu: 3.538 ± 0.476
1.098ProMet: 1.098 ± 0.24
2.623ProAsn: 2.623 ± 0.371
2.196ProPro: 2.196 ± 0.373
2.379ProGln: 2.379 ± 0.474
3.294ProArg: 3.294 ± 0.508
3.111ProSer: 3.111 ± 0.381
3.599ProThr: 3.599 ± 0.482
3.965ProVal: 3.965 ± 0.45
1.281ProTrp: 1.281 ± 0.357
1.403ProTyr: 1.403 ± 0.257
0.0ProXaa: 0.0 ± 0.0
Gln
3.904GlnAla: 3.904 ± 0.554
0.244GlnCys: 0.244 ± 0.13
2.074GlnAsp: 2.074 ± 0.329
2.074GlnGlu: 2.074 ± 0.317
1.281GlnPhe: 1.281 ± 0.224
3.416GlnGly: 3.416 ± 0.483
0.671GlnHis: 0.671 ± 0.202
2.684GlnIle: 2.684 ± 0.393
1.647GlnLys: 1.647 ± 0.269
3.599GlnLeu: 3.599 ± 0.423
0.732GlnMet: 0.732 ± 0.194
0.366GlnAsn: 0.366 ± 0.156
1.891GlnPro: 1.891 ± 0.434
1.22GlnGln: 1.22 ± 0.362
2.684GlnArg: 2.684 ± 0.37
1.708GlnSer: 1.708 ± 0.299
1.769GlnThr: 1.769 ± 0.269
2.196GlnVal: 2.196 ± 0.325
0.793GlnTrp: 0.793 ± 0.285
1.22GlnTyr: 1.22 ± 0.251
0.0GlnXaa: 0.0 ± 0.0
Arg
6.466ArgAla: 6.466 ± 0.782
0.915ArgCys: 0.915 ± 0.283
4.209ArgAsp: 4.209 ± 0.533
4.209ArgGlu: 4.209 ± 0.53
2.257ArgPhe: 2.257 ± 0.434
4.575ArgGly: 4.575 ± 0.614
1.403ArgHis: 1.403 ± 0.283
3.721ArgIle: 3.721 ± 0.464
3.111ArgLys: 3.111 ± 0.523
5.49ArgLeu: 5.49 ± 0.551
1.891ArgMet: 1.891 ± 0.293
1.769ArgAsn: 1.769 ± 0.393
3.172ArgPro: 3.172 ± 0.48
2.379ArgGln: 2.379 ± 0.355
6.283ArgArg: 6.283 ± 0.826
2.928ArgSer: 2.928 ± 0.368
2.684ArgThr: 2.684 ± 0.344
5.307ArgVal: 5.307 ± 0.548
1.403ArgTrp: 1.403 ± 0.294
2.074ArgTyr: 2.074 ± 0.282
0.0ArgXaa: 0.0 ± 0.0
Ser
5.124SerAla: 5.124 ± 0.673
0.549SerCys: 0.549 ± 0.141
3.05SerAsp: 3.05 ± 0.397
3.233SerGlu: 3.233 ± 0.478
1.769SerPhe: 1.769 ± 0.318
4.331SerGly: 4.331 ± 0.569
0.854SerHis: 0.854 ± 0.234
2.257SerIle: 2.257 ± 0.347
1.891SerLys: 1.891 ± 0.418
3.782SerLeu: 3.782 ± 0.567
1.586SerMet: 1.586 ± 0.276
1.647SerAsn: 1.647 ± 0.318
3.477SerPro: 3.477 ± 0.487
2.44SerGln: 2.44 ± 0.413
3.477SerArg: 3.477 ± 0.49
2.562SerSer: 2.562 ± 0.51
2.867SerThr: 2.867 ± 0.507
3.843SerVal: 3.843 ± 0.524
1.342SerTrp: 1.342 ± 0.297
1.403SerTyr: 1.403 ± 0.247
0.0SerXaa: 0.0 ± 0.0
Thr
6.039ThrAla: 6.039 ± 0.515
0.427ThrCys: 0.427 ± 0.125
3.965ThrAsp: 3.965 ± 0.49
3.233ThrGlu: 3.233 ± 0.368
2.196ThrPhe: 2.196 ± 0.334
5.49ThrGly: 5.49 ± 0.661
0.854ThrHis: 0.854 ± 0.221
2.562ThrIle: 2.562 ± 0.373
2.44ThrLys: 2.44 ± 0.391
4.453ThrLeu: 4.453 ± 0.467
1.22ThrMet: 1.22 ± 0.267
1.891ThrAsn: 1.891 ± 0.299
3.904ThrPro: 3.904 ± 0.574
2.257ThrGln: 2.257 ± 0.327
3.05ThrArg: 3.05 ± 0.355
2.745ThrSer: 2.745 ± 0.514
3.538ThrThr: 3.538 ± 0.52
3.843ThrVal: 3.843 ± 0.444
0.671ThrTrp: 0.671 ± 0.2
2.074ThrTyr: 2.074 ± 0.341
0.0ThrXaa: 0.0 ± 0.0
Val
7.564ValAla: 7.564 ± 0.694
0.549ValCys: 0.549 ± 0.193
5.246ValAsp: 5.246 ± 0.616
5.002ValGlu: 5.002 ± 0.569
3.05ValPhe: 3.05 ± 0.459
4.453ValGly: 4.453 ± 0.548
0.915ValHis: 0.915 ± 0.202
4.148ValIle: 4.148 ± 0.437
3.355ValLys: 3.355 ± 0.438
5.124ValLeu: 5.124 ± 0.594
1.525ValMet: 1.525 ± 0.304
3.355ValAsn: 3.355 ± 0.425
3.294ValPro: 3.294 ± 0.48
2.562ValGln: 2.562 ± 0.323
4.514ValArg: 4.514 ± 0.56
3.782ValSer: 3.782 ± 0.531
4.697ValThr: 4.697 ± 0.627
5.673ValVal: 5.673 ± 0.603
1.464ValTrp: 1.464 ± 0.263
2.806ValTyr: 2.806 ± 0.47
0.0ValXaa: 0.0 ± 0.0
Trp
1.891TrpAla: 1.891 ± 0.314
0.366TrpCys: 0.366 ± 0.145
1.403TrpAsp: 1.403 ± 0.225
1.586TrpGlu: 1.586 ± 0.241
0.671TrpPhe: 0.671 ± 0.184
1.22TrpGly: 1.22 ± 0.291
0.61TrpHis: 0.61 ± 0.189
1.098TrpIle: 1.098 ± 0.237
0.732TrpLys: 0.732 ± 0.218
1.464TrpLeu: 1.464 ± 0.323
0.488TrpMet: 0.488 ± 0.202
0.671TrpAsn: 0.671 ± 0.238
1.037TrpPro: 1.037 ± 0.273
1.037TrpGln: 1.037 ± 0.196
1.159TrpArg: 1.159 ± 0.227
1.342TrpSer: 1.342 ± 0.215
0.976TrpThr: 0.976 ± 0.237
1.22TrpVal: 1.22 ± 0.291
0.488TrpTrp: 0.488 ± 0.167
0.61TrpTyr: 0.61 ± 0.169
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.867TyrAla: 2.867 ± 0.367
0.244TyrCys: 0.244 ± 0.124
2.318TyrAsp: 2.318 ± 0.328
1.83TyrGlu: 1.83 ± 0.322
0.671TyrPhe: 0.671 ± 0.207
2.135TyrGly: 2.135 ± 0.34
0.488TyrHis: 0.488 ± 0.185
1.952TyrIle: 1.952 ± 0.277
1.403TyrLys: 1.403 ± 0.25
2.806TyrLeu: 2.806 ± 0.393
0.366TyrMet: 0.366 ± 0.143
0.915TyrAsn: 0.915 ± 0.215
1.281TyrPro: 1.281 ± 0.268
1.22TyrGln: 1.22 ± 0.283
1.952TyrArg: 1.952 ± 0.381
1.586TyrSer: 1.586 ± 0.314
1.586TyrThr: 1.586 ± 0.264
2.806TyrVal: 2.806 ± 0.448
0.427TyrTrp: 0.427 ± 0.137
0.915TyrTyr: 0.915 ± 0.266
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 97 proteins (16395 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski