Amino acid dipepetide frequency for Mycobacterium phage Trouble

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
11.85AlaAla: 11.85 ± 1.43
0.846AlaCys: 0.846 ± 0.261
6.651AlaAsp: 6.651 ± 0.689
6.288AlaGlu: 6.288 ± 0.731
2.66AlaPhe: 2.66 ± 0.455
7.376AlaGly: 7.376 ± 0.89
1.693AlaHis: 1.693 ± 0.407
4.172AlaIle: 4.172 ± 0.691
3.688AlaLys: 3.688 ± 0.522
8.585AlaLeu: 8.585 ± 0.82
3.023AlaMet: 3.023 ± 0.477
2.418AlaAsn: 2.418 ± 0.408
5.26AlaPro: 5.26 ± 0.612
2.963AlaGln: 2.963 ± 0.473
6.288AlaArg: 6.288 ± 0.616
4.958AlaSer: 4.958 ± 0.604
5.683AlaThr: 5.683 ± 0.634
8.162AlaVal: 8.162 ± 0.708
1.693AlaTrp: 1.693 ± 0.403
2.66AlaTyr: 2.66 ± 0.383
0.0AlaXaa: 0.0 ± 0.0
Cys
0.846CysAla: 0.846 ± 0.268
0.06CysCys: 0.06 ± 0.072
0.484CysAsp: 0.484 ± 0.186
0.726CysGlu: 0.726 ± 0.203
0.242CysPhe: 0.242 ± 0.129
0.423CysGly: 0.423 ± 0.194
0.121CysHis: 0.121 ± 0.086
0.181CysIle: 0.181 ± 0.118
0.242CysLys: 0.242 ± 0.17
0.302CysLeu: 0.302 ± 0.167
0.121CysMet: 0.121 ± 0.086
0.242CysAsn: 0.242 ± 0.138
0.242CysPro: 0.242 ± 0.115
0.181CysGln: 0.181 ± 0.115
0.665CysArg: 0.665 ± 0.256
0.484CysSer: 0.484 ± 0.161
0.484CysThr: 0.484 ± 0.209
0.423CysVal: 0.423 ± 0.243
0.363CysTrp: 0.363 ± 0.155
0.242CysTyr: 0.242 ± 0.119
0.0CysXaa: 0.0 ± 0.0
Asp
5.562AspAla: 5.562 ± 0.666
0.423AspCys: 0.423 ± 0.173
4.232AspAsp: 4.232 ± 0.466
3.748AspGlu: 3.748 ± 0.449
2.721AspPhe: 2.721 ± 0.368
6.227AspGly: 6.227 ± 0.682
0.967AspHis: 0.967 ± 0.26
2.539AspIle: 2.539 ± 0.418
2.842AspLys: 2.842 ± 0.413
6.892AspLeu: 6.892 ± 0.614
1.27AspMet: 1.27 ± 0.258
1.995AspAsn: 1.995 ± 0.307
4.958AspPro: 4.958 ± 0.625
1.874AspGln: 1.874 ± 0.442
3.567AspArg: 3.567 ± 0.43
3.507AspSer: 3.507 ± 0.429
4.111AspThr: 4.111 ± 0.494
4.776AspVal: 4.776 ± 0.503
1.693AspTrp: 1.693 ± 0.325
1.935AspTyr: 1.935 ± 0.317
0.0AspXaa: 0.0 ± 0.0
Glu
5.865GluAla: 5.865 ± 0.675
0.423GluCys: 0.423 ± 0.186
4.716GluAsp: 4.716 ± 0.617
4.776GluGlu: 4.776 ± 0.594
2.056GluPhe: 2.056 ± 0.368
3.809GluGly: 3.809 ± 0.435
1.391GluHis: 1.391 ± 0.295
3.083GluIle: 3.083 ± 0.43
2.781GluLys: 2.781 ± 0.492
6.469GluLeu: 6.469 ± 0.6
1.753GluMet: 1.753 ± 0.304
1.753GluAsn: 1.753 ± 0.333
2.297GluPro: 2.297 ± 0.391
2.116GluGln: 2.116 ± 0.373
4.051GluArg: 4.051 ± 0.562
3.507GluSer: 3.507 ± 0.403
3.688GluThr: 3.688 ± 0.503
5.26GluVal: 5.26 ± 0.6
1.33GluTrp: 1.33 ± 0.336
2.237GluTyr: 2.237 ± 0.42
0.0GluXaa: 0.0 ± 0.0
Phe
2.479PheAla: 2.479 ± 0.321
0.242PheCys: 0.242 ± 0.146
2.721PheAsp: 2.721 ± 0.308
1.814PheGlu: 1.814 ± 0.342
0.484PhePhe: 0.484 ± 0.161
3.567PheGly: 3.567 ± 0.533
0.846PheHis: 0.846 ± 0.323
1.27PheIle: 1.27 ± 0.284
1.27PheLys: 1.27 ± 0.272
2.539PheLeu: 2.539 ± 0.401
0.726PheMet: 0.726 ± 0.191
1.391PheAsn: 1.391 ± 0.288
1.451PhePro: 1.451 ± 0.294
1.088PheGln: 1.088 ± 0.259
1.995PheArg: 1.995 ± 0.419
2.116PheSer: 2.116 ± 0.526
2.237PheThr: 2.237 ± 0.345
2.116PheVal: 2.116 ± 0.344
0.605PheTrp: 0.605 ± 0.174
0.786PheTyr: 0.786 ± 0.198
0.0PheXaa: 0.0 ± 0.0
Gly
7.557GlyAla: 7.557 ± 1.417
0.726GlyCys: 0.726 ± 0.177
5.683GlyAsp: 5.683 ± 0.492
4.293GlyGlu: 4.293 ± 0.495
3.144GlyPhe: 3.144 ± 0.61
10.157GlyGly: 10.157 ± 3.227
1.995GlyHis: 1.995 ± 0.388
4.776GlyIle: 4.776 ± 0.689
4.051GlyLys: 4.051 ± 0.546
7.86GlyLeu: 7.86 ± 0.832
1.753GlyMet: 1.753 ± 0.377
3.144GlyAsn: 3.144 ± 0.48
3.809GlyPro: 3.809 ± 0.512
2.358GlyGln: 2.358 ± 0.438
5.018GlyArg: 5.018 ± 0.584
6.288GlySer: 6.288 ± 0.806
4.655GlyThr: 4.655 ± 0.556
5.683GlyVal: 5.683 ± 0.621
2.6GlyTrp: 2.6 ± 0.38
2.539GlyTyr: 2.539 ± 0.348
0.0GlyXaa: 0.0 ± 0.0
His
1.572HisAla: 1.572 ± 0.351
0.121HisCys: 0.121 ± 0.115
1.209HisAsp: 1.209 ± 0.261
1.451HisGlu: 1.451 ± 0.314
0.665HisPhe: 0.665 ± 0.184
1.632HisGly: 1.632 ± 0.332
0.726HisHis: 0.726 ± 0.23
1.028HisIle: 1.028 ± 0.213
1.149HisLys: 1.149 ± 0.283
1.753HisLeu: 1.753 ± 0.413
0.302HisMet: 0.302 ± 0.18
0.423HisAsn: 0.423 ± 0.176
1.33HisPro: 1.33 ± 0.267
1.088HisGln: 1.088 ± 0.247
1.632HisArg: 1.632 ± 0.369
0.786HisSer: 0.786 ± 0.215
0.846HisThr: 0.846 ± 0.231
1.511HisVal: 1.511 ± 0.316
0.605HisTrp: 0.605 ± 0.174
0.544HisTyr: 0.544 ± 0.183
0.0HisXaa: 0.0 ± 0.0
Ile
6.409IleAla: 6.409 ± 0.811
0.302IleCys: 0.302 ± 0.127
3.446IleAsp: 3.446 ± 0.358
3.748IleGlu: 3.748 ± 0.477
0.846IlePhe: 0.846 ± 0.261
3.869IleGly: 3.869 ± 0.466
1.088IleHis: 1.088 ± 0.278
1.753IleIle: 1.753 ± 0.334
1.874IleLys: 1.874 ± 0.328
3.204IleLeu: 3.204 ± 0.422
0.665IleMet: 0.665 ± 0.167
1.511IleAsn: 1.511 ± 0.31
3.325IlePro: 3.325 ± 0.36
1.693IleGln: 1.693 ± 0.419
3.265IleArg: 3.265 ± 0.428
3.325IleSer: 3.325 ± 0.432
2.902IleThr: 2.902 ± 0.482
2.963IleVal: 2.963 ± 0.492
0.846IleTrp: 0.846 ± 0.196
1.451IleTyr: 1.451 ± 0.258
0.0IleXaa: 0.0 ± 0.0
Lys
3.99LysAla: 3.99 ± 0.513
0.242LysCys: 0.242 ± 0.107
2.358LysAsp: 2.358 ± 0.433
2.056LysGlu: 2.056 ± 0.419
1.511LysPhe: 1.511 ± 0.307
2.6LysGly: 2.6 ± 0.423
1.27LysHis: 1.27 ± 0.315
2.66LysIle: 2.66 ± 0.436
2.177LysLys: 2.177 ± 0.52
3.023LysLeu: 3.023 ± 0.49
0.846LysMet: 0.846 ± 0.163
1.511LysAsn: 1.511 ± 0.321
2.721LysPro: 2.721 ± 0.414
1.995LysGln: 1.995 ± 0.371
2.902LysArg: 2.902 ± 0.484
2.66LysSer: 2.66 ± 0.424
2.177LysThr: 2.177 ± 0.321
3.204LysVal: 3.204 ± 0.529
0.846LysTrp: 0.846 ± 0.236
1.149LysTyr: 1.149 ± 0.299
0.0LysXaa: 0.0 ± 0.0
Leu
9.734LeuAla: 9.734 ± 0.866
0.363LeuCys: 0.363 ± 0.171
6.227LeuAsp: 6.227 ± 0.591
5.26LeuGlu: 5.26 ± 0.562
2.177LeuPhe: 2.177 ± 0.396
7.981LeuGly: 7.981 ± 0.836
1.33LeuHis: 1.33 ± 0.3
4.655LeuIle: 4.655 ± 0.627
3.628LeuLys: 3.628 ± 0.485
5.623LeuLeu: 5.623 ± 0.48
1.693LeuMet: 1.693 ± 0.31
3.023LeuAsn: 3.023 ± 0.409
5.381LeuPro: 5.381 ± 0.631
2.66LeuGln: 2.66 ± 0.414
6.227LeuArg: 6.227 ± 0.792
5.562LeuSer: 5.562 ± 0.624
5.925LeuThr: 5.925 ± 0.53
4.897LeuVal: 4.897 ± 0.613
1.028LeuTrp: 1.028 ± 0.318
2.358LeuTyr: 2.358 ± 0.416
0.0LeuXaa: 0.0 ± 0.0
Met
2.66MetAla: 2.66 ± 0.333
0.06MetCys: 0.06 ± 0.063
1.27MetAsp: 1.27 ± 0.301
1.391MetGlu: 1.391 ± 0.261
0.605MetPhe: 0.605 ± 0.165
1.33MetGly: 1.33 ± 0.28
0.423MetHis: 0.423 ± 0.198
0.726MetIle: 0.726 ± 0.237
1.028MetLys: 1.028 ± 0.264
1.451MetLeu: 1.451 ± 0.33
0.242MetMet: 0.242 ± 0.125
1.028MetAsn: 1.028 ± 0.215
1.209MetPro: 1.209 ± 0.269
0.605MetGln: 0.605 ± 0.152
1.209MetArg: 1.209 ± 0.353
2.479MetSer: 2.479 ± 0.504
1.995MetThr: 1.995 ± 0.327
1.028MetVal: 1.028 ± 0.282
0.181MetTrp: 0.181 ± 0.111
0.423MetTyr: 0.423 ± 0.166
0.0MetXaa: 0.0 ± 0.0
Asn
3.144AsnAla: 3.144 ± 0.466
0.121AsnCys: 0.121 ± 0.092
1.995AsnAsp: 1.995 ± 0.382
1.511AsnGlu: 1.511 ± 0.296
0.786AsnPhe: 0.786 ± 0.238
3.507AsnGly: 3.507 ± 0.445
0.726AsnHis: 0.726 ± 0.181
1.572AsnIle: 1.572 ± 0.271
0.423AsnLys: 0.423 ± 0.134
2.66AsnLeu: 2.66 ± 0.379
0.665AsnMet: 0.665 ± 0.185
0.786AsnAsn: 0.786 ± 0.218
2.902AsnPro: 2.902 ± 0.421
1.028AsnGln: 1.028 ± 0.236
1.33AsnArg: 1.33 ± 0.364
2.116AsnSer: 2.116 ± 0.417
1.935AsnThr: 1.935 ± 0.331
2.539AsnVal: 2.539 ± 0.41
0.665AsnTrp: 0.665 ± 0.185
1.27AsnTyr: 1.27 ± 0.313
0.0AsnXaa: 0.0 ± 0.0
Pro
5.139ProAla: 5.139 ± 0.577
0.484ProCys: 0.484 ± 0.225
4.655ProAsp: 4.655 ± 0.524
4.474ProGlu: 4.474 ± 0.551
2.177ProPhe: 2.177 ± 0.456
4.776ProGly: 4.776 ± 0.665
1.028ProHis: 1.028 ± 0.25
2.237ProIle: 2.237 ± 0.405
2.056ProLys: 2.056 ± 0.272
4.474ProLeu: 4.474 ± 0.558
0.967ProMet: 0.967 ± 0.264
1.33ProAsn: 1.33 ± 0.3
3.023ProPro: 3.023 ± 0.477
1.27ProGln: 1.27 ± 0.307
3.023ProArg: 3.023 ± 0.482
3.809ProSer: 3.809 ± 0.506
3.93ProThr: 3.93 ± 0.566
3.688ProVal: 3.688 ± 0.444
0.846ProTrp: 0.846 ± 0.272
1.511ProTyr: 1.511 ± 0.373
0.0ProXaa: 0.0 ± 0.0
Gln
2.781GlnAla: 2.781 ± 0.471
0.121GlnCys: 0.121 ± 0.088
1.391GlnAsp: 1.391 ± 0.411
1.572GlnGlu: 1.572 ± 0.313
1.149GlnPhe: 1.149 ± 0.247
2.6GlnGly: 2.6 ± 0.452
0.726GlnHis: 0.726 ± 0.217
3.023GlnIle: 3.023 ± 0.551
1.27GlnLys: 1.27 ± 0.29
3.567GlnLeu: 3.567 ± 0.498
0.846GlnMet: 0.846 ± 0.232
0.544GlnAsn: 0.544 ± 0.164
1.572GlnPro: 1.572 ± 0.301
1.753GlnGln: 1.753 ± 0.383
1.995GlnArg: 1.995 ± 0.399
1.874GlnSer: 1.874 ± 0.322
1.693GlnThr: 1.693 ± 0.315
2.781GlnVal: 2.781 ± 0.402
0.605GlnTrp: 0.605 ± 0.16
0.665GlnTyr: 0.665 ± 0.206
0.0GlnXaa: 0.0 ± 0.0
Arg
5.441ArgAla: 5.441 ± 0.689
1.028ArgCys: 1.028 ± 0.35
3.144ArgAsp: 3.144 ± 0.411
4.595ArgGlu: 4.595 ± 0.65
1.935ArgPhe: 1.935 ± 0.351
5.26ArgGly: 5.26 ± 0.725
1.209ArgHis: 1.209 ± 0.303
3.083ArgIle: 3.083 ± 0.435
3.265ArgLys: 3.265 ± 0.584
6.348ArgLeu: 6.348 ± 0.676
1.874ArgMet: 1.874 ± 0.379
2.66ArgAsn: 2.66 ± 0.557
2.358ArgPro: 2.358 ± 0.392
1.874ArgGln: 1.874 ± 0.31
5.381ArgArg: 5.381 ± 0.749
4.232ArgSer: 4.232 ± 0.6
3.083ArgThr: 3.083 ± 0.4
5.018ArgVal: 5.018 ± 0.631
1.209ArgTrp: 1.209 ± 0.29
1.632ArgTyr: 1.632 ± 0.312
0.0ArgXaa: 0.0 ± 0.0
Ser
6.227SerAla: 6.227 ± 0.806
0.484SerCys: 0.484 ± 0.182
3.869SerAsp: 3.869 ± 0.381
3.809SerGlu: 3.809 ± 0.485
2.056SerPhe: 2.056 ± 0.375
7.134SerGly: 7.134 ± 0.779
1.451SerHis: 1.451 ± 0.257
2.842SerIle: 2.842 ± 0.432
2.297SerLys: 2.297 ± 0.424
5.562SerLeu: 5.562 ± 0.612
1.511SerMet: 1.511 ± 0.326
2.297SerAsn: 2.297 ± 0.424
3.023SerPro: 3.023 ± 0.487
2.056SerGln: 2.056 ± 0.301
3.325SerArg: 3.325 ± 0.464
4.051SerSer: 4.051 ± 0.602
3.446SerThr: 3.446 ± 0.482
3.748SerVal: 3.748 ± 0.398
1.33SerTrp: 1.33 ± 0.295
1.753SerTyr: 1.753 ± 0.382
0.0SerXaa: 0.0 ± 0.0
Thr
5.562ThrAla: 5.562 ± 0.721
0.302ThrCys: 0.302 ± 0.184
3.628ThrAsp: 3.628 ± 0.547
3.748ThrGlu: 3.748 ± 0.436
2.479ThrPhe: 2.479 ± 0.363
6.953ThrGly: 6.953 ± 0.68
0.846ThrHis: 0.846 ± 0.243
2.479ThrIle: 2.479 ± 0.532
2.781ThrLys: 2.781 ± 0.369
5.441ThrLeu: 5.441 ± 0.588
0.907ThrMet: 0.907 ± 0.22
1.814ThrAsn: 1.814 ± 0.347
3.93ThrPro: 3.93 ± 0.519
1.572ThrGln: 1.572 ± 0.32
3.628ThrArg: 3.628 ± 0.584
3.809ThrSer: 3.809 ± 0.498
4.414ThrThr: 4.414 ± 0.556
5.079ThrVal: 5.079 ± 0.532
1.209ThrTrp: 1.209 ± 0.245
1.935ThrTyr: 1.935 ± 0.324
0.0ThrXaa: 0.0 ± 0.0
Val
6.348ValAla: 6.348 ± 0.634
0.302ValCys: 0.302 ± 0.121
5.623ValAsp: 5.623 ± 0.532
4.837ValGlu: 4.837 ± 0.526
2.539ValPhe: 2.539 ± 0.365
4.716ValGly: 4.716 ± 0.638
1.451ValHis: 1.451 ± 0.259
3.748ValIle: 3.748 ± 0.44
3.386ValLys: 3.386 ± 0.477
5.441ValLeu: 5.441 ± 0.586
1.149ValMet: 1.149 ± 0.275
2.177ValAsn: 2.177 ± 0.326
3.93ValPro: 3.93 ± 0.44
2.237ValGln: 2.237 ± 0.407
5.26ValArg: 5.26 ± 0.809
4.534ValSer: 4.534 ± 0.482
5.623ValThr: 5.623 ± 0.589
5.502ValVal: 5.502 ± 0.644
1.209ValTrp: 1.209 ± 0.228
2.297ValTyr: 2.297 ± 0.349
0.0ValXaa: 0.0 ± 0.0
Trp
1.511TrpAla: 1.511 ± 0.359
0.242TrpCys: 0.242 ± 0.117
1.391TrpAsp: 1.391 ± 0.282
0.967TrpGlu: 0.967 ± 0.223
0.786TrpPhe: 0.786 ± 0.216
1.572TrpGly: 1.572 ± 0.315
0.484TrpHis: 0.484 ± 0.161
1.088TrpIle: 1.088 ± 0.219
0.363TrpLys: 0.363 ± 0.166
1.995TrpLeu: 1.995 ± 0.311
0.363TrpMet: 0.363 ± 0.147
0.484TrpAsn: 0.484 ± 0.156
0.907TrpPro: 0.907 ± 0.248
0.967TrpGln: 0.967 ± 0.219
1.27TrpArg: 1.27 ± 0.3
0.846TrpSer: 0.846 ± 0.231
1.814TrpThr: 1.814 ± 0.331
1.935TrpVal: 1.935 ± 0.303
0.726TrpTrp: 0.726 ± 0.266
0.242TrpTyr: 0.242 ± 0.108
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.237TyrAla: 2.237 ± 0.372
0.181TyrCys: 0.181 ± 0.099
1.149TyrAsp: 1.149 ± 0.265
2.297TyrGlu: 2.297 ± 0.328
0.726TyrPhe: 0.726 ± 0.206
2.66TyrGly: 2.66 ± 0.449
0.605TyrHis: 0.605 ± 0.178
1.572TyrIle: 1.572 ± 0.339
1.33TyrLys: 1.33 ± 0.267
2.539TyrLeu: 2.539 ± 0.421
0.605TyrMet: 0.605 ± 0.153
1.149TyrAsn: 1.149 ± 0.296
1.27TyrPro: 1.27 ± 0.251
1.149TyrGln: 1.149 ± 0.324
2.6TyrArg: 2.6 ± 0.357
1.33TyrSer: 1.33 ± 0.27
1.874TyrThr: 1.874 ± 0.358
1.995TyrVal: 1.995 ± 0.367
0.363TyrTrp: 0.363 ± 0.145
0.726TyrTyr: 0.726 ± 0.23
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 94 proteins (16541 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski