Amino acid dipepetide frequency for Mycobacterium phage BigMau

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
13.0AlaAla: 13.0 ± 1.194
0.739AlaCys: 0.739 ± 0.239
6.469AlaAsp: 6.469 ± 0.701
6.223AlaGlu: 6.223 ± 0.701
3.019AlaPhe: 3.019 ± 0.518
7.393AlaGly: 7.393 ± 0.812
1.417AlaHis: 1.417 ± 0.315
4.744AlaIle: 4.744 ± 0.624
4.682AlaLys: 4.682 ± 0.542
8.749AlaLeu: 8.749 ± 0.889
2.464AlaMet: 2.464 ± 0.407
2.526AlaAsn: 2.526 ± 0.351
4.806AlaPro: 4.806 ± 0.681
2.834AlaGln: 2.834 ± 0.464
6.223AlaArg: 6.223 ± 0.631
4.929AlaSer: 4.929 ± 0.586
6.161AlaThr: 6.161 ± 0.53
8.502AlaVal: 8.502 ± 0.759
1.787AlaTrp: 1.787 ± 0.365
2.588AlaTyr: 2.588 ± 0.465
0.0AlaXaa: 0.0 ± 0.0
Cys
0.678CysAla: 0.678 ± 0.181
0.062CysCys: 0.062 ± 0.058
0.554CysAsp: 0.554 ± 0.162
0.863CysGlu: 0.863 ± 0.249
0.185CysPhe: 0.185 ± 0.117
0.678CysGly: 0.678 ± 0.217
0.185CysHis: 0.185 ± 0.113
0.37CysIle: 0.37 ± 0.153
0.431CysLys: 0.431 ± 0.183
0.554CysLeu: 0.554 ± 0.234
0.062CysMet: 0.062 ± 0.063
0.431CysAsn: 0.431 ± 0.169
0.185CysPro: 0.185 ± 0.092
0.246CysGln: 0.246 ± 0.13
0.493CysArg: 0.493 ± 0.18
0.308CysSer: 0.308 ± 0.134
0.308CysThr: 0.308 ± 0.141
0.246CysVal: 0.246 ± 0.11
0.185CysTrp: 0.185 ± 0.101
0.123CysTyr: 0.123 ± 0.087
0.0CysXaa: 0.0 ± 0.0
Asp
5.915AspAla: 5.915 ± 0.629
0.616AspCys: 0.616 ± 0.198
4.374AspAsp: 4.374 ± 0.512
3.265AspGlu: 3.265 ± 0.512
2.341AspPhe: 2.341 ± 0.299
5.853AspGly: 5.853 ± 0.571
1.171AspHis: 1.171 ± 0.267
2.834AspIle: 2.834 ± 0.474
3.019AspLys: 3.019 ± 0.506
6.531AspLeu: 6.531 ± 0.758
1.355AspMet: 1.355 ± 0.264
2.033AspAsn: 2.033 ± 0.326
4.744AspPro: 4.744 ± 0.587
1.479AspGln: 1.479 ± 0.292
3.758AspArg: 3.758 ± 0.387
3.019AspSer: 3.019 ± 0.499
4.066AspThr: 4.066 ± 0.419
5.052AspVal: 5.052 ± 0.626
1.602AspTrp: 1.602 ± 0.318
2.095AspTyr: 2.095 ± 0.288
0.0AspXaa: 0.0 ± 0.0
Glu
5.853GluAla: 5.853 ± 0.605
0.493GluCys: 0.493 ± 0.206
5.114GluAsp: 5.114 ± 0.547
5.114GluGlu: 5.114 ± 0.751
2.095GluPhe: 2.095 ± 0.375
4.374GluGly: 4.374 ± 0.507
1.355GluHis: 1.355 ± 0.317
3.265GluIle: 3.265 ± 0.439
2.28GluLys: 2.28 ± 0.377
6.469GluLeu: 6.469 ± 0.547
1.663GluMet: 1.663 ± 0.293
1.355GluAsn: 1.355 ± 0.298
2.526GluPro: 2.526 ± 0.475
2.957GluGln: 2.957 ± 0.408
4.066GluArg: 4.066 ± 0.681
3.881GluSer: 3.881 ± 0.413
3.82GluThr: 3.82 ± 0.511
5.483GluVal: 5.483 ± 0.714
1.663GluTrp: 1.663 ± 0.335
2.464GluTyr: 2.464 ± 0.457
0.0GluXaa: 0.0 ± 0.0
Phe
2.834PheAla: 2.834 ± 0.453
0.37PheCys: 0.37 ± 0.178
2.649PheAsp: 2.649 ± 0.368
2.095PheGlu: 2.095 ± 0.275
0.678PhePhe: 0.678 ± 0.192
3.758PheGly: 3.758 ± 0.488
0.554PheHis: 0.554 ± 0.208
1.355PheIle: 1.355 ± 0.25
1.355PheLys: 1.355 ± 0.29
2.341PheLeu: 2.341 ± 0.361
0.801PheMet: 0.801 ± 0.191
1.417PheAsn: 1.417 ± 0.336
1.602PhePro: 1.602 ± 0.322
0.739PheGln: 0.739 ± 0.177
1.725PheArg: 1.725 ± 0.324
1.602PheSer: 1.602 ± 0.274
2.28PheThr: 2.28 ± 0.387
1.725PheVal: 1.725 ± 0.374
0.616PheTrp: 0.616 ± 0.184
0.986PheTyr: 0.986 ± 0.234
0.0PheXaa: 0.0 ± 0.0
Gly
7.886GlyAla: 7.886 ± 0.982
0.616GlyCys: 0.616 ± 0.195
5.791GlyAsp: 5.791 ± 0.497
4.128GlyGlu: 4.128 ± 0.541
3.204GlyPhe: 3.204 ± 0.611
9.118GlyGly: 9.118 ± 1.743
1.725GlyHis: 1.725 ± 0.32
4.251GlyIle: 4.251 ± 0.66
3.697GlyLys: 3.697 ± 0.554
7.332GlyLeu: 7.332 ± 0.763
1.848GlyMet: 1.848 ± 0.389
3.081GlyAsn: 3.081 ± 0.373
3.881GlyPro: 3.881 ± 0.473
2.588GlyGln: 2.588 ± 0.362
4.806GlyArg: 4.806 ± 0.56
6.469GlySer: 6.469 ± 0.83
5.052GlyThr: 5.052 ± 0.563
6.038GlyVal: 6.038 ± 0.681
2.588GlyTrp: 2.588 ± 0.373
2.711GlyTyr: 2.711 ± 0.364
0.0GlyXaa: 0.0 ± 0.0
His
2.095HisAla: 2.095 ± 0.441
0.185HisCys: 0.185 ± 0.12
1.232HisAsp: 1.232 ± 0.228
1.294HisGlu: 1.294 ± 0.329
0.554HisPhe: 0.554 ± 0.176
1.54HisGly: 1.54 ± 0.388
0.678HisHis: 0.678 ± 0.191
1.047HisIle: 1.047 ± 0.232
0.924HisLys: 0.924 ± 0.305
1.232HisLeu: 1.232 ± 0.332
0.123HisMet: 0.123 ± 0.096
0.308HisAsn: 0.308 ± 0.128
1.232HisPro: 1.232 ± 0.261
1.109HisGln: 1.109 ± 0.253
1.417HisArg: 1.417 ± 0.237
0.554HisSer: 0.554 ± 0.159
1.171HisThr: 1.171 ± 0.255
1.232HisVal: 1.232 ± 0.297
0.493HisTrp: 0.493 ± 0.178
0.924HisTyr: 0.924 ± 0.278
0.0HisXaa: 0.0 ± 0.0
Ile
5.668IleAla: 5.668 ± 0.58
0.246IleCys: 0.246 ± 0.115
3.635IleAsp: 3.635 ± 0.433
3.758IleGlu: 3.758 ± 0.479
0.863IlePhe: 0.863 ± 0.233
4.128IleGly: 4.128 ± 0.503
0.801IleHis: 0.801 ± 0.232
1.848IleIle: 1.848 ± 0.322
1.972IleLys: 1.972 ± 0.375
3.327IleLeu: 3.327 ± 0.445
0.863IleMet: 0.863 ± 0.209
1.972IleAsn: 1.972 ± 0.347
2.957IlePro: 2.957 ± 0.458
1.417IleGln: 1.417 ± 0.366
3.697IleArg: 3.697 ± 0.528
3.389IleSer: 3.389 ± 0.51
3.635IleThr: 3.635 ± 0.49
3.512IleVal: 3.512 ± 0.508
0.739IleTrp: 0.739 ± 0.18
1.787IleTyr: 1.787 ± 0.305
0.0IleXaa: 0.0 ± 0.0
Lys
4.066LysAla: 4.066 ± 0.589
0.308LysCys: 0.308 ± 0.15
2.464LysAsp: 2.464 ± 0.519
2.218LysGlu: 2.218 ± 0.338
1.417LysPhe: 1.417 ± 0.303
2.711LysGly: 2.711 ± 0.422
1.047LysHis: 1.047 ± 0.251
2.28LysIle: 2.28 ± 0.404
2.033LysLys: 2.033 ± 0.465
3.389LysLeu: 3.389 ± 0.436
1.047LysMet: 1.047 ± 0.234
1.602LysAsn: 1.602 ± 0.363
2.711LysPro: 2.711 ± 0.399
1.972LysGln: 1.972 ± 0.406
3.573LysArg: 3.573 ± 0.694
2.772LysSer: 2.772 ± 0.415
2.464LysThr: 2.464 ± 0.406
3.265LysVal: 3.265 ± 0.439
0.678LysTrp: 0.678 ± 0.218
1.047LysTyr: 1.047 ± 0.276
0.0LysXaa: 0.0 ± 0.0
Leu
8.872LeuAla: 8.872 ± 0.692
0.308LeuCys: 0.308 ± 0.125
6.099LeuAsp: 6.099 ± 0.598
5.607LeuGlu: 5.607 ± 0.608
2.095LeuPhe: 2.095 ± 0.338
7.024LeuGly: 7.024 ± 0.673
1.663LeuHis: 1.663 ± 0.414
4.498LeuIle: 4.498 ± 0.581
4.313LeuLys: 4.313 ± 0.475
5.73LeuLeu: 5.73 ± 0.569
1.663LeuMet: 1.663 ± 0.291
2.834LeuAsn: 2.834 ± 0.39
5.607LeuPro: 5.607 ± 0.6
2.156LeuGln: 2.156 ± 0.437
6.346LeuArg: 6.346 ± 0.683
5.299LeuSer: 5.299 ± 0.523
6.099LeuThr: 6.099 ± 0.478
4.682LeuVal: 4.682 ± 0.689
1.047LeuTrp: 1.047 ± 0.328
2.218LeuTyr: 2.218 ± 0.361
0.0LeuXaa: 0.0 ± 0.0
Met
2.526MetAla: 2.526 ± 0.324
0.0MetCys: 0.0 ± 0.0
0.986MetAsp: 0.986 ± 0.251
1.602MetGlu: 1.602 ± 0.321
0.678MetPhe: 0.678 ± 0.187
1.355MetGly: 1.355 ± 0.3
0.308MetHis: 0.308 ± 0.13
0.678MetIle: 0.678 ± 0.211
1.171MetLys: 1.171 ± 0.256
1.171MetLeu: 1.171 ± 0.282
0.062MetMet: 0.062 ± 0.064
1.047MetAsn: 1.047 ± 0.229
1.109MetPro: 1.109 ± 0.279
0.616MetGln: 0.616 ± 0.172
1.355MetArg: 1.355 ± 0.282
2.218MetSer: 2.218 ± 0.352
2.033MetThr: 2.033 ± 0.374
0.924MetVal: 0.924 ± 0.222
0.308MetTrp: 0.308 ± 0.128
0.431MetTyr: 0.431 ± 0.146
0.0MetXaa: 0.0 ± 0.0
Asn
3.142AsnAla: 3.142 ± 0.467
0.0AsnCys: 0.0 ± 0.0
1.972AsnAsp: 1.972 ± 0.4
1.972AsnGlu: 1.972 ± 0.32
1.109AsnPhe: 1.109 ± 0.271
3.512AsnGly: 3.512 ± 0.48
0.801AsnHis: 0.801 ± 0.21
1.787AsnIle: 1.787 ± 0.385
0.493AsnLys: 0.493 ± 0.173
2.649AsnLeu: 2.649 ± 0.356
0.739AsnMet: 0.739 ± 0.194
0.863AsnAsn: 0.863 ± 0.214
2.834AsnPro: 2.834 ± 0.386
0.863AsnGln: 0.863 ± 0.213
1.479AsnArg: 1.479 ± 0.315
1.972AsnSer: 1.972 ± 0.39
1.725AsnThr: 1.725 ± 0.315
2.649AsnVal: 2.649 ± 0.446
0.678AsnTrp: 0.678 ± 0.173
0.986AsnTyr: 0.986 ± 0.258
0.0AsnXaa: 0.0 ± 0.0
Pro
5.052ProAla: 5.052 ± 0.59
0.308ProCys: 0.308 ± 0.147
4.313ProAsp: 4.313 ± 0.539
4.498ProGlu: 4.498 ± 0.521
1.91ProPhe: 1.91 ± 0.411
5.299ProGly: 5.299 ± 0.586
0.801ProHis: 0.801 ± 0.241
2.526ProIle: 2.526 ± 0.405
1.972ProLys: 1.972 ± 0.292
4.251ProLeu: 4.251 ± 0.576
0.863ProMet: 0.863 ± 0.251
1.91ProAsn: 1.91 ± 0.307
2.403ProPro: 2.403 ± 0.396
1.355ProGln: 1.355 ± 0.291
2.588ProArg: 2.588 ± 0.422
3.512ProSer: 3.512 ± 0.463
3.943ProThr: 3.943 ± 0.539
3.758ProVal: 3.758 ± 0.36
0.801ProTrp: 0.801 ± 0.292
1.479ProTyr: 1.479 ± 0.304
0.0ProXaa: 0.0 ± 0.0
Gln
3.142GlnAla: 3.142 ± 0.492
0.062GlnCys: 0.062 ± 0.065
1.047GlnAsp: 1.047 ± 0.293
1.787GlnGlu: 1.787 ± 0.291
1.232GlnPhe: 1.232 ± 0.228
2.403GlnGly: 2.403 ± 0.391
0.616GlnHis: 0.616 ± 0.167
2.464GlnIle: 2.464 ± 0.467
1.417GlnLys: 1.417 ± 0.286
3.327GlnLeu: 3.327 ± 0.516
0.863GlnMet: 0.863 ± 0.237
0.431GlnAsn: 0.431 ± 0.125
1.848GlnPro: 1.848 ± 0.271
1.848GlnGln: 1.848 ± 0.379
1.663GlnArg: 1.663 ± 0.433
1.972GlnSer: 1.972 ± 0.277
1.417GlnThr: 1.417 ± 0.276
2.526GlnVal: 2.526 ± 0.345
0.801GlnTrp: 0.801 ± 0.237
0.431GlnTyr: 0.431 ± 0.132
0.0GlnXaa: 0.0 ± 0.0
Arg
5.73ArgAla: 5.73 ± 0.688
0.863ArgCys: 0.863 ± 0.223
3.327ArgAsp: 3.327 ± 0.375
4.929ArgGlu: 4.929 ± 0.722
2.033ArgPhe: 2.033 ± 0.314
4.559ArgGly: 4.559 ± 0.643
1.171ArgHis: 1.171 ± 0.267
3.758ArgIle: 3.758 ± 0.486
3.758ArgLys: 3.758 ± 0.543
6.284ArgLeu: 6.284 ± 0.814
1.848ArgMet: 1.848 ± 0.33
2.28ArgAsn: 2.28 ± 0.412
2.341ArgPro: 2.341 ± 0.416
1.787ArgGln: 1.787 ± 0.304
5.73ArgArg: 5.73 ± 0.819
3.45ArgSer: 3.45 ± 0.554
3.142ArgThr: 3.142 ± 0.532
4.867ArgVal: 4.867 ± 0.659
1.417ArgTrp: 1.417 ± 0.292
1.787ArgTyr: 1.787 ± 0.305
0.0ArgXaa: 0.0 ± 0.0
Ser
6.223SerAla: 6.223 ± 0.608
0.678SerCys: 0.678 ± 0.196
2.896SerAsp: 2.896 ± 0.421
4.251SerGlu: 4.251 ± 0.497
2.095SerPhe: 2.095 ± 0.458
6.9SerGly: 6.9 ± 0.727
1.602SerHis: 1.602 ± 0.299
2.772SerIle: 2.772 ± 0.351
2.526SerLys: 2.526 ± 0.429
5.052SerLeu: 5.052 ± 0.559
1.479SerMet: 1.479 ± 0.351
2.218SerAsn: 2.218 ± 0.508
3.019SerPro: 3.019 ± 0.471
1.602SerGln: 1.602 ± 0.239
3.204SerArg: 3.204 ± 0.432
3.635SerSer: 3.635 ± 0.7
3.142SerThr: 3.142 ± 0.443
4.066SerVal: 4.066 ± 0.586
1.232SerTrp: 1.232 ± 0.304
1.355SerTyr: 1.355 ± 0.259
0.0SerXaa: 0.0 ± 0.0
Thr
5.422ThrAla: 5.422 ± 0.638
0.37ThrCys: 0.37 ± 0.162
4.251ThrAsp: 4.251 ± 0.598
4.436ThrGlu: 4.436 ± 0.604
2.218ThrPhe: 2.218 ± 0.398
6.407ThrGly: 6.407 ± 0.666
0.986ThrHis: 0.986 ± 0.312
3.45ThrIle: 3.45 ± 0.498
2.218ThrLys: 2.218 ± 0.359
5.545ThrLeu: 5.545 ± 0.571
0.924ThrMet: 0.924 ± 0.232
1.725ThrAsn: 1.725 ± 0.285
3.697ThrPro: 3.697 ± 0.513
1.972ThrGln: 1.972 ± 0.301
3.45ThrArg: 3.45 ± 0.527
3.943ThrSer: 3.943 ± 0.681
4.374ThrThr: 4.374 ± 0.588
5.483ThrVal: 5.483 ± 0.528
1.047ThrTrp: 1.047 ± 0.243
1.787ThrTyr: 1.787 ± 0.334
0.0ThrXaa: 0.0 ± 0.0
Val
7.455ValAla: 7.455 ± 0.763
0.431ValCys: 0.431 ± 0.142
5.299ValAsp: 5.299 ± 0.514
4.929ValGlu: 4.929 ± 0.514
2.403ValPhe: 2.403 ± 0.391
5.175ValGly: 5.175 ± 0.672
1.355ValHis: 1.355 ± 0.27
3.265ValIle: 3.265 ± 0.434
3.265ValLys: 3.265 ± 0.431
5.607ValLeu: 5.607 ± 0.551
0.924ValMet: 0.924 ± 0.289
2.588ValAsn: 2.588 ± 0.322
3.943ValPro: 3.943 ± 0.488
1.972ValGln: 1.972 ± 0.423
5.422ValArg: 5.422 ± 0.64
4.682ValSer: 4.682 ± 0.551
5.607ValThr: 5.607 ± 0.677
4.929ValVal: 4.929 ± 0.695
1.171ValTrp: 1.171 ± 0.264
2.156ValTyr: 2.156 ± 0.4
0.0ValXaa: 0.0 ± 0.0
Trp
1.355TrpAla: 1.355 ± 0.296
0.185TrpCys: 0.185 ± 0.095
1.294TrpAsp: 1.294 ± 0.267
0.986TrpGlu: 0.986 ± 0.232
0.801TrpPhe: 0.801 ± 0.204
1.91TrpGly: 1.91 ± 0.304
0.493TrpHis: 0.493 ± 0.19
1.355TrpIle: 1.355 ± 0.26
0.308TrpLys: 0.308 ± 0.161
1.91TrpLeu: 1.91 ± 0.34
0.37TrpMet: 0.37 ± 0.161
0.431TrpAsn: 0.431 ± 0.171
0.801TrpPro: 0.801 ± 0.225
0.863TrpGln: 0.863 ± 0.227
1.417TrpArg: 1.417 ± 0.349
1.047TrpSer: 1.047 ± 0.261
1.417TrpThr: 1.417 ± 0.352
1.91TrpVal: 1.91 ± 0.323
0.554TrpTrp: 0.554 ± 0.207
0.431TrpTyr: 0.431 ± 0.175
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.341TyrAla: 2.341 ± 0.38
0.308TyrCys: 0.308 ± 0.153
1.355TyrAsp: 1.355 ± 0.333
2.341TyrGlu: 2.341 ± 0.364
0.554TyrPhe: 0.554 ± 0.166
2.711TyrGly: 2.711 ± 0.414
0.616TyrHis: 0.616 ± 0.187
1.663TyrIle: 1.663 ± 0.344
1.232TyrLys: 1.232 ± 0.249
2.711TyrLeu: 2.711 ± 0.383
0.554TyrMet: 0.554 ± 0.179
1.171TyrAsn: 1.171 ± 0.283
1.294TyrPro: 1.294 ± 0.284
0.924TyrGln: 0.924 ± 0.199
2.711TyrArg: 2.711 ± 0.452
1.294TyrSer: 1.294 ± 0.269
1.848TyrThr: 1.848 ± 0.374
1.725TyrVal: 1.725 ± 0.305
0.431TyrTrp: 0.431 ± 0.186
0.554TyrTyr: 0.554 ± 0.197
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 91 proteins (16232 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski