Amino acid dipepetide frequency for Mycobacterium phage Toaka

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
11.557AlaAla: 11.557 ± 1.073
0.482AlaCys: 0.482 ± 0.186
5.297AlaAsp: 5.297 ± 0.585
7.344AlaGlu: 7.344 ± 0.758
3.792AlaPhe: 3.792 ± 0.521
8.367AlaGly: 8.367 ± 0.715
1.926AlaHis: 1.926 ± 0.325
4.936AlaIle: 4.936 ± 0.544
5.598AlaLys: 5.598 ± 0.635
9.45AlaLeu: 9.45 ± 1.026
3.13AlaMet: 3.13 ± 0.49
3.612AlaAsn: 3.612 ± 0.545
5.357AlaPro: 5.357 ± 0.552
3.732AlaGln: 3.732 ± 0.525
6.14AlaArg: 6.14 ± 0.54
5.357AlaSer: 5.357 ± 0.448
5.357AlaThr: 5.357 ± 0.685
6.982AlaVal: 6.982 ± 0.669
1.324AlaTrp: 1.324 ± 0.309
1.986AlaTyr: 1.986 ± 0.333
0.0AlaXaa: 0.0 ± 0.0
Cys
0.542CysAla: 0.542 ± 0.16
0.0CysCys: 0.0 ± 0.0
0.662CysAsp: 0.662 ± 0.19
0.482CysGlu: 0.482 ± 0.17
0.241CysPhe: 0.241 ± 0.107
0.662CysGly: 0.662 ± 0.191
0.181CysHis: 0.181 ± 0.104
0.361CysIle: 0.361 ± 0.147
0.301CysLys: 0.301 ± 0.125
0.421CysLeu: 0.421 ± 0.17
0.241CysMet: 0.241 ± 0.119
0.542CysAsn: 0.542 ± 0.175
0.482CysPro: 0.482 ± 0.196
0.181CysGln: 0.181 ± 0.099
0.542CysArg: 0.542 ± 0.184
0.482CysSer: 0.482 ± 0.144
0.602CysThr: 0.602 ± 0.179
0.542CysVal: 0.542 ± 0.164
0.301CysTrp: 0.301 ± 0.128
0.301CysTyr: 0.301 ± 0.136
0.0CysXaa: 0.0 ± 0.0
Asp
5.598AspAla: 5.598 ± 0.728
0.421AspCys: 0.421 ± 0.174
3.13AspAsp: 3.13 ± 0.439
3.913AspGlu: 3.913 ± 0.515
2.287AspPhe: 2.287 ± 0.43
5.297AspGly: 5.297 ± 0.518
1.384AspHis: 1.384 ± 0.327
3.612AspIle: 3.612 ± 0.499
2.227AspLys: 2.227 ± 0.399
6.32AspLeu: 6.32 ± 0.767
1.505AspMet: 1.505 ± 0.29
2.047AspAsn: 2.047 ± 0.384
3.612AspPro: 3.612 ± 0.505
1.866AspGln: 1.866 ± 0.315
3.13AspArg: 3.13 ± 0.446
3.852AspSer: 3.852 ± 0.489
3.913AspThr: 3.913 ± 0.429
4.214AspVal: 4.214 ± 0.387
1.565AspTrp: 1.565 ± 0.348
2.528AspTyr: 2.528 ± 0.422
0.0AspXaa: 0.0 ± 0.0
Glu
7.283GluAla: 7.283 ± 0.659
0.12GluCys: 0.12 ± 0.082
3.973GluAsp: 3.973 ± 0.534
4.454GluGlu: 4.454 ± 0.497
2.287GluPhe: 2.287 ± 0.378
5.116GluGly: 5.116 ± 0.521
1.505GluHis: 1.505 ± 0.362
3.792GluIle: 3.792 ± 0.399
2.588GluLys: 2.588 ± 0.49
6.501GluLeu: 6.501 ± 0.707
1.505GluMet: 1.505 ± 0.275
2.047GluAsn: 2.047 ± 0.325
2.709GluPro: 2.709 ± 0.468
1.986GluGln: 1.986 ± 0.3
4.033GluArg: 4.033 ± 0.527
3.792GluSer: 3.792 ± 0.436
3.371GluThr: 3.371 ± 0.438
4.394GluVal: 4.394 ± 0.531
1.384GluTrp: 1.384 ± 0.305
2.167GluTyr: 2.167 ± 0.35
0.0GluXaa: 0.0 ± 0.0
Phe
3.25PheAla: 3.25 ± 0.544
0.301PheCys: 0.301 ± 0.126
2.468PheAsp: 2.468 ± 0.425
2.889PheGlu: 2.889 ± 0.462
0.602PhePhe: 0.602 ± 0.198
3.19PheGly: 3.19 ± 0.444
0.662PheHis: 0.662 ± 0.255
1.986PheIle: 1.986 ± 0.346
1.204PheLys: 1.204 ± 0.241
1.866PheLeu: 1.866 ± 0.339
0.602PheMet: 0.602 ± 0.187
1.866PheAsn: 1.866 ± 0.307
1.625PhePro: 1.625 ± 0.265
1.083PheGln: 1.083 ± 0.242
1.746PheArg: 1.746 ± 0.378
2.528PheSer: 2.528 ± 0.383
2.348PheThr: 2.348 ± 0.286
1.746PheVal: 1.746 ± 0.349
0.662PheTrp: 0.662 ± 0.213
1.023PheTyr: 1.023 ± 0.264
0.0PheXaa: 0.0 ± 0.0
Gly
7.223GlyAla: 7.223 ± 0.846
0.843GlyCys: 0.843 ± 0.26
6.32GlyAsp: 6.32 ± 0.637
4.635GlyGlu: 4.635 ± 0.486
3.01GlyPhe: 3.01 ± 0.421
9.932GlyGly: 9.932 ± 1.635
1.986GlyHis: 1.986 ± 0.377
3.672GlyIle: 3.672 ± 0.631
4.274GlyLys: 4.274 ± 0.516
6.561GlyLeu: 6.561 ± 0.791
2.588GlyMet: 2.588 ± 0.514
3.371GlyAsn: 3.371 ± 0.578
3.19GlyPro: 3.19 ± 0.415
3.01GlyGln: 3.01 ± 0.442
3.672GlyArg: 3.672 ± 0.415
4.635GlySer: 4.635 ± 0.804
5.478GlyThr: 5.478 ± 0.676
5.779GlyVal: 5.779 ± 0.655
1.625GlyTrp: 1.625 ± 0.242
2.348GlyTyr: 2.348 ± 0.423
0.0GlyXaa: 0.0 ± 0.0
His
1.625HisAla: 1.625 ± 0.278
0.301HisCys: 0.301 ± 0.124
1.324HisAsp: 1.324 ± 0.328
1.083HisGlu: 1.083 ± 0.261
0.542HisPhe: 0.542 ± 0.166
1.505HisGly: 1.505 ± 0.356
0.602HisHis: 0.602 ± 0.191
0.722HisIle: 0.722 ± 0.208
0.963HisLys: 0.963 ± 0.261
1.445HisLeu: 1.445 ± 0.436
0.301HisMet: 0.301 ± 0.148
0.722HisAsn: 0.722 ± 0.207
1.204HisPro: 1.204 ± 0.236
0.662HisGln: 0.662 ± 0.224
1.204HisArg: 1.204 ± 0.296
1.023HisSer: 1.023 ± 0.21
1.565HisThr: 1.565 ± 0.264
1.324HisVal: 1.324 ± 0.307
0.361HisTrp: 0.361 ± 0.149
0.602HisTyr: 0.602 ± 0.265
0.0HisXaa: 0.0 ± 0.0
Ile
5.779IleAla: 5.779 ± 0.553
0.482IleCys: 0.482 ± 0.176
3.431IleAsp: 3.431 ± 0.486
4.394IleGlu: 4.394 ± 0.637
1.384IlePhe: 1.384 ± 0.274
3.491IleGly: 3.491 ± 0.504
1.023IleHis: 1.023 ± 0.271
1.746IleIle: 1.746 ± 0.313
2.528IleLys: 2.528 ± 0.346
3.491IleLeu: 3.491 ± 0.436
0.662IleMet: 0.662 ± 0.189
2.167IleAsn: 2.167 ± 0.332
3.792IlePro: 3.792 ± 0.373
1.324IleGln: 1.324 ± 0.263
3.672IleArg: 3.672 ± 0.422
2.709IleSer: 2.709 ± 0.364
3.371IleThr: 3.371 ± 0.529
2.829IleVal: 2.829 ± 0.434
0.542IleTrp: 0.542 ± 0.204
1.204IleTyr: 1.204 ± 0.275
0.0IleXaa: 0.0 ± 0.0
Lys
5.538LysAla: 5.538 ± 0.63
0.241LysCys: 0.241 ± 0.119
1.866LysAsp: 1.866 ± 0.293
3.311LysGlu: 3.311 ± 0.45
1.264LysPhe: 1.264 ± 0.236
3.491LysGly: 3.491 ± 0.513
0.421LysHis: 0.421 ± 0.136
2.167LysIle: 2.167 ± 0.412
2.348LysLys: 2.348 ± 0.329
3.852LysLeu: 3.852 ± 0.468
1.384LysMet: 1.384 ± 0.298
1.264LysAsn: 1.264 ± 0.309
2.348LysPro: 2.348 ± 0.404
1.685LysGln: 1.685 ± 0.326
3.913LysArg: 3.913 ± 0.523
2.528LysSer: 2.528 ± 0.419
2.468LysThr: 2.468 ± 0.406
3.792LysVal: 3.792 ± 0.624
1.083LysTrp: 1.083 ± 0.345
1.324LysTyr: 1.324 ± 0.227
0.0LysXaa: 0.0 ± 0.0
Leu
9.089LeuAla: 9.089 ± 0.901
1.023LeuCys: 1.023 ± 0.229
5.056LeuAsp: 5.056 ± 0.564
4.936LeuGlu: 4.936 ± 0.523
2.588LeuPhe: 2.588 ± 0.357
6.982LeuGly: 6.982 ± 0.824
1.023LeuHis: 1.023 ± 0.297
4.153LeuIle: 4.153 ± 0.519
3.973LeuLys: 3.973 ± 0.448
5.417LeuLeu: 5.417 ± 0.561
2.528LeuMet: 2.528 ± 0.393
2.769LeuAsn: 2.769 ± 0.43
4.996LeuPro: 4.996 ± 0.591
2.408LeuGln: 2.408 ± 0.407
5.478LeuArg: 5.478 ± 0.543
5.056LeuSer: 5.056 ± 0.481
6.441LeuThr: 6.441 ± 0.649
4.515LeuVal: 4.515 ± 0.565
1.625LeuTrp: 1.625 ± 0.261
1.625LeuTyr: 1.625 ± 0.353
0.0LeuXaa: 0.0 ± 0.0
Met
2.528MetAla: 2.528 ± 0.351
0.12MetCys: 0.12 ± 0.073
0.783MetAsp: 0.783 ± 0.21
1.685MetGlu: 1.685 ± 0.418
0.783MetPhe: 0.783 ± 0.227
1.746MetGly: 1.746 ± 0.306
0.301MetHis: 0.301 ± 0.132
1.445MetIle: 1.445 ± 0.342
1.264MetLys: 1.264 ± 0.241
1.565MetLeu: 1.565 ± 0.381
0.421MetMet: 0.421 ± 0.166
0.662MetAsn: 0.662 ± 0.189
1.445MetPro: 1.445 ± 0.318
0.963MetGln: 0.963 ± 0.217
1.746MetArg: 1.746 ± 0.324
2.348MetSer: 2.348 ± 0.369
2.287MetThr: 2.287 ± 0.38
1.445MetVal: 1.445 ± 0.332
0.421MetTrp: 0.421 ± 0.155
0.662MetTyr: 0.662 ± 0.215
0.0MetXaa: 0.0 ± 0.0
Asn
4.214AsnAla: 4.214 ± 0.519
0.361AsnCys: 0.361 ± 0.14
1.746AsnAsp: 1.746 ± 0.316
1.926AsnGlu: 1.926 ± 0.373
1.083AsnPhe: 1.083 ± 0.323
3.732AsnGly: 3.732 ± 0.526
1.083AsnHis: 1.083 ± 0.285
1.445AsnIle: 1.445 ± 0.285
1.144AsnLys: 1.144 ± 0.27
3.371AsnLeu: 3.371 ± 0.541
0.542AsnMet: 0.542 ± 0.187
0.843AsnAsn: 0.843 ± 0.266
2.588AsnPro: 2.588 ± 0.305
0.963AsnGln: 0.963 ± 0.204
2.047AsnArg: 2.047 ± 0.375
1.986AsnSer: 1.986 ± 0.337
1.986AsnThr: 1.986 ± 0.41
2.588AsnVal: 2.588 ± 0.416
0.963AsnTrp: 0.963 ± 0.224
0.783AsnTyr: 0.783 ± 0.204
0.0AsnXaa: 0.0 ± 0.0
Pro
5.417ProAla: 5.417 ± 0.715
0.241ProCys: 0.241 ± 0.135
3.431ProAsp: 3.431 ± 0.492
4.214ProGlu: 4.214 ± 0.572
1.746ProPhe: 1.746 ± 0.279
4.755ProGly: 4.755 ± 0.591
0.662ProHis: 0.662 ± 0.217
2.709ProIle: 2.709 ± 0.439
1.986ProLys: 1.986 ± 0.373
3.431ProLeu: 3.431 ± 0.502
1.023ProMet: 1.023 ± 0.259
2.528ProAsn: 2.528 ± 0.431
2.167ProPro: 2.167 ± 0.398
1.866ProGln: 1.866 ± 0.399
2.769ProArg: 2.769 ± 0.462
2.769ProSer: 2.769 ± 0.37
3.913ProThr: 3.913 ± 0.498
4.093ProVal: 4.093 ± 0.511
0.722ProTrp: 0.722 ± 0.192
1.685ProTyr: 1.685 ± 0.306
0.0ProXaa: 0.0 ± 0.0
Gln
4.334GlnAla: 4.334 ± 0.628
0.181GlnCys: 0.181 ± 0.098
1.445GlnAsp: 1.445 ± 0.353
1.625GlnGlu: 1.625 ± 0.327
1.204GlnPhe: 1.204 ± 0.205
2.468GlnGly: 2.468 ± 0.299
0.903GlnHis: 0.903 ± 0.218
2.348GlnIle: 2.348 ± 0.332
1.384GlnLys: 1.384 ± 0.334
2.949GlnLeu: 2.949 ± 0.554
1.023GlnMet: 1.023 ± 0.321
0.963GlnAsn: 0.963 ± 0.219
1.565GlnPro: 1.565 ± 0.303
1.806GlnGln: 1.806 ± 0.517
2.588GlnArg: 2.588 ± 0.304
1.806GlnSer: 1.806 ± 0.296
2.167GlnThr: 2.167 ± 0.249
2.709GlnVal: 2.709 ± 0.345
0.542GlnTrp: 0.542 ± 0.18
1.083GlnTyr: 1.083 ± 0.227
0.0GlnXaa: 0.0 ± 0.0
Arg
5.357ArgAla: 5.357 ± 0.555
1.023ArgCys: 1.023 ± 0.289
4.454ArgAsp: 4.454 ± 0.614
4.093ArgGlu: 4.093 ± 0.424
2.649ArgPhe: 2.649 ± 0.383
3.732ArgGly: 3.732 ± 0.422
1.023ArgHis: 1.023 ± 0.239
3.732ArgIle: 3.732 ± 0.478
3.13ArgLys: 3.13 ± 0.452
5.237ArgLeu: 5.237 ± 0.62
1.806ArgMet: 1.806 ± 0.332
1.986ArgAsn: 1.986 ± 0.344
2.769ArgPro: 2.769 ± 0.359
2.709ArgGln: 2.709 ± 0.333
5.357ArgArg: 5.357 ± 0.773
3.07ArgSer: 3.07 ± 0.474
2.889ArgThr: 2.889 ± 0.361
3.973ArgVal: 3.973 ± 0.53
1.324ArgTrp: 1.324 ± 0.302
1.986ArgTyr: 1.986 ± 0.359
0.0ArgXaa: 0.0 ± 0.0
Ser
5.116SerAla: 5.116 ± 0.607
0.542SerCys: 0.542 ± 0.171
3.732SerAsp: 3.732 ± 0.434
3.792SerGlu: 3.792 ± 0.45
2.227SerPhe: 2.227 ± 0.374
5.899SerGly: 5.899 ± 0.715
0.843SerHis: 0.843 ± 0.205
1.926SerIle: 1.926 ± 0.357
2.709SerLys: 2.709 ± 0.407
4.093SerLeu: 4.093 ± 0.619
1.384SerMet: 1.384 ± 0.281
1.685SerAsn: 1.685 ± 0.334
2.468SerPro: 2.468 ± 0.396
2.287SerGln: 2.287 ± 0.357
4.274SerArg: 4.274 ± 0.564
3.612SerSer: 3.612 ± 0.551
2.949SerThr: 2.949 ± 0.41
4.515SerVal: 4.515 ± 0.453
1.324SerTrp: 1.324 ± 0.223
1.986SerTyr: 1.986 ± 0.396
0.0SerXaa: 0.0 ± 0.0
Thr
6.561ThrAla: 6.561 ± 0.624
0.361ThrCys: 0.361 ± 0.15
4.274ThrAsp: 4.274 ± 0.496
2.829ThrGlu: 2.829 ± 0.351
1.986ThrPhe: 1.986 ± 0.348
4.936ThrGly: 4.936 ± 0.521
1.505ThrHis: 1.505 ± 0.332
3.19ThrIle: 3.19 ± 0.392
3.551ThrLys: 3.551 ± 0.366
6.381ThrLeu: 6.381 ± 0.692
1.505ThrMet: 1.505 ± 0.311
1.866ThrAsn: 1.866 ± 0.391
4.033ThrPro: 4.033 ± 0.492
2.348ThrGln: 2.348 ± 0.46
2.468ThrArg: 2.468 ± 0.311
3.491ThrSer: 3.491 ± 0.434
3.371ThrThr: 3.371 ± 0.535
5.538ThrVal: 5.538 ± 0.688
0.903ThrTrp: 0.903 ± 0.253
1.926ThrTyr: 1.926 ± 0.311
0.0ThrXaa: 0.0 ± 0.0
Val
5.839ValAla: 5.839 ± 0.597
0.602ValCys: 0.602 ± 0.179
5.959ValAsp: 5.959 ± 0.644
4.454ValGlu: 4.454 ± 0.686
2.408ValPhe: 2.408 ± 0.466
5.177ValGly: 5.177 ± 0.748
1.144ValHis: 1.144 ± 0.23
3.732ValIle: 3.732 ± 0.534
3.311ValLys: 3.311 ± 0.374
4.755ValLeu: 4.755 ± 0.561
1.083ValMet: 1.083 ± 0.267
3.371ValAsn: 3.371 ± 0.497
2.889ValPro: 2.889 ± 0.485
1.866ValGln: 1.866 ± 0.314
4.876ValArg: 4.876 ± 0.513
3.973ValSer: 3.973 ± 0.45
5.357ValThr: 5.357 ± 0.487
5.959ValVal: 5.959 ± 0.802
1.806ValTrp: 1.806 ± 0.344
1.746ValTyr: 1.746 ± 0.418
0.0ValXaa: 0.0 ± 0.0
Trp
1.926TrpAla: 1.926 ± 0.383
0.301TrpCys: 0.301 ± 0.163
1.384TrpAsp: 1.384 ± 0.274
1.264TrpGlu: 1.264 ± 0.233
0.542TrpPhe: 0.542 ± 0.217
1.685TrpGly: 1.685 ± 0.359
0.602TrpHis: 0.602 ± 0.203
1.264TrpIle: 1.264 ± 0.283
0.602TrpLys: 0.602 ± 0.157
1.384TrpLeu: 1.384 ± 0.312
0.843TrpMet: 0.843 ± 0.217
0.421TrpAsn: 0.421 ± 0.143
1.023TrpPro: 1.023 ± 0.255
1.144TrpGln: 1.144 ± 0.247
1.023TrpArg: 1.023 ± 0.234
0.963TrpSer: 0.963 ± 0.23
1.324TrpThr: 1.324 ± 0.244
0.963TrpVal: 0.963 ± 0.216
0.361TrpTrp: 0.361 ± 0.154
0.421TrpTyr: 0.421 ± 0.14
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.949TyrAla: 2.949 ± 0.424
0.12TyrCys: 0.12 ± 0.096
1.685TyrAsp: 1.685 ± 0.311
1.685TyrGlu: 1.685 ± 0.354
1.023TyrPhe: 1.023 ± 0.213
2.047TyrGly: 2.047 ± 0.333
0.301TyrHis: 0.301 ± 0.141
1.083TyrIle: 1.083 ± 0.239
1.204TyrLys: 1.204 ± 0.326
3.19TyrLeu: 3.19 ± 0.482
0.482TyrMet: 0.482 ± 0.18
0.662TyrAsn: 0.662 ± 0.158
1.866TyrPro: 1.866 ± 0.406
1.204TyrGln: 1.204 ± 0.301
1.625TyrArg: 1.625 ± 0.346
1.384TyrSer: 1.384 ± 0.295
1.866TyrThr: 1.866 ± 0.346
2.408TyrVal: 2.408 ± 0.335
0.602TyrTrp: 0.602 ± 0.201
0.843TyrTyr: 0.843 ± 0.233
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 95 proteins (16614 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski