Amino acid dipepetide frequency for Mycobacterium phage Myxus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
10.47AlaAla: 10.47 ± 0.827
0.718AlaCys: 0.718 ± 0.228
5.145AlaAsp: 5.145 ± 0.514
6.641AlaGlu: 6.641 ± 0.796
3.65AlaPhe: 3.65 ± 0.486
9.154AlaGly: 9.154 ± 0.951
1.496AlaHis: 1.496 ± 0.285
4.846AlaIle: 4.846 ± 0.557
5.804AlaLys: 5.804 ± 0.601
9.034AlaLeu: 9.034 ± 0.694
2.752AlaMet: 2.752 ± 0.375
3.65AlaAsn: 3.65 ± 0.495
4.547AlaPro: 4.547 ± 0.599
3.949AlaGln: 3.949 ± 0.601
5.863AlaArg: 5.863 ± 0.606
6.162AlaSer: 6.162 ± 0.507
5.863AlaThr: 5.863 ± 0.784
7.0AlaVal: 7.0 ± 0.647
1.316AlaTrp: 1.316 ± 0.287
2.513AlaTyr: 2.513 ± 0.392
0.0AlaXaa: 0.0 ± 0.0
Cys
0.479CysAla: 0.479 ± 0.162
0.0CysCys: 0.0 ± 0.0
0.598CysAsp: 0.598 ± 0.179
0.538CysGlu: 0.538 ± 0.196
0.359CysPhe: 0.359 ± 0.14
0.718CysGly: 0.718 ± 0.197
0.359CysHis: 0.359 ± 0.159
0.299CysIle: 0.299 ± 0.138
0.299CysLys: 0.299 ± 0.14
0.538CysLeu: 0.538 ± 0.174
0.239CysMet: 0.239 ± 0.133
0.538CysAsn: 0.538 ± 0.162
0.538CysPro: 0.538 ± 0.22
0.239CysGln: 0.239 ± 0.108
0.658CysArg: 0.658 ± 0.188
0.778CysSer: 0.778 ± 0.213
0.359CysThr: 0.359 ± 0.151
0.538CysVal: 0.538 ± 0.167
0.419CysTrp: 0.419 ± 0.163
0.179CysTyr: 0.179 ± 0.103
0.0CysXaa: 0.0 ± 0.0
Asp
6.342AspAla: 6.342 ± 0.553
0.419AspCys: 0.419 ± 0.174
3.35AspAsp: 3.35 ± 0.421
4.308AspGlu: 4.308 ± 0.621
2.274AspPhe: 2.274 ± 0.361
5.445AspGly: 5.445 ± 0.592
1.316AspHis: 1.316 ± 0.328
3.051AspIle: 3.051 ± 0.42
2.932AspLys: 2.932 ± 0.427
5.804AspLeu: 5.804 ± 0.704
1.017AspMet: 1.017 ± 0.219
1.436AspAsn: 1.436 ± 0.285
4.308AspPro: 4.308 ± 0.638
2.154AspGln: 2.154 ± 0.35
3.35AspArg: 3.35 ± 0.421
3.65AspSer: 3.65 ± 0.388
4.128AspThr: 4.128 ± 0.562
3.949AspVal: 3.949 ± 0.461
1.197AspTrp: 1.197 ± 0.328
2.453AspTyr: 2.453 ± 0.342
0.0AspXaa: 0.0 ± 0.0
Glu
6.761GluAla: 6.761 ± 0.682
0.12GluCys: 0.12 ± 0.081
4.128GluAsp: 4.128 ± 0.588
4.966GluGlu: 4.966 ± 0.626
2.692GluPhe: 2.692 ± 0.333
4.786GluGly: 4.786 ± 0.548
1.615GluHis: 1.615 ± 0.317
3.47GluIle: 3.47 ± 0.467
2.513GluLys: 2.513 ± 0.424
5.983GluLeu: 5.983 ± 0.692
2.573GluMet: 2.573 ± 0.428
2.034GluAsn: 2.034 ± 0.393
2.872GluPro: 2.872 ± 0.396
2.513GluGln: 2.513 ± 0.309
3.889GluArg: 3.889 ± 0.499
3.051GluSer: 3.051 ± 0.449
3.35GluThr: 3.35 ± 0.42
4.727GluVal: 4.727 ± 0.437
1.376GluTrp: 1.376 ± 0.34
1.974GluTyr: 1.974 ± 0.283
0.0GluXaa: 0.0 ± 0.0
Phe
3.35PheAla: 3.35 ± 0.436
0.359PheCys: 0.359 ± 0.132
1.735PheAsp: 1.735 ± 0.329
2.274PheGlu: 2.274 ± 0.387
0.718PhePhe: 0.718 ± 0.209
3.171PheGly: 3.171 ± 0.397
0.598PheHis: 0.598 ± 0.189
1.974PheIle: 1.974 ± 0.331
1.436PheLys: 1.436 ± 0.279
2.393PheLeu: 2.393 ± 0.429
0.598PheMet: 0.598 ± 0.177
1.855PheAsn: 1.855 ± 0.284
1.795PhePro: 1.795 ± 0.281
0.957PheGln: 0.957 ± 0.229
1.436PheArg: 1.436 ± 0.309
2.274PheSer: 2.274 ± 0.377
2.034PheThr: 2.034 ± 0.338
1.974PheVal: 1.974 ± 0.372
0.359PheTrp: 0.359 ± 0.143
1.017PheTyr: 1.017 ± 0.247
0.0PheXaa: 0.0 ± 0.0
Gly
8.077GlyAla: 8.077 ± 1.103
1.017GlyCys: 1.017 ± 0.294
6.94GlyAsp: 6.94 ± 0.88
4.966GlyGlu: 4.966 ± 0.528
2.992GlyPhe: 2.992 ± 0.448
8.795GlyGly: 8.795 ± 1.215
1.675GlyHis: 1.675 ± 0.328
4.547GlyIle: 4.547 ± 0.592
4.248GlyLys: 4.248 ± 0.54
7.239GlyLeu: 7.239 ± 0.827
2.692GlyMet: 2.692 ± 0.374
3.59GlyAsn: 3.59 ± 0.565
5.445GlyPro: 5.445 ± 1.585
2.692GlyGln: 2.692 ± 0.473
3.709GlyArg: 3.709 ± 0.438
4.667GlySer: 4.667 ± 0.698
5.385GlyThr: 5.385 ± 0.664
6.162GlyVal: 6.162 ± 0.554
1.556GlyTrp: 1.556 ± 0.284
2.333GlyTyr: 2.333 ± 0.375
0.0GlyXaa: 0.0 ± 0.0
His
1.615HisAla: 1.615 ± 0.259
0.179HisCys: 0.179 ± 0.091
1.316HisAsp: 1.316 ± 0.332
0.778HisGlu: 0.778 ± 0.224
0.658HisPhe: 0.658 ± 0.176
1.615HisGly: 1.615 ± 0.35
0.538HisHis: 0.538 ± 0.197
0.838HisIle: 0.838 ± 0.212
0.838HisLys: 0.838 ± 0.237
1.316HisLeu: 1.316 ± 0.329
0.359HisMet: 0.359 ± 0.144
0.718HisAsn: 0.718 ± 0.201
1.316HisPro: 1.316 ± 0.267
1.017HisGln: 1.017 ± 0.276
1.316HisArg: 1.316 ± 0.306
0.359HisSer: 0.359 ± 0.14
1.376HisThr: 1.376 ± 0.26
1.137HisVal: 1.137 ± 0.256
0.479HisTrp: 0.479 ± 0.17
0.778HisTyr: 0.778 ± 0.327
0.0HisXaa: 0.0 ± 0.0
Ile
5.145IleAla: 5.145 ± 0.503
0.419IleCys: 0.419 ± 0.156
4.068IleAsp: 4.068 ± 0.471
3.949IleGlu: 3.949 ± 0.511
1.735IlePhe: 1.735 ± 0.326
3.41IleGly: 3.41 ± 0.429
0.957IleHis: 0.957 ± 0.225
1.795IleIle: 1.795 ± 0.242
2.633IleLys: 2.633 ± 0.378
3.769IleLeu: 3.769 ± 0.439
0.598IleMet: 0.598 ± 0.171
2.094IleAsn: 2.094 ± 0.334
3.829IlePro: 3.829 ± 0.412
1.376IleGln: 1.376 ± 0.265
3.889IleArg: 3.889 ± 0.485
2.573IleSer: 2.573 ± 0.334
3.111IleThr: 3.111 ± 0.398
3.231IleVal: 3.231 ± 0.414
0.718IleTrp: 0.718 ± 0.25
0.897IleTyr: 0.897 ± 0.264
0.0IleXaa: 0.0 ± 0.0
Lys
4.906LysAla: 4.906 ± 0.609
0.179LysCys: 0.179 ± 0.119
2.214LysAsp: 2.214 ± 0.225
2.692LysGlu: 2.692 ± 0.46
1.316LysPhe: 1.316 ± 0.302
4.547LysGly: 4.547 ± 0.866
0.718LysHis: 0.718 ± 0.19
2.274LysIle: 2.274 ± 0.371
2.573LysLys: 2.573 ± 0.421
4.188LysLeu: 4.188 ± 0.454
1.017LysMet: 1.017 ± 0.22
1.077LysAsn: 1.077 ± 0.251
2.692LysPro: 2.692 ± 0.458
1.915LysGln: 1.915 ± 0.345
3.889LysArg: 3.889 ± 0.59
2.214LysSer: 2.214 ± 0.357
2.034LysThr: 2.034 ± 0.315
4.368LysVal: 4.368 ± 0.529
0.897LysTrp: 0.897 ± 0.27
1.316LysTyr: 1.316 ± 0.281
0.0LysXaa: 0.0 ± 0.0
Leu
9.573LeuAla: 9.573 ± 0.669
1.077LeuCys: 1.077 ± 0.279
4.966LeuAsp: 4.966 ± 0.529
5.325LeuGlu: 5.325 ± 0.543
2.274LeuPhe: 2.274 ± 0.292
6.701LeuGly: 6.701 ± 0.843
1.197LeuHis: 1.197 ± 0.312
4.188LeuIle: 4.188 ± 0.562
3.231LeuLys: 3.231 ± 0.496
4.607LeuLeu: 4.607 ± 0.528
2.573LeuMet: 2.573 ± 0.35
2.633LeuAsn: 2.633 ± 0.408
4.906LeuPro: 4.906 ± 0.541
2.692LeuGln: 2.692 ± 0.493
5.325LeuArg: 5.325 ± 0.636
4.667LeuSer: 4.667 ± 0.47
5.504LeuThr: 5.504 ± 0.661
4.547LeuVal: 4.547 ± 0.553
1.795LeuTrp: 1.795 ± 0.282
1.735LeuTyr: 1.735 ± 0.367
0.0LeuXaa: 0.0 ± 0.0
Met
2.932MetAla: 2.932 ± 0.374
0.12MetCys: 0.12 ± 0.092
0.897MetAsp: 0.897 ± 0.214
2.154MetGlu: 2.154 ± 0.407
0.538MetPhe: 0.538 ± 0.156
2.214MetGly: 2.214 ± 0.406
0.299MetHis: 0.299 ± 0.13
1.316MetIle: 1.316 ± 0.264
1.436MetLys: 1.436 ± 0.341
1.675MetLeu: 1.675 ± 0.302
0.838MetMet: 0.838 ± 0.205
0.419MetAsn: 0.419 ± 0.174
1.137MetPro: 1.137 ± 0.235
1.197MetGln: 1.197 ± 0.215
1.735MetArg: 1.735 ± 0.366
2.154MetSer: 2.154 ± 0.355
2.214MetThr: 2.214 ± 0.351
1.316MetVal: 1.316 ± 0.293
0.299MetTrp: 0.299 ± 0.124
0.658MetTyr: 0.658 ± 0.217
0.0MetXaa: 0.0 ± 0.0
Asn
3.889AsnAla: 3.889 ± 0.505
0.419AsnCys: 0.419 ± 0.171
2.034AsnAsp: 2.034 ± 0.302
1.675AsnGlu: 1.675 ± 0.362
1.137AsnPhe: 1.137 ± 0.311
3.41AsnGly: 3.41 ± 0.655
0.897AsnHis: 0.897 ± 0.223
1.556AsnIle: 1.556 ± 0.275
0.957AsnLys: 0.957 ± 0.222
2.752AsnLeu: 2.752 ± 0.357
0.658AsnMet: 0.658 ± 0.163
0.598AsnAsn: 0.598 ± 0.208
2.333AsnPro: 2.333 ± 0.354
1.077AsnGln: 1.077 ± 0.305
1.915AsnArg: 1.915 ± 0.353
1.735AsnSer: 1.735 ± 0.34
1.496AsnThr: 1.496 ± 0.238
2.573AsnVal: 2.573 ± 0.359
0.778AsnTrp: 0.778 ± 0.214
0.778AsnTyr: 0.778 ± 0.193
0.0AsnXaa: 0.0 ± 0.0
Pro
5.684ProAla: 5.684 ± 0.524
0.419ProCys: 0.419 ± 0.175
3.769ProAsp: 3.769 ± 0.571
4.248ProGlu: 4.248 ± 0.564
1.795ProPhe: 1.795 ± 0.37
5.385ProGly: 5.385 ± 0.537
0.897ProHis: 0.897 ± 0.243
2.573ProIle: 2.573 ± 0.354
2.692ProLys: 2.692 ± 0.717
3.35ProLeu: 3.35 ± 0.428
1.256ProMet: 1.256 ± 0.263
2.094ProAsn: 2.094 ± 0.337
2.214ProPro: 2.214 ± 0.366
2.633ProGln: 2.633 ± 0.527
2.812ProArg: 2.812 ± 0.519
2.812ProSer: 2.812 ± 0.36
3.889ProThr: 3.889 ± 0.465
4.427ProVal: 4.427 ± 0.541
0.838ProTrp: 0.838 ± 0.318
1.436ProTyr: 1.436 ± 0.258
0.0ProXaa: 0.0 ± 0.0
Gln
4.068GlnAla: 4.068 ± 0.614
0.179GlnCys: 0.179 ± 0.106
1.256GlnAsp: 1.256 ± 0.249
2.393GlnGlu: 2.393 ± 0.449
1.137GlnPhe: 1.137 ± 0.252
4.009GlnGly: 4.009 ± 0.88
0.838GlnHis: 0.838 ± 0.234
2.333GlnIle: 2.333 ± 0.366
1.256GlnLys: 1.256 ± 0.323
3.231GlnLeu: 3.231 ± 0.619
0.957GlnMet: 0.957 ± 0.317
1.197GlnAsn: 1.197 ± 0.272
1.496GlnPro: 1.496 ± 0.348
2.094GlnGln: 2.094 ± 0.334
2.393GlnArg: 2.393 ± 0.354
1.974GlnSer: 1.974 ± 0.344
1.974GlnThr: 1.974 ± 0.383
2.333GlnVal: 2.333 ± 0.401
0.718GlnTrp: 0.718 ± 0.191
1.376GlnTyr: 1.376 ± 0.281
0.0GlnXaa: 0.0 ± 0.0
Arg
5.205ArgAla: 5.205 ± 0.543
0.718ArgCys: 0.718 ± 0.24
3.889ArgAsp: 3.889 ± 0.494
4.846ArgGlu: 4.846 ± 0.603
2.333ArgPhe: 2.333 ± 0.401
4.248ArgGly: 4.248 ± 0.54
1.496ArgHis: 1.496 ± 0.409
3.709ArgIle: 3.709 ± 0.435
3.231ArgLys: 3.231 ± 0.474
5.385ArgLeu: 5.385 ± 0.583
1.615ArgMet: 1.615 ± 0.313
1.735ArgAsn: 1.735 ± 0.311
2.872ArgPro: 2.872 ± 0.313
2.633ArgGln: 2.633 ± 0.387
5.026ArgArg: 5.026 ± 0.71
3.41ArgSer: 3.41 ± 0.423
2.752ArgThr: 2.752 ± 0.448
4.308ArgVal: 4.308 ± 0.476
1.077ArgTrp: 1.077 ± 0.228
1.496ArgTyr: 1.496 ± 0.32
0.0ArgXaa: 0.0 ± 0.0
Ser
5.145SerAla: 5.145 ± 0.561
0.479SerCys: 0.479 ± 0.176
3.709SerAsp: 3.709 ± 0.396
3.65SerGlu: 3.65 ± 0.512
1.615SerPhe: 1.615 ± 0.371
5.684SerGly: 5.684 ± 0.776
1.077SerHis: 1.077 ± 0.26
2.633SerIle: 2.633 ± 0.403
2.333SerLys: 2.333 ± 0.436
4.009SerLeu: 4.009 ± 0.556
1.436SerMet: 1.436 ± 0.234
1.017SerAsn: 1.017 ± 0.21
2.513SerPro: 2.513 ± 0.448
2.034SerGln: 2.034 ± 0.364
4.487SerArg: 4.487 ± 0.534
3.59SerSer: 3.59 ± 0.515
3.171SerThr: 3.171 ± 0.399
3.65SerVal: 3.65 ± 0.415
1.197SerTrp: 1.197 ± 0.281
1.915SerTyr: 1.915 ± 0.323
0.0SerXaa: 0.0 ± 0.0
Thr
6.162ThrAla: 6.162 ± 0.514
0.479ThrCys: 0.479 ± 0.138
3.949ThrAsp: 3.949 ± 0.485
2.573ThrGlu: 2.573 ± 0.385
1.735ThrPhe: 1.735 ± 0.287
6.88ThrGly: 6.88 ± 1.079
1.137ThrHis: 1.137 ± 0.242
2.453ThrIle: 2.453 ± 0.375
3.291ThrLys: 3.291 ± 0.504
5.325ThrLeu: 5.325 ± 0.574
1.675ThrMet: 1.675 ± 0.4
1.615ThrAsn: 1.615 ± 0.293
3.829ThrPro: 3.829 ± 0.451
1.735ThrGln: 1.735 ± 0.278
3.231ThrArg: 3.231 ± 0.38
2.752ThrSer: 2.752 ± 0.482
3.231ThrThr: 3.231 ± 0.435
4.846ThrVal: 4.846 ± 0.44
1.077ThrTrp: 1.077 ± 0.284
2.274ThrTyr: 2.274 ± 0.3
0.0ThrXaa: 0.0 ± 0.0
Val
6.641ValAla: 6.641 ± 0.679
0.658ValCys: 0.658 ± 0.2
5.624ValAsp: 5.624 ± 0.646
4.427ValGlu: 4.427 ± 0.574
2.154ValPhe: 2.154 ± 0.377
5.325ValGly: 5.325 ± 0.522
0.838ValHis: 0.838 ± 0.242
3.829ValIle: 3.829 ± 0.524
3.949ValLys: 3.949 ± 0.498
5.145ValLeu: 5.145 ± 0.559
1.197ValMet: 1.197 ± 0.254
2.752ValAsn: 2.752 ± 0.401
3.47ValPro: 3.47 ± 0.498
2.214ValGln: 2.214 ± 0.311
4.128ValArg: 4.128 ± 0.441
3.59ValSer: 3.59 ± 0.467
5.205ValThr: 5.205 ± 0.52
5.385ValVal: 5.385 ± 0.541
1.735ValTrp: 1.735 ± 0.375
2.034ValTyr: 2.034 ± 0.341
0.0ValXaa: 0.0 ± 0.0
Trp
1.496TrpAla: 1.496 ± 0.395
0.299TrpCys: 0.299 ± 0.155
1.376TrpAsp: 1.376 ± 0.259
1.316TrpGlu: 1.316 ± 0.265
0.359TrpPhe: 0.359 ± 0.146
1.316TrpGly: 1.316 ± 0.33
0.419TrpHis: 0.419 ± 0.171
1.137TrpIle: 1.137 ± 0.228
0.658TrpLys: 0.658 ± 0.211
1.017TrpLeu: 1.017 ± 0.266
0.778TrpMet: 0.778 ± 0.205
0.658TrpAsn: 0.658 ± 0.198
1.017TrpPro: 1.017 ± 0.245
1.017TrpGln: 1.017 ± 0.217
0.838TrpArg: 0.838 ± 0.211
1.256TrpSer: 1.256 ± 0.254
1.556TrpThr: 1.556 ± 0.258
1.496TrpVal: 1.496 ± 0.238
0.479TrpTrp: 0.479 ± 0.207
0.419TrpTyr: 0.419 ± 0.151
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.692TyrAla: 2.692 ± 0.368
0.419TyrCys: 0.419 ± 0.164
1.915TyrAsp: 1.915 ± 0.336
1.675TyrGlu: 1.675 ± 0.364
0.838TyrPhe: 0.838 ± 0.201
1.915TyrGly: 1.915 ± 0.352
0.179TyrHis: 0.179 ± 0.098
1.376TyrIle: 1.376 ± 0.237
0.718TyrLys: 0.718 ± 0.234
2.752TyrLeu: 2.752 ± 0.328
0.658TyrMet: 0.658 ± 0.192
0.957TyrAsn: 0.957 ± 0.248
2.094TyrPro: 2.094 ± 0.335
1.077TyrGln: 1.077 ± 0.343
2.094TyrArg: 2.094 ± 0.394
1.675TyrSer: 1.675 ± 0.266
1.615TyrThr: 1.615 ± 0.28
2.333TyrVal: 2.333 ± 0.365
0.598TyrTrp: 0.598 ± 0.183
1.137TyrTyr: 1.137 ± 0.306
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 99 proteins (16715 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski