Amino acid dipepetide frequency for Mycobacterium phage Heffalump

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
11.174AlaAla: 11.174 ± 1.106
0.483AlaCys: 0.483 ± 0.165
5.678AlaAsp: 5.678 ± 0.604
7.671AlaGlu: 7.671 ± 0.738
3.805AlaPhe: 3.805 ± 0.56
7.49AlaGly: 7.49 ± 0.818
1.51AlaHis: 1.51 ± 0.27
4.47AlaIle: 4.47 ± 0.482
5.496AlaLys: 5.496 ± 0.518
8.819AlaLeu: 8.819 ± 0.977
2.174AlaMet: 2.174 ± 0.355
3.02AlaAsn: 3.02 ± 0.473
4.47AlaPro: 4.47 ± 0.642
4.107AlaGln: 4.107 ± 0.609
6.403AlaArg: 6.403 ± 0.727
4.832AlaSer: 4.832 ± 0.417
5.255AlaThr: 5.255 ± 0.521
8.154AlaVal: 8.154 ± 0.575
1.812AlaTrp: 1.812 ± 0.369
2.778AlaTyr: 2.778 ± 0.383
0.0AlaXaa: 0.0 ± 0.0
Cys
0.664CysAla: 0.664 ± 0.207
0.06CysCys: 0.06 ± 0.07
0.785CysAsp: 0.785 ± 0.191
0.423CysGlu: 0.423 ± 0.16
0.302CysPhe: 0.302 ± 0.14
0.785CysGly: 0.785 ± 0.206
0.242CysHis: 0.242 ± 0.126
0.362CysIle: 0.362 ± 0.155
0.362CysLys: 0.362 ± 0.142
0.966CysLeu: 0.966 ± 0.225
0.121CysMet: 0.121 ± 0.081
0.483CysAsn: 0.483 ± 0.176
0.544CysPro: 0.544 ± 0.224
0.181CysGln: 0.181 ± 0.102
0.604CysArg: 0.604 ± 0.193
0.483CysSer: 0.483 ± 0.174
0.604CysThr: 0.604 ± 0.152
0.604CysVal: 0.604 ± 0.211
0.302CysTrp: 0.302 ± 0.155
0.423CysTyr: 0.423 ± 0.158
0.0CysXaa: 0.0 ± 0.0
Asp
6.886AspAla: 6.886 ± 0.846
0.906AspCys: 0.906 ± 0.25
4.047AspAsp: 4.047 ± 0.469
4.711AspGlu: 4.711 ± 0.629
2.597AspPhe: 2.597 ± 0.52
5.496AspGly: 5.496 ± 0.54
1.51AspHis: 1.51 ± 0.355
3.805AspIle: 3.805 ± 0.532
2.114AspLys: 2.114 ± 0.324
5.496AspLeu: 5.496 ± 0.612
1.51AspMet: 1.51 ± 0.296
1.389AspAsn: 1.389 ± 0.323
4.53AspPro: 4.53 ± 0.561
1.45AspGln: 1.45 ± 0.282
2.899AspArg: 2.899 ± 0.389
2.839AspSer: 2.839 ± 0.435
4.168AspThr: 4.168 ± 0.568
3.926AspVal: 3.926 ± 0.47
1.268AspTrp: 1.268 ± 0.313
2.174AspTyr: 2.174 ± 0.308
0.0AspXaa: 0.0 ± 0.0
Glu
7.067GluAla: 7.067 ± 0.802
0.302GluCys: 0.302 ± 0.12
4.651GluAsp: 4.651 ± 0.627
4.772GluGlu: 4.772 ± 0.589
2.416GluPhe: 2.416 ± 0.428
4.711GluGly: 4.711 ± 0.769
1.872GluHis: 1.872 ± 0.404
3.866GluIle: 3.866 ± 0.436
2.235GluLys: 2.235 ± 0.315
7.55GluLeu: 7.55 ± 0.717
2.537GluMet: 2.537 ± 0.344
1.993GluAsn: 1.993 ± 0.369
2.96GluPro: 2.96 ± 0.47
1.993GluGln: 1.993 ± 0.306
4.47GluArg: 4.47 ± 0.554
3.322GluSer: 3.322 ± 0.47
3.986GluThr: 3.986 ± 0.462
4.168GluVal: 4.168 ± 0.518
1.329GluTrp: 1.329 ± 0.251
2.235GluTyr: 2.235 ± 0.358
0.0GluXaa: 0.0 ± 0.0
Phe
2.839PheAla: 2.839 ± 0.487
0.302PheCys: 0.302 ± 0.149
2.114PheAsp: 2.114 ± 0.481
2.174PheGlu: 2.174 ± 0.28
0.725PhePhe: 0.725 ± 0.27
3.443PheGly: 3.443 ± 0.469
0.966PheHis: 0.966 ± 0.325
1.57PheIle: 1.57 ± 0.338
1.57PheLys: 1.57 ± 0.323
2.356PheLeu: 2.356 ± 0.528
0.362PheMet: 0.362 ± 0.136
1.631PheAsn: 1.631 ± 0.289
1.752PhePro: 1.752 ± 0.322
1.208PheGln: 1.208 ± 0.264
2.235PheArg: 2.235 ± 0.366
1.631PheSer: 1.631 ± 0.296
2.295PheThr: 2.295 ± 0.319
2.054PheVal: 2.054 ± 0.383
0.423PheTrp: 0.423 ± 0.167
0.725PheTyr: 0.725 ± 0.175
0.0PheXaa: 0.0 ± 0.0
Gly
6.584GlyAla: 6.584 ± 0.696
0.725GlyCys: 0.725 ± 0.193
5.98GlyAsp: 5.98 ± 0.808
4.651GlyGlu: 4.651 ± 0.564
3.141GlyPhe: 3.141 ± 0.415
9.785GlyGly: 9.785 ± 2.25
1.389GlyHis: 1.389 ± 0.262
4.168GlyIle: 4.168 ± 0.521
4.168GlyLys: 4.168 ± 0.433
6.765GlyLeu: 6.765 ± 0.733
2.054GlyMet: 2.054 ± 0.302
3.684GlyAsn: 3.684 ± 0.455
5.436GlyPro: 5.436 ± 1.938
3.624GlyGln: 3.624 ± 0.531
3.684GlyArg: 3.684 ± 0.445
4.47GlySer: 4.47 ± 0.828
4.832GlyThr: 4.832 ± 0.486
6.463GlyVal: 6.463 ± 0.701
1.812GlyTrp: 1.812 ± 0.339
2.597GlyTyr: 2.597 ± 0.395
0.0GlyXaa: 0.0 ± 0.0
His
1.268HisAla: 1.268 ± 0.297
0.302HisCys: 0.302 ± 0.134
1.51HisAsp: 1.51 ± 0.339
1.208HisGlu: 1.208 ± 0.272
0.362HisPhe: 0.362 ± 0.151
2.114HisGly: 2.114 ± 0.391
0.423HisHis: 0.423 ± 0.162
1.087HisIle: 1.087 ± 0.268
1.027HisLys: 1.027 ± 0.258
1.691HisLeu: 1.691 ± 0.314
0.242HisMet: 0.242 ± 0.123
0.423HisAsn: 0.423 ± 0.148
1.208HisPro: 1.208 ± 0.231
0.725HisGln: 0.725 ± 0.185
1.268HisArg: 1.268 ± 0.338
0.966HisSer: 0.966 ± 0.245
0.906HisThr: 0.906 ± 0.22
1.389HisVal: 1.389 ± 0.33
0.362HisTrp: 0.362 ± 0.142
0.846HisTyr: 0.846 ± 0.268
0.0HisXaa: 0.0 ± 0.0
Ile
5.376IleAla: 5.376 ± 0.591
0.604IleCys: 0.604 ± 0.182
3.503IleAsp: 3.503 ± 0.52
4.53IleGlu: 4.53 ± 0.568
1.087IlePhe: 1.087 ± 0.221
4.53IleGly: 4.53 ± 0.559
0.846IleHis: 0.846 ± 0.173
1.51IleIle: 1.51 ± 0.288
2.174IleLys: 2.174 ± 0.327
3.624IleLeu: 3.624 ± 0.47
0.604IleMet: 0.604 ± 0.169
2.537IleAsn: 2.537 ± 0.484
3.322IlePro: 3.322 ± 0.438
1.45IleGln: 1.45 ± 0.373
3.322IleArg: 3.322 ± 0.507
2.658IleSer: 2.658 ± 0.421
2.658IleThr: 2.658 ± 0.365
3.201IleVal: 3.201 ± 0.451
0.544IleTrp: 0.544 ± 0.163
1.148IleTyr: 1.148 ± 0.23
0.0IleXaa: 0.0 ± 0.0
Lys
5.617LysAla: 5.617 ± 0.681
0.302LysCys: 0.302 ± 0.17
2.597LysAsp: 2.597 ± 0.439
2.416LysGlu: 2.416 ± 0.35
0.664LysPhe: 0.664 ± 0.173
3.564LysGly: 3.564 ± 0.515
0.966LysHis: 0.966 ± 0.23
2.114LysIle: 2.114 ± 0.444
3.02LysLys: 3.02 ± 0.525
3.986LysLeu: 3.986 ± 0.449
1.268LysMet: 1.268 ± 0.366
1.57LysAsn: 1.57 ± 0.36
2.416LysPro: 2.416 ± 0.454
1.45LysGln: 1.45 ± 0.276
3.262LysArg: 3.262 ± 0.538
2.416LysSer: 2.416 ± 0.418
2.476LysThr: 2.476 ± 0.348
3.986LysVal: 3.986 ± 0.493
0.906LysTrp: 0.906 ± 0.243
1.45LysTyr: 1.45 ± 0.32
0.0LysXaa: 0.0 ± 0.0
Leu
8.517LeuAla: 8.517 ± 0.79
0.785LeuCys: 0.785 ± 0.218
5.194LeuAsp: 5.194 ± 0.568
5.376LeuGlu: 5.376 ± 0.673
2.356LeuPhe: 2.356 ± 0.368
5.799LeuGly: 5.799 ± 0.731
1.631LeuHis: 1.631 ± 0.313
3.805LeuIle: 3.805 ± 0.475
3.564LeuLys: 3.564 ± 0.514
5.376LeuLeu: 5.376 ± 0.703
3.02LeuMet: 3.02 ± 0.401
2.537LeuAsn: 2.537 ± 0.374
5.013LeuPro: 5.013 ± 0.559
2.658LeuGln: 2.658 ± 0.509
6.282LeuArg: 6.282 ± 0.601
5.919LeuSer: 5.919 ± 0.926
5.919LeuThr: 5.919 ± 0.668
4.953LeuVal: 4.953 ± 0.725
1.389LeuTrp: 1.389 ± 0.299
1.993LeuTyr: 1.993 ± 0.393
0.0LeuXaa: 0.0 ± 0.0
Met
3.141MetAla: 3.141 ± 0.404
0.121MetCys: 0.121 ± 0.087
0.966MetAsp: 0.966 ± 0.202
1.087MetGlu: 1.087 ± 0.241
0.664MetPhe: 0.664 ± 0.181
1.691MetGly: 1.691 ± 0.337
0.423MetHis: 0.423 ± 0.158
1.268MetIle: 1.268 ± 0.345
1.45MetLys: 1.45 ± 0.328
1.812MetLeu: 1.812 ± 0.292
0.664MetMet: 0.664 ± 0.182
0.544MetAsn: 0.544 ± 0.2
1.329MetPro: 1.329 ± 0.27
1.148MetGln: 1.148 ± 0.256
1.631MetArg: 1.631 ± 0.27
1.812MetSer: 1.812 ± 0.316
2.356MetThr: 2.356 ± 0.349
1.027MetVal: 1.027 ± 0.301
0.242MetTrp: 0.242 ± 0.127
0.604MetTyr: 0.604 ± 0.152
0.0MetXaa: 0.0 ± 0.0
Asn
3.201AsnAla: 3.201 ± 0.426
0.544AsnCys: 0.544 ± 0.188
1.933AsnAsp: 1.933 ± 0.326
2.114AsnGlu: 2.114 ± 0.353
0.725AsnPhe: 0.725 ± 0.183
3.564AsnGly: 3.564 ± 0.54
1.148AsnHis: 1.148 ± 0.215
1.57AsnIle: 1.57 ± 0.286
1.329AsnLys: 1.329 ± 0.352
3.262AsnLeu: 3.262 ± 0.407
0.544AsnMet: 0.544 ± 0.172
0.544AsnAsn: 0.544 ± 0.16
2.356AsnPro: 2.356 ± 0.421
0.785AsnGln: 0.785 ± 0.24
2.114AsnArg: 2.114 ± 0.353
1.51AsnSer: 1.51 ± 0.301
1.51AsnThr: 1.51 ± 0.281
2.295AsnVal: 2.295 ± 0.381
0.725AsnTrp: 0.725 ± 0.174
0.846AsnTyr: 0.846 ± 0.223
0.0AsnXaa: 0.0 ± 0.0
Pro
5.436ProAla: 5.436 ± 0.58
0.362ProCys: 0.362 ± 0.159
3.443ProAsp: 3.443 ± 0.509
4.953ProGlu: 4.953 ± 0.625
1.933ProPhe: 1.933 ± 0.37
4.651ProGly: 4.651 ± 0.505
1.027ProHis: 1.027 ± 0.222
2.597ProIle: 2.597 ± 0.394
2.235ProLys: 2.235 ± 0.403
3.201ProLeu: 3.201 ± 0.541
1.45ProMet: 1.45 ± 0.242
1.993ProAsn: 1.993 ± 0.379
2.718ProPro: 2.718 ± 0.446
2.476ProGln: 2.476 ± 0.961
3.805ProArg: 3.805 ± 0.519
2.416ProSer: 2.416 ± 0.375
3.684ProThr: 3.684 ± 0.4
4.228ProVal: 4.228 ± 0.557
1.268ProTrp: 1.268 ± 0.408
1.57ProTyr: 1.57 ± 0.342
0.0ProXaa: 0.0 ± 0.0
Gln
4.228GlnAla: 4.228 ± 0.454
0.242GlnCys: 0.242 ± 0.12
1.208GlnAsp: 1.208 ± 0.301
1.389GlnGlu: 1.389 ± 0.358
1.148GlnPhe: 1.148 ± 0.229
4.107GlnGly: 4.107 ± 1.395
0.604GlnHis: 0.604 ± 0.214
2.476GlnIle: 2.476 ± 0.423
1.57GlnLys: 1.57 ± 0.403
2.899GlnLeu: 2.899 ± 0.668
0.664GlnMet: 0.664 ± 0.19
0.544GlnAsn: 0.544 ± 0.169
1.268GlnPro: 1.268 ± 0.386
1.993GlnGln: 1.993 ± 0.386
2.356GlnArg: 2.356 ± 0.459
2.235GlnSer: 2.235 ± 0.344
2.235GlnThr: 2.235 ± 0.398
2.537GlnVal: 2.537 ± 0.361
1.027GlnTrp: 1.027 ± 0.325
0.785GlnTyr: 0.785 ± 0.188
0.0GlnXaa: 0.0 ± 0.0
Arg
5.134ArgAla: 5.134 ± 0.667
0.785ArgCys: 0.785 ± 0.214
4.409ArgAsp: 4.409 ± 0.526
4.953ArgGlu: 4.953 ± 0.676
2.416ArgPhe: 2.416 ± 0.504
4.651ArgGly: 4.651 ± 0.695
1.389ArgHis: 1.389 ± 0.284
3.564ArgIle: 3.564 ± 0.548
3.262ArgLys: 3.262 ± 0.53
5.255ArgLeu: 5.255 ± 0.571
1.631ArgMet: 1.631 ± 0.288
2.718ArgAsn: 2.718 ± 0.436
2.537ArgPro: 2.537 ± 0.36
1.872ArgGln: 1.872 ± 0.35
5.134ArgArg: 5.134 ± 0.641
3.262ArgSer: 3.262 ± 0.496
3.503ArgThr: 3.503 ± 0.361
4.409ArgVal: 4.409 ± 0.613
1.148ArgTrp: 1.148 ± 0.278
1.51ArgTyr: 1.51 ± 0.31
0.0ArgXaa: 0.0 ± 0.0
Ser
4.953SerAla: 4.953 ± 0.577
0.362SerCys: 0.362 ± 0.172
3.262SerAsp: 3.262 ± 0.492
3.624SerGlu: 3.624 ± 0.466
2.295SerPhe: 2.295 ± 0.382
5.074SerGly: 5.074 ± 0.79
0.846SerHis: 0.846 ± 0.211
2.537SerIle: 2.537 ± 0.367
2.537SerLys: 2.537 ± 0.466
4.651SerLeu: 4.651 ± 0.635
1.329SerMet: 1.329 ± 0.29
0.966SerAsn: 0.966 ± 0.238
3.443SerPro: 3.443 ± 0.385
1.993SerGln: 1.993 ± 0.245
3.503SerArg: 3.503 ± 0.382
2.295SerSer: 2.295 ± 0.441
3.201SerThr: 3.201 ± 0.445
3.382SerVal: 3.382 ± 0.459
0.966SerTrp: 0.966 ± 0.227
1.45SerTyr: 1.45 ± 0.29
0.0SerXaa: 0.0 ± 0.0
Thr
5.98ThrAla: 5.98 ± 0.52
0.544ThrCys: 0.544 ± 0.164
3.745ThrAsp: 3.745 ± 0.558
4.228ThrGlu: 4.228 ± 0.525
1.933ThrPhe: 1.933 ± 0.356
6.04ThrGly: 6.04 ± 0.914
0.906ThrHis: 0.906 ± 0.246
2.597ThrIle: 2.597 ± 0.404
3.08ThrLys: 3.08 ± 0.494
5.255ThrLeu: 5.255 ± 0.671
1.208ThrMet: 1.208 ± 0.314
1.57ThrAsn: 1.57 ± 0.301
4.47ThrPro: 4.47 ± 0.554
2.114ThrGln: 2.114 ± 0.341
2.537ThrArg: 2.537 ± 0.426
3.141ThrSer: 3.141 ± 0.525
3.382ThrThr: 3.382 ± 0.528
5.194ThrVal: 5.194 ± 0.672
1.389ThrTrp: 1.389 ± 0.249
2.054ThrTyr: 2.054 ± 0.365
0.0ThrXaa: 0.0 ± 0.0
Val
6.523ValAla: 6.523 ± 0.733
0.846ValCys: 0.846 ± 0.186
5.315ValAsp: 5.315 ± 0.586
4.47ValGlu: 4.47 ± 0.587
2.476ValPhe: 2.476 ± 0.409
5.255ValGly: 5.255 ± 0.527
0.906ValHis: 0.906 ± 0.238
3.624ValIle: 3.624 ± 0.54
3.805ValLys: 3.805 ± 0.42
5.134ValLeu: 5.134 ± 0.684
1.329ValMet: 1.329 ± 0.323
2.718ValAsn: 2.718 ± 0.445
3.02ValPro: 3.02 ± 0.567
2.476ValGln: 2.476 ± 0.389
4.59ValArg: 4.59 ± 0.524
4.047ValSer: 4.047 ± 0.615
5.134ValThr: 5.134 ± 0.531
4.772ValVal: 4.772 ± 0.477
1.389ValTrp: 1.389 ± 0.312
2.356ValTyr: 2.356 ± 0.373
0.0ValXaa: 0.0 ± 0.0
Trp
1.812TrpAla: 1.812 ± 0.379
0.362TrpCys: 0.362 ± 0.15
1.389TrpAsp: 1.389 ± 0.315
1.389TrpGlu: 1.389 ± 0.249
0.483TrpPhe: 0.483 ± 0.202
1.148TrpGly: 1.148 ± 0.244
0.423TrpHis: 0.423 ± 0.164
1.087TrpIle: 1.087 ± 0.234
0.725TrpLys: 0.725 ± 0.224
1.208TrpLeu: 1.208 ± 0.273
0.423TrpMet: 0.423 ± 0.157
0.846TrpAsn: 0.846 ± 0.255
0.966TrpPro: 0.966 ± 0.27
0.906TrpGln: 0.906 ± 0.229
1.268TrpArg: 1.268 ± 0.233
1.027TrpSer: 1.027 ± 0.25
1.45TrpThr: 1.45 ± 0.292
1.268TrpVal: 1.268 ± 0.264
0.483TrpTrp: 0.483 ± 0.166
0.604TrpTyr: 0.604 ± 0.197
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.201TyrAla: 3.201 ± 0.443
0.362TyrCys: 0.362 ± 0.161
1.993TyrAsp: 1.993 ± 0.3
2.295TyrGlu: 2.295 ± 0.46
0.846TyrPhe: 0.846 ± 0.205
2.114TyrGly: 2.114 ± 0.308
0.302TyrHis: 0.302 ± 0.123
1.148TyrIle: 1.148 ± 0.227
0.785TyrLys: 0.785 ± 0.215
2.778TyrLeu: 2.778 ± 0.421
0.785TyrMet: 0.785 ± 0.227
0.906TyrAsn: 0.906 ± 0.256
1.691TyrPro: 1.691 ± 0.385
0.906TyrGln: 0.906 ± 0.225
2.174TyrArg: 2.174 ± 0.379
1.389TyrSer: 1.389 ± 0.281
1.812TyrThr: 1.812 ± 0.291
2.174TyrVal: 2.174 ± 0.389
0.483TyrTrp: 0.483 ± 0.202
0.725TyrTyr: 0.725 ± 0.214
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 92 proteins (16557 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski