Amino acid dipepetide frequency for Mycobacterium phage Mercurio

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
18.524AlaAla: 18.524 ± 1.524
1.098AlaCys: 1.098 ± 0.335
9.056AlaAsp: 9.056 ± 0.916
9.33AlaGlu: 9.33 ± 1.111
4.254AlaPhe: 4.254 ± 0.585
14.201AlaGly: 14.201 ± 1.179
2.058AlaHis: 2.058 ± 0.389
7.89AlaIle: 7.89 ± 0.801
5.832AlaLys: 5.832 ± 0.709
11.663AlaLeu: 11.663 ± 1.175
2.538AlaMet: 2.538 ± 0.436
4.254AlaAsn: 4.254 ± 0.617
7.89AlaPro: 7.89 ± 1.115
4.116AlaGln: 4.116 ± 0.469
7.066AlaArg: 7.066 ± 0.813
5.351AlaSer: 5.351 ± 0.854
7.752AlaThr: 7.752 ± 0.703
7.547AlaVal: 7.547 ± 0.96
1.99AlaTrp: 1.99 ± 0.255
1.784AlaTyr: 1.784 ± 0.361
0.0AlaXaa: 0.0 ± 0.0
Cys
1.098CysAla: 1.098 ± 0.324
0.206CysCys: 0.206 ± 0.124
0.617CysAsp: 0.617 ± 0.227
0.412CysGlu: 0.412 ± 0.154
0.274CysPhe: 0.274 ± 0.174
1.441CysGly: 1.441 ± 0.44
0.069CysHis: 0.069 ± 0.088
0.412CysIle: 0.412 ± 0.154
0.0CysLys: 0.0 ± 0.0
0.206CysLeu: 0.206 ± 0.107
0.0CysMet: 0.0 ± 0.0
0.412CysAsn: 0.412 ± 0.142
0.96CysPro: 0.96 ± 0.287
0.549CysGln: 0.549 ± 0.252
1.029CysArg: 1.029 ± 0.323
0.617CysSer: 0.617 ± 0.282
0.549CysThr: 0.549 ± 0.203
0.48CysVal: 0.48 ± 0.284
0.549CysTrp: 0.549 ± 0.223
0.206CysTyr: 0.206 ± 0.099
0.0CysXaa: 0.0 ± 0.0
Asp
8.919AspAla: 8.919 ± 0.951
0.892AspCys: 0.892 ± 0.335
6.655AspAsp: 6.655 ± 0.774
5.214AspGlu: 5.214 ± 0.782
1.647AspPhe: 1.647 ± 0.489
8.507AspGly: 8.507 ± 0.782
1.372AspHis: 1.372 ± 0.304
0.96AspIle: 0.96 ± 0.271
1.647AspLys: 1.647 ± 0.451
3.43AspLeu: 3.43 ± 0.488
1.235AspMet: 1.235 ± 0.285
0.892AspAsn: 0.892 ± 0.258
5.969AspPro: 5.969 ± 0.802
3.499AspGln: 3.499 ± 0.632
4.802AspArg: 4.802 ± 0.68
1.647AspSer: 1.647 ± 0.334
3.979AspThr: 3.979 ± 0.586
5.694AspVal: 5.694 ± 0.589
1.647AspTrp: 1.647 ± 0.354
2.333AspTyr: 2.333 ± 0.426
0.0AspXaa: 0.0 ± 0.0
Glu
7.341GluAla: 7.341 ± 1.067
0.48GluCys: 0.48 ± 0.161
3.842GluAsp: 3.842 ± 0.633
2.401GluGlu: 2.401 ± 0.421
1.784GluPhe: 1.784 ± 0.312
3.636GluGly: 3.636 ± 0.601
2.058GluHis: 2.058 ± 0.379
1.441GluIle: 1.441 ± 0.387
1.99GluLys: 1.99 ± 0.392
4.254GluLeu: 4.254 ± 0.51
0.823GluMet: 0.823 ± 0.315
2.127GluAsn: 2.127 ± 0.467
4.871GluPro: 4.871 ± 0.715
1.921GluGln: 1.921 ± 0.295
5.008GluArg: 5.008 ± 0.831
1.578GluSer: 1.578 ± 0.279
2.401GluThr: 2.401 ± 0.323
3.636GluVal: 3.636 ± 0.429
1.372GluTrp: 1.372 ± 0.29
0.892GluTyr: 0.892 ± 0.263
0.0GluXaa: 0.0 ± 0.0
Phe
5.008PheAla: 5.008 ± 0.689
0.412PheCys: 0.412 ± 0.159
2.401PheAsp: 2.401 ± 0.386
2.538PheGlu: 2.538 ± 0.48
0.755PhePhe: 0.755 ± 0.19
4.322PheGly: 4.322 ± 0.537
0.274PheHis: 0.274 ± 0.117
0.755PheIle: 0.755 ± 0.175
0.755PheLys: 0.755 ± 0.228
1.509PheLeu: 1.509 ± 0.319
0.48PheMet: 0.48 ± 0.186
0.823PheAsn: 0.823 ± 0.247
1.578PhePro: 1.578 ± 0.377
0.686PheGln: 0.686 ± 0.195
1.304PheArg: 1.304 ± 0.262
1.029PheSer: 1.029 ± 0.279
1.852PheThr: 1.852 ± 0.354
1.99PheVal: 1.99 ± 0.357
0.412PheTrp: 0.412 ± 0.155
0.755PheTyr: 0.755 ± 0.278
0.0PheXaa: 0.0 ± 0.0
Gly
11.8GlyAla: 11.8 ± 1.212
0.96GlyCys: 0.96 ± 0.274
8.439GlyAsp: 8.439 ± 0.745
3.362GlyGlu: 3.362 ± 0.44
3.087GlyPhe: 3.087 ± 0.474
11.594GlyGly: 11.594 ± 3.171
1.715GlyHis: 1.715 ± 0.4
4.528GlyIle: 4.528 ± 0.623
3.705GlyLys: 3.705 ± 0.622
6.792GlyLeu: 6.792 ± 0.818
1.509GlyMet: 1.509 ± 0.303
2.881GlyAsn: 2.881 ± 0.414
4.734GlyPro: 4.734 ± 0.663
2.676GlyGln: 2.676 ± 0.482
6.861GlyArg: 6.861 ± 0.768
5.763GlySer: 5.763 ± 0.815
6.792GlyThr: 6.792 ± 0.812
6.312GlyVal: 6.312 ± 0.808
2.538GlyTrp: 2.538 ± 0.301
2.195GlyTyr: 2.195 ± 0.39
0.0GlyXaa: 0.0 ± 0.0
His
2.401HisAla: 2.401 ± 0.519
0.206HisCys: 0.206 ± 0.107
1.372HisAsp: 1.372 ± 0.373
0.412HisGlu: 0.412 ± 0.166
0.549HisPhe: 0.549 ± 0.211
1.647HisGly: 1.647 ± 0.31
0.412HisHis: 0.412 ± 0.187
0.686HisIle: 0.686 ± 0.2
0.549HisLys: 0.549 ± 0.176
1.166HisLeu: 1.166 ± 0.288
0.48HisMet: 0.48 ± 0.183
0.412HisAsn: 0.412 ± 0.164
1.784HisPro: 1.784 ± 0.406
1.029HisGln: 1.029 ± 0.233
2.401HisArg: 2.401 ± 0.481
0.549HisSer: 0.549 ± 0.182
1.304HisThr: 1.304 ± 0.325
1.578HisVal: 1.578 ± 0.316
0.48HisTrp: 0.48 ± 0.173
0.412HisTyr: 0.412 ± 0.17
0.0HisXaa: 0.0 ± 0.0
Ile
6.792IleAla: 6.792 ± 0.751
0.274IleCys: 0.274 ± 0.15
3.156IleAsp: 3.156 ± 0.476
2.95IleGlu: 2.95 ± 0.445
0.96IlePhe: 0.96 ± 0.281
5.283IleGly: 5.283 ± 0.936
0.823IleHis: 0.823 ± 0.214
0.686IleIle: 0.686 ± 0.248
0.892IleLys: 0.892 ± 0.292
2.127IleLeu: 2.127 ± 0.459
0.823IleMet: 0.823 ± 0.27
0.686IleAsn: 0.686 ± 0.212
3.499IlePro: 3.499 ± 0.47
0.96IleGln: 0.96 ± 0.197
2.607IleArg: 2.607 ± 0.364
2.127IleSer: 2.127 ± 0.425
3.362IleThr: 3.362 ± 0.478
3.362IleVal: 3.362 ± 0.442
0.823IleTrp: 0.823 ± 0.237
0.892IleTyr: 0.892 ± 0.224
0.0IleXaa: 0.0 ± 0.0
Lys
5.214LysAla: 5.214 ± 0.717
0.206LysCys: 0.206 ± 0.107
1.784LysAsp: 1.784 ± 0.406
0.892LysGlu: 0.892 ± 0.237
1.166LysPhe: 1.166 ± 0.234
1.647LysGly: 1.647 ± 0.364
1.166LysHis: 1.166 ± 0.312
1.647LysIle: 1.647 ± 0.394
0.755LysLys: 0.755 ± 0.255
2.813LysLeu: 2.813 ± 0.429
1.098LysMet: 1.098 ± 0.296
1.715LysAsn: 1.715 ± 0.295
2.195LysPro: 2.195 ± 0.285
0.755LysGln: 0.755 ± 0.276
3.568LysArg: 3.568 ± 0.594
1.99LysSer: 1.99 ± 0.475
2.264LysThr: 2.264 ± 0.404
2.058LysVal: 2.058 ± 0.342
0.48LysTrp: 0.48 ± 0.172
0.617LysTyr: 0.617 ± 0.239
0.0LysXaa: 0.0 ± 0.0
Leu
9.879LeuAla: 9.879 ± 0.86
0.343LeuCys: 0.343 ± 0.183
6.655LeuAsp: 6.655 ± 0.648
3.636LeuGlu: 3.636 ± 0.464
2.264LeuPhe: 2.264 ± 0.316
6.312LeuGly: 6.312 ± 1.153
1.372LeuHis: 1.372 ± 0.277
3.911LeuIle: 3.911 ± 0.541
1.99LeuLys: 1.99 ± 0.367
6.175LeuLeu: 6.175 ± 0.59
1.235LeuMet: 1.235 ± 0.284
2.127LeuAsn: 2.127 ± 0.368
4.528LeuPro: 4.528 ± 0.716
1.098LeuGln: 1.098 ± 0.261
5.626LeuArg: 5.626 ± 0.588
4.322LeuSer: 4.322 ± 0.536
5.077LeuThr: 5.077 ± 0.713
4.734LeuVal: 4.734 ± 0.539
1.304LeuTrp: 1.304 ± 0.312
2.195LeuTyr: 2.195 ± 0.408
0.0LeuXaa: 0.0 ± 0.0
Met
3.019MetAla: 3.019 ± 0.449
0.206MetCys: 0.206 ± 0.113
0.96MetAsp: 0.96 ± 0.266
0.686MetGlu: 0.686 ± 0.169
0.343MetPhe: 0.343 ± 0.151
1.029MetGly: 1.029 ± 0.258
0.274MetHis: 0.274 ± 0.18
1.509MetIle: 1.509 ± 0.338
0.755MetLys: 0.755 ± 0.259
1.166MetLeu: 1.166 ± 0.221
0.48MetMet: 0.48 ± 0.183
0.412MetAsn: 0.412 ± 0.151
0.892MetPro: 0.892 ± 0.229
0.823MetGln: 0.823 ± 0.273
1.029MetArg: 1.029 ± 0.232
1.715MetSer: 1.715 ± 0.307
1.715MetThr: 1.715 ± 0.295
1.441MetVal: 1.441 ± 0.275
0.206MetTrp: 0.206 ± 0.106
0.549MetTyr: 0.549 ± 0.187
0.0MetXaa: 0.0 ± 0.0
Asn
3.156AsnAla: 3.156 ± 0.445
0.549AsnCys: 0.549 ± 0.21
1.578AsnAsp: 1.578 ± 0.313
1.852AsnGlu: 1.852 ± 0.332
0.617AsnPhe: 0.617 ± 0.186
3.636AsnGly: 3.636 ± 0.667
0.48AsnHis: 0.48 ± 0.185
0.892AsnIle: 0.892 ± 0.291
1.098AsnLys: 1.098 ± 0.269
1.852AsnLeu: 1.852 ± 0.447
0.343AsnMet: 0.343 ± 0.173
0.343AsnAsn: 0.343 ± 0.132
2.95AsnPro: 2.95 ± 0.443
0.892AsnGln: 0.892 ± 0.217
2.744AsnArg: 2.744 ± 0.36
0.617AsnSer: 0.617 ± 0.183
1.578AsnThr: 1.578 ± 0.357
1.715AsnVal: 1.715 ± 0.529
0.274AsnTrp: 0.274 ± 0.126
0.48AsnTyr: 0.48 ± 0.151
0.0AsnXaa: 0.0 ± 0.0
Pro
12.349ProAla: 12.349 ± 1.279
0.343ProCys: 0.343 ± 0.161
5.214ProAsp: 5.214 ± 0.689
3.705ProGlu: 3.705 ± 0.673
1.647ProPhe: 1.647 ± 0.34
5.283ProGly: 5.283 ± 0.722
1.441ProHis: 1.441 ± 0.316
2.95ProIle: 2.95 ± 0.402
2.607ProLys: 2.607 ± 0.442
4.185ProLeu: 4.185 ± 0.537
0.96ProMet: 0.96 ± 0.328
1.372ProAsn: 1.372 ± 0.379
3.224ProPro: 3.224 ± 0.47
1.372ProGln: 1.372 ± 0.378
3.43ProArg: 3.43 ± 0.567
3.499ProSer: 3.499 ± 0.525
4.254ProThr: 4.254 ± 0.542
3.773ProVal: 3.773 ± 0.467
0.96ProTrp: 0.96 ± 0.299
1.372ProTyr: 1.372 ± 0.337
0.0ProXaa: 0.0 ± 0.0
Gln
4.665GlnAla: 4.665 ± 0.546
0.137GlnCys: 0.137 ± 0.099
1.578GlnAsp: 1.578 ± 0.355
0.755GlnGlu: 0.755 ± 0.223
1.098GlnPhe: 1.098 ± 0.263
3.019GlnGly: 3.019 ± 0.495
0.755GlnHis: 0.755 ± 0.226
1.784GlnIle: 1.784 ± 0.235
0.892GlnLys: 0.892 ± 0.288
3.224GlnLeu: 3.224 ± 0.419
0.823GlnMet: 0.823 ± 0.206
1.098GlnAsn: 1.098 ± 0.229
1.852GlnPro: 1.852 ± 0.362
0.48GlnGln: 0.48 ± 0.178
2.676GlnArg: 2.676 ± 0.402
1.029GlnSer: 1.029 ± 0.384
1.647GlnThr: 1.647 ± 0.323
2.195GlnVal: 2.195 ± 0.425
0.617GlnTrp: 0.617 ± 0.221
0.617GlnTyr: 0.617 ± 0.177
0.0GlnXaa: 0.0 ± 0.0
Arg
8.987ArgAla: 8.987 ± 0.938
1.372ArgCys: 1.372 ± 0.363
4.322ArgAsp: 4.322 ± 0.52
3.911ArgGlu: 3.911 ± 0.553
2.127ArgPhe: 2.127 ± 0.496
5.626ArgGly: 5.626 ± 0.596
1.921ArgHis: 1.921 ± 0.413
2.881ArgIle: 2.881 ± 0.439
3.156ArgLys: 3.156 ± 0.549
6.723ArgLeu: 6.723 ± 0.776
2.127ArgMet: 2.127 ± 0.382
1.784ArgAsn: 1.784 ± 0.345
4.94ArgPro: 4.94 ± 0.648
2.058ArgGln: 2.058 ± 0.445
7.204ArgArg: 7.204 ± 0.763
2.744ArgSer: 2.744 ± 0.494
4.116ArgThr: 4.116 ± 0.419
3.362ArgVal: 3.362 ± 0.486
1.166ArgTrp: 1.166 ± 0.303
1.098ArgTyr: 1.098 ± 0.264
0.0ArgXaa: 0.0 ± 0.0
Ser
4.391SerAla: 4.391 ± 0.492
0.412SerCys: 0.412 ± 0.182
2.95SerAsp: 2.95 ± 0.412
2.401SerGlu: 2.401 ± 0.424
1.372SerPhe: 1.372 ± 0.292
5.42SerGly: 5.42 ± 0.829
0.617SerHis: 0.617 ± 0.244
3.019SerIle: 3.019 ± 0.483
1.921SerLys: 1.921 ± 0.334
3.636SerLeu: 3.636 ± 0.765
1.166SerMet: 1.166 ± 0.238
1.852SerAsn: 1.852 ± 0.288
2.538SerPro: 2.538 ± 0.406
1.304SerGln: 1.304 ± 0.26
3.224SerArg: 3.224 ± 0.387
2.676SerSer: 2.676 ± 0.469
4.185SerThr: 4.185 ± 0.734
2.264SerVal: 2.264 ± 0.35
0.823SerTrp: 0.823 ± 0.256
1.098SerTyr: 1.098 ± 0.354
0.0SerXaa: 0.0 ± 0.0
Thr
8.507ThrAla: 8.507 ± 0.837
0.48ThrCys: 0.48 ± 0.193
3.362ThrAsp: 3.362 ± 0.494
3.293ThrGlu: 3.293 ± 0.502
1.852ThrPhe: 1.852 ± 0.289
6.586ThrGly: 6.586 ± 0.863
1.372ThrHis: 1.372 ± 0.285
3.019ThrIle: 3.019 ± 0.431
2.058ThrLys: 2.058 ± 0.406
3.979ThrLeu: 3.979 ± 0.443
1.509ThrMet: 1.509 ± 0.337
1.784ThrAsn: 1.784 ± 0.357
4.94ThrPro: 4.94 ± 0.683
1.921ThrGln: 1.921 ± 0.344
3.499ThrArg: 3.499 ± 0.496
3.636ThrSer: 3.636 ± 0.478
3.842ThrThr: 3.842 ± 0.483
5.283ThrVal: 5.283 ± 0.504
1.029ThrTrp: 1.029 ± 0.245
1.235ThrTyr: 1.235 ± 0.221
0.0ThrXaa: 0.0 ± 0.0
Val
8.301ValAla: 8.301 ± 0.998
1.166ValCys: 1.166 ± 0.375
4.254ValAsp: 4.254 ± 0.593
4.734ValGlu: 4.734 ± 0.493
2.676ValPhe: 2.676 ± 0.409
5.557ValGly: 5.557 ± 0.813
0.549ValHis: 0.549 ± 0.164
2.95ValIle: 2.95 ± 0.398
2.264ValLys: 2.264 ± 0.41
5.626ValLeu: 5.626 ± 0.551
1.098ValMet: 1.098 ± 0.32
1.304ValAsn: 1.304 ± 0.257
2.881ValPro: 2.881 ± 0.432
2.881ValGln: 2.881 ± 0.536
4.665ValArg: 4.665 ± 0.586
2.676ValSer: 2.676 ± 0.41
3.842ValThr: 3.842 ± 0.513
5.557ValVal: 5.557 ± 0.644
1.852ValTrp: 1.852 ± 0.354
1.852ValTyr: 1.852 ± 0.394
0.0ValXaa: 0.0 ± 0.0
Trp
1.921TrpAla: 1.921 ± 0.374
0.206TrpCys: 0.206 ± 0.108
1.098TrpAsp: 1.098 ± 0.229
0.617TrpGlu: 0.617 ± 0.197
0.823TrpPhe: 0.823 ± 0.277
1.441TrpGly: 1.441 ± 0.355
0.48TrpHis: 0.48 ± 0.22
0.617TrpIle: 0.617 ± 0.23
0.686TrpLys: 0.686 ± 0.232
2.127TrpLeu: 2.127 ± 0.315
0.274TrpMet: 0.274 ± 0.135
0.96TrpAsn: 0.96 ± 0.28
0.96TrpPro: 0.96 ± 0.237
0.892TrpGln: 0.892 ± 0.19
1.647TrpArg: 1.647 ± 0.385
1.372TrpSer: 1.372 ± 0.312
1.509TrpThr: 1.509 ± 0.28
1.029TrpVal: 1.029 ± 0.289
0.412TrpTrp: 0.412 ± 0.153
0.48TrpTyr: 0.48 ± 0.193
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.921TyrAla: 1.921 ± 0.335
0.137TyrCys: 0.137 ± 0.103
1.647TyrAsp: 1.647 ± 0.348
0.755TyrGlu: 0.755 ± 0.257
0.343TyrPhe: 0.343 ± 0.167
2.058TyrGly: 2.058 ± 0.515
0.617TyrHis: 0.617 ± 0.222
0.549TyrIle: 0.549 ± 0.165
0.48TyrLys: 0.48 ± 0.188
2.127TyrLeu: 2.127 ± 0.442
0.137TyrMet: 0.137 ± 0.098
0.412TyrAsn: 0.412 ± 0.157
0.823TyrPro: 0.823 ± 0.199
1.029TyrGln: 1.029 ± 0.276
1.235TyrArg: 1.235 ± 0.264
2.333TyrSer: 2.333 ± 0.466
1.166TyrThr: 1.166 ± 0.315
2.676TyrVal: 2.676 ± 0.419
0.617TyrTrp: 0.617 ± 0.235
0.206TyrTyr: 0.206 ± 0.169
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 64 proteins (14577 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski