Amino acid dipepetide frequency for Mycobacterium phage Smeagol

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
12.442AlaAla: 12.442 ± 1.146
0.671AlaCys: 0.671 ± 0.213
6.892AlaAsp: 6.892 ± 0.669
6.526AlaGlu: 6.526 ± 0.781
3.293AlaPhe: 3.293 ± 0.461
8.417AlaGly: 8.417 ± 0.85
1.525AlaHis: 1.525 ± 0.353
4.757AlaIle: 4.757 ± 0.526
4.147AlaLys: 4.147 ± 0.496
8.844AlaLeu: 8.844 ± 0.941
2.623AlaMet: 2.623 ± 0.351
2.623AlaAsn: 2.623 ± 0.369
5.062AlaPro: 5.062 ± 0.711
3.111AlaGln: 3.111 ± 0.463
6.16AlaArg: 6.16 ± 0.565
6.221AlaSer: 6.221 ± 0.831
5.611AlaThr: 5.611 ± 0.644
7.685AlaVal: 7.685 ± 0.726
2.074AlaTrp: 2.074 ± 0.316
3.05AlaTyr: 3.05 ± 0.42
0.0AlaXaa: 0.0 ± 0.0
Cys
0.549CysAla: 0.549 ± 0.206
0.0CysCys: 0.0 ± 0.0
0.305CysAsp: 0.305 ± 0.14
0.549CysGlu: 0.549 ± 0.174
0.122CysPhe: 0.122 ± 0.088
0.671CysGly: 0.671 ± 0.255
0.183CysHis: 0.183 ± 0.119
0.183CysIle: 0.183 ± 0.095
0.183CysLys: 0.183 ± 0.096
0.427CysLeu: 0.427 ± 0.185
0.061CysMet: 0.061 ± 0.07
0.244CysAsn: 0.244 ± 0.129
0.244CysPro: 0.244 ± 0.108
0.183CysGln: 0.183 ± 0.104
0.61CysArg: 0.61 ± 0.199
0.244CysSer: 0.244 ± 0.129
0.305CysThr: 0.305 ± 0.151
0.122CysVal: 0.122 ± 0.077
0.305CysTrp: 0.305 ± 0.126
0.122CysTyr: 0.122 ± 0.092
0.0CysXaa: 0.0 ± 0.0
Asp
6.526AspAla: 6.526 ± 0.67
0.549AspCys: 0.549 ± 0.2
4.635AspAsp: 4.635 ± 0.61
4.208AspGlu: 4.208 ± 0.548
2.501AspPhe: 2.501 ± 0.38
5.855AspGly: 5.855 ± 0.661
1.037AspHis: 1.037 ± 0.25
2.806AspIle: 2.806 ± 0.51
2.989AspLys: 2.989 ± 0.572
7.197AspLeu: 7.197 ± 0.721
1.098AspMet: 1.098 ± 0.229
1.891AspAsn: 1.891 ± 0.327
4.391AspPro: 4.391 ± 0.541
1.464AspGln: 1.464 ± 0.326
3.842AspArg: 3.842 ± 0.518
3.476AspSer: 3.476 ± 0.433
4.025AspThr: 4.025 ± 0.508
5.123AspVal: 5.123 ± 0.591
1.525AspTrp: 1.525 ± 0.363
1.769AspTyr: 1.769 ± 0.336
0.0AspXaa: 0.0 ± 0.0
Glu
6.099GluAla: 6.099 ± 0.662
0.244GluCys: 0.244 ± 0.146
4.94GluAsp: 4.94 ± 0.472
5.001GluGlu: 5.001 ± 0.567
2.257GluPhe: 2.257 ± 0.363
4.269GluGly: 4.269 ± 0.514
1.464GluHis: 1.464 ± 0.27
3.537GluIle: 3.537 ± 0.452
2.44GluLys: 2.44 ± 0.42
6.587GluLeu: 6.587 ± 0.615
1.647GluMet: 1.647 ± 0.299
1.586GluAsn: 1.586 ± 0.336
2.928GluPro: 2.928 ± 0.492
2.562GluGln: 2.562 ± 0.42
3.842GluArg: 3.842 ± 0.544
3.111GluSer: 3.111 ± 0.476
4.147GluThr: 4.147 ± 0.453
5.611GluVal: 5.611 ± 0.583
1.769GluTrp: 1.769 ± 0.356
2.379GluTyr: 2.379 ± 0.448
0.0GluXaa: 0.0 ± 0.0
Phe
2.745PheAla: 2.745 ± 0.373
0.183PheCys: 0.183 ± 0.095
2.623PheAsp: 2.623 ± 0.428
1.952PheGlu: 1.952 ± 0.305
0.427PhePhe: 0.427 ± 0.207
3.842PheGly: 3.842 ± 0.474
0.732PheHis: 0.732 ± 0.237
1.403PheIle: 1.403 ± 0.282
1.525PheLys: 1.525 ± 0.306
2.379PheLeu: 2.379 ± 0.457
0.732PheMet: 0.732 ± 0.235
1.159PheAsn: 1.159 ± 0.281
1.708PhePro: 1.708 ± 0.333
0.793PheGln: 0.793 ± 0.208
2.074PheArg: 2.074 ± 0.423
1.586PheSer: 1.586 ± 0.288
2.013PheThr: 2.013 ± 0.363
2.196PheVal: 2.196 ± 0.365
0.61PheTrp: 0.61 ± 0.185
1.098PheTyr: 1.098 ± 0.248
0.0PheXaa: 0.0 ± 0.0
Gly
7.319GlyAla: 7.319 ± 1.03
0.427GlyCys: 0.427 ± 0.16
5.977GlyAsp: 5.977 ± 0.681
4.33GlyGlu: 4.33 ± 0.506
2.501GlyPhe: 2.501 ± 0.442
8.295GlyGly: 8.295 ± 1.904
2.074GlyHis: 2.074 ± 0.402
4.574GlyIle: 4.574 ± 0.646
3.415GlyLys: 3.415 ± 0.462
7.685GlyLeu: 7.685 ± 0.94
1.708GlyMet: 1.708 ± 0.335
2.928GlyAsn: 2.928 ± 0.368
3.964GlyPro: 3.964 ± 0.549
2.257GlyGln: 2.257 ± 0.427
5.184GlyArg: 5.184 ± 0.587
5.855GlySer: 5.855 ± 0.715
4.94GlyThr: 4.94 ± 0.671
5.062GlyVal: 5.062 ± 0.593
2.318GlyTrp: 2.318 ± 0.367
3.172GlyTyr: 3.172 ± 0.406
0.0GlyXaa: 0.0 ± 0.0
His
1.83HisAla: 1.83 ± 0.371
0.244HisCys: 0.244 ± 0.154
1.037HisAsp: 1.037 ± 0.26
1.342HisGlu: 1.342 ± 0.272
0.671HisPhe: 0.671 ± 0.225
1.891HisGly: 1.891 ± 0.344
0.671HisHis: 0.671 ± 0.221
0.854HisIle: 0.854 ± 0.215
1.037HisLys: 1.037 ± 0.295
1.403HisLeu: 1.403 ± 0.345
0.183HisMet: 0.183 ± 0.143
0.305HisAsn: 0.305 ± 0.135
1.281HisPro: 1.281 ± 0.278
0.671HisGln: 0.671 ± 0.199
1.464HisArg: 1.464 ± 0.319
0.732HisSer: 0.732 ± 0.196
1.159HisThr: 1.159 ± 0.315
1.647HisVal: 1.647 ± 0.327
0.366HisTrp: 0.366 ± 0.129
0.732HisTyr: 0.732 ± 0.245
0.0HisXaa: 0.0 ± 0.0
Ile
5.611IleAla: 5.611 ± 0.614
0.183IleCys: 0.183 ± 0.104
3.598IleAsp: 3.598 ± 0.439
3.903IleGlu: 3.903 ± 0.478
1.037IlePhe: 1.037 ± 0.206
3.964IleGly: 3.964 ± 0.56
0.732IleHis: 0.732 ± 0.212
1.891IleIle: 1.891 ± 0.318
1.83IleLys: 1.83 ± 0.329
3.354IleLeu: 3.354 ± 0.442
0.671IleMet: 0.671 ± 0.177
1.952IleAsn: 1.952 ± 0.391
3.232IlePro: 3.232 ± 0.389
1.769IleGln: 1.769 ± 0.315
3.415IleArg: 3.415 ± 0.56
3.354IleSer: 3.354 ± 0.622
3.72IleThr: 3.72 ± 0.442
2.806IleVal: 2.806 ± 0.45
0.61IleTrp: 0.61 ± 0.178
1.464IleTyr: 1.464 ± 0.249
0.0IleXaa: 0.0 ± 0.0
Lys
4.147LysAla: 4.147 ± 0.515
0.244LysCys: 0.244 ± 0.116
3.111LysAsp: 3.111 ± 0.494
2.135LysGlu: 2.135 ± 0.318
1.525LysPhe: 1.525 ± 0.319
2.684LysGly: 2.684 ± 0.368
1.098LysHis: 1.098 ± 0.284
2.562LysIle: 2.562 ± 0.398
1.83LysLys: 1.83 ± 0.397
3.111LysLeu: 3.111 ± 0.402
1.037LysMet: 1.037 ± 0.237
1.342LysAsn: 1.342 ± 0.273
2.806LysPro: 2.806 ± 0.508
1.83LysGln: 1.83 ± 0.37
3.172LysArg: 3.172 ± 0.491
2.745LysSer: 2.745 ± 0.406
2.257LysThr: 2.257 ± 0.452
2.928LysVal: 2.928 ± 0.452
0.793LysTrp: 0.793 ± 0.223
1.159LysTyr: 1.159 ± 0.294
0.0LysXaa: 0.0 ± 0.0
Leu
10.246LeuAla: 10.246 ± 0.817
0.305LeuCys: 0.305 ± 0.12
6.526LeuAsp: 6.526 ± 0.631
4.757LeuGlu: 4.757 ± 0.561
1.952LeuPhe: 1.952 ± 0.332
7.014LeuGly: 7.014 ± 0.696
1.403LeuHis: 1.403 ± 0.336
4.635LeuIle: 4.635 ± 0.524
3.903LeuLys: 3.903 ± 0.518
6.038LeuLeu: 6.038 ± 0.62
1.83LeuMet: 1.83 ± 0.317
2.806LeuAsn: 2.806 ± 0.342
5.733LeuPro: 5.733 ± 0.588
2.989LeuGln: 2.989 ± 0.537
6.038LeuArg: 6.038 ± 0.697
5.428LeuSer: 5.428 ± 0.562
6.526LeuThr: 6.526 ± 0.638
4.513LeuVal: 4.513 ± 0.542
1.403LeuTrp: 1.403 ± 0.371
2.379LeuTyr: 2.379 ± 0.402
0.0LeuXaa: 0.0 ± 0.0
Met
2.501MetAla: 2.501 ± 0.404
0.0MetCys: 0.0 ± 0.0
1.037MetAsp: 1.037 ± 0.267
1.708MetGlu: 1.708 ± 0.341
0.793MetPhe: 0.793 ± 0.245
1.342MetGly: 1.342 ± 0.296
0.305MetHis: 0.305 ± 0.132
0.549MetIle: 0.549 ± 0.216
1.159MetLys: 1.159 ± 0.265
0.976MetLeu: 0.976 ± 0.251
0.183MetMet: 0.183 ± 0.1
1.037MetAsn: 1.037 ± 0.214
1.403MetPro: 1.403 ± 0.289
0.488MetGln: 0.488 ± 0.192
1.098MetArg: 1.098 ± 0.261
1.769MetSer: 1.769 ± 0.356
1.952MetThr: 1.952 ± 0.352
1.22MetVal: 1.22 ± 0.386
0.305MetTrp: 0.305 ± 0.12
0.427MetTyr: 0.427 ± 0.168
0.0MetXaa: 0.0 ± 0.0
Asn
3.232AsnAla: 3.232 ± 0.498
0.061AsnCys: 0.061 ± 0.057
1.83AsnAsp: 1.83 ± 0.385
2.501AsnGlu: 2.501 ± 0.407
0.915AsnPhe: 0.915 ± 0.245
2.928AsnGly: 2.928 ± 0.44
0.854AsnHis: 0.854 ± 0.218
1.952AsnIle: 1.952 ± 0.305
0.488AsnLys: 0.488 ± 0.21
2.135AsnLeu: 2.135 ± 0.388
0.488AsnMet: 0.488 ± 0.151
0.61AsnAsn: 0.61 ± 0.154
2.562AsnPro: 2.562 ± 0.425
0.915AsnGln: 0.915 ± 0.228
1.403AsnArg: 1.403 ± 0.273
1.708AsnSer: 1.708 ± 0.398
2.013AsnThr: 2.013 ± 0.32
2.501AsnVal: 2.501 ± 0.412
0.793AsnTrp: 0.793 ± 0.191
1.281AsnTyr: 1.281 ± 0.257
0.0AsnXaa: 0.0 ± 0.0
Pro
5.184ProAla: 5.184 ± 0.505
0.305ProCys: 0.305 ± 0.139
4.208ProAsp: 4.208 ± 0.478
4.879ProGlu: 4.879 ± 0.558
2.318ProPhe: 2.318 ± 0.378
4.696ProGly: 4.696 ± 0.678
0.915ProHis: 0.915 ± 0.254
2.501ProIle: 2.501 ± 0.383
2.318ProLys: 2.318 ± 0.323
4.818ProLeu: 4.818 ± 0.544
0.732ProMet: 0.732 ± 0.205
1.525ProAsn: 1.525 ± 0.294
2.623ProPro: 2.623 ± 0.486
1.464ProGln: 1.464 ± 0.317
2.867ProArg: 2.867 ± 0.483
3.903ProSer: 3.903 ± 0.438
3.476ProThr: 3.476 ± 0.473
3.781ProVal: 3.781 ± 0.522
0.793ProTrp: 0.793 ± 0.284
1.586ProTyr: 1.586 ± 0.318
0.0ProXaa: 0.0 ± 0.0
Gln
2.989GlnAla: 2.989 ± 0.509
0.183GlnCys: 0.183 ± 0.104
1.22GlnAsp: 1.22 ± 0.316
1.586GlnGlu: 1.586 ± 0.326
1.342GlnPhe: 1.342 ± 0.257
2.318GlnGly: 2.318 ± 0.399
0.488GlnHis: 0.488 ± 0.15
2.867GlnIle: 2.867 ± 0.479
1.586GlnLys: 1.586 ± 0.354
3.659GlnLeu: 3.659 ± 0.481
0.732GlnMet: 0.732 ± 0.23
0.549GlnAsn: 0.549 ± 0.158
1.647GlnPro: 1.647 ± 0.315
1.769GlnGln: 1.769 ± 0.351
1.464GlnArg: 1.464 ± 0.34
2.135GlnSer: 2.135 ± 0.321
1.769GlnThr: 1.769 ± 0.327
2.074GlnVal: 2.074 ± 0.355
0.671GlnTrp: 0.671 ± 0.17
0.671GlnTyr: 0.671 ± 0.16
0.0GlnXaa: 0.0 ± 0.0
Arg
5.367ArgAla: 5.367 ± 0.664
0.671ArgCys: 0.671 ± 0.222
3.476ArgAsp: 3.476 ± 0.558
4.391ArgGlu: 4.391 ± 0.575
1.952ArgPhe: 1.952 ± 0.383
4.513ArgGly: 4.513 ± 0.651
1.22ArgHis: 1.22 ± 0.283
2.684ArgIle: 2.684 ± 0.43
3.72ArgLys: 3.72 ± 0.598
6.648ArgLeu: 6.648 ± 0.704
1.952ArgMet: 1.952 ± 0.361
2.013ArgAsn: 2.013 ± 0.452
2.379ArgPro: 2.379 ± 0.397
1.891ArgGln: 1.891 ± 0.34
5.123ArgArg: 5.123 ± 0.732
3.781ArgSer: 3.781 ± 0.501
3.172ArgThr: 3.172 ± 0.469
5.001ArgVal: 5.001 ± 0.504
1.403ArgTrp: 1.403 ± 0.327
1.83ArgTyr: 1.83 ± 0.317
0.0ArgXaa: 0.0 ± 0.0
Ser
7.38SerAla: 7.38 ± 1.171
0.61SerCys: 0.61 ± 0.201
3.232SerAsp: 3.232 ± 0.384
3.598SerGlu: 3.598 ± 0.596
2.074SerPhe: 2.074 ± 0.361
6.16SerGly: 6.16 ± 0.746
1.586SerHis: 1.586 ± 0.342
2.684SerIle: 2.684 ± 0.459
2.379SerLys: 2.379 ± 0.418
5.062SerLeu: 5.062 ± 0.6
1.464SerMet: 1.464 ± 0.35
2.257SerAsn: 2.257 ± 0.442
3.05SerPro: 3.05 ± 0.406
1.891SerGln: 1.891 ± 0.32
3.111SerArg: 3.111 ± 0.455
3.354SerSer: 3.354 ± 0.822
3.781SerThr: 3.781 ± 0.652
3.781SerVal: 3.781 ± 0.497
1.159SerTrp: 1.159 ± 0.251
1.403SerTyr: 1.403 ± 0.321
0.0SerXaa: 0.0 ± 0.0
Thr
6.465ThrAla: 6.465 ± 0.739
0.183ThrCys: 0.183 ± 0.123
4.208ThrAsp: 4.208 ± 0.454
4.513ThrGlu: 4.513 ± 0.56
2.257ThrPhe: 2.257 ± 0.364
6.343ThrGly: 6.343 ± 0.684
0.854ThrHis: 0.854 ± 0.309
2.562ThrIle: 2.562 ± 0.522
2.867ThrLys: 2.867 ± 0.388
5.672ThrLeu: 5.672 ± 0.606
0.854ThrMet: 0.854 ± 0.217
1.647ThrAsn: 1.647 ± 0.3
3.842ThrPro: 3.842 ± 0.478
1.891ThrGln: 1.891 ± 0.355
3.354ThrArg: 3.354 ± 0.513
3.598ThrSer: 3.598 ± 0.635
4.635ThrThr: 4.635 ± 0.64
5.977ThrVal: 5.977 ± 0.685
1.342ThrTrp: 1.342 ± 0.309
1.525ThrTyr: 1.525 ± 0.302
0.0ThrXaa: 0.0 ± 0.0
Val
6.648ValAla: 6.648 ± 0.679
0.183ValCys: 0.183 ± 0.1
5.428ValAsp: 5.428 ± 0.542
4.696ValGlu: 4.696 ± 0.597
2.379ValPhe: 2.379 ± 0.393
4.94ValGly: 4.94 ± 0.626
1.281ValHis: 1.281 ± 0.23
3.415ValIle: 3.415 ± 0.535
2.989ValLys: 2.989 ± 0.396
5.672ValLeu: 5.672 ± 0.676
1.22ValMet: 1.22 ± 0.29
2.989ValAsn: 2.989 ± 0.39
3.964ValPro: 3.964 ± 0.466
1.708ValGln: 1.708 ± 0.373
4.757ValArg: 4.757 ± 0.694
4.818ValSer: 4.818 ± 0.485
5.55ValThr: 5.55 ± 0.543
5.184ValVal: 5.184 ± 0.744
1.159ValTrp: 1.159 ± 0.262
1.83ValTyr: 1.83 ± 0.388
0.0ValXaa: 0.0 ± 0.0
Trp
1.586TrpAla: 1.586 ± 0.317
0.122TrpCys: 0.122 ± 0.092
1.464TrpAsp: 1.464 ± 0.27
1.098TrpGlu: 1.098 ± 0.242
0.915TrpPhe: 0.915 ± 0.241
1.708TrpGly: 1.708 ± 0.26
0.671TrpHis: 0.671 ± 0.211
1.037TrpIle: 1.037 ± 0.219
0.427TrpLys: 0.427 ± 0.24
2.318TrpLeu: 2.318 ± 0.353
0.427TrpMet: 0.427 ± 0.153
0.61TrpAsn: 0.61 ± 0.21
0.854TrpPro: 0.854 ± 0.26
0.793TrpGln: 0.793 ± 0.206
1.647TrpArg: 1.647 ± 0.341
0.915TrpSer: 0.915 ± 0.186
1.586TrpThr: 1.586 ± 0.348
1.647TrpVal: 1.647 ± 0.277
0.549TrpTrp: 0.549 ± 0.179
0.305TrpTyr: 0.305 ± 0.121
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.867TyrAla: 2.867 ± 0.427
0.244TyrCys: 0.244 ± 0.18
1.159TyrAsp: 1.159 ± 0.273
2.745TyrGlu: 2.745 ± 0.376
0.549TyrPhe: 0.549 ± 0.149
2.196TyrGly: 2.196 ± 0.364
0.427TyrHis: 0.427 ± 0.158
1.647TyrIle: 1.647 ± 0.31
1.281TyrLys: 1.281 ± 0.273
2.44TyrLeu: 2.44 ± 0.419
0.549TyrMet: 0.549 ± 0.147
1.098TyrAsn: 1.098 ± 0.285
1.22TyrPro: 1.22 ± 0.258
1.281TyrGln: 1.281 ± 0.295
2.562TyrArg: 2.562 ± 0.37
1.281TyrSer: 1.281 ± 0.239
1.952TyrThr: 1.952 ± 0.428
1.952TyrVal: 1.952 ± 0.367
0.732TyrTrp: 0.732 ± 0.227
0.549TyrTyr: 0.549 ± 0.207
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 90 proteins (16397 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski