Amino acid dipepetide frequency for Mycobacterium phage TheloniousMonk

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
12.605AlaAla: 12.605 ± 1.126
0.612AlaCys: 0.612 ± 0.184
6.18AlaAsp: 6.18 ± 0.624
6.119AlaGlu: 6.119 ± 0.851
2.631AlaPhe: 2.631 ± 0.373
7.649AlaGly: 7.649 ± 0.936
1.407AlaHis: 1.407 ± 0.316
4.406AlaIle: 4.406 ± 0.645
3.732AlaLys: 3.732 ± 0.485
9.729AlaLeu: 9.729 ± 0.869
2.509AlaMet: 2.509 ± 0.425
2.937AlaAsn: 2.937 ± 0.394
5.201AlaPro: 5.201 ± 0.727
3.059AlaGln: 3.059 ± 0.486
5.935AlaArg: 5.935 ± 0.463
4.956AlaSer: 4.956 ± 0.574
6.058AlaThr: 6.058 ± 0.734
7.893AlaVal: 7.893 ± 0.661
2.142AlaTrp: 2.142 ± 0.384
3.121AlaTyr: 3.121 ± 0.496
0.0AlaXaa: 0.0 ± 0.0
Cys
0.734CysAla: 0.734 ± 0.241
0.122CysCys: 0.122 ± 0.087
0.551CysAsp: 0.551 ± 0.175
0.734CysGlu: 0.734 ± 0.239
0.184CysPhe: 0.184 ± 0.108
0.428CysGly: 0.428 ± 0.17
0.184CysHis: 0.184 ± 0.099
0.245CysIle: 0.245 ± 0.14
0.49CysLys: 0.49 ± 0.164
0.428CysLeu: 0.428 ± 0.151
0.184CysMet: 0.184 ± 0.089
0.428CysAsn: 0.428 ± 0.158
0.367CysPro: 0.367 ± 0.137
0.184CysGln: 0.184 ± 0.115
0.551CysArg: 0.551 ± 0.182
0.428CysSer: 0.428 ± 0.166
0.184CysThr: 0.184 ± 0.091
0.122CysVal: 0.122 ± 0.08
0.184CysTrp: 0.184 ± 0.099
0.122CysTyr: 0.122 ± 0.078
0.0CysXaa: 0.0 ± 0.0
Asp
6.302AspAla: 6.302 ± 0.51
0.49AspCys: 0.49 ± 0.161
4.467AspAsp: 4.467 ± 0.52
4.161AspGlu: 4.161 ± 0.471
2.203AspPhe: 2.203 ± 0.316
6.364AspGly: 6.364 ± 0.683
1.224AspHis: 1.224 ± 0.313
2.509AspIle: 2.509 ± 0.418
2.57AspLys: 2.57 ± 0.428
6.18AspLeu: 6.18 ± 0.73
1.163AspMet: 1.163 ± 0.203
1.958AspAsn: 1.958 ± 0.372
4.283AspPro: 4.283 ± 0.609
1.652AspGln: 1.652 ± 0.377
3.549AspArg: 3.549 ± 0.4
3.243AspSer: 3.243 ± 0.466
3.916AspThr: 3.916 ± 0.371
4.344AspVal: 4.344 ± 0.577
1.652AspTrp: 1.652 ± 0.303
1.958AspTyr: 1.958 ± 0.322
0.0AspXaa: 0.0 ± 0.0
Glu
5.691GluAla: 5.691 ± 0.685
0.245GluCys: 0.245 ± 0.136
4.773GluAsp: 4.773 ± 0.606
4.895GluGlu: 4.895 ± 0.536
2.08GluPhe: 2.08 ± 0.357
4.038GluGly: 4.038 ± 0.492
1.285GluHis: 1.285 ± 0.283
3.671GluIle: 3.671 ± 0.482
3.059GluLys: 3.059 ± 0.458
6.731GluLeu: 6.731 ± 0.584
1.652GluMet: 1.652 ± 0.262
1.897GluAsn: 1.897 ± 0.369
2.264GluPro: 2.264 ± 0.429
2.692GluGln: 2.692 ± 0.378
3.732GluArg: 3.732 ± 0.559
3.488GluSer: 3.488 ± 0.504
3.121GluThr: 3.121 ± 0.481
6.119GluVal: 6.119 ± 0.589
1.53GluTrp: 1.53 ± 0.406
2.937GluTyr: 2.937 ± 0.44
0.0GluXaa: 0.0 ± 0.0
Phe
2.142PheAla: 2.142 ± 0.341
0.306PheCys: 0.306 ± 0.156
2.386PheAsp: 2.386 ± 0.366
2.264PheGlu: 2.264 ± 0.32
0.734PhePhe: 0.734 ± 0.19
3.488PheGly: 3.488 ± 0.465
0.673PheHis: 0.673 ± 0.22
1.469PheIle: 1.469 ± 0.279
0.857PheLys: 0.857 ± 0.261
2.692PheLeu: 2.692 ± 0.555
0.612PheMet: 0.612 ± 0.208
1.285PheAsn: 1.285 ± 0.29
1.774PhePro: 1.774 ± 0.348
0.979PheGln: 0.979 ± 0.203
1.591PheArg: 1.591 ± 0.317
1.774PheSer: 1.774 ± 0.291
1.958PheThr: 1.958 ± 0.41
2.019PheVal: 2.019 ± 0.368
0.551PheTrp: 0.551 ± 0.208
0.979PheTyr: 0.979 ± 0.203
0.0PheXaa: 0.0 ± 0.0
Gly
7.281GlyAla: 7.281 ± 1.038
0.734GlyCys: 0.734 ± 0.213
6.119GlyAsp: 6.119 ± 0.568
5.079GlyGlu: 5.079 ± 0.604
2.998GlyPhe: 2.998 ± 0.507
8.75GlyGly: 8.75 ± 2.248
2.142GlyHis: 2.142 ± 0.404
4.589GlyIle: 4.589 ± 0.865
4.1GlyLys: 4.1 ± 0.536
7.22GlyLeu: 7.22 ± 0.788
1.774GlyMet: 1.774 ± 0.275
3.182GlyAsn: 3.182 ± 0.39
3.671GlyPro: 3.671 ± 0.551
2.57GlyGln: 2.57 ± 0.405
5.323GlyArg: 5.323 ± 0.586
5.874GlySer: 5.874 ± 0.788
5.874GlyThr: 5.874 ± 0.685
5.262GlyVal: 5.262 ± 0.61
2.57GlyTrp: 2.57 ± 0.412
2.998GlyTyr: 2.998 ± 0.436
0.0GlyXaa: 0.0 ± 0.0
His
1.713HisAla: 1.713 ± 0.293
0.245HisCys: 0.245 ± 0.128
1.224HisAsp: 1.224 ± 0.247
1.53HisGlu: 1.53 ± 0.312
0.673HisPhe: 0.673 ± 0.171
1.836HisGly: 1.836 ± 0.421
0.612HisHis: 0.612 ± 0.215
0.979HisIle: 0.979 ± 0.247
1.04HisLys: 1.04 ± 0.291
1.713HisLeu: 1.713 ± 0.42
0.061HisMet: 0.061 ± 0.053
0.184HisAsn: 0.184 ± 0.108
1.101HisPro: 1.101 ± 0.24
1.04HisGln: 1.04 ± 0.229
1.285HisArg: 1.285 ± 0.289
0.795HisSer: 0.795 ± 0.218
1.285HisThr: 1.285 ± 0.277
1.958HisVal: 1.958 ± 0.365
0.428HisTrp: 0.428 ± 0.151
0.612HisTyr: 0.612 ± 0.234
0.0HisXaa: 0.0 ± 0.0
Ile
5.874IleAla: 5.874 ± 0.604
0.367IleCys: 0.367 ± 0.153
3.121IleAsp: 3.121 ± 0.387
3.304IleGlu: 3.304 ± 0.392
0.979IlePhe: 0.979 ± 0.242
4.038IleGly: 4.038 ± 0.532
1.04IleHis: 1.04 ± 0.21
1.53IleIle: 1.53 ± 0.329
1.591IleLys: 1.591 ± 0.294
3.121IleLeu: 3.121 ± 0.333
0.734IleMet: 0.734 ± 0.191
1.713IleAsn: 1.713 ± 0.293
2.815IlePro: 2.815 ± 0.399
1.652IleGln: 1.652 ± 0.349
3.427IleArg: 3.427 ± 0.477
3.059IleSer: 3.059 ± 0.444
3.549IleThr: 3.549 ± 0.386
2.937IleVal: 2.937 ± 0.456
0.857IleTrp: 0.857 ± 0.209
1.652IleTyr: 1.652 ± 0.306
0.0IleXaa: 0.0 ± 0.0
Lys
3.671LysAla: 3.671 ± 0.457
0.122LysCys: 0.122 ± 0.08
2.386LysAsp: 2.386 ± 0.374
1.836LysGlu: 1.836 ± 0.316
1.346LysPhe: 1.346 ± 0.237
2.692LysGly: 2.692 ± 0.376
1.224LysHis: 1.224 ± 0.302
1.652LysIle: 1.652 ± 0.351
2.264LysLys: 2.264 ± 0.429
3.427LysLeu: 3.427 ± 0.436
0.979LysMet: 0.979 ± 0.236
1.836LysAsn: 1.836 ± 0.32
2.692LysPro: 2.692 ± 0.403
1.346LysGln: 1.346 ± 0.285
3.365LysArg: 3.365 ± 0.541
2.264LysSer: 2.264 ± 0.383
2.448LysThr: 2.448 ± 0.416
3.243LysVal: 3.243 ± 0.416
0.734LysTrp: 0.734 ± 0.206
1.163LysTyr: 1.163 ± 0.311
0.0LysXaa: 0.0 ± 0.0
Leu
10.218LeuAla: 10.218 ± 0.958
0.367LeuCys: 0.367 ± 0.178
5.874LeuAsp: 5.874 ± 0.55
5.14LeuGlu: 5.14 ± 0.534
2.264LeuPhe: 2.264 ± 0.441
7.159LeuGly: 7.159 ± 0.707
1.346LeuHis: 1.346 ± 0.296
4.467LeuIle: 4.467 ± 0.563
3.427LeuLys: 3.427 ± 0.456
6.486LeuLeu: 6.486 ± 0.635
1.469LeuMet: 1.469 ± 0.251
2.876LeuAsn: 2.876 ± 0.411
5.385LeuPro: 5.385 ± 0.602
2.386LeuGln: 2.386 ± 0.418
6.425LeuArg: 6.425 ± 0.508
5.691LeuSer: 5.691 ± 0.596
6.547LeuThr: 6.547 ± 0.669
5.079LeuVal: 5.079 ± 0.591
1.04LeuTrp: 1.04 ± 0.305
2.325LeuTyr: 2.325 ± 0.346
0.0LeuXaa: 0.0 ± 0.0
Met
2.019MetAla: 2.019 ± 0.312
0.0MetCys: 0.0 ± 0.0
0.979MetAsp: 0.979 ± 0.245
1.53MetGlu: 1.53 ± 0.29
0.551MetPhe: 0.551 ± 0.161
1.591MetGly: 1.591 ± 0.284
0.428MetHis: 0.428 ± 0.205
0.367MetIle: 0.367 ± 0.127
1.04MetLys: 1.04 ± 0.26
1.101MetLeu: 1.101 ± 0.284
0.306MetMet: 0.306 ± 0.125
0.979MetAsn: 0.979 ± 0.236
1.04MetPro: 1.04 ± 0.285
0.795MetGln: 0.795 ± 0.247
1.224MetArg: 1.224 ± 0.268
1.958MetSer: 1.958 ± 0.421
2.386MetThr: 2.386 ± 0.326
0.918MetVal: 0.918 ± 0.243
0.306MetTrp: 0.306 ± 0.118
0.367MetTyr: 0.367 ± 0.155
0.0MetXaa: 0.0 ± 0.0
Asn
3.794AsnAla: 3.794 ± 0.445
0.122AsnCys: 0.122 ± 0.091
1.836AsnAsp: 1.836 ± 0.369
1.774AsnGlu: 1.774 ± 0.331
0.918AsnPhe: 0.918 ± 0.223
3.916AsnGly: 3.916 ± 0.539
0.734AsnHis: 0.734 ± 0.202
1.53AsnIle: 1.53 ± 0.3
0.734AsnLys: 0.734 ± 0.213
2.509AsnLeu: 2.509 ± 0.315
0.673AsnMet: 0.673 ± 0.177
1.163AsnAsn: 1.163 ± 0.289
2.509AsnPro: 2.509 ± 0.386
1.04AsnGln: 1.04 ± 0.228
1.774AsnArg: 1.774 ± 0.336
1.591AsnSer: 1.591 ± 0.346
2.142AsnThr: 2.142 ± 0.36
2.631AsnVal: 2.631 ± 0.356
0.612AsnTrp: 0.612 ± 0.177
1.163AsnTyr: 1.163 ± 0.267
0.0AsnXaa: 0.0 ± 0.0
Pro
4.895ProAla: 4.895 ± 0.581
0.49ProCys: 0.49 ± 0.167
3.855ProAsp: 3.855 ± 0.46
4.773ProGlu: 4.773 ± 0.506
2.08ProPhe: 2.08 ± 0.397
4.773ProGly: 4.773 ± 0.544
0.795ProHis: 0.795 ± 0.228
2.386ProIle: 2.386 ± 0.369
2.019ProLys: 2.019 ± 0.261
4.1ProLeu: 4.1 ± 0.509
0.857ProMet: 0.857 ± 0.265
1.591ProAsn: 1.591 ± 0.293
2.998ProPro: 2.998 ± 0.412
1.836ProGln: 1.836 ± 0.394
2.753ProArg: 2.753 ± 0.495
3.61ProSer: 3.61 ± 0.473
4.344ProThr: 4.344 ± 0.605
3.488ProVal: 3.488 ± 0.478
0.734ProTrp: 0.734 ± 0.24
1.53ProTyr: 1.53 ± 0.328
0.0ProXaa: 0.0 ± 0.0
Gln
3.365GlnAla: 3.365 ± 0.783
0.122GlnCys: 0.122 ± 0.093
1.591GlnAsp: 1.591 ± 0.324
1.53GlnGlu: 1.53 ± 0.297
1.285GlnPhe: 1.285 ± 0.355
2.876GlnGly: 2.876 ± 0.379
0.551GlnHis: 0.551 ± 0.155
2.509GlnIle: 2.509 ± 0.438
1.04GlnLys: 1.04 ± 0.196
3.671GlnLeu: 3.671 ± 0.543
0.857GlnMet: 0.857 ± 0.2
0.551GlnAsn: 0.551 ± 0.148
1.836GlnPro: 1.836 ± 0.37
2.019GlnGln: 2.019 ± 0.401
1.836GlnArg: 1.836 ± 0.351
1.897GlnSer: 1.897 ± 0.272
1.836GlnThr: 1.836 ± 0.392
2.325GlnVal: 2.325 ± 0.306
0.734GlnTrp: 0.734 ± 0.172
0.612GlnTyr: 0.612 ± 0.153
0.0GlnXaa: 0.0 ± 0.0
Arg
5.201ArgAla: 5.201 ± 0.503
0.857ArgCys: 0.857 ± 0.244
2.998ArgAsp: 2.998 ± 0.381
5.017ArgGlu: 5.017 ± 0.592
1.897ArgPhe: 1.897 ± 0.383
5.813ArgGly: 5.813 ± 0.688
1.101ArgHis: 1.101 ± 0.274
3.243ArgIle: 3.243 ± 0.477
3.121ArgLys: 3.121 ± 0.484
5.752ArgLeu: 5.752 ± 0.616
1.836ArgMet: 1.836 ± 0.326
2.325ArgAsn: 2.325 ± 0.468
2.631ArgPro: 2.631 ± 0.428
1.713ArgGln: 1.713 ± 0.37
5.14ArgArg: 5.14 ± 0.662
3.488ArgSer: 3.488 ± 0.485
3.243ArgThr: 3.243 ± 0.604
4.956ArgVal: 4.956 ± 0.596
1.407ArgTrp: 1.407 ± 0.296
2.203ArgTyr: 2.203 ± 0.328
0.0ArgXaa: 0.0 ± 0.0
Ser
5.813SerAla: 5.813 ± 0.579
0.49SerCys: 0.49 ± 0.168
3.427SerAsp: 3.427 ± 0.534
3.488SerGlu: 3.488 ± 0.469
1.836SerPhe: 1.836 ± 0.382
6.792SerGly: 6.792 ± 1.129
1.591SerHis: 1.591 ± 0.358
2.386SerIle: 2.386 ± 0.327
2.142SerLys: 2.142 ± 0.357
5.323SerLeu: 5.323 ± 0.544
1.285SerMet: 1.285 ± 0.218
2.203SerAsn: 2.203 ± 0.41
2.937SerPro: 2.937 ± 0.378
2.08SerGln: 2.08 ± 0.313
2.815SerArg: 2.815 ± 0.43
3.121SerSer: 3.121 ± 0.676
3.488SerThr: 3.488 ± 0.486
4.038SerVal: 4.038 ± 0.507
1.346SerTrp: 1.346 ± 0.315
1.407SerTyr: 1.407 ± 0.269
0.0SerXaa: 0.0 ± 0.0
Thr
6.67ThrAla: 6.67 ± 0.742
0.306ThrCys: 0.306 ± 0.12
3.977ThrAsp: 3.977 ± 0.452
4.65ThrGlu: 4.65 ± 0.515
2.325ThrPhe: 2.325 ± 0.413
6.425ThrGly: 6.425 ± 0.575
1.469ThrHis: 1.469 ± 0.338
2.631ThrIle: 2.631 ± 0.486
2.386ThrLys: 2.386 ± 0.327
6.058ThrLeu: 6.058 ± 0.715
0.673ThrMet: 0.673 ± 0.197
2.019ThrAsn: 2.019 ± 0.294
4.161ThrPro: 4.161 ± 0.496
1.774ThrGln: 1.774 ± 0.375
3.855ThrArg: 3.855 ± 0.545
3.671ThrSer: 3.671 ± 0.597
4.528ThrThr: 4.528 ± 0.545
5.629ThrVal: 5.629 ± 0.557
1.346ThrTrp: 1.346 ± 0.267
1.958ThrTyr: 1.958 ± 0.351
0.0ThrXaa: 0.0 ± 0.0
Val
6.731ValAla: 6.731 ± 0.691
0.367ValCys: 0.367 ± 0.143
5.323ValAsp: 5.323 ± 0.526
4.956ValGlu: 4.956 ± 0.58
2.142ValPhe: 2.142 ± 0.361
4.773ValGly: 4.773 ± 0.711
1.469ValHis: 1.469 ± 0.274
3.977ValIle: 3.977 ± 0.617
3.243ValLys: 3.243 ± 0.451
5.446ValLeu: 5.446 ± 0.519
1.285ValMet: 1.285 ± 0.303
2.509ValAsn: 2.509 ± 0.399
3.916ValPro: 3.916 ± 0.407
2.264ValGln: 2.264 ± 0.378
5.017ValArg: 5.017 ± 0.64
4.65ValSer: 4.65 ± 0.478
5.752ValThr: 5.752 ± 0.582
4.895ValVal: 4.895 ± 0.631
1.285ValTrp: 1.285 ± 0.255
2.08ValTyr: 2.08 ± 0.384
0.0ValXaa: 0.0 ± 0.0
Trp
1.591TrpAla: 1.591 ± 0.321
0.245TrpCys: 0.245 ± 0.094
1.774TrpAsp: 1.774 ± 0.33
0.795TrpGlu: 0.795 ± 0.188
0.795TrpPhe: 0.795 ± 0.255
1.836TrpGly: 1.836 ± 0.329
0.367TrpHis: 0.367 ± 0.158
1.163TrpIle: 1.163 ± 0.239
0.551TrpLys: 0.551 ± 0.23
1.713TrpLeu: 1.713 ± 0.358
0.367TrpMet: 0.367 ± 0.171
0.49TrpAsn: 0.49 ± 0.188
0.612TrpPro: 0.612 ± 0.177
0.979TrpGln: 0.979 ± 0.229
1.285TrpArg: 1.285 ± 0.259
0.979TrpSer: 0.979 ± 0.264
1.713TrpThr: 1.713 ± 0.393
2.203TrpVal: 2.203 ± 0.377
0.49TrpTrp: 0.49 ± 0.183
0.428TrpTyr: 0.428 ± 0.145
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.57TyrAla: 2.57 ± 0.481
0.245TyrCys: 0.245 ± 0.158
1.346TyrAsp: 1.346 ± 0.29
2.325TyrGlu: 2.325 ± 0.351
0.612TyrPhe: 0.612 ± 0.189
2.876TyrGly: 2.876 ± 0.397
0.673TyrHis: 0.673 ± 0.188
1.652TyrIle: 1.652 ± 0.331
1.285TyrLys: 1.285 ± 0.242
2.57TyrLeu: 2.57 ± 0.382
0.551TyrMet: 0.551 ± 0.163
1.04TyrAsn: 1.04 ± 0.285
1.774TyrPro: 1.774 ± 0.356
0.979TyrGln: 0.979 ± 0.309
3.059TyrArg: 3.059 ± 0.511
1.469TyrSer: 1.469 ± 0.265
2.142TyrThr: 2.142 ± 0.382
2.019TyrVal: 2.019 ± 0.31
0.428TyrTrp: 0.428 ± 0.176
0.673TyrTyr: 0.673 ± 0.202
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 90 proteins (16344 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski