Amino acid dipepetide frequency for Mycobacterium phage IdentityCrisis

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
17.537AlaAla: 17.537 ± 1.742
1.305AlaCys: 1.305 ± 0.289
8.401AlaAsp: 8.401 ± 0.825
8.157AlaGlu: 8.157 ± 1.008
2.692AlaPhe: 2.692 ± 0.46
10.767AlaGly: 10.767 ± 1.284
1.958AlaHis: 1.958 ± 0.392
7.504AlaIle: 7.504 ± 0.831
3.915AlaLys: 3.915 ± 0.539
10.767AlaLeu: 10.767 ± 0.912
3.426AlaMet: 3.426 ± 0.471
3.344AlaAsn: 3.344 ± 0.439
5.22AlaPro: 5.22 ± 0.596
4.486AlaGln: 4.486 ± 0.632
7.83AlaArg: 7.83 ± 1.099
6.199AlaSer: 6.199 ± 0.951
7.259AlaThr: 7.259 ± 0.731
9.788AlaVal: 9.788 ± 0.943
1.876AlaTrp: 1.876 ± 0.449
2.039AlaTyr: 2.039 ± 0.391
0.0AlaXaa: 0.0 ± 0.0
Cys
0.816CysAla: 0.816 ± 0.272
0.163CysCys: 0.163 ± 0.119
0.653CysAsp: 0.653 ± 0.205
0.653CysGlu: 0.653 ± 0.229
0.0CysPhe: 0.0 ± 0.0
1.223CysGly: 1.223 ± 0.292
0.408CysHis: 0.408 ± 0.159
0.082CysIle: 0.082 ± 0.084
0.163CysLys: 0.163 ± 0.113
0.816CysLeu: 0.816 ± 0.244
0.0CysMet: 0.0 ± 0.0
0.408CysAsn: 0.408 ± 0.171
1.142CysPro: 1.142 ± 0.272
0.163CysGln: 0.163 ± 0.11
1.305CysArg: 1.305 ± 0.285
0.653CysSer: 0.653 ± 0.239
0.897CysThr: 0.897 ± 0.247
0.408CysVal: 0.408 ± 0.165
0.163CysTrp: 0.163 ± 0.119
0.326CysTyr: 0.326 ± 0.176
0.0CysXaa: 0.0 ± 0.0
Asp
7.096AspAla: 7.096 ± 0.611
0.897AspCys: 0.897 ± 0.29
4.894AspAsp: 4.894 ± 0.631
4.323AspGlu: 4.323 ± 0.589
2.284AspPhe: 2.284 ± 0.359
6.688AspGly: 6.688 ± 0.672
2.039AspHis: 2.039 ± 0.434
3.1AspIle: 3.1 ± 0.396
1.713AspLys: 1.713 ± 0.416
5.383AspLeu: 5.383 ± 0.557
1.55AspMet: 1.55 ± 0.339
2.121AspAsn: 2.121 ± 0.417
4.323AspPro: 4.323 ± 0.547
1.794AspGln: 1.794 ± 0.351
4.16AspArg: 4.16 ± 0.559
2.692AspSer: 2.692 ± 0.572
4.078AspThr: 4.078 ± 0.579
3.915AspVal: 3.915 ± 0.486
1.631AspTrp: 1.631 ± 0.438
1.223AspTyr: 1.223 ± 0.291
0.0AspXaa: 0.0 ± 0.0
Glu
5.873GluAla: 5.873 ± 0.688
0.979GluCys: 0.979 ± 0.313
3.507GluAsp: 3.507 ± 0.528
1.876GluGlu: 1.876 ± 0.354
1.713GluPhe: 1.713 ± 0.403
3.018GluGly: 3.018 ± 0.484
1.06GluHis: 1.06 ± 0.331
3.263GluIle: 3.263 ± 0.524
1.794GluLys: 1.794 ± 0.392
5.628GluLeu: 5.628 ± 0.663
1.06GluMet: 1.06 ± 0.272
1.55GluAsn: 1.55 ± 0.317
2.529GluPro: 2.529 ± 0.41
2.365GluGln: 2.365 ± 0.597
3.915GluArg: 3.915 ± 0.795
2.529GluSer: 2.529 ± 0.484
2.936GluThr: 2.936 ± 0.606
4.568GluVal: 4.568 ± 0.618
1.223GluTrp: 1.223 ± 0.291
1.142GluTyr: 1.142 ± 0.274
0.0GluXaa: 0.0 ± 0.0
Phe
2.529PheAla: 2.529 ± 0.405
0.163PheCys: 0.163 ± 0.126
2.284PheAsp: 2.284 ± 0.384
0.897PheGlu: 0.897 ± 0.313
0.408PhePhe: 0.408 ± 0.144
3.589PheGly: 3.589 ± 0.444
0.489PheHis: 0.489 ± 0.179
1.06PheIle: 1.06 ± 0.303
0.897PheLys: 0.897 ± 0.235
1.794PheLeu: 1.794 ± 0.309
0.571PheMet: 0.571 ± 0.193
0.816PheAsn: 0.816 ± 0.254
0.897PhePro: 0.897 ± 0.269
0.979PheGln: 0.979 ± 0.249
0.979PheArg: 0.979 ± 0.273
1.305PheSer: 1.305 ± 0.389
2.202PheThr: 2.202 ± 0.493
2.284PheVal: 2.284 ± 0.448
0.653PheTrp: 0.653 ± 0.231
0.489PheTyr: 0.489 ± 0.24
0.0PheXaa: 0.0 ± 0.0
Gly
8.891GlyAla: 8.891 ± 1.193
0.489GlyCys: 0.489 ± 0.192
6.036GlyAsp: 6.036 ± 0.819
5.22GlyGlu: 5.22 ± 0.572
2.529GlyPhe: 2.529 ± 0.464
13.214GlyGly: 13.214 ± 2.24
1.713GlyHis: 1.713 ± 0.36
4.976GlyIle: 4.976 ± 0.636
2.692GlyLys: 2.692 ± 0.437
7.341GlyLeu: 7.341 ± 0.707
0.979GlyMet: 0.979 ± 0.276
3.1GlyAsn: 3.1 ± 0.52
3.915GlyPro: 3.915 ± 0.736
2.936GlyGln: 2.936 ± 0.571
5.954GlyArg: 5.954 ± 0.529
5.22GlySer: 5.22 ± 0.611
7.912GlyThr: 7.912 ± 1.066
7.015GlyVal: 7.015 ± 0.693
1.631GlyTrp: 1.631 ± 0.246
2.365GlyTyr: 2.365 ± 0.438
0.0GlyXaa: 0.0 ± 0.0
His
2.202HisAla: 2.202 ± 0.41
0.408HisCys: 0.408 ± 0.22
0.979HisAsp: 0.979 ± 0.239
0.816HisGlu: 0.816 ± 0.255
0.163HisPhe: 0.163 ± 0.103
1.631HisGly: 1.631 ± 0.291
0.489HisHis: 0.489 ± 0.22
1.142HisIle: 1.142 ± 0.259
0.653HisLys: 0.653 ± 0.222
1.55HisLeu: 1.55 ± 0.391
0.326HisMet: 0.326 ± 0.141
0.979HisAsn: 0.979 ± 0.288
1.55HisPro: 1.55 ± 0.351
0.653HisGln: 0.653 ± 0.281
1.387HisArg: 1.387 ± 0.352
0.653HisSer: 0.653 ± 0.258
1.631HisThr: 1.631 ± 0.385
1.55HisVal: 1.55 ± 0.266
0.489HisTrp: 0.489 ± 0.186
0.489HisTyr: 0.489 ± 0.169
0.0HisXaa: 0.0 ± 0.0
Ile
7.178IleAla: 7.178 ± 0.897
0.489IleCys: 0.489 ± 0.188
4.976IleAsp: 4.976 ± 0.685
3.67IleGlu: 3.67 ± 0.427
1.305IlePhe: 1.305 ± 0.359
4.731IleGly: 4.731 ± 0.733
0.653IleHis: 0.653 ± 0.289
1.223IleIle: 1.223 ± 0.254
0.897IleLys: 0.897 ± 0.264
2.692IleLeu: 2.692 ± 0.454
0.408IleMet: 0.408 ± 0.152
1.876IleAsn: 1.876 ± 0.366
3.344IlePro: 3.344 ± 0.669
1.794IleGln: 1.794 ± 0.419
3.426IleArg: 3.426 ± 0.428
3.018IleSer: 3.018 ± 0.43
2.936IleThr: 2.936 ± 0.582
3.67IleVal: 3.67 ± 0.62
0.897IleTrp: 0.897 ± 0.276
0.979IleTyr: 0.979 ± 0.222
0.0IleXaa: 0.0 ± 0.0
Lys
4.731LysAla: 4.731 ± 0.779
0.408LysCys: 0.408 ± 0.187
1.142LysAsp: 1.142 ± 0.279
1.142LysGlu: 1.142 ± 0.301
0.489LysPhe: 0.489 ± 0.167
2.284LysGly: 2.284 ± 0.511
0.408LysHis: 0.408 ± 0.186
1.958LysIle: 1.958 ± 0.357
1.06LysLys: 1.06 ± 0.301
3.181LysLeu: 3.181 ± 0.375
0.408LysMet: 0.408 ± 0.147
0.816LysAsn: 0.816 ± 0.239
2.284LysPro: 2.284 ± 0.383
1.55LysGln: 1.55 ± 0.411
2.121LysArg: 2.121 ± 0.43
2.284LysSer: 2.284 ± 0.49
0.897LysThr: 0.897 ± 0.266
2.692LysVal: 2.692 ± 0.567
0.489LysTrp: 0.489 ± 0.147
0.734LysTyr: 0.734 ± 0.265
0.0LysXaa: 0.0 ± 0.0
Leu
12.724LeuAla: 12.724 ± 1.345
0.816LeuCys: 0.816 ± 0.282
5.465LeuAsp: 5.465 ± 0.688
3.834LeuGlu: 3.834 ± 0.473
2.202LeuPhe: 2.202 ± 0.397
6.933LeuGly: 6.933 ± 0.779
1.223LeuHis: 1.223 ± 0.329
4.241LeuIle: 4.241 ± 0.783
3.181LeuLys: 3.181 ± 0.416
7.504LeuLeu: 7.504 ± 0.757
1.06LeuMet: 1.06 ± 0.294
1.713LeuAsn: 1.713 ± 0.36
4.16LeuPro: 4.16 ± 0.44
2.936LeuGln: 2.936 ± 0.509
5.628LeuArg: 5.628 ± 0.72
6.117LeuSer: 6.117 ± 0.681
6.199LeuThr: 6.199 ± 0.639
4.486LeuVal: 4.486 ± 0.616
1.958LeuTrp: 1.958 ± 0.44
1.468LeuTyr: 1.468 ± 0.275
0.0LeuXaa: 0.0 ± 0.0
Met
1.713MetAla: 1.713 ± 0.315
0.163MetCys: 0.163 ± 0.094
1.06MetAsp: 1.06 ± 0.242
0.571MetGlu: 0.571 ± 0.207
0.163MetPhe: 0.163 ± 0.121
1.387MetGly: 1.387 ± 0.486
0.245MetHis: 0.245 ± 0.13
1.468MetIle: 1.468 ± 0.326
0.408MetLys: 0.408 ± 0.18
1.305MetLeu: 1.305 ± 0.334
0.326MetMet: 0.326 ± 0.156
0.979MetAsn: 0.979 ± 0.315
2.039MetPro: 2.039 ± 0.383
0.245MetGln: 0.245 ± 0.105
0.979MetArg: 0.979 ± 0.317
2.447MetSer: 2.447 ± 0.377
2.284MetThr: 2.284 ± 0.364
0.816MetVal: 0.816 ± 0.295
0.408MetTrp: 0.408 ± 0.155
0.326MetTyr: 0.326 ± 0.158
0.0MetXaa: 0.0 ± 0.0
Asn
3.67AsnAla: 3.67 ± 0.658
0.245AsnCys: 0.245 ± 0.127
1.142AsnAsp: 1.142 ± 0.27
1.387AsnGlu: 1.387 ± 0.348
0.571AsnPhe: 0.571 ± 0.179
3.426AsnGly: 3.426 ± 0.567
0.653AsnHis: 0.653 ± 0.2
1.468AsnIle: 1.468 ± 0.456
0.734AsnLys: 0.734 ± 0.266
2.61AsnLeu: 2.61 ± 0.408
0.571AsnMet: 0.571 ± 0.222
0.734AsnAsn: 0.734 ± 0.332
3.344AsnPro: 3.344 ± 0.618
0.897AsnGln: 0.897 ± 0.276
1.876AsnArg: 1.876 ± 0.391
1.305AsnSer: 1.305 ± 0.315
1.468AsnThr: 1.468 ± 0.397
1.876AsnVal: 1.876 ± 0.35
0.489AsnTrp: 0.489 ± 0.218
0.653AsnTyr: 0.653 ± 0.239
0.0AsnXaa: 0.0 ± 0.0
Pro
6.362ProAla: 6.362 ± 0.737
0.408ProCys: 0.408 ± 0.177
3.752ProAsp: 3.752 ± 0.397
3.426ProGlu: 3.426 ± 0.59
1.55ProPhe: 1.55 ± 0.317
5.628ProGly: 5.628 ± 0.706
1.305ProHis: 1.305 ± 0.306
2.692ProIle: 2.692 ± 0.462
2.284ProLys: 2.284 ± 0.44
4.16ProLeu: 4.16 ± 0.533
1.142ProMet: 1.142 ± 0.304
1.223ProAsn: 1.223 ± 0.259
3.1ProPro: 3.1 ± 0.554
2.202ProGln: 2.202 ± 0.381
3.834ProArg: 3.834 ± 0.578
3.181ProSer: 3.181 ± 0.506
3.426ProThr: 3.426 ± 0.454
4.649ProVal: 4.649 ± 0.5
1.223ProTrp: 1.223 ± 0.323
0.979ProTyr: 0.979 ± 0.205
0.0ProXaa: 0.0 ± 0.0
Gln
4.241GlnAla: 4.241 ± 0.602
0.571GlnCys: 0.571 ± 0.223
1.468GlnAsp: 1.468 ± 0.362
1.387GlnGlu: 1.387 ± 0.328
0.979GlnPhe: 0.979 ± 0.295
2.121GlnGly: 2.121 ± 0.378
0.979GlnHis: 0.979 ± 0.299
1.305GlnIle: 1.305 ± 0.324
1.223GlnLys: 1.223 ± 0.35
5.057GlnLeu: 5.057 ± 0.583
0.897GlnMet: 0.897 ± 0.252
0.571GlnAsn: 0.571 ± 0.169
1.794GlnPro: 1.794 ± 0.354
1.06GlnGln: 1.06 ± 0.291
2.855GlnArg: 2.855 ± 0.578
1.631GlnSer: 1.631 ± 0.433
2.284GlnThr: 2.284 ± 0.495
1.794GlnVal: 1.794 ± 0.331
1.06GlnTrp: 1.06 ± 0.236
0.408GlnTyr: 0.408 ± 0.159
0.0GlnXaa: 0.0 ± 0.0
Arg
9.38ArgAla: 9.38 ± 1.007
0.734ArgCys: 0.734 ± 0.249
4.078ArgAsp: 4.078 ± 0.624
3.589ArgGlu: 3.589 ± 0.584
2.039ArgPhe: 2.039 ± 0.581
5.139ArgGly: 5.139 ± 0.589
1.55ArgHis: 1.55 ± 0.382
3.018ArgIle: 3.018 ± 0.457
3.018ArgLys: 3.018 ± 0.515
4.241ArgLeu: 4.241 ± 0.655
1.958ArgMet: 1.958 ± 0.393
2.121ArgAsn: 2.121 ± 0.337
3.263ArgPro: 3.263 ± 0.476
3.1ArgGln: 3.1 ± 0.44
7.096ArgArg: 7.096 ± 1.022
2.365ArgSer: 2.365 ± 0.411
4.812ArgThr: 4.812 ± 0.557
3.997ArgVal: 3.997 ± 0.557
1.468ArgTrp: 1.468 ± 0.33
1.876ArgTyr: 1.876 ± 0.39
0.0ArgXaa: 0.0 ± 0.0
Ser
6.933SerAla: 6.933 ± 0.801
0.326SerCys: 0.326 ± 0.202
4.405SerAsp: 4.405 ± 0.591
2.936SerGlu: 2.936 ± 0.451
1.958SerPhe: 1.958 ± 0.285
5.302SerGly: 5.302 ± 0.582
0.979SerHis: 0.979 ± 0.303
2.692SerIle: 2.692 ± 0.405
0.979SerLys: 0.979 ± 0.303
3.915SerLeu: 3.915 ± 0.54
1.142SerMet: 1.142 ± 0.282
1.142SerAsn: 1.142 ± 0.249
3.1SerPro: 3.1 ± 0.48
1.223SerGln: 1.223 ± 0.34
3.426SerArg: 3.426 ± 0.473
3.507SerSer: 3.507 ± 0.734
3.589SerThr: 3.589 ± 0.593
4.812SerVal: 4.812 ± 0.623
0.653SerTrp: 0.653 ± 0.236
0.979SerTyr: 0.979 ± 0.312
0.0SerXaa: 0.0 ± 0.0
Thr
9.462ThrAla: 9.462 ± 1.014
0.571ThrCys: 0.571 ± 0.208
4.731ThrAsp: 4.731 ± 0.678
3.589ThrGlu: 3.589 ± 0.516
1.55ThrPhe: 1.55 ± 0.343
7.423ThrGly: 7.423 ± 0.768
1.06ThrHis: 1.06 ± 0.273
3.915ThrIle: 3.915 ± 0.686
1.142ThrLys: 1.142 ± 0.234
5.139ThrLeu: 5.139 ± 0.495
1.06ThrMet: 1.06 ± 0.285
1.794ThrAsn: 1.794 ± 0.369
4.731ThrPro: 4.731 ± 0.62
1.794ThrGln: 1.794 ± 0.292
3.426ThrArg: 3.426 ± 0.572
3.181ThrSer: 3.181 ± 0.523
4.405ThrThr: 4.405 ± 0.649
6.199ThrVal: 6.199 ± 0.882
1.55ThrTrp: 1.55 ± 0.331
1.387ThrTyr: 1.387 ± 0.35
0.0ThrXaa: 0.0 ± 0.0
Val
9.054ValAla: 9.054 ± 0.905
0.489ValCys: 0.489 ± 0.241
3.507ValAsp: 3.507 ± 0.575
3.752ValGlu: 3.752 ± 0.504
1.55ValPhe: 1.55 ± 0.401
6.036ValGly: 6.036 ± 0.615
1.794ValHis: 1.794 ± 0.492
3.344ValIle: 3.344 ± 0.473
3.018ValLys: 3.018 ± 0.537
7.096ValLeu: 7.096 ± 0.905
0.979ValMet: 0.979 ± 0.243
2.365ValAsn: 2.365 ± 0.593
3.915ValPro: 3.915 ± 0.492
2.121ValGln: 2.121 ± 0.369
5.302ValArg: 5.302 ± 0.54
3.752ValSer: 3.752 ± 0.555
5.954ValThr: 5.954 ± 0.807
4.731ValVal: 4.731 ± 0.594
1.387ValTrp: 1.387 ± 0.244
1.713ValTyr: 1.713 ± 0.405
0.0ValXaa: 0.0 ± 0.0
Trp
1.794TrpAla: 1.794 ± 0.337
0.571TrpCys: 0.571 ± 0.239
1.387TrpAsp: 1.387 ± 0.326
0.734TrpGlu: 0.734 ± 0.19
0.734TrpPhe: 0.734 ± 0.269
1.631TrpGly: 1.631 ± 0.321
0.408TrpHis: 0.408 ± 0.202
1.06TrpIle: 1.06 ± 0.248
0.571TrpLys: 0.571 ± 0.18
1.55TrpLeu: 1.55 ± 0.293
0.653TrpMet: 0.653 ± 0.265
0.653TrpAsn: 0.653 ± 0.229
1.305TrpPro: 1.305 ± 0.426
0.897TrpGln: 0.897 ± 0.28
1.713TrpArg: 1.713 ± 0.348
1.142TrpSer: 1.142 ± 0.3
1.387TrpThr: 1.387 ± 0.336
1.305TrpVal: 1.305 ± 0.267
0.653TrpTrp: 0.653 ± 0.245
0.245TrpTyr: 0.245 ± 0.137
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.365TyrAla: 2.365 ± 0.475
0.245TyrCys: 0.245 ± 0.126
2.039TyrAsp: 2.039 ± 0.359
0.571TyrGlu: 0.571 ± 0.166
0.408TyrPhe: 0.408 ± 0.172
1.958TyrGly: 1.958 ± 0.42
0.326TyrHis: 0.326 ± 0.16
0.653TyrIle: 0.653 ± 0.216
0.734TyrLys: 0.734 ± 0.284
2.121TyrLeu: 2.121 ± 0.447
0.734TyrMet: 0.734 ± 0.203
0.816TyrAsn: 0.816 ± 0.182
0.571TyrPro: 0.571 ± 0.188
0.489TyrGln: 0.489 ± 0.186
1.631TyrArg: 1.631 ± 0.473
0.816TyrSer: 0.816 ± 0.24
1.55TyrThr: 1.55 ± 0.307
1.305TyrVal: 1.305 ± 0.276
0.489TyrTrp: 0.489 ± 0.251
0.734TyrTyr: 0.734 ± 0.338
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 60 proteins (12261 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski