Amino acid dipepetide frequency for Mycobacterium phage Petruchio

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
13.284AlaAla: 13.284 ± 1.602
0.755AlaCys: 0.755 ± 0.199
7.24AlaAsp: 7.24 ± 0.74
6.547AlaGlu: 6.547 ± 0.638
2.959AlaPhe: 2.959 ± 0.454
7.744AlaGly: 7.744 ± 0.704
1.763AlaHis: 1.763 ± 0.362
4.029AlaIle: 4.029 ± 0.607
3.903AlaLys: 3.903 ± 0.487
8.562AlaLeu: 8.562 ± 0.959
2.518AlaMet: 2.518 ± 0.435
2.581AlaAsn: 2.581 ± 0.358
5.099AlaPro: 5.099 ± 0.749
2.959AlaGln: 2.959 ± 0.524
6.17AlaArg: 6.17 ± 0.638
4.911AlaSer: 4.911 ± 0.63
5.855AlaThr: 5.855 ± 0.611
8.562AlaVal: 8.562 ± 0.719
1.763AlaTrp: 1.763 ± 0.343
3.274AlaTyr: 3.274 ± 0.414
0.0AlaXaa: 0.0 ± 0.0
Cys
0.881CysAla: 0.881 ± 0.32
0.0CysCys: 0.0 ± 0.0
0.693CysAsp: 0.693 ± 0.212
0.63CysGlu: 0.63 ± 0.204
0.189CysPhe: 0.189 ± 0.097
0.441CysGly: 0.441 ± 0.204
0.189CysHis: 0.189 ± 0.109
0.126CysIle: 0.126 ± 0.095
0.315CysLys: 0.315 ± 0.184
0.504CysLeu: 0.504 ± 0.262
0.063CysMet: 0.063 ± 0.065
0.252CysAsn: 0.252 ± 0.123
0.504CysPro: 0.504 ± 0.199
0.252CysGln: 0.252 ± 0.131
0.567CysArg: 0.567 ± 0.209
0.378CysSer: 0.378 ± 0.156
0.315CysThr: 0.315 ± 0.133
0.315CysVal: 0.315 ± 0.138
0.315CysTrp: 0.315 ± 0.151
0.252CysTyr: 0.252 ± 0.129
0.0CysXaa: 0.0 ± 0.0
Asp
6.61AspAla: 6.61 ± 0.706
0.567AspCys: 0.567 ± 0.212
4.407AspAsp: 4.407 ± 0.466
3.463AspGlu: 3.463 ± 0.476
2.455AspPhe: 2.455 ± 0.404
6.233AspGly: 6.233 ± 0.645
1.259AspHis: 1.259 ± 0.268
2.77AspIle: 2.77 ± 0.389
2.707AspLys: 2.707 ± 0.453
6.862AspLeu: 6.862 ± 0.688
1.322AspMet: 1.322 ± 0.246
1.826AspAsn: 1.826 ± 0.353
4.974AspPro: 4.974 ± 0.606
1.511AspGln: 1.511 ± 0.322
3.589AspArg: 3.589 ± 0.409
3.148AspSer: 3.148 ± 0.563
3.589AspThr: 3.589 ± 0.383
4.848AspVal: 4.848 ± 0.566
1.637AspTrp: 1.637 ± 0.3
2.078AspTyr: 2.078 ± 0.327
0.0AspXaa: 0.0 ± 0.0
Glu
6.17GluAla: 6.17 ± 0.693
0.315GluCys: 0.315 ± 0.178
4.911GluAsp: 4.911 ± 0.498
4.785GluGlu: 4.785 ± 0.606
2.141GluPhe: 2.141 ± 0.345
3.589GluGly: 3.589 ± 0.463
1.322GluHis: 1.322 ± 0.289
3.526GluIle: 3.526 ± 0.51
2.896GluLys: 2.896 ± 0.436
6.925GluLeu: 6.925 ± 0.61
1.511GluMet: 1.511 ± 0.281
1.637GluAsn: 1.637 ± 0.366
2.833GluPro: 2.833 ± 0.434
2.392GluGln: 2.392 ± 0.353
3.903GluArg: 3.903 ± 0.577
3.777GluSer: 3.777 ± 0.462
3.651GluThr: 3.651 ± 0.5
5.603GluVal: 5.603 ± 0.653
1.385GluTrp: 1.385 ± 0.314
2.266GluTyr: 2.266 ± 0.38
0.0GluXaa: 0.0 ± 0.0
Phe
2.266PheAla: 2.266 ± 0.354
0.189PheCys: 0.189 ± 0.13
2.833PheAsp: 2.833 ± 0.342
1.763PheGlu: 1.763 ± 0.29
0.567PhePhe: 0.567 ± 0.196
3.463PheGly: 3.463 ± 0.532
0.818PheHis: 0.818 ± 0.303
1.574PheIle: 1.574 ± 0.294
1.133PheLys: 1.133 ± 0.258
2.266PheLeu: 2.266 ± 0.451
0.504PheMet: 0.504 ± 0.145
0.944PheAsn: 0.944 ± 0.215
1.637PhePro: 1.637 ± 0.325
0.944PheGln: 0.944 ± 0.243
2.015PheArg: 2.015 ± 0.368
1.889PheSer: 1.889 ± 0.413
2.141PheThr: 2.141 ± 0.366
2.203PheVal: 2.203 ± 0.34
0.567PheTrp: 0.567 ± 0.163
0.818PheTyr: 0.818 ± 0.24
0.0PheXaa: 0.0 ± 0.0
Gly
6.799GlyAla: 6.799 ± 0.909
0.63GlyCys: 0.63 ± 0.21
6.107GlyAsp: 6.107 ± 0.568
4.848GlyGlu: 4.848 ± 0.55
2.707GlyPhe: 2.707 ± 0.504
8.31GlyGly: 8.31 ± 1.709
1.763GlyHis: 1.763 ± 0.324
4.281GlyIle: 4.281 ± 0.684
3.714GlyLys: 3.714 ± 0.555
7.744GlyLeu: 7.744 ± 0.795
2.141GlyMet: 2.141 ± 0.437
3.4GlyAsn: 3.4 ± 0.392
3.903GlyPro: 3.903 ± 0.577
2.266GlyGln: 2.266 ± 0.351
4.722GlyArg: 4.722 ± 0.573
5.981GlySer: 5.981 ± 0.748
5.351GlyThr: 5.351 ± 0.581
5.351GlyVal: 5.351 ± 0.614
2.392GlyTrp: 2.392 ± 0.435
2.518GlyTyr: 2.518 ± 0.32
0.0GlyXaa: 0.0 ± 0.0
His
1.826HisAla: 1.826 ± 0.423
0.189HisCys: 0.189 ± 0.143
1.259HisAsp: 1.259 ± 0.231
1.7HisGlu: 1.7 ± 0.316
0.693HisPhe: 0.693 ± 0.239
1.511HisGly: 1.511 ± 0.335
0.818HisHis: 0.818 ± 0.282
0.944HisIle: 0.944 ± 0.214
1.07HisLys: 1.07 ± 0.225
1.448HisLeu: 1.448 ± 0.379
0.189HisMet: 0.189 ± 0.106
0.441HisAsn: 0.441 ± 0.179
1.385HisPro: 1.385 ± 0.309
0.944HisGln: 0.944 ± 0.264
1.637HisArg: 1.637 ± 0.372
0.818HisSer: 0.818 ± 0.19
1.322HisThr: 1.322 ± 0.384
1.7HisVal: 1.7 ± 0.357
0.504HisTrp: 0.504 ± 0.158
0.567HisTyr: 0.567 ± 0.222
0.0HisXaa: 0.0 ± 0.0
Ile
6.296IleAla: 6.296 ± 0.708
0.252IleCys: 0.252 ± 0.114
3.337IleAsp: 3.337 ± 0.379
3.651IleGlu: 3.651 ± 0.475
0.818IlePhe: 0.818 ± 0.245
4.092IleGly: 4.092 ± 0.479
0.944IleHis: 0.944 ± 0.233
1.889IleIle: 1.889 ± 0.336
1.952IleLys: 1.952 ± 0.395
3.463IleLeu: 3.463 ± 0.465
0.881IleMet: 0.881 ± 0.216
1.385IleAsn: 1.385 ± 0.274
3.274IlePro: 3.274 ± 0.403
1.574IleGln: 1.574 ± 0.349
3.463IleArg: 3.463 ± 0.53
3.274IleSer: 3.274 ± 0.475
3.274IleThr: 3.274 ± 0.447
2.77IleVal: 2.77 ± 0.475
0.693IleTrp: 0.693 ± 0.202
1.7IleTyr: 1.7 ± 0.307
0.0IleXaa: 0.0 ± 0.0
Lys
3.903LysAla: 3.903 ± 0.597
0.252LysCys: 0.252 ± 0.119
2.392LysAsp: 2.392 ± 0.448
2.015LysGlu: 2.015 ± 0.353
1.637LysPhe: 1.637 ± 0.3
2.644LysGly: 2.644 ± 0.467
1.259LysHis: 1.259 ± 0.282
2.833LysIle: 2.833 ± 0.505
1.952LysLys: 1.952 ± 0.37
3.022LysLeu: 3.022 ± 0.454
0.944LysMet: 0.944 ± 0.208
1.511LysAsn: 1.511 ± 0.247
2.77LysPro: 2.77 ± 0.447
1.511LysGln: 1.511 ± 0.351
3.148LysArg: 3.148 ± 0.474
2.581LysSer: 2.581 ± 0.447
2.078LysThr: 2.078 ± 0.459
3.085LysVal: 3.085 ± 0.462
0.755LysTrp: 0.755 ± 0.206
0.818LysTyr: 0.818 ± 0.241
0.0LysXaa: 0.0 ± 0.0
Leu
9.443LeuAla: 9.443 ± 0.893
0.378LeuCys: 0.378 ± 0.154
6.044LeuAsp: 6.044 ± 0.585
5.603LeuGlu: 5.603 ± 0.551
1.952LeuPhe: 1.952 ± 0.33
7.303LeuGly: 7.303 ± 0.914
1.511LeuHis: 1.511 ± 0.332
4.659LeuIle: 4.659 ± 0.557
3.84LeuLys: 3.84 ± 0.465
5.666LeuLeu: 5.666 ± 0.597
1.763LeuMet: 1.763 ± 0.332
3.022LeuAsn: 3.022 ± 0.386
5.792LeuPro: 5.792 ± 0.611
2.329LeuGln: 2.329 ± 0.409
5.603LeuArg: 5.603 ± 0.573
5.603LeuSer: 5.603 ± 0.499
6.044LeuThr: 6.044 ± 0.464
4.911LeuVal: 4.911 ± 0.781
1.007LeuTrp: 1.007 ± 0.306
2.455LeuTyr: 2.455 ± 0.416
0.0LeuXaa: 0.0 ± 0.0
Met
2.329MetAla: 2.329 ± 0.295
0.0MetCys: 0.0 ± 0.0
1.259MetAsp: 1.259 ± 0.237
1.385MetGlu: 1.385 ± 0.326
0.504MetPhe: 0.504 ± 0.166
1.385MetGly: 1.385 ± 0.287
0.378MetHis: 0.378 ± 0.194
0.441MetIle: 0.441 ± 0.167
1.133MetLys: 1.133 ± 0.223
1.259MetLeu: 1.259 ± 0.313
0.126MetMet: 0.126 ± 0.095
0.944MetAsn: 0.944 ± 0.201
1.196MetPro: 1.196 ± 0.265
0.567MetGln: 0.567 ± 0.166
1.574MetArg: 1.574 ± 0.361
2.203MetSer: 2.203 ± 0.399
2.455MetThr: 2.455 ± 0.36
0.755MetVal: 0.755 ± 0.216
0.252MetTrp: 0.252 ± 0.116
0.567MetTyr: 0.567 ± 0.184
0.0MetXaa: 0.0 ± 0.0
Asn
2.833AsnAla: 2.833 ± 0.471
0.063AsnCys: 0.063 ± 0.067
1.952AsnAsp: 1.952 ± 0.363
1.826AsnGlu: 1.826 ± 0.35
0.881AsnPhe: 0.881 ± 0.243
3.4AsnGly: 3.4 ± 0.474
0.63AsnHis: 0.63 ± 0.186
1.7AsnIle: 1.7 ± 0.301
0.378AsnLys: 0.378 ± 0.142
2.392AsnLeu: 2.392 ± 0.321
0.63AsnMet: 0.63 ± 0.158
0.818AsnAsn: 0.818 ± 0.214
2.455AsnPro: 2.455 ± 0.374
1.133AsnGln: 1.133 ± 0.235
1.637AsnArg: 1.637 ± 0.352
1.637AsnSer: 1.637 ± 0.339
1.826AsnThr: 1.826 ± 0.314
2.896AsnVal: 2.896 ± 0.451
0.755AsnTrp: 0.755 ± 0.242
1.448AsnTyr: 1.448 ± 0.276
0.0AsnXaa: 0.0 ± 0.0
Pro
5.351ProAla: 5.351 ± 0.642
0.63ProCys: 0.63 ± 0.312
4.218ProAsp: 4.218 ± 0.512
4.533ProGlu: 4.533 ± 0.584
2.015ProPhe: 2.015 ± 0.394
5.225ProGly: 5.225 ± 0.691
1.007ProHis: 1.007 ± 0.284
2.329ProIle: 2.329 ± 0.359
2.015ProLys: 2.015 ± 0.316
4.596ProLeu: 4.596 ± 0.634
0.944ProMet: 0.944 ± 0.253
1.637ProAsn: 1.637 ± 0.313
2.959ProPro: 2.959 ± 0.481
1.763ProGln: 1.763 ± 0.355
2.959ProArg: 2.959 ± 0.581
3.714ProSer: 3.714 ± 0.433
4.155ProThr: 4.155 ± 0.54
4.155ProVal: 4.155 ± 0.497
0.881ProTrp: 0.881 ± 0.311
1.448ProTyr: 1.448 ± 0.374
0.0ProXaa: 0.0 ± 0.0
Gln
3.022GlnAla: 3.022 ± 0.531
0.126GlnCys: 0.126 ± 0.098
1.196GlnAsp: 1.196 ± 0.334
1.574GlnGlu: 1.574 ± 0.247
1.133GlnPhe: 1.133 ± 0.222
2.141GlnGly: 2.141 ± 0.323
0.755GlnHis: 0.755 ± 0.195
2.833GlnIle: 2.833 ± 0.442
1.196GlnLys: 1.196 ± 0.258
3.777GlnLeu: 3.777 ± 0.45
0.881GlnMet: 0.881 ± 0.224
0.504GlnAsn: 0.504 ± 0.175
1.826GlnPro: 1.826 ± 0.385
1.7GlnGln: 1.7 ± 0.4
1.889GlnArg: 1.889 ± 0.412
1.448GlnSer: 1.448 ± 0.277
1.826GlnThr: 1.826 ± 0.345
2.518GlnVal: 2.518 ± 0.407
0.63GlnTrp: 0.63 ± 0.17
0.567GlnTyr: 0.567 ± 0.173
0.0GlnXaa: 0.0 ± 0.0
Arg
5.729ArgAla: 5.729 ± 0.636
0.944ArgCys: 0.944 ± 0.338
2.959ArgAsp: 2.959 ± 0.453
4.596ArgGlu: 4.596 ± 0.686
1.889ArgPhe: 1.889 ± 0.357
5.225ArgGly: 5.225 ± 0.79
1.448ArgHis: 1.448 ± 0.306
3.148ArgIle: 3.148 ± 0.46
3.526ArgLys: 3.526 ± 0.536
5.729ArgLeu: 5.729 ± 0.705
1.637ArgMet: 1.637 ± 0.3
1.952ArgAsn: 1.952 ± 0.424
2.329ArgPro: 2.329 ± 0.382
1.763ArgGln: 1.763 ± 0.31
5.603ArgArg: 5.603 ± 0.776
3.903ArgSer: 3.903 ± 0.571
2.959ArgThr: 2.959 ± 0.423
4.974ArgVal: 4.974 ± 0.657
1.448ArgTrp: 1.448 ± 0.33
1.826ArgTyr: 1.826 ± 0.298
0.0ArgXaa: 0.0 ± 0.0
Ser
6.736SerAla: 6.736 ± 0.924
0.504SerCys: 0.504 ± 0.179
3.337SerAsp: 3.337 ± 0.369
3.84SerGlu: 3.84 ± 0.541
1.763SerPhe: 1.763 ± 0.405
6.359SerGly: 6.359 ± 0.658
1.385SerHis: 1.385 ± 0.282
2.833SerIle: 2.833 ± 0.373
2.078SerLys: 2.078 ± 0.325
5.037SerLeu: 5.037 ± 0.542
1.511SerMet: 1.511 ± 0.289
2.392SerAsn: 2.392 ± 0.468
3.211SerPro: 3.211 ± 0.442
2.078SerGln: 2.078 ± 0.323
2.833SerArg: 2.833 ± 0.381
3.903SerSer: 3.903 ± 0.713
3.337SerThr: 3.337 ± 0.442
3.651SerVal: 3.651 ± 0.404
1.448SerTrp: 1.448 ± 0.304
1.196SerTyr: 1.196 ± 0.281
0.0SerXaa: 0.0 ± 0.0
Thr
6.422ThrAla: 6.422 ± 0.698
0.441ThrCys: 0.441 ± 0.193
3.966ThrAsp: 3.966 ± 0.573
4.344ThrGlu: 4.344 ± 0.593
2.329ThrPhe: 2.329 ± 0.371
6.673ThrGly: 6.673 ± 0.688
1.196ThrHis: 1.196 ± 0.381
2.518ThrIle: 2.518 ± 0.54
2.581ThrLys: 2.581 ± 0.355
6.107ThrLeu: 6.107 ± 0.597
0.944ThrMet: 0.944 ± 0.224
1.385ThrAsn: 1.385 ± 0.341
4.029ThrPro: 4.029 ± 0.496
1.889ThrGln: 1.889 ± 0.346
3.651ThrArg: 3.651 ± 0.6
3.211ThrSer: 3.211 ± 0.428
4.47ThrThr: 4.47 ± 0.581
5.099ThrVal: 5.099 ± 0.584
1.322ThrTrp: 1.322 ± 0.344
1.952ThrTyr: 1.952 ± 0.351
0.0ThrXaa: 0.0 ± 0.0
Val
7.114ValAla: 7.114 ± 0.774
0.504ValCys: 0.504 ± 0.171
5.162ValAsp: 5.162 ± 0.614
5.037ValGlu: 5.037 ± 0.543
2.518ValPhe: 2.518 ± 0.375
4.722ValGly: 4.722 ± 0.752
1.385ValHis: 1.385 ± 0.219
3.777ValIle: 3.777 ± 0.468
3.211ValLys: 3.211 ± 0.401
5.225ValLeu: 5.225 ± 0.558
1.196ValMet: 1.196 ± 0.282
2.896ValAsn: 2.896 ± 0.397
3.903ValPro: 3.903 ± 0.443
2.078ValGln: 2.078 ± 0.432
4.785ValArg: 4.785 ± 0.596
4.533ValSer: 4.533 ± 0.43
5.981ValThr: 5.981 ± 0.627
5.162ValVal: 5.162 ± 0.668
1.07ValTrp: 1.07 ± 0.245
2.392ValTyr: 2.392 ± 0.339
0.0ValXaa: 0.0 ± 0.0
Trp
1.385TrpAla: 1.385 ± 0.322
0.189TrpCys: 0.189 ± 0.117
1.385TrpAsp: 1.385 ± 0.271
0.881TrpGlu: 0.881 ± 0.213
0.755TrpPhe: 0.755 ± 0.229
1.889TrpGly: 1.889 ± 0.281
0.441TrpHis: 0.441 ± 0.167
1.322TrpIle: 1.322 ± 0.294
0.315TrpLys: 0.315 ± 0.203
1.826TrpLeu: 1.826 ± 0.31
0.378TrpMet: 0.378 ± 0.171
0.504TrpAsn: 0.504 ± 0.185
0.944TrpPro: 0.944 ± 0.259
0.818TrpGln: 0.818 ± 0.209
1.133TrpArg: 1.133 ± 0.32
1.007TrpSer: 1.007 ± 0.269
1.637TrpThr: 1.637 ± 0.334
2.203TrpVal: 2.203 ± 0.326
0.567TrpTrp: 0.567 ± 0.184
0.378TrpTyr: 0.378 ± 0.136
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.078TyrAla: 2.078 ± 0.393
0.252TyrCys: 0.252 ± 0.164
1.196TyrAsp: 1.196 ± 0.328
2.518TyrGlu: 2.518 ± 0.364
0.63TyrPhe: 0.63 ± 0.147
2.581TyrGly: 2.581 ± 0.441
0.693TyrHis: 0.693 ± 0.165
1.448TyrIle: 1.448 ± 0.358
1.196TyrLys: 1.196 ± 0.275
2.518TyrLeu: 2.518 ± 0.385
0.567TyrMet: 0.567 ± 0.166
1.259TyrAsn: 1.259 ± 0.265
1.511TyrPro: 1.511 ± 0.314
1.07TyrGln: 1.07 ± 0.252
2.77TyrArg: 2.77 ± 0.394
1.511TyrSer: 1.511 ± 0.331
2.266TyrThr: 2.266 ± 0.376
1.889TyrVal: 1.889 ± 0.377
0.567TyrTrp: 0.567 ± 0.189
0.755TyrTyr: 0.755 ± 0.219
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 90 proteins (15885 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski