Amino acid dipepetide frequency for Mycobacterium phage ShedlockHolmes

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
23.341AlaAla: 23.341 ± 1.63
1.262AlaCys: 1.262 ± 0.27
8.779AlaAsp: 8.779 ± 0.704
9.62AlaGlu: 9.62 ± 1.159
3.522AlaPhe: 3.522 ± 0.523
9.568AlaGly: 9.568 ± 0.927
2.681AlaHis: 2.681 ± 0.46
4.416AlaIle: 4.416 ± 0.564
4.101AlaLys: 4.101 ± 0.44
11.355AlaLeu: 11.355 ± 0.903
3.627AlaMet: 3.627 ± 0.377
2.576AlaAsn: 2.576 ± 0.42
6.466AlaPro: 6.466 ± 1.072
5.415AlaGln: 5.415 ± 0.818
8.622AlaArg: 8.622 ± 0.766
4.994AlaSer: 4.994 ± 0.547
7.307AlaThr: 7.307 ± 0.7
10.567AlaVal: 10.567 ± 0.73
2.734AlaTrp: 2.734 ± 0.335
2.629AlaTyr: 2.629 ± 0.362
0.0AlaXaa: 0.0 ± 0.0
Cys
0.999CysAla: 0.999 ± 0.216
0.21CysCys: 0.21 ± 0.121
0.736CysAsp: 0.736 ± 0.216
0.894CysGlu: 0.894 ± 0.237
0.526CysPhe: 0.526 ± 0.157
1.577CysGly: 1.577 ± 0.348
0.105CysHis: 0.105 ± 0.073
0.315CysIle: 0.315 ± 0.117
0.473CysLys: 0.473 ± 0.192
0.683CysLeu: 0.683 ± 0.192
0.21CysMet: 0.21 ± 0.108
0.263CysAsn: 0.263 ± 0.117
0.736CysPro: 0.736 ± 0.19
0.421CysGln: 0.421 ± 0.141
0.841CysArg: 0.841 ± 0.301
0.946CysSer: 0.946 ± 0.213
0.578CysThr: 0.578 ± 0.17
0.683CysVal: 0.683 ± 0.196
0.263CysTrp: 0.263 ± 0.133
0.263CysTyr: 0.263 ± 0.112
0.0CysXaa: 0.0 ± 0.0
Asp
8.306AspAla: 8.306 ± 0.677
0.789AspCys: 0.789 ± 0.219
5.362AspAsp: 5.362 ± 0.574
6.098AspGlu: 6.098 ± 0.649
1.104AspPhe: 1.104 ± 0.179
6.203AspGly: 6.203 ± 0.539
0.736AspHis: 0.736 ± 0.2
1.209AspIle: 1.209 ± 0.289
2.313AspLys: 2.313 ± 0.364
5.625AspLeu: 5.625 ± 0.666
1.472AspMet: 1.472 ± 0.282
1.63AspAsn: 1.63 ± 0.288
3.943AspPro: 3.943 ± 0.503
1.945AspGln: 1.945 ± 0.313
4.731AspArg: 4.731 ± 0.6
2.681AspSer: 2.681 ± 0.391
3.102AspThr: 3.102 ± 0.412
5.52AspVal: 5.52 ± 0.477
0.631AspTrp: 0.631 ± 0.225
1.472AspTyr: 1.472 ± 0.32
0.0AspXaa: 0.0 ± 0.0
Glu
8.937GluAla: 8.937 ± 0.963
1.051GluCys: 1.051 ± 0.301
2.629GluAsp: 2.629 ± 0.439
1.945GluGlu: 1.945 ± 0.321
1.735GluPhe: 1.735 ± 0.292
4.048GluGly: 4.048 ± 0.492
1.787GluHis: 1.787 ± 0.347
3.154GluIle: 3.154 ± 0.448
1.682GluLys: 1.682 ± 0.438
3.943GluLeu: 3.943 ± 0.509
1.419GluMet: 1.419 ± 0.222
1.051GluAsn: 1.051 ± 0.232
3.154GluPro: 3.154 ± 0.502
2.839GluGln: 2.839 ± 0.398
5.362GluArg: 5.362 ± 0.912
2.734GluSer: 2.734 ± 0.318
2.418GluThr: 2.418 ± 0.39
5.099GluVal: 5.099 ± 0.654
1.157GluTrp: 1.157 ± 0.24
1.525GluTyr: 1.525 ± 0.372
0.0GluXaa: 0.0 ± 0.0
Phe
3.102PheAla: 3.102 ± 0.358
0.105PheCys: 0.105 ± 0.073
3.259PheAsp: 3.259 ± 0.356
0.999PheGlu: 0.999 ± 0.21
0.683PhePhe: 0.683 ± 0.185
3.312PheGly: 3.312 ± 0.445
0.158PheHis: 0.158 ± 0.076
0.946PheIle: 0.946 ± 0.212
1.419PheLys: 1.419 ± 0.331
1.998PheLeu: 1.998 ± 0.31
0.421PheMet: 0.421 ± 0.166
0.841PheAsn: 0.841 ± 0.212
1.157PhePro: 1.157 ± 0.223
0.894PheGln: 0.894 ± 0.217
1.893PheArg: 1.893 ± 0.395
1.209PheSer: 1.209 ± 0.261
2.155PheThr: 2.155 ± 0.398
2.103PheVal: 2.103 ± 0.317
0.473PheTrp: 0.473 ± 0.201
0.315PheTyr: 0.315 ± 0.129
0.0PheXaa: 0.0 ± 0.0
Gly
9.883GlyAla: 9.883 ± 1.248
1.104GlyCys: 1.104 ± 0.216
5.993GlyAsp: 5.993 ± 0.464
4.994GlyGlu: 4.994 ± 0.524
2.155GlyPhe: 2.155 ± 0.352
10.724GlyGly: 10.724 ± 1.604
1.945GlyHis: 1.945 ± 0.354
2.944GlyIle: 2.944 ± 0.697
4.521GlyLys: 4.521 ± 0.506
7.097GlyLeu: 7.097 ± 0.964
1.367GlyMet: 1.367 ± 0.261
2.891GlyAsn: 2.891 ± 0.316
3.733GlyPro: 3.733 ± 0.499
2.155GlyGln: 2.155 ± 0.599
5.993GlyArg: 5.993 ± 0.704
4.574GlySer: 4.574 ± 0.636
6.151GlyThr: 6.151 ± 0.752
6.834GlyVal: 6.834 ± 0.554
2.681GlyTrp: 2.681 ± 0.299
2.366GlyTyr: 2.366 ± 0.354
0.0GlyXaa: 0.0 ± 0.0
His
2.471HisAla: 2.471 ± 0.427
0.368HisCys: 0.368 ± 0.153
1.577HisAsp: 1.577 ± 0.316
0.999HisGlu: 0.999 ± 0.23
0.158HisPhe: 0.158 ± 0.077
2.05HisGly: 2.05 ± 0.391
0.473HisHis: 0.473 ± 0.146
0.946HisIle: 0.946 ± 0.285
0.368HisLys: 0.368 ± 0.145
1.63HisLeu: 1.63 ± 0.28
0.631HisMet: 0.631 ± 0.18
0.473HisAsn: 0.473 ± 0.149
1.314HisPro: 1.314 ± 0.285
0.631HisGln: 0.631 ± 0.158
1.63HisArg: 1.63 ± 0.325
0.946HisSer: 0.946 ± 0.184
0.946HisThr: 0.946 ± 0.265
1.787HisVal: 1.787 ± 0.35
0.421HisTrp: 0.421 ± 0.205
0.315HisTyr: 0.315 ± 0.134
0.0HisXaa: 0.0 ± 0.0
Ile
5.678IleAla: 5.678 ± 0.538
0.21IleCys: 0.21 ± 0.108
3.522IleAsp: 3.522 ± 0.387
3.47IleGlu: 3.47 ± 0.498
0.526IlePhe: 0.526 ± 0.195
3.68IleGly: 3.68 ± 0.794
0.473IleHis: 0.473 ± 0.177
0.999IleIle: 0.999 ± 0.236
1.367IleLys: 1.367 ± 0.39
2.103IleLeu: 2.103 ± 0.418
0.263IleMet: 0.263 ± 0.107
1.472IleAsn: 1.472 ± 0.266
1.525IlePro: 1.525 ± 0.341
0.736IleGln: 0.736 ± 0.249
1.893IleArg: 1.893 ± 0.34
2.05IleSer: 2.05 ± 0.357
2.418IleThr: 2.418 ± 0.343
3.627IleVal: 3.627 ± 0.409
0.736IleTrp: 0.736 ± 0.215
0.578IleTyr: 0.578 ± 0.183
0.0IleXaa: 0.0 ± 0.0
Lys
4.942LysAla: 4.942 ± 0.576
0.526LysCys: 0.526 ± 0.138
1.157LysAsp: 1.157 ± 0.325
0.736LysGlu: 0.736 ± 0.184
1.314LysPhe: 1.314 ± 0.338
3.049LysGly: 3.049 ± 0.449
0.631LysHis: 0.631 ± 0.166
1.525LysIle: 1.525 ± 0.488
0.526LysLys: 0.526 ± 0.2
3.785LysLeu: 3.785 ± 0.404
1.051LysMet: 1.051 ± 0.237
0.631LysAsn: 0.631 ± 0.203
2.734LysPro: 2.734 ± 0.439
1.157LysGln: 1.157 ± 0.262
2.944LysArg: 2.944 ± 0.511
1.104LysSer: 1.104 ± 0.228
1.525LysThr: 1.525 ± 0.312
3.207LysVal: 3.207 ± 0.432
0.789LysTrp: 0.789 ± 0.191
0.999LysTyr: 0.999 ± 0.241
0.0LysXaa: 0.0 ± 0.0
Leu
11.723LeuAla: 11.723 ± 0.865
1.051LeuCys: 1.051 ± 0.274
6.203LeuAsp: 6.203 ± 0.468
2.155LeuGlu: 2.155 ± 0.258
2.261LeuPhe: 2.261 ± 0.279
6.571LeuGly: 6.571 ± 0.617
1.787LeuHis: 1.787 ± 0.304
2.891LeuIle: 2.891 ± 0.459
2.786LeuLys: 2.786 ± 0.369
6.256LeuLeu: 6.256 ± 0.717
1.577LeuMet: 1.577 ± 0.29
2.366LeuAsn: 2.366 ± 0.414
5.152LeuPro: 5.152 ± 0.679
3.575LeuGln: 3.575 ± 0.416
5.572LeuArg: 5.572 ± 0.516
5.205LeuSer: 5.205 ± 0.557
5.52LeuThr: 5.52 ± 0.519
5.047LeuVal: 5.047 ± 0.486
1.682LeuTrp: 1.682 ± 0.327
1.998LeuTyr: 1.998 ± 0.367
0.0LeuXaa: 0.0 ± 0.0
Met
2.418MetAla: 2.418 ± 0.337
0.158MetCys: 0.158 ± 0.09
0.736MetAsp: 0.736 ± 0.233
0.631MetGlu: 0.631 ± 0.163
0.631MetPhe: 0.631 ± 0.185
1.472MetGly: 1.472 ± 0.288
0.421MetHis: 0.421 ± 0.141
0.946MetIle: 0.946 ± 0.236
0.315MetLys: 0.315 ± 0.139
1.367MetLeu: 1.367 ± 0.27
0.158MetMet: 0.158 ± 0.099
0.736MetAsn: 0.736 ± 0.174
1.314MetPro: 1.314 ± 0.289
0.946MetGln: 0.946 ± 0.219
1.577MetArg: 1.577 ± 0.254
2.523MetSer: 2.523 ± 0.369
1.525MetThr: 1.525 ± 0.27
1.682MetVal: 1.682 ± 0.245
0.526MetTrp: 0.526 ± 0.157
0.473MetTyr: 0.473 ± 0.17
0.0MetXaa: 0.0 ± 0.0
Asn
3.47AsnAla: 3.47 ± 0.533
0.21AsnCys: 0.21 ± 0.101
0.999AsnAsp: 0.999 ± 0.226
1.525AsnGlu: 1.525 ± 0.233
0.736AsnPhe: 0.736 ± 0.186
3.049AsnGly: 3.049 ± 0.527
0.21AsnHis: 0.21 ± 0.099
0.894AsnIle: 0.894 ± 0.275
0.789AsnLys: 0.789 ± 0.248
2.418AsnLeu: 2.418 ± 0.344
0.578AsnMet: 0.578 ± 0.155
0.473AsnAsn: 0.473 ± 0.159
2.471AsnPro: 2.471 ± 0.383
0.526AsnGln: 0.526 ± 0.173
1.157AsnArg: 1.157 ± 0.261
1.262AsnSer: 1.262 ± 0.268
1.472AsnThr: 1.472 ± 0.25
2.208AsnVal: 2.208 ± 0.385
0.789AsnTrp: 0.789 ± 0.214
0.526AsnTyr: 0.526 ± 0.124
0.0AsnXaa: 0.0 ± 0.0
Pro
7.623ProAla: 7.623 ± 0.752
0.473ProCys: 0.473 ± 0.173
3.733ProAsp: 3.733 ± 0.46
4.048ProGlu: 4.048 ± 0.522
1.63ProPhe: 1.63 ± 0.321
5.783ProGly: 5.783 ± 0.588
1.314ProHis: 1.314 ± 0.308
1.998ProIle: 1.998 ± 0.202
2.103ProLys: 2.103 ± 0.367
3.627ProLeu: 3.627 ± 0.472
0.894ProMet: 0.894 ± 0.251
1.157ProAsn: 1.157 ± 0.322
2.997ProPro: 2.997 ± 0.557
2.366ProGln: 2.366 ± 0.497
2.839ProArg: 2.839 ± 0.454
3.312ProSer: 3.312 ± 0.408
3.207ProThr: 3.207 ± 0.435
5.678ProVal: 5.678 ± 0.913
0.999ProTrp: 0.999 ± 0.243
1.209ProTyr: 1.209 ± 0.268
0.0ProXaa: 0.0 ± 0.0
Gln
4.679GlnAla: 4.679 ± 0.627
0.263GlnCys: 0.263 ± 0.111
1.472GlnAsp: 1.472 ± 0.244
0.999GlnGlu: 0.999 ± 0.347
1.104GlnPhe: 1.104 ± 0.325
2.629GlnGly: 2.629 ± 0.374
1.157GlnHis: 1.157 ± 0.223
2.366GlnIle: 2.366 ± 0.367
1.314GlnLys: 1.314 ± 0.25
2.839GlnLeu: 2.839 ± 0.412
0.526GlnMet: 0.526 ± 0.145
0.999GlnAsn: 0.999 ± 0.237
2.629GlnPro: 2.629 ± 0.304
1.945GlnGln: 1.945 ± 0.36
2.786GlnArg: 2.786 ± 0.461
1.314GlnSer: 1.314 ± 0.264
2.576GlnThr: 2.576 ± 0.321
2.997GlnVal: 2.997 ± 0.33
0.736GlnTrp: 0.736 ± 0.234
1.157GlnTyr: 1.157 ± 0.215
0.0GlnXaa: 0.0 ± 0.0
Arg
7.15ArgAla: 7.15 ± 0.962
1.051ArgCys: 1.051 ± 0.272
3.89ArgAsp: 3.89 ± 0.489
4.574ArgGlu: 4.574 ± 0.561
1.682ArgPhe: 1.682 ± 0.328
4.731ArgGly: 4.731 ± 0.475
0.946ArgHis: 0.946 ± 0.283
2.576ArgIle: 2.576 ± 0.486
2.997ArgLys: 2.997 ± 0.367
6.466ArgLeu: 6.466 ± 0.721
2.208ArgMet: 2.208 ± 0.335
2.208ArgAsn: 2.208 ± 0.26
3.522ArgPro: 3.522 ± 0.48
2.208ArgGln: 2.208 ± 0.345
5.835ArgArg: 5.835 ± 0.891
3.522ArgSer: 3.522 ± 0.446
3.417ArgThr: 3.417 ± 0.444
5.415ArgVal: 5.415 ± 0.469
2.05ArgTrp: 2.05 ± 0.398
1.525ArgTyr: 1.525 ± 0.254
0.0ArgXaa: 0.0 ± 0.0
Ser
6.834SerAla: 6.834 ± 0.806
0.158SerCys: 0.158 ± 0.087
2.944SerAsp: 2.944 ± 0.332
2.576SerGlu: 2.576 ± 0.347
1.419SerPhe: 1.419 ± 0.303
5.257SerGly: 5.257 ± 0.74
1.157SerHis: 1.157 ± 0.239
1.945SerIle: 1.945 ± 0.391
1.525SerLys: 1.525 ± 0.343
4.206SerLeu: 4.206 ± 0.478
1.367SerMet: 1.367 ± 0.227
1.367SerAsn: 1.367 ± 0.24
3.312SerPro: 3.312 ± 0.472
1.63SerGln: 1.63 ± 0.327
2.681SerArg: 2.681 ± 0.312
3.365SerSer: 3.365 ± 0.574
3.627SerThr: 3.627 ± 0.49
3.47SerVal: 3.47 ± 0.44
0.841SerTrp: 0.841 ± 0.27
1.262SerTyr: 1.262 ± 0.224
0.0SerXaa: 0.0 ± 0.0
Thr
7.307ThrAla: 7.307 ± 0.691
0.683ThrCys: 0.683 ± 0.209
3.943ThrAsp: 3.943 ± 0.468
3.259ThrGlu: 3.259 ± 0.337
2.786ThrPhe: 2.786 ± 0.322
6.361ThrGly: 6.361 ± 0.626
1.157ThrHis: 1.157 ± 0.259
2.734ThrIle: 2.734 ± 0.402
1.945ThrLys: 1.945 ± 0.393
4.153ThrLeu: 4.153 ± 0.49
0.789ThrMet: 0.789 ± 0.189
1.367ThrAsn: 1.367 ± 0.324
4.206ThrPro: 4.206 ± 0.468
1.63ThrGln: 1.63 ± 0.314
3.102ThrArg: 3.102 ± 0.628
3.47ThrSer: 3.47 ± 0.471
2.891ThrThr: 2.891 ± 0.503
5.257ThrVal: 5.257 ± 0.556
1.157ThrTrp: 1.157 ± 0.225
1.525ThrTyr: 1.525 ± 0.263
0.0ThrXaa: 0.0 ± 0.0
Val
9.831ValAla: 9.831 ± 0.779
1.314ValCys: 1.314 ± 0.263
5.467ValAsp: 5.467 ± 0.596
6.414ValGlu: 6.414 ± 0.717
2.05ValPhe: 2.05 ± 0.375
6.046ValGly: 6.046 ± 0.543
2.103ValHis: 2.103 ± 0.316
3.417ValIle: 3.417 ± 0.542
2.839ValLys: 2.839 ± 0.529
7.044ValLeu: 7.044 ± 0.612
0.999ValMet: 0.999 ± 0.214
1.84ValAsn: 1.84 ± 0.353
4.626ValPro: 4.626 ± 0.472
3.575ValGln: 3.575 ± 0.691
4.574ValArg: 4.574 ± 0.584
3.312ValSer: 3.312 ± 0.493
5.52ValThr: 5.52 ± 0.548
5.783ValVal: 5.783 ± 0.557
1.577ValTrp: 1.577 ± 0.311
1.314ValTyr: 1.314 ± 0.233
0.0ValXaa: 0.0 ± 0.0
Trp
1.998TrpAla: 1.998 ± 0.341
0.368TrpCys: 0.368 ± 0.133
1.525TrpAsp: 1.525 ± 0.303
0.631TrpGlu: 0.631 ± 0.14
0.631TrpPhe: 0.631 ± 0.189
1.472TrpGly: 1.472 ± 0.296
0.578TrpHis: 0.578 ± 0.17
0.789TrpIle: 0.789 ± 0.175
0.368TrpLys: 0.368 ± 0.141
2.681TrpLeu: 2.681 ± 0.314
0.263TrpMet: 0.263 ± 0.107
0.683TrpAsn: 0.683 ± 0.209
0.999TrpPro: 0.999 ± 0.222
1.209TrpGln: 1.209 ± 0.252
2.155TrpArg: 2.155 ± 0.373
0.999TrpSer: 0.999 ± 0.207
1.314TrpThr: 1.314 ± 0.218
1.209TrpVal: 1.209 ± 0.251
0.421TrpTrp: 0.421 ± 0.142
0.631TrpTyr: 0.631 ± 0.2
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.891TyrAla: 2.891 ± 0.362
0.368TyrCys: 0.368 ± 0.141
0.946TyrAsp: 0.946 ± 0.281
1.104TyrGlu: 1.104 ± 0.274
0.841TyrPhe: 0.841 ± 0.235
2.523TyrGly: 2.523 ± 0.421
0.315TyrHis: 0.315 ± 0.112
0.315TyrIle: 0.315 ± 0.134
0.578TyrLys: 0.578 ± 0.171
2.103TyrLeu: 2.103 ± 0.484
0.683TyrMet: 0.683 ± 0.213
0.736TyrAsn: 0.736 ± 0.177
0.999TyrPro: 0.999 ± 0.241
0.683TyrGln: 0.683 ± 0.187
1.787TyrArg: 1.787 ± 0.327
1.419TyrSer: 1.419 ± 0.265
2.05TyrThr: 2.05 ± 0.331
1.472TyrVal: 1.472 ± 0.249
0.263TyrTrp: 0.263 ± 0.104
0.368TyrTyr: 0.368 ± 0.117
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 100 proteins (19023 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski