Amino acid dipepetide frequency for Mycobacterium phage Bubbles123

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
14.35AlaAla: 14.35 ± 1.705
1.26AlaCys: 1.26 ± 0.251
7.12AlaAsp: 7.12 ± 0.603
7.613AlaGlu: 7.613 ± 0.837
3.177AlaPhe: 3.177 ± 0.428
9.53AlaGly: 9.53 ± 1.231
2.519AlaHis: 2.519 ± 0.384
4.382AlaIle: 4.382 ± 0.569
4.108AlaLys: 4.108 ± 0.465
7.613AlaLeu: 7.613 ± 0.82
2.793AlaMet: 2.793 ± 0.451
2.739AlaAsn: 2.739 ± 0.439
5.094AlaPro: 5.094 ± 0.64
3.779AlaGln: 3.779 ± 0.503
7.23AlaArg: 7.23 ± 0.698
5.806AlaSer: 5.806 ± 0.685
6.572AlaThr: 6.572 ± 0.514
6.792AlaVal: 6.792 ± 0.6
2.793AlaTrp: 2.793 ± 0.374
2.355AlaTyr: 2.355 ± 0.341
0.0AlaXaa: 0.0 ± 0.0
Cys
1.041CysAla: 1.041 ± 0.307
0.11CysCys: 0.11 ± 0.069
1.424CysAsp: 1.424 ± 0.322
0.822CysGlu: 0.822 ± 0.233
0.329CysPhe: 0.329 ± 0.12
1.479CysGly: 1.479 ± 0.313
0.329CysHis: 0.329 ± 0.131
0.219CysIle: 0.219 ± 0.105
0.383CysLys: 0.383 ± 0.139
0.548CysLeu: 0.548 ± 0.203
0.219CysMet: 0.219 ± 0.102
0.274CysAsn: 0.274 ± 0.117
1.205CysPro: 1.205 ± 0.256
0.383CysGln: 0.383 ± 0.145
0.822CysArg: 0.822 ± 0.227
0.602CysSer: 0.602 ± 0.151
0.822CysThr: 0.822 ± 0.223
0.767CysVal: 0.767 ± 0.188
0.329CysTrp: 0.329 ± 0.13
0.219CysTyr: 0.219 ± 0.111
0.0CysXaa: 0.0 ± 0.0
Asp
7.011AspAla: 7.011 ± 0.626
0.931AspCys: 0.931 ± 0.236
4.546AspAsp: 4.546 ± 0.55
3.724AspGlu: 3.724 ± 0.444
1.698AspPhe: 1.698 ± 0.256
6.792AspGly: 6.792 ± 0.66
1.643AspHis: 1.643 ± 0.324
2.3AspIle: 2.3 ± 0.365
1.588AspLys: 1.588 ± 0.308
5.313AspLeu: 5.313 ± 0.592
1.15AspMet: 1.15 ± 0.303
1.588AspAsn: 1.588 ± 0.321
4.875AspPro: 4.875 ± 0.552
2.191AspGln: 2.191 ± 0.37
4.765AspArg: 4.765 ± 0.652
3.615AspSer: 3.615 ± 0.479
4.108AspThr: 4.108 ± 0.429
4.436AspVal: 4.436 ± 0.47
1.807AspTrp: 1.807 ± 0.315
1.807AspTyr: 1.807 ± 0.303
0.0AspXaa: 0.0 ± 0.0
Glu
6.025GluAla: 6.025 ± 0.704
0.657GluCys: 0.657 ± 0.198
2.684GluAsp: 2.684 ± 0.331
3.067GluGlu: 3.067 ± 0.564
2.191GluPhe: 2.191 ± 0.345
3.231GluGly: 3.231 ± 0.432
2.191GluHis: 2.191 ± 0.416
2.629GluIle: 2.629 ± 0.338
1.588GluLys: 1.588 ± 0.284
5.696GluLeu: 5.696 ± 0.69
1.588GluMet: 1.588 ± 0.318
2.246GluAsn: 2.246 ± 0.306
3.122GluPro: 3.122 ± 0.46
2.684GluGln: 2.684 ± 0.4
4.765GluArg: 4.765 ± 0.62
3.122GluSer: 3.122 ± 0.418
3.779GluThr: 3.779 ± 0.643
3.615GluVal: 3.615 ± 0.427
1.698GluTrp: 1.698 ± 0.339
1.753GluTyr: 1.753 ± 0.29
0.0GluXaa: 0.0 ± 0.0
Phe
3.451PheAla: 3.451 ± 0.419
0.219PheCys: 0.219 ± 0.105
1.862PheAsp: 1.862 ± 0.449
2.027PheGlu: 2.027 ± 0.299
0.986PhePhe: 0.986 ± 0.252
2.739PheGly: 2.739 ± 0.572
0.329PheHis: 0.329 ± 0.136
1.534PheIle: 1.534 ± 0.328
0.931PheLys: 0.931 ± 0.241
1.972PheLeu: 1.972 ± 0.297
0.657PheMet: 0.657 ± 0.193
1.369PheAsn: 1.369 ± 0.311
1.534PhePro: 1.534 ± 0.356
1.314PheGln: 1.314 ± 0.309
1.807PheArg: 1.807 ± 0.273
1.807PheSer: 1.807 ± 0.24
2.903PheThr: 2.903 ± 0.426
1.917PheVal: 1.917 ± 0.333
0.602PheTrp: 0.602 ± 0.158
0.931PheTyr: 0.931 ± 0.257
0.0PheXaa: 0.0 ± 0.0
Gly
8.654GlyAla: 8.654 ± 1.115
1.15GlyCys: 1.15 ± 0.285
6.299GlyAsp: 6.299 ± 0.527
3.56GlyGlu: 3.56 ± 0.507
2.684GlyPhe: 2.684 ± 0.426
10.845GlyGly: 10.845 ± 1.918
1.917GlyHis: 1.917 ± 0.34
4.765GlyIle: 4.765 ± 0.64
2.574GlyLys: 2.574 ± 0.391
5.806GlyLeu: 5.806 ± 0.569
2.629GlyMet: 2.629 ± 0.483
3.067GlyAsn: 3.067 ± 0.396
4.546GlyPro: 4.546 ± 0.646
2.684GlyGln: 2.684 ± 0.479
5.313GlyArg: 5.313 ± 0.713
5.422GlySer: 5.422 ± 0.854
6.408GlyThr: 6.408 ± 0.835
5.587GlyVal: 5.587 ± 0.555
2.246GlyTrp: 2.246 ± 0.355
2.3GlyTyr: 2.3 ± 0.38
0.0GlyXaa: 0.0 ± 0.0
His
2.027HisAla: 2.027 ± 0.368
0.329HisCys: 0.329 ± 0.165
1.479HisAsp: 1.479 ± 0.262
1.095HisGlu: 1.095 ± 0.276
0.548HisPhe: 0.548 ± 0.159
1.698HisGly: 1.698 ± 0.344
0.822HisHis: 0.822 ± 0.248
1.698HisIle: 1.698 ± 0.27
0.876HisLys: 0.876 ± 0.262
1.588HisLeu: 1.588 ± 0.307
0.438HisMet: 0.438 ± 0.139
0.986HisAsn: 0.986 ± 0.24
1.643HisPro: 1.643 ± 0.329
0.931HisGln: 0.931 ± 0.23
1.917HisArg: 1.917 ± 0.37
0.986HisSer: 0.986 ± 0.207
1.588HisThr: 1.588 ± 0.308
1.588HisVal: 1.588 ± 0.404
0.493HisTrp: 0.493 ± 0.152
0.767HisTyr: 0.767 ± 0.163
0.0HisXaa: 0.0 ± 0.0
Ile
4.984IleAla: 4.984 ± 0.475
0.602IleCys: 0.602 ± 0.22
3.505IleAsp: 3.505 ± 0.452
3.943IleGlu: 3.943 ± 0.358
0.822IlePhe: 0.822 ± 0.248
4.436IleGly: 4.436 ± 0.475
1.534IleHis: 1.534 ± 0.245
1.479IleIle: 1.479 ± 0.278
1.095IleLys: 1.095 ± 0.254
2.355IleLeu: 2.355 ± 0.431
0.329IleMet: 0.329 ± 0.122
1.862IleAsn: 1.862 ± 0.26
2.903IlePro: 2.903 ± 0.371
1.424IleGln: 1.424 ± 0.29
2.246IleArg: 2.246 ± 0.389
2.41IleSer: 2.41 ± 0.446
3.67IleThr: 3.67 ± 0.38
2.958IleVal: 2.958 ± 0.425
0.822IleTrp: 0.822 ± 0.175
0.876IleTyr: 0.876 ± 0.202
0.0IleXaa: 0.0 ± 0.0
Lys
3.451LysAla: 3.451 ± 0.487
0.493LysCys: 0.493 ± 0.185
1.588LysAsp: 1.588 ± 0.307
1.26LysGlu: 1.26 ± 0.244
1.205LysPhe: 1.205 ± 0.211
2.465LysGly: 2.465 ± 0.395
0.986LysHis: 0.986 ± 0.249
0.876LysIle: 0.876 ± 0.173
1.26LysLys: 1.26 ± 0.341
2.629LysLeu: 2.629 ± 0.504
0.876LysMet: 0.876 ± 0.223
1.041LysAsn: 1.041 ± 0.272
2.465LysPro: 2.465 ± 0.35
1.424LysGln: 1.424 ± 0.227
2.027LysArg: 2.027 ± 0.343
2.027LysSer: 2.027 ± 0.395
2.136LysThr: 2.136 ± 0.358
2.3LysVal: 2.3 ± 0.405
0.822LysTrp: 0.822 ± 0.328
0.657LysTyr: 0.657 ± 0.193
0.0LysXaa: 0.0 ± 0.0
Leu
7.504LeuAla: 7.504 ± 0.826
0.876LeuCys: 0.876 ± 0.293
5.313LeuAsp: 5.313 ± 0.669
3.779LeuGlu: 3.779 ± 0.529
2.848LeuPhe: 2.848 ± 0.313
5.313LeuGly: 5.313 ± 0.476
1.15LeuHis: 1.15 ± 0.267
2.629LeuIle: 2.629 ± 0.43
2.027LeuLys: 2.027 ± 0.319
4.984LeuLeu: 4.984 ± 0.512
1.479LeuMet: 1.479 ± 0.29
2.684LeuAsn: 2.684 ± 0.374
5.203LeuPro: 5.203 ± 0.738
2.684LeuGln: 2.684 ± 0.422
5.368LeuArg: 5.368 ± 0.643
4.71LeuSer: 4.71 ± 0.426
5.477LeuThr: 5.477 ± 0.568
4.765LeuVal: 4.765 ± 0.62
1.424LeuTrp: 1.424 ± 0.302
2.246LeuTyr: 2.246 ± 0.415
0.0LeuXaa: 0.0 ± 0.0
Met
2.739MetAla: 2.739 ± 0.405
0.329MetCys: 0.329 ± 0.172
1.15MetAsp: 1.15 ± 0.26
1.095MetGlu: 1.095 ± 0.205
0.712MetPhe: 0.712 ± 0.218
1.862MetGly: 1.862 ± 0.28
0.164MetHis: 0.164 ± 0.091
0.931MetIle: 0.931 ± 0.263
0.657MetLys: 0.657 ± 0.23
1.807MetLeu: 1.807 ± 0.237
0.548MetMet: 0.548 ± 0.231
0.767MetAsn: 0.767 ± 0.197
1.205MetPro: 1.205 ± 0.277
0.548MetGln: 0.548 ± 0.163
1.369MetArg: 1.369 ± 0.31
2.629MetSer: 2.629 ± 0.386
2.3MetThr: 2.3 ± 0.325
1.041MetVal: 1.041 ± 0.297
0.274MetTrp: 0.274 ± 0.133
0.383MetTyr: 0.383 ± 0.134
0.0MetXaa: 0.0 ± 0.0
Asn
3.889AsnAla: 3.889 ± 0.427
0.438AsnCys: 0.438 ± 0.154
1.862AsnAsp: 1.862 ± 0.289
1.314AsnGlu: 1.314 ± 0.266
0.931AsnPhe: 0.931 ± 0.3
3.998AsnGly: 3.998 ± 0.635
0.822AsnHis: 0.822 ± 0.183
1.753AsnIle: 1.753 ± 0.468
0.986AsnLys: 0.986 ± 0.241
2.3AsnLeu: 2.3 ± 0.358
0.712AsnMet: 0.712 ± 0.193
1.972AsnAsn: 1.972 ± 0.311
2.848AsnPro: 2.848 ± 0.343
1.314AsnGln: 1.314 ± 0.318
2.081AsnArg: 2.081 ± 0.407
1.643AsnSer: 1.643 ± 0.324
2.246AsnThr: 2.246 ± 0.326
2.027AsnVal: 2.027 ± 0.369
0.712AsnTrp: 0.712 ± 0.18
1.041AsnTyr: 1.041 ± 0.22
0.0AsnXaa: 0.0 ± 0.0
Pro
5.641ProAla: 5.641 ± 0.698
0.712ProCys: 0.712 ± 0.188
4.436ProAsp: 4.436 ± 0.454
4.327ProGlu: 4.327 ± 0.512
1.972ProPhe: 1.972 ± 0.397
6.627ProGly: 6.627 ± 0.657
1.369ProHis: 1.369 ± 0.292
2.191ProIle: 2.191 ± 0.346
2.191ProLys: 2.191 ± 0.353
4.601ProLeu: 4.601 ± 0.518
1.588ProMet: 1.588 ± 0.319
2.191ProAsn: 2.191 ± 0.354
4.053ProPro: 4.053 ± 0.664
2.629ProGln: 2.629 ± 0.393
3.396ProArg: 3.396 ± 0.598
3.451ProSer: 3.451 ± 0.433
3.341ProThr: 3.341 ± 0.438
4.382ProVal: 4.382 ± 0.513
0.986ProTrp: 0.986 ± 0.192
1.369ProTyr: 1.369 ± 0.275
0.0ProXaa: 0.0 ± 0.0
Gln
4.436GlnAla: 4.436 ± 0.533
0.219GlnCys: 0.219 ± 0.144
1.534GlnAsp: 1.534 ± 0.237
2.027GlnGlu: 2.027 ± 0.406
1.041GlnPhe: 1.041 ± 0.22
2.629GlnGly: 2.629 ± 0.494
1.26GlnHis: 1.26 ± 0.282
1.917GlnIle: 1.917 ± 0.343
1.15GlnLys: 1.15 ± 0.197
3.177GlnLeu: 3.177 ± 0.431
0.548GlnMet: 0.548 ± 0.157
0.931GlnAsn: 0.931 ± 0.248
2.903GlnPro: 2.903 ± 0.407
1.314GlnGln: 1.314 ± 0.29
2.41GlnArg: 2.41 ± 0.31
2.3GlnSer: 2.3 ± 0.387
1.862GlnThr: 1.862 ± 0.347
2.465GlnVal: 2.465 ± 0.391
0.712GlnTrp: 0.712 ± 0.169
0.822GlnTyr: 0.822 ± 0.264
0.0GlnXaa: 0.0 ± 0.0
Arg
7.504ArgAla: 7.504 ± 0.816
1.205ArgCys: 1.205 ± 0.341
4.053ArgAsp: 4.053 ± 0.518
4.71ArgGlu: 4.71 ± 0.609
2.246ArgPhe: 2.246 ± 0.418
4.053ArgGly: 4.053 ± 0.434
1.26ArgHis: 1.26 ± 0.26
3.56ArgIle: 3.56 ± 0.445
2.739ArgLys: 2.739 ± 0.415
4.655ArgLeu: 4.655 ± 0.639
2.191ArgMet: 2.191 ± 0.355
2.684ArgAsn: 2.684 ± 0.435
3.231ArgPro: 3.231 ± 0.412
2.081ArgGln: 2.081 ± 0.346
5.587ArgArg: 5.587 ± 0.764
3.56ArgSer: 3.56 ± 0.404
3.177ArgThr: 3.177 ± 0.474
4.929ArgVal: 4.929 ± 0.644
1.862ArgTrp: 1.862 ± 0.311
1.807ArgTyr: 1.807 ± 0.268
0.0ArgXaa: 0.0 ± 0.0
Ser
5.97SerAla: 5.97 ± 0.781
0.383SerCys: 0.383 ± 0.118
4.327SerAsp: 4.327 ± 0.516
3.177SerGlu: 3.177 ± 0.44
1.917SerPhe: 1.917 ± 0.406
6.299SerGly: 6.299 ± 0.837
1.369SerHis: 1.369 ± 0.22
2.519SerIle: 2.519 ± 0.436
2.136SerLys: 2.136 ± 0.346
4.053SerLeu: 4.053 ± 0.419
0.931SerMet: 0.931 ± 0.2
2.574SerAsn: 2.574 ± 0.469
3.451SerPro: 3.451 ± 0.388
1.753SerGln: 1.753 ± 0.266
3.177SerArg: 3.177 ± 0.415
4.163SerSer: 4.163 ± 0.688
3.396SerThr: 3.396 ± 0.397
4.382SerVal: 4.382 ± 0.51
1.15SerTrp: 1.15 ± 0.255
1.314SerTyr: 1.314 ± 0.239
0.0SerXaa: 0.0 ± 0.0
Thr
7.011ThrAla: 7.011 ± 0.679
0.657ThrCys: 0.657 ± 0.211
4.546ThrAsp: 4.546 ± 0.64
3.231ThrGlu: 3.231 ± 0.404
1.698ThrPhe: 1.698 ± 0.345
6.463ThrGly: 6.463 ± 0.594
1.479ThrHis: 1.479 ± 0.302
3.231ThrIle: 3.231 ± 0.493
2.081ThrLys: 2.081 ± 0.334
4.655ThrLeu: 4.655 ± 0.483
1.314ThrMet: 1.314 ± 0.235
2.355ThrAsn: 2.355 ± 0.343
4.82ThrPro: 4.82 ± 0.598
1.807ThrGln: 1.807 ± 0.3
4.272ThrArg: 4.272 ± 0.478
3.505ThrSer: 3.505 ± 0.466
4.984ThrThr: 4.984 ± 0.794
6.025ThrVal: 6.025 ± 0.652
1.15ThrTrp: 1.15 ± 0.285
1.698ThrTyr: 1.698 ± 0.341
0.0ThrXaa: 0.0 ± 0.0
Val
7.394ValAla: 7.394 ± 0.549
0.986ValCys: 0.986 ± 0.191
5.094ValAsp: 5.094 ± 0.561
4.436ValGlu: 4.436 ± 0.495
1.972ValPhe: 1.972 ± 0.347
5.148ValGly: 5.148 ± 0.505
1.369ValHis: 1.369 ± 0.297
3.067ValIle: 3.067 ± 0.401
2.3ValLys: 2.3 ± 0.416
4.984ValLeu: 4.984 ± 0.566
1.424ValMet: 1.424 ± 0.26
2.3ValAsn: 2.3 ± 0.34
4.053ValPro: 4.053 ± 0.376
2.958ValGln: 2.958 ± 0.357
4.382ValArg: 4.382 ± 0.554
4.053ValSer: 4.053 ± 0.511
5.148ValThr: 5.148 ± 0.597
5.806ValVal: 5.806 ± 0.62
1.698ValTrp: 1.698 ± 0.381
1.095ValTyr: 1.095 ± 0.269
0.0ValXaa: 0.0 ± 0.0
Trp
2.3TrpAla: 2.3 ± 0.39
0.383TrpCys: 0.383 ± 0.147
1.479TrpAsp: 1.479 ± 0.283
1.314TrpGlu: 1.314 ± 0.352
0.876TrpPhe: 0.876 ± 0.23
1.205TrpGly: 1.205 ± 0.238
0.602TrpHis: 0.602 ± 0.205
1.369TrpIle: 1.369 ± 0.259
0.602TrpLys: 0.602 ± 0.15
1.807TrpLeu: 1.807 ± 0.335
0.876TrpMet: 0.876 ± 0.223
0.438TrpAsn: 0.438 ± 0.209
1.041TrpPro: 1.041 ± 0.256
0.931TrpGln: 0.931 ± 0.228
2.081TrpArg: 2.081 ± 0.388
1.369TrpSer: 1.369 ± 0.249
1.588TrpThr: 1.588 ± 0.295
1.588TrpVal: 1.588 ± 0.452
1.041TrpTrp: 1.041 ± 0.222
0.329TrpTyr: 0.329 ± 0.14
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.246TyrAla: 2.246 ± 0.434
0.383TyrCys: 0.383 ± 0.14
1.588TyrAsp: 1.588 ± 0.359
1.807TyrGlu: 1.807 ± 0.304
1.041TyrPhe: 1.041 ± 0.227
1.643TyrGly: 1.643 ± 0.37
0.383TyrHis: 0.383 ± 0.125
1.205TyrIle: 1.205 ± 0.238
0.712TyrLys: 0.712 ± 0.183
1.917TyrLeu: 1.917 ± 0.341
0.164TyrMet: 0.164 ± 0.087
0.822TyrAsn: 0.822 ± 0.218
1.314TyrPro: 1.314 ± 0.25
0.822TyrGln: 0.822 ± 0.212
2.081TyrArg: 2.081 ± 0.425
1.26TyrSer: 1.26 ± 0.286
1.479TyrThr: 1.479 ± 0.351
2.3TyrVal: 2.3 ± 0.359
0.602TyrTrp: 0.602 ± 0.137
0.602TyrTyr: 0.602 ± 0.158
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 100 proteins (18259 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski