Amino acid dipepetide frequency for Mycobacterium phage Andies

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
18.554AlaAla: 18.554 ± 2.498
0.984AlaCys: 0.984 ± 0.347
8.223AlaAsp: 8.223 ± 0.658
7.52AlaGlu: 7.52 ± 0.747
3.303AlaPhe: 3.303 ± 0.468
9.066AlaGly: 9.066 ± 1.176
2.671AlaHis: 2.671 ± 0.472
5.341AlaIle: 5.341 ± 0.577
3.163AlaLys: 3.163 ± 0.487
9.136AlaLeu: 9.136 ± 0.713
2.038AlaMet: 2.038 ± 0.419
3.444AlaAsn: 3.444 ± 0.633
6.677AlaPro: 6.677 ± 0.668
5.552AlaGln: 5.552 ± 0.754
7.52AlaArg: 7.52 ± 0.795
5.903AlaSer: 5.903 ± 0.545
7.871AlaThr: 7.871 ± 0.755
6.958AlaVal: 6.958 ± 0.757
1.757AlaTrp: 1.757 ± 0.38
2.108AlaTyr: 2.108 ± 0.339
0.0AlaXaa: 0.0 ± 0.0
Cys
1.757CysAla: 1.757 ± 0.427
0.211CysCys: 0.211 ± 0.115
0.914CysAsp: 0.914 ± 0.195
0.914CysGlu: 0.914 ± 0.301
0.07CysPhe: 0.07 ± 0.072
1.476CysGly: 1.476 ± 0.393
0.351CysHis: 0.351 ± 0.15
0.141CysIle: 0.141 ± 0.094
0.281CysLys: 0.281 ± 0.155
0.492CysLeu: 0.492 ± 0.194
0.211CysMet: 0.211 ± 0.113
0.351CysAsn: 0.351 ± 0.223
1.124CysPro: 1.124 ± 0.375
0.492CysGln: 0.492 ± 0.16
0.984CysArg: 0.984 ± 0.243
0.492CysSer: 0.492 ± 0.19
0.773CysThr: 0.773 ± 0.194
0.703CysVal: 0.703 ± 0.229
0.211CysTrp: 0.211 ± 0.114
0.281CysTyr: 0.281 ± 0.122
0.0CysXaa: 0.0 ± 0.0
Asp
7.942AspAla: 7.942 ± 0.749
1.195AspCys: 1.195 ± 0.275
5.06AspAsp: 5.06 ± 0.734
5.341AspGlu: 5.341 ± 0.678
1.898AspPhe: 1.898 ± 0.371
6.536AspGly: 6.536 ± 0.662
1.476AspHis: 1.476 ± 0.321
2.671AspIle: 2.671 ± 0.421
1.616AspLys: 1.616 ± 0.399
5.482AspLeu: 5.482 ± 0.625
1.335AspMet: 1.335 ± 0.22
2.249AspAsn: 2.249 ± 0.368
4.779AspPro: 4.779 ± 0.55
2.038AspGln: 2.038 ± 0.357
3.936AspArg: 3.936 ± 0.622
3.865AspSer: 3.865 ± 0.582
3.444AspThr: 3.444 ± 0.46
3.795AspVal: 3.795 ± 0.502
1.195AspTrp: 1.195 ± 0.308
1.124AspTyr: 1.124 ± 0.269
0.0AspXaa: 0.0 ± 0.0
Glu
5.552GluAla: 5.552 ± 0.598
0.843GluCys: 0.843 ± 0.237
3.936GluAsp: 3.936 ± 0.641
1.968GluGlu: 1.968 ± 0.329
2.319GluPhe: 2.319 ± 0.443
2.952GluGly: 2.952 ± 0.43
1.265GluHis: 1.265 ± 0.326
2.741GluIle: 2.741 ± 0.469
2.6GluLys: 2.6 ± 0.313
5.903GluLeu: 5.903 ± 0.623
1.616GluMet: 1.616 ± 0.518
1.968GluAsn: 1.968 ± 0.286
2.811GluPro: 2.811 ± 0.501
2.881GluGln: 2.881 ± 0.468
4.498GluArg: 4.498 ± 0.672
2.881GluSer: 2.881 ± 0.505
3.725GluThr: 3.725 ± 0.455
4.99GluVal: 4.99 ± 0.727
0.984GluTrp: 0.984 ± 0.263
1.616GluTyr: 1.616 ± 0.31
0.0GluXaa: 0.0 ± 0.0
Phe
3.514PheAla: 3.514 ± 0.546
0.281PheCys: 0.281 ± 0.174
2.319PheAsp: 2.319 ± 0.526
1.476PheGlu: 1.476 ± 0.323
0.703PhePhe: 0.703 ± 0.288
3.514PheGly: 3.514 ± 0.545
1.124PheHis: 1.124 ± 0.287
1.195PheIle: 1.195 ± 0.277
0.843PheLys: 0.843 ± 0.236
1.476PheLeu: 1.476 ± 0.308
0.562PheMet: 0.562 ± 0.199
1.124PheAsn: 1.124 ± 0.25
1.054PhePro: 1.054 ± 0.284
0.492PheGln: 0.492 ± 0.142
1.827PheArg: 1.827 ± 0.317
1.546PheSer: 1.546 ± 0.307
2.038PheThr: 2.038 ± 0.373
2.53PheVal: 2.53 ± 0.47
0.562PheTrp: 0.562 ± 0.245
0.984PheTyr: 0.984 ± 0.265
0.0PheXaa: 0.0 ± 0.0
Gly
8.223GlyAla: 8.223 ± 1.227
1.124GlyCys: 1.124 ± 0.283
5.552GlyAsp: 5.552 ± 0.642
4.357GlyGlu: 4.357 ± 0.651
2.46GlyPhe: 2.46 ± 0.469
9.98GlyGly: 9.98 ± 1.497
1.546GlyHis: 1.546 ± 0.32
4.568GlyIle: 4.568 ± 0.688
1.898GlyLys: 1.898 ± 0.283
6.817GlyLeu: 6.817 ± 0.554
1.968GlyMet: 1.968 ± 0.368
2.671GlyAsn: 2.671 ± 0.448
4.146GlyPro: 4.146 ± 0.577
3.655GlyGln: 3.655 ± 0.405
6.114GlyArg: 6.114 ± 0.615
6.044GlySer: 6.044 ± 0.802
5.622GlyThr: 5.622 ± 0.572
6.747GlyVal: 6.747 ± 0.792
2.179GlyTrp: 2.179 ± 0.394
3.233GlyTyr: 3.233 ± 0.624
0.0GlyXaa: 0.0 ± 0.0
His
1.757HisAla: 1.757 ± 0.352
0.422HisCys: 0.422 ± 0.171
1.335HisAsp: 1.335 ± 0.294
2.108HisGlu: 2.108 ± 0.472
0.351HisPhe: 0.351 ± 0.154
1.687HisGly: 1.687 ± 0.38
0.843HisHis: 0.843 ± 0.247
0.703HisIle: 0.703 ± 0.185
0.843HisLys: 0.843 ± 0.258
1.616HisLeu: 1.616 ± 0.457
0.211HisMet: 0.211 ± 0.1
0.351HisAsn: 0.351 ± 0.144
1.265HisPro: 1.265 ± 0.347
1.054HisGln: 1.054 ± 0.249
1.968HisArg: 1.968 ± 0.31
0.703HisSer: 0.703 ± 0.249
1.827HisThr: 1.827 ± 0.333
1.124HisVal: 1.124 ± 0.298
0.422HisTrp: 0.422 ± 0.152
0.984HisTyr: 0.984 ± 0.234
0.0HisXaa: 0.0 ± 0.0
Ile
5.13IleAla: 5.13 ± 0.483
0.211IleCys: 0.211 ± 0.125
3.233IleAsp: 3.233 ± 0.559
4.287IleGlu: 4.287 ± 0.461
0.984IlePhe: 0.984 ± 0.247
4.709IleGly: 4.709 ± 0.529
1.124IleHis: 1.124 ± 0.301
1.195IleIle: 1.195 ± 0.258
0.984IleLys: 0.984 ± 0.266
2.249IleLeu: 2.249 ± 0.396
0.351IleMet: 0.351 ± 0.146
2.038IleAsn: 2.038 ± 0.304
2.53IlePro: 2.53 ± 0.438
1.406IleGln: 1.406 ± 0.303
3.233IleArg: 3.233 ± 0.505
2.179IleSer: 2.179 ± 0.412
3.584IleThr: 3.584 ± 0.435
3.514IleVal: 3.514 ± 0.55
0.914IleTrp: 0.914 ± 0.198
0.843IleTyr: 0.843 ± 0.19
0.0IleXaa: 0.0 ± 0.0
Lys
4.287LysAla: 4.287 ± 0.989
0.281LysCys: 0.281 ± 0.123
0.984LysAsp: 0.984 ± 0.268
0.984LysGlu: 0.984 ± 0.223
0.773LysPhe: 0.773 ± 0.179
2.249LysGly: 2.249 ± 0.411
0.633LysHis: 0.633 ± 0.188
1.335LysIle: 1.335 ± 0.29
0.351LysLys: 0.351 ± 0.161
2.249LysLeu: 2.249 ± 0.365
0.703LysMet: 0.703 ± 0.229
0.843LysAsn: 0.843 ± 0.188
2.53LysPro: 2.53 ± 0.365
1.195LysGln: 1.195 ± 0.271
2.389LysArg: 2.389 ± 0.376
1.546LysSer: 1.546 ± 0.284
2.108LysThr: 2.108 ± 0.333
1.968LysVal: 1.968 ± 0.337
0.492LysTrp: 0.492 ± 0.148
1.124LysTyr: 1.124 ± 0.235
0.0LysXaa: 0.0 ± 0.0
Leu
10.05LeuAla: 10.05 ± 0.868
0.773LeuCys: 0.773 ± 0.24
4.99LeuAsp: 4.99 ± 0.447
3.795LeuGlu: 3.795 ± 0.33
3.022LeuPhe: 3.022 ± 0.508
8.082LeuGly: 8.082 ± 0.887
1.406LeuHis: 1.406 ± 0.265
3.655LeuIle: 3.655 ± 0.523
2.179LeuLys: 2.179 ± 0.404
6.677LeuLeu: 6.677 ± 0.722
1.195LeuMet: 1.195 ± 0.314
2.389LeuAsn: 2.389 ± 0.459
4.006LeuPro: 4.006 ± 0.729
2.671LeuGln: 2.671 ± 0.373
5.693LeuArg: 5.693 ± 0.779
3.514LeuSer: 3.514 ± 0.527
5.13LeuThr: 5.13 ± 0.593
5.903LeuVal: 5.903 ± 0.48
1.195LeuTrp: 1.195 ± 0.23
1.195LeuTyr: 1.195 ± 0.242
0.0LeuXaa: 0.0 ± 0.0
Met
2.881MetAla: 2.881 ± 0.479
0.07MetCys: 0.07 ± 0.073
0.843MetAsp: 0.843 ± 0.27
0.562MetGlu: 0.562 ± 0.229
0.843MetPhe: 0.843 ± 0.286
0.633MetGly: 0.633 ± 0.174
0.281MetHis: 0.281 ± 0.112
1.335MetIle: 1.335 ± 0.332
0.633MetLys: 0.633 ± 0.232
1.757MetLeu: 1.757 ± 0.325
0.492MetMet: 0.492 ± 0.181
0.843MetAsn: 0.843 ± 0.248
1.687MetPro: 1.687 ± 0.297
0.773MetGln: 0.773 ± 0.219
0.843MetArg: 0.843 ± 0.24
2.741MetSer: 2.741 ± 0.37
1.898MetThr: 1.898 ± 0.36
0.703MetVal: 0.703 ± 0.206
0.703MetTrp: 0.703 ± 0.272
0.351MetTyr: 0.351 ± 0.129
0.0MetXaa: 0.0 ± 0.0
Asn
3.655AsnAla: 3.655 ± 0.605
0.281AsnCys: 0.281 ± 0.146
1.968AsnAsp: 1.968 ± 0.382
0.843AsnGlu: 0.843 ± 0.236
1.124AsnPhe: 1.124 ± 0.259
3.725AsnGly: 3.725 ± 0.701
0.562AsnHis: 0.562 ± 0.186
1.546AsnIle: 1.546 ± 0.317
0.633AsnLys: 0.633 ± 0.223
2.6AsnLeu: 2.6 ± 0.43
0.492AsnMet: 0.492 ± 0.163
1.546AsnAsn: 1.546 ± 0.296
2.46AsnPro: 2.46 ± 0.32
1.124AsnGln: 1.124 ± 0.28
1.898AsnArg: 1.898 ± 0.298
1.054AsnSer: 1.054 ± 0.285
1.968AsnThr: 1.968 ± 0.389
2.038AsnVal: 2.038 ± 0.34
0.492AsnTrp: 0.492 ± 0.152
0.562AsnTyr: 0.562 ± 0.243
0.0AsnXaa: 0.0 ± 0.0
Pro
7.098ProAla: 7.098 ± 0.764
0.914ProCys: 0.914 ± 0.249
5.06ProAsp: 5.06 ± 0.614
4.568ProGlu: 4.568 ± 0.638
2.038ProPhe: 2.038 ± 0.283
6.114ProGly: 6.114 ± 0.991
1.124ProHis: 1.124 ± 0.271
1.616ProIle: 1.616 ± 0.323
1.124ProLys: 1.124 ± 0.327
3.795ProLeu: 3.795 ± 0.507
1.687ProMet: 1.687 ± 0.416
1.476ProAsn: 1.476 ± 0.332
3.655ProPro: 3.655 ± 0.517
2.038ProGln: 2.038 ± 0.348
3.303ProArg: 3.303 ± 0.57
2.741ProSer: 2.741 ± 0.4
4.076ProThr: 4.076 ± 0.558
4.428ProVal: 4.428 ± 0.497
1.616ProTrp: 1.616 ± 0.436
1.195ProTyr: 1.195 ± 0.361
0.0ProXaa: 0.0 ± 0.0
Gln
4.99GlnAla: 4.99 ± 0.976
0.422GlnCys: 0.422 ± 0.147
1.687GlnAsp: 1.687 ± 0.413
1.265GlnGlu: 1.265 ± 0.251
0.914GlnPhe: 0.914 ± 0.205
1.616GlnGly: 1.616 ± 0.269
1.054GlnHis: 1.054 ± 0.278
2.6GlnIle: 2.6 ± 0.404
1.335GlnLys: 1.335 ± 0.267
4.498GlnLeu: 4.498 ± 0.606
1.054GlnMet: 1.054 ± 0.247
0.422GlnAsn: 0.422 ± 0.18
2.389GlnPro: 2.389 ± 0.452
1.616GlnGln: 1.616 ± 0.423
3.022GlnArg: 3.022 ± 0.452
1.898GlnSer: 1.898 ± 0.386
2.319GlnThr: 2.319 ± 0.336
2.811GlnVal: 2.811 ± 0.4
0.562GlnTrp: 0.562 ± 0.165
0.773GlnTyr: 0.773 ± 0.275
0.0GlnXaa: 0.0 ± 0.0
Arg
6.606ArgAla: 6.606 ± 0.745
1.124ArgCys: 1.124 ± 0.336
4.779ArgAsp: 4.779 ± 0.582
4.638ArgGlu: 4.638 ± 0.609
1.546ArgPhe: 1.546 ± 0.378
4.99ArgGly: 4.99 ± 0.549
1.757ArgHis: 1.757 ± 0.401
3.584ArgIle: 3.584 ± 0.557
2.249ArgLys: 2.249 ± 0.452
5.271ArgLeu: 5.271 ± 0.528
2.108ArgMet: 2.108 ± 0.501
1.757ArgAsn: 1.757 ± 0.263
4.076ArgPro: 4.076 ± 0.654
2.6ArgGln: 2.6 ± 0.519
7.028ArgArg: 7.028 ± 0.879
3.163ArgSer: 3.163 ± 0.438
4.006ArgThr: 4.006 ± 0.63
4.99ArgVal: 4.99 ± 0.586
1.757ArgTrp: 1.757 ± 0.303
1.898ArgTyr: 1.898 ± 0.34
0.0ArgXaa: 0.0 ± 0.0
Ser
5.903SerAla: 5.903 ± 0.812
0.422SerCys: 0.422 ± 0.163
3.725SerAsp: 3.725 ± 0.522
1.968SerGlu: 1.968 ± 0.379
1.687SerPhe: 1.687 ± 0.302
5.763SerGly: 5.763 ± 0.959
0.633SerHis: 0.633 ± 0.208
2.038SerIle: 2.038 ± 0.376
1.546SerLys: 1.546 ± 0.343
3.584SerLeu: 3.584 ± 0.524
2.179SerMet: 2.179 ± 0.393
1.265SerAsn: 1.265 ± 0.277
3.584SerPro: 3.584 ± 0.405
1.687SerGln: 1.687 ± 0.396
2.881SerArg: 2.881 ± 0.425
3.022SerSer: 3.022 ± 0.386
3.514SerThr: 3.514 ± 0.427
3.655SerVal: 3.655 ± 0.46
1.124SerTrp: 1.124 ± 0.235
1.687SerTyr: 1.687 ± 0.322
0.0SerXaa: 0.0 ± 0.0
Thr
8.012ThrAla: 8.012 ± 0.711
0.633ThrCys: 0.633 ± 0.222
4.287ThrAsp: 4.287 ± 0.527
3.514ThrGlu: 3.514 ± 0.392
1.898ThrPhe: 1.898 ± 0.376
6.677ThrGly: 6.677 ± 0.783
1.195ThrHis: 1.195 ± 0.305
3.092ThrIle: 3.092 ± 0.497
2.811ThrLys: 2.811 ± 0.404
5.411ThrLeu: 5.411 ± 0.468
1.054ThrMet: 1.054 ± 0.219
1.687ThrAsn: 1.687 ± 0.285
5.06ThrPro: 5.06 ± 0.594
1.827ThrGln: 1.827 ± 0.332
4.709ThrArg: 4.709 ± 0.658
2.741ThrSer: 2.741 ± 0.574
4.498ThrThr: 4.498 ± 0.525
6.185ThrVal: 6.185 ± 0.819
0.914ThrTrp: 0.914 ± 0.252
1.687ThrTyr: 1.687 ± 0.309
0.0ThrXaa: 0.0 ± 0.0
Val
8.223ValAla: 8.223 ± 0.924
0.914ValCys: 0.914 ± 0.252
5.201ValAsp: 5.201 ± 0.584
5.833ValGlu: 5.833 ± 0.67
1.898ValPhe: 1.898 ± 0.37
5.552ValGly: 5.552 ± 0.733
1.335ValHis: 1.335 ± 0.292
3.584ValIle: 3.584 ± 0.534
2.881ValLys: 2.881 ± 0.596
4.498ValLeu: 4.498 ± 0.493
1.124ValMet: 1.124 ± 0.33
2.53ValAsn: 2.53 ± 0.34
4.217ValPro: 4.217 ± 0.521
2.249ValGln: 2.249 ± 0.388
4.99ValArg: 4.99 ± 0.642
3.584ValSer: 3.584 ± 0.506
5.833ValThr: 5.833 ± 0.6
5.482ValVal: 5.482 ± 0.694
1.054ValTrp: 1.054 ± 0.232
1.476ValTyr: 1.476 ± 0.269
0.0ValXaa: 0.0 ± 0.0
Trp
1.476TrpAla: 1.476 ± 0.296
0.703TrpCys: 0.703 ± 0.271
1.054TrpAsp: 1.054 ± 0.222
0.562TrpGlu: 0.562 ± 0.243
0.773TrpPhe: 0.773 ± 0.241
1.335TrpGly: 1.335 ± 0.268
0.703TrpHis: 0.703 ± 0.212
0.703TrpIle: 0.703 ± 0.232
0.492TrpLys: 0.492 ± 0.162
2.038TrpLeu: 2.038 ± 0.417
0.141TrpMet: 0.141 ± 0.096
0.562TrpAsn: 0.562 ± 0.154
0.843TrpPro: 0.843 ± 0.238
0.843TrpGln: 0.843 ± 0.236
1.406TrpArg: 1.406 ± 0.301
0.914TrpSer: 0.914 ± 0.166
1.827TrpThr: 1.827 ± 0.363
1.898TrpVal: 1.898 ± 0.298
0.492TrpTrp: 0.492 ± 0.18
0.422TrpTyr: 0.422 ± 0.156
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.179TyrAla: 2.179 ± 0.376
0.422TyrCys: 0.422 ± 0.167
1.827TyrAsp: 1.827 ± 0.285
1.546TyrGlu: 1.546 ± 0.252
0.562TyrPhe: 0.562 ± 0.216
2.249TyrGly: 2.249 ± 0.377
0.492TyrHis: 0.492 ± 0.17
1.054TyrIle: 1.054 ± 0.255
0.562TyrLys: 0.562 ± 0.185
1.968TyrLeu: 1.968 ± 0.341
0.281TyrMet: 0.281 ± 0.133
1.054TyrAsn: 1.054 ± 0.216
0.773TyrPro: 0.773 ± 0.21
1.054TyrGln: 1.054 ± 0.225
1.687TyrArg: 1.687 ± 0.278
1.335TyrSer: 1.335 ± 0.361
1.898TyrThr: 1.898 ± 0.321
2.038TyrVal: 2.038 ± 0.463
0.562TyrTrp: 0.562 ± 0.227
0.843TyrTyr: 0.843 ± 0.215
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 65 proteins (14230 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski