Amino acid dipepetide frequency for Mycobacterium phage SpongeBob

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
19.894AlaAla: 19.894 ± 2.887
0.972AlaCys: 0.972 ± 0.366
8.376AlaAsp: 8.376 ± 0.978
8.302AlaGlu: 8.302 ± 1.017
3.216AlaPhe: 3.216 ± 0.556
9.423AlaGly: 9.423 ± 1.636
2.318AlaHis: 2.318 ± 0.459
5.684AlaIle: 5.684 ± 0.719
3.739AlaLys: 3.739 ± 0.623
9.947AlaLeu: 9.947 ± 0.967
2.244AlaMet: 2.244 ± 0.453
3.365AlaAsn: 3.365 ± 0.65
7.255AlaPro: 7.255 ± 0.697
5.16AlaGln: 5.16 ± 0.648
7.703AlaArg: 7.703 ± 0.738
5.684AlaSer: 5.684 ± 0.715
7.703AlaThr: 7.703 ± 0.951
7.404AlaVal: 7.404 ± 0.808
2.169AlaTrp: 2.169 ± 0.403
2.318AlaTyr: 2.318 ± 0.423
0.0AlaXaa: 0.0 ± 0.0
Cys
1.795CysAla: 1.795 ± 0.475
0.224CysCys: 0.224 ± 0.135
0.897CysAsp: 0.897 ± 0.254
0.748CysGlu: 0.748 ± 0.304
0.0CysPhe: 0.0 ± 0.0
1.421CysGly: 1.421 ± 0.419
0.075CysHis: 0.075 ± 0.077
0.15CysIle: 0.15 ± 0.098
0.374CysLys: 0.374 ± 0.174
0.449CysLeu: 0.449 ± 0.177
0.15CysMet: 0.15 ± 0.099
0.299CysAsn: 0.299 ± 0.138
0.823CysPro: 0.823 ± 0.324
0.224CysGln: 0.224 ± 0.137
1.047CysArg: 1.047 ± 0.286
0.673CysSer: 0.673 ± 0.199
0.897CysThr: 0.897 ± 0.272
0.748CysVal: 0.748 ± 0.235
0.15CysTrp: 0.15 ± 0.097
0.299CysTyr: 0.299 ± 0.135
0.0CysXaa: 0.0 ± 0.0
Asp
7.703AspAla: 7.703 ± 0.839
0.897AspCys: 0.897 ± 0.245
5.235AspAsp: 5.235 ± 0.782
4.936AspGlu: 4.936 ± 0.607
1.945AspPhe: 1.945 ± 0.385
6.581AspGly: 6.581 ± 0.698
1.645AspHis: 1.645 ± 0.329
2.244AspIle: 2.244 ± 0.302
1.645AspLys: 1.645 ± 0.355
6.133AspLeu: 6.133 ± 0.593
1.421AspMet: 1.421 ± 0.305
1.72AspAsn: 1.72 ± 0.381
4.637AspPro: 4.637 ± 0.644
1.72AspGln: 1.72 ± 0.296
4.413AspArg: 4.413 ± 0.787
3.665AspSer: 3.665 ± 0.54
3.066AspThr: 3.066 ± 0.557
4.188AspVal: 4.188 ± 0.556
1.122AspTrp: 1.122 ± 0.428
1.795AspTyr: 1.795 ± 0.337
0.0AspXaa: 0.0 ± 0.0
Glu
6.133GluAla: 6.133 ± 0.722
0.748GluCys: 0.748 ± 0.242
3.889GluAsp: 3.889 ± 0.53
2.393GluGlu: 2.393 ± 0.461
2.618GluPhe: 2.618 ± 0.45
2.992GluGly: 2.992 ± 0.599
1.122GluHis: 1.122 ± 0.318
3.291GluIle: 3.291 ± 0.403
2.468GluLys: 2.468 ± 0.378
6.058GluLeu: 6.058 ± 0.683
1.421GluMet: 1.421 ± 0.38
2.019GluAsn: 2.019 ± 0.316
2.917GluPro: 2.917 ± 0.59
3.066GluGln: 3.066 ± 0.484
4.263GluArg: 4.263 ± 0.605
2.692GluSer: 2.692 ± 0.478
4.113GluThr: 4.113 ± 0.627
4.712GluVal: 4.712 ± 0.69
1.047GluTrp: 1.047 ± 0.328
1.271GluTyr: 1.271 ± 0.278
0.0GluXaa: 0.0 ± 0.0
Phe
3.739PheAla: 3.739 ± 0.557
0.374PheCys: 0.374 ± 0.21
2.318PheAsp: 2.318 ± 0.489
1.571PheGlu: 1.571 ± 0.374
1.122PhePhe: 1.122 ± 0.323
2.842PheGly: 2.842 ± 0.484
0.972PheHis: 0.972 ± 0.29
0.897PheIle: 0.897 ± 0.267
0.897PheLys: 0.897 ± 0.208
2.094PheLeu: 2.094 ± 0.452
0.299PheMet: 0.299 ± 0.175
0.897PheAsn: 0.897 ± 0.274
1.122PhePro: 1.122 ± 0.361
0.374PheGln: 0.374 ± 0.143
1.795PheArg: 1.795 ± 0.284
1.346PheSer: 1.346 ± 0.26
2.393PheThr: 2.393 ± 0.427
2.692PheVal: 2.692 ± 0.488
0.449PheTrp: 0.449 ± 0.19
0.823PheTyr: 0.823 ± 0.233
0.0PheXaa: 0.0 ± 0.0
Gly
7.703GlyAla: 7.703 ± 1.291
1.047GlyCys: 1.047 ± 0.341
6.282GlyAsp: 6.282 ± 0.632
3.889GlyGlu: 3.889 ± 0.671
2.468GlyPhe: 2.468 ± 0.423
9.947GlyGly: 9.947 ± 1.526
1.496GlyHis: 1.496 ± 0.28
4.936GlyIle: 4.936 ± 0.801
1.87GlyLys: 1.87 ± 0.337
6.357GlyLeu: 6.357 ± 0.65
1.645GlyMet: 1.645 ± 0.393
2.992GlyAsn: 2.992 ± 0.463
5.011GlyPro: 5.011 ± 0.679
3.365GlyGln: 3.365 ± 0.481
5.684GlyArg: 5.684 ± 0.605
5.684GlySer: 5.684 ± 0.703
5.908GlyThr: 5.908 ± 0.741
6.133GlyVal: 6.133 ± 0.521
1.795GlyTrp: 1.795 ± 0.406
2.917GlyTyr: 2.917 ± 0.62
0.0GlyXaa: 0.0 ± 0.0
His
1.421HisAla: 1.421 ± 0.408
0.374HisCys: 0.374 ± 0.187
1.047HisAsp: 1.047 ± 0.2
1.945HisGlu: 1.945 ± 0.485
0.374HisPhe: 0.374 ± 0.137
1.795HisGly: 1.795 ± 0.362
0.598HisHis: 0.598 ± 0.202
0.673HisIle: 0.673 ± 0.197
0.748HisLys: 0.748 ± 0.234
1.645HisLeu: 1.645 ± 0.343
0.224HisMet: 0.224 ± 0.165
0.224HisAsn: 0.224 ± 0.14
1.047HisPro: 1.047 ± 0.295
0.897HisGln: 0.897 ± 0.241
2.393HisArg: 2.393 ± 0.415
0.524HisSer: 0.524 ± 0.201
1.271HisThr: 1.271 ± 0.299
1.197HisVal: 1.197 ± 0.297
0.299HisTrp: 0.299 ± 0.141
0.823HisTyr: 0.823 ± 0.232
0.0HisXaa: 0.0 ± 0.0
Ile
5.609IleAla: 5.609 ± 0.529
0.449IleCys: 0.449 ± 0.174
3.44IleAsp: 3.44 ± 0.53
3.814IleGlu: 3.814 ± 0.424
0.748IlePhe: 0.748 ± 0.209
5.011IleGly: 5.011 ± 0.72
0.897IleHis: 0.897 ± 0.253
1.346IleIle: 1.346 ± 0.325
1.047IleLys: 1.047 ± 0.244
2.692IleLeu: 2.692 ± 0.379
0.374IleMet: 0.374 ± 0.163
1.795IleAsn: 1.795 ± 0.356
2.543IlePro: 2.543 ± 0.462
1.346IleGln: 1.346 ± 0.253
3.44IleArg: 3.44 ± 0.476
2.244IleSer: 2.244 ± 0.398
3.59IleThr: 3.59 ± 0.492
3.59IleVal: 3.59 ± 0.537
0.748IleTrp: 0.748 ± 0.189
0.823IleTyr: 0.823 ± 0.192
0.0IleXaa: 0.0 ± 0.0
Lys
4.861LysAla: 4.861 ± 1.054
0.15LysCys: 0.15 ± 0.102
1.197LysAsp: 1.197 ± 0.339
1.197LysGlu: 1.197 ± 0.285
0.972LysPhe: 0.972 ± 0.215
2.842LysGly: 2.842 ± 0.415
0.449LysHis: 0.449 ± 0.181
1.87LysIle: 1.87 ± 0.338
0.374LysLys: 0.374 ± 0.156
1.945LysLeu: 1.945 ± 0.408
0.598LysMet: 0.598 ± 0.2
0.823LysAsn: 0.823 ± 0.252
2.244LysPro: 2.244 ± 0.443
1.421LysGln: 1.421 ± 0.427
2.468LysArg: 2.468 ± 0.521
1.421LysSer: 1.421 ± 0.301
1.571LysThr: 1.571 ± 0.32
2.318LysVal: 2.318 ± 0.34
0.374LysTrp: 0.374 ± 0.162
0.897LysTyr: 0.897 ± 0.215
0.0LysXaa: 0.0 ± 0.0
Leu
10.844LeuAla: 10.844 ± 1.009
0.673LeuCys: 0.673 ± 0.254
5.385LeuAsp: 5.385 ± 0.669
3.814LeuGlu: 3.814 ± 0.387
2.692LeuPhe: 2.692 ± 0.479
8.077LeuGly: 8.077 ± 0.829
1.645LeuHis: 1.645 ± 0.299
3.44LeuIle: 3.44 ± 0.449
2.767LeuLys: 2.767 ± 0.457
6.207LeuLeu: 6.207 ± 0.83
1.346LeuMet: 1.346 ± 0.289
2.842LeuAsn: 2.842 ± 0.544
3.59LeuPro: 3.59 ± 0.546
2.543LeuGln: 2.543 ± 0.437
5.534LeuArg: 5.534 ± 0.819
4.113LeuSer: 4.113 ± 0.497
5.086LeuThr: 5.086 ± 0.653
5.16LeuVal: 5.16 ± 0.594
1.122LeuTrp: 1.122 ± 0.261
1.122LeuTyr: 1.122 ± 0.253
0.0LeuXaa: 0.0 ± 0.0
Met
3.066MetAla: 3.066 ± 0.489
0.0MetCys: 0.0 ± 0.0
0.524MetAsp: 0.524 ± 0.219
0.598MetGlu: 0.598 ± 0.226
0.972MetPhe: 0.972 ± 0.256
0.673MetGly: 0.673 ± 0.177
0.299MetHis: 0.299 ± 0.15
0.972MetIle: 0.972 ± 0.238
0.673MetLys: 0.673 ± 0.227
1.571MetLeu: 1.571 ± 0.323
0.374MetMet: 0.374 ± 0.173
0.673MetAsn: 0.673 ± 0.218
1.421MetPro: 1.421 ± 0.296
0.673MetGln: 0.673 ± 0.215
0.823MetArg: 0.823 ± 0.31
2.543MetSer: 2.543 ± 0.36
2.169MetThr: 2.169 ± 0.442
0.972MetVal: 0.972 ± 0.248
0.748MetTrp: 0.748 ± 0.279
0.15MetTyr: 0.15 ± 0.089
0.0MetXaa: 0.0 ± 0.0
Asn
3.291AsnAla: 3.291 ± 0.584
0.449AsnCys: 0.449 ± 0.172
1.571AsnAsp: 1.571 ± 0.308
0.823AsnGlu: 0.823 ± 0.219
0.897AsnPhe: 0.897 ± 0.272
4.188AsnGly: 4.188 ± 0.644
0.374AsnHis: 0.374 ± 0.158
1.795AsnIle: 1.795 ± 0.362
0.524AsnLys: 0.524 ± 0.268
2.543AsnLeu: 2.543 ± 0.419
0.374AsnMet: 0.374 ± 0.149
1.122AsnAsn: 1.122 ± 0.226
2.692AsnPro: 2.692 ± 0.364
1.421AsnGln: 1.421 ± 0.297
1.87AsnArg: 1.87 ± 0.357
1.571AsnSer: 1.571 ± 0.377
1.87AsnThr: 1.87 ± 0.404
2.019AsnVal: 2.019 ± 0.334
0.598AsnTrp: 0.598 ± 0.191
0.598AsnTyr: 0.598 ± 0.228
0.0AsnXaa: 0.0 ± 0.0
Pro
7.928ProAla: 7.928 ± 0.933
0.748ProCys: 0.748 ± 0.238
4.637ProAsp: 4.637 ± 0.548
4.263ProGlu: 4.263 ± 0.636
2.019ProPhe: 2.019 ± 0.392
5.534ProGly: 5.534 ± 0.702
0.823ProHis: 0.823 ± 0.213
2.019ProIle: 2.019 ± 0.374
1.346ProLys: 1.346 ± 0.348
4.338ProLeu: 4.338 ± 0.717
1.571ProMet: 1.571 ± 0.412
1.87ProAsn: 1.87 ± 0.387
3.365ProPro: 3.365 ± 0.431
2.169ProGln: 2.169 ± 0.444
2.767ProArg: 2.767 ± 0.417
2.767ProSer: 2.767 ± 0.42
4.188ProThr: 4.188 ± 0.593
4.338ProVal: 4.338 ± 0.549
1.72ProTrp: 1.72 ± 0.406
1.271ProTyr: 1.271 ± 0.416
0.0ProXaa: 0.0 ± 0.0
Gln
5.086GlnAla: 5.086 ± 1.062
0.524GlnCys: 0.524 ± 0.209
1.645GlnAsp: 1.645 ± 0.428
1.496GlnGlu: 1.496 ± 0.317
0.748GlnPhe: 0.748 ± 0.209
1.571GlnGly: 1.571 ± 0.304
1.047GlnHis: 1.047 ± 0.303
2.543GlnIle: 2.543 ± 0.321
1.571GlnLys: 1.571 ± 0.352
3.814GlnLeu: 3.814 ± 0.465
1.122GlnMet: 1.122 ± 0.284
0.673GlnAsn: 0.673 ± 0.268
2.618GlnPro: 2.618 ± 0.498
1.795GlnGln: 1.795 ± 0.482
2.767GlnArg: 2.767 ± 0.524
1.645GlnSer: 1.645 ± 0.323
2.767GlnThr: 2.767 ± 0.447
2.692GlnVal: 2.692 ± 0.43
0.748GlnTrp: 0.748 ± 0.222
0.823GlnTyr: 0.823 ± 0.256
0.0GlnXaa: 0.0 ± 0.0
Arg
6.507ArgAla: 6.507 ± 0.68
0.972ArgCys: 0.972 ± 0.276
4.637ArgAsp: 4.637 ± 0.672
4.637ArgGlu: 4.637 ± 0.706
1.72ArgPhe: 1.72 ± 0.4
3.964ArgGly: 3.964 ± 0.562
1.945ArgHis: 1.945 ± 0.453
3.066ArgIle: 3.066 ± 0.461
2.468ArgLys: 2.468 ± 0.468
5.385ArgLeu: 5.385 ± 0.644
2.169ArgMet: 2.169 ± 0.648
1.945ArgAsn: 1.945 ± 0.367
4.113ArgPro: 4.113 ± 0.542
2.992ArgGln: 2.992 ± 0.582
6.357ArgArg: 6.357 ± 0.837
2.842ArgSer: 2.842 ± 0.483
4.637ArgThr: 4.637 ± 0.648
4.861ArgVal: 4.861 ± 0.622
1.645ArgTrp: 1.645 ± 0.295
1.72ArgTyr: 1.72 ± 0.31
0.0ArgXaa: 0.0 ± 0.0
Ser
5.908SerAla: 5.908 ± 0.742
0.299SerCys: 0.299 ± 0.189
3.515SerAsp: 3.515 ± 0.448
2.019SerGlu: 2.019 ± 0.324
1.571SerPhe: 1.571 ± 0.398
5.086SerGly: 5.086 ± 0.632
0.449SerHis: 0.449 ± 0.181
2.244SerIle: 2.244 ± 0.449
1.571SerLys: 1.571 ± 0.322
3.665SerLeu: 3.665 ± 0.479
2.094SerMet: 2.094 ± 0.472
1.421SerAsn: 1.421 ± 0.397
3.216SerPro: 3.216 ± 0.49
2.094SerGln: 2.094 ± 0.401
3.141SerArg: 3.141 ± 0.416
2.767SerSer: 2.767 ± 0.372
3.515SerThr: 3.515 ± 0.433
3.59SerVal: 3.59 ± 0.527
1.122SerTrp: 1.122 ± 0.24
1.496SerTyr: 1.496 ± 0.266
0.0SerXaa: 0.0 ± 0.0
Thr
8.675ThrAla: 8.675 ± 0.899
0.897ThrCys: 0.897 ± 0.297
4.188ThrAsp: 4.188 ± 0.623
3.814ThrGlu: 3.814 ± 0.504
1.87ThrPhe: 1.87 ± 0.404
7.03ThrGly: 7.03 ± 0.919
0.897ThrHis: 0.897 ± 0.246
2.767ThrIle: 2.767 ± 0.477
2.169ThrLys: 2.169 ± 0.385
5.834ThrLeu: 5.834 ± 0.577
0.748ThrMet: 0.748 ± 0.185
1.795ThrAsn: 1.795 ± 0.301
5.011ThrPro: 5.011 ± 0.67
2.094ThrGln: 2.094 ± 0.466
4.712ThrArg: 4.712 ± 0.553
2.767ThrSer: 2.767 ± 0.546
3.59ThrThr: 3.59 ± 0.586
6.282ThrVal: 6.282 ± 0.689
0.673ThrTrp: 0.673 ± 0.258
1.496ThrTyr: 1.496 ± 0.32
0.0ThrXaa: 0.0 ± 0.0
Val
8.9ValAla: 8.9 ± 0.816
0.748ValCys: 0.748 ± 0.247
5.086ValAsp: 5.086 ± 0.886
6.507ValGlu: 6.507 ± 0.774
1.72ValPhe: 1.72 ± 0.396
4.936ValGly: 4.936 ± 0.525
1.421ValHis: 1.421 ± 0.367
3.814ValIle: 3.814 ± 0.516
2.767ValLys: 2.767 ± 0.563
3.964ValLeu: 3.964 ± 0.621
0.897ValMet: 0.897 ± 0.208
2.917ValAsn: 2.917 ± 0.511
4.413ValPro: 4.413 ± 0.605
2.692ValGln: 2.692 ± 0.514
4.637ValArg: 4.637 ± 0.555
3.665ValSer: 3.665 ± 0.527
5.609ValThr: 5.609 ± 0.649
6.881ValVal: 6.881 ± 0.781
0.972ValTrp: 0.972 ± 0.235
1.346ValTyr: 1.346 ± 0.299
0.0ValXaa: 0.0 ± 0.0
Trp
1.795TrpAla: 1.795 ± 0.332
0.598TrpCys: 0.598 ± 0.249
1.122TrpAsp: 1.122 ± 0.238
0.449TrpGlu: 0.449 ± 0.203
0.823TrpPhe: 0.823 ± 0.323
0.972TrpGly: 0.972 ± 0.221
0.524TrpHis: 0.524 ± 0.171
0.972TrpIle: 0.972 ± 0.288
0.449TrpLys: 0.449 ± 0.183
1.795TrpLeu: 1.795 ± 0.438
0.299TrpMet: 0.299 ± 0.135
0.374TrpAsn: 0.374 ± 0.14
0.823TrpPro: 0.823 ± 0.244
0.823TrpGln: 0.823 ± 0.248
1.346TrpArg: 1.346 ± 0.322
1.122TrpSer: 1.122 ± 0.239
1.421TrpThr: 1.421 ± 0.393
1.87TrpVal: 1.87 ± 0.339
0.673TrpTrp: 0.673 ± 0.199
0.524TrpTyr: 0.524 ± 0.181
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.019TyrAla: 2.019 ± 0.378
0.299TyrCys: 0.299 ± 0.16
1.72TyrAsp: 1.72 ± 0.368
2.019TyrGlu: 2.019 ± 0.355
0.524TyrPhe: 0.524 ± 0.197
2.244TyrGly: 2.244 ± 0.344
0.524TyrHis: 0.524 ± 0.169
0.823TyrIle: 0.823 ± 0.295
0.598TyrLys: 0.598 ± 0.178
1.571TyrLeu: 1.571 ± 0.297
0.449TyrMet: 0.449 ± 0.192
0.823TyrAsn: 0.823 ± 0.232
0.823TyrPro: 0.823 ± 0.231
0.823TyrGln: 0.823 ± 0.234
1.421TyrArg: 1.421 ± 0.363
1.047TyrSer: 1.047 ± 0.339
1.945TyrThr: 1.945 ± 0.445
2.244TyrVal: 2.244 ± 0.604
0.598TyrTrp: 0.598 ± 0.269
0.823TyrTyr: 0.823 ± 0.239
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 65 proteins (13372 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski