Amino acid dipepetide frequency for Mycobacterium phage Steamy

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
11.056AlaAla: 11.056 ± 0.895
0.478AlaCys: 0.478 ± 0.168
5.976AlaAsp: 5.976 ± 0.647
7.411AlaGlu: 7.411 ± 0.748
4.183AlaPhe: 4.183 ± 0.559
9.801AlaGly: 9.801 ± 0.946
1.912AlaHis: 1.912 ± 0.315
5.199AlaIle: 5.199 ± 0.504
5.737AlaLys: 5.737 ± 0.575
8.426AlaLeu: 8.426 ± 0.857
2.809AlaMet: 2.809 ± 0.399
3.526AlaAsn: 3.526 ± 0.508
5.319AlaPro: 5.319 ± 0.822
3.227AlaGln: 3.227 ± 0.471
6.275AlaArg: 6.275 ± 0.695
5.677AlaSer: 5.677 ± 0.716
4.602AlaThr: 4.602 ± 0.542
7.889AlaVal: 7.889 ± 0.649
1.614AlaTrp: 1.614 ± 0.378
2.211AlaTyr: 2.211 ± 0.409
0.0AlaXaa: 0.0 ± 0.0
Cys
0.538CysAla: 0.538 ± 0.191
0.0CysCys: 0.0 ± 0.0
0.478CysAsp: 0.478 ± 0.174
0.478CysGlu: 0.478 ± 0.168
0.359CysPhe: 0.359 ± 0.164
0.837CysGly: 0.837 ± 0.301
0.299CysHis: 0.299 ± 0.158
0.179CysIle: 0.179 ± 0.093
0.418CysLys: 0.418 ± 0.164
0.418CysLeu: 0.418 ± 0.169
0.06CysMet: 0.06 ± 0.059
0.299CysAsn: 0.299 ± 0.115
0.598CysPro: 0.598 ± 0.185
0.179CysGln: 0.179 ± 0.098
0.598CysArg: 0.598 ± 0.211
0.538CysSer: 0.538 ± 0.204
0.598CysThr: 0.598 ± 0.185
0.717CysVal: 0.717 ± 0.226
0.418CysTrp: 0.418 ± 0.165
0.239CysTyr: 0.239 ± 0.114
0.0CysXaa: 0.0 ± 0.0
Asp
5.797AspAla: 5.797 ± 0.66
0.478AspCys: 0.478 ± 0.183
3.227AspAsp: 3.227 ± 0.519
3.287AspGlu: 3.287 ± 0.497
2.331AspPhe: 2.331 ± 0.381
4.721AspGly: 4.721 ± 0.532
1.733AspHis: 1.733 ± 0.427
3.167AspIle: 3.167 ± 0.511
2.092AspLys: 2.092 ± 0.335
5.618AspLeu: 5.618 ± 0.6
1.255AspMet: 1.255 ± 0.262
1.733AspAsn: 1.733 ± 0.379
4.303AspPro: 4.303 ± 0.56
1.673AspGln: 1.673 ± 0.359
2.869AspArg: 2.869 ± 0.468
3.466AspSer: 3.466 ± 0.426
4.064AspThr: 4.064 ± 0.558
4.243AspVal: 4.243 ± 0.526
1.135AspTrp: 1.135 ± 0.284
2.45AspTyr: 2.45 ± 0.33
0.0AspXaa: 0.0 ± 0.0
Glu
8.187GluAla: 8.187 ± 0.796
0.179GluCys: 0.179 ± 0.1
3.466GluAsp: 3.466 ± 0.59
4.004GluGlu: 4.004 ± 0.612
2.032GluPhe: 2.032 ± 0.333
5.14GluGly: 5.14 ± 0.496
0.956GluHis: 0.956 ± 0.228
3.287GluIle: 3.287 ± 0.46
2.689GluLys: 2.689 ± 0.446
5.08GluLeu: 5.08 ± 0.589
1.912GluMet: 1.912 ± 0.347
2.092GluAsn: 2.092 ± 0.331
2.331GluPro: 2.331 ± 0.432
2.63GluGln: 2.63 ± 0.351
4.661GluArg: 4.661 ± 0.587
3.108GluSer: 3.108 ± 0.34
4.183GluThr: 4.183 ± 0.476
4.602GluVal: 4.602 ± 0.547
1.614GluTrp: 1.614 ± 0.297
2.092GluTyr: 2.092 ± 0.337
0.0GluXaa: 0.0 ± 0.0
Phe
2.809PheAla: 2.809 ± 0.425
0.239PheCys: 0.239 ± 0.118
1.853PheAsp: 1.853 ± 0.375
2.869PheGlu: 2.869 ± 0.431
0.717PhePhe: 0.717 ± 0.207
3.645PheGly: 3.645 ± 0.5
0.239PheHis: 0.239 ± 0.121
2.092PheIle: 2.092 ± 0.331
1.375PheLys: 1.375 ± 0.316
2.271PheLeu: 2.271 ± 0.478
0.598PheMet: 0.598 ± 0.171
1.434PheAsn: 1.434 ± 0.218
1.733PhePro: 1.733 ± 0.349
0.777PheGln: 0.777 ± 0.203
1.912PheArg: 1.912 ± 0.28
2.271PheSer: 2.271 ± 0.325
2.63PheThr: 2.63 ± 0.357
2.39PheVal: 2.39 ± 0.381
0.657PheTrp: 0.657 ± 0.197
0.896PheTyr: 0.896 ± 0.197
0.0PheXaa: 0.0 ± 0.0
Gly
7.59GlyAla: 7.59 ± 1.015
1.076GlyCys: 1.076 ± 0.272
5.976GlyAsp: 5.976 ± 0.849
4.363GlyGlu: 4.363 ± 0.516
2.39GlyPhe: 2.39 ± 0.305
9.084GlyGly: 9.084 ± 1.619
1.912GlyHis: 1.912 ± 0.369
4.124GlyIle: 4.124 ± 0.593
4.243GlyLys: 4.243 ± 0.501
5.916GlyLeu: 5.916 ± 0.793
2.032GlyMet: 2.032 ± 0.385
3.048GlyAsn: 3.048 ± 0.468
4.124GlyPro: 4.124 ± 0.947
3.466GlyGln: 3.466 ± 0.505
4.363GlyArg: 4.363 ± 0.428
5.498GlySer: 5.498 ± 0.77
5.558GlyThr: 5.558 ± 0.657
7.171GlyVal: 7.171 ± 0.688
2.092GlyTrp: 2.092 ± 0.333
2.928GlyTyr: 2.928 ± 0.385
0.0GlyXaa: 0.0 ± 0.0
His
1.076HisAla: 1.076 ± 0.293
0.179HisCys: 0.179 ± 0.099
0.896HisAsp: 0.896 ± 0.253
1.375HisGlu: 1.375 ± 0.293
0.896HisPhe: 0.896 ± 0.197
1.434HisGly: 1.434 ± 0.358
0.598HisHis: 0.598 ± 0.209
1.315HisIle: 1.315 ± 0.267
1.016HisLys: 1.016 ± 0.257
1.255HisLeu: 1.255 ± 0.354
0.359HisMet: 0.359 ± 0.189
0.359HisAsn: 0.359 ± 0.149
1.195HisPro: 1.195 ± 0.269
0.657HisGln: 0.657 ± 0.194
1.554HisArg: 1.554 ± 0.354
1.076HisSer: 1.076 ± 0.228
1.315HisThr: 1.315 ± 0.266
1.195HisVal: 1.195 ± 0.33
0.359HisTrp: 0.359 ± 0.178
0.598HisTyr: 0.598 ± 0.241
0.0HisXaa: 0.0 ± 0.0
Ile
4.661IleAla: 4.661 ± 0.546
0.598IleCys: 0.598 ± 0.21
3.287IleAsp: 3.287 ± 0.415
4.96IleGlu: 4.96 ± 0.515
1.195IlePhe: 1.195 ± 0.251
3.526IleGly: 3.526 ± 0.425
0.777IleHis: 0.777 ± 0.196
2.151IleIle: 2.151 ± 0.349
2.749IleLys: 2.749 ± 0.382
3.526IleLeu: 3.526 ± 0.436
0.896IleMet: 0.896 ± 0.225
1.673IleAsn: 1.673 ± 0.279
3.885IlePro: 3.885 ± 0.411
1.195IleGln: 1.195 ± 0.314
3.466IleArg: 3.466 ± 0.429
2.57IleSer: 2.57 ± 0.434
3.347IleThr: 3.347 ± 0.391
2.809IleVal: 2.809 ± 0.434
0.777IleTrp: 0.777 ± 0.218
1.076IleTyr: 1.076 ± 0.249
0.0IleXaa: 0.0 ± 0.0
Lys
5.916LysAla: 5.916 ± 0.58
0.418LysCys: 0.418 ± 0.167
2.51LysAsp: 2.51 ± 0.427
2.39LysGlu: 2.39 ± 0.378
1.195LysPhe: 1.195 ± 0.263
3.526LysGly: 3.526 ± 0.493
0.777LysHis: 0.777 ± 0.241
2.271LysIle: 2.271 ± 0.324
2.331LysLys: 2.331 ± 0.433
4.243LysLeu: 4.243 ± 0.484
1.016LysMet: 1.016 ± 0.239
1.494LysAsn: 1.494 ± 0.273
2.928LysPro: 2.928 ± 0.491
1.076LysGln: 1.076 ± 0.2
3.227LysArg: 3.227 ± 0.454
2.39LysSer: 2.39 ± 0.336
2.988LysThr: 2.988 ± 0.522
3.526LysVal: 3.526 ± 0.443
0.837LysTrp: 0.837 ± 0.241
1.195LysTyr: 1.195 ± 0.258
0.0LysXaa: 0.0 ± 0.0
Leu
9.801LeuAla: 9.801 ± 0.869
0.478LeuCys: 0.478 ± 0.151
4.841LeuAsp: 4.841 ± 0.469
3.765LeuGlu: 3.765 ± 0.498
2.331LeuPhe: 2.331 ± 0.326
5.558LeuGly: 5.558 ± 0.623
1.614LeuHis: 1.614 ± 0.343
3.526LeuIle: 3.526 ± 0.453
3.227LeuLys: 3.227 ± 0.481
5.199LeuLeu: 5.199 ± 0.466
1.972LeuMet: 1.972 ± 0.353
2.689LeuAsn: 2.689 ± 0.386
4.841LeuPro: 4.841 ± 0.513
2.689LeuGln: 2.689 ± 0.471
4.661LeuArg: 4.661 ± 0.546
5.976LeuSer: 5.976 ± 0.612
5.916LeuThr: 5.916 ± 0.691
4.721LeuVal: 4.721 ± 0.522
1.733LeuTrp: 1.733 ± 0.311
2.331LeuTyr: 2.331 ± 0.443
0.0LeuXaa: 0.0 ± 0.0
Met
3.108MetAla: 3.108 ± 0.403
0.239MetCys: 0.239 ± 0.127
1.494MetAsp: 1.494 ± 0.266
1.135MetGlu: 1.135 ± 0.284
0.657MetPhe: 0.657 ± 0.197
1.434MetGly: 1.434 ± 0.317
0.239MetHis: 0.239 ± 0.113
1.195MetIle: 1.195 ± 0.261
1.434MetLys: 1.434 ± 0.276
1.375MetLeu: 1.375 ± 0.288
0.717MetMet: 0.717 ± 0.179
0.896MetAsn: 0.896 ± 0.189
1.614MetPro: 1.614 ± 0.352
1.195MetGln: 1.195 ± 0.254
1.554MetArg: 1.554 ± 0.291
2.211MetSer: 2.211 ± 0.363
1.733MetThr: 1.733 ± 0.325
1.793MetVal: 1.793 ± 0.296
0.478MetTrp: 0.478 ± 0.159
0.538MetTyr: 0.538 ± 0.169
0.0MetXaa: 0.0 ± 0.0
Asn
3.825AsnAla: 3.825 ± 0.511
0.359AsnCys: 0.359 ± 0.182
1.733AsnAsp: 1.733 ± 0.314
1.614AsnGlu: 1.614 ± 0.321
0.478AsnPhe: 0.478 ± 0.232
4.303AsnGly: 4.303 ± 0.589
0.896AsnHis: 0.896 ± 0.205
1.375AsnIle: 1.375 ± 0.266
1.434AsnLys: 1.434 ± 0.269
3.227AsnLeu: 3.227 ± 0.478
0.837AsnMet: 0.837 ± 0.243
0.896AsnAsn: 0.896 ± 0.325
2.271AsnPro: 2.271 ± 0.346
1.195AsnGln: 1.195 ± 0.267
1.853AsnArg: 1.853 ± 0.331
1.614AsnSer: 1.614 ± 0.281
1.853AsnThr: 1.853 ± 0.365
2.57AsnVal: 2.57 ± 0.371
0.598AsnTrp: 0.598 ± 0.186
0.777AsnTyr: 0.777 ± 0.244
0.0AsnXaa: 0.0 ± 0.0
Pro
5.438ProAla: 5.438 ± 0.638
0.299ProCys: 0.299 ± 0.138
3.825ProAsp: 3.825 ± 0.406
5.14ProGlu: 5.14 ± 0.614
1.972ProPhe: 1.972 ± 0.398
4.482ProGly: 4.482 ± 0.738
1.016ProHis: 1.016 ± 0.231
2.988ProIle: 2.988 ± 0.388
2.45ProLys: 2.45 ± 0.505
2.928ProLeu: 2.928 ± 0.461
1.195ProMet: 1.195 ± 0.279
2.51ProAsn: 2.51 ± 0.342
2.092ProPro: 2.092 ± 0.365
1.912ProGln: 1.912 ± 0.362
3.227ProArg: 3.227 ± 0.512
2.57ProSer: 2.57 ± 0.435
4.602ProThr: 4.602 ± 0.618
3.825ProVal: 3.825 ± 0.501
0.538ProTrp: 0.538 ± 0.214
1.614ProTyr: 1.614 ± 0.306
0.0ProXaa: 0.0 ± 0.0
Gln
4.721GlnAla: 4.721 ± 0.542
0.299GlnCys: 0.299 ± 0.134
1.255GlnAsp: 1.255 ± 0.256
1.554GlnGlu: 1.554 ± 0.309
1.315GlnPhe: 1.315 ± 0.245
4.243GlnGly: 4.243 ± 1.482
0.538GlnHis: 0.538 ± 0.182
2.39GlnIle: 2.39 ± 0.322
1.315GlnLys: 1.315 ± 0.238
3.466GlnLeu: 3.466 ± 0.604
0.837GlnMet: 0.837 ± 0.204
0.777GlnAsn: 0.777 ± 0.187
1.195GlnPro: 1.195 ± 0.264
2.032GlnGln: 2.032 ± 0.505
2.331GlnArg: 2.331 ± 0.337
1.733GlnSer: 1.733 ± 0.26
1.972GlnThr: 1.972 ± 0.269
2.271GlnVal: 2.271 ± 0.362
0.538GlnTrp: 0.538 ± 0.186
1.016GlnTyr: 1.016 ± 0.228
0.0GlnXaa: 0.0 ± 0.0
Arg
5.618ArgAla: 5.618 ± 0.642
0.717ArgCys: 0.717 ± 0.244
4.183ArgAsp: 4.183 ± 0.478
3.825ArgGlu: 3.825 ± 0.519
2.032ArgPhe: 2.032 ± 0.376
4.721ArgGly: 4.721 ± 0.686
1.434ArgHis: 1.434 ± 0.356
3.347ArgIle: 3.347 ± 0.352
2.689ArgLys: 2.689 ± 0.385
5.379ArgLeu: 5.379 ± 0.634
2.211ArgMet: 2.211 ± 0.354
2.092ArgAsn: 2.092 ± 0.339
2.988ArgPro: 2.988 ± 0.477
2.271ArgGln: 2.271 ± 0.354
4.602ArgArg: 4.602 ± 0.698
3.347ArgSer: 3.347 ± 0.422
2.869ArgThr: 2.869 ± 0.348
4.363ArgVal: 4.363 ± 0.564
1.614ArgTrp: 1.614 ± 0.338
2.032ArgTyr: 2.032 ± 0.351
0.0ArgXaa: 0.0 ± 0.0
Ser
5.498SerAla: 5.498 ± 0.411
0.538SerCys: 0.538 ± 0.186
3.586SerAsp: 3.586 ± 0.481
4.422SerGlu: 4.422 ± 0.65
2.749SerPhe: 2.749 ± 0.521
5.857SerGly: 5.857 ± 0.72
0.837SerHis: 0.837 ± 0.191
2.211SerIle: 2.211 ± 0.349
2.45SerLys: 2.45 ± 0.46
4.781SerLeu: 4.781 ± 0.64
1.494SerMet: 1.494 ± 0.285
1.135SerAsn: 1.135 ± 0.256
3.586SerPro: 3.586 ± 0.438
2.39SerGln: 2.39 ± 0.342
4.183SerArg: 4.183 ± 0.539
4.124SerSer: 4.124 ± 0.639
2.928SerThr: 2.928 ± 0.32
3.227SerVal: 3.227 ± 0.385
1.255SerTrp: 1.255 ± 0.24
1.733SerTyr: 1.733 ± 0.277
0.0SerXaa: 0.0 ± 0.0
Thr
6.693ThrAla: 6.693 ± 0.516
0.418ThrCys: 0.418 ± 0.15
3.227ThrAsp: 3.227 ± 0.515
3.347ThrGlu: 3.347 ± 0.51
2.51ThrPhe: 2.51 ± 0.295
6.096ThrGly: 6.096 ± 0.621
1.255ThrHis: 1.255 ± 0.298
2.809ThrIle: 2.809 ± 0.445
3.227ThrLys: 3.227 ± 0.528
4.781ThrLeu: 4.781 ± 0.451
1.614ThrMet: 1.614 ± 0.316
1.673ThrAsn: 1.673 ± 0.308
4.183ThrPro: 4.183 ± 0.487
2.39ThrGln: 2.39 ± 0.358
3.167ThrArg: 3.167 ± 0.422
3.944ThrSer: 3.944 ± 0.625
3.825ThrThr: 3.825 ± 0.604
4.482ThrVal: 4.482 ± 0.468
1.135ThrTrp: 1.135 ± 0.328
1.912ThrTyr: 1.912 ± 0.268
0.0ThrXaa: 0.0 ± 0.0
Val
6.215ValAla: 6.215 ± 0.531
0.717ValCys: 0.717 ± 0.215
5.14ValAsp: 5.14 ± 0.487
5.02ValGlu: 5.02 ± 0.504
2.689ValPhe: 2.689 ± 0.574
4.602ValGly: 4.602 ± 0.568
0.777ValHis: 0.777 ± 0.188
2.809ValIle: 2.809 ± 0.392
3.466ValLys: 3.466 ± 0.434
6.275ValLeu: 6.275 ± 0.575
1.793ValMet: 1.793 ± 0.333
3.287ValAsn: 3.287 ± 0.44
3.227ValPro: 3.227 ± 0.377
2.57ValGln: 2.57 ± 0.492
4.542ValArg: 4.542 ± 0.523
4.183ValSer: 4.183 ± 0.486
4.422ValThr: 4.422 ± 0.462
5.08ValVal: 5.08 ± 0.601
1.673ValTrp: 1.673 ± 0.263
2.032ValTyr: 2.032 ± 0.38
0.0ValXaa: 0.0 ± 0.0
Trp
2.331TrpAla: 2.331 ± 0.383
0.299TrpCys: 0.299 ± 0.136
0.956TrpAsp: 0.956 ± 0.217
1.255TrpGlu: 1.255 ± 0.297
0.478TrpPhe: 0.478 ± 0.183
1.315TrpGly: 1.315 ± 0.296
0.657TrpHis: 0.657 ± 0.215
1.375TrpIle: 1.375 ± 0.259
0.956TrpLys: 0.956 ± 0.218
1.076TrpLeu: 1.076 ± 0.212
0.657TrpMet: 0.657 ± 0.19
0.837TrpAsn: 0.837 ± 0.223
0.837TrpPro: 0.837 ± 0.196
1.255TrpGln: 1.255 ± 0.296
1.076TrpArg: 1.076 ± 0.22
1.195TrpSer: 1.195 ± 0.299
1.315TrpThr: 1.315 ± 0.347
1.255TrpVal: 1.255 ± 0.239
0.359TrpTrp: 0.359 ± 0.177
0.359TrpTyr: 0.359 ± 0.124
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.45TyrAla: 2.45 ± 0.336
0.239TyrCys: 0.239 ± 0.146
1.733TyrAsp: 1.733 ± 0.271
2.032TyrGlu: 2.032 ± 0.398
0.896TyrPhe: 0.896 ± 0.191
2.63TyrGly: 2.63 ± 0.356
0.239TyrHis: 0.239 ± 0.123
1.315TyrIle: 1.315 ± 0.258
0.956TyrLys: 0.956 ± 0.267
2.63TyrLeu: 2.63 ± 0.361
0.717TyrMet: 0.717 ± 0.203
1.255TyrAsn: 1.255 ± 0.332
1.375TyrPro: 1.375 ± 0.278
1.016TyrGln: 1.016 ± 0.256
2.211TyrArg: 2.211 ± 0.428
1.494TyrSer: 1.494 ± 0.267
2.032TyrThr: 2.032 ± 0.37
2.39TyrVal: 2.39 ± 0.411
0.418TyrTrp: 0.418 ± 0.142
0.657TyrTyr: 0.657 ± 0.194
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 90 proteins (16734 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski