Amino acid dipepetide frequency for Salmonella phage Shemara

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
11.836AlaAla: 11.836 ± 1.226
0.921AlaCys: 0.921 ± 0.294
6.804AlaAsp: 6.804 ± 0.73
6.662AlaGlu: 6.662 ± 0.712
3.898AlaPhe: 3.898 ± 0.581
8.009AlaGly: 8.009 ± 0.672
1.772AlaHis: 1.772 ± 0.383
5.174AlaIle: 5.174 ± 0.678
5.599AlaLys: 5.599 ± 0.704
8.151AlaLeu: 8.151 ± 0.936
2.197AlaMet: 2.197 ± 0.449
3.827AlaAsn: 3.827 ± 0.572
3.048AlaPro: 3.048 ± 0.425
3.473AlaGln: 3.473 ± 0.569
5.032AlaArg: 5.032 ± 0.595
6.237AlaSer: 6.237 ± 0.597
5.103AlaThr: 5.103 ± 0.763
6.592AlaVal: 6.592 ± 0.723
1.205AlaTrp: 1.205 ± 0.24
2.977AlaTyr: 2.977 ± 0.482
0.0AlaXaa: 0.0 ± 0.0
Cys
1.063CysAla: 1.063 ± 0.314
0.354CysCys: 0.354 ± 0.165
0.638CysAsp: 0.638 ± 0.221
0.992CysGlu: 0.992 ± 0.315
0.284CysPhe: 0.284 ± 0.139
1.347CysGly: 1.347 ± 0.414
0.213CysHis: 0.213 ± 0.127
0.567CysIle: 0.567 ± 0.223
1.205CysLys: 1.205 ± 0.278
0.78CysLeu: 0.78 ± 0.252
0.142CysMet: 0.142 ± 0.12
0.709CysAsn: 0.709 ± 0.222
0.425CysPro: 0.425 ± 0.165
0.638CysGln: 0.638 ± 0.263
1.347CysArg: 1.347 ± 0.426
0.284CysSer: 0.284 ± 0.149
0.496CysThr: 0.496 ± 0.188
0.638CysVal: 0.638 ± 0.276
0.284CysTrp: 0.284 ± 0.129
0.567CysTyr: 0.567 ± 0.195
0.0CysXaa: 0.0 ± 0.0
Asp
6.662AspAla: 6.662 ± 0.625
0.709AspCys: 0.709 ± 0.228
3.686AspAsp: 3.686 ± 0.727
5.599AspGlu: 5.599 ± 0.801
2.552AspPhe: 2.552 ± 0.409
5.174AspGly: 5.174 ± 0.619
0.921AspHis: 0.921 ± 0.264
3.544AspIle: 3.544 ± 0.415
3.756AspLys: 3.756 ± 0.582
5.245AspLeu: 5.245 ± 0.752
1.701AspMet: 1.701 ± 0.351
2.764AspAsn: 2.764 ± 0.456
1.63AspPro: 1.63 ± 0.378
0.921AspGln: 0.921 ± 0.244
2.126AspArg: 2.126 ± 0.455
3.686AspSer: 3.686 ± 0.501
3.969AspThr: 3.969 ± 0.458
4.182AspVal: 4.182 ± 0.492
1.205AspTrp: 1.205 ± 0.314
1.985AspTyr: 1.985 ± 0.364
0.0AspXaa: 0.0 ± 0.0
Glu
5.954GluAla: 5.954 ± 0.88
0.638GluCys: 0.638 ± 0.25
4.678GluAsp: 4.678 ± 0.567
4.323GluGlu: 4.323 ± 0.823
2.268GluPhe: 2.268 ± 0.314
4.749GluGly: 4.749 ± 0.548
1.347GluHis: 1.347 ± 0.375
2.977GluIle: 2.977 ± 0.451
4.678GluLys: 4.678 ± 0.684
6.308GluLeu: 6.308 ± 0.917
2.693GluMet: 2.693 ± 0.495
2.835GluAsn: 2.835 ± 0.519
2.268GluPro: 2.268 ± 0.43
2.977GluGln: 2.977 ± 0.531
3.473GluArg: 3.473 ± 0.565
2.906GluSer: 2.906 ± 0.382
3.189GluThr: 3.189 ± 0.488
3.686GluVal: 3.686 ± 0.551
1.205GluTrp: 1.205 ± 0.299
2.693GluTyr: 2.693 ± 0.508
0.0GluXaa: 0.0 ± 0.0
Phe
2.552PheAla: 2.552 ± 0.43
0.921PheCys: 0.921 ± 0.26
2.481PheAsp: 2.481 ± 0.481
2.977PheGlu: 2.977 ± 0.401
0.921PhePhe: 0.921 ± 0.266
2.764PheGly: 2.764 ± 0.44
0.709PheHis: 0.709 ± 0.254
2.055PheIle: 2.055 ± 0.352
1.418PheLys: 1.418 ± 0.305
2.339PheLeu: 2.339 ± 0.471
0.638PheMet: 0.638 ± 0.202
2.339PheAsn: 2.339 ± 0.541
1.134PhePro: 1.134 ± 0.377
0.78PheGln: 0.78 ± 0.259
1.914PheArg: 1.914 ± 0.292
3.331PheSer: 3.331 ± 0.617
2.552PheThr: 2.552 ± 0.434
2.693PheVal: 2.693 ± 0.486
0.992PheTrp: 0.992 ± 0.285
0.992PheTyr: 0.992 ± 0.245
0.0PheXaa: 0.0 ± 0.0
Gly
6.662GlyAla: 6.662 ± 0.74
1.205GlyCys: 1.205 ± 0.373
4.607GlyAsp: 4.607 ± 0.655
4.678GlyGlu: 4.678 ± 0.489
2.764GlyPhe: 2.764 ± 0.343
6.025GlyGly: 6.025 ± 0.783
1.347GlyHis: 1.347 ± 0.365
3.473GlyIle: 3.473 ± 0.478
4.394GlyLys: 4.394 ± 0.708
6.521GlyLeu: 6.521 ± 0.73
1.985GlyMet: 1.985 ± 0.413
3.969GlyAsn: 3.969 ± 0.813
2.197GlyPro: 2.197 ± 0.333
2.552GlyGln: 2.552 ± 0.443
4.536GlyArg: 4.536 ± 0.599
4.82GlySer: 4.82 ± 0.577
4.678GlyThr: 4.678 ± 0.594
5.741GlyVal: 5.741 ± 0.677
1.205GlyTrp: 1.205 ± 0.339
2.693GlyTyr: 2.693 ± 0.36
0.0GlyXaa: 0.0 ± 0.0
His
1.488HisAla: 1.488 ± 0.291
0.354HisCys: 0.354 ± 0.166
0.851HisAsp: 0.851 ± 0.2
0.921HisGlu: 0.921 ± 0.243
0.921HisPhe: 0.921 ± 0.277
1.063HisGly: 1.063 ± 0.354
1.063HisHis: 1.063 ± 0.44
1.063HisIle: 1.063 ± 0.319
1.205HisLys: 1.205 ± 0.331
1.134HisLeu: 1.134 ± 0.295
0.354HisMet: 0.354 ± 0.165
0.709HisAsn: 0.709 ± 0.191
1.418HisPro: 1.418 ± 0.294
0.851HisGln: 0.851 ± 0.212
0.851HisArg: 0.851 ± 0.264
0.921HisSer: 0.921 ± 0.222
0.567HisThr: 0.567 ± 0.183
1.134HisVal: 1.134 ± 0.278
0.213HisTrp: 0.213 ± 0.132
0.638HisTyr: 0.638 ± 0.215
0.0HisXaa: 0.0 ± 0.0
Ile
4.82IleAla: 4.82 ± 0.666
0.851IleCys: 0.851 ± 0.218
3.402IleAsp: 3.402 ± 0.586
3.331IleGlu: 3.331 ± 0.41
1.205IlePhe: 1.205 ± 0.325
4.111IleGly: 4.111 ± 0.518
0.425IleHis: 0.425 ± 0.177
3.189IleIle: 3.189 ± 0.503
2.552IleLys: 2.552 ± 0.35
2.693IleLeu: 2.693 ± 0.439
1.488IleMet: 1.488 ± 0.29
2.835IleAsn: 2.835 ± 0.501
2.693IlePro: 2.693 ± 0.426
2.126IleGln: 2.126 ± 0.387
2.41IleArg: 2.41 ± 0.389
3.615IleSer: 3.615 ± 0.641
3.473IleThr: 3.473 ± 0.505
2.977IleVal: 2.977 ± 0.44
0.921IleTrp: 0.921 ± 0.292
1.347IleTyr: 1.347 ± 0.289
0.0IleXaa: 0.0 ± 0.0
Lys
6.237LysAla: 6.237 ± 0.744
1.347LysCys: 1.347 ± 0.268
2.693LysAsp: 2.693 ± 0.427
3.615LysGlu: 3.615 ± 0.53
2.126LysPhe: 2.126 ± 0.363
3.898LysGly: 3.898 ± 0.654
1.559LysHis: 1.559 ± 0.322
2.197LysIle: 2.197 ± 0.362
2.835LysLys: 2.835 ± 0.392
4.961LysLeu: 4.961 ± 0.651
2.552LysMet: 2.552 ± 0.509
2.622LysAsn: 2.622 ± 0.477
2.835LysPro: 2.835 ± 0.476
2.622LysGln: 2.622 ± 0.441
3.331LysArg: 3.331 ± 0.583
3.331LysSer: 3.331 ± 0.54
3.827LysThr: 3.827 ± 0.53
3.119LysVal: 3.119 ± 0.486
0.851LysTrp: 0.851 ± 0.24
2.764LysTyr: 2.764 ± 0.43
0.0LysXaa: 0.0 ± 0.0
Leu
8.434LeuAla: 8.434 ± 0.832
1.134LeuCys: 1.134 ± 0.282
4.323LeuAsp: 4.323 ± 0.475
4.323LeuGlu: 4.323 ± 0.58
2.977LeuPhe: 2.977 ± 0.465
4.465LeuGly: 4.465 ± 0.458
1.205LeuHis: 1.205 ± 0.297
3.615LeuIle: 3.615 ± 0.447
5.174LeuLys: 5.174 ± 0.576
6.45LeuLeu: 6.45 ± 0.907
2.339LeuMet: 2.339 ± 0.4
4.323LeuAsn: 4.323 ± 0.592
3.119LeuPro: 3.119 ± 0.532
3.402LeuGln: 3.402 ± 0.598
4.678LeuArg: 4.678 ± 0.509
4.323LeuSer: 4.323 ± 0.553
4.182LeuThr: 4.182 ± 0.584
5.954LeuVal: 5.954 ± 0.633
0.567LeuTrp: 0.567 ± 0.171
2.197LeuTyr: 2.197 ± 0.335
0.0LeuXaa: 0.0 ± 0.0
Met
2.977MetAla: 2.977 ± 0.454
0.284MetCys: 0.284 ± 0.126
1.488MetAsp: 1.488 ± 0.463
1.914MetGlu: 1.914 ± 0.367
0.992MetPhe: 0.992 ± 0.243
1.347MetGly: 1.347 ± 0.275
0.638MetHis: 0.638 ± 0.233
1.276MetIle: 1.276 ± 0.318
1.701MetLys: 1.701 ± 0.353
2.197MetLeu: 2.197 ± 0.416
0.567MetMet: 0.567 ± 0.223
1.134MetAsn: 1.134 ± 0.348
0.851MetPro: 0.851 ± 0.224
1.063MetGln: 1.063 ± 0.258
1.914MetArg: 1.914 ± 0.398
1.772MetSer: 1.772 ± 0.303
2.126MetThr: 2.126 ± 0.505
1.418MetVal: 1.418 ± 0.292
0.142MetTrp: 0.142 ± 0.098
0.78MetTyr: 0.78 ± 0.271
0.0MetXaa: 0.0 ± 0.0
Asn
4.111AsnAla: 4.111 ± 0.503
0.425AsnCys: 0.425 ± 0.171
3.827AsnAsp: 3.827 ± 0.389
2.41AsnGlu: 2.41 ± 0.41
1.276AsnPhe: 1.276 ± 0.289
5.032AsnGly: 5.032 ± 0.575
1.134AsnHis: 1.134 ± 0.34
3.473AsnIle: 3.473 ± 0.548
2.339AsnLys: 2.339 ± 0.468
3.048AsnLeu: 3.048 ± 0.452
1.063AsnMet: 1.063 ± 0.303
2.906AsnAsn: 2.906 ± 0.593
1.418AsnPro: 1.418 ± 0.368
1.488AsnGln: 1.488 ± 0.32
3.048AsnArg: 3.048 ± 0.541
2.481AsnSer: 2.481 ± 0.615
3.402AsnThr: 3.402 ± 0.739
2.693AsnVal: 2.693 ± 0.428
0.709AsnTrp: 0.709 ± 0.247
1.772AsnTyr: 1.772 ± 0.381
0.0AsnXaa: 0.0 ± 0.0
Pro
3.686ProAla: 3.686 ± 0.616
0.284ProCys: 0.284 ± 0.151
2.764ProAsp: 2.764 ± 0.429
4.04ProGlu: 4.04 ± 0.5
1.488ProPhe: 1.488 ± 0.331
2.197ProGly: 2.197 ± 0.613
0.142ProHis: 0.142 ± 0.091
1.772ProIle: 1.772 ± 0.383
2.055ProLys: 2.055 ± 0.426
2.977ProLeu: 2.977 ± 0.446
0.851ProMet: 0.851 ± 0.264
1.772ProAsn: 1.772 ± 0.433
0.851ProPro: 0.851 ± 0.305
1.418ProGln: 1.418 ± 0.349
1.772ProArg: 1.772 ± 0.312
2.339ProSer: 2.339 ± 0.357
1.843ProThr: 1.843 ± 0.371
3.473ProVal: 3.473 ± 0.52
0.142ProTrp: 0.142 ± 0.112
1.063ProTyr: 1.063 ± 0.284
0.0ProXaa: 0.0 ± 0.0
Gln
4.465GlnAla: 4.465 ± 0.609
0.496GlnCys: 0.496 ± 0.229
1.701GlnAsp: 1.701 ± 0.437
2.268GlnGlu: 2.268 ± 0.449
1.488GlnPhe: 1.488 ± 0.356
2.055GlnGly: 2.055 ± 0.393
0.709GlnHis: 0.709 ± 0.264
2.552GlnIle: 2.552 ± 0.427
2.197GlnLys: 2.197 ± 0.482
2.41GlnLeu: 2.41 ± 0.483
1.418GlnMet: 1.418 ± 0.355
1.772GlnAsn: 1.772 ± 0.381
1.772GlnPro: 1.772 ± 0.361
2.764GlnGln: 2.764 ± 0.774
2.339GlnArg: 2.339 ± 0.454
1.772GlnSer: 1.772 ± 0.324
2.41GlnThr: 2.41 ± 0.402
2.693GlnVal: 2.693 ± 0.377
0.851GlnTrp: 0.851 ± 0.241
1.559GlnTyr: 1.559 ± 0.335
0.0GlnXaa: 0.0 ± 0.0
Arg
4.182ArgAla: 4.182 ± 0.541
0.425ArgCys: 0.425 ± 0.204
3.402ArgAsp: 3.402 ± 0.49
2.906ArgGlu: 2.906 ± 0.481
1.914ArgPhe: 1.914 ± 0.322
3.898ArgGly: 3.898 ± 0.612
0.921ArgHis: 0.921 ± 0.221
2.906ArgIle: 2.906 ± 0.425
4.749ArgLys: 4.749 ± 0.651
4.04ArgLeu: 4.04 ± 0.508
1.134ArgMet: 1.134 ± 0.275
2.977ArgAsn: 2.977 ± 0.484
1.701ArgPro: 1.701 ± 0.519
2.906ArgGln: 2.906 ± 0.49
3.898ArgArg: 3.898 ± 0.612
2.481ArgSer: 2.481 ± 0.37
3.544ArgThr: 3.544 ± 0.518
4.607ArgVal: 4.607 ± 0.536
0.851ArgTrp: 0.851 ± 0.254
1.701ArgTyr: 1.701 ± 0.382
0.0ArgXaa: 0.0 ± 0.0
Ser
6.166SerAla: 6.166 ± 0.764
0.425SerCys: 0.425 ± 0.211
3.189SerAsp: 3.189 ± 0.421
3.331SerGlu: 3.331 ± 0.522
2.906SerPhe: 2.906 ± 0.468
5.528SerGly: 5.528 ± 0.909
0.709SerHis: 0.709 ± 0.245
2.764SerIle: 2.764 ± 0.421
2.906SerLys: 2.906 ± 0.525
4.182SerLeu: 4.182 ± 0.463
2.055SerMet: 2.055 ± 0.389
2.552SerAsn: 2.552 ± 0.561
2.268SerPro: 2.268 ± 0.438
2.126SerGln: 2.126 ± 0.469
3.331SerArg: 3.331 ± 0.43
2.622SerSer: 2.622 ± 0.466
3.827SerThr: 3.827 ± 0.611
3.827SerVal: 3.827 ± 0.774
0.851SerTrp: 0.851 ± 0.241
2.197SerTyr: 2.197 ± 0.361
0.0SerXaa: 0.0 ± 0.0
Thr
5.954ThrAla: 5.954 ± 0.766
0.567ThrCys: 0.567 ± 0.212
2.977ThrAsp: 2.977 ± 0.443
3.402ThrGlu: 3.402 ± 0.49
2.552ThrPhe: 2.552 ± 0.49
6.45ThrGly: 6.45 ± 0.916
0.638ThrHis: 0.638 ± 0.216
3.048ThrIle: 3.048 ± 0.38
3.048ThrLys: 3.048 ± 0.482
5.174ThrLeu: 5.174 ± 0.663
1.205ThrMet: 1.205 ± 0.329
2.268ThrAsn: 2.268 ± 0.576
3.544ThrPro: 3.544 ± 0.574
2.268ThrGln: 2.268 ± 0.39
2.339ThrArg: 2.339 ± 0.392
3.827ThrSer: 3.827 ± 0.5
3.686ThrThr: 3.686 ± 0.696
4.82ThrVal: 4.82 ± 0.598
0.921ThrTrp: 0.921 ± 0.265
2.268ThrTyr: 2.268 ± 0.427
0.0ThrXaa: 0.0 ± 0.0
Val
7.229ValAla: 7.229 ± 0.635
0.709ValCys: 0.709 ± 0.249
4.111ValAsp: 4.111 ± 0.429
4.82ValGlu: 4.82 ± 0.621
1.701ValPhe: 1.701 ± 0.437
4.182ValGly: 4.182 ± 0.621
1.134ValHis: 1.134 ± 0.251
3.544ValIle: 3.544 ± 0.429
4.323ValLys: 4.323 ± 0.594
4.607ValLeu: 4.607 ± 0.467
1.205ValMet: 1.205 ± 0.361
3.544ValAsn: 3.544 ± 0.641
2.339ValPro: 2.339 ± 0.408
2.693ValGln: 2.693 ± 0.341
3.756ValArg: 3.756 ± 0.521
4.182ValSer: 4.182 ± 0.761
5.528ValThr: 5.528 ± 0.716
5.458ValVal: 5.458 ± 0.94
0.992ValTrp: 0.992 ± 0.247
2.622ValTyr: 2.622 ± 0.379
0.0ValXaa: 0.0 ± 0.0
Trp
0.921TrpAla: 0.921 ± 0.377
0.213TrpCys: 0.213 ± 0.117
1.063TrpAsp: 1.063 ± 0.249
0.78TrpGlu: 0.78 ± 0.296
0.709TrpPhe: 0.709 ± 0.222
0.921TrpGly: 0.921 ± 0.312
0.567TrpHis: 0.567 ± 0.265
0.284TrpIle: 0.284 ± 0.16
0.425TrpLys: 0.425 ± 0.21
1.772TrpLeu: 1.772 ± 0.341
0.284TrpMet: 0.284 ± 0.149
0.638TrpAsn: 0.638 ± 0.273
0.496TrpPro: 0.496 ± 0.194
0.78TrpGln: 0.78 ± 0.25
1.205TrpArg: 1.205 ± 0.26
0.992TrpSer: 0.992 ± 0.284
0.851TrpThr: 0.851 ± 0.207
1.276TrpVal: 1.276 ± 0.285
0.425TrpTrp: 0.425 ± 0.167
0.567TrpTyr: 0.567 ± 0.218
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.119TyrAla: 3.119 ± 0.576
0.709TyrCys: 0.709 ± 0.208
3.331TyrAsp: 3.331 ± 0.502
2.552TyrGlu: 2.552 ± 0.476
1.559TyrPhe: 1.559 ± 0.333
2.835TyrGly: 2.835 ± 0.509
0.567TyrHis: 0.567 ± 0.208
0.992TyrIle: 0.992 ± 0.25
2.764TyrLys: 2.764 ± 0.445
2.197TyrLeu: 2.197 ± 0.425
0.638TyrMet: 0.638 ± 0.233
1.559TyrAsn: 1.559 ± 0.251
0.921TyrPro: 0.921 ± 0.285
1.843TyrGln: 1.843 ± 0.477
1.843TyrArg: 1.843 ± 0.35
1.843TyrSer: 1.843 ± 0.376
1.701TyrThr: 1.701 ± 0.371
1.772TyrVal: 1.772 ± 0.381
0.638TyrTrp: 0.638 ± 0.238
1.063TyrTyr: 1.063 ± 0.241
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 83 proteins (14110 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski