Amino acid dipepetide frequency for Enterobacteria phage SfI

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
7.733AlaAla: 7.733 ± 0.952
1.429AlaCys: 1.429 ± 0.375
4.371AlaAsp: 4.371 ± 0.651
6.052AlaGlu: 6.052 ± 0.657
3.362AlaPhe: 3.362 ± 0.741
6.808AlaGly: 6.808 ± 0.673
1.765AlaHis: 1.765 ± 0.389
5.38AlaIle: 5.38 ± 0.562
4.035AlaLys: 4.035 ± 0.477
8.826AlaLeu: 8.826 ± 0.959
3.026AlaMet: 3.026 ± 0.477
2.858AlaAsn: 2.858 ± 0.511
2.942AlaPro: 2.942 ± 0.526
2.69AlaGln: 2.69 ± 0.474
7.229AlaArg: 7.229 ± 1.046
5.38AlaSer: 5.38 ± 0.727
6.22AlaThr: 6.22 ± 0.757
5.043AlaVal: 5.043 ± 0.71
2.017AlaTrp: 2.017 ± 0.426
2.774AlaTyr: 2.774 ± 0.483
0.0AlaXaa: 0.0 ± 0.0
Cys
0.925CysAla: 0.925 ± 0.356
0.336CysCys: 0.336 ± 0.182
0.588CysAsp: 0.588 ± 0.176
0.672CysGlu: 0.672 ± 0.261
0.42CysPhe: 0.42 ± 0.203
1.429CysGly: 1.429 ± 0.392
0.252CysHis: 0.252 ± 0.126
1.513CysIle: 1.513 ± 0.315
0.588CysLys: 0.588 ± 0.209
0.925CysLeu: 0.925 ± 0.356
0.252CysMet: 0.252 ± 0.155
0.588CysAsn: 0.588 ± 0.214
0.756CysPro: 0.756 ± 0.24
0.672CysGln: 0.672 ± 0.235
1.765CysArg: 1.765 ± 0.482
1.009CysSer: 1.009 ± 0.273
0.42CysThr: 0.42 ± 0.199
0.925CysVal: 0.925 ± 0.285
0.168CysTrp: 0.168 ± 0.123
0.504CysTyr: 0.504 ± 0.205
0.0CysXaa: 0.0 ± 0.0
Asp
5.211AspAla: 5.211 ± 0.721
0.504AspCys: 0.504 ± 0.195
4.875AspAsp: 4.875 ± 0.705
4.371AspGlu: 4.371 ± 0.527
1.849AspPhe: 1.849 ± 0.394
4.959AspGly: 4.959 ± 0.632
0.672AspHis: 0.672 ± 0.232
3.782AspIle: 3.782 ± 1.077
3.53AspLys: 3.53 ± 0.585
4.959AspLeu: 4.959 ± 0.901
2.269AspMet: 2.269 ± 0.39
1.765AspAsn: 1.765 ± 0.366
2.606AspPro: 2.606 ± 0.448
1.597AspGln: 1.597 ± 0.314
2.185AspArg: 2.185 ± 0.363
2.774AspSer: 2.774 ± 0.473
2.354AspThr: 2.354 ± 0.452
3.867AspVal: 3.867 ± 0.694
0.841AspTrp: 0.841 ± 0.264
1.765AspTyr: 1.765 ± 0.438
0.0AspXaa: 0.0 ± 0.0
Glu
5.295GluAla: 5.295 ± 0.464
1.261GluCys: 1.261 ± 0.461
2.269GluAsp: 2.269 ± 0.502
4.035GluGlu: 4.035 ± 0.695
2.185GluPhe: 2.185 ± 0.445
2.606GluGly: 2.606 ± 0.399
1.009GluHis: 1.009 ± 0.253
3.614GluIle: 3.614 ± 0.431
3.026GluLys: 3.026 ± 0.543
6.64GluLeu: 6.64 ± 0.885
1.933GluMet: 1.933 ± 0.296
2.522GluAsn: 2.522 ± 0.421
2.438GluPro: 2.438 ± 0.345
3.026GluGln: 3.026 ± 0.561
4.623GluArg: 4.623 ± 0.62
4.035GluSer: 4.035 ± 0.452
2.774GluThr: 2.774 ± 0.488
3.362GluVal: 3.362 ± 0.528
1.681GluTrp: 1.681 ± 0.388
1.513GluTyr: 1.513 ± 0.33
0.0GluXaa: 0.0 ± 0.0
Phe
2.606PheAla: 2.606 ± 0.386
0.252PheCys: 0.252 ± 0.162
2.354PheAsp: 2.354 ± 0.459
1.849PheGlu: 1.849 ± 0.377
0.672PhePhe: 0.672 ± 0.297
3.11PheGly: 3.11 ± 0.506
0.756PheHis: 0.756 ± 0.189
2.017PheIle: 2.017 ± 0.459
2.269PheLys: 2.269 ± 0.412
3.11PheLeu: 3.11 ± 0.655
1.261PheMet: 1.261 ± 0.344
1.681PheAsn: 1.681 ± 0.433
1.597PhePro: 1.597 ± 0.314
0.841PheGln: 0.841 ± 0.3
2.269PheArg: 2.269 ± 0.396
2.606PheSer: 2.606 ± 0.761
2.942PheThr: 2.942 ± 0.472
2.354PheVal: 2.354 ± 0.537
1.009PheTrp: 1.009 ± 0.203
1.177PheTyr: 1.177 ± 0.344
0.0PheXaa: 0.0 ± 0.0
Gly
5.968GlyAla: 5.968 ± 0.88
0.588GlyCys: 0.588 ± 0.228
3.951GlyAsp: 3.951 ± 0.492
4.119GlyGlu: 4.119 ± 0.552
3.11GlyPhe: 3.11 ± 0.529
4.203GlyGly: 4.203 ± 0.671
0.588GlyHis: 0.588 ± 0.246
3.698GlyIle: 3.698 ± 0.503
4.203GlyLys: 4.203 ± 0.568
5.548GlyLeu: 5.548 ± 1.21
2.185GlyMet: 2.185 ± 0.465
3.53GlyAsn: 3.53 ± 0.438
1.429GlyPro: 1.429 ± 0.334
2.858GlyGln: 2.858 ± 0.51
4.035GlyArg: 4.035 ± 0.527
3.614GlySer: 3.614 ± 0.529
4.539GlyThr: 4.539 ± 0.679
5.8GlyVal: 5.8 ± 0.773
1.933GlyTrp: 1.933 ± 0.446
2.774GlyTyr: 2.774 ± 0.578
0.0GlyXaa: 0.0 ± 0.0
His
1.009HisAla: 1.009 ± 0.251
0.336HisCys: 0.336 ± 0.196
1.177HisAsp: 1.177 ± 0.273
0.841HisGlu: 0.841 ± 0.265
0.252HisPhe: 0.252 ± 0.14
1.849HisGly: 1.849 ± 0.396
0.672HisHis: 0.672 ± 0.233
0.841HisIle: 0.841 ± 0.255
1.261HisLys: 1.261 ± 0.327
1.429HisLeu: 1.429 ± 0.438
0.504HisMet: 0.504 ± 0.194
0.672HisAsn: 0.672 ± 0.24
1.009HisPro: 1.009 ± 0.385
0.588HisGln: 0.588 ± 0.21
1.345HisArg: 1.345 ± 0.384
0.504HisSer: 0.504 ± 0.291
1.345HisThr: 1.345 ± 0.296
1.009HisVal: 1.009 ± 0.298
0.252HisTrp: 0.252 ± 0.139
1.009HisTyr: 1.009 ± 0.259
0.0HisXaa: 0.0 ± 0.0
Ile
5.884IleAla: 5.884 ± 0.822
0.925IleCys: 0.925 ± 0.26
3.362IleAsp: 3.362 ± 0.667
3.53IleGlu: 3.53 ± 0.592
0.756IlePhe: 0.756 ± 0.219
4.371IleGly: 4.371 ± 0.63
0.925IleHis: 0.925 ± 0.266
2.69IleIle: 2.69 ± 0.441
2.942IleLys: 2.942 ± 0.481
4.623IleLeu: 4.623 ± 0.791
0.672IleMet: 0.672 ± 0.253
3.614IleAsn: 3.614 ± 0.515
2.69IlePro: 2.69 ± 0.453
1.849IleGln: 1.849 ± 0.401
3.782IleArg: 3.782 ± 0.54
4.035IleSer: 4.035 ± 0.687
5.548IleThr: 5.548 ± 0.638
2.774IleVal: 2.774 ± 0.613
0.588IleTrp: 0.588 ± 0.236
1.597IleTyr: 1.597 ± 0.42
0.0IleXaa: 0.0 ± 0.0
Lys
5.968LysAla: 5.968 ± 0.757
0.588LysCys: 0.588 ± 0.24
3.362LysAsp: 3.362 ± 0.621
3.026LysGlu: 3.026 ± 0.513
1.681LysPhe: 1.681 ± 0.349
3.026LysGly: 3.026 ± 0.5
0.841LysHis: 0.841 ± 0.294
3.446LysIle: 3.446 ± 0.445
4.371LysLys: 4.371 ± 0.682
4.707LysLeu: 4.707 ± 0.803
1.933LysMet: 1.933 ± 0.522
3.446LysAsn: 3.446 ± 0.447
3.614LysPro: 3.614 ± 0.487
2.354LysGln: 2.354 ± 0.492
4.539LysArg: 4.539 ± 0.674
4.119LysSer: 4.119 ± 0.494
2.774LysThr: 2.774 ± 0.633
3.11LysVal: 3.11 ± 0.648
1.177LysTrp: 1.177 ± 0.29
0.841LysTyr: 0.841 ± 0.279
0.0LysXaa: 0.0 ± 0.0
Leu
9.078LeuAla: 9.078 ± 0.776
1.765LeuCys: 1.765 ± 0.386
3.446LeuAsp: 3.446 ± 0.464
4.623LeuGlu: 4.623 ± 0.765
3.951LeuPhe: 3.951 ± 0.742
4.539LeuGly: 4.539 ± 0.716
1.345LeuHis: 1.345 ± 0.324
5.8LeuIle: 5.8 ± 0.741
5.548LeuLys: 5.548 ± 0.708
6.977LeuLeu: 6.977 ± 1.022
2.017LeuMet: 2.017 ± 0.472
5.548LeuAsn: 5.548 ± 0.822
3.782LeuPro: 3.782 ± 0.612
3.362LeuGln: 3.362 ± 0.503
5.548LeuArg: 5.548 ± 0.707
5.464LeuSer: 5.464 ± 0.909
4.791LeuThr: 4.791 ± 0.54
4.623LeuVal: 4.623 ± 0.751
1.429LeuTrp: 1.429 ± 0.391
1.849LeuTyr: 1.849 ± 0.458
0.0LeuXaa: 0.0 ± 0.0
Met
2.185MetAla: 2.185 ± 0.421
0.252MetCys: 0.252 ± 0.142
1.429MetAsp: 1.429 ± 0.349
1.093MetGlu: 1.093 ± 0.339
0.841MetPhe: 0.841 ± 0.29
1.849MetGly: 1.849 ± 0.321
0.336MetHis: 0.336 ± 0.222
1.681MetIle: 1.681 ± 0.37
1.849MetLys: 1.849 ± 0.408
3.026MetLeu: 3.026 ± 0.41
0.42MetMet: 0.42 ± 0.309
1.429MetAsn: 1.429 ± 0.353
0.925MetPro: 0.925 ± 0.276
1.093MetGln: 1.093 ± 0.344
2.438MetArg: 2.438 ± 0.504
2.017MetSer: 2.017 ± 0.342
2.858MetThr: 2.858 ± 0.38
1.429MetVal: 1.429 ± 0.371
0.42MetTrp: 0.42 ± 0.154
0.336MetTyr: 0.336 ± 0.16
0.0MetXaa: 0.0 ± 0.0
Asn
5.043AsnAla: 5.043 ± 0.842
0.252AsnCys: 0.252 ± 0.155
2.606AsnAsp: 2.606 ± 0.393
1.849AsnGlu: 1.849 ± 0.378
1.765AsnPhe: 1.765 ± 0.444
4.035AsnGly: 4.035 ± 0.573
1.009AsnHis: 1.009 ± 0.346
2.185AsnIle: 2.185 ± 0.383
3.614AsnLys: 3.614 ± 0.619
2.606AsnLeu: 2.606 ± 0.525
1.177AsnMet: 1.177 ± 0.352
1.681AsnAsn: 1.681 ± 0.362
2.606AsnPro: 2.606 ± 0.538
1.849AsnGln: 1.849 ± 0.457
2.101AsnArg: 2.101 ± 0.517
2.269AsnSer: 2.269 ± 0.404
2.522AsnThr: 2.522 ± 0.549
2.269AsnVal: 2.269 ± 0.472
0.756AsnTrp: 0.756 ± 0.236
1.093AsnTyr: 1.093 ± 0.29
0.0AsnXaa: 0.0 ± 0.0
Pro
3.951ProAla: 3.951 ± 0.624
0.672ProCys: 0.672 ± 0.222
3.362ProAsp: 3.362 ± 0.547
3.194ProGlu: 3.194 ± 0.526
1.681ProPhe: 1.681 ± 0.416
2.774ProGly: 2.774 ± 0.503
0.756ProHis: 0.756 ± 0.214
2.354ProIle: 2.354 ± 0.457
2.017ProLys: 2.017 ± 0.375
3.026ProLeu: 3.026 ± 0.379
1.093ProMet: 1.093 ± 0.305
1.765ProAsn: 1.765 ± 0.354
1.177ProPro: 1.177 ± 0.307
1.597ProGln: 1.597 ± 0.318
1.849ProArg: 1.849 ± 0.379
2.858ProSer: 2.858 ± 0.48
1.849ProThr: 1.849 ± 0.435
4.203ProVal: 4.203 ± 0.491
0.504ProTrp: 0.504 ± 0.219
1.765ProTyr: 1.765 ± 0.364
0.0ProXaa: 0.0 ± 0.0
Gln
3.278GlnAla: 3.278 ± 0.548
0.504GlnCys: 0.504 ± 0.217
1.681GlnAsp: 1.681 ± 0.366
2.522GlnGlu: 2.522 ± 0.519
1.681GlnPhe: 1.681 ± 0.387
1.765GlnGly: 1.765 ± 0.396
0.841GlnHis: 0.841 ± 0.245
1.681GlnIle: 1.681 ± 0.382
2.185GlnLys: 2.185 ± 0.475
3.194GlnLeu: 3.194 ± 0.531
1.093GlnMet: 1.093 ± 0.331
1.429GlnAsn: 1.429 ± 0.325
1.261GlnPro: 1.261 ± 0.344
2.438GlnGln: 2.438 ± 0.374
3.53GlnArg: 3.53 ± 0.655
2.354GlnSer: 2.354 ± 0.467
2.438GlnThr: 2.438 ± 0.501
2.101GlnVal: 2.101 ± 0.425
1.345GlnTrp: 1.345 ± 0.411
1.345GlnTyr: 1.345 ± 0.302
0.0GlnXaa: 0.0 ± 0.0
Arg
4.959ArgAla: 4.959 ± 0.726
0.588ArgCys: 0.588 ± 0.219
4.035ArgAsp: 4.035 ± 0.586
4.203ArgGlu: 4.203 ± 0.693
2.438ArgPhe: 2.438 ± 0.533
3.11ArgGly: 3.11 ± 0.544
1.933ArgHis: 1.933 ± 0.371
3.698ArgIle: 3.698 ± 0.504
4.203ArgLys: 4.203 ± 0.56
5.38ArgLeu: 5.38 ± 0.657
1.765ArgMet: 1.765 ± 0.393
2.101ArgAsn: 2.101 ± 0.455
2.185ArgPro: 2.185 ± 0.46
4.119ArgGln: 4.119 ± 0.621
5.211ArgArg: 5.211 ± 1.281
3.698ArgSer: 3.698 ± 0.649
3.53ArgThr: 3.53 ± 0.514
4.287ArgVal: 4.287 ± 0.617
1.345ArgTrp: 1.345 ± 0.367
1.765ArgTyr: 1.765 ± 0.44
0.0ArgXaa: 0.0 ± 0.0
Ser
5.548SerAla: 5.548 ± 0.713
0.925SerCys: 0.925 ± 0.243
3.53SerAsp: 3.53 ± 0.698
3.362SerGlu: 3.362 ± 0.497
2.69SerPhe: 2.69 ± 0.566
4.875SerGly: 4.875 ± 0.601
1.345SerHis: 1.345 ± 0.263
3.11SerIle: 3.11 ± 0.734
3.446SerLys: 3.446 ± 0.512
5.716SerLeu: 5.716 ± 0.8
1.849SerMet: 1.849 ± 0.425
2.017SerAsn: 2.017 ± 0.313
2.774SerPro: 2.774 ± 0.466
1.933SerGln: 1.933 ± 0.389
2.858SerArg: 2.858 ± 0.353
3.951SerSer: 3.951 ± 0.592
3.11SerThr: 3.11 ± 0.578
5.295SerVal: 5.295 ± 0.554
1.093SerTrp: 1.093 ± 0.242
2.185SerTyr: 2.185 ± 0.541
0.0SerXaa: 0.0 ± 0.0
Thr
6.977ThrAla: 6.977 ± 1.027
1.345ThrCys: 1.345 ± 0.312
3.026ThrAsp: 3.026 ± 0.508
3.782ThrGlu: 3.782 ± 0.592
2.185ThrPhe: 2.185 ± 0.462
5.295ThrGly: 5.295 ± 0.823
1.177ThrHis: 1.177 ± 0.368
2.858ThrIle: 2.858 ± 0.497
3.614ThrLys: 3.614 ± 0.593
5.716ThrLeu: 5.716 ± 0.94
0.841ThrMet: 0.841 ± 0.292
2.522ThrAsn: 2.522 ± 0.387
3.026ThrPro: 3.026 ± 0.472
1.261ThrGln: 1.261 ± 0.309
3.278ThrArg: 3.278 ± 0.372
3.53ThrSer: 3.53 ± 0.573
3.867ThrThr: 3.867 ± 0.631
3.782ThrVal: 3.782 ± 0.584
1.261ThrTrp: 1.261 ± 0.355
1.765ThrTyr: 1.765 ± 0.347
0.0ThrXaa: 0.0 ± 0.0
Val
4.707ValAla: 4.707 ± 0.662
1.177ValCys: 1.177 ± 0.389
4.707ValAsp: 4.707 ± 0.612
4.287ValGlu: 4.287 ± 0.598
2.858ValPhe: 2.858 ± 0.485
3.782ValGly: 3.782 ± 0.591
0.672ValHis: 0.672 ± 0.23
3.951ValIle: 3.951 ± 0.712
3.446ValLys: 3.446 ± 0.546
4.791ValLeu: 4.791 ± 0.578
1.849ValMet: 1.849 ± 0.404
2.858ValAsn: 2.858 ± 0.568
3.026ValPro: 3.026 ± 0.49
2.438ValGln: 2.438 ± 0.507
2.942ValArg: 2.942 ± 0.561
4.623ValSer: 4.623 ± 0.655
4.287ValThr: 4.287 ± 0.762
5.38ValVal: 5.38 ± 0.612
0.672ValTrp: 0.672 ± 0.233
2.522ValTyr: 2.522 ± 0.454
0.0ValXaa: 0.0 ± 0.0
Trp
1.177TrpAla: 1.177 ± 0.282
0.672TrpCys: 0.672 ± 0.24
0.841TrpAsp: 0.841 ± 0.282
1.093TrpGlu: 1.093 ± 0.327
0.841TrpPhe: 0.841 ± 0.32
1.177TrpGly: 1.177 ± 0.315
0.504TrpHis: 0.504 ± 0.21
0.756TrpIle: 0.756 ± 0.232
1.681TrpLys: 1.681 ± 0.427
2.185TrpLeu: 2.185 ± 0.516
0.504TrpMet: 0.504 ± 0.199
0.756TrpAsn: 0.756 ± 0.246
0.841TrpPro: 0.841 ± 0.278
1.009TrpGln: 1.009 ± 0.243
1.177TrpArg: 1.177 ± 0.295
1.009TrpSer: 1.009 ± 0.25
1.093TrpThr: 1.093 ± 0.329
1.345TrpVal: 1.345 ± 0.345
0.252TrpTrp: 0.252 ± 0.151
0.42TrpTyr: 0.42 ± 0.185
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.438TyrAla: 2.438 ± 0.434
0.336TyrCys: 0.336 ± 0.163
1.849TyrAsp: 1.849 ± 0.382
1.345TyrGlu: 1.345 ± 0.404
1.513TyrPhe: 1.513 ± 0.423
3.11TyrGly: 3.11 ± 0.457
0.504TyrHis: 0.504 ± 0.219
2.101TyrIle: 2.101 ± 0.39
1.093TyrLys: 1.093 ± 0.329
2.101TyrLeu: 2.101 ± 0.382
1.177TyrMet: 1.177 ± 0.317
0.504TyrAsn: 0.504 ± 0.191
1.765TyrPro: 1.765 ± 0.348
1.093TyrGln: 1.093 ± 0.256
1.849TyrArg: 1.849 ± 0.399
1.765TyrSer: 1.765 ± 0.394
1.849TyrThr: 1.849 ± 0.429
2.017TyrVal: 2.017 ± 0.563
0.588TyrTrp: 0.588 ± 0.216
1.009TyrTyr: 1.009 ± 0.271
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 65 proteins (11898 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski