Amino acid dipepetide frequency for Streptococcus satellite phage Javan595

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
3.314AlaAla: 3.314 ± 0.938
0.829AlaCys: 0.829 ± 0.564
4.143AlaAsp: 4.143 ± 2.086
2.486AlaGlu: 2.486 ± 0.99
1.243AlaPhe: 1.243 ± 0.843
3.314AlaGly: 3.314 ± 1.361
0.829AlaHis: 0.829 ± 0.843
3.314AlaIle: 3.314 ± 0.974
4.143AlaLys: 4.143 ± 1.302
2.9AlaLeu: 2.9 ± 1.185
1.657AlaMet: 1.657 ± 1.334
4.557AlaAsn: 4.557 ± 1.018
0.829AlaPro: 0.829 ± 0.486
1.243AlaGln: 1.243 ± 0.733
2.9AlaArg: 2.9 ± 0.858
2.9AlaSer: 2.9 ± 1.145
3.314AlaThr: 3.314 ± 0.805
2.071AlaVal: 2.071 ± 0.898
1.243AlaTrp: 1.243 ± 0.682
2.486AlaTyr: 2.486 ± 0.937
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.0CysCys: 0.0 ± 0.0
0.414CysAsp: 0.414 ± 0.402
0.414CysGlu: 0.414 ± 0.373
0.0CysPhe: 0.0 ± 0.0
0.0CysGly: 0.0 ± 0.0
0.0CysHis: 0.0 ± 0.0
0.414CysIle: 0.414 ± 0.429
0.414CysLys: 0.414 ± 0.402
0.0CysLeu: 0.0 ± 0.0
0.0CysMet: 0.0 ± 0.0
0.414CysAsn: 0.414 ± 0.421
0.829CysPro: 0.829 ± 0.858
0.0CysGln: 0.0 ± 0.0
0.0CysArg: 0.0 ± 0.0
0.0CysSer: 0.0 ± 0.0
0.0CysThr: 0.0 ± 0.0
0.414CysVal: 0.414 ± 0.376
0.0CysTrp: 0.0 ± 0.0
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
2.486AspAla: 2.486 ± 0.87
0.0AspCys: 0.0 ± 0.0
4.971AspAsp: 4.971 ± 1.788
3.728AspGlu: 3.728 ± 0.987
2.071AspPhe: 2.071 ± 0.72
5.385AspGly: 5.385 ± 1.151
0.0AspHis: 0.0 ± 0.0
4.971AspIle: 4.971 ± 1.62
6.628AspLys: 6.628 ± 1.218
4.557AspLeu: 4.557 ± 1.014
3.314AspMet: 3.314 ± 1.09
5.8AspAsn: 5.8 ± 1.649
0.414AspPro: 0.414 ± 0.431
0.414AspGln: 0.414 ± 0.382
2.9AspArg: 2.9 ± 1.235
4.143AspSer: 4.143 ± 1.18
2.9AspThr: 2.9 ± 1.159
2.9AspVal: 2.9 ± 0.68
0.829AspTrp: 0.829 ± 0.536
3.314AspTyr: 3.314 ± 1.295
0.0AspXaa: 0.0 ± 0.0
Glu
2.486GluAla: 2.486 ± 1.458
0.414GluCys: 0.414 ± 0.429
4.557GluAsp: 4.557 ± 1.242
8.285GluGlu: 8.285 ± 2.957
2.486GluPhe: 2.486 ± 1.124
2.071GluGly: 2.071 ± 0.702
1.657GluHis: 1.657 ± 0.734
8.285GluIle: 8.285 ± 2.85
7.042GluLys: 7.042 ± 1.989
10.771GluLeu: 10.771 ± 2.222
2.486GluMet: 2.486 ± 1.097
4.143GluAsn: 4.143 ± 1.058
3.314GluPro: 3.314 ± 0.779
3.728GluGln: 3.728 ± 0.738
5.385GluArg: 5.385 ± 1.622
2.9GluSer: 2.9 ± 1.116
3.728GluThr: 3.728 ± 1.124
4.971GluVal: 4.971 ± 1.383
1.243GluTrp: 1.243 ± 0.838
2.071GluTyr: 2.071 ± 0.708
0.0GluXaa: 0.0 ± 0.0
Phe
1.243PheAla: 1.243 ± 0.642
0.0PheCys: 0.0 ± 0.0
3.314PheAsp: 3.314 ± 1.307
3.728PheGlu: 3.728 ± 1.615
1.657PhePhe: 1.657 ± 0.733
2.9PheGly: 2.9 ± 1.655
0.414PheHis: 0.414 ± 0.431
4.557PheIle: 4.557 ± 1.871
2.486PheLys: 2.486 ± 0.981
2.071PheLeu: 2.071 ± 0.887
1.243PheMet: 1.243 ± 0.686
1.657PheAsn: 1.657 ± 0.782
1.657PhePro: 1.657 ± 0.832
0.829PheGln: 0.829 ± 0.58
0.414PheArg: 0.414 ± 0.473
3.728PheSer: 3.728 ± 0.812
1.657PheThr: 1.657 ± 0.574
1.657PheVal: 1.657 ± 0.818
0.414PheTrp: 0.414 ± 0.376
1.243PheTyr: 1.243 ± 0.562
0.0PheXaa: 0.0 ± 0.0
Gly
2.071GlyAla: 2.071 ± 0.898
0.414GlyCys: 0.414 ± 0.376
1.657GlyAsp: 1.657 ± 0.663
5.385GlyGlu: 5.385 ± 1.491
1.657GlyPhe: 1.657 ± 0.831
1.243GlyGly: 1.243 ± 0.649
0.829GlyHis: 0.829 ± 0.533
4.557GlyIle: 4.557 ± 1.293
6.214GlyLys: 6.214 ± 1.892
4.971GlyLeu: 4.971 ± 1.596
0.414GlyMet: 0.414 ± 0.429
4.557GlyAsn: 4.557 ± 1.062
0.0GlyPro: 0.0 ± 0.0
2.071GlyGln: 2.071 ± 0.887
1.657GlyArg: 1.657 ± 1.017
1.657GlySer: 1.657 ± 0.771
4.143GlyThr: 4.143 ± 1.208
4.143GlyVal: 4.143 ± 1.591
0.414GlyTrp: 0.414 ± 0.486
2.486GlyTyr: 2.486 ± 1.045
0.0GlyXaa: 0.0 ± 0.0
His
0.829HisAla: 0.829 ± 0.605
0.0HisCys: 0.0 ± 0.0
1.243HisAsp: 1.243 ± 0.565
1.657HisGlu: 1.657 ± 0.885
0.0HisPhe: 0.0 ± 0.0
1.243HisGly: 1.243 ± 0.734
0.414HisHis: 0.414 ± 0.402
1.243HisIle: 1.243 ± 0.75
0.414HisLys: 0.414 ± 0.373
2.486HisLeu: 2.486 ± 0.989
0.0HisMet: 0.0 ± 0.0
2.071HisAsn: 2.071 ± 0.895
0.414HisPro: 0.414 ± 0.426
0.414HisGln: 0.414 ± 0.402
1.243HisArg: 1.243 ± 0.633
0.414HisSer: 0.414 ± 0.431
0.829HisThr: 0.829 ± 0.58
0.414HisVal: 0.414 ± 0.486
0.0HisTrp: 0.0 ± 0.0
0.829HisTyr: 0.829 ± 0.588
0.0HisXaa: 0.0 ± 0.0
Ile
5.8IleAla: 5.8 ± 1.939
0.414IleCys: 0.414 ± 0.429
5.385IleAsp: 5.385 ± 2.027
7.457IleGlu: 7.457 ± 2.042
3.314IlePhe: 3.314 ± 1.081
2.071IleGly: 2.071 ± 0.751
2.486IleHis: 2.486 ± 0.893
6.214IleIle: 6.214 ± 1.443
7.457IleLys: 7.457 ± 1.557
6.214IleLeu: 6.214 ± 1.521
0.829IleMet: 0.829 ± 0.472
6.628IleAsn: 6.628 ± 2.442
3.728IlePro: 3.728 ± 1.373
2.9IleGln: 2.9 ± 0.875
4.557IleArg: 4.557 ± 1.013
5.8IleSer: 5.8 ± 1.044
5.385IleThr: 5.385 ± 1.002
2.486IleVal: 2.486 ± 0.902
0.829IleTrp: 0.829 ± 0.617
2.071IleTyr: 2.071 ± 0.846
0.0IleXaa: 0.0 ± 0.0
Lys
6.214LysAla: 6.214 ± 1.771
0.0LysCys: 0.0 ± 0.0
5.385LysAsp: 5.385 ± 1.447
9.942LysGlu: 9.942 ± 1.691
2.486LysPhe: 2.486 ± 1.142
5.385LysGly: 5.385 ± 1.594
1.243LysHis: 1.243 ± 0.682
7.457LysIle: 7.457 ± 1.808
8.699LysLys: 8.699 ± 2.153
9.942LysLeu: 9.942 ± 1.976
0.414LysMet: 0.414 ± 0.388
4.557LysAsn: 4.557 ± 1.082
2.9LysPro: 2.9 ± 0.953
7.457LysGln: 7.457 ± 1.262
2.486LysArg: 2.486 ± 0.708
6.628LysSer: 6.628 ± 1.667
5.8LysThr: 5.8 ± 1.717
4.143LysVal: 4.143 ± 1.319
2.071LysTrp: 2.071 ± 1.068
3.728LysTyr: 3.728 ± 1.058
0.0LysXaa: 0.0 ± 0.0
Leu
6.628LeuAla: 6.628 ± 1.319
0.0LeuCys: 0.0 ± 0.0
5.385LeuAsp: 5.385 ± 1.648
10.356LeuGlu: 10.356 ± 1.921
2.9LeuPhe: 2.9 ± 0.786
6.214LeuGly: 6.214 ± 2.311
2.486LeuHis: 2.486 ± 0.937
6.214LeuIle: 6.214 ± 1.607
6.214LeuLys: 6.214 ± 1.31
10.356LeuLeu: 10.356 ± 2.224
1.657LeuMet: 1.657 ± 1.063
3.728LeuAsn: 3.728 ± 1.023
2.486LeuPro: 2.486 ± 0.822
2.486LeuGln: 2.486 ± 1.021
2.486LeuArg: 2.486 ± 0.85
6.628LeuSer: 6.628 ± 1.275
7.871LeuThr: 7.871 ± 1.481
2.9LeuVal: 2.9 ± 1.025
0.829LeuTrp: 0.829 ± 0.573
4.971LeuTyr: 4.971 ± 1.159
0.0LeuXaa: 0.0 ± 0.0
Met
3.314MetAla: 3.314 ± 1.545
0.0MetCys: 0.0 ± 0.0
2.486MetAsp: 2.486 ± 0.971
0.829MetGlu: 0.829 ± 0.675
0.829MetPhe: 0.829 ± 0.583
0.829MetGly: 0.829 ± 0.587
0.0MetHis: 0.0 ± 0.0
0.829MetIle: 0.829 ± 0.491
1.657MetLys: 1.657 ± 0.752
5.385MetLeu: 5.385 ± 1.26
1.657MetMet: 1.657 ± 0.837
1.243MetAsn: 1.243 ± 0.679
1.243MetPro: 1.243 ± 0.671
1.657MetGln: 1.657 ± 0.797
1.243MetArg: 1.243 ± 0.589
0.414MetSer: 0.414 ± 0.436
2.071MetThr: 2.071 ± 0.814
1.243MetVal: 1.243 ± 0.943
0.0MetTrp: 0.0 ± 0.0
0.0MetTyr: 0.0 ± 0.0
0.0MetXaa: 0.0 ± 0.0
Asn
2.486AsnAla: 2.486 ± 0.783
0.414AsnCys: 0.414 ± 0.402
4.557AsnAsp: 4.557 ± 1.007
4.557AsnGlu: 4.557 ± 1.416
2.486AsnPhe: 2.486 ± 0.842
2.9AsnGly: 2.9 ± 0.908
0.414AsnHis: 0.414 ± 0.388
4.971AsnIle: 4.971 ± 1.817
7.457AsnLys: 7.457 ± 1.698
4.557AsnLeu: 4.557 ± 1.82
2.071AsnMet: 2.071 ± 0.794
2.486AsnAsn: 2.486 ± 0.944
2.9AsnPro: 2.9 ± 0.949
3.728AsnGln: 3.728 ± 0.991
1.657AsnArg: 1.657 ± 1.0
3.314AsnSer: 3.314 ± 0.895
4.971AsnThr: 4.971 ± 1.353
1.657AsnVal: 1.657 ± 0.921
1.243AsnTrp: 1.243 ± 0.625
1.657AsnTyr: 1.657 ± 0.637
0.0AsnXaa: 0.0 ± 0.0
Pro
1.243ProAla: 1.243 ± 0.788
0.0ProCys: 0.0 ± 0.0
1.243ProAsp: 1.243 ± 0.685
2.486ProGlu: 2.486 ± 1.242
2.9ProPhe: 2.9 ± 0.924
0.414ProGly: 0.414 ± 0.421
0.414ProHis: 0.414 ± 0.486
3.728ProIle: 3.728 ± 1.205
3.728ProLys: 3.728 ± 1.324
1.243ProLeu: 1.243 ± 0.59
1.243ProMet: 1.243 ± 0.82
2.9ProAsn: 2.9 ± 1.064
2.071ProPro: 2.071 ± 1.186
0.414ProGln: 0.414 ± 0.376
2.9ProArg: 2.9 ± 1.16
0.414ProSer: 0.414 ± 0.431
3.314ProThr: 3.314 ± 1.119
1.243ProVal: 1.243 ± 0.683
0.0ProTrp: 0.0 ± 0.0
1.243ProTyr: 1.243 ± 0.631
0.0ProXaa: 0.0 ± 0.0
Gln
2.486GlnAla: 2.486 ± 0.752
0.0GlnCys: 0.0 ± 0.0
2.486GlnAsp: 2.486 ± 1.274
1.657GlnGlu: 1.657 ± 0.802
0.414GlnPhe: 0.414 ± 0.432
3.314GlnGly: 3.314 ± 0.964
0.0GlnHis: 0.0 ± 0.0
2.486GlnIle: 2.486 ± 0.73
5.385GlnLys: 5.385 ± 1.696
2.9GlnLeu: 2.9 ± 0.95
2.071GlnMet: 2.071 ± 0.655
2.9GlnAsn: 2.9 ± 0.8
0.829GlnPro: 0.829 ± 0.541
3.314GlnGln: 3.314 ± 1.578
2.071GlnArg: 2.071 ± 0.962
4.971GlnSer: 4.971 ± 0.924
2.071GlnThr: 2.071 ± 0.616
2.486GlnVal: 2.486 ± 0.954
0.0GlnTrp: 0.0 ± 0.0
1.243GlnTyr: 1.243 ± 0.605
0.0GlnXaa: 0.0 ± 0.0
Arg
0.829ArgAla: 0.829 ± 0.526
0.0ArgCys: 0.0 ± 0.0
1.243ArgAsp: 1.243 ± 0.637
4.143ArgGlu: 4.143 ± 1.311
2.071ArgPhe: 2.071 ± 0.828
2.071ArgGly: 2.071 ± 0.776
0.0ArgHis: 0.0 ± 0.0
4.143ArgIle: 4.143 ± 1.284
5.8ArgLys: 5.8 ± 1.326
2.071ArgLeu: 2.071 ± 0.869
1.657ArgMet: 1.657 ± 0.751
2.071ArgAsn: 2.071 ± 0.807
0.414ArgPro: 0.414 ± 0.486
1.657ArgGln: 1.657 ± 0.642
0.829ArgArg: 0.829 ± 0.751
1.243ArgSer: 1.243 ± 0.694
2.9ArgThr: 2.9 ± 1.108
4.557ArgVal: 4.557 ± 1.269
1.243ArgTrp: 1.243 ± 0.71
1.243ArgTyr: 1.243 ± 0.753
0.0ArgXaa: 0.0 ± 0.0
Ser
2.486SerAla: 2.486 ± 1.146
0.414SerCys: 0.414 ± 0.421
4.143SerAsp: 4.143 ± 1.088
5.8SerGlu: 5.8 ± 2.16
2.071SerPhe: 2.071 ± 0.823
1.657SerGly: 1.657 ± 0.752
0.829SerHis: 0.829 ± 0.635
3.728SerIle: 3.728 ± 1.154
8.285SerLys: 8.285 ± 1.536
3.728SerLeu: 3.728 ± 1.189
2.9SerMet: 2.9 ± 1.226
2.071SerAsn: 2.071 ± 0.731
2.486SerPro: 2.486 ± 1.007
4.143SerGln: 4.143 ± 1.007
0.829SerArg: 0.829 ± 0.858
4.557SerSer: 4.557 ± 1.417
2.071SerThr: 2.071 ± 0.885
4.557SerVal: 4.557 ± 1.347
0.0SerTrp: 0.0 ± 0.0
2.071SerTyr: 2.071 ± 0.917
0.0SerXaa: 0.0 ± 0.0
Thr
2.071ThrAla: 2.071 ± 0.832
0.414ThrCys: 0.414 ± 0.429
2.9ThrAsp: 2.9 ± 1.096
4.557ThrGlu: 4.557 ± 1.15
1.243ThrPhe: 1.243 ± 0.689
5.385ThrGly: 5.385 ± 1.485
2.071ThrHis: 2.071 ± 0.818
7.457ThrIle: 7.457 ± 1.231
5.385ThrLys: 5.385 ± 1.699
5.385ThrLeu: 5.385 ± 1.783
0.0ThrMet: 0.0 ± 0.0
1.657ThrAsn: 1.657 ± 0.637
4.143ThrPro: 4.143 ± 1.281
4.143ThrGln: 4.143 ± 1.366
3.314ThrArg: 3.314 ± 1.376
2.486ThrSer: 2.486 ± 1.102
5.385ThrThr: 5.385 ± 1.585
4.143ThrVal: 4.143 ± 1.312
0.0ThrTrp: 0.0 ± 0.0
3.314ThrTyr: 3.314 ± 0.979
0.0ThrXaa: 0.0 ± 0.0
Val
1.657ValAla: 1.657 ± 0.668
0.0ValCys: 0.0 ± 0.0
2.9ValAsp: 2.9 ± 0.988
2.9ValGlu: 2.9 ± 0.861
2.071ValPhe: 2.071 ± 1.063
2.486ValGly: 2.486 ± 1.008
0.414ValHis: 0.414 ± 0.382
4.143ValIle: 4.143 ± 1.343
5.385ValLys: 5.385 ± 1.644
4.971ValLeu: 4.971 ± 1.087
2.071ValMet: 2.071 ± 0.856
4.143ValAsn: 4.143 ± 1.351
1.657ValPro: 1.657 ± 0.808
0.829ValGln: 0.829 ± 0.531
2.486ValArg: 2.486 ± 1.019
2.9ValSer: 2.9 ± 1.393
2.486ValThr: 2.486 ± 1.257
2.9ValVal: 2.9 ± 1.522
0.414ValTrp: 0.414 ± 0.382
2.9ValTyr: 2.9 ± 1.556
0.0ValXaa: 0.0 ± 0.0
Trp
1.243TrpAla: 1.243 ± 0.676
0.0TrpCys: 0.0 ± 0.0
0.829TrpAsp: 0.829 ± 0.52
1.243TrpGlu: 1.243 ± 0.633
0.414TrpPhe: 0.414 ± 0.388
0.0TrpGly: 0.0 ± 0.0
0.414TrpHis: 0.414 ± 0.359
0.0TrpIle: 0.0 ± 0.0
0.414TrpLys: 0.414 ± 0.376
2.071TrpLeu: 2.071 ± 0.958
0.414TrpMet: 0.414 ± 0.446
0.0TrpAsn: 0.0 ± 0.0
0.0TrpPro: 0.0 ± 0.0
0.0TrpGln: 0.0 ± 0.0
0.829TrpArg: 0.829 ± 0.843
0.414TrpSer: 0.414 ± 0.376
1.657TrpThr: 1.657 ± 0.777
0.0TrpVal: 0.0 ± 0.0
0.414TrpTrp: 0.414 ± 0.382
0.829TrpTyr: 0.829 ± 0.52
0.0TrpXaa: 0.0 ± 0.0
Tyr
0.829TyrAla: 0.829 ± 0.538
0.0TyrCys: 0.0 ± 0.0
2.486TyrAsp: 2.486 ± 1.105
0.414TyrGlu: 0.414 ± 0.436
4.557TyrPhe: 4.557 ± 1.086
1.657TyrGly: 1.657 ± 0.73
1.243TyrHis: 1.243 ± 0.681
3.314TyrIle: 3.314 ± 1.057
3.728TyrLys: 3.728 ± 1.336
6.214TyrLeu: 6.214 ± 2.144
0.414TyrMet: 0.414 ± 0.376
2.9TyrAsn: 2.9 ± 1.032
0.829TyrPro: 0.829 ± 0.67
1.657TyrGln: 1.657 ± 0.644
0.414TyrArg: 0.414 ± 0.429
3.314TyrSer: 3.314 ± 1.406
2.9TyrThr: 2.9 ± 1.03
0.829TyrVal: 0.829 ± 0.541
0.0TyrTrp: 0.0 ± 0.0
1.243TyrTyr: 1.243 ± 0.681
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 19 proteins (2415 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski