Amino acid dipepetide frequency for Streptococcus satellite phage Javan219

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
0.347AlaAla: 0.347 ± 0.346
0.0AlaCys: 0.0 ± 0.0
3.465AlaAsp: 3.465 ± 1.085
2.426AlaGlu: 2.426 ± 0.871
2.772AlaPhe: 2.772 ± 0.921
2.772AlaGly: 2.772 ± 1.004
0.693AlaHis: 0.693 ± 0.431
2.079AlaIle: 2.079 ± 0.669
4.158AlaLys: 4.158 ± 1.89
3.465AlaLeu: 3.465 ± 1.049
1.04AlaMet: 1.04 ± 0.674
2.426AlaAsn: 2.426 ± 0.77
1.386AlaPro: 1.386 ± 0.569
1.386AlaGln: 1.386 ± 0.62
2.772AlaArg: 2.772 ± 0.903
3.812AlaSer: 3.812 ± 0.905
2.079AlaThr: 2.079 ± 1.025
3.812AlaVal: 3.812 ± 1.341
0.693AlaTrp: 0.693 ± 0.52
2.426AlaTyr: 2.426 ± 0.881
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.0CysCys: 0.0 ± 0.0
0.693CysAsp: 0.693 ± 0.451
0.0CysGlu: 0.0 ± 0.0
0.347CysPhe: 0.347 ± 0.298
0.0CysGly: 0.0 ± 0.0
0.347CysHis: 0.347 ± 0.391
0.693CysIle: 0.693 ± 0.634
0.693CysLys: 0.693 ± 0.802
1.386CysLeu: 1.386 ± 0.57
0.347CysMet: 0.347 ± 0.298
0.347CysAsn: 0.347 ± 0.391
0.693CysPro: 0.693 ± 0.547
0.693CysGln: 0.693 ± 0.484
0.0CysArg: 0.0 ± 0.0
0.693CysSer: 0.693 ± 0.545
0.0CysThr: 0.0 ± 0.0
0.0CysVal: 0.0 ± 0.0
0.0CysTrp: 0.0 ± 0.0
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
0.693AspAla: 0.693 ± 0.404
1.386AspCys: 1.386 ± 0.71
4.158AspAsp: 4.158 ± 1.168
3.812AspGlu: 3.812 ± 1.631
3.465AspPhe: 3.465 ± 0.956
2.772AspGly: 2.772 ± 1.292
1.04AspHis: 1.04 ± 0.697
4.505AspIle: 4.505 ± 1.196
3.465AspLys: 3.465 ± 1.033
5.198AspLeu: 5.198 ± 1.274
1.386AspMet: 1.386 ± 0.477
7.623AspAsn: 7.623 ± 1.408
1.386AspPro: 1.386 ± 0.744
1.386AspGln: 1.386 ± 0.794
2.079AspArg: 2.079 ± 0.978
3.119AspSer: 3.119 ± 0.853
2.079AspThr: 2.079 ± 1.173
4.158AspVal: 4.158 ± 1.053
0.347AspTrp: 0.347 ± 0.336
4.158AspTyr: 4.158 ± 1.03
0.0AspXaa: 0.0 ± 0.0
Glu
6.237GluAla: 6.237 ± 1.629
0.347GluCys: 0.347 ± 0.324
3.465GluAsp: 3.465 ± 1.306
5.198GluGlu: 5.198 ± 1.561
2.079GluPhe: 2.079 ± 0.79
1.386GluGly: 1.386 ± 0.74
1.04GluHis: 1.04 ± 0.455
6.584GluIle: 6.584 ± 1.872
9.356GluLys: 9.356 ± 2.323
10.742GluLeu: 10.742 ± 3.074
1.386GluMet: 1.386 ± 0.538
5.544GluAsn: 5.544 ± 1.171
1.386GluPro: 1.386 ± 0.7
4.851GluGln: 4.851 ± 1.06
4.158GluArg: 4.158 ± 1.145
3.119GluSer: 3.119 ± 1.509
5.544GluThr: 5.544 ± 1.086
4.505GluVal: 4.505 ± 1.56
1.386GluTrp: 1.386 ± 0.694
2.772GluTyr: 2.772 ± 1.136
0.0GluXaa: 0.0 ± 0.0
Phe
0.347PheAla: 0.347 ± 0.346
0.693PheCys: 0.693 ± 0.547
2.079PheAsp: 2.079 ± 0.668
3.812PheGlu: 3.812 ± 1.017
2.079PhePhe: 2.079 ± 0.909
2.426PheGly: 2.426 ± 0.923
0.693PheHis: 0.693 ± 0.44
1.733PheIle: 1.733 ± 0.634
5.544PheLys: 5.544 ± 1.736
2.772PheLeu: 2.772 ± 0.944
1.04PheMet: 1.04 ± 0.407
1.04PheAsn: 1.04 ± 0.485
1.386PhePro: 1.386 ± 0.661
0.693PheGln: 0.693 ± 0.5
2.426PheArg: 2.426 ± 0.827
4.158PheSer: 4.158 ± 1.176
1.733PheThr: 1.733 ± 0.657
1.733PheVal: 1.733 ± 0.658
0.0PheTrp: 0.0 ± 0.0
2.426PheTyr: 2.426 ± 0.814
0.0PheXaa: 0.0 ± 0.0
Gly
1.733GlyAla: 1.733 ± 0.903
0.0GlyCys: 0.0 ± 0.0
2.426GlyAsp: 2.426 ± 0.714
2.426GlyGlu: 2.426 ± 1.255
3.812GlyPhe: 3.812 ± 0.865
2.079GlyGly: 2.079 ± 0.747
1.386GlyHis: 1.386 ± 0.47
3.465GlyIle: 3.465 ± 1.308
5.198GlyLys: 5.198 ± 1.282
5.891GlyLeu: 5.891 ± 1.207
0.347GlyMet: 0.347 ± 0.351
3.812GlyAsn: 3.812 ± 1.125
0.693GlyPro: 0.693 ± 0.545
0.693GlyGln: 0.693 ± 0.423
0.347GlyArg: 0.347 ± 0.298
1.733GlySer: 1.733 ± 0.678
3.465GlyThr: 3.465 ± 0.836
3.812GlyVal: 3.812 ± 1.258
0.347GlyTrp: 0.347 ± 0.324
3.465GlyTyr: 3.465 ± 1.036
0.0GlyXaa: 0.0 ± 0.0
His
2.079HisAla: 2.079 ± 0.807
0.693HisCys: 0.693 ± 0.547
0.0HisAsp: 0.0 ± 0.0
0.693HisGlu: 0.693 ± 0.468
1.04HisPhe: 1.04 ± 0.825
2.079HisGly: 2.079 ± 0.767
0.693HisHis: 0.693 ± 0.456
0.0HisIle: 0.0 ± 0.0
2.426HisLys: 2.426 ± 0.899
2.426HisLeu: 2.426 ± 0.907
0.0HisMet: 0.0 ± 0.0
1.04HisAsn: 1.04 ± 0.825
0.347HisPro: 0.347 ± 0.336
0.347HisGln: 0.347 ± 0.391
0.347HisArg: 0.347 ± 0.298
0.347HisSer: 0.347 ± 0.298
1.04HisThr: 1.04 ± 0.58
0.693HisVal: 0.693 ± 0.498
0.0HisTrp: 0.0 ± 0.0
0.347HisTyr: 0.347 ± 0.401
0.0HisXaa: 0.0 ± 0.0
Ile
3.119IleAla: 3.119 ± 0.836
0.347IleCys: 0.347 ± 0.391
6.93IleAsp: 6.93 ± 2.036
8.663IleGlu: 8.663 ± 1.728
2.772IlePhe: 2.772 ± 0.992
2.772IleGly: 2.772 ± 0.834
0.0IleHis: 0.0 ± 0.0
5.198IleIle: 5.198 ± 1.22
5.198IleLys: 5.198 ± 1.29
4.505IleLeu: 4.505 ± 1.015
1.386IleMet: 1.386 ± 0.622
3.465IleAsn: 3.465 ± 0.772
2.772IlePro: 2.772 ± 1.206
2.772IleGln: 2.772 ± 0.914
3.119IleArg: 3.119 ± 1.668
6.93IleSer: 6.93 ± 1.378
4.505IleThr: 4.505 ± 0.92
3.465IleVal: 3.465 ± 0.875
0.347IleTrp: 0.347 ± 0.3
4.158IleTyr: 4.158 ± 1.069
0.0IleXaa: 0.0 ± 0.0
Lys
5.198LysAla: 5.198 ± 1.243
0.0LysCys: 0.0 ± 0.0
7.623LysAsp: 7.623 ± 1.125
7.277LysGlu: 7.277 ± 1.973
2.079LysPhe: 2.079 ± 1.181
5.544LysGly: 5.544 ± 1.275
2.079LysHis: 2.079 ± 1.068
7.97LysIle: 7.97 ± 1.136
10.395LysLys: 10.395 ± 2.028
10.049LysLeu: 10.049 ± 1.488
2.079LysMet: 2.079 ± 0.764
7.277LysAsn: 7.277 ± 2.251
1.386LysPro: 1.386 ± 0.515
4.505LysGln: 4.505 ± 1.045
6.237LysArg: 6.237 ± 1.622
4.851LysSer: 4.851 ± 0.922
5.544LysThr: 5.544 ± 1.025
5.544LysVal: 5.544 ± 1.081
1.04LysTrp: 1.04 ± 0.836
5.891LysTyr: 5.891 ± 1.407
0.0LysXaa: 0.0 ± 0.0
Leu
4.158LeuAla: 4.158 ± 1.254
0.0LeuCys: 0.0 ± 0.0
4.158LeuAsp: 4.158 ± 1.058
10.395LeuGlu: 10.395 ± 2.594
2.079LeuPhe: 2.079 ± 0.816
5.198LeuGly: 5.198 ± 1.339
0.347LeuHis: 0.347 ± 0.391
5.544LeuIle: 5.544 ± 1.761
11.781LeuLys: 11.781 ± 1.863
8.663LeuLeu: 8.663 ± 1.838
3.119LeuMet: 3.119 ± 0.736
6.93LeuAsn: 6.93 ± 1.579
3.119LeuPro: 3.119 ± 1.185
3.119LeuGln: 3.119 ± 1.244
4.505LeuArg: 4.505 ± 0.897
9.356LeuSer: 9.356 ± 1.859
6.584LeuThr: 6.584 ± 1.48
4.158LeuVal: 4.158 ± 1.175
1.386LeuTrp: 1.386 ± 0.566
3.465LeuTyr: 3.465 ± 0.98
0.0LeuXaa: 0.0 ± 0.0
Met
3.119MetAla: 3.119 ± 0.726
0.0MetCys: 0.0 ± 0.0
1.386MetAsp: 1.386 ± 0.47
3.119MetGlu: 3.119 ± 0.775
0.693MetPhe: 0.693 ± 0.422
0.347MetGly: 0.347 ± 0.346
0.0MetHis: 0.0 ± 0.0
2.079MetIle: 2.079 ± 0.934
1.386MetLys: 1.386 ± 0.578
1.733MetLeu: 1.733 ± 0.688
0.0MetMet: 0.0 ± 0.0
1.04MetAsn: 1.04 ± 0.609
0.0MetPro: 0.0 ± 0.0
0.347MetGln: 0.347 ± 0.368
0.0MetArg: 0.0 ± 0.0
0.693MetSer: 0.693 ± 0.422
1.733MetThr: 1.733 ± 0.773
1.386MetVal: 1.386 ± 0.562
0.0MetTrp: 0.0 ± 0.0
1.04MetTyr: 1.04 ± 0.63
0.0MetXaa: 0.0 ± 0.0
Asn
3.812AsnAla: 3.812 ± 1.31
1.386AsnCys: 1.386 ± 0.797
4.505AsnAsp: 4.505 ± 1.259
5.198AsnGlu: 5.198 ± 1.359
3.119AsnPhe: 3.119 ± 0.986
3.465AsnGly: 3.465 ± 1.145
1.04AsnHis: 1.04 ± 0.468
4.851AsnIle: 4.851 ± 1.363
6.584AsnLys: 6.584 ± 1.841
4.505AsnLeu: 4.505 ± 1.05
1.386AsnMet: 1.386 ± 0.714
3.119AsnAsn: 3.119 ± 1.217
2.079AsnPro: 2.079 ± 0.571
2.426AsnGln: 2.426 ± 1.019
3.119AsnArg: 3.119 ± 1.186
3.119AsnSer: 3.119 ± 1.026
2.772AsnThr: 2.772 ± 1.129
2.079AsnVal: 2.079 ± 0.686
1.04AsnTrp: 1.04 ± 0.65
4.158AsnTyr: 4.158 ± 1.533
0.0AsnXaa: 0.0 ± 0.0
Pro
1.04ProAla: 1.04 ± 0.786
0.0ProCys: 0.0 ± 0.0
0.347ProAsp: 0.347 ± 0.3
1.733ProGlu: 1.733 ± 0.771
0.693ProPhe: 0.693 ± 0.414
0.0ProGly: 0.0 ± 0.0
0.0ProHis: 0.0 ± 0.0
1.04ProIle: 1.04 ± 0.536
3.812ProLys: 3.812 ± 1.264
2.772ProLeu: 2.772 ± 1.001
0.347ProMet: 0.347 ± 0.298
1.386ProAsn: 1.386 ± 0.727
0.693ProPro: 0.693 ± 0.456
2.079ProGln: 2.079 ± 0.867
2.426ProArg: 2.426 ± 0.863
2.426ProSer: 2.426 ± 1.308
2.426ProThr: 2.426 ± 1.017
1.733ProVal: 1.733 ± 0.857
0.693ProTrp: 0.693 ± 0.546
1.04ProTyr: 1.04 ± 0.67
0.0ProXaa: 0.0 ± 0.0
Gln
1.386GlnAla: 1.386 ± 0.541
0.693GlnCys: 0.693 ± 0.377
1.386GlnAsp: 1.386 ± 0.807
3.465GlnGlu: 3.465 ± 0.955
1.386GlnPhe: 1.386 ± 0.839
1.733GlnGly: 1.733 ± 0.658
1.04GlnHis: 1.04 ± 0.506
2.079GlnIle: 2.079 ± 0.96
4.505GlnLys: 4.505 ± 1.071
5.891GlnLeu: 5.891 ± 1.31
1.733GlnMet: 1.733 ± 1.001
1.386GlnAsn: 1.386 ± 0.502
1.386GlnPro: 1.386 ± 0.734
1.733GlnGln: 1.733 ± 1.157
1.04GlnArg: 1.04 ± 0.572
3.119GlnSer: 3.119 ± 1.17
1.04GlnThr: 1.04 ± 0.704
1.386GlnVal: 1.386 ± 0.602
0.347GlnTrp: 0.347 ± 0.3
2.079GlnTyr: 2.079 ± 0.581
0.0GlnXaa: 0.0 ± 0.0
Arg
2.079ArgAla: 2.079 ± 1.556
0.0ArgCys: 0.0 ± 0.0
3.465ArgAsp: 3.465 ± 1.071
5.198ArgGlu: 5.198 ± 1.009
2.079ArgPhe: 2.079 ± 0.807
2.079ArgGly: 2.079 ± 0.824
1.04ArgHis: 1.04 ± 0.631
4.505ArgIle: 4.505 ± 1.337
2.772ArgLys: 2.772 ± 0.865
4.851ArgLeu: 4.851 ± 0.969
0.347ArgMet: 0.347 ± 0.298
2.079ArgAsn: 2.079 ± 1.023
0.347ArgPro: 0.347 ± 0.3
1.733ArgGln: 1.733 ± 0.833
1.386ArgArg: 1.386 ± 0.917
1.04ArgSer: 1.04 ± 0.623
2.426ArgThr: 2.426 ± 0.906
1.386ArgVal: 1.386 ± 0.677
0.347ArgTrp: 0.347 ± 0.309
2.426ArgTyr: 2.426 ± 1.083
0.0ArgXaa: 0.0 ± 0.0
Ser
1.04SerAla: 1.04 ± 0.455
0.693SerCys: 0.693 ± 0.547
2.772SerAsp: 2.772 ± 0.653
4.158SerGlu: 4.158 ± 1.255
2.079SerPhe: 2.079 ± 0.788
2.772SerGly: 2.772 ± 1.197
1.386SerHis: 1.386 ± 0.772
4.158SerIle: 4.158 ± 1.156
6.93SerLys: 6.93 ± 1.432
6.237SerLeu: 6.237 ± 1.633
1.04SerMet: 1.04 ± 0.662
4.505SerAsn: 4.505 ± 1.097
3.119SerPro: 3.119 ± 1.105
3.812SerGln: 3.812 ± 0.899
2.426SerArg: 2.426 ± 0.657
6.584SerSer: 6.584 ± 3.703
6.237SerThr: 6.237 ± 1.636
4.851SerVal: 4.851 ± 1.252
0.693SerTrp: 0.693 ± 0.422
3.119SerTyr: 3.119 ± 0.98
0.0SerXaa: 0.0 ± 0.0
Thr
2.772ThrAla: 2.772 ± 1.052
0.0ThrCys: 0.0 ± 0.0
4.851ThrAsp: 4.851 ± 1.178
5.891ThrGlu: 5.891 ± 1.727
2.772ThrPhe: 2.772 ± 0.767
3.465ThrGly: 3.465 ± 0.996
1.733ThrHis: 1.733 ± 0.789
4.851ThrIle: 4.851 ± 1.236
6.237ThrLys: 6.237 ± 1.21
5.891ThrLeu: 5.891 ± 1.539
0.693ThrMet: 0.693 ± 0.439
2.772ThrAsn: 2.772 ± 0.939
1.04ThrPro: 1.04 ± 0.62
2.079ThrGln: 2.079 ± 0.945
1.733ThrArg: 1.733 ± 0.552
3.119ThrSer: 3.119 ± 0.996
3.812ThrThr: 3.812 ± 2.746
4.851ThrVal: 4.851 ± 1.276
0.347ThrTrp: 0.347 ± 0.346
1.386ThrTyr: 1.386 ± 0.642
0.0ThrXaa: 0.0 ± 0.0
Val
3.119ValAla: 3.119 ± 0.804
0.347ValCys: 0.347 ± 0.401
2.772ValAsp: 2.772 ± 0.927
3.465ValGlu: 3.465 ± 0.838
2.426ValPhe: 2.426 ± 0.73
2.426ValGly: 2.426 ± 1.088
0.347ValHis: 0.347 ± 0.369
6.584ValIle: 6.584 ± 1.907
5.544ValLys: 5.544 ± 1.003
5.198ValLeu: 5.198 ± 1.214
1.386ValMet: 1.386 ± 0.678
3.119ValAsn: 3.119 ± 1.38
2.079ValPro: 2.079 ± 0.545
1.386ValGln: 1.386 ± 0.483
1.04ValArg: 1.04 ± 0.585
5.198ValSer: 5.198 ± 1.946
3.465ValThr: 3.465 ± 0.975
1.04ValVal: 1.04 ± 0.522
0.347ValTrp: 0.347 ± 0.409
3.119ValTyr: 3.119 ± 0.781
0.0ValXaa: 0.0 ± 0.0
Trp
0.693TrpAla: 0.693 ± 0.513
0.0TrpCys: 0.0 ± 0.0
0.347TrpAsp: 0.347 ± 0.351
1.04TrpGlu: 1.04 ± 0.557
0.0TrpPhe: 0.0 ± 0.0
0.347TrpGly: 0.347 ± 0.3
0.347TrpHis: 0.347 ± 0.369
0.347TrpIle: 0.347 ± 0.351
0.693TrpLys: 0.693 ± 0.431
1.386TrpLeu: 1.386 ± 0.57
0.0TrpMet: 0.0 ± 0.0
0.0TrpAsn: 0.0 ± 0.0
0.0TrpPro: 0.0 ± 0.0
1.04TrpGln: 1.04 ± 0.586
0.0TrpArg: 0.0 ± 0.0
1.386TrpSer: 1.386 ± 0.556
0.0TrpThr: 0.0 ± 0.0
1.733TrpVal: 1.733 ± 0.668
0.0TrpTrp: 0.0 ± 0.0
0.347TrpTyr: 0.347 ± 0.3
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.04TyrAla: 1.04 ± 0.572
0.347TyrCys: 0.347 ± 0.3
1.386TyrAsp: 1.386 ± 0.58
3.119TyrGlu: 3.119 ± 0.808
1.04TyrPhe: 1.04 ± 0.578
3.465TyrGly: 3.465 ± 1.233
1.386TyrHis: 1.386 ± 0.482
4.158TyrIle: 4.158 ± 0.991
5.891TyrLys: 5.891 ± 1.121
4.851TyrLeu: 4.851 ± 1.237
0.693TyrMet: 0.693 ± 0.469
4.851TyrAsn: 4.851 ± 1.225
1.386TyrPro: 1.386 ± 0.595
1.733TyrGln: 1.733 ± 0.747
2.426TyrArg: 2.426 ± 0.67
3.812TyrSer: 3.812 ± 1.199
3.812TyrThr: 3.812 ± 0.742
2.079TyrVal: 2.079 ± 0.462
0.347TyrTrp: 0.347 ± 0.401
2.426TyrTyr: 2.426 ± 0.632
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 16 proteins (2887 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski