Amino acid dipepetide frequency for Harrison Dam virus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
2.915AlaAla: 2.915 ± 2.61
0.53AlaCys: 0.53 ± 0.49
2.915AlaAsp: 2.915 ± 1.176
2.385AlaGlu: 2.385 ± 0.791
2.12AlaPhe: 2.12 ± 0.575
2.12AlaGly: 2.12 ± 1.164
0.795AlaHis: 0.795 ± 0.312
1.855AlaIle: 1.855 ± 1.07
3.975AlaLys: 3.975 ± 1.347
3.975AlaLeu: 3.975 ± 0.656
1.06AlaMet: 1.06 ± 0.838
3.18AlaAsn: 3.18 ± 0.703
0.53AlaPro: 0.53 ± 0.543
1.325AlaGln: 1.325 ± 0.546
2.385AlaArg: 2.385 ± 0.76
2.12AlaSer: 2.12 ± 1.119
1.855AlaThr: 1.855 ± 0.654
1.59AlaVal: 1.59 ± 0.895
0.0AlaTrp: 0.0 ± 0.0
0.795AlaTyr: 0.795 ± 0.311
0.0AlaXaa: 0.0 ± 0.0
Cys
0.53CysAla: 0.53 ± 0.304
0.0CysCys: 0.0 ± 0.0
1.59CysAsp: 1.59 ± 0.895
1.59CysGlu: 1.59 ± 0.53
0.795CysPhe: 0.795 ± 0.346
2.65CysGly: 2.65 ± 0.896
0.53CysHis: 0.53 ± 0.428
1.59CysIle: 1.59 ± 0.625
1.325CysLys: 1.325 ± 0.971
2.12CysLeu: 2.12 ± 0.67
0.0CysMet: 0.0 ± 0.0
0.53CysAsn: 0.53 ± 0.35
1.06CysPro: 1.06 ± 0.754
1.06CysGln: 1.06 ± 0.497
1.325CysArg: 1.325 ± 0.621
1.06CysSer: 1.06 ± 0.609
0.53CysThr: 0.53 ± 0.291
1.325CysVal: 1.325 ± 0.536
0.265CysTrp: 0.265 ± 0.152
0.53CysTyr: 0.53 ± 0.429
0.0CysXaa: 0.0 ± 0.0
Asp
1.325AspAla: 1.325 ± 0.633
1.855AspCys: 1.855 ± 0.399
3.975AspAsp: 3.975 ± 1.213
3.18AspGlu: 3.18 ± 0.327
2.385AspPhe: 2.385 ± 0.45
4.24AspGly: 4.24 ± 1.879
1.59AspHis: 1.59 ± 0.594
3.18AspIle: 3.18 ± 0.662
2.65AspLys: 2.65 ± 0.744
8.744AspLeu: 8.744 ± 0.691
2.12AspMet: 2.12 ± 0.397
2.12AspAsn: 2.12 ± 0.823
2.385AspPro: 2.385 ± 1.081
2.915AspGln: 2.915 ± 0.564
1.855AspArg: 1.855 ± 0.692
2.915AspSer: 2.915 ± 0.62
1.855AspThr: 1.855 ± 0.757
2.65AspVal: 2.65 ± 0.881
1.06AspTrp: 1.06 ± 0.922
3.71AspTyr: 3.71 ± 0.537
0.0AspXaa: 0.0 ± 0.0
Glu
2.65GluAla: 2.65 ± 1.497
1.59GluCys: 1.59 ± 0.668
2.65GluAsp: 2.65 ± 0.409
4.769GluGlu: 4.769 ± 1.922
4.24GluPhe: 4.24 ± 1.163
3.975GluGly: 3.975 ± 1.444
1.06GluHis: 1.06 ± 0.353
6.889GluIle: 6.889 ± 1.721
7.154GluLys: 7.154 ± 0.95
4.505GluLeu: 4.505 ± 0.941
0.795GluMet: 0.795 ± 0.438
4.24GluAsn: 4.24 ± 0.957
1.855GluPro: 1.855 ± 0.622
1.325GluGln: 1.325 ± 0.686
3.975GluArg: 3.975 ± 0.552
5.564GluSer: 5.564 ± 0.941
4.769GluThr: 4.769 ± 1.333
2.65GluVal: 2.65 ± 1.045
1.855GluTrp: 1.855 ± 0.524
3.975GluTyr: 3.975 ± 0.922
0.0GluXaa: 0.0 ± 0.0
Phe
0.795PheAla: 0.795 ± 0.312
1.325PheCys: 1.325 ± 1.02
3.975PheAsp: 3.975 ± 0.675
3.975PheGlu: 3.975 ± 1.321
2.385PhePhe: 2.385 ± 0.899
2.65PheGly: 2.65 ± 1.32
0.53PheHis: 0.53 ± 0.35
2.65PheIle: 2.65 ± 0.693
4.769PheLys: 4.769 ± 0.821
5.034PheLeu: 5.034 ± 1.042
1.325PheMet: 1.325 ± 0.388
1.855PheAsn: 1.855 ± 0.548
2.12PhePro: 2.12 ± 0.579
2.65PheGln: 2.65 ± 0.647
2.12PheArg: 2.12 ± 0.708
3.71PheSer: 3.71 ± 1.216
1.59PheThr: 1.59 ± 0.495
3.445PheVal: 3.445 ± 0.492
0.53PheTrp: 0.53 ± 0.304
0.53PheTyr: 0.53 ± 0.304
0.0PheXaa: 0.0 ± 0.0
Gly
2.65GlyAla: 2.65 ± 1.203
0.53GlyCys: 0.53 ± 0.304
3.975GlyAsp: 3.975 ± 1.182
2.915GlyGlu: 2.915 ± 1.689
3.445GlyPhe: 3.445 ± 1.567
3.18GlyGly: 3.18 ± 1.496
0.53GlyHis: 0.53 ± 0.268
5.829GlyIle: 5.829 ± 1.212
3.975GlyLys: 3.975 ± 1.101
5.829GlyLeu: 5.829 ± 0.964
0.53GlyMet: 0.53 ± 0.304
2.65GlyAsn: 2.65 ± 0.751
1.325GlyPro: 1.325 ± 0.431
3.18GlyGln: 3.18 ± 0.979
2.915GlyArg: 2.915 ± 1.615
4.24GlySer: 4.24 ± 1.193
2.385GlyThr: 2.385 ± 0.976
3.18GlyVal: 3.18 ± 0.787
1.06GlyTrp: 1.06 ± 0.437
1.59GlyTyr: 1.59 ± 0.597
0.0GlyXaa: 0.0 ± 0.0
His
0.53HisAla: 0.53 ± 0.291
0.53HisCys: 0.53 ± 0.828
0.795HisAsp: 0.795 ± 0.912
1.855HisGlu: 1.855 ± 0.342
1.59HisPhe: 1.59 ± 0.42
1.855HisGly: 1.855 ± 0.626
0.0HisHis: 0.0 ± 0.0
0.795HisIle: 0.795 ± 0.575
1.59HisLys: 1.59 ± 0.672
1.855HisLeu: 1.855 ± 0.813
1.325HisMet: 1.325 ± 0.495
1.59HisAsn: 1.59 ± 0.499
1.06HisPro: 1.06 ± 0.535
0.53HisGln: 0.53 ± 0.363
1.06HisArg: 1.06 ± 0.686
1.59HisSer: 1.59 ± 0.653
0.0HisThr: 0.0 ± 0.0
0.53HisVal: 0.53 ± 0.304
0.53HisTrp: 0.53 ± 0.583
1.325HisTyr: 1.325 ± 0.761
0.0HisXaa: 0.0 ± 0.0
Ile
2.65IleAla: 2.65 ± 0.701
1.855IleCys: 1.855 ± 0.476
5.564IleAsp: 5.564 ± 0.969
4.769IleGlu: 4.769 ± 1.26
3.71IlePhe: 3.71 ± 0.723
5.564IleGly: 5.564 ± 1.308
3.18IleHis: 3.18 ± 0.688
7.949IleIle: 7.949 ± 1.562
8.214IleLys: 8.214 ± 2.084
5.564IleLeu: 5.564 ± 1.036
1.855IleMet: 1.855 ± 1.034
5.829IleAsn: 5.829 ± 0.97
3.71IlePro: 3.71 ± 1.517
2.385IleGln: 2.385 ± 0.685
4.769IleArg: 4.769 ± 0.818
5.299IleSer: 5.299 ± 1.585
5.299IleThr: 5.299 ± 0.999
2.65IleVal: 2.65 ± 0.544
0.53IleTrp: 0.53 ± 0.608
3.445IleTyr: 3.445 ± 0.943
0.0IleXaa: 0.0 ± 0.0
Lys
3.975LysAla: 3.975 ± 0.839
1.59LysCys: 1.59 ± 0.53
5.564LysAsp: 5.564 ± 0.811
5.034LysGlu: 5.034 ± 0.836
4.24LysPhe: 4.24 ± 0.913
3.71LysGly: 3.71 ± 0.943
0.795LysHis: 0.795 ± 0.312
10.334LysIle: 10.334 ± 1.278
4.769LysLys: 4.769 ± 0.893
5.564LysLeu: 5.564 ± 0.645
2.12LysMet: 2.12 ± 0.443
4.24LysAsn: 4.24 ± 0.52
3.18LysPro: 3.18 ± 1.36
1.855LysGln: 1.855 ± 0.673
3.445LysArg: 3.445 ± 0.728
6.889LysSer: 6.889 ± 0.676
4.769LysThr: 4.769 ± 1.396
5.564LysVal: 5.564 ± 1.044
1.855LysTrp: 1.855 ± 0.472
2.385LysTyr: 2.385 ± 1.347
0.0LysXaa: 0.0 ± 0.0
Leu
5.564LeuAla: 5.564 ± 0.732
0.795LeuCys: 0.795 ± 0.312
5.299LeuAsp: 5.299 ± 0.882
7.154LeuGlu: 7.154 ± 1.918
3.18LeuPhe: 3.18 ± 0.74
4.769LeuGly: 4.769 ± 1.044
1.325LeuHis: 1.325 ± 0.796
10.069LeuIle: 10.069 ± 1.06
7.154LeuLys: 7.154 ± 1.536
10.864LeuLeu: 10.864 ± 1.479
2.385LeuMet: 2.385 ± 0.555
6.889LeuAsn: 6.889 ± 1.718
3.18LeuPro: 3.18 ± 0.672
2.385LeuGln: 2.385 ± 1.237
6.889LeuArg: 6.889 ± 2.343
7.154LeuSer: 7.154 ± 1.291
5.829LeuThr: 5.829 ± 1.663
1.855LeuVal: 1.855 ± 0.831
1.06LeuTrp: 1.06 ± 0.421
2.385LeuTyr: 2.385 ± 0.344
0.0LeuXaa: 0.0 ± 0.0
Met
1.325MetAla: 1.325 ± 0.387
0.53MetCys: 0.53 ± 0.363
1.325MetAsp: 1.325 ± 0.431
1.06MetGlu: 1.06 ± 0.726
1.325MetPhe: 1.325 ± 0.485
0.795MetGly: 0.795 ± 0.311
0.0MetHis: 0.0 ± 0.0
2.12MetIle: 2.12 ± 0.885
1.59MetLys: 1.59 ± 0.594
1.59MetLeu: 1.59 ± 0.746
0.795MetMet: 0.795 ± 0.438
1.325MetAsn: 1.325 ± 0.57
0.795MetPro: 0.795 ± 0.375
0.795MetGln: 0.795 ± 0.76
1.325MetArg: 1.325 ± 0.558
2.385MetSer: 2.385 ± 1.277
1.59MetThr: 1.59 ± 0.672
1.325MetVal: 1.325 ± 0.59
0.53MetTrp: 0.53 ± 0.304
1.06MetTyr: 1.06 ± 0.979
0.0MetXaa: 0.0 ± 0.0
Asn
1.59AsnAla: 1.59 ± 0.68
1.59AsnCys: 1.59 ± 0.704
2.65AsnAsp: 2.65 ± 0.409
2.915AsnGlu: 2.915 ± 1.086
3.71AsnPhe: 3.71 ± 0.915
2.915AsnGly: 2.915 ± 1.789
2.915AsnHis: 2.915 ± 0.811
3.18AsnIle: 3.18 ± 0.874
2.385AsnLys: 2.385 ± 0.483
7.949AsnLeu: 7.949 ± 1.723
1.06AsnMet: 1.06 ± 0.411
3.445AsnAsn: 3.445 ± 1.202
3.18AsnPro: 3.18 ± 1.302
2.65AsnGln: 2.65 ± 0.434
2.65AsnArg: 2.65 ± 0.779
3.975AsnSer: 3.975 ± 0.955
2.65AsnThr: 2.65 ± 0.913
1.59AsnVal: 1.59 ± 0.672
1.855AsnTrp: 1.855 ± 0.527
2.12AsnTyr: 2.12 ± 0.663
0.0AsnXaa: 0.0 ± 0.0
Pro
0.53ProAla: 0.53 ± 0.291
1.325ProCys: 1.325 ± 0.504
2.915ProAsp: 2.915 ± 0.837
2.65ProGlu: 2.65 ± 1.16
1.855ProPhe: 1.855 ± 0.673
0.53ProGly: 0.53 ± 0.291
0.53ProHis: 0.53 ± 0.304
2.12ProIle: 2.12 ± 0.959
3.18ProLys: 3.18 ± 0.695
3.445ProLeu: 3.445 ± 0.951
0.53ProMet: 0.53 ± 0.304
1.325ProAsn: 1.325 ± 0.612
1.325ProPro: 1.325 ± 0.561
1.06ProGln: 1.06 ± 0.726
1.06ProArg: 1.06 ± 0.421
3.975ProSer: 3.975 ± 0.403
2.12ProThr: 2.12 ± 0.579
2.12ProVal: 2.12 ± 0.856
0.795ProTrp: 0.795 ± 0.375
3.18ProTyr: 3.18 ± 0.674
0.0ProXaa: 0.0 ± 0.0
Gln
1.06GlnAla: 1.06 ± 0.961
0.265GlnCys: 0.265 ± 0.47
0.795GlnAsp: 0.795 ± 0.664
3.975GlnGlu: 3.975 ± 1.281
1.855GlnPhe: 1.855 ± 0.524
2.65GlnGly: 2.65 ± 0.913
0.795GlnHis: 0.795 ± 0.403
2.915GlnIle: 2.915 ± 0.803
3.975GlnLys: 3.975 ± 1.433
2.385GlnLeu: 2.385 ± 1.314
0.795GlnMet: 0.795 ± 0.312
2.65GlnAsn: 2.65 ± 0.917
1.06GlnPro: 1.06 ± 1.037
0.795GlnGln: 0.795 ± 0.563
1.59GlnArg: 1.59 ± 0.701
2.385GlnSer: 2.385 ± 1.37
2.65GlnThr: 2.65 ± 0.646
1.06GlnVal: 1.06 ± 0.497
1.06GlnTrp: 1.06 ± 0.45
1.06GlnTyr: 1.06 ± 0.411
0.0GlnXaa: 0.0 ± 0.0
Arg
2.385ArgAla: 2.385 ± 0.913
0.795ArgCys: 0.795 ± 0.42
2.65ArgAsp: 2.65 ± 0.697
3.71ArgGlu: 3.71 ± 1.193
1.325ArgPhe: 1.325 ± 0.476
3.975ArgGly: 3.975 ± 1.108
1.59ArgHis: 1.59 ± 0.792
4.505ArgIle: 4.505 ± 1.272
5.299ArgLys: 5.299 ± 0.891
3.71ArgLeu: 3.71 ± 0.619
1.06ArgMet: 1.06 ± 0.682
3.18ArgAsn: 3.18 ± 1.097
1.06ArgPro: 1.06 ± 0.411
2.12ArgGln: 2.12 ± 0.705
2.65ArgArg: 2.65 ± 0.816
4.505ArgSer: 4.505 ± 1.314
1.59ArgThr: 1.59 ± 0.913
2.65ArgVal: 2.65 ± 0.842
1.325ArgTrp: 1.325 ± 0.812
1.06ArgTyr: 1.06 ± 0.538
0.0ArgXaa: 0.0 ± 0.0
Ser
3.18SerAla: 3.18 ± 0.813
2.65SerCys: 2.65 ± 0.59
4.769SerAsp: 4.769 ± 1.018
6.624SerGlu: 6.624 ± 1.59
3.18SerPhe: 3.18 ± 0.635
2.12SerGly: 2.12 ± 0.797
1.855SerHis: 1.855 ± 0.831
6.359SerIle: 6.359 ± 1.38
8.479SerLys: 8.479 ± 1.326
6.359SerLeu: 6.359 ± 1.769
0.795SerMet: 0.795 ± 0.76
4.24SerAsn: 4.24 ± 1.296
2.12SerPro: 2.12 ± 0.449
2.915SerGln: 2.915 ± 1.06
4.505SerArg: 4.505 ± 0.922
5.564SerSer: 5.564 ± 1.007
3.445SerThr: 3.445 ± 0.875
2.385SerVal: 2.385 ± 0.822
2.12SerTrp: 2.12 ± 0.873
1.59SerTyr: 1.59 ± 0.42
0.0SerXaa: 0.0 ± 0.0
Thr
1.06ThrAla: 1.06 ± 0.39
1.06ThrCys: 1.06 ± 0.443
1.06ThrAsp: 1.06 ± 0.919
3.18ThrGlu: 3.18 ± 1.218
2.12ThrPhe: 2.12 ± 0.729
2.915ThrGly: 2.915 ± 0.85
1.06ThrHis: 1.06 ± 0.609
4.505ThrIle: 4.505 ± 0.939
3.71ThrLys: 3.71 ± 0.922
5.829ThrLeu: 5.829 ± 0.826
2.12ThrMet: 2.12 ± 0.757
2.12ThrAsn: 2.12 ± 0.579
2.12ThrPro: 2.12 ± 0.543
2.12ThrGln: 2.12 ± 0.939
2.385ThrArg: 2.385 ± 1.313
3.445ThrSer: 3.445 ± 0.744
2.385ThrThr: 2.385 ± 0.807
2.65ThrVal: 2.65 ± 0.584
1.59ThrTrp: 1.59 ± 0.527
1.325ThrTyr: 1.325 ± 0.44
0.0ThrXaa: 0.0 ± 0.0
Val
2.12ValAla: 2.12 ± 0.966
0.53ValCys: 0.53 ± 0.304
1.325ValAsp: 1.325 ± 0.881
5.299ValGlu: 5.299 ± 1.167
1.59ValPhe: 1.59 ± 0.623
2.385ValGly: 2.385 ± 0.952
0.53ValHis: 0.53 ± 0.291
3.18ValIle: 3.18 ± 0.716
3.18ValLys: 3.18 ± 1.125
3.975ValLeu: 3.975 ± 1.143
1.325ValMet: 1.325 ± 0.59
1.59ValAsn: 1.59 ± 1.191
2.385ValPro: 2.385 ± 0.778
1.325ValGln: 1.325 ± 0.369
1.59ValArg: 1.59 ± 0.52
4.769ValSer: 4.769 ± 0.733
2.12ValThr: 2.12 ± 0.569
1.855ValVal: 1.855 ± 0.498
0.265ValTrp: 0.265 ± 0.152
2.385ValTyr: 2.385 ± 0.548
0.0ValXaa: 0.0 ± 0.0
Trp
0.795TrpAla: 0.795 ± 0.312
0.265TrpCys: 0.265 ± 0.304
0.795TrpAsp: 0.795 ± 0.457
1.06TrpGlu: 1.06 ± 0.411
1.06TrpPhe: 1.06 ± 0.609
1.325TrpGly: 1.325 ± 0.423
0.0TrpHis: 0.0 ± 0.0
1.855TrpIle: 1.855 ± 0.861
1.59TrpLys: 1.59 ± 0.385
2.385TrpLeu: 2.385 ± 0.497
0.795TrpMet: 0.795 ± 0.317
1.59TrpAsn: 1.59 ± 0.876
0.265TrpPro: 0.265 ± 0.152
0.53TrpGln: 0.53 ± 0.268
0.265TrpArg: 0.265 ± 0.414
0.795TrpSer: 0.795 ± 0.652
0.795TrpThr: 0.795 ± 0.575
1.59TrpVal: 1.59 ± 1.321
0.265TrpTrp: 0.265 ± 0.47
0.53TrpTyr: 0.53 ± 0.634
0.0TrpXaa: 0.0 ± 0.0
Tyr
0.795TyrAla: 0.795 ± 0.459
0.795TyrCys: 0.795 ± 0.62
2.12TyrAsp: 2.12 ± 0.823
2.385TyrGlu: 2.385 ± 0.859
1.855TyrPhe: 1.855 ± 0.713
1.59TyrGly: 1.59 ± 0.385
1.325TyrHis: 1.325 ± 0.916
2.915TyrIle: 2.915 ± 0.588
2.915TyrLys: 2.915 ± 0.533
4.769TyrLeu: 4.769 ± 0.95
0.53TyrMet: 0.53 ± 0.751
2.385TyrAsn: 2.385 ± 0.531
1.855TyrPro: 1.855 ± 0.64
1.855TyrGln: 1.855 ± 1.091
2.385TyrArg: 2.385 ± 0.828
2.915TyrSer: 2.915 ± 0.85
0.53TyrThr: 0.53 ± 0.412
1.06TyrVal: 1.06 ± 0.443
0.0TyrTrp: 0.0 ± 0.0
1.06TyrTyr: 1.06 ± 0.437
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 9 proteins (3775 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski