Amino acid dipepetide frequency for Human papillomavirus type 5b

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
2.721AlaAla: 2.721 ± 1.356
1.361AlaCys: 1.361 ± 0.945
4.422AlaAsp: 4.422 ± 0.914
4.762AlaGlu: 4.762 ± 1.243
2.721AlaPhe: 2.721 ± 0.91
3.061AlaGly: 3.061 ± 0.808
1.361AlaHis: 1.361 ± 0.648
2.041AlaIle: 2.041 ± 1.258
3.401AlaLys: 3.401 ± 1.287
3.061AlaLeu: 3.061 ± 0.89
2.381AlaMet: 2.381 ± 1.161
3.061AlaAsn: 3.061 ± 0.862
3.401AlaPro: 3.401 ± 1.138
3.401AlaGln: 3.401 ± 1.205
4.082AlaArg: 4.082 ± 1.23
1.701AlaSer: 1.701 ± 0.827
3.741AlaThr: 3.741 ± 1.475
4.082AlaVal: 4.082 ± 0.924
0.68AlaTrp: 0.68 ± 0.41
1.701AlaTyr: 1.701 ± 0.504
0.0AlaXaa: 0.0 ± 0.0
Cys
0.68CysAla: 0.68 ± 0.362
1.361CysCys: 1.361 ± 1.042
0.68CysAsp: 0.68 ± 0.515
1.361CysGlu: 1.361 ± 1.14
0.68CysPhe: 0.68 ± 0.362
2.041CysGly: 2.041 ± 1.39
0.0CysHis: 0.0 ± 0.0
1.02CysIle: 1.02 ± 0.461
2.721CysLys: 2.721 ± 0.91
0.68CysLeu: 0.68 ± 0.477
1.02CysMet: 1.02 ± 0.529
0.34CysAsn: 0.34 ± 0.36
1.361CysPro: 1.361 ± 0.763
0.34CysGln: 0.34 ± 0.36
2.381CysArg: 2.381 ± 1.333
0.68CysSer: 0.68 ± 0.627
0.68CysThr: 0.68 ± 0.369
0.34CysVal: 0.34 ± 0.36
0.34CysTrp: 0.34 ± 0.314
0.34CysTyr: 0.34 ± 0.286
0.0CysXaa: 0.0 ± 0.0
Asp
2.041AspAla: 2.041 ± 0.603
2.381AspCys: 2.381 ± 1.099
3.061AspAsp: 3.061 ± 1.101
2.041AspGlu: 2.041 ± 0.431
2.381AspPhe: 2.381 ± 0.846
2.721AspGly: 2.721 ± 0.899
0.34AspHis: 0.34 ± 0.304
6.803AspIle: 6.803 ± 1.86
2.721AspLys: 2.721 ± 1.183
5.102AspLeu: 5.102 ± 1.301
1.361AspMet: 1.361 ± 0.53
3.741AspAsn: 3.741 ± 1.053
3.401AspPro: 3.401 ± 0.721
2.721AspGln: 2.721 ± 1.089
1.701AspArg: 1.701 ± 0.842
3.401AspSer: 3.401 ± 0.533
4.422AspThr: 4.422 ± 0.947
3.401AspVal: 3.401 ± 1.183
1.701AspTrp: 1.701 ± 0.883
1.701AspTyr: 1.701 ± 0.846
0.0AspXaa: 0.0 ± 0.0
Glu
4.422GluAla: 4.422 ± 1.426
1.02GluCys: 1.02 ± 0.941
3.401GluAsp: 3.401 ± 0.984
6.122GluGlu: 6.122 ± 2.434
1.701GluPhe: 1.701 ± 0.966
6.803GluGly: 6.803 ± 4.051
1.02GluHis: 1.02 ± 0.461
2.721GluIle: 2.721 ± 1.418
1.361GluLys: 1.361 ± 0.786
4.422GluLeu: 4.422 ± 1.539
0.0GluMet: 0.0 ± 0.0
2.041GluAsn: 2.041 ± 0.586
3.061GluPro: 3.061 ± 1.133
4.082GluGln: 4.082 ± 1.218
2.381GluArg: 2.381 ± 0.759
4.422GluSer: 4.422 ± 1.675
3.061GluThr: 3.061 ± 1.116
5.442GluVal: 5.442 ± 1.847
0.68GluTrp: 0.68 ± 0.386
1.02GluTyr: 1.02 ± 0.858
0.0GluXaa: 0.0 ± 0.0
Phe
3.061PheAla: 3.061 ± 0.808
0.68PheCys: 0.68 ± 0.664
2.721PheAsp: 2.721 ± 0.678
3.061PheGlu: 3.061 ± 1.571
1.361PhePhe: 1.361 ± 0.53
1.701PheGly: 1.701 ± 0.68
1.02PheHis: 1.02 ± 0.564
2.381PheIle: 2.381 ± 0.996
2.381PheLys: 2.381 ± 0.969
3.061PheLeu: 3.061 ± 0.861
0.0PheMet: 0.0 ± 0.0
2.041PheAsn: 2.041 ± 0.831
1.361PhePro: 1.361 ± 0.906
1.361PheGln: 1.361 ± 0.567
1.701PheArg: 1.701 ± 0.969
3.401PheSer: 3.401 ± 0.954
0.34PheThr: 0.34 ± 0.314
1.701PheVal: 1.701 ± 0.857
1.361PheTrp: 1.361 ± 0.725
2.381PheTyr: 2.381 ± 0.88
0.0PheXaa: 0.0 ± 0.0
Gly
4.422GlyAla: 4.422 ± 0.724
1.02GlyCys: 1.02 ± 0.617
5.442GlyAsp: 5.442 ± 1.712
4.422GlyGlu: 4.422 ± 0.891
2.041GlyPhe: 2.041 ± 0.855
5.782GlyGly: 5.782 ± 1.941
5.102GlyHis: 5.102 ± 2.872
2.381GlyIle: 2.381 ± 0.767
2.721GlyLys: 2.721 ± 0.81
4.082GlyLeu: 4.082 ± 1.165
0.0GlyMet: 0.0 ± 0.0
2.381GlyAsn: 2.381 ± 0.793
7.143GlyPro: 7.143 ± 4.055
2.381GlyGln: 2.381 ± 0.847
7.823GlyArg: 7.823 ± 2.734
6.463GlySer: 6.463 ± 1.755
5.102GlyThr: 5.102 ± 1.215
3.061GlyVal: 3.061 ± 1.07
0.0GlyTrp: 0.0 ± 0.0
1.361GlyTyr: 1.361 ± 0.924
0.0GlyXaa: 0.0 ± 0.0
His
0.68HisAla: 0.68 ± 0.605
1.02HisCys: 1.02 ± 0.696
1.701HisAsp: 1.701 ± 0.73
1.361HisGlu: 1.361 ± 0.734
2.041HisPhe: 2.041 ± 0.672
1.02HisGly: 1.02 ± 0.695
1.701HisHis: 1.701 ± 1.833
1.361HisIle: 1.361 ± 0.694
2.041HisLys: 2.041 ± 0.941
2.041HisLeu: 2.041 ± 1.284
0.0HisMet: 0.0 ± 0.0
1.701HisAsn: 1.701 ± 0.7
2.381HisPro: 2.381 ± 1.184
1.361HisGln: 1.361 ± 0.743
1.361HisArg: 1.361 ± 0.946
1.701HisSer: 1.701 ± 0.556
1.701HisThr: 1.701 ± 0.65
1.701HisVal: 1.701 ± 0.989
1.02HisTrp: 1.02 ± 0.376
1.02HisTyr: 1.02 ± 0.587
0.0HisXaa: 0.0 ± 0.0
Ile
3.401IleAla: 3.401 ± 1.56
0.68IleCys: 0.68 ± 0.556
2.381IleAsp: 2.381 ± 0.88
3.061IleGlu: 3.061 ± 1.432
1.02IlePhe: 1.02 ± 0.681
3.401IleGly: 3.401 ± 1.12
1.02IleHis: 1.02 ± 0.571
3.061IleIle: 3.061 ± 1.347
1.701IleLys: 1.701 ± 1.352
3.401IleLeu: 3.401 ± 0.897
1.02IleMet: 1.02 ± 1.329
2.041IleAsn: 2.041 ± 0.835
3.061IlePro: 3.061 ± 1.377
2.041IleGln: 2.041 ± 1.024
2.721IleArg: 2.721 ± 1.032
3.401IleSer: 3.401 ± 0.745
2.041IleThr: 2.041 ± 0.843
2.041IleVal: 2.041 ± 0.556
0.68IleTrp: 0.68 ± 0.41
3.401IleTyr: 3.401 ± 0.846
0.0IleXaa: 0.0 ± 0.0
Lys
5.442LysAla: 5.442 ± 1.172
0.0LysCys: 0.0 ± 0.0
2.721LysAsp: 2.721 ± 1.237
4.082LysGlu: 4.082 ± 1.191
2.041LysPhe: 2.041 ± 1.166
4.762LysGly: 4.762 ± 0.905
1.361LysHis: 1.361 ± 0.779
1.02LysIle: 1.02 ± 0.632
2.381LysLys: 2.381 ± 0.921
4.082LysLeu: 4.082 ± 1.131
0.34LysMet: 0.34 ± 0.286
2.721LysAsn: 2.721 ± 1.325
1.02LysPro: 1.02 ± 1.137
2.041LysGln: 2.041 ± 0.389
3.741LysArg: 3.741 ± 0.861
3.401LysSer: 3.401 ± 1.762
2.381LysThr: 2.381 ± 0.724
5.102LysVal: 5.102 ± 2.807
0.68LysTrp: 0.68 ± 0.416
3.061LysTyr: 3.061 ± 0.621
0.0LysXaa: 0.0 ± 0.0
Leu
3.401LeuAla: 3.401 ± 1.263
1.361LeuCys: 1.361 ± 0.756
5.102LeuAsp: 5.102 ± 1.397
5.102LeuGlu: 5.102 ± 0.958
3.741LeuPhe: 3.741 ± 1.265
5.782LeuGly: 5.782 ± 1.116
3.061LeuHis: 3.061 ± 1.464
2.721LeuIle: 2.721 ± 0.96
3.401LeuLys: 3.401 ± 1.282
9.524LeuLeu: 9.524 ± 2.956
2.381LeuMet: 2.381 ± 1.046
1.361LeuAsn: 1.361 ± 0.631
3.401LeuPro: 3.401 ± 1.645
6.463LeuGln: 6.463 ± 1.309
3.401LeuArg: 3.401 ± 1.188
5.102LeuSer: 5.102 ± 1.706
5.442LeuThr: 5.442 ± 0.828
4.082LeuVal: 4.082 ± 1.499
1.02LeuTrp: 1.02 ± 0.542
1.361LeuTyr: 1.361 ± 0.734
0.0LeuXaa: 0.0 ± 0.0
Met
1.701MetAla: 1.701 ± 0.593
0.0MetCys: 0.0 ± 0.0
1.02MetAsp: 1.02 ± 0.383
0.34MetGlu: 0.34 ± 0.308
0.68MetPhe: 0.68 ± 0.572
0.34MetGly: 0.34 ± 0.314
0.34MetHis: 0.34 ± 0.443
0.34MetIle: 0.34 ± 0.539
0.68MetLys: 0.68 ± 0.627
1.02MetLeu: 1.02 ± 0.554
0.0MetMet: 0.0 ± 0.0
1.361MetAsn: 1.361 ± 0.721
0.0MetPro: 0.0 ± 0.0
0.68MetGln: 0.68 ± 0.523
0.34MetArg: 0.34 ± 0.314
1.361MetSer: 1.361 ± 0.906
1.02MetThr: 1.02 ± 0.62
1.701MetVal: 1.701 ± 1.024
0.34MetTrp: 0.34 ± 0.308
0.34MetTyr: 0.34 ± 0.286
0.0MetXaa: 0.0 ± 0.0
Asn
3.401AsnAla: 3.401 ± 0.93
0.68AsnCys: 0.68 ± 0.595
1.701AsnAsp: 1.701 ± 0.477
2.041AsnGlu: 2.041 ± 1.267
1.361AsnPhe: 1.361 ± 0.648
2.721AsnGly: 2.721 ± 1.126
0.68AsnHis: 0.68 ± 0.386
2.721AsnIle: 2.721 ± 0.639
1.701AsnLys: 1.701 ± 0.286
1.02AsnLeu: 1.02 ± 0.911
0.68AsnMet: 0.68 ± 0.536
1.701AsnAsn: 1.701 ± 0.532
3.741AsnPro: 3.741 ± 1.33
3.061AsnGln: 3.061 ± 1.181
2.721AsnArg: 2.721 ± 0.686
2.721AsnSer: 2.721 ± 0.963
3.741AsnThr: 3.741 ± 1.223
1.361AsnVal: 1.361 ± 0.714
0.0AsnTrp: 0.0 ± 0.0
1.02AsnTyr: 1.02 ± 0.433
0.0AsnXaa: 0.0 ± 0.0
Pro
2.381ProAla: 2.381 ± 1.047
1.701ProCys: 1.701 ± 0.828
5.442ProAsp: 5.442 ± 1.364
4.082ProGlu: 4.082 ± 1.296
1.02ProPhe: 1.02 ± 0.461
4.422ProGly: 4.422 ± 2.031
1.02ProHis: 1.02 ± 0.941
1.701ProIle: 1.701 ± 0.977
4.422ProLys: 4.422 ± 1.394
4.762ProLeu: 4.762 ± 1.478
0.68ProMet: 0.68 ± 0.515
2.381ProAsn: 2.381 ± 1.082
10.544ProPro: 10.544 ± 6.622
2.721ProGln: 2.721 ± 1.15
4.082ProArg: 4.082 ± 1.712
4.082ProSer: 4.082 ± 1.768
4.762ProThr: 4.762 ± 1.374
6.803ProVal: 6.803 ± 2.841
0.68ProTrp: 0.68 ± 0.49
1.701ProTyr: 1.701 ± 0.789
0.0ProXaa: 0.0 ± 0.0
Gln
3.061GlnAla: 3.061 ± 0.59
1.361GlnCys: 1.361 ± 0.598
2.721GlnAsp: 2.721 ± 0.673
1.361GlnGlu: 1.361 ± 0.859
2.041GlnPhe: 2.041 ± 0.944
3.741GlnGly: 3.741 ± 1.443
1.361GlnHis: 1.361 ± 0.53
3.401GlnIle: 3.401 ± 0.896
2.041GlnLys: 2.041 ± 0.757
5.782GlnLeu: 5.782 ± 1.174
1.361GlnMet: 1.361 ± 0.789
1.361GlnAsn: 1.361 ± 0.615
2.721GlnPro: 2.721 ± 0.712
4.762GlnGln: 4.762 ± 1.008
3.741GlnArg: 3.741 ± 1.26
1.701GlnSer: 1.701 ± 0.556
5.442GlnThr: 5.442 ± 1.696
2.381GlnVal: 2.381 ± 1.649
0.34GlnTrp: 0.34 ± 0.314
1.02GlnTyr: 1.02 ± 0.345
0.0GlnXaa: 0.0 ± 0.0
Arg
3.741ArgAla: 3.741 ± 0.877
0.68ArgCys: 0.68 ± 0.473
3.741ArgAsp: 3.741 ± 0.887
2.381ArgGlu: 2.381 ± 0.42
1.701ArgPhe: 1.701 ± 0.504
8.163ArgGly: 8.163 ± 2.646
1.701ArgHis: 1.701 ± 0.829
1.02ArgIle: 1.02 ± 1.09
4.082ArgLys: 4.082 ± 1.279
6.463ArgLeu: 6.463 ± 1.226
0.34ArgMet: 0.34 ± 0.308
3.061ArgAsn: 3.061 ± 0.689
3.061ArgPro: 3.061 ± 0.714
2.041ArgGln: 2.041 ± 0.746
7.483ArgArg: 7.483 ± 3.264
10.884ArgSer: 10.884 ± 6.576
1.361ArgThr: 1.361 ± 0.709
4.082ArgVal: 4.082 ± 1.411
1.02ArgTrp: 1.02 ± 1.011
2.381ArgTyr: 2.381 ± 0.83
0.0ArgXaa: 0.0 ± 0.0
Ser
3.401SerAla: 3.401 ± 0.913
0.34SerCys: 0.34 ± 0.308
3.061SerAsp: 3.061 ± 1.488
3.061SerGlu: 3.061 ± 1.1
3.061SerPhe: 3.061 ± 1.075
5.102SerGly: 5.102 ± 1.055
1.701SerHis: 1.701 ± 0.708
2.381SerIle: 2.381 ± 0.62
4.762SerLys: 4.762 ± 1.127
7.823SerLeu: 7.823 ± 1.384
0.34SerMet: 0.34 ± 0.514
1.701SerAsn: 1.701 ± 0.674
4.082SerPro: 4.082 ± 2.025
3.061SerGln: 3.061 ± 1.094
8.163SerArg: 8.163 ± 4.975
7.483SerSer: 7.483 ± 2.558
8.163SerThr: 8.163 ± 2.87
4.082SerVal: 4.082 ± 0.786
0.68SerTrp: 0.68 ± 0.386
1.701SerTyr: 1.701 ± 0.71
0.0SerXaa: 0.0 ± 0.0
Thr
3.401ThrAla: 3.401 ± 1.129
2.041ThrCys: 2.041 ± 0.509
3.401ThrAsp: 3.401 ± 1.098
4.422ThrGlu: 4.422 ± 1.201
3.401ThrPhe: 3.401 ± 1.756
5.442ThrGly: 5.442 ± 1.75
1.361ThrHis: 1.361 ± 0.924
2.721ThrIle: 2.721 ± 1.074
1.701ThrLys: 1.701 ± 0.477
3.061ThrLeu: 3.061 ± 1.209
0.34ThrMet: 0.34 ± 0.308
2.381ThrAsn: 2.381 ± 1.008
6.803ThrPro: 6.803 ± 2.008
3.061ThrGln: 3.061 ± 0.747
5.782ThrArg: 5.782 ± 1.557
5.102ThrSer: 5.102 ± 2.268
5.782ThrThr: 5.782 ± 3.037
5.102ThrVal: 5.102 ± 1.577
0.68ThrTrp: 0.68 ± 0.616
2.381ThrTyr: 2.381 ± 0.883
0.0ThrXaa: 0.0 ± 0.0
Val
3.061ValAla: 3.061 ± 0.705
0.68ValCys: 0.68 ± 0.605
3.401ValAsp: 3.401 ± 0.938
4.082ValGlu: 4.082 ± 1.171
2.721ValPhe: 2.721 ± 0.782
3.741ValGly: 3.741 ± 0.744
3.061ValHis: 3.061 ± 1.416
2.721ValIle: 2.721 ± 1.158
3.741ValLys: 3.741 ± 1.085
3.401ValLeu: 3.401 ± 0.718
0.0ValMet: 0.0 ± 0.0
2.041ValAsn: 2.041 ± 0.603
6.803ValPro: 6.803 ± 2.394
3.401ValGln: 3.401 ± 1.48
4.422ValArg: 4.422 ± 1.461
4.422ValSer: 4.422 ± 1.061
5.102ValThr: 5.102 ± 2.334
2.041ValVal: 2.041 ± 1.091
0.68ValTrp: 0.68 ± 0.572
2.381ValTyr: 2.381 ± 1.076
0.0ValXaa: 0.0 ± 0.0
Trp
1.02TrpAla: 1.02 ± 0.573
0.34TrpCys: 0.34 ± 0.314
0.68TrpAsp: 0.68 ± 0.572
0.68TrpGlu: 0.68 ± 0.416
0.0TrpPhe: 0.0 ± 0.0
0.34TrpGly: 0.34 ± 0.438
0.68TrpHis: 0.68 ± 0.369
0.68TrpIle: 0.68 ± 0.627
1.701TrpLys: 1.701 ± 0.982
1.02TrpLeu: 1.02 ± 0.603
0.34TrpMet: 0.34 ± 0.347
0.0TrpAsn: 0.0 ± 0.0
0.34TrpPro: 0.34 ± 0.443
1.02TrpGln: 1.02 ± 0.565
0.0TrpArg: 0.0 ± 0.0
1.361TrpSer: 1.361 ± 0.734
1.02TrpThr: 1.02 ± 0.624
1.361TrpVal: 1.361 ± 0.92
0.0TrpTrp: 0.0 ± 0.0
0.34TrpTyr: 0.34 ± 0.314
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.701TyrAla: 1.701 ± 0.672
0.34TyrCys: 0.34 ± 0.36
0.34TyrAsp: 0.34 ± 0.308
1.361TyrGlu: 1.361 ± 0.786
1.361TyrPhe: 1.361 ± 0.843
2.041TyrGly: 2.041 ± 0.805
1.02TyrHis: 1.02 ± 0.509
2.381TyrIle: 2.381 ± 1.43
2.721TyrLys: 2.721 ± 1.067
3.401TyrLeu: 3.401 ± 0.833
0.68TyrMet: 0.68 ± 0.386
1.361TyrAsn: 1.361 ± 1.144
1.701TyrPro: 1.701 ± 0.504
1.701TyrGln: 1.701 ± 0.486
1.361TyrArg: 1.361 ± 0.542
1.701TyrSer: 1.701 ± 0.766
3.061TyrThr: 3.061 ± 1.14
2.041TyrVal: 2.041 ± 0.93
0.34TyrTrp: 0.34 ± 0.379
2.381TyrTyr: 2.381 ± 1.245
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 9 proteins (2941 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski