Amino acid dipepetide frequency for Human papillomavirus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
3.033AlaAla: 3.033 ± 0.764
1.3AlaCys: 1.3 ± 0.999
4.766AlaAsp: 4.766 ± 1.966
4.766AlaGlu: 4.766 ± 1.867
4.333AlaPhe: 4.333 ± 1.428
1.3AlaGly: 1.3 ± 0.443
1.3AlaHis: 1.3 ± 0.343
4.333AlaIle: 4.333 ± 1.362
3.466AlaLys: 3.466 ± 1.372
5.199AlaLeu: 5.199 ± 0.344
1.3AlaMet: 1.3 ± 0.679
1.733AlaAsn: 1.733 ± 1.016
2.166AlaPro: 2.166 ± 1.386
1.3AlaGln: 1.3 ± 0.941
3.033AlaArg: 3.033 ± 0.724
6.499AlaSer: 6.499 ± 0.794
3.033AlaThr: 3.033 ± 1.627
4.333AlaVal: 4.333 ± 1.048
0.433AlaTrp: 0.433 ± 0.407
1.733AlaTyr: 1.733 ± 1.066
0.0AlaXaa: 0.0 ± 0.0
Cys
0.433CysAla: 0.433 ± 0.388
1.733CysCys: 1.733 ± 1.706
0.867CysAsp: 0.867 ± 0.385
0.433CysGlu: 0.433 ± 0.397
1.733CysPhe: 1.733 ± 1.134
0.433CysGly: 0.433 ± 0.673
0.0CysHis: 0.0 ± 0.0
1.3CysIle: 1.3 ± 1.114
1.733CysLys: 1.733 ± 0.953
1.733CysLeu: 1.733 ± 2.025
0.433CysMet: 0.433 ± 0.388
0.867CysAsn: 0.867 ± 0.776
1.3CysPro: 1.3 ± 0.789
0.0CysGln: 0.0 ± 0.0
1.3CysArg: 1.3 ± 2.079
3.466CysSer: 3.466 ± 2.448
1.3CysThr: 1.3 ± 1.114
1.3CysVal: 1.3 ± 0.656
1.733CysTrp: 1.733 ± 0.64
0.433CysTyr: 0.433 ± 0.693
0.0CysXaa: 0.0 ± 0.0
Asp
3.033AspAla: 3.033 ± 0.749
2.166AspCys: 2.166 ± 0.843
3.466AspAsp: 3.466 ± 1.34
4.333AspGlu: 4.333 ± 1.219
1.733AspPhe: 1.733 ± 0.592
3.033AspGly: 3.033 ± 1.176
0.867AspHis: 0.867 ± 0.776
3.033AspIle: 3.033 ± 1.531
1.3AspLys: 1.3 ± 0.847
7.799AspLeu: 7.799 ± 1.79
1.3AspMet: 1.3 ± 0.778
3.033AspAsn: 3.033 ± 0.777
5.633AspPro: 5.633 ± 2.329
0.433AspGln: 0.433 ± 0.363
2.166AspArg: 2.166 ± 1.628
5.633AspSer: 5.633 ± 1.306
4.333AspThr: 4.333 ± 1.32
3.899AspVal: 3.899 ± 1.21
0.867AspTrp: 0.867 ± 0.776
2.166AspTyr: 2.166 ± 0.676
0.0AspXaa: 0.0 ± 0.0
Glu
3.899GluAla: 3.899 ± 0.752
0.867GluCys: 0.867 ± 0.767
4.333GluAsp: 4.333 ± 1.353
7.799GluGlu: 7.799 ± 3.038
1.733GluPhe: 1.733 ± 0.932
3.033GluGly: 3.033 ± 1.237
1.3GluHis: 1.3 ± 0.76
2.6GluIle: 2.6 ± 1.266
2.166GluLys: 2.166 ± 1.288
6.499GluLeu: 6.499 ± 0.769
0.433GluMet: 0.433 ± 0.388
4.766GluAsn: 4.766 ± 1.092
1.3GluPro: 1.3 ± 0.789
4.766GluGln: 4.766 ± 1.561
3.899GluArg: 3.899 ± 2.089
4.766GluSer: 4.766 ± 1.48
5.199GluThr: 5.199 ± 1.906
3.466GluVal: 3.466 ± 1.155
0.867GluTrp: 0.867 ± 0.814
1.733GluTyr: 1.733 ± 0.69
0.0GluXaa: 0.0 ± 0.0
Phe
3.033PheAla: 3.033 ± 0.529
0.867PheCys: 0.867 ± 0.767
1.3PheAsp: 1.3 ± 0.785
4.766PheGlu: 4.766 ± 2.057
2.6PhePhe: 2.6 ± 0.948
2.166PheGly: 2.166 ± 0.998
0.867PheHis: 0.867 ± 0.715
0.867PheIle: 0.867 ± 0.458
3.033PheLys: 3.033 ± 1.69
4.333PheLeu: 4.333 ± 1.851
1.3PheMet: 1.3 ± 0.782
3.466PheAsn: 3.466 ± 0.685
2.6PhePro: 2.6 ± 0.976
2.6PheGln: 2.6 ± 0.861
2.166PheArg: 2.166 ± 1.07
4.766PheSer: 4.766 ± 1.396
2.6PheThr: 2.6 ± 1.271
1.733PheVal: 1.733 ± 0.733
1.3PheTrp: 1.3 ± 0.679
1.733PheTyr: 1.733 ± 0.563
0.0PheXaa: 0.0 ± 0.0
Gly
4.766GlyAla: 4.766 ± 1.063
0.433GlyCys: 0.433 ± 0.397
3.033GlyAsp: 3.033 ± 1.784
3.033GlyGlu: 3.033 ± 1.269
0.867GlyPhe: 0.867 ± 0.533
3.466GlyGly: 3.466 ± 1.341
1.733GlyHis: 1.733 ± 0.789
1.3GlyIle: 1.3 ± 0.343
3.033GlyLys: 3.033 ± 1.16
4.766GlyLeu: 4.766 ± 1.011
0.0GlyMet: 0.0 ± 0.0
2.6GlyAsn: 2.6 ± 1.392
2.6GlyPro: 2.6 ± 0.896
1.733GlyGln: 1.733 ± 0.962
2.6GlyArg: 2.6 ± 1.373
3.899GlySer: 3.899 ± 0.743
5.633GlyThr: 5.633 ± 1.481
3.033GlyVal: 3.033 ± 0.834
0.0GlyTrp: 0.0 ± 0.0
1.733GlyTyr: 1.733 ± 1.167
0.0GlyXaa: 0.0 ± 0.0
His
0.433HisAla: 0.433 ± 0.397
0.0HisCys: 0.0 ± 0.0
0.867HisAsp: 0.867 ± 0.488
0.0HisGlu: 0.0 ± 0.0
1.3HisPhe: 1.3 ± 0.741
0.867HisGly: 0.867 ± 0.785
0.0HisHis: 0.0 ± 0.0
0.867HisIle: 0.867 ± 0.715
0.433HisLys: 0.433 ± 0.388
2.166HisLeu: 2.166 ± 1.244
0.0HisMet: 0.0 ± 0.0
0.433HisAsn: 0.433 ± 0.407
2.166HisPro: 2.166 ± 0.975
0.433HisGln: 0.433 ± 0.388
1.3HisArg: 1.3 ± 0.436
0.867HisSer: 0.867 ± 0.385
0.433HisThr: 0.433 ± 0.407
2.166HisVal: 2.166 ± 1.134
0.433HisTrp: 0.433 ± 0.397
1.733HisTyr: 1.733 ± 0.669
0.0HisXaa: 0.0 ± 0.0
Ile
1.733IleAla: 1.733 ± 1.1
0.867IleCys: 0.867 ± 0.794
3.899IleAsp: 3.899 ± 1.911
5.633IleGlu: 5.633 ± 0.807
2.166IlePhe: 2.166 ± 0.713
2.166IleGly: 2.166 ± 1.238
0.0IleHis: 0.0 ± 0.0
2.6IleIle: 2.6 ± 1.274
1.733IleLys: 1.733 ± 0.657
4.333IleLeu: 4.333 ± 0.766
0.433IleMet: 0.433 ± 0.388
3.033IleAsn: 3.033 ± 1.621
2.6IlePro: 2.6 ± 1.475
2.166IleGln: 2.166 ± 0.998
0.867IleArg: 0.867 ± 0.533
6.066IleSer: 6.066 ± 2.409
0.867IleThr: 0.867 ± 0.776
4.766IleVal: 4.766 ± 1.224
0.433IleTrp: 0.433 ± 0.388
2.6IleTyr: 2.6 ± 1.105
0.0IleXaa: 0.0 ± 0.0
Lys
3.466LysAla: 3.466 ± 0.988
0.867LysCys: 0.867 ± 0.488
3.033LysAsp: 3.033 ± 0.826
2.166LysGlu: 2.166 ± 1.577
2.6LysPhe: 2.6 ± 0.686
2.6LysGly: 2.6 ± 0.977
2.166LysHis: 2.166 ± 1.257
2.6LysIle: 2.6 ± 1.307
0.867LysLys: 0.867 ± 0.488
2.6LysLeu: 2.6 ± 1.359
1.3LysMet: 1.3 ± 0.628
1.3LysAsn: 1.3 ± 0.443
2.166LysPro: 2.166 ± 0.724
3.033LysGln: 3.033 ± 1.597
6.499LysArg: 6.499 ± 1.06
3.899LysSer: 3.899 ± 1.587
1.3LysThr: 1.3 ± 0.782
3.899LysVal: 3.899 ± 1.059
0.433LysTrp: 0.433 ± 0.397
3.033LysTyr: 3.033 ± 1.453
0.0LysXaa: 0.0 ± 0.0
Leu
5.199LeuAla: 5.199 ± 1.496
1.733LeuCys: 1.733 ± 1.006
4.766LeuAsp: 4.766 ± 1.95
3.033LeuGlu: 3.033 ± 0.557
6.066LeuPhe: 6.066 ± 1.857
6.066LeuGly: 6.066 ± 2.489
1.733LeuHis: 1.733 ± 0.882
6.499LeuIle: 6.499 ± 0.941
5.633LeuLys: 5.633 ± 1.762
8.232LeuLeu: 8.232 ± 2.114
1.733LeuMet: 1.733 ± 0.588
5.199LeuAsn: 5.199 ± 1.154
4.333LeuPro: 4.333 ± 0.833
6.499LeuGln: 6.499 ± 1.736
3.033LeuArg: 3.033 ± 1.239
7.366LeuSer: 7.366 ± 1.438
5.199LeuThr: 5.199 ± 0.69
4.766LeuVal: 4.766 ± 1.262
0.433LeuTrp: 0.433 ± 0.397
4.333LeuTyr: 4.333 ± 1.47
0.0LeuXaa: 0.0 ± 0.0
Met
1.733MetAla: 1.733 ± 0.842
0.433MetCys: 0.433 ± 0.397
1.3MetAsp: 1.3 ± 0.679
0.867MetGlu: 0.867 ± 0.533
0.433MetPhe: 0.433 ± 0.397
0.433MetGly: 0.433 ± 0.397
0.0MetHis: 0.0 ± 0.0
0.433MetIle: 0.433 ± 0.673
0.433MetLys: 0.433 ± 0.363
1.733MetLeu: 1.733 ± 0.759
0.433MetMet: 0.433 ± 0.407
0.433MetAsn: 0.433 ± 0.388
0.867MetPro: 0.867 ± 0.385
1.3MetGln: 1.3 ± 0.443
0.867MetArg: 0.867 ± 0.488
2.166MetSer: 2.166 ± 1.474
0.433MetThr: 0.433 ± 0.397
1.3MetVal: 1.3 ± 1.164
0.0MetTrp: 0.0 ± 0.0
0.433MetTyr: 0.433 ± 0.363
0.0MetXaa: 0.0 ± 0.0
Asn
4.333AsnAla: 4.333 ± 1.896
2.166AsnCys: 2.166 ± 2.039
0.867AsnAsp: 0.867 ± 0.458
1.733AsnGlu: 1.733 ± 1.028
2.6AsnPhe: 2.6 ± 1.327
3.033AsnGly: 3.033 ± 1.008
0.867AsnHis: 0.867 ± 0.766
3.033AsnIle: 3.033 ± 1.354
4.333AsnLys: 4.333 ± 1.525
4.333AsnLeu: 4.333 ± 1.401
0.433AsnMet: 0.433 ± 0.539
2.6AsnAsn: 2.6 ± 1.401
3.033AsnPro: 3.033 ± 1.383
2.166AsnGln: 2.166 ± 0.633
3.899AsnArg: 3.899 ± 0.877
3.033AsnSer: 3.033 ± 1.766
2.6AsnThr: 2.6 ± 1.515
4.766AsnVal: 4.766 ± 1.269
1.3AsnTrp: 1.3 ± 0.443
1.3AsnTyr: 1.3 ± 0.678
0.0AsnXaa: 0.0 ± 0.0
Pro
3.899ProAla: 3.899 ± 1.135
0.433ProCys: 0.433 ± 0.397
5.199ProAsp: 5.199 ± 1.803
3.899ProGlu: 3.899 ± 2.032
1.3ProPhe: 1.3 ± 0.782
0.867ProGly: 0.867 ± 0.794
0.0ProHis: 0.0 ± 0.0
2.6ProIle: 2.6 ± 1.759
3.466ProLys: 3.466 ± 0.962
4.766ProLeu: 4.766 ± 1.442
0.433ProMet: 0.433 ± 0.363
2.6ProAsn: 2.6 ± 1.421
5.199ProPro: 5.199 ± 2.172
2.166ProGln: 2.166 ± 0.759
2.6ProArg: 2.6 ± 1.561
6.499ProSer: 6.499 ± 2.904
4.333ProThr: 4.333 ± 1.813
3.899ProVal: 3.899 ± 0.833
0.0ProTrp: 0.0 ± 0.0
2.6ProTyr: 2.6 ± 1.515
0.0ProXaa: 0.0 ± 0.0
Gln
3.899GlnAla: 3.899 ± 1.328
1.733GlnCys: 1.733 ± 0.932
3.899GlnAsp: 3.899 ± 0.848
3.899GlnGlu: 3.899 ± 1.603
3.033GlnPhe: 3.033 ± 0.842
2.6GlnGly: 2.6 ± 0.883
0.433GlnHis: 0.433 ± 0.397
2.6GlnIle: 2.6 ± 1.307
1.733GlnLys: 1.733 ± 0.592
6.066GlnLeu: 6.066 ± 1.058
1.3GlnMet: 1.3 ± 0.443
2.166GlnAsn: 2.166 ± 1.083
2.166GlnPro: 2.166 ± 0.848
2.166GlnGln: 2.166 ± 0.747
0.867GlnArg: 0.867 ± 0.533
2.6GlnSer: 2.6 ± 1.061
0.867GlnThr: 0.867 ± 0.468
1.3GlnVal: 1.3 ± 0.663
0.867GlnTrp: 0.867 ± 0.767
3.033GlnTyr: 3.033 ± 1.759
0.0GlnXaa: 0.0 ± 0.0
Arg
2.166ArgAla: 2.166 ± 0.889
2.6ArgCys: 2.6 ± 1.061
2.166ArgAsp: 2.166 ± 1.024
2.6ArgGlu: 2.6 ± 0.575
3.466ArgPhe: 3.466 ± 1.024
2.166ArgGly: 2.166 ± 0.798
1.733ArgHis: 1.733 ± 0.93
2.166ArgIle: 2.166 ± 0.94
3.466ArgLys: 3.466 ± 0.795
5.199ArgLeu: 5.199 ± 1.085
1.3ArgMet: 1.3 ± 0.971
3.033ArgAsn: 3.033 ± 1.025
3.466ArgPro: 3.466 ± 1.75
2.166ArgGln: 2.166 ± 0.447
4.766ArgArg: 4.766 ± 1.796
3.466ArgSer: 3.466 ± 0.694
2.166ArgThr: 2.166 ± 1.121
3.899ArgVal: 3.899 ± 1.102
0.433ArgTrp: 0.433 ± 0.407
1.3ArgTyr: 1.3 ± 0.443
0.0ArgXaa: 0.0 ± 0.0
Ser
4.333SerAla: 4.333 ± 1.564
0.867SerCys: 0.867 ± 0.767
4.333SerAsp: 4.333 ± 1.081
6.066SerGlu: 6.066 ± 2.216
3.033SerPhe: 3.033 ± 1.198
6.932SerGly: 6.932 ± 1.687
2.166SerHis: 2.166 ± 0.447
4.333SerIle: 4.333 ± 1.494
3.899SerLys: 3.899 ± 1.528
6.932SerLeu: 6.932 ± 2.014
1.733SerMet: 1.733 ± 0.77
5.633SerAsn: 5.633 ± 1.764
4.333SerPro: 4.333 ± 0.982
5.633SerGln: 5.633 ± 1.842
4.333SerArg: 4.333 ± 1.53
10.832SerSer: 10.832 ± 4.574
4.766SerThr: 4.766 ± 1.417
7.366SerVal: 7.366 ± 1.108
0.867SerTrp: 0.867 ± 0.776
0.867SerTyr: 0.867 ± 0.385
0.0SerXaa: 0.0 ± 0.0
Thr
2.6ThrAla: 2.6 ± 1.776
1.3ThrCys: 1.3 ± 1.348
3.466ThrAsp: 3.466 ± 0.631
4.333ThrGlu: 4.333 ± 0.555
3.033ThrPhe: 3.033 ± 1.435
2.166ThrGly: 2.166 ± 0.473
0.0ThrHis: 0.0 ± 0.0
3.033ThrIle: 3.033 ± 1.141
2.6ThrLys: 2.6 ± 0.494
5.633ThrLeu: 5.633 ± 1.847
0.0ThrMet: 0.0 ± 0.0
3.466ThrAsn: 3.466 ± 0.688
3.033ThrPro: 3.033 ± 0.684
1.733ThrGln: 1.733 ± 0.722
4.766ThrArg: 4.766 ± 1.506
3.033ThrSer: 3.033 ± 1.258
3.899ThrThr: 3.899 ± 0.782
5.633ThrVal: 5.633 ± 1.349
0.433ThrTrp: 0.433 ± 0.407
1.3ThrTyr: 1.3 ± 0.861
0.0ThrXaa: 0.0 ± 0.0
Val
3.899ValAla: 3.899 ± 0.534
1.733ValCys: 1.733 ± 2.316
5.633ValAsp: 5.633 ± 0.897
4.333ValGlu: 4.333 ± 1.324
3.466ValPhe: 3.466 ± 1.094
3.899ValGly: 3.899 ± 1.506
1.733ValHis: 1.733 ± 0.722
3.033ValIle: 3.033 ± 1.085
2.6ValLys: 2.6 ± 0.789
4.333ValLeu: 4.333 ± 1.277
0.867ValMet: 0.867 ± 0.385
3.033ValAsn: 3.033 ± 0.809
5.199ValPro: 5.199 ± 1.185
3.466ValGln: 3.466 ± 0.888
1.733ValArg: 1.733 ± 1.711
6.499ValSer: 6.499 ± 1.562
4.333ValThr: 4.333 ± 1.879
1.733ValVal: 1.733 ± 0.601
1.3ValTrp: 1.3 ± 0.861
1.733ValTyr: 1.733 ± 0.686
0.0ValXaa: 0.0 ± 0.0
Trp
0.867TrpAla: 0.867 ± 0.776
0.0TrpCys: 0.0 ± 0.0
0.867TrpAsp: 0.867 ± 0.533
0.433TrpGlu: 0.433 ± 0.407
0.0TrpPhe: 0.0 ± 0.0
0.433TrpGly: 0.433 ± 0.388
0.0TrpHis: 0.0 ± 0.0
0.433TrpIle: 0.433 ± 0.388
1.3TrpLys: 1.3 ± 0.443
1.733TrpLeu: 1.733 ± 0.64
0.433TrpMet: 0.433 ± 0.397
0.433TrpAsn: 0.433 ± 0.397
1.3TrpPro: 1.3 ± 1.191
1.3TrpGln: 1.3 ± 0.443
1.3TrpArg: 1.3 ± 1.012
0.0TrpSer: 0.0 ± 0.0
0.867TrpThr: 0.867 ± 0.814
0.867TrpVal: 0.867 ± 0.729
0.0TrpTrp: 0.0 ± 0.0
0.867TrpTyr: 0.867 ± 0.488
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.6TyrAla: 2.6 ± 0.927
0.433TyrCys: 0.433 ± 0.693
2.6TyrAsp: 2.6 ± 0.928
1.3TyrGlu: 1.3 ± 0.741
2.6TyrPhe: 2.6 ± 0.896
2.6TyrGly: 2.6 ± 0.643
0.0TyrHis: 0.0 ± 0.0
1.3TyrIle: 1.3 ± 0.663
2.6TyrLys: 2.6 ± 1.359
3.466TyrLeu: 3.466 ± 1.127
0.433TyrMet: 0.433 ± 0.388
2.6TyrAsn: 2.6 ± 0.577
1.3TyrPro: 1.3 ± 0.508
2.166TyrGln: 2.166 ± 1.325
1.733TyrArg: 1.733 ± 0.789
3.899TyrSer: 3.899 ± 1.709
1.3TyrThr: 1.3 ± 1.221
0.433TyrVal: 0.433 ± 0.397
1.3TyrTrp: 1.3 ± 0.729
3.033TyrTyr: 3.033 ± 1.19
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 6 proteins (2309 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski