Amino acid dipepetide frequency for Long-fingered bat hepatitis B virus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
6.203AlaAla: 6.203 ± 3.681
0.62AlaCys: 0.62 ± 0.398
3.722AlaAsp: 3.722 ± 2.389
1.241AlaGlu: 1.241 ± 0.433
1.241AlaPhe: 1.241 ± 0.796
3.722AlaGly: 3.722 ± 1.862
1.241AlaHis: 1.241 ± 0.796
3.102AlaIle: 3.102 ± 1.106
1.861AlaLys: 1.861 ± 0.919
9.926AlaLeu: 9.926 ± 2.27
0.62AlaMet: 0.62 ± 0.398
1.241AlaAsn: 1.241 ± 0.988
4.342AlaPro: 4.342 ± 0.737
1.861AlaGln: 1.861 ± 0.688
6.203AlaArg: 6.203 ± 2.483
8.685AlaSer: 8.685 ± 2.68
3.722AlaThr: 3.722 ± 1.494
1.861AlaVal: 1.861 ± 0.524
1.241AlaTrp: 1.241 ± 0.796
1.861AlaTyr: 1.861 ± 1.618
0.0AlaXaa: 0.0 ± 0.0
Cys
3.722CysAla: 3.722 ± 3.071
1.241CysCys: 1.241 ± 1.293
0.0CysAsp: 0.0 ± 0.0
0.0CysGlu: 0.0 ± 0.0
1.241CysPhe: 1.241 ± 0.433
1.241CysGly: 1.241 ± 0.796
0.62CysHis: 0.62 ± 0.398
1.241CysIle: 1.241 ± 1.186
0.0CysLys: 0.0 ± 0.0
4.342CysLeu: 4.342 ± 1.814
1.861CysMet: 1.861 ± 1.1
0.62CysAsn: 0.62 ± 0.398
2.481CysPro: 2.481 ± 0.866
1.241CysGln: 1.241 ± 0.988
1.241CysArg: 1.241 ± 0.988
3.102CysSer: 3.102 ± 0.907
3.102CysThr: 3.102 ± 2.302
0.62CysVal: 0.62 ± 0.398
1.861CysTrp: 1.861 ± 0.688
0.62CysTyr: 0.62 ± 0.398
0.0CysXaa: 0.0 ± 0.0
Asp
3.722AspAla: 3.722 ± 1.298
0.0AspCys: 0.0 ± 0.0
1.241AspAsp: 1.241 ± 0.796
1.241AspGlu: 1.241 ± 0.433
1.241AspPhe: 1.241 ± 0.589
0.62AspGly: 0.62 ± 0.647
0.62AspHis: 0.62 ± 0.398
0.62AspIle: 0.62 ± 0.708
1.861AspLys: 1.861 ± 0.524
3.722AspLeu: 3.722 ± 0.864
1.241AspMet: 1.241 ± 1.186
1.241AspAsn: 1.241 ± 0.796
3.722AspPro: 3.722 ± 0.724
1.241AspGln: 1.241 ± 1.186
0.62AspArg: 0.62 ± 0.398
0.62AspSer: 0.62 ± 0.398
1.241AspThr: 1.241 ± 0.589
1.241AspVal: 1.241 ± 0.796
3.102AspTrp: 3.102 ± 1.436
0.0AspTyr: 0.0 ± 0.0
0.0AspXaa: 0.0 ± 0.0
Glu
1.241GluAla: 1.241 ± 0.796
0.0GluCys: 0.0 ± 0.0
1.241GluAsp: 1.241 ± 0.589
3.102GluGlu: 3.102 ± 2.183
1.241GluPhe: 1.241 ± 1.416
3.102GluGly: 3.102 ± 0.582
3.102GluHis: 3.102 ± 1.106
0.62GluIle: 0.62 ± 0.647
3.102GluLys: 3.102 ± 0.821
3.102GluLeu: 3.102 ± 1.68
0.0GluMet: 0.0 ± 0.0
0.0GluAsn: 0.0 ± 0.0
1.241GluPro: 1.241 ± 0.433
1.241GluGln: 1.241 ± 0.796
0.0GluArg: 0.0 ± 0.0
3.102GluSer: 3.102 ± 1.795
1.861GluThr: 1.861 ± 1.442
1.241GluVal: 1.241 ± 0.433
1.241GluTrp: 1.241 ± 0.988
0.0GluTyr: 0.0 ± 0.0
0.0GluXaa: 0.0 ± 0.0
Phe
3.722PheAla: 3.722 ± 1.048
1.861PheCys: 1.861 ± 0.714
0.62PheAsp: 0.62 ± 0.398
1.241PheGlu: 1.241 ± 0.433
1.861PhePhe: 1.861 ± 0.688
1.861PheGly: 1.861 ± 2.124
2.481PheHis: 2.481 ± 1.178
1.241PheIle: 1.241 ± 1.293
1.861PheLys: 1.861 ± 1.195
7.444PheLeu: 7.444 ± 2.988
0.62PheMet: 0.62 ± 0.398
1.241PheAsn: 1.241 ± 0.796
3.722PhePro: 3.722 ± 0.864
0.0PheGln: 0.0 ± 0.0
3.102PheArg: 3.102 ± 0.821
3.102PheSer: 3.102 ± 0.821
2.481PheThr: 2.481 ± 1.258
2.481PheVal: 2.481 ± 1.245
0.62PheTrp: 0.62 ± 0.647
1.241PheTyr: 1.241 ± 0.796
0.0PheXaa: 0.0 ± 0.0
Gly
3.722GlyAla: 3.722 ± 0.638
2.481GlyCys: 2.481 ± 1.949
3.102GlyAsp: 3.102 ± 1.183
1.241GlyGlu: 1.241 ± 0.796
4.342GlyPhe: 4.342 ± 2.682
6.203GlyGly: 6.203 ± 1.519
1.241GlyHis: 1.241 ± 1.186
4.342GlyIle: 4.342 ± 1.322
2.481GlyLys: 2.481 ± 0.866
11.166GlyLeu: 11.166 ± 3.864
0.62GlyMet: 0.62 ± 0.708
3.722GlyAsn: 3.722 ± 2.393
3.102GlyPro: 3.102 ± 1.795
4.342GlyGln: 4.342 ± 0.737
4.963GlyArg: 4.963 ± 0.953
2.481GlySer: 2.481 ± 0.824
4.342GlyThr: 4.342 ± 1.417
1.241GlyVal: 1.241 ± 0.796
1.241GlyTrp: 1.241 ± 0.433
0.62GlyTyr: 0.62 ± 0.398
0.0GlyXaa: 0.0 ± 0.0
His
1.241HisAla: 1.241 ± 0.796
3.102HisCys: 3.102 ± 1.337
0.0HisAsp: 0.0 ± 0.0
0.0HisGlu: 0.0 ± 0.0
1.861HisPhe: 1.861 ± 1.195
3.102HisGly: 3.102 ± 0.821
2.481HisHis: 2.481 ± 0.995
0.62HisIle: 0.62 ± 0.398
1.241HisLys: 1.241 ± 0.988
6.203HisLeu: 6.203 ± 1.164
0.62HisMet: 0.62 ± 0.398
1.241HisAsn: 1.241 ± 0.796
1.861HisPro: 1.861 ± 0.524
1.861HisGln: 1.861 ± 0.524
1.241HisArg: 1.241 ± 0.796
1.861HisSer: 1.861 ± 0.919
4.342HisThr: 4.342 ± 1.685
1.241HisVal: 1.241 ± 0.988
1.861HisTrp: 1.861 ± 1.059
0.62HisTyr: 0.62 ± 0.398
0.0HisXaa: 0.0 ± 0.0
Ile
0.62IleAla: 0.62 ± 0.398
1.241IleCys: 1.241 ± 0.988
1.241IleAsp: 1.241 ± 0.589
1.241IleGlu: 1.241 ± 1.312
1.241IlePhe: 1.241 ± 1.293
3.102IleGly: 3.102 ± 1.799
1.241IleHis: 1.241 ± 0.796
1.241IleIle: 1.241 ± 0.433
0.62IleLys: 0.62 ± 0.398
2.481IleLeu: 2.481 ± 0.498
0.62IleMet: 0.62 ± 0.398
0.0IleAsn: 0.0 ± 0.0
7.444IlePro: 7.444 ± 2.621
1.241IleGln: 1.241 ± 0.433
2.481IleArg: 2.481 ± 1.178
4.342IleSer: 4.342 ± 1.685
4.963IleThr: 4.963 ± 1.376
2.481IleVal: 2.481 ± 0.498
1.241IleTrp: 1.241 ± 1.293
0.62IleTyr: 0.62 ± 0.647
0.0IleXaa: 0.0 ± 0.0
Lys
1.241LysAla: 1.241 ± 0.433
0.0LysCys: 0.0 ± 0.0
1.241LysAsp: 1.241 ± 1.293
1.861LysGlu: 1.861 ± 1.112
1.241LysPhe: 1.241 ± 0.433
3.102LysGly: 3.102 ± 0.821
1.241LysHis: 1.241 ± 0.796
2.481LysIle: 2.481 ± 0.824
0.62LysLys: 0.62 ± 0.647
3.722LysLeu: 3.722 ± 1.722
0.0LysMet: 0.0 ± 0.0
1.861LysAsn: 1.861 ± 1.195
3.722LysPro: 3.722 ± 1.562
2.481LysGln: 2.481 ± 1.593
1.241LysArg: 1.241 ± 0.796
3.722LysSer: 3.722 ± 1.562
2.481LysThr: 2.481 ± 1.593
0.62LysVal: 0.62 ± 0.398
0.62LysTrp: 0.62 ± 0.398
1.241LysTyr: 1.241 ± 0.433
0.0LysXaa: 0.0 ± 0.0
Leu
7.444LeuAla: 7.444 ± 2.516
6.203LeuCys: 6.203 ± 1.609
3.102LeuAsp: 3.102 ± 0.958
1.861LeuGlu: 1.861 ± 0.524
3.722LeuPhe: 3.722 ± 0.864
8.065LeuGly: 8.065 ± 1.949
2.481LeuHis: 2.481 ± 1.593
6.203LeuIle: 6.203 ± 1.164
2.481LeuLys: 2.481 ± 1.593
16.129LeuLeu: 16.129 ± 3.631
1.241LeuMet: 1.241 ± 1.791
6.203LeuAsn: 6.203 ± 1.102
12.407LeuPro: 12.407 ± 1.761
3.722LeuGln: 3.722 ± 1.048
8.685LeuArg: 8.685 ± 3.512
7.444LeuSer: 7.444 ± 1.729
4.963LeuThr: 4.963 ± 1.845
7.444LeuVal: 7.444 ± 1.494
4.342LeuTrp: 4.342 ± 1.845
4.342LeuTyr: 4.342 ± 1.571
0.0LeuXaa: 0.0 ± 0.0
Met
1.861MetAla: 1.861 ± 1.059
0.62MetCys: 0.62 ± 0.398
1.241MetAsp: 1.241 ± 0.589
0.62MetGlu: 0.62 ± 1.071
0.62MetPhe: 0.62 ± 0.398
3.102MetGly: 3.102 ± 0.875
1.241MetHis: 1.241 ± 0.796
0.0MetIle: 0.0 ± 0.0
0.62MetLys: 0.62 ± 1.071
1.241MetLeu: 1.241 ± 0.796
0.62MetMet: 0.62 ± 0.647
0.62MetAsn: 0.62 ± 1.071
1.241MetPro: 1.241 ± 0.433
0.62MetGln: 0.62 ± 0.708
0.62MetArg: 0.62 ± 0.708
0.0MetSer: 0.0 ± 0.0
0.0MetThr: 0.0 ± 0.0
0.0MetVal: 0.0 ± 0.0
1.241MetTrp: 1.241 ± 1.293
0.62MetTyr: 0.62 ± 1.071
0.0MetXaa: 0.0 ± 0.0
Asn
0.62AsnAla: 0.62 ± 0.708
2.481AsnCys: 2.481 ± 1.328
1.241AsnAsp: 1.241 ± 1.312
0.62AsnGlu: 0.62 ± 0.398
1.861AsnPhe: 1.861 ± 2.022
0.0AsnGly: 0.0 ± 0.0
1.861AsnHis: 1.861 ± 1.195
1.241AsnIle: 1.241 ± 0.796
1.241AsnLys: 1.241 ± 0.796
3.102AsnLeu: 3.102 ± 0.875
1.241AsnMet: 1.241 ± 0.433
2.481AsnAsn: 2.481 ± 0.498
5.583AsnPro: 5.583 ± 1.572
1.861AsnGln: 1.861 ± 0.524
2.481AsnArg: 2.481 ± 0.824
3.102AsnSer: 3.102 ± 0.958
0.0AsnThr: 0.0 ± 0.0
0.62AsnVal: 0.62 ± 1.071
1.241AsnTrp: 1.241 ± 0.433
1.861AsnTyr: 1.861 ± 1.195
0.0AsnXaa: 0.0 ± 0.0
Pro
5.583ProAla: 5.583 ± 0.953
1.861ProCys: 1.861 ± 1.059
1.861ProAsp: 1.861 ± 1.195
4.342ProGlu: 4.342 ± 1.134
4.963ProPhe: 4.963 ± 1.648
4.342ProGly: 4.342 ± 1.322
5.583ProHis: 5.583 ± 0.357
3.102ProIle: 3.102 ± 1.417
2.481ProLys: 2.481 ± 0.824
10.546ProLeu: 10.546 ± 1.777
1.241ProMet: 1.241 ± 0.719
3.722ProAsn: 3.722 ± 0.724
8.065ProPro: 8.065 ± 2.445
1.861ProGln: 1.861 ± 0.688
6.203ProArg: 6.203 ± 1.722
7.444ProSer: 7.444 ± 1.729
9.305ProThr: 9.305 ± 3.629
6.824ProVal: 6.824 ± 1.697
2.481ProTrp: 2.481 ± 0.824
2.481ProTyr: 2.481 ± 1.199
0.0ProXaa: 0.0 ± 0.0
Gln
2.481GlnAla: 2.481 ± 1.328
0.0GlnCys: 0.0 ± 0.0
2.481GlnAsp: 2.481 ± 1.199
1.241GlnGlu: 1.241 ± 0.433
1.241GlnPhe: 1.241 ± 0.796
1.861GlnGly: 1.861 ± 1.195
0.0GlnHis: 0.0 ± 0.0
1.241GlnIle: 1.241 ± 1.293
1.861GlnLys: 1.861 ± 1.195
6.824GlnLeu: 6.824 ± 1.243
0.62GlnMet: 0.62 ± 0.398
1.861GlnAsn: 1.861 ± 1.026
3.102GlnPro: 3.102 ± 1.183
0.62GlnGln: 0.62 ± 0.708
1.861GlnArg: 1.861 ± 1.059
4.342GlnSer: 4.342 ± 1.633
1.241GlnThr: 1.241 ± 1.293
2.481GlnVal: 2.481 ± 0.866
0.62GlnTrp: 0.62 ± 0.647
0.62GlnTyr: 0.62 ± 0.398
0.0GlnXaa: 0.0 ± 0.0
Arg
3.722ArgAla: 3.722 ± 1.993
1.241ArgCys: 1.241 ± 1.186
1.241ArgAsp: 1.241 ± 0.796
3.102ArgGlu: 3.102 ± 1.417
3.722ArgPhe: 3.722 ± 0.776
2.481ArgGly: 2.481 ± 0.824
2.481ArgHis: 2.481 ± 1.976
1.861ArgIle: 1.861 ± 1.195
2.481ArgLys: 2.481 ± 0.824
4.963ArgLeu: 4.963 ± 2.893
0.62ArgMet: 0.62 ± 0.708
0.62ArgAsn: 0.62 ± 0.398
4.963ArgPro: 4.963 ± 2.068
4.342ArgGln: 4.342 ± 1.214
12.407ArgArg: 12.407 ± 6.387
5.583ArgSer: 5.583 ± 2.38
5.583ArgThr: 5.583 ± 0.995
3.722ArgVal: 3.722 ± 1.562
1.861ArgTrp: 1.861 ± 0.688
0.0ArgTyr: 0.0 ± 0.0
0.0ArgXaa: 0.0 ± 0.0
Ser
5.583SerAla: 5.583 ± 0.357
1.861SerCys: 1.861 ± 0.688
0.0SerAsp: 0.0 ± 0.0
1.861SerGlu: 1.861 ± 0.714
4.963SerPhe: 4.963 ± 1.229
5.583SerGly: 5.583 ± 4.86
2.481SerHis: 2.481 ± 0.824
3.102SerIle: 3.102 ± 0.582
3.722SerLys: 3.722 ± 1.048
8.685SerLeu: 8.685 ± 1.751
0.62SerMet: 0.62 ± 0.398
3.722SerAsn: 3.722 ± 0.864
9.305SerPro: 9.305 ± 0.892
3.722SerGln: 3.722 ± 1.494
4.963SerArg: 4.963 ± 1.575
9.926SerSer: 9.926 ± 2.862
7.444SerThr: 7.444 ± 2.198
3.722SerVal: 3.722 ± 0.864
3.102SerTrp: 3.102 ± 1.436
1.241SerTyr: 1.241 ± 0.796
0.0SerXaa: 0.0 ± 0.0
Thr
3.102ThrAla: 3.102 ± 2.753
3.102ThrCys: 3.102 ± 1.758
2.481ThrAsp: 2.481 ± 1.659
0.0ThrGlu: 0.0 ± 0.0
1.241ThrPhe: 1.241 ± 0.589
5.583ThrGly: 5.583 ± 1.248
1.241ThrHis: 1.241 ± 0.433
3.102ThrIle: 3.102 ± 2.521
3.102ThrLys: 3.102 ± 1.183
4.342ThrLeu: 4.342 ± 2.11
0.62ThrMet: 0.62 ± 1.071
2.481ThrAsn: 2.481 ± 0.824
6.824ThrPro: 6.824 ± 2.072
1.241ThrGln: 1.241 ± 0.433
3.102ThrArg: 3.102 ± 0.805
12.407ThrSer: 12.407 ± 3.501
6.824ThrThr: 6.824 ± 3.348
3.722ThrVal: 3.722 ± 2.221
2.481ThrTrp: 2.481 ± 1.937
2.481ThrTyr: 2.481 ± 1.258
0.0ThrXaa: 0.0 ± 0.0
Val
5.583ValAla: 5.583 ± 1.808
1.861ValCys: 1.861 ± 1.026
1.861ValAsp: 1.861 ± 0.714
1.241ValGlu: 1.241 ± 0.589
1.861ValPhe: 1.861 ± 0.714
3.722ValGly: 3.722 ± 1.562
3.722ValHis: 3.722 ± 1.86
3.102ValIle: 3.102 ± 0.907
0.0ValLys: 0.0 ± 0.0
2.481ValLeu: 2.481 ± 1.245
0.0ValMet: 0.0 ± 0.0
1.861ValAsn: 1.861 ± 1.195
5.583ValPro: 5.583 ± 1.248
1.861ValGln: 1.861 ± 2.124
2.481ValArg: 2.481 ± 0.498
1.861ValSer: 1.861 ± 0.919
3.102ValThr: 3.102 ± 2.719
2.481ValVal: 2.481 ± 1.258
0.62ValTrp: 0.62 ± 0.647
1.241ValTyr: 1.241 ± 1.293
0.0ValXaa: 0.0 ± 0.0
Trp
1.861TrpAla: 1.861 ± 1.539
0.0TrpCys: 0.0 ± 0.0
1.241TrpAsp: 1.241 ± 1.186
3.102TrpGlu: 3.102 ± 0.805
1.861TrpPhe: 1.861 ± 1.539
4.342TrpGly: 4.342 ± 0.737
0.0TrpHis: 0.0 ± 0.0
0.0TrpIle: 0.0 ± 0.0
1.861TrpLys: 1.861 ± 1.195
5.583TrpLeu: 5.583 ± 1.73
2.481TrpMet: 2.481 ± 1.519
0.0TrpAsn: 0.0 ± 0.0
3.102TrpPro: 3.102 ± 1.436
0.0TrpGln: 0.0 ± 0.0
1.861TrpArg: 1.861 ± 1.026
0.62TrpSer: 0.62 ± 0.647
1.861TrpThr: 1.861 ± 0.524
1.241TrpVal: 1.241 ± 0.589
2.481TrpTrp: 2.481 ± 0.866
0.0TrpTyr: 0.0 ± 0.0
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.241TyrAla: 1.241 ± 0.433
0.62TyrCys: 0.62 ± 0.647
0.0TyrAsp: 0.0 ± 0.0
0.62TyrGlu: 0.62 ± 0.708
1.241TyrPhe: 1.241 ± 0.988
1.861TyrGly: 1.861 ± 1.026
1.241TyrHis: 1.241 ± 0.796
1.241TyrIle: 1.241 ± 0.796
1.241TyrLys: 1.241 ± 0.589
3.102TyrLeu: 3.102 ± 1.183
0.62TyrMet: 0.62 ± 0.398
0.0TyrAsn: 0.0 ± 0.0
2.481TyrPro: 2.481 ± 1.593
1.241TyrGln: 1.241 ± 1.293
1.241TyrArg: 1.241 ± 0.589
1.861TyrSer: 1.861 ± 1.195
0.62TyrThr: 0.62 ± 1.071
1.241TyrVal: 1.241 ± 1.312
0.0TyrTrp: 0.0 ± 0.0
0.62TyrTyr: 0.62 ± 0.398
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 4 proteins (1613 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski