Amino acid dipepetide frequency for Tai Forest hepadnavirus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
7.107AlaAla: 7.107 ± 3.019
0.711AlaCys: 0.711 ± 0.428
4.264AlaAsp: 4.264 ± 1.841
2.843AlaGlu: 2.843 ± 1.711
3.554AlaPhe: 3.554 ± 2.139
2.843AlaGly: 2.843 ± 3.271
0.0AlaHis: 0.0 ± 0.0
1.421AlaIle: 1.421 ± 1.366
0.711AlaLys: 0.711 ± 0.428
7.818AlaLeu: 7.818 ± 0.987
0.0AlaMet: 0.0 ± 0.0
0.711AlaAsn: 0.711 ± 0.987
4.975AlaPro: 4.975 ± 0.285
1.421AlaGln: 1.421 ± 0.776
6.397AlaArg: 6.397 ± 1.822
6.397AlaSer: 6.397 ± 2.094
3.554AlaThr: 3.554 ± 1.576
1.421AlaVal: 1.421 ± 0.856
2.132AlaTrp: 2.132 ± 0.886
0.711AlaTyr: 0.711 ± 0.987
0.0AlaXaa: 0.0 ± 0.0
Cys
3.554CysAla: 3.554 ± 1.747
1.421CysCys: 1.421 ± 1.99
1.421CysAsp: 1.421 ± 0.776
0.0CysGlu: 0.0 ± 0.0
0.711CysPhe: 0.711 ± 0.428
1.421CysGly: 1.421 ± 0.856
0.0CysHis: 0.0 ± 0.0
0.0CysIle: 0.0 ± 0.0
1.421CysLys: 1.421 ± 0.776
6.397CysLeu: 6.397 ± 2.953
0.711CysMet: 0.711 ± 1.303
0.0CysAsn: 0.0 ± 0.0
2.132CysPro: 2.132 ± 2.177
1.421CysGln: 1.421 ± 0.856
2.843CysArg: 2.843 ± 1.974
3.554CysSer: 3.554 ± 1.497
3.554CysThr: 3.554 ± 2.959
0.711CysVal: 0.711 ± 0.937
1.421CysTrp: 1.421 ± 0.804
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
4.264AspAla: 4.264 ± 1.732
0.711AspCys: 0.711 ± 0.995
1.421AspAsp: 1.421 ± 0.856
0.711AspGlu: 0.711 ± 0.995
2.843AspPhe: 2.843 ± 0.904
0.711AspGly: 0.711 ± 0.428
1.421AspHis: 1.421 ± 0.776
0.711AspIle: 0.711 ± 0.428
0.711AspLys: 0.711 ± 0.428
7.107AspLeu: 7.107 ± 2.502
0.711AspMet: 0.711 ± 0.987
0.711AspAsn: 0.711 ± 0.428
2.843AspPro: 2.843 ± 0.904
0.711AspGln: 0.711 ± 0.937
1.421AspArg: 1.421 ± 0.856
1.421AspSer: 1.421 ± 0.776
0.711AspThr: 0.711 ± 0.428
2.843AspVal: 2.843 ± 1.045
1.421AspTrp: 1.421 ± 0.856
0.711AspTyr: 0.711 ± 0.428
0.0AspXaa: 0.0 ± 0.0
Glu
1.421GluAla: 1.421 ± 0.804
0.711GluCys: 0.711 ± 0.428
2.843GluAsp: 2.843 ± 1.134
2.843GluGlu: 2.843 ± 2.723
1.421GluPhe: 1.421 ± 1.973
2.132GluGly: 2.132 ± 2.177
2.843GluHis: 2.843 ± 1.608
0.0GluIle: 0.0 ± 0.0
1.421GluLys: 1.421 ± 0.856
2.132GluLeu: 2.132 ± 0.828
0.0GluMet: 0.0 ± 0.0
0.0GluAsn: 0.0 ± 0.0
2.132GluPro: 2.132 ± 1.283
1.421GluGln: 1.421 ± 0.856
0.711GluArg: 0.711 ± 0.937
2.132GluSer: 2.132 ± 0.762
2.843GluThr: 2.843 ± 1.045
0.0GluVal: 0.0 ± 0.0
0.711GluTrp: 0.711 ± 0.428
0.711GluTyr: 0.711 ± 0.428
0.0GluXaa: 0.0 ± 0.0
Phe
4.264PheAla: 4.264 ± 1.08
1.421PheCys: 1.421 ± 0.856
0.0PheAsp: 0.0 ± 0.0
0.0PheGlu: 0.0 ± 0.0
2.132PhePhe: 2.132 ± 0.984
4.975PheGly: 4.975 ± 2.193
2.132PheHis: 2.132 ± 2.22
2.843PheIle: 2.843 ± 1.552
2.132PheLys: 2.132 ± 1.283
5.686PheLeu: 5.686 ± 1.323
1.421PheMet: 1.421 ± 0.805
1.421PheAsn: 1.421 ± 0.856
4.975PhePro: 4.975 ± 1.031
0.0PheGln: 0.0 ± 0.0
2.843PheArg: 2.843 ± 1.711
4.264PheSer: 4.264 ± 1.646
3.554PheThr: 3.554 ± 1.468
2.843PheVal: 2.843 ± 1.711
0.711PheTrp: 0.711 ± 0.995
1.421PheTyr: 1.421 ± 0.805
0.0PheXaa: 0.0 ± 0.0
Gly
2.843GlyAla: 2.843 ± 0.904
1.421GlyCys: 1.421 ± 0.805
0.711GlyAsp: 0.711 ± 0.428
2.132GlyGlu: 2.132 ± 1.283
2.132GlyPhe: 2.132 ± 1.283
3.554GlyGly: 3.554 ± 0.934
0.0GlyHis: 0.0 ± 0.0
4.264GlyIle: 4.264 ± 1.523
2.132GlyLys: 2.132 ± 0.828
12.793GlyLeu: 12.793 ± 3.914
0.711GlyMet: 0.711 ± 0.428
4.264GlyAsn: 4.264 ± 1.197
4.975GlyPro: 4.975 ± 2.325
1.421GlyGln: 1.421 ± 0.804
4.264GlyArg: 4.264 ± 1.603
4.264GlySer: 4.264 ± 1.732
5.686GlyThr: 5.686 ± 0.485
2.843GlyVal: 2.843 ± 0.961
1.421GlyTrp: 1.421 ± 0.805
2.843GlyTyr: 2.843 ± 1.552
0.0GlyXaa: 0.0 ± 0.0
His
0.0HisAla: 0.0 ± 0.0
2.132HisCys: 2.132 ± 0.828
0.711HisAsp: 0.711 ± 0.428
0.0HisGlu: 0.0 ± 0.0
0.711HisPhe: 0.711 ± 0.428
1.421HisGly: 1.421 ± 1.875
1.421HisHis: 1.421 ± 0.804
0.711HisIle: 0.711 ± 0.428
3.554HisLys: 3.554 ± 1.468
6.397HisLeu: 6.397 ± 2.177
0.0HisMet: 0.0 ± 0.0
0.711HisAsn: 0.711 ± 0.428
2.843HisPro: 2.843 ± 1.711
0.0HisGln: 0.0 ± 0.0
2.132HisArg: 2.132 ± 0.886
2.132HisSer: 2.132 ± 0.762
2.843HisThr: 2.843 ± 2.723
1.421HisVal: 1.421 ± 0.805
0.0HisTrp: 0.0 ± 0.0
0.711HisTyr: 0.711 ± 0.428
0.0HisXaa: 0.0 ± 0.0
Ile
1.421IleAla: 1.421 ± 1.99
0.711IleCys: 0.711 ± 0.428
1.421IleAsp: 1.421 ± 0.856
0.0IleGlu: 0.0 ± 0.0
0.711IlePhe: 0.711 ± 0.995
1.421IleGly: 1.421 ± 1.366
2.132IleHis: 2.132 ± 1.283
1.421IleIle: 1.421 ± 0.776
0.711IleLys: 0.711 ± 0.428
4.975IleLeu: 4.975 ± 2.166
0.0IleMet: 0.0 ± 0.0
0.711IleAsn: 0.711 ± 0.428
5.686IlePro: 5.686 ± 4.223
0.711IleGln: 0.711 ± 0.428
1.421IleArg: 1.421 ± 1.973
2.843IleSer: 2.843 ± 1.348
6.397IleThr: 6.397 ± 2.393
1.421IleVal: 1.421 ± 1.366
1.421IleTrp: 1.421 ± 1.99
1.421IleTyr: 1.421 ± 0.776
0.0IleXaa: 0.0 ± 0.0
Lys
1.421LysAla: 1.421 ± 0.856
0.711LysCys: 0.711 ± 0.428
0.0LysAsp: 0.0 ± 0.0
2.843LysGlu: 2.843 ± 1.608
1.421LysPhe: 1.421 ± 0.856
2.132LysGly: 2.132 ± 0.762
0.711LysHis: 0.711 ± 0.428
2.132LysIle: 2.132 ± 0.762
1.421LysLys: 1.421 ± 0.856
4.264LysLeu: 4.264 ± 1.081
0.0LysMet: 0.0 ± 0.0
2.132LysAsn: 2.132 ± 1.283
1.421LysPro: 1.421 ± 0.856
0.711LysGln: 0.711 ± 0.428
0.711LysArg: 0.711 ± 0.428
2.132LysSer: 2.132 ± 0.762
2.843LysThr: 2.843 ± 1.711
0.711LysVal: 0.711 ± 0.428
1.421LysTrp: 1.421 ± 0.856
2.132LysTyr: 2.132 ± 1.283
0.0LysXaa: 0.0 ± 0.0
Leu
6.397LeuAla: 6.397 ± 2.285
4.264LeuCys: 4.264 ± 2.924
4.975LeuAsp: 4.975 ± 1.468
3.554LeuGlu: 3.554 ± 1.497
5.686LeuPhe: 5.686 ± 0.987
8.529LeuGly: 8.529 ± 3.261
4.975LeuHis: 4.975 ± 2.234
4.264LeuIle: 4.264 ± 3.11
2.843LeuLys: 2.843 ± 1.711
20.611LeuLeu: 20.611 ± 5.55
2.132LeuMet: 2.132 ± 0.917
7.818LeuAsn: 7.818 ± 0.987
7.818LeuPro: 7.818 ± 1.942
4.264LeuGln: 4.264 ± 2.328
10.661LeuArg: 10.661 ± 5.1
11.372LeuSer: 11.372 ± 2.765
7.107LeuThr: 7.107 ± 1.281
8.529LeuVal: 8.529 ± 2.264
6.397LeuTrp: 6.397 ± 1.319
4.264LeuTyr: 4.264 ± 2.567
0.0LeuXaa: 0.0 ± 0.0
Met
1.421MetAla: 1.421 ± 1.389
0.711MetCys: 0.711 ± 0.995
1.421MetAsp: 1.421 ± 0.804
0.0MetGlu: 0.0 ± 0.0
0.0MetPhe: 0.0 ± 0.0
3.554MetGly: 3.554 ± 0.932
2.132MetHis: 2.132 ± 0.828
0.0MetIle: 0.0 ± 0.0
0.0MetLys: 0.0 ± 0.0
0.711MetLeu: 0.711 ± 0.428
0.711MetMet: 0.711 ± 0.995
0.711MetAsn: 0.711 ± 0.937
1.421MetPro: 1.421 ± 0.856
1.421MetGln: 1.421 ± 1.389
0.711MetArg: 0.711 ± 0.937
0.711MetSer: 0.711 ± 0.428
0.0MetThr: 0.0 ± 0.0
0.0MetVal: 0.0 ± 0.0
0.711MetTrp: 0.711 ± 0.995
0.0MetTyr: 0.0 ± 0.0
0.0MetXaa: 0.0 ± 0.0
Asn
1.421AsnAla: 1.421 ± 0.804
2.132AsnCys: 2.132 ± 2.166
0.711AsnAsp: 0.711 ± 0.987
0.0AsnGlu: 0.0 ± 0.0
1.421AsnPhe: 1.421 ± 0.805
1.421AsnGly: 1.421 ± 0.856
0.711AsnHis: 0.711 ± 0.987
2.843AsnIle: 2.843 ± 0.968
0.711AsnLys: 0.711 ± 0.428
4.975AsnLeu: 4.975 ± 2.995
0.0AsnMet: 0.0 ± 0.0
0.0AsnAsn: 0.0 ± 0.0
2.843AsnPro: 2.843 ± 1.711
2.132AsnGln: 2.132 ± 0.828
2.843AsnArg: 2.843 ± 1.134
2.132AsnSer: 2.132 ± 0.762
0.711AsnThr: 0.711 ± 0.428
2.132AsnVal: 2.132 ± 0.828
0.0AsnTrp: 0.0 ± 0.0
2.843AsnTyr: 2.843 ± 1.711
0.0AsnXaa: 0.0 ± 0.0
Pro
4.975ProAla: 4.975 ± 1.958
1.421ProCys: 1.421 ± 0.805
0.0ProAsp: 0.0 ± 0.0
3.554ProGlu: 3.554 ± 1.365
6.397ProPhe: 6.397 ± 2.222
4.975ProGly: 4.975 ± 1.681
2.132ProHis: 2.132 ± 0.828
4.975ProIle: 4.975 ± 1.031
0.711ProLys: 0.711 ± 0.428
9.24ProLeu: 9.24 ± 2.409
2.132ProMet: 2.132 ± 0.854
2.843ProAsn: 2.843 ± 1.045
4.264ProPro: 4.264 ± 1.732
1.421ProGln: 1.421 ± 0.805
5.686ProArg: 5.686 ± 1.997
11.372ProSer: 11.372 ± 1.333
4.975ProThr: 4.975 ± 2.166
7.107ProVal: 7.107 ± 2.129
1.421ProTrp: 1.421 ± 0.776
1.421ProTyr: 1.421 ± 0.804
0.0ProXaa: 0.0 ± 0.0
Gln
2.132GlnAla: 2.132 ± 0.828
0.0GlnCys: 0.0 ± 0.0
4.264GlnAsp: 4.264 ± 1.081
0.0GlnGlu: 0.0 ± 0.0
1.421GlnPhe: 1.421 ± 0.856
1.421GlnGly: 1.421 ± 0.805
0.0GlnHis: 0.0 ± 0.0
0.0GlnIle: 0.0 ± 0.0
2.132GlnLys: 2.132 ± 1.283
3.554GlnLeu: 3.554 ± 1.672
0.0GlnMet: 0.0 ± 0.0
0.0GlnAsn: 0.0 ± 0.0
2.843GlnPro: 2.843 ± 1.552
0.711GlnGln: 0.711 ± 0.987
2.843GlnArg: 2.843 ± 1.711
4.264GlnSer: 4.264 ± 2.924
0.0GlnThr: 0.0 ± 0.0
2.843GlnVal: 2.843 ± 0.661
1.421GlnTrp: 1.421 ± 0.776
0.0GlnTyr: 0.0 ± 0.0
0.0GlnXaa: 0.0 ± 0.0
Arg
2.132ArgAla: 2.132 ± 0.828
2.843ArgCys: 2.843 ± 1.974
0.0ArgAsp: 0.0 ± 0.0
0.711ArgGlu: 0.711 ± 0.428
5.686ArgPhe: 5.686 ± 1.499
5.686ArgGly: 5.686 ± 0.485
1.421ArgHis: 1.421 ± 0.805
1.421ArgIle: 1.421 ± 0.856
2.843ArgLys: 2.843 ± 1.711
7.818ArgLeu: 7.818 ± 3.386
1.421ArgMet: 1.421 ± 1.389
1.421ArgAsn: 1.421 ± 0.805
2.132ArgPro: 2.132 ± 1.091
2.132ArgGln: 2.132 ± 0.828
13.504ArgArg: 13.504 ± 8.407
9.95ArgSer: 9.95 ± 4.114
5.686ArgThr: 5.686 ± 1.997
4.264ArgVal: 4.264 ± 1.657
4.264ArgTrp: 4.264 ± 1.751
0.711ArgTyr: 0.711 ± 0.995
0.0ArgXaa: 0.0 ± 0.0
Ser
3.554SerAla: 3.554 ± 1.477
4.975SerCys: 4.975 ± 3.348
3.554SerAsp: 3.554 ± 1.365
2.843SerGlu: 2.843 ± 0.968
7.107SerPhe: 7.107 ± 0.773
3.554SerGly: 3.554 ± 0.9
2.843SerHis: 2.843 ± 1.134
2.132SerIle: 2.132 ± 2.334
1.421SerLys: 1.421 ± 1.366
11.372SerLeu: 11.372 ± 4.315
0.711SerMet: 0.711 ± 0.428
2.843SerAsn: 2.843 ± 1.608
13.504SerPro: 13.504 ± 2.637
4.975SerGln: 4.975 ± 1.836
6.397SerArg: 6.397 ± 2.225
6.397SerSer: 6.397 ± 1.851
4.264SerThr: 4.264 ± 1.657
2.843SerVal: 2.843 ± 0.961
2.843SerTrp: 2.843 ± 2.716
1.421SerTyr: 1.421 ± 0.856
0.0SerXaa: 0.0 ± 0.0
Thr
4.975ThrAla: 4.975 ± 1.836
4.264ThrCys: 4.264 ± 3.927
1.421ThrAsp: 1.421 ± 0.856
1.421ThrGlu: 1.421 ± 0.856
2.132ThrPhe: 2.132 ± 0.828
4.975ThrGly: 4.975 ± 2.038
1.421ThrHis: 1.421 ± 0.856
3.554ThrIle: 3.554 ± 2.537
2.843ThrLys: 2.843 ± 0.961
4.975ThrLeu: 4.975 ± 1.958
0.711ThrMet: 0.711 ± 0.428
2.843ThrAsn: 2.843 ± 1.711
5.686ThrPro: 5.686 ± 1.146
2.843ThrGln: 2.843 ± 0.961
4.264ThrArg: 4.264 ± 1.252
4.975ThrSer: 4.975 ± 1.681
2.132ThrThr: 2.132 ± 1.283
2.843ThrVal: 2.843 ± 3.142
2.132ThrTrp: 2.132 ± 1.749
2.132ThrTyr: 2.132 ± 1.091
0.0ThrXaa: 0.0 ± 0.0
Val
2.132ValAla: 2.132 ± 1.283
1.421ValCys: 1.421 ± 0.776
4.264ValAsp: 4.264 ± 0.687
1.421ValGlu: 1.421 ± 1.973
1.421ValPhe: 1.421 ± 0.856
6.397ValGly: 6.397 ± 1.09
1.421ValHis: 1.421 ± 0.856
1.421ValIle: 1.421 ± 0.804
0.711ValLys: 0.711 ± 0.428
6.397ValLeu: 6.397 ± 3.39
2.132ValMet: 2.132 ± 0.828
1.421ValAsn: 1.421 ± 0.856
5.686ValPro: 5.686 ± 2.269
1.421ValGln: 1.421 ± 1.973
4.264ValArg: 4.264 ± 1.657
4.975ValSer: 4.975 ± 1.031
1.421ValThr: 1.421 ± 1.366
2.843ValVal: 2.843 ± 1.711
0.711ValTrp: 0.711 ± 0.995
0.0ValTyr: 0.0 ± 0.0
0.0ValXaa: 0.0 ± 0.0
Trp
1.421TrpAla: 1.421 ± 0.776
0.0TrpCys: 0.0 ± 0.0
0.711TrpAsp: 0.711 ± 0.987
3.554TrpGlu: 3.554 ± 0.9
1.421TrpPhe: 1.421 ± 1.366
3.554TrpGly: 3.554 ± 1.477
0.711TrpHis: 0.711 ± 0.937
2.843TrpIle: 2.843 ± 1.809
2.132TrpLys: 2.132 ± 1.283
4.975TrpLeu: 4.975 ± 2.138
1.421TrpMet: 1.421 ± 1.99
0.0TrpAsn: 0.0 ± 0.0
1.421TrpPro: 1.421 ± 0.856
0.711TrpGln: 0.711 ± 0.937
0.0TrpArg: 0.0 ± 0.0
2.132TrpSer: 2.132 ± 1.695
2.843TrpThr: 2.843 ± 1.552
1.421TrpVal: 1.421 ± 0.804
1.421TrpTrp: 1.421 ± 0.776
0.711TrpTyr: 0.711 ± 0.995
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.132TyrAla: 2.132 ± 1.283
1.421TyrCys: 1.421 ± 0.856
0.0TyrAsp: 0.0 ± 0.0
0.0TyrGlu: 0.0 ± 0.0
0.711TyrPhe: 0.711 ± 0.995
0.0TyrGly: 0.0 ± 0.0
1.421TyrHis: 1.421 ± 0.805
0.0TyrIle: 0.0 ± 0.0
0.711TyrLys: 0.711 ± 0.987
4.264TyrLeu: 4.264 ± 1.523
0.711TyrMet: 0.711 ± 0.428
1.421TyrAsn: 1.421 ± 0.856
1.421TyrPro: 1.421 ± 0.856
0.0TyrGln: 0.0 ± 0.0
2.132TyrArg: 2.132 ± 0.984
2.132TyrSer: 2.132 ± 1.283
1.421TyrThr: 1.421 ± 0.856
2.843TyrVal: 2.843 ± 0.904
1.421TyrTrp: 1.421 ± 0.776
0.711TyrTyr: 0.711 ± 0.428
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 4 proteins (1408 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski