Amino acid dipepetide frequency for Pomona bat hepatitis B virus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
3.656AlaAla: 3.656 ± 1.822
2.438AlaCys: 2.438 ± 1.096
2.438AlaAsp: 2.438 ± 0.859
3.047AlaGlu: 3.047 ± 0.737
1.219AlaPhe: 1.219 ± 0.761
1.828AlaGly: 1.828 ± 0.889
0.609AlaHis: 0.609 ± 0.38
4.266AlaIle: 4.266 ± 0.648
1.828AlaLys: 1.828 ± 0.579
7.922AlaLeu: 7.922 ± 1.251
0.609AlaMet: 0.609 ± 0.38
1.828AlaAsn: 1.828 ± 1.141
6.703AlaPro: 6.703 ± 1.411
0.609AlaGln: 0.609 ± 0.986
2.438AlaArg: 2.438 ± 1.716
7.313AlaSer: 7.313 ± 1.703
4.266AlaThr: 4.266 ± 1.064
1.219AlaVal: 1.219 ± 0.858
1.219AlaTrp: 1.219 ± 1.26
1.828AlaTyr: 1.828 ± 1.809
0.0AlaXaa: 0.0 ± 0.0
Cys
1.219CysAla: 1.219 ± 1.973
1.219CysCys: 1.219 ± 1.26
0.0CysAsp: 0.0 ± 0.0
1.219CysGlu: 1.219 ± 0.858
0.0CysPhe: 0.0 ± 0.0
1.828CysGly: 1.828 ± 1.141
0.609CysHis: 0.609 ± 0.38
0.0CysIle: 0.0 ± 0.0
0.609CysLys: 0.609 ± 0.63
6.094CysLeu: 6.094 ± 1.69
0.609CysMet: 0.609 ± 0.687
1.219CysAsn: 1.219 ± 1.26
2.438CysPro: 2.438 ± 1.66
0.609CysGln: 0.609 ± 0.38
1.219CysArg: 1.219 ± 0.858
3.047CysSer: 3.047 ± 1.238
1.219CysThr: 1.219 ± 0.47
0.609CysVal: 0.609 ± 0.986
1.828CysTrp: 1.828 ± 0.641
1.219CysTyr: 1.219 ± 0.761
0.0CysXaa: 0.0 ± 0.0
Asp
4.266AspAla: 4.266 ± 1.202
0.0AspCys: 0.0 ± 0.0
1.828AspAsp: 1.828 ± 0.889
0.0AspGlu: 0.0 ± 0.0
1.828AspPhe: 1.828 ± 1.202
0.0AspGly: 0.0 ± 0.0
1.828AspHis: 1.828 ± 0.579
1.219AspIle: 1.219 ± 0.745
1.219AspLys: 1.219 ± 0.761
2.438AspLeu: 2.438 ± 1.059
0.609AspMet: 0.609 ± 0.63
1.219AspAsn: 1.219 ± 0.761
2.438AspPro: 2.438 ± 1.755
1.219AspGln: 1.219 ± 0.47
1.219AspArg: 1.219 ± 0.858
3.047AspSer: 3.047 ± 1.024
0.609AspThr: 0.609 ± 0.63
1.828AspVal: 1.828 ± 0.888
1.828AspTrp: 1.828 ± 0.579
0.0AspTyr: 0.0 ± 0.0
0.0AspXaa: 0.0 ± 0.0
Glu
0.609GluAla: 0.609 ± 0.986
0.609GluCys: 0.609 ± 0.63
1.219GluAsp: 1.219 ± 0.47
3.047GluGlu: 3.047 ± 1.74
2.438GluPhe: 2.438 ± 1.49
1.828GluGly: 1.828 ± 0.833
1.828GluHis: 1.828 ± 1.542
0.609GluIle: 0.609 ± 0.38
1.219GluLys: 1.219 ± 0.761
3.656GluLeu: 3.656 ± 0.866
0.609GluMet: 0.609 ± 0.38
1.219GluAsn: 1.219 ± 0.939
0.609GluPro: 0.609 ± 0.63
1.828GluGln: 1.828 ± 1.141
0.0GluArg: 0.0 ± 0.0
2.438GluSer: 2.438 ± 1.059
3.656GluThr: 3.656 ± 1.774
1.828GluVal: 1.828 ± 0.579
0.609GluTrp: 0.609 ± 0.38
0.0GluTyr: 0.0 ± 0.0
0.0GluXaa: 0.0 ± 0.0
Phe
1.828PheAla: 1.828 ± 0.888
1.219PheCys: 1.219 ± 0.745
0.609PheAsp: 0.609 ± 0.38
0.609PheGlu: 0.609 ± 0.38
3.656PhePhe: 3.656 ± 0.856
4.266PheGly: 4.266 ± 4.129
1.828PheHis: 1.828 ± 1.542
0.609PheIle: 0.609 ± 0.63
1.828PheLys: 1.828 ± 1.141
6.094PheLeu: 6.094 ± 1.492
1.828PheMet: 1.828 ± 0.922
1.828PheAsn: 1.828 ± 1.141
5.484PhePro: 5.484 ± 2.08
1.219PheGln: 1.219 ± 0.47
2.438PheArg: 2.438 ± 0.859
4.875PheSer: 4.875 ± 1.718
2.438PheThr: 2.438 ± 1.063
3.656PheVal: 3.656 ± 1.01
0.0PheTrp: 0.0 ± 0.0
1.219PheTyr: 1.219 ± 0.47
0.0PheXaa: 0.0 ± 0.0
Gly
3.047GlyAla: 3.047 ± 0.737
1.219GlyCys: 1.219 ± 0.858
2.438GlyAsp: 2.438 ± 1.13
2.438GlyGlu: 2.438 ± 0.941
4.875GlyPhe: 4.875 ± 1.544
4.875GlyGly: 4.875 ± 2.123
0.609GlyHis: 0.609 ± 0.38
2.438GlyIle: 2.438 ± 0.941
3.047GlyLys: 3.047 ± 0.983
9.75GlyLeu: 9.75 ± 2.516
0.609GlyMet: 0.609 ± 0.84
1.828GlyAsn: 1.828 ± 1.045
3.047GlyPro: 3.047 ± 0.737
1.828GlyGln: 1.828 ± 0.889
6.094GlyArg: 6.094 ± 1.072
4.875GlySer: 4.875 ± 1.01
6.094GlyThr: 6.094 ± 1.765
2.438GlyVal: 2.438 ± 0.859
1.828GlyTrp: 1.828 ± 1.202
0.609GlyTyr: 0.609 ± 0.38
0.0GlyXaa: 0.0 ± 0.0
His
1.828HisAla: 1.828 ± 1.141
1.219HisCys: 1.219 ± 0.745
0.609HisAsp: 0.609 ± 0.38
0.0HisGlu: 0.0 ± 0.0
1.219HisPhe: 1.219 ± 0.761
1.828HisGly: 1.828 ± 0.579
1.828HisHis: 1.828 ± 0.833
1.219HisIle: 1.219 ± 0.761
1.828HisLys: 1.828 ± 0.888
5.484HisLeu: 5.484 ± 1.586
0.0HisMet: 0.0 ± 0.0
0.609HisAsn: 0.609 ± 0.38
1.219HisPro: 1.219 ± 0.47
1.828HisGln: 1.828 ± 0.579
1.828HisArg: 1.828 ± 0.888
3.047HisSer: 3.047 ± 0.983
4.875HisThr: 4.875 ± 1.211
1.219HisVal: 1.219 ± 0.745
0.0HisTrp: 0.0 ± 0.0
0.609HisTyr: 0.609 ± 0.38
0.0HisXaa: 0.0 ± 0.0
Ile
2.438IleAla: 2.438 ± 1.13
0.609IleCys: 0.609 ± 0.63
1.828IleAsp: 1.828 ± 0.641
0.0IleGlu: 0.0 ± 0.0
1.219IlePhe: 1.219 ± 0.761
3.047IleGly: 3.047 ± 1.493
1.219IleHis: 1.219 ± 0.761
3.047IleIle: 3.047 ± 2.195
1.219IleLys: 1.219 ± 0.47
4.875IleLeu: 4.875 ± 0.886
0.0IleMet: 0.0 ± 0.0
1.828IleAsn: 1.828 ± 1.667
1.828IlePro: 1.828 ± 0.579
2.438IleGln: 2.438 ± 1.059
1.828IleArg: 1.828 ± 0.833
1.219IleSer: 1.219 ± 0.761
4.875IleThr: 4.875 ± 1.881
1.828IleVal: 1.828 ± 0.833
1.828IleTrp: 1.828 ± 1.045
1.219IleTyr: 1.219 ± 0.47
0.0IleXaa: 0.0 ± 0.0
Lys
1.219LysAla: 1.219 ± 0.761
0.0LysCys: 0.0 ± 0.0
0.609LysAsp: 0.609 ± 0.63
0.609LysGlu: 0.609 ± 0.84
2.438LysPhe: 2.438 ± 0.859
3.656LysGly: 3.656 ± 1.553
1.828LysHis: 1.828 ± 0.579
3.047LysIle: 3.047 ± 1.196
0.0LysLys: 0.0 ± 0.0
3.047LysLeu: 3.047 ± 1.74
0.0LysMet: 0.0 ± 0.0
3.047LysAsn: 3.047 ± 1.196
1.828LysPro: 1.828 ± 1.045
3.656LysGln: 3.656 ± 1.157
1.828LysArg: 1.828 ± 0.579
4.266LysSer: 4.266 ± 1.202
6.703LysThr: 6.703 ± 2.114
0.0LysVal: 0.0 ± 0.0
1.219LysTrp: 1.219 ± 0.761
1.219LysTyr: 1.219 ± 0.761
0.0LysXaa: 0.0 ± 0.0
Leu
7.313LeuAla: 7.313 ± 2.577
3.656LeuCys: 3.656 ± 1.16
2.438LeuAsp: 2.438 ± 1.28
3.656LeuGlu: 3.656 ± 0.856
5.484LeuPhe: 5.484 ± 0.991
10.969LeuGly: 10.969 ± 2.163
3.656LeuHis: 3.656 ± 1.157
6.094LeuIle: 6.094 ± 2.293
1.219LeuLys: 1.219 ± 0.745
22.547LeuLeu: 22.547 ± 4.104
1.219LeuMet: 1.219 ± 0.858
4.266LeuAsn: 4.266 ± 1.323
7.922LeuPro: 7.922 ± 0.346
3.656LeuGln: 3.656 ± 1.157
9.75LeuArg: 9.75 ± 2.879
9.141LeuSer: 9.141 ± 1.399
5.484LeuThr: 5.484 ± 0.991
6.094LeuVal: 6.094 ± 1.492
3.047LeuTrp: 3.047 ± 1.238
3.656LeuTyr: 3.656 ± 1.258
0.0LeuXaa: 0.0 ± 0.0
Met
0.609MetAla: 0.609 ± 0.986
0.609MetCys: 0.609 ± 0.63
1.219MetAsp: 1.219 ± 0.745
0.609MetGlu: 0.609 ± 0.986
0.0MetPhe: 0.0 ± 0.0
1.828MetGly: 1.828 ± 1.045
0.609MetHis: 0.609 ± 0.38
0.0MetIle: 0.0 ± 0.0
0.0MetLys: 0.0 ± 0.0
0.0MetLeu: 0.0 ± 0.0
0.609MetMet: 0.609 ± 0.63
1.828MetAsn: 1.828 ± 1.46
2.438MetPro: 2.438 ± 0.859
0.609MetGln: 0.609 ± 0.38
0.0MetArg: 0.0 ± 0.0
0.609MetSer: 0.609 ± 0.38
0.609MetThr: 0.609 ± 0.986
0.0MetVal: 0.0 ± 0.0
0.609MetTrp: 0.609 ± 0.63
0.609MetTyr: 0.609 ± 0.84
0.0MetXaa: 0.0 ± 0.0
Asn
3.047AsnAla: 3.047 ± 1.117
0.609AsnCys: 0.609 ± 0.84
0.0AsnAsp: 0.0 ± 0.0
0.0AsnGlu: 0.0 ± 0.0
1.219AsnPhe: 1.219 ± 1.406
2.438AsnGly: 2.438 ± 0.722
2.438AsnHis: 2.438 ± 1.522
1.828AsnIle: 1.828 ± 0.579
2.438AsnLys: 2.438 ± 0.859
4.266AsnLeu: 4.266 ± 1.191
1.219AsnMet: 1.219 ± 1.161
4.266AsnAsn: 4.266 ± 1.92
4.266AsnPro: 4.266 ± 1.429
1.828AsnGln: 1.828 ± 0.889
3.047AsnArg: 3.047 ± 1.493
3.047AsnSer: 3.047 ± 1.024
3.047AsnThr: 3.047 ± 0.746
0.0AsnVal: 0.0 ± 0.0
0.609AsnTrp: 0.609 ± 0.38
1.828AsnTyr: 1.828 ± 1.141
0.0AsnXaa: 0.0 ± 0.0
Pro
9.75ProAla: 9.75 ± 1.806
2.438ProCys: 2.438 ± 0.722
3.656ProAsp: 3.656 ± 1.553
2.438ProGlu: 2.438 ± 0.478
3.656ProPhe: 3.656 ± 0.866
3.047ProGly: 3.047 ± 2.032
3.656ProHis: 3.656 ± 0.586
3.047ProIle: 3.047 ± 1.238
3.656ProLys: 3.656 ± 2.283
7.313ProLeu: 7.313 ± 2.315
0.609ProMet: 0.609 ± 0.354
4.266ProAsn: 4.266 ± 0.648
7.313ProPro: 7.313 ± 0.661
2.438ProGln: 2.438 ± 1.029
4.266ProArg: 4.266 ± 2.263
9.75ProSer: 9.75 ± 0.826
7.313ProThr: 7.313 ± 3.461
3.656ProVal: 3.656 ± 2.167
3.047ProTrp: 3.047 ± 0.983
1.828ProTyr: 1.828 ± 1.542
0.0ProXaa: 0.0 ± 0.0
Gln
1.828GlnAla: 1.828 ± 0.641
0.609GlnCys: 0.609 ± 0.38
2.438GlnAsp: 2.438 ± 1.096
0.609GlnGlu: 0.609 ± 0.38
1.219GlnPhe: 1.219 ± 0.761
0.609GlnGly: 0.609 ± 0.63
0.609GlnHis: 0.609 ± 0.38
1.219GlnIle: 1.219 ± 0.47
4.875GlnLys: 4.875 ± 1.718
5.484GlnLeu: 5.484 ± 1.215
0.0GlnMet: 0.0 ± 0.0
1.219GlnAsn: 1.219 ± 0.47
2.438GlnPro: 2.438 ± 1.13
3.047GlnGln: 3.047 ± 1.238
2.438GlnArg: 2.438 ± 0.722
5.484GlnSer: 5.484 ± 2.664
3.047GlnThr: 3.047 ± 1.811
1.219GlnVal: 1.219 ± 0.47
2.438GlnTrp: 2.438 ± 1.818
1.219GlnTyr: 1.219 ± 0.761
0.0GlnXaa: 0.0 ± 0.0
Arg
1.828ArgAla: 1.828 ± 1.667
1.219ArgCys: 1.219 ± 0.47
0.609ArgAsp: 0.609 ± 0.986
1.828ArgGlu: 1.828 ± 0.833
4.875ArgPhe: 4.875 ± 1.445
4.266ArgGly: 4.266 ± 0.652
1.828ArgHis: 1.828 ± 0.888
1.828ArgIle: 1.828 ± 0.888
2.438ArgLys: 2.438 ± 0.478
4.875ArgLeu: 4.875 ± 1.563
1.219ArgMet: 1.219 ± 0.647
1.219ArgAsn: 1.219 ± 0.761
5.484ArgPro: 5.484 ± 1.329
6.094ArgGln: 6.094 ± 1.484
12.188ArgArg: 12.188 ± 6.119
3.047ArgSer: 3.047 ± 1.238
6.094ArgThr: 6.094 ± 2.69
1.828ArgVal: 1.828 ± 1.141
0.609ArgTrp: 0.609 ± 0.63
1.219ArgTyr: 1.219 ± 0.47
0.0ArgXaa: 0.0 ± 0.0
Ser
3.047SerAla: 3.047 ± 1.196
3.047SerCys: 3.047 ± 0.746
2.438SerAsp: 2.438 ± 0.478
2.438SerGlu: 2.438 ± 0.941
5.484SerPhe: 5.484 ± 0.991
4.266SerGly: 4.266 ± 2.663
1.219SerHis: 1.219 ± 0.761
2.438SerIle: 2.438 ± 1.522
3.656SerLys: 3.656 ± 1.281
9.75SerLeu: 9.75 ± 2.749
0.0SerMet: 0.0 ± 0.0
2.438SerAsn: 2.438 ± 1.818
14.625SerPro: 14.625 ± 2.353
4.266SerGln: 4.266 ± 1.191
5.484SerArg: 5.484 ± 1.409
8.531SerSer: 8.531 ± 1.399
7.313SerThr: 7.313 ± 1.2
3.656SerVal: 3.656 ± 0.586
4.266SerTrp: 4.266 ± 2.118
1.219SerTyr: 1.219 ± 0.745
0.0SerXaa: 0.0 ± 0.0
Thr
6.094ThrAla: 6.094 ± 1.69
3.656ThrCys: 3.656 ± 1.54
1.219ThrAsp: 1.219 ± 0.761
1.219ThrGlu: 1.219 ± 0.761
1.828ThrPhe: 1.828 ± 1.542
6.094ThrGly: 6.094 ± 0.492
3.047ThrHis: 3.047 ± 0.579
4.266ThrIle: 4.266 ± 1.896
5.484ThrLys: 5.484 ± 2.414
5.484ThrLeu: 5.484 ± 0.321
0.609ThrMet: 0.609 ± 0.986
4.875ThrAsn: 4.875 ± 1.531
9.141ThrPro: 9.141 ± 2.301
1.219ThrGln: 1.219 ± 0.858
1.828ThrArg: 1.828 ± 0.833
8.531ThrSer: 8.531 ± 2.286
4.266ThrThr: 4.266 ± 2.118
3.656ThrVal: 3.656 ± 3.425
3.656ThrTrp: 3.656 ± 1.523
1.219ThrTyr: 1.219 ± 0.761
0.0ThrXaa: 0.0 ± 0.0
Val
0.609ValAla: 0.609 ± 0.38
1.828ValCys: 1.828 ± 0.579
1.828ValAsp: 1.828 ± 0.833
2.438ValGlu: 2.438 ± 1.418
2.438ValPhe: 2.438 ± 0.941
1.828ValGly: 1.828 ± 0.579
1.219ValHis: 1.219 ± 0.858
0.0ValIle: 0.0 ± 0.0
0.0ValLys: 0.0 ± 0.0
5.484ValLeu: 5.484 ± 2.973
0.609ValMet: 0.609 ± 0.84
1.219ValAsn: 1.219 ± 0.858
4.266ValPro: 4.266 ± 2.118
1.828ValGln: 1.828 ± 0.833
3.047ValArg: 3.047 ± 1.534
4.266ValSer: 4.266 ± 0.674
3.047ValThr: 3.047 ± 2.589
4.266ValVal: 4.266 ± 1.191
0.609ValTrp: 0.609 ± 0.63
0.609ValTyr: 0.609 ± 0.38
0.0ValXaa: 0.0 ± 0.0
Trp
1.828TrpAla: 1.828 ± 1.045
0.0TrpCys: 0.0 ± 0.0
1.219TrpAsp: 1.219 ± 1.161
2.438TrpGlu: 2.438 ± 1.096
1.219TrpPhe: 1.219 ± 0.939
5.484TrpGly: 5.484 ± 1.475
0.609TrpHis: 0.609 ± 0.986
1.219TrpIle: 1.219 ± 0.745
1.219TrpLys: 1.219 ± 0.761
4.266TrpLeu: 4.266 ± 1.07
1.219TrpMet: 1.219 ± 1.26
1.219TrpAsn: 1.219 ± 0.761
1.219TrpPro: 1.219 ± 0.47
0.609TrpGln: 0.609 ± 0.986
1.828TrpArg: 1.828 ± 1.045
0.609TrpSer: 0.609 ± 0.38
1.828TrpThr: 1.828 ± 1.045
1.219TrpVal: 1.219 ± 0.745
1.219TrpTrp: 1.219 ± 0.47
1.219TrpTyr: 1.219 ± 1.161
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.219TyrAla: 1.219 ± 0.47
1.219TyrCys: 1.219 ± 0.858
0.0TyrAsp: 0.0 ± 0.0
1.219TyrGlu: 1.219 ± 0.745
1.219TyrPhe: 1.219 ± 0.858
0.0TyrGly: 0.0 ± 0.0
1.219TyrHis: 1.219 ± 0.761
0.0TyrIle: 0.0 ± 0.0
2.438TyrLys: 2.438 ± 1.059
2.438TyrLeu: 2.438 ± 0.478
0.609TyrMet: 0.609 ± 0.38
0.0TyrAsn: 0.0 ± 0.0
2.438TyrPro: 2.438 ± 1.059
1.219TyrGln: 1.219 ± 0.761
1.828TyrArg: 1.828 ± 0.833
2.438TyrSer: 2.438 ± 0.722
0.609TyrThr: 0.609 ± 0.38
1.219TyrVal: 1.219 ± 0.858
1.219TyrTrp: 1.219 ± 0.47
2.438TyrTyr: 2.438 ± 0.859
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 4 proteins (1642 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski