Amino acid dipepetide frequency for Parabacteroides phage YZ-2015b

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
2.469AlaAla: 2.469 ± 1.627
0.823AlaCys: 0.823 ± 0.488
3.292AlaAsp: 3.292 ± 2.829
4.115AlaGlu: 4.115 ± 1.452
2.469AlaPhe: 2.469 ± 0.571
6.584AlaGly: 6.584 ± 1.592
2.469AlaHis: 2.469 ± 0.944
0.823AlaIle: 0.823 ± 0.893
5.761AlaLys: 5.761 ± 3.239
4.938AlaLeu: 4.938 ± 2.928
4.115AlaMet: 4.115 ± 2.613
3.292AlaAsn: 3.292 ± 1.332
4.115AlaPro: 4.115 ± 1.288
4.115AlaGln: 4.115 ± 3.691
6.584AlaArg: 6.584 ± 2.698
4.115AlaSer: 4.115 ± 2.16
3.292AlaThr: 3.292 ± 1.829
1.646AlaVal: 1.646 ± 0.74
2.469AlaTrp: 2.469 ± 0.571
5.761AlaTyr: 5.761 ± 1.768
0.0AlaXaa: 0.0 ± 0.0
Cys
0.823CysAla: 0.823 ± 0.824
0.0CysCys: 0.0 ± 0.0
0.0CysAsp: 0.0 ± 0.0
0.0CysGlu: 0.0 ± 0.0
0.0CysPhe: 0.0 ± 0.0
1.646CysGly: 1.646 ± 0.74
0.0CysHis: 0.0 ± 0.0
0.0CysIle: 0.0 ± 0.0
2.469CysLys: 2.469 ± 1.488
0.823CysLeu: 0.823 ± 0.824
0.823CysMet: 0.823 ± 0.488
0.0CysAsn: 0.0 ± 0.0
0.823CysPro: 0.823 ± 0.824
0.0CysGln: 0.0 ± 0.0
0.823CysArg: 0.823 ± 0.824
0.823CysSer: 0.823 ± 0.824
0.0CysThr: 0.0 ± 0.0
0.823CysVal: 0.823 ± 0.824
0.823CysTrp: 0.823 ± 0.824
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
3.292AspAla: 3.292 ± 1.608
0.0AspCys: 0.0 ± 0.0
4.115AspAsp: 4.115 ± 2.399
4.115AspGlu: 4.115 ± 1.544
5.761AspPhe: 5.761 ± 2.659
0.823AspGly: 0.823 ± 0.488
0.823AspHis: 0.823 ± 1.219
2.469AspIle: 2.469 ± 0.571
3.292AspLys: 3.292 ± 1.309
5.761AspLeu: 5.761 ± 2.729
2.469AspMet: 2.469 ± 1.178
2.469AspAsn: 2.469 ± 1.627
4.115AspPro: 4.115 ± 3.38
3.292AspGln: 3.292 ± 1.157
2.469AspArg: 2.469 ± 2.089
4.938AspSer: 4.938 ± 1.971
0.0AspThr: 0.0 ± 0.0
4.115AspVal: 4.115 ± 1.26
0.0AspTrp: 0.0 ± 0.0
5.761AspTyr: 5.761 ± 2.806
0.0AspXaa: 0.0 ± 0.0
Glu
4.115GluAla: 4.115 ± 1.452
1.646GluCys: 1.646 ± 0.74
5.761GluAsp: 5.761 ± 1.922
2.469GluGlu: 2.469 ± 1.488
3.292GluPhe: 3.292 ± 1.479
1.646GluGly: 1.646 ± 1.785
1.646GluHis: 1.646 ± 0.74
2.469GluIle: 2.469 ± 1.488
9.053GluLys: 9.053 ± 5.883
7.407GluLeu: 7.407 ± 3.471
2.469GluMet: 2.469 ± 0.571
3.292GluAsn: 3.292 ± 1.303
2.469GluPro: 2.469 ± 1.223
3.292GluGln: 3.292 ± 0.53
4.938GluArg: 4.938 ± 0.995
2.469GluSer: 2.469 ± 1.214
1.646GluThr: 1.646 ± 0.976
2.469GluVal: 2.469 ± 1.214
1.646GluTrp: 1.646 ± 0.74
4.938GluTyr: 4.938 ± 2.219
0.0GluXaa: 0.0 ± 0.0
Phe
1.646PheAla: 1.646 ± 0.976
0.0PheCys: 0.0 ± 0.0
4.115PheAsp: 4.115 ± 2.647
3.292PheGlu: 3.292 ± 1.952
1.646PhePhe: 1.646 ± 0.74
4.938PheGly: 4.938 ± 1.005
0.0PheHis: 0.0 ± 0.0
0.823PheIle: 0.823 ± 0.488
1.646PheLys: 1.646 ± 0.804
1.646PheLeu: 1.646 ± 0.74
2.469PheMet: 2.469 ± 2.472
3.292PheAsn: 3.292 ± 0.53
2.469PhePro: 2.469 ± 0.944
1.646PheGln: 1.646 ± 0.74
4.938PheArg: 4.938 ± 1.143
0.823PheSer: 0.823 ± 1.219
4.938PheThr: 4.938 ± 1.55
5.761PheVal: 5.761 ± 2.115
1.646PheTrp: 1.646 ± 0.976
1.646PheTyr: 1.646 ± 0.74
0.0PheXaa: 0.0 ± 0.0
Gly
3.292GlyAla: 3.292 ± 1.303
0.0GlyCys: 0.0 ± 0.0
5.761GlyAsp: 5.761 ± 0.928
1.646GlyGlu: 1.646 ± 0.976
5.761GlyPhe: 5.761 ± 2.522
4.938GlyGly: 4.938 ± 2.928
2.469GlyHis: 2.469 ± 0.944
5.761GlyIle: 5.761 ± 1.352
0.823GlyLys: 0.823 ± 0.824
2.469GlyLeu: 2.469 ± 1.627
0.823GlyMet: 0.823 ± 1.219
3.292GlyAsn: 3.292 ± 1.608
0.823GlyPro: 0.823 ± 0.488
0.0GlyGln: 0.0 ± 0.0
2.469GlyArg: 2.469 ± 0.944
4.938GlySer: 4.938 ± 1.143
4.115GlyThr: 4.115 ± 2.44
4.938GlyVal: 4.938 ± 1.567
1.646GlyTrp: 1.646 ± 0.92
2.469GlyTyr: 2.469 ± 1.464
0.0GlyXaa: 0.0 ± 0.0
His
0.823HisAla: 0.823 ± 0.488
0.0HisCys: 0.0 ± 0.0
0.823HisAsp: 0.823 ± 1.219
0.0HisGlu: 0.0 ± 0.0
0.823HisPhe: 0.823 ± 0.488
2.469HisGly: 2.469 ± 1.464
0.0HisHis: 0.0 ± 0.0
0.0HisIle: 0.0 ± 0.0
1.646HisLys: 1.646 ± 1.648
3.292HisLeu: 3.292 ± 1.309
0.823HisMet: 0.823 ± 0.824
0.0HisAsn: 0.0 ± 0.0
3.292HisPro: 3.292 ± 2.289
0.0HisGln: 0.0 ± 0.0
0.823HisArg: 0.823 ± 0.488
0.0HisSer: 0.0 ± 0.0
0.0HisThr: 0.0 ± 0.0
0.0HisVal: 0.0 ± 0.0
0.0HisTrp: 0.0 ± 0.0
2.469HisTyr: 2.469 ± 0.944
0.0HisXaa: 0.0 ± 0.0
Ile
2.469IleAla: 2.469 ± 1.615
0.0IleCys: 0.0 ± 0.0
3.292IleAsp: 3.292 ± 1.006
2.469IleGlu: 2.469 ± 1.488
1.646IlePhe: 1.646 ± 0.976
2.469IleGly: 2.469 ± 1.488
0.0IleHis: 0.0 ± 0.0
1.646IleIle: 1.646 ± 0.74
4.115IleLys: 4.115 ± 1.125
3.292IleLeu: 3.292 ± 2.289
1.646IleMet: 1.646 ± 0.976
4.115IleAsn: 4.115 ± 1.747
0.0IlePro: 0.0 ± 0.0
2.469IleGln: 2.469 ± 1.223
2.469IleArg: 2.469 ± 1.627
1.646IleSer: 1.646 ± 0.976
2.469IleThr: 2.469 ± 1.464
0.0IleVal: 0.0 ± 0.0
0.0IleTrp: 0.0 ± 0.0
1.646IleTyr: 1.646 ± 1.114
0.0IleXaa: 0.0 ± 0.0
Lys
4.938LysAla: 4.938 ± 1.574
0.823LysCys: 0.823 ± 0.824
0.823LysAsp: 0.823 ± 0.488
5.761LysGlu: 5.761 ± 2.729
2.469LysPhe: 2.469 ± 1.464
0.823LysGly: 0.823 ± 0.488
0.823LysHis: 0.823 ± 0.488
3.292LysIle: 3.292 ± 1.829
1.646LysLys: 1.646 ± 1.648
4.938LysLeu: 4.938 ± 2.246
4.938LysMet: 4.938 ± 3.889
4.115LysAsn: 4.115 ± 1.494
2.469LysPro: 2.469 ± 0.944
0.0LysGln: 0.0 ± 0.0
4.938LysArg: 4.938 ± 2.741
1.646LysSer: 1.646 ± 0.74
1.646LysThr: 1.646 ± 1.114
4.115LysVal: 4.115 ± 0.697
0.823LysTrp: 0.823 ± 0.488
4.938LysTyr: 4.938 ± 1.725
0.0LysXaa: 0.0 ± 0.0
Leu
6.584LeuAla: 6.584 ± 3.482
0.823LeuCys: 0.823 ± 0.824
2.469LeuAsp: 2.469 ± 1.627
1.646LeuGlu: 1.646 ± 1.415
6.584LeuPhe: 6.584 ± 2.842
3.292LeuGly: 3.292 ± 1.332
0.823LeuHis: 0.823 ± 0.824
4.115LeuIle: 4.115 ± 2.162
2.469LeuLys: 2.469 ± 1.178
0.823LeuLeu: 0.823 ± 0.824
4.938LeuMet: 4.938 ± 2.085
4.115LeuAsn: 4.115 ± 1.731
4.938LeuPro: 4.938 ± 2.105
5.761LeuGln: 5.761 ± 1.352
6.584LeuArg: 6.584 ± 1.415
7.407LeuSer: 7.407 ± 2.057
4.115LeuThr: 4.115 ± 2.201
5.761LeuVal: 5.761 ± 1.931
0.823LeuTrp: 0.823 ± 0.488
0.823LeuTyr: 0.823 ± 0.824
0.0LeuXaa: 0.0 ± 0.0
Met
6.584MetAla: 6.584 ± 1.684
1.646MetCys: 1.646 ± 0.74
1.646MetAsp: 1.646 ± 0.976
4.115MetGlu: 4.115 ± 3.691
4.115MetPhe: 4.115 ± 1.494
2.469MetGly: 2.469 ± 1.214
0.823MetHis: 0.823 ± 0.488
0.823MetIle: 0.823 ± 0.824
3.292MetLys: 3.292 ± 2.289
1.646MetLeu: 1.646 ± 0.74
2.469MetMet: 2.469 ± 1.382
0.823MetAsn: 0.823 ± 0.824
4.938MetPro: 4.938 ± 1.55
2.469MetGln: 2.469 ± 1.615
2.469MetArg: 2.469 ± 1.502
3.292MetSer: 3.292 ± 1.608
0.0MetThr: 0.0 ± 0.0
0.823MetVal: 0.823 ± 0.488
0.823MetTrp: 0.823 ± 0.893
4.115MetTyr: 4.115 ± 1.452
0.0MetXaa: 0.0 ± 0.0
Asn
4.115AsnAla: 4.115 ± 1.731
0.823AsnCys: 0.823 ± 0.824
3.292AsnAsp: 3.292 ± 1.157
4.115AsnGlu: 4.115 ± 0.974
0.823AsnPhe: 0.823 ± 1.219
6.584AsnGly: 6.584 ± 2.617
0.0AsnHis: 0.0 ± 0.0
2.469AsnIle: 2.469 ± 0.986
4.115AsnLys: 4.115 ± 2.16
7.407AsnLeu: 7.407 ± 3.046
4.115AsnMet: 4.115 ± 1.731
3.292AsnAsn: 3.292 ± 1.332
3.292AsnPro: 3.292 ± 1.841
3.292AsnGln: 3.292 ± 1.608
4.115AsnArg: 4.115 ± 1.747
3.292AsnSer: 3.292 ± 1.608
0.823AsnThr: 0.823 ± 0.488
4.115AsnVal: 4.115 ± 1.125
0.823AsnTrp: 0.823 ± 0.893
0.0AsnTyr: 0.0 ± 0.0
0.0AsnXaa: 0.0 ± 0.0
Pro
2.469ProAla: 2.469 ± 2.536
0.823ProCys: 0.823 ± 0.824
3.292ProAsp: 3.292 ± 4.874
3.292ProGlu: 3.292 ± 1.309
2.469ProPhe: 2.469 ± 1.488
0.823ProGly: 0.823 ± 0.824
0.823ProHis: 0.823 ± 0.824
4.115ProIle: 4.115 ± 1.625
2.469ProLys: 2.469 ± 0.986
2.469ProLeu: 2.469 ± 1.488
3.292ProMet: 3.292 ± 1.309
4.115ProAsn: 4.115 ± 1.747
0.0ProPro: 0.0 ± 0.0
2.469ProGln: 2.469 ± 0.986
4.938ProArg: 4.938 ± 1.888
4.115ProSer: 4.115 ± 2.278
2.469ProThr: 2.469 ± 1.223
2.469ProVal: 2.469 ± 1.214
0.823ProTrp: 0.823 ± 0.488
0.823ProTyr: 0.823 ± 0.893
0.0ProXaa: 0.0 ± 0.0
Gln
4.115GlnAla: 4.115 ± 1.452
0.0GlnCys: 0.0 ± 0.0
3.292GlnAsp: 3.292 ± 1.157
2.469GlnGlu: 2.469 ± 1.214
0.823GlnPhe: 0.823 ± 0.488
2.469GlnGly: 2.469 ± 0.986
0.0GlnHis: 0.0 ± 0.0
1.646GlnIle: 1.646 ± 0.976
1.646GlnLys: 1.646 ± 0.804
4.115GlnLeu: 4.115 ± 0.974
3.292GlnMet: 3.292 ± 1.66
5.761GlnAsn: 5.761 ± 2.894
0.823GlnPro: 0.823 ± 0.893
1.646GlnGln: 1.646 ± 0.804
4.938GlnArg: 4.938 ± 0.659
1.646GlnSer: 1.646 ± 0.74
2.469GlnThr: 2.469 ± 0.571
1.646GlnVal: 1.646 ± 0.804
0.0GlnTrp: 0.0 ± 0.0
3.292GlnTyr: 3.292 ± 3.571
0.0GlnXaa: 0.0 ± 0.0
Arg
5.761ArgAla: 5.761 ± 1.352
0.823ArgCys: 0.823 ± 0.824
4.938ArgAsp: 4.938 ± 1.725
6.584ArgGlu: 6.584 ± 2.694
0.0ArgPhe: 0.0 ± 0.0
3.292ArgGly: 3.292 ± 1.332
1.646ArgHis: 1.646 ± 1.648
1.646ArgIle: 1.646 ± 0.74
2.469ArgLys: 2.469 ± 1.42
8.23ArgLeu: 8.23 ± 0.736
4.115ArgMet: 4.115 ± 1.125
4.115ArgAsn: 4.115 ± 1.125
2.469ArgPro: 2.469 ± 0.944
5.761ArgGln: 5.761 ± 2.159
3.292ArgArg: 3.292 ± 1.309
4.115ArgSer: 4.115 ± 1.673
2.469ArgThr: 2.469 ± 1.627
2.469ArgVal: 2.469 ± 1.464
0.823ArgTrp: 0.823 ± 0.488
3.292ArgTyr: 3.292 ± 1.479
0.0ArgXaa: 0.0 ± 0.0
Ser
7.407SerAla: 7.407 ± 2.957
0.0SerCys: 0.0 ± 0.0
4.938SerAsp: 4.938 ± 1.147
8.23SerGlu: 8.23 ± 1.732
1.646SerPhe: 1.646 ± 1.114
7.407SerGly: 7.407 ± 2.399
1.646SerHis: 1.646 ± 0.976
2.469SerIle: 2.469 ± 0.986
1.646SerLys: 1.646 ± 0.74
4.115SerLeu: 4.115 ± 1.735
2.469SerMet: 2.469 ± 0.571
3.292SerAsn: 3.292 ± 1.157
1.646SerPro: 1.646 ± 0.804
0.823SerGln: 0.823 ± 0.488
3.292SerArg: 3.292 ± 1.303
5.761SerSer: 5.761 ± 1.931
1.646SerThr: 1.646 ± 0.804
2.469SerVal: 2.469 ± 0.571
0.823SerTrp: 0.823 ± 0.824
1.646SerTyr: 1.646 ± 0.976
0.0SerXaa: 0.0 ± 0.0
Thr
4.115ThrAla: 4.115 ± 3.691
0.0ThrCys: 0.0 ± 0.0
1.646ThrAsp: 1.646 ± 0.74
4.115ThrGlu: 4.115 ± 1.288
1.646ThrPhe: 1.646 ± 0.976
2.469ThrGly: 2.469 ± 1.615
0.823ThrHis: 0.823 ± 0.488
0.823ThrIle: 0.823 ± 0.824
1.646ThrLys: 1.646 ± 0.976
5.761ThrLeu: 5.761 ± 0.928
1.646ThrMet: 1.646 ± 0.804
1.646ThrAsn: 1.646 ± 0.74
3.292ThrPro: 3.292 ± 1.477
0.0ThrGln: 0.0 ± 0.0
0.0ThrArg: 0.0 ± 0.0
4.938ThrSer: 4.938 ± 2.187
1.646ThrThr: 1.646 ± 0.976
2.469ThrVal: 2.469 ± 1.214
0.0ThrTrp: 0.0 ± 0.0
0.823ThrTyr: 0.823 ± 0.824
0.0ThrXaa: 0.0 ± 0.0
Val
3.292ValAla: 3.292 ± 0.53
0.823ValCys: 0.823 ± 0.824
2.469ValAsp: 2.469 ± 1.464
4.115ValGlu: 4.115 ± 1.494
1.646ValPhe: 1.646 ± 0.976
0.823ValGly: 0.823 ± 0.488
0.0ValHis: 0.0 ± 0.0
0.0ValIle: 0.0 ± 0.0
2.469ValLys: 2.469 ± 1.464
3.292ValLeu: 3.292 ± 2.305
0.0ValMet: 0.0 ± 0.0
4.115ValAsn: 4.115 ± 1.735
4.938ValPro: 4.938 ± 2.187
3.292ValGln: 3.292 ± 1.479
4.115ValArg: 4.115 ± 2.139
7.407ValSer: 7.407 ± 1.868
4.115ValThr: 4.115 ± 1.374
3.292ValVal: 3.292 ± 1.309
0.823ValTrp: 0.823 ± 0.488
2.469ValTyr: 2.469 ± 0.571
0.0ValXaa: 0.0 ± 0.0
Trp
0.823TrpAla: 0.823 ± 0.824
0.823TrpCys: 0.823 ± 0.824
0.823TrpAsp: 0.823 ± 0.488
1.646TrpGlu: 1.646 ± 0.976
0.823TrpPhe: 0.823 ± 0.488
0.0TrpGly: 0.0 ± 0.0
0.823TrpHis: 0.823 ± 0.488
0.823TrpIle: 0.823 ± 0.824
0.0TrpLys: 0.0 ± 0.0
0.823TrpLeu: 0.823 ± 0.488
0.0TrpMet: 0.0 ± 0.0
2.469TrpAsn: 2.469 ± 1.627
0.0TrpPro: 0.0 ± 0.0
1.646TrpGln: 1.646 ± 0.92
0.0TrpArg: 0.0 ± 0.0
0.0TrpSer: 0.0 ± 0.0
0.823TrpThr: 0.823 ± 0.893
2.469TrpVal: 2.469 ± 0.944
0.0TrpTrp: 0.0 ± 0.0
0.0TrpTyr: 0.0 ± 0.0
0.0TrpXaa: 0.0 ± 0.0
Tyr
4.115TyrAla: 4.115 ± 1.494
0.823TyrCys: 0.823 ± 0.824
3.292TyrAsp: 3.292 ± 1.479
4.938TyrGlu: 4.938 ± 1.725
4.938TyrPhe: 4.938 ± 2.187
2.469TyrGly: 2.469 ± 0.571
1.646TyrHis: 1.646 ± 0.74
1.646TyrIle: 1.646 ± 0.976
3.292TyrLys: 3.292 ± 1.7
1.646TyrLeu: 1.646 ± 0.74
2.469TyrMet: 2.469 ± 1.615
3.292TyrAsn: 3.292 ± 1.157
1.646TyrPro: 1.646 ± 1.415
4.115TyrGln: 4.115 ± 1.731
4.115TyrArg: 4.115 ± 1.374
0.0TyrSer: 0.0 ± 0.0
0.823TyrThr: 0.823 ± 0.488
1.646TyrVal: 1.646 ± 1.648
0.0TyrTrp: 0.0 ± 0.0
1.646TyrTyr: 1.646 ± 0.74
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 4 proteins (1216 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski