Amino acid dipepetide frequency for Xanthomonas phage XaF13

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
13.852AlaAla: 13.852 ± 1.559
1.34AlaCys: 1.34 ± 0.66
7.149AlaAsp: 7.149 ± 1.716
5.362AlaGlu: 5.362 ± 1.897
4.021AlaPhe: 4.021 ± 2.24
5.809AlaGly: 5.809 ± 1.799
0.894AlaHis: 0.894 ± 0.467
4.915AlaIle: 4.915 ± 1.731
6.702AlaLys: 6.702 ± 2.029
11.171AlaLeu: 11.171 ± 3.472
5.362AlaMet: 5.362 ± 1.503
2.681AlaAsn: 2.681 ± 0.891
5.362AlaPro: 5.362 ± 2.224
5.809AlaGln: 5.809 ± 1.76
6.702AlaArg: 6.702 ± 1.454
5.809AlaSer: 5.809 ± 1.292
4.468AlaThr: 4.468 ± 0.945
7.596AlaVal: 7.596 ± 2.152
4.021AlaTrp: 4.021 ± 1.193
2.681AlaTyr: 2.681 ± 0.893
0.0AlaXaa: 0.0 ± 0.0
Cys
1.787CysAla: 1.787 ± 1.03
0.447CysCys: 0.447 ± 0.45
2.234CysAsp: 2.234 ± 1.335
1.34CysGlu: 1.34 ± 0.712
0.447CysPhe: 0.447 ± 0.457
0.894CysGly: 0.894 ± 0.69
0.447CysHis: 0.447 ± 0.457
0.447CysIle: 0.447 ± 0.396
1.34CysLys: 1.34 ± 1.035
1.34CysLeu: 1.34 ± 0.771
1.787CysMet: 1.787 ± 1.16
0.447CysAsn: 0.447 ± 0.345
3.128CysPro: 3.128 ± 1.567
0.447CysGln: 0.447 ± 0.345
2.681CysArg: 2.681 ± 0.918
3.128CysSer: 3.128 ± 1.186
1.34CysThr: 1.34 ± 0.719
0.894CysVal: 0.894 ± 0.501
0.894CysTrp: 0.894 ± 0.901
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
7.596AspAla: 7.596 ± 1.846
1.34AspCys: 1.34 ± 1.035
1.787AspAsp: 1.787 ± 0.835
1.787AspGlu: 1.787 ± 1.146
0.447AspPhe: 0.447 ± 0.43
9.383AspGly: 9.383 ± 3.391
0.894AspHis: 0.894 ± 0.654
0.894AspIle: 0.894 ± 0.461
3.575AspLys: 3.575 ± 1.546
4.915AspLeu: 4.915 ± 1.852
0.447AspMet: 0.447 ± 0.345
0.0AspAsn: 0.0 ± 0.0
2.234AspPro: 2.234 ± 0.903
1.787AspGln: 1.787 ± 0.903
1.787AspArg: 1.787 ± 1.03
2.234AspSer: 2.234 ± 1.43
2.234AspThr: 2.234 ± 1.017
4.021AspVal: 4.021 ± 1.241
0.0AspTrp: 0.0 ± 0.0
1.787AspTyr: 1.787 ± 1.06
0.0AspXaa: 0.0 ± 0.0
Glu
4.915GluAla: 4.915 ± 1.424
0.894GluCys: 0.894 ± 0.514
1.34GluAsp: 1.34 ± 0.673
2.234GluGlu: 2.234 ± 0.942
3.575GluPhe: 3.575 ± 1.98
2.681GluGly: 2.681 ± 2.069
0.894GluHis: 0.894 ± 0.508
1.787GluIle: 1.787 ± 0.996
1.787GluLys: 1.787 ± 1.019
4.915GluLeu: 4.915 ± 1.477
0.447GluMet: 0.447 ± 0.505
0.894GluAsn: 0.894 ± 0.694
2.681GluPro: 2.681 ± 1.061
2.234GluGln: 2.234 ± 1.168
3.128GluArg: 3.128 ± 1.548
3.128GluSer: 3.128 ± 0.919
2.234GluThr: 2.234 ± 1.426
2.234GluVal: 2.234 ± 0.539
1.34GluTrp: 1.34 ± 0.815
0.894GluTyr: 0.894 ± 0.467
0.0GluXaa: 0.0 ± 0.0
Phe
3.128PheAla: 3.128 ± 0.966
0.447PheCys: 0.447 ± 0.345
1.787PheAsp: 1.787 ± 0.628
0.447PheGlu: 0.447 ± 0.397
0.894PhePhe: 0.894 ± 0.585
6.256PheGly: 6.256 ± 1.775
1.34PheHis: 1.34 ± 0.892
0.0PheIle: 0.0 ± 0.0
1.34PheLys: 1.34 ± 1.057
2.234PheLeu: 2.234 ± 0.827
0.0PheMet: 0.0 ± 0.0
0.447PheAsn: 0.447 ± 0.457
0.894PhePro: 0.894 ± 0.599
0.894PheGln: 0.894 ± 0.665
2.681PheArg: 2.681 ± 0.832
1.787PheSer: 1.787 ± 0.662
0.447PheThr: 0.447 ± 0.457
1.787PheVal: 1.787 ± 1.253
1.34PheTrp: 1.34 ± 0.934
0.447PheTyr: 0.447 ± 0.45
0.0PheXaa: 0.0 ± 0.0
Gly
5.809GlyAla: 5.809 ± 1.252
1.787GlyCys: 1.787 ± 1.019
6.256GlyAsp: 6.256 ± 2.719
3.575GlyGlu: 3.575 ± 2.327
1.787GlyPhe: 1.787 ± 0.799
10.277GlyGly: 10.277 ± 3.527
1.34GlyHis: 1.34 ± 0.753
1.787GlyIle: 1.787 ± 0.724
4.021GlyLys: 4.021 ± 1.577
5.362GlyLeu: 5.362 ± 2.313
3.128GlyMet: 3.128 ± 1.092
5.362GlyAsn: 5.362 ± 1.625
4.021GlyPro: 4.021 ± 1.234
4.468GlyGln: 4.468 ± 1.549
5.362GlyArg: 5.362 ± 2.091
8.49GlySer: 8.49 ± 1.953
7.149GlyThr: 7.149 ± 2.002
7.149GlyVal: 7.149 ± 1.63
2.234GlyTrp: 2.234 ± 0.722
6.256GlyTyr: 6.256 ± 2.263
0.0GlyXaa: 0.0 ± 0.0
His
3.128HisAla: 3.128 ± 1.118
0.0HisCys: 0.0 ± 0.0
0.447HisAsp: 0.447 ± 0.43
0.894HisGlu: 0.894 ± 0.791
1.34HisPhe: 1.34 ± 0.536
1.34HisGly: 1.34 ± 0.728
0.0HisHis: 0.0 ± 0.0
0.0HisIle: 0.0 ± 0.0
1.34HisLys: 1.34 ± 0.793
1.34HisLeu: 1.34 ± 0.892
0.0HisMet: 0.0 ± 0.0
0.894HisAsn: 0.894 ± 0.617
0.447HisPro: 0.447 ± 0.396
0.0HisGln: 0.0 ± 0.0
2.234HisArg: 2.234 ± 1.375
0.894HisSer: 0.894 ± 0.672
0.894HisThr: 0.894 ± 0.86
2.234HisVal: 2.234 ± 1.13
0.447HisTrp: 0.447 ± 0.45
0.0HisTyr: 0.0 ± 0.0
0.0HisXaa: 0.0 ± 0.0
Ile
6.256IleAla: 6.256 ± 2.142
0.447IleCys: 0.447 ± 0.505
2.234IleAsp: 2.234 ± 0.864
3.575IleGlu: 3.575 ± 1.481
1.34IlePhe: 1.34 ± 1.29
1.787IleGly: 1.787 ± 0.752
0.0IleHis: 0.0 ± 0.0
0.894IleIle: 0.894 ± 0.672
2.681IleLys: 2.681 ± 0.962
0.894IleLeu: 0.894 ± 0.508
2.234IleMet: 2.234 ± 1.102
1.787IleAsn: 1.787 ± 0.915
1.787IlePro: 1.787 ± 1.014
2.234IleGln: 2.234 ± 0.82
3.128IleArg: 3.128 ± 1.126
0.894IleSer: 0.894 ± 0.62
1.787IleThr: 1.787 ± 0.682
0.447IleVal: 0.447 ± 0.345
0.447IleTrp: 0.447 ± 0.504
0.894IleTyr: 0.894 ± 0.541
0.0IleXaa: 0.0 ± 0.0
Lys
4.468LysAla: 4.468 ± 1.326
0.894LysCys: 0.894 ± 0.501
3.128LysAsp: 3.128 ± 0.962
2.234LysGlu: 2.234 ± 1.141
0.894LysPhe: 0.894 ± 0.583
4.021LysGly: 4.021 ± 0.999
1.787LysHis: 1.787 ± 0.838
1.34LysIle: 1.34 ± 0.866
2.681LysLys: 2.681 ± 1.316
2.234LysLeu: 2.234 ± 0.859
0.894LysMet: 0.894 ± 0.604
1.34LysAsn: 1.34 ± 0.712
4.468LysPro: 4.468 ± 1.812
0.894LysGln: 0.894 ± 0.541
3.128LysArg: 3.128 ± 1.116
3.575LysSer: 3.575 ± 0.992
1.787LysThr: 1.787 ± 0.677
1.34LysVal: 1.34 ± 0.635
1.787LysTrp: 1.787 ± 1.015
0.894LysTyr: 0.894 ± 0.461
0.0LysXaa: 0.0 ± 0.0
Leu
12.958LeuAla: 12.958 ± 2.954
0.447LeuCys: 0.447 ± 0.45
4.915LeuAsp: 4.915 ± 1.403
3.128LeuGlu: 3.128 ± 0.811
1.787LeuPhe: 1.787 ± 1.471
6.256LeuGly: 6.256 ± 1.869
2.681LeuHis: 2.681 ± 0.937
4.468LeuIle: 4.468 ± 1.831
2.681LeuLys: 2.681 ± 1.391
7.149LeuLeu: 7.149 ± 2.137
2.234LeuMet: 2.234 ± 0.695
1.787LeuAsn: 1.787 ± 0.735
8.043LeuPro: 8.043 ± 2.49
3.128LeuGln: 3.128 ± 0.899
4.468LeuArg: 4.468 ± 1.248
2.681LeuSer: 2.681 ± 1.08
2.681LeuThr: 2.681 ± 1.036
9.83LeuVal: 9.83 ± 3.059
2.681LeuTrp: 2.681 ± 1.241
1.787LeuTyr: 1.787 ± 0.894
0.0LeuXaa: 0.0 ± 0.0
Met
3.128MetAla: 3.128 ± 0.946
0.894MetCys: 0.894 ± 0.69
0.447MetAsp: 0.447 ± 0.397
0.447MetGlu: 0.447 ± 0.43
0.0MetPhe: 0.0 ± 0.0
2.234MetGly: 2.234 ± 1.069
0.447MetHis: 0.447 ± 0.397
1.34MetIle: 1.34 ± 0.5
0.447MetLys: 0.447 ± 0.397
1.34MetLeu: 1.34 ± 0.615
0.894MetMet: 0.894 ± 0.583
0.447MetAsn: 0.447 ± 0.551
2.681MetPro: 2.681 ± 1.316
2.234MetGln: 2.234 ± 0.836
1.34MetArg: 1.34 ± 0.844
2.681MetSer: 2.681 ± 0.927
4.021MetThr: 4.021 ± 1.088
2.234MetVal: 2.234 ± 1.153
0.447MetTrp: 0.447 ± 0.522
0.0MetTyr: 0.0 ± 0.0
0.0MetXaa: 0.0 ± 0.0
Asn
4.021AsnAla: 4.021 ± 1.644
0.894AsnCys: 0.894 ± 0.69
1.787AsnAsp: 1.787 ± 0.666
1.787AsnGlu: 1.787 ± 0.874
1.34AsnPhe: 1.34 ± 0.736
4.915AsnGly: 4.915 ± 1.958
0.0AsnHis: 0.0 ± 0.0
1.34AsnIle: 1.34 ± 0.713
0.447AsnLys: 0.447 ± 0.345
1.787AsnLeu: 1.787 ± 1.01
0.447AsnMet: 0.447 ± 0.43
1.34AsnAsn: 1.34 ± 1.035
0.447AsnPro: 0.447 ± 0.505
0.447AsnGln: 0.447 ± 0.43
2.681AsnArg: 2.681 ± 1.626
2.234AsnSer: 2.234 ± 0.709
1.787AsnThr: 1.787 ± 0.852
0.447AsnVal: 0.447 ± 0.45
0.0AsnTrp: 0.0 ± 0.0
0.447AsnTyr: 0.447 ± 0.43
0.0AsnXaa: 0.0 ± 0.0
Pro
5.809ProAla: 5.809 ± 1.96
0.894ProCys: 0.894 ± 0.658
2.681ProAsp: 2.681 ± 1.663
4.021ProGlu: 4.021 ± 1.214
1.34ProPhe: 1.34 ± 0.554
6.256ProGly: 6.256 ± 1.544
0.447ProHis: 0.447 ± 0.396
2.234ProIle: 2.234 ± 1.226
2.681ProLys: 2.681 ± 1.297
4.468ProLeu: 4.468 ± 2.59
0.894ProMet: 0.894 ± 0.708
2.234ProAsn: 2.234 ± 1.17
4.915ProPro: 4.915 ± 2.412
2.681ProGln: 2.681 ± 1.621
4.021ProArg: 4.021 ± 1.97
5.809ProSer: 5.809 ± 2.271
3.128ProThr: 3.128 ± 1.583
5.362ProVal: 5.362 ± 1.848
2.234ProTrp: 2.234 ± 0.944
0.447ProTyr: 0.447 ± 0.505
0.0ProXaa: 0.0 ± 0.0
Gln
4.915GlnAla: 4.915 ± 1.41
2.234GlnCys: 2.234 ± 1.16
0.447GlnAsp: 0.447 ± 0.43
1.34GlnGlu: 1.34 ± 0.673
0.894GlnPhe: 0.894 ± 0.638
5.362GlnGly: 5.362 ± 2.234
0.894GlnHis: 0.894 ± 0.561
1.34GlnIle: 1.34 ± 0.51
0.0GlnLys: 0.0 ± 0.0
3.575GlnLeu: 3.575 ± 1.186
0.0GlnMet: 0.0 ± 0.0
0.447GlnAsn: 0.447 ± 0.497
5.362GlnPro: 5.362 ± 1.732
5.362GlnGln: 5.362 ± 2.254
4.021GlnArg: 4.021 ± 1.368
2.234GlnSer: 2.234 ± 1.124
1.34GlnThr: 1.34 ± 0.495
0.894GlnVal: 0.894 ± 0.535
2.681GlnTrp: 2.681 ± 1.168
0.0GlnTyr: 0.0 ± 0.0
0.0GlnXaa: 0.0 ± 0.0
Arg
6.702ArgAla: 6.702 ± 2.34
1.787ArgCys: 1.787 ± 1.083
3.128ArgAsp: 3.128 ± 1.12
2.681ArgGlu: 2.681 ± 1.316
1.787ArgPhe: 1.787 ± 0.936
5.809ArgGly: 5.809 ± 1.587
1.34ArgHis: 1.34 ± 0.753
4.915ArgIle: 4.915 ± 0.838
1.787ArgLys: 1.787 ± 0.81
8.043ArgLeu: 8.043 ± 1.584
2.234ArgMet: 2.234 ± 1.029
2.234ArgAsn: 2.234 ± 0.918
4.021ArgPro: 4.021 ± 2.862
3.128ArgGln: 3.128 ± 1.195
4.468ArgArg: 4.468 ± 1.1
3.575ArgSer: 3.575 ± 2.146
4.021ArgThr: 4.021 ± 1.186
2.681ArgVal: 2.681 ± 1.05
2.681ArgTrp: 2.681 ± 1.107
0.447ArgTyr: 0.447 ± 0.396
0.0ArgXaa: 0.0 ± 0.0
Ser
8.937SerAla: 8.937 ± 2.548
3.575SerCys: 3.575 ± 2.003
3.575SerAsp: 3.575 ± 1.276
1.787SerGlu: 1.787 ± 0.802
0.0SerPhe: 0.0 ± 0.0
4.915SerGly: 4.915 ± 1.525
2.234SerHis: 2.234 ± 1.358
1.787SerIle: 1.787 ± 0.486
4.468SerLys: 4.468 ± 1.133
7.149SerLeu: 7.149 ± 1.922
2.234SerMet: 2.234 ± 1.069
1.34SerAsn: 1.34 ± 0.594
4.468SerPro: 4.468 ± 1.182
2.234SerGln: 2.234 ± 0.828
5.362SerArg: 5.362 ± 2.014
5.362SerSer: 5.362 ± 1.96
4.021SerThr: 4.021 ± 1.083
1.787SerVal: 1.787 ± 0.842
0.0SerTrp: 0.0 ± 0.0
0.447SerTyr: 0.447 ± 0.345
0.0SerXaa: 0.0 ± 0.0
Thr
6.256ThrAla: 6.256 ± 1.882
4.021ThrCys: 4.021 ± 1.487
2.234ThrAsp: 2.234 ± 0.917
2.681ThrGlu: 2.681 ± 1.002
1.34ThrPhe: 1.34 ± 0.714
7.149ThrGly: 7.149 ± 2.381
0.447ThrHis: 0.447 ± 0.43
2.681ThrIle: 2.681 ± 0.876
1.787ThrLys: 1.787 ± 0.645
3.575ThrLeu: 3.575 ± 1.403
2.234ThrMet: 2.234 ± 0.946
0.894ThrAsn: 0.894 ± 0.463
2.681ThrPro: 2.681 ± 1.215
1.787ThrGln: 1.787 ± 0.897
3.575ThrArg: 3.575 ± 1.15
4.468ThrSer: 4.468 ± 1.679
2.681ThrThr: 2.681 ± 1.304
2.234ThrVal: 2.234 ± 1.155
0.894ThrTrp: 0.894 ± 0.545
0.447ThrTyr: 0.447 ± 0.345
0.0ThrXaa: 0.0 ± 0.0
Val
3.128ValAla: 3.128 ± 1.138
1.787ValCys: 1.787 ± 1.071
2.681ValAsp: 2.681 ± 0.974
3.575ValGlu: 3.575 ± 1.661
3.575ValPhe: 3.575 ± 1.013
5.809ValGly: 5.809 ± 1.041
1.34ValHis: 1.34 ± 0.623
1.787ValIle: 1.787 ± 0.85
2.234ValLys: 2.234 ± 0.877
8.49ValLeu: 8.49 ± 3.717
0.894ValMet: 0.894 ± 0.529
0.894ValAsn: 0.894 ± 0.654
2.234ValPro: 2.234 ± 0.661
2.234ValGln: 2.234 ± 1.429
5.362ValArg: 5.362 ± 1.412
4.915ValSer: 4.915 ± 1.592
3.575ValThr: 3.575 ± 1.475
4.915ValVal: 4.915 ± 1.572
3.575ValTrp: 3.575 ± 1.303
0.447ValTyr: 0.447 ± 0.505
0.0ValXaa: 0.0 ± 0.0
Trp
2.681TrpAla: 2.681 ± 1.498
1.34TrpCys: 1.34 ± 0.617
0.447TrpAsp: 0.447 ± 0.484
0.894TrpGlu: 0.894 ± 0.735
1.787TrpPhe: 1.787 ± 0.725
2.234TrpGly: 2.234 ± 0.823
0.447TrpHis: 0.447 ± 0.396
1.34TrpIle: 1.34 ± 0.762
1.34TrpLys: 1.34 ± 0.998
5.362TrpLeu: 5.362 ± 2.293
0.894TrpMet: 0.894 ± 0.545
1.34TrpAsn: 1.34 ± 0.7
1.34TrpPro: 1.34 ± 0.833
0.894TrpGln: 0.894 ± 0.657
0.894TrpArg: 0.894 ± 0.69
0.894TrpSer: 0.894 ± 0.585
2.234TrpThr: 2.234 ± 1.059
0.894TrpVal: 0.894 ± 0.88
0.894TrpTrp: 0.894 ± 0.654
0.894TrpTyr: 0.894 ± 0.561
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.234TyrAla: 2.234 ± 1.023
0.0TyrCys: 0.0 ± 0.0
0.447TyrAsp: 0.447 ± 0.505
0.447TyrGlu: 0.447 ± 0.396
0.0TyrPhe: 0.0 ± 0.0
2.234TyrGly: 2.234 ± 0.695
0.0TyrHis: 0.0 ± 0.0
0.447TyrIle: 0.447 ± 0.397
0.447TyrLys: 0.447 ± 0.43
0.447TyrLeu: 0.447 ± 0.345
0.0TyrMet: 0.0 ± 0.0
1.787TyrAsn: 1.787 ± 1.198
1.34TyrPro: 1.34 ± 0.594
0.447TyrGln: 0.447 ± 0.45
0.447TyrArg: 0.447 ± 0.505
1.34TyrSer: 1.34 ± 0.73
2.234TyrThr: 2.234 ± 0.878
4.468TyrVal: 4.468 ± 1.39
0.447TyrTrp: 0.447 ± 0.45
0.894TyrTyr: 0.894 ± 0.529
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 14 proteins (2239 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski