Amino acid dipepetide frequency for Caulobacter phage phiCb5

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
3.113AlaAla: 3.113 ± 1.961
0.0AlaCys: 0.0 ± 0.0
1.556AlaAsp: 1.556 ± 0.98
1.556AlaGlu: 1.556 ± 0.98
2.335AlaPhe: 2.335 ± 2.318
2.335AlaGly: 2.335 ± 1.286
1.556AlaHis: 1.556 ± 0.98
1.556AlaIle: 1.556 ± 0.98
3.113AlaLys: 3.113 ± 2.464
10.117AlaLeu: 10.117 ± 2.782
0.0AlaMet: 0.0 ± 0.0
0.778AlaAsn: 0.778 ± 0.773
0.778AlaPro: 0.778 ± 0.49
0.0AlaGln: 0.0 ± 0.0
3.113AlaArg: 3.113 ± 3.091
10.895AlaSer: 10.895 ± 2.157
3.891AlaThr: 3.891 ± 2.077
5.447AlaVal: 5.447 ± 3.431
1.556AlaTrp: 1.556 ± 0.592
2.335AlaTyr: 2.335 ± 1.294
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.0CysCys: 0.0 ± 0.0
0.778CysAsp: 0.778 ± 0.49
0.0CysGlu: 0.0 ± 0.0
0.0CysPhe: 0.0 ± 0.0
2.335CysGly: 2.335 ± 1.47
0.0CysHis: 0.0 ± 0.0
0.0CysIle: 0.0 ± 0.0
0.0CysLys: 0.0 ± 0.0
3.113CysLeu: 3.113 ± 2.472
0.0CysMet: 0.0 ± 0.0
0.0CysAsn: 0.0 ± 0.0
0.778CysPro: 0.778 ± 0.773
0.778CysGln: 0.778 ± 0.49
1.556CysArg: 1.556 ± 0.592
3.113CysSer: 3.113 ± 1.331
0.0CysThr: 0.0 ± 0.0
0.778CysVal: 0.778 ± 0.49
0.0CysTrp: 0.0 ± 0.0
0.778CysTyr: 0.778 ± 0.49
0.0CysXaa: 0.0 ± 0.0
Asp
1.556AspAla: 1.556 ± 0.592
3.113AspCys: 3.113 ± 1.139
1.556AspAsp: 1.556 ± 0.98
1.556AspGlu: 1.556 ± 0.592
2.335AspPhe: 2.335 ± 0.764
3.113AspGly: 3.113 ± 0.963
1.556AspHis: 1.556 ± 1.236
4.669AspIle: 4.669 ± 1.775
1.556AspLys: 1.556 ± 0.98
7.782AspLeu: 7.782 ± 1.03
0.778AspMet: 0.778 ± 0.49
1.556AspAsn: 1.556 ± 0.592
2.335AspPro: 2.335 ± 1.47
1.556AspGln: 1.556 ± 0.98
2.335AspArg: 2.335 ± 1.47
5.447AspSer: 5.447 ± 2.75
3.891AspThr: 3.891 ± 1.038
3.891AspVal: 3.891 ± 1.171
1.556AspTrp: 1.556 ± 0.592
0.778AspTyr: 0.778 ± 0.49
0.0AspXaa: 0.0 ± 0.0
Glu
1.556GluAla: 1.556 ± 2.713
0.0GluCys: 0.0 ± 0.0
2.335GluAsp: 2.335 ± 1.47
0.778GluGlu: 0.778 ± 0.773
2.335GluPhe: 2.335 ± 2.545
3.113GluGly: 3.113 ± 1.139
0.778GluHis: 0.778 ± 0.49
0.778GluIle: 0.778 ± 0.773
2.335GluLys: 2.335 ± 1.026
0.778GluLeu: 0.778 ± 0.773
0.0GluMet: 0.0 ± 1.082
2.335GluAsn: 2.335 ± 0.764
2.335GluPro: 2.335 ± 1.47
0.778GluGln: 0.778 ± 0.773
3.113GluArg: 3.113 ± 1.521
3.113GluSer: 3.113 ± 2.464
1.556GluThr: 1.556 ± 2.713
3.891GluVal: 3.891 ± 1.925
3.113GluTrp: 3.113 ± 2.038
1.556GluTyr: 1.556 ± 1.437
0.0GluXaa: 0.0 ± 0.0
Phe
0.778PheAla: 0.778 ± 0.49
0.0PheCys: 0.0 ± 0.0
0.778PheAsp: 0.778 ± 0.49
0.778PheGlu: 0.778 ± 0.49
1.556PhePhe: 1.556 ± 0.98
6.226PheGly: 6.226 ± 2.367
0.778PheHis: 0.778 ± 0.49
0.778PheIle: 0.778 ± 0.773
2.335PheLys: 2.335 ± 0.764
4.669PheLeu: 4.669 ± 0.893
0.0PheMet: 0.0 ± 0.0
0.778PheAsn: 0.778 ± 0.773
2.335PhePro: 2.335 ± 1.47
3.113PheGln: 3.113 ± 2.472
3.891PheArg: 3.891 ± 1.748
3.113PheSer: 3.113 ± 1.331
2.335PheThr: 2.335 ± 1.294
2.335PheVal: 2.335 ± 1.121
2.335PheTrp: 2.335 ± 1.47
2.335PheTyr: 2.335 ± 1.121
0.0PheXaa: 0.0 ± 0.0
Gly
6.226GlyAla: 6.226 ± 3.128
0.0GlyCys: 0.0 ± 0.0
4.669GlyAsp: 4.669 ± 2.856
3.113GlyGlu: 3.113 ± 0.963
6.226GlyPhe: 6.226 ± 2.157
3.891GlyGly: 3.891 ± 3.241
1.556GlyHis: 1.556 ± 0.98
1.556GlyIle: 1.556 ± 1.332
3.113GlyLys: 3.113 ± 1.521
8.56GlyLeu: 8.56 ± 3.55
1.556GlyMet: 1.556 ± 1.332
2.335GlyAsn: 2.335 ± 1.286
3.113GlyPro: 3.113 ± 1.139
0.778GlyGln: 0.778 ± 0.773
3.113GlyArg: 3.113 ± 1.139
7.004GlySer: 7.004 ± 4.118
6.226GlyThr: 6.226 ± 2.129
6.226GlyVal: 6.226 ± 1.32
3.113GlyTrp: 3.113 ± 1.479
3.113GlyTyr: 3.113 ± 1.521
0.0GlyXaa: 0.0 ± 0.0
His
0.778HisAla: 0.778 ± 0.773
0.0HisCys: 0.0 ± 0.0
0.0HisAsp: 0.0 ± 0.0
0.778HisGlu: 0.778 ± 0.49
0.0HisPhe: 0.0 ± 0.0
2.335HisGly: 2.335 ± 2.518
0.0HisHis: 0.0 ± 0.0
0.0HisIle: 0.0 ± 0.0
2.335HisLys: 2.335 ± 1.121
3.113HisLeu: 3.113 ± 1.184
0.778HisMet: 0.778 ± 0.49
1.556HisAsn: 1.556 ± 1.232
0.0HisPro: 0.0 ± 0.0
1.556HisGln: 1.556 ± 0.592
1.556HisArg: 1.556 ± 0.592
3.113HisSer: 3.113 ± 1.139
2.335HisThr: 2.335 ± 1.121
0.778HisVal: 0.778 ± 0.49
0.778HisTrp: 0.778 ± 0.49
0.778HisTyr: 0.778 ± 0.49
0.0HisXaa: 0.0 ± 0.0
Ile
3.113IleAla: 3.113 ± 1.184
1.556IleCys: 1.556 ± 1.236
3.113IleAsp: 3.113 ± 1.961
3.113IleGlu: 3.113 ± 1.184
2.335IlePhe: 2.335 ± 0.764
3.891IleGly: 3.891 ± 2.077
0.778IleHis: 0.778 ± 0.773
2.335IleIle: 2.335 ± 1.286
3.113IleLys: 3.113 ± 1.331
2.335IleLeu: 2.335 ± 1.026
1.556IleMet: 1.556 ± 0.592
2.335IleAsn: 2.335 ± 1.121
2.335IlePro: 2.335 ± 1.026
0.778IleGln: 0.778 ± 0.773
2.335IleArg: 2.335 ± 1.121
5.447IleSer: 5.447 ± 3.302
6.226IleThr: 6.226 ± 1.085
3.891IleVal: 3.891 ± 2.327
0.0IleTrp: 0.0 ± 0.0
0.778IleTyr: 0.778 ± 0.49
0.0IleXaa: 0.0 ± 0.0
Lys
4.669LysAla: 4.669 ± 0.893
0.0LysCys: 0.0 ± 0.0
3.113LysAsp: 3.113 ± 1.961
1.556LysGlu: 1.556 ± 1.437
0.778LysPhe: 0.778 ± 0.49
2.335LysGly: 2.335 ± 1.286
2.335LysHis: 2.335 ± 0.764
5.447LysIle: 5.447 ± 3.661
2.335LysLys: 2.335 ± 0.764
3.113LysLeu: 3.113 ± 1.521
0.0LysMet: 0.0 ± 0.0
0.778LysAsn: 0.778 ± 0.49
2.335LysPro: 2.335 ± 1.867
0.778LysGln: 0.778 ± 0.773
3.113LysArg: 3.113 ± 1.184
5.447LysSer: 5.447 ± 1.724
3.891LysThr: 3.891 ± 1.025
4.669LysVal: 4.669 ± 3.696
0.778LysTrp: 0.778 ± 0.49
0.778LysTyr: 0.778 ± 0.49
0.0LysXaa: 0.0 ± 0.0
Leu
2.335LeuAla: 2.335 ± 1.026
2.335LeuCys: 2.335 ± 1.332
7.004LeuAsp: 7.004 ± 2.71
3.113LeuGlu: 3.113 ± 1.581
6.226LeuPhe: 6.226 ± 2.157
14.786LeuGly: 14.786 ± 2.172
2.335LeuHis: 2.335 ± 0.764
6.226LeuIle: 6.226 ± 1.926
5.447LeuLys: 5.447 ± 1.724
10.117LeuLeu: 10.117 ± 4.192
2.335LeuMet: 2.335 ± 1.026
0.778LeuAsn: 0.778 ± 1.357
6.226LeuPro: 6.226 ± 2.173
4.669LeuGln: 4.669 ± 0.893
10.117LeuArg: 10.117 ± 1.186
14.008LeuSer: 14.008 ± 3.577
3.113LeuThr: 3.113 ± 2.903
3.113LeuVal: 3.113 ± 0.901
0.0LeuTrp: 0.0 ± 0.0
2.335LeuTyr: 2.335 ± 1.91
0.0LeuXaa: 0.0 ± 0.0
Met
2.335MetAla: 2.335 ± 1.121
0.0MetCys: 0.0 ± 0.0
0.0MetAsp: 0.0 ± 0.0
0.0MetGlu: 0.0 ± 0.0
0.778MetPhe: 0.778 ± 0.773
0.0MetGly: 0.0 ± 0.0
0.0MetHis: 0.0 ± 0.0
0.778MetIle: 0.778 ± 0.49
1.556MetLys: 1.556 ± 1.236
1.556MetLeu: 1.556 ± 0.592
0.0MetMet: 0.0 ± 0.0
0.778MetAsn: 0.778 ± 0.49
2.335MetPro: 2.335 ± 0.764
0.0MetGln: 0.0 ± 0.0
0.0MetArg: 0.0 ± 0.0
3.113MetSer: 3.113 ± 2.298
1.556MetThr: 1.556 ± 0.592
0.778MetVal: 0.778 ± 0.773
0.0MetTrp: 0.0 ± 0.0
0.0MetTyr: 0.0 ± 0.0
0.0MetXaa: 0.0 ± 0.0
Asn
1.556AsnAla: 1.556 ± 0.592
0.0AsnCys: 0.0 ± 0.0
3.113AsnAsp: 3.113 ± 1.139
2.335AsnGlu: 2.335 ± 1.121
0.778AsnPhe: 0.778 ± 0.49
1.556AsnGly: 1.556 ± 1.545
1.556AsnHis: 1.556 ± 0.592
1.556AsnIle: 1.556 ± 0.592
0.778AsnLys: 0.778 ± 1.328
6.226AsnLeu: 6.226 ± 2.018
0.0AsnMet: 0.0 ± 0.0
1.556AsnAsn: 1.556 ± 0.592
3.891AsnPro: 3.891 ± 1.847
1.556AsnGln: 1.556 ± 2.713
2.335AsnArg: 2.335 ± 0.764
0.778AsnSer: 0.778 ± 0.49
1.556AsnThr: 1.556 ± 1.545
1.556AsnVal: 1.556 ± 1.437
0.0AsnTrp: 0.0 ± 0.0
0.778AsnTyr: 0.778 ± 0.49
0.0AsnXaa: 0.0 ± 0.0
Pro
7.004ProAla: 7.004 ± 2.71
0.778ProCys: 0.778 ± 0.49
2.335ProAsp: 2.335 ± 0.764
3.113ProGlu: 3.113 ± 2.417
0.778ProPhe: 0.778 ± 1.328
3.113ProGly: 3.113 ± 1.331
0.0ProHis: 0.0 ± 0.0
3.891ProIle: 3.891 ± 1.022
1.556ProLys: 1.556 ± 0.98
5.447ProLeu: 5.447 ± 1.836
2.335ProMet: 2.335 ± 1.47
1.556ProAsn: 1.556 ± 1.437
3.891ProPro: 3.891 ± 2.327
0.778ProGln: 0.778 ± 1.328
5.447ProArg: 5.447 ± 1.877
4.669ProSer: 4.669 ± 0.893
4.669ProThr: 4.669 ± 1.331
0.778ProVal: 0.778 ± 0.49
0.778ProTrp: 0.778 ± 0.773
0.0ProTyr: 0.0 ± 0.0
0.0ProXaa: 0.0 ± 0.0
Gln
2.335GlnAla: 2.335 ± 1.121
0.778GlnCys: 0.778 ± 0.773
0.778GlnAsp: 0.778 ± 1.357
2.335GlnGlu: 2.335 ± 1.026
1.556GlnPhe: 1.556 ± 1.545
1.556GlnGly: 1.556 ± 0.98
0.0GlnHis: 0.0 ± 0.0
0.778GlnIle: 0.778 ± 0.773
1.556GlnLys: 1.556 ± 0.592
3.891GlnLeu: 3.891 ± 1.022
0.0GlnMet: 0.0 ± 0.0
0.778GlnAsn: 0.778 ± 0.49
0.0GlnPro: 0.0 ± 0.0
1.556GlnGln: 1.556 ± 0.98
3.113GlnArg: 3.113 ± 1.139
1.556GlnSer: 1.556 ± 1.545
3.113GlnThr: 3.113 ± 1.581
1.556GlnVal: 1.556 ± 2.002
0.0GlnTrp: 0.0 ± 0.0
2.335GlnTyr: 2.335 ± 1.294
0.0GlnXaa: 0.0 ± 0.0
Arg
3.113ArgAla: 3.113 ± 0.963
0.778ArgCys: 0.778 ± 0.49
6.226ArgAsp: 6.226 ± 1.32
2.335ArgGlu: 2.335 ± 1.47
0.0ArgPhe: 0.0 ± 0.0
3.891ArgGly: 3.891 ± 1.025
3.891ArgHis: 3.891 ± 4.294
3.113ArgIle: 3.113 ± 1.184
5.447ArgLys: 5.447 ± 1.087
7.004ArgLeu: 7.004 ± 3.475
1.556ArgMet: 1.556 ± 1.393
3.113ArgAsn: 3.113 ± 1.184
3.113ArgPro: 3.113 ± 1.331
2.335ArgGln: 2.335 ± 1.121
6.226ArgArg: 6.226 ± 3.302
11.673ArgSer: 11.673 ± 2.303
3.891ArgThr: 3.891 ± 1.276
4.669ArgVal: 4.669 ± 1.331
2.335ArgTrp: 2.335 ± 0.764
0.0ArgTyr: 0.0 ± 0.0
0.0ArgXaa: 0.0 ± 0.0
Ser
5.447SerAla: 5.447 ± 1.087
1.556SerCys: 1.556 ± 0.98
7.782SerAsp: 7.782 ± 1.279
2.335SerGlu: 2.335 ± 1.121
4.669SerPhe: 4.669 ± 2.701
11.673SerGly: 11.673 ± 3.757
1.556SerHis: 1.556 ± 1.545
6.226SerIle: 6.226 ± 1.961
6.226SerLys: 6.226 ± 1.32
14.008SerLeu: 14.008 ± 4.741
1.556SerMet: 1.556 ± 1.012
6.226SerAsn: 6.226 ± 2.018
3.113SerPro: 3.113 ± 0.901
2.335SerGln: 2.335 ± 1.286
7.782SerArg: 7.782 ± 2.862
12.451SerSer: 12.451 ± 4.675
5.447SerThr: 5.447 ± 3.661
5.447SerVal: 5.447 ± 1.18
1.556SerTrp: 1.556 ± 0.592
3.891SerTyr: 3.891 ± 1.579
0.0SerXaa: 0.0 ± 0.0
Thr
4.669ThrAla: 4.669 ± 1.018
0.778ThrCys: 0.778 ± 1.328
2.335ThrAsp: 2.335 ± 1.332
4.669ThrGlu: 4.669 ± 1.112
3.891ThrPhe: 3.891 ± 1.276
1.556ThrGly: 1.556 ± 0.98
0.778ThrHis: 0.778 ± 0.49
4.669ThrIle: 4.669 ± 1.735
3.113ThrLys: 3.113 ± 0.963
3.891ThrLeu: 3.891 ± 3.768
0.778ThrMet: 0.778 ± 0.773
1.556ThrAsn: 1.556 ± 1.545
6.226ThrPro: 6.226 ± 1.425
1.556ThrGln: 1.556 ± 0.98
3.113ThrArg: 3.113 ± 0.901
4.669ThrSer: 4.669 ± 2.447
4.669ThrThr: 4.669 ± 3.71
8.56ThrVal: 8.56 ± 2.876
2.335ThrTrp: 2.335 ± 0.764
0.778ThrTyr: 0.778 ± 0.773
0.0ThrXaa: 0.0 ± 0.0
Val
3.113ValAla: 3.113 ± 1.139
0.0ValCys: 0.0 ± 0.0
2.335ValAsp: 2.335 ± 1.286
2.335ValGlu: 2.335 ± 1.294
3.113ValPhe: 3.113 ± 1.581
5.447ValGly: 5.447 ± 0.781
0.778ValHis: 0.778 ± 0.49
3.891ValIle: 3.891 ± 1.647
1.556ValLys: 1.556 ± 1.437
4.669ValLeu: 4.669 ± 1.112
0.778ValMet: 0.778 ± 0.49
3.891ValAsn: 3.891 ± 1.022
4.669ValPro: 4.669 ± 1.331
2.335ValGln: 2.335 ± 1.294
8.56ValArg: 8.56 ± 2.139
8.56ValSer: 8.56 ± 4.359
2.335ValThr: 2.335 ± 0.764
3.113ValVal: 3.113 ± 1.479
0.0ValTrp: 0.0 ± 0.0
3.113ValTyr: 3.113 ± 1.521
0.0ValXaa: 0.0 ± 0.0
Trp
0.778TrpAla: 0.778 ± 0.773
0.778TrpCys: 0.778 ± 0.773
0.778TrpAsp: 0.778 ± 0.49
0.778TrpGlu: 0.778 ± 1.357
0.778TrpPhe: 0.778 ± 0.773
1.556TrpGly: 1.556 ± 0.98
1.556TrpHis: 1.556 ± 0.592
3.113TrpIle: 3.113 ± 1.139
0.778TrpLys: 0.778 ± 0.773
0.778TrpLeu: 0.778 ± 0.49
0.778TrpMet: 0.778 ± 0.773
0.778TrpAsn: 0.778 ± 0.49
0.778TrpPro: 0.778 ± 0.49
0.778TrpGln: 0.778 ± 0.773
0.778TrpArg: 0.778 ± 0.773
1.556TrpSer: 1.556 ± 0.592
2.335TrpThr: 2.335 ± 0.764
0.778TrpVal: 0.778 ± 0.49
0.0TrpTrp: 0.0 ± 0.0
0.0TrpTyr: 0.0 ± 0.0
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.556TyrAla: 1.556 ± 1.232
1.556TyrCys: 1.556 ± 0.592
2.335TyrAsp: 2.335 ± 1.47
0.778TyrGlu: 0.778 ± 1.357
0.778TyrPhe: 0.778 ± 0.49
0.778TyrGly: 0.778 ± 1.357
0.778TyrHis: 0.778 ± 0.49
0.0TyrIle: 0.0 ± 0.0
0.0TyrLys: 0.0 ± 0.0
5.447TyrLeu: 5.447 ± 2.146
0.0TyrMet: 0.0 ± 0.0
0.0TyrAsn: 0.0 ± 0.0
2.335TyrPro: 2.335 ± 1.47
1.556TyrGln: 1.556 ± 0.592
3.113TyrArg: 3.113 ± 0.901
1.556TyrSer: 1.556 ± 0.592
1.556TyrThr: 1.556 ± 1.232
2.335TyrVal: 2.335 ± 1.294
0.0TyrTrp: 0.0 ± 0.0
1.556TyrTyr: 1.556 ± 1.232
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 4 proteins (1286 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski