Amino acid dipepetide frequency for Streptococcus satellite phage Javan431

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
1.111AlaAla: 1.111 ± 0.529
0.0AlaCys: 0.0 ± 0.0
2.592AlaAsp: 2.592 ± 0.859
3.332AlaGlu: 3.332 ± 1.803
3.702AlaPhe: 3.702 ± 1.345
1.851AlaGly: 1.851 ± 0.733
0.74AlaHis: 0.74 ± 0.491
5.183AlaIle: 5.183 ± 1.056
5.183AlaLys: 5.183 ± 0.905
5.924AlaLeu: 5.924 ± 1.983
0.74AlaMet: 0.74 ± 0.494
1.111AlaAsn: 1.111 ± 0.658
1.851AlaPro: 1.851 ± 0.731
1.851AlaGln: 1.851 ± 0.819
3.702AlaArg: 3.702 ± 1.143
3.702AlaSer: 3.702 ± 1.112
4.443AlaThr: 4.443 ± 1.114
5.183AlaVal: 5.183 ± 1.391
0.37AlaTrp: 0.37 ± 0.302
2.221AlaTyr: 2.221 ± 0.802
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.0CysCys: 0.0 ± 0.0
0.0CysAsp: 0.0 ± 0.0
0.74CysGlu: 0.74 ± 0.348
0.0CysPhe: 0.0 ± 0.0
0.0CysGly: 0.0 ± 0.0
0.37CysHis: 0.37 ± 0.3
0.0CysIle: 0.0 ± 0.0
0.0CysLys: 0.0 ± 0.0
0.74CysLeu: 0.74 ± 0.43
0.37CysMet: 0.37 ± 0.423
0.37CysAsn: 0.37 ± 0.345
0.37CysPro: 0.37 ± 0.3
0.37CysGln: 0.37 ± 0.365
0.37CysArg: 0.37 ± 0.478
0.0CysSer: 0.0 ± 0.0
0.0CysThr: 0.0 ± 0.0
0.37CysVal: 0.37 ± 0.302
0.0CysTrp: 0.0 ± 0.0
0.37CysTyr: 0.37 ± 0.302
0.0CysXaa: 0.0 ± 0.0
Asp
1.481AspAla: 1.481 ± 0.794
0.74AspCys: 0.74 ± 0.43
3.702AspAsp: 3.702 ± 1.041
6.664AspGlu: 6.664 ± 2.008
3.332AspPhe: 3.332 ± 0.766
4.073AspGly: 4.073 ± 1.109
0.37AspHis: 0.37 ± 0.302
5.924AspIle: 5.924 ± 1.303
6.294AspLys: 6.294 ± 1.457
7.775AspLeu: 7.775 ± 1.991
1.851AspMet: 1.851 ± 0.694
2.592AspAsn: 2.592 ± 0.821
0.74AspPro: 0.74 ± 0.603
0.37AspGln: 0.37 ± 0.302
3.702AspArg: 3.702 ± 0.708
1.481AspSer: 1.481 ± 0.619
2.592AspThr: 2.592 ± 0.922
1.481AspVal: 1.481 ± 0.854
0.74AspTrp: 0.74 ± 0.411
3.332AspTyr: 3.332 ± 0.952
0.0AspXaa: 0.0 ± 0.0
Glu
3.702GluAla: 3.702 ± 1.275
0.37GluCys: 0.37 ± 0.3
3.332GluAsp: 3.332 ± 1.23
9.256GluGlu: 9.256 ± 2.714
4.073GluPhe: 4.073 ± 1.594
1.481GluGly: 1.481 ± 0.536
2.221GluHis: 2.221 ± 0.738
7.405GluIle: 7.405 ± 1.113
11.477GluLys: 11.477 ± 2.0
9.626GluLeu: 9.626 ± 2.261
3.332GluMet: 3.332 ± 1.007
4.443GluAsn: 4.443 ± 1.541
1.111GluPro: 1.111 ± 0.669
2.962GluGln: 2.962 ± 0.972
3.702GluArg: 3.702 ± 1.228
3.702GluSer: 3.702 ± 1.087
6.294GluThr: 6.294 ± 1.527
6.294GluVal: 6.294 ± 1.759
0.37GluTrp: 0.37 ± 0.384
5.183GluTyr: 5.183 ± 1.209
0.0GluXaa: 0.0 ± 0.0
Phe
1.481PheAla: 1.481 ± 0.808
0.0PheCys: 0.0 ± 0.0
4.073PheAsp: 4.073 ± 1.155
3.702PheGlu: 3.702 ± 1.394
2.592PhePhe: 2.592 ± 0.994
2.962PheGly: 2.962 ± 0.827
1.481PheHis: 1.481 ± 0.506
2.962PheIle: 2.962 ± 1.02
4.813PheLys: 4.813 ± 1.418
7.034PheLeu: 7.034 ± 1.603
0.74PheMet: 0.74 ± 0.597
2.592PheAsn: 2.592 ± 0.84
0.74PhePro: 0.74 ± 0.451
0.0PheGln: 0.0 ± 0.0
1.481PheArg: 1.481 ± 0.638
3.702PheSer: 3.702 ± 0.914
4.443PheThr: 4.443 ± 1.35
3.332PheVal: 3.332 ± 1.219
0.37PheTrp: 0.37 ± 0.302
2.592PheTyr: 2.592 ± 0.994
0.0PheXaa: 0.0 ± 0.0
Gly
2.592GlyAla: 2.592 ± 1.489
0.0GlyCys: 0.0 ± 0.0
2.592GlyAsp: 2.592 ± 1.025
3.332GlyGlu: 3.332 ± 1.38
2.221GlyPhe: 2.221 ± 0.658
2.221GlyGly: 2.221 ± 1.294
1.481GlyHis: 1.481 ± 0.709
2.962GlyIle: 2.962 ± 1.168
4.813GlyLys: 4.813 ± 1.053
5.183GlyLeu: 5.183 ± 1.255
1.111GlyMet: 1.111 ± 0.654
0.74GlyAsn: 0.74 ± 0.44
0.37GlyPro: 0.37 ± 0.326
2.221GlyGln: 2.221 ± 0.789
2.962GlyArg: 2.962 ± 0.909
1.481GlySer: 1.481 ± 0.645
1.111GlyThr: 1.111 ± 0.518
7.034GlyVal: 7.034 ± 1.473
1.111GlyTrp: 1.111 ± 0.823
2.221GlyTyr: 2.221 ± 0.696
0.0GlyXaa: 0.0 ± 0.0
His
1.481HisAla: 1.481 ± 0.687
0.0HisCys: 0.0 ± 0.0
1.111HisAsp: 1.111 ± 0.484
1.851HisGlu: 1.851 ± 1.275
1.481HisPhe: 1.481 ± 0.783
1.851HisGly: 1.851 ± 1.014
0.0HisHis: 0.0 ± 0.0
0.37HisIle: 0.37 ± 0.478
1.481HisLys: 1.481 ± 0.727
0.74HisLeu: 0.74 ± 0.599
0.74HisMet: 0.74 ± 0.404
0.37HisAsn: 0.37 ± 0.302
0.37HisPro: 0.37 ± 0.326
0.74HisGln: 0.74 ± 0.584
0.74HisArg: 0.74 ± 0.67
1.481HisSer: 1.481 ± 0.384
0.74HisThr: 0.74 ± 0.599
0.0HisVal: 0.0 ± 0.0
0.0HisTrp: 0.0 ± 0.0
1.481HisTyr: 1.481 ± 0.788
0.0HisXaa: 0.0 ± 0.0
Ile
2.962IleAla: 2.962 ± 1.394
0.37IleCys: 0.37 ± 0.345
4.073IleAsp: 4.073 ± 1.08
5.183IleGlu: 5.183 ± 1.07
3.332IlePhe: 3.332 ± 1.066
2.221IleGly: 2.221 ± 0.906
0.0IleHis: 0.0 ± 0.0
4.073IleIle: 4.073 ± 1.15
5.924IleLys: 5.924 ± 1.579
6.664IleLeu: 6.664 ± 1.604
0.74IleMet: 0.74 ± 0.603
3.702IleAsn: 3.702 ± 0.799
2.221IlePro: 2.221 ± 0.679
2.592IleGln: 2.592 ± 0.743
2.962IleArg: 2.962 ± 0.783
4.073IleSer: 4.073 ± 1.389
4.813IleThr: 4.813 ± 0.992
4.073IleVal: 4.073 ± 0.922
1.481IleTrp: 1.481 ± 1.001
2.962IleTyr: 2.962 ± 1.195
0.0IleXaa: 0.0 ± 0.0
Lys
7.405LysAla: 7.405 ± 1.923
0.0LysCys: 0.0 ± 0.0
6.294LysAsp: 6.294 ± 1.67
10.367LysGlu: 10.367 ± 1.708
4.813LysPhe: 4.813 ± 1.144
5.183LysGly: 5.183 ± 1.341
1.851LysHis: 1.851 ± 1.429
7.405LysIle: 7.405 ± 1.602
9.626LysLys: 9.626 ± 1.48
12.218LysLeu: 12.218 ± 1.316
1.851LysMet: 1.851 ± 0.765
7.405LysAsn: 7.405 ± 1.543
1.851LysPro: 1.851 ± 1.011
5.183LysGln: 5.183 ± 1.833
4.073LysArg: 4.073 ± 0.924
5.924LysSer: 5.924 ± 0.868
5.183LysThr: 5.183 ± 1.551
2.221LysVal: 2.221 ± 0.75
1.111LysTrp: 1.111 ± 0.412
2.221LysTyr: 2.221 ± 0.852
0.0LysXaa: 0.0 ± 0.0
Leu
5.553LeuAla: 5.553 ± 1.608
0.37LeuCys: 0.37 ± 0.365
7.775LeuAsp: 7.775 ± 2.033
12.588LeuGlu: 12.588 ± 1.914
5.183LeuPhe: 5.183 ± 1.794
5.924LeuGly: 5.924 ± 1.798
1.851LeuHis: 1.851 ± 0.903
5.183LeuIle: 5.183 ± 1.691
10.367LeuLys: 10.367 ± 1.902
7.775LeuLeu: 7.775 ± 2.291
1.111LeuMet: 1.111 ± 0.558
7.775LeuAsn: 7.775 ± 1.473
3.702LeuPro: 3.702 ± 0.898
3.702LeuGln: 3.702 ± 1.168
4.073LeuArg: 4.073 ± 0.988
4.073LeuSer: 4.073 ± 1.209
9.256LeuThr: 9.256 ± 1.719
5.183LeuVal: 5.183 ± 1.301
0.37LeuTrp: 0.37 ± 0.358
4.073LeuTyr: 4.073 ± 1.034
0.0LeuXaa: 0.0 ± 0.0
Met
1.111MetAla: 1.111 ± 0.74
0.0MetCys: 0.0 ± 0.0
1.481MetAsp: 1.481 ± 0.465
1.111MetGlu: 1.111 ± 0.895
0.37MetPhe: 0.37 ± 0.422
1.111MetGly: 1.111 ± 0.617
0.37MetHis: 0.37 ± 0.326
1.851MetIle: 1.851 ± 0.596
2.592MetLys: 2.592 ± 0.828
1.481MetLeu: 1.481 ± 0.51
0.37MetMet: 0.37 ± 0.478
2.221MetAsn: 2.221 ± 0.808
0.0MetPro: 0.0 ± 0.0
0.74MetGln: 0.74 ± 0.404
0.74MetArg: 0.74 ± 0.44
0.74MetSer: 0.74 ± 0.579
0.74MetThr: 0.74 ± 0.533
3.332MetVal: 3.332 ± 1.013
0.0MetTrp: 0.0 ± 0.0
0.74MetTyr: 0.74 ± 0.451
0.0MetXaa: 0.0 ± 0.0
Asn
3.332AsnAla: 3.332 ± 1.092
0.37AsnCys: 0.37 ± 0.3
2.962AsnAsp: 2.962 ± 1.081
3.332AsnGlu: 3.332 ± 1.077
1.851AsnPhe: 1.851 ± 0.978
4.073AsnGly: 4.073 ± 1.228
1.481AsnHis: 1.481 ± 0.575
2.592AsnIle: 2.592 ± 0.955
6.294AsnLys: 6.294 ± 0.844
8.145AsnLeu: 8.145 ± 1.795
0.74AsnMet: 0.74 ± 0.535
0.74AsnAsn: 0.74 ± 0.588
2.221AsnPro: 2.221 ± 0.601
4.443AsnGln: 4.443 ± 1.092
2.962AsnArg: 2.962 ± 1.126
1.851AsnSer: 1.851 ± 0.664
3.332AsnThr: 3.332 ± 1.461
1.111AsnVal: 1.111 ± 0.625
0.74AsnTrp: 0.74 ± 0.348
1.481AsnTyr: 1.481 ± 0.832
0.0AsnXaa: 0.0 ± 0.0
Pro
0.74ProAla: 0.74 ± 0.599
0.0ProCys: 0.0 ± 0.0
2.221ProAsp: 2.221 ± 0.992
2.221ProGlu: 2.221 ± 0.887
2.221ProPhe: 2.221 ± 0.654
0.74ProGly: 0.74 ± 0.571
0.0ProHis: 0.0 ± 0.0
1.111ProIle: 1.111 ± 0.548
2.962ProLys: 2.962 ± 1.273
2.221ProLeu: 2.221 ± 0.921
0.37ProMet: 0.37 ± 0.302
1.481ProAsn: 1.481 ± 0.5
1.851ProPro: 1.851 ± 0.806
1.111ProGln: 1.111 ± 0.489
0.74ProArg: 0.74 ± 0.599
1.481ProSer: 1.481 ± 0.757
1.481ProThr: 1.481 ± 0.753
0.37ProVal: 0.37 ± 0.3
0.0ProTrp: 0.0 ± 0.0
1.481ProTyr: 1.481 ± 0.694
0.0ProXaa: 0.0 ± 0.0
Gln
4.443GlnAla: 4.443 ± 0.896
0.0GlnCys: 0.0 ± 0.0
2.592GlnAsp: 2.592 ± 1.293
4.813GlnGlu: 4.813 ± 1.642
1.851GlnPhe: 1.851 ± 0.637
2.221GlnGly: 2.221 ± 0.728
0.37GlnHis: 0.37 ± 0.326
3.332GlnIle: 3.332 ± 1.237
2.962GlnLys: 2.962 ± 1.281
4.073GlnLeu: 4.073 ± 1.194
0.74GlnMet: 0.74 ± 0.499
1.481GlnAsn: 1.481 ± 0.699
0.74GlnPro: 0.74 ± 0.491
3.332GlnGln: 3.332 ± 1.109
2.592GlnArg: 2.592 ± 0.835
1.481GlnSer: 1.481 ± 0.68
1.481GlnThr: 1.481 ± 0.849
2.592GlnVal: 2.592 ± 0.931
0.74GlnTrp: 0.74 ± 0.499
1.111GlnTyr: 1.111 ± 0.564
0.0GlnXaa: 0.0 ± 0.0
Arg
4.443ArgAla: 4.443 ± 1.373
0.0ArgCys: 0.0 ± 0.0
2.962ArgAsp: 2.962 ± 0.892
4.443ArgGlu: 4.443 ± 0.969
2.221ArgPhe: 2.221 ± 0.963
3.332ArgGly: 3.332 ± 1.131
0.74ArgHis: 0.74 ± 0.348
2.592ArgIle: 2.592 ± 0.794
3.332ArgLys: 3.332 ± 0.68
5.924ArgLeu: 5.924 ± 1.691
2.221ArgMet: 2.221 ± 0.776
1.481ArgAsn: 1.481 ± 0.622
0.74ArgPro: 0.74 ± 0.603
2.221ArgGln: 2.221 ± 0.934
1.851ArgArg: 1.851 ± 0.806
2.221ArgSer: 2.221 ± 0.705
1.851ArgThr: 1.851 ± 0.526
1.481ArgVal: 1.481 ± 0.819
0.37ArgTrp: 0.37 ± 0.384
2.962ArgTyr: 2.962 ± 1.049
0.0ArgXaa: 0.0 ± 0.0
Ser
3.332SerAla: 3.332 ± 1.071
1.111SerCys: 1.111 ± 0.664
3.332SerAsp: 3.332 ± 0.896
4.813SerGlu: 4.813 ± 1.182
3.332SerPhe: 3.332 ± 0.919
2.592SerGly: 2.592 ± 0.612
1.111SerHis: 1.111 ± 0.533
1.851SerIle: 1.851 ± 0.543
3.702SerLys: 3.702 ± 0.815
5.183SerLeu: 5.183 ± 1.031
1.111SerMet: 1.111 ± 0.863
2.221SerAsn: 2.221 ± 0.556
1.111SerPro: 1.111 ± 0.751
3.332SerGln: 3.332 ± 1.058
2.592SerArg: 2.592 ± 1.155
2.221SerSer: 2.221 ± 0.916
1.111SerThr: 1.111 ± 0.646
1.111SerVal: 1.111 ± 0.642
0.37SerTrp: 0.37 ± 0.365
2.221SerTyr: 2.221 ± 0.777
0.0SerXaa: 0.0 ± 0.0
Thr
2.592ThrAla: 2.592 ± 0.876
0.37ThrCys: 0.37 ± 0.423
3.332ThrAsp: 3.332 ± 1.136
3.332ThrGlu: 3.332 ± 1.074
2.221ThrPhe: 2.221 ± 0.625
2.221ThrGly: 2.221 ± 0.603
1.111ThrHis: 1.111 ± 0.412
4.443ThrIle: 4.443 ± 1.006
7.405ThrLys: 7.405 ± 2.305
5.183ThrLeu: 5.183 ± 0.986
0.74ThrMet: 0.74 ± 0.592
2.962ThrAsn: 2.962 ± 1.467
1.851ThrPro: 1.851 ± 0.788
3.702ThrGln: 3.702 ± 1.029
2.962ThrArg: 2.962 ± 0.638
2.592ThrSer: 2.592 ± 1.103
2.592ThrThr: 2.592 ± 0.744
4.073ThrVal: 4.073 ± 2.068
0.0ThrTrp: 0.0 ± 0.0
3.332ThrTyr: 3.332 ± 1.483
0.0ThrXaa: 0.0 ± 0.0
Val
5.183ValAla: 5.183 ± 1.618
0.0ValCys: 0.0 ± 0.0
2.592ValAsp: 2.592 ± 1.011
4.073ValGlu: 4.073 ± 0.939
3.332ValPhe: 3.332 ± 0.89
0.74ValGly: 0.74 ± 0.348
0.37ValHis: 0.37 ± 0.302
1.851ValIle: 1.851 ± 0.729
5.924ValLys: 5.924 ± 1.934
4.813ValLeu: 4.813 ± 0.887
1.481ValMet: 1.481 ± 0.676
5.924ValAsn: 5.924 ± 0.897
1.111ValPro: 1.111 ± 0.679
1.111ValGln: 1.111 ± 0.562
2.592ValArg: 2.592 ± 0.896
2.592ValSer: 2.592 ± 1.278
2.592ValThr: 2.592 ± 0.923
2.962ValVal: 2.962 ± 1.036
0.74ValTrp: 0.74 ± 0.44
2.962ValTyr: 2.962 ± 0.925
0.0ValXaa: 0.0 ± 0.0
Trp
0.37TrpAla: 0.37 ± 0.3
0.37TrpCys: 0.37 ± 0.302
0.0TrpAsp: 0.0 ± 0.0
2.592TrpGlu: 2.592 ± 0.934
0.37TrpPhe: 0.37 ± 0.365
0.74TrpGly: 0.74 ± 0.404
0.0TrpHis: 0.0 ± 0.0
1.111TrpIle: 1.111 ± 0.635
1.111TrpLys: 1.111 ± 0.627
0.37TrpLeu: 0.37 ± 0.3
0.0TrpMet: 0.0 ± 0.0
0.37TrpAsn: 0.37 ± 0.358
0.0TrpPro: 0.0 ± 0.0
0.37TrpGln: 0.37 ± 0.397
0.37TrpArg: 0.37 ± 0.302
0.37TrpSer: 0.37 ± 0.3
0.37TrpThr: 0.37 ± 0.302
0.0TrpVal: 0.0 ± 0.0
0.0TrpTrp: 0.0 ± 0.0
0.0TrpTyr: 0.0 ± 0.0
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.481TyrAla: 1.481 ± 0.669
0.37TyrCys: 0.37 ± 0.3
2.221TyrAsp: 2.221 ± 0.629
2.592TyrGlu: 2.592 ± 0.861
2.592TyrPhe: 2.592 ± 1.087
2.221TyrGly: 2.221 ± 1.065
1.111TyrHis: 1.111 ± 0.57
2.221TyrIle: 2.221 ± 0.794
5.924TyrLys: 5.924 ± 1.864
4.813TyrLeu: 4.813 ± 1.396
0.37TyrMet: 0.37 ± 0.326
4.073TyrAsn: 4.073 ± 0.857
1.851TyrPro: 1.851 ± 0.588
2.592TyrGln: 2.592 ± 0.715
2.221TyrArg: 2.221 ± 0.829
2.962TyrSer: 2.962 ± 0.904
2.592TyrThr: 2.592 ± 0.972
0.74TyrVal: 0.74 ± 0.477
0.0TyrTrp: 0.0 ± 0.0
2.962TyrTyr: 2.962 ± 1.24
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 17 proteins (2702 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski