Amino acid dipepetide frequency for Bacillus phage pGIL02

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
0.471AlaAla: 0.471 ± 0.296
0.943AlaCys: 0.943 ± 0.448
2.829AlaAsp: 2.829 ± 0.976
6.365AlaGlu: 6.365 ± 1.939
2.593AlaPhe: 2.593 ± 0.946
5.658AlaGly: 5.658 ± 1.326
1.886AlaHis: 1.886 ± 0.549
2.357AlaIle: 2.357 ± 0.797
6.836AlaLys: 6.836 ± 1.554
5.186AlaLeu: 5.186 ± 1.119
1.886AlaMet: 1.886 ± 0.572
2.122AlaAsn: 2.122 ± 0.883
2.357AlaPro: 2.357 ± 0.627
2.357AlaGln: 2.357 ± 0.686
3.536AlaArg: 3.536 ± 0.714
4.479AlaSer: 4.479 ± 1.0
5.422AlaThr: 5.422 ± 1.49
2.593AlaVal: 2.593 ± 0.933
0.236AlaTrp: 0.236 ± 0.228
2.829AlaTyr: 2.829 ± 0.555
0.0AlaXaa: 0.0 ± 0.0
Cys
0.471CysAla: 0.471 ± 0.245
0.0CysCys: 0.0 ± 0.0
0.707CysAsp: 0.707 ± 0.494
0.707CysGlu: 0.707 ± 0.33
0.943CysPhe: 0.943 ± 0.37
0.0CysGly: 0.0 ± 0.0
0.236CysHis: 0.236 ± 0.187
0.943CysIle: 0.943 ± 0.49
0.707CysLys: 0.707 ± 0.323
0.236CysLeu: 0.236 ± 0.23
0.236CysMet: 0.236 ± 0.187
0.236CysAsn: 0.236 ± 0.238
0.707CysPro: 0.707 ± 0.481
0.0CysGln: 0.0 ± 0.0
0.943CysArg: 0.943 ± 0.586
0.0CysSer: 0.0 ± 0.0
0.236CysThr: 0.236 ± 0.23
0.236CysVal: 0.236 ± 0.234
0.0CysTrp: 0.0 ± 0.0
0.236CysTyr: 0.236 ± 0.226
0.0CysXaa: 0.0 ± 0.0
Asp
3.065AspAla: 3.065 ± 0.792
0.471AspCys: 0.471 ± 0.374
1.886AspAsp: 1.886 ± 0.6
4.243AspGlu: 4.243 ± 1.113
4.008AspPhe: 4.008 ± 1.152
3.065AspGly: 3.065 ± 0.68
0.707AspHis: 0.707 ± 0.404
1.886AspIle: 1.886 ± 0.529
5.658AspLys: 5.658 ± 1.638
4.008AspLeu: 4.008 ± 0.845
3.065AspMet: 3.065 ± 0.695
1.65AspAsn: 1.65 ± 0.457
3.065AspPro: 3.065 ± 0.819
0.943AspGln: 0.943 ± 0.551
2.122AspArg: 2.122 ± 0.585
3.536AspSer: 3.536 ± 0.825
2.829AspThr: 2.829 ± 0.617
3.3AspVal: 3.3 ± 0.594
0.471AspTrp: 0.471 ± 0.272
2.829AspTyr: 2.829 ± 1.0
0.0AspXaa: 0.0 ± 0.0
Glu
4.008GluAla: 4.008 ± 1.001
0.471GluCys: 0.471 ± 0.322
3.536GluAsp: 3.536 ± 1.147
10.844GluGlu: 10.844 ± 5.111
4.243GluPhe: 4.243 ± 0.703
6.365GluGly: 6.365 ± 1.244
1.414GluHis: 1.414 ± 0.776
4.008GluIle: 4.008 ± 1.141
4.95GluLys: 4.95 ± 1.354
6.836GluLeu: 6.836 ± 1.064
1.886GluMet: 1.886 ± 0.633
3.772GluAsn: 3.772 ± 0.86
2.122GluPro: 2.122 ± 0.578
3.3GluGln: 3.3 ± 0.741
5.658GluArg: 5.658 ± 1.014
1.886GluSer: 1.886 ± 0.617
4.715GluThr: 4.715 ± 1.287
5.186GluVal: 5.186 ± 1.301
0.943GluTrp: 0.943 ± 0.503
3.3GluTyr: 3.3 ± 1.322
0.0GluXaa: 0.0 ± 0.0
Phe
3.065PheAla: 3.065 ± 0.774
0.471PheCys: 0.471 ± 0.477
3.536PheAsp: 3.536 ± 0.799
4.008PheGlu: 4.008 ± 0.926
1.886PhePhe: 1.886 ± 0.653
2.122PheGly: 2.122 ± 0.849
0.471PheHis: 0.471 ± 0.324
3.3PheIle: 3.3 ± 0.761
1.65PheLys: 1.65 ± 0.502
3.536PheLeu: 3.536 ± 1.054
1.414PheMet: 1.414 ± 0.527
2.122PheAsn: 2.122 ± 0.622
2.829PhePro: 2.829 ± 0.791
1.414PheGln: 1.414 ± 0.63
1.886PheArg: 1.886 ± 0.551
2.829PheSer: 2.829 ± 1.158
4.008PheThr: 4.008 ± 0.841
2.122PheVal: 2.122 ± 0.638
0.707PheTrp: 0.707 ± 0.397
0.707PheTyr: 0.707 ± 0.355
0.0PheXaa: 0.0 ± 0.0
Gly
4.243GlyAla: 4.243 ± 0.914
0.707GlyCys: 0.707 ± 0.332
4.243GlyAsp: 4.243 ± 1.811
3.065GlyGlu: 3.065 ± 0.866
3.3GlyPhe: 3.3 ± 0.982
8.958GlyGly: 8.958 ± 2.155
0.471GlyHis: 0.471 ± 0.311
3.065GlyIle: 3.065 ± 1.091
7.779GlyLys: 7.779 ± 1.594
4.479GlyLeu: 4.479 ± 1.274
2.357GlyMet: 2.357 ± 0.772
2.829GlyAsn: 2.829 ± 0.821
1.179GlyPro: 1.179 ± 0.48
2.593GlyGln: 2.593 ± 0.815
3.3GlyArg: 3.3 ± 0.806
4.715GlySer: 4.715 ± 1.217
4.243GlyThr: 4.243 ± 1.276
5.658GlyVal: 5.658 ± 1.038
1.414GlyTrp: 1.414 ± 0.641
4.95GlyTyr: 4.95 ± 0.851
0.0GlyXaa: 0.0 ± 0.0
His
1.179HisAla: 1.179 ± 0.514
0.236HisCys: 0.236 ± 0.226
0.943HisAsp: 0.943 ± 0.426
1.414HisGlu: 1.414 ± 0.499
0.707HisPhe: 0.707 ± 0.392
0.471HisGly: 0.471 ± 0.305
0.471HisHis: 0.471 ± 0.477
1.179HisIle: 1.179 ± 0.419
1.65HisLys: 1.65 ± 0.5
0.471HisLeu: 0.471 ± 0.348
0.0HisMet: 0.0 ± 0.0
0.707HisAsn: 0.707 ± 0.411
0.471HisPro: 0.471 ± 0.305
0.471HisGln: 0.471 ± 0.359
0.471HisArg: 0.471 ± 0.322
1.179HisSer: 1.179 ± 0.54
0.707HisThr: 0.707 ± 0.324
2.357HisVal: 2.357 ± 0.744
0.0HisTrp: 0.0 ± 0.0
1.179HisTyr: 1.179 ± 0.444
0.0HisXaa: 0.0 ± 0.0
Ile
3.3IleAla: 3.3 ± 0.66
0.0IleCys: 0.0 ± 0.0
2.122IleAsp: 2.122 ± 0.733
4.243IleGlu: 4.243 ± 0.914
1.414IlePhe: 1.414 ± 0.607
2.593IleGly: 2.593 ± 0.69
0.707IleHis: 0.707 ± 0.422
3.772IleIle: 3.772 ± 1.202
3.065IleLys: 3.065 ± 0.793
4.95IleLeu: 4.95 ± 0.845
1.414IleMet: 1.414 ± 0.416
3.772IleAsn: 3.772 ± 0.929
3.3IlePro: 3.3 ± 1.162
3.536IleGln: 3.536 ± 0.777
2.593IleArg: 2.593 ± 0.6
2.122IleSer: 2.122 ± 0.697
1.886IleThr: 1.886 ± 0.603
3.772IleVal: 3.772 ± 0.904
1.179IleTrp: 1.179 ± 0.619
3.3IleTyr: 3.3 ± 0.856
0.0IleXaa: 0.0 ± 0.0
Lys
6.836LysAla: 6.836 ± 1.189
0.0LysCys: 0.0 ± 0.0
4.715LysAsp: 4.715 ± 0.809
8.251LysGlu: 8.251 ± 1.659
1.886LysPhe: 1.886 ± 0.706
6.601LysGly: 6.601 ± 1.631
1.179LysHis: 1.179 ± 0.641
3.772LysIle: 3.772 ± 0.856
9.901LysLys: 9.901 ± 2.865
6.129LysLeu: 6.129 ± 1.643
2.357LysMet: 2.357 ± 0.876
4.243LysAsn: 4.243 ± 0.931
5.422LysPro: 5.422 ± 1.615
4.243LysGln: 4.243 ± 0.841
5.186LysArg: 5.186 ± 1.092
4.008LysSer: 4.008 ± 1.013
5.893LysThr: 5.893 ± 1.201
4.008LysVal: 4.008 ± 0.633
1.414LysTrp: 1.414 ± 0.6
2.593LysTyr: 2.593 ± 0.84
0.0LysXaa: 0.0 ± 0.0
Leu
4.243LeuAla: 4.243 ± 0.996
0.943LeuCys: 0.943 ± 0.6
4.008LeuAsp: 4.008 ± 0.932
6.836LeuGlu: 6.836 ± 1.402
5.186LeuPhe: 5.186 ± 0.816
3.3LeuGly: 3.3 ± 0.762
0.707LeuHis: 0.707 ± 0.279
2.829LeuIle: 2.829 ± 0.797
5.893LeuLys: 5.893 ± 1.129
6.365LeuLeu: 6.365 ± 1.199
3.3LeuMet: 3.3 ± 1.11
4.243LeuAsn: 4.243 ± 1.486
4.243LeuPro: 4.243 ± 1.216
3.536LeuGln: 3.536 ± 0.869
2.829LeuArg: 2.829 ± 0.665
4.715LeuSer: 4.715 ± 1.185
4.243LeuThr: 4.243 ± 0.849
4.243LeuVal: 4.243 ± 1.074
1.65LeuTrp: 1.65 ± 0.576
3.3LeuTyr: 3.3 ± 1.453
0.0LeuXaa: 0.0 ± 0.0
Met
2.357MetAla: 2.357 ± 0.626
0.471MetCys: 0.471 ± 0.305
1.65MetAsp: 1.65 ± 0.747
2.829MetGlu: 2.829 ± 0.571
0.0MetPhe: 0.0 ± 0.0
1.886MetGly: 1.886 ± 0.947
0.471MetHis: 0.471 ± 0.286
1.179MetIle: 1.179 ± 0.562
1.65MetLys: 1.65 ± 0.762
1.886MetLeu: 1.886 ± 0.746
0.707MetMet: 0.707 ± 0.394
2.357MetAsn: 2.357 ± 0.803
0.943MetPro: 0.943 ± 0.492
1.179MetGln: 1.179 ± 0.496
0.943MetArg: 0.943 ± 0.497
1.65MetSer: 1.65 ± 0.8
1.65MetThr: 1.65 ± 0.463
2.593MetVal: 2.593 ± 0.637
0.471MetTrp: 0.471 ± 0.304
1.886MetTyr: 1.886 ± 0.545
0.0MetXaa: 0.0 ± 0.0
Asn
4.715AsnAla: 4.715 ± 0.857
0.236AsnCys: 0.236 ± 0.187
2.829AsnAsp: 2.829 ± 0.619
3.065AsnGlu: 3.065 ± 0.666
2.122AsnPhe: 2.122 ± 0.562
4.008AsnGly: 4.008 ± 0.797
1.414AsnHis: 1.414 ± 0.523
2.122AsnIle: 2.122 ± 0.843
2.593AsnLys: 2.593 ± 0.588
3.3AsnLeu: 3.3 ± 1.035
1.414AsnMet: 1.414 ± 0.805
3.3AsnAsn: 3.3 ± 0.919
0.943AsnPro: 0.943 ± 0.43
0.707AsnGln: 0.707 ± 0.535
1.886AsnArg: 1.886 ± 0.684
3.536AsnSer: 3.536 ± 0.765
4.715AsnThr: 4.715 ± 0.839
3.065AsnVal: 3.065 ± 1.073
0.471AsnTrp: 0.471 ± 0.321
1.65AsnTyr: 1.65 ± 0.517
0.0AsnXaa: 0.0 ± 0.0
Pro
3.3ProAla: 3.3 ± 1.08
0.471ProCys: 0.471 ± 0.321
1.886ProAsp: 1.886 ± 0.734
2.357ProGlu: 2.357 ± 0.589
2.122ProPhe: 2.122 ± 0.579
2.122ProGly: 2.122 ± 0.796
0.471ProHis: 0.471 ± 0.257
3.3ProIle: 3.3 ± 0.966
4.715ProLys: 4.715 ± 1.558
2.829ProLeu: 2.829 ± 0.762
0.236ProMet: 0.236 ± 0.255
2.122ProAsn: 2.122 ± 0.557
1.414ProPro: 1.414 ± 0.774
1.414ProGln: 1.414 ± 0.598
2.122ProArg: 2.122 ± 0.692
4.008ProSer: 4.008 ± 0.918
2.357ProThr: 2.357 ± 0.593
4.008ProVal: 4.008 ± 0.888
0.471ProTrp: 0.471 ± 0.301
2.122ProTyr: 2.122 ± 0.726
0.0ProXaa: 0.0 ± 0.0
Gln
2.593GlnAla: 2.593 ± 1.035
0.471GlnCys: 0.471 ± 0.324
1.886GlnAsp: 1.886 ± 0.625
1.65GlnGlu: 1.65 ± 0.839
0.943GlnPhe: 0.943 ± 0.367
2.357GlnGly: 2.357 ± 0.815
0.471GlnHis: 0.471 ± 0.311
2.122GlnIle: 2.122 ± 0.504
3.536GlnLys: 3.536 ± 0.65
3.536GlnLeu: 3.536 ± 0.92
1.65GlnMet: 1.65 ± 0.666
1.886GlnAsn: 1.886 ± 0.714
1.414GlnPro: 1.414 ± 0.471
1.65GlnGln: 1.65 ± 0.982
1.886GlnArg: 1.886 ± 0.672
1.65GlnSer: 1.65 ± 0.659
1.886GlnThr: 1.886 ± 0.666
3.065GlnVal: 3.065 ± 0.778
0.707GlnTrp: 0.707 ± 0.376
1.65GlnTyr: 1.65 ± 0.724
0.0GlnXaa: 0.0 ± 0.0
Arg
3.772ArgAla: 3.772 ± 0.666
0.236ArgCys: 0.236 ± 0.226
3.065ArgAsp: 3.065 ± 0.95
4.95ArgGlu: 4.95 ± 1.077
1.414ArgPhe: 1.414 ± 0.66
2.357ArgGly: 2.357 ± 0.969
0.236ArgHis: 0.236 ± 0.255
3.536ArgIle: 3.536 ± 0.861
5.186ArgLys: 5.186 ± 1.035
4.479ArgLeu: 4.479 ± 1.072
1.414ArgMet: 1.414 ± 0.566
1.65ArgAsn: 1.65 ± 0.588
2.357ArgPro: 2.357 ± 0.983
1.65ArgGln: 1.65 ± 0.76
2.122ArgArg: 2.122 ± 0.538
2.357ArgSer: 2.357 ± 0.862
1.414ArgThr: 1.414 ± 0.47
3.772ArgVal: 3.772 ± 1.168
0.0ArgTrp: 0.0 ± 0.0
0.943ArgTyr: 0.943 ± 0.432
0.0ArgXaa: 0.0 ± 0.0
Ser
2.593SerAla: 2.593 ± 0.946
0.471SerCys: 0.471 ± 0.257
2.357SerAsp: 2.357 ± 0.785
3.3SerGlu: 3.3 ± 0.666
1.886SerPhe: 1.886 ± 0.653
5.658SerGly: 5.658 ± 1.472
1.414SerHis: 1.414 ± 0.429
5.422SerIle: 5.422 ± 1.465
6.129SerLys: 6.129 ± 1.041
3.3SerLeu: 3.3 ± 0.899
1.414SerMet: 1.414 ± 0.597
2.593SerAsn: 2.593 ± 0.847
2.829SerPro: 2.829 ± 0.726
2.122SerGln: 2.122 ± 0.687
3.065SerArg: 3.065 ± 0.998
4.243SerSer: 4.243 ± 0.961
2.357SerThr: 2.357 ± 0.677
3.536SerVal: 3.536 ± 0.888
0.943SerTrp: 0.943 ± 0.537
2.357SerTyr: 2.357 ± 0.632
0.0SerXaa: 0.0 ± 0.0
Thr
3.065ThrAla: 3.065 ± 0.624
0.0ThrCys: 0.0 ± 0.0
2.829ThrAsp: 2.829 ± 1.03
2.593ThrGlu: 2.593 ± 0.5
3.536ThrPhe: 3.536 ± 0.748
6.601ThrGly: 6.601 ± 1.224
0.471ThrHis: 0.471 ± 0.381
3.3ThrIle: 3.3 ± 0.678
6.365ThrLys: 6.365 ± 1.206
6.365ThrLeu: 6.365 ± 1.427
0.471ThrMet: 0.471 ± 0.32
3.3ThrAsn: 3.3 ± 0.731
2.357ThrPro: 2.357 ± 0.603
2.357ThrGln: 2.357 ± 1.159
2.593ThrArg: 2.593 ± 0.683
4.243ThrSer: 4.243 ± 1.284
4.243ThrThr: 4.243 ± 1.139
4.243ThrVal: 4.243 ± 0.795
0.943ThrTrp: 0.943 ± 0.432
0.471ThrTyr: 0.471 ± 0.274
0.0ThrXaa: 0.0 ± 0.0
Val
4.95ValAla: 4.95 ± 0.964
0.0ValCys: 0.0 ± 0.0
3.536ValAsp: 3.536 ± 0.889
4.243ValGlu: 4.243 ± 0.872
3.536ValPhe: 3.536 ± 0.876
4.95ValGly: 4.95 ± 0.765
1.414ValHis: 1.414 ± 0.49
2.593ValIle: 2.593 ± 0.633
4.479ValLys: 4.479 ± 0.854
5.658ValLeu: 5.658 ± 0.884
1.414ValMet: 1.414 ± 0.51
2.357ValAsn: 2.357 ± 0.961
4.243ValPro: 4.243 ± 0.881
1.65ValGln: 1.65 ± 0.529
2.122ValArg: 2.122 ± 0.744
4.008ValSer: 4.008 ± 1.123
4.95ValThr: 4.95 ± 1.32
5.186ValVal: 5.186 ± 1.196
1.414ValTrp: 1.414 ± 0.578
3.065ValTyr: 3.065 ± 0.778
0.0ValXaa: 0.0 ± 0.0
Trp
1.65TrpAla: 1.65 ± 0.494
0.471TrpCys: 0.471 ± 0.287
0.707TrpAsp: 0.707 ± 0.315
0.943TrpGlu: 0.943 ± 0.447
1.179TrpPhe: 1.179 ± 0.586
0.707TrpGly: 0.707 ± 0.399
0.471TrpHis: 0.471 ± 0.452
0.471TrpIle: 0.471 ± 0.303
1.414TrpLys: 1.414 ± 0.454
0.943TrpLeu: 0.943 ± 0.572
0.0TrpMet: 0.0 ± 0.0
0.236TrpAsn: 0.236 ± 0.222
0.0TrpPro: 0.0 ± 0.0
0.943TrpGln: 0.943 ± 0.634
0.471TrpArg: 0.471 ± 0.299
0.943TrpSer: 0.943 ± 0.444
0.471TrpThr: 0.471 ± 0.303
0.471TrpVal: 0.471 ± 0.283
0.0TrpTrp: 0.0 ± 0.0
0.943TrpTyr: 0.943 ± 0.567
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.065TyrAla: 3.065 ± 0.999
0.707TyrCys: 0.707 ± 0.346
3.536TyrAsp: 3.536 ± 0.944
3.536TyrGlu: 3.536 ± 0.876
1.179TyrPhe: 1.179 ± 0.467
3.772TyrGly: 3.772 ± 1.252
1.179TyrHis: 1.179 ± 0.548
1.886TyrIle: 1.886 ± 0.936
4.479TyrLys: 4.479 ± 1.037
2.122TyrLeu: 2.122 ± 0.586
1.886TyrMet: 1.886 ± 0.475
2.593TyrAsn: 2.593 ± 0.716
1.65TyrPro: 1.65 ± 0.439
0.943TyrGln: 0.943 ± 0.423
1.179TyrArg: 1.179 ± 0.461
1.886TyrSer: 1.886 ± 0.531
2.122TyrThr: 2.122 ± 0.678
2.593TyrVal: 2.593 ± 0.827
0.0TyrTrp: 0.0 ± 0.0
2.357TyrTyr: 2.357 ± 0.596
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 21 proteins (4243 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski