Amino acid dipepetide frequency for Bacillus phage Aurora

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
0.42AlaAla: 0.42 ± 0.237
0.98AlaCys: 0.98 ± 0.367
1.961AlaAsp: 1.961 ± 0.553
3.641AlaGlu: 3.641 ± 0.779
2.941AlaPhe: 2.941 ± 0.68
3.501AlaGly: 3.501 ± 0.708
0.14AlaHis: 0.14 ± 0.116
2.661AlaIle: 2.661 ± 0.659
4.901AlaLys: 4.901 ± 0.863
2.521AlaLeu: 2.521 ± 0.501
1.82AlaMet: 1.82 ± 0.485
2.661AlaAsn: 2.661 ± 0.508
1.26AlaPro: 1.26 ± 0.501
2.241AlaGln: 2.241 ± 0.429
1.68AlaArg: 1.68 ± 0.427
1.961AlaSer: 1.961 ± 0.514
2.661AlaThr: 2.661 ± 0.484
2.661AlaVal: 2.661 ± 0.917
0.56AlaTrp: 0.56 ± 0.233
2.241AlaTyr: 2.241 ± 0.563
0.0AlaXaa: 0.0 ± 0.0
Cys
0.28CysAla: 0.28 ± 0.191
0.14CysCys: 0.14 ± 0.128
1.961CysAsp: 1.961 ± 0.547
1.12CysGlu: 1.12 ± 0.328
0.7CysPhe: 0.7 ± 0.265
0.84CysGly: 0.84 ± 0.437
0.14CysHis: 0.14 ± 0.111
0.42CysIle: 0.42 ± 0.21
1.54CysLys: 1.54 ± 0.512
0.42CysLeu: 0.42 ± 0.239
0.42CysMet: 0.42 ± 0.259
0.42CysAsn: 0.42 ± 0.225
0.7CysPro: 0.7 ± 0.378
0.14CysGln: 0.14 ± 0.134
0.56CysArg: 0.56 ± 0.327
0.28CysSer: 0.28 ± 0.183
0.0CysThr: 0.0 ± 0.0
0.7CysVal: 0.7 ± 0.333
0.28CysTrp: 0.28 ± 0.171
0.42CysTyr: 0.42 ± 0.206
0.0CysXaa: 0.0 ± 0.0
Asp
2.941AspAla: 2.941 ± 0.585
1.12AspCys: 1.12 ± 0.526
2.941AspAsp: 2.941 ± 0.775
4.761AspGlu: 4.761 ± 0.648
4.621AspPhe: 4.621 ± 0.629
5.601AspGly: 5.601 ± 1.234
0.98AspHis: 0.98 ± 0.349
6.162AspIle: 6.162 ± 0.972
5.601AspLys: 5.601 ± 0.582
3.081AspLeu: 3.081 ± 0.693
1.82AspMet: 1.82 ± 0.559
4.621AspAsn: 4.621 ± 0.717
2.521AspPro: 2.521 ± 0.542
2.241AspGln: 2.241 ± 0.51
1.82AspArg: 1.82 ± 0.484
2.521AspSer: 2.521 ± 0.61
3.081AspThr: 3.081 ± 0.799
5.882AspVal: 5.882 ± 0.801
0.42AspTrp: 0.42 ± 0.211
2.801AspTyr: 2.801 ± 0.579
0.0AspXaa: 0.0 ± 0.0
Glu
3.641GluAla: 3.641 ± 0.745
0.56GluCys: 0.56 ± 0.258
3.781GluAsp: 3.781 ± 0.568
7.002GluGlu: 7.002 ± 1.569
4.341GluPhe: 4.341 ± 0.863
4.341GluGly: 4.341 ± 0.676
0.84GluHis: 0.84 ± 0.32
5.741GluIle: 5.741 ± 0.951
5.882GluLys: 5.882 ± 1.201
9.382GluLeu: 9.382 ± 1.562
2.941GluMet: 2.941 ± 0.722
5.181GluAsn: 5.181 ± 0.921
0.84GluPro: 0.84 ± 0.288
2.941GluGln: 2.941 ± 0.506
3.501GluArg: 3.501 ± 0.757
4.901GluSer: 4.901 ± 0.834
4.201GluThr: 4.201 ± 0.942
5.041GluVal: 5.041 ± 0.903
1.12GluTrp: 1.12 ± 0.294
3.781GluTyr: 3.781 ± 0.827
0.0GluXaa: 0.0 ± 0.0
Phe
1.82PheAla: 1.82 ± 0.471
0.42PheCys: 0.42 ± 0.249
3.921PheAsp: 3.921 ± 0.676
3.921PheGlu: 3.921 ± 0.549
1.82PhePhe: 1.82 ± 0.419
1.961PheGly: 1.961 ± 0.571
0.7PheHis: 0.7 ± 0.336
4.341PheIle: 4.341 ± 0.823
4.481PheLys: 4.481 ± 0.904
2.941PheLeu: 2.941 ± 0.582
1.961PheMet: 1.961 ± 0.409
2.101PheAsn: 2.101 ± 0.595
0.98PhePro: 0.98 ± 0.374
0.98PheGln: 0.98 ± 0.385
0.84PheArg: 0.84 ± 0.302
3.641PheSer: 3.641 ± 0.761
3.221PheThr: 3.221 ± 0.723
2.941PheVal: 2.941 ± 0.579
0.42PheTrp: 0.42 ± 0.221
2.661PheTyr: 2.661 ± 0.543
0.0PheXaa: 0.0 ± 0.0
Gly
2.521GlyAla: 2.521 ± 0.625
0.56GlyCys: 0.56 ± 0.27
3.221GlyAsp: 3.221 ± 0.676
5.601GlyGlu: 5.601 ± 0.897
3.501GlyPhe: 3.501 ± 0.562
3.221GlyGly: 3.221 ± 0.759
0.7GlyHis: 0.7 ± 0.391
4.901GlyIle: 4.901 ± 0.838
6.302GlyLys: 6.302 ± 0.964
4.201GlyLeu: 4.201 ± 0.542
2.381GlyMet: 2.381 ± 0.536
5.041GlyAsn: 5.041 ± 1.005
0.56GlyPro: 0.56 ± 0.236
1.68GlyGln: 1.68 ± 0.507
2.381GlyArg: 2.381 ± 0.589
4.761GlySer: 4.761 ± 0.689
5.601GlyThr: 5.601 ± 1.204
3.641GlyVal: 3.641 ± 0.594
0.14GlyTrp: 0.14 ± 0.151
3.081GlyTyr: 3.081 ± 0.64
0.0GlyXaa: 0.0 ± 0.0
His
0.84HisAla: 0.84 ± 0.261
0.14HisCys: 0.14 ± 0.128
0.98HisAsp: 0.98 ± 0.402
0.14HisGlu: 0.14 ± 0.111
0.84HisPhe: 0.84 ± 0.254
0.7HisGly: 0.7 ± 0.361
0.56HisHis: 0.56 ± 0.265
1.4HisIle: 1.4 ± 0.321
1.54HisLys: 1.54 ± 0.449
1.4HisLeu: 1.4 ± 0.362
0.42HisMet: 0.42 ± 0.284
0.56HisAsn: 0.56 ± 0.196
0.14HisPro: 0.14 ± 0.116
0.28HisGln: 0.28 ± 0.18
0.28HisArg: 0.28 ± 0.18
0.42HisSer: 0.42 ± 0.217
1.26HisThr: 1.26 ± 0.461
0.84HisVal: 0.84 ± 0.338
0.0HisTrp: 0.0 ± 0.0
1.26HisTyr: 1.26 ± 0.45
0.0HisXaa: 0.0 ± 0.0
Ile
3.921IleAla: 3.921 ± 0.772
0.42IleCys: 0.42 ± 0.229
6.722IleAsp: 6.722 ± 0.762
6.862IleGlu: 6.862 ± 1.202
1.68IlePhe: 1.68 ± 0.418
4.201IleGly: 4.201 ± 0.602
1.12IleHis: 1.12 ± 0.429
4.761IleIle: 4.761 ± 0.824
6.722IleLys: 6.722 ± 0.941
2.801IleLeu: 2.801 ± 0.636
2.381IleMet: 2.381 ± 0.584
5.181IleAsn: 5.181 ± 0.738
1.4IlePro: 1.4 ± 0.401
2.941IleGln: 2.941 ± 0.591
4.201IleArg: 4.201 ± 0.859
3.221IleSer: 3.221 ± 0.668
4.201IleThr: 4.201 ± 0.647
4.201IleVal: 4.201 ± 0.742
0.84IleTrp: 0.84 ± 0.414
3.081IleTyr: 3.081 ± 0.748
0.0IleXaa: 0.0 ± 0.0
Lys
4.341LysAla: 4.341 ± 1.187
1.12LysCys: 1.12 ± 0.406
6.862LysAsp: 6.862 ± 1.193
9.382LysGlu: 9.382 ± 1.623
3.921LysPhe: 3.921 ± 0.794
6.722LysGly: 6.722 ± 0.801
1.4LysHis: 1.4 ± 0.416
5.601LysIle: 5.601 ± 0.637
10.503LysLys: 10.503 ± 1.088
6.722LysLeu: 6.722 ± 1.071
3.921LysMet: 3.921 ± 0.803
6.302LysAsn: 6.302 ± 0.941
0.98LysPro: 0.98 ± 0.403
3.361LysGln: 3.361 ± 0.863
4.621LysArg: 4.621 ± 0.818
3.781LysSer: 3.781 ± 0.701
5.741LysThr: 5.741 ± 0.898
7.142LysVal: 7.142 ± 0.961
0.98LysTrp: 0.98 ± 0.324
4.201LysTyr: 4.201 ± 0.866
0.0LysXaa: 0.0 ± 0.0
Leu
2.241LeuAla: 2.241 ± 0.506
0.42LeuCys: 0.42 ± 0.228
5.041LeuAsp: 5.041 ± 0.76
5.041LeuGlu: 5.041 ± 0.822
2.241LeuPhe: 2.241 ± 0.602
3.921LeuGly: 3.921 ± 0.687
1.68LeuHis: 1.68 ± 0.402
4.341LeuIle: 4.341 ± 0.801
7.282LeuLys: 7.282 ± 1.346
3.221LeuLeu: 3.221 ± 0.647
2.241LeuMet: 2.241 ± 0.552
4.481LeuAsn: 4.481 ± 0.748
1.82LeuPro: 1.82 ± 0.446
2.941LeuGln: 2.941 ± 0.709
3.501LeuArg: 3.501 ± 0.809
3.641LeuSer: 3.641 ± 0.638
5.181LeuThr: 5.181 ± 0.864
3.921LeuVal: 3.921 ± 0.71
0.98LeuTrp: 0.98 ± 0.38
3.501LeuTyr: 3.501 ± 0.54
0.0LeuXaa: 0.0 ± 0.0
Met
0.84MetAla: 0.84 ± 0.286
0.56MetCys: 0.56 ± 0.288
2.381MetAsp: 2.381 ± 0.602
2.101MetGlu: 2.101 ± 0.543
1.961MetPhe: 1.961 ± 0.576
1.82MetGly: 1.82 ± 0.448
0.14MetHis: 0.14 ± 0.128
1.82MetIle: 1.82 ± 0.674
2.801MetLys: 2.801 ± 0.607
3.221MetLeu: 3.221 ± 0.77
1.26MetMet: 1.26 ± 0.504
3.361MetAsn: 3.361 ± 0.686
0.56MetPro: 0.56 ± 0.247
0.98MetGln: 0.98 ± 0.337
1.82MetArg: 1.82 ± 0.462
2.381MetSer: 2.381 ± 0.514
1.4MetThr: 1.4 ± 0.425
1.54MetVal: 1.54 ± 0.446
0.98MetTrp: 0.98 ± 0.337
1.961MetTyr: 1.961 ± 0.513
0.0MetXaa: 0.0 ± 0.0
Asn
2.381AsnAla: 2.381 ± 0.606
1.12AsnCys: 1.12 ± 0.462
4.621AsnAsp: 4.621 ± 0.766
5.181AsnGlu: 5.181 ± 0.736
3.081AsnPhe: 3.081 ± 0.681
4.341AsnGly: 4.341 ± 0.775
0.56AsnHis: 0.56 ± 0.268
5.181AsnIle: 5.181 ± 0.741
6.862AsnLys: 6.862 ± 1.19
5.181AsnLeu: 5.181 ± 0.734
1.54AsnMet: 1.54 ± 0.491
6.302AsnAsn: 6.302 ± 1.06
2.241AsnPro: 2.241 ± 0.655
1.961AsnGln: 1.961 ± 0.537
2.381AsnArg: 2.381 ± 0.563
5.041AsnSer: 5.041 ± 0.731
4.061AsnThr: 4.061 ± 1.032
4.481AsnVal: 4.481 ± 0.867
0.98AsnTrp: 0.98 ± 0.292
2.801AsnTyr: 2.801 ± 0.811
0.0AsnXaa: 0.0 ± 0.0
Pro
0.84ProAla: 0.84 ± 0.299
0.42ProCys: 0.42 ± 0.221
1.68ProAsp: 1.68 ± 0.523
1.68ProGlu: 1.68 ± 0.472
0.98ProPhe: 0.98 ± 0.516
0.7ProGly: 0.7 ± 0.25
0.14ProHis: 0.14 ± 0.13
1.54ProIle: 1.54 ± 0.385
2.521ProLys: 2.521 ± 0.539
0.98ProLeu: 0.98 ± 0.387
0.84ProMet: 0.84 ± 0.386
1.961ProAsn: 1.961 ± 0.59
0.84ProPro: 0.84 ± 0.252
0.84ProGln: 0.84 ± 0.382
0.7ProArg: 0.7 ± 0.34
1.4ProSer: 1.4 ± 0.476
1.82ProThr: 1.82 ± 0.448
1.82ProVal: 1.82 ± 0.449
0.28ProTrp: 0.28 ± 0.187
1.82ProTyr: 1.82 ± 0.574
0.0ProXaa: 0.0 ± 0.0
Gln
1.54GlnAla: 1.54 ± 0.432
0.14GlnCys: 0.14 ± 0.111
1.4GlnAsp: 1.4 ± 0.521
1.54GlnGlu: 1.54 ± 0.533
1.12GlnPhe: 1.12 ± 0.429
2.381GlnGly: 2.381 ± 0.682
0.28GlnHis: 0.28 ± 0.168
1.82GlnIle: 1.82 ± 0.517
2.941GlnLys: 2.941 ± 0.627
3.781GlnLeu: 3.781 ± 0.759
0.7GlnMet: 0.7 ± 0.277
2.521GlnAsn: 2.521 ± 0.445
1.26GlnPro: 1.26 ± 0.36
0.84GlnGln: 0.84 ± 0.308
1.26GlnArg: 1.26 ± 0.32
2.241GlnSer: 2.241 ± 0.658
2.521GlnThr: 2.521 ± 0.586
1.961GlnVal: 1.961 ± 0.545
0.28GlnTrp: 0.28 ± 0.189
2.661GlnTyr: 2.661 ± 0.583
0.0GlnXaa: 0.0 ± 0.0
Arg
2.661ArgAla: 2.661 ± 0.444
0.42ArgCys: 0.42 ± 0.223
2.241ArgAsp: 2.241 ± 0.486
3.221ArgGlu: 3.221 ± 0.676
2.381ArgPhe: 2.381 ± 0.558
2.101ArgGly: 2.101 ± 0.591
0.28ArgHis: 0.28 ± 0.158
2.661ArgIle: 2.661 ± 0.636
4.621ArgLys: 4.621 ± 0.925
2.241ArgLeu: 2.241 ± 0.505
1.4ArgMet: 1.4 ± 0.435
2.521ArgAsn: 2.521 ± 0.742
1.12ArgPro: 1.12 ± 0.381
1.54ArgGln: 1.54 ± 0.452
2.241ArgArg: 2.241 ± 0.602
1.961ArgSer: 1.961 ± 0.576
1.54ArgThr: 1.54 ± 0.402
2.101ArgVal: 2.101 ± 0.55
0.42ArgTrp: 0.42 ± 0.191
1.961ArgTyr: 1.961 ± 0.646
0.0ArgXaa: 0.0 ± 0.0
Ser
2.801SerAla: 2.801 ± 0.549
0.84SerCys: 0.84 ± 0.283
3.921SerAsp: 3.921 ± 0.662
3.781SerGlu: 3.781 ± 0.695
2.521SerPhe: 2.521 ± 0.557
2.941SerGly: 2.941 ± 0.553
1.12SerHis: 1.12 ± 0.365
4.061SerIle: 4.061 ± 0.919
4.621SerLys: 4.621 ± 0.912
4.061SerLeu: 4.061 ± 0.787
1.68SerMet: 1.68 ± 0.377
4.341SerAsn: 4.341 ± 0.783
0.7SerPro: 0.7 ± 0.314
2.101SerGln: 2.101 ± 0.559
2.101SerArg: 2.101 ± 0.48
3.081SerSer: 3.081 ± 0.975
3.361SerThr: 3.361 ± 0.842
3.221SerVal: 3.221 ± 0.622
0.14SerTrp: 0.14 ± 0.134
2.661SerTyr: 2.661 ± 0.526
0.0SerXaa: 0.0 ± 0.0
Thr
2.801ThrAla: 2.801 ± 0.794
0.7ThrCys: 0.7 ± 0.316
3.081ThrAsp: 3.081 ± 0.716
4.341ThrGlu: 4.341 ± 0.873
2.801ThrPhe: 2.801 ± 0.445
5.041ThrGly: 5.041 ± 1.171
0.98ThrHis: 0.98 ± 0.352
5.461ThrIle: 5.461 ± 0.675
5.461ThrLys: 5.461 ± 0.782
5.041ThrLeu: 5.041 ± 0.878
2.101ThrMet: 2.101 ± 0.494
4.201ThrAsn: 4.201 ± 0.898
1.4ThrPro: 1.4 ± 0.399
1.26ThrGln: 1.26 ± 0.409
1.82ThrArg: 1.82 ± 0.513
3.781ThrSer: 3.781 ± 0.776
3.921ThrThr: 3.921 ± 1.22
4.201ThrVal: 4.201 ± 0.649
0.56ThrTrp: 0.56 ± 0.267
1.68ThrTyr: 1.68 ± 0.493
0.0ThrXaa: 0.0 ± 0.0
Val
3.221ValAla: 3.221 ± 0.723
0.14ValCys: 0.14 ± 0.138
5.041ValAsp: 5.041 ± 0.795
5.601ValGlu: 5.601 ± 0.869
1.54ValPhe: 1.54 ± 0.451
5.882ValGly: 5.882 ± 0.732
1.12ValHis: 1.12 ± 0.319
4.481ValIle: 4.481 ± 0.807
7.282ValLys: 7.282 ± 0.872
3.501ValLeu: 3.501 ± 0.607
1.68ValMet: 1.68 ± 0.588
3.221ValAsn: 3.221 ± 0.665
2.941ValPro: 2.941 ± 0.704
1.82ValGln: 1.82 ± 0.517
1.54ValArg: 1.54 ± 0.45
3.221ValSer: 3.221 ± 0.728
4.341ValThr: 4.341 ± 0.956
4.201ValVal: 4.201 ± 0.816
1.12ValTrp: 1.12 ± 0.364
2.521ValTyr: 2.521 ± 0.629
0.0ValXaa: 0.0 ± 0.0
Trp
0.14TrpAla: 0.14 ± 0.131
0.42TrpCys: 0.42 ± 0.264
0.42TrpAsp: 0.42 ± 0.22
0.84TrpGlu: 0.84 ± 0.375
1.12TrpPhe: 1.12 ± 0.357
0.7TrpGly: 0.7 ± 0.257
0.56TrpHis: 0.56 ± 0.212
0.56TrpIle: 0.56 ± 0.223
1.12TrpLys: 1.12 ± 0.371
0.84TrpLeu: 0.84 ± 0.315
0.56TrpMet: 0.56 ± 0.281
1.26TrpAsn: 1.26 ± 0.322
0.0TrpPro: 0.0 ± 0.0
0.84TrpGln: 0.84 ± 0.259
0.7TrpArg: 0.7 ± 0.301
0.14TrpSer: 0.14 ± 0.118
0.28TrpThr: 0.28 ± 0.193
0.56TrpVal: 0.56 ± 0.301
0.28TrpTrp: 0.28 ± 0.302
0.56TrpTyr: 0.56 ± 0.297
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.221TyrAla: 3.221 ± 0.493
0.84TyrCys: 0.84 ± 0.339
3.361TyrAsp: 3.361 ± 0.608
4.201TyrGlu: 4.201 ± 0.84
1.961TyrPhe: 1.961 ± 0.452
3.081TyrGly: 3.081 ± 0.638
0.56TyrHis: 0.56 ± 0.264
2.941TyrIle: 2.941 ± 0.661
4.761TyrLys: 4.761 ± 0.8
2.101TyrLeu: 2.101 ± 0.614
1.68TyrMet: 1.68 ± 0.463
4.061TyrAsn: 4.061 ± 0.564
1.54TyrPro: 1.54 ± 0.393
1.26TyrGln: 1.26 ± 0.334
1.68TyrArg: 1.68 ± 0.411
1.68TyrSer: 1.68 ± 0.497
2.241TyrThr: 2.241 ± 0.43
3.361TyrVal: 3.361 ± 0.703
1.12TyrTrp: 1.12 ± 0.449
1.961TyrTyr: 1.961 ± 0.395
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 40 proteins (7142 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski