Amino acid dipepetide frequency for Bacillus phage phi29 (Bacteriophage phi-29)

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
3.361AlaAla: 3.361 ± 0.901
0.8AlaCys: 0.8 ± 0.336
3.201AlaAsp: 3.201 ± 0.805
4.001AlaGlu: 4.001 ± 0.769
2.561AlaPhe: 2.561 ± 0.508
3.361AlaGly: 3.361 ± 0.808
0.8AlaHis: 0.8 ± 0.246
2.881AlaIle: 2.881 ± 0.742
4.321AlaLys: 4.321 ± 0.663
4.962AlaLeu: 4.962 ± 0.82
1.12AlaMet: 1.12 ± 0.439
2.721AlaAsn: 2.721 ± 0.668
1.601AlaPro: 1.601 ± 0.663
2.721AlaGln: 2.721 ± 0.686
2.561AlaArg: 2.561 ± 0.546
4.161AlaSer: 4.161 ± 0.7
3.361AlaThr: 3.361 ± 0.926
4.962AlaVal: 4.962 ± 1.199
1.12AlaTrp: 1.12 ± 0.435
3.041AlaTyr: 3.041 ± 0.604
0.0AlaXaa: 0.0 ± 0.0
Cys
0.32CysAla: 0.32 ± 0.191
0.0CysCys: 0.0 ± 0.0
0.48CysAsp: 0.48 ± 0.314
0.8CysGlu: 0.8 ± 0.312
0.16CysPhe: 0.16 ± 0.157
0.48CysGly: 0.48 ± 0.253
0.32CysHis: 0.32 ± 0.221
0.32CysIle: 0.32 ± 0.197
0.0CysLys: 0.0 ± 0.0
0.48CysLeu: 0.48 ± 0.244
0.32CysMet: 0.32 ± 0.253
0.32CysAsn: 0.32 ± 0.174
0.16CysPro: 0.16 ± 0.175
0.0CysGln: 0.0 ± 0.0
0.16CysArg: 0.16 ± 0.153
0.32CysSer: 0.32 ± 0.221
0.32CysThr: 0.32 ± 0.175
0.96CysVal: 0.96 ± 0.403
0.16CysTrp: 0.16 ± 0.161
0.96CysTyr: 0.96 ± 0.319
0.0CysXaa: 0.0 ± 0.0
Asp
2.561AspAla: 2.561 ± 0.513
0.64AspCys: 0.64 ± 0.21
4.321AspAsp: 4.321 ± 0.611
4.321AspGlu: 4.321 ± 0.859
3.361AspPhe: 3.361 ± 0.642
6.402AspGly: 6.402 ± 1.152
0.8AspHis: 0.8 ± 0.308
5.602AspIle: 5.602 ± 1.128
4.641AspLys: 4.641 ± 0.953
5.442AspLeu: 5.442 ± 1.126
1.921AspMet: 1.921 ± 0.644
4.641AspAsn: 4.641 ± 0.665
2.081AspPro: 2.081 ± 0.502
0.8AspGln: 0.8 ± 0.329
1.761AspArg: 1.761 ± 0.656
3.361AspSer: 3.361 ± 0.645
3.361AspThr: 3.361 ± 0.985
4.161AspVal: 4.161 ± 0.808
0.8AspTrp: 0.8 ± 0.432
3.201AspTyr: 3.201 ± 1.163
0.0AspXaa: 0.0 ± 0.0
Glu
3.841GluAla: 3.841 ± 0.882
0.0GluCys: 0.0 ± 0.0
4.001GluAsp: 4.001 ± 0.941
4.802GluGlu: 4.802 ± 0.672
1.601GluPhe: 1.601 ± 0.423
3.841GluGly: 3.841 ± 0.774
1.28GluHis: 1.28 ± 0.432
5.442GluIle: 5.442 ± 0.645
5.602GluLys: 5.602 ± 1.09
6.722GluLeu: 6.722 ± 1.217
2.241GluMet: 2.241 ± 0.523
4.962GluAsn: 4.962 ± 0.804
0.8GluPro: 0.8 ± 0.408
3.201GluGln: 3.201 ± 0.966
2.721GluArg: 2.721 ± 0.563
3.041GluSer: 3.041 ± 0.716
4.481GluThr: 4.481 ± 1.031
4.001GluVal: 4.001 ± 0.992
1.12GluTrp: 1.12 ± 0.56
3.681GluTyr: 3.681 ± 0.777
0.0GluXaa: 0.0 ± 0.0
Phe
2.881PheAla: 2.881 ± 0.822
0.16PheCys: 0.16 ± 0.175
3.841PheAsp: 3.841 ± 0.719
4.001PheGlu: 4.001 ± 0.796
1.44PhePhe: 1.44 ± 0.551
2.241PheGly: 2.241 ± 0.63
0.8PheHis: 0.8 ± 0.309
2.881PheIle: 2.881 ± 0.814
4.802PheLys: 4.802 ± 1.445
3.201PheLeu: 3.201 ± 0.866
1.28PheMet: 1.28 ± 0.274
3.361PheAsn: 3.361 ± 0.765
1.601PhePro: 1.601 ± 0.375
0.48PheGln: 0.48 ± 0.317
1.761PheArg: 1.761 ± 0.453
2.721PheSer: 2.721 ± 0.564
2.081PheThr: 2.081 ± 0.496
2.241PheVal: 2.241 ± 0.655
0.16PheTrp: 0.16 ± 0.195
1.921PheTyr: 1.921 ± 0.518
0.0PheXaa: 0.0 ± 0.0
Gly
3.841GlyAla: 3.841 ± 0.893
0.32GlyCys: 0.32 ± 0.245
5.442GlyAsp: 5.442 ± 1.131
3.201GlyGlu: 3.201 ± 0.678
3.841GlyPhe: 3.841 ± 0.895
5.922GlyGly: 5.922 ± 1.263
1.28GlyHis: 1.28 ± 0.482
4.481GlyIle: 4.481 ± 1.41
4.161GlyLys: 4.161 ± 0.728
5.762GlyLeu: 5.762 ± 0.806
1.28GlyMet: 1.28 ± 0.641
5.282GlyAsn: 5.282 ± 1.204
0.0GlyPro: 0.0 ± 0.0
2.241GlyGln: 2.241 ± 0.534
1.44GlyArg: 1.44 ± 0.456
4.481GlySer: 4.481 ± 0.736
4.962GlyThr: 4.962 ± 0.595
4.962GlyVal: 4.962 ± 1.07
0.8GlyTrp: 0.8 ± 0.463
3.041GlyTyr: 3.041 ± 0.719
0.0GlyXaa: 0.0 ± 0.0
His
0.64HisAla: 0.64 ± 0.301
0.0HisCys: 0.0 ± 0.0
0.8HisAsp: 0.8 ± 0.28
1.44HisGlu: 1.44 ± 0.526
0.96HisPhe: 0.96 ± 0.432
1.28HisGly: 1.28 ± 0.461
0.48HisHis: 0.48 ± 0.301
1.12HisIle: 1.12 ± 0.394
1.761HisLys: 1.761 ± 0.455
1.28HisLeu: 1.28 ± 0.422
0.48HisMet: 0.48 ± 0.203
0.96HisAsn: 0.96 ± 0.31
0.32HisPro: 0.32 ± 0.197
0.32HisGln: 0.32 ± 0.205
0.0HisArg: 0.0 ± 0.0
0.48HisSer: 0.48 ± 0.26
0.48HisThr: 0.48 ± 0.223
1.44HisVal: 1.44 ± 0.381
0.16HisTrp: 0.16 ± 0.146
0.96HisTyr: 0.96 ± 0.382
0.0HisXaa: 0.0 ± 0.0
Ile
3.841IleAla: 3.841 ± 0.588
0.48IleCys: 0.48 ± 0.244
4.962IleAsp: 4.962 ± 0.844
5.122IleGlu: 5.122 ± 1.093
2.881IlePhe: 2.881 ± 0.863
4.321IleGly: 4.321 ± 0.779
1.28IleHis: 1.28 ± 0.407
4.161IleIle: 4.161 ± 0.942
6.242IleLys: 6.242 ± 1.223
3.841IleLeu: 3.841 ± 1.088
0.96IleMet: 0.96 ± 0.481
6.562IleAsn: 6.562 ± 1.007
2.401IlePro: 2.401 ± 0.595
2.081IleGln: 2.081 ± 0.716
3.681IleArg: 3.681 ± 0.658
4.161IleSer: 4.161 ± 0.913
4.481IleThr: 4.481 ± 0.666
3.681IleVal: 3.681 ± 0.734
0.32IleTrp: 0.32 ± 0.175
2.721IleTyr: 2.721 ± 0.651
0.0IleXaa: 0.0 ± 0.0
Lys
3.841LysAla: 3.841 ± 0.699
0.16LysCys: 0.16 ± 0.153
5.122LysAsp: 5.122 ± 0.912
4.962LysGlu: 4.962 ± 1.219
3.201LysPhe: 3.201 ± 0.782
4.001LysGly: 4.001 ± 0.697
0.64LysHis: 0.64 ± 0.348
5.602LysIle: 5.602 ± 0.781
5.762LysLys: 5.762 ± 1.121
6.722LysLeu: 6.722 ± 1.064
4.001LysMet: 4.001 ± 0.972
4.321LysAsn: 4.321 ± 0.702
2.081LysPro: 2.081 ± 0.596
2.881LysGln: 2.881 ± 0.473
4.161LysArg: 4.161 ± 0.719
4.481LysSer: 4.481 ± 0.876
4.802LysThr: 4.802 ± 0.694
5.442LysVal: 5.442 ± 0.863
1.28LysTrp: 1.28 ± 0.475
1.761LysTyr: 1.761 ± 0.463
0.0LysXaa: 0.0 ± 0.0
Leu
4.161LeuAla: 4.161 ± 0.887
0.8LeuCys: 0.8 ± 0.351
4.161LeuAsp: 4.161 ± 0.637
7.362LeuGlu: 7.362 ± 1.514
3.841LeuPhe: 3.841 ± 0.817
4.161LeuGly: 4.161 ± 0.713
1.12LeuHis: 1.12 ± 0.413
4.641LeuIle: 4.641 ± 0.679
6.402LeuLys: 6.402 ± 1.179
4.641LeuLeu: 4.641 ± 1.141
2.721LeuMet: 2.721 ± 0.554
5.922LeuAsn: 5.922 ± 0.757
3.041LeuPro: 3.041 ± 0.548
2.881LeuGln: 2.881 ± 0.82
3.041LeuArg: 3.041 ± 0.762
7.042LeuSer: 7.042 ± 0.865
5.282LeuThr: 5.282 ± 0.739
5.442LeuVal: 5.442 ± 0.782
0.64LeuTrp: 0.64 ± 0.335
3.041LeuTyr: 3.041 ± 0.725
0.0LeuXaa: 0.0 ± 0.0
Met
1.44MetAla: 1.44 ± 0.374
0.16MetCys: 0.16 ± 0.146
1.28MetAsp: 1.28 ± 0.455
1.601MetGlu: 1.601 ± 0.516
0.96MetPhe: 0.96 ± 0.643
1.601MetGly: 1.601 ± 0.501
0.16MetHis: 0.16 ± 0.157
2.721MetIle: 2.721 ± 0.8
2.721MetLys: 2.721 ± 0.646
1.601MetLeu: 1.601 ± 0.506
1.28MetMet: 1.28 ± 0.364
1.12MetAsn: 1.12 ± 0.308
0.96MetPro: 0.96 ± 0.349
1.44MetGln: 1.44 ± 0.512
1.28MetArg: 1.28 ± 0.426
1.44MetSer: 1.44 ± 0.35
1.44MetThr: 1.44 ± 0.458
2.561MetVal: 2.561 ± 0.745
0.16MetTrp: 0.16 ± 0.176
1.12MetTyr: 1.12 ± 0.426
0.0MetXaa: 0.0 ± 0.0
Asn
4.962AsnAla: 4.962 ± 0.966
0.64AsnCys: 0.64 ± 0.491
5.122AsnAsp: 5.122 ± 1.131
3.201AsnGlu: 3.201 ± 0.712
1.921AsnPhe: 1.921 ± 0.477
4.641AsnGly: 4.641 ± 0.862
0.8AsnHis: 0.8 ± 0.351
3.521AsnIle: 3.521 ± 0.857
5.602AsnLys: 5.602 ± 1.07
4.481AsnLeu: 4.481 ± 0.557
2.081AsnMet: 2.081 ± 0.483
4.001AsnAsn: 4.001 ± 1.026
3.041AsnPro: 3.041 ± 0.565
2.881AsnGln: 2.881 ± 0.746
2.241AsnArg: 2.241 ± 0.578
3.841AsnSer: 3.841 ± 0.378
4.962AsnThr: 4.962 ± 1.021
4.802AsnVal: 4.802 ± 0.723
0.64AsnTrp: 0.64 ± 0.21
3.041AsnTyr: 3.041 ± 0.713
0.0AsnXaa: 0.0 ± 0.0
Pro
1.44ProAla: 1.44 ± 0.407
0.16ProCys: 0.16 ± 0.146
2.241ProAsp: 2.241 ± 0.676
1.601ProGlu: 1.601 ± 0.49
1.601ProPhe: 1.601 ± 0.658
0.48ProGly: 0.48 ± 0.223
0.64ProHis: 0.64 ± 0.348
0.96ProIle: 0.96 ± 0.421
1.44ProLys: 1.44 ± 0.546
2.401ProLeu: 2.401 ± 0.474
0.32ProMet: 0.32 ± 0.191
3.041ProAsn: 3.041 ± 0.61
0.48ProPro: 0.48 ± 0.225
1.12ProGln: 1.12 ± 0.402
1.761ProArg: 1.761 ± 0.388
1.761ProSer: 1.761 ± 0.571
2.401ProThr: 2.401 ± 0.566
2.241ProVal: 2.241 ± 0.859
0.16ProTrp: 0.16 ± 0.157
2.241ProTyr: 2.241 ± 0.472
0.0ProXaa: 0.0 ± 0.0
Gln
2.881GlnAla: 2.881 ± 0.786
0.0GlnCys: 0.0 ± 0.0
1.44GlnAsp: 1.44 ± 0.308
2.721GlnGlu: 2.721 ± 0.726
2.241GlnPhe: 2.241 ± 0.532
3.041GlnGly: 3.041 ± 0.59
0.64GlnHis: 0.64 ± 0.308
2.081GlnIle: 2.081 ± 0.586
1.921GlnLys: 1.921 ± 0.574
3.521GlnLeu: 3.521 ± 0.74
0.64GlnMet: 0.64 ± 0.255
1.28GlnAsn: 1.28 ± 0.342
0.32GlnPro: 0.32 ± 0.204
0.8GlnGln: 0.8 ± 0.37
1.761GlnArg: 1.761 ± 0.698
1.761GlnSer: 1.761 ± 0.519
1.601GlnThr: 1.601 ± 0.527
2.881GlnVal: 2.881 ± 0.681
0.64GlnTrp: 0.64 ± 0.334
1.44GlnTyr: 1.44 ± 0.435
0.0GlnXaa: 0.0 ± 0.0
Arg
2.401ArgAla: 2.401 ± 0.696
0.32ArgCys: 0.32 ± 0.232
1.601ArgAsp: 1.601 ± 0.62
3.041ArgGlu: 3.041 ± 0.599
1.921ArgPhe: 1.921 ± 0.861
3.041ArgGly: 3.041 ± 0.461
0.48ArgHis: 0.48 ± 0.299
3.041ArgIle: 3.041 ± 0.698
2.561ArgLys: 2.561 ± 0.779
3.841ArgLeu: 3.841 ± 0.774
1.601ArgMet: 1.601 ± 0.483
2.401ArgAsn: 2.401 ± 0.487
1.12ArgPro: 1.12 ± 0.3
0.96ArgGln: 0.96 ± 0.379
2.081ArgArg: 2.081 ± 0.677
2.721ArgSer: 2.721 ± 0.488
3.041ArgThr: 3.041 ± 0.866
2.561ArgVal: 2.561 ± 0.575
0.64ArgTrp: 0.64 ± 0.259
1.44ArgTyr: 1.44 ± 0.368
0.0ArgXaa: 0.0 ± 0.0
Ser
3.041SerAla: 3.041 ± 1.169
0.48SerCys: 0.48 ± 0.238
3.521SerAsp: 3.521 ± 0.842
5.122SerGlu: 5.122 ± 0.927
3.681SerPhe: 3.681 ± 0.894
5.442SerGly: 5.442 ± 1.046
0.8SerHis: 0.8 ± 0.331
4.481SerIle: 4.481 ± 0.963
3.201SerLys: 3.201 ± 0.715
6.562SerLeu: 6.562 ± 1.107
0.96SerMet: 0.96 ± 0.32
5.442SerAsn: 5.442 ± 1.065
1.761SerPro: 1.761 ± 0.361
2.241SerGln: 2.241 ± 0.537
3.841SerArg: 3.841 ± 0.719
5.922SerSer: 5.922 ± 1.11
3.201SerThr: 3.201 ± 0.609
4.321SerVal: 4.321 ± 0.754
0.32SerTrp: 0.32 ± 0.224
3.041SerTyr: 3.041 ± 0.655
0.0SerXaa: 0.0 ± 0.0
Thr
5.282ThrAla: 5.282 ± 1.084
0.48ThrCys: 0.48 ± 0.253
4.802ThrAsp: 4.802 ± 0.798
3.041ThrGlu: 3.041 ± 0.561
3.041ThrPhe: 3.041 ± 0.714
4.641ThrGly: 4.641 ± 0.788
0.8ThrHis: 0.8 ± 0.308
4.641ThrIle: 4.641 ± 0.769
4.962ThrLys: 4.962 ± 0.845
4.321ThrLeu: 4.321 ± 0.638
0.8ThrMet: 0.8 ± 0.379
2.561ThrAsn: 2.561 ± 0.497
2.721ThrPro: 2.721 ± 0.476
1.44ThrGln: 1.44 ± 0.541
1.601ThrArg: 1.601 ± 0.466
4.962ThrSer: 4.962 ± 1.32
5.922ThrThr: 5.922 ± 1.718
5.122ThrVal: 5.122 ± 1.108
1.44ThrTrp: 1.44 ± 0.522
1.921ThrTyr: 1.921 ± 0.538
0.0ThrXaa: 0.0 ± 0.0
Val
3.841ValAla: 3.841 ± 0.897
0.96ValCys: 0.96 ± 0.529
4.962ValAsp: 4.962 ± 1.007
3.521ValGlu: 3.521 ± 0.709
3.041ValPhe: 3.041 ± 0.985
3.201ValGly: 3.201 ± 0.697
0.8ValHis: 0.8 ± 0.36
4.802ValIle: 4.802 ± 0.74
4.962ValLys: 4.962 ± 0.855
6.082ValLeu: 6.082 ± 1.253
1.12ValMet: 1.12 ± 0.38
3.361ValAsn: 3.361 ± 0.673
2.401ValPro: 2.401 ± 0.559
2.881ValGln: 2.881 ± 0.518
2.401ValArg: 2.401 ± 0.531
7.202ValSer: 7.202 ± 1.465
5.122ValThr: 5.122 ± 0.893
3.681ValVal: 3.681 ± 0.852
1.28ValTrp: 1.28 ± 0.505
3.361ValTyr: 3.361 ± 0.642
0.0ValXaa: 0.0 ± 0.0
Trp
0.8TrpAla: 0.8 ± 0.492
0.0TrpCys: 0.0 ± 0.0
0.16TrpAsp: 0.16 ± 0.153
0.8TrpGlu: 0.8 ± 0.304
0.8TrpPhe: 0.8 ± 0.465
0.64TrpGly: 0.64 ± 0.31
0.48TrpHis: 0.48 ± 0.211
0.48TrpIle: 0.48 ± 0.265
0.8TrpLys: 0.8 ± 0.446
1.12TrpLeu: 1.12 ± 0.527
0.64TrpMet: 0.64 ± 0.348
1.12TrpAsn: 1.12 ± 0.336
0.0TrpPro: 0.0 ± 0.0
1.12TrpGln: 1.12 ± 0.353
0.64TrpArg: 0.64 ± 0.329
0.96TrpSer: 0.96 ± 0.311
0.96TrpThr: 0.96 ± 0.583
0.8TrpVal: 0.8 ± 0.371
0.0TrpTrp: 0.0 ± 0.0
0.48TrpTyr: 0.48 ± 0.194
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.241TyrAla: 2.241 ± 0.636
0.48TyrCys: 0.48 ± 0.25
2.721TyrAsp: 2.721 ± 0.647
2.721TyrGlu: 2.721 ± 0.537
0.8TyrPhe: 0.8 ± 0.366
4.001TyrGly: 4.001 ± 0.905
0.96TyrHis: 0.96 ± 0.403
4.321TyrIle: 4.321 ± 0.932
3.521TyrLys: 3.521 ± 0.756
3.361TyrLeu: 3.361 ± 0.531
1.12TyrMet: 1.12 ± 0.566
3.041TyrAsn: 3.041 ± 0.671
1.601TyrPro: 1.601 ± 0.411
1.12TyrGln: 1.12 ± 0.396
1.921TyrArg: 1.921 ± 0.614
2.561TyrSer: 2.561 ± 0.609
2.241TyrThr: 2.241 ± 0.557
2.721TyrVal: 2.721 ± 0.831
0.96TyrTrp: 0.96 ± 0.364
1.601TyrTyr: 1.601 ± 0.57
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 29 proteins (6249 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski