Amino acid dipepetide frequency for Rhodococcus phage Toil

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
9.29AlaAla: 9.29 ± 1.878
0.911AlaCys: 0.911 ± 0.413
4.372AlaAsp: 4.372 ± 0.943
5.282AlaGlu: 5.282 ± 0.737
4.007AlaPhe: 4.007 ± 0.736
10.018AlaGly: 10.018 ± 1.793
1.639AlaHis: 1.639 ± 0.759
4.736AlaIle: 4.736 ± 0.906
2.914AlaLys: 2.914 ± 0.839
7.832AlaLeu: 7.832 ± 1.298
2.55AlaMet: 2.55 ± 0.702
3.643AlaAsn: 3.643 ± 0.697
6.375AlaPro: 6.375 ± 1.708
7.104AlaGln: 7.104 ± 1.187
4.918AlaArg: 4.918 ± 1.085
5.282AlaSer: 5.282 ± 1.04
7.104AlaThr: 7.104 ± 1.742
7.468AlaVal: 7.468 ± 1.673
1.275AlaTrp: 1.275 ± 0.488
2.55AlaTyr: 2.55 ± 0.642
0.0AlaXaa: 0.0 ± 0.0
Cys
0.729CysAla: 0.729 ± 0.396
0.182CysCys: 0.182 ± 0.171
0.546CysAsp: 0.546 ± 0.337
0.182CysGlu: 0.182 ± 0.157
0.0CysPhe: 0.0 ± 0.0
0.182CysGly: 0.182 ± 0.179
0.0CysHis: 0.0 ± 0.0
0.0CysIle: 0.0 ± 0.0
0.729CysLys: 0.729 ± 0.345
0.729CysLeu: 0.729 ± 0.296
0.0CysMet: 0.0 ± 0.0
0.182CysAsn: 0.182 ± 0.191
0.546CysPro: 0.546 ± 0.259
0.364CysGln: 0.364 ± 0.234
1.093CysArg: 1.093 ± 0.526
1.093CysSer: 1.093 ± 0.478
0.364CysThr: 0.364 ± 0.342
0.911CysVal: 0.911 ± 0.383
0.0CysTrp: 0.0 ± 0.0
0.182CysTyr: 0.182 ± 0.157
0.0CysXaa: 0.0 ± 0.0
Asp
5.282AspAla: 5.282 ± 0.645
1.275AspCys: 1.275 ± 0.622
3.643AspAsp: 3.643 ± 1.026
4.007AspGlu: 4.007 ± 0.87
1.275AspPhe: 1.275 ± 0.545
2.914AspGly: 2.914 ± 0.772
1.275AspHis: 1.275 ± 0.523
2.732AspIle: 2.732 ± 0.64
2.732AspLys: 2.732 ± 0.842
6.375AspLeu: 6.375 ± 0.971
0.729AspMet: 0.729 ± 0.339
2.732AspAsn: 2.732 ± 0.59
2.732AspPro: 2.732 ± 0.558
2.55AspGln: 2.55 ± 0.631
3.097AspArg: 3.097 ± 0.744
2.914AspSer: 2.914 ± 0.891
4.736AspThr: 4.736 ± 1.099
3.279AspVal: 3.279 ± 0.565
0.911AspTrp: 0.911 ± 0.383
2.004AspTyr: 2.004 ± 0.563
0.0AspXaa: 0.0 ± 0.0
Glu
7.104GluAla: 7.104 ± 1.534
0.729GluCys: 0.729 ± 0.334
1.639GluAsp: 1.639 ± 0.629
2.55GluGlu: 2.55 ± 0.592
1.821GluPhe: 1.821 ± 0.568
3.097GluGly: 3.097 ± 0.816
1.821GluHis: 1.821 ± 0.502
1.457GluIle: 1.457 ± 0.557
0.911GluLys: 0.911 ± 0.341
5.1GluLeu: 5.1 ± 1.059
1.275GluMet: 1.275 ± 0.594
2.004GluAsn: 2.004 ± 0.566
2.186GluPro: 2.186 ± 0.901
1.639GluGln: 1.639 ± 0.702
2.732GluArg: 2.732 ± 0.758
2.55GluSer: 2.55 ± 0.662
2.732GluThr: 2.732 ± 0.737
4.554GluVal: 4.554 ± 0.941
1.093GluTrp: 1.093 ± 0.409
2.004GluTyr: 2.004 ± 0.675
0.0GluXaa: 0.0 ± 0.0
Phe
3.643PheAla: 3.643 ± 0.719
0.182PheCys: 0.182 ± 0.187
2.368PheAsp: 2.368 ± 0.498
1.275PheGlu: 1.275 ± 0.514
1.457PhePhe: 1.457 ± 0.359
4.554PheGly: 4.554 ± 1.1
0.729PheHis: 0.729 ± 0.317
1.275PheIle: 1.275 ± 0.42
0.911PheLys: 0.911 ± 0.423
2.914PheLeu: 2.914 ± 0.71
0.729PheMet: 0.729 ± 0.365
2.004PheAsn: 2.004 ± 0.592
1.093PhePro: 1.093 ± 0.455
1.275PheGln: 1.275 ± 0.437
1.639PheArg: 1.639 ± 0.507
3.279PheSer: 3.279 ± 0.94
2.368PheThr: 2.368 ± 0.649
3.643PheVal: 3.643 ± 0.928
0.729PheTrp: 0.729 ± 0.486
1.275PheTyr: 1.275 ± 0.548
0.0PheXaa: 0.0 ± 0.0
Gly
7.832GlyAla: 7.832 ± 1.332
0.364GlyCys: 0.364 ± 0.235
4.372GlyAsp: 4.372 ± 0.975
3.461GlyGlu: 3.461 ± 0.775
4.007GlyPhe: 4.007 ± 1.05
9.29GlyGly: 9.29 ± 2.102
1.821GlyHis: 1.821 ± 0.486
5.282GlyIle: 5.282 ± 0.848
2.732GlyLys: 2.732 ± 1.094
4.918GlyLeu: 4.918 ± 0.829
2.368GlyMet: 2.368 ± 0.734
5.282GlyAsn: 5.282 ± 0.998
2.004GlyPro: 2.004 ± 0.495
2.914GlyGln: 2.914 ± 0.608
2.914GlyArg: 2.914 ± 0.785
6.557GlySer: 6.557 ± 0.957
8.379GlyThr: 8.379 ± 1.375
6.375GlyVal: 6.375 ± 1.017
2.368GlyTrp: 2.368 ± 0.824
3.461GlyTyr: 3.461 ± 0.696
0.0GlyXaa: 0.0 ± 0.0
His
1.821HisAla: 1.821 ± 0.558
0.182HisCys: 0.182 ± 0.172
1.275HisAsp: 1.275 ± 0.566
1.275HisGlu: 1.275 ± 0.534
0.182HisPhe: 0.182 ± 0.171
1.457HisGly: 1.457 ± 0.638
0.729HisHis: 0.729 ± 0.355
0.911HisIle: 0.911 ± 0.387
0.911HisLys: 0.911 ± 0.339
2.186HisLeu: 2.186 ± 0.735
0.182HisMet: 0.182 ± 0.181
0.729HisAsn: 0.729 ± 0.314
1.639HisPro: 1.639 ± 0.819
0.182HisGln: 0.182 ± 0.157
1.093HisArg: 1.093 ± 0.434
0.911HisSer: 0.911 ± 0.463
1.093HisThr: 1.093 ± 0.429
2.186HisVal: 2.186 ± 0.779
0.364HisTrp: 0.364 ± 0.263
0.546HisTyr: 0.546 ± 0.324
0.0HisXaa: 0.0 ± 0.0
Ile
4.372IleAla: 4.372 ± 0.777
0.364IleCys: 0.364 ± 0.23
3.643IleAsp: 3.643 ± 0.896
1.093IleGlu: 1.093 ± 0.414
1.457IlePhe: 1.457 ± 0.5
5.464IleGly: 5.464 ± 1.099
0.729IleHis: 0.729 ± 0.369
3.643IleIle: 3.643 ± 0.777
1.639IleLys: 1.639 ± 0.655
3.461IleLeu: 3.461 ± 0.797
0.911IleMet: 0.911 ± 0.442
1.275IleAsn: 1.275 ± 0.363
4.007IlePro: 4.007 ± 0.772
2.55IleGln: 2.55 ± 0.776
2.732IleArg: 2.732 ± 0.683
4.007IleSer: 4.007 ± 0.861
2.914IleThr: 2.914 ± 0.615
3.279IleVal: 3.279 ± 1.053
0.546IleTrp: 0.546 ± 0.302
1.639IleTyr: 1.639 ± 0.452
0.0IleXaa: 0.0 ± 0.0
Lys
4.736LysAla: 4.736 ± 1.136
0.182LysCys: 0.182 ± 0.168
1.821LysAsp: 1.821 ± 0.575
1.093LysGlu: 1.093 ± 0.543
1.639LysPhe: 1.639 ± 0.491
2.368LysGly: 2.368 ± 0.467
0.911LysHis: 0.911 ± 0.353
1.093LysIle: 1.093 ± 0.563
1.639LysLys: 1.639 ± 0.64
2.004LysLeu: 2.004 ± 0.66
0.911LysMet: 0.911 ± 0.398
1.639LysAsn: 1.639 ± 0.642
2.004LysPro: 2.004 ± 0.643
1.821LysGln: 1.821 ± 0.557
2.186LysArg: 2.186 ± 0.764
1.457LysSer: 1.457 ± 0.501
1.821LysThr: 1.821 ± 0.642
3.461LysVal: 3.461 ± 0.788
1.639LysTrp: 1.639 ± 0.497
0.729LysTyr: 0.729 ± 0.362
0.0LysXaa: 0.0 ± 0.0
Leu
6.922LeuAla: 6.922 ± 1.279
0.182LeuCys: 0.182 ± 0.157
4.007LeuAsp: 4.007 ± 0.954
4.007LeuGlu: 4.007 ± 0.747
2.914LeuPhe: 2.914 ± 0.761
7.468LeuGly: 7.468 ± 1.098
0.911LeuHis: 0.911 ± 0.52
2.55LeuIle: 2.55 ± 0.884
2.004LeuLys: 2.004 ± 0.541
4.554LeuLeu: 4.554 ± 0.735
2.186LeuMet: 2.186 ± 0.636
4.554LeuAsn: 4.554 ± 0.93
4.736LeuPro: 4.736 ± 0.814
2.732LeuGln: 2.732 ± 0.91
5.464LeuArg: 5.464 ± 1.17
6.011LeuSer: 6.011 ± 0.821
6.193LeuThr: 6.193 ± 0.991
4.736LeuVal: 4.736 ± 0.825
1.093LeuTrp: 1.093 ± 0.369
1.821LeuTyr: 1.821 ± 0.624
0.0LeuXaa: 0.0 ± 0.0
Met
3.097MetAla: 3.097 ± 0.613
0.364MetCys: 0.364 ± 0.279
0.182MetAsp: 0.182 ± 0.171
0.364MetGlu: 0.364 ± 0.273
0.546MetPhe: 0.546 ± 0.341
1.457MetGly: 1.457 ± 0.575
0.729MetHis: 0.729 ± 0.288
1.093MetIle: 1.093 ± 0.39
1.093MetLys: 1.093 ± 0.453
2.186MetLeu: 2.186 ± 0.797
0.911MetMet: 0.911 ± 0.355
2.004MetAsn: 2.004 ± 0.533
1.093MetPro: 1.093 ± 0.363
1.457MetGln: 1.457 ± 0.57
0.364MetArg: 0.364 ± 0.263
1.275MetSer: 1.275 ± 0.526
3.279MetThr: 3.279 ± 0.711
2.732MetVal: 2.732 ± 0.787
0.0MetTrp: 0.0 ± 0.0
0.911MetTyr: 0.911 ± 0.334
0.0MetXaa: 0.0 ± 0.0
Asn
5.464AsnAla: 5.464 ± 1.195
0.364AsnCys: 0.364 ± 0.264
1.457AsnAsp: 1.457 ± 0.646
2.004AsnGlu: 2.004 ± 0.711
1.639AsnPhe: 1.639 ± 0.433
6.011AsnGly: 6.011 ± 1.271
0.364AsnHis: 0.364 ± 0.286
3.097AsnIle: 3.097 ± 0.819
1.275AsnLys: 1.275 ± 0.578
3.279AsnLeu: 3.279 ± 1.002
1.821AsnMet: 1.821 ± 0.676
2.368AsnAsn: 2.368 ± 0.928
3.279AsnPro: 3.279 ± 0.963
2.186AsnGln: 2.186 ± 0.773
2.004AsnArg: 2.004 ± 0.41
2.004AsnSer: 2.004 ± 0.528
2.914AsnThr: 2.914 ± 0.656
3.643AsnVal: 3.643 ± 0.804
0.911AsnTrp: 0.911 ± 0.383
1.275AsnTyr: 1.275 ± 0.476
0.0AsnXaa: 0.0 ± 0.0
Pro
5.1ProAla: 5.1 ± 1.169
0.364ProCys: 0.364 ± 0.292
4.189ProAsp: 4.189 ± 0.803
3.461ProGlu: 3.461 ± 0.87
2.368ProPhe: 2.368 ± 0.611
3.825ProGly: 3.825 ± 0.837
0.729ProHis: 0.729 ± 0.4
2.004ProIle: 2.004 ± 0.682
2.732ProLys: 2.732 ± 0.732
3.097ProLeu: 3.097 ± 0.754
1.639ProMet: 1.639 ± 0.543
3.279ProAsn: 3.279 ± 0.691
2.732ProPro: 2.732 ± 0.911
2.004ProGln: 2.004 ± 0.671
1.639ProArg: 1.639 ± 0.841
2.368ProSer: 2.368 ± 0.742
2.55ProThr: 2.55 ± 0.56
3.461ProVal: 3.461 ± 0.737
1.275ProTrp: 1.275 ± 0.382
2.186ProTyr: 2.186 ± 0.803
0.0ProXaa: 0.0 ± 0.0
Gln
3.825GlnAla: 3.825 ± 0.839
0.364GlnCys: 0.364 ± 0.313
1.821GlnAsp: 1.821 ± 0.541
1.639GlnGlu: 1.639 ± 0.549
1.275GlnPhe: 1.275 ± 0.442
3.825GlnGly: 3.825 ± 0.639
0.546GlnHis: 0.546 ± 0.319
2.55GlnIle: 2.55 ± 0.765
1.275GlnLys: 1.275 ± 0.624
4.918GlnLeu: 4.918 ± 1.04
0.911GlnMet: 0.911 ± 0.43
2.368GlnAsn: 2.368 ± 0.48
3.097GlnPro: 3.097 ± 0.637
2.55GlnGln: 2.55 ± 0.745
2.55GlnArg: 2.55 ± 0.741
4.189GlnSer: 4.189 ± 0.823
2.368GlnThr: 2.368 ± 0.55
3.279GlnVal: 3.279 ± 0.748
0.729GlnTrp: 0.729 ± 0.327
0.911GlnTyr: 0.911 ± 0.4
0.0GlnXaa: 0.0 ± 0.0
Arg
4.189ArgAla: 4.189 ± 0.704
0.182ArgCys: 0.182 ± 0.197
4.554ArgAsp: 4.554 ± 0.905
3.097ArgGlu: 3.097 ± 0.87
2.186ArgPhe: 2.186 ± 0.618
2.914ArgGly: 2.914 ± 0.564
1.275ArgHis: 1.275 ± 0.434
4.372ArgIle: 4.372 ± 0.852
2.004ArgLys: 2.004 ± 0.769
3.097ArgLeu: 3.097 ± 0.774
1.093ArgMet: 1.093 ± 0.396
2.55ArgAsn: 2.55 ± 1.123
1.639ArgPro: 1.639 ± 0.664
1.821ArgGln: 1.821 ± 0.634
2.732ArgArg: 2.732 ± 0.789
4.189ArgSer: 4.189 ± 0.978
2.732ArgThr: 2.732 ± 0.734
2.914ArgVal: 2.914 ± 0.802
0.729ArgTrp: 0.729 ± 0.355
2.004ArgTyr: 2.004 ± 0.758
0.0ArgXaa: 0.0 ± 0.0
Ser
6.375SerAla: 6.375 ± 1.215
0.364SerCys: 0.364 ± 0.224
4.736SerAsp: 4.736 ± 0.968
2.55SerGlu: 2.55 ± 0.631
3.461SerPhe: 3.461 ± 0.784
6.557SerGly: 6.557 ± 0.856
0.546SerHis: 0.546 ± 0.334
3.461SerIle: 3.461 ± 1.278
3.097SerLys: 3.097 ± 0.706
4.554SerLeu: 4.554 ± 0.919
1.457SerMet: 1.457 ± 0.406
2.914SerAsn: 2.914 ± 1.097
2.914SerPro: 2.914 ± 0.908
2.55SerGln: 2.55 ± 0.734
2.186SerArg: 2.186 ± 0.905
4.189SerSer: 4.189 ± 1.071
6.193SerThr: 6.193 ± 0.8
4.554SerVal: 4.554 ± 0.869
1.093SerTrp: 1.093 ± 0.366
2.004SerTyr: 2.004 ± 0.646
0.0SerXaa: 0.0 ± 0.0
Thr
8.015ThrAla: 8.015 ± 1.153
0.364ThrCys: 0.364 ± 0.227
4.007ThrAsp: 4.007 ± 0.918
6.193ThrGlu: 6.193 ± 1.382
2.368ThrPhe: 2.368 ± 0.671
4.918ThrGly: 4.918 ± 1.09
1.639ThrHis: 1.639 ± 0.678
4.554ThrIle: 4.554 ± 0.684
2.004ThrLys: 2.004 ± 0.577
4.554ThrLeu: 4.554 ± 0.761
2.186ThrMet: 2.186 ± 0.673
2.732ThrAsn: 2.732 ± 0.725
3.279ThrPro: 3.279 ± 0.803
3.279ThrGln: 3.279 ± 0.774
4.189ThrArg: 4.189 ± 0.989
2.914ThrSer: 2.914 ± 0.81
5.647ThrThr: 5.647 ± 1.204
6.557ThrVal: 6.557 ± 1.218
0.729ThrTrp: 0.729 ± 0.387
1.639ThrTyr: 1.639 ± 0.557
0.0ThrXaa: 0.0 ± 0.0
Val
8.561ValAla: 8.561 ± 1.38
0.364ValCys: 0.364 ± 0.273
2.732ValAsp: 2.732 ± 0.815
3.643ValGlu: 3.643 ± 0.756
3.461ValPhe: 3.461 ± 0.715
5.464ValGly: 5.464 ± 1.422
1.275ValHis: 1.275 ± 0.389
2.368ValIle: 2.368 ± 0.663
2.368ValLys: 2.368 ± 0.703
4.189ValLeu: 4.189 ± 1.008
1.821ValMet: 1.821 ± 0.789
4.007ValAsn: 4.007 ± 0.839
3.825ValPro: 3.825 ± 0.872
3.461ValGln: 3.461 ± 0.749
3.825ValArg: 3.825 ± 0.925
7.468ValSer: 7.468 ± 1.426
5.647ValThr: 5.647 ± 1.112
5.464ValVal: 5.464 ± 1.185
1.821ValTrp: 1.821 ± 0.5
3.643ValTyr: 3.643 ± 0.876
0.0ValXaa: 0.0 ± 0.0
Trp
0.546TrpAla: 0.546 ± 0.291
0.364TrpCys: 0.364 ± 0.302
1.457TrpAsp: 1.457 ± 0.56
0.364TrpGlu: 0.364 ± 0.262
0.364TrpPhe: 0.364 ± 0.353
2.004TrpGly: 2.004 ± 0.589
1.457TrpHis: 1.457 ± 0.747
0.546TrpIle: 0.546 ± 0.353
0.729TrpLys: 0.729 ± 0.344
2.55TrpLeu: 2.55 ± 0.637
0.364TrpMet: 0.364 ± 0.281
0.911TrpAsn: 0.911 ± 0.334
0.364TrpPro: 0.364 ± 0.237
1.275TrpGln: 1.275 ± 0.516
1.275TrpArg: 1.275 ± 0.399
1.275TrpSer: 1.275 ± 0.469
0.364TrpThr: 0.364 ± 0.227
0.729TrpVal: 0.729 ± 0.277
0.182TrpTrp: 0.182 ± 0.19
1.093TrpTyr: 1.093 ± 0.481
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.368TyrAla: 2.368 ± 0.738
0.364TyrCys: 0.364 ± 0.234
4.372TyrAsp: 4.372 ± 0.793
1.457TyrGlu: 1.457 ± 0.495
0.911TyrPhe: 0.911 ± 0.431
2.914TyrGly: 2.914 ± 0.612
0.729TyrHis: 0.729 ± 0.396
2.004TyrIle: 2.004 ± 0.701
1.457TyrLys: 1.457 ± 0.425
2.55TyrLeu: 2.55 ± 0.668
0.911TyrMet: 0.911 ± 0.387
0.364TyrAsn: 0.364 ± 0.238
1.457TyrPro: 1.457 ± 0.532
1.275TyrGln: 1.275 ± 0.375
1.639TyrArg: 1.639 ± 0.519
1.821TyrSer: 1.821 ± 0.529
2.368TyrThr: 2.368 ± 0.508
2.186TyrVal: 2.186 ± 0.595
0.729TyrTrp: 0.729 ± 0.307
0.911TyrTyr: 0.911 ± 0.386
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 35 proteins (5491 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski