Amino acid dipepetide frequency for Helicobacter pylori bacteriophage KHP30

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
1.833AlaAla: 1.833 ± 0.745
0.856AlaCys: 0.856 ± 0.376
2.2AlaAsp: 2.2 ± 0.402
3.055AlaGlu: 3.055 ± 0.831
4.033AlaPhe: 4.033 ± 0.856
2.933AlaGly: 2.933 ± 0.87
0.611AlaHis: 0.611 ± 0.318
6.967AlaIle: 6.967 ± 0.908
7.578AlaLys: 7.578 ± 1.19
9.9AlaLeu: 9.9 ± 1.55
1.222AlaMet: 1.222 ± 0.456
6.6AlaAsn: 6.6 ± 1.038
1.467AlaPro: 1.467 ± 0.269
2.933AlaGln: 2.933 ± 0.581
2.933AlaArg: 2.933 ± 0.573
3.3AlaSer: 3.3 ± 0.683
2.567AlaThr: 2.567 ± 0.545
2.2AlaVal: 2.2 ± 0.531
0.244AlaTrp: 0.244 ± 0.164
2.078AlaTyr: 2.078 ± 0.597
0.0AlaXaa: 0.0 ± 0.0
Cys
0.367CysAla: 0.367 ± 0.199
0.122CysCys: 0.122 ± 0.116
0.489CysAsp: 0.489 ± 0.282
0.733CysGlu: 0.733 ± 0.299
0.611CysPhe: 0.611 ± 0.319
0.489CysGly: 0.489 ± 0.344
0.122CysHis: 0.122 ± 0.128
0.611CysIle: 0.611 ± 0.371
0.367CysLys: 0.367 ± 0.227
1.1CysLeu: 1.1 ± 0.482
0.122CysMet: 0.122 ± 0.133
0.489CysAsn: 0.489 ± 0.301
0.367CysPro: 0.367 ± 0.176
0.244CysGln: 0.244 ± 0.165
0.244CysArg: 0.244 ± 0.167
0.122CysSer: 0.122 ± 0.138
0.611CysThr: 0.611 ± 0.263
0.367CysVal: 0.367 ± 0.227
0.0CysTrp: 0.0 ± 0.0
0.122CysTyr: 0.122 ± 0.114
0.0CysXaa: 0.0 ± 0.0
Asp
2.933AspAla: 2.933 ± 0.749
0.367AspCys: 0.367 ± 0.249
2.689AspAsp: 2.689 ± 0.64
3.422AspGlu: 3.422 ± 0.6
4.155AspPhe: 4.155 ± 0.735
1.1AspGly: 1.1 ± 0.261
0.489AspHis: 0.489 ± 0.22
2.567AspIle: 2.567 ± 0.784
6.6AspLys: 6.6 ± 1.07
7.333AspLeu: 7.333 ± 1.13
1.344AspMet: 1.344 ± 0.549
4.4AspAsn: 4.4 ± 0.574
2.2AspPro: 2.2 ± 0.443
0.978AspGln: 0.978 ± 0.347
1.467AspArg: 1.467 ± 0.393
2.689AspSer: 2.689 ± 0.582
1.589AspThr: 1.589 ± 0.527
1.344AspVal: 1.344 ± 0.497
0.0AspTrp: 0.0 ± 0.0
2.933AspTyr: 2.933 ± 0.495
0.0AspXaa: 0.0 ± 0.0
Glu
5.5GluAla: 5.5 ± 0.948
0.122GluCys: 0.122 ± 0.114
1.956GluAsp: 1.956 ± 0.423
5.867GluGlu: 5.867 ± 1.028
4.278GluPhe: 4.278 ± 0.793
1.589GluGly: 1.589 ± 0.414
1.222GluHis: 1.222 ± 0.491
6.722GluIle: 6.722 ± 0.753
7.455GluLys: 7.455 ± 0.84
9.533GluLeu: 9.533 ± 0.894
0.978GluMet: 0.978 ± 0.226
6.233GluAsn: 6.233 ± 0.666
1.344GluPro: 1.344 ± 0.389
4.767GluGln: 4.767 ± 0.94
5.622GluArg: 5.622 ± 1.027
7.455GluSer: 7.455 ± 0.73
4.644GluThr: 4.644 ± 0.654
3.911GluVal: 3.911 ± 0.77
0.489GluTrp: 0.489 ± 0.221
2.2GluTyr: 2.2 ± 0.443
0.0GluXaa: 0.0 ± 0.0
Phe
1.1PheAla: 1.1 ± 0.374
0.489PheCys: 0.489 ± 0.223
3.178PheAsp: 3.178 ± 0.689
3.667PheGlu: 3.667 ± 0.707
3.544PhePhe: 3.544 ± 0.548
1.589PheGly: 1.589 ± 0.38
0.856PheHis: 0.856 ± 0.258
3.055PheIle: 3.055 ± 0.669
5.989PheLys: 5.989 ± 0.798
6.233PheLeu: 6.233 ± 0.934
0.856PheMet: 0.856 ± 0.214
2.689PheAsn: 2.689 ± 0.461
0.611PhePro: 0.611 ± 0.245
0.489PheGln: 0.489 ± 0.241
1.956PheArg: 1.956 ± 0.414
5.133PheSer: 5.133 ± 0.81
3.055PheThr: 3.055 ± 0.532
1.467PheVal: 1.467 ± 0.403
0.0PheTrp: 0.0 ± 0.0
1.833PheTyr: 1.833 ± 0.614
0.0PheXaa: 0.0 ± 0.0
Gly
2.811GlyAla: 2.811 ± 0.933
0.611GlyCys: 0.611 ± 0.306
1.589GlyAsp: 1.589 ± 0.423
1.833GlyGlu: 1.833 ± 0.437
2.933GlyPhe: 2.933 ± 0.481
3.3GlyGly: 3.3 ± 0.927
0.244GlyHis: 0.244 ± 0.17
2.811GlyIle: 2.811 ± 0.745
2.078GlyLys: 2.078 ± 0.588
5.133GlyLeu: 5.133 ± 0.619
1.222GlyMet: 1.222 ± 0.384
3.544GlyAsn: 3.544 ± 0.651
0.122GlyPro: 0.122 ± 0.135
0.978GlyGln: 0.978 ± 0.249
1.1GlyArg: 1.1 ± 0.303
2.444GlySer: 2.444 ± 0.566
0.978GlyThr: 0.978 ± 0.296
3.667GlyVal: 3.667 ± 0.777
0.0GlyTrp: 0.0 ± 0.0
2.2GlyTyr: 2.2 ± 0.41
0.0GlyXaa: 0.0 ± 0.0
His
1.1HisAla: 1.1 ± 0.287
0.0HisCys: 0.0 ± 0.0
0.856HisAsp: 0.856 ± 0.443
1.222HisGlu: 1.222 ± 0.416
0.611HisPhe: 0.611 ± 0.265
0.367HisGly: 0.367 ± 0.187
0.122HisHis: 0.122 ± 0.137
0.856HisIle: 0.856 ± 0.298
1.467HisLys: 1.467 ± 0.401
1.344HisLeu: 1.344 ± 0.554
0.244HisMet: 0.244 ± 0.174
0.856HisAsn: 0.856 ± 0.327
0.367HisPro: 0.367 ± 0.221
0.367HisGln: 0.367 ± 0.254
0.611HisArg: 0.611 ± 0.306
0.489HisSer: 0.489 ± 0.189
1.222HisThr: 1.222 ± 0.512
0.367HisVal: 0.367 ± 0.217
0.0HisTrp: 0.0 ± 0.0
0.367HisTyr: 0.367 ± 0.224
0.0HisXaa: 0.0 ± 0.0
Ile
5.255IleAla: 5.255 ± 0.731
1.1IleCys: 1.1 ± 0.478
4.033IleAsp: 4.033 ± 0.773
5.989IleGlu: 5.989 ± 0.892
2.322IlePhe: 2.322 ± 0.541
1.833IleGly: 1.833 ± 0.583
0.489IleHis: 0.489 ± 0.212
4.278IleIle: 4.278 ± 0.795
8.433IleLys: 8.433 ± 0.931
6.6IleLeu: 6.6 ± 0.727
1.711IleMet: 1.711 ± 0.476
5.378IleAsn: 5.378 ± 1.293
1.467IlePro: 1.467 ± 0.472
3.544IleGln: 3.544 ± 0.977
4.155IleArg: 4.155 ± 0.494
4.522IleSer: 4.522 ± 0.691
3.911IleThr: 3.911 ± 0.56
3.055IleVal: 3.055 ± 0.572
0.244IleTrp: 0.244 ± 0.191
2.2IleTyr: 2.2 ± 0.471
0.0IleXaa: 0.0 ± 0.0
Lys
9.166LysAla: 9.166 ± 1.253
0.733LysCys: 0.733 ± 0.385
6.355LysAsp: 6.355 ± 1.194
12.955LysGlu: 12.955 ± 1.384
2.689LysPhe: 2.689 ± 0.408
2.811LysGly: 2.811 ± 0.596
2.2LysHis: 2.2 ± 0.83
8.678LysIle: 8.678 ± 1.097
9.533LysLys: 9.533 ± 1.622
7.7LysLeu: 7.7 ± 0.89
1.344LysMet: 1.344 ± 0.402
10.389LysAsn: 10.389 ± 1.425
3.422LysPro: 3.422 ± 0.618
6.478LysGln: 6.478 ± 1.049
5.255LysArg: 5.255 ± 0.752
5.133LysSer: 5.133 ± 1.226
5.378LysThr: 5.378 ± 0.925
4.767LysVal: 4.767 ± 0.783
0.611LysTrp: 0.611 ± 0.262
2.933LysTyr: 2.933 ± 0.506
0.0LysXaa: 0.0 ± 0.0
Leu
5.378LeuAla: 5.378 ± 0.487
1.344LeuCys: 1.344 ± 0.566
4.889LeuAsp: 4.889 ± 0.569
11.244LeuGlu: 11.244 ± 1.207
3.789LeuPhe: 3.789 ± 0.772
6.111LeuGly: 6.111 ± 0.81
0.489LeuHis: 0.489 ± 0.207
6.111LeuIle: 6.111 ± 0.864
16.011LeuLys: 16.011 ± 1.769
7.455LeuLeu: 7.455 ± 1.085
1.956LeuMet: 1.956 ± 0.47
11.122LeuAsn: 11.122 ± 1.489
1.956LeuPro: 1.956 ± 0.493
4.4LeuGln: 4.4 ± 0.965
4.155LeuArg: 4.155 ± 0.661
6.844LeuSer: 6.844 ± 0.981
4.155LeuThr: 4.155 ± 0.793
3.422LeuVal: 3.422 ± 0.682
0.978LeuTrp: 0.978 ± 0.393
2.2LeuTyr: 2.2 ± 0.503
0.0LeuXaa: 0.0 ± 0.0
Met
0.978MetAla: 0.978 ± 0.33
0.0MetCys: 0.0 ± 0.0
1.1MetAsp: 1.1 ± 0.432
0.978MetGlu: 0.978 ± 0.244
0.856MetPhe: 0.856 ± 0.413
1.467MetGly: 1.467 ± 0.559
0.367MetHis: 0.367 ± 0.18
1.467MetIle: 1.467 ± 0.44
1.833MetLys: 1.833 ± 0.643
1.956MetLeu: 1.956 ± 0.403
0.122MetMet: 0.122 ± 0.117
1.833MetAsn: 1.833 ± 0.507
0.733MetPro: 0.733 ± 0.304
1.467MetGln: 1.467 ± 0.455
0.733MetArg: 0.733 ± 0.305
0.733MetSer: 0.733 ± 0.247
0.367MetThr: 0.367 ± 0.237
0.244MetVal: 0.244 ± 0.142
0.244MetTrp: 0.244 ± 0.214
0.489MetTyr: 0.489 ± 0.187
0.0MetXaa: 0.0 ± 0.0
Asn
10.144AsnAla: 10.144 ± 1.929
0.122AsnCys: 0.122 ± 0.108
4.767AsnAsp: 4.767 ± 0.617
6.6AsnGlu: 6.6 ± 0.78
3.789AsnPhe: 3.789 ± 0.781
2.567AsnGly: 2.567 ± 0.444
1.1AsnHis: 1.1 ± 0.32
4.033AsnIle: 4.033 ± 0.596
8.066AsnLys: 8.066 ± 0.954
6.6AsnLeu: 6.6 ± 0.965
1.222AsnMet: 1.222 ± 0.343
7.578AsnAsn: 7.578 ± 1.27
1.589AsnPro: 1.589 ± 0.433
5.622AsnGln: 5.622 ± 1.33
3.055AsnArg: 3.055 ± 0.477
4.522AsnSer: 4.522 ± 0.63
4.278AsnThr: 4.278 ± 0.899
2.2AsnVal: 2.2 ± 0.565
0.244AsnTrp: 0.244 ± 0.166
5.622AsnTyr: 5.622 ± 1.138
0.0AsnXaa: 0.0 ± 0.0
Pro
0.733ProAla: 0.733 ± 0.271
0.0ProCys: 0.0 ± 0.0
0.611ProAsp: 0.611 ± 0.199
0.978ProGlu: 0.978 ± 0.372
2.078ProPhe: 2.078 ± 0.419
0.244ProGly: 0.244 ± 0.163
0.122ProHis: 0.122 ± 0.102
1.956ProIle: 1.956 ± 0.429
3.911ProLys: 3.911 ± 0.662
2.2ProLeu: 2.2 ± 0.686
0.244ProMet: 0.244 ± 0.178
2.567ProAsn: 2.567 ± 0.449
0.367ProPro: 0.367 ± 0.226
0.978ProGln: 0.978 ± 0.325
0.489ProArg: 0.489 ± 0.25
2.444ProSer: 2.444 ± 0.423
1.833ProThr: 1.833 ± 0.54
0.733ProVal: 0.733 ± 0.221
0.0ProTrp: 0.0 ± 0.0
0.856ProTyr: 0.856 ± 0.397
0.0ProXaa: 0.0 ± 0.0
Gln
5.133GlnAla: 5.133 ± 0.846
0.367GlnCys: 0.367 ± 0.23
2.2GlnAsp: 2.2 ± 0.42
4.767GlnGlu: 4.767 ± 1.157
1.589GlnPhe: 1.589 ± 0.38
1.833GlnGly: 1.833 ± 0.636
0.244GlnHis: 0.244 ± 0.168
4.155GlnIle: 4.155 ± 1.024
5.744GlnLys: 5.744 ± 1.144
2.933GlnLeu: 2.933 ± 0.632
0.611GlnMet: 0.611 ± 0.275
4.155GlnAsn: 4.155 ± 0.763
0.611GlnPro: 0.611 ± 0.262
3.544GlnGln: 3.544 ± 0.84
1.344GlnArg: 1.344 ± 0.425
2.933GlnSer: 2.933 ± 0.68
2.322GlnThr: 2.322 ± 0.515
2.2GlnVal: 2.2 ± 0.508
0.244GlnTrp: 0.244 ± 0.124
1.1GlnTyr: 1.1 ± 0.41
0.0GlnXaa: 0.0 ± 0.0
Arg
3.544ArgAla: 3.544 ± 0.563
0.122ArgCys: 0.122 ± 0.114
2.567ArgAsp: 2.567 ± 0.435
2.933ArgGlu: 2.933 ± 0.62
2.078ArgPhe: 2.078 ± 0.391
1.344ArgGly: 1.344 ± 0.32
0.856ArgHis: 0.856 ± 0.294
2.2ArgIle: 2.2 ± 0.551
3.422ArgLys: 3.422 ± 0.661
5.867ArgLeu: 5.867 ± 0.605
0.733ArgMet: 0.733 ± 0.32
2.078ArgAsn: 2.078 ± 0.625
1.1ArgPro: 1.1 ± 0.422
2.078ArgGln: 2.078 ± 0.736
0.856ArgArg: 0.856 ± 0.404
3.544ArgSer: 3.544 ± 0.929
1.956ArgThr: 1.956 ± 0.449
1.589ArgVal: 1.589 ± 0.34
0.244ArgTrp: 0.244 ± 0.13
1.589ArgTyr: 1.589 ± 0.508
0.0ArgXaa: 0.0 ± 0.0
Ser
2.689SerAla: 2.689 ± 0.544
0.367SerCys: 0.367 ± 0.283
5.255SerAsp: 5.255 ± 0.737
6.233SerGlu: 6.233 ± 0.991
4.033SerPhe: 4.033 ± 0.893
3.789SerGly: 3.789 ± 0.473
0.489SerHis: 0.489 ± 0.234
3.422SerIle: 3.422 ± 0.7
6.478SerLys: 6.478 ± 0.917
8.922SerLeu: 8.922 ± 1.083
1.222SerMet: 1.222 ± 0.38
4.278SerAsn: 4.278 ± 0.524
1.467SerPro: 1.467 ± 0.408
2.567SerGln: 2.567 ± 0.464
1.344SerArg: 1.344 ± 0.304
2.811SerSer: 2.811 ± 0.796
2.567SerThr: 2.567 ± 0.705
5.255SerVal: 5.255 ± 0.929
0.489SerTrp: 0.489 ± 0.241
3.055SerTyr: 3.055 ± 0.466
0.0SerXaa: 0.0 ± 0.0
Thr
2.2ThrAla: 2.2 ± 0.536
0.367ThrCys: 0.367 ± 0.209
1.833ThrAsp: 1.833 ± 0.352
3.055ThrGlu: 3.055 ± 0.39
0.978ThrPhe: 0.978 ± 0.361
2.567ThrGly: 2.567 ± 0.549
1.344ThrHis: 1.344 ± 0.478
4.522ThrIle: 4.522 ± 0.858
4.644ThrLys: 4.644 ± 0.805
4.155ThrLeu: 4.155 ± 0.773
1.1ThrMet: 1.1 ± 0.419
3.3ThrAsn: 3.3 ± 0.583
2.811ThrPro: 2.811 ± 0.562
3.055ThrGln: 3.055 ± 0.671
1.589ThrArg: 1.589 ± 0.546
3.667ThrSer: 3.667 ± 0.777
3.789ThrThr: 3.789 ± 0.753
0.611ThrVal: 0.611 ± 0.276
0.611ThrTrp: 0.611 ± 0.261
1.833ThrTyr: 1.833 ± 0.509
0.0ThrXaa: 0.0 ± 0.0
Val
2.444ValAla: 2.444 ± 0.553
0.244ValCys: 0.244 ± 0.195
2.322ValAsp: 2.322 ± 0.447
2.2ValGlu: 2.2 ± 0.563
1.589ValPhe: 1.589 ± 0.517
2.811ValGly: 2.811 ± 0.732
0.244ValHis: 0.244 ± 0.189
3.178ValIle: 3.178 ± 0.655
4.522ValLys: 4.522 ± 0.843
5.011ValLeu: 5.011 ± 0.93
0.733ValMet: 0.733 ± 0.339
2.444ValAsn: 2.444 ± 0.42
0.611ValPro: 0.611 ± 0.265
0.856ValGln: 0.856 ± 0.374
1.833ValArg: 1.833 ± 0.57
4.644ValSer: 4.644 ± 0.875
1.1ValThr: 1.1 ± 0.337
2.567ValVal: 2.567 ± 0.597
0.244ValTrp: 0.244 ± 0.157
1.1ValTyr: 1.1 ± 0.454
0.0ValXaa: 0.0 ± 0.0
Trp
0.0TrpAla: 0.0 ± 0.0
0.0TrpCys: 0.0 ± 0.0
0.0TrpAsp: 0.0 ± 0.0
0.856TrpGlu: 0.856 ± 0.414
0.0TrpPhe: 0.0 ± 0.0
0.244TrpGly: 0.244 ± 0.185
0.122TrpHis: 0.122 ± 0.114
0.611TrpIle: 0.611 ± 0.262
0.122TrpLys: 0.122 ± 0.114
0.244TrpLeu: 0.244 ± 0.194
0.244TrpMet: 0.244 ± 0.165
0.489TrpAsn: 0.489 ± 0.35
0.122TrpPro: 0.122 ± 0.119
0.122TrpGln: 0.122 ± 0.102
0.244TrpArg: 0.244 ± 0.166
0.611TrpSer: 0.611 ± 0.334
0.244TrpThr: 0.244 ± 0.152
0.489TrpVal: 0.489 ± 0.224
0.0TrpTrp: 0.0 ± 0.0
0.244TrpTyr: 0.244 ± 0.235
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.2TyrAla: 2.2 ± 0.562
0.367TyrCys: 0.367 ± 0.348
1.711TyrAsp: 1.711 ± 0.556
3.3TyrGlu: 3.3 ± 0.566
1.833TyrPhe: 1.833 ± 0.627
0.611TyrGly: 0.611 ± 0.215
1.1TyrHis: 1.1 ± 0.362
2.444TyrIle: 2.444 ± 0.604
4.278TyrLys: 4.278 ± 0.721
3.789TyrLeu: 3.789 ± 0.696
0.611TyrMet: 0.611 ± 0.302
3.3TyrAsn: 3.3 ± 0.662
0.611TyrPro: 0.611 ± 0.249
2.567TyrGln: 2.567 ± 0.624
1.711TyrArg: 1.711 ± 0.433
2.689TyrSer: 2.689 ± 0.529
1.467TyrThr: 1.467 ± 0.473
0.244TyrVal: 0.244 ± 0.17
0.122TyrTrp: 0.122 ± 0.126
1.833TyrTyr: 1.833 ± 0.613
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 30 proteins (8183 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski