Amino acid dipepetide frequency for Bacillus phage PBC4

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
5.096AlaAla: 5.096 ± 0.708
0.71AlaCys: 0.71 ± 0.173
3.133AlaAsp: 3.133 ± 0.431
4.302AlaGlu: 4.302 ± 0.645
2.297AlaPhe: 2.297 ± 0.281
4.637AlaGly: 4.637 ± 0.727
0.71AlaHis: 0.71 ± 0.176
4.135AlaIle: 4.135 ± 0.606
5.723AlaLys: 5.723 ± 0.453
4.845AlaLeu: 4.845 ± 0.373
1.462AlaMet: 1.462 ± 0.297
3.801AlaAsn: 3.801 ± 0.497
1.295AlaPro: 1.295 ± 0.212
2.047AlaGln: 2.047 ± 0.331
1.671AlaArg: 1.671 ± 0.254
3.008AlaSer: 3.008 ± 0.428
3.718AlaThr: 3.718 ± 0.551
3.885AlaVal: 3.885 ± 0.374
0.794AlaTrp: 0.794 ± 0.148
2.673AlaTyr: 2.673 ± 0.319
0.0AlaXaa: 0.0 ± 0.0
Cys
0.376CysAla: 0.376 ± 0.119
0.0CysCys: 0.0 ± 0.0
0.919CysAsp: 0.919 ± 0.228
0.668CysGlu: 0.668 ± 0.175
0.418CysPhe: 0.418 ± 0.112
0.501CysGly: 0.501 ± 0.125
0.209CysHis: 0.209 ± 0.097
0.71CysIle: 0.71 ± 0.192
0.919CysLys: 0.919 ± 0.219
0.459CysLeu: 0.459 ± 0.142
0.668CysMet: 0.668 ± 0.192
0.501CysAsn: 0.501 ± 0.172
0.543CysPro: 0.543 ± 0.169
0.292CysGln: 0.292 ± 0.132
0.292CysArg: 0.292 ± 0.117
0.418CysSer: 0.418 ± 0.155
0.501CysThr: 0.501 ± 0.155
0.376CysVal: 0.376 ± 0.152
0.042CysTrp: 0.042 ± 0.04
0.292CysTyr: 0.292 ± 0.135
0.0CysXaa: 0.0 ± 0.0
Asp
3.216AspAla: 3.216 ± 0.478
0.585AspCys: 0.585 ± 0.188
3.718AspAsp: 3.718 ± 0.412
5.054AspGlu: 5.054 ± 0.658
3.551AspPhe: 3.551 ± 0.49
3.801AspGly: 3.801 ± 0.386
1.17AspHis: 1.17 ± 0.249
5.054AspIle: 5.054 ± 0.611
5.806AspLys: 5.806 ± 0.568
5.43AspLeu: 5.43 ± 0.418
1.963AspMet: 1.963 ± 0.222
3.718AspAsn: 3.718 ± 0.462
1.629AspPro: 1.629 ± 0.291
1.42AspGln: 1.42 ± 0.201
2.59AspArg: 2.59 ± 0.516
3.509AspSer: 3.509 ± 0.448
3.049AspThr: 3.049 ± 0.442
4.094AspVal: 4.094 ± 0.339
0.919AspTrp: 0.919 ± 0.191
3.342AspTyr: 3.342 ± 0.394
0.0AspXaa: 0.0 ± 0.0
Glu
4.01GluAla: 4.01 ± 0.52
0.961GluCys: 0.961 ± 0.23
3.968GluAsp: 3.968 ± 0.486
7.185GluGlu: 7.185 ± 0.821
3.467GluPhe: 3.467 ± 0.429
4.177GluGly: 4.177 ± 0.441
1.462GluHis: 1.462 ± 0.323
6.099GluIle: 6.099 ± 0.66
7.644GluLys: 7.644 ± 0.737
7.101GluLeu: 7.101 ± 0.639
2.715GluMet: 2.715 ± 0.357
4.553GluAsn: 4.553 ± 0.527
1.754GluPro: 1.754 ± 0.334
3.718GluGln: 3.718 ± 0.548
3.216GluArg: 3.216 ± 0.389
3.885GluSer: 3.885 ± 0.409
4.511GluThr: 4.511 ± 0.547
5.723GluVal: 5.723 ± 0.638
0.835GluTrp: 0.835 ± 0.205
3.133GluTyr: 3.133 ± 0.369
0.0GluXaa: 0.0 ± 0.0
Phe
3.216PheAla: 3.216 ± 0.41
0.501PheCys: 0.501 ± 0.172
3.509PheAsp: 3.509 ± 0.416
3.509PheGlu: 3.509 ± 0.401
1.671PhePhe: 1.671 ± 0.354
2.799PheGly: 2.799 ± 0.55
0.585PheHis: 0.585 ± 0.141
2.673PheIle: 2.673 ± 0.357
3.843PheLys: 3.843 ± 0.438
3.258PheLeu: 3.258 ± 0.472
1.128PheMet: 1.128 ± 0.219
2.757PheAsn: 2.757 ± 0.337
1.044PhePro: 1.044 ± 0.228
1.044PheGln: 1.044 ± 0.196
1.671PheArg: 1.671 ± 0.276
2.757PheSer: 2.757 ± 0.321
3.634PheThr: 3.634 ± 0.379
2.757PheVal: 2.757 ± 0.313
0.668PheTrp: 0.668 ± 0.177
1.838PheTyr: 1.838 ± 0.245
0.0PheXaa: 0.0 ± 0.0
Gly
3.718GlyAla: 3.718 ± 0.629
0.794GlyCys: 0.794 ± 0.195
3.008GlyAsp: 3.008 ± 0.339
4.428GlyGlu: 4.428 ± 0.47
3.216GlyPhe: 3.216 ± 0.355
4.762GlyGly: 4.762 ± 0.796
1.337GlyHis: 1.337 ± 0.277
4.052GlyIle: 4.052 ± 0.379
6.015GlyLys: 6.015 ± 0.469
5.096GlyLeu: 5.096 ± 0.475
1.754GlyMet: 1.754 ± 0.427
3.759GlyAsn: 3.759 ± 0.506
0.042GlyPro: 0.042 ± 0.037
3.216GlyGln: 3.216 ± 0.944
2.548GlyArg: 2.548 ± 0.294
4.511GlySer: 4.511 ± 0.492
4.553GlyThr: 4.553 ± 0.656
4.637GlyVal: 4.637 ± 0.524
0.752GlyTrp: 0.752 ± 0.188
3.133GlyTyr: 3.133 ± 0.339
0.0GlyXaa: 0.0 ± 0.0
His
0.668HisAla: 0.668 ± 0.15
0.125HisCys: 0.125 ± 0.064
1.546HisAsp: 1.546 ± 0.306
1.546HisGlu: 1.546 ± 0.321
0.835HisPhe: 0.835 ± 0.176
1.003HisGly: 1.003 ± 0.232
0.251HisHis: 0.251 ± 0.106
1.253HisIle: 1.253 ± 0.239
1.42HisLys: 1.42 ± 0.273
1.671HisLeu: 1.671 ± 0.287
0.501HisMet: 0.501 ± 0.148
1.128HisAsn: 1.128 ± 0.223
0.585HisPro: 0.585 ± 0.153
0.585HisGln: 0.585 ± 0.156
0.418HisArg: 0.418 ± 0.111
0.752HisSer: 0.752 ± 0.178
0.835HisThr: 0.835 ± 0.14
1.211HisVal: 1.211 ± 0.239
0.251HisTrp: 0.251 ± 0.105
0.794HisTyr: 0.794 ± 0.2
0.0HisXaa: 0.0 ± 0.0
Ile
4.177IleAla: 4.177 ± 0.422
0.752IleCys: 0.752 ± 0.228
6.015IleAsp: 6.015 ± 0.604
6.725IleGlu: 6.725 ± 0.603
2.256IlePhe: 2.256 ± 0.294
4.177IleGly: 4.177 ± 0.406
1.713IleHis: 1.713 ± 0.331
4.302IleIle: 4.302 ± 0.483
7.101IleLys: 7.101 ± 0.688
3.467IleLeu: 3.467 ± 0.343
1.921IleMet: 1.921 ± 0.283
3.968IleAsn: 3.968 ± 0.463
2.005IlePro: 2.005 ± 0.254
3.049IleGln: 3.049 ± 0.327
2.423IleArg: 2.423 ± 0.294
3.509IleSer: 3.509 ± 0.421
4.595IleThr: 4.595 ± 0.47
4.678IleVal: 4.678 ± 0.399
0.376IleTrp: 0.376 ± 0.115
2.548IleTyr: 2.548 ± 0.311
0.0IleXaa: 0.0 ± 0.0
Lys
5.43LysAla: 5.43 ± 0.553
0.627LysCys: 0.627 ± 0.16
5.096LysAsp: 5.096 ± 0.417
9.148LysGlu: 9.148 ± 0.929
3.843LysPhe: 3.843 ± 0.403
5.514LysGly: 5.514 ± 0.512
1.462LysHis: 1.462 ± 0.291
7.059LysIle: 7.059 ± 0.516
7.644LysLys: 7.644 ± 0.673
8.02LysLeu: 8.02 ± 0.591
3.008LysMet: 3.008 ± 0.38
4.845LysAsn: 4.845 ± 0.494
2.673LysPro: 2.673 ± 0.293
2.84LysGln: 2.84 ± 0.395
3.216LysArg: 3.216 ± 0.416
3.843LysSer: 3.843 ± 0.387
5.138LysThr: 5.138 ± 0.474
6.516LysVal: 6.516 ± 0.697
1.044LysTrp: 1.044 ± 0.168
3.759LysTyr: 3.759 ± 0.418
0.0LysXaa: 0.0 ± 0.0
Leu
5.221LeuAla: 5.221 ± 0.356
0.501LeuCys: 0.501 ± 0.159
5.388LeuAsp: 5.388 ± 0.465
6.349LeuGlu: 6.349 ± 0.589
2.548LeuPhe: 2.548 ± 0.315
5.221LeuGly: 5.221 ± 0.573
1.337LeuHis: 1.337 ± 0.236
5.096LeuIle: 5.096 ± 0.503
7.602LeuLys: 7.602 ± 0.63
5.096LeuLeu: 5.096 ± 0.556
1.963LeuMet: 1.963 ± 0.29
5.263LeuAsn: 5.263 ± 0.469
2.757LeuPro: 2.757 ± 0.338
2.757LeuGln: 2.757 ± 0.377
2.757LeuArg: 2.757 ± 0.292
3.718LeuSer: 3.718 ± 0.428
4.386LeuThr: 4.386 ± 0.386
3.843LeuVal: 3.843 ± 0.409
1.086LeuTrp: 1.086 ± 0.207
2.924LeuTyr: 2.924 ± 0.364
0.0LeuXaa: 0.0 ± 0.0
Met
2.172MetAla: 2.172 ± 0.284
0.042MetCys: 0.042 ± 0.04
1.378MetAsp: 1.378 ± 0.241
2.172MetGlu: 2.172 ± 0.349
1.587MetPhe: 1.587 ± 0.232
1.546MetGly: 1.546 ± 0.418
0.251MetHis: 0.251 ± 0.098
1.838MetIle: 1.838 ± 0.312
3.509MetLys: 3.509 ± 0.35
2.172MetLeu: 2.172 ± 0.273
1.086MetMet: 1.086 ± 0.206
2.13MetAsn: 2.13 ± 0.376
0.627MetPro: 0.627 ± 0.172
0.835MetGln: 0.835 ± 0.253
1.086MetArg: 1.086 ± 0.228
2.256MetSer: 2.256 ± 0.388
1.587MetThr: 1.587 ± 0.251
2.089MetVal: 2.089 ± 0.288
0.292MetTrp: 0.292 ± 0.146
0.919MetTyr: 0.919 ± 0.185
0.0MetXaa: 0.0 ± 0.0
Asn
3.551AsnAla: 3.551 ± 0.445
0.543AsnCys: 0.543 ± 0.187
4.135AsnAsp: 4.135 ± 0.385
4.637AsnGlu: 4.637 ± 0.612
2.297AsnPhe: 2.297 ± 0.35
4.971AsnGly: 4.971 ± 0.793
1.003AsnHis: 1.003 ± 0.179
4.386AsnIle: 4.386 ± 0.394
5.18AsnLys: 5.18 ± 0.476
4.762AsnLeu: 4.762 ± 0.424
1.754AsnMet: 1.754 ± 0.265
4.511AsnAsn: 4.511 ± 0.471
2.464AsnPro: 2.464 ± 0.481
2.464AsnGln: 2.464 ± 0.474
2.59AsnArg: 2.59 ± 0.346
3.425AsnSer: 3.425 ± 0.542
2.297AsnThr: 2.297 ± 0.344
4.261AsnVal: 4.261 ± 0.454
0.585AsnTrp: 0.585 ± 0.146
2.757AsnTyr: 2.757 ± 0.402
0.0AsnXaa: 0.0 ± 0.0
Pro
2.005ProAla: 2.005 ± 0.27
0.209ProCys: 0.209 ± 0.101
1.838ProAsp: 1.838 ± 0.371
2.506ProGlu: 2.506 ± 0.308
1.88ProPhe: 1.88 ± 0.424
0.042ProGly: 0.042 ± 0.039
0.501ProHis: 0.501 ± 0.135
1.838ProIle: 1.838 ± 0.251
2.464ProLys: 2.464 ± 0.385
1.796ProLeu: 1.796 ± 0.281
1.044ProMet: 1.044 ± 0.234
2.13ProAsn: 2.13 ± 0.42
0.376ProPro: 0.376 ± 0.118
1.211ProGln: 1.211 ± 0.266
0.459ProArg: 0.459 ± 0.112
1.587ProSer: 1.587 ± 0.27
1.671ProThr: 1.671 ± 0.263
1.963ProVal: 1.963 ± 0.31
0.167ProTrp: 0.167 ± 0.066
1.044ProTyr: 1.044 ± 0.228
0.0ProXaa: 0.0 ± 0.0
Gln
1.921GlnAla: 1.921 ± 0.343
0.459GlnCys: 0.459 ± 0.165
1.546GlnAsp: 1.546 ± 0.195
2.757GlnGlu: 2.757 ± 0.452
1.42GlnPhe: 1.42 ± 0.254
3.049GlnGly: 3.049 ± 0.648
0.627GlnHis: 0.627 ± 0.165
2.172GlnIle: 2.172 ± 0.317
2.381GlnLys: 2.381 ± 0.415
3.467GlnLeu: 3.467 ± 0.377
1.754GlnMet: 1.754 ± 0.311
2.381GlnAsn: 2.381 ± 0.354
1.211GlnPro: 1.211 ± 0.459
2.464GlnGln: 2.464 ± 1.274
1.587GlnArg: 1.587 ± 0.242
1.295GlnSer: 1.295 ± 0.234
2.13GlnThr: 2.13 ± 0.532
2.089GlnVal: 2.089 ± 0.259
0.376GlnTrp: 0.376 ± 0.136
1.211GlnTyr: 1.211 ± 0.196
0.0GlnXaa: 0.0 ± 0.0
Arg
1.796ArgAla: 1.796 ± 0.29
0.167ArgCys: 0.167 ± 0.08
2.464ArgAsp: 2.464 ± 0.337
2.172ArgGlu: 2.172 ± 0.354
2.047ArgPhe: 2.047 ± 0.26
2.464ArgGly: 2.464 ± 0.305
0.835ArgHis: 0.835 ± 0.19
2.423ArgIle: 2.423 ± 0.324
3.759ArgLys: 3.759 ± 0.5
2.715ArgLeu: 2.715 ± 0.338
1.128ArgMet: 1.128 ± 0.217
2.005ArgAsn: 2.005 ± 0.323
1.003ArgPro: 1.003 ± 0.248
0.961ArgGln: 0.961 ± 0.19
1.629ArgArg: 1.629 ± 0.25
2.089ArgSer: 2.089 ± 0.261
2.13ArgThr: 2.13 ± 0.298
2.13ArgVal: 2.13 ± 0.31
0.459ArgTrp: 0.459 ± 0.13
1.754ArgTyr: 1.754 ± 0.301
0.0ArgXaa: 0.0 ± 0.0
Ser
3.467SerAla: 3.467 ± 0.549
0.543SerCys: 0.543 ± 0.163
2.799SerAsp: 2.799 ± 0.38
3.425SerGlu: 3.425 ± 0.4
3.3SerPhe: 3.3 ± 0.406
4.971SerGly: 4.971 ± 0.779
0.877SerHis: 0.877 ± 0.201
3.718SerIle: 3.718 ± 0.45
4.302SerLys: 4.302 ± 0.379
4.052SerLeu: 4.052 ± 0.395
1.504SerMet: 1.504 ± 0.322
3.049SerAsn: 3.049 ± 0.304
1.295SerPro: 1.295 ± 0.316
1.629SerGln: 1.629 ± 0.277
1.546SerArg: 1.546 ± 0.277
3.676SerSer: 3.676 ± 0.44
2.799SerThr: 2.799 ± 0.39
2.715SerVal: 2.715 ± 0.315
1.044SerTrp: 1.044 ± 0.199
2.84SerTyr: 2.84 ± 0.385
0.0SerXaa: 0.0 ± 0.0
Thr
3.843ThrAla: 3.843 ± 0.727
0.292ThrCys: 0.292 ± 0.119
3.759ThrAsp: 3.759 ± 0.384
3.509ThrGlu: 3.509 ± 0.42
3.049ThrPhe: 3.049 ± 0.331
4.637ThrGly: 4.637 ± 0.511
0.835ThrHis: 0.835 ± 0.182
4.511ThrIle: 4.511 ± 0.57
4.344ThrLys: 4.344 ± 0.4
4.01ThrLeu: 4.01 ± 0.489
1.378ThrMet: 1.378 ± 0.241
3.843ThrAsn: 3.843 ± 0.467
2.339ThrPro: 2.339 ± 0.493
2.13ThrGln: 2.13 ± 0.289
1.587ThrArg: 1.587 ± 0.268
3.383ThrSer: 3.383 ± 0.578
4.094ThrThr: 4.094 ± 0.621
4.344ThrVal: 4.344 ± 0.513
0.376ThrTrp: 0.376 ± 0.104
2.464ThrTyr: 2.464 ± 0.325
0.0ThrXaa: 0.0 ± 0.0
Val
3.258ValAla: 3.258 ± 0.438
0.501ValCys: 0.501 ± 0.144
4.887ValAsp: 4.887 ± 0.519
5.388ValGlu: 5.388 ± 0.558
2.924ValPhe: 2.924 ± 0.353
3.843ValGly: 3.843 ± 0.389
1.003ValHis: 1.003 ± 0.181
4.678ValIle: 4.678 ± 0.491
6.642ValLys: 6.642 ± 0.57
4.47ValLeu: 4.47 ± 0.443
1.546ValMet: 1.546 ± 0.365
4.261ValAsn: 4.261 ± 0.438
2.089ValPro: 2.089 ± 0.245
2.047ValGln: 2.047 ± 0.289
2.59ValArg: 2.59 ± 0.281
2.84ValSer: 2.84 ± 0.258
4.386ValThr: 4.386 ± 0.431
4.302ValVal: 4.302 ± 0.438
0.961ValTrp: 0.961 ± 0.206
2.548ValTyr: 2.548 ± 0.365
0.0ValXaa: 0.0 ± 0.0
Trp
0.251TrpAla: 0.251 ± 0.101
0.209TrpCys: 0.209 ± 0.114
0.919TrpAsp: 0.919 ± 0.214
1.086TrpGlu: 1.086 ± 0.248
0.627TrpPhe: 0.627 ± 0.196
0.585TrpGly: 0.585 ± 0.178
0.292TrpHis: 0.292 ± 0.139
1.086TrpIle: 1.086 ± 0.21
0.752TrpLys: 0.752 ± 0.181
0.794TrpLeu: 0.794 ± 0.159
0.167TrpMet: 0.167 ± 0.074
0.835TrpAsn: 0.835 ± 0.149
0.042TrpPro: 0.042 ± 0.037
0.376TrpGln: 0.376 ± 0.152
0.585TrpArg: 0.585 ± 0.147
0.543TrpSer: 0.543 ± 0.155
0.668TrpThr: 0.668 ± 0.175
0.794TrpVal: 0.794 ± 0.165
0.042TrpTrp: 0.042 ± 0.045
0.752TrpTyr: 0.752 ± 0.198
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.548TyrAla: 2.548 ± 0.32
0.585TyrCys: 0.585 ± 0.194
3.592TyrAsp: 3.592 ± 0.468
3.551TyrGlu: 3.551 ± 0.415
1.629TyrPhe: 1.629 ± 0.28
2.632TyrGly: 2.632 ± 0.345
0.919TyrHis: 0.919 ± 0.165
2.673TyrIle: 2.673 ± 0.384
3.676TyrLys: 3.676 ± 0.491
3.133TyrLeu: 3.133 ± 0.416
1.086TyrMet: 1.086 ± 0.24
3.133TyrAsn: 3.133 ± 0.296
0.961TyrPro: 0.961 ± 0.21
1.211TyrGln: 1.211 ± 0.171
1.671TyrArg: 1.671 ± 0.238
2.59TyrSer: 2.59 ± 0.348
2.047TyrThr: 2.047 ± 0.282
2.757TyrVal: 2.757 ± 0.324
0.334TyrTrp: 0.334 ± 0.106
2.464TyrTyr: 2.464 ± 0.405
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 123 proteins (23941 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski