Amino acid dipepetide frequency for Streptomyces phage Yaboi

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
7.612AlaAla: 7.612 ± 1.047
0.767AlaCys: 0.767 ± 0.163
4.691AlaAsp: 4.691 ± 0.375
6.373AlaGlu: 6.373 ± 0.503
3.157AlaPhe: 3.157 ± 0.337
6.167AlaGly: 6.167 ± 0.462
1.357AlaHis: 1.357 ± 0.18
4.19AlaIle: 4.19 ± 0.678
5.163AlaLys: 5.163 ± 0.363
6.816AlaLeu: 6.816 ± 0.6
2.36AlaMet: 2.36 ± 0.301
4.249AlaAsn: 4.249 ± 0.448
3.01AlaPro: 3.01 ± 0.346
2.833AlaGln: 2.833 ± 0.331
4.809AlaArg: 4.809 ± 0.436
4.308AlaSer: 4.308 ± 0.451
4.603AlaThr: 4.603 ± 0.617
5.518AlaVal: 5.518 ± 0.364
1.416AlaTrp: 1.416 ± 0.186
3.187AlaTyr: 3.187 ± 0.331
0.0AlaXaa: 0.0 ± 0.0
Cys
0.826CysAla: 0.826 ± 0.176
0.207CysCys: 0.207 ± 0.093
0.59CysAsp: 0.59 ± 0.149
0.531CysGlu: 0.531 ± 0.144
0.354CysPhe: 0.354 ± 0.111
1.062CysGly: 1.062 ± 0.227
0.266CysHis: 0.266 ± 0.164
0.59CysIle: 0.59 ± 0.139
0.767CysLys: 0.767 ± 0.156
0.708CysLeu: 0.708 ± 0.14
0.443CysMet: 0.443 ± 0.115
0.561CysAsn: 0.561 ± 0.137
0.502CysPro: 0.502 ± 0.138
0.325CysGln: 0.325 ± 0.114
0.738CysArg: 0.738 ± 0.163
0.738CysSer: 0.738 ± 0.16
0.384CysThr: 0.384 ± 0.118
0.679CysVal: 0.679 ± 0.156
0.236CysTrp: 0.236 ± 0.077
0.325CysTyr: 0.325 ± 0.085
0.0CysXaa: 0.0 ± 0.0
Asp
5.34AspAla: 5.34 ± 0.479
0.59AspCys: 0.59 ± 0.128
3.836AspAsp: 3.836 ± 0.378
4.573AspGlu: 4.573 ± 0.442
3.039AspPhe: 3.039 ± 0.349
5.281AspGly: 5.281 ± 0.432
0.856AspHis: 0.856 ± 0.173
3.393AspIle: 3.393 ± 0.338
3.511AspLys: 3.511 ± 0.348
4.957AspLeu: 4.957 ± 0.376
2.419AspMet: 2.419 ± 0.312
3.039AspAsn: 3.039 ± 0.267
2.242AspPro: 2.242 ± 0.287
1.918AspGln: 1.918 ± 0.206
3.246AspArg: 3.246 ± 0.277
4.042AspSer: 4.042 ± 0.351
3.069AspThr: 3.069 ± 0.352
4.367AspVal: 4.367 ± 0.449
1.682AspTrp: 1.682 ± 0.223
2.39AspTyr: 2.39 ± 0.278
0.0AspXaa: 0.0 ± 0.0
Glu
6.432GluAla: 6.432 ± 0.511
0.797GluCys: 0.797 ± 0.173
4.603GluAsp: 4.603 ± 0.427
5.665GluGlu: 5.665 ± 0.484
3.187GluPhe: 3.187 ± 0.304
4.337GluGly: 4.337 ± 0.339
1.357GluHis: 1.357 ± 0.268
4.337GluIle: 4.337 ± 0.308
4.868GluLys: 4.868 ± 0.544
5.636GluLeu: 5.636 ± 0.466
2.124GluMet: 2.124 ± 0.235
2.892GluAsn: 2.892 ± 0.301
2.006GluPro: 2.006 ± 0.299
2.774GluGln: 2.774 ± 0.397
4.072GluArg: 4.072 ± 0.39
3.452GluSer: 3.452 ± 0.35
3.806GluThr: 3.806 ± 0.418
5.163GluVal: 5.163 ± 0.493
1.298GluTrp: 1.298 ± 0.209
2.921GluTyr: 2.921 ± 0.397
0.0GluXaa: 0.0 ± 0.0
Phe
2.685PheAla: 2.685 ± 0.343
0.443PheCys: 0.443 ± 0.103
2.98PheAsp: 2.98 ± 0.276
3.452PheGlu: 3.452 ± 0.284
1.328PhePhe: 1.328 ± 0.221
2.98PheGly: 2.98 ± 0.259
0.59PheHis: 0.59 ± 0.12
1.652PheIle: 1.652 ± 0.224
2.213PheLys: 2.213 ± 0.273
2.331PheLeu: 2.331 ± 0.288
1.062PheMet: 1.062 ± 0.176
2.213PheAsn: 2.213 ± 0.231
1.092PhePro: 1.092 ± 0.173
0.944PheGln: 0.944 ± 0.194
2.183PheArg: 2.183 ± 0.254
2.685PheSer: 2.685 ± 0.363
2.567PheThr: 2.567 ± 0.325
2.685PheVal: 2.685 ± 0.325
0.443PheTrp: 0.443 ± 0.128
1.564PheTyr: 1.564 ± 0.195
0.0PheXaa: 0.0 ± 0.0
Gly
4.544GlyAla: 4.544 ± 0.503
0.944GlyCys: 0.944 ± 0.188
4.426GlyAsp: 4.426 ± 0.442
4.455GlyGlu: 4.455 ± 0.441
3.216GlyPhe: 3.216 ± 0.326
4.691GlyGly: 4.691 ± 0.459
1.505GlyHis: 1.505 ± 0.224
4.485GlyIle: 4.485 ± 0.407
5.754GlyLys: 5.754 ± 0.518
5.222GlyLeu: 5.222 ± 0.457
2.655GlyMet: 2.655 ± 0.259
3.6GlyAsn: 3.6 ± 0.327
2.065GlyPro: 2.065 ± 0.252
2.036GlyGln: 2.036 ± 0.264
4.426GlyArg: 4.426 ± 0.365
4.396GlySer: 4.396 ± 0.415
5.34GlyThr: 5.34 ± 0.695
5.518GlyVal: 5.518 ± 0.363
1.564GlyTrp: 1.564 ± 0.178
2.892GlyTyr: 2.892 ± 0.268
0.0GlyXaa: 0.0 ± 0.0
His
1.092HisAla: 1.092 ± 0.198
0.266HisCys: 0.266 ± 0.084
1.033HisAsp: 1.033 ± 0.178
1.151HisGlu: 1.151 ± 0.179
0.738HisPhe: 0.738 ± 0.156
1.505HisGly: 1.505 ± 0.235
0.531HisHis: 0.531 ± 0.138
0.915HisIle: 0.915 ± 0.12
1.062HisLys: 1.062 ± 0.159
1.151HisLeu: 1.151 ± 0.177
0.413HisMet: 0.413 ± 0.104
0.797HisAsn: 0.797 ± 0.175
0.826HisPro: 0.826 ± 0.149
0.59HisGln: 0.59 ± 0.119
1.121HisArg: 1.121 ± 0.182
0.974HisSer: 0.974 ± 0.157
0.974HisThr: 0.974 ± 0.182
1.18HisVal: 1.18 ± 0.208
0.295HisTrp: 0.295 ± 0.104
0.679HisTyr: 0.679 ± 0.12
0.0HisXaa: 0.0 ± 0.0
Ile
4.337IleAla: 4.337 ± 0.341
0.679IleCys: 0.679 ± 0.163
4.337IleAsp: 4.337 ± 0.373
4.898IleGlu: 4.898 ± 0.44
1.564IlePhe: 1.564 ± 0.205
3.718IleGly: 3.718 ± 0.318
0.826IleHis: 0.826 ± 0.157
2.951IleIle: 2.951 ± 0.281
4.16IleLys: 4.16 ± 0.413
3.452IleLeu: 3.452 ± 0.336
1.387IleMet: 1.387 ± 0.207
2.744IleAsn: 2.744 ± 0.326
1.947IlePro: 1.947 ± 0.262
1.977IleGln: 1.977 ± 0.254
2.951IleArg: 2.951 ± 0.291
2.921IleSer: 2.921 ± 0.28
3.511IleThr: 3.511 ± 0.314
4.101IleVal: 4.101 ± 0.351
0.856IleTrp: 0.856 ± 0.164
1.652IleTyr: 1.652 ± 0.264
0.0IleXaa: 0.0 ± 0.0
Lys
5.429LysAla: 5.429 ± 0.483
0.62LysCys: 0.62 ± 0.145
3.511LysAsp: 3.511 ± 0.395
3.482LysGlu: 3.482 ± 0.385
2.272LysPhe: 2.272 ± 0.241
4.042LysGly: 4.042 ± 0.338
1.121LysHis: 1.121 ± 0.195
4.514LysIle: 4.514 ± 0.359
4.986LysLys: 4.986 ± 0.447
4.455LysLeu: 4.455 ± 0.391
2.213LysMet: 2.213 ± 0.247
3.836LysAsn: 3.836 ± 0.338
2.626LysPro: 2.626 ± 0.335
2.508LysGln: 2.508 ± 0.26
3.836LysArg: 3.836 ± 0.402
3.511LysSer: 3.511 ± 0.297
3.836LysThr: 3.836 ± 0.306
4.337LysVal: 4.337 ± 0.313
1.298LysTrp: 1.298 ± 0.236
2.537LysTyr: 2.537 ± 0.245
0.0LysXaa: 0.0 ± 0.0
Leu
6.639LeuAla: 6.639 ± 0.453
0.797LeuCys: 0.797 ± 0.181
5.281LeuAsp: 5.281 ± 0.409
5.813LeuGlu: 5.813 ± 0.496
2.803LeuPhe: 2.803 ± 0.306
4.927LeuGly: 4.927 ± 0.412
1.328LeuHis: 1.328 ± 0.202
3.865LeuIle: 3.865 ± 0.345
4.278LeuLys: 4.278 ± 0.398
5.045LeuLeu: 5.045 ± 0.475
2.036LeuMet: 2.036 ± 0.293
3.246LeuAsn: 3.246 ± 0.309
2.98LeuPro: 2.98 ± 0.254
1.682LeuGln: 1.682 ± 0.295
4.396LeuArg: 4.396 ± 0.285
4.898LeuSer: 4.898 ± 0.318
4.927LeuThr: 4.927 ± 0.388
4.16LeuVal: 4.16 ± 0.378
1.328LeuTrp: 1.328 ± 0.196
2.626LeuTyr: 2.626 ± 0.268
0.0LeuXaa: 0.0 ± 0.0
Met
2.862MetAla: 2.862 ± 0.282
0.354MetCys: 0.354 ± 0.112
1.623MetAsp: 1.623 ± 0.228
1.682MetGlu: 1.682 ± 0.24
0.856MetPhe: 0.856 ± 0.189
2.036MetGly: 2.036 ± 0.363
0.413MetHis: 0.413 ± 0.104
1.947MetIle: 1.947 ± 0.208
2.095MetLys: 2.095 ± 0.274
1.77MetLeu: 1.77 ± 0.237
0.649MetMet: 0.649 ± 0.141
1.328MetAsn: 1.328 ± 0.175
1.21MetPro: 1.21 ± 0.192
1.151MetGln: 1.151 ± 0.262
1.77MetArg: 1.77 ± 0.238
2.124MetSer: 2.124 ± 0.248
2.419MetThr: 2.419 ± 0.26
1.741MetVal: 1.741 ± 0.21
0.295MetTrp: 0.295 ± 0.083
0.826MetTyr: 0.826 ± 0.145
0.0MetXaa: 0.0 ± 0.0
Asn
3.452AsnAla: 3.452 ± 0.418
0.502AsnCys: 0.502 ± 0.136
2.951AsnAsp: 2.951 ± 0.295
3.187AsnGlu: 3.187 ± 0.303
1.623AsnPhe: 1.623 ± 0.23
4.337AsnGly: 4.337 ± 0.404
0.944AsnHis: 0.944 ± 0.164
1.829AsnIle: 1.829 ± 0.263
3.482AsnLys: 3.482 ± 0.366
3.541AsnLeu: 3.541 ± 0.355
1.298AsnMet: 1.298 ± 0.179
1.8AsnAsn: 1.8 ± 0.265
2.331AsnPro: 2.331 ± 0.223
1.475AsnGln: 1.475 ± 0.222
2.272AsnArg: 2.272 ± 0.256
2.626AsnSer: 2.626 ± 0.32
2.715AsnThr: 2.715 ± 0.312
3.393AsnVal: 3.393 ± 0.363
1.121AsnTrp: 1.121 ± 0.191
1.947AsnTyr: 1.947 ± 0.229
0.0AsnXaa: 0.0 ± 0.0
Pro
3.216ProAla: 3.216 ± 0.334
0.266ProCys: 0.266 ± 0.084
2.715ProAsp: 2.715 ± 0.374
2.715ProGlu: 2.715 ± 0.273
1.623ProPhe: 1.623 ± 0.215
3.098ProGly: 3.098 ± 0.338
0.62ProHis: 0.62 ± 0.127
1.711ProIle: 1.711 ± 0.199
1.977ProLys: 1.977 ± 0.253
2.508ProLeu: 2.508 ± 0.265
0.915ProMet: 0.915 ± 0.178
1.682ProAsn: 1.682 ± 0.217
1.121ProPro: 1.121 ± 0.262
0.885ProGln: 0.885 ± 0.156
1.8ProArg: 1.8 ± 0.238
2.39ProSer: 2.39 ± 0.375
2.154ProThr: 2.154 ± 0.33
3.511ProVal: 3.511 ± 0.304
0.502ProTrp: 0.502 ± 0.123
1.092ProTyr: 1.092 ± 0.158
0.0ProXaa: 0.0 ± 0.0
Gln
3.836GlnAla: 3.836 ± 0.465
0.384GlnCys: 0.384 ± 0.108
1.298GlnAsp: 1.298 ± 0.226
1.8GlnGlu: 1.8 ± 0.279
1.475GlnPhe: 1.475 ± 0.198
1.534GlnGly: 1.534 ± 0.212
0.413GlnHis: 0.413 ± 0.124
2.065GlnIle: 2.065 ± 0.253
2.242GlnLys: 2.242 ± 0.366
2.626GlnLeu: 2.626 ± 0.291
1.033GlnMet: 1.033 ± 0.238
1.269GlnAsn: 1.269 ± 0.204
0.797GlnPro: 0.797 ± 0.155
1.239GlnGln: 1.239 ± 0.258
2.183GlnArg: 2.183 ± 0.362
2.213GlnSer: 2.213 ± 0.243
1.623GlnThr: 1.623 ± 0.246
2.242GlnVal: 2.242 ± 0.252
0.59GlnTrp: 0.59 ± 0.142
1.18GlnTyr: 1.18 ± 0.202
0.0GlnXaa: 0.0 ± 0.0
Arg
5.016ArgAla: 5.016 ± 0.463
0.708ArgCys: 0.708 ± 0.18
3.246ArgAsp: 3.246 ± 0.324
4.337ArgGlu: 4.337 ± 0.412
2.213ArgPhe: 2.213 ± 0.252
4.278ArgGly: 4.278 ± 0.364
0.944ArgHis: 0.944 ± 0.181
3.128ArgIle: 3.128 ± 0.313
3.924ArgLys: 3.924 ± 0.401
4.249ArgLeu: 4.249 ± 0.335
2.036ArgMet: 2.036 ± 0.235
2.508ArgAsn: 2.508 ± 0.285
1.888ArgPro: 1.888 ± 0.275
1.918ArgGln: 1.918 ± 0.279
3.482ArgArg: 3.482 ± 0.459
2.774ArgSer: 2.774 ± 0.337
2.921ArgThr: 2.921 ± 0.289
4.278ArgVal: 4.278 ± 0.353
1.21ArgTrp: 1.21 ± 0.206
2.213ArgTyr: 2.213 ± 0.282
0.0ArgXaa: 0.0 ± 0.0
Ser
4.691SerAla: 4.691 ± 0.529
0.443SerCys: 0.443 ± 0.121
3.806SerAsp: 3.806 ± 0.377
3.836SerGlu: 3.836 ± 0.333
2.331SerPhe: 2.331 ± 0.25
5.636SerGly: 5.636 ± 0.424
1.092SerHis: 1.092 ± 0.19
3.01SerIle: 3.01 ± 0.31
3.423SerLys: 3.423 ± 0.335
4.78SerLeu: 4.78 ± 0.363
1.8SerMet: 1.8 ± 0.21
2.449SerAsn: 2.449 ± 0.362
2.154SerPro: 2.154 ± 0.283
1.593SerGln: 1.593 ± 0.216
3.836SerArg: 3.836 ± 0.3
3.659SerSer: 3.659 ± 0.417
3.305SerThr: 3.305 ± 0.495
4.337SerVal: 4.337 ± 0.31
1.387SerTrp: 1.387 ± 0.196
2.006SerTyr: 2.006 ± 0.273
0.0SerXaa: 0.0 ± 0.0
Thr
4.927ThrAla: 4.927 ± 0.61
0.649ThrCys: 0.649 ± 0.131
3.747ThrAsp: 3.747 ± 0.387
4.072ThrGlu: 4.072 ± 0.353
2.065ThrPhe: 2.065 ± 0.196
4.986ThrGly: 4.986 ± 0.643
0.797ThrHis: 0.797 ± 0.175
3.6ThrIle: 3.6 ± 0.366
3.01ThrLys: 3.01 ± 0.301
4.249ThrLeu: 4.249 ± 0.365
1.269ThrMet: 1.269 ± 0.182
2.774ThrAsn: 2.774 ± 0.449
3.128ThrPro: 3.128 ± 0.393
2.419ThrGln: 2.419 ± 0.265
2.951ThrArg: 2.951 ± 0.27
3.629ThrSer: 3.629 ± 0.32
3.924ThrThr: 3.924 ± 0.488
4.544ThrVal: 4.544 ± 0.418
1.298ThrTrp: 1.298 ± 0.205
2.065ThrTyr: 2.065 ± 0.295
0.0ThrXaa: 0.0 ± 0.0
Val
5.311ValAla: 5.311 ± 0.327
0.797ValCys: 0.797 ± 0.181
5.016ValAsp: 5.016 ± 0.479
4.898ValGlu: 4.898 ± 0.355
2.508ValPhe: 2.508 ± 0.235
4.927ValGly: 4.927 ± 0.3
1.18ValHis: 1.18 ± 0.177
3.924ValIle: 3.924 ± 0.318
4.721ValLys: 4.721 ± 0.387
4.455ValLeu: 4.455 ± 0.343
1.534ValMet: 1.534 ± 0.196
2.921ValAsn: 2.921 ± 0.287
2.862ValPro: 2.862 ± 0.346
1.977ValGln: 1.977 ± 0.24
4.219ValArg: 4.219 ± 0.319
4.986ValSer: 4.986 ± 0.459
4.75ValThr: 4.75 ± 0.633
5.459ValVal: 5.459 ± 0.425
1.564ValTrp: 1.564 ± 0.261
2.98ValTyr: 2.98 ± 0.337
0.0ValXaa: 0.0 ± 0.0
Trp
1.18TrpAla: 1.18 ± 0.187
0.266TrpCys: 0.266 ± 0.104
1.357TrpAsp: 1.357 ± 0.204
1.977TrpGlu: 1.977 ± 0.237
0.738TrpPhe: 0.738 ± 0.141
1.239TrpGly: 1.239 ± 0.228
0.443TrpHis: 0.443 ± 0.113
0.974TrpIle: 0.974 ± 0.183
1.269TrpLys: 1.269 ± 0.242
1.682TrpLeu: 1.682 ± 0.267
0.708TrpMet: 0.708 ± 0.176
1.092TrpAsn: 1.092 ± 0.162
0.443TrpPro: 0.443 ± 0.112
0.738TrpGln: 0.738 ± 0.145
1.003TrpArg: 1.003 ± 0.172
1.062TrpSer: 1.062 ± 0.177
1.121TrpThr: 1.121 ± 0.167
1.033TrpVal: 1.033 ± 0.164
0.531TrpTrp: 0.531 ± 0.099
0.62TrpTyr: 0.62 ± 0.118
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.157TyrAla: 3.157 ± 0.312
0.354TyrCys: 0.354 ± 0.123
2.744TyrAsp: 2.744 ± 0.311
2.833TyrGlu: 2.833 ± 0.31
1.003TyrPhe: 1.003 ± 0.193
2.951TyrGly: 2.951 ± 0.27
0.708TyrHis: 0.708 ± 0.154
1.888TyrIle: 1.888 ± 0.265
2.065TyrLys: 2.065 ± 0.227
3.423TyrLeu: 3.423 ± 0.323
0.767TyrMet: 0.767 ± 0.137
1.829TyrAsn: 1.829 ± 0.209
1.269TyrPro: 1.269 ± 0.248
1.062TyrGln: 1.062 ± 0.138
2.006TyrArg: 2.006 ± 0.273
2.183TyrSer: 2.183 ± 0.241
2.154TyrThr: 2.154 ± 0.224
2.744TyrVal: 2.744 ± 0.277
0.561TyrTrp: 0.561 ± 0.126
1.328TyrTyr: 1.328 ± 0.211
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 215 proteins (33893 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski