Amino acid dipepetide frequency for Mycobacterium phage EniyanLRS

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
9.31AlaAla: 9.31 ± 0.826
0.854AlaCys: 0.854 ± 0.206
5.296AlaAsp: 5.296 ± 0.391
7.004AlaGlu: 7.004 ± 0.542
3.203AlaPhe: 3.203 ± 0.348
7.132AlaGly: 7.132 ± 0.859
1.836AlaHis: 1.836 ± 0.253
5.125AlaIle: 5.125 ± 0.469
5.168AlaLys: 5.168 ± 0.529
6.791AlaLeu: 6.791 ± 0.628
2.904AlaMet: 2.904 ± 0.288
3.075AlaAsn: 3.075 ± 0.335
4.313AlaPro: 4.313 ± 0.446
3.63AlaGln: 3.63 ± 0.601
5.253AlaArg: 5.253 ± 0.496
4.698AlaSer: 4.698 ± 0.472
5.296AlaThr: 5.296 ± 0.611
5.381AlaVal: 5.381 ± 0.551
1.794AlaTrp: 1.794 ± 0.252
2.477AlaTyr: 2.477 ± 0.308
0.0AlaXaa: 0.0 ± 0.0
Cys
1.068CysAla: 1.068 ± 0.24
0.085CysCys: 0.085 ± 0.055
0.384CysAsp: 0.384 ± 0.123
0.854CysGlu: 0.854 ± 0.189
0.384CysPhe: 0.384 ± 0.125
1.11CysGly: 1.11 ± 0.237
0.384CysHis: 0.384 ± 0.144
0.555CysIle: 0.555 ± 0.143
0.726CysLys: 0.726 ± 0.207
0.512CysLeu: 0.512 ± 0.154
0.256CysMet: 0.256 ± 0.11
0.512CysAsn: 0.512 ± 0.142
0.512CysPro: 0.512 ± 0.147
0.256CysGln: 0.256 ± 0.104
0.641CysArg: 0.641 ± 0.175
0.512CysSer: 0.512 ± 0.193
0.342CysThr: 0.342 ± 0.123
0.384CysVal: 0.384 ± 0.123
0.384CysTrp: 0.384 ± 0.138
0.299CysTyr: 0.299 ± 0.117
0.0CysXaa: 0.0 ± 0.0
Asp
6.363AspAla: 6.363 ± 0.458
0.683AspCys: 0.683 ± 0.199
5.467AspAsp: 5.467 ± 0.572
5.168AspGlu: 5.168 ± 0.576
2.819AspPhe: 2.819 ± 0.352
4.954AspGly: 4.954 ± 0.583
1.324AspHis: 1.324 ± 0.265
3.032AspIle: 3.032 ± 0.398
2.349AspLys: 2.349 ± 0.291
5.936AspLeu: 5.936 ± 0.538
2.007AspMet: 2.007 ± 0.314
1.708AspAsn: 1.708 ± 0.271
4.783AspPro: 4.783 ± 0.436
2.007AspGln: 2.007 ± 0.252
3.673AspArg: 3.673 ± 0.366
2.819AspSer: 2.819 ± 0.292
3.459AspThr: 3.459 ± 0.412
4.356AspVal: 4.356 ± 0.482
1.922AspTrp: 1.922 ± 0.275
2.221AspTyr: 2.221 ± 0.281
0.0AspXaa: 0.0 ± 0.0
Glu
5.894GluAla: 5.894 ± 0.55
0.555GluCys: 0.555 ± 0.179
4.484GluAsp: 4.484 ± 0.51
4.313GluGlu: 4.313 ± 0.567
2.52GluPhe: 2.52 ± 0.306
4.698GluGly: 4.698 ± 0.454
1.367GluHis: 1.367 ± 0.198
3.075GluIle: 3.075 ± 0.39
2.648GluLys: 2.648 ± 0.414
6.961GluLeu: 6.961 ± 0.616
1.836GluMet: 1.836 ± 0.302
2.221GluAsn: 2.221 ± 0.317
2.477GluPro: 2.477 ± 0.317
1.965GluGln: 1.965 ± 0.278
3.972GluArg: 3.972 ± 0.537
3.801GluSer: 3.801 ± 0.435
3.331GluThr: 3.331 ± 0.35
4.783GluVal: 4.783 ± 0.441
2.477GluTrp: 2.477 ± 0.281
2.562GluTyr: 2.562 ± 0.445
0.0GluXaa: 0.0 ± 0.0
Phe
2.733PheAla: 2.733 ± 0.34
0.299PheCys: 0.299 ± 0.125
2.99PheAsp: 2.99 ± 0.379
2.52PheGlu: 2.52 ± 0.414
1.025PhePhe: 1.025 ± 0.212
3.545PheGly: 3.545 ± 0.589
0.726PheHis: 0.726 ± 0.175
1.58PheIle: 1.58 ± 0.254
1.196PheLys: 1.196 ± 0.207
2.221PheLeu: 2.221 ± 0.354
0.982PheMet: 0.982 ± 0.199
1.623PheAsn: 1.623 ± 0.259
1.196PhePro: 1.196 ± 0.217
0.726PheGln: 0.726 ± 0.153
2.477PheArg: 2.477 ± 0.314
2.05PheSer: 2.05 ± 0.323
2.093PheThr: 2.093 ± 0.345
2.904PheVal: 2.904 ± 0.282
0.683PheTrp: 0.683 ± 0.152
1.068PheTyr: 1.068 ± 0.222
0.0PheXaa: 0.0 ± 0.0
Gly
6.876GlyAla: 6.876 ± 0.848
0.811GlyCys: 0.811 ± 0.19
5.082GlyAsp: 5.082 ± 0.55
5.296GlyGlu: 5.296 ± 0.456
2.733GlyPhe: 2.733 ± 0.356
7.132GlyGly: 7.132 ± 0.895
2.05GlyHis: 2.05 ± 0.314
3.972GlyIle: 3.972 ± 0.542
4.228GlyLys: 4.228 ± 0.478
5.936GlyLeu: 5.936 ± 0.57
1.452GlyMet: 1.452 ± 0.235
3.886GlyAsn: 3.886 ± 0.484
3.673GlyPro: 3.673 ± 0.537
3.246GlyGln: 3.246 ± 0.49
3.801GlyArg: 3.801 ± 0.464
4.698GlySer: 4.698 ± 0.653
5.637GlyThr: 5.637 ± 0.551
6.022GlyVal: 6.022 ± 0.639
2.349GlyTrp: 2.349 ± 0.356
3.075GlyTyr: 3.075 ± 0.335
0.0GlyXaa: 0.0 ± 0.0
His
1.623HisAla: 1.623 ± 0.309
0.47HisCys: 0.47 ± 0.16
1.409HisAsp: 1.409 ± 0.267
1.153HisGlu: 1.153 ± 0.208
1.11HisPhe: 1.11 ± 0.218
1.965HisGly: 1.965 ± 0.323
0.897HisHis: 0.897 ± 0.196
1.239HisIle: 1.239 ± 0.244
1.324HisLys: 1.324 ± 0.222
1.965HisLeu: 1.965 ± 0.341
0.598HisMet: 0.598 ± 0.169
0.854HisAsn: 0.854 ± 0.202
1.879HisPro: 1.879 ± 0.34
0.512HisGln: 0.512 ± 0.18
1.495HisArg: 1.495 ± 0.245
0.769HisSer: 0.769 ± 0.204
1.666HisThr: 1.666 ± 0.197
1.324HisVal: 1.324 ± 0.256
0.47HisTrp: 0.47 ± 0.128
0.854HisTyr: 0.854 ± 0.194
0.0HisXaa: 0.0 ± 0.0
Ile
4.228IleAla: 4.228 ± 0.496
0.512IleCys: 0.512 ± 0.169
3.63IleAsp: 3.63 ± 0.368
3.972IleGlu: 3.972 ± 0.461
1.409IlePhe: 1.409 ± 0.213
2.99IleGly: 2.99 ± 0.4
1.324IleHis: 1.324 ± 0.207
1.922IleIle: 1.922 ± 0.288
2.093IleLys: 2.093 ± 0.281
3.801IleLeu: 3.801 ± 0.472
1.068IleMet: 1.068 ± 0.209
2.178IleAsn: 2.178 ± 0.351
3.716IlePro: 3.716 ± 0.419
1.537IleGln: 1.537 ± 0.245
2.562IleArg: 2.562 ± 0.312
2.135IleSer: 2.135 ± 0.299
3.032IleThr: 3.032 ± 0.371
3.801IleVal: 3.801 ± 0.377
0.598IleTrp: 0.598 ± 0.146
0.769IleTyr: 0.769 ± 0.208
0.0IleXaa: 0.0 ± 0.0
Lys
4.655LysAla: 4.655 ± 0.5
0.427LysCys: 0.427 ± 0.13
2.007LysAsp: 2.007 ± 0.333
2.007LysGlu: 2.007 ± 0.272
1.324LysPhe: 1.324 ± 0.178
3.758LysGly: 3.758 ± 0.475
0.641LysHis: 0.641 ± 0.181
2.178LysIle: 2.178 ± 0.294
1.708LysLys: 1.708 ± 0.333
3.801LysLeu: 3.801 ± 0.438
1.409LysMet: 1.409 ± 0.254
1.751LysAsn: 1.751 ± 0.246
1.965LysPro: 1.965 ± 0.276
1.708LysGln: 1.708 ± 0.31
3.374LysArg: 3.374 ± 0.533
2.264LysSer: 2.264 ± 0.3
2.605LysThr: 2.605 ± 0.356
3.417LysVal: 3.417 ± 0.409
1.025LysTrp: 1.025 ± 0.176
1.452LysTyr: 1.452 ± 0.243
0.0LysXaa: 0.0 ± 0.0
Leu
7.687LeuAla: 7.687 ± 0.634
0.854LeuCys: 0.854 ± 0.223
5.509LeuAsp: 5.509 ± 0.568
4.997LeuGlu: 4.997 ± 0.485
2.306LeuPhe: 2.306 ± 0.337
6.705LeuGly: 6.705 ± 0.727
1.708LeuHis: 1.708 ± 0.287
3.075LeuIle: 3.075 ± 0.339
3.459LeuLys: 3.459 ± 0.308
5.637LeuLeu: 5.637 ± 0.566
2.306LeuMet: 2.306 ± 0.342
3.716LeuAsn: 3.716 ± 0.366
4.911LeuPro: 4.911 ± 0.498
2.392LeuGln: 2.392 ± 0.314
5.637LeuArg: 5.637 ± 0.523
3.801LeuSer: 3.801 ± 0.402
4.527LeuThr: 4.527 ± 0.409
5.381LeuVal: 5.381 ± 0.495
1.324LeuTrp: 1.324 ± 0.27
2.733LeuTyr: 2.733 ± 0.479
0.0LeuXaa: 0.0 ± 0.0
Met
2.691MetAla: 2.691 ± 0.388
0.128MetCys: 0.128 ± 0.071
1.708MetAsp: 1.708 ± 0.248
1.623MetGlu: 1.623 ± 0.302
0.769MetPhe: 0.769 ± 0.149
2.349MetGly: 2.349 ± 0.318
0.342MetHis: 0.342 ± 0.136
1.452MetIle: 1.452 ± 0.239
1.11MetLys: 1.11 ± 0.26
2.306MetLeu: 2.306 ± 0.324
0.47MetMet: 0.47 ± 0.145
1.239MetAsn: 1.239 ± 0.198
1.281MetPro: 1.281 ± 0.212
0.982MetGln: 0.982 ± 0.234
2.007MetArg: 2.007 ± 0.314
2.221MetSer: 2.221 ± 0.32
1.367MetThr: 1.367 ± 0.24
1.324MetVal: 1.324 ± 0.231
0.598MetTrp: 0.598 ± 0.161
0.769MetTyr: 0.769 ± 0.173
0.0MetXaa: 0.0 ± 0.0
Asn
3.203AsnAla: 3.203 ± 0.393
0.171AsnCys: 0.171 ± 0.079
2.434AsnAsp: 2.434 ± 0.311
1.879AsnGlu: 1.879 ± 0.252
1.281AsnPhe: 1.281 ± 0.257
3.331AsnGly: 3.331 ± 0.45
0.897AsnHis: 0.897 ± 0.212
1.708AsnIle: 1.708 ± 0.295
1.58AsnLys: 1.58 ± 0.317
3.502AsnLeu: 3.502 ± 0.478
0.854AsnMet: 0.854 ± 0.195
1.495AsnAsn: 1.495 ± 0.273
3.459AsnPro: 3.459 ± 0.337
1.794AsnGln: 1.794 ± 0.267
1.965AsnArg: 1.965 ± 0.218
2.007AsnSer: 2.007 ± 0.293
2.05AsnThr: 2.05 ± 0.318
2.605AsnVal: 2.605 ± 0.344
0.982AsnTrp: 0.982 ± 0.196
1.068AsnTyr: 1.068 ± 0.214
0.0AsnXaa: 0.0 ± 0.0
Pro
4.1ProAla: 4.1 ± 0.434
0.47ProCys: 0.47 ± 0.148
4.015ProAsp: 4.015 ± 0.396
4.655ProGlu: 4.655 ± 0.452
1.836ProPhe: 1.836 ± 0.229
4.869ProGly: 4.869 ± 0.7
1.794ProHis: 1.794 ± 0.273
1.965ProIle: 1.965 ± 0.337
2.392ProLys: 2.392 ± 0.383
2.947ProLeu: 2.947 ± 0.379
1.281ProMet: 1.281 ± 0.204
2.05ProAsn: 2.05 ± 0.324
3.16ProPro: 3.16 ± 0.474
2.221ProGln: 2.221 ± 0.33
2.648ProArg: 2.648 ± 0.346
2.947ProSer: 2.947 ± 0.345
3.374ProThr: 3.374 ± 0.422
3.801ProVal: 3.801 ± 0.456
1.239ProTrp: 1.239 ± 0.206
1.281ProTyr: 1.281 ± 0.226
0.0ProXaa: 0.0 ± 0.0
Gln
4.015GlnAla: 4.015 ± 0.801
0.512GlnCys: 0.512 ± 0.175
1.922GlnAsp: 1.922 ± 0.295
1.965GlnGlu: 1.965 ± 0.283
1.409GlnPhe: 1.409 ± 0.269
2.434GlnGly: 2.434 ± 0.551
0.811GlnHis: 0.811 ± 0.159
2.007GlnIle: 2.007 ± 0.244
1.11GlnLys: 1.11 ± 0.223
3.246GlnLeu: 3.246 ± 0.335
1.281GlnMet: 1.281 ± 0.261
1.666GlnAsn: 1.666 ± 0.284
1.751GlnPro: 1.751 ± 0.326
1.708GlnGln: 1.708 ± 0.468
2.52GlnArg: 2.52 ± 0.337
2.007GlnSer: 2.007 ± 0.25
1.708GlnThr: 1.708 ± 0.258
2.05GlnVal: 2.05 ± 0.273
1.153GlnTrp: 1.153 ± 0.189
1.068GlnTyr: 1.068 ± 0.202
0.0GlnXaa: 0.0 ± 0.0
Arg
5.253ArgAla: 5.253 ± 0.461
0.811ArgCys: 0.811 ± 0.199
4.356ArgAsp: 4.356 ± 0.355
4.484ArgGlu: 4.484 ± 0.45
2.221ArgPhe: 2.221 ± 0.291
4.698ArgGly: 4.698 ± 0.457
1.537ArgHis: 1.537 ± 0.282
2.691ArgIle: 2.691 ± 0.346
2.99ArgLys: 2.99 ± 0.406
4.185ArgLeu: 4.185 ± 0.372
2.093ArgMet: 2.093 ± 0.339
2.562ArgAsn: 2.562 ± 0.331
2.007ArgPro: 2.007 ± 0.319
2.392ArgGln: 2.392 ± 0.319
4.399ArgArg: 4.399 ± 0.566
2.904ArgSer: 2.904 ± 0.381
3.331ArgThr: 3.331 ± 0.358
5.381ArgVal: 5.381 ± 0.544
1.708ArgTrp: 1.708 ± 0.277
1.836ArgTyr: 1.836 ± 0.288
0.0ArgXaa: 0.0 ± 0.0
Ser
4.869SerAla: 4.869 ± 0.566
0.427SerCys: 0.427 ± 0.128
3.417SerAsp: 3.417 ± 0.378
2.947SerGlu: 2.947 ± 0.362
1.666SerPhe: 1.666 ± 0.282
4.954SerGly: 4.954 ± 0.502
1.452SerHis: 1.452 ± 0.267
3.032SerIle: 3.032 ± 0.314
2.135SerLys: 2.135 ± 0.304
4.271SerLeu: 4.271 ± 0.429
1.239SerMet: 1.239 ± 0.226
1.751SerAsn: 1.751 ± 0.338
2.562SerPro: 2.562 ± 0.341
2.733SerGln: 2.733 ± 0.298
3.032SerArg: 3.032 ± 0.374
2.904SerSer: 2.904 ± 0.412
2.691SerThr: 2.691 ± 0.322
3.459SerVal: 3.459 ± 0.439
1.367SerTrp: 1.367 ± 0.279
1.623SerTyr: 1.623 ± 0.224
0.0SerXaa: 0.0 ± 0.0
Thr
4.57ThrAla: 4.57 ± 0.515
0.683ThrCys: 0.683 ± 0.211
3.459ThrAsp: 3.459 ± 0.422
3.374ThrGlu: 3.374 ± 0.423
2.605ThrPhe: 2.605 ± 0.37
5.68ThrGly: 5.68 ± 0.616
1.11ThrHis: 1.11 ± 0.209
3.502ThrIle: 3.502 ± 0.428
2.007ThrLys: 2.007 ± 0.381
4.954ThrLeu: 4.954 ± 0.393
1.239ThrMet: 1.239 ± 0.259
1.495ThrAsn: 1.495 ± 0.228
3.331ThrPro: 3.331 ± 0.585
1.794ThrGln: 1.794 ± 0.255
3.288ThrArg: 3.288 ± 0.334
3.459ThrSer: 3.459 ± 0.376
3.886ThrThr: 3.886 ± 0.431
4.826ThrVal: 4.826 ± 0.594
1.794ThrTrp: 1.794 ± 0.326
2.093ThrTyr: 2.093 ± 0.278
0.0ThrXaa: 0.0 ± 0.0
Val
6.449ValAla: 6.449 ± 0.573
0.811ValCys: 0.811 ± 0.179
5.125ValAsp: 5.125 ± 0.394
4.271ValGlu: 4.271 ± 0.388
2.264ValPhe: 2.264 ± 0.301
5.467ValGly: 5.467 ± 0.607
1.452ValHis: 1.452 ± 0.226
2.733ValIle: 2.733 ± 0.356
2.904ValLys: 2.904 ± 0.323
5.125ValLeu: 5.125 ± 0.503
1.708ValMet: 1.708 ± 0.225
3.032ValAsn: 3.032 ± 0.368
3.758ValPro: 3.758 ± 0.457
2.648ValGln: 2.648 ± 0.311
4.655ValArg: 4.655 ± 0.411
4.399ValSer: 4.399 ± 0.357
5.381ValThr: 5.381 ± 0.615
5.979ValVal: 5.979 ± 0.776
1.708ValTrp: 1.708 ± 0.271
2.135ValTyr: 2.135 ± 0.308
0.0ValXaa: 0.0 ± 0.0
Trp
2.007TrpAla: 2.007 ± 0.249
0.384TrpCys: 0.384 ± 0.122
2.093TrpAsp: 2.093 ± 0.301
1.666TrpGlu: 1.666 ± 0.232
0.555TrpPhe: 0.555 ± 0.15
1.794TrpGly: 1.794 ± 0.281
0.683TrpHis: 0.683 ± 0.188
1.281TrpIle: 1.281 ± 0.228
1.068TrpLys: 1.068 ± 0.197
2.264TrpLeu: 2.264 ± 0.31
0.726TrpMet: 0.726 ± 0.169
0.598TrpAsn: 0.598 ± 0.148
0.726TrpPro: 0.726 ± 0.154
1.11TrpGln: 1.11 ± 0.206
1.708TrpArg: 1.708 ± 0.273
0.94TrpSer: 0.94 ± 0.177
1.623TrpThr: 1.623 ± 0.246
2.477TrpVal: 2.477 ± 0.359
0.769TrpTrp: 0.769 ± 0.2
0.94TrpTyr: 0.94 ± 0.214
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.947TyrAla: 2.947 ± 0.327
0.214TyrCys: 0.214 ± 0.096
2.264TyrAsp: 2.264 ± 0.325
1.794TyrGlu: 1.794 ± 0.337
1.196TyrPhe: 1.196 ± 0.251
2.392TyrGly: 2.392 ± 0.325
1.239TyrHis: 1.239 ± 0.257
1.153TyrIle: 1.153 ± 0.225
1.281TyrLys: 1.281 ± 0.246
2.306TyrLeu: 2.306 ± 0.386
0.982TyrMet: 0.982 ± 0.171
0.854TyrAsn: 0.854 ± 0.221
1.58TyrPro: 1.58 ± 0.252
0.982TyrGln: 0.982 ± 0.223
2.776TyrArg: 2.776 ± 0.39
1.196TyrSer: 1.196 ± 0.221
1.751TyrThr: 1.751 ± 0.278
2.349TyrVal: 2.349 ± 0.342
1.068TyrTrp: 1.068 ± 0.215
1.068TyrTyr: 1.068 ± 0.215
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 148 proteins (23416 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski