Amino acid dipepetide frequency for Mycobacterium phage Emma

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
13.415AlaAla: 13.415 ± 1.503
0.825AlaCys: 0.825 ± 0.197
7.752AlaAsp: 7.752 ± 0.714
7.312AlaGlu: 7.312 ± 0.709
2.749AlaPhe: 2.749 ± 0.387
10.006AlaGly: 10.006 ± 1.381
2.199AlaHis: 2.199 ± 0.345
3.958AlaIle: 3.958 ± 0.615
3.629AlaLys: 3.629 ± 0.412
8.082AlaLeu: 8.082 ± 0.725
2.254AlaMet: 2.254 ± 0.317
2.474AlaAsn: 2.474 ± 0.498
4.948AlaPro: 4.948 ± 0.671
3.244AlaGln: 3.244 ± 0.559
7.367AlaArg: 7.367 ± 0.715
5.773AlaSer: 5.773 ± 0.627
6.487AlaThr: 6.487 ± 0.504
6.982AlaVal: 6.982 ± 0.624
2.749AlaTrp: 2.749 ± 0.443
2.199AlaTyr: 2.199 ± 0.296
0.0AlaXaa: 0.0 ± 0.0
Cys
1.045CysAla: 1.045 ± 0.244
0.055CysCys: 0.055 ± 0.056
1.374CysAsp: 1.374 ± 0.359
0.77CysGlu: 0.77 ± 0.211
0.275CysPhe: 0.275 ± 0.122
1.704CysGly: 1.704 ± 0.355
0.22CysHis: 0.22 ± 0.105
0.11CysIle: 0.11 ± 0.083
0.495CysLys: 0.495 ± 0.171
0.605CysLeu: 0.605 ± 0.223
0.165CysMet: 0.165 ± 0.092
0.385CysAsn: 0.385 ± 0.126
1.21CysPro: 1.21 ± 0.339
0.275CysGln: 0.275 ± 0.124
0.825CysArg: 0.825 ± 0.25
0.66CysSer: 0.66 ± 0.185
0.825CysThr: 0.825 ± 0.238
0.715CysVal: 0.715 ± 0.215
0.165CysTrp: 0.165 ± 0.096
0.165CysTyr: 0.165 ± 0.093
0.0CysXaa: 0.0 ± 0.0
Asp
6.542AspAla: 6.542 ± 0.586
1.045AspCys: 1.045 ± 0.28
4.618AspAsp: 4.618 ± 0.529
3.299AspGlu: 3.299 ± 0.39
1.704AspPhe: 1.704 ± 0.261
6.707AspGly: 6.707 ± 0.639
1.484AspHis: 1.484 ± 0.29
2.364AspIle: 2.364 ± 0.39
1.649AspLys: 1.649 ± 0.284
5.938AspLeu: 5.938 ± 0.563
0.935AspMet: 0.935 ± 0.242
1.869AspAsn: 1.869 ± 0.377
4.728AspPro: 4.728 ± 0.574
2.089AspGln: 2.089 ± 0.328
5.443AspArg: 5.443 ± 0.581
3.903AspSer: 3.903 ± 0.502
4.453AspThr: 4.453 ± 0.464
4.453AspVal: 4.453 ± 0.572
1.319AspTrp: 1.319 ± 0.24
2.034AspTyr: 2.034 ± 0.288
0.0AspXaa: 0.0 ± 0.0
Glu
6.268GluAla: 6.268 ± 0.689
0.935GluCys: 0.935 ± 0.277
3.134GluAsp: 3.134 ± 0.336
2.969GluGlu: 2.969 ± 0.542
2.144GluPhe: 2.144 ± 0.325
2.969GluGly: 2.969 ± 0.464
1.429GluHis: 1.429 ± 0.407
1.869GluIle: 1.869 ± 0.296
1.594GluLys: 1.594 ± 0.284
5.223GluLeu: 5.223 ± 0.728
1.814GluMet: 1.814 ± 0.279
1.814GluAsn: 1.814 ± 0.288
2.639GluPro: 2.639 ± 0.426
3.464GluGln: 3.464 ± 0.403
5.003GluArg: 5.003 ± 0.554
2.694GluSer: 2.694 ± 0.463
3.848GluThr: 3.848 ± 0.52
4.123GluVal: 4.123 ± 0.494
1.869GluTrp: 1.869 ± 0.371
2.144GluTyr: 2.144 ± 0.4
0.0GluXaa: 0.0 ± 0.0
Phe
3.409PheAla: 3.409 ± 0.469
0.22PheCys: 0.22 ± 0.12
2.309PheAsp: 2.309 ± 0.443
1.429PheGlu: 1.429 ± 0.279
0.88PhePhe: 0.88 ± 0.252
2.474PheGly: 2.474 ± 0.627
0.44PheHis: 0.44 ± 0.148
1.649PheIle: 1.649 ± 0.359
0.825PheLys: 0.825 ± 0.248
2.034PheLeu: 2.034 ± 0.304
0.77PheMet: 0.77 ± 0.222
1.265PheAsn: 1.265 ± 0.357
1.484PhePro: 1.484 ± 0.324
1.045PheGln: 1.045 ± 0.302
1.814PheArg: 1.814 ± 0.265
1.265PheSer: 1.265 ± 0.258
2.584PheThr: 2.584 ± 0.38
1.979PheVal: 1.979 ± 0.306
0.55PheTrp: 0.55 ± 0.153
0.935PheTyr: 0.935 ± 0.25
0.0PheXaa: 0.0 ± 0.0
Gly
8.852GlyAla: 8.852 ± 1.411
0.935GlyCys: 0.935 ± 0.229
6.377GlyAsp: 6.377 ± 0.678
3.958GlyGlu: 3.958 ± 0.515
2.969GlyPhe: 2.969 ± 0.394
10.666GlyGly: 10.666 ± 2.891
1.814GlyHis: 1.814 ± 0.297
4.068GlyIle: 4.068 ± 0.543
2.419GlyLys: 2.419 ± 0.365
6.158GlyLeu: 6.158 ± 0.666
2.364GlyMet: 2.364 ± 0.398
3.134GlyAsn: 3.134 ± 0.443
4.123GlyPro: 4.123 ± 0.484
2.309GlyGln: 2.309 ± 0.482
4.893GlyArg: 4.893 ± 0.681
6.103GlySer: 6.103 ± 0.847
6.213GlyThr: 6.213 ± 0.783
5.773GlyVal: 5.773 ± 0.611
2.694GlyTrp: 2.694 ± 0.338
2.034GlyTyr: 2.034 ± 0.397
0.0GlyXaa: 0.0 ± 0.0
His
1.979HisAla: 1.979 ± 0.363
0.275HisCys: 0.275 ± 0.163
1.429HisAsp: 1.429 ± 0.287
1.045HisGlu: 1.045 ± 0.269
0.495HisPhe: 0.495 ± 0.145
1.814HisGly: 1.814 ± 0.269
0.935HisHis: 0.935 ± 0.284
1.484HisIle: 1.484 ± 0.288
0.605HisLys: 0.605 ± 0.198
0.99HisLeu: 0.99 ± 0.22
0.77HisMet: 0.77 ± 0.19
0.77HisAsn: 0.77 ± 0.186
1.594HisPro: 1.594 ± 0.264
0.825HisGln: 0.825 ± 0.208
2.144HisArg: 2.144 ± 0.391
0.605HisSer: 0.605 ± 0.195
1.594HisThr: 1.594 ± 0.329
1.429HisVal: 1.429 ± 0.296
0.44HisTrp: 0.44 ± 0.151
0.88HisTyr: 0.88 ± 0.236
0.0HisXaa: 0.0 ± 0.0
Ile
5.168IleAla: 5.168 ± 0.485
0.715IleCys: 0.715 ± 0.247
3.794IleAsp: 3.794 ± 0.422
3.629IleGlu: 3.629 ± 0.346
0.66IlePhe: 0.66 ± 0.235
3.519IleGly: 3.519 ± 0.532
1.539IleHis: 1.539 ± 0.254
1.484IleIle: 1.484 ± 0.304
1.155IleLys: 1.155 ± 0.238
2.089IleLeu: 2.089 ± 0.42
0.275IleMet: 0.275 ± 0.107
2.034IleAsn: 2.034 ± 0.303
2.914IlePro: 2.914 ± 0.368
1.374IleGln: 1.374 ± 0.242
3.134IleArg: 3.134 ± 0.499
2.089IleSer: 2.089 ± 0.506
3.574IleThr: 3.574 ± 0.394
3.134IleVal: 3.134 ± 0.401
1.045IleTrp: 1.045 ± 0.218
0.715IleTyr: 0.715 ± 0.232
0.0IleXaa: 0.0 ± 0.0
Lys
3.299LysAla: 3.299 ± 0.428
0.495LysCys: 0.495 ± 0.19
1.649LysAsp: 1.649 ± 0.288
1.594LysGlu: 1.594 ± 0.301
1.21LysPhe: 1.21 ± 0.214
2.474LysGly: 2.474 ± 0.341
1.045LysHis: 1.045 ± 0.267
0.935LysIle: 0.935 ± 0.297
1.1LysLys: 1.1 ± 0.3
2.749LysLeu: 2.749 ± 0.45
0.825LysMet: 0.825 ± 0.188
0.99LysAsn: 0.99 ± 0.284
2.309LysPro: 2.309 ± 0.367
1.704LysGln: 1.704 ± 0.287
1.814LysArg: 1.814 ± 0.323
1.759LysSer: 1.759 ± 0.287
2.309LysThr: 2.309 ± 0.356
2.309LysVal: 2.309 ± 0.36
0.715LysTrp: 0.715 ± 0.302
1.045LysTyr: 1.045 ± 0.263
0.0LysXaa: 0.0 ± 0.0
Leu
8.412LeuAla: 8.412 ± 0.795
0.66LeuCys: 0.66 ± 0.232
4.728LeuAsp: 4.728 ± 0.545
3.794LeuGlu: 3.794 ± 0.519
2.364LeuPhe: 2.364 ± 0.332
5.553LeuGly: 5.553 ± 0.56
1.155LeuHis: 1.155 ± 0.259
3.684LeuIle: 3.684 ± 0.476
1.924LeuLys: 1.924 ± 0.302
4.893LeuLeu: 4.893 ± 0.521
1.484LeuMet: 1.484 ± 0.265
2.639LeuAsn: 2.639 ± 0.335
5.058LeuPro: 5.058 ± 0.708
2.529LeuGln: 2.529 ± 0.426
4.673LeuArg: 4.673 ± 0.658
6.048LeuSer: 6.048 ± 0.583
6.048LeuThr: 6.048 ± 0.603
4.893LeuVal: 4.893 ± 0.486
1.155LeuTrp: 1.155 ± 0.225
1.979LeuTyr: 1.979 ± 0.335
0.0LeuXaa: 0.0 ± 0.0
Met
2.309MetAla: 2.309 ± 0.359
0.275MetCys: 0.275 ± 0.131
1.1MetAsp: 1.1 ± 0.248
0.77MetGlu: 0.77 ± 0.176
0.825MetPhe: 0.825 ± 0.236
1.759MetGly: 1.759 ± 0.243
0.22MetHis: 0.22 ± 0.115
0.99MetIle: 0.99 ± 0.215
1.045MetLys: 1.045 ± 0.307
1.814MetLeu: 1.814 ± 0.295
0.605MetMet: 0.605 ± 0.229
0.935MetAsn: 0.935 ± 0.233
1.1MetPro: 1.1 ± 0.253
0.44MetGln: 0.44 ± 0.123
1.649MetArg: 1.649 ± 0.283
2.859MetSer: 2.859 ± 0.352
2.419MetThr: 2.419 ± 0.332
1.429MetVal: 1.429 ± 0.304
0.22MetTrp: 0.22 ± 0.118
0.44MetTyr: 0.44 ± 0.151
0.0MetXaa: 0.0 ± 0.0
Asn
3.409AsnAla: 3.409 ± 0.379
0.165AsnCys: 0.165 ± 0.088
1.704AsnAsp: 1.704 ± 0.287
1.594AsnGlu: 1.594 ± 0.299
0.935AsnPhe: 0.935 ± 0.301
3.629AsnGly: 3.629 ± 0.567
0.825AsnHis: 0.825 ± 0.184
1.649AsnIle: 1.649 ± 0.44
1.1AsnLys: 1.1 ± 0.227
2.144AsnLeu: 2.144 ± 0.322
0.55AsnMet: 0.55 ± 0.14
1.979AsnAsn: 1.979 ± 0.372
2.584AsnPro: 2.584 ± 0.358
1.045AsnGln: 1.045 ± 0.323
2.364AsnArg: 2.364 ± 0.399
1.869AsnSer: 1.869 ± 0.288
2.584AsnThr: 2.584 ± 0.339
1.869AsnVal: 1.869 ± 0.356
0.88AsnTrp: 0.88 ± 0.201
0.77AsnTyr: 0.77 ± 0.155
0.0AsnXaa: 0.0 ± 0.0
Pro
4.838ProAla: 4.838 ± 0.64
0.44ProCys: 0.44 ± 0.149
4.288ProAsp: 4.288 ± 0.484
4.288ProGlu: 4.288 ± 0.515
1.759ProPhe: 1.759 ± 0.345
6.597ProGly: 6.597 ± 0.689
1.429ProHis: 1.429 ± 0.283
2.254ProIle: 2.254 ± 0.251
2.199ProLys: 2.199 ± 0.448
4.398ProLeu: 4.398 ± 0.528
1.649ProMet: 1.649 ± 0.385
2.254ProAsn: 2.254 ± 0.317
3.958ProPro: 3.958 ± 0.608
2.199ProGln: 2.199 ± 0.412
3.299ProArg: 3.299 ± 0.514
3.079ProSer: 3.079 ± 0.395
2.914ProThr: 2.914 ± 0.39
5.003ProVal: 5.003 ± 0.568
1.045ProTrp: 1.045 ± 0.203
1.374ProTyr: 1.374 ± 0.264
0.0ProXaa: 0.0 ± 0.0
Gln
4.508GlnAla: 4.508 ± 0.636
0.275GlnCys: 0.275 ± 0.133
1.429GlnAsp: 1.429 ± 0.301
1.649GlnGlu: 1.649 ± 0.353
1.21GlnPhe: 1.21 ± 0.206
2.199GlnGly: 2.199 ± 0.518
0.715GlnHis: 0.715 ± 0.18
1.759GlnIle: 1.759 ± 0.289
1.045GlnLys: 1.045 ± 0.179
3.409GlnLeu: 3.409 ± 0.469
0.605GlnMet: 0.605 ± 0.178
0.935GlnAsn: 0.935 ± 0.244
2.529GlnPro: 2.529 ± 0.316
1.319GlnGln: 1.319 ± 0.258
2.254GlnArg: 2.254 ± 0.332
2.749GlnSer: 2.749 ± 0.422
1.649GlnThr: 1.649 ± 0.333
2.419GlnVal: 2.419 ± 0.341
0.88GlnTrp: 0.88 ± 0.187
0.99GlnTyr: 0.99 ± 0.258
0.0GlnXaa: 0.0 ± 0.0
Arg
6.377ArgAla: 6.377 ± 0.639
1.319ArgCys: 1.319 ± 0.328
3.903ArgAsp: 3.903 ± 0.608
5.058ArgGlu: 5.058 ± 0.651
1.924ArgPhe: 1.924 ± 0.373
4.013ArgGly: 4.013 ± 0.453
1.649ArgHis: 1.649 ± 0.332
3.519ArgIle: 3.519 ± 0.567
2.584ArgLys: 2.584 ± 0.425
5.003ArgLeu: 5.003 ± 0.615
3.079ArgMet: 3.079 ± 0.518
2.199ArgAsn: 2.199 ± 0.363
3.189ArgPro: 3.189 ± 0.429
1.924ArgGln: 1.924 ± 0.299
5.553ArgArg: 5.553 ± 0.749
3.629ArgSer: 3.629 ± 0.366
3.574ArgThr: 3.574 ± 0.487
5.443ArgVal: 5.443 ± 0.516
1.979ArgTrp: 1.979 ± 0.344
1.869ArgTyr: 1.869 ± 0.304
0.0ArgXaa: 0.0 ± 0.0
Ser
6.158SerAla: 6.158 ± 0.89
0.715SerCys: 0.715 ± 0.265
4.123SerAsp: 4.123 ± 0.469
3.409SerGlu: 3.409 ± 0.534
2.034SerPhe: 2.034 ± 0.4
6.817SerGly: 6.817 ± 0.694
1.1SerHis: 1.1 ± 0.242
3.134SerIle: 3.134 ± 0.424
2.639SerLys: 2.639 ± 0.41
3.958SerLeu: 3.958 ± 0.445
1.265SerMet: 1.265 ± 0.283
2.034SerAsn: 2.034 ± 0.363
3.629SerPro: 3.629 ± 0.332
1.979SerGln: 1.979 ± 0.328
2.969SerArg: 2.969 ± 0.359
4.233SerSer: 4.233 ± 0.662
3.464SerThr: 3.464 ± 0.515
4.453SerVal: 4.453 ± 0.462
1.539SerTrp: 1.539 ± 0.251
1.374SerTyr: 1.374 ± 0.191
0.0SerXaa: 0.0 ± 0.0
Thr
6.432ThrAla: 6.432 ± 0.563
0.825ThrCys: 0.825 ± 0.213
4.398ThrAsp: 4.398 ± 0.69
3.739ThrGlu: 3.739 ± 0.468
1.979ThrPhe: 1.979 ± 0.369
6.542ThrGly: 6.542 ± 0.604
1.649ThrHis: 1.649 ± 0.316
3.409ThrIle: 3.409 ± 0.427
2.474ThrLys: 2.474 ± 0.394
5.003ThrLeu: 5.003 ± 0.5
1.265ThrMet: 1.265 ± 0.226
2.254ThrAsn: 2.254 ± 0.367
4.618ThrPro: 4.618 ± 0.505
1.869ThrGln: 1.869 ± 0.306
3.848ThrArg: 3.848 ± 0.395
3.684ThrSer: 3.684 ± 0.392
5.278ThrThr: 5.278 ± 0.653
6.103ThrVal: 6.103 ± 0.665
1.155ThrTrp: 1.155 ± 0.236
2.034ThrTyr: 2.034 ± 0.31
0.0ThrXaa: 0.0 ± 0.0
Val
6.872ValAla: 6.872 ± 0.564
1.21ValCys: 1.21 ± 0.29
5.608ValAsp: 5.608 ± 0.502
4.783ValGlu: 4.783 ± 0.561
1.814ValPhe: 1.814 ± 0.357
5.443ValGly: 5.443 ± 0.669
1.374ValHis: 1.374 ± 0.243
3.299ValIle: 3.299 ± 0.565
1.979ValLys: 1.979 ± 0.3
5.058ValLeu: 5.058 ± 0.567
1.484ValMet: 1.484 ± 0.262
2.364ValAsn: 2.364 ± 0.377
4.233ValPro: 4.233 ± 0.441
2.969ValGln: 2.969 ± 0.389
4.783ValArg: 4.783 ± 0.666
5.278ValSer: 5.278 ± 0.56
5.333ValThr: 5.333 ± 0.493
6.048ValVal: 6.048 ± 0.671
1.759ValTrp: 1.759 ± 0.349
1.155ValTyr: 1.155 ± 0.258
0.0ValXaa: 0.0 ± 0.0
Trp
2.474TrpAla: 2.474 ± 0.406
0.22TrpCys: 0.22 ± 0.1
1.429TrpAsp: 1.429 ± 0.25
1.045TrpGlu: 1.045 ± 0.311
0.77TrpPhe: 0.77 ± 0.247
0.99TrpGly: 0.99 ± 0.249
0.495TrpHis: 0.495 ± 0.172
0.88TrpIle: 0.88 ± 0.214
1.155TrpLys: 1.155 ± 0.206
2.199TrpLeu: 2.199 ± 0.387
0.77TrpMet: 0.77 ± 0.214
0.605TrpAsn: 0.605 ± 0.226
0.99TrpPro: 0.99 ± 0.229
0.935TrpGln: 0.935 ± 0.257
2.089TrpArg: 2.089 ± 0.414
1.594TrpSer: 1.594 ± 0.405
1.594TrpThr: 1.594 ± 0.294
1.924TrpVal: 1.924 ± 0.483
1.045TrpTrp: 1.045 ± 0.241
0.385TrpTyr: 0.385 ± 0.163
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.639TyrAla: 2.639 ± 0.345
0.44TyrCys: 0.44 ± 0.167
1.429TyrAsp: 1.429 ± 0.355
1.814TyrGlu: 1.814 ± 0.301
0.66TyrPhe: 0.66 ± 0.223
1.979TyrGly: 1.979 ± 0.366
0.385TyrHis: 0.385 ± 0.126
1.539TyrIle: 1.539 ± 0.273
0.715TyrLys: 0.715 ± 0.225
1.924TyrLeu: 1.924 ± 0.286
0.22TyrMet: 0.22 ± 0.113
0.715TyrAsn: 0.715 ± 0.225
1.429TyrPro: 1.429 ± 0.268
0.99TyrGln: 0.99 ± 0.225
1.869TyrArg: 1.869 ± 0.353
1.155TyrSer: 1.155 ± 0.255
1.869TyrThr: 1.869 ± 0.373
2.364TyrVal: 2.364 ± 0.326
0.385TyrTrp: 0.385 ± 0.145
0.715TyrTyr: 0.715 ± 0.162
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 109 proteins (18190 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski