Amino acid dipepetide frequency for Mycobacterium phage IPhane7

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
11.147AlaAla: 11.147 ± 1.454
0.737AlaCys: 0.737 ± 0.196
6.766AlaAsp: 6.766 ± 0.558
7.894AlaGlu: 7.894 ± 0.758
2.689AlaPhe: 2.689 ± 0.376
6.94AlaGly: 6.94 ± 0.916
2.082AlaHis: 2.082 ± 0.354
5.075AlaIle: 5.075 ± 0.442
5.291AlaLys: 5.291 ± 0.52
10.149AlaLeu: 10.149 ± 0.879
2.559AlaMet: 2.559 ± 0.339
3.513AlaAsn: 3.513 ± 0.656
4.251AlaPro: 4.251 ± 0.612
4.337AlaGln: 4.337 ± 0.438
5.942AlaArg: 5.942 ± 0.533
4.684AlaSer: 4.684 ± 0.431
5.725AlaThr: 5.725 ± 0.671
7.807AlaVal: 7.807 ± 0.58
2.299AlaTrp: 2.299 ± 0.315
2.429AlaTyr: 2.429 ± 0.318
0.0AlaXaa: 0.0 ± 0.0
Cys
0.867CysAla: 0.867 ± 0.234
0.043CysCys: 0.043 ± 0.041
0.52CysAsp: 0.52 ± 0.166
0.39CysGlu: 0.39 ± 0.14
0.173CysPhe: 0.173 ± 0.084
0.998CysGly: 0.998 ± 0.22
0.087CysHis: 0.087 ± 0.061
0.477CysIle: 0.477 ± 0.162
0.173CysLys: 0.173 ± 0.09
0.52CysLeu: 0.52 ± 0.169
0.26CysMet: 0.26 ± 0.125
0.434CysAsn: 0.434 ± 0.136
0.651CysPro: 0.651 ± 0.234
0.173CysGln: 0.173 ± 0.086
0.52CysArg: 0.52 ± 0.155
0.477CysSer: 0.477 ± 0.166
0.477CysThr: 0.477 ± 0.131
0.867CysVal: 0.867 ± 0.213
0.347CysTrp: 0.347 ± 0.127
0.347CysTyr: 0.347 ± 0.119
0.0CysXaa: 0.0 ± 0.0
Asp
6.549AspAla: 6.549 ± 0.468
0.737AspCys: 0.737 ± 0.235
5.118AspAsp: 5.118 ± 0.528
4.641AspGlu: 4.641 ± 0.532
2.082AspPhe: 2.082 ± 0.334
6.81AspGly: 6.81 ± 0.476
1.431AspHis: 1.431 ± 0.243
2.776AspIle: 2.776 ± 0.336
2.385AspLys: 2.385 ± 0.356
5.638AspLeu: 5.638 ± 0.515
1.865AspMet: 1.865 ± 0.354
1.995AspAsn: 1.995 ± 0.29
4.598AspPro: 4.598 ± 0.377
1.952AspGln: 1.952 ± 0.304
4.554AspArg: 4.554 ± 0.491
2.906AspSer: 2.906 ± 0.49
3.253AspThr: 3.253 ± 0.356
3.99AspVal: 3.99 ± 0.451
1.475AspTrp: 1.475 ± 0.302
2.169AspTyr: 2.169 ± 0.265
0.0AspXaa: 0.0 ± 0.0
Glu
6.94GluAla: 6.94 ± 0.608
0.52GluCys: 0.52 ± 0.17
4.12GluAsp: 4.12 ± 0.613
3.817GluGlu: 3.817 ± 0.551
2.385GluPhe: 2.385 ± 0.382
5.031GluGly: 5.031 ± 0.475
1.431GluHis: 1.431 ± 0.261
3.383GluIle: 3.383 ± 0.496
2.776GluLys: 2.776 ± 0.348
5.725GluLeu: 5.725 ± 0.577
2.082GluMet: 2.082 ± 0.293
2.039GluAsn: 2.039 ± 0.3
2.212GluPro: 2.212 ± 0.341
2.819GluGln: 2.819 ± 0.328
4.511GluArg: 4.511 ± 0.42
2.819GluSer: 2.819 ± 0.276
3.21GluThr: 3.21 ± 0.368
4.684GluVal: 4.684 ± 0.531
1.865GluTrp: 1.865 ± 0.381
1.908GluTyr: 1.908 ± 0.311
0.0GluXaa: 0.0 ± 0.0
Phe
2.732PheAla: 2.732 ± 0.414
0.217PheCys: 0.217 ± 0.092
2.993PheAsp: 2.993 ± 0.354
2.082PheGlu: 2.082 ± 0.277
0.911PhePhe: 0.911 ± 0.19
2.299PheGly: 2.299 ± 0.321
0.607PheHis: 0.607 ± 0.191
1.648PheIle: 1.648 ± 0.264
1.605PheLys: 1.605 ± 0.376
2.212PheLeu: 2.212 ± 0.374
0.607PheMet: 0.607 ± 0.193
1.128PheAsn: 1.128 ± 0.215
1.561PhePro: 1.561 ± 0.245
1.345PheGln: 1.345 ± 0.266
1.648PheArg: 1.648 ± 0.23
1.735PheSer: 1.735 ± 0.28
1.995PheThr: 1.995 ± 0.324
2.472PheVal: 2.472 ± 0.343
0.477PheTrp: 0.477 ± 0.153
0.737PheTyr: 0.737 ± 0.194
0.0PheXaa: 0.0 ± 0.0
Gly
7.33GlyAla: 7.33 ± 0.986
0.781GlyCys: 0.781 ± 0.178
5.205GlyAsp: 5.205 ± 0.565
4.901GlyGlu: 4.901 ± 0.439
2.863GlyPhe: 2.863 ± 0.305
8.501GlyGly: 8.501 ± 1.172
1.084GlyHis: 1.084 ± 0.325
3.86GlyIle: 3.86 ± 0.599
3.947GlyLys: 3.947 ± 0.363
6.072GlyLeu: 6.072 ± 0.794
1.865GlyMet: 1.865 ± 0.276
3.383GlyAsn: 3.383 ± 0.444
3.34GlyPro: 3.34 ± 0.749
3.036GlyGln: 3.036 ± 0.52
4.467GlyArg: 4.467 ± 0.53
5.465GlySer: 5.465 ± 0.563
5.942GlyThr: 5.942 ± 0.533
6.202GlyVal: 6.202 ± 0.588
2.342GlyTrp: 2.342 ± 0.365
2.863GlyTyr: 2.863 ± 0.331
0.0GlyXaa: 0.0 ± 0.0
His
1.345HisAla: 1.345 ± 0.28
0.347HisCys: 0.347 ± 0.12
1.214HisAsp: 1.214 ± 0.302
1.778HisGlu: 1.778 ± 0.358
0.694HisPhe: 0.694 ± 0.154
1.692HisGly: 1.692 ± 0.289
0.564HisHis: 0.564 ± 0.162
1.214HisIle: 1.214 ± 0.229
0.781HisLys: 0.781 ± 0.181
1.865HisLeu: 1.865 ± 0.369
0.39HisMet: 0.39 ± 0.126
0.781HisAsn: 0.781 ± 0.158
0.954HisPro: 0.954 ± 0.268
0.39HisGln: 0.39 ± 0.13
1.518HisArg: 1.518 ± 0.304
0.954HisSer: 0.954 ± 0.28
1.561HisThr: 1.561 ± 0.29
1.301HisVal: 1.301 ± 0.216
0.39HisTrp: 0.39 ± 0.132
0.651HisTyr: 0.651 ± 0.199
0.0HisXaa: 0.0 ± 0.0
Ile
4.424IleAla: 4.424 ± 0.359
0.347IleCys: 0.347 ± 0.132
4.034IleAsp: 4.034 ± 0.389
3.773IleGlu: 3.773 ± 0.379
1.041IlePhe: 1.041 ± 0.188
4.12IleGly: 4.12 ± 0.437
0.911IleHis: 0.911 ± 0.237
2.125IleIle: 2.125 ± 0.278
2.385IleLys: 2.385 ± 0.272
3.904IleLeu: 3.904 ± 0.417
1.128IleMet: 1.128 ± 0.217
1.692IleAsn: 1.692 ± 0.296
3.34IlePro: 3.34 ± 0.53
1.605IleGln: 1.605 ± 0.285
3.687IleArg: 3.687 ± 0.394
1.605IleSer: 1.605 ± 0.245
3.47IleThr: 3.47 ± 0.377
2.689IleVal: 2.689 ± 0.315
0.911IleTrp: 0.911 ± 0.181
1.388IleTyr: 1.388 ± 0.292
0.0IleXaa: 0.0 ± 0.0
Lys
5.291LysAla: 5.291 ± 0.585
0.26LysCys: 0.26 ± 0.106
2.472LysAsp: 2.472 ± 0.337
1.908LysGlu: 1.908 ± 0.314
1.388LysPhe: 1.388 ± 0.319
3.513LysGly: 3.513 ± 0.468
0.867LysHis: 0.867 ± 0.175
2.125LysIle: 2.125 ± 0.296
2.039LysLys: 2.039 ± 0.294
3.47LysLeu: 3.47 ± 0.316
1.084LysMet: 1.084 ± 0.203
1.692LysAsn: 1.692 ± 0.3
2.169LysPro: 2.169 ± 0.307
1.778LysGln: 1.778 ± 0.264
2.819LysArg: 2.819 ± 0.356
2.559LysSer: 2.559 ± 0.358
2.732LysThr: 2.732 ± 0.368
3.296LysVal: 3.296 ± 0.379
1.561LysTrp: 1.561 ± 0.278
1.692LysTyr: 1.692 ± 0.264
0.0LysXaa: 0.0 ± 0.0
Leu
9.715LeuAla: 9.715 ± 0.511
0.564LeuCys: 0.564 ± 0.149
5.855LeuAsp: 5.855 ± 0.543
5.248LeuGlu: 5.248 ± 0.428
2.516LeuPhe: 2.516 ± 0.319
6.202LeuGly: 6.202 ± 0.662
1.345LeuHis: 1.345 ± 0.288
3.947LeuIle: 3.947 ± 0.479
3.123LeuLys: 3.123 ± 0.358
5.552LeuLeu: 5.552 ± 0.538
1.865LeuMet: 1.865 ± 0.261
2.949LeuAsn: 2.949 ± 0.553
4.294LeuPro: 4.294 ± 0.459
2.732LeuGln: 2.732 ± 0.384
5.205LeuArg: 5.205 ± 0.457
4.901LeuSer: 4.901 ± 0.396
4.944LeuThr: 4.944 ± 0.527
5.769LeuVal: 5.769 ± 0.58
1.431LeuTrp: 1.431 ± 0.252
2.516LeuTyr: 2.516 ± 0.409
0.0LeuXaa: 0.0 ± 0.0
Met
3.557MetAla: 3.557 ± 0.344
0.26MetCys: 0.26 ± 0.107
1.214MetAsp: 1.214 ± 0.252
1.258MetGlu: 1.258 ± 0.198
0.651MetPhe: 0.651 ± 0.174
1.518MetGly: 1.518 ± 0.251
0.477MetHis: 0.477 ± 0.147
0.911MetIle: 0.911 ± 0.242
0.824MetLys: 0.824 ± 0.181
1.692MetLeu: 1.692 ± 0.301
0.347MetMet: 0.347 ± 0.12
0.694MetAsn: 0.694 ± 0.142
1.561MetPro: 1.561 ± 0.211
1.084MetGln: 1.084 ± 0.257
1.692MetArg: 1.692 ± 0.283
3.296MetSer: 3.296 ± 0.423
1.908MetThr: 1.908 ± 0.316
1.171MetVal: 1.171 ± 0.234
0.477MetTrp: 0.477 ± 0.151
0.694MetTyr: 0.694 ± 0.189
0.0MetXaa: 0.0 ± 0.0
Asn
3.687AsnAla: 3.687 ± 0.517
0.347AsnCys: 0.347 ± 0.128
1.778AsnAsp: 1.778 ± 0.321
2.472AsnGlu: 2.472 ± 0.338
1.128AsnPhe: 1.128 ± 0.252
3.6AsnGly: 3.6 ± 0.446
0.998AsnHis: 0.998 ± 0.186
1.301AsnIle: 1.301 ± 0.24
1.431AsnLys: 1.431 ± 0.202
2.516AsnLeu: 2.516 ± 0.324
1.041AsnMet: 1.041 ± 0.223
1.041AsnAsn: 1.041 ± 0.201
2.516AsnPro: 2.516 ± 0.299
1.258AsnGln: 1.258 ± 0.241
2.732AsnArg: 2.732 ± 0.32
2.039AsnSer: 2.039 ± 0.268
1.995AsnThr: 1.995 ± 0.328
2.169AsnVal: 2.169 ± 0.289
0.694AsnTrp: 0.694 ± 0.167
0.911AsnTyr: 0.911 ± 0.239
0.0AsnXaa: 0.0 ± 0.0
Pro
5.335ProAla: 5.335 ± 0.802
0.347ProCys: 0.347 ± 0.137
3.557ProAsp: 3.557 ± 0.362
3.47ProGlu: 3.47 ± 0.414
1.648ProPhe: 1.648 ± 0.218
4.598ProGly: 4.598 ± 0.881
1.041ProHis: 1.041 ± 0.236
1.735ProIle: 1.735 ± 0.301
2.255ProLys: 2.255 ± 0.367
3.904ProLeu: 3.904 ± 0.503
1.692ProMet: 1.692 ± 0.296
1.648ProAsn: 1.648 ± 0.328
2.212ProPro: 2.212 ± 0.451
1.778ProGln: 1.778 ± 0.276
2.516ProArg: 2.516 ± 0.319
2.559ProSer: 2.559 ± 0.33
2.863ProThr: 2.863 ± 0.353
3.86ProVal: 3.86 ± 0.419
0.954ProTrp: 0.954 ± 0.221
1.388ProTyr: 1.388 ± 0.245
0.0ProXaa: 0.0 ± 0.0
Gln
4.034GlnAla: 4.034 ± 0.484
0.607GlnCys: 0.607 ± 0.204
1.865GlnAsp: 1.865 ± 0.258
1.822GlnGlu: 1.822 ± 0.285
0.824GlnPhe: 0.824 ± 0.181
2.993GlnGly: 2.993 ± 0.398
0.781GlnHis: 0.781 ± 0.212
2.863GlnIle: 2.863 ± 0.327
1.388GlnLys: 1.388 ± 0.242
2.906GlnLeu: 2.906 ± 0.324
1.041GlnMet: 1.041 ± 0.201
0.954GlnAsn: 0.954 ± 0.193
1.648GlnPro: 1.648 ± 0.231
1.822GlnGln: 1.822 ± 0.303
2.819GlnArg: 2.819 ± 0.343
1.518GlnSer: 1.518 ± 0.255
2.082GlnThr: 2.082 ± 0.336
2.602GlnVal: 2.602 ± 0.319
1.128GlnTrp: 1.128 ± 0.207
0.954GlnTyr: 0.954 ± 0.213
0.0GlnXaa: 0.0 ± 0.0
Arg
6.463ArgAla: 6.463 ± 0.441
0.651ArgCys: 0.651 ± 0.173
3.6ArgAsp: 3.6 ± 0.39
3.86ArgGlu: 3.86 ± 0.337
2.385ArgPhe: 2.385 ± 0.356
4.598ArgGly: 4.598 ± 0.432
1.128ArgHis: 1.128 ± 0.225
3.947ArgIle: 3.947 ± 0.449
3.773ArgLys: 3.773 ± 0.473
4.598ArgLeu: 4.598 ± 0.373
2.385ArgMet: 2.385 ± 0.377
2.819ArgAsn: 2.819 ± 0.333
2.776ArgPro: 2.776 ± 0.419
2.863ArgGln: 2.863 ± 0.365
3.817ArgArg: 3.817 ± 0.421
2.559ArgSer: 2.559 ± 0.31
3.47ArgThr: 3.47 ± 0.394
4.554ArgVal: 4.554 ± 0.485
1.518ArgTrp: 1.518 ± 0.304
1.865ArgTyr: 1.865 ± 0.287
0.0ArgXaa: 0.0 ± 0.0
Ser
4.988SerAla: 4.988 ± 0.471
0.39SerCys: 0.39 ± 0.161
3.557SerAsp: 3.557 ± 0.365
2.385SerGlu: 2.385 ± 0.38
1.952SerPhe: 1.952 ± 0.282
5.161SerGly: 5.161 ± 0.534
0.998SerHis: 0.998 ± 0.232
2.299SerIle: 2.299 ± 0.393
1.952SerLys: 1.952 ± 0.258
4.077SerLeu: 4.077 ± 0.392
1.084SerMet: 1.084 ± 0.257
2.689SerAsn: 2.689 ± 0.4
2.039SerPro: 2.039 ± 0.337
1.258SerGln: 1.258 ± 0.217
2.906SerArg: 2.906 ± 0.339
3.687SerSer: 3.687 ± 0.501
4.164SerThr: 4.164 ± 0.512
4.12SerVal: 4.12 ± 0.435
1.518SerTrp: 1.518 ± 0.349
1.952SerTyr: 1.952 ± 0.267
0.0SerXaa: 0.0 ± 0.0
Thr
6.246ThrAla: 6.246 ± 0.604
0.347ThrCys: 0.347 ± 0.123
3.904ThrAsp: 3.904 ± 0.397
3.904ThrGlu: 3.904 ± 0.366
1.908ThrPhe: 1.908 ± 0.311
5.248ThrGly: 5.248 ± 0.476
1.084ThrHis: 1.084 ± 0.225
3.383ThrIle: 3.383 ± 0.435
2.559ThrLys: 2.559 ± 0.327
5.335ThrLeu: 5.335 ± 0.496
0.998ThrMet: 0.998 ± 0.211
2.039ThrAsn: 2.039 ± 0.308
3.817ThrPro: 3.817 ± 0.534
2.082ThrGln: 2.082 ± 0.247
3.47ThrArg: 3.47 ± 0.448
2.602ThrSer: 2.602 ± 0.463
3.643ThrThr: 3.643 ± 0.361
4.988ThrVal: 4.988 ± 0.463
1.258ThrTrp: 1.258 ± 0.23
2.299ThrTyr: 2.299 ± 0.305
0.0ThrXaa: 0.0 ± 0.0
Val
6.332ValAla: 6.332 ± 0.592
0.651ValCys: 0.651 ± 0.241
5.508ValAsp: 5.508 ± 0.567
4.814ValGlu: 4.814 ± 0.586
1.995ValPhe: 1.995 ± 0.241
5.552ValGly: 5.552 ± 0.467
1.952ValHis: 1.952 ± 0.343
3.513ValIle: 3.513 ± 0.41
3.47ValLys: 3.47 ± 0.275
5.682ValLeu: 5.682 ± 0.499
1.605ValMet: 1.605 ± 0.324
2.602ValAsn: 2.602 ± 0.302
3.253ValPro: 3.253 ± 0.361
2.559ValGln: 2.559 ± 0.357
4.467ValArg: 4.467 ± 0.436
4.467ValSer: 4.467 ± 0.457
4.598ValThr: 4.598 ± 0.627
4.684ValVal: 4.684 ± 0.465
1.301ValTrp: 1.301 ± 0.257
1.605ValTyr: 1.605 ± 0.297
0.0ValXaa: 0.0 ± 0.0
Trp
2.385TrpAla: 2.385 ± 0.315
0.304TrpCys: 0.304 ± 0.162
1.431TrpAsp: 1.431 ± 0.221
1.995TrpGlu: 1.995 ± 0.288
0.781TrpPhe: 0.781 ± 0.191
1.388TrpGly: 1.388 ± 0.256
0.694TrpHis: 0.694 ± 0.22
0.911TrpIle: 0.911 ± 0.172
0.954TrpLys: 0.954 ± 0.197
2.516TrpLeu: 2.516 ± 0.442
0.651TrpMet: 0.651 ± 0.189
0.694TrpAsn: 0.694 ± 0.17
0.737TrpPro: 0.737 ± 0.156
0.737TrpGln: 0.737 ± 0.184
1.648TrpArg: 1.648 ± 0.262
0.954TrpSer: 0.954 ± 0.264
1.345TrpThr: 1.345 ± 0.257
1.648TrpVal: 1.648 ± 0.274
0.781TrpTrp: 0.781 ± 0.195
0.651TrpTyr: 0.651 ± 0.208
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.949TyrAla: 2.949 ± 0.381
0.347TyrCys: 0.347 ± 0.114
2.212TyrAsp: 2.212 ± 0.287
1.865TyrGlu: 1.865 ± 0.366
0.998TyrPhe: 0.998 ± 0.181
2.689TyrGly: 2.689 ± 0.337
0.824TyrHis: 0.824 ± 0.213
1.084TyrIle: 1.084 ± 0.18
1.605TyrLys: 1.605 ± 0.278
2.342TyrLeu: 2.342 ± 0.304
0.694TyrMet: 0.694 ± 0.173
0.867TyrAsn: 0.867 ± 0.198
1.431TyrPro: 1.431 ± 0.297
1.041TyrGln: 1.041 ± 0.204
2.732TyrArg: 2.732 ± 0.317
1.388TyrSer: 1.388 ± 0.268
1.735TyrThr: 1.735 ± 0.339
1.735TyrVal: 1.735 ± 0.229
0.477TyrTrp: 0.477 ± 0.147
1.171TyrTyr: 1.171 ± 0.264
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 137 proteins (23057 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski