Amino acid dipepetide frequency for Mycobacterium phage JoieB

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
12.097AlaAla: 12.097 ± 1.046
0.919AlaCys: 0.919 ± 0.23
6.436AlaAsp: 6.436 ± 0.578
6.726AlaGlu: 6.726 ± 0.686
2.903AlaPhe: 2.903 ± 0.418
6.339AlaGly: 6.339 ± 0.536
2.371AlaHis: 2.371 ± 0.328
4.597AlaIle: 4.597 ± 0.519
4.355AlaLys: 4.355 ± 0.513
8.565AlaLeu: 8.565 ± 0.776
2.613AlaMet: 2.613 ± 0.305
3.726AlaAsn: 3.726 ± 0.511
4.742AlaPro: 4.742 ± 0.494
3.774AlaGln: 3.774 ± 0.569
6.049AlaArg: 6.049 ± 0.542
4.452AlaSer: 4.452 ± 0.578
5.758AlaThr: 5.758 ± 0.483
6.097AlaVal: 6.097 ± 0.646
2.032AlaTrp: 2.032 ± 0.278
3.0AlaTyr: 3.0 ± 0.347
0.0AlaXaa: 0.0 ± 0.0
Cys
1.065CysAla: 1.065 ± 0.243
0.242CysCys: 0.242 ± 0.122
1.403CysAsp: 1.403 ± 0.328
0.677CysGlu: 0.677 ± 0.207
0.29CysPhe: 0.29 ± 0.118
1.645CysGly: 1.645 ± 0.387
0.484CysHis: 0.484 ± 0.162
0.823CysIle: 0.823 ± 0.209
0.339CysLys: 0.339 ± 0.157
0.435CysLeu: 0.435 ± 0.143
0.097CysMet: 0.097 ± 0.069
0.629CysAsn: 0.629 ± 0.146
0.871CysPro: 0.871 ± 0.201
0.726CysGln: 0.726 ± 0.186
0.968CysArg: 0.968 ± 0.214
0.581CysSer: 0.581 ± 0.191
0.435CysThr: 0.435 ± 0.136
0.871CysVal: 0.871 ± 0.185
0.242CysTrp: 0.242 ± 0.108
0.387CysTyr: 0.387 ± 0.146
0.0CysXaa: 0.0 ± 0.0
Asp
5.42AspAla: 5.42 ± 0.491
0.677AspCys: 0.677 ± 0.225
5.274AspAsp: 5.274 ± 0.653
5.661AspGlu: 5.661 ± 0.522
2.129AspPhe: 2.129 ± 0.304
4.79AspGly: 4.79 ± 0.559
1.21AspHis: 1.21 ± 0.22
3.823AspIle: 3.823 ± 0.563
2.129AspLys: 2.129 ± 0.349
6.0AspLeu: 6.0 ± 0.512
1.839AspMet: 1.839 ± 0.312
1.79AspAsn: 1.79 ± 0.321
4.839AspPro: 4.839 ± 0.576
1.597AspGln: 1.597 ± 0.318
4.161AspArg: 4.161 ± 0.42
2.468AspSer: 2.468 ± 0.408
3.0AspThr: 3.0 ± 0.463
4.79AspVal: 4.79 ± 0.454
1.403AspTrp: 1.403 ± 0.231
2.565AspTyr: 2.565 ± 0.338
0.0AspXaa: 0.0 ± 0.0
Glu
7.065GluAla: 7.065 ± 0.684
0.823GluCys: 0.823 ± 0.268
3.678GluAsp: 3.678 ± 0.528
4.887GluGlu: 4.887 ± 0.525
3.387GluPhe: 3.387 ± 0.391
4.694GluGly: 4.694 ± 0.458
1.597GluHis: 1.597 ± 0.348
2.613GluIle: 2.613 ± 0.348
2.129GluLys: 2.129 ± 0.371
5.952GluLeu: 5.952 ± 0.579
2.661GluMet: 2.661 ± 0.401
1.936GluAsn: 1.936 ± 0.341
2.758GluPro: 2.758 ± 0.421
2.855GluGln: 2.855 ± 0.355
5.323GluArg: 5.323 ± 0.512
3.484GluSer: 3.484 ± 0.426
3.29GluThr: 3.29 ± 0.361
4.936GluVal: 4.936 ± 0.619
1.694GluTrp: 1.694 ± 0.331
1.79GluTyr: 1.79 ± 0.361
0.0GluXaa: 0.0 ± 0.0
Phe
2.371PheAla: 2.371 ± 0.289
0.581PheCys: 0.581 ± 0.17
2.661PheAsp: 2.661 ± 0.333
2.419PheGlu: 2.419 ± 0.311
1.016PhePhe: 1.016 ± 0.192
3.29PheGly: 3.29 ± 0.434
0.774PheHis: 0.774 ± 0.177
1.548PheIle: 1.548 ± 0.282
1.694PheLys: 1.694 ± 0.285
2.613PheLeu: 2.613 ± 0.422
0.726PheMet: 0.726 ± 0.168
1.452PheAsn: 1.452 ± 0.237
1.113PhePro: 1.113 ± 0.239
1.016PheGln: 1.016 ± 0.184
2.129PheArg: 2.129 ± 0.306
1.306PheSer: 1.306 ± 0.227
2.274PheThr: 2.274 ± 0.423
2.129PheVal: 2.129 ± 0.326
0.871PheTrp: 0.871 ± 0.23
1.161PheTyr: 1.161 ± 0.197
0.0PheXaa: 0.0 ± 0.0
Gly
6.581GlyAla: 6.581 ± 0.841
0.677GlyCys: 0.677 ± 0.198
5.952GlyAsp: 5.952 ± 0.678
4.258GlyGlu: 4.258 ± 0.533
3.484GlyPhe: 3.484 ± 0.394
8.952GlyGly: 8.952 ± 1.272
1.548GlyHis: 1.548 ± 0.256
3.339GlyIle: 3.339 ± 0.448
3.919GlyLys: 3.919 ± 0.483
6.339GlyLeu: 6.339 ± 0.634
2.081GlyMet: 2.081 ± 0.335
3.0GlyAsn: 3.0 ± 0.443
4.016GlyPro: 4.016 ± 0.663
2.903GlyGln: 2.903 ± 0.44
5.516GlyArg: 5.516 ± 0.6
5.42GlySer: 5.42 ± 0.609
4.597GlyThr: 4.597 ± 0.651
5.71GlyVal: 5.71 ± 0.581
2.032GlyTrp: 2.032 ± 0.37
2.661GlyTyr: 2.661 ± 0.396
0.0GlyXaa: 0.0 ± 0.0
His
1.936HisAla: 1.936 ± 0.362
0.242HisCys: 0.242 ± 0.116
1.5HisAsp: 1.5 ± 0.36
1.839HisGlu: 1.839 ± 0.269
0.339HisPhe: 0.339 ± 0.123
1.936HisGly: 1.936 ± 0.275
0.968HisHis: 0.968 ± 0.196
1.016HisIle: 1.016 ± 0.239
0.581HisLys: 0.581 ± 0.184
2.177HisLeu: 2.177 ± 0.311
0.29HisMet: 0.29 ± 0.118
0.871HisAsn: 0.871 ± 0.169
1.79HisPro: 1.79 ± 0.252
0.823HisGln: 0.823 ± 0.189
1.113HisArg: 1.113 ± 0.236
1.016HisSer: 1.016 ± 0.22
0.968HisThr: 0.968 ± 0.199
1.113HisVal: 1.113 ± 0.228
0.29HisTrp: 0.29 ± 0.122
0.823HisTyr: 0.823 ± 0.191
0.0HisXaa: 0.0 ± 0.0
Ile
5.081IleAla: 5.081 ± 0.505
0.823IleCys: 0.823 ± 0.228
3.774IleAsp: 3.774 ± 0.433
3.484IleGlu: 3.484 ± 0.444
1.21IlePhe: 1.21 ± 0.268
3.871IleGly: 3.871 ± 0.507
1.161IleHis: 1.161 ± 0.242
1.645IleIle: 1.645 ± 0.276
1.548IleLys: 1.548 ± 0.29
3.339IleLeu: 3.339 ± 0.405
0.774IleMet: 0.774 ± 0.154
1.548IleAsn: 1.548 ± 0.334
2.807IlePro: 2.807 ± 0.474
1.548IleGln: 1.548 ± 0.259
3.097IleArg: 3.097 ± 0.379
2.468IleSer: 2.468 ± 0.326
2.419IleThr: 2.419 ± 0.479
2.903IleVal: 2.903 ± 0.415
0.726IleTrp: 0.726 ± 0.207
1.065IleTyr: 1.065 ± 0.261
0.0IleXaa: 0.0 ± 0.0
Lys
4.645LysAla: 4.645 ± 0.54
0.823LysCys: 0.823 ± 0.216
1.355LysAsp: 1.355 ± 0.26
1.79LysGlu: 1.79 ± 0.464
1.452LysPhe: 1.452 ± 0.296
2.952LysGly: 2.952 ± 0.339
1.016LysHis: 1.016 ± 0.204
1.597LysIle: 1.597 ± 0.258
2.371LysLys: 2.371 ± 0.4
3.339LysLeu: 3.339 ± 0.375
1.016LysMet: 1.016 ± 0.213
0.968LysAsn: 0.968 ± 0.266
2.661LysPro: 2.661 ± 0.438
1.742LysGln: 1.742 ± 0.354
3.097LysArg: 3.097 ± 0.417
1.597LysSer: 1.597 ± 0.303
1.839LysThr: 1.839 ± 0.349
2.323LysVal: 2.323 ± 0.326
0.871LysTrp: 0.871 ± 0.174
0.581LysTyr: 0.581 ± 0.19
0.0LysXaa: 0.0 ± 0.0
Leu
7.887LeuAla: 7.887 ± 0.717
1.065LeuCys: 1.065 ± 0.32
6.339LeuAsp: 6.339 ± 0.628
5.613LeuGlu: 5.613 ± 0.486
2.565LeuPhe: 2.565 ± 0.458
6.339LeuGly: 6.339 ± 0.878
1.403LeuHis: 1.403 ± 0.254
3.871LeuIle: 3.871 ± 0.424
3.0LeuLys: 3.0 ± 0.363
6.484LeuLeu: 6.484 ± 0.61
1.839LeuMet: 1.839 ± 0.273
2.855LeuAsn: 2.855 ± 0.484
5.71LeuPro: 5.71 ± 0.711
2.661LeuGln: 2.661 ± 0.388
5.613LeuArg: 5.613 ± 0.574
4.79LeuSer: 4.79 ± 0.471
5.42LeuThr: 5.42 ± 0.411
5.323LeuVal: 5.323 ± 0.483
1.113LeuTrp: 1.113 ± 0.22
2.226LeuTyr: 2.226 ± 0.31
0.0LeuXaa: 0.0 ± 0.0
Met
3.29MetAla: 3.29 ± 0.426
0.435MetCys: 0.435 ± 0.139
0.919MetAsp: 0.919 ± 0.21
1.21MetGlu: 1.21 ± 0.271
1.161MetPhe: 1.161 ± 0.23
2.129MetGly: 2.129 ± 0.262
0.484MetHis: 0.484 ± 0.136
1.065MetIle: 1.065 ± 0.239
0.968MetLys: 0.968 ± 0.217
1.597MetLeu: 1.597 ± 0.249
0.581MetMet: 0.581 ± 0.149
1.161MetAsn: 1.161 ± 0.231
1.79MetPro: 1.79 ± 0.31
0.484MetGln: 0.484 ± 0.173
1.645MetArg: 1.645 ± 0.243
2.323MetSer: 2.323 ± 0.321
2.468MetThr: 2.468 ± 0.323
1.5MetVal: 1.5 ± 0.335
0.871MetTrp: 0.871 ± 0.189
0.194MetTyr: 0.194 ± 0.078
0.0MetXaa: 0.0 ± 0.0
Asn
3.968AsnAla: 3.968 ± 0.373
0.581AsnCys: 0.581 ± 0.177
2.032AsnAsp: 2.032 ± 0.276
1.984AsnGlu: 1.984 ± 0.31
1.21AsnPhe: 1.21 ± 0.242
3.774AsnGly: 3.774 ± 0.453
0.532AsnHis: 0.532 ± 0.158
1.452AsnIle: 1.452 ± 0.259
1.016AsnLys: 1.016 ± 0.237
2.855AsnLeu: 2.855 ± 0.403
0.484AsnMet: 0.484 ± 0.129
1.306AsnAsn: 1.306 ± 0.28
2.661AsnPro: 2.661 ± 0.409
1.597AsnGln: 1.597 ± 0.346
2.081AsnArg: 2.081 ± 0.359
1.839AsnSer: 1.839 ± 0.353
2.032AsnThr: 2.032 ± 0.29
2.177AsnVal: 2.177 ± 0.402
0.919AsnTrp: 0.919 ± 0.183
0.774AsnTyr: 0.774 ± 0.235
0.0AsnXaa: 0.0 ± 0.0
Pro
4.549ProAla: 4.549 ± 0.63
0.581ProCys: 0.581 ± 0.175
3.726ProAsp: 3.726 ± 0.449
4.839ProGlu: 4.839 ± 0.562
1.839ProPhe: 1.839 ± 0.261
5.468ProGly: 5.468 ± 0.802
1.452ProHis: 1.452 ± 0.288
2.274ProIle: 2.274 ± 0.414
1.984ProLys: 1.984 ± 0.347
4.113ProLeu: 4.113 ± 0.389
1.403ProMet: 1.403 ± 0.236
2.323ProAsn: 2.323 ± 0.329
3.436ProPro: 3.436 ± 0.457
2.274ProGln: 2.274 ± 0.294
3.242ProArg: 3.242 ± 0.469
4.645ProSer: 4.645 ± 0.552
3.532ProThr: 3.532 ± 0.546
3.629ProVal: 3.629 ± 0.378
1.113ProTrp: 1.113 ± 0.225
1.597ProTyr: 1.597 ± 0.273
0.0ProXaa: 0.0 ± 0.0
Gln
3.774GlnAla: 3.774 ± 0.611
0.677GlnCys: 0.677 ± 0.211
1.694GlnAsp: 1.694 ± 0.273
2.177GlnGlu: 2.177 ± 0.32
0.871GlnPhe: 0.871 ± 0.178
2.468GlnGly: 2.468 ± 0.376
0.677GlnHis: 0.677 ± 0.174
1.936GlnIle: 1.936 ± 0.316
1.645GlnLys: 1.645 ± 0.269
3.823GlnLeu: 3.823 ± 0.445
1.258GlnMet: 1.258 ± 0.233
0.774GlnAsn: 0.774 ± 0.167
1.984GlnPro: 1.984 ± 0.282
2.032GlnGln: 2.032 ± 0.496
3.145GlnArg: 3.145 ± 0.46
1.887GlnSer: 1.887 ± 0.325
1.694GlnThr: 1.694 ± 0.244
2.613GlnVal: 2.613 ± 0.384
1.113GlnTrp: 1.113 ± 0.223
0.823GlnTyr: 0.823 ± 0.17
0.0GlnXaa: 0.0 ± 0.0
Arg
6.145ArgAla: 6.145 ± 0.525
1.161ArgCys: 1.161 ± 0.257
4.161ArgAsp: 4.161 ± 0.392
4.549ArgGlu: 4.549 ± 0.543
2.323ArgPhe: 2.323 ± 0.344
5.032ArgGly: 5.032 ± 0.526
1.306ArgHis: 1.306 ± 0.297
3.436ArgIle: 3.436 ± 0.368
3.097ArgLys: 3.097 ± 0.449
6.097ArgLeu: 6.097 ± 0.509
1.548ArgMet: 1.548 ± 0.296
2.419ArgAsn: 2.419 ± 0.322
3.29ArgPro: 3.29 ± 0.336
2.661ArgGln: 2.661 ± 0.296
6.484ArgArg: 6.484 ± 0.78
2.952ArgSer: 2.952 ± 0.372
2.661ArgThr: 2.661 ± 0.33
4.21ArgVal: 4.21 ± 0.538
2.274ArgTrp: 2.274 ± 0.371
1.887ArgTyr: 1.887 ± 0.332
0.0ArgXaa: 0.0 ± 0.0
Ser
5.274SerAla: 5.274 ± 0.604
0.484SerCys: 0.484 ± 0.161
4.161SerAsp: 4.161 ± 0.549
3.242SerGlu: 3.242 ± 0.385
1.645SerPhe: 1.645 ± 0.316
4.984SerGly: 4.984 ± 0.667
1.161SerHis: 1.161 ± 0.244
2.565SerIle: 2.565 ± 0.369
1.597SerLys: 1.597 ± 0.238
4.258SerLeu: 4.258 ± 0.531
2.081SerMet: 2.081 ± 0.321
1.694SerAsn: 1.694 ± 0.294
3.048SerPro: 3.048 ± 0.373
1.887SerGln: 1.887 ± 0.294
3.871SerArg: 3.871 ± 0.593
3.968SerSer: 3.968 ± 0.528
3.484SerThr: 3.484 ± 0.43
3.532SerVal: 3.532 ± 0.443
1.21SerTrp: 1.21 ± 0.212
0.919SerTyr: 0.919 ± 0.214
0.0SerXaa: 0.0 ± 0.0
Thr
5.516ThrAla: 5.516 ± 0.492
0.581ThrCys: 0.581 ± 0.2
3.387ThrAsp: 3.387 ± 0.45
2.807ThrGlu: 2.807 ± 0.417
1.887ThrPhe: 1.887 ± 0.306
5.178ThrGly: 5.178 ± 0.604
1.161ThrHis: 1.161 ± 0.253
2.661ThrIle: 2.661 ± 0.329
1.887ThrLys: 1.887 ± 0.275
5.081ThrLeu: 5.081 ± 0.654
1.5ThrMet: 1.5 ± 0.272
2.371ThrAsn: 2.371 ± 0.383
3.919ThrPro: 3.919 ± 0.546
2.129ThrGln: 2.129 ± 0.275
3.532ThrArg: 3.532 ± 0.401
2.468ThrSer: 2.468 ± 0.375
3.436ThrThr: 3.436 ± 0.501
4.161ThrVal: 4.161 ± 0.548
1.21ThrTrp: 1.21 ± 0.271
1.258ThrTyr: 1.258 ± 0.266
0.0ThrXaa: 0.0 ± 0.0
Val
7.016ValAla: 7.016 ± 0.584
0.968ValCys: 0.968 ± 0.236
3.436ValAsp: 3.436 ± 0.402
5.661ValGlu: 5.661 ± 0.522
2.081ValPhe: 2.081 ± 0.322
5.032ValGly: 5.032 ± 0.525
0.871ValHis: 0.871 ± 0.207
3.0ValIle: 3.0 ± 0.46
2.419ValLys: 2.419 ± 0.308
5.226ValLeu: 5.226 ± 0.465
1.984ValMet: 1.984 ± 0.313
2.71ValAsn: 2.71 ± 0.317
4.452ValPro: 4.452 ± 0.449
2.468ValGln: 2.468 ± 0.384
3.823ValArg: 3.823 ± 0.532
4.597ValSer: 4.597 ± 0.434
3.968ValThr: 3.968 ± 0.474
5.855ValVal: 5.855 ± 0.642
0.774ValTrp: 0.774 ± 0.246
1.548ValTyr: 1.548 ± 0.258
0.0ValXaa: 0.0 ± 0.0
Trp
1.452TrpAla: 1.452 ± 0.302
0.532TrpCys: 0.532 ± 0.144
1.742TrpAsp: 1.742 ± 0.319
1.645TrpGlu: 1.645 ± 0.282
0.581TrpPhe: 0.581 ± 0.174
1.645TrpGly: 1.645 ± 0.257
0.871TrpHis: 0.871 ± 0.214
0.726TrpIle: 0.726 ± 0.189
0.774TrpLys: 0.774 ± 0.182
1.839TrpLeu: 1.839 ± 0.333
0.726TrpMet: 0.726 ± 0.169
0.677TrpAsn: 0.677 ± 0.134
0.823TrpPro: 0.823 ± 0.158
0.919TrpGln: 0.919 ± 0.186
1.452TrpArg: 1.452 ± 0.249
1.258TrpSer: 1.258 ± 0.266
1.306TrpThr: 1.306 ± 0.244
2.032TrpVal: 2.032 ± 0.33
0.774TrpTrp: 0.774 ± 0.224
0.774TrpTyr: 0.774 ± 0.178
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.419TyrAla: 2.419 ± 0.332
0.339TyrCys: 0.339 ± 0.134
1.936TyrAsp: 1.936 ± 0.275
2.081TyrGlu: 2.081 ± 0.308
0.823TyrPhe: 0.823 ± 0.181
2.468TyrGly: 2.468 ± 0.306
0.581TyrHis: 0.581 ± 0.178
1.306TyrIle: 1.306 ± 0.27
0.581TyrLys: 0.581 ± 0.172
1.984TyrLeu: 1.984 ± 0.382
0.726TyrMet: 0.726 ± 0.15
1.065TyrAsn: 1.065 ± 0.167
1.306TyrPro: 1.306 ± 0.282
0.968TyrGln: 0.968 ± 0.182
1.355TyrArg: 1.355 ± 0.256
1.548TyrSer: 1.548 ± 0.25
1.548TyrThr: 1.548 ± 0.305
1.984TyrVal: 1.984 ± 0.283
1.016TyrTrp: 1.016 ± 0.237
0.919TyrTyr: 0.919 ± 0.193
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 117 proteins (20667 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski