Amino acid dipepetide frequency for Mycobacterium phage Anthony

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
11.039AlaAla: 11.039 ± 0.808
0.671AlaCys: 0.671 ± 0.208
6.038AlaAsp: 6.038 ± 0.681
6.221AlaGlu: 6.221 ± 0.627
4.147AlaPhe: 4.147 ± 0.484
8.538AlaGly: 8.538 ± 0.837
1.403AlaHis: 1.403 ± 0.28
4.513AlaIle: 4.513 ± 0.452
5.306AlaLys: 5.306 ± 0.603
9.392AlaLeu: 9.392 ± 0.758
2.866AlaMet: 2.866 ± 0.416
3.049AlaAsn: 3.049 ± 0.516
4.879AlaPro: 4.879 ± 0.777
3.354AlaGln: 3.354 ± 0.444
5.428AlaArg: 5.428 ± 0.775
5.611AlaSer: 5.611 ± 0.608
5.733AlaThr: 5.733 ± 0.737
6.343AlaVal: 6.343 ± 0.729
2.013AlaTrp: 2.013 ± 0.32
2.074AlaTyr: 2.074 ± 0.288
0.0AlaXaa: 0.0 ± 0.0
Cys
0.61CysAla: 0.61 ± 0.188
0.061CysCys: 0.061 ± 0.059
0.671CysAsp: 0.671 ± 0.206
0.549CysGlu: 0.549 ± 0.209
0.366CysPhe: 0.366 ± 0.148
0.854CysGly: 0.854 ± 0.208
0.427CysHis: 0.427 ± 0.142
0.427CysIle: 0.427 ± 0.205
0.488CysLys: 0.488 ± 0.202
0.488CysLeu: 0.488 ± 0.206
0.122CysMet: 0.122 ± 0.096
0.427CysAsn: 0.427 ± 0.138
0.549CysPro: 0.549 ± 0.197
0.244CysGln: 0.244 ± 0.122
0.488CysArg: 0.488 ± 0.164
0.244CysSer: 0.244 ± 0.114
0.427CysThr: 0.427 ± 0.167
0.793CysVal: 0.793 ± 0.228
0.305CysTrp: 0.305 ± 0.147
0.366CysTyr: 0.366 ± 0.132
0.0CysXaa: 0.0 ± 0.0
Asp
6.709AspAla: 6.709 ± 0.515
0.793AspCys: 0.793 ± 0.258
3.293AspAsp: 3.293 ± 0.473
5.123AspGlu: 5.123 ± 0.67
2.5AspPhe: 2.5 ± 0.399
5.306AspGly: 5.306 ± 0.612
1.525AspHis: 1.525 ± 0.37
2.805AspIle: 2.805 ± 0.433
2.439AspLys: 2.439 ± 0.438
6.099AspLeu: 6.099 ± 0.656
1.464AspMet: 1.464 ± 0.282
1.83AspAsn: 1.83 ± 0.297
5.001AspPro: 5.001 ± 0.511
1.952AspGln: 1.952 ± 0.258
3.354AspArg: 3.354 ± 0.394
3.72AspSer: 3.72 ± 0.484
3.476AspThr: 3.476 ± 0.48
4.269AspVal: 4.269 ± 0.527
1.281AspTrp: 1.281 ± 0.295
3.049AspTyr: 3.049 ± 0.438
0.0AspXaa: 0.0 ± 0.0
Glu
7.989GluAla: 7.989 ± 0.657
0.366GluCys: 0.366 ± 0.164
4.513GluAsp: 4.513 ± 0.571
4.025GluGlu: 4.025 ± 0.463
2.561GluPhe: 2.561 ± 0.313
4.452GluGly: 4.452 ± 0.553
1.159GluHis: 1.159 ± 0.243
3.415GluIle: 3.415 ± 0.509
3.049GluLys: 3.049 ± 0.429
5.977GluLeu: 5.977 ± 0.675
1.952GluMet: 1.952 ± 0.342
1.891GluAsn: 1.891 ± 0.303
2.805GluPro: 2.805 ± 0.371
2.378GluGln: 2.378 ± 0.298
5.245GluArg: 5.245 ± 0.638
3.049GluSer: 3.049 ± 0.386
3.537GluThr: 3.537 ± 0.443
4.452GluVal: 4.452 ± 0.511
1.281GluTrp: 1.281 ± 0.286
1.342GluTyr: 1.342 ± 0.317
0.0GluXaa: 0.0 ± 0.0
Phe
2.866PheAla: 2.866 ± 0.417
0.122PheCys: 0.122 ± 0.09
2.805PheAsp: 2.805 ± 0.396
2.439PheGlu: 2.439 ± 0.343
0.793PhePhe: 0.793 ± 0.301
2.927PheGly: 2.927 ± 0.362
0.61PheHis: 0.61 ± 0.238
1.83PheIle: 1.83 ± 0.324
1.891PheLys: 1.891 ± 0.289
2.439PheLeu: 2.439 ± 0.403
0.732PheMet: 0.732 ± 0.19
1.586PheAsn: 1.586 ± 0.321
1.83PhePro: 1.83 ± 0.29
1.464PheGln: 1.464 ± 0.297
1.891PheArg: 1.891 ± 0.247
2.257PheSer: 2.257 ± 0.423
2.196PheThr: 2.196 ± 0.36
2.805PheVal: 2.805 ± 0.417
0.366PheTrp: 0.366 ± 0.167
0.915PheTyr: 0.915 ± 0.238
0.0PheXaa: 0.0 ± 0.0
Gly
5.855GlyAla: 5.855 ± 0.911
0.793GlyCys: 0.793 ± 0.216
5.977GlyAsp: 5.977 ± 0.616
4.513GlyGlu: 4.513 ± 0.492
2.622GlyPhe: 2.622 ± 0.404
7.379GlyGly: 7.379 ± 0.921
1.647GlyHis: 1.647 ± 0.374
4.269GlyIle: 4.269 ± 0.6
3.842GlyLys: 3.842 ± 0.441
6.526GlyLeu: 6.526 ± 0.789
2.622GlyMet: 2.622 ± 0.311
3.11GlyAsn: 3.11 ± 0.517
3.476GlyPro: 3.476 ± 0.793
3.781GlyGln: 3.781 ± 0.428
4.208GlyArg: 4.208 ± 0.508
4.696GlySer: 4.696 ± 0.623
5.428GlyThr: 5.428 ± 0.58
5.733GlyVal: 5.733 ± 0.638
1.83GlyTrp: 1.83 ± 0.407
2.927GlyTyr: 2.927 ± 0.386
0.0GlyXaa: 0.0 ± 0.0
His
1.525HisAla: 1.525 ± 0.316
0.305HisCys: 0.305 ± 0.136
1.098HisAsp: 1.098 ± 0.257
1.708HisGlu: 1.708 ± 0.365
0.427HisPhe: 0.427 ± 0.14
1.586HisGly: 1.586 ± 0.31
0.244HisHis: 0.244 ± 0.13
0.854HisIle: 0.854 ± 0.265
1.037HisLys: 1.037 ± 0.275
0.732HisLeu: 0.732 ± 0.28
0.244HisMet: 0.244 ± 0.118
0.61HisAsn: 0.61 ± 0.156
1.159HisPro: 1.159 ± 0.269
0.488HisGln: 0.488 ± 0.162
1.22HisArg: 1.22 ± 0.26
1.037HisSer: 1.037 ± 0.332
1.647HisThr: 1.647 ± 0.459
1.159HisVal: 1.159 ± 0.278
0.305HisTrp: 0.305 ± 0.178
0.305HisTyr: 0.305 ± 0.161
0.0HisXaa: 0.0 ± 0.0
Ile
5.916IleAla: 5.916 ± 0.574
0.427IleCys: 0.427 ± 0.144
3.659IleAsp: 3.659 ± 0.473
3.659IleGlu: 3.659 ± 0.61
1.525IlePhe: 1.525 ± 0.282
4.513IleGly: 4.513 ± 0.576
0.915IleHis: 0.915 ± 0.325
2.196IleIle: 2.196 ± 0.41
2.805IleLys: 2.805 ± 0.344
4.391IleLeu: 4.391 ± 0.568
0.61IleMet: 0.61 ± 0.187
1.769IleAsn: 1.769 ± 0.38
3.354IlePro: 3.354 ± 0.469
1.647IleGln: 1.647 ± 0.345
3.232IleArg: 3.232 ± 0.38
2.561IleSer: 2.561 ± 0.385
3.903IleThr: 3.903 ± 0.511
3.171IleVal: 3.171 ± 0.433
0.793IleTrp: 0.793 ± 0.213
1.22IleTyr: 1.22 ± 0.292
0.0IleXaa: 0.0 ± 0.0
Lys
4.94LysAla: 4.94 ± 0.568
0.549LysCys: 0.549 ± 0.232
3.049LysAsp: 3.049 ± 0.443
2.622LysGlu: 2.622 ± 0.438
1.464LysPhe: 1.464 ± 0.31
4.147LysGly: 4.147 ± 0.637
0.732LysHis: 0.732 ± 0.208
2.439LysIle: 2.439 ± 0.392
2.927LysLys: 2.927 ± 0.537
3.781LysLeu: 3.781 ± 0.545
0.915LysMet: 0.915 ± 0.239
1.342LysAsn: 1.342 ± 0.226
3.293LysPro: 3.293 ± 0.558
2.074LysGln: 2.074 ± 0.354
3.049LysArg: 3.049 ± 0.52
2.927LysSer: 2.927 ± 0.355
2.439LysThr: 2.439 ± 0.365
4.513LysVal: 4.513 ± 0.494
0.915LysTrp: 0.915 ± 0.267
1.403LysTyr: 1.403 ± 0.307
0.0LysXaa: 0.0 ± 0.0
Leu
8.538LeuAla: 8.538 ± 0.78
0.732LeuCys: 0.732 ± 0.191
4.879LeuAsp: 4.879 ± 0.518
6.038LeuGlu: 6.038 ± 0.536
2.744LeuPhe: 2.744 ± 0.404
6.282LeuGly: 6.282 ± 0.54
1.769LeuHis: 1.769 ± 0.336
5.184LeuIle: 5.184 ± 0.544
4.33LeuLys: 4.33 ± 0.61
5.062LeuLeu: 5.062 ± 0.471
2.013LeuMet: 2.013 ± 0.368
2.561LeuAsn: 2.561 ± 0.529
4.391LeuPro: 4.391 ± 0.592
2.317LeuGln: 2.317 ± 0.442
5.611LeuArg: 5.611 ± 0.482
5.916LeuSer: 5.916 ± 0.532
4.818LeuThr: 4.818 ± 0.515
4.574LeuVal: 4.574 ± 0.502
1.22LeuTrp: 1.22 ± 0.275
2.257LeuTyr: 2.257 ± 0.345
0.0LeuXaa: 0.0 ± 0.0
Met
2.135MetAla: 2.135 ± 0.32
0.061MetCys: 0.061 ± 0.06
1.281MetAsp: 1.281 ± 0.267
1.281MetGlu: 1.281 ± 0.28
0.671MetPhe: 0.671 ± 0.198
1.769MetGly: 1.769 ± 0.362
0.427MetHis: 0.427 ± 0.149
1.22MetIle: 1.22 ± 0.324
1.342MetLys: 1.342 ± 0.36
1.342MetLeu: 1.342 ± 0.27
0.793MetMet: 0.793 ± 0.191
0.976MetAsn: 0.976 ± 0.277
1.22MetPro: 1.22 ± 0.276
0.732MetGln: 0.732 ± 0.219
2.074MetArg: 2.074 ± 0.322
2.5MetSer: 2.5 ± 0.369
2.317MetThr: 2.317 ± 0.358
1.159MetVal: 1.159 ± 0.288
0.244MetTrp: 0.244 ± 0.109
0.732MetTyr: 0.732 ± 0.249
0.0MetXaa: 0.0 ± 0.0
Asn
3.781AsnAla: 3.781 ± 0.412
0.305AsnCys: 0.305 ± 0.153
2.439AsnAsp: 2.439 ± 0.439
2.074AsnGlu: 2.074 ± 0.322
1.342AsnPhe: 1.342 ± 0.31
3.293AsnGly: 3.293 ± 0.44
0.793AsnHis: 0.793 ± 0.21
1.342AsnIle: 1.342 ± 0.234
1.22AsnLys: 1.22 ± 0.238
2.683AsnLeu: 2.683 ± 0.443
1.037AsnMet: 1.037 ± 0.236
1.281AsnAsn: 1.281 ± 0.214
2.439AsnPro: 2.439 ± 0.36
1.403AsnGln: 1.403 ± 0.411
1.769AsnArg: 1.769 ± 0.358
1.586AsnSer: 1.586 ± 0.321
1.83AsnThr: 1.83 ± 0.356
2.013AsnVal: 2.013 ± 0.354
0.915AsnTrp: 0.915 ± 0.197
1.098AsnTyr: 1.098 ± 0.31
0.0AsnXaa: 0.0 ± 0.0
Pro
4.574ProAla: 4.574 ± 0.71
0.488ProCys: 0.488 ± 0.198
3.537ProAsp: 3.537 ± 0.462
3.964ProGlu: 3.964 ± 0.442
1.952ProPhe: 1.952 ± 0.378
4.269ProGly: 4.269 ± 0.506
0.854ProHis: 0.854 ± 0.256
2.683ProIle: 2.683 ± 0.397
2.988ProLys: 2.988 ± 0.557
3.049ProLeu: 3.049 ± 0.425
1.647ProMet: 1.647 ± 0.342
1.83ProAsn: 1.83 ± 0.345
2.196ProPro: 2.196 ± 0.424
1.403ProGln: 1.403 ± 0.341
2.927ProArg: 2.927 ± 0.469
2.927ProSer: 2.927 ± 0.365
3.598ProThr: 3.598 ± 0.474
4.269ProVal: 4.269 ± 0.452
0.915ProTrp: 0.915 ± 0.315
1.769ProTyr: 1.769 ± 0.309
0.0ProXaa: 0.0 ± 0.0
Gln
3.537GlnAla: 3.537 ± 0.55
0.061GlnCys: 0.061 ± 0.055
1.708GlnAsp: 1.708 ± 0.368
1.586GlnGlu: 1.586 ± 0.331
1.525GlnPhe: 1.525 ± 0.313
3.476GlnGly: 3.476 ± 0.827
0.61GlnHis: 0.61 ± 0.166
2.683GlnIle: 2.683 ± 0.522
1.403GlnLys: 1.403 ± 0.292
3.049GlnLeu: 3.049 ± 0.371
0.61GlnMet: 0.61 ± 0.161
1.159GlnAsn: 1.159 ± 0.215
1.159GlnPro: 1.159 ± 0.281
1.403GlnGln: 1.403 ± 0.405
2.622GlnArg: 2.622 ± 0.474
1.281GlnSer: 1.281 ± 0.259
1.647GlnThr: 1.647 ± 0.251
2.561GlnVal: 2.561 ± 0.417
0.793GlnTrp: 0.793 ± 0.229
1.037GlnTyr: 1.037 ± 0.215
0.0GlnXaa: 0.0 ± 0.0
Arg
6.16ArgAla: 6.16 ± 0.658
0.732ArgCys: 0.732 ± 0.219
4.025ArgAsp: 4.025 ± 0.492
4.025ArgGlu: 4.025 ± 0.619
2.317ArgPhe: 2.317 ± 0.436
3.598ArgGly: 3.598 ± 0.404
1.159ArgHis: 1.159 ± 0.307
3.476ArgIle: 3.476 ± 0.454
3.232ArgLys: 3.232 ± 0.499
6.038ArgLeu: 6.038 ± 0.642
1.952ArgMet: 1.952 ± 0.281
3.049ArgAsn: 3.049 ± 0.333
2.257ArgPro: 2.257 ± 0.352
1.952ArgGln: 1.952 ± 0.336
5.611ArgArg: 5.611 ± 0.65
3.049ArgSer: 3.049 ± 0.51
2.927ArgThr: 2.927 ± 0.425
4.086ArgVal: 4.086 ± 0.515
1.159ArgTrp: 1.159 ± 0.277
1.952ArgTyr: 1.952 ± 0.414
0.0ArgXaa: 0.0 ± 0.0
Ser
4.635SerAla: 4.635 ± 0.496
0.488SerCys: 0.488 ± 0.146
4.086SerAsp: 4.086 ± 0.429
4.208SerGlu: 4.208 ± 0.558
2.317SerPhe: 2.317 ± 0.334
5.428SerGly: 5.428 ± 0.594
0.732SerHis: 0.732 ± 0.198
2.805SerIle: 2.805 ± 0.362
2.988SerLys: 2.988 ± 0.412
5.428SerLeu: 5.428 ± 0.623
1.159SerMet: 1.159 ± 0.266
1.464SerAsn: 1.464 ± 0.302
2.744SerPro: 2.744 ± 0.343
2.013SerGln: 2.013 ± 0.34
3.72SerArg: 3.72 ± 0.509
3.659SerSer: 3.659 ± 0.514
3.232SerThr: 3.232 ± 0.476
3.781SerVal: 3.781 ± 0.399
1.525SerTrp: 1.525 ± 0.296
2.196SerTyr: 2.196 ± 0.322
0.0SerXaa: 0.0 ± 0.0
Thr
7.318ThrAla: 7.318 ± 0.633
0.427ThrCys: 0.427 ± 0.147
3.964ThrAsp: 3.964 ± 0.603
3.903ThrGlu: 3.903 ± 0.469
1.83ThrPhe: 1.83 ± 0.292
4.818ThrGly: 4.818 ± 0.456
0.976ThrHis: 0.976 ± 0.271
2.561ThrIle: 2.561 ± 0.33
2.866ThrLys: 2.866 ± 0.371
5.184ThrLeu: 5.184 ± 0.5
1.22ThrMet: 1.22 ± 0.247
2.013ThrAsn: 2.013 ± 0.314
3.903ThrPro: 3.903 ± 0.548
1.647ThrGln: 1.647 ± 0.305
3.354ThrArg: 3.354 ± 0.563
3.293ThrSer: 3.293 ± 0.411
3.781ThrThr: 3.781 ± 0.591
4.818ThrVal: 4.818 ± 0.461
0.793ThrTrp: 0.793 ± 0.241
2.013ThrTyr: 2.013 ± 0.284
0.0ThrXaa: 0.0 ± 0.0
Val
5.916ValAla: 5.916 ± 0.647
0.732ValCys: 0.732 ± 0.203
5.245ValAsp: 5.245 ± 0.586
4.94ValGlu: 4.94 ± 0.517
2.317ValPhe: 2.317 ± 0.366
4.879ValGly: 4.879 ± 0.553
0.976ValHis: 0.976 ± 0.251
4.208ValIle: 4.208 ± 0.604
3.537ValLys: 3.537 ± 0.382
5.062ValLeu: 5.062 ± 0.579
1.22ValMet: 1.22 ± 0.276
3.171ValAsn: 3.171 ± 0.451
2.622ValPro: 2.622 ± 0.387
1.647ValGln: 1.647 ± 0.336
4.208ValArg: 4.208 ± 0.425
5.123ValSer: 5.123 ± 0.615
4.94ValThr: 4.94 ± 0.507
4.818ValVal: 4.818 ± 0.529
1.098ValTrp: 1.098 ± 0.299
1.952ValTyr: 1.952 ± 0.357
0.0ValXaa: 0.0 ± 0.0
Trp
1.952TrpAla: 1.952 ± 0.317
0.244TrpCys: 0.244 ± 0.107
1.342TrpAsp: 1.342 ± 0.252
1.281TrpGlu: 1.281 ± 0.262
0.427TrpPhe: 0.427 ± 0.162
1.281TrpGly: 1.281 ± 0.33
0.366TrpHis: 0.366 ± 0.147
1.769TrpIle: 1.769 ± 0.32
0.732TrpLys: 0.732 ± 0.274
1.159TrpLeu: 1.159 ± 0.284
0.549TrpMet: 0.549 ± 0.161
0.671TrpAsn: 0.671 ± 0.235
0.671TrpPro: 0.671 ± 0.195
0.915TrpGln: 0.915 ± 0.197
0.793TrpArg: 0.793 ± 0.193
1.342TrpSer: 1.342 ± 0.313
1.403TrpThr: 1.403 ± 0.306
0.854TrpVal: 0.854 ± 0.221
0.549TrpTrp: 0.549 ± 0.254
0.488TrpTyr: 0.488 ± 0.192
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.622TyrAla: 2.622 ± 0.31
0.549TyrCys: 0.549 ± 0.199
2.378TyrAsp: 2.378 ± 0.277
1.281TyrGlu: 1.281 ± 0.324
0.915TyrPhe: 0.915 ± 0.198
2.439TyrGly: 2.439 ± 0.419
0.366TyrHis: 0.366 ± 0.149
1.586TyrIle: 1.586 ± 0.259
1.098TyrLys: 1.098 ± 0.249
3.354TyrLeu: 3.354 ± 0.472
0.488TyrMet: 0.488 ± 0.15
0.854TyrAsn: 0.854 ± 0.216
1.891TyrPro: 1.891 ± 0.308
1.098TyrGln: 1.098 ± 0.234
2.013TyrArg: 2.013 ± 0.356
1.83TyrSer: 1.83 ± 0.429
1.403TyrThr: 1.403 ± 0.342
2.439TyrVal: 2.439 ± 0.363
0.488TyrTrp: 0.488 ± 0.181
1.159TyrTyr: 1.159 ± 0.376
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 89 proteins (16398 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski