Amino acid dipepetide frequency for Mycobacterium phage BaconJack

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
12.141AlaAla: 12.141 ± 1.08
0.655AlaCys: 0.655 ± 0.209
7.082AlaAsp: 7.082 ± 0.746
6.011AlaGlu: 6.011 ± 0.785
2.857AlaPhe: 2.857 ± 0.442
7.618AlaGly: 7.618 ± 0.806
1.726AlaHis: 1.726 ± 0.363
4.166AlaIle: 4.166 ± 0.713
3.809AlaLys: 3.809 ± 0.456
8.748AlaLeu: 8.748 ± 0.906
2.262AlaMet: 2.262 ± 0.411
2.381AlaAsn: 2.381 ± 0.447
4.702AlaPro: 4.702 ± 0.67
2.857AlaGln: 2.857 ± 0.421
6.606AlaArg: 6.606 ± 0.539
5.118AlaSer: 5.118 ± 0.591
5.416AlaThr: 5.416 ± 0.6
8.153AlaVal: 8.153 ± 0.723
1.785AlaTrp: 1.785 ± 0.324
2.678AlaTyr: 2.678 ± 0.38
0.0AlaXaa: 0.0 ± 0.0
Cys
0.714CysAla: 0.714 ± 0.208
0.119CysCys: 0.119 ± 0.075
0.536CysAsp: 0.536 ± 0.159
0.536CysGlu: 0.536 ± 0.175
0.179CysPhe: 0.179 ± 0.094
0.357CysGly: 0.357 ± 0.154
0.119CysHis: 0.119 ± 0.095
0.238CysIle: 0.238 ± 0.114
0.238CysLys: 0.238 ± 0.114
0.595CysLeu: 0.595 ± 0.186
0.119CysMet: 0.119 ± 0.078
0.298CysAsn: 0.298 ± 0.139
0.357CysPro: 0.357 ± 0.159
0.119CysGln: 0.119 ± 0.09
0.655CysArg: 0.655 ± 0.219
0.298CysSer: 0.298 ± 0.129
0.357CysThr: 0.357 ± 0.144
0.238CysVal: 0.238 ± 0.108
0.06CysTrp: 0.06 ± 0.065
0.119CysTyr: 0.119 ± 0.077
0.0CysXaa: 0.0 ± 0.0
Asp
6.427AspAla: 6.427 ± 0.641
0.655AspCys: 0.655 ± 0.213
4.702AspAsp: 4.702 ± 0.567
4.047AspGlu: 4.047 ± 0.547
2.083AspPhe: 2.083 ± 0.275
6.427AspGly: 6.427 ± 0.63
1.071AspHis: 1.071 ± 0.275
2.976AspIle: 2.976 ± 0.458
2.44AspLys: 2.44 ± 0.42
7.023AspLeu: 7.023 ± 0.695
1.131AspMet: 1.131 ± 0.203
1.666AspAsn: 1.666 ± 0.302
4.821AspPro: 4.821 ± 0.591
1.547AspGln: 1.547 ± 0.321
3.868AspArg: 3.868 ± 0.395
3.154AspSer: 3.154 ± 0.43
4.047AspThr: 4.047 ± 0.429
4.642AspVal: 4.642 ± 0.451
1.488AspTrp: 1.488 ± 0.269
2.083AspTyr: 2.083 ± 0.35
0.0AspXaa: 0.0 ± 0.0
Glu
6.249GluAla: 6.249 ± 0.722
0.238GluCys: 0.238 ± 0.131
5.178GluAsp: 5.178 ± 0.639
5.297GluGlu: 5.297 ± 0.659
2.202GluPhe: 2.202 ± 0.365
3.809GluGly: 3.809 ± 0.473
1.428GluHis: 1.428 ± 0.306
3.511GluIle: 3.511 ± 0.43
2.678GluLys: 2.678 ± 0.422
7.023GluLeu: 7.023 ± 0.609
1.607GluMet: 1.607 ± 0.283
1.726GluAsn: 1.726 ± 0.31
2.797GluPro: 2.797 ± 0.428
2.5GluGln: 2.5 ± 0.387
3.987GluArg: 3.987 ± 0.543
3.154GluSer: 3.154 ± 0.423
3.63GluThr: 3.63 ± 0.502
5.356GluVal: 5.356 ± 0.672
1.369GluTrp: 1.369 ± 0.355
2.678GluTyr: 2.678 ± 0.482
0.0GluXaa: 0.0 ± 0.0
Phe
2.262PheAla: 2.262 ± 0.327
0.238PheCys: 0.238 ± 0.132
2.678PheAsp: 2.678 ± 0.326
2.083PheGlu: 2.083 ± 0.292
0.476PhePhe: 0.476 ± 0.16
3.571PheGly: 3.571 ± 0.491
0.536PheHis: 0.536 ± 0.225
1.309PheIle: 1.309 ± 0.26
1.25PheLys: 1.25 ± 0.278
2.202PheLeu: 2.202 ± 0.392
0.536PheMet: 0.536 ± 0.198
1.369PheAsn: 1.369 ± 0.26
1.726PhePro: 1.726 ± 0.294
0.774PheGln: 0.774 ± 0.174
1.845PheArg: 1.845 ± 0.364
2.142PheSer: 2.142 ± 0.404
1.904PheThr: 1.904 ± 0.367
1.904PheVal: 1.904 ± 0.349
0.536PheTrp: 0.536 ± 0.172
0.833PheTyr: 0.833 ± 0.255
0.0PheXaa: 0.0 ± 0.0
Gly
6.844GlyAla: 6.844 ± 1.0
0.476GlyCys: 0.476 ± 0.162
5.713GlyAsp: 5.713 ± 0.476
4.523GlyGlu: 4.523 ± 0.493
2.976GlyPhe: 2.976 ± 0.459
9.225GlyGly: 9.225 ± 2.215
1.726GlyHis: 1.726 ± 0.376
4.106GlyIle: 4.106 ± 0.654
3.273GlyLys: 3.273 ± 0.492
7.915GlyLeu: 7.915 ± 0.909
1.964GlyMet: 1.964 ± 0.33
3.154GlyAsn: 3.154 ± 0.445
3.69GlyPro: 3.69 ± 0.665
2.619GlyGln: 2.619 ± 0.353
5.237GlyArg: 5.237 ± 0.501
6.249GlySer: 6.249 ± 0.951
5.416GlyThr: 5.416 ± 0.767
5.654GlyVal: 5.654 ± 0.562
2.44GlyTrp: 2.44 ± 0.332
2.857GlyTyr: 2.857 ± 0.461
0.0GlyXaa: 0.0 ± 0.0
His
1.488HisAla: 1.488 ± 0.276
0.179HisCys: 0.179 ± 0.149
1.25HisAsp: 1.25 ± 0.234
1.309HisGlu: 1.309 ± 0.319
0.595HisPhe: 0.595 ± 0.207
1.488HisGly: 1.488 ± 0.327
0.774HisHis: 0.774 ± 0.213
1.012HisIle: 1.012 ± 0.244
0.774HisLys: 0.774 ± 0.23
1.785HisLeu: 1.785 ± 0.336
0.238HisMet: 0.238 ± 0.124
0.119HisAsn: 0.119 ± 0.073
1.071HisPro: 1.071 ± 0.248
0.833HisGln: 0.833 ± 0.245
1.726HisArg: 1.726 ± 0.369
0.774HisSer: 0.774 ± 0.23
1.131HisThr: 1.131 ± 0.287
1.845HisVal: 1.845 ± 0.378
0.595HisTrp: 0.595 ± 0.16
0.774HisTyr: 0.774 ± 0.254
0.0HisXaa: 0.0 ± 0.0
Ile
5.713IleAla: 5.713 ± 0.572
0.179IleCys: 0.179 ± 0.108
2.976IleAsp: 2.976 ± 0.305
3.809IleGlu: 3.809 ± 0.454
0.655IlePhe: 0.655 ± 0.185
4.047IleGly: 4.047 ± 0.485
0.833IleHis: 0.833 ± 0.232
1.666IleIle: 1.666 ± 0.298
1.666IleLys: 1.666 ± 0.34
3.63IleLeu: 3.63 ± 0.395
0.833IleMet: 0.833 ± 0.18
2.083IleAsn: 2.083 ± 0.332
3.035IlePro: 3.035 ± 0.403
1.428IleGln: 1.428 ± 0.318
3.511IleArg: 3.511 ± 0.46
3.63IleSer: 3.63 ± 0.552
3.571IleThr: 3.571 ± 0.375
3.392IleVal: 3.392 ± 0.499
0.893IleTrp: 0.893 ± 0.197
1.666IleTyr: 1.666 ± 0.31
0.0IleXaa: 0.0 ± 0.0
Lys
3.928LysAla: 3.928 ± 0.563
0.179LysCys: 0.179 ± 0.11
2.559LysAsp: 2.559 ± 0.47
1.964LysGlu: 1.964 ± 0.346
1.369LysPhe: 1.369 ± 0.284
2.202LysGly: 2.202 ± 0.364
1.309LysHis: 1.309 ± 0.355
2.083LysIle: 2.083 ± 0.388
1.964LysLys: 1.964 ± 0.35
3.273LysLeu: 3.273 ± 0.384
0.952LysMet: 0.952 ± 0.214
1.488LysAsn: 1.488 ± 0.257
3.035LysPro: 3.035 ± 0.478
1.488LysGln: 1.488 ± 0.257
2.559LysArg: 2.559 ± 0.48
2.44LysSer: 2.44 ± 0.387
2.5LysThr: 2.5 ± 0.391
3.154LysVal: 3.154 ± 0.428
0.833LysTrp: 0.833 ± 0.22
1.131LysTyr: 1.131 ± 0.312
0.0LysXaa: 0.0 ± 0.0
Leu
9.165LeuAla: 9.165 ± 0.81
0.238LeuCys: 0.238 ± 0.115
5.773LeuAsp: 5.773 ± 0.561
5.892LeuGlu: 5.892 ± 0.592
2.083LeuPhe: 2.083 ± 0.343
7.558LeuGly: 7.558 ± 0.732
1.607LeuHis: 1.607 ± 0.34
4.821LeuIle: 4.821 ± 0.597
3.809LeuLys: 3.809 ± 0.489
6.189LeuLeu: 6.189 ± 0.525
1.607LeuMet: 1.607 ± 0.275
3.095LeuAsn: 3.095 ± 0.471
5.356LeuPro: 5.356 ± 0.661
2.5LeuGln: 2.5 ± 0.375
6.011LeuArg: 6.011 ± 0.518
5.951LeuSer: 5.951 ± 0.565
6.487LeuThr: 6.487 ± 0.517
4.88LeuVal: 4.88 ± 0.636
1.012LeuTrp: 1.012 ± 0.275
2.5LeuTyr: 2.5 ± 0.421
0.0LeuXaa: 0.0 ± 0.0
Met
2.381MetAla: 2.381 ± 0.344
0.0MetCys: 0.0 ± 0.0
1.071MetAsp: 1.071 ± 0.258
1.309MetGlu: 1.309 ± 0.325
0.417MetPhe: 0.417 ± 0.136
1.369MetGly: 1.369 ± 0.3
0.298MetHis: 0.298 ± 0.131
0.595MetIle: 0.595 ± 0.191
0.952MetLys: 0.952 ± 0.213
1.369MetLeu: 1.369 ± 0.278
0.179MetMet: 0.179 ± 0.109
0.952MetAsn: 0.952 ± 0.217
0.833MetPro: 0.833 ± 0.231
0.536MetGln: 0.536 ± 0.172
1.547MetArg: 1.547 ± 0.353
2.559MetSer: 2.559 ± 0.406
2.083MetThr: 2.083 ± 0.278
1.071MetVal: 1.071 ± 0.281
0.298MetTrp: 0.298 ± 0.107
0.417MetTyr: 0.417 ± 0.16
0.0MetXaa: 0.0 ± 0.0
Asn
3.095AsnAla: 3.095 ± 0.423
0.0AsnCys: 0.0 ± 0.0
2.023AsnAsp: 2.023 ± 0.36
2.083AsnGlu: 2.083 ± 0.366
0.893AsnPhe: 0.893 ± 0.264
3.571AsnGly: 3.571 ± 0.597
0.774AsnHis: 0.774 ± 0.187
1.369AsnIle: 1.369 ± 0.317
0.714AsnLys: 0.714 ± 0.232
2.321AsnLeu: 2.321 ± 0.411
0.476AsnMet: 0.476 ± 0.142
0.893AsnAsn: 0.893 ± 0.227
2.5AsnPro: 2.5 ± 0.382
0.893AsnGln: 0.893 ± 0.241
1.607AsnArg: 1.607 ± 0.399
2.083AsnSer: 2.083 ± 0.399
2.023AsnThr: 2.023 ± 0.322
2.678AsnVal: 2.678 ± 0.386
0.714AsnTrp: 0.714 ± 0.205
1.25AsnTyr: 1.25 ± 0.258
0.0AsnXaa: 0.0 ± 0.0
Pro
5.178ProAla: 5.178 ± 0.569
0.298ProCys: 0.298 ± 0.146
3.868ProAsp: 3.868 ± 0.559
4.285ProGlu: 4.285 ± 0.541
1.964ProPhe: 1.964 ± 0.342
4.999ProGly: 4.999 ± 0.553
0.774ProHis: 0.774 ± 0.219
2.559ProIle: 2.559 ± 0.358
2.381ProLys: 2.381 ± 0.366
4.344ProLeu: 4.344 ± 0.549
0.952ProMet: 0.952 ± 0.276
1.488ProAsn: 1.488 ± 0.279
3.392ProPro: 3.392 ± 0.462
1.547ProGln: 1.547 ± 0.3
2.738ProArg: 2.738 ± 0.423
3.749ProSer: 3.749 ± 0.508
4.047ProThr: 4.047 ± 0.495
3.809ProVal: 3.809 ± 0.493
0.893ProTrp: 0.893 ± 0.311
1.369ProTyr: 1.369 ± 0.374
0.0ProXaa: 0.0 ± 0.0
Gln
3.035GlnAla: 3.035 ± 0.459
0.179GlnCys: 0.179 ± 0.123
1.369GlnAsp: 1.369 ± 0.357
1.666GlnGlu: 1.666 ± 0.284
1.25GlnPhe: 1.25 ± 0.302
2.321GlnGly: 2.321 ± 0.295
0.655GlnHis: 0.655 ± 0.165
2.619GlnIle: 2.619 ± 0.384
1.071GlnLys: 1.071 ± 0.224
3.333GlnLeu: 3.333 ± 0.445
1.071GlnMet: 1.071 ± 0.222
0.476GlnAsn: 0.476 ± 0.133
1.845GlnPro: 1.845 ± 0.342
1.726GlnGln: 1.726 ± 0.34
2.023GlnArg: 2.023 ± 0.357
1.666GlnSer: 1.666 ± 0.308
1.25GlnThr: 1.25 ± 0.264
2.321GlnVal: 2.321 ± 0.346
0.714GlnTrp: 0.714 ± 0.142
0.595GlnTyr: 0.595 ± 0.143
0.0GlnXaa: 0.0 ± 0.0
Arg
5.118ArgAla: 5.118 ± 0.552
0.774ArgCys: 0.774 ± 0.245
3.273ArgAsp: 3.273 ± 0.545
4.761ArgGlu: 4.761 ± 0.657
1.964ArgPhe: 1.964 ± 0.395
4.642ArgGly: 4.642 ± 0.545
1.071ArgHis: 1.071 ± 0.252
3.928ArgIle: 3.928 ± 0.539
3.511ArgLys: 3.511 ± 0.58
6.368ArgLeu: 6.368 ± 0.565
1.845ArgMet: 1.845 ± 0.407
2.5ArgAsn: 2.5 ± 0.479
2.381ArgPro: 2.381 ± 0.359
2.083ArgGln: 2.083 ± 0.397
5.297ArgArg: 5.297 ± 0.687
4.344ArgSer: 4.344 ± 0.55
3.333ArgThr: 3.333 ± 0.459
5.297ArgVal: 5.297 ± 0.486
1.488ArgTrp: 1.488 ± 0.297
1.726ArgTyr: 1.726 ± 0.295
0.0ArgXaa: 0.0 ± 0.0
Ser
6.13SerAla: 6.13 ± 0.795
0.417SerCys: 0.417 ± 0.178
3.511SerAsp: 3.511 ± 0.45
3.69SerGlu: 3.69 ± 0.535
2.023SerPhe: 2.023 ± 0.366
7.38SerGly: 7.38 ± 1.033
1.428SerHis: 1.428 ± 0.308
2.976SerIle: 2.976 ± 0.48
2.5SerLys: 2.5 ± 0.471
4.88SerLeu: 4.88 ± 0.596
1.19SerMet: 1.19 ± 0.263
2.44SerAsn: 2.44 ± 0.422
3.095SerPro: 3.095 ± 0.525
1.964SerGln: 1.964 ± 0.28
3.333SerArg: 3.333 ± 0.415
3.571SerSer: 3.571 ± 0.668
3.095SerThr: 3.095 ± 0.506
4.166SerVal: 4.166 ± 0.52
1.666SerTrp: 1.666 ± 0.313
1.547SerTyr: 1.547 ± 0.325
0.0SerXaa: 0.0 ± 0.0
Thr
6.07ThrAla: 6.07 ± 0.685
0.417ThrCys: 0.417 ± 0.175
4.463ThrAsp: 4.463 ± 0.584
4.583ThrGlu: 4.583 ± 0.464
2.142ThrPhe: 2.142 ± 0.386
6.189ThrGly: 6.189 ± 0.573
1.071ThrHis: 1.071 ± 0.346
2.976ThrIle: 2.976 ± 0.525
2.619ThrLys: 2.619 ± 0.371
5.654ThrLeu: 5.654 ± 0.584
0.833ThrMet: 0.833 ± 0.198
1.726ThrAsn: 1.726 ± 0.305
4.047ThrPro: 4.047 ± 0.469
1.726ThrGln: 1.726 ± 0.329
3.392ThrArg: 3.392 ± 0.471
3.333ThrSer: 3.333 ± 0.48
4.88ThrThr: 4.88 ± 0.581
5.832ThrVal: 5.832 ± 0.693
1.012ThrTrp: 1.012 ± 0.244
1.964ThrTyr: 1.964 ± 0.34
0.0ThrXaa: 0.0 ± 0.0
Val
7.142ValAla: 7.142 ± 0.716
0.595ValCys: 0.595 ± 0.199
5.535ValAsp: 5.535 ± 0.518
4.88ValGlu: 4.88 ± 0.552
2.678ValPhe: 2.678 ± 0.334
4.94ValGly: 4.94 ± 0.566
1.488ValHis: 1.488 ± 0.258
3.333ValIle: 3.333 ± 0.453
2.916ValLys: 2.916 ± 0.39
5.594ValLeu: 5.594 ± 0.544
1.369ValMet: 1.369 ± 0.303
2.619ValAsn: 2.619 ± 0.411
3.809ValPro: 3.809 ± 0.46
2.262ValGln: 2.262 ± 0.46
5.535ValArg: 5.535 ± 0.7
4.642ValSer: 4.642 ± 0.423
5.773ValThr: 5.773 ± 0.612
5.059ValVal: 5.059 ± 0.716
1.19ValTrp: 1.19 ± 0.284
2.083ValTyr: 2.083 ± 0.452
0.0ValXaa: 0.0 ± 0.0
Trp
1.19TrpAla: 1.19 ± 0.257
0.238TrpCys: 0.238 ± 0.116
1.547TrpAsp: 1.547 ± 0.265
1.19TrpGlu: 1.19 ± 0.232
0.893TrpPhe: 0.893 ± 0.247
1.785TrpGly: 1.785 ± 0.33
0.595TrpHis: 0.595 ± 0.211
1.131TrpIle: 1.131 ± 0.255
0.595TrpLys: 0.595 ± 0.232
1.785TrpLeu: 1.785 ± 0.331
0.476TrpMet: 0.476 ± 0.158
0.417TrpAsn: 0.417 ± 0.144
0.833TrpPro: 0.833 ± 0.225
0.833TrpGln: 0.833 ± 0.206
1.428TrpArg: 1.428 ± 0.368
0.774TrpSer: 0.774 ± 0.252
1.607TrpThr: 1.607 ± 0.392
1.785TrpVal: 1.785 ± 0.278
0.476TrpTrp: 0.476 ± 0.221
0.357TrpTyr: 0.357 ± 0.167
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.559TyrAla: 2.559 ± 0.384
0.179TyrCys: 0.179 ± 0.129
1.19TyrAsp: 1.19 ± 0.242
2.5TyrGlu: 2.5 ± 0.389
0.476TyrPhe: 0.476 ± 0.157
2.559TyrGly: 2.559 ± 0.419
0.476TyrHis: 0.476 ± 0.156
1.845TyrIle: 1.845 ± 0.36
1.25TyrLys: 1.25 ± 0.295
2.5TyrLeu: 2.5 ± 0.439
0.536TyrMet: 0.536 ± 0.157
1.071TyrAsn: 1.071 ± 0.263
1.309TyrPro: 1.309 ± 0.315
0.952TyrGln: 0.952 ± 0.208
2.797TyrArg: 2.797 ± 0.43
1.369TyrSer: 1.369 ± 0.298
2.381TyrThr: 2.381 ± 0.356
2.202TyrVal: 2.202 ± 0.357
0.476TyrTrp: 0.476 ± 0.189
0.476TyrTyr: 0.476 ± 0.148
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 97 proteins (16804 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski