Amino acid dipepetide frequency for Mycobacterium phage Batiatus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
14.502AlaAla: 14.502 ± 1.679
0.92AlaCys: 0.92 ± 0.226
7.359AlaAsp: 7.359 ± 0.661
7.63AlaGlu: 7.63 ± 0.818
2.814AlaPhe: 2.814 ± 0.39
10.065AlaGly: 10.065 ± 1.221
2.489AlaHis: 2.489 ± 0.406
4.329AlaIle: 4.329 ± 0.557
3.896AlaLys: 3.896 ± 0.404
8.171AlaLeu: 8.171 ± 0.726
2.543AlaMet: 2.543 ± 0.468
2.327AlaAsn: 2.327 ± 0.335
5.141AlaPro: 5.141 ± 0.575
3.355AlaGln: 3.355 ± 0.477
8.171AlaArg: 8.171 ± 0.924
5.411AlaSer: 5.411 ± 0.648
6.115AlaThr: 6.115 ± 0.567
7.143AlaVal: 7.143 ± 0.593
2.922AlaTrp: 2.922 ± 0.618
2.219AlaTyr: 2.219 ± 0.294
0.0AlaXaa: 0.0 ± 0.0
Cys
0.974CysAla: 0.974 ± 0.236
0.108CysCys: 0.108 ± 0.08
1.569CysAsp: 1.569 ± 0.399
0.758CysGlu: 0.758 ± 0.201
0.162CysPhe: 0.162 ± 0.086
1.353CysGly: 1.353 ± 0.292
0.162CysHis: 0.162 ± 0.102
0.216CysIle: 0.216 ± 0.119
0.433CysLys: 0.433 ± 0.174
0.92CysLeu: 0.92 ± 0.247
0.216CysMet: 0.216 ± 0.096
0.433CysAsn: 0.433 ± 0.146
0.92CysPro: 0.92 ± 0.242
0.325CysGln: 0.325 ± 0.13
0.649CysArg: 0.649 ± 0.184
0.595CysSer: 0.595 ± 0.181
0.541CysThr: 0.541 ± 0.175
0.703CysVal: 0.703 ± 0.197
0.271CysTrp: 0.271 ± 0.114
0.162CysTyr: 0.162 ± 0.106
0.0CysXaa: 0.0 ± 0.0
Asp
7.413AspAla: 7.413 ± 0.621
0.866AspCys: 0.866 ± 0.229
4.491AspAsp: 4.491 ± 0.601
3.193AspGlu: 3.193 ± 0.411
2.11AspPhe: 2.11 ± 0.272
7.63AspGly: 7.63 ± 0.774
1.19AspHis: 1.19 ± 0.241
2.381AspIle: 2.381 ± 0.299
1.515AspLys: 1.515 ± 0.265
6.385AspLeu: 6.385 ± 0.562
0.866AspMet: 0.866 ± 0.243
1.461AspAsn: 1.461 ± 0.326
5.032AspPro: 5.032 ± 0.654
2.11AspGln: 2.11 ± 0.333
4.924AspArg: 4.924 ± 0.618
3.842AspSer: 3.842 ± 0.588
3.734AspThr: 3.734 ± 0.438
4.329AspVal: 4.329 ± 0.531
1.569AspTrp: 1.569 ± 0.302
1.948AspTyr: 1.948 ± 0.289
0.0AspXaa: 0.0 ± 0.0
Glu
6.223GluAla: 6.223 ± 0.685
0.758GluCys: 0.758 ± 0.272
2.922GluAsp: 2.922 ± 0.344
2.76GluGlu: 2.76 ± 0.614
2.165GluPhe: 2.165 ± 0.326
2.922GluGly: 2.922 ± 0.393
1.407GluHis: 1.407 ± 0.341
2.381GluIle: 2.381 ± 0.338
1.623GluLys: 1.623 ± 0.257
5.357GluLeu: 5.357 ± 0.73
1.786GluMet: 1.786 ± 0.309
2.273GluAsn: 2.273 ± 0.287
2.976GluPro: 2.976 ± 0.42
2.76GluGln: 2.76 ± 0.399
4.545GluArg: 4.545 ± 0.606
3.247GluSer: 3.247 ± 0.528
4.383GluThr: 4.383 ± 0.675
3.95GluVal: 3.95 ± 0.597
1.407GluTrp: 1.407 ± 0.278
1.732GluTyr: 1.732 ± 0.377
0.0GluXaa: 0.0 ± 0.0
Phe
3.409PheAla: 3.409 ± 0.441
0.162PheCys: 0.162 ± 0.08
2.327PheAsp: 2.327 ± 0.423
1.732PheGlu: 1.732 ± 0.31
0.866PhePhe: 0.866 ± 0.235
3.247PheGly: 3.247 ± 0.592
0.433PheHis: 0.433 ± 0.142
1.515PheIle: 1.515 ± 0.362
1.028PheLys: 1.028 ± 0.251
1.569PheLeu: 1.569 ± 0.264
0.866PheMet: 0.866 ± 0.26
1.19PheAsn: 1.19 ± 0.33
1.677PhePro: 1.677 ± 0.294
1.082PheGln: 1.082 ± 0.327
1.569PheArg: 1.569 ± 0.252
1.299PheSer: 1.299 ± 0.265
2.219PheThr: 2.219 ± 0.423
1.948PheVal: 1.948 ± 0.295
0.487PheTrp: 0.487 ± 0.127
1.136PheTyr: 1.136 ± 0.322
0.0PheXaa: 0.0 ± 0.0
Gly
9.037GlyAla: 9.037 ± 1.174
1.19GlyCys: 1.19 ± 0.253
6.602GlyAsp: 6.602 ± 0.642
3.626GlyGlu: 3.626 ± 0.552
2.814GlyPhe: 2.814 ± 0.41
10.281GlyGly: 10.281 ± 1.948
1.732GlyHis: 1.732 ± 0.289
4.383GlyIle: 4.383 ± 0.665
2.273GlyLys: 2.273 ± 0.331
6.115GlyLeu: 6.115 ± 0.559
2.381GlyMet: 2.381 ± 0.392
2.976GlyAsn: 2.976 ± 0.363
4.437GlyPro: 4.437 ± 0.505
2.435GlyGln: 2.435 ± 0.51
5.357GlyArg: 5.357 ± 0.656
6.439GlySer: 6.439 ± 0.884
6.169GlyThr: 6.169 ± 0.775
5.411GlyVal: 5.411 ± 0.61
2.652GlyTrp: 2.652 ± 0.382
2.11GlyTyr: 2.11 ± 0.349
0.0GlyXaa: 0.0 ± 0.0
His
1.84HisAla: 1.84 ± 0.397
0.379HisCys: 0.379 ± 0.167
0.812HisAsp: 0.812 ± 0.194
1.353HisGlu: 1.353 ± 0.287
0.379HisPhe: 0.379 ± 0.127
1.84HisGly: 1.84 ± 0.301
0.974HisHis: 0.974 ± 0.253
1.407HisIle: 1.407 ± 0.294
0.812HisLys: 0.812 ± 0.241
1.353HisLeu: 1.353 ± 0.338
0.703HisMet: 0.703 ± 0.186
0.703HisAsn: 0.703 ± 0.185
1.623HisPro: 1.623 ± 0.317
0.758HisGln: 0.758 ± 0.192
2.219HisArg: 2.219 ± 0.416
0.703HisSer: 0.703 ± 0.186
1.19HisThr: 1.19 ± 0.315
1.461HisVal: 1.461 ± 0.301
0.541HisTrp: 0.541 ± 0.185
0.703HisTyr: 0.703 ± 0.186
0.0HisXaa: 0.0 ± 0.0
Ile
5.628IleAla: 5.628 ± 0.584
0.541IleCys: 0.541 ± 0.184
4.058IleAsp: 4.058 ± 0.526
3.626IleGlu: 3.626 ± 0.422
0.812IlePhe: 0.812 ± 0.254
3.355IleGly: 3.355 ± 0.522
1.623IleHis: 1.623 ± 0.33
1.299IleIle: 1.299 ± 0.258
0.974IleLys: 0.974 ± 0.243
2.327IleLeu: 2.327 ± 0.414
0.379IleMet: 0.379 ± 0.128
1.786IleAsn: 1.786 ± 0.278
3.193IlePro: 3.193 ± 0.357
1.299IleGln: 1.299 ± 0.229
2.814IleArg: 2.814 ± 0.475
2.327IleSer: 2.327 ± 0.397
3.626IleThr: 3.626 ± 0.444
2.76IleVal: 2.76 ± 0.333
0.92IleTrp: 0.92 ± 0.272
0.541IleTyr: 0.541 ± 0.181
0.0IleXaa: 0.0 ± 0.0
Lys
3.084LysAla: 3.084 ± 0.458
0.487LysCys: 0.487 ± 0.19
1.515LysAsp: 1.515 ± 0.313
1.299LysGlu: 1.299 ± 0.268
1.299LysPhe: 1.299 ± 0.268
2.706LysGly: 2.706 ± 0.357
0.812LysHis: 0.812 ± 0.234
0.866LysIle: 0.866 ± 0.258
1.19LysLys: 1.19 ± 0.396
2.435LysLeu: 2.435 ± 0.384
0.649LysMet: 0.649 ± 0.165
0.812LysAsn: 0.812 ± 0.197
2.489LysPro: 2.489 ± 0.428
1.569LysGln: 1.569 ± 0.259
2.219LysArg: 2.219 ± 0.332
1.84LysSer: 1.84 ± 0.357
2.11LysThr: 2.11 ± 0.324
2.219LysVal: 2.219 ± 0.369
0.758LysTrp: 0.758 ± 0.219
0.974LysTyr: 0.974 ± 0.253
0.0LysXaa: 0.0 ± 0.0
Leu
7.576LeuAla: 7.576 ± 0.716
0.758LeuCys: 0.758 ± 0.225
5.303LeuAsp: 5.303 ± 0.6
3.626LeuGlu: 3.626 ± 0.545
2.543LeuPhe: 2.543 ± 0.301
5.628LeuGly: 5.628 ± 0.517
1.136LeuHis: 1.136 ± 0.258
3.571LeuIle: 3.571 ± 0.467
1.84LeuLys: 1.84 ± 0.326
4.816LeuLeu: 4.816 ± 0.548
1.515LeuMet: 1.515 ± 0.266
2.814LeuAsn: 2.814 ± 0.376
5.411LeuPro: 5.411 ± 0.69
2.76LeuGln: 2.76 ± 0.409
5.411LeuArg: 5.411 ± 0.611
5.465LeuSer: 5.465 ± 0.538
5.411LeuThr: 5.411 ± 0.589
5.032LeuVal: 5.032 ± 0.585
1.245LeuTrp: 1.245 ± 0.296
2.056LeuTyr: 2.056 ± 0.372
0.0LeuXaa: 0.0 ± 0.0
Met
2.219MetAla: 2.219 ± 0.377
0.271MetCys: 0.271 ± 0.123
1.136MetAsp: 1.136 ± 0.248
1.19MetGlu: 1.19 ± 0.218
0.703MetPhe: 0.703 ± 0.222
1.569MetGly: 1.569 ± 0.291
0.162MetHis: 0.162 ± 0.101
0.812MetIle: 0.812 ± 0.222
0.758MetLys: 0.758 ± 0.189
1.623MetLeu: 1.623 ± 0.265
0.487MetMet: 0.487 ± 0.196
1.028MetAsn: 1.028 ± 0.204
1.623MetPro: 1.623 ± 0.252
0.595MetGln: 0.595 ± 0.196
1.677MetArg: 1.677 ± 0.318
2.76MetSer: 2.76 ± 0.378
2.002MetThr: 2.002 ± 0.295
1.461MetVal: 1.461 ± 0.334
0.325MetTrp: 0.325 ± 0.123
0.433MetTyr: 0.433 ± 0.167
0.0MetXaa: 0.0 ± 0.0
Asn
3.788AsnAla: 3.788 ± 0.477
0.271AsnCys: 0.271 ± 0.121
1.623AsnAsp: 1.623 ± 0.291
1.569AsnGlu: 1.569 ± 0.288
0.812AsnPhe: 0.812 ± 0.273
4.437AsnGly: 4.437 ± 0.632
0.974AsnHis: 0.974 ± 0.169
1.786AsnIle: 1.786 ± 0.485
1.082AsnLys: 1.082 ± 0.259
2.11AsnLeu: 2.11 ± 0.327
0.487AsnMet: 0.487 ± 0.157
1.786AsnAsn: 1.786 ± 0.362
2.273AsnPro: 2.273 ± 0.324
1.028AsnGln: 1.028 ± 0.311
1.894AsnArg: 1.894 ± 0.384
1.569AsnSer: 1.569 ± 0.26
2.435AsnThr: 2.435 ± 0.355
1.948AsnVal: 1.948 ± 0.303
0.758AsnTrp: 0.758 ± 0.186
0.487AsnTyr: 0.487 ± 0.151
0.0AsnXaa: 0.0 ± 0.0
Pro
5.898ProAla: 5.898 ± 0.727
0.703ProCys: 0.703 ± 0.219
4.383ProAsp: 4.383 ± 0.615
3.842ProGlu: 3.842 ± 0.382
1.732ProPhe: 1.732 ± 0.337
6.494ProGly: 6.494 ± 0.737
1.515ProHis: 1.515 ± 0.297
2.706ProIle: 2.706 ± 0.387
2.435ProLys: 2.435 ± 0.436
4.329ProLeu: 4.329 ± 0.568
1.732ProMet: 1.732 ± 0.382
2.273ProAsn: 2.273 ± 0.346
3.788ProPro: 3.788 ± 0.561
2.219ProGln: 2.219 ± 0.397
3.247ProArg: 3.247 ± 0.52
3.409ProSer: 3.409 ± 0.448
3.139ProThr: 3.139 ± 0.415
4.762ProVal: 4.762 ± 0.533
1.19ProTrp: 1.19 ± 0.245
1.623ProTyr: 1.623 ± 0.268
0.0ProXaa: 0.0 ± 0.0
Gln
4.437GlnAla: 4.437 ± 0.569
0.271GlnCys: 0.271 ± 0.117
1.786GlnAsp: 1.786 ± 0.277
1.732GlnGlu: 1.732 ± 0.347
1.082GlnPhe: 1.082 ± 0.238
2.219GlnGly: 2.219 ± 0.476
0.758GlnHis: 0.758 ± 0.254
1.677GlnIle: 1.677 ± 0.271
1.082GlnLys: 1.082 ± 0.178
3.03GlnLeu: 3.03 ± 0.414
0.541GlnMet: 0.541 ± 0.178
0.92GlnAsn: 0.92 ± 0.21
2.489GlnPro: 2.489 ± 0.414
1.136GlnGln: 1.136 ± 0.252
2.273GlnArg: 2.273 ± 0.316
2.543GlnSer: 2.543 ± 0.458
1.623GlnThr: 1.623 ± 0.342
2.381GlnVal: 2.381 ± 0.352
0.703GlnTrp: 0.703 ± 0.159
0.866GlnTyr: 0.866 ± 0.236
0.0GlnXaa: 0.0 ± 0.0
Arg
6.331ArgAla: 6.331 ± 0.646
1.299ArgCys: 1.299 ± 0.279
4.058ArgAsp: 4.058 ± 0.578
4.762ArgGlu: 4.762 ± 0.636
2.056ArgPhe: 2.056 ± 0.398
3.68ArgGly: 3.68 ± 0.486
1.353ArgHis: 1.353 ± 0.273
3.788ArgIle: 3.788 ± 0.549
2.489ArgLys: 2.489 ± 0.416
5.249ArgLeu: 5.249 ± 0.619
2.165ArgMet: 2.165 ± 0.39
2.543ArgAsn: 2.543 ± 0.511
3.788ArgPro: 3.788 ± 0.419
2.273ArgGln: 2.273 ± 0.378
6.169ArgArg: 6.169 ± 0.989
4.6ArgSer: 4.6 ± 0.492
3.571ArgThr: 3.571 ± 0.503
5.249ArgVal: 5.249 ± 0.553
2.11ArgTrp: 2.11 ± 0.328
2.11ArgTyr: 2.11 ± 0.332
0.0ArgXaa: 0.0 ± 0.0
Ser
7.089SerAla: 7.089 ± 1.166
0.433SerCys: 0.433 ± 0.151
4.437SerAsp: 4.437 ± 0.521
3.03SerGlu: 3.03 ± 0.403
2.056SerPhe: 2.056 ± 0.409
6.872SerGly: 6.872 ± 0.821
1.19SerHis: 1.19 ± 0.251
2.976SerIle: 2.976 ± 0.373
2.435SerLys: 2.435 ± 0.454
4.167SerLeu: 4.167 ± 0.481
1.082SerMet: 1.082 ± 0.302
2.11SerAsn: 2.11 ± 0.361
3.084SerPro: 3.084 ± 0.321
1.569SerGln: 1.569 ± 0.307
3.896SerArg: 3.896 ± 0.454
3.68SerSer: 3.68 ± 0.563
3.571SerThr: 3.571 ± 0.446
4.383SerVal: 4.383 ± 0.48
1.461SerTrp: 1.461 ± 0.297
1.407SerTyr: 1.407 ± 0.234
0.0SerXaa: 0.0 ± 0.0
Thr
6.169ThrAla: 6.169 ± 0.764
0.649ThrCys: 0.649 ± 0.214
3.788ThrAsp: 3.788 ± 0.613
3.788ThrGlu: 3.788 ± 0.396
1.515ThrPhe: 1.515 ± 0.298
5.628ThrGly: 5.628 ± 0.597
1.407ThrHis: 1.407 ± 0.289
3.084ThrIle: 3.084 ± 0.428
1.948ThrLys: 1.948 ± 0.31
4.816ThrLeu: 4.816 ± 0.392
1.569ThrMet: 1.569 ± 0.314
2.327ThrAsn: 2.327 ± 0.357
4.654ThrPro: 4.654 ± 0.554
2.002ThrGln: 2.002 ± 0.313
4.221ThrArg: 4.221 ± 0.437
3.734ThrSer: 3.734 ± 0.477
4.491ThrThr: 4.491 ± 0.487
6.115ThrVal: 6.115 ± 0.683
1.082ThrTrp: 1.082 ± 0.265
1.894ThrTyr: 1.894 ± 0.339
0.0ThrXaa: 0.0 ± 0.0
Val
7.576ValAla: 7.576 ± 0.612
0.92ValCys: 0.92 ± 0.204
5.628ValAsp: 5.628 ± 0.598
5.195ValGlu: 5.195 ± 0.672
2.219ValPhe: 2.219 ± 0.401
5.465ValGly: 5.465 ± 0.652
1.245ValHis: 1.245 ± 0.298
2.597ValIle: 2.597 ± 0.411
2.056ValLys: 2.056 ± 0.339
5.032ValLeu: 5.032 ± 0.569
1.569ValMet: 1.569 ± 0.228
1.948ValAsn: 1.948 ± 0.364
4.113ValPro: 4.113 ± 0.487
2.652ValGln: 2.652 ± 0.319
4.329ValArg: 4.329 ± 0.605
4.762ValSer: 4.762 ± 0.573
5.032ValThr: 5.032 ± 0.503
6.548ValVal: 6.548 ± 0.712
1.948ValTrp: 1.948 ± 0.362
1.407ValTyr: 1.407 ± 0.243
0.0ValXaa: 0.0 ± 0.0
Trp
1.786TrpAla: 1.786 ± 0.306
0.216TrpCys: 0.216 ± 0.124
1.569TrpAsp: 1.569 ± 0.268
1.136TrpGlu: 1.136 ± 0.305
0.92TrpPhe: 0.92 ± 0.233
1.082TrpGly: 1.082 ± 0.244
0.595TrpHis: 0.595 ± 0.186
1.19TrpIle: 1.19 ± 0.272
0.812TrpLys: 0.812 ± 0.211
1.948TrpLeu: 1.948 ± 0.387
0.974TrpMet: 0.974 ± 0.243
0.541TrpAsn: 0.541 ± 0.205
1.136TrpPro: 1.136 ± 0.272
0.974TrpGln: 0.974 ± 0.283
2.219TrpArg: 2.219 ± 0.432
1.677TrpSer: 1.677 ± 0.507
1.732TrpThr: 1.732 ± 0.288
2.056TrpVal: 2.056 ± 0.382
0.974TrpTrp: 0.974 ± 0.188
0.325TrpTyr: 0.325 ± 0.138
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.543TyrAla: 2.543 ± 0.371
0.271TyrCys: 0.271 ± 0.128
1.623TyrAsp: 1.623 ± 0.36
1.732TyrGlu: 1.732 ± 0.268
0.703TyrPhe: 0.703 ± 0.211
1.948TyrGly: 1.948 ± 0.396
0.595TyrHis: 0.595 ± 0.166
0.92TyrIle: 0.92 ± 0.182
0.649TyrLys: 0.649 ± 0.226
2.165TyrLeu: 2.165 ± 0.36
0.271TyrMet: 0.271 ± 0.116
0.866TyrAsn: 0.866 ± 0.248
1.623TyrPro: 1.623 ± 0.253
0.649TyrGln: 0.649 ± 0.189
1.84TyrArg: 1.84 ± 0.37
1.028TyrSer: 1.028 ± 0.232
1.786TyrThr: 1.786 ± 0.386
2.381TyrVal: 2.381 ± 0.304
0.541TyrTrp: 0.541 ± 0.165
0.758TyrTyr: 0.758 ± 0.176
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 105 proteins (18481 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski