Amino acid dipepetide frequency for Mycobacterium phage Odin

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
11.478AlaAla: 11.478 ± 1.213
0.565AlaCys: 0.565 ± 0.216
5.708AlaAsp: 5.708 ± 0.532
7.59AlaGlu: 7.59 ± 0.862
3.136AlaPhe: 3.136 ± 0.39
8.342AlaGly: 8.342 ± 0.822
1.505AlaHis: 1.505 ± 0.254
3.889AlaIle: 3.889 ± 0.462
5.582AlaLys: 5.582 ± 0.783
9.095AlaLeu: 9.095 ± 0.853
3.262AlaMet: 3.262 ± 0.402
3.638AlaAsn: 3.638 ± 0.621
4.892AlaPro: 4.892 ± 0.924
4.014AlaGln: 4.014 ± 0.584
5.331AlaArg: 5.331 ± 0.576
4.579AlaSer: 4.579 ± 0.738
5.018AlaThr: 5.018 ± 0.428
5.833AlaVal: 5.833 ± 0.588
1.631AlaTrp: 1.631 ± 0.316
2.572AlaTyr: 2.572 ± 0.462
0.0AlaXaa: 0.0 ± 0.0
Cys
0.565CysAla: 0.565 ± 0.164
0.0CysCys: 0.0 ± 0.0
0.439CysAsp: 0.439 ± 0.159
0.627CysGlu: 0.627 ± 0.185
0.439CysPhe: 0.439 ± 0.164
0.69CysGly: 0.69 ± 0.197
0.188CysHis: 0.188 ± 0.102
0.251CysIle: 0.251 ± 0.147
0.627CysLys: 0.627 ± 0.2
0.627CysLeu: 0.627 ± 0.211
0.251CysMet: 0.251 ± 0.17
0.314CysAsn: 0.314 ± 0.112
0.69CysPro: 0.69 ± 0.284
0.125CysGln: 0.125 ± 0.084
0.69CysArg: 0.69 ± 0.177
0.502CysSer: 0.502 ± 0.21
0.439CysThr: 0.439 ± 0.185
0.565CysVal: 0.565 ± 0.216
0.314CysTrp: 0.314 ± 0.133
0.376CysTyr: 0.376 ± 0.151
0.0CysXaa: 0.0 ± 0.0
Asp
5.959AspAla: 5.959 ± 0.797
0.815AspCys: 0.815 ± 0.291
3.387AspAsp: 3.387 ± 0.481
4.14AspGlu: 4.14 ± 0.646
2.446AspPhe: 2.446 ± 0.43
4.453AspGly: 4.453 ± 0.563
1.756AspHis: 1.756 ± 0.353
3.952AspIle: 3.952 ± 0.407
2.572AspLys: 2.572 ± 0.419
5.52AspLeu: 5.52 ± 0.726
1.254AspMet: 1.254 ± 0.285
1.631AspAsn: 1.631 ± 0.376
4.892AspPro: 4.892 ± 0.628
1.631AspGln: 1.631 ± 0.353
3.073AspArg: 3.073 ± 0.546
3.073AspSer: 3.073 ± 0.386
3.638AspThr: 3.638 ± 0.387
4.14AspVal: 4.14 ± 0.412
1.568AspTrp: 1.568 ± 0.323
2.383AspTyr: 2.383 ± 0.323
0.0AspXaa: 0.0 ± 0.0
Glu
7.84GluAla: 7.84 ± 0.834
0.376GluCys: 0.376 ± 0.157
3.763GluAsp: 3.763 ± 0.67
5.018GluGlu: 5.018 ± 0.659
3.199GluPhe: 3.199 ± 0.451
5.081GluGly: 5.081 ± 0.597
1.192GluHis: 1.192 ± 0.253
3.513GluIle: 3.513 ± 0.448
2.76GluLys: 2.76 ± 0.437
6.774GluLeu: 6.774 ± 0.765
2.383GluMet: 2.383 ± 0.325
2.07GluAsn: 2.07 ± 0.331
3.011GluPro: 3.011 ± 0.461
2.07GluGln: 2.07 ± 0.327
4.704GluArg: 4.704 ± 0.619
2.885GluSer: 2.885 ± 0.449
3.262GluThr: 3.262 ± 0.414
4.516GluVal: 4.516 ± 0.504
1.317GluTrp: 1.317 ± 0.323
1.944GluTyr: 1.944 ± 0.295
0.0GluXaa: 0.0 ± 0.0
Phe
3.073PheAla: 3.073 ± 0.376
0.188PheCys: 0.188 ± 0.101
2.007PheAsp: 2.007 ± 0.396
2.697PheGlu: 2.697 ± 0.325
0.439PhePhe: 0.439 ± 0.193
2.634PheGly: 2.634 ± 0.369
0.627PheHis: 0.627 ± 0.21
1.568PheIle: 1.568 ± 0.326
1.819PheLys: 1.819 ± 0.354
2.133PheLeu: 2.133 ± 0.401
0.439PheMet: 0.439 ± 0.155
1.944PheAsn: 1.944 ± 0.365
1.694PhePro: 1.694 ± 0.335
1.38PheGln: 1.38 ± 0.378
2.07PheArg: 2.07 ± 0.336
1.631PheSer: 1.631 ± 0.322
2.07PheThr: 2.07 ± 0.333
2.572PheVal: 2.572 ± 0.414
0.502PheTrp: 0.502 ± 0.207
1.004PheTyr: 1.004 ± 0.297
0.0PheXaa: 0.0 ± 0.0
Gly
6.649GlyAla: 6.649 ± 0.834
0.69GlyCys: 0.69 ± 0.2
6.021GlyAsp: 6.021 ± 0.619
3.826GlyGlu: 3.826 ± 0.4
2.885GlyPhe: 2.885 ± 0.479
8.593GlyGly: 8.593 ± 1.307
2.07GlyHis: 2.07 ± 0.363
4.202GlyIle: 4.202 ± 0.616
3.701GlyLys: 3.701 ± 0.44
5.833GlyLeu: 5.833 ± 0.791
2.007GlyMet: 2.007 ± 0.405
2.572GlyAsn: 2.572 ± 0.436
3.199GlyPro: 3.199 ± 0.389
2.948GlyGln: 2.948 ± 0.541
4.14GlyArg: 4.14 ± 0.503
4.642GlySer: 4.642 ± 0.922
5.394GlyThr: 5.394 ± 0.721
6.774GlyVal: 6.774 ± 0.648
1.882GlyTrp: 1.882 ± 0.354
2.321GlyTyr: 2.321 ± 0.378
0.0GlyXaa: 0.0 ± 0.0
His
1.568HisAla: 1.568 ± 0.356
0.251HisCys: 0.251 ± 0.11
1.443HisAsp: 1.443 ± 0.394
1.756HisGlu: 1.756 ± 0.385
0.439HisPhe: 0.439 ± 0.189
1.631HisGly: 1.631 ± 0.367
0.753HisHis: 0.753 ± 0.243
1.192HisIle: 1.192 ± 0.249
1.192HisLys: 1.192 ± 0.255
1.944HisLeu: 1.944 ± 0.315
0.376HisMet: 0.376 ± 0.139
0.627HisAsn: 0.627 ± 0.204
1.066HisPro: 1.066 ± 0.192
0.941HisGln: 0.941 ± 0.271
1.192HisArg: 1.192 ± 0.28
0.627HisSer: 0.627 ± 0.186
1.443HisThr: 1.443 ± 0.347
1.192HisVal: 1.192 ± 0.262
0.251HisTrp: 0.251 ± 0.126
0.815HisTyr: 0.815 ± 0.278
0.0HisXaa: 0.0 ± 0.0
Ile
5.081IleAla: 5.081 ± 0.498
0.439IleCys: 0.439 ± 0.175
3.262IleAsp: 3.262 ± 0.41
4.265IleGlu: 4.265 ± 0.448
1.066IlePhe: 1.066 ± 0.221
4.077IleGly: 4.077 ± 0.546
1.066IleHis: 1.066 ± 0.256
2.321IleIle: 2.321 ± 0.354
2.885IleLys: 2.885 ± 0.346
4.453IleLeu: 4.453 ± 0.504
0.314IleMet: 0.314 ± 0.133
2.383IleAsn: 2.383 ± 0.351
3.826IlePro: 3.826 ± 0.455
1.694IleGln: 1.694 ± 0.328
3.136IleArg: 3.136 ± 0.393
3.262IleSer: 3.262 ± 0.468
3.387IleThr: 3.387 ± 0.398
3.513IleVal: 3.513 ± 0.483
0.69IleTrp: 0.69 ± 0.211
0.941IleTyr: 0.941 ± 0.223
0.0IleXaa: 0.0 ± 0.0
Lys
5.143LysAla: 5.143 ± 0.806
0.314LysCys: 0.314 ± 0.147
3.324LysAsp: 3.324 ± 0.481
2.509LysGlu: 2.509 ± 0.454
0.753LysPhe: 0.753 ± 0.227
3.763LysGly: 3.763 ± 0.473
0.69LysHis: 0.69 ± 0.177
2.383LysIle: 2.383 ± 0.374
3.073LysLys: 3.073 ± 0.557
3.889LysLeu: 3.889 ± 0.546
1.443LysMet: 1.443 ± 0.315
1.631LysAsn: 1.631 ± 0.301
3.324LysPro: 3.324 ± 0.533
2.07LysGln: 2.07 ± 0.378
3.324LysArg: 3.324 ± 0.496
2.634LysSer: 2.634 ± 0.38
3.513LysThr: 3.513 ± 0.591
3.763LysVal: 3.763 ± 0.41
0.502LysTrp: 0.502 ± 0.186
1.254LysTyr: 1.254 ± 0.268
0.0LysXaa: 0.0 ± 0.0
Leu
8.907LeuAla: 8.907 ± 0.952
0.878LeuCys: 0.878 ± 0.267
4.704LeuAsp: 4.704 ± 0.525
5.331LeuGlu: 5.331 ± 0.632
2.446LeuPhe: 2.446 ± 0.259
6.774LeuGly: 6.774 ± 0.837
1.505LeuHis: 1.505 ± 0.307
4.453LeuIle: 4.453 ± 0.568
3.262LeuLys: 3.262 ± 0.419
4.83LeuLeu: 4.83 ± 0.527
2.572LeuMet: 2.572 ± 0.346
2.446LeuAsn: 2.446 ± 0.464
4.202LeuPro: 4.202 ± 0.512
2.76LeuGln: 2.76 ± 0.538
5.269LeuArg: 5.269 ± 0.78
5.708LeuSer: 5.708 ± 0.602
5.018LeuThr: 5.018 ± 0.594
5.269LeuVal: 5.269 ± 0.613
1.443LeuTrp: 1.443 ± 0.262
1.944LeuTyr: 1.944 ± 0.396
0.0LeuXaa: 0.0 ± 0.0
Met
2.321MetAla: 2.321 ± 0.513
0.125MetCys: 0.125 ± 0.088
1.192MetAsp: 1.192 ± 0.26
1.317MetGlu: 1.317 ± 0.239
0.69MetPhe: 0.69 ± 0.207
1.631MetGly: 1.631 ± 0.356
0.502MetHis: 0.502 ± 0.174
1.505MetIle: 1.505 ± 0.346
1.756MetLys: 1.756 ± 0.368
1.631MetLeu: 1.631 ± 0.367
0.878MetMet: 0.878 ± 0.218
0.753MetAsn: 0.753 ± 0.218
1.129MetPro: 1.129 ± 0.244
1.066MetGln: 1.066 ± 0.279
1.944MetArg: 1.944 ± 0.361
2.383MetSer: 2.383 ± 0.326
2.697MetThr: 2.697 ± 0.415
1.443MetVal: 1.443 ± 0.282
0.251MetTrp: 0.251 ± 0.121
0.502MetTyr: 0.502 ± 0.17
0.0MetXaa: 0.0 ± 0.0
Asn
2.885AsnAla: 2.885 ± 0.5
0.376AsnCys: 0.376 ± 0.156
2.007AsnAsp: 2.007 ± 0.319
2.195AsnGlu: 2.195 ± 0.399
0.878AsnPhe: 0.878 ± 0.23
3.324AsnGly: 3.324 ± 0.482
0.815AsnHis: 0.815 ± 0.226
1.317AsnIle: 1.317 ± 0.248
1.066AsnLys: 1.066 ± 0.303
2.446AsnLeu: 2.446 ± 0.339
0.815AsnMet: 0.815 ± 0.204
0.753AsnAsn: 0.753 ± 0.256
2.321AsnPro: 2.321 ± 0.37
1.505AsnGln: 1.505 ± 0.424
2.634AsnArg: 2.634 ± 0.448
1.631AsnSer: 1.631 ± 0.342
2.195AsnThr: 2.195 ± 0.395
3.136AsnVal: 3.136 ± 0.491
0.627AsnTrp: 0.627 ± 0.184
1.192AsnTyr: 1.192 ± 0.247
0.0AsnXaa: 0.0 ± 0.0
Pro
5.269ProAla: 5.269 ± 0.743
0.376ProCys: 0.376 ± 0.153
3.136ProAsp: 3.136 ± 0.378
4.516ProGlu: 4.516 ± 0.491
2.195ProPhe: 2.195 ± 0.39
4.391ProGly: 4.391 ± 0.537
1.882ProHis: 1.882 ± 0.334
2.823ProIle: 2.823 ± 0.352
2.572ProLys: 2.572 ± 0.502
3.575ProLeu: 3.575 ± 0.48
1.066ProMet: 1.066 ± 0.269
2.133ProAsn: 2.133 ± 0.375
2.572ProPro: 2.572 ± 0.509
1.944ProGln: 1.944 ± 0.43
2.76ProArg: 2.76 ± 0.411
2.383ProSer: 2.383 ± 0.427
3.638ProThr: 3.638 ± 0.474
4.014ProVal: 4.014 ± 0.487
1.129ProTrp: 1.129 ± 0.356
1.443ProTyr: 1.443 ± 0.333
0.0ProXaa: 0.0 ± 0.0
Gln
4.202GlnAla: 4.202 ± 0.615
0.251GlnCys: 0.251 ± 0.122
2.007GlnAsp: 2.007 ± 0.342
1.882GlnGlu: 1.882 ± 0.349
1.317GlnPhe: 1.317 ± 0.251
3.011GlnGly: 3.011 ± 0.481
0.753GlnHis: 0.753 ± 0.195
2.383GlnIle: 2.383 ± 0.319
2.258GlnLys: 2.258 ± 0.381
2.697GlnLeu: 2.697 ± 0.549
1.192GlnMet: 1.192 ± 0.248
1.066GlnAsn: 1.066 ± 0.271
1.317GlnPro: 1.317 ± 0.262
1.756GlnGln: 1.756 ± 0.432
2.697GlnArg: 2.697 ± 0.448
2.634GlnSer: 2.634 ± 0.392
2.007GlnThr: 2.007 ± 0.301
2.509GlnVal: 2.509 ± 0.417
0.815GlnTrp: 0.815 ± 0.207
1.129GlnTyr: 1.129 ± 0.236
0.0GlnXaa: 0.0 ± 0.0
Arg
5.52ArgAla: 5.52 ± 0.654
1.129ArgCys: 1.129 ± 0.299
3.513ArgAsp: 3.513 ± 0.477
4.704ArgGlu: 4.704 ± 0.714
2.133ArgPhe: 2.133 ± 0.448
3.324ArgGly: 3.324 ± 0.492
1.066ArgHis: 1.066 ± 0.254
4.077ArgIle: 4.077 ± 0.477
3.324ArgLys: 3.324 ± 0.473
5.269ArgLeu: 5.269 ± 0.629
2.07ArgMet: 2.07 ± 0.336
2.195ArgAsn: 2.195 ± 0.343
2.634ArgPro: 2.634 ± 0.409
2.258ArgGln: 2.258 ± 0.417
5.771ArgArg: 5.771 ± 0.632
2.885ArgSer: 2.885 ± 0.393
2.697ArgThr: 2.697 ± 0.395
4.892ArgVal: 4.892 ± 0.555
1.317ArgTrp: 1.317 ± 0.304
2.07ArgTyr: 2.07 ± 0.437
0.0ArgXaa: 0.0 ± 0.0
Ser
4.202SerAla: 4.202 ± 0.516
0.376SerCys: 0.376 ± 0.166
3.889SerAsp: 3.889 ± 0.445
3.889SerGlu: 3.889 ± 0.509
2.383SerPhe: 2.383 ± 0.358
4.767SerGly: 4.767 ± 0.675
0.815SerHis: 0.815 ± 0.226
2.509SerIle: 2.509 ± 0.429
2.697SerLys: 2.697 ± 0.421
4.453SerLeu: 4.453 ± 0.542
0.878SerMet: 0.878 ± 0.225
1.505SerAsn: 1.505 ± 0.285
3.073SerPro: 3.073 ± 0.403
2.007SerGln: 2.007 ± 0.332
4.077SerArg: 4.077 ± 0.534
3.45SerSer: 3.45 ± 0.507
3.387SerThr: 3.387 ± 0.5
3.826SerVal: 3.826 ± 0.464
1.443SerTrp: 1.443 ± 0.399
1.568SerTyr: 1.568 ± 0.246
0.0SerXaa: 0.0 ± 0.0
Thr
5.582ThrAla: 5.582 ± 0.579
0.502ThrCys: 0.502 ± 0.178
3.575ThrAsp: 3.575 ± 0.595
4.014ThrGlu: 4.014 ± 0.553
2.321ThrPhe: 2.321 ± 0.367
5.645ThrGly: 5.645 ± 0.577
1.254ThrHis: 1.254 ± 0.24
2.634ThrIle: 2.634 ± 0.372
3.136ThrLys: 3.136 ± 0.494
5.143ThrLeu: 5.143 ± 0.635
1.443ThrMet: 1.443 ± 0.319
1.882ThrAsn: 1.882 ± 0.363
4.579ThrPro: 4.579 ± 0.555
2.634ThrGln: 2.634 ± 0.382
2.509ThrArg: 2.509 ± 0.404
3.011ThrSer: 3.011 ± 0.459
3.262ThrThr: 3.262 ± 0.527
4.453ThrVal: 4.453 ± 0.563
1.254ThrTrp: 1.254 ± 0.279
1.694ThrTyr: 1.694 ± 0.321
0.0ThrXaa: 0.0 ± 0.0
Val
7.339ValAla: 7.339 ± 0.82
0.815ValCys: 0.815 ± 0.255
5.457ValAsp: 5.457 ± 0.638
4.14ValGlu: 4.14 ± 0.499
2.007ValPhe: 2.007 ± 0.392
5.143ValGly: 5.143 ± 0.59
1.443ValHis: 1.443 ± 0.27
3.826ValIle: 3.826 ± 0.566
3.513ValLys: 3.513 ± 0.395
5.018ValLeu: 5.018 ± 0.535
1.819ValMet: 1.819 ± 0.44
2.76ValAsn: 2.76 ± 0.433
3.073ValPro: 3.073 ± 0.455
2.572ValGln: 2.572 ± 0.401
4.83ValArg: 4.83 ± 0.517
4.579ValSer: 4.579 ± 0.63
4.14ValThr: 4.14 ± 0.542
5.269ValVal: 5.269 ± 0.551
1.129ValTrp: 1.129 ± 0.226
2.07ValTyr: 2.07 ± 0.394
0.0ValXaa: 0.0 ± 0.0
Trp
1.694TrpAla: 1.694 ± 0.361
0.251TrpCys: 0.251 ± 0.144
1.192TrpAsp: 1.192 ± 0.274
1.129TrpGlu: 1.129 ± 0.244
0.753TrpPhe: 0.753 ± 0.192
1.254TrpGly: 1.254 ± 0.325
0.627TrpHis: 0.627 ± 0.22
1.317TrpIle: 1.317 ± 0.263
0.815TrpLys: 0.815 ± 0.233
1.254TrpLeu: 1.254 ± 0.275
0.314TrpMet: 0.314 ± 0.126
0.565TrpAsn: 0.565 ± 0.198
0.815TrpPro: 0.815 ± 0.22
1.192TrpGln: 1.192 ± 0.358
0.815TrpArg: 0.815 ± 0.195
1.066TrpSer: 1.066 ± 0.267
1.38TrpThr: 1.38 ± 0.306
1.317TrpVal: 1.317 ± 0.228
0.627TrpTrp: 0.627 ± 0.191
0.69TrpTyr: 0.69 ± 0.279
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.383TyrAla: 2.383 ± 0.421
0.0TyrCys: 0.0 ± 0.0
2.258TyrAsp: 2.258 ± 0.301
2.133TyrGlu: 2.133 ± 0.355
0.565TyrPhe: 0.565 ± 0.158
1.819TyrGly: 1.819 ± 0.425
0.188TyrHis: 0.188 ± 0.129
1.944TyrIle: 1.944 ± 0.333
0.69TyrLys: 0.69 ± 0.178
3.136TyrLeu: 3.136 ± 0.454
0.878TyrMet: 0.878 ± 0.239
1.254TyrAsn: 1.254 ± 0.251
1.505TyrPro: 1.505 ± 0.256
1.317TyrGln: 1.317 ± 0.285
1.882TyrArg: 1.882 ± 0.371
1.694TyrSer: 1.694 ± 0.313
1.944TyrThr: 1.944 ± 0.37
1.944TyrVal: 1.944 ± 0.407
0.376TyrTrp: 0.376 ± 0.159
1.004TyrTyr: 1.004 ± 0.278
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 94 proteins (15944 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski