Amino acid dipepetide frequency for Mycobacterium phage DonSanchon

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
18.078AlaAla: 18.078 ± 1.395
1.174AlaCys: 1.174 ± 0.327
7.372AlaAsp: 7.372 ± 0.6
7.513AlaGlu: 7.513 ± 0.788
2.348AlaPhe: 2.348 ± 0.422
10.706AlaGly: 10.706 ± 1.074
2.536AlaHis: 2.536 ± 0.424
4.555AlaIle: 4.555 ± 0.606
3.616AlaLys: 3.616 ± 0.436
10.659AlaLeu: 10.659 ± 0.915
3.287AlaMet: 3.287 ± 0.404
3.052AlaAsn: 3.052 ± 0.353
6.715AlaPro: 6.715 ± 0.547
4.226AlaGln: 4.226 ± 0.523
8.734AlaArg: 8.734 ± 0.757
5.588AlaSer: 5.588 ± 0.511
7.795AlaThr: 7.795 ± 0.627
9.015AlaVal: 9.015 ± 0.851
1.972AlaTrp: 1.972 ± 0.324
3.052AlaTyr: 3.052 ± 0.356
0.0AlaXaa: 0.0 ± 0.0
Cys
0.845CysAla: 0.845 ± 0.233
0.0CysCys: 0.0 ± 0.0
0.986CysAsp: 0.986 ± 0.232
0.657CysGlu: 0.657 ± 0.218
0.329CysPhe: 0.329 ± 0.115
1.362CysGly: 1.362 ± 0.303
0.376CysHis: 0.376 ± 0.119
0.235CysIle: 0.235 ± 0.1
0.235CysLys: 0.235 ± 0.109
0.517CysLeu: 0.517 ± 0.193
0.235CysMet: 0.235 ± 0.1
0.282CysAsn: 0.282 ± 0.11
0.892CysPro: 0.892 ± 0.241
0.235CysGln: 0.235 ± 0.108
0.751CysArg: 0.751 ± 0.207
0.47CysSer: 0.47 ± 0.152
0.61CysThr: 0.61 ± 0.188
0.798CysVal: 0.798 ± 0.194
0.235CysTrp: 0.235 ± 0.116
0.329CysTyr: 0.329 ± 0.13
0.0CysXaa: 0.0 ± 0.0
Asp
7.184AspAla: 7.184 ± 0.549
0.939AspCys: 0.939 ± 0.241
4.883AspAsp: 4.883 ± 0.584
5.118AspGlu: 5.118 ± 0.744
1.69AspPhe: 1.69 ± 0.235
6.057AspGly: 6.057 ± 0.621
1.268AspHis: 1.268 ± 0.293
3.146AspIle: 3.146 ± 0.414
1.878AspLys: 1.878 ± 0.316
5.775AspLeu: 5.775 ± 0.497
1.033AspMet: 1.033 ± 0.273
2.113AspAsn: 2.113 ± 0.365
5.165AspPro: 5.165 ± 0.598
1.503AspGln: 1.503 ± 0.277
4.226AspArg: 4.226 ± 0.543
2.864AspSer: 2.864 ± 0.376
3.991AspThr: 3.991 ± 0.476
3.944AspVal: 3.944 ± 0.366
1.127AspTrp: 1.127 ± 0.245
1.831AspTyr: 1.831 ± 0.265
0.0AspXaa: 0.0 ± 0.0
Glu
7.466GluAla: 7.466 ± 0.94
0.657GluCys: 0.657 ± 0.191
3.616GluAsp: 3.616 ± 0.596
2.254GluGlu: 2.254 ± 0.304
2.254GluPhe: 2.254 ± 0.33
4.508GluGly: 4.508 ± 0.45
1.268GluHis: 1.268 ± 0.279
3.803GluIle: 3.803 ± 0.608
1.596GluLys: 1.596 ± 0.296
6.433GluLeu: 6.433 ± 0.594
0.845GluMet: 0.845 ± 0.212
1.127GluAsn: 1.127 ± 0.207
3.287GluPro: 3.287 ± 0.569
2.583GluGln: 2.583 ± 0.347
4.273GluArg: 4.273 ± 0.459
1.831GluSer: 1.831 ± 0.319
3.616GluThr: 3.616 ± 0.447
5.916GluVal: 5.916 ± 0.636
0.986GluTrp: 0.986 ± 0.238
1.08GluTyr: 1.08 ± 0.197
0.0GluXaa: 0.0 ± 0.0
Phe
2.911PheAla: 2.911 ± 0.35
0.282PheCys: 0.282 ± 0.102
1.643PheAsp: 1.643 ± 0.268
1.268PheGlu: 1.268 ± 0.22
0.423PhePhe: 0.423 ± 0.124
3.099PheGly: 3.099 ± 0.556
0.47PheHis: 0.47 ± 0.123
0.657PheIle: 0.657 ± 0.177
1.08PheLys: 1.08 ± 0.221
1.503PheLeu: 1.503 ± 0.304
0.282PheMet: 0.282 ± 0.107
0.892PheAsn: 0.892 ± 0.218
1.315PhePro: 1.315 ± 0.241
0.563PheGln: 0.563 ± 0.159
1.503PheArg: 1.503 ± 0.284
1.127PheSer: 1.127 ± 0.241
1.596PheThr: 1.596 ± 0.228
1.456PheVal: 1.456 ± 0.218
0.329PheTrp: 0.329 ± 0.132
0.657PheTyr: 0.657 ± 0.177
0.0PheXaa: 0.0 ± 0.0
Gly
9.485GlyAla: 9.485 ± 1.387
1.127GlyCys: 1.127 ± 0.298
5.494GlyAsp: 5.494 ± 0.565
5.071GlyGlu: 5.071 ± 0.552
1.972GlyPhe: 1.972 ± 0.317
13.194GlyGly: 13.194 ± 2.427
1.925GlyHis: 1.925 ± 0.347
4.789GlyIle: 4.789 ± 0.387
3.569GlyLys: 3.569 ± 0.426
7.748GlyLeu: 7.748 ± 1.002
2.16GlyMet: 2.16 ± 0.353
2.489GlyAsn: 2.489 ± 0.453
4.508GlyPro: 4.508 ± 0.598
3.475GlyGln: 3.475 ± 0.568
5.822GlyArg: 5.822 ± 0.694
5.306GlySer: 5.306 ± 0.473
6.996GlyThr: 6.996 ± 0.706
6.715GlyVal: 6.715 ± 0.68
1.972GlyTrp: 1.972 ± 0.385
2.77GlyTyr: 2.77 ± 0.316
0.0GlyXaa: 0.0 ± 0.0
His
1.925HisAla: 1.925 ± 0.351
0.235HisCys: 0.235 ± 0.141
1.08HisAsp: 1.08 ± 0.234
1.55HisGlu: 1.55 ± 0.32
0.47HisPhe: 0.47 ± 0.144
1.55HisGly: 1.55 ± 0.254
0.704HisHis: 0.704 ± 0.281
1.503HisIle: 1.503 ± 0.318
0.47HisLys: 0.47 ± 0.177
1.315HisLeu: 1.315 ± 0.306
0.517HisMet: 0.517 ± 0.15
0.47HisAsn: 0.47 ± 0.139
1.127HisPro: 1.127 ± 0.261
0.329HisGln: 0.329 ± 0.134
2.113HisArg: 2.113 ± 0.355
0.892HisSer: 0.892 ± 0.213
1.878HisThr: 1.878 ± 0.336
1.08HisVal: 1.08 ± 0.234
0.188HisTrp: 0.188 ± 0.072
0.563HisTyr: 0.563 ± 0.177
0.0HisXaa: 0.0 ± 0.0
Ile
5.729IleAla: 5.729 ± 0.46
0.517IleCys: 0.517 ± 0.156
3.897IleAsp: 3.897 ± 0.458
4.836IleGlu: 4.836 ± 0.516
0.892IlePhe: 0.892 ± 0.206
5.165IleGly: 5.165 ± 0.524
0.61IleHis: 0.61 ± 0.167
1.643IleIle: 1.643 ± 0.297
1.409IleLys: 1.409 ± 0.356
2.817IleLeu: 2.817 ± 0.334
0.751IleMet: 0.751 ± 0.159
1.362IleAsn: 1.362 ± 0.256
2.723IlePro: 2.723 ± 0.379
0.892IleGln: 0.892 ± 0.226
2.348IleArg: 2.348 ± 0.404
2.489IleSer: 2.489 ± 0.337
3.522IleThr: 3.522 ± 0.354
3.052IleVal: 3.052 ± 0.456
0.329IleTrp: 0.329 ± 0.113
0.892IleTyr: 0.892 ± 0.217
0.0IleXaa: 0.0 ± 0.0
Lys
4.038LysAla: 4.038 ± 0.533
0.141LysCys: 0.141 ± 0.085
1.456LysAsp: 1.456 ± 0.242
0.798LysGlu: 0.798 ± 0.271
0.657LysPhe: 0.657 ± 0.169
2.583LysGly: 2.583 ± 0.341
0.517LysHis: 0.517 ± 0.159
1.643LysIle: 1.643 ± 0.321
0.892LysLys: 0.892 ± 0.222
2.958LysLeu: 2.958 ± 0.363
0.376LysMet: 0.376 ± 0.154
0.751LysAsn: 0.751 ± 0.219
1.315LysPro: 1.315 ± 0.268
1.456LysGln: 1.456 ± 0.245
2.113LysArg: 2.113 ± 0.375
1.362LysSer: 1.362 ± 0.206
2.395LysThr: 2.395 ± 0.34
2.77LysVal: 2.77 ± 0.307
0.329LysTrp: 0.329 ± 0.106
0.704LysTyr: 0.704 ± 0.186
0.0LysXaa: 0.0 ± 0.0
Leu
11.645LeuAla: 11.645 ± 0.776
1.033LeuCys: 1.033 ± 0.249
6.386LeuAsp: 6.386 ± 0.678
3.099LeuGlu: 3.099 ± 0.313
1.925LeuPhe: 1.925 ± 0.433
7.372LeuGly: 7.372 ± 1.081
1.08LeuHis: 1.08 ± 0.202
3.85LeuIle: 3.85 ± 0.495
2.301LeuLys: 2.301 ± 0.429
5.212LeuLeu: 5.212 ± 0.624
1.643LeuMet: 1.643 ± 0.237
3.334LeuAsn: 3.334 ± 0.583
5.869LeuPro: 5.869 ± 0.582
1.596LeuGln: 1.596 ± 0.271
4.038LeuArg: 4.038 ± 0.478
3.991LeuSer: 3.991 ± 0.463
7.184LeuThr: 7.184 ± 0.517
4.836LeuVal: 4.836 ± 0.511
1.268LeuTrp: 1.268 ± 0.261
1.737LeuTyr: 1.737 ± 0.332
0.0LeuXaa: 0.0 ± 0.0
Met
2.489MetAla: 2.489 ± 0.339
0.235MetCys: 0.235 ± 0.116
0.657MetAsp: 0.657 ± 0.181
0.704MetGlu: 0.704 ± 0.125
0.798MetPhe: 0.798 ± 0.157
1.456MetGly: 1.456 ± 0.223
0.376MetHis: 0.376 ± 0.144
0.892MetIle: 0.892 ± 0.182
0.657MetLys: 0.657 ± 0.166
1.456MetLeu: 1.456 ± 0.231
0.47MetMet: 0.47 ± 0.145
0.423MetAsn: 0.423 ± 0.141
1.503MetPro: 1.503 ± 0.289
0.376MetGln: 0.376 ± 0.138
1.221MetArg: 1.221 ± 0.232
1.878MetSer: 1.878 ± 0.344
2.77MetThr: 2.77 ± 0.283
1.409MetVal: 1.409 ± 0.268
0.61MetTrp: 0.61 ± 0.185
0.517MetTyr: 0.517 ± 0.163
0.0MetXaa: 0.0 ± 0.0
Asn
3.381AsnAla: 3.381 ± 0.37
0.235AsnCys: 0.235 ± 0.103
2.019AsnAsp: 2.019 ± 0.323
1.268AsnGlu: 1.268 ± 0.223
0.329AsnPhe: 0.329 ± 0.102
3.428AsnGly: 3.428 ± 0.445
0.657AsnHis: 0.657 ± 0.19
1.409AsnIle: 1.409 ± 0.306
1.08AsnLys: 1.08 ± 0.26
1.878AsnLeu: 1.878 ± 0.327
0.47AsnMet: 0.47 ± 0.118
1.268AsnAsn: 1.268 ± 0.315
1.831AsnPro: 1.831 ± 0.284
0.376AsnGln: 0.376 ± 0.126
2.019AsnArg: 2.019 ± 0.374
1.55AsnSer: 1.55 ± 0.334
2.536AsnThr: 2.536 ± 0.421
1.831AsnVal: 1.831 ± 0.268
0.376AsnTrp: 0.376 ± 0.143
0.939AsnTyr: 0.939 ± 0.209
0.0AsnXaa: 0.0 ± 0.0
Pro
7.278ProAla: 7.278 ± 0.633
0.329ProCys: 0.329 ± 0.113
4.273ProAsp: 4.273 ± 0.628
5.588ProGlu: 5.588 ± 0.806
1.315ProPhe: 1.315 ± 0.237
6.104ProGly: 6.104 ± 0.725
1.221ProHis: 1.221 ± 0.275
2.536ProIle: 2.536 ± 0.471
1.221ProLys: 1.221 ± 0.198
3.522ProLeu: 3.522 ± 0.412
1.174ProMet: 1.174 ± 0.205
2.066ProAsn: 2.066 ± 0.283
4.555ProPro: 4.555 ± 0.713
1.878ProGln: 1.878 ± 0.296
3.616ProArg: 3.616 ± 0.467
2.442ProSer: 2.442 ± 0.359
4.038ProThr: 4.038 ± 0.352
4.836ProVal: 4.836 ± 0.511
1.409ProTrp: 1.409 ± 0.242
1.033ProTyr: 1.033 ± 0.25
0.0ProXaa: 0.0 ± 0.0
Gln
4.273GlnAla: 4.273 ± 0.74
0.282GlnCys: 0.282 ± 0.135
1.127GlnAsp: 1.127 ± 0.234
0.892GlnGlu: 0.892 ± 0.192
0.986GlnPhe: 0.986 ± 0.211
2.254GlnGly: 2.254 ± 0.378
0.845GlnHis: 0.845 ± 0.197
1.972GlnIle: 1.972 ± 0.323
0.845GlnLys: 0.845 ± 0.222
2.536GlnLeu: 2.536 ± 0.359
0.563GlnMet: 0.563 ± 0.154
0.704GlnAsn: 0.704 ± 0.19
1.69GlnPro: 1.69 ± 0.255
1.596GlnGln: 1.596 ± 0.221
3.052GlnArg: 3.052 ± 0.452
1.596GlnSer: 1.596 ± 0.275
2.301GlnThr: 2.301 ± 0.278
2.019GlnVal: 2.019 ± 0.299
0.704GlnTrp: 0.704 ± 0.184
0.704GlnTyr: 0.704 ± 0.207
0.0GlnXaa: 0.0 ± 0.0
Arg
7.278ArgAla: 7.278 ± 0.653
0.657ArgCys: 0.657 ± 0.187
4.32ArgAsp: 4.32 ± 0.456
4.977ArgGlu: 4.977 ± 0.61
1.456ArgPhe: 1.456 ± 0.245
4.085ArgGly: 4.085 ± 0.521
1.878ArgHis: 1.878 ± 0.358
2.348ArgIle: 2.348 ± 0.297
1.878ArgLys: 1.878 ± 0.347
5.729ArgLeu: 5.729 ± 0.439
1.737ArgMet: 1.737 ± 0.414
2.348ArgAsn: 2.348 ± 0.373
3.897ArgPro: 3.897 ± 0.514
2.536ArgGln: 2.536 ± 0.299
6.245ArgArg: 6.245 ± 0.686
2.77ArgSer: 2.77 ± 0.304
3.897ArgThr: 3.897 ± 0.467
5.447ArgVal: 5.447 ± 0.61
1.737ArgTrp: 1.737 ± 0.296
2.254ArgTyr: 2.254 ± 0.406
0.0ArgXaa: 0.0 ± 0.0
Ser
4.93SerAla: 4.93 ± 0.58
0.47SerCys: 0.47 ± 0.116
3.334SerAsp: 3.334 ± 0.252
2.254SerGlu: 2.254 ± 0.336
0.751SerPhe: 0.751 ± 0.22
5.963SerGly: 5.963 ± 0.686
0.704SerHis: 0.704 ± 0.175
1.643SerIle: 1.643 ± 0.354
1.221SerLys: 1.221 ± 0.267
4.038SerLeu: 4.038 ± 0.697
1.268SerMet: 1.268 ± 0.234
1.08SerAsn: 1.08 ± 0.247
3.193SerPro: 3.193 ± 0.402
2.019SerGln: 2.019 ± 0.331
3.428SerArg: 3.428 ± 0.28
2.583SerSer: 2.583 ± 0.38
3.991SerThr: 3.991 ± 0.476
3.803SerVal: 3.803 ± 0.455
1.127SerTrp: 1.127 ± 0.216
1.596SerTyr: 1.596 ± 0.224
0.0SerXaa: 0.0 ± 0.0
Thr
8.828ThrAla: 8.828 ± 0.593
0.423ThrCys: 0.423 ± 0.154
5.353ThrAsp: 5.353 ± 0.459
3.616ThrGlu: 3.616 ± 0.455
1.69ThrPhe: 1.69 ± 0.279
7.043ThrGly: 7.043 ± 0.548
0.986ThrHis: 0.986 ± 0.282
3.709ThrIle: 3.709 ± 0.475
2.536ThrLys: 2.536 ± 0.29
5.259ThrLeu: 5.259 ± 0.463
1.784ThrMet: 1.784 ± 0.3
1.972ThrAsn: 1.972 ± 0.386
5.024ThrPro: 5.024 ± 0.602
2.019ThrGln: 2.019 ± 0.274
3.756ThrArg: 3.756 ± 0.462
4.179ThrSer: 4.179 ± 0.347
3.897ThrThr: 3.897 ± 0.489
6.433ThrVal: 6.433 ± 0.581
1.221ThrTrp: 1.221 ± 0.273
1.503ThrTyr: 1.503 ± 0.289
0.0ThrXaa: 0.0 ± 0.0
Val
8.405ValAla: 8.405 ± 0.47
0.892ValCys: 0.892 ± 0.223
4.508ValAsp: 4.508 ± 0.45
5.916ValGlu: 5.916 ± 0.661
1.456ValPhe: 1.456 ± 0.222
7.466ValGly: 7.466 ± 0.557
1.596ValHis: 1.596 ± 0.308
3.569ValIle: 3.569 ± 0.36
1.972ValLys: 1.972 ± 0.311
6.527ValLeu: 6.527 ± 0.404
1.456ValMet: 1.456 ± 0.289
2.113ValAsn: 2.113 ± 0.303
3.85ValPro: 3.85 ± 0.347
1.737ValGln: 1.737 ± 0.282
4.602ValArg: 4.602 ± 0.425
4.132ValSer: 4.132 ± 0.375
5.118ValThr: 5.118 ± 0.608
6.245ValVal: 6.245 ± 0.585
1.878ValTrp: 1.878 ± 0.351
1.972ValTyr: 1.972 ± 0.405
0.0ValXaa: 0.0 ± 0.0
Trp
2.442TrpAla: 2.442 ± 0.369
0.517TrpCys: 0.517 ± 0.192
1.174TrpAsp: 1.174 ± 0.239
0.517TrpGlu: 0.517 ± 0.121
0.704TrpPhe: 0.704 ± 0.274
1.08TrpGly: 1.08 ± 0.214
0.47TrpHis: 0.47 ± 0.182
0.939TrpIle: 0.939 ± 0.192
0.235TrpLys: 0.235 ± 0.097
1.55TrpLeu: 1.55 ± 0.347
0.47TrpMet: 0.47 ± 0.132
0.282TrpAsn: 0.282 ± 0.112
0.845TrpPro: 0.845 ± 0.223
0.657TrpGln: 0.657 ± 0.18
1.409TrpArg: 1.409 ± 0.205
1.362TrpSer: 1.362 ± 0.257
1.456TrpThr: 1.456 ± 0.272
1.55TrpVal: 1.55 ± 0.378
0.47TrpTrp: 0.47 ± 0.153
0.423TrpTyr: 0.423 ± 0.163
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.381TyrAla: 3.381 ± 0.423
0.235TyrCys: 0.235 ± 0.117
2.16TyrAsp: 2.16 ± 0.28
1.503TyrGlu: 1.503 ± 0.279
0.61TyrPhe: 0.61 ± 0.179
2.113TyrGly: 2.113 ± 0.389
0.47TyrHis: 0.47 ± 0.149
1.033TyrIle: 1.033 ± 0.217
0.563TyrLys: 0.563 ± 0.196
1.972TyrLeu: 1.972 ± 0.394
0.329TyrMet: 0.329 ± 0.125
0.751TyrAsn: 0.751 ± 0.153
1.127TyrPro: 1.127 ± 0.31
0.892TyrGln: 0.892 ± 0.253
2.254TyrArg: 2.254 ± 0.418
0.986TyrSer: 0.986 ± 0.203
1.643TyrThr: 1.643 ± 0.274
2.254TyrVal: 2.254 ± 0.279
0.282TyrTrp: 0.282 ± 0.102
0.61TyrTyr: 0.61 ± 0.189
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 97 proteins (21298 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski