Amino acid dipepetide frequency for Mycobacterium virus Dorothy

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
14.286AlaAla: 14.286 ± 1.612
0.746AlaCys: 0.746 ± 0.191
6.93AlaAsp: 6.93 ± 0.649
7.036AlaGlu: 7.036 ± 0.681
2.825AlaPhe: 2.825 ± 0.342
9.968AlaGly: 9.968 ± 1.359
2.186AlaHis: 2.186 ± 0.367
4.051AlaIle: 4.051 ± 0.531
4.318AlaLys: 4.318 ± 0.388
8.369AlaLeu: 8.369 ± 0.663
2.239AlaMet: 2.239 ± 0.317
2.719AlaAsn: 2.719 ± 0.37
4.851AlaPro: 4.851 ± 0.575
3.731AlaGln: 3.731 ± 0.466
7.196AlaArg: 7.196 ± 0.686
5.224AlaSer: 5.224 ± 0.504
5.544AlaThr: 5.544 ± 0.493
6.503AlaVal: 6.503 ± 0.519
2.239AlaTrp: 2.239 ± 0.322
2.345AlaTyr: 2.345 ± 0.35
0.0AlaXaa: 0.0 ± 0.0
Cys
0.746CysAla: 0.746 ± 0.258
0.107CysCys: 0.107 ± 0.065
1.386CysAsp: 1.386 ± 0.334
1.066CysGlu: 1.066 ± 0.232
0.107CysPhe: 0.107 ± 0.086
1.919CysGly: 1.919 ± 0.365
0.213CysHis: 0.213 ± 0.108
0.267CysIle: 0.267 ± 0.126
0.373CysLys: 0.373 ± 0.148
0.959CysLeu: 0.959 ± 0.289
0.16CysMet: 0.16 ± 0.089
0.426CysAsn: 0.426 ± 0.154
1.119CysPro: 1.119 ± 0.27
0.48CysGln: 0.48 ± 0.159
0.906CysArg: 0.906 ± 0.246
0.693CysSer: 0.693 ± 0.192
0.64CysThr: 0.64 ± 0.187
0.693CysVal: 0.693 ± 0.193
0.267CysTrp: 0.267 ± 0.118
0.16CysTyr: 0.16 ± 0.086
0.0CysXaa: 0.0 ± 0.0
Asp
6.237AspAla: 6.237 ± 0.564
0.8AspCys: 0.8 ± 0.193
4.531AspAsp: 4.531 ± 0.493
3.145AspGlu: 3.145 ± 0.407
1.706AspPhe: 1.706 ± 0.22
6.663AspGly: 6.663 ± 0.622
1.386AspHis: 1.386 ± 0.33
2.612AspIle: 2.612 ± 0.358
1.919AspLys: 1.919 ± 0.31
6.13AspLeu: 6.13 ± 0.496
1.333AspMet: 1.333 ± 0.289
1.599AspAsn: 1.599 ± 0.32
4.531AspPro: 4.531 ± 0.498
2.665AspGln: 2.665 ± 0.325
5.277AspArg: 5.277 ± 0.625
3.571AspSer: 3.571 ± 0.555
4.158AspThr: 4.158 ± 0.441
4.158AspVal: 4.158 ± 0.495
1.866AspTrp: 1.866 ± 0.273
2.132AspTyr: 2.132 ± 0.33
0.0AspXaa: 0.0 ± 0.0
Glu
6.93GluAla: 6.93 ± 0.718
0.959GluCys: 0.959 ± 0.226
2.985GluAsp: 2.985 ± 0.354
2.878GluGlu: 2.878 ± 0.5
2.292GluPhe: 2.292 ± 0.349
3.625GluGly: 3.625 ± 0.394
1.759GluHis: 1.759 ± 0.382
2.026GluIle: 2.026 ± 0.315
2.026GluLys: 2.026 ± 0.286
5.757GluLeu: 5.757 ± 0.667
1.493GluMet: 1.493 ± 0.278
1.972GluAsn: 1.972 ± 0.317
2.665GluPro: 2.665 ± 0.368
2.665GluGln: 2.665 ± 0.398
4.691GluArg: 4.691 ± 0.68
3.198GluSer: 3.198 ± 0.454
4.371GluThr: 4.371 ± 0.551
4.104GluVal: 4.104 ± 0.458
1.546GluTrp: 1.546 ± 0.273
1.866GluTyr: 1.866 ± 0.355
0.0GluXaa: 0.0 ± 0.0
Phe
2.932PheAla: 2.932 ± 0.432
0.213PheCys: 0.213 ± 0.099
2.505PheAsp: 2.505 ± 0.389
1.812PheGlu: 1.812 ± 0.275
1.066PhePhe: 1.066 ± 0.238
2.932PheGly: 2.932 ± 0.603
0.586PheHis: 0.586 ± 0.176
1.493PheIle: 1.493 ± 0.328
1.173PheLys: 1.173 ± 0.271
1.599PheLeu: 1.599 ± 0.275
0.64PheMet: 0.64 ± 0.194
1.173PheAsn: 1.173 ± 0.337
1.706PhePro: 1.706 ± 0.27
1.173PheGln: 1.173 ± 0.335
1.599PheArg: 1.599 ± 0.252
1.546PheSer: 1.546 ± 0.328
2.079PheThr: 2.079 ± 0.362
1.919PheVal: 1.919 ± 0.258
0.64PheTrp: 0.64 ± 0.184
0.906PheTyr: 0.906 ± 0.231
0.0PheXaa: 0.0 ± 0.0
Gly
9.275GlyAla: 9.275 ± 1.311
1.119GlyCys: 1.119 ± 0.265
6.13GlyAsp: 6.13 ± 0.503
4.211GlyGlu: 4.211 ± 0.513
3.038GlyPhe: 3.038 ± 0.526
10.661GlyGly: 10.661 ± 2.351
2.026GlyHis: 2.026 ± 0.282
4.211GlyIle: 4.211 ± 0.62
2.772GlyLys: 2.772 ± 0.364
5.65GlyLeu: 5.65 ± 0.598
2.186GlyMet: 2.186 ± 0.452
3.358GlyAsn: 3.358 ± 0.358
4.371GlyPro: 4.371 ± 0.536
2.399GlyGln: 2.399 ± 0.529
5.49GlyArg: 5.49 ± 0.637
5.704GlySer: 5.704 ± 0.87
6.823GlyThr: 6.823 ± 0.894
5.917GlyVal: 5.917 ± 0.535
2.665GlyTrp: 2.665 ± 0.391
1.919GlyTyr: 1.919 ± 0.368
0.0GlyXaa: 0.0 ± 0.0
His
1.493HisAla: 1.493 ± 0.253
0.373HisCys: 0.373 ± 0.187
1.013HisAsp: 1.013 ± 0.206
1.439HisGlu: 1.439 ± 0.307
0.533HisPhe: 0.533 ± 0.162
1.546HisGly: 1.546 ± 0.324
0.906HisHis: 0.906 ± 0.245
1.599HisIle: 1.599 ± 0.272
0.906HisLys: 0.906 ± 0.219
1.652HisLeu: 1.652 ± 0.299
0.533HisMet: 0.533 ± 0.13
0.959HisAsn: 0.959 ± 0.207
1.493HisPro: 1.493 ± 0.278
0.746HisGln: 0.746 ± 0.197
2.186HisArg: 2.186 ± 0.355
0.8HisSer: 0.8 ± 0.183
1.493HisThr: 1.493 ± 0.356
1.279HisVal: 1.279 ± 0.278
0.48HisTrp: 0.48 ± 0.152
0.693HisTyr: 0.693 ± 0.158
0.0HisXaa: 0.0 ± 0.0
Ile
5.171IleAla: 5.171 ± 0.471
0.906IleCys: 0.906 ± 0.26
3.838IleAsp: 3.838 ± 0.428
3.465IleGlu: 3.465 ± 0.352
0.586IlePhe: 0.586 ± 0.213
3.945IleGly: 3.945 ± 0.457
1.279IleHis: 1.279 ± 0.31
1.493IleIle: 1.493 ± 0.289
1.013IleLys: 1.013 ± 0.244
2.452IleLeu: 2.452 ± 0.393
0.267IleMet: 0.267 ± 0.087
2.079IleAsn: 2.079 ± 0.354
2.559IlePro: 2.559 ± 0.309
1.226IleGln: 1.226 ± 0.24
2.505IleArg: 2.505 ± 0.326
2.026IleSer: 2.026 ± 0.396
3.412IleThr: 3.412 ± 0.42
3.252IleVal: 3.252 ± 0.34
1.066IleTrp: 1.066 ± 0.264
0.959IleTyr: 0.959 ± 0.227
0.0IleXaa: 0.0 ± 0.0
Lys
3.678LysAla: 3.678 ± 0.427
0.746LysCys: 0.746 ± 0.216
1.812LysAsp: 1.812 ± 0.296
1.652LysGlu: 1.652 ± 0.288
1.173LysPhe: 1.173 ± 0.207
2.399LysGly: 2.399 ± 0.351
0.906LysHis: 0.906 ± 0.257
0.8LysIle: 0.8 ± 0.248
1.493LysLys: 1.493 ± 0.462
2.505LysLeu: 2.505 ± 0.475
0.693LysMet: 0.693 ± 0.159
0.906LysAsn: 0.906 ± 0.202
2.505LysPro: 2.505 ± 0.439
1.652LysGln: 1.652 ± 0.268
2.932LysArg: 2.932 ± 0.446
2.026LysSer: 2.026 ± 0.367
1.919LysThr: 1.919 ± 0.344
2.612LysVal: 2.612 ± 0.45
0.906LysTrp: 0.906 ± 0.273
1.066LysTyr: 1.066 ± 0.275
0.0LysXaa: 0.0 ± 0.0
Leu
7.889LeuAla: 7.889 ± 0.815
0.853LeuCys: 0.853 ± 0.237
4.638LeuAsp: 4.638 ± 0.452
4.478LeuGlu: 4.478 ± 0.455
2.665LeuPhe: 2.665 ± 0.308
5.011LeuGly: 5.011 ± 0.532
1.066LeuHis: 1.066 ± 0.284
3.038LeuIle: 3.038 ± 0.42
2.612LeuLys: 2.612 ± 0.337
5.277LeuLeu: 5.277 ± 0.597
2.079LeuMet: 2.079 ± 0.319
2.345LeuAsn: 2.345 ± 0.472
5.171LeuPro: 5.171 ± 0.677
2.665LeuGln: 2.665 ± 0.386
5.384LeuArg: 5.384 ± 0.647
4.851LeuSer: 4.851 ± 0.565
4.957LeuThr: 4.957 ± 0.467
4.957LeuVal: 4.957 ± 0.444
1.546LeuTrp: 1.546 ± 0.307
2.239LeuTyr: 2.239 ± 0.339
0.0LeuXaa: 0.0 ± 0.0
Met
2.026MetAla: 2.026 ± 0.309
0.32MetCys: 0.32 ± 0.17
1.599MetAsp: 1.599 ± 0.3
1.119MetGlu: 1.119 ± 0.222
0.693MetPhe: 0.693 ± 0.183
2.026MetGly: 2.026 ± 0.287
0.16MetHis: 0.16 ± 0.093
1.119MetIle: 1.119 ± 0.293
0.586MetLys: 0.586 ± 0.149
1.759MetLeu: 1.759 ± 0.273
0.533MetMet: 0.533 ± 0.213
1.119MetAsn: 1.119 ± 0.231
1.119MetPro: 1.119 ± 0.278
0.586MetGln: 0.586 ± 0.153
1.652MetArg: 1.652 ± 0.276
2.559MetSer: 2.559 ± 0.406
1.706MetThr: 1.706 ± 0.291
1.333MetVal: 1.333 ± 0.365
0.213MetTrp: 0.213 ± 0.103
0.32MetTyr: 0.32 ± 0.127
0.0MetXaa: 0.0 ± 0.0
Asn
3.465AsnAla: 3.465 ± 0.429
0.16AsnCys: 0.16 ± 0.094
1.759AsnAsp: 1.759 ± 0.309
2.186AsnGlu: 2.186 ± 0.393
0.8AsnPhe: 0.8 ± 0.235
4.051AsnGly: 4.051 ± 0.511
0.959AsnHis: 0.959 ± 0.194
1.706AsnIle: 1.706 ± 0.423
1.013AsnLys: 1.013 ± 0.225
2.399AsnLeu: 2.399 ± 0.46
0.586AsnMet: 0.586 ± 0.145
1.759AsnAsn: 1.759 ± 0.431
2.878AsnPro: 2.878 ± 0.34
1.279AsnGln: 1.279 ± 0.373
2.079AsnArg: 2.079 ± 0.395
1.599AsnSer: 1.599 ± 0.316
1.866AsnThr: 1.866 ± 0.312
1.919AsnVal: 1.919 ± 0.286
0.8AsnTrp: 0.8 ± 0.187
0.586AsnTyr: 0.586 ± 0.146
0.0AsnXaa: 0.0 ± 0.0
Pro
4.904ProAla: 4.904 ± 0.616
0.693ProCys: 0.693 ± 0.192
4.371ProAsp: 4.371 ± 0.493
3.945ProGlu: 3.945 ± 0.331
1.493ProPhe: 1.493 ± 0.341
6.557ProGly: 6.557 ± 0.745
1.652ProHis: 1.652 ± 0.267
1.972ProIle: 1.972 ± 0.326
2.345ProLys: 2.345 ± 0.335
4.104ProLeu: 4.104 ± 0.501
1.706ProMet: 1.706 ± 0.374
2.292ProAsn: 2.292 ± 0.305
3.945ProPro: 3.945 ± 0.604
1.972ProGln: 1.972 ± 0.364
3.305ProArg: 3.305 ± 0.474
3.358ProSer: 3.358 ± 0.461
3.305ProThr: 3.305 ± 0.392
4.797ProVal: 4.797 ± 0.477
1.119ProTrp: 1.119 ± 0.215
1.546ProTyr: 1.546 ± 0.278
0.0ProXaa: 0.0 ± 0.0
Gln
4.318GlnAla: 4.318 ± 0.631
0.426GlnCys: 0.426 ± 0.211
1.546GlnAsp: 1.546 ± 0.256
2.132GlnGlu: 2.132 ± 0.315
1.173GlnPhe: 1.173 ± 0.211
2.132GlnGly: 2.132 ± 0.412
0.693GlnHis: 0.693 ± 0.199
1.866GlnIle: 1.866 ± 0.33
1.386GlnLys: 1.386 ± 0.251
3.038GlnLeu: 3.038 ± 0.426
0.746GlnMet: 0.746 ± 0.217
0.853GlnAsn: 0.853 ± 0.177
2.345GlnPro: 2.345 ± 0.397
1.386GlnGln: 1.386 ± 0.277
2.825GlnArg: 2.825 ± 0.365
2.132GlnSer: 2.132 ± 0.339
1.706GlnThr: 1.706 ± 0.352
2.505GlnVal: 2.505 ± 0.345
1.066GlnTrp: 1.066 ± 0.361
1.173GlnTyr: 1.173 ± 0.285
0.0GlnXaa: 0.0 ± 0.0
Arg
6.663ArgAla: 6.663 ± 0.568
1.386ArgCys: 1.386 ± 0.379
4.904ArgAsp: 4.904 ± 0.587
5.33ArgGlu: 5.33 ± 0.7
1.759ArgPhe: 1.759 ± 0.379
4.371ArgGly: 4.371 ± 0.446
1.439ArgHis: 1.439 ± 0.303
4.264ArgIle: 4.264 ± 0.579
2.186ArgLys: 2.186 ± 0.385
4.744ArgLeu: 4.744 ± 0.608
2.612ArgMet: 2.612 ± 0.379
2.399ArgAsn: 2.399 ± 0.343
3.945ArgPro: 3.945 ± 0.495
2.186ArgGln: 2.186 ± 0.398
5.81ArgArg: 5.81 ± 0.925
3.571ArgSer: 3.571 ± 0.336
3.305ArgThr: 3.305 ± 0.462
5.064ArgVal: 5.064 ± 0.543
2.186ArgTrp: 2.186 ± 0.396
2.239ArgTyr: 2.239 ± 0.361
0.0ArgXaa: 0.0 ± 0.0
Ser
5.757SerAla: 5.757 ± 0.639
0.426SerCys: 0.426 ± 0.174
4.264SerAsp: 4.264 ± 0.475
3.465SerGlu: 3.465 ± 0.461
2.079SerPhe: 2.079 ± 0.408
6.397SerGly: 6.397 ± 0.971
0.853SerHis: 0.853 ± 0.247
2.559SerIle: 2.559 ± 0.409
2.186SerLys: 2.186 ± 0.34
3.465SerLeu: 3.465 ± 0.363
1.226SerMet: 1.226 ± 0.234
2.079SerAsn: 2.079 ± 0.464
3.198SerPro: 3.198 ± 0.353
1.599SerGln: 1.599 ± 0.247
3.625SerArg: 3.625 ± 0.434
3.518SerSer: 3.518 ± 0.545
3.412SerThr: 3.412 ± 0.478
4.744SerVal: 4.744 ± 0.648
1.226SerTrp: 1.226 ± 0.285
1.226SerTyr: 1.226 ± 0.239
0.0SerXaa: 0.0 ± 0.0
Thr
6.023ThrAla: 6.023 ± 0.576
0.48ThrCys: 0.48 ± 0.157
3.625ThrAsp: 3.625 ± 0.558
3.731ThrGlu: 3.731 ± 0.461
1.759ThrPhe: 1.759 ± 0.341
6.93ThrGly: 6.93 ± 0.62
1.599ThrHis: 1.599 ± 0.343
3.252ThrIle: 3.252 ± 0.425
1.972ThrLys: 1.972 ± 0.293
4.264ThrLeu: 4.264 ± 0.539
1.066ThrMet: 1.066 ± 0.21
2.239ThrAsn: 2.239 ± 0.353
4.158ThrPro: 4.158 ± 0.441
1.919ThrGln: 1.919 ± 0.295
3.945ThrArg: 3.945 ± 0.385
3.998ThrSer: 3.998 ± 0.453
4.638ThrThr: 4.638 ± 0.646
5.33ThrVal: 5.33 ± 0.618
1.226ThrTrp: 1.226 ± 0.271
1.706ThrTyr: 1.706 ± 0.272
0.0ThrXaa: 0.0 ± 0.0
Val
6.93ValAla: 6.93 ± 0.586
1.439ValCys: 1.439 ± 0.268
5.33ValAsp: 5.33 ± 0.537
3.838ValGlu: 3.838 ± 0.557
2.505ValPhe: 2.505 ± 0.407
5.384ValGly: 5.384 ± 0.717
1.226ValHis: 1.226 ± 0.245
2.932ValIle: 2.932 ± 0.409
2.452ValLys: 2.452 ± 0.331
5.544ValLeu: 5.544 ± 0.559
1.386ValMet: 1.386 ± 0.243
2.132ValAsn: 2.132 ± 0.393
3.945ValPro: 3.945 ± 0.446
2.772ValGln: 2.772 ± 0.347
4.638ValArg: 4.638 ± 0.544
4.638ValSer: 4.638 ± 0.639
5.117ValThr: 5.117 ± 0.518
5.704ValVal: 5.704 ± 0.59
2.079ValTrp: 2.079 ± 0.357
1.386ValTyr: 1.386 ± 0.3
0.0ValXaa: 0.0 ± 0.0
Trp
2.186TrpAla: 2.186 ± 0.286
0.267TrpCys: 0.267 ± 0.138
1.599TrpAsp: 1.599 ± 0.309
1.173TrpGlu: 1.173 ± 0.335
0.746TrpPhe: 0.746 ± 0.192
1.279TrpGly: 1.279 ± 0.304
0.746TrpHis: 0.746 ± 0.211
1.119TrpIle: 1.119 ± 0.205
0.959TrpLys: 0.959 ± 0.193
2.026TrpLeu: 2.026 ± 0.338
0.853TrpMet: 0.853 ± 0.22
0.693TrpAsn: 0.693 ± 0.198
1.599TrpPro: 1.599 ± 0.336
1.439TrpGln: 1.439 ± 0.283
1.972TrpArg: 1.972 ± 0.496
1.226TrpSer: 1.226 ± 0.234
1.493TrpThr: 1.493 ± 0.222
2.079TrpVal: 2.079 ± 0.416
1.013TrpTrp: 1.013 ± 0.209
0.373TrpTyr: 0.373 ± 0.132
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.399TyrAla: 2.399 ± 0.372
0.267TyrCys: 0.267 ± 0.12
1.652TyrAsp: 1.652 ± 0.348
1.759TyrGlu: 1.759 ± 0.306
0.693TyrPhe: 0.693 ± 0.19
2.132TyrGly: 2.132 ± 0.377
0.48TyrHis: 0.48 ± 0.176
1.119TyrIle: 1.119 ± 0.222
0.693TyrLys: 0.693 ± 0.219
2.026TyrLeu: 2.026 ± 0.279
0.16TyrMet: 0.16 ± 0.087
0.746TyrAsn: 0.746 ± 0.202
1.279TyrPro: 1.279 ± 0.234
0.906TyrGln: 0.906 ± 0.237
2.345TyrArg: 2.345 ± 0.388
1.066TyrSer: 1.066 ± 0.252
1.919TyrThr: 1.919 ± 0.338
2.505TyrVal: 2.505 ± 0.316
0.693TyrTrp: 0.693 ± 0.181
0.64TyrTyr: 0.64 ± 0.166
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 104 proteins (18761 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski