Amino acid dipepetide frequency for Mycobacterium virus Dotproduct

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
14.755AlaAla: 14.755 ± 1.762
0.968AlaCys: 0.968 ± 0.283
7.463AlaAsp: 7.463 ± 0.706
6.95AlaGlu: 6.95 ± 0.87
2.792AlaPhe: 2.792 ± 0.425
10.312AlaGly: 10.312 ± 1.336
2.507AlaHis: 2.507 ± 0.404
4.216AlaIle: 4.216 ± 0.506
4.216AlaLys: 4.216 ± 0.462
8.659AlaLeu: 8.659 ± 0.816
2.621AlaMet: 2.621 ± 0.368
2.336AlaAsn: 2.336 ± 0.444
5.127AlaPro: 5.127 ± 0.676
3.817AlaGln: 3.817 ± 0.532
7.52AlaArg: 7.52 ± 0.714
5.412AlaSer: 5.412 ± 0.589
6.267AlaThr: 6.267 ± 0.49
6.836AlaVal: 6.836 ± 0.584
2.621AlaTrp: 2.621 ± 0.386
2.108AlaTyr: 2.108 ± 0.315
0.0AlaXaa: 0.0 ± 0.0
Cys
0.798CysAla: 0.798 ± 0.257
0.114CysCys: 0.114 ± 0.087
1.196CysAsp: 1.196 ± 0.322
0.912CysGlu: 0.912 ± 0.219
0.171CysPhe: 0.171 ± 0.102
1.937CysGly: 1.937 ± 0.408
0.114CysHis: 0.114 ± 0.077
0.171CysIle: 0.171 ± 0.097
0.399CysLys: 0.399 ± 0.13
0.912CysLeu: 0.912 ± 0.272
0.228CysMet: 0.228 ± 0.1
0.342CysAsn: 0.342 ± 0.136
1.025CysPro: 1.025 ± 0.31
0.399CysGln: 0.399 ± 0.152
1.025CysArg: 1.025 ± 0.329
0.968CysSer: 0.968 ± 0.348
0.627CysThr: 0.627 ± 0.169
0.627CysVal: 0.627 ± 0.155
0.228CysTrp: 0.228 ± 0.105
0.228CysTyr: 0.228 ± 0.112
0.0CysXaa: 0.0 ± 0.0
Asp
7.121AspAla: 7.121 ± 0.651
0.912AspCys: 0.912 ± 0.231
4.842AspAsp: 4.842 ± 0.593
3.361AspGlu: 3.361 ± 0.396
1.937AspPhe: 1.937 ± 0.284
7.349AspGly: 7.349 ± 0.8
1.253AspHis: 1.253 ± 0.249
2.165AspIle: 2.165 ± 0.333
1.481AspLys: 1.481 ± 0.313
6.039AspLeu: 6.039 ± 0.644
1.196AspMet: 1.196 ± 0.247
1.652AspAsn: 1.652 ± 0.333
4.956AspPro: 4.956 ± 0.657
2.336AspGln: 2.336 ± 0.331
5.07AspArg: 5.07 ± 0.636
3.019AspSer: 3.019 ± 0.471
4.501AspThr: 4.501 ± 0.507
4.444AspVal: 4.444 ± 0.607
1.367AspTrp: 1.367 ± 0.255
2.051AspTyr: 2.051 ± 0.348
0.0AspXaa: 0.0 ± 0.0
Glu
5.982GluAla: 5.982 ± 0.69
0.968GluCys: 0.968 ± 0.246
2.279GluAsp: 2.279 ± 0.405
2.507GluGlu: 2.507 ± 0.432
2.108GluPhe: 2.108 ± 0.297
3.532GluGly: 3.532 ± 0.466
1.31GluHis: 1.31 ± 0.294
1.823GluIle: 1.823 ± 0.34
1.88GluLys: 1.88 ± 0.32
5.526GluLeu: 5.526 ± 0.701
1.538GluMet: 1.538 ± 0.286
1.937GluAsn: 1.937 ± 0.298
3.133GluPro: 3.133 ± 0.53
2.621GluGln: 2.621 ± 0.393
5.013GluArg: 5.013 ± 0.645
2.792GluSer: 2.792 ± 0.447
4.558GluThr: 4.558 ± 0.733
3.931GluVal: 3.931 ± 0.506
1.538GluTrp: 1.538 ± 0.28
1.595GluTyr: 1.595 ± 0.316
0.0GluXaa: 0.0 ± 0.0
Phe
2.849PheAla: 2.849 ± 0.463
0.285PheCys: 0.285 ± 0.127
2.051PheAsp: 2.051 ± 0.391
1.88PheGlu: 1.88 ± 0.323
1.025PhePhe: 1.025 ± 0.315
2.735PheGly: 2.735 ± 0.625
0.456PheHis: 0.456 ± 0.146
1.595PheIle: 1.595 ± 0.352
1.139PheLys: 1.139 ± 0.247
1.766PheLeu: 1.766 ± 0.284
0.684PheMet: 0.684 ± 0.206
1.367PheAsn: 1.367 ± 0.447
1.709PhePro: 1.709 ± 0.306
1.139PheGln: 1.139 ± 0.348
1.481PheArg: 1.481 ± 0.249
1.652PheSer: 1.652 ± 0.283
2.621PheThr: 2.621 ± 0.391
1.88PheVal: 1.88 ± 0.296
0.627PheTrp: 0.627 ± 0.158
0.798PheTyr: 0.798 ± 0.225
0.0PheXaa: 0.0 ± 0.0
Gly
9.628GlyAla: 9.628 ± 1.172
1.196GlyCys: 1.196 ± 0.272
6.495GlyAsp: 6.495 ± 0.65
4.045GlyGlu: 4.045 ± 0.613
2.621GlyPhe: 2.621 ± 0.452
10.426GlyGly: 10.426 ± 1.987
2.051GlyHis: 2.051 ± 0.328
3.817GlyIle: 3.817 ± 0.571
2.507GlyLys: 2.507 ± 0.369
6.096GlyLeu: 6.096 ± 0.494
2.45GlyMet: 2.45 ± 0.491
3.304GlyAsn: 3.304 ± 0.422
4.444GlyPro: 4.444 ± 0.588
2.678GlyGln: 2.678 ± 0.575
5.241GlyArg: 5.241 ± 0.641
5.754GlySer: 5.754 ± 0.799
6.096GlyThr: 6.096 ± 0.791
6.552GlyVal: 6.552 ± 0.723
2.393GlyTrp: 2.393 ± 0.341
2.051GlyTyr: 2.051 ± 0.393
0.0GlyXaa: 0.0 ± 0.0
His
1.823HisAla: 1.823 ± 0.291
0.456HisCys: 0.456 ± 0.188
1.196HisAsp: 1.196 ± 0.26
1.196HisGlu: 1.196 ± 0.296
0.57HisPhe: 0.57 ± 0.15
1.538HisGly: 1.538 ± 0.303
0.855HisHis: 0.855 ± 0.23
1.424HisIle: 1.424 ± 0.311
0.798HisLys: 0.798 ± 0.178
1.538HisLeu: 1.538 ± 0.319
0.57HisMet: 0.57 ± 0.136
0.855HisAsn: 0.855 ± 0.187
1.595HisPro: 1.595 ± 0.262
0.627HisGln: 0.627 ± 0.17
2.279HisArg: 2.279 ± 0.392
0.855HisSer: 0.855 ± 0.193
1.538HisThr: 1.538 ± 0.346
1.025HisVal: 1.025 ± 0.252
0.57HisTrp: 0.57 ± 0.164
0.798HisTyr: 0.798 ± 0.191
0.0HisXaa: 0.0 ± 0.0
Ile
5.298IleAla: 5.298 ± 0.468
0.57IleCys: 0.57 ± 0.201
3.76IleAsp: 3.76 ± 0.44
3.076IleGlu: 3.076 ± 0.373
0.798IlePhe: 0.798 ± 0.254
3.304IleGly: 3.304 ± 0.476
1.481IleHis: 1.481 ± 0.337
1.367IleIle: 1.367 ± 0.281
1.367IleLys: 1.367 ± 0.272
2.564IleLeu: 2.564 ± 0.391
0.342IleMet: 0.342 ± 0.158
2.336IleAsn: 2.336 ± 0.319
2.45IlePro: 2.45 ± 0.332
1.082IleGln: 1.082 ± 0.236
2.222IleArg: 2.222 ± 0.345
1.994IleSer: 1.994 ± 0.403
3.646IleThr: 3.646 ± 0.444
3.076IleVal: 3.076 ± 0.403
0.912IleTrp: 0.912 ± 0.245
0.741IleTyr: 0.741 ± 0.22
0.0IleXaa: 0.0 ± 0.0
Lys
3.703LysAla: 3.703 ± 0.545
0.399LysCys: 0.399 ± 0.146
1.766LysAsp: 1.766 ± 0.305
1.709LysGlu: 1.709 ± 0.269
1.196LysPhe: 1.196 ± 0.219
2.507LysGly: 2.507 ± 0.361
0.968LysHis: 0.968 ± 0.24
1.025LysIle: 1.025 ± 0.283
1.424LysLys: 1.424 ± 0.401
2.45LysLeu: 2.45 ± 0.465
0.456LysMet: 0.456 ± 0.144
0.912LysAsn: 0.912 ± 0.262
2.222LysPro: 2.222 ± 0.414
1.481LysGln: 1.481 ± 0.262
2.621LysArg: 2.621 ± 0.478
1.823LysSer: 1.823 ± 0.303
2.165LysThr: 2.165 ± 0.342
2.564LysVal: 2.564 ± 0.412
0.912LysTrp: 0.912 ± 0.263
1.139LysTyr: 1.139 ± 0.317
0.0LysXaa: 0.0 ± 0.0
Leu
7.748LeuAla: 7.748 ± 0.795
0.684LeuCys: 0.684 ± 0.215
5.241LeuAsp: 5.241 ± 0.558
3.418LeuGlu: 3.418 ± 0.428
2.621LeuPhe: 2.621 ± 0.281
5.07LeuGly: 5.07 ± 0.572
1.082LeuHis: 1.082 ± 0.297
3.133LeuIle: 3.133 ± 0.394
1.994LeuLys: 1.994 ± 0.382
5.241LeuLeu: 5.241 ± 0.62
1.538LeuMet: 1.538 ± 0.317
2.393LeuAsn: 2.393 ± 0.347
5.469LeuPro: 5.469 ± 0.636
2.678LeuGln: 2.678 ± 0.395
5.583LeuArg: 5.583 ± 0.587
5.298LeuSer: 5.298 ± 0.471
5.64LeuThr: 5.64 ± 0.518
4.558LeuVal: 4.558 ± 0.567
1.367LeuTrp: 1.367 ± 0.313
2.279LeuTyr: 2.279 ± 0.449
0.0LeuXaa: 0.0 ± 0.0
Met
2.678MetAla: 2.678 ± 0.428
0.285MetCys: 0.285 ± 0.188
1.025MetAsp: 1.025 ± 0.297
0.968MetGlu: 0.968 ± 0.204
0.684MetPhe: 0.684 ± 0.181
1.994MetGly: 1.994 ± 0.329
0.342MetHis: 0.342 ± 0.147
0.855MetIle: 0.855 ± 0.231
1.025MetLys: 1.025 ± 0.286
1.481MetLeu: 1.481 ± 0.267
0.57MetMet: 0.57 ± 0.239
1.082MetAsn: 1.082 ± 0.239
1.139MetPro: 1.139 ± 0.281
0.513MetGln: 0.513 ± 0.145
1.481MetArg: 1.481 ± 0.335
2.735MetSer: 2.735 ± 0.476
2.222MetThr: 2.222 ± 0.335
1.253MetVal: 1.253 ± 0.292
0.342MetTrp: 0.342 ± 0.128
0.342MetTyr: 0.342 ± 0.167
0.0MetXaa: 0.0 ± 0.0
Asn
3.589AsnAla: 3.589 ± 0.427
0.114AsnCys: 0.114 ± 0.089
2.108AsnAsp: 2.108 ± 0.285
1.823AsnGlu: 1.823 ± 0.331
0.912AsnPhe: 0.912 ± 0.31
4.501AsnGly: 4.501 ± 0.76
0.798AsnHis: 0.798 ± 0.162
1.652AsnIle: 1.652 ± 0.483
1.196AsnLys: 1.196 ± 0.245
2.222AsnLeu: 2.222 ± 0.413
0.57AsnMet: 0.57 ± 0.194
1.652AsnAsn: 1.652 ± 0.466
3.076AsnPro: 3.076 ± 0.431
1.082AsnGln: 1.082 ± 0.366
1.88AsnArg: 1.88 ± 0.384
1.652AsnSer: 1.652 ± 0.313
2.222AsnThr: 2.222 ± 0.294
1.994AsnVal: 1.994 ± 0.399
0.741AsnTrp: 0.741 ± 0.188
0.513AsnTyr: 0.513 ± 0.136
0.0AsnXaa: 0.0 ± 0.0
Pro
5.298ProAla: 5.298 ± 0.73
0.741ProCys: 0.741 ± 0.216
4.558ProAsp: 4.558 ± 0.55
4.216ProGlu: 4.216 ± 0.46
1.766ProPhe: 1.766 ± 0.375
6.779ProGly: 6.779 ± 0.773
1.424ProHis: 1.424 ± 0.313
2.165ProIle: 2.165 ± 0.317
2.222ProLys: 2.222 ± 0.389
4.102ProLeu: 4.102 ± 0.535
1.766ProMet: 1.766 ± 0.406
2.336ProAsn: 2.336 ± 0.362
4.273ProPro: 4.273 ± 0.631
2.108ProGln: 2.108 ± 0.34
3.076ProArg: 3.076 ± 0.522
3.475ProSer: 3.475 ± 0.462
3.589ProThr: 3.589 ± 0.563
4.956ProVal: 4.956 ± 0.59
1.367ProTrp: 1.367 ± 0.271
1.595ProTyr: 1.595 ± 0.334
0.0ProXaa: 0.0 ± 0.0
Gln
4.672GlnAla: 4.672 ± 0.608
0.399GlnCys: 0.399 ± 0.146
1.538GlnAsp: 1.538 ± 0.246
1.709GlnGlu: 1.709 ± 0.266
1.139GlnPhe: 1.139 ± 0.259
2.393GlnGly: 2.393 ± 0.47
0.912GlnHis: 0.912 ± 0.212
1.595GlnIle: 1.595 ± 0.281
1.253GlnLys: 1.253 ± 0.23
2.792GlnLeu: 2.792 ± 0.425
0.741GlnMet: 0.741 ± 0.195
0.912GlnAsn: 0.912 ± 0.213
2.792GlnPro: 2.792 ± 0.451
1.196GlnGln: 1.196 ± 0.32
2.678GlnArg: 2.678 ± 0.329
2.279GlnSer: 2.279 ± 0.414
1.823GlnThr: 1.823 ± 0.346
2.051GlnVal: 2.051 ± 0.378
0.684GlnTrp: 0.684 ± 0.163
0.627GlnTyr: 0.627 ± 0.235
0.0GlnXaa: 0.0 ± 0.0
Arg
6.893ArgAla: 6.893 ± 0.643
1.31ArgCys: 1.31 ± 0.352
4.444ArgAsp: 4.444 ± 0.591
4.501ArgGlu: 4.501 ± 0.648
1.823ArgPhe: 1.823 ± 0.393
4.615ArgGly: 4.615 ± 0.561
1.538ArgHis: 1.538 ± 0.356
3.817ArgIle: 3.817 ± 0.515
2.621ArgLys: 2.621 ± 0.491
4.33ArgLeu: 4.33 ± 0.619
2.507ArgMet: 2.507 ± 0.378
2.792ArgAsn: 2.792 ± 0.47
3.589ArgPro: 3.589 ± 0.422
2.165ArgGln: 2.165 ± 0.391
5.697ArgArg: 5.697 ± 0.852
3.646ArgSer: 3.646 ± 0.484
3.532ArgThr: 3.532 ± 0.545
5.583ArgVal: 5.583 ± 0.569
1.652ArgTrp: 1.652 ± 0.38
1.766ArgTyr: 1.766 ± 0.306
0.0ArgXaa: 0.0 ± 0.0
Ser
5.526SerAla: 5.526 ± 0.731
0.513SerCys: 0.513 ± 0.191
4.33SerAsp: 4.33 ± 0.556
3.247SerGlu: 3.247 ± 0.455
2.222SerPhe: 2.222 ± 0.438
6.722SerGly: 6.722 ± 0.853
1.025SerHis: 1.025 ± 0.219
2.678SerIle: 2.678 ± 0.458
2.222SerLys: 2.222 ± 0.398
3.532SerLeu: 3.532 ± 0.465
1.481SerMet: 1.481 ± 0.282
1.994SerAsn: 1.994 ± 0.375
3.133SerPro: 3.133 ± 0.376
1.766SerGln: 1.766 ± 0.28
3.418SerArg: 3.418 ± 0.526
3.532SerSer: 3.532 ± 0.721
3.304SerThr: 3.304 ± 0.427
3.874SerVal: 3.874 ± 0.576
1.481SerTrp: 1.481 ± 0.27
1.424SerTyr: 1.424 ± 0.247
0.0SerXaa: 0.0 ± 0.0
Thr
6.95ThrAla: 6.95 ± 0.761
0.57ThrCys: 0.57 ± 0.197
4.33ThrAsp: 4.33 ± 0.592
3.418ThrGlu: 3.418 ± 0.348
1.595ThrPhe: 1.595 ± 0.377
5.811ThrGly: 5.811 ± 0.632
1.652ThrHis: 1.652 ± 0.319
3.931ThrIle: 3.931 ± 0.489
2.165ThrLys: 2.165 ± 0.395
4.956ThrLeu: 4.956 ± 0.517
1.196ThrMet: 1.196 ± 0.26
2.507ThrAsn: 2.507 ± 0.373
5.526ThrPro: 5.526 ± 0.86
1.652ThrGln: 1.652 ± 0.337
4.159ThrArg: 4.159 ± 0.495
4.159ThrSer: 4.159 ± 0.442
4.444ThrThr: 4.444 ± 0.67
5.697ThrVal: 5.697 ± 0.618
0.912ThrTrp: 0.912 ± 0.27
1.823ThrTyr: 1.823 ± 0.277
0.0ThrXaa: 0.0 ± 0.0
Val
7.406ValAla: 7.406 ± 0.515
1.367ValCys: 1.367 ± 0.354
5.241ValAsp: 5.241 ± 0.57
4.956ValGlu: 4.956 ± 0.518
2.393ValPhe: 2.393 ± 0.443
5.754ValGly: 5.754 ± 0.731
1.367ValHis: 1.367 ± 0.284
2.45ValIle: 2.45 ± 0.426
2.279ValLys: 2.279 ± 0.308
5.127ValLeu: 5.127 ± 0.565
1.31ValMet: 1.31 ± 0.227
2.279ValAsn: 2.279 ± 0.4
3.703ValPro: 3.703 ± 0.429
3.019ValGln: 3.019 ± 0.37
4.045ValArg: 4.045 ± 0.588
4.387ValSer: 4.387 ± 0.518
4.899ValThr: 4.899 ± 0.604
6.438ValVal: 6.438 ± 0.785
1.937ValTrp: 1.937 ± 0.391
1.196ValTyr: 1.196 ± 0.235
0.0ValXaa: 0.0 ± 0.0
Trp
2.051TrpAla: 2.051 ± 0.345
0.171TrpCys: 0.171 ± 0.109
1.595TrpAsp: 1.595 ± 0.336
1.139TrpGlu: 1.139 ± 0.324
0.627TrpPhe: 0.627 ± 0.18
0.912TrpGly: 0.912 ± 0.225
0.684TrpHis: 0.684 ± 0.214
1.31TrpIle: 1.31 ± 0.262
0.57TrpLys: 0.57 ± 0.152
1.652TrpLeu: 1.652 ± 0.336
0.968TrpMet: 0.968 ± 0.3
0.684TrpAsn: 0.684 ± 0.261
1.196TrpPro: 1.196 ± 0.301
1.139TrpGln: 1.139 ± 0.282
2.222TrpArg: 2.222 ± 0.426
1.082TrpSer: 1.082 ± 0.219
1.88TrpThr: 1.88 ± 0.314
1.994TrpVal: 1.994 ± 0.465
0.968TrpTrp: 0.968 ± 0.239
0.399TrpTyr: 0.399 ± 0.152
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.735TyrAla: 2.735 ± 0.373
0.399TyrCys: 0.399 ± 0.175
1.595TyrAsp: 1.595 ± 0.378
1.538TyrGlu: 1.538 ± 0.318
0.741TyrPhe: 0.741 ± 0.224
1.652TyrGly: 1.652 ± 0.333
0.285TyrHis: 0.285 ± 0.114
1.253TyrIle: 1.253 ± 0.301
0.684TyrLys: 0.684 ± 0.218
2.051TyrLeu: 2.051 ± 0.368
0.228TyrMet: 0.228 ± 0.108
0.627TyrAsn: 0.627 ± 0.207
1.253TyrPro: 1.253 ± 0.226
0.684TyrGln: 0.684 ± 0.184
2.108TyrArg: 2.108 ± 0.338
0.912TyrSer: 0.912 ± 0.238
1.88TyrThr: 1.88 ± 0.343
2.222TyrVal: 2.222 ± 0.36
0.627TyrTrp: 0.627 ± 0.193
0.627TyrTyr: 0.627 ± 0.18
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 98 proteins (17554 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski