Amino acid dipepetide frequency for Mycobacterium virus Wonder

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
10.607AlaAla: 10.607 ± 1.371
0.6AlaCys: 0.6 ± 0.222
5.304AlaAsp: 5.304 ± 0.837
7.005AlaGlu: 7.005 ± 1.044
4.403AlaPhe: 4.403 ± 0.877
7.405AlaGly: 7.405 ± 1.104
2.302AlaHis: 2.302 ± 0.439
4.303AlaIle: 4.303 ± 0.571
4.403AlaLys: 4.403 ± 0.704
8.806AlaLeu: 8.806 ± 1.55
2.802AlaMet: 2.802 ± 0.669
3.002AlaAsn: 3.002 ± 0.623
3.803AlaPro: 3.803 ± 0.659
4.603AlaGln: 4.603 ± 0.802
6.805AlaArg: 6.805 ± 0.879
4.503AlaSer: 4.503 ± 0.766
5.704AlaThr: 5.704 ± 0.727
7.605AlaVal: 7.605 ± 0.781
2.101AlaTrp: 2.101 ± 0.433
2.001AlaTyr: 2.001 ± 0.494
0.0AlaXaa: 0.0 ± 0.0
Cys
0.801CysAla: 0.801 ± 0.215
0.0CysCys: 0.0 ± 0.0
0.6CysAsp: 0.6 ± 0.267
0.6CysGlu: 0.6 ± 0.239
0.4CysPhe: 0.4 ± 0.188
0.5CysGly: 0.5 ± 0.232
0.2CysHis: 0.2 ± 0.134
0.2CysIle: 0.2 ± 0.129
0.4CysLys: 0.4 ± 0.161
0.5CysLeu: 0.5 ± 0.248
0.2CysMet: 0.2 ± 0.137
0.7CysAsn: 0.7 ± 0.257
0.6CysPro: 0.6 ± 0.299
0.1CysGln: 0.1 ± 0.098
0.5CysArg: 0.5 ± 0.191
0.3CysSer: 0.3 ± 0.173
0.4CysThr: 0.4 ± 0.182
0.6CysVal: 0.6 ± 0.273
0.1CysTrp: 0.1 ± 0.091
0.3CysTyr: 0.3 ± 0.172
0.0CysXaa: 0.0 ± 0.0
Asp
6.505AspAla: 6.505 ± 0.817
0.901AspCys: 0.901 ± 0.278
3.502AspAsp: 3.502 ± 0.734
3.502AspGlu: 3.502 ± 0.659
2.702AspPhe: 2.702 ± 0.467
6.705AspGly: 6.705 ± 0.841
1.301AspHis: 1.301 ± 0.442
2.202AspIle: 2.202 ± 0.372
1.901AspLys: 1.901 ± 0.471
5.604AspLeu: 5.604 ± 0.688
1.301AspMet: 1.301 ± 0.306
1.301AspAsn: 1.301 ± 0.42
6.104AspPro: 6.104 ± 1.26
1.701AspGln: 1.701 ± 0.381
3.502AspArg: 3.502 ± 0.821
3.102AspSer: 3.102 ± 0.554
3.903AspThr: 3.903 ± 0.578
3.703AspVal: 3.703 ± 0.434
1.301AspTrp: 1.301 ± 0.297
3.202AspTyr: 3.202 ± 0.527
0.0AspXaa: 0.0 ± 0.0
Glu
6.705GluAla: 6.705 ± 0.809
0.2GluCys: 0.2 ± 0.152
4.503GluAsp: 4.503 ± 0.959
5.104GluGlu: 5.104 ± 1.179
2.202GluPhe: 2.202 ± 0.449
4.403GluGly: 4.403 ± 0.578
1.201GluHis: 1.201 ± 0.352
3.202GluIle: 3.202 ± 0.603
2.402GluLys: 2.402 ± 0.406
6.304GluLeu: 6.304 ± 0.876
1.701GluMet: 1.701 ± 0.398
2.402GluAsn: 2.402 ± 0.444
2.902GluPro: 2.902 ± 0.701
3.102GluGln: 3.102 ± 0.446
4.403GluArg: 4.403 ± 0.716
2.702GluSer: 2.702 ± 0.542
3.302GluThr: 3.302 ± 0.662
4.803GluVal: 4.803 ± 0.825
1.401GluTrp: 1.401 ± 0.279
2.602GluTyr: 2.602 ± 0.539
0.0GluXaa: 0.0 ± 0.0
Phe
3.803PheAla: 3.803 ± 0.783
0.1PheCys: 0.1 ± 0.093
2.802PheAsp: 2.802 ± 0.646
2.101PheGlu: 2.101 ± 0.475
0.7PhePhe: 0.7 ± 0.237
3.502PheGly: 3.502 ± 0.631
0.801PheHis: 0.801 ± 0.305
1.401PheIle: 1.401 ± 0.307
1.401PheLys: 1.401 ± 0.341
3.803PheLeu: 3.803 ± 0.603
0.801PheMet: 0.801 ± 0.223
1.301PheAsn: 1.301 ± 0.453
1.601PhePro: 1.601 ± 0.435
1.601PheGln: 1.601 ± 0.402
2.402PheArg: 2.402 ± 0.495
2.502PheSer: 2.502 ± 0.438
2.502PheThr: 2.502 ± 0.415
1.701PheVal: 1.701 ± 0.404
0.5PheTrp: 0.5 ± 0.229
0.801PheTyr: 0.801 ± 0.295
0.0PheXaa: 0.0 ± 0.0
Gly
7.705GlyAla: 7.705 ± 1.271
0.6GlyCys: 0.6 ± 0.254
6.204GlyAsp: 6.204 ± 0.833
4.503GlyGlu: 4.503 ± 0.614
3.502GlyPhe: 3.502 ± 0.695
6.705GlyGly: 6.705 ± 0.725
1.601GlyHis: 1.601 ± 0.35
4.703GlyIle: 4.703 ± 0.753
3.202GlyLys: 3.202 ± 0.576
6.605GlyLeu: 6.605 ± 0.885
2.302GlyMet: 2.302 ± 0.676
3.402GlyAsn: 3.402 ± 0.737
5.604GlyPro: 5.604 ± 1.793
2.902GlyGln: 2.902 ± 0.668
4.603GlyArg: 4.603 ± 0.607
4.703GlySer: 4.703 ± 0.904
5.104GlyThr: 5.104 ± 0.805
6.705GlyVal: 6.705 ± 0.888
2.402GlyTrp: 2.402 ± 0.396
2.802GlyTyr: 2.802 ± 0.535
0.0GlyXaa: 0.0 ± 0.0
His
1.401HisAla: 1.401 ± 0.41
0.2HisCys: 0.2 ± 0.131
1.301HisAsp: 1.301 ± 0.394
1.401HisGlu: 1.401 ± 0.415
0.901HisPhe: 0.901 ± 0.279
1.901HisGly: 1.901 ± 0.544
0.7HisHis: 0.7 ± 0.253
1.301HisIle: 1.301 ± 0.26
0.4HisLys: 0.4 ± 0.21
1.501HisLeu: 1.501 ± 0.512
0.1HisMet: 0.1 ± 0.092
0.7HisAsn: 0.7 ± 0.267
1.301HisPro: 1.301 ± 0.359
0.901HisGln: 0.901 ± 0.256
2.001HisArg: 2.001 ± 0.526
0.6HisSer: 0.6 ± 0.259
1.101HisThr: 1.101 ± 0.329
1.001HisVal: 1.001 ± 0.311
0.3HisTrp: 0.3 ± 0.212
0.7HisTyr: 0.7 ± 0.29
0.0HisXaa: 0.0 ± 0.0
Ile
4.803IleAla: 4.803 ± 0.722
0.3IleCys: 0.3 ± 0.149
3.803IleAsp: 3.803 ± 0.459
3.402IleGlu: 3.402 ± 0.687
1.201IlePhe: 1.201 ± 0.308
3.803IleGly: 3.803 ± 0.75
0.901IleHis: 0.901 ± 0.317
1.701IleIle: 1.701 ± 0.374
2.001IleLys: 2.001 ± 0.452
3.502IleLeu: 3.502 ± 0.456
0.5IleMet: 0.5 ± 0.212
1.101IleAsn: 1.101 ± 0.385
4.003IlePro: 4.003 ± 0.517
1.801IleGln: 1.801 ± 0.331
3.002IleArg: 3.002 ± 0.497
3.302IleSer: 3.302 ± 0.496
3.402IleThr: 3.402 ± 0.593
2.702IleVal: 2.702 ± 0.543
1.301IleTrp: 1.301 ± 0.345
0.801IleTyr: 0.801 ± 0.265
0.0IleXaa: 0.0 ± 0.0
Lys
4.203LysAla: 4.203 ± 0.701
0.1LysCys: 0.1 ± 0.093
2.202LysAsp: 2.202 ± 0.422
2.502LysGlu: 2.502 ± 0.463
1.601LysPhe: 1.601 ± 0.343
3.603LysGly: 3.603 ± 0.769
1.201LysHis: 1.201 ± 0.324
2.302LysIle: 2.302 ± 0.431
2.001LysLys: 2.001 ± 0.397
3.302LysLeu: 3.302 ± 0.604
0.4LysMet: 0.4 ± 0.15
0.901LysAsn: 0.901 ± 0.253
2.101LysPro: 2.101 ± 0.547
1.701LysGln: 1.701 ± 0.379
2.602LysArg: 2.602 ± 0.826
2.202LysSer: 2.202 ± 0.531
2.702LysThr: 2.702 ± 0.514
2.902LysVal: 2.902 ± 0.498
0.901LysTrp: 0.901 ± 0.307
0.801LysTyr: 0.801 ± 0.222
0.0LysXaa: 0.0 ± 0.0
Leu
8.406LeuAla: 8.406 ± 1.525
0.3LeuCys: 0.3 ± 0.159
5.404LeuAsp: 5.404 ± 0.649
6.004LeuGlu: 6.004 ± 0.801
2.101LeuPhe: 2.101 ± 0.5
6.905LeuGly: 6.905 ± 0.852
1.801LeuHis: 1.801 ± 0.531
4.703LeuIle: 4.703 ± 0.592
3.202LeuLys: 3.202 ± 0.424
6.404LeuLeu: 6.404 ± 0.977
2.202LeuMet: 2.202 ± 0.532
3.002LeuAsn: 3.002 ± 0.596
4.403LeuPro: 4.403 ± 0.557
2.702LeuGln: 2.702 ± 0.901
6.605LeuArg: 6.605 ± 0.873
4.503LeuSer: 4.503 ± 0.591
5.404LeuThr: 5.404 ± 0.832
5.004LeuVal: 5.004 ± 0.823
1.701LeuTrp: 1.701 ± 0.364
2.702LeuTyr: 2.702 ± 0.679
0.0LeuXaa: 0.0 ± 0.0
Met
2.502MetAla: 2.502 ± 0.533
0.0MetCys: 0.0 ± 0.0
1.601MetAsp: 1.601 ± 0.579
1.501MetGlu: 1.501 ± 0.454
0.801MetPhe: 0.801 ± 0.208
1.901MetGly: 1.901 ± 0.375
0.2MetHis: 0.2 ± 0.123
1.201MetIle: 1.201 ± 0.313
0.801MetLys: 0.801 ± 0.374
1.501MetLeu: 1.501 ± 0.326
0.801MetMet: 0.801 ± 0.426
0.6MetAsn: 0.6 ± 0.222
2.101MetPro: 2.101 ± 0.524
0.7MetGln: 0.7 ± 0.434
2.202MetArg: 2.202 ± 0.468
1.601MetSer: 1.601 ± 0.341
1.701MetThr: 1.701 ± 0.306
1.601MetVal: 1.601 ± 0.399
0.2MetTrp: 0.2 ± 0.121
0.6MetTyr: 0.6 ± 0.301
0.0MetXaa: 0.0 ± 0.0
Asn
3.703AsnAla: 3.703 ± 0.656
0.3AsnCys: 0.3 ± 0.154
1.901AsnAsp: 1.901 ± 0.392
2.101AsnGlu: 2.101 ± 0.302
1.301AsnPhe: 1.301 ± 0.525
3.002AsnGly: 3.002 ± 0.645
0.7AsnHis: 0.7 ± 0.227
1.501AsnIle: 1.501 ± 0.492
1.001AsnLys: 1.001 ± 0.292
2.202AsnLeu: 2.202 ± 0.6
1.001AsnMet: 1.001 ± 0.312
1.001AsnAsn: 1.001 ± 0.32
2.802AsnPro: 2.802 ± 0.628
1.101AsnGln: 1.101 ± 0.459
1.801AsnArg: 1.801 ± 0.444
1.801AsnSer: 1.801 ± 0.412
1.901AsnThr: 1.901 ± 0.517
3.002AsnVal: 3.002 ± 0.449
0.5AsnTrp: 0.5 ± 0.198
1.101AsnTyr: 1.101 ± 0.252
0.0AsnXaa: 0.0 ± 0.0
Pro
5.904ProAla: 5.904 ± 0.863
0.4ProCys: 0.4 ± 0.244
3.202ProAsp: 3.202 ± 0.644
3.603ProGlu: 3.603 ± 0.845
2.202ProPhe: 2.202 ± 0.408
4.503ProGly: 4.503 ± 0.738
1.101ProHis: 1.101 ± 0.234
2.802ProIle: 2.802 ± 0.359
2.101ProLys: 2.101 ± 0.506
4.203ProLeu: 4.203 ± 0.908
1.501ProMet: 1.501 ± 0.442
3.002ProAsn: 3.002 ± 0.786
2.202ProPro: 2.202 ± 0.553
2.902ProGln: 2.902 ± 1.15
2.702ProArg: 2.702 ± 0.669
2.702ProSer: 2.702 ± 0.501
4.403ProThr: 4.403 ± 1.055
3.803ProVal: 3.803 ± 0.72
0.901ProTrp: 0.901 ± 0.49
1.701ProTyr: 1.701 ± 0.375
0.0ProXaa: 0.0 ± 0.0
Gln
4.103GlnAla: 4.103 ± 0.614
0.2GlnCys: 0.2 ± 0.12
2.101GlnAsp: 2.101 ± 0.422
2.302GlnGlu: 2.302 ± 0.516
1.901GlnPhe: 1.901 ± 0.696
4.203GlnGly: 4.203 ± 1.569
0.801GlnHis: 0.801 ± 0.276
2.302GlnIle: 2.302 ± 0.437
1.301GlnLys: 1.301 ± 0.384
3.903GlnLeu: 3.903 ± 0.716
1.501GlnMet: 1.501 ± 0.783
0.7GlnAsn: 0.7 ± 0.302
1.701GlnPro: 1.701 ± 0.353
2.202GlnGln: 2.202 ± 0.849
2.802GlnArg: 2.802 ± 0.576
1.001GlnSer: 1.001 ± 0.338
2.101GlnThr: 2.101 ± 0.594
2.902GlnVal: 2.902 ± 0.619
0.6GlnTrp: 0.6 ± 0.247
0.7GlnTyr: 0.7 ± 0.338
0.0GlnXaa: 0.0 ± 0.0
Arg
6.004ArgAla: 6.004 ± 0.987
1.401ArgCys: 1.401 ± 0.474
4.103ArgAsp: 4.103 ± 0.669
5.804ArgGlu: 5.804 ± 0.861
2.202ArgPhe: 2.202 ± 0.531
4.503ArgGly: 4.503 ± 0.663
1.201ArgHis: 1.201 ± 0.326
3.002ArgIle: 3.002 ± 0.551
3.302ArgLys: 3.302 ± 0.724
5.404ArgLeu: 5.404 ± 0.672
1.601ArgMet: 1.601 ± 0.46
2.001ArgAsn: 2.001 ± 0.383
2.202ArgPro: 2.202 ± 0.374
2.302ArgGln: 2.302 ± 0.473
5.104ArgArg: 5.104 ± 0.692
3.302ArgSer: 3.302 ± 0.74
2.902ArgThr: 2.902 ± 0.573
5.304ArgVal: 5.304 ± 0.64
1.801ArgTrp: 1.801 ± 0.536
2.502ArgTyr: 2.502 ± 0.557
0.0ArgXaa: 0.0 ± 0.0
Ser
4.203SerAla: 4.203 ± 0.775
0.0SerCys: 0.0 ± 0.0
2.702SerAsp: 2.702 ± 0.582
3.803SerGlu: 3.803 ± 0.626
2.502SerPhe: 2.502 ± 0.394
5.304SerGly: 5.304 ± 1.07
0.7SerHis: 0.7 ± 0.26
2.302SerIle: 2.302 ± 0.363
2.202SerLys: 2.202 ± 0.565
4.803SerLeu: 4.803 ± 0.87
1.601SerMet: 1.601 ± 0.459
1.301SerAsn: 1.301 ± 0.287
2.802SerPro: 2.802 ± 0.5
2.502SerGln: 2.502 ± 0.638
4.403SerArg: 4.403 ± 0.723
1.901SerSer: 1.901 ± 0.304
2.402SerThr: 2.402 ± 0.617
3.102SerVal: 3.102 ± 0.649
1.301SerTrp: 1.301 ± 0.333
1.401SerTyr: 1.401 ± 0.367
0.0SerXaa: 0.0 ± 0.0
Thr
4.703ThrAla: 4.703 ± 0.707
0.5ThrCys: 0.5 ± 0.223
4.103ThrAsp: 4.103 ± 0.845
3.402ThrGlu: 3.402 ± 0.708
2.202ThrPhe: 2.202 ± 0.534
7.105ThrGly: 7.105 ± 1.004
0.901ThrHis: 0.901 ± 0.41
2.001ThrIle: 2.001 ± 0.456
2.402ThrLys: 2.402 ± 0.548
5.404ThrLeu: 5.404 ± 0.742
1.001ThrMet: 1.001 ± 0.3
2.001ThrAsn: 2.001 ± 0.4
4.103ThrPro: 4.103 ± 0.608
2.502ThrGln: 2.502 ± 0.494
3.202ThrArg: 3.202 ± 0.377
2.402ThrSer: 2.402 ± 0.553
3.703ThrThr: 3.703 ± 0.586
5.004ThrVal: 5.004 ± 0.86
0.801ThrTrp: 0.801 ± 0.261
2.001ThrTyr: 2.001 ± 0.413
0.0ThrXaa: 0.0 ± 0.0
Val
7.205ValAla: 7.205 ± 1.061
1.001ValCys: 1.001 ± 0.38
5.004ValAsp: 5.004 ± 0.936
3.302ValGlu: 3.302 ± 0.523
2.001ValPhe: 2.001 ± 0.605
5.804ValGly: 5.804 ± 0.584
1.401ValHis: 1.401 ± 0.35
3.502ValIle: 3.502 ± 0.516
4.703ValLys: 4.703 ± 0.71
5.004ValLeu: 5.004 ± 0.606
1.501ValMet: 1.501 ± 0.366
2.802ValAsn: 2.802 ± 0.519
3.803ValPro: 3.803 ± 0.725
2.001ValGln: 2.001 ± 0.41
4.403ValArg: 4.403 ± 0.675
5.704ValSer: 5.704 ± 0.738
3.603ValThr: 3.603 ± 0.671
6.104ValVal: 6.104 ± 0.738
1.001ValTrp: 1.001 ± 0.332
1.701ValTyr: 1.701 ± 0.419
0.0ValXaa: 0.0 ± 0.0
Trp
1.801TrpAla: 1.801 ± 0.536
0.5TrpCys: 0.5 ± 0.283
1.301TrpAsp: 1.301 ± 0.407
1.001TrpGlu: 1.001 ± 0.266
0.4TrpPhe: 0.4 ± 0.161
1.701TrpGly: 1.701 ± 0.465
0.3TrpHis: 0.3 ± 0.163
1.601TrpIle: 1.601 ± 0.493
0.3TrpLys: 0.3 ± 0.173
1.601TrpLeu: 1.601 ± 0.318
0.7TrpMet: 0.7 ± 0.237
1.101TrpAsn: 1.101 ± 0.303
1.001TrpPro: 1.001 ± 0.336
1.001TrpGln: 1.001 ± 0.35
1.001TrpArg: 1.001 ± 0.371
0.901TrpSer: 0.901 ± 0.315
1.501TrpThr: 1.501 ± 0.405
1.401TrpVal: 1.401 ± 0.27
0.7TrpTrp: 0.7 ± 0.332
0.6TrpTyr: 0.6 ± 0.262
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.002TyrAla: 3.002 ± 0.585
0.4TyrCys: 0.4 ± 0.192
2.302TyrAsp: 2.302 ± 0.419
2.602TyrGlu: 2.602 ± 0.511
0.901TyrPhe: 0.901 ± 0.304
2.402TyrGly: 2.402 ± 0.468
0.4TyrHis: 0.4 ± 0.176
1.101TyrIle: 1.101 ± 0.322
0.801TyrLys: 0.801 ± 0.257
3.102TyrLeu: 3.102 ± 0.41
0.5TyrMet: 0.5 ± 0.225
1.401TyrAsn: 1.401 ± 0.419
0.801TyrPro: 0.801 ± 0.196
1.001TyrGln: 1.001 ± 0.285
1.901TyrArg: 1.901 ± 0.459
1.501TyrSer: 1.501 ± 0.326
1.701TyrThr: 1.701 ± 0.339
2.602TyrVal: 2.602 ± 0.573
0.6TyrTrp: 0.6 ± 0.239
1.101TyrTyr: 1.101 ± 0.438
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 33 proteins (9994 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski