Amino acid dipepetide frequency for Mycobacterium virus Kugel

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
13.351AlaAla: 13.351 ± 1.468
0.544AlaCys: 0.544 ± 0.175
6.766AlaAsp: 6.766 ± 0.611
6.827AlaGlu: 6.827 ± 0.792
3.443AlaPhe: 3.443 ± 0.501
8.337AlaGly: 8.337 ± 0.822
1.812AlaHis: 1.812 ± 0.365
4.289AlaIle: 4.289 ± 0.54
4.289AlaLys: 4.289 ± 0.564
8.82AlaLeu: 8.82 ± 0.905
2.477AlaMet: 2.477 ± 0.368
2.598AlaAsn: 2.598 ± 0.478
4.591AlaPro: 4.591 ± 0.641
2.839AlaGln: 2.839 ± 0.44
5.92AlaArg: 5.92 ± 0.615
5.135AlaSer: 5.135 ± 0.549
5.558AlaThr: 5.558 ± 0.581
8.035AlaVal: 8.035 ± 0.795
1.571AlaTrp: 1.571 ± 0.278
2.9AlaTyr: 2.9 ± 0.401
0.0AlaXaa: 0.0 ± 0.0
Cys
0.967CysAla: 0.967 ± 0.31
0.0CysCys: 0.0 ± 0.0
0.483CysAsp: 0.483 ± 0.181
0.665CysGlu: 0.665 ± 0.188
0.302CysPhe: 0.302 ± 0.157
0.665CysGly: 0.665 ± 0.216
0.181CysHis: 0.181 ± 0.103
0.362CysIle: 0.362 ± 0.146
0.242CysLys: 0.242 ± 0.123
0.483CysLeu: 0.483 ± 0.183
0.121CysMet: 0.121 ± 0.09
0.242CysAsn: 0.242 ± 0.104
0.423CysPro: 0.423 ± 0.17
0.242CysGln: 0.242 ± 0.123
0.362CysArg: 0.362 ± 0.138
0.604CysSer: 0.604 ± 0.187
0.423CysThr: 0.423 ± 0.165
0.423CysVal: 0.423 ± 0.149
0.242CysTrp: 0.242 ± 0.11
0.121CysTyr: 0.121 ± 0.091
0.0CysXaa: 0.0 ± 0.0
Asp
6.706AspAla: 6.706 ± 0.813
0.665AspCys: 0.665 ± 0.21
4.712AspAsp: 4.712 ± 0.487
3.383AspGlu: 3.383 ± 0.494
2.537AspPhe: 2.537 ± 0.32
6.222AspGly: 6.222 ± 0.613
1.329AspHis: 1.329 ± 0.294
2.598AspIle: 2.598 ± 0.39
2.477AspLys: 2.477 ± 0.534
6.343AspLeu: 6.343 ± 0.713
1.269AspMet: 1.269 ± 0.267
1.812AspAsn: 1.812 ± 0.379
5.075AspPro: 5.075 ± 0.607
1.752AspGln: 1.752 ± 0.348
3.383AspArg: 3.383 ± 0.424
2.839AspSer: 2.839 ± 0.395
3.262AspThr: 3.262 ± 0.339
4.893AspVal: 4.893 ± 0.581
1.873AspTrp: 1.873 ± 0.341
2.416AspTyr: 2.416 ± 0.345
0.0AspXaa: 0.0 ± 0.0
Glu
6.041GluAla: 6.041 ± 0.689
0.544GluCys: 0.544 ± 0.199
5.135GluAsp: 5.135 ± 0.638
5.135GluGlu: 5.135 ± 0.698
1.873GluPhe: 1.873 ± 0.321
4.712GluGly: 4.712 ± 0.596
1.329GluHis: 1.329 ± 0.354
3.323GluIle: 3.323 ± 0.389
2.537GluLys: 2.537 ± 0.414
6.766GluLeu: 6.766 ± 0.643
1.752GluMet: 1.752 ± 0.305
1.389GluAsn: 1.389 ± 0.274
2.779GluPro: 2.779 ± 0.478
2.779GluGln: 2.779 ± 0.487
3.806GluArg: 3.806 ± 0.559
3.081GluSer: 3.081 ± 0.366
3.383GluThr: 3.383 ± 0.498
5.558GluVal: 5.558 ± 0.721
1.631GluTrp: 1.631 ± 0.346
2.719GluTyr: 2.719 ± 0.487
0.0GluXaa: 0.0 ± 0.0
Phe
2.416PheAla: 2.416 ± 0.424
0.242PheCys: 0.242 ± 0.141
3.021PheAsp: 3.021 ± 0.372
2.114PheGlu: 2.114 ± 0.343
0.544PhePhe: 0.544 ± 0.19
3.564PheGly: 3.564 ± 0.524
0.725PheHis: 0.725 ± 0.228
1.269PheIle: 1.269 ± 0.229
1.329PheLys: 1.329 ± 0.277
2.114PheLeu: 2.114 ± 0.373
0.544PheMet: 0.544 ± 0.173
0.967PheAsn: 0.967 ± 0.294
1.571PhePro: 1.571 ± 0.264
0.785PheGln: 0.785 ± 0.176
1.994PheArg: 1.994 ± 0.415
1.812PheSer: 1.812 ± 0.333
2.356PheThr: 2.356 ± 0.367
2.235PheVal: 2.235 ± 0.451
0.604PheTrp: 0.604 ± 0.164
0.967PheTyr: 0.967 ± 0.207
0.0PheXaa: 0.0 ± 0.0
Gly
7.612GlyAla: 7.612 ± 0.831
0.544GlyCys: 0.544 ± 0.197
5.618GlyAsp: 5.618 ± 0.594
4.893GlyGlu: 4.893 ± 0.642
2.719GlyPhe: 2.719 ± 0.471
10.21GlyGly: 10.21 ± 2.188
1.752GlyHis: 1.752 ± 0.281
4.531GlyIle: 4.531 ± 0.739
3.685GlyLys: 3.685 ± 0.528
7.249GlyLeu: 7.249 ± 0.787
1.994GlyMet: 1.994 ± 0.318
3.323GlyAsn: 3.323 ± 0.492
4.108GlyPro: 4.108 ± 0.681
2.296GlyGln: 2.296 ± 0.338
4.833GlyArg: 4.833 ± 0.488
6.041GlySer: 6.041 ± 0.754
5.437GlyThr: 5.437 ± 0.729
5.135GlyVal: 5.135 ± 0.655
2.477GlyTrp: 2.477 ± 0.349
3.021GlyTyr: 3.021 ± 0.442
0.0GlyXaa: 0.0 ± 0.0
His
1.812HisAla: 1.812 ± 0.391
0.121HisCys: 0.121 ± 0.128
1.027HisAsp: 1.027 ± 0.212
1.873HisGlu: 1.873 ± 0.421
0.544HisPhe: 0.544 ± 0.203
1.752HisGly: 1.752 ± 0.375
0.604HisHis: 0.604 ± 0.212
0.906HisIle: 0.906 ± 0.201
1.027HisLys: 1.027 ± 0.322
1.571HisLeu: 1.571 ± 0.433
0.121HisMet: 0.121 ± 0.079
0.362HisAsn: 0.362 ± 0.181
1.208HisPro: 1.208 ± 0.246
0.785HisGln: 0.785 ± 0.219
1.873HisArg: 1.873 ± 0.377
0.906HisSer: 0.906 ± 0.226
1.269HisThr: 1.269 ± 0.319
1.692HisVal: 1.692 ± 0.321
0.665HisTrp: 0.665 ± 0.193
0.785HisTyr: 0.785 ± 0.242
0.0HisXaa: 0.0 ± 0.0
Ile
5.497IleAla: 5.497 ± 0.777
0.302IleCys: 0.302 ± 0.125
3.323IleAsp: 3.323 ± 0.371
3.866IleGlu: 3.866 ± 0.462
0.906IlePhe: 0.906 ± 0.223
3.685IleGly: 3.685 ± 0.513
0.906IleHis: 0.906 ± 0.212
1.752IleIle: 1.752 ± 0.3
1.692IleLys: 1.692 ± 0.34
3.625IleLeu: 3.625 ± 0.448
0.906IleMet: 0.906 ± 0.235
1.692IleAsn: 1.692 ± 0.277
3.383IlePro: 3.383 ± 0.459
1.208IleGln: 1.208 ± 0.303
3.081IleArg: 3.081 ± 0.481
3.141IleSer: 3.141 ± 0.497
3.504IleThr: 3.504 ± 0.372
2.9IleVal: 2.9 ± 0.561
0.725IleTrp: 0.725 ± 0.181
1.752IleTyr: 1.752 ± 0.294
0.0IleXaa: 0.0 ± 0.0
Lys
3.504LysAla: 3.504 ± 0.528
0.362LysCys: 0.362 ± 0.163
2.779LysAsp: 2.779 ± 0.487
1.812LysGlu: 1.812 ± 0.363
1.148LysPhe: 1.148 ± 0.254
2.779LysGly: 2.779 ± 0.351
0.967LysHis: 0.967 ± 0.237
2.658LysIle: 2.658 ± 0.46
2.114LysLys: 2.114 ± 0.514
3.323LysLeu: 3.323 ± 0.432
0.785LysMet: 0.785 ± 0.223
1.571LysAsn: 1.571 ± 0.333
2.537LysPro: 2.537 ± 0.38
1.571LysGln: 1.571 ± 0.442
2.537LysArg: 2.537 ± 0.467
2.477LysSer: 2.477 ± 0.364
2.296LysThr: 2.296 ± 0.392
3.443LysVal: 3.443 ± 0.489
0.846LysTrp: 0.846 ± 0.242
0.906LysTyr: 0.906 ± 0.244
0.0LysXaa: 0.0 ± 0.0
Leu
8.941LeuAla: 8.941 ± 0.872
0.483LeuCys: 0.483 ± 0.184
6.283LeuAsp: 6.283 ± 0.559
5.316LeuGlu: 5.316 ± 0.576
2.175LeuPhe: 2.175 ± 0.351
7.249LeuGly: 7.249 ± 0.766
1.389LeuHis: 1.389 ± 0.325
4.47LeuIle: 4.47 ± 0.434
3.323LeuLys: 3.323 ± 0.424
5.558LeuLeu: 5.558 ± 0.532
1.812LeuMet: 1.812 ± 0.305
3.021LeuAsn: 3.021 ± 0.428
5.497LeuPro: 5.497 ± 0.58
2.416LeuGln: 2.416 ± 0.532
5.92LeuArg: 5.92 ± 0.56
5.618LeuSer: 5.618 ± 0.604
5.981LeuThr: 5.981 ± 0.545
4.41LeuVal: 4.41 ± 0.644
1.269LeuTrp: 1.269 ± 0.347
2.296LeuTyr: 2.296 ± 0.355
0.0LeuXaa: 0.0 ± 0.0
Met
2.416MetAla: 2.416 ± 0.4
0.121MetCys: 0.121 ± 0.078
0.967MetAsp: 0.967 ± 0.222
1.45MetGlu: 1.45 ± 0.303
0.665MetPhe: 0.665 ± 0.182
1.389MetGly: 1.389 ± 0.258
0.242MetHis: 0.242 ± 0.124
0.665MetIle: 0.665 ± 0.213
1.087MetLys: 1.087 ± 0.258
1.389MetLeu: 1.389 ± 0.326
0.121MetMet: 0.121 ± 0.095
0.725MetAsn: 0.725 ± 0.193
1.087MetPro: 1.087 ± 0.244
0.423MetGln: 0.423 ± 0.145
1.45MetArg: 1.45 ± 0.344
2.779MetSer: 2.779 ± 0.487
2.356MetThr: 2.356 ± 0.339
1.027MetVal: 1.027 ± 0.286
0.302MetTrp: 0.302 ± 0.116
0.302MetTyr: 0.302 ± 0.146
0.0MetXaa: 0.0 ± 0.0
Asn
3.746AsnAla: 3.746 ± 0.525
0.121AsnCys: 0.121 ± 0.082
1.933AsnAsp: 1.933 ± 0.354
1.873AsnGlu: 1.873 ± 0.317
0.967AsnPhe: 0.967 ± 0.276
3.504AsnGly: 3.504 ± 0.468
0.665AsnHis: 0.665 ± 0.202
1.51AsnIle: 1.51 ± 0.289
0.725AsnLys: 0.725 ± 0.246
2.114AsnLeu: 2.114 ± 0.33
0.483AsnMet: 0.483 ± 0.14
1.027AsnAsn: 1.027 ± 0.234
2.719AsnPro: 2.719 ± 0.418
0.785AsnGln: 0.785 ± 0.225
1.571AsnArg: 1.571 ± 0.361
1.692AsnSer: 1.692 ± 0.365
1.51AsnThr: 1.51 ± 0.296
2.235AsnVal: 2.235 ± 0.472
0.604AsnTrp: 0.604 ± 0.157
1.208AsnTyr: 1.208 ± 0.294
0.0AsnXaa: 0.0 ± 0.0
Pro
5.014ProAla: 5.014 ± 0.554
0.604ProCys: 0.604 ± 0.193
3.625ProAsp: 3.625 ± 0.408
4.712ProGlu: 4.712 ± 0.725
2.175ProPhe: 2.175 ± 0.355
4.773ProGly: 4.773 ± 0.533
1.148ProHis: 1.148 ± 0.251
2.175ProIle: 2.175 ± 0.368
1.812ProLys: 1.812 ± 0.262
4.289ProLeu: 4.289 ± 0.532
1.269ProMet: 1.269 ± 0.321
1.631ProAsn: 1.631 ± 0.322
3.323ProPro: 3.323 ± 0.519
1.812ProGln: 1.812 ± 0.4
3.021ProArg: 3.021 ± 0.521
3.987ProSer: 3.987 ± 0.549
3.866ProThr: 3.866 ± 0.567
4.531ProVal: 4.531 ± 0.588
0.785ProTrp: 0.785 ± 0.262
1.812ProTyr: 1.812 ± 0.352
0.0ProXaa: 0.0 ± 0.0
Gln
2.839GlnAla: 2.839 ± 0.427
0.242GlnCys: 0.242 ± 0.113
1.45GlnAsp: 1.45 ± 0.315
1.571GlnGlu: 1.571 ± 0.283
1.269GlnPhe: 1.269 ± 0.271
2.175GlnGly: 2.175 ± 0.29
0.544GlnHis: 0.544 ± 0.143
2.537GlnIle: 2.537 ± 0.555
1.389GlnLys: 1.389 ± 0.285
3.564GlnLeu: 3.564 ± 0.488
1.027GlnMet: 1.027 ± 0.263
0.483GlnAsn: 0.483 ± 0.152
1.812GlnPro: 1.812 ± 0.319
1.631GlnGln: 1.631 ± 0.361
2.054GlnArg: 2.054 ± 0.376
1.631GlnSer: 1.631 ± 0.251
1.873GlnThr: 1.873 ± 0.327
2.356GlnVal: 2.356 ± 0.358
0.785GlnTrp: 0.785 ± 0.182
0.483GlnTyr: 0.483 ± 0.143
0.0GlnXaa: 0.0 ± 0.0
Arg
5.316ArgAla: 5.316 ± 0.642
0.846ArgCys: 0.846 ± 0.304
3.141ArgAsp: 3.141 ± 0.441
4.652ArgGlu: 4.652 ± 0.755
1.933ArgPhe: 1.933 ± 0.382
5.377ArgGly: 5.377 ± 0.585
1.269ArgHis: 1.269 ± 0.258
2.658ArgIle: 2.658 ± 0.368
3.202ArgLys: 3.202 ± 0.475
5.135ArgLeu: 5.135 ± 0.616
1.812ArgMet: 1.812 ± 0.366
2.175ArgAsn: 2.175 ± 0.43
2.719ArgPro: 2.719 ± 0.415
2.054ArgGln: 2.054 ± 0.289
5.497ArgArg: 5.497 ± 0.695
3.383ArgSer: 3.383 ± 0.519
3.443ArgThr: 3.443 ± 0.565
5.256ArgVal: 5.256 ± 0.578
1.329ArgTrp: 1.329 ± 0.306
1.631ArgTyr: 1.631 ± 0.314
0.0ArgXaa: 0.0 ± 0.0
Ser
6.887SerAla: 6.887 ± 0.865
0.665SerCys: 0.665 ± 0.245
3.202SerAsp: 3.202 ± 0.419
3.564SerGlu: 3.564 ± 0.418
2.054SerPhe: 2.054 ± 0.42
5.92SerGly: 5.92 ± 0.694
1.631SerHis: 1.631 ± 0.291
2.779SerIle: 2.779 ± 0.418
2.658SerLys: 2.658 ± 0.462
5.014SerLeu: 5.014 ± 0.549
1.329SerMet: 1.329 ± 0.295
1.873SerAsn: 1.873 ± 0.368
3.202SerPro: 3.202 ± 0.421
1.994SerGln: 1.994 ± 0.323
2.839SerArg: 2.839 ± 0.366
3.746SerSer: 3.746 ± 0.735
3.323SerThr: 3.323 ± 0.395
3.806SerVal: 3.806 ± 0.491
1.692SerTrp: 1.692 ± 0.346
1.208SerTyr: 1.208 ± 0.248
0.0SerXaa: 0.0 ± 0.0
Thr
5.92ThrAla: 5.92 ± 0.675
0.242ThrCys: 0.242 ± 0.119
4.289ThrAsp: 4.289 ± 0.531
4.108ThrGlu: 4.108 ± 0.557
2.054ThrPhe: 2.054 ± 0.383
6.102ThrGly: 6.102 ± 0.768
1.027ThrHis: 1.027 ± 0.33
2.779ThrIle: 2.779 ± 0.539
2.658ThrLys: 2.658 ± 0.423
5.86ThrLeu: 5.86 ± 0.648
1.087ThrMet: 1.087 ± 0.244
1.571ThrAsn: 1.571 ± 0.339
4.048ThrPro: 4.048 ± 0.605
1.873ThrGln: 1.873 ± 0.39
3.202ThrArg: 3.202 ± 0.416
4.048ThrSer: 4.048 ± 0.571
4.531ThrThr: 4.531 ± 0.585
5.377ThrVal: 5.377 ± 0.572
1.027ThrTrp: 1.027 ± 0.252
2.235ThrTyr: 2.235 ± 0.36
0.0ThrXaa: 0.0 ± 0.0
Val
6.887ValAla: 6.887 ± 0.761
0.423ValCys: 0.423 ± 0.172
5.256ValAsp: 5.256 ± 0.53
5.075ValGlu: 5.075 ± 0.544
2.175ValPhe: 2.175 ± 0.35
4.893ValGly: 4.893 ± 0.553
2.235ValHis: 2.235 ± 0.377
3.866ValIle: 3.866 ± 0.518
2.537ValLys: 2.537 ± 0.327
5.558ValLeu: 5.558 ± 0.564
1.087ValMet: 1.087 ± 0.272
2.719ValAsn: 2.719 ± 0.369
4.048ValPro: 4.048 ± 0.521
2.296ValGln: 2.296 ± 0.414
5.256ValArg: 5.256 ± 0.643
4.229ValSer: 4.229 ± 0.525
5.195ValThr: 5.195 ± 0.656
4.591ValVal: 4.591 ± 0.676
1.51ValTrp: 1.51 ± 0.313
2.175ValTyr: 2.175 ± 0.337
0.0ValXaa: 0.0 ± 0.0
Trp
1.571TrpAla: 1.571 ± 0.312
0.242TrpCys: 0.242 ± 0.105
1.329TrpAsp: 1.329 ± 0.329
1.208TrpGlu: 1.208 ± 0.266
0.846TrpPhe: 0.846 ± 0.212
1.752TrpGly: 1.752 ± 0.247
0.544TrpHis: 0.544 ± 0.18
1.148TrpIle: 1.148 ± 0.207
0.242TrpLys: 0.242 ± 0.145
1.752TrpLeu: 1.752 ± 0.318
0.362TrpMet: 0.362 ± 0.158
0.483TrpAsn: 0.483 ± 0.167
0.665TrpPro: 0.665 ± 0.24
0.967TrpGln: 0.967 ± 0.237
1.329TrpArg: 1.329 ± 0.317
0.967TrpSer: 0.967 ± 0.18
2.296TrpThr: 2.296 ± 0.397
2.175TrpVal: 2.175 ± 0.353
0.725TrpTrp: 0.725 ± 0.303
0.362TrpTyr: 0.362 ± 0.14
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.719TyrAla: 2.719 ± 0.36
0.242TyrCys: 0.242 ± 0.14
1.208TyrAsp: 1.208 ± 0.276
2.175TyrGlu: 2.175 ± 0.32
0.846TyrPhe: 0.846 ± 0.218
2.416TyrGly: 2.416 ± 0.34
0.604TyrHis: 0.604 ± 0.197
1.571TyrIle: 1.571 ± 0.37
1.389TyrLys: 1.389 ± 0.3
2.9TyrLeu: 2.9 ± 0.53
0.544TyrMet: 0.544 ± 0.172
1.329TyrAsn: 1.329 ± 0.314
1.571TyrPro: 1.571 ± 0.273
1.208TyrGln: 1.208 ± 0.249
2.9TyrArg: 2.9 ± 0.42
1.269TyrSer: 1.269 ± 0.25
2.175TyrThr: 2.175 ± 0.365
1.933TyrVal: 1.933 ± 0.314
0.302TyrTrp: 0.302 ± 0.135
0.725TyrTyr: 0.725 ± 0.228
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 95 proteins (16554 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski