Amino acid dipepetide frequency for Mycobacterium virus KBG

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
13.465AlaAla: 13.465 ± 1.242
0.728AlaCys: 0.728 ± 0.19
6.854AlaAsp: 6.854 ± 0.676
6.49AlaGlu: 6.49 ± 0.722
3.033AlaPhe: 3.033 ± 0.491
7.642AlaGly: 7.642 ± 0.794
1.213AlaHis: 1.213 ± 0.294
3.882AlaIle: 3.882 ± 0.576
4.185AlaLys: 4.185 ± 0.467
8.734AlaLeu: 8.734 ± 0.806
2.366AlaMet: 2.366 ± 0.446
2.729AlaAsn: 2.729 ± 0.415
5.277AlaPro: 5.277 ± 0.718
2.729AlaGln: 2.729 ± 0.378
6.733AlaArg: 6.733 ± 0.571
4.852AlaSer: 4.852 ± 0.574
5.156AlaThr: 5.156 ± 0.578
7.946AlaVal: 7.946 ± 0.679
1.759AlaTrp: 1.759 ± 0.315
2.79AlaTyr: 2.79 ± 0.375
0.0AlaXaa: 0.0 ± 0.0
Cys
0.546CysAla: 0.546 ± 0.191
0.061CysCys: 0.061 ± 0.061
0.425CysAsp: 0.425 ± 0.16
0.789CysGlu: 0.789 ± 0.278
0.121CysPhe: 0.121 ± 0.079
0.728CysGly: 0.728 ± 0.239
0.121CysHis: 0.121 ± 0.097
0.303CysIle: 0.303 ± 0.145
0.364CysLys: 0.364 ± 0.164
0.485CysLeu: 0.485 ± 0.162
0.121CysMet: 0.121 ± 0.103
0.425CysAsn: 0.425 ± 0.167
0.364CysPro: 0.364 ± 0.139
0.546CysGln: 0.546 ± 0.209
0.546CysArg: 0.546 ± 0.173
0.364CysSer: 0.364 ± 0.145
0.243CysThr: 0.243 ± 0.128
0.303CysVal: 0.303 ± 0.13
0.182CysTrp: 0.182 ± 0.107
0.121CysTyr: 0.121 ± 0.083
0.0CysXaa: 0.0 ± 0.0
Asp
6.369AspAla: 6.369 ± 0.633
0.485AspCys: 0.485 ± 0.161
4.67AspAsp: 4.67 ± 0.49
3.639AspGlu: 3.639 ± 0.488
2.547AspPhe: 2.547 ± 0.362
6.065AspGly: 6.065 ± 0.638
1.274AspHis: 1.274 ± 0.333
2.608AspIle: 2.608 ± 0.442
2.608AspLys: 2.608 ± 0.377
7.218AspLeu: 7.218 ± 0.735
0.91AspMet: 0.91 ± 0.17
1.698AspAsn: 1.698 ± 0.309
5.034AspPro: 5.034 ± 0.574
1.577AspGln: 1.577 ± 0.331
3.882AspArg: 3.882 ± 0.47
3.275AspSer: 3.275 ± 0.471
3.943AspThr: 3.943 ± 0.427
4.367AspVal: 4.367 ± 0.67
1.82AspTrp: 1.82 ± 0.321
2.062AspTyr: 2.062 ± 0.345
0.0AspXaa: 0.0 ± 0.0
Glu
6.005GluAla: 6.005 ± 0.701
0.121GluCys: 0.121 ± 0.083
5.156GluAsp: 5.156 ± 0.609
4.913GluGlu: 4.913 ± 0.566
1.638GluPhe: 1.638 ± 0.293
4.731GluGly: 4.731 ± 0.555
1.516GluHis: 1.516 ± 0.348
3.639GluIle: 3.639 ± 0.517
2.547GluLys: 2.547 ± 0.414
6.672GluLeu: 6.672 ± 0.623
1.395GluMet: 1.395 ± 0.289
1.88GluAsn: 1.88 ± 0.329
2.608GluPro: 2.608 ± 0.381
2.911GluGln: 2.911 ± 0.42
3.943GluArg: 3.943 ± 0.493
3.457GluSer: 3.457 ± 0.437
3.397GluThr: 3.397 ± 0.486
5.459GluVal: 5.459 ± 0.603
1.456GluTrp: 1.456 ± 0.33
2.608GluTyr: 2.608 ± 0.441
0.0GluXaa: 0.0 ± 0.0
Phe
2.244PheAla: 2.244 ± 0.318
0.364PheCys: 0.364 ± 0.187
2.669PheAsp: 2.669 ± 0.349
1.759PheGlu: 1.759 ± 0.307
0.546PhePhe: 0.546 ± 0.17
3.579PheGly: 3.579 ± 0.475
0.546PheHis: 0.546 ± 0.235
1.516PheIle: 1.516 ± 0.282
1.274PheLys: 1.274 ± 0.284
2.547PheLeu: 2.547 ± 0.436
0.607PheMet: 0.607 ± 0.204
1.274PheAsn: 1.274 ± 0.268
1.456PhePro: 1.456 ± 0.327
0.789PheGln: 0.789 ± 0.186
1.516PheArg: 1.516 ± 0.316
2.184PheSer: 2.184 ± 0.422
1.88PheThr: 1.88 ± 0.352
2.062PheVal: 2.062 ± 0.393
0.546PheTrp: 0.546 ± 0.186
0.849PheTyr: 0.849 ± 0.207
0.0PheXaa: 0.0 ± 0.0
Gly
7.46GlyAla: 7.46 ± 1.308
0.728GlyCys: 0.728 ± 0.206
5.823GlyAsp: 5.823 ± 0.467
4.913GlyGlu: 4.913 ± 0.499
2.911GlyPhe: 2.911 ± 0.389
9.644GlyGly: 9.644 ± 2.954
2.305GlyHis: 2.305 ± 0.379
4.246GlyIle: 4.246 ± 0.683
3.761GlyLys: 3.761 ± 0.525
7.642GlyLeu: 7.642 ± 0.841
2.123GlyMet: 2.123 ± 0.342
3.336GlyAsn: 3.336 ± 0.42
4.124GlyPro: 4.124 ± 0.569
2.184GlyGln: 2.184 ± 0.288
5.156GlyArg: 5.156 ± 0.552
5.701GlySer: 5.701 ± 0.829
4.852GlyThr: 4.852 ± 0.495
5.156GlyVal: 5.156 ± 0.554
2.184GlyTrp: 2.184 ± 0.39
2.729GlyTyr: 2.729 ± 0.361
0.0GlyXaa: 0.0 ± 0.0
His
1.88HisAla: 1.88 ± 0.353
0.243HisCys: 0.243 ± 0.146
1.395HisAsp: 1.395 ± 0.282
1.516HisGlu: 1.516 ± 0.323
0.728HisPhe: 0.728 ± 0.209
1.759HisGly: 1.759 ± 0.33
0.728HisHis: 0.728 ± 0.205
0.849HisIle: 0.849 ± 0.19
0.91HisLys: 0.91 ± 0.276
1.638HisLeu: 1.638 ± 0.355
0.061HisMet: 0.061 ± 0.06
0.364HisAsn: 0.364 ± 0.172
1.213HisPro: 1.213 ± 0.23
1.031HisGln: 1.031 ± 0.247
1.88HisArg: 1.88 ± 0.36
0.667HisSer: 0.667 ± 0.184
0.91HisThr: 0.91 ± 0.247
1.395HisVal: 1.395 ± 0.307
0.607HisTrp: 0.607 ± 0.172
0.667HisTyr: 0.667 ± 0.207
0.0HisXaa: 0.0 ± 0.0
Ile
6.369IleAla: 6.369 ± 0.742
0.303IleCys: 0.303 ± 0.145
3.154IleAsp: 3.154 ± 0.395
4.003IleGlu: 4.003 ± 0.396
0.728IlePhe: 0.728 ± 0.249
3.215IleGly: 3.215 ± 0.46
1.152IleHis: 1.152 ± 0.271
1.638IleIle: 1.638 ± 0.293
1.577IleLys: 1.577 ± 0.322
3.579IleLeu: 3.579 ± 0.472
0.667IleMet: 0.667 ± 0.159
1.941IleAsn: 1.941 ± 0.336
3.397IlePro: 3.397 ± 0.4
1.698IleGln: 1.698 ± 0.401
3.7IleArg: 3.7 ± 0.487
3.275IleSer: 3.275 ± 0.478
3.336IleThr: 3.336 ± 0.389
2.972IleVal: 2.972 ± 0.594
0.728IleTrp: 0.728 ± 0.212
1.456IleTyr: 1.456 ± 0.263
0.0IleXaa: 0.0 ± 0.0
Lys
3.943LysAla: 3.943 ± 0.519
0.364LysCys: 0.364 ± 0.139
2.608LysAsp: 2.608 ± 0.395
2.062LysGlu: 2.062 ± 0.338
1.456LysPhe: 1.456 ± 0.287
2.487LysGly: 2.487 ± 0.371
1.274LysHis: 1.274 ± 0.327
2.669LysIle: 2.669 ± 0.47
2.062LysLys: 2.062 ± 0.449
3.215LysLeu: 3.215 ± 0.426
1.274LysMet: 1.274 ± 0.249
1.274LysAsn: 1.274 ± 0.271
2.487LysPro: 2.487 ± 0.36
1.334LysGln: 1.334 ± 0.271
3.154LysArg: 3.154 ± 0.536
2.244LysSer: 2.244 ± 0.416
2.184LysThr: 2.184 ± 0.379
3.397LysVal: 3.397 ± 0.442
0.849LysTrp: 0.849 ± 0.2
1.274LysTyr: 1.274 ± 0.351
0.0LysXaa: 0.0 ± 0.0
Leu
8.734LeuAla: 8.734 ± 0.868
0.243LeuCys: 0.243 ± 0.116
5.762LeuAsp: 5.762 ± 0.667
5.52LeuGlu: 5.52 ± 0.596
2.002LeuPhe: 2.002 ± 0.404
7.885LeuGly: 7.885 ± 0.839
1.638LeuHis: 1.638 ± 0.325
4.549LeuIle: 4.549 ± 0.571
4.064LeuLys: 4.064 ± 0.494
5.216LeuLeu: 5.216 ± 0.574
1.941LeuMet: 1.941 ± 0.346
3.093LeuAsn: 3.093 ± 0.449
5.52LeuPro: 5.52 ± 0.641
2.487LeuGln: 2.487 ± 0.401
5.944LeuArg: 5.944 ± 0.581
5.762LeuSer: 5.762 ± 0.502
6.369LeuThr: 6.369 ± 0.529
4.792LeuVal: 4.792 ± 0.623
1.092LeuTrp: 1.092 ± 0.333
2.608LeuTyr: 2.608 ± 0.39
0.0LeuXaa: 0.0 ± 0.0
Met
2.062MetAla: 2.062 ± 0.316
0.0MetCys: 0.0 ± 0.0
1.031MetAsp: 1.031 ± 0.246
1.334MetGlu: 1.334 ± 0.302
0.667MetPhe: 0.667 ± 0.2
1.152MetGly: 1.152 ± 0.292
0.243MetHis: 0.243 ± 0.138
0.789MetIle: 0.789 ± 0.217
1.031MetLys: 1.031 ± 0.261
1.456MetLeu: 1.456 ± 0.313
0.303MetMet: 0.303 ± 0.122
1.152MetAsn: 1.152 ± 0.221
1.031MetPro: 1.031 ± 0.218
0.789MetGln: 0.789 ± 0.192
1.152MetArg: 1.152 ± 0.278
2.487MetSer: 2.487 ± 0.405
2.062MetThr: 2.062 ± 0.294
1.334MetVal: 1.334 ± 0.318
0.243MetTrp: 0.243 ± 0.112
0.364MetTyr: 0.364 ± 0.15
0.0MetXaa: 0.0 ± 0.0
Asn
3.033AsnAla: 3.033 ± 0.529
0.061AsnCys: 0.061 ± 0.062
1.698AsnAsp: 1.698 ± 0.309
1.638AsnGlu: 1.638 ± 0.318
0.91AsnPhe: 0.91 ± 0.26
3.518AsnGly: 3.518 ± 0.575
0.607AsnHis: 0.607 ± 0.175
1.759AsnIle: 1.759 ± 0.315
0.667AsnLys: 0.667 ± 0.252
2.487AsnLeu: 2.487 ± 0.381
0.485AsnMet: 0.485 ± 0.159
0.91AsnAsn: 0.91 ± 0.249
3.154AsnPro: 3.154 ± 0.329
1.213AsnGln: 1.213 ± 0.276
1.395AsnArg: 1.395 ± 0.351
1.941AsnSer: 1.941 ± 0.476
2.062AsnThr: 2.062 ± 0.385
2.547AsnVal: 2.547 ± 0.457
0.667AsnTrp: 0.667 ± 0.187
1.152AsnTyr: 1.152 ± 0.27
0.0AsnXaa: 0.0 ± 0.0
Pro
5.034ProAla: 5.034 ± 0.665
0.849ProCys: 0.849 ± 0.22
4.61ProAsp: 4.61 ± 0.53
4.852ProGlu: 4.852 ± 0.698
2.184ProPhe: 2.184 ± 0.395
5.459ProGly: 5.459 ± 0.637
0.849ProHis: 0.849 ± 0.245
2.062ProIle: 2.062 ± 0.331
2.123ProLys: 2.123 ± 0.339
4.124ProLeu: 4.124 ± 0.535
1.092ProMet: 1.092 ± 0.297
1.456ProAsn: 1.456 ± 0.315
2.729ProPro: 2.729 ± 0.457
1.577ProGln: 1.577 ± 0.298
2.669ProArg: 2.669 ± 0.426
3.336ProSer: 3.336 ± 0.431
4.246ProThr: 4.246 ± 0.525
3.882ProVal: 3.882 ± 0.511
0.789ProTrp: 0.789 ± 0.294
1.516ProTyr: 1.516 ± 0.384
0.0ProXaa: 0.0 ± 0.0
Gln
3.093GlnAla: 3.093 ± 0.415
0.121GlnCys: 0.121 ± 0.092
1.152GlnAsp: 1.152 ± 0.296
1.395GlnGlu: 1.395 ± 0.262
0.91GlnPhe: 0.91 ± 0.189
2.608GlnGly: 2.608 ± 0.38
0.485GlnHis: 0.485 ± 0.144
2.79GlnIle: 2.79 ± 0.498
1.092GlnLys: 1.092 ± 0.259
3.943GlnLeu: 3.943 ± 0.468
1.092GlnMet: 1.092 ± 0.265
0.364GlnAsn: 0.364 ± 0.154
1.638GlnPro: 1.638 ± 0.316
1.82GlnGln: 1.82 ± 0.404
2.123GlnArg: 2.123 ± 0.393
1.759GlnSer: 1.759 ± 0.279
1.88GlnThr: 1.88 ± 0.333
2.669GlnVal: 2.669 ± 0.348
0.728GlnTrp: 0.728 ± 0.19
0.789GlnTyr: 0.789 ± 0.213
0.0GlnXaa: 0.0 ± 0.0
Arg
5.762ArgAla: 5.762 ± 0.667
0.849ArgCys: 0.849 ± 0.272
3.275ArgAsp: 3.275 ± 0.447
4.549ArgGlu: 4.549 ± 0.616
1.88ArgPhe: 1.88 ± 0.391
4.792ArgGly: 4.792 ± 0.636
1.031ArgHis: 1.031 ± 0.245
3.275ArgIle: 3.275 ± 0.475
3.518ArgLys: 3.518 ± 0.62
6.247ArgLeu: 6.247 ± 0.685
1.759ArgMet: 1.759 ± 0.358
2.305ArgAsn: 2.305 ± 0.42
2.487ArgPro: 2.487 ± 0.4
1.941ArgGln: 1.941 ± 0.382
5.883ArgArg: 5.883 ± 0.736
3.943ArgSer: 3.943 ± 0.541
3.154ArgThr: 3.154 ± 0.385
5.095ArgVal: 5.095 ± 0.592
1.213ArgTrp: 1.213 ± 0.287
1.88ArgTyr: 1.88 ± 0.281
0.0ArgXaa: 0.0 ± 0.0
Ser
5.762SerAla: 5.762 ± 0.613
0.425SerCys: 0.425 ± 0.189
3.639SerAsp: 3.639 ± 0.473
4.124SerGlu: 4.124 ± 0.52
1.941SerPhe: 1.941 ± 0.397
6.49SerGly: 6.49 ± 0.808
1.638SerHis: 1.638 ± 0.358
2.669SerIle: 2.669 ± 0.395
2.366SerLys: 2.366 ± 0.347
5.034SerLeu: 5.034 ± 0.637
1.456SerMet: 1.456 ± 0.341
2.123SerAsn: 2.123 ± 0.405
3.093SerPro: 3.093 ± 0.461
2.002SerGln: 2.002 ± 0.309
2.972SerArg: 2.972 ± 0.404
3.518SerSer: 3.518 ± 0.686
3.518SerThr: 3.518 ± 0.512
3.761SerVal: 3.761 ± 0.44
1.516SerTrp: 1.516 ± 0.407
1.516SerTyr: 1.516 ± 0.376
0.0SerXaa: 0.0 ± 0.0
Thr
6.551ThrAla: 6.551 ± 0.881
0.364ThrCys: 0.364 ± 0.175
4.064ThrAsp: 4.064 ± 0.552
4.003ThrGlu: 4.003 ± 0.533
2.547ThrPhe: 2.547 ± 0.409
6.126ThrGly: 6.126 ± 0.597
1.092ThrHis: 1.092 ± 0.277
3.154ThrIle: 3.154 ± 0.498
2.608ThrLys: 2.608 ± 0.365
5.459ThrLeu: 5.459 ± 0.559
0.789ThrMet: 0.789 ± 0.208
2.002ThrAsn: 2.002 ± 0.361
4.003ThrPro: 4.003 ± 0.534
1.638ThrGln: 1.638 ± 0.331
3.336ThrArg: 3.336 ± 0.545
3.215ThrSer: 3.215 ± 0.511
4.488ThrThr: 4.488 ± 0.496
5.156ThrVal: 5.156 ± 0.605
1.092ThrTrp: 1.092 ± 0.224
1.759ThrTyr: 1.759 ± 0.383
0.0ThrXaa: 0.0 ± 0.0
Val
6.551ValAla: 6.551 ± 0.603
0.364ValCys: 0.364 ± 0.172
5.398ValAsp: 5.398 ± 0.615
4.974ValGlu: 4.974 ± 0.508
2.062ValPhe: 2.062 ± 0.342
4.792ValGly: 4.792 ± 0.773
1.456ValHis: 1.456 ± 0.292
4.003ValIle: 4.003 ± 0.515
3.275ValLys: 3.275 ± 0.479
5.156ValLeu: 5.156 ± 0.574
1.152ValMet: 1.152 ± 0.315
2.002ValAsn: 2.002 ± 0.335
3.761ValPro: 3.761 ± 0.429
2.244ValGln: 2.244 ± 0.386
4.974ValArg: 4.974 ± 0.679
4.852ValSer: 4.852 ± 0.565
5.883ValThr: 5.883 ± 0.647
5.58ValVal: 5.58 ± 0.748
1.334ValTrp: 1.334 ± 0.284
1.88ValTyr: 1.88 ± 0.343
0.0ValXaa: 0.0 ± 0.0
Trp
1.274TrpAla: 1.274 ± 0.3
0.303TrpCys: 0.303 ± 0.145
1.516TrpAsp: 1.516 ± 0.323
1.092TrpGlu: 1.092 ± 0.231
0.728TrpPhe: 0.728 ± 0.209
1.638TrpGly: 1.638 ± 0.257
0.364TrpHis: 0.364 ± 0.151
1.152TrpIle: 1.152 ± 0.235
0.243TrpLys: 0.243 ± 0.122
1.759TrpLeu: 1.759 ± 0.282
0.425TrpMet: 0.425 ± 0.168
0.546TrpAsn: 0.546 ± 0.224
0.91TrpPro: 0.91 ± 0.267
0.849TrpGln: 0.849 ± 0.201
1.456TrpArg: 1.456 ± 0.329
1.092TrpSer: 1.092 ± 0.292
1.698TrpThr: 1.698 ± 0.373
1.941TrpVal: 1.941 ± 0.28
0.789TrpTrp: 0.789 ± 0.283
0.243TrpTyr: 0.243 ± 0.105
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.426TyrAla: 2.426 ± 0.366
0.182TyrCys: 0.182 ± 0.11
1.152TyrAsp: 1.152 ± 0.269
2.487TyrGlu: 2.487 ± 0.304
0.667TyrPhe: 0.667 ± 0.183
2.669TyrGly: 2.669 ± 0.42
0.91TyrHis: 0.91 ± 0.289
1.334TyrIle: 1.334 ± 0.307
1.334TyrLys: 1.334 ± 0.303
2.608TyrLeu: 2.608 ± 0.427
0.607TyrMet: 0.607 ± 0.186
1.031TyrAsn: 1.031 ± 0.267
1.152TyrPro: 1.152 ± 0.227
1.092TyrGln: 1.092 ± 0.218
2.487TyrArg: 2.487 ± 0.425
1.638TyrSer: 1.638 ± 0.28
2.184TyrThr: 2.184 ± 0.419
1.88TyrVal: 1.88 ± 0.332
0.425TyrTrp: 0.425 ± 0.195
0.546TyrTyr: 0.546 ± 0.19
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 89 proteins (16488 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski