Amino acid dipepetide frequency for Mycobacterium virus Mozy

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
12.744AlaAla: 12.744 ± 1.544
1.089AlaCys: 1.089 ± 0.278
6.753AlaAsp: 6.753 ± 0.614
7.788AlaGlu: 7.788 ± 0.773
2.723AlaPhe: 2.723 ± 0.435
9.422AlaGly: 9.422 ± 1.165
2.342AlaHis: 2.342 ± 0.419
4.466AlaIle: 4.466 ± 0.584
3.758AlaLys: 3.758 ± 0.479
8.115AlaLeu: 8.115 ± 0.797
2.723AlaMet: 2.723 ± 0.356
3.104AlaAsn: 3.104 ± 0.487
5.228AlaPro: 5.228 ± 0.586
3.54AlaGln: 3.54 ± 0.572
6.699AlaArg: 6.699 ± 0.62
5.773AlaSer: 5.773 ± 0.588
6.808AlaThr: 6.808 ± 0.587
7.135AlaVal: 7.135 ± 0.555
2.233AlaTrp: 2.233 ± 0.335
2.342AlaTyr: 2.342 ± 0.304
0.0AlaXaa: 0.0 ± 0.0
Cys
0.98CysAla: 0.98 ± 0.295
0.0CysCys: 0.0 ± 0.0
0.98CysAsp: 0.98 ± 0.278
0.871CysGlu: 0.871 ± 0.228
0.272CysPhe: 0.272 ± 0.131
1.525CysGly: 1.525 ± 0.32
0.654CysHis: 0.654 ± 0.195
0.436CysIle: 0.436 ± 0.157
0.381CysLys: 0.381 ± 0.155
0.871CysLeu: 0.871 ± 0.252
0.218CysMet: 0.218 ± 0.092
0.436CysAsn: 0.436 ± 0.172
1.144CysPro: 1.144 ± 0.266
0.49CysGln: 0.49 ± 0.173
1.035CysArg: 1.035 ± 0.302
0.708CysSer: 0.708 ± 0.248
0.708CysThr: 0.708 ± 0.19
0.708CysVal: 0.708 ± 0.229
0.272CysTrp: 0.272 ± 0.135
0.272CysTyr: 0.272 ± 0.117
0.0CysXaa: 0.0 ± 0.0
Asp
6.536AspAla: 6.536 ± 0.582
0.762AspCys: 0.762 ± 0.22
4.03AspAsp: 4.03 ± 0.531
3.377AspGlu: 3.377 ± 0.485
1.362AspPhe: 1.362 ± 0.244
6.808AspGly: 6.808 ± 0.605
1.634AspHis: 1.634 ± 0.252
2.723AspIle: 2.723 ± 0.424
1.852AspLys: 1.852 ± 0.295
5.61AspLeu: 5.61 ± 0.593
0.98AspMet: 0.98 ± 0.234
1.525AspAsn: 1.525 ± 0.253
4.303AspPro: 4.303 ± 0.628
2.505AspGln: 2.505 ± 0.32
5.555AspArg: 5.555 ± 0.577
3.431AspSer: 3.431 ± 0.491
3.649AspThr: 3.649 ± 0.408
4.52AspVal: 4.52 ± 0.602
1.416AspTrp: 1.416 ± 0.262
1.634AspTyr: 1.634 ± 0.264
0.0AspXaa: 0.0 ± 0.0
Glu
6.536GluAla: 6.536 ± 0.651
0.871GluCys: 0.871 ± 0.239
2.887GluAsp: 2.887 ± 0.367
3.322GluGlu: 3.322 ± 0.496
2.342GluPhe: 2.342 ± 0.326
3.159GluGly: 3.159 ± 0.421
1.089GluHis: 1.089 ± 0.268
2.56GluIle: 2.56 ± 0.386
1.579GluLys: 1.579 ± 0.285
5.011GluLeu: 5.011 ± 0.693
1.961GluMet: 1.961 ± 0.34
2.233GluAsn: 2.233 ± 0.402
2.342GluPro: 2.342 ± 0.405
2.995GluGln: 2.995 ± 0.405
4.956GluArg: 4.956 ± 0.616
3.268GluSer: 3.268 ± 0.454
4.194GluThr: 4.194 ± 0.584
4.466GluVal: 4.466 ± 0.569
1.198GluTrp: 1.198 ± 0.267
2.287GluTyr: 2.287 ± 0.426
0.0GluXaa: 0.0 ± 0.0
Phe
2.669PheAla: 2.669 ± 0.434
0.327PheCys: 0.327 ± 0.129
2.233PheAsp: 2.233 ± 0.354
1.525PheGlu: 1.525 ± 0.292
0.871PhePhe: 0.871 ± 0.269
3.377PheGly: 3.377 ± 0.661
0.381PheHis: 0.381 ± 0.142
1.198PheIle: 1.198 ± 0.288
1.253PheLys: 1.253 ± 0.336
1.906PheLeu: 1.906 ± 0.259
0.817PheMet: 0.817 ± 0.239
1.144PheAsn: 1.144 ± 0.308
1.471PhePro: 1.471 ± 0.323
0.98PheGln: 0.98 ± 0.26
1.525PheArg: 1.525 ± 0.225
1.852PheSer: 1.852 ± 0.323
2.233PheThr: 2.233 ± 0.307
1.797PheVal: 1.797 ± 0.275
0.762PheTrp: 0.762 ± 0.182
0.926PheTyr: 0.926 ± 0.261
0.0PheXaa: 0.0 ± 0.0
Gly
9.912GlyAla: 9.912 ± 1.267
1.144GlyCys: 1.144 ± 0.264
6.209GlyAsp: 6.209 ± 0.531
3.758GlyGlu: 3.758 ± 0.433
2.941GlyPhe: 2.941 ± 0.414
11.546GlyGly: 11.546 ± 2.912
2.07GlyHis: 2.07 ± 0.366
4.575GlyIle: 4.575 ± 0.526
2.614GlyLys: 2.614 ± 0.347
5.555GlyLeu: 5.555 ± 0.628
2.233GlyMet: 2.233 ± 0.461
3.213GlyAsn: 3.213 ± 0.387
3.921GlyPro: 3.921 ± 0.541
1.797GlyGln: 1.797 ± 0.457
5.283GlyArg: 5.283 ± 0.609
5.936GlySer: 5.936 ± 0.98
6.536GlyThr: 6.536 ± 0.652
5.555GlyVal: 5.555 ± 0.575
2.778GlyTrp: 2.778 ± 0.441
2.233GlyTyr: 2.233 ± 0.35
0.0GlyXaa: 0.0 ± 0.0
His
2.287HisAla: 2.287 ± 0.414
0.599HisCys: 0.599 ± 0.239
1.144HisAsp: 1.144 ± 0.248
1.362HisGlu: 1.362 ± 0.305
0.381HisPhe: 0.381 ± 0.122
1.634HisGly: 1.634 ± 0.316
1.089HisHis: 1.089 ± 0.262
1.471HisIle: 1.471 ± 0.337
0.599HisLys: 0.599 ± 0.209
1.525HisLeu: 1.525 ± 0.324
0.436HisMet: 0.436 ± 0.137
0.545HisAsn: 0.545 ± 0.163
1.253HisPro: 1.253 ± 0.294
0.926HisGln: 0.926 ± 0.217
2.396HisArg: 2.396 ± 0.502
0.654HisSer: 0.654 ± 0.175
1.688HisThr: 1.688 ± 0.358
1.634HisVal: 1.634 ± 0.292
0.545HisTrp: 0.545 ± 0.171
0.871HisTyr: 0.871 ± 0.18
0.0HisXaa: 0.0 ± 0.0
Ile
5.228IleAla: 5.228 ± 0.529
0.817IleCys: 0.817 ± 0.235
3.54IleAsp: 3.54 ± 0.399
3.322IleGlu: 3.322 ± 0.418
0.817IlePhe: 0.817 ± 0.25
3.54IleGly: 3.54 ± 0.491
1.253IleHis: 1.253 ± 0.3
0.98IleIle: 0.98 ± 0.223
1.362IleLys: 1.362 ± 0.279
2.233IleLeu: 2.233 ± 0.412
0.49IleMet: 0.49 ± 0.167
1.743IleAsn: 1.743 ± 0.261
3.649IlePro: 3.649 ± 0.42
1.852IleGln: 1.852 ± 0.292
2.505IleArg: 2.505 ± 0.363
2.287IleSer: 2.287 ± 0.444
3.54IleThr: 3.54 ± 0.469
2.887IleVal: 2.887 ± 0.392
1.035IleTrp: 1.035 ± 0.269
0.762IleTyr: 0.762 ± 0.195
0.0IleXaa: 0.0 ± 0.0
Lys
3.268LysAla: 3.268 ± 0.417
0.545LysCys: 0.545 ± 0.19
2.015LysAsp: 2.015 ± 0.287
1.144LysGlu: 1.144 ± 0.265
1.035LysPhe: 1.035 ± 0.219
2.342LysGly: 2.342 ± 0.361
1.035LysHis: 1.035 ± 0.209
0.98LysIle: 0.98 ± 0.227
1.416LysLys: 1.416 ± 0.273
2.396LysLeu: 2.396 ± 0.429
0.654LysMet: 0.654 ± 0.173
0.926LysAsn: 0.926 ± 0.232
2.669LysPro: 2.669 ± 0.333
1.797LysGln: 1.797 ± 0.256
2.124LysArg: 2.124 ± 0.305
2.233LysSer: 2.233 ± 0.315
2.505LysThr: 2.505 ± 0.489
1.852LysVal: 1.852 ± 0.313
0.708LysTrp: 0.708 ± 0.227
0.98LysTyr: 0.98 ± 0.247
0.0LysXaa: 0.0 ± 0.0
Leu
7.788LeuAla: 7.788 ± 0.797
0.762LeuCys: 0.762 ± 0.224
4.629LeuAsp: 4.629 ± 0.486
4.194LeuGlu: 4.194 ± 0.489
1.906LeuPhe: 1.906 ± 0.244
5.011LeuGly: 5.011 ± 0.569
0.654LeuHis: 0.654 ± 0.179
3.54LeuIle: 3.54 ± 0.381
2.287LeuLys: 2.287 ± 0.444
4.629LeuLeu: 4.629 ± 0.555
1.307LeuMet: 1.307 ± 0.29
3.05LeuAsn: 3.05 ± 0.474
5.12LeuPro: 5.12 ± 0.765
2.669LeuGln: 2.669 ± 0.429
4.902LeuArg: 4.902 ± 0.614
5.283LeuSer: 5.283 ± 0.568
5.446LeuThr: 5.446 ± 0.56
4.738LeuVal: 4.738 ± 0.467
1.144LeuTrp: 1.144 ± 0.253
2.124LeuTyr: 2.124 ± 0.384
0.0LeuXaa: 0.0 ± 0.0
Met
2.07MetAla: 2.07 ± 0.398
0.218MetCys: 0.218 ± 0.174
1.634MetAsp: 1.634 ± 0.312
0.871MetGlu: 0.871 ± 0.197
0.762MetPhe: 0.762 ± 0.195
2.015MetGly: 2.015 ± 0.288
0.327MetHis: 0.327 ± 0.144
0.98MetIle: 0.98 ± 0.208
0.817MetLys: 0.817 ± 0.21
1.525MetLeu: 1.525 ± 0.26
0.436MetMet: 0.436 ± 0.156
0.926MetAsn: 0.926 ± 0.213
1.471MetPro: 1.471 ± 0.28
0.436MetGln: 0.436 ± 0.138
1.362MetArg: 1.362 ± 0.26
2.669MetSer: 2.669 ± 0.363
1.961MetThr: 1.961 ± 0.368
1.525MetVal: 1.525 ± 0.333
0.49MetTrp: 0.49 ± 0.156
0.327MetTyr: 0.327 ± 0.121
0.0MetXaa: 0.0 ± 0.0
Asn
3.213AsnAla: 3.213 ± 0.381
0.272AsnCys: 0.272 ± 0.133
2.015AsnAsp: 2.015 ± 0.349
1.797AsnGlu: 1.797 ± 0.325
0.762AsnPhe: 0.762 ± 0.256
3.976AsnGly: 3.976 ± 0.509
0.817AsnHis: 0.817 ± 0.202
1.579AsnIle: 1.579 ± 0.451
0.98AsnLys: 0.98 ± 0.244
2.614AsnLeu: 2.614 ± 0.349
0.654AsnMet: 0.654 ± 0.171
1.688AsnAsn: 1.688 ± 0.369
2.887AsnPro: 2.887 ± 0.433
0.98AsnGln: 0.98 ± 0.314
1.961AsnArg: 1.961 ± 0.353
1.906AsnSer: 1.906 ± 0.327
2.56AsnThr: 2.56 ± 0.307
1.961AsnVal: 1.961 ± 0.37
0.654AsnTrp: 0.654 ± 0.143
0.49AsnTyr: 0.49 ± 0.125
0.0AsnXaa: 0.0 ± 0.0
Pro
5.664ProAla: 5.664 ± 0.591
0.599ProCys: 0.599 ± 0.183
4.248ProAsp: 4.248 ± 0.501
4.684ProGlu: 4.684 ± 0.515
1.579ProPhe: 1.579 ± 0.3
6.753ProGly: 6.753 ± 0.711
1.525ProHis: 1.525 ± 0.289
2.124ProIle: 2.124 ± 0.323
2.124ProLys: 2.124 ± 0.356
4.139ProLeu: 4.139 ± 0.528
1.416ProMet: 1.416 ± 0.255
2.179ProAsn: 2.179 ± 0.328
3.595ProPro: 3.595 ± 0.639
2.233ProGln: 2.233 ± 0.44
3.104ProArg: 3.104 ± 0.505
3.377ProSer: 3.377 ± 0.406
2.941ProThr: 2.941 ± 0.409
4.738ProVal: 4.738 ± 0.566
0.98ProTrp: 0.98 ± 0.221
1.579ProTyr: 1.579 ± 0.323
0.0ProXaa: 0.0 ± 0.0
Gln
4.248GlnAla: 4.248 ± 0.604
0.599GlnCys: 0.599 ± 0.21
1.416GlnAsp: 1.416 ± 0.321
1.852GlnGlu: 1.852 ± 0.343
1.144GlnPhe: 1.144 ± 0.22
2.723GlnGly: 2.723 ± 0.498
0.871GlnHis: 0.871 ± 0.246
2.07GlnIle: 2.07 ± 0.325
1.307GlnLys: 1.307 ± 0.226
3.05GlnLeu: 3.05 ± 0.382
0.708GlnMet: 0.708 ± 0.216
1.089GlnAsn: 1.089 ± 0.238
2.342GlnPro: 2.342 ± 0.381
1.525GlnGln: 1.525 ± 0.342
2.723GlnArg: 2.723 ± 0.383
2.505GlnSer: 2.505 ± 0.371
1.471GlnThr: 1.471 ± 0.358
2.07GlnVal: 2.07 ± 0.335
0.817GlnTrp: 0.817 ± 0.239
1.089GlnTyr: 1.089 ± 0.316
0.0GlnXaa: 0.0 ± 0.0
Arg
6.427ArgAla: 6.427 ± 0.624
1.253ArgCys: 1.253 ± 0.304
4.357ArgAsp: 4.357 ± 0.658
4.466ArgGlu: 4.466 ± 0.614
2.233ArgPhe: 2.233 ± 0.363
4.194ArgGly: 4.194 ± 0.542
1.416ArgHis: 1.416 ± 0.309
3.268ArgIle: 3.268 ± 0.434
2.342ArgLys: 2.342 ± 0.372
4.575ArgLeu: 4.575 ± 0.565
2.56ArgMet: 2.56 ± 0.421
2.505ArgAsn: 2.505 ± 0.398
3.431ArgPro: 3.431 ± 0.421
2.342ArgGln: 2.342 ± 0.362
5.555ArgArg: 5.555 ± 0.792
4.139ArgSer: 4.139 ± 0.501
3.322ArgThr: 3.322 ± 0.481
5.065ArgVal: 5.065 ± 0.446
1.852ArgTrp: 1.852 ± 0.35
2.233ArgTyr: 2.233 ± 0.36
0.0ArgXaa: 0.0 ± 0.0
Ser
6.154SerAla: 6.154 ± 0.708
0.436SerCys: 0.436 ± 0.168
4.085SerAsp: 4.085 ± 0.476
3.431SerGlu: 3.431 ± 0.47
2.505SerPhe: 2.505 ± 0.453
6.699SerGly: 6.699 ± 0.758
1.362SerHis: 1.362 ± 0.254
3.104SerIle: 3.104 ± 0.415
2.505SerLys: 2.505 ± 0.397
3.649SerLeu: 3.649 ± 0.468
1.416SerMet: 1.416 ± 0.286
1.797SerAsn: 1.797 ± 0.328
3.431SerPro: 3.431 ± 0.37
1.852SerGln: 1.852 ± 0.293
4.085SerArg: 4.085 ± 0.487
3.976SerSer: 3.976 ± 0.643
3.377SerThr: 3.377 ± 0.482
4.575SerVal: 4.575 ± 0.539
1.525SerTrp: 1.525 ± 0.26
1.525SerTyr: 1.525 ± 0.223
0.0SerXaa: 0.0 ± 0.0
Thr
6.645ThrAla: 6.645 ± 0.63
0.871ThrCys: 0.871 ± 0.298
3.595ThrAsp: 3.595 ± 0.439
3.595ThrGlu: 3.595 ± 0.488
1.797ThrPhe: 1.797 ± 0.347
5.882ThrGly: 5.882 ± 0.681
2.015ThrHis: 2.015 ± 0.39
3.159ThrIle: 3.159 ± 0.446
1.906ThrLys: 1.906 ± 0.322
4.303ThrLeu: 4.303 ± 0.474
1.307ThrMet: 1.307 ± 0.315
2.287ThrAsn: 2.287 ± 0.373
4.575ThrPro: 4.575 ± 0.472
2.396ThrGln: 2.396 ± 0.323
3.322ThrArg: 3.322 ± 0.467
3.867ThrSer: 3.867 ± 0.378
5.12ThrThr: 5.12 ± 0.739
6.209ThrVal: 6.209 ± 0.734
1.362ThrTrp: 1.362 ± 0.268
2.233ThrTyr: 2.233 ± 0.339
0.0ThrXaa: 0.0 ± 0.0
Val
7.897ValAla: 7.897 ± 0.61
1.035ValCys: 1.035 ± 0.24
5.011ValAsp: 5.011 ± 0.605
4.793ValGlu: 4.793 ± 0.486
2.015ValPhe: 2.015 ± 0.327
6.1ValGly: 6.1 ± 0.56
1.471ValHis: 1.471 ± 0.297
2.778ValIle: 2.778 ± 0.49
2.287ValLys: 2.287 ± 0.339
5.664ValLeu: 5.664 ± 0.608
1.362ValMet: 1.362 ± 0.216
2.124ValAsn: 2.124 ± 0.304
4.03ValPro: 4.03 ± 0.404
2.56ValGln: 2.56 ± 0.355
4.575ValArg: 4.575 ± 0.592
4.956ValSer: 4.956 ± 0.538
4.738ValThr: 4.738 ± 0.475
5.664ValVal: 5.664 ± 0.513
1.634ValTrp: 1.634 ± 0.355
1.471ValTyr: 1.471 ± 0.281
0.0ValXaa: 0.0 ± 0.0
Trp
1.906TrpAla: 1.906 ± 0.327
0.436TrpCys: 0.436 ± 0.144
1.362TrpAsp: 1.362 ± 0.29
0.926TrpGlu: 0.926 ± 0.24
0.817TrpPhe: 0.817 ± 0.197
1.144TrpGly: 1.144 ± 0.267
0.49TrpHis: 0.49 ± 0.192
0.98TrpIle: 0.98 ± 0.179
0.708TrpLys: 0.708 ± 0.185
1.579TrpLeu: 1.579 ± 0.314
0.98TrpMet: 0.98 ± 0.256
0.49TrpAsn: 0.49 ± 0.215
1.362TrpPro: 1.362 ± 0.285
1.035TrpGln: 1.035 ± 0.277
2.015TrpArg: 2.015 ± 0.309
1.525TrpSer: 1.525 ± 0.332
1.525TrpThr: 1.525 ± 0.307
2.179TrpVal: 2.179 ± 0.447
0.98TrpTrp: 0.98 ± 0.209
0.654TrpTyr: 0.654 ± 0.183
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.614TyrAla: 2.614 ± 0.36
0.327TyrCys: 0.327 ± 0.132
2.07TyrAsp: 2.07 ± 0.386
2.179TyrGlu: 2.179 ± 0.322
0.98TyrPhe: 0.98 ± 0.229
1.961TyrGly: 1.961 ± 0.338
0.545TyrHis: 0.545 ± 0.157
0.871TyrIle: 0.871 ± 0.21
0.545TyrLys: 0.545 ± 0.194
2.233TyrLeu: 2.233 ± 0.334
0.109TyrMet: 0.109 ± 0.075
0.762TyrAsn: 0.762 ± 0.217
1.362TyrPro: 1.362 ± 0.209
0.708TyrGln: 0.708 ± 0.207
1.852TyrArg: 1.852 ± 0.298
1.144TyrSer: 1.144 ± 0.225
2.179TyrThr: 2.179 ± 0.422
2.887TyrVal: 2.887 ± 0.322
0.708TyrTrp: 0.708 ± 0.198
0.49TyrTyr: 0.49 ± 0.154
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 108 proteins (18362 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski