Amino acid dipepetide frequency for Mycobacterium virus Rockyhorror

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
13.701AlaAla: 13.701 ± 1.605
1.206AlaCys: 1.206 ± 0.283
6.631AlaAsp: 6.631 ± 0.567
7.782AlaGlu: 7.782 ± 0.788
3.069AlaPhe: 3.069 ± 0.419
9.043AlaGly: 9.043 ± 1.099
2.137AlaHis: 2.137 ± 0.462
4.329AlaIle: 4.329 ± 0.478
4.384AlaLys: 4.384 ± 0.517
8.275AlaLeu: 8.275 ± 0.905
2.411AlaMet: 2.411 ± 0.395
3.014AlaAsn: 3.014 ± 0.487
5.042AlaPro: 5.042 ± 0.632
3.343AlaGln: 3.343 ± 0.386
6.412AlaArg: 6.412 ± 0.629
6.412AlaSer: 6.412 ± 0.544
6.138AlaThr: 6.138 ± 0.457
7.179AlaVal: 7.179 ± 0.575
2.028AlaTrp: 2.028 ± 0.326
2.576AlaTyr: 2.576 ± 0.337
0.0AlaXaa: 0.0 ± 0.0
Cys
1.151CysAla: 1.151 ± 0.322
0.11CysCys: 0.11 ± 0.071
0.877CysAsp: 0.877 ± 0.213
1.096CysGlu: 1.096 ± 0.278
0.274CysPhe: 0.274 ± 0.124
1.644CysGly: 1.644 ± 0.329
0.164CysHis: 0.164 ± 0.09
0.274CysIle: 0.274 ± 0.117
0.603CysLys: 0.603 ± 0.173
0.822CysLeu: 0.822 ± 0.238
0.11CysMet: 0.11 ± 0.075
0.493CysAsn: 0.493 ± 0.161
1.096CysPro: 1.096 ± 0.276
0.493CysGln: 0.493 ± 0.184
0.877CysArg: 0.877 ± 0.251
0.822CysSer: 0.822 ± 0.279
0.877CysThr: 0.877 ± 0.25
0.767CysVal: 0.767 ± 0.222
0.274CysTrp: 0.274 ± 0.122
0.11CysTyr: 0.11 ± 0.076
0.0CysXaa: 0.0 ± 0.0
Asp
6.467AspAla: 6.467 ± 0.592
0.822AspCys: 0.822 ± 0.205
4.22AspAsp: 4.22 ± 0.473
3.781AspGlu: 3.781 ± 0.522
1.096AspPhe: 1.096 ± 0.191
5.535AspGly: 5.535 ± 0.527
1.096AspHis: 1.096 ± 0.234
2.411AspIle: 2.411 ± 0.43
1.863AspLys: 1.863 ± 0.337
6.302AspLeu: 6.302 ± 0.575
1.041AspMet: 1.041 ± 0.222
1.918AspAsn: 1.918 ± 0.399
4.713AspPro: 4.713 ± 0.639
2.521AspGln: 2.521 ± 0.329
5.316AspArg: 5.316 ± 0.656
2.959AspSer: 2.959 ± 0.43
3.562AspThr: 3.562 ± 0.411
4.932AspVal: 4.932 ± 0.587
1.37AspTrp: 1.37 ± 0.289
2.083AspTyr: 2.083 ± 0.325
0.0AspXaa: 0.0 ± 0.0
Glu
6.686GluAla: 6.686 ± 0.719
0.932GluCys: 0.932 ± 0.271
2.576GluAsp: 2.576 ± 0.348
2.521GluGlu: 2.521 ± 0.484
2.083GluPhe: 2.083 ± 0.302
3.453GluGly: 3.453 ± 0.435
1.754GluHis: 1.754 ± 0.354
2.466GluIle: 2.466 ± 0.343
1.973GluLys: 1.973 ± 0.367
5.48GluLeu: 5.48 ± 0.694
1.425GluMet: 1.425 ± 0.303
1.644GluAsn: 1.644 ± 0.226
3.069GluPro: 3.069 ± 0.499
3.124GluGln: 3.124 ± 0.411
5.261GluArg: 5.261 ± 0.621
3.288GluSer: 3.288 ± 0.492
4.11GluThr: 4.11 ± 0.564
4.22GluVal: 4.22 ± 0.55
1.26GluTrp: 1.26 ± 0.27
1.644GluTyr: 1.644 ± 0.341
0.0GluXaa: 0.0 ± 0.0
Phe
2.631PheAla: 2.631 ± 0.36
0.329PheCys: 0.329 ± 0.14
2.521PheAsp: 2.521 ± 0.458
1.48PheGlu: 1.48 ± 0.264
0.932PhePhe: 0.932 ± 0.262
3.288PheGly: 3.288 ± 0.661
0.658PheHis: 0.658 ± 0.207
1.425PheIle: 1.425 ± 0.345
1.26PheLys: 1.26 ± 0.304
1.534PheLeu: 1.534 ± 0.228
0.658PheMet: 0.658 ± 0.202
1.206PheAsn: 1.206 ± 0.295
1.534PhePro: 1.534 ± 0.305
1.151PheGln: 1.151 ± 0.282
1.315PheArg: 1.315 ± 0.271
1.534PheSer: 1.534 ± 0.296
2.028PheThr: 2.028 ± 0.285
1.863PheVal: 1.863 ± 0.257
0.822PheTrp: 0.822 ± 0.178
0.767PheTyr: 0.767 ± 0.242
0.0PheXaa: 0.0 ± 0.0
Gly
9.536GlyAla: 9.536 ± 1.106
0.986GlyCys: 0.986 ± 0.246
5.919GlyAsp: 5.919 ± 0.55
4.165GlyGlu: 4.165 ± 0.583
2.576GlyPhe: 2.576 ± 0.422
11.07GlyGly: 11.07 ± 1.91
1.973GlyHis: 1.973 ± 0.288
4.165GlyIle: 4.165 ± 0.501
2.74GlyLys: 2.74 ± 0.404
6.028GlyLeu: 6.028 ± 0.547
2.302GlyMet: 2.302 ± 0.433
3.124GlyAsn: 3.124 ± 0.363
3.562GlyPro: 3.562 ± 0.531
2.466GlyGln: 2.466 ± 0.513
4.768GlyArg: 4.768 ± 0.541
5.754GlySer: 5.754 ± 0.829
6.412GlyThr: 6.412 ± 0.815
6.302GlyVal: 6.302 ± 0.624
2.631GlyTrp: 2.631 ± 0.347
2.192GlyTyr: 2.192 ± 0.433
0.0GlyXaa: 0.0 ± 0.0
His
1.809HisAla: 1.809 ± 0.359
0.384HisCys: 0.384 ± 0.175
1.206HisAsp: 1.206 ± 0.296
1.534HisGlu: 1.534 ± 0.318
0.438HisPhe: 0.438 ± 0.116
1.863HisGly: 1.863 ± 0.37
1.26HisHis: 1.26 ± 0.321
1.151HisIle: 1.151 ± 0.284
0.877HisLys: 0.877 ± 0.238
1.589HisLeu: 1.589 ± 0.265
0.384HisMet: 0.384 ± 0.119
1.041HisAsn: 1.041 ± 0.238
1.425HisPro: 1.425 ± 0.216
0.712HisGln: 0.712 ± 0.201
1.589HisArg: 1.589 ± 0.267
0.932HisSer: 0.932 ± 0.217
1.644HisThr: 1.644 ± 0.381
1.644HisVal: 1.644 ± 0.335
0.548HisTrp: 0.548 ± 0.171
0.932HisTyr: 0.932 ± 0.225
0.0HisXaa: 0.0 ± 0.0
Ile
5.097IleAla: 5.097 ± 0.504
0.603IleCys: 0.603 ± 0.199
3.124IleAsp: 3.124 ± 0.462
3.233IleGlu: 3.233 ± 0.445
0.767IlePhe: 0.767 ± 0.234
4.22IleGly: 4.22 ± 0.46
1.37IleHis: 1.37 ± 0.301
1.48IleIle: 1.48 ± 0.249
1.206IleLys: 1.206 ± 0.271
2.137IleLeu: 2.137 ± 0.397
0.329IleMet: 0.329 ± 0.126
1.809IleAsn: 1.809 ± 0.267
2.905IlePro: 2.905 ± 0.437
1.589IleGln: 1.589 ± 0.263
2.411IleArg: 2.411 ± 0.383
2.137IleSer: 2.137 ± 0.358
3.562IleThr: 3.562 ± 0.36
3.781IleVal: 3.781 ± 0.426
1.096IleTrp: 1.096 ± 0.263
0.932IleTyr: 0.932 ± 0.209
0.0IleXaa: 0.0 ± 0.0
Lys
3.781LysAla: 3.781 ± 0.427
0.384LysCys: 0.384 ± 0.141
2.028LysAsp: 2.028 ± 0.319
1.425LysGlu: 1.425 ± 0.288
1.041LysPhe: 1.041 ± 0.207
2.411LysGly: 2.411 ± 0.303
1.37LysHis: 1.37 ± 0.313
0.877LysIle: 0.877 ± 0.241
1.425LysLys: 1.425 ± 0.358
3.124LysLeu: 3.124 ± 0.428
0.822LysMet: 0.822 ± 0.196
1.096LysAsn: 1.096 ± 0.223
2.685LysPro: 2.685 ± 0.374
1.37LysGln: 1.37 ± 0.213
2.192LysArg: 2.192 ± 0.41
2.028LysSer: 2.028 ± 0.362
2.083LysThr: 2.083 ± 0.339
2.247LysVal: 2.247 ± 0.385
0.986LysTrp: 0.986 ± 0.245
1.041LysTyr: 1.041 ± 0.239
0.0LysXaa: 0.0 ± 0.0
Leu
8.221LeuAla: 8.221 ± 0.859
0.712LeuCys: 0.712 ± 0.272
5.206LeuAsp: 5.206 ± 0.558
4.11LeuGlu: 4.11 ± 0.51
2.192LeuPhe: 2.192 ± 0.36
5.974LeuGly: 5.974 ± 0.59
0.877LeuHis: 0.877 ± 0.249
3.233LeuIle: 3.233 ± 0.484
2.411LeuLys: 2.411 ± 0.362
4.549LeuLeu: 4.549 ± 0.479
1.151LeuMet: 1.151 ± 0.233
2.521LeuAsn: 2.521 ± 0.408
5.7LeuPro: 5.7 ± 0.727
3.014LeuGln: 3.014 ± 0.507
5.152LeuArg: 5.152 ± 0.586
5.042LeuSer: 5.042 ± 0.533
5.645LeuThr: 5.645 ± 0.531
5.645LeuVal: 5.645 ± 0.539
1.315LeuTrp: 1.315 ± 0.292
2.028LeuTyr: 2.028 ± 0.366
0.0LeuXaa: 0.0 ± 0.0
Met
2.028MetAla: 2.028 ± 0.383
0.164MetCys: 0.164 ± 0.126
1.096MetAsp: 1.096 ± 0.274
0.658MetGlu: 0.658 ± 0.159
0.548MetPhe: 0.548 ± 0.175
1.809MetGly: 1.809 ± 0.328
0.164MetHis: 0.164 ± 0.095
1.151MetIle: 1.151 ± 0.254
0.767MetLys: 0.767 ± 0.195
1.589MetLeu: 1.589 ± 0.243
0.493MetMet: 0.493 ± 0.225
0.986MetAsn: 0.986 ± 0.232
1.26MetPro: 1.26 ± 0.261
0.384MetGln: 0.384 ± 0.141
1.37MetArg: 1.37 ± 0.255
3.069MetSer: 3.069 ± 0.441
1.863MetThr: 1.863 ± 0.294
1.26MetVal: 1.26 ± 0.303
0.329MetTrp: 0.329 ± 0.133
0.274MetTyr: 0.274 ± 0.133
0.0MetXaa: 0.0 ± 0.0
Asn
3.781AsnAla: 3.781 ± 0.518
0.274AsnCys: 0.274 ± 0.118
1.809AsnAsp: 1.809 ± 0.316
1.918AsnGlu: 1.918 ± 0.334
0.767AsnPhe: 0.767 ± 0.267
4.494AsnGly: 4.494 ± 0.653
0.932AsnHis: 0.932 ± 0.244
1.589AsnIle: 1.589 ± 0.44
0.877AsnLys: 0.877 ± 0.268
2.521AsnLeu: 2.521 ± 0.382
0.658AsnMet: 0.658 ± 0.167
1.754AsnAsn: 1.754 ± 0.322
2.905AsnPro: 2.905 ± 0.38
1.041AsnGln: 1.041 ± 0.281
2.357AsnArg: 2.357 ± 0.369
1.534AsnSer: 1.534 ± 0.263
2.411AsnThr: 2.411 ± 0.298
1.589AsnVal: 1.589 ± 0.266
0.658AsnTrp: 0.658 ± 0.155
0.658AsnTyr: 0.658 ± 0.167
0.0AsnXaa: 0.0 ± 0.0
Pro
5.754ProAla: 5.754 ± 0.636
0.658ProCys: 0.658 ± 0.194
4.439ProAsp: 4.439 ± 0.443
4.658ProGlu: 4.658 ± 0.473
2.028ProPhe: 2.028 ± 0.317
6.412ProGly: 6.412 ± 0.73
1.425ProHis: 1.425 ± 0.295
2.083ProIle: 2.083 ± 0.322
1.973ProLys: 1.973 ± 0.399
4.329ProLeu: 4.329 ± 0.551
1.206ProMet: 1.206 ± 0.265
2.083ProAsn: 2.083 ± 0.274
4.165ProPro: 4.165 ± 0.587
2.631ProGln: 2.631 ± 0.336
2.959ProArg: 2.959 ± 0.424
3.398ProSer: 3.398 ± 0.326
3.179ProThr: 3.179 ± 0.427
4.439ProVal: 4.439 ± 0.514
1.096ProTrp: 1.096 ± 0.273
1.809ProTyr: 1.809 ± 0.361
0.0ProXaa: 0.0 ± 0.0
Gln
4.001GlnAla: 4.001 ± 0.563
0.658GlnCys: 0.658 ± 0.243
1.206GlnAsp: 1.206 ± 0.277
1.699GlnGlu: 1.699 ± 0.375
1.041GlnPhe: 1.041 ± 0.195
2.466GlnGly: 2.466 ± 0.429
0.932GlnHis: 0.932 ± 0.227
1.699GlnIle: 1.699 ± 0.33
1.206GlnLys: 1.206 ± 0.22
3.343GlnLeu: 3.343 ± 0.54
0.822GlnMet: 0.822 ± 0.263
0.932GlnAsn: 0.932 ± 0.241
2.357GlnPro: 2.357 ± 0.389
1.754GlnGln: 1.754 ± 0.436
2.685GlnArg: 2.685 ± 0.436
2.631GlnSer: 2.631 ± 0.363
1.754GlnThr: 1.754 ± 0.347
2.302GlnVal: 2.302 ± 0.298
0.932GlnTrp: 0.932 ± 0.21
1.096GlnTyr: 1.096 ± 0.269
0.0GlnXaa: 0.0 ± 0.0
Arg
6.631ArgAla: 6.631 ± 0.79
1.206ArgCys: 1.206 ± 0.318
4.494ArgAsp: 4.494 ± 0.579
4.987ArgGlu: 4.987 ± 0.599
2.028ArgPhe: 2.028 ± 0.337
4.001ArgGly: 4.001 ± 0.403
1.26ArgHis: 1.26 ± 0.329
4.165ArgIle: 4.165 ± 0.544
1.973ArgLys: 1.973 ± 0.351
4.823ArgLeu: 4.823 ± 0.488
2.357ArgMet: 2.357 ± 0.355
2.192ArgAsn: 2.192 ± 0.428
3.124ArgPro: 3.124 ± 0.46
1.589ArgGln: 1.589 ± 0.319
4.878ArgArg: 4.878 ± 0.635
4.329ArgSer: 4.329 ± 0.385
3.233ArgThr: 3.233 ± 0.477
4.439ArgVal: 4.439 ± 0.48
1.699ArgTrp: 1.699 ± 0.324
2.083ArgTyr: 2.083 ± 0.317
0.0ArgXaa: 0.0 ± 0.0
Ser
5.59SerAla: 5.59 ± 0.795
0.548SerCys: 0.548 ± 0.255
4.055SerAsp: 4.055 ± 0.437
3.288SerGlu: 3.288 ± 0.467
2.357SerPhe: 2.357 ± 0.369
6.467SerGly: 6.467 ± 0.836
1.315SerHis: 1.315 ± 0.275
2.959SerIle: 2.959 ± 0.41
2.247SerLys: 2.247 ± 0.416
4.055SerLeu: 4.055 ± 0.452
1.48SerMet: 1.48 ± 0.269
2.411SerAsn: 2.411 ± 0.513
3.562SerPro: 3.562 ± 0.462
1.534SerGln: 1.534 ± 0.267
3.453SerArg: 3.453 ± 0.421
4.329SerSer: 4.329 ± 0.651
3.672SerThr: 3.672 ± 0.519
5.097SerVal: 5.097 ± 0.637
1.315SerTrp: 1.315 ± 0.278
1.48SerTyr: 1.48 ± 0.292
0.0SerXaa: 0.0 ± 0.0
Thr
5.754ThrAla: 5.754 ± 0.531
0.767ThrCys: 0.767 ± 0.249
3.672ThrAsp: 3.672 ± 0.5
3.946ThrGlu: 3.946 ± 0.424
1.973ThrPhe: 1.973 ± 0.419
5.864ThrGly: 5.864 ± 0.621
1.699ThrHis: 1.699 ± 0.25
3.233ThrIle: 3.233 ± 0.527
2.247ThrLys: 2.247 ± 0.39
4.603ThrLeu: 4.603 ± 0.555
0.877ThrMet: 0.877 ± 0.238
2.411ThrAsn: 2.411 ± 0.376
4.275ThrPro: 4.275 ± 0.466
2.083ThrGln: 2.083 ± 0.318
3.836ThrArg: 3.836 ± 0.495
3.836ThrSer: 3.836 ± 0.467
4.439ThrThr: 4.439 ± 0.721
5.919ThrVal: 5.919 ± 0.526
0.986ThrTrp: 0.986 ± 0.26
2.028ThrTyr: 2.028 ± 0.338
0.0ThrXaa: 0.0 ± 0.0
Val
7.453ValAla: 7.453 ± 0.697
1.534ValCys: 1.534 ± 0.271
5.59ValAsp: 5.59 ± 0.611
4.22ValGlu: 4.22 ± 0.62
2.302ValPhe: 2.302 ± 0.398
5.535ValGly: 5.535 ± 0.553
1.534ValHis: 1.534 ± 0.26
2.905ValIle: 2.905 ± 0.41
2.631ValLys: 2.631 ± 0.353
5.426ValLeu: 5.426 ± 0.554
1.37ValMet: 1.37 ± 0.224
2.631ValAsn: 2.631 ± 0.321
4.603ValPro: 4.603 ± 0.59
2.74ValGln: 2.74 ± 0.313
4.932ValArg: 4.932 ± 0.591
4.768ValSer: 4.768 ± 0.609
4.713ValThr: 4.713 ± 0.429
6.028ValVal: 6.028 ± 0.615
1.589ValTrp: 1.589 ± 0.339
1.26ValTyr: 1.26 ± 0.311
0.0ValXaa: 0.0 ± 0.0
Trp
2.192TrpAla: 2.192 ± 0.328
0.384TrpCys: 0.384 ± 0.153
1.26TrpAsp: 1.26 ± 0.241
0.822TrpGlu: 0.822 ± 0.196
0.658TrpPhe: 0.658 ± 0.151
0.877TrpGly: 0.877 ± 0.257
0.767TrpHis: 0.767 ± 0.21
1.096TrpIle: 1.096 ± 0.225
0.932TrpLys: 0.932 ± 0.196
1.918TrpLeu: 1.918 ± 0.332
0.986TrpMet: 0.986 ± 0.311
0.493TrpAsn: 0.493 ± 0.198
1.37TrpPro: 1.37 ± 0.31
1.041TrpGln: 1.041 ± 0.276
1.809TrpArg: 1.809 ± 0.325
1.37TrpSer: 1.37 ± 0.237
1.315TrpThr: 1.315 ± 0.289
1.809TrpVal: 1.809 ± 0.411
0.877TrpTrp: 0.877 ± 0.191
0.438TrpTyr: 0.438 ± 0.132
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.85TyrAla: 2.85 ± 0.379
0.274TyrCys: 0.274 ± 0.129
2.083TyrAsp: 2.083 ± 0.352
1.644TyrGlu: 1.644 ± 0.316
0.767TyrPhe: 0.767 ± 0.192
2.247TyrGly: 2.247 ± 0.473
0.274TyrHis: 0.274 ± 0.101
1.096TyrIle: 1.096 ± 0.233
0.822TyrLys: 0.822 ± 0.21
2.247TyrLeu: 2.247 ± 0.363
0.329TyrMet: 0.329 ± 0.121
0.767TyrAsn: 0.767 ± 0.192
1.37TyrPro: 1.37 ± 0.25
0.932TyrGln: 0.932 ± 0.244
1.918TyrArg: 1.918 ± 0.288
1.041TyrSer: 1.041 ± 0.255
1.809TyrThr: 1.809 ± 0.426
2.357TyrVal: 2.357 ± 0.367
0.603TyrTrp: 0.603 ± 0.184
0.712TyrTyr: 0.712 ± 0.164
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 107 proteins (18248 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski