Amino acid dipepetide frequency for Mycobacterium phage Carlyle

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
13.582AlaAla: 13.582 ± 1.25
0.695AlaCys: 0.695 ± 0.198
6.443AlaAsp: 6.443 ± 0.712
6.822AlaGlu: 6.822 ± 0.71
2.969AlaPhe: 2.969 ± 0.39
7.265AlaGly: 7.265 ± 0.691
1.39AlaHis: 1.39 ± 0.332
5.18AlaIle: 5.18 ± 0.581
3.853AlaLys: 3.853 ± 0.513
9.665AlaLeu: 9.665 ± 0.911
2.148AlaMet: 2.148 ± 0.347
2.464AlaAsn: 2.464 ± 0.336
4.548AlaPro: 4.548 ± 0.662
2.969AlaGln: 2.969 ± 0.481
6.38AlaArg: 6.38 ± 0.588
5.433AlaSer: 5.433 ± 0.767
5.37AlaThr: 5.37 ± 0.601
8.212AlaVal: 8.212 ± 0.604
2.085AlaTrp: 2.085 ± 0.343
2.653AlaTyr: 2.653 ± 0.351
0.0AlaXaa: 0.0 ± 0.0
Cys
0.442CysAla: 0.442 ± 0.156
0.126CysCys: 0.126 ± 0.093
0.632CysAsp: 0.632 ± 0.186
0.695CysGlu: 0.695 ± 0.242
0.126CysPhe: 0.126 ± 0.084
0.505CysGly: 0.505 ± 0.187
0.19CysHis: 0.19 ± 0.099
0.442CysIle: 0.442 ± 0.167
0.253CysLys: 0.253 ± 0.119
0.505CysLeu: 0.505 ± 0.17
0.063CysMet: 0.063 ± 0.068
0.253CysAsn: 0.253 ± 0.13
0.19CysPro: 0.19 ± 0.103
0.126CysGln: 0.126 ± 0.073
0.695CysArg: 0.695 ± 0.23
0.442CysSer: 0.442 ± 0.175
0.379CysThr: 0.379 ± 0.148
0.253CysVal: 0.253 ± 0.102
0.253CysTrp: 0.253 ± 0.146
0.126CysTyr: 0.126 ± 0.072
0.0CysXaa: 0.0 ± 0.0
Asp
6.633AspAla: 6.633 ± 0.68
0.632AspCys: 0.632 ± 0.187
4.548AspAsp: 4.548 ± 0.582
3.917AspGlu: 3.917 ± 0.46
2.401AspPhe: 2.401 ± 0.305
6.064AspGly: 6.064 ± 0.567
1.074AspHis: 1.074 ± 0.246
2.843AspIle: 2.843 ± 0.413
2.464AspLys: 2.464 ± 0.434
7.202AspLeu: 7.202 ± 0.742
1.327AspMet: 1.327 ± 0.236
1.706AspAsn: 1.706 ± 0.313
4.485AspPro: 4.485 ± 0.515
1.516AspGln: 1.516 ± 0.326
3.917AspArg: 3.917 ± 0.391
3.222AspSer: 3.222 ± 0.408
3.79AspThr: 3.79 ± 0.425
4.738AspVal: 4.738 ± 0.491
2.021AspTrp: 2.021 ± 0.28
2.021AspTyr: 2.021 ± 0.315
0.0AspXaa: 0.0 ± 0.0
Glu
6.254GluAla: 6.254 ± 0.71
0.316GluCys: 0.316 ± 0.15
4.675GluAsp: 4.675 ± 0.555
5.117GluGlu: 5.117 ± 0.637
2.274GluPhe: 2.274 ± 0.361
3.79GluGly: 3.79 ± 0.434
1.706GluHis: 1.706 ± 0.346
3.601GluIle: 3.601 ± 0.479
2.969GluLys: 2.969 ± 0.421
6.822GluLeu: 6.822 ± 0.636
1.895GluMet: 1.895 ± 0.344
1.516GluAsn: 1.516 ± 0.34
2.021GluPro: 2.021 ± 0.323
2.464GluGln: 2.464 ± 0.303
3.79GluArg: 3.79 ± 0.531
3.664GluSer: 3.664 ± 0.456
4.043GluThr: 4.043 ± 0.528
6.064GluVal: 6.064 ± 0.604
1.263GluTrp: 1.263 ± 0.276
2.969GluTyr: 2.969 ± 0.467
0.0GluXaa: 0.0 ± 0.0
Phe
2.274PheAla: 2.274 ± 0.327
0.253PheCys: 0.253 ± 0.1
2.464PheAsp: 2.464 ± 0.303
2.464PheGlu: 2.464 ± 0.391
0.632PhePhe: 0.632 ± 0.17
3.727PheGly: 3.727 ± 0.468
0.569PheHis: 0.569 ± 0.216
1.263PheIle: 1.263 ± 0.245
1.263PheLys: 1.263 ± 0.257
2.464PheLeu: 2.464 ± 0.415
0.632PheMet: 0.632 ± 0.169
1.137PheAsn: 1.137 ± 0.242
1.769PhePro: 1.769 ± 0.316
0.884PheGln: 0.884 ± 0.205
1.453PheArg: 1.453 ± 0.323
2.021PheSer: 2.021 ± 0.387
1.958PheThr: 1.958 ± 0.322
1.958PheVal: 1.958 ± 0.367
0.442PheTrp: 0.442 ± 0.147
0.948PheTyr: 0.948 ± 0.258
0.0PheXaa: 0.0 ± 0.0
Gly
6.886GlyAla: 6.886 ± 0.991
0.695GlyCys: 0.695 ± 0.18
6.317GlyAsp: 6.317 ± 0.517
4.485GlyGlu: 4.485 ± 0.564
3.032GlyPhe: 3.032 ± 0.563
7.833GlyGly: 7.833 ± 1.328
1.832GlyHis: 1.832 ± 0.299
4.043GlyIle: 4.043 ± 0.669
3.917GlyLys: 3.917 ± 0.452
7.707GlyLeu: 7.707 ± 1.07
1.895GlyMet: 1.895 ± 0.355
3.032GlyAsn: 3.032 ± 0.386
3.727GlyPro: 3.727 ± 0.548
2.401GlyGln: 2.401 ± 0.366
4.485GlyArg: 4.485 ± 0.492
6.064GlySer: 6.064 ± 0.676
4.927GlyThr: 4.927 ± 0.652
5.496GlyVal: 5.496 ± 0.604
2.527GlyTrp: 2.527 ± 0.389
2.906GlyTyr: 2.906 ± 0.349
0.0GlyXaa: 0.0 ± 0.0
His
1.642HisAla: 1.642 ± 0.316
0.063HisCys: 0.063 ± 0.069
1.2HisAsp: 1.2 ± 0.264
1.579HisGlu: 1.579 ± 0.328
0.442HisPhe: 0.442 ± 0.169
1.706HisGly: 1.706 ± 0.338
0.569HisHis: 0.569 ± 0.171
1.074HisIle: 1.074 ± 0.239
0.884HisLys: 0.884 ± 0.252
1.327HisLeu: 1.327 ± 0.276
0.063HisMet: 0.063 ± 0.06
0.253HisAsn: 0.253 ± 0.124
1.453HisPro: 1.453 ± 0.275
1.137HisGln: 1.137 ± 0.238
1.39HisArg: 1.39 ± 0.338
0.695HisSer: 0.695 ± 0.191
1.263HisThr: 1.263 ± 0.291
1.579HisVal: 1.579 ± 0.303
0.379HisTrp: 0.379 ± 0.156
0.695HisTyr: 0.695 ± 0.22
0.0HisXaa: 0.0 ± 0.0
Ile
6.57IleAla: 6.57 ± 0.632
0.316IleCys: 0.316 ± 0.127
3.411IleAsp: 3.411 ± 0.365
3.727IleGlu: 3.727 ± 0.468
0.948IlePhe: 0.948 ± 0.259
3.98IleGly: 3.98 ± 0.498
0.884IleHis: 0.884 ± 0.227
1.895IleIle: 1.895 ± 0.314
1.832IleLys: 1.832 ± 0.352
2.78IleLeu: 2.78 ± 0.38
0.884IleMet: 0.884 ± 0.178
2.211IleAsn: 2.211 ± 0.402
3.348IlePro: 3.348 ± 0.432
1.453IleGln: 1.453 ± 0.306
3.348IleArg: 3.348 ± 0.415
3.411IleSer: 3.411 ± 0.624
3.601IleThr: 3.601 ± 0.401
3.348IleVal: 3.348 ± 0.593
0.758IleTrp: 0.758 ± 0.197
1.453IleTyr: 1.453 ± 0.275
0.0IleXaa: 0.0 ± 0.0
Lys
3.727LysAla: 3.727 ± 0.557
0.316LysCys: 0.316 ± 0.142
2.337LysAsp: 2.337 ± 0.446
2.337LysGlu: 2.337 ± 0.395
1.579LysPhe: 1.579 ± 0.292
2.59LysGly: 2.59 ± 0.391
1.074LysHis: 1.074 ± 0.259
1.895LysIle: 1.895 ± 0.354
1.895LysLys: 1.895 ± 0.344
3.159LysLeu: 3.159 ± 0.438
0.884LysMet: 0.884 ± 0.241
1.516LysAsn: 1.516 ± 0.269
2.716LysPro: 2.716 ± 0.436
1.263LysGln: 1.263 ± 0.243
3.285LysArg: 3.285 ± 0.422
2.969LysSer: 2.969 ± 0.391
2.148LysThr: 2.148 ± 0.415
3.285LysVal: 3.285 ± 0.458
0.884LysTrp: 0.884 ± 0.215
0.948LysTyr: 0.948 ± 0.317
0.0LysXaa: 0.0 ± 0.0
Leu
9.349LeuAla: 9.349 ± 0.743
0.316LeuCys: 0.316 ± 0.129
6.57LeuAsp: 6.57 ± 0.635
5.243LeuGlu: 5.243 ± 0.539
1.958LeuPhe: 1.958 ± 0.381
7.644LeuGly: 7.644 ± 0.865
1.516LeuHis: 1.516 ± 0.303
4.991LeuIle: 4.991 ± 0.577
4.043LeuLys: 4.043 ± 0.442
5.559LeuLeu: 5.559 ± 0.461
1.706LeuMet: 1.706 ± 0.32
3.222LeuAsn: 3.222 ± 0.442
5.243LeuPro: 5.243 ± 0.565
2.274LeuGln: 2.274 ± 0.431
6.443LeuArg: 6.443 ± 0.521
5.559LeuSer: 5.559 ± 0.468
6.001LeuThr: 6.001 ± 0.461
4.611LeuVal: 4.611 ± 0.577
1.074LeuTrp: 1.074 ± 0.319
2.337LeuTyr: 2.337 ± 0.387
0.0LeuXaa: 0.0 ± 0.0
Met
2.401MetAla: 2.401 ± 0.327
0.0MetCys: 0.0 ± 0.0
1.011MetAsp: 1.011 ± 0.223
1.39MetGlu: 1.39 ± 0.317
0.569MetPhe: 0.569 ± 0.182
1.39MetGly: 1.39 ± 0.268
0.316MetHis: 0.316 ± 0.145
0.758MetIle: 0.758 ± 0.238
0.948MetLys: 0.948 ± 0.195
1.2MetLeu: 1.2 ± 0.283
0.063MetMet: 0.063 ± 0.066
0.948MetAsn: 0.948 ± 0.214
0.948MetPro: 0.948 ± 0.278
0.758MetGln: 0.758 ± 0.197
1.137MetArg: 1.137 ± 0.226
2.527MetSer: 2.527 ± 0.458
2.527MetThr: 2.527 ± 0.413
0.948MetVal: 0.948 ± 0.249
0.379MetTrp: 0.379 ± 0.125
0.505MetTyr: 0.505 ± 0.189
0.0MetXaa: 0.0 ± 0.0
Asn
3.285AsnAla: 3.285 ± 0.495
0.19AsnCys: 0.19 ± 0.109
1.895AsnAsp: 1.895 ± 0.357
2.274AsnGlu: 2.274 ± 0.352
0.821AsnPhe: 0.821 ± 0.217
3.032AsnGly: 3.032 ± 0.449
0.632AsnHis: 0.632 ± 0.166
1.516AsnIle: 1.516 ± 0.33
1.011AsnLys: 1.011 ± 0.294
2.274AsnLeu: 2.274 ± 0.376
0.505AsnMet: 0.505 ± 0.138
0.948AsnAsn: 0.948 ± 0.228
2.464AsnPro: 2.464 ± 0.356
1.137AsnGln: 1.137 ± 0.245
1.263AsnArg: 1.263 ± 0.28
1.958AsnSer: 1.958 ± 0.452
1.958AsnThr: 1.958 ± 0.339
2.906AsnVal: 2.906 ± 0.396
0.758AsnTrp: 0.758 ± 0.238
1.2AsnTyr: 1.2 ± 0.305
0.0AsnXaa: 0.0 ± 0.0
Pro
5.054ProAla: 5.054 ± 0.597
0.442ProCys: 0.442 ± 0.166
4.232ProAsp: 4.232 ± 0.46
4.169ProGlu: 4.169 ± 0.519
2.085ProPhe: 2.085 ± 0.321
4.991ProGly: 4.991 ± 0.554
0.821ProHis: 0.821 ± 0.204
2.464ProIle: 2.464 ± 0.438
1.895ProLys: 1.895 ± 0.287
4.485ProLeu: 4.485 ± 0.54
0.695ProMet: 0.695 ± 0.205
1.453ProAsn: 1.453 ± 0.282
3.222ProPro: 3.222 ± 0.513
1.832ProGln: 1.832 ± 0.36
2.59ProArg: 2.59 ± 0.409
3.727ProSer: 3.727 ± 0.458
3.853ProThr: 3.853 ± 0.512
3.727ProVal: 3.727 ± 0.457
0.948ProTrp: 0.948 ± 0.289
1.516ProTyr: 1.516 ± 0.312
0.0ProXaa: 0.0 ± 0.0
Gln
3.159GlnAla: 3.159 ± 0.425
0.19GlnCys: 0.19 ± 0.144
1.327GlnAsp: 1.327 ± 0.302
1.832GlnGlu: 1.832 ± 0.342
1.137GlnPhe: 1.137 ± 0.214
2.401GlnGly: 2.401 ± 0.329
0.569GlnHis: 0.569 ± 0.177
2.59GlnIle: 2.59 ± 0.416
0.948GlnLys: 0.948 ± 0.187
3.222GlnLeu: 3.222 ± 0.444
1.011GlnMet: 1.011 ± 0.236
0.569GlnAsn: 0.569 ± 0.14
2.274GlnPro: 2.274 ± 0.366
1.706GlnGln: 1.706 ± 0.323
1.642GlnArg: 1.642 ± 0.307
1.895GlnSer: 1.895 ± 0.329
1.516GlnThr: 1.516 ± 0.281
2.59GlnVal: 2.59 ± 0.378
0.758GlnTrp: 0.758 ± 0.163
0.632GlnTyr: 0.632 ± 0.226
0.0GlnXaa: 0.0 ± 0.0
Arg
5.433ArgAla: 5.433 ± 0.533
0.695ArgCys: 0.695 ± 0.206
3.095ArgAsp: 3.095 ± 0.467
4.738ArgGlu: 4.738 ± 0.682
1.832ArgPhe: 1.832 ± 0.339
4.485ArgGly: 4.485 ± 0.507
1.2ArgHis: 1.2 ± 0.25
3.601ArgIle: 3.601 ± 0.526
3.032ArgLys: 3.032 ± 0.509
6.128ArgLeu: 6.128 ± 0.709
2.274ArgMet: 2.274 ± 0.351
2.148ArgAsn: 2.148 ± 0.461
2.464ArgPro: 2.464 ± 0.389
2.085ArgGln: 2.085 ± 0.355
5.433ArgArg: 5.433 ± 0.698
3.727ArgSer: 3.727 ± 0.486
2.969ArgThr: 2.969 ± 0.458
4.611ArgVal: 4.611 ± 0.446
1.327ArgTrp: 1.327 ± 0.301
1.769ArgTyr: 1.769 ± 0.292
0.0ArgXaa: 0.0 ± 0.0
Ser
6.443SerAla: 6.443 ± 0.856
0.442SerCys: 0.442 ± 0.153
4.043SerAsp: 4.043 ± 0.42
3.601SerGlu: 3.601 ± 0.525
2.085SerPhe: 2.085 ± 0.386
6.822SerGly: 6.822 ± 0.833
1.327SerHis: 1.327 ± 0.302
2.906SerIle: 2.906 ± 0.481
2.401SerLys: 2.401 ± 0.392
5.054SerLeu: 5.054 ± 0.535
1.39SerMet: 1.39 ± 0.329
2.337SerAsn: 2.337 ± 0.425
3.159SerPro: 3.159 ± 0.383
2.401SerGln: 2.401 ± 0.329
3.222SerArg: 3.222 ± 0.408
3.664SerSer: 3.664 ± 0.651
2.906SerThr: 2.906 ± 0.418
4.296SerVal: 4.296 ± 0.453
1.516SerTrp: 1.516 ± 0.335
1.453SerTyr: 1.453 ± 0.299
0.0SerXaa: 0.0 ± 0.0
Thr
5.938ThrAla: 5.938 ± 0.648
0.19ThrCys: 0.19 ± 0.115
4.106ThrAsp: 4.106 ± 0.533
4.422ThrGlu: 4.422 ± 0.446
1.895ThrPhe: 1.895 ± 0.348
6.696ThrGly: 6.696 ± 0.708
1.074ThrHis: 1.074 ± 0.353
2.527ThrIle: 2.527 ± 0.534
2.527ThrLys: 2.527 ± 0.35
5.559ThrLeu: 5.559 ± 0.632
1.011ThrMet: 1.011 ± 0.228
1.769ThrAsn: 1.769 ± 0.305
3.917ThrPro: 3.917 ± 0.478
1.706ThrGln: 1.706 ± 0.332
3.474ThrArg: 3.474 ± 0.584
2.969ThrSer: 2.969 ± 0.593
4.043ThrThr: 4.043 ± 0.538
5.749ThrVal: 5.749 ± 0.682
0.948ThrTrp: 0.948 ± 0.244
1.769ThrTyr: 1.769 ± 0.314
0.0ThrXaa: 0.0 ± 0.0
Val
7.075ValAla: 7.075 ± 0.659
0.442ValCys: 0.442 ± 0.143
5.054ValAsp: 5.054 ± 0.517
5.243ValGlu: 5.243 ± 0.559
2.211ValPhe: 2.211 ± 0.35
5.054ValGly: 5.054 ± 0.718
1.642ValHis: 1.642 ± 0.252
3.538ValIle: 3.538 ± 0.448
3.222ValLys: 3.222 ± 0.419
6.001ValLeu: 6.001 ± 0.643
1.137ValMet: 1.137 ± 0.298
2.906ValAsn: 2.906 ± 0.378
3.727ValPro: 3.727 ± 0.466
1.769ValGln: 1.769 ± 0.366
4.991ValArg: 4.991 ± 0.62
5.117ValSer: 5.117 ± 0.496
5.433ValThr: 5.433 ± 0.57
5.054ValVal: 5.054 ± 0.505
1.2ValTrp: 1.2 ± 0.315
2.274ValTyr: 2.274 ± 0.444
0.0ValXaa: 0.0 ± 0.0
Trp
1.453TrpAla: 1.453 ± 0.325
0.19TrpCys: 0.19 ± 0.112
1.642TrpAsp: 1.642 ± 0.293
1.074TrpGlu: 1.074 ± 0.24
0.884TrpPhe: 0.884 ± 0.199
1.832TrpGly: 1.832 ± 0.342
0.505TrpHis: 0.505 ± 0.154
1.39TrpIle: 1.39 ± 0.266
0.505TrpLys: 0.505 ± 0.23
1.706TrpLeu: 1.706 ± 0.314
0.379TrpMet: 0.379 ± 0.176
0.442TrpAsn: 0.442 ± 0.149
0.948TrpPro: 0.948 ± 0.286
1.011TrpGln: 1.011 ± 0.25
1.263TrpArg: 1.263 ± 0.33
1.011TrpSer: 1.011 ± 0.303
1.579TrpThr: 1.579 ± 0.363
1.895TrpVal: 1.895 ± 0.259
0.632TrpTrp: 0.632 ± 0.21
0.379TrpTyr: 0.379 ± 0.163
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.527TyrAla: 2.527 ± 0.393
0.19TyrCys: 0.19 ± 0.106
1.516TyrAsp: 1.516 ± 0.333
2.085TyrGlu: 2.085 ± 0.354
0.695TyrPhe: 0.695 ± 0.211
2.464TyrGly: 2.464 ± 0.394
0.505TyrHis: 0.505 ± 0.188
1.642TyrIle: 1.642 ± 0.345
0.948TyrLys: 0.948 ± 0.205
2.843TyrLeu: 2.843 ± 0.442
0.695TyrMet: 0.695 ± 0.182
1.2TyrAsn: 1.2 ± 0.29
1.579TyrPro: 1.579 ± 0.282
1.074TyrGln: 1.074 ± 0.24
2.843TyrArg: 2.843 ± 0.357
1.39TyrSer: 1.39 ± 0.268
2.021TyrThr: 2.021 ± 0.346
1.769TyrVal: 1.769 ± 0.289
0.569TyrTrp: 0.569 ± 0.213
0.569TyrTyr: 0.569 ± 0.197
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 91 proteins (15831 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski