Amino acid dipepetide frequency for Mycobacterium phage Cheetobro

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
19.653AlaAla: 19.653 ± 1.941
1.095AlaCys: 1.095 ± 0.258
8.157AlaAsp: 8.157 ± 0.702
11.058AlaGlu: 11.058 ± 1.197
3.066AlaPhe: 3.066 ± 0.597
10.511AlaGly: 10.511 ± 1.305
2.628AlaHis: 2.628 ± 0.481
5.365AlaIle: 5.365 ± 0.55
3.668AlaLys: 3.668 ± 0.36
10.237AlaLeu: 10.237 ± 0.728
2.956AlaMet: 2.956 ± 0.429
3.23AlaAsn: 3.23 ± 0.398
5.748AlaPro: 5.748 ± 0.821
5.31AlaGln: 5.31 ± 0.481
9.416AlaArg: 9.416 ± 0.96
5.091AlaSer: 5.091 ± 0.574
5.42AlaThr: 5.42 ± 0.745
8.978AlaVal: 8.978 ± 0.741
2.026AlaTrp: 2.026 ± 0.297
3.011AlaTyr: 3.011 ± 0.354
0.0AlaXaa: 0.0 ± 0.0
Cys
1.314CysAla: 1.314 ± 0.307
0.164CysCys: 0.164 ± 0.083
0.657CysAsp: 0.657 ± 0.188
0.602CysGlu: 0.602 ± 0.203
0.328CysPhe: 0.328 ± 0.135
1.259CysGly: 1.259 ± 0.343
0.219CysHis: 0.219 ± 0.115
0.547CysIle: 0.547 ± 0.152
0.438CysLys: 0.438 ± 0.153
0.712CysLeu: 0.712 ± 0.201
0.109CysMet: 0.109 ± 0.072
0.164CysAsn: 0.164 ± 0.088
0.931CysPro: 0.931 ± 0.248
0.383CysGln: 0.383 ± 0.139
1.15CysArg: 1.15 ± 0.303
0.985CysSer: 0.985 ± 0.221
0.438CysThr: 0.438 ± 0.177
0.383CysVal: 0.383 ± 0.171
0.109CysTrp: 0.109 ± 0.082
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
7.609AspAla: 7.609 ± 0.64
0.547AspCys: 0.547 ± 0.189
7.171AspAsp: 7.171 ± 1.011
5.693AspGlu: 5.693 ± 0.581
2.08AspPhe: 2.08 ± 0.304
7.007AspGly: 7.007 ± 0.605
1.04AspHis: 1.04 ± 0.271
1.15AspIle: 1.15 ± 0.262
1.971AspLys: 1.971 ± 0.302
5.255AspLeu: 5.255 ± 0.627
1.314AspMet: 1.314 ± 0.31
1.752AspAsn: 1.752 ± 0.325
4.325AspPro: 4.325 ± 0.564
1.697AspGln: 1.697 ± 0.322
5.255AspArg: 5.255 ± 0.679
2.409AspSer: 2.409 ± 0.376
3.339AspThr: 3.339 ± 0.364
4.379AspVal: 4.379 ± 0.489
0.876AspTrp: 0.876 ± 0.18
1.15AspTyr: 1.15 ± 0.253
0.0AspXaa: 0.0 ± 0.0
Glu
8.157GluAla: 8.157 ± 0.979
0.766GluCys: 0.766 ± 0.269
2.573GluAsp: 2.573 ± 0.373
1.697GluGlu: 1.697 ± 0.345
1.095GluPhe: 1.095 ± 0.301
3.942GluGly: 3.942 ± 0.498
1.752GluHis: 1.752 ± 0.348
1.588GluIle: 1.588 ± 0.31
2.737GluLys: 2.737 ± 0.448
8.157GluLeu: 8.157 ± 0.729
1.314GluMet: 1.314 ± 0.217
1.04GluAsn: 1.04 ± 0.218
3.175GluPro: 3.175 ± 0.453
2.956GluGln: 2.956 ± 0.379
5.255GluArg: 5.255 ± 0.752
2.409GluSer: 2.409 ± 0.355
2.737GluThr: 2.737 ± 0.376
4.817GluVal: 4.817 ± 0.528
1.15GluTrp: 1.15 ± 0.208
1.697GluTyr: 1.697 ± 0.347
0.0GluXaa: 0.0 ± 0.0
Phe
2.463PheAla: 2.463 ± 0.341
0.164PheCys: 0.164 ± 0.086
2.792PheAsp: 2.792 ± 0.36
1.861PheGlu: 1.861 ± 0.422
0.602PhePhe: 0.602 ± 0.146
3.449PheGly: 3.449 ± 0.44
0.712PheHis: 0.712 ± 0.186
1.095PheIle: 1.095 ± 0.253
0.712PheLys: 0.712 ± 0.192
1.807PheLeu: 1.807 ± 0.268
0.657PheMet: 0.657 ± 0.193
0.985PheAsn: 0.985 ± 0.211
0.985PhePro: 0.985 ± 0.33
0.876PheGln: 0.876 ± 0.26
1.588PheArg: 1.588 ± 0.302
1.15PheSer: 1.15 ± 0.263
1.807PheThr: 1.807 ± 0.371
2.299PheVal: 2.299 ± 0.353
0.438PheTrp: 0.438 ± 0.137
0.657PheTyr: 0.657 ± 0.209
0.0PheXaa: 0.0 ± 0.0
Gly
8.759GlyAla: 8.759 ± 1.097
0.985GlyCys: 0.985 ± 0.259
5.748GlyAsp: 5.748 ± 0.667
6.022GlyGlu: 6.022 ± 0.664
2.026GlyPhe: 2.026 ± 0.432
9.471GlyGly: 9.471 ± 1.134
1.697GlyHis: 1.697 ± 0.308
3.23GlyIle: 3.23 ± 0.558
3.942GlyLys: 3.942 ± 0.524
6.296GlyLeu: 6.296 ± 1.067
1.588GlyMet: 1.588 ± 0.311
3.175GlyAsn: 3.175 ± 0.4
3.832GlyPro: 3.832 ± 0.54
3.504GlyGln: 3.504 ± 0.443
6.186GlyArg: 6.186 ± 0.531
4.872GlySer: 4.872 ± 0.734
5.584GlyThr: 5.584 ± 0.622
6.186GlyVal: 6.186 ± 0.623
1.861GlyTrp: 1.861 ± 0.323
2.518GlyTyr: 2.518 ± 0.388
0.0GlyXaa: 0.0 ± 0.0
His
1.697HisAla: 1.697 ± 0.406
0.219HisCys: 0.219 ± 0.098
1.259HisAsp: 1.259 ± 0.259
0.876HisGlu: 0.876 ± 0.214
0.821HisPhe: 0.821 ± 0.199
2.244HisGly: 2.244 ± 0.415
0.821HisHis: 0.821 ± 0.163
0.985HisIle: 0.985 ± 0.226
0.657HisLys: 0.657 ± 0.187
1.533HisLeu: 1.533 ± 0.301
0.328HisMet: 0.328 ± 0.121
0.493HisAsn: 0.493 ± 0.147
1.04HisPro: 1.04 ± 0.252
0.657HisGln: 0.657 ± 0.176
2.299HisArg: 2.299 ± 0.463
0.821HisSer: 0.821 ± 0.204
0.876HisThr: 0.876 ± 0.208
2.354HisVal: 2.354 ± 0.388
0.274HisTrp: 0.274 ± 0.117
0.493HisTyr: 0.493 ± 0.173
0.0HisXaa: 0.0 ± 0.0
Ile
4.27IleAla: 4.27 ± 0.555
0.219IleCys: 0.219 ± 0.112
3.339IleAsp: 3.339 ± 0.393
3.394IleGlu: 3.394 ± 0.422
0.821IlePhe: 0.821 ± 0.217
4.161IleGly: 4.161 ± 0.739
0.493IleHis: 0.493 ± 0.144
1.259IleIle: 1.259 ± 0.249
1.533IleLys: 1.533 ± 0.495
2.299IleLeu: 2.299 ± 0.321
0.274IleMet: 0.274 ± 0.101
1.807IleAsn: 1.807 ± 0.267
2.026IlePro: 2.026 ± 0.418
0.657IleGln: 0.657 ± 0.227
2.628IleArg: 2.628 ± 0.344
1.697IleSer: 1.697 ± 0.404
2.847IleThr: 2.847 ± 0.386
3.504IleVal: 3.504 ± 0.454
0.602IleTrp: 0.602 ± 0.211
0.766IleTyr: 0.766 ± 0.229
0.0IleXaa: 0.0 ± 0.0
Lys
5.31LysAla: 5.31 ± 0.635
0.657LysCys: 0.657 ± 0.194
1.533LysAsp: 1.533 ± 0.3
0.766LysGlu: 0.766 ± 0.182
0.876LysPhe: 0.876 ± 0.217
1.971LysGly: 1.971 ± 0.254
1.095LysHis: 1.095 ± 0.258
1.259LysIle: 1.259 ± 0.247
0.821LysLys: 0.821 ± 0.317
3.011LysLeu: 3.011 ± 0.362
1.04LysMet: 1.04 ± 0.2
1.095LysAsn: 1.095 ± 0.219
2.354LysPro: 2.354 ± 0.403
1.642LysGln: 1.642 ± 0.284
2.409LysArg: 2.409 ± 0.407
1.861LysSer: 1.861 ± 0.293
2.08LysThr: 2.08 ± 0.402
2.19LysVal: 2.19 ± 0.345
0.493LysTrp: 0.493 ± 0.154
0.602LysTyr: 0.602 ± 0.179
0.0LysXaa: 0.0 ± 0.0
Leu
11.934LeuAla: 11.934 ± 0.81
0.657LeuCys: 0.657 ± 0.208
7.5LeuAsp: 7.5 ± 0.789
2.19LeuGlu: 2.19 ± 0.319
2.409LeuPhe: 2.409 ± 0.288
6.898LeuGly: 6.898 ± 0.663
1.204LeuHis: 1.204 ± 0.297
3.613LeuIle: 3.613 ± 0.431
2.244LeuLys: 2.244 ± 0.446
6.077LeuLeu: 6.077 ± 0.66
1.423LeuMet: 1.423 ± 0.271
2.628LeuAsn: 2.628 ± 0.287
4.051LeuPro: 4.051 ± 0.657
3.12LeuGln: 3.12 ± 0.432
7.171LeuArg: 7.171 ± 0.585
5.201LeuSer: 5.201 ± 0.502
5.584LeuThr: 5.584 ± 0.552
5.858LeuVal: 5.858 ± 0.534
1.478LeuTrp: 1.478 ± 0.32
2.026LeuTyr: 2.026 ± 0.32
0.0LeuXaa: 0.0 ± 0.0
Met
1.478MetAla: 1.478 ± 0.237
0.0MetCys: 0.0 ± 0.0
0.602MetAsp: 0.602 ± 0.153
0.657MetGlu: 0.657 ± 0.185
0.547MetPhe: 0.547 ± 0.195
1.259MetGly: 1.259 ± 0.273
0.383MetHis: 0.383 ± 0.147
0.931MetIle: 0.931 ± 0.245
0.602MetLys: 0.602 ± 0.192
2.135MetLeu: 2.135 ± 0.304
0.219MetMet: 0.219 ± 0.108
0.876MetAsn: 0.876 ± 0.237
1.15MetPro: 1.15 ± 0.265
0.493MetGln: 0.493 ± 0.141
1.259MetArg: 1.259 ± 0.251
1.752MetSer: 1.752 ± 0.324
2.299MetThr: 2.299 ± 0.344
1.369MetVal: 1.369 ± 0.24
0.383MetTrp: 0.383 ± 0.148
0.383MetTyr: 0.383 ± 0.132
0.0MetXaa: 0.0 ± 0.0
Asn
3.504AsnAla: 3.504 ± 0.624
0.274AsnCys: 0.274 ± 0.123
1.423AsnAsp: 1.423 ± 0.237
1.15AsnGlu: 1.15 ± 0.208
0.766AsnPhe: 0.766 ± 0.189
2.901AsnGly: 2.901 ± 0.401
0.328AsnHis: 0.328 ± 0.123
0.985AsnIle: 0.985 ± 0.362
0.985AsnLys: 0.985 ± 0.219
2.354AsnLeu: 2.354 ± 0.378
0.493AsnMet: 0.493 ± 0.12
0.876AsnAsn: 0.876 ± 0.189
3.066AsnPro: 3.066 ± 0.496
0.821AsnGln: 0.821 ± 0.226
2.135AsnArg: 2.135 ± 0.507
1.04AsnSer: 1.04 ± 0.266
2.026AsnThr: 2.026 ± 0.421
2.956AsnVal: 2.956 ± 0.373
0.493AsnTrp: 0.493 ± 0.14
0.766AsnTyr: 0.766 ± 0.202
0.0AsnXaa: 0.0 ± 0.0
Pro
9.142ProAla: 9.142 ± 0.819
0.219ProCys: 0.219 ± 0.099
3.066ProAsp: 3.066 ± 0.394
4.27ProGlu: 4.27 ± 0.565
1.642ProPhe: 1.642 ± 0.261
6.569ProGly: 6.569 ± 0.552
1.314ProHis: 1.314 ± 0.294
2.628ProIle: 2.628 ± 0.344
1.423ProLys: 1.423 ± 0.261
3.777ProLeu: 3.777 ± 0.491
0.547ProMet: 0.547 ± 0.172
1.314ProAsn: 1.314 ± 0.29
4.106ProPro: 4.106 ± 0.704
1.533ProGln: 1.533 ± 0.311
2.901ProArg: 2.901 ± 0.467
2.792ProSer: 2.792 ± 0.436
4.161ProThr: 4.161 ± 0.484
4.325ProVal: 4.325 ± 0.557
0.602ProTrp: 0.602 ± 0.175
1.861ProTyr: 1.861 ± 0.337
0.0ProXaa: 0.0 ± 0.0
Gln
5.584GlnAla: 5.584 ± 0.68
0.657GlnCys: 0.657 ± 0.19
1.642GlnAsp: 1.642 ± 0.281
1.314GlnGlu: 1.314 ± 0.252
1.04GlnPhe: 1.04 ± 0.205
2.299GlnGly: 2.299 ± 0.31
1.15GlnHis: 1.15 ± 0.305
1.204GlnIle: 1.204 ± 0.224
1.204GlnLys: 1.204 ± 0.285
3.613GlnLeu: 3.613 ± 0.392
0.931GlnMet: 0.931 ± 0.19
0.821GlnAsn: 0.821 ± 0.168
2.463GlnPro: 2.463 ± 0.375
1.916GlnGln: 1.916 ± 0.384
3.011GlnArg: 3.011 ± 0.392
1.533GlnSer: 1.533 ± 0.256
1.478GlnThr: 1.478 ± 0.327
2.026GlnVal: 2.026 ± 0.299
0.766GlnTrp: 0.766 ± 0.174
1.095GlnTyr: 1.095 ± 0.238
0.0GlnXaa: 0.0 ± 0.0
Arg
8.485ArgAla: 8.485 ± 1.123
1.423ArgCys: 1.423 ± 0.378
4.489ArgAsp: 4.489 ± 0.645
3.996ArgGlu: 3.996 ± 0.511
2.135ArgPhe: 2.135 ± 0.332
4.27ArgGly: 4.27 ± 0.574
1.642ArgHis: 1.642 ± 0.401
3.394ArgIle: 3.394 ± 0.487
2.737ArgLys: 2.737 ± 0.408
7.609ArgLeu: 7.609 ± 0.782
2.026ArgMet: 2.026 ± 0.317
2.628ArgAsn: 2.628 ± 0.405
4.379ArgPro: 4.379 ± 0.579
3.394ArgGln: 3.394 ± 0.374
7.445ArgArg: 7.445 ± 0.888
3.613ArgSer: 3.613 ± 0.475
3.887ArgThr: 3.887 ± 0.484
4.982ArgVal: 4.982 ± 0.598
2.409ArgTrp: 2.409 ± 0.422
1.861ArgTyr: 1.861 ± 0.355
0.0ArgXaa: 0.0 ± 0.0
Ser
6.241SerAla: 6.241 ± 0.937
0.657SerCys: 0.657 ± 0.221
2.682SerAsp: 2.682 ± 0.318
2.026SerGlu: 2.026 ± 0.371
1.807SerPhe: 1.807 ± 0.316
4.161SerGly: 4.161 ± 0.634
0.876SerHis: 0.876 ± 0.206
1.807SerIle: 1.807 ± 0.259
1.478SerLys: 1.478 ± 0.259
3.832SerLeu: 3.832 ± 0.567
0.766SerMet: 0.766 ± 0.201
1.697SerAsn: 1.697 ± 0.409
2.737SerPro: 2.737 ± 0.379
1.259SerGln: 1.259 ± 0.28
3.12SerArg: 3.12 ± 0.498
3.449SerSer: 3.449 ± 0.648
3.777SerThr: 3.777 ± 0.558
3.723SerVal: 3.723 ± 0.469
1.259SerTrp: 1.259 ± 0.241
1.642SerTyr: 1.642 ± 0.254
0.0SerXaa: 0.0 ± 0.0
Thr
6.679ThrAla: 6.679 ± 1.02
0.547ThrCys: 0.547 ± 0.184
3.339ThrAsp: 3.339 ± 0.418
3.668ThrGlu: 3.668 ± 0.451
1.971ThrPhe: 1.971 ± 0.351
5.639ThrGly: 5.639 ± 0.627
1.204ThrHis: 1.204 ± 0.241
3.613ThrIle: 3.613 ± 0.553
2.026ThrLys: 2.026 ± 0.336
4.325ThrLeu: 4.325 ± 0.555
0.931ThrMet: 0.931 ± 0.221
1.478ThrAsn: 1.478 ± 0.278
4.544ThrPro: 4.544 ± 0.354
1.259ThrGln: 1.259 ± 0.241
3.832ThrArg: 3.832 ± 0.586
2.244ThrSer: 2.244 ± 0.399
3.668ThrThr: 3.668 ± 0.517
5.803ThrVal: 5.803 ± 0.512
1.314ThrTrp: 1.314 ± 0.287
1.423ThrTyr: 1.423 ± 0.267
0.0ThrXaa: 0.0 ± 0.0
Val
9.471ValAla: 9.471 ± 0.812
0.821ValCys: 0.821 ± 0.262
5.693ValAsp: 5.693 ± 0.632
5.365ValGlu: 5.365 ± 0.554
1.971ValPhe: 1.971 ± 0.313
5.858ValGly: 5.858 ± 0.623
1.15ValHis: 1.15 ± 0.278
2.628ValIle: 2.628 ± 0.377
2.573ValLys: 2.573 ± 0.458
6.077ValLeu: 6.077 ± 0.636
1.095ValMet: 1.095 ± 0.281
1.916ValAsn: 1.916 ± 0.288
4.872ValPro: 4.872 ± 0.583
2.135ValGln: 2.135 ± 0.336
5.091ValArg: 5.091 ± 0.566
3.613ValSer: 3.613 ± 0.42
4.872ValThr: 4.872 ± 0.478
6.46ValVal: 6.46 ± 0.644
1.807ValTrp: 1.807 ± 0.314
2.08ValTyr: 2.08 ± 0.318
0.0ValXaa: 0.0 ± 0.0
Trp
1.807TrpAla: 1.807 ± 0.291
0.328TrpCys: 0.328 ± 0.125
0.821TrpAsp: 0.821 ± 0.182
0.766TrpGlu: 0.766 ± 0.22
0.766TrpPhe: 0.766 ± 0.18
1.15TrpGly: 1.15 ± 0.258
0.712TrpHis: 0.712 ± 0.153
0.547TrpIle: 0.547 ± 0.15
0.493TrpLys: 0.493 ± 0.153
2.08TrpLeu: 2.08 ± 0.292
0.328TrpMet: 0.328 ± 0.125
0.602TrpAsn: 0.602 ± 0.205
0.985TrpPro: 0.985 ± 0.223
1.423TrpGln: 1.423 ± 0.236
2.299TrpArg: 2.299 ± 0.393
0.821TrpSer: 0.821 ± 0.249
1.478TrpThr: 1.478 ± 0.31
1.04TrpVal: 1.04 ± 0.262
0.438TrpTrp: 0.438 ± 0.15
0.328TrpTyr: 0.328 ± 0.134
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.066TyrAla: 3.066 ± 0.382
0.493TyrCys: 0.493 ± 0.186
1.533TyrAsp: 1.533 ± 0.289
1.642TyrGlu: 1.642 ± 0.368
0.328TyrPhe: 0.328 ± 0.104
2.463TyrGly: 2.463 ± 0.421
0.328TyrHis: 0.328 ± 0.124
0.766TyrIle: 0.766 ± 0.223
0.876TyrLys: 0.876 ± 0.248
2.08TyrLeu: 2.08 ± 0.419
0.383TyrMet: 0.383 ± 0.129
0.766TyrAsn: 0.766 ± 0.212
1.423TyrPro: 1.423 ± 0.307
0.712TyrGln: 0.712 ± 0.202
2.354TyrArg: 2.354 ± 0.313
1.642TyrSer: 1.642 ± 0.277
1.15TyrThr: 1.15 ± 0.265
1.916TyrVal: 1.916 ± 0.303
0.438TyrTrp: 0.438 ± 0.146
0.712TyrTyr: 0.712 ± 0.157
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 92 proteins (18268 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski