Amino acid dipepetide frequency for Mycobacterium phage Charlie

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
20.542AlaAla: 20.542 ± 2.916
0.571AlaCys: 0.571 ± 0.266
9.058AlaAsp: 9.058 ± 0.728
7.703AlaGlu: 7.703 ± 0.915
3.495AlaPhe: 3.495 ± 0.474
9.558AlaGly: 9.558 ± 1.351
2.496AlaHis: 2.496 ± 0.433
5.492AlaIle: 5.492 ± 0.599
3.709AlaLys: 3.709 ± 0.623
9.772AlaLeu: 9.772 ± 0.876
2.211AlaMet: 2.211 ± 0.406
3.638AlaAsn: 3.638 ± 0.596
6.847AlaPro: 6.847 ± 0.819
5.421AlaGln: 5.421 ± 0.716
7.347AlaArg: 7.347 ± 0.857
6.419AlaSer: 6.419 ± 0.781
8.06AlaThr: 8.06 ± 0.814
6.562AlaVal: 6.562 ± 0.628
1.854AlaTrp: 1.854 ± 0.478
2.282AlaTyr: 2.282 ± 0.407
0.0AlaXaa: 0.0 ± 0.0
Cys
1.854CysAla: 1.854 ± 0.464
0.214CysCys: 0.214 ± 0.143
1.07CysAsp: 1.07 ± 0.329
0.428CysGlu: 0.428 ± 0.207
0.071CysPhe: 0.071 ± 0.081
1.427CysGly: 1.427 ± 0.38
0.357CysHis: 0.357 ± 0.209
0.214CysIle: 0.214 ± 0.142
0.285CysLys: 0.285 ± 0.16
0.713CysLeu: 0.713 ± 0.279
0.143CysMet: 0.143 ± 0.088
0.357CysAsn: 0.357 ± 0.19
1.569CysPro: 1.569 ± 0.412
0.642CysGln: 0.642 ± 0.2
0.856CysArg: 0.856 ± 0.293
0.428CysSer: 0.428 ± 0.173
0.713CysThr: 0.713 ± 0.196
0.856CysVal: 0.856 ± 0.216
0.214CysTrp: 0.214 ± 0.141
0.143CysTyr: 0.143 ± 0.093
0.0CysXaa: 0.0 ± 0.0
Asp
7.703AspAla: 7.703 ± 0.805
1.284AspCys: 1.284 ± 0.345
5.207AspAsp: 5.207 ± 0.756
4.565AspGlu: 4.565 ± 0.823
2.14AspPhe: 2.14 ± 0.325
6.063AspGly: 6.063 ± 0.737
0.999AspHis: 0.999 ± 0.321
2.924AspIle: 2.924 ± 0.351
1.498AspLys: 1.498 ± 0.366
6.063AspLeu: 6.063 ± 0.484
1.498AspMet: 1.498 ± 0.281
1.854AspAsn: 1.854 ± 0.34
4.708AspPro: 4.708 ± 0.627
2.568AspGln: 2.568 ± 0.461
3.923AspArg: 3.923 ± 0.629
3.638AspSer: 3.638 ± 0.547
3.352AspThr: 3.352 ± 0.519
3.138AspVal: 3.138 ± 0.434
1.07AspTrp: 1.07 ± 0.388
1.141AspTyr: 1.141 ± 0.293
0.0AspXaa: 0.0 ± 0.0
Glu
5.706GluAla: 5.706 ± 0.678
0.999GluCys: 0.999 ± 0.278
3.566GluAsp: 3.566 ± 0.524
1.569GluGlu: 1.569 ± 0.356
2.568GluPhe: 2.568 ± 0.422
2.853GluGly: 2.853 ± 0.417
1.427GluHis: 1.427 ± 0.304
2.782GluIle: 2.782 ± 0.493
2.14GluLys: 2.14 ± 0.368
5.136GluLeu: 5.136 ± 0.634
1.569GluMet: 1.569 ± 0.429
1.712GluAsn: 1.712 ± 0.271
2.71GluPro: 2.71 ± 0.576
3.21GluGln: 3.21 ± 0.453
4.494GluArg: 4.494 ± 0.744
2.924GluSer: 2.924 ± 0.5
3.638GluThr: 3.638 ± 0.535
4.28GluVal: 4.28 ± 0.586
0.999GluTrp: 0.999 ± 0.284
1.355GluTyr: 1.355 ± 0.286
0.0GluXaa: 0.0 ± 0.0
Phe
3.852PheAla: 3.852 ± 0.523
0.357PheCys: 0.357 ± 0.2
1.926PheAsp: 1.926 ± 0.549
1.783PheGlu: 1.783 ± 0.353
0.999PhePhe: 0.999 ± 0.329
3.709PheGly: 3.709 ± 0.535
1.07PheHis: 1.07 ± 0.304
1.284PheIle: 1.284 ± 0.272
0.713PheLys: 0.713 ± 0.211
1.997PheLeu: 1.997 ± 0.362
0.642PheMet: 0.642 ± 0.246
0.785PheAsn: 0.785 ± 0.207
0.856PhePro: 0.856 ± 0.274
0.642PheGln: 0.642 ± 0.202
1.641PheArg: 1.641 ± 0.339
1.712PheSer: 1.712 ± 0.31
2.496PheThr: 2.496 ± 0.35
2.782PheVal: 2.782 ± 0.464
0.357PheTrp: 0.357 ± 0.272
0.999PheTyr: 0.999 ± 0.254
0.0PheXaa: 0.0 ± 0.0
Gly
8.417GlyAla: 8.417 ± 1.24
0.927GlyCys: 0.927 ± 0.308
5.136GlyAsp: 5.136 ± 0.632
3.994GlyGlu: 3.994 ± 0.566
2.782GlyPhe: 2.782 ± 0.475
9.201GlyGly: 9.201 ± 1.297
1.783GlyHis: 1.783 ± 0.346
4.28GlyIle: 4.28 ± 0.639
1.997GlyLys: 1.997 ± 0.358
6.491GlyLeu: 6.491 ± 0.684
1.783GlyMet: 1.783 ± 0.344
2.924GlyAsn: 2.924 ± 0.439
4.494GlyPro: 4.494 ± 0.743
4.422GlyGln: 4.422 ± 0.548
5.849GlyArg: 5.849 ± 0.461
4.779GlySer: 4.779 ± 0.67
6.063GlyThr: 6.063 ± 0.824
6.491GlyVal: 6.491 ± 0.636
1.997GlyTrp: 1.997 ± 0.314
2.853GlyTyr: 2.853 ± 0.647
0.0GlyXaa: 0.0 ± 0.0
His
1.997HisAla: 1.997 ± 0.4
0.357HisCys: 0.357 ± 0.187
1.498HisAsp: 1.498 ± 0.291
1.712HisGlu: 1.712 ± 0.422
0.357HisPhe: 0.357 ± 0.154
1.926HisGly: 1.926 ± 0.413
0.713HisHis: 0.713 ± 0.21
0.856HisIle: 0.856 ± 0.208
0.785HisLys: 0.785 ± 0.248
1.498HisLeu: 1.498 ± 0.427
0.143HisMet: 0.143 ± 0.087
0.214HisAsn: 0.214 ± 0.137
1.07HisPro: 1.07 ± 0.29
1.141HisGln: 1.141 ± 0.276
1.641HisArg: 1.641 ± 0.333
0.571HisSer: 0.571 ± 0.25
1.712HisThr: 1.712 ± 0.287
1.213HisVal: 1.213 ± 0.31
0.499HisTrp: 0.499 ± 0.167
0.927HisTyr: 0.927 ± 0.218
0.0HisXaa: 0.0 ± 0.0
Ile
5.492IleAla: 5.492 ± 0.67
0.143IleCys: 0.143 ± 0.096
3.78IleAsp: 3.78 ± 0.646
4.422IleGlu: 4.422 ± 0.444
0.927IlePhe: 0.927 ± 0.272
4.28IleGly: 4.28 ± 0.473
1.355IleHis: 1.355 ± 0.333
1.355IleIle: 1.355 ± 0.338
1.141IleLys: 1.141 ± 0.313
2.354IleLeu: 2.354 ± 0.46
0.214IleMet: 0.214 ± 0.12
1.997IleAsn: 1.997 ± 0.356
2.425IlePro: 2.425 ± 0.435
1.569IleGln: 1.569 ± 0.302
3.638IleArg: 3.638 ± 0.582
2.14IleSer: 2.14 ± 0.384
3.709IleThr: 3.709 ± 0.517
2.71IleVal: 2.71 ± 0.556
0.785IleTrp: 0.785 ± 0.22
1.141IleTyr: 1.141 ± 0.315
0.0IleXaa: 0.0 ± 0.0
Lys
4.208LysAla: 4.208 ± 0.82
0.357LysCys: 0.357 ± 0.18
0.999LysAsp: 0.999 ± 0.255
1.213LysGlu: 1.213 ± 0.231
0.999LysPhe: 0.999 ± 0.228
2.354LysGly: 2.354 ± 0.42
0.571LysHis: 0.571 ± 0.175
1.355LysIle: 1.355 ± 0.271
0.571LysLys: 0.571 ± 0.179
2.639LysLeu: 2.639 ± 0.514
0.713LysMet: 0.713 ± 0.214
0.571LysAsn: 0.571 ± 0.175
2.568LysPro: 2.568 ± 0.385
1.427LysGln: 1.427 ± 0.332
2.425LysArg: 2.425 ± 0.377
1.712LysSer: 1.712 ± 0.333
1.926LysThr: 1.926 ± 0.378
2.068LysVal: 2.068 ± 0.44
0.785LysTrp: 0.785 ± 0.208
0.713LysTyr: 0.713 ± 0.234
0.0LysXaa: 0.0 ± 0.0
Leu
10.342LeuAla: 10.342 ± 0.999
0.927LeuCys: 0.927 ± 0.3
5.278LeuAsp: 5.278 ± 0.564
3.638LeuGlu: 3.638 ± 0.462
2.853LeuPhe: 2.853 ± 0.507
8.274LeuGly: 8.274 ± 0.835
1.213LeuHis: 1.213 ± 0.295
3.852LeuIle: 3.852 ± 0.471
2.568LeuLys: 2.568 ± 0.459
6.776LeuLeu: 6.776 ± 0.802
1.141LeuMet: 1.141 ± 0.332
2.71LeuAsn: 2.71 ± 0.442
4.422LeuPro: 4.422 ± 0.669
2.782LeuGln: 2.782 ± 0.436
5.35LeuArg: 5.35 ± 0.75
4.066LeuSer: 4.066 ± 0.649
5.849LeuThr: 5.849 ± 0.677
4.85LeuVal: 4.85 ± 0.542
1.284LeuTrp: 1.284 ± 0.284
1.141LeuTyr: 1.141 ± 0.236
0.0LeuXaa: 0.0 ± 0.0
Met
3.281MetAla: 3.281 ± 0.463
0.071MetCys: 0.071 ± 0.077
0.785MetAsp: 0.785 ± 0.262
0.571MetGlu: 0.571 ± 0.264
0.999MetPhe: 0.999 ± 0.217
0.571MetGly: 0.571 ± 0.167
0.285MetHis: 0.285 ± 0.138
1.213MetIle: 1.213 ± 0.321
0.571MetLys: 0.571 ± 0.208
1.427MetLeu: 1.427 ± 0.293
0.357MetMet: 0.357 ± 0.149
0.856MetAsn: 0.856 ± 0.265
1.498MetPro: 1.498 ± 0.343
0.713MetGln: 0.713 ± 0.174
1.213MetArg: 1.213 ± 0.307
2.282MetSer: 2.282 ± 0.381
1.783MetThr: 1.783 ± 0.356
0.642MetVal: 0.642 ± 0.175
0.785MetTrp: 0.785 ± 0.287
0.428MetTyr: 0.428 ± 0.159
0.0MetXaa: 0.0 ± 0.0
Asn
3.424AsnAla: 3.424 ± 0.452
0.357AsnCys: 0.357 ± 0.146
1.854AsnAsp: 1.854 ± 0.361
0.927AsnGlu: 0.927 ± 0.271
0.999AsnPhe: 0.999 ± 0.258
3.638AsnGly: 3.638 ± 0.545
0.214AsnHis: 0.214 ± 0.111
1.641AsnIle: 1.641 ± 0.359
0.713AsnLys: 0.713 ± 0.289
2.996AsnLeu: 2.996 ± 0.476
0.428AsnMet: 0.428 ± 0.169
1.498AsnAsn: 1.498 ± 0.296
2.996AsnPro: 2.996 ± 0.386
1.355AsnGln: 1.355 ± 0.357
1.997AsnArg: 1.997 ± 0.362
1.427AsnSer: 1.427 ± 0.314
1.712AsnThr: 1.712 ± 0.29
1.712AsnVal: 1.712 ± 0.345
0.428AsnTrp: 0.428 ± 0.167
0.642AsnTyr: 0.642 ± 0.246
0.0AsnXaa: 0.0 ± 0.0
Pro
6.776ProAla: 6.776 ± 0.78
0.713ProCys: 0.713 ± 0.262
4.565ProAsp: 4.565 ± 0.571
4.779ProGlu: 4.779 ± 0.621
1.997ProPhe: 1.997 ± 0.34
5.92ProGly: 5.92 ± 0.808
1.213ProHis: 1.213 ± 0.276
2.282ProIle: 2.282 ± 0.418
1.213ProLys: 1.213 ± 0.292
4.708ProLeu: 4.708 ± 0.545
1.569ProMet: 1.569 ± 0.385
1.712ProAsn: 1.712 ± 0.313
4.137ProPro: 4.137 ± 0.493
2.14ProGln: 2.14 ± 0.435
2.568ProArg: 2.568 ± 0.475
3.281ProSer: 3.281 ± 0.474
3.994ProThr: 3.994 ± 0.601
4.779ProVal: 4.779 ± 0.519
1.355ProTrp: 1.355 ± 0.424
1.284ProTyr: 1.284 ± 0.351
0.0ProXaa: 0.0 ± 0.0
Gln
4.993GlnAla: 4.993 ± 0.884
0.499GlnCys: 0.499 ± 0.165
1.641GlnAsp: 1.641 ± 0.477
1.427GlnGlu: 1.427 ± 0.335
1.141GlnPhe: 1.141 ± 0.211
2.068GlnGly: 2.068 ± 0.313
1.213GlnHis: 1.213 ± 0.311
2.782GlnIle: 2.782 ± 0.426
1.213GlnLys: 1.213 ± 0.248
4.636GlnLeu: 4.636 ± 0.517
1.141GlnMet: 1.141 ± 0.261
0.642GlnAsn: 0.642 ± 0.19
2.425GlnPro: 2.425 ± 0.549
1.569GlnGln: 1.569 ± 0.415
3.638GlnArg: 3.638 ± 0.641
1.997GlnSer: 1.997 ± 0.396
2.639GlnThr: 2.639 ± 0.367
3.138GlnVal: 3.138 ± 0.471
0.642GlnTrp: 0.642 ± 0.237
0.713GlnTyr: 0.713 ± 0.198
0.0GlnXaa: 0.0 ± 0.0
Arg
7.061ArgAla: 7.061 ± 0.734
1.213ArgCys: 1.213 ± 0.351
4.636ArgAsp: 4.636 ± 0.597
4.066ArgGlu: 4.066 ± 0.684
1.783ArgPhe: 1.783 ± 0.479
5.421ArgGly: 5.421 ± 0.661
1.498ArgHis: 1.498 ± 0.318
3.923ArgIle: 3.923 ± 0.647
2.782ArgLys: 2.782 ± 0.516
4.708ArgLeu: 4.708 ± 0.519
2.211ArgMet: 2.211 ± 0.533
1.926ArgAsn: 1.926 ± 0.328
4.28ArgPro: 4.28 ± 0.701
2.782ArgGln: 2.782 ± 0.64
6.847ArgArg: 6.847 ± 1.111
2.568ArgSer: 2.568 ± 0.47
3.994ArgThr: 3.994 ± 0.616
3.923ArgVal: 3.923 ± 0.479
1.569ArgTrp: 1.569 ± 0.283
1.926ArgTyr: 1.926 ± 0.391
0.0ArgXaa: 0.0 ± 0.0
Ser
7.133SerAla: 7.133 ± 1.057
0.642SerCys: 0.642 ± 0.2
2.853SerAsp: 2.853 ± 0.45
2.211SerGlu: 2.211 ± 0.395
1.926SerPhe: 1.926 ± 0.379
5.064SerGly: 5.064 ± 0.555
0.713SerHis: 0.713 ± 0.224
1.569SerIle: 1.569 ± 0.367
1.641SerLys: 1.641 ± 0.343
3.994SerLeu: 3.994 ± 0.602
1.926SerMet: 1.926 ± 0.356
1.284SerAsn: 1.284 ± 0.315
3.566SerPro: 3.566 ± 0.429
1.783SerGln: 1.783 ± 0.417
3.067SerArg: 3.067 ± 0.409
3.067SerSer: 3.067 ± 0.402
3.067SerThr: 3.067 ± 0.427
3.566SerVal: 3.566 ± 0.354
1.07SerTrp: 1.07 ± 0.234
1.641SerTyr: 1.641 ± 0.365
0.0SerXaa: 0.0 ± 0.0
Thr
8.488ThrAla: 8.488 ± 0.707
0.571ThrCys: 0.571 ± 0.208
4.351ThrAsp: 4.351 ± 0.571
3.566ThrGlu: 3.566 ± 0.552
1.712ThrPhe: 1.712 ± 0.339
6.348ThrGly: 6.348 ± 0.848
1.07ThrHis: 1.07 ± 0.284
3.067ThrIle: 3.067 ± 0.441
2.425ThrLys: 2.425 ± 0.403
5.777ThrLeu: 5.777 ± 0.601
1.141ThrMet: 1.141 ± 0.256
1.926ThrAsn: 1.926 ± 0.311
4.85ThrPro: 4.85 ± 0.59
2.068ThrGln: 2.068 ± 0.355
3.923ThrArg: 3.923 ± 0.534
2.924ThrSer: 2.924 ± 0.488
4.565ThrThr: 4.565 ± 0.799
6.705ThrVal: 6.705 ± 0.831
0.785ThrTrp: 0.785 ± 0.252
1.427ThrTyr: 1.427 ± 0.314
0.0ThrXaa: 0.0 ± 0.0
Val
7.989ValAla: 7.989 ± 0.949
0.999ValCys: 0.999 ± 0.239
4.066ValAsp: 4.066 ± 0.502
5.207ValGlu: 5.207 ± 0.788
1.569ValPhe: 1.569 ± 0.389
5.064ValGly: 5.064 ± 0.553
1.284ValHis: 1.284 ± 0.309
3.281ValIle: 3.281 ± 0.48
3.067ValLys: 3.067 ± 0.509
3.994ValLeu: 3.994 ± 0.419
0.999ValMet: 0.999 ± 0.321
2.996ValAsn: 2.996 ± 0.504
3.709ValPro: 3.709 ± 0.398
2.354ValGln: 2.354 ± 0.305
5.635ValArg: 5.635 ± 0.618
3.495ValSer: 3.495 ± 0.519
5.064ValThr: 5.064 ± 0.56
4.636ValVal: 4.636 ± 0.623
0.856ValTrp: 0.856 ± 0.261
1.213ValTyr: 1.213 ± 0.28
0.0ValXaa: 0.0 ± 0.0
Trp
1.498TrpAla: 1.498 ± 0.371
0.927TrpCys: 0.927 ± 0.28
0.927TrpAsp: 0.927 ± 0.232
0.428TrpGlu: 0.428 ± 0.158
0.642TrpPhe: 0.642 ± 0.243
1.07TrpGly: 1.07 ± 0.287
0.713TrpHis: 0.713 ± 0.228
0.571TrpIle: 0.571 ± 0.205
0.428TrpLys: 0.428 ± 0.151
1.854TrpLeu: 1.854 ± 0.427
0.143TrpMet: 0.143 ± 0.122
0.499TrpAsn: 0.499 ± 0.172
0.856TrpPro: 0.856 ± 0.278
0.856TrpGln: 0.856 ± 0.225
1.498TrpArg: 1.498 ± 0.376
1.141TrpSer: 1.141 ± 0.241
1.712TrpThr: 1.712 ± 0.342
1.712TrpVal: 1.712 ± 0.319
0.713TrpTrp: 0.713 ± 0.218
0.428TrpTyr: 0.428 ± 0.151
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.282TyrAla: 2.282 ± 0.34
0.428TyrCys: 0.428 ± 0.159
2.14TyrAsp: 2.14 ± 0.336
1.427TyrGlu: 1.427 ± 0.276
0.642TyrPhe: 0.642 ± 0.25
1.854TyrGly: 1.854 ± 0.306
0.642TyrHis: 0.642 ± 0.215
0.856TyrIle: 0.856 ± 0.244
0.713TyrLys: 0.713 ± 0.223
1.783TyrLeu: 1.783 ± 0.378
0.285TyrMet: 0.285 ± 0.125
0.856TyrAsn: 0.856 ± 0.224
0.999TyrPro: 0.999 ± 0.295
0.713TyrGln: 0.713 ± 0.212
1.569TyrArg: 1.569 ± 0.306
1.213TyrSer: 1.213 ± 0.237
1.641TyrThr: 1.641 ± 0.363
1.712TyrVal: 1.712 ± 0.446
0.642TyrTrp: 0.642 ± 0.275
0.785TyrTyr: 0.785 ± 0.223
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 68 proteins (14021 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski