Amino acid dipepetide frequency for Mycobacterium phage Krakatau

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
13.329AlaAla: 13.329 ± 1.684
1.062AlaCys: 1.062 ± 0.225
7.136AlaAsp: 7.136 ± 0.618
7.431AlaGlu: 7.431 ± 0.802
3.008AlaPhe: 3.008 ± 0.366
10.203AlaGly: 10.203 ± 1.33
2.359AlaHis: 2.359 ± 0.393
3.951AlaIle: 3.951 ± 0.523
3.892AlaLys: 3.892 ± 0.456
8.139AlaLeu: 8.139 ± 0.849
2.595AlaMet: 2.595 ± 0.413
2.477AlaAsn: 2.477 ± 0.344
5.19AlaPro: 5.19 ± 0.622
3.539AlaGln: 3.539 ± 0.477
7.844AlaArg: 7.844 ± 0.853
5.426AlaSer: 5.426 ± 0.667
6.546AlaThr: 6.546 ± 0.5
6.723AlaVal: 6.723 ± 0.714
2.536AlaTrp: 2.536 ± 0.371
2.359AlaTyr: 2.359 ± 0.31
0.0AlaXaa: 0.0 ± 0.0
Cys
1.121CysAla: 1.121 ± 0.266
0.059CysCys: 0.059 ± 0.065
1.356CysAsp: 1.356 ± 0.282
0.944CysGlu: 0.944 ± 0.229
0.236CysPhe: 0.236 ± 0.122
1.769CysGly: 1.769 ± 0.331
0.236CysHis: 0.236 ± 0.11
0.177CysIle: 0.177 ± 0.107
0.767CysLys: 0.767 ± 0.193
0.531CysLeu: 0.531 ± 0.228
0.295CysMet: 0.295 ± 0.117
0.354CysAsn: 0.354 ± 0.149
1.121CysPro: 1.121 ± 0.26
0.177CysGln: 0.177 ± 0.099
0.944CysArg: 0.944 ± 0.264
0.649CysSer: 0.649 ± 0.192
0.885CysThr: 0.885 ± 0.249
0.649CysVal: 0.649 ± 0.192
0.354CysTrp: 0.354 ± 0.121
0.295CysTyr: 0.295 ± 0.145
0.0CysXaa: 0.0 ± 0.0
Asp
6.723AspAla: 6.723 ± 0.635
0.885AspCys: 0.885 ± 0.253
4.423AspAsp: 4.423 ± 0.606
3.303AspGlu: 3.303 ± 0.419
1.651AspPhe: 1.651 ± 0.279
6.664AspGly: 6.664 ± 0.602
1.003AspHis: 1.003 ± 0.207
2.536AspIle: 2.536 ± 0.379
1.71AspLys: 1.71 ± 0.281
5.721AspLeu: 5.721 ± 0.501
0.885AspMet: 0.885 ± 0.226
1.71AspAsn: 1.71 ± 0.374
5.367AspPro: 5.367 ± 0.636
2.359AspGln: 2.359 ± 0.343
5.603AspArg: 5.603 ± 0.621
2.949AspSer: 2.949 ± 0.566
3.951AspThr: 3.951 ± 0.534
4.01AspVal: 4.01 ± 0.522
1.474AspTrp: 1.474 ± 0.324
2.064AspTyr: 2.064 ± 0.317
0.0AspXaa: 0.0 ± 0.0
Glu
6.782GluAla: 6.782 ± 0.691
0.885GluCys: 0.885 ± 0.251
2.89GluAsp: 2.89 ± 0.4
3.244GluGlu: 3.244 ± 0.552
2.064GluPhe: 2.064 ± 0.343
3.362GluGly: 3.362 ± 0.421
1.828GluHis: 1.828 ± 0.453
2.064GluIle: 2.064 ± 0.34
1.828GluLys: 1.828 ± 0.332
5.662GluLeu: 5.662 ± 0.745
1.651GluMet: 1.651 ± 0.33
1.946GluAsn: 1.946 ± 0.357
2.654GluPro: 2.654 ± 0.457
2.949GluGln: 2.949 ± 0.39
5.131GluArg: 5.131 ± 0.66
2.772GluSer: 2.772 ± 0.398
3.892GluThr: 3.892 ± 0.565
4.128GluVal: 4.128 ± 0.471
1.415GluTrp: 1.415 ± 0.293
1.769GluTyr: 1.769 ± 0.331
0.0GluXaa: 0.0 ± 0.0
Phe
3.244PheAla: 3.244 ± 0.427
0.413PheCys: 0.413 ± 0.151
2.182PheAsp: 2.182 ± 0.418
1.71PheGlu: 1.71 ± 0.281
0.944PhePhe: 0.944 ± 0.289
2.89PheGly: 2.89 ± 0.643
0.531PheHis: 0.531 ± 0.197
1.71PheIle: 1.71 ± 0.373
1.062PheLys: 1.062 ± 0.266
1.533PheLeu: 1.533 ± 0.259
0.649PheMet: 0.649 ± 0.22
1.238PheAsn: 1.238 ± 0.328
1.651PhePro: 1.651 ± 0.298
1.18PheGln: 1.18 ± 0.33
1.651PheArg: 1.651 ± 0.257
1.474PheSer: 1.474 ± 0.3
2.831PheThr: 2.831 ± 0.465
2.241PheVal: 2.241 ± 0.306
0.708PheTrp: 0.708 ± 0.182
1.003PheTyr: 1.003 ± 0.314
0.0PheXaa: 0.0 ± 0.0
Gly
9.495GlyAla: 9.495 ± 1.227
1.003GlyCys: 1.003 ± 0.236
6.428GlyAsp: 6.428 ± 0.559
4.01GlyGlu: 4.01 ± 0.511
2.772GlyPhe: 2.772 ± 0.513
10.085GlyGly: 10.085 ± 2.192
1.828GlyHis: 1.828 ± 0.339
4.6GlyIle: 4.6 ± 0.586
2.831GlyLys: 2.831 ± 0.43
6.251GlyLeu: 6.251 ± 0.544
2.536GlyMet: 2.536 ± 0.575
3.303GlyAsn: 3.303 ± 0.475
4.246GlyPro: 4.246 ± 0.553
2.595GlyGln: 2.595 ± 0.658
5.367GlyArg: 5.367 ± 0.685
5.957GlySer: 5.957 ± 0.962
6.546GlyThr: 6.546 ± 0.865
6.134GlyVal: 6.134 ± 0.596
2.418GlyTrp: 2.418 ± 0.441
1.71GlyTyr: 1.71 ± 0.349
0.0GlyXaa: 0.0 ± 0.0
His
1.71HisAla: 1.71 ± 0.371
0.177HisCys: 0.177 ± 0.133
0.767HisAsp: 0.767 ± 0.189
1.121HisGlu: 1.121 ± 0.25
0.531HisPhe: 0.531 ± 0.153
1.71HisGly: 1.71 ± 0.316
1.062HisHis: 1.062 ± 0.289
1.651HisIle: 1.651 ± 0.34
0.767HisLys: 0.767 ± 0.247
1.828HisLeu: 1.828 ± 0.311
0.472HisMet: 0.472 ± 0.128
0.944HisAsn: 0.944 ± 0.195
1.297HisPro: 1.297 ± 0.246
0.649HisGln: 0.649 ± 0.196
1.946HisArg: 1.946 ± 0.338
0.944HisSer: 0.944 ± 0.236
1.533HisThr: 1.533 ± 0.358
1.651HisVal: 1.651 ± 0.363
0.59HisTrp: 0.59 ± 0.169
0.826HisTyr: 0.826 ± 0.197
0.0HisXaa: 0.0 ± 0.0
Ile
5.249IleAla: 5.249 ± 0.53
0.649IleCys: 0.649 ± 0.206
3.303IleAsp: 3.303 ± 0.489
3.657IleGlu: 3.657 ± 0.392
0.59IlePhe: 0.59 ± 0.247
4.187IleGly: 4.187 ± 0.527
1.592IleHis: 1.592 ± 0.321
1.297IleIle: 1.297 ± 0.22
0.826IleLys: 0.826 ± 0.23
2.3IleLeu: 2.3 ± 0.439
0.354IleMet: 0.354 ± 0.139
1.71IleAsn: 1.71 ± 0.264
2.595IlePro: 2.595 ± 0.326
1.415IleGln: 1.415 ± 0.237
2.654IleArg: 2.654 ± 0.442
2.005IleSer: 2.005 ± 0.43
3.892IleThr: 3.892 ± 0.449
3.126IleVal: 3.126 ± 0.39
0.885IleTrp: 0.885 ± 0.22
0.708IleTyr: 0.708 ± 0.206
0.0IleXaa: 0.0 ± 0.0
Lys
3.539LysAla: 3.539 ± 0.47
0.649LysCys: 0.649 ± 0.214
1.828LysAsp: 1.828 ± 0.306
1.71LysGlu: 1.71 ± 0.322
1.297LysPhe: 1.297 ± 0.241
2.536LysGly: 2.536 ± 0.418
1.062LysHis: 1.062 ± 0.264
0.885LysIle: 0.885 ± 0.218
1.474LysLys: 1.474 ± 0.41
2.831LysLeu: 2.831 ± 0.498
0.767LysMet: 0.767 ± 0.202
0.767LysAsn: 0.767 ± 0.198
2.182LysPro: 2.182 ± 0.398
1.474LysGln: 1.474 ± 0.269
2.595LysArg: 2.595 ± 0.477
2.182LysSer: 2.182 ± 0.346
1.769LysThr: 1.769 ± 0.336
2.536LysVal: 2.536 ± 0.461
0.767LysTrp: 0.767 ± 0.22
0.826LysTyr: 0.826 ± 0.238
0.0LysXaa: 0.0 ± 0.0
Leu
8.139LeuAla: 8.139 ± 0.907
0.767LeuCys: 0.767 ± 0.216
5.249LeuAsp: 5.249 ± 0.62
4.069LeuGlu: 4.069 ± 0.562
3.126LeuPhe: 3.126 ± 0.348
5.603LeuGly: 5.603 ± 0.61
1.415LeuHis: 1.415 ± 0.264
3.067LeuIle: 3.067 ± 0.467
2.005LeuLys: 2.005 ± 0.326
4.895LeuLeu: 4.895 ± 0.641
1.651LeuMet: 1.651 ± 0.285
2.359LeuAsn: 2.359 ± 0.398
5.721LeuPro: 5.721 ± 0.638
2.536LeuGln: 2.536 ± 0.406
4.895LeuArg: 4.895 ± 0.718
5.013LeuSer: 5.013 ± 0.449
5.308LeuThr: 5.308 ± 0.55
4.659LeuVal: 4.659 ± 0.493
1.238LeuTrp: 1.238 ± 0.275
2.005LeuTyr: 2.005 ± 0.367
0.0LeuXaa: 0.0 ± 0.0
Met
2.359MetAla: 2.359 ± 0.345
0.354MetCys: 0.354 ± 0.184
1.238MetAsp: 1.238 ± 0.306
0.885MetGlu: 0.885 ± 0.194
0.531MetPhe: 0.531 ± 0.208
1.651MetGly: 1.651 ± 0.311
0.118MetHis: 0.118 ± 0.094
0.826MetIle: 0.826 ± 0.224
0.826MetLys: 0.826 ± 0.267
1.828MetLeu: 1.828 ± 0.265
0.531MetMet: 0.531 ± 0.228
1.18MetAsn: 1.18 ± 0.249
1.297MetPro: 1.297 ± 0.252
0.649MetGln: 0.649 ± 0.183
1.769MetArg: 1.769 ± 0.309
2.89MetSer: 2.89 ± 0.378
1.946MetThr: 1.946 ± 0.312
1.356MetVal: 1.356 ± 0.348
0.354MetTrp: 0.354 ± 0.136
0.354MetTyr: 0.354 ± 0.137
0.0MetXaa: 0.0 ± 0.0
Asn
3.539AsnAla: 3.539 ± 0.398
0.295AsnCys: 0.295 ± 0.139
2.123AsnAsp: 2.123 ± 0.353
1.415AsnGlu: 1.415 ± 0.298
0.885AsnPhe: 0.885 ± 0.31
4.541AsnGly: 4.541 ± 0.736
0.944AsnHis: 0.944 ± 0.199
1.769AsnIle: 1.769 ± 0.383
1.062AsnLys: 1.062 ± 0.238
2.359AsnLeu: 2.359 ± 0.386
0.59AsnMet: 0.59 ± 0.165
1.592AsnAsn: 1.592 ± 0.329
2.241AsnPro: 2.241 ± 0.367
1.062AsnGln: 1.062 ± 0.301
1.946AsnArg: 1.946 ± 0.375
1.474AsnSer: 1.474 ± 0.307
2.005AsnThr: 2.005 ± 0.328
2.123AsnVal: 2.123 ± 0.309
0.885AsnTrp: 0.885 ± 0.181
0.708AsnTyr: 0.708 ± 0.187
0.0AsnXaa: 0.0 ± 0.0
Pro
5.308ProAla: 5.308 ± 0.666
0.708ProCys: 0.708 ± 0.192
4.187ProAsp: 4.187 ± 0.541
4.6ProGlu: 4.6 ± 0.557
1.828ProPhe: 1.828 ± 0.34
6.782ProGly: 6.782 ± 0.698
1.415ProHis: 1.415 ± 0.317
1.946ProIle: 1.946 ± 0.374
2.005ProLys: 2.005 ± 0.429
4.364ProLeu: 4.364 ± 0.597
1.533ProMet: 1.533 ± 0.326
2.418ProAsn: 2.418 ± 0.394
3.657ProPro: 3.657 ± 0.622
2.182ProGln: 2.182 ± 0.416
3.362ProArg: 3.362 ± 0.468
2.89ProSer: 2.89 ± 0.381
3.303ProThr: 3.303 ± 0.486
4.6ProVal: 4.6 ± 0.552
1.121ProTrp: 1.121 ± 0.273
1.474ProTyr: 1.474 ± 0.267
0.0ProXaa: 0.0 ± 0.0
Gln
4.246GlnAla: 4.246 ± 0.595
0.413GlnCys: 0.413 ± 0.199
1.238GlnAsp: 1.238 ± 0.255
1.415GlnGlu: 1.415 ± 0.281
1.003GlnPhe: 1.003 ± 0.238
2.477GlnGly: 2.477 ± 0.534
0.708GlnHis: 0.708 ± 0.216
1.946GlnIle: 1.946 ± 0.349
1.297GlnLys: 1.297 ± 0.203
3.303GlnLeu: 3.303 ± 0.367
0.767GlnMet: 0.767 ± 0.212
0.767GlnAsn: 0.767 ± 0.251
2.831GlnPro: 2.831 ± 0.39
1.415GlnGln: 1.415 ± 0.245
2.595GlnArg: 2.595 ± 0.389
2.536GlnSer: 2.536 ± 0.359
1.887GlnThr: 1.887 ± 0.347
2.241GlnVal: 2.241 ± 0.364
0.531GlnTrp: 0.531 ± 0.194
0.767GlnTyr: 0.767 ± 0.267
0.0GlnXaa: 0.0 ± 0.0
Arg
6.428ArgAla: 6.428 ± 0.63
1.415ArgCys: 1.415 ± 0.378
4.423ArgAsp: 4.423 ± 0.676
4.954ArgGlu: 4.954 ± 0.753
2.123ArgPhe: 2.123 ± 0.455
4.128ArgGly: 4.128 ± 0.494
1.651ArgHis: 1.651 ± 0.368
3.833ArgIle: 3.833 ± 0.555
2.595ArgLys: 2.595 ± 0.431
5.249ArgLeu: 5.249 ± 0.797
2.241ArgMet: 2.241 ± 0.391
2.595ArgAsn: 2.595 ± 0.501
3.657ArgPro: 3.657 ± 0.429
1.946ArgGln: 1.946 ± 0.474
5.485ArgArg: 5.485 ± 0.991
4.187ArgSer: 4.187 ± 0.55
3.48ArgThr: 3.48 ± 0.511
5.249ArgVal: 5.249 ± 0.574
2.005ArgTrp: 2.005 ± 0.417
1.887ArgTyr: 1.887 ± 0.398
0.0ArgXaa: 0.0 ± 0.0
Ser
5.721SerAla: 5.721 ± 0.886
0.472SerCys: 0.472 ± 0.183
3.951SerAsp: 3.951 ± 0.478
3.185SerGlu: 3.185 ± 0.515
2.359SerPhe: 2.359 ± 0.442
6.487SerGly: 6.487 ± 0.873
1.062SerHis: 1.062 ± 0.275
2.595SerIle: 2.595 ± 0.43
2.536SerLys: 2.536 ± 0.415
3.715SerLeu: 3.715 ± 0.498
1.474SerMet: 1.474 ± 0.267
2.241SerAsn: 2.241 ± 0.377
3.185SerPro: 3.185 ± 0.361
1.651SerGln: 1.651 ± 0.321
3.362SerArg: 3.362 ± 0.439
4.246SerSer: 4.246 ± 0.731
3.303SerThr: 3.303 ± 0.417
4.364SerVal: 4.364 ± 0.548
1.356SerTrp: 1.356 ± 0.315
1.238SerTyr: 1.238 ± 0.25
0.0SerXaa: 0.0 ± 0.0
Thr
7.018ThrAla: 7.018 ± 0.735
0.944ThrCys: 0.944 ± 0.251
3.774ThrAsp: 3.774 ± 0.563
3.539ThrGlu: 3.539 ± 0.377
2.123ThrPhe: 2.123 ± 0.375
6.134ThrGly: 6.134 ± 0.656
1.474ThrHis: 1.474 ± 0.283
3.303ThrIle: 3.303 ± 0.442
1.946ThrLys: 1.946 ± 0.37
4.482ThrLeu: 4.482 ± 0.523
1.238ThrMet: 1.238 ± 0.291
2.536ThrAsn: 2.536 ± 0.361
4.423ThrPro: 4.423 ± 0.538
1.887ThrGln: 1.887 ± 0.333
4.01ThrArg: 4.01 ± 0.49
3.539ThrSer: 3.539 ± 0.407
4.895ThrThr: 4.895 ± 0.705
5.544ThrVal: 5.544 ± 0.684
1.356ThrTrp: 1.356 ± 0.336
1.651ThrTyr: 1.651 ± 0.357
0.0ThrXaa: 0.0 ± 0.0
Val
7.313ValAla: 7.313 ± 0.566
1.062ValCys: 1.062 ± 0.253
5.485ValAsp: 5.485 ± 0.589
4.482ValGlu: 4.482 ± 0.645
2.3ValPhe: 2.3 ± 0.388
6.192ValGly: 6.192 ± 0.767
1.18ValHis: 1.18 ± 0.266
2.654ValIle: 2.654 ± 0.413
2.477ValLys: 2.477 ± 0.388
5.072ValLeu: 5.072 ± 0.596
1.415ValMet: 1.415 ± 0.263
2.064ValAsn: 2.064 ± 0.328
3.774ValPro: 3.774 ± 0.418
2.831ValGln: 2.831 ± 0.438
4.305ValArg: 4.305 ± 0.611
5.131ValSer: 5.131 ± 0.545
4.718ValThr: 4.718 ± 0.52
6.251ValVal: 6.251 ± 0.808
1.828ValTrp: 1.828 ± 0.332
1.356ValTyr: 1.356 ± 0.26
0.0ValXaa: 0.0 ± 0.0
Trp
2.064TrpAla: 2.064 ± 0.31
0.295TrpCys: 0.295 ± 0.144
1.415TrpAsp: 1.415 ± 0.265
1.356TrpGlu: 1.356 ± 0.385
0.767TrpPhe: 0.767 ± 0.191
1.238TrpGly: 1.238 ± 0.311
0.472TrpHis: 0.472 ± 0.187
1.18TrpIle: 1.18 ± 0.246
1.003TrpLys: 1.003 ± 0.234
1.828TrpLeu: 1.828 ± 0.394
0.944TrpMet: 0.944 ± 0.268
0.531TrpAsn: 0.531 ± 0.221
1.297TrpPro: 1.297 ± 0.314
1.121TrpGln: 1.121 ± 0.279
1.887TrpArg: 1.887 ± 0.474
1.062TrpSer: 1.062 ± 0.287
1.71TrpThr: 1.71 ± 0.32
1.946TrpVal: 1.946 ± 0.506
1.121TrpTrp: 1.121 ± 0.222
0.295TrpTyr: 0.295 ± 0.136
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.418TyrAla: 2.418 ± 0.368
0.413TyrCys: 0.413 ± 0.156
1.592TyrAsp: 1.592 ± 0.354
1.887TyrGlu: 1.887 ± 0.398
0.708TyrPhe: 0.708 ± 0.227
1.474TyrGly: 1.474 ± 0.33
0.236TyrHis: 0.236 ± 0.109
1.003TyrIle: 1.003 ± 0.227
0.767TyrLys: 0.767 ± 0.212
1.828TyrLeu: 1.828 ± 0.283
0.295TyrMet: 0.295 ± 0.134
0.826TyrAsn: 0.826 ± 0.223
1.297TyrPro: 1.297 ± 0.259
0.767TyrGln: 0.767 ± 0.233
2.182TyrArg: 2.182 ± 0.326
1.18TyrSer: 1.18 ± 0.263
1.474TyrThr: 1.474 ± 0.33
2.359TyrVal: 2.359 ± 0.337
0.59TyrTrp: 0.59 ± 0.151
0.767TyrTyr: 0.767 ± 0.177
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 97 proteins (16957 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski