Amino acid dipepetide frequency for Mycobacterium phage Big3

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
12.495AlaAla: 12.495 ± 1.657
0.607AlaCys: 0.607 ± 0.173
6.126AlaAsp: 6.126 ± 0.712
5.883AlaGlu: 5.883 ± 0.655
3.154AlaPhe: 3.154 ± 0.522
7.824AlaGly: 7.824 ± 0.713
1.456AlaHis: 1.456 ± 0.32
3.7AlaIle: 3.7 ± 0.546
3.761AlaLys: 3.761 ± 0.562
9.401AlaLeu: 9.401 ± 0.887
2.608AlaMet: 2.608 ± 0.473
2.79AlaAsn: 2.79 ± 0.449
4.549AlaPro: 4.549 ± 0.703
3.093AlaGln: 3.093 ± 0.523
6.611AlaArg: 6.611 ± 0.609
5.52AlaSer: 5.52 ± 0.628
6.308AlaThr: 6.308 ± 0.682
8.31AlaVal: 8.31 ± 0.711
1.82AlaTrp: 1.82 ± 0.382
3.093AlaTyr: 3.093 ± 0.437
0.0AlaXaa: 0.0 ± 0.0
Cys
0.789CysAla: 0.789 ± 0.243
0.0CysCys: 0.0 ± 0.0
0.485CysAsp: 0.485 ± 0.174
0.789CysGlu: 0.789 ± 0.221
0.121CysPhe: 0.121 ± 0.077
0.364CysGly: 0.364 ± 0.166
0.121CysHis: 0.121 ± 0.103
0.182CysIle: 0.182 ± 0.108
0.303CysLys: 0.303 ± 0.135
0.364CysLeu: 0.364 ± 0.145
0.121CysMet: 0.121 ± 0.085
0.243CysAsn: 0.243 ± 0.12
0.303CysPro: 0.303 ± 0.156
0.243CysGln: 0.243 ± 0.107
0.546CysArg: 0.546 ± 0.193
0.243CysSer: 0.243 ± 0.11
0.303CysThr: 0.303 ± 0.13
0.243CysVal: 0.243 ± 0.106
0.182CysTrp: 0.182 ± 0.102
0.061CysTyr: 0.061 ± 0.058
0.0CysXaa: 0.0 ± 0.0
Asp
6.369AspAla: 6.369 ± 0.717
0.485AspCys: 0.485 ± 0.174
4.61AspAsp: 4.61 ± 0.547
4.124AspGlu: 4.124 ± 0.528
2.426AspPhe: 2.426 ± 0.355
6.369AspGly: 6.369 ± 0.745
1.334AspHis: 1.334 ± 0.288
2.729AspIle: 2.729 ± 0.425
2.911AspLys: 2.911 ± 0.408
7.218AspLeu: 7.218 ± 0.741
1.152AspMet: 1.152 ± 0.199
1.759AspAsn: 1.759 ± 0.345
4.913AspPro: 4.913 ± 0.559
1.82AspGln: 1.82 ± 0.376
3.518AspArg: 3.518 ± 0.385
3.761AspSer: 3.761 ± 0.447
3.518AspThr: 3.518 ± 0.435
3.7AspVal: 3.7 ± 0.523
1.577AspTrp: 1.577 ± 0.282
2.244AspTyr: 2.244 ± 0.306
0.0AspXaa: 0.0 ± 0.0
Glu
6.005GluAla: 6.005 ± 0.704
0.182GluCys: 0.182 ± 0.131
4.852GluAsp: 4.852 ± 0.507
5.095GluGlu: 5.095 ± 0.577
2.244GluPhe: 2.244 ± 0.352
4.064GluGly: 4.064 ± 0.389
1.274GluHis: 1.274 ± 0.307
3.397GluIle: 3.397 ± 0.477
2.547GluLys: 2.547 ± 0.407
7.764GluLeu: 7.764 ± 0.687
1.516GluMet: 1.516 ± 0.314
1.759GluAsn: 1.759 ± 0.43
2.426GluPro: 2.426 ± 0.412
2.487GluGln: 2.487 ± 0.379
4.064GluArg: 4.064 ± 0.605
3.336GluSer: 3.336 ± 0.439
4.124GluThr: 4.124 ± 0.58
5.701GluVal: 5.701 ± 0.575
1.638GluTrp: 1.638 ± 0.299
2.851GluTyr: 2.851 ± 0.513
0.0GluXaa: 0.0 ± 0.0
Phe
2.426PheAla: 2.426 ± 0.324
0.182PheCys: 0.182 ± 0.131
2.911PheAsp: 2.911 ± 0.412
2.366PheGlu: 2.366 ± 0.453
0.607PhePhe: 0.607 ± 0.183
3.336PheGly: 3.336 ± 0.531
0.789PheHis: 0.789 ± 0.29
1.456PheIle: 1.456 ± 0.251
1.152PheLys: 1.152 ± 0.292
2.487PheLeu: 2.487 ± 0.43
0.607PheMet: 0.607 ± 0.219
0.97PheAsn: 0.97 ± 0.23
1.638PhePro: 1.638 ± 0.308
1.031PheGln: 1.031 ± 0.258
1.941PheArg: 1.941 ± 0.397
2.184PheSer: 2.184 ± 0.487
1.82PheThr: 1.82 ± 0.369
2.062PheVal: 2.062 ± 0.361
0.607PheTrp: 0.607 ± 0.172
0.97PheTyr: 0.97 ± 0.259
0.0PheXaa: 0.0 ± 0.0
Gly
6.854GlyAla: 6.854 ± 0.898
0.607GlyCys: 0.607 ± 0.191
5.944GlyAsp: 5.944 ± 0.653
4.731GlyGlu: 4.731 ± 0.507
2.79GlyPhe: 2.79 ± 0.437
9.098GlyGly: 9.098 ± 1.918
1.88GlyHis: 1.88 ± 0.38
4.306GlyIle: 4.306 ± 0.569
3.943GlyLys: 3.943 ± 0.483
7.764GlyLeu: 7.764 ± 0.884
1.82GlyMet: 1.82 ± 0.311
3.154GlyAsn: 3.154 ± 0.447
3.579GlyPro: 3.579 ± 0.574
2.669GlyGln: 2.669 ± 0.306
5.034GlyArg: 5.034 ± 0.529
6.308GlySer: 6.308 ± 0.968
4.974GlyThr: 4.974 ± 0.658
5.641GlyVal: 5.641 ± 0.658
2.547GlyTrp: 2.547 ± 0.407
2.669GlyTyr: 2.669 ± 0.429
0.0GlyXaa: 0.0 ± 0.0
His
1.88HisAla: 1.88 ± 0.361
0.182HisCys: 0.182 ± 0.108
1.092HisAsp: 1.092 ± 0.218
1.516HisGlu: 1.516 ± 0.346
0.667HisPhe: 0.667 ± 0.169
1.516HisGly: 1.516 ± 0.358
0.546HisHis: 0.546 ± 0.218
1.152HisIle: 1.152 ± 0.251
0.97HisLys: 0.97 ± 0.27
1.274HisLeu: 1.274 ± 0.328
0.121HisMet: 0.121 ± 0.081
0.303HisAsn: 0.303 ± 0.121
1.334HisPro: 1.334 ± 0.303
0.97HisGln: 0.97 ± 0.272
1.456HisArg: 1.456 ± 0.359
0.485HisSer: 0.485 ± 0.168
0.91HisThr: 0.91 ± 0.27
1.456HisVal: 1.456 ± 0.325
0.546HisTrp: 0.546 ± 0.179
0.667HisTyr: 0.667 ± 0.219
0.0HisXaa: 0.0 ± 0.0
Ile
5.762IleAla: 5.762 ± 0.792
0.303IleCys: 0.303 ± 0.127
3.943IleAsp: 3.943 ± 0.466
3.7IleGlu: 3.7 ± 0.466
0.849IlePhe: 0.849 ± 0.196
3.882IleGly: 3.882 ± 0.426
0.789IleHis: 0.789 ± 0.218
1.516IleIle: 1.516 ± 0.311
1.516IleLys: 1.516 ± 0.304
3.579IleLeu: 3.579 ± 0.48
0.607IleMet: 0.607 ± 0.168
1.82IleAsn: 1.82 ± 0.299
3.397IlePro: 3.397 ± 0.437
1.213IleGln: 1.213 ± 0.416
4.003IleArg: 4.003 ± 0.56
3.336IleSer: 3.336 ± 0.52
3.215IleThr: 3.215 ± 0.467
2.851IleVal: 2.851 ± 0.566
0.91IleTrp: 0.91 ± 0.199
1.395IleTyr: 1.395 ± 0.255
0.0IleXaa: 0.0 ± 0.0
Lys
3.579LysAla: 3.579 ± 0.531
0.243LysCys: 0.243 ± 0.118
2.366LysAsp: 2.366 ± 0.411
1.88LysGlu: 1.88 ± 0.332
1.698LysPhe: 1.698 ± 0.28
2.547LysGly: 2.547 ± 0.376
1.213LysHis: 1.213 ± 0.309
2.487LysIle: 2.487 ± 0.455
2.184LysLys: 2.184 ± 0.453
3.7LysLeu: 3.7 ± 0.512
1.092LysMet: 1.092 ± 0.25
1.638LysAsn: 1.638 ± 0.26
2.79LysPro: 2.79 ± 0.353
1.395LysGln: 1.395 ± 0.308
2.669LysArg: 2.669 ± 0.392
2.366LysSer: 2.366 ± 0.373
2.487LysThr: 2.487 ± 0.349
3.093LysVal: 3.093 ± 0.511
0.91LysTrp: 0.91 ± 0.227
0.849LysTyr: 0.849 ± 0.262
0.0LysXaa: 0.0 ± 0.0
Leu
9.28LeuAla: 9.28 ± 0.827
0.364LeuCys: 0.364 ± 0.132
6.187LeuAsp: 6.187 ± 0.601
5.944LeuGlu: 5.944 ± 0.548
2.366LeuPhe: 2.366 ± 0.458
8.067LeuGly: 8.067 ± 0.691
1.334LeuHis: 1.334 ± 0.35
5.095LeuIle: 5.095 ± 0.587
4.549LeuLys: 4.549 ± 0.456
5.701LeuLeu: 5.701 ± 0.571
1.82LeuMet: 1.82 ± 0.28
3.397LeuAsn: 3.397 ± 0.411
5.398LeuPro: 5.398 ± 0.623
2.426LeuGln: 2.426 ± 0.451
6.187LeuArg: 6.187 ± 0.601
5.216LeuSer: 5.216 ± 0.546
6.308LeuThr: 6.308 ± 0.494
4.913LeuVal: 4.913 ± 0.594
1.092LeuTrp: 1.092 ± 0.327
2.244LeuTyr: 2.244 ± 0.414
0.0LeuXaa: 0.0 ± 0.0
Met
2.366MetAla: 2.366 ± 0.314
0.0MetCys: 0.0 ± 0.0
1.334MetAsp: 1.334 ± 0.302
1.395MetGlu: 1.395 ± 0.284
0.667MetPhe: 0.667 ± 0.197
1.759MetGly: 1.759 ± 0.308
0.243MetHis: 0.243 ± 0.125
0.789MetIle: 0.789 ± 0.202
0.91MetLys: 0.91 ± 0.221
1.274MetLeu: 1.274 ± 0.285
0.121MetMet: 0.121 ± 0.086
1.092MetAsn: 1.092 ± 0.245
1.152MetPro: 1.152 ± 0.287
0.607MetGln: 0.607 ± 0.174
1.213MetArg: 1.213 ± 0.253
1.82MetSer: 1.82 ± 0.305
2.002MetThr: 2.002 ± 0.332
1.092MetVal: 1.092 ± 0.282
0.425MetTrp: 0.425 ± 0.155
0.425MetTyr: 0.425 ± 0.191
0.0MetXaa: 0.0 ± 0.0
Asn
2.851AsnAla: 2.851 ± 0.488
0.061AsnCys: 0.061 ± 0.072
1.941AsnAsp: 1.941 ± 0.382
1.82AsnGlu: 1.82 ± 0.348
0.91AsnPhe: 0.91 ± 0.218
3.761AsnGly: 3.761 ± 0.533
0.607AsnHis: 0.607 ± 0.2
1.395AsnIle: 1.395 ± 0.272
0.849AsnLys: 0.849 ± 0.228
2.487AsnLeu: 2.487 ± 0.344
0.546AsnMet: 0.546 ± 0.157
1.031AsnAsn: 1.031 ± 0.275
2.608AsnPro: 2.608 ± 0.369
1.213AsnGln: 1.213 ± 0.353
1.638AsnArg: 1.638 ± 0.37
1.88AsnSer: 1.88 ± 0.373
2.184AsnThr: 2.184 ± 0.404
2.669AsnVal: 2.669 ± 0.405
0.728AsnTrp: 0.728 ± 0.181
1.213AsnTyr: 1.213 ± 0.294
0.0AsnXaa: 0.0 ± 0.0
Pro
5.459ProAla: 5.459 ± 0.689
0.485ProCys: 0.485 ± 0.179
4.246ProAsp: 4.246 ± 0.515
4.003ProGlu: 4.003 ± 0.509
1.941ProPhe: 1.941 ± 0.374
5.156ProGly: 5.156 ± 0.601
0.849ProHis: 0.849 ± 0.246
2.305ProIle: 2.305 ± 0.376
1.941ProLys: 1.941 ± 0.315
4.549ProLeu: 4.549 ± 0.552
0.849ProMet: 0.849 ± 0.245
1.759ProAsn: 1.759 ± 0.319
3.093ProPro: 3.093 ± 0.48
1.577ProGln: 1.577 ± 0.344
2.547ProArg: 2.547 ± 0.402
3.518ProSer: 3.518 ± 0.42
4.246ProThr: 4.246 ± 0.543
3.761ProVal: 3.761 ± 0.62
0.849ProTrp: 0.849 ± 0.269
1.759ProTyr: 1.759 ± 0.341
0.0ProXaa: 0.0 ± 0.0
Gln
3.336GlnAla: 3.336 ± 0.494
0.061GlnCys: 0.061 ± 0.06
1.334GlnAsp: 1.334 ± 0.408
1.88GlnGlu: 1.88 ± 0.281
1.213GlnPhe: 1.213 ± 0.243
2.366GlnGly: 2.366 ± 0.325
0.667GlnHis: 0.667 ± 0.225
2.669GlnIle: 2.669 ± 0.536
0.91GlnLys: 0.91 ± 0.236
3.579GlnLeu: 3.579 ± 0.473
1.092GlnMet: 1.092 ± 0.283
0.607GlnAsn: 0.607 ± 0.224
2.062GlnPro: 2.062 ± 0.367
1.516GlnGln: 1.516 ± 0.43
2.002GlnArg: 2.002 ± 0.31
1.274GlnSer: 1.274 ± 0.259
1.577GlnThr: 1.577 ± 0.28
2.608GlnVal: 2.608 ± 0.358
0.607GlnTrp: 0.607 ± 0.158
0.485GlnTyr: 0.485 ± 0.158
0.0GlnXaa: 0.0 ± 0.0
Arg
6.369ArgAla: 6.369 ± 0.771
0.728ArgCys: 0.728 ± 0.209
3.275ArgAsp: 3.275 ± 0.343
4.913ArgGlu: 4.913 ± 0.583
2.123ArgPhe: 2.123 ± 0.464
4.488ArgGly: 4.488 ± 0.526
1.092ArgHis: 1.092 ± 0.32
3.033ArgIle: 3.033 ± 0.424
3.215ArgLys: 3.215 ± 0.51
5.883ArgLeu: 5.883 ± 0.636
1.941ArgMet: 1.941 ± 0.339
2.244ArgAsn: 2.244 ± 0.441
2.608ArgPro: 2.608 ± 0.446
1.941ArgGln: 1.941 ± 0.352
5.034ArgArg: 5.034 ± 0.668
4.003ArgSer: 4.003 ± 0.485
2.851ArgThr: 2.851 ± 0.417
4.913ArgVal: 4.913 ± 0.508
1.213ArgTrp: 1.213 ± 0.268
1.88ArgTyr: 1.88 ± 0.36
0.0ArgXaa: 0.0 ± 0.0
Ser
6.308SerAla: 6.308 ± 0.81
0.485SerCys: 0.485 ± 0.199
3.7SerAsp: 3.7 ± 0.386
3.7SerGlu: 3.7 ± 0.444
1.698SerPhe: 1.698 ± 0.347
6.49SerGly: 6.49 ± 1.04
1.577SerHis: 1.577 ± 0.29
3.093SerIle: 3.093 ± 0.469
2.366SerLys: 2.366 ± 0.407
5.216SerLeu: 5.216 ± 0.6
1.577SerMet: 1.577 ± 0.35
2.123SerAsn: 2.123 ± 0.407
2.851SerPro: 2.851 ± 0.423
1.698SerGln: 1.698 ± 0.287
2.972SerArg: 2.972 ± 0.372
3.336SerSer: 3.336 ± 0.739
3.518SerThr: 3.518 ± 0.542
3.821SerVal: 3.821 ± 0.436
1.092SerTrp: 1.092 ± 0.284
1.031SerTyr: 1.031 ± 0.286
0.0SerXaa: 0.0 ± 0.0
Thr
6.308ThrAla: 6.308 ± 0.747
0.182ThrCys: 0.182 ± 0.106
4.428ThrAsp: 4.428 ± 0.644
4.488ThrGlu: 4.488 ± 0.482
2.305ThrPhe: 2.305 ± 0.379
6.005ThrGly: 6.005 ± 0.725
0.849ThrHis: 0.849 ± 0.227
3.093ThrIle: 3.093 ± 0.541
2.426ThrLys: 2.426 ± 0.344
5.701ThrLeu: 5.701 ± 0.573
0.849ThrMet: 0.849 ± 0.238
2.002ThrAsn: 2.002 ± 0.358
3.639ThrPro: 3.639 ± 0.483
2.062ThrGln: 2.062 ± 0.373
3.336ThrArg: 3.336 ± 0.482
3.639ThrSer: 3.639 ± 0.501
4.367ThrThr: 4.367 ± 0.573
5.398ThrVal: 5.398 ± 0.674
1.152ThrTrp: 1.152 ± 0.252
1.88ThrTyr: 1.88 ± 0.352
0.0ThrXaa: 0.0 ± 0.0
Val
7.097ValAla: 7.097 ± 0.719
0.425ValCys: 0.425 ± 0.18
4.974ValAsp: 4.974 ± 0.587
5.641ValGlu: 5.641 ± 0.549
2.244ValPhe: 2.244 ± 0.334
4.974ValGly: 4.974 ± 0.793
1.334ValHis: 1.334 ± 0.254
3.397ValIle: 3.397 ± 0.504
3.154ValLys: 3.154 ± 0.448
5.398ValLeu: 5.398 ± 0.634
1.152ValMet: 1.152 ± 0.333
2.184ValAsn: 2.184 ± 0.365
4.306ValPro: 4.306 ± 0.465
2.062ValGln: 2.062 ± 0.393
4.852ValArg: 4.852 ± 0.672
4.246ValSer: 4.246 ± 0.412
5.459ValThr: 5.459 ± 0.522
5.034ValVal: 5.034 ± 0.636
1.334ValTrp: 1.334 ± 0.32
1.941ValTyr: 1.941 ± 0.35
0.0ValXaa: 0.0 ± 0.0
Trp
1.395TrpAla: 1.395 ± 0.338
0.243TrpCys: 0.243 ± 0.105
1.516TrpAsp: 1.516 ± 0.29
1.031TrpGlu: 1.031 ± 0.21
0.91TrpPhe: 0.91 ± 0.237
1.759TrpGly: 1.759 ± 0.352
0.364TrpHis: 0.364 ± 0.158
1.274TrpIle: 1.274 ± 0.204
0.243TrpLys: 0.243 ± 0.128
2.062TrpLeu: 2.062 ± 0.322
0.364TrpMet: 0.364 ± 0.145
0.425TrpAsn: 0.425 ± 0.145
0.849TrpPro: 0.849 ± 0.246
0.789TrpGln: 0.789 ± 0.19
1.395TrpArg: 1.395 ± 0.345
1.031TrpSer: 1.031 ± 0.235
1.88TrpThr: 1.88 ± 0.372
1.88TrpVal: 1.88 ± 0.293
0.607TrpTrp: 0.607 ± 0.213
0.364TrpTyr: 0.364 ± 0.136
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.184TyrAla: 2.184 ± 0.318
0.182TyrCys: 0.182 ± 0.13
1.274TyrAsp: 1.274 ± 0.29
2.426TyrGlu: 2.426 ± 0.36
0.607TyrPhe: 0.607 ± 0.155
2.487TyrGly: 2.487 ± 0.384
0.728TyrHis: 0.728 ± 0.241
1.516TyrIle: 1.516 ± 0.354
1.395TyrLys: 1.395 ± 0.237
2.487TyrLeu: 2.487 ± 0.407
0.667TyrMet: 0.667 ± 0.163
1.092TyrAsn: 1.092 ± 0.265
1.395TyrPro: 1.395 ± 0.316
1.031TyrGln: 1.031 ± 0.25
2.729TyrArg: 2.729 ± 0.409
1.152TyrSer: 1.152 ± 0.221
2.062TyrThr: 2.062 ± 0.345
2.062TyrVal: 2.062 ± 0.351
0.546TyrTrp: 0.546 ± 0.165
0.546TyrTyr: 0.546 ± 0.198
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 91 proteins (16488 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski