Amino acid dipepetide frequency for Gordonia phage Beyoncage

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
9.734AlaAla: 9.734 ± 1.372
0.614AlaCys: 0.614 ± 0.173
5.245AlaAsp: 5.245 ± 0.555
7.277AlaGlu: 7.277 ± 0.81
3.024AlaPhe: 3.024 ± 0.678
7.041AlaGly: 7.041 ± 0.786
1.323AlaHis: 1.323 ± 0.207
5.198AlaIle: 5.198 ± 0.592
5.387AlaLys: 5.387 ± 0.499
7.371AlaLeu: 7.371 ± 0.618
3.166AlaMet: 3.166 ± 0.483
3.78AlaAsn: 3.78 ± 0.501
3.827AlaPro: 3.827 ± 0.449
4.3AlaGln: 4.3 ± 0.596
4.914AlaArg: 4.914 ± 0.544
5.812AlaSer: 5.812 ± 0.736
4.583AlaThr: 4.583 ± 0.525
5.907AlaVal: 5.907 ± 0.761
1.654AlaTrp: 1.654 ± 0.315
2.174AlaTyr: 2.174 ± 0.35
0.0AlaXaa: 0.0 ± 0.0
Cys
0.425CysAla: 0.425 ± 0.15
0.047CysCys: 0.047 ± 0.05
0.756CysAsp: 0.756 ± 0.221
0.52CysGlu: 0.52 ± 0.177
0.189CysPhe: 0.189 ± 0.094
0.898CysGly: 0.898 ± 0.241
0.142CysHis: 0.142 ± 0.073
0.236CysIle: 0.236 ± 0.107
0.425CysLys: 0.425 ± 0.142
0.473CysLeu: 0.473 ± 0.155
0.142CysMet: 0.142 ± 0.08
0.425CysAsn: 0.425 ± 0.15
0.425CysPro: 0.425 ± 0.161
0.236CysGln: 0.236 ± 0.114
0.189CysArg: 0.189 ± 0.097
0.378CysSer: 0.378 ± 0.132
0.851CysThr: 0.851 ± 0.183
0.425CysVal: 0.425 ± 0.142
0.236CysTrp: 0.236 ± 0.11
0.331CysTyr: 0.331 ± 0.116
0.0CysXaa: 0.0 ± 0.0
Asp
5.765AspAla: 5.765 ± 0.495
0.331AspCys: 0.331 ± 0.123
5.434AspAsp: 5.434 ± 1.111
6.615AspGlu: 6.615 ± 0.652
2.693AspPhe: 2.693 ± 0.464
4.064AspGly: 4.064 ± 0.534
1.559AspHis: 1.559 ± 0.258
2.552AspIle: 2.552 ± 0.316
3.26AspLys: 3.26 ± 0.438
5.481AspLeu: 5.481 ± 0.537
1.89AspMet: 1.89 ± 0.311
2.221AspAsn: 2.221 ± 0.375
4.442AspPro: 4.442 ± 0.48
2.41AspGln: 2.41 ± 0.337
3.827AspArg: 3.827 ± 0.526
3.071AspSer: 3.071 ± 0.382
3.26AspThr: 3.26 ± 0.56
4.725AspVal: 4.725 ± 0.454
1.229AspTrp: 1.229 ± 0.33
1.843AspTyr: 1.843 ± 0.327
0.0AspXaa: 0.0 ± 0.0
Glu
8.222GluAla: 8.222 ± 0.712
0.803GluCys: 0.803 ± 0.216
6.001GluAsp: 6.001 ± 0.849
6.568GluGlu: 6.568 ± 0.89
2.693GluPhe: 2.693 ± 0.395
6.379GluGly: 6.379 ± 0.581
1.607GluHis: 1.607 ± 0.374
3.686GluIle: 3.686 ± 0.414
4.394GluLys: 4.394 ± 0.512
5.529GluLeu: 5.529 ± 0.436
2.599GluMet: 2.599 ± 0.301
2.552GluAsn: 2.552 ± 0.28
2.315GluPro: 2.315 ± 0.321
2.741GluGln: 2.741 ± 0.37
4.3GluArg: 4.3 ± 0.582
2.835GluSer: 2.835 ± 0.33
2.457GluThr: 2.457 ± 0.356
4.205GluVal: 4.205 ± 0.429
1.89GluTrp: 1.89 ± 0.265
2.882GluTyr: 2.882 ± 0.416
0.0GluXaa: 0.0 ± 0.0
Phe
3.213PheAla: 3.213 ± 0.524
0.331PheCys: 0.331 ± 0.123
2.079PheAsp: 2.079 ± 0.25
2.079PheGlu: 2.079 ± 0.348
1.087PhePhe: 1.087 ± 0.213
3.308PheGly: 3.308 ± 0.39
0.614PheHis: 0.614 ± 0.172
1.465PheIle: 1.465 ± 0.314
1.701PheLys: 1.701 ± 0.282
2.646PheLeu: 2.646 ± 0.375
0.851PheMet: 0.851 ± 0.198
1.134PheAsn: 1.134 ± 0.26
1.89PhePro: 1.89 ± 0.303
1.701PheGln: 1.701 ± 0.309
2.363PheArg: 2.363 ± 0.316
1.985PheSer: 1.985 ± 0.259
1.89PheThr: 1.89 ± 0.283
2.504PheVal: 2.504 ± 0.37
0.425PheTrp: 0.425 ± 0.122
0.992PheTyr: 0.992 ± 0.208
0.0PheXaa: 0.0 ± 0.0
Gly
6.19GlyAla: 6.19 ± 0.981
0.756GlyCys: 0.756 ± 0.219
6.048GlyAsp: 6.048 ± 1.128
5.859GlyGlu: 5.859 ± 0.714
3.119GlyPhe: 3.119 ± 0.428
7.182GlyGly: 7.182 ± 0.91
1.843GlyHis: 1.843 ± 0.355
5.245GlyIle: 5.245 ± 0.75
5.292GlyLys: 5.292 ± 0.692
6.332GlyLeu: 6.332 ± 0.742
2.646GlyMet: 2.646 ± 0.372
3.071GlyAsn: 3.071 ± 0.304
3.686GlyPro: 3.686 ± 0.57
3.875GlyGln: 3.875 ± 0.654
4.394GlyArg: 4.394 ± 0.459
5.34GlySer: 5.34 ± 0.451
5.245GlyThr: 5.245 ± 0.606
5.765GlyVal: 5.765 ± 0.572
1.748GlyTrp: 1.748 ± 0.247
2.41GlyTyr: 2.41 ± 0.461
0.0GlyXaa: 0.0 ± 0.0
His
1.937HisAla: 1.937 ± 0.357
0.095HisCys: 0.095 ± 0.066
1.134HisAsp: 1.134 ± 0.23
1.323HisGlu: 1.323 ± 0.252
0.945HisPhe: 0.945 ± 0.223
1.559HisGly: 1.559 ± 0.285
0.567HisHis: 0.567 ± 0.164
0.851HisIle: 0.851 ± 0.16
0.756HisLys: 0.756 ± 0.209
1.937HisLeu: 1.937 ± 0.265
0.425HisMet: 0.425 ± 0.131
0.425HisAsn: 0.425 ± 0.165
1.181HisPro: 1.181 ± 0.26
0.52HisGln: 0.52 ± 0.174
1.465HisArg: 1.465 ± 0.321
1.134HisSer: 1.134 ± 0.228
1.323HisThr: 1.323 ± 0.317
1.181HisVal: 1.181 ± 0.223
0.52HisTrp: 0.52 ± 0.161
0.898HisTyr: 0.898 ± 0.248
0.0HisXaa: 0.0 ± 0.0
Ile
5.859IleAla: 5.859 ± 0.8
0.567IleCys: 0.567 ± 0.175
3.497IleAsp: 3.497 ± 0.433
2.93IleGlu: 2.93 ± 0.441
1.654IlePhe: 1.654 ± 0.279
4.064IleGly: 4.064 ± 0.553
1.087IleHis: 1.087 ± 0.189
1.796IleIle: 1.796 ± 0.283
2.457IleLys: 2.457 ± 0.289
3.733IleLeu: 3.733 ± 0.521
1.087IleMet: 1.087 ± 0.215
1.843IleAsn: 1.843 ± 0.274
2.504IlePro: 2.504 ± 0.415
2.552IleGln: 2.552 ± 0.455
3.449IleArg: 3.449 ± 0.289
2.693IleSer: 2.693 ± 0.403
2.268IleThr: 2.268 ± 0.341
2.977IleVal: 2.977 ± 0.498
0.614IleTrp: 0.614 ± 0.221
1.276IleTyr: 1.276 ± 0.233
0.0IleXaa: 0.0 ± 0.0
Lys
6.001LysAla: 6.001 ± 0.695
0.425LysCys: 0.425 ± 0.112
3.355LysAsp: 3.355 ± 0.498
3.922LysGlu: 3.922 ± 0.479
1.843LysPhe: 1.843 ± 0.314
4.3LysGly: 4.3 ± 0.717
0.803LysHis: 0.803 ± 0.236
3.449LysIle: 3.449 ± 0.359
3.308LysLys: 3.308 ± 0.543
4.3LysLeu: 4.3 ± 0.482
1.418LysMet: 1.418 ± 0.241
2.646LysAsn: 2.646 ± 0.39
2.599LysPro: 2.599 ± 0.38
2.174LysGln: 2.174 ± 0.269
4.82LysArg: 4.82 ± 0.768
2.363LysSer: 2.363 ± 0.398
2.552LysThr: 2.552 ± 0.299
3.071LysVal: 3.071 ± 0.389
1.181LysTrp: 1.181 ± 0.247
1.465LysTyr: 1.465 ± 0.261
0.0LysXaa: 0.0 ± 0.0
Leu
6.71LeuAla: 6.71 ± 0.564
0.756LeuCys: 0.756 ± 0.212
4.583LeuAsp: 4.583 ± 0.494
5.434LeuGlu: 5.434 ± 0.444
2.126LeuPhe: 2.126 ± 0.287
6.615LeuGly: 6.615 ± 0.621
1.465LeuHis: 1.465 ± 0.323
3.308LeuIle: 3.308 ± 0.382
4.347LeuLys: 4.347 ± 0.473
4.583LeuLeu: 4.583 ± 0.422
1.985LeuMet: 1.985 ± 0.299
2.835LeuAsn: 2.835 ± 0.294
3.638LeuPro: 3.638 ± 0.405
2.599LeuGln: 2.599 ± 0.376
5.765LeuArg: 5.765 ± 0.519
4.253LeuSer: 4.253 ± 0.396
3.686LeuThr: 3.686 ± 0.452
5.529LeuVal: 5.529 ± 0.568
1.087LeuTrp: 1.087 ± 0.2
1.748LeuTyr: 1.748 ± 0.284
0.0LeuXaa: 0.0 ± 0.0
Met
2.693MetAla: 2.693 ± 0.346
0.378MetCys: 0.378 ± 0.13
1.04MetAsp: 1.04 ± 0.282
1.654MetGlu: 1.654 ± 0.28
0.898MetPhe: 0.898 ± 0.199
1.937MetGly: 1.937 ± 0.32
0.52MetHis: 0.52 ± 0.166
1.418MetIle: 1.418 ± 0.286
1.843MetLys: 1.843 ± 0.309
1.607MetLeu: 1.607 ± 0.224
0.425MetMet: 0.425 ± 0.138
0.709MetAsn: 0.709 ± 0.181
0.851MetPro: 0.851 ± 0.202
1.229MetGln: 1.229 ± 0.232
1.465MetArg: 1.465 ± 0.267
2.41MetSer: 2.41 ± 0.37
1.937MetThr: 1.937 ± 0.318
1.796MetVal: 1.796 ± 0.286
0.567MetTrp: 0.567 ± 0.154
0.803MetTyr: 0.803 ± 0.188
0.0MetXaa: 0.0 ± 0.0
Asn
3.875AsnAla: 3.875 ± 0.568
0.189AsnCys: 0.189 ± 0.101
1.559AsnAsp: 1.559 ± 0.223
2.41AsnGlu: 2.41 ± 0.355
1.181AsnPhe: 1.181 ± 0.248
3.638AsnGly: 3.638 ± 0.416
0.425AsnHis: 0.425 ± 0.162
1.418AsnIle: 1.418 ± 0.278
1.843AsnLys: 1.843 ± 0.291
2.741AsnLeu: 2.741 ± 0.357
0.851AsnMet: 0.851 ± 0.219
1.229AsnAsn: 1.229 ± 0.252
2.599AsnPro: 2.599 ± 0.395
1.465AsnGln: 1.465 ± 0.252
3.071AsnArg: 3.071 ± 0.375
2.174AsnSer: 2.174 ± 0.297
1.937AsnThr: 1.937 ± 0.322
2.504AsnVal: 2.504 ± 0.3
0.803AsnTrp: 0.803 ± 0.158
0.756AsnTyr: 0.756 ± 0.235
0.0AsnXaa: 0.0 ± 0.0
Pro
4.347ProAla: 4.347 ± 0.566
0.189ProCys: 0.189 ± 0.092
3.591ProAsp: 3.591 ± 0.459
4.347ProGlu: 4.347 ± 0.548
1.323ProPhe: 1.323 ± 0.28
4.725ProGly: 4.725 ± 0.479
1.323ProHis: 1.323 ± 0.303
2.552ProIle: 2.552 ± 0.387
2.788ProLys: 2.788 ± 0.501
2.599ProLeu: 2.599 ± 0.375
0.945ProMet: 0.945 ± 0.21
1.748ProAsn: 1.748 ± 0.29
2.552ProPro: 2.552 ± 0.453
2.126ProGln: 2.126 ± 0.343
1.89ProArg: 1.89 ± 0.381
2.693ProSer: 2.693 ± 0.489
3.166ProThr: 3.166 ± 0.478
2.788ProVal: 2.788 ± 0.392
0.851ProTrp: 0.851 ± 0.246
1.512ProTyr: 1.512 ± 0.347
0.0ProXaa: 0.0 ± 0.0
Gln
3.969GlnAla: 3.969 ± 0.55
0.095GlnCys: 0.095 ± 0.06
2.457GlnAsp: 2.457 ± 0.475
3.071GlnGlu: 3.071 ± 0.412
1.843GlnPhe: 1.843 ± 0.314
5.103GlnGly: 5.103 ± 1.325
0.567GlnHis: 0.567 ± 0.151
1.985GlnIle: 1.985 ± 0.279
1.985GlnLys: 1.985 ± 0.371
3.497GlnLeu: 3.497 ± 0.441
1.087GlnMet: 1.087 ± 0.216
1.465GlnAsn: 1.465 ± 0.265
1.418GlnPro: 1.418 ± 0.254
2.079GlnGln: 2.079 ± 0.389
2.741GlnArg: 2.741 ± 0.459
1.512GlnSer: 1.512 ± 0.311
2.504GlnThr: 2.504 ± 0.356
2.646GlnVal: 2.646 ± 0.31
0.662GlnTrp: 0.662 ± 0.206
2.032GlnTyr: 2.032 ± 0.352
0.0GlnXaa: 0.0 ± 0.0
Arg
4.583ArgAla: 4.583 ± 0.5
0.425ArgCys: 0.425 ± 0.128
4.394ArgAsp: 4.394 ± 0.482
4.725ArgGlu: 4.725 ± 0.472
2.268ArgPhe: 2.268 ± 0.352
4.631ArgGly: 4.631 ± 0.532
1.229ArgHis: 1.229 ± 0.3
2.788ArgIle: 2.788 ± 0.324
5.009ArgLys: 5.009 ± 0.611
4.3ArgLeu: 4.3 ± 0.435
1.748ArgMet: 1.748 ± 0.253
2.41ArgAsn: 2.41 ± 0.363
2.221ArgPro: 2.221 ± 0.369
3.449ArgGln: 3.449 ± 0.463
4.205ArgArg: 4.205 ± 0.714
3.449ArgSer: 3.449 ± 0.502
3.213ArgThr: 3.213 ± 0.315
4.725ArgVal: 4.725 ± 0.533
1.134ArgTrp: 1.134 ± 0.237
1.701ArgTyr: 1.701 ± 0.302
0.0ArgXaa: 0.0 ± 0.0
Ser
4.489SerAla: 4.489 ± 0.52
0.614SerCys: 0.614 ± 0.218
3.827SerAsp: 3.827 ± 0.47
3.827SerGlu: 3.827 ± 0.389
1.796SerPhe: 1.796 ± 0.26
5.812SerGly: 5.812 ± 0.542
1.181SerHis: 1.181 ± 0.28
2.741SerIle: 2.741 ± 0.429
2.552SerLys: 2.552 ± 0.378
3.827SerLeu: 3.827 ± 0.389
1.37SerMet: 1.37 ± 0.293
1.937SerAsn: 1.937 ± 0.318
1.843SerPro: 1.843 ± 0.29
2.032SerGln: 2.032 ± 0.327
3.733SerArg: 3.733 ± 0.463
3.308SerSer: 3.308 ± 0.562
3.119SerThr: 3.119 ± 0.388
4.016SerVal: 4.016 ± 0.452
1.04SerTrp: 1.04 ± 0.224
1.843SerTyr: 1.843 ± 0.316
0.0SerXaa: 0.0 ± 0.0
Thr
4.489ThrAla: 4.489 ± 0.458
0.331ThrCys: 0.331 ± 0.108
2.835ThrAsp: 2.835 ± 0.443
3.497ThrGlu: 3.497 ± 0.425
2.268ThrPhe: 2.268 ± 0.291
5.623ThrGly: 5.623 ± 0.629
1.04ThrHis: 1.04 ± 0.254
2.977ThrIle: 2.977 ± 0.379
2.882ThrLys: 2.882 ± 0.31
3.875ThrLeu: 3.875 ± 0.393
0.945ThrMet: 0.945 ± 0.192
1.748ThrAsn: 1.748 ± 0.324
3.733ThrPro: 3.733 ± 0.423
1.796ThrGln: 1.796 ± 0.298
2.693ThrArg: 2.693 ± 0.354
3.355ThrSer: 3.355 ± 0.452
3.166ThrThr: 3.166 ± 0.347
4.158ThrVal: 4.158 ± 0.41
0.945ThrTrp: 0.945 ± 0.233
1.418ThrTyr: 1.418 ± 0.31
0.0ThrXaa: 0.0 ± 0.0
Val
5.954ValAla: 5.954 ± 0.724
0.378ValCys: 0.378 ± 0.139
4.205ValAsp: 4.205 ± 0.345
5.056ValGlu: 5.056 ± 0.388
1.985ValPhe: 1.985 ± 0.347
5.103ValGly: 5.103 ± 0.445
1.418ValHis: 1.418 ± 0.283
3.591ValIle: 3.591 ± 0.39
3.922ValLys: 3.922 ± 0.479
4.725ValLeu: 4.725 ± 0.458
1.748ValMet: 1.748 ± 0.256
2.315ValAsn: 2.315 ± 0.352
3.686ValPro: 3.686 ± 0.486
3.402ValGln: 3.402 ± 0.351
4.064ValArg: 4.064 ± 0.419
3.733ValSer: 3.733 ± 0.431
3.969ValThr: 3.969 ± 0.444
4.867ValVal: 4.867 ± 0.515
1.087ValTrp: 1.087 ± 0.214
2.032ValTyr: 2.032 ± 0.321
0.0ValXaa: 0.0 ± 0.0
Trp
1.748TrpAla: 1.748 ± 0.318
0.095TrpCys: 0.095 ± 0.073
1.748TrpAsp: 1.748 ± 0.283
1.276TrpGlu: 1.276 ± 0.231
0.614TrpPhe: 0.614 ± 0.182
1.087TrpGly: 1.087 ± 0.273
0.52TrpHis: 0.52 ± 0.15
0.803TrpIle: 0.803 ± 0.179
0.898TrpLys: 0.898 ± 0.217
1.229TrpLeu: 1.229 ± 0.265
0.425TrpMet: 0.425 ± 0.131
0.992TrpAsn: 0.992 ± 0.207
1.087TrpPro: 1.087 ± 0.346
0.709TrpGln: 0.709 ± 0.233
1.134TrpArg: 1.134 ± 0.257
0.992TrpSer: 0.992 ± 0.214
1.229TrpThr: 1.229 ± 0.184
1.181TrpVal: 1.181 ± 0.263
0.284TrpTrp: 0.284 ± 0.106
0.567TrpTyr: 0.567 ± 0.17
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.126TyrAla: 2.126 ± 0.431
0.331TyrCys: 0.331 ± 0.143
2.882TyrAsp: 2.882 ± 0.409
2.268TyrGlu: 2.268 ± 0.391
0.709TyrPhe: 0.709 ± 0.175
2.977TyrGly: 2.977 ± 0.43
0.945TyrHis: 0.945 ± 0.244
0.851TyrIle: 0.851 ± 0.21
0.945TyrLys: 0.945 ± 0.263
2.268TyrLeu: 2.268 ± 0.322
0.378TyrMet: 0.378 ± 0.144
1.181TyrAsn: 1.181 ± 0.275
1.559TyrPro: 1.559 ± 0.386
1.323TyrGln: 1.323 ± 0.245
2.126TyrArg: 2.126 ± 0.419
1.418TyrSer: 1.418 ± 0.251
1.465TyrThr: 1.465 ± 0.303
2.268TyrVal: 2.268 ± 0.37
0.662TyrTrp: 0.662 ± 0.188
0.709TyrTyr: 0.709 ± 0.169
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 96 proteins (21164 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski