Amino acid dipepetide frequency for Marinomonas phage CPG1g

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
7.739AlaAla: 7.739 ± 1.003
1.075AlaCys: 1.075 ± 0.324
5.517AlaAsp: 5.517 ± 0.594
4.586AlaGlu: 4.586 ± 0.521
2.723AlaPhe: 2.723 ± 0.482
6.162AlaGly: 6.162 ± 0.568
0.86AlaHis: 0.86 ± 0.194
5.446AlaIle: 5.446 ± 0.683
5.302AlaLys: 5.302 ± 0.68
5.231AlaLeu: 5.231 ± 0.77
2.221AlaMet: 2.221 ± 0.348
3.081AlaAsn: 3.081 ± 0.35
2.221AlaPro: 2.221 ± 0.394
2.866AlaGln: 2.866 ± 0.67
4.299AlaArg: 4.299 ± 0.501
6.162AlaSer: 6.162 ± 0.715
3.941AlaThr: 3.941 ± 0.49
5.016AlaVal: 5.016 ± 0.38
0.717AlaTrp: 0.717 ± 0.225
2.723AlaTyr: 2.723 ± 0.399
0.0AlaXaa: 0.0 ± 0.0
Cys
0.43CysAla: 0.43 ± 0.184
0.0CysCys: 0.0 ± 0.0
0.788CysAsp: 0.788 ± 0.286
0.573CysGlu: 0.573 ± 0.21
0.645CysPhe: 0.645 ± 0.198
0.215CysGly: 0.215 ± 0.124
0.287CysHis: 0.287 ± 0.127
0.573CysIle: 0.573 ± 0.218
1.146CysLys: 1.146 ± 0.362
1.146CysLeu: 1.146 ± 0.335
0.358CysMet: 0.358 ± 0.136
0.645CysAsn: 0.645 ± 0.214
0.43CysPro: 0.43 ± 0.145
0.143CysGln: 0.143 ± 0.121
0.43CysArg: 0.43 ± 0.239
0.43CysSer: 0.43 ± 0.139
0.43CysThr: 0.43 ± 0.202
0.645CysVal: 0.645 ± 0.21
0.0CysTrp: 0.0 ± 0.0
0.287CysTyr: 0.287 ± 0.124
0.0CysXaa: 0.0 ± 0.0
Asp
5.732AspAla: 5.732 ± 0.583
0.43AspCys: 0.43 ± 0.233
3.654AspAsp: 3.654 ± 0.494
3.869AspGlu: 3.869 ± 0.779
2.58AspPhe: 2.58 ± 0.402
4.872AspGly: 4.872 ± 0.671
0.788AspHis: 0.788 ± 0.238
4.801AspIle: 4.801 ± 0.704
4.299AspLys: 4.299 ± 0.679
5.374AspLeu: 5.374 ± 0.736
1.935AspMet: 1.935 ± 0.464
2.723AspAsn: 2.723 ± 0.489
2.15AspPro: 2.15 ± 0.391
1.863AspGln: 1.863 ± 0.275
2.723AspArg: 2.723 ± 0.589
4.801AspSer: 4.801 ± 0.441
3.726AspThr: 3.726 ± 0.558
3.511AspVal: 3.511 ± 0.647
1.075AspTrp: 1.075 ± 0.276
3.296AspTyr: 3.296 ± 0.37
0.0AspXaa: 0.0 ± 0.0
Glu
6.592GluAla: 6.592 ± 0.704
0.502GluCys: 0.502 ± 0.268
5.374GluAsp: 5.374 ± 0.495
8.957GluGlu: 8.957 ± 1.353
3.368GluPhe: 3.368 ± 0.504
6.091GluGly: 6.091 ± 0.621
1.218GluHis: 1.218 ± 0.24
2.794GluIle: 2.794 ± 0.466
4.443GluLys: 4.443 ± 0.555
7.309GluLeu: 7.309 ± 0.675
2.15GluMet: 2.15 ± 0.43
2.866GluAsn: 2.866 ± 0.372
2.15GluPro: 2.15 ± 0.524
2.221GluGln: 2.221 ± 0.598
3.654GluArg: 3.654 ± 0.649
5.517GluSer: 5.517 ± 0.779
3.224GluThr: 3.224 ± 0.397
6.735GluVal: 6.735 ± 0.614
1.648GluTrp: 1.648 ± 0.279
3.224GluTyr: 3.224 ± 0.573
0.0GluXaa: 0.0 ± 0.0
Phe
2.221PheAla: 2.221 ± 0.312
0.287PheCys: 0.287 ± 0.224
3.439PheAsp: 3.439 ± 0.428
2.293PheGlu: 2.293 ± 0.371
1.146PhePhe: 1.146 ± 0.29
1.505PheGly: 1.505 ± 0.353
0.43PheHis: 0.43 ± 0.149
1.72PheIle: 1.72 ± 0.39
3.081PheLys: 3.081 ± 0.407
2.651PheLeu: 2.651 ± 0.497
1.29PheMet: 1.29 ± 0.301
2.508PheAsn: 2.508 ± 0.482
1.576PhePro: 1.576 ± 0.408
0.717PheGln: 0.717 ± 0.159
2.15PheArg: 2.15 ± 0.284
3.153PheSer: 3.153 ± 0.64
2.15PheThr: 2.15 ± 0.459
2.006PheVal: 2.006 ± 0.462
0.717PheTrp: 0.717 ± 0.28
0.931PheTyr: 0.931 ± 0.219
0.0PheXaa: 0.0 ± 0.0
Gly
4.801GlyAla: 4.801 ± 0.615
0.717GlyCys: 0.717 ± 0.219
4.371GlyAsp: 4.371 ± 0.643
5.446GlyGlu: 5.446 ± 0.692
2.436GlyPhe: 2.436 ± 0.505
4.944GlyGly: 4.944 ± 0.762
1.361GlyHis: 1.361 ± 0.326
3.798GlyIle: 3.798 ± 0.46
5.947GlyLys: 5.947 ± 0.825
5.517GlyLeu: 5.517 ± 0.697
1.863GlyMet: 1.863 ± 0.373
2.938GlyAsn: 2.938 ± 0.388
0.143GlyPro: 0.143 ± 0.102
2.651GlyGln: 2.651 ± 0.584
3.439GlyArg: 3.439 ± 0.63
5.016GlySer: 5.016 ± 0.564
5.087GlyThr: 5.087 ± 0.751
4.299GlyVal: 4.299 ± 0.646
0.86GlyTrp: 0.86 ± 0.236
2.794GlyTyr: 2.794 ± 0.406
0.0GlyXaa: 0.0 ± 0.0
His
0.717HisAla: 0.717 ± 0.245
0.43HisCys: 0.43 ± 0.184
1.576HisAsp: 1.576 ± 0.293
1.003HisGlu: 1.003 ± 0.325
0.645HisPhe: 0.645 ± 0.209
1.075HisGly: 1.075 ± 0.338
0.358HisHis: 0.358 ± 0.173
0.717HisIle: 0.717 ± 0.265
1.576HisLys: 1.576 ± 0.36
2.221HisLeu: 2.221 ± 0.475
0.502HisMet: 0.502 ± 0.23
0.43HisAsn: 0.43 ± 0.216
0.645HisPro: 0.645 ± 0.219
0.645HisGln: 0.645 ± 0.229
1.003HisArg: 1.003 ± 0.234
1.29HisSer: 1.29 ± 0.292
0.931HisThr: 0.931 ± 0.33
0.788HisVal: 0.788 ± 0.228
0.358HisTrp: 0.358 ± 0.163
0.788HisTyr: 0.788 ± 0.243
0.0HisXaa: 0.0 ± 0.0
Ile
4.801IleAla: 4.801 ± 0.561
0.358IleCys: 0.358 ± 0.142
4.013IleAsp: 4.013 ± 0.593
4.944IleGlu: 4.944 ± 0.57
1.361IlePhe: 1.361 ± 0.244
3.941IleGly: 3.941 ± 0.5
1.218IleHis: 1.218 ± 0.339
2.58IleIle: 2.58 ± 0.568
5.016IleLys: 5.016 ± 0.619
3.726IleLeu: 3.726 ± 0.524
1.146IleMet: 1.146 ± 0.227
3.009IleAsn: 3.009 ± 0.542
1.863IlePro: 1.863 ± 0.32
1.935IleGln: 1.935 ± 0.346
2.723IleArg: 2.723 ± 0.43
3.798IleSer: 3.798 ± 0.514
3.798IleThr: 3.798 ± 0.523
3.081IleVal: 3.081 ± 0.525
0.573IleTrp: 0.573 ± 0.22
2.15IleTyr: 2.15 ± 0.385
0.0IleXaa: 0.0 ± 0.0
Lys
6.95LysAla: 6.95 ± 0.835
0.86LysCys: 0.86 ± 0.304
5.732LysAsp: 5.732 ± 0.687
8.025LysGlu: 8.025 ± 1.075
1.433LysPhe: 1.433 ± 0.327
3.869LysGly: 3.869 ± 0.756
1.433LysHis: 1.433 ± 0.287
2.078LysIle: 2.078 ± 0.387
3.726LysLys: 3.726 ± 0.714
5.374LysLeu: 5.374 ± 0.653
2.293LysMet: 2.293 ± 0.367
2.221LysAsn: 2.221 ± 0.452
2.293LysPro: 2.293 ± 0.438
3.153LysGln: 3.153 ± 0.478
4.371LysArg: 4.371 ± 0.503
3.941LysSer: 3.941 ± 0.616
3.009LysThr: 3.009 ± 0.473
4.872LysVal: 4.872 ± 0.589
0.717LysTrp: 0.717 ± 0.219
2.651LysTyr: 2.651 ± 0.389
0.0LysXaa: 0.0 ± 0.0
Leu
6.162LeuAla: 6.162 ± 0.658
0.931LeuCys: 0.931 ± 0.269
5.087LeuAsp: 5.087 ± 0.531
6.234LeuGlu: 6.234 ± 0.661
2.866LeuPhe: 2.866 ± 0.566
5.589LeuGly: 5.589 ± 0.634
1.72LeuHis: 1.72 ± 0.404
4.443LeuIle: 4.443 ± 0.459
5.087LeuLys: 5.087 ± 0.786
5.374LeuLeu: 5.374 ± 0.732
2.436LeuMet: 2.436 ± 0.389
4.299LeuAsn: 4.299 ± 0.589
3.511LeuPro: 3.511 ± 0.5
3.153LeuGln: 3.153 ± 0.488
4.013LeuArg: 4.013 ± 0.625
7.309LeuSer: 7.309 ± 0.679
5.446LeuThr: 5.446 ± 0.666
4.657LeuVal: 4.657 ± 0.624
0.502LeuTrp: 0.502 ± 0.151
3.153LeuTyr: 3.153 ± 0.544
0.0LeuXaa: 0.0 ± 0.0
Met
2.651MetAla: 2.651 ± 0.38
0.215MetCys: 0.215 ± 0.11
1.29MetAsp: 1.29 ± 0.433
1.863MetGlu: 1.863 ± 0.402
1.361MetPhe: 1.361 ± 0.374
1.863MetGly: 1.863 ± 0.409
0.573MetHis: 0.573 ± 0.181
1.433MetIle: 1.433 ± 0.275
2.006MetLys: 2.006 ± 0.35
3.009MetLeu: 3.009 ± 0.451
0.143MetMet: 0.143 ± 0.087
1.075MetAsn: 1.075 ± 0.297
0.717MetPro: 0.717 ± 0.239
1.146MetGln: 1.146 ± 0.206
1.003MetArg: 1.003 ± 0.329
2.723MetSer: 2.723 ± 0.439
1.863MetThr: 1.863 ± 0.408
1.218MetVal: 1.218 ± 0.262
0.143MetTrp: 0.143 ± 0.091
1.433MetTyr: 1.433 ± 0.329
0.0MetXaa: 0.0 ± 0.0
Asn
2.866AsnAla: 2.866 ± 0.423
0.287AsnCys: 0.287 ± 0.156
1.433AsnAsp: 1.433 ± 0.409
3.296AsnGlu: 3.296 ± 0.482
1.72AsnPhe: 1.72 ± 0.241
3.869AsnGly: 3.869 ± 0.493
0.502AsnHis: 0.502 ± 0.196
3.009AsnIle: 3.009 ± 0.405
3.798AsnLys: 3.798 ± 0.52
4.443AsnLeu: 4.443 ± 0.624
1.29AsnMet: 1.29 ± 0.307
2.866AsnAsn: 2.866 ± 0.437
2.365AsnPro: 2.365 ± 0.407
1.863AsnGln: 1.863 ± 0.453
2.436AsnArg: 2.436 ± 0.414
2.938AsnSer: 2.938 ± 0.377
3.153AsnThr: 3.153 ± 0.484
3.081AsnVal: 3.081 ± 0.612
0.358AsnTrp: 0.358 ± 0.134
2.436AsnTyr: 2.436 ± 0.563
0.0AsnXaa: 0.0 ± 0.0
Pro
2.078ProAla: 2.078 ± 0.388
0.215ProCys: 0.215 ± 0.121
2.508ProAsp: 2.508 ± 0.406
3.224ProGlu: 3.224 ± 0.535
1.218ProPhe: 1.218 ± 0.41
0.143ProGly: 0.143 ± 0.111
0.502ProHis: 0.502 ± 0.16
1.648ProIle: 1.648 ± 0.433
1.791ProLys: 1.791 ± 0.392
2.436ProLeu: 2.436 ± 0.385
1.075ProMet: 1.075 ± 0.218
2.15ProAsn: 2.15 ± 0.372
0.717ProPro: 0.717 ± 0.263
0.86ProGln: 0.86 ± 0.241
0.86ProArg: 0.86 ± 0.193
4.013ProSer: 4.013 ± 0.52
2.006ProThr: 2.006 ± 0.355
2.365ProVal: 2.365 ± 0.434
0.573ProTrp: 0.573 ± 0.158
1.72ProTyr: 1.72 ± 0.309
0.0ProXaa: 0.0 ± 0.0
Gln
2.866GlnAla: 2.866 ± 0.615
0.358GlnCys: 0.358 ± 0.17
1.791GlnAsp: 1.791 ± 0.375
3.798GlnGlu: 3.798 ± 0.558
1.433GlnPhe: 1.433 ± 0.306
3.009GlnGly: 3.009 ± 0.37
0.931GlnHis: 0.931 ± 0.259
2.365GlnIle: 2.365 ± 0.321
1.935GlnLys: 1.935 ± 0.404
2.723GlnLeu: 2.723 ± 0.6
1.075GlnMet: 1.075 ± 0.223
1.075GlnAsn: 1.075 ± 0.277
0.86GlnPro: 0.86 ± 0.212
1.146GlnGln: 1.146 ± 0.431
1.72GlnArg: 1.72 ± 0.284
2.078GlnSer: 2.078 ± 0.379
1.505GlnThr: 1.505 ± 0.369
2.651GlnVal: 2.651 ± 0.458
0.502GlnTrp: 0.502 ± 0.141
0.788GlnTyr: 0.788 ± 0.272
0.0GlnXaa: 0.0 ± 0.0
Arg
3.726ArgAla: 3.726 ± 0.518
0.645ArgCys: 0.645 ± 0.304
3.009ArgAsp: 3.009 ± 0.666
4.371ArgGlu: 4.371 ± 0.47
2.436ArgPhe: 2.436 ± 0.404
2.866ArgGly: 2.866 ± 0.533
1.075ArgHis: 1.075 ± 0.294
3.439ArgIle: 3.439 ± 0.412
3.869ArgLys: 3.869 ± 0.661
3.654ArgLeu: 3.654 ± 0.494
1.863ArgMet: 1.863 ± 0.47
2.221ArgAsn: 2.221 ± 0.388
1.075ArgPro: 1.075 ± 0.253
1.075ArgGln: 1.075 ± 0.305
2.15ArgArg: 2.15 ± 0.442
3.153ArgSer: 3.153 ± 0.579
2.006ArgThr: 2.006 ± 0.438
3.511ArgVal: 3.511 ± 0.561
0.788ArgTrp: 0.788 ± 0.193
1.576ArgTyr: 1.576 ± 0.368
0.0ArgXaa: 0.0 ± 0.0
Ser
4.371SerAla: 4.371 ± 0.56
0.502SerCys: 0.502 ± 0.19
4.729SerAsp: 4.729 ± 0.614
5.159SerGlu: 5.159 ± 0.538
2.436SerPhe: 2.436 ± 0.453
5.661SerGly: 5.661 ± 0.752
1.218SerHis: 1.218 ± 0.249
4.944SerIle: 4.944 ± 0.757
4.872SerLys: 4.872 ± 0.685
6.592SerLeu: 6.592 ± 0.688
2.15SerMet: 2.15 ± 0.544
4.443SerAsn: 4.443 ± 0.749
2.508SerPro: 2.508 ± 0.416
2.58SerGln: 2.58 ± 0.547
3.726SerArg: 3.726 ± 0.552
5.876SerSer: 5.876 ± 1.009
5.016SerThr: 5.016 ± 0.827
5.159SerVal: 5.159 ± 0.639
0.645SerTrp: 0.645 ± 0.215
2.938SerTyr: 2.938 ± 0.511
0.0SerXaa: 0.0 ± 0.0
Thr
4.228ThrAla: 4.228 ± 0.763
0.86ThrCys: 0.86 ± 0.276
3.224ThrAsp: 3.224 ± 0.563
4.156ThrGlu: 4.156 ± 0.506
2.436ThrPhe: 2.436 ± 0.404
4.371ThrGly: 4.371 ± 0.606
1.218ThrHis: 1.218 ± 0.254
4.156ThrIle: 4.156 ± 0.548
2.866ThrLys: 2.866 ± 0.539
6.234ThrLeu: 6.234 ± 0.78
1.075ThrMet: 1.075 ± 0.278
2.221ThrAsn: 2.221 ± 0.412
3.081ThrPro: 3.081 ± 0.337
2.723ThrGln: 2.723 ± 0.427
2.794ThrArg: 2.794 ± 0.44
3.511ThrSer: 3.511 ± 0.64
3.081ThrThr: 3.081 ± 0.625
3.439ThrVal: 3.439 ± 0.54
0.215ThrTrp: 0.215 ± 0.167
1.863ThrTyr: 1.863 ± 0.332
0.0ThrXaa: 0.0 ± 0.0
Val
5.374ValAla: 5.374 ± 0.494
0.358ValCys: 0.358 ± 0.162
4.156ValAsp: 4.156 ± 0.481
4.443ValGlu: 4.443 ± 0.56
1.791ValPhe: 1.791 ± 0.386
4.514ValGly: 4.514 ± 0.513
0.788ValHis: 0.788 ± 0.233
3.009ValIle: 3.009 ± 0.434
4.514ValLys: 4.514 ± 0.651
4.801ValLeu: 4.801 ± 0.754
1.505ValMet: 1.505 ± 0.322
4.586ValAsn: 4.586 ± 0.417
2.15ValPro: 2.15 ± 0.333
2.651ValGln: 2.651 ± 0.486
3.009ValArg: 3.009 ± 0.511
5.302ValSer: 5.302 ± 0.819
4.586ValThr: 4.586 ± 0.733
4.228ValVal: 4.228 ± 0.63
0.788ValTrp: 0.788 ± 0.236
2.078ValTyr: 2.078 ± 0.378
0.0ValXaa: 0.0 ± 0.0
Trp
1.003TrpAla: 1.003 ± 0.255
0.287TrpCys: 0.287 ± 0.132
0.358TrpAsp: 0.358 ± 0.128
1.29TrpGlu: 1.29 ± 0.233
0.645TrpPhe: 0.645 ± 0.216
0.86TrpGly: 0.86 ± 0.265
0.215TrpHis: 0.215 ± 0.117
0.502TrpIle: 0.502 ± 0.164
0.931TrpLys: 0.931 ± 0.373
0.717TrpLeu: 0.717 ± 0.281
0.287TrpMet: 0.287 ± 0.157
0.573TrpAsn: 0.573 ± 0.154
0.358TrpPro: 0.358 ± 0.166
0.215TrpGln: 0.215 ± 0.11
0.645TrpArg: 0.645 ± 0.234
0.86TrpSer: 0.86 ± 0.237
0.43TrpThr: 0.43 ± 0.2
1.075TrpVal: 1.075 ± 0.204
0.072TrpTrp: 0.072 ± 0.077
0.43TrpTyr: 0.43 ± 0.204
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.58TyrAla: 2.58 ± 0.478
0.287TyrCys: 0.287 ± 0.139
2.15TyrAsp: 2.15 ± 0.379
2.508TyrGlu: 2.508 ± 0.42
1.29TyrPhe: 1.29 ± 0.334
3.009TyrGly: 3.009 ± 0.479
1.003TyrHis: 1.003 ± 0.236
2.58TyrIle: 2.58 ± 0.47
2.794TyrLys: 2.794 ± 0.544
3.439TyrLeu: 3.439 ± 0.382
1.003TyrMet: 1.003 ± 0.201
2.078TyrAsn: 2.078 ± 0.388
1.29TyrPro: 1.29 ± 0.382
1.003TyrGln: 1.003 ± 0.208
1.433TyrArg: 1.433 ± 0.253
3.654TyrSer: 3.654 ± 0.473
2.365TyrThr: 2.365 ± 0.416
2.293TyrVal: 2.293 ± 0.309
0.502TyrTrp: 0.502 ± 0.186
1.29TyrTyr: 1.29 ± 0.331
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 50 proteins (13957 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski