Amino acid dipepetide frequency for Sphingomonas phage Scott

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
14.048AlaAla: 14.048 ± 1.922
1.139AlaCys: 1.139 ± 0.337
8.733AlaAsp: 8.733 ± 0.77
6.834AlaGlu: 6.834 ± 0.754
3.797AlaPhe: 3.797 ± 0.495
8.809AlaGly: 8.809 ± 0.948
0.987AlaHis: 0.987 ± 0.258
4.632AlaIle: 4.632 ± 0.493
6.379AlaLys: 6.379 ± 0.976
10.403AlaLeu: 10.403 ± 0.987
3.189AlaMet: 3.189 ± 0.533
4.404AlaAsn: 4.404 ± 0.597
3.341AlaPro: 3.341 ± 0.419
5.543AlaGln: 5.543 ± 0.764
5.771AlaArg: 5.771 ± 0.8
5.24AlaSer: 5.24 ± 0.754
5.999AlaThr: 5.999 ± 0.869
6.986AlaVal: 6.986 ± 0.577
1.671AlaTrp: 1.671 ± 0.309
2.886AlaTyr: 2.886 ± 0.309
0.0AlaXaa: 0.0 ± 0.0
Cys
0.987CysAla: 0.987 ± 0.315
0.228CysCys: 0.228 ± 0.173
0.076CysAsp: 0.076 ± 0.084
0.532CysGlu: 0.532 ± 0.231
0.607CysPhe: 0.607 ± 0.215
0.759CysGly: 0.759 ± 0.208
0.0CysHis: 0.0 ± 0.0
1.063CysIle: 1.063 ± 0.403
0.456CysLys: 0.456 ± 0.191
0.607CysLeu: 0.607 ± 0.224
0.152CysMet: 0.152 ± 0.108
0.456CysAsn: 0.456 ± 0.187
0.607CysPro: 0.607 ± 0.21
0.228CysGln: 0.228 ± 0.141
0.38CysArg: 0.38 ± 0.156
0.456CysSer: 0.456 ± 0.202
0.228CysThr: 0.228 ± 0.145
0.532CysVal: 0.532 ± 0.231
0.228CysTrp: 0.228 ± 0.116
0.38CysTyr: 0.38 ± 0.171
0.0CysXaa: 0.0 ± 0.0
Asp
6.227AspAla: 6.227 ± 0.569
0.683AspCys: 0.683 ± 0.212
3.265AspAsp: 3.265 ± 0.599
3.265AspGlu: 3.265 ± 0.409
3.113AspPhe: 3.113 ± 0.451
5.316AspGly: 5.316 ± 0.776
0.835AspHis: 0.835 ± 0.207
3.265AspIle: 3.265 ± 0.413
3.417AspLys: 3.417 ± 0.51
5.619AspLeu: 5.619 ± 0.59
1.898AspMet: 1.898 ± 0.54
2.582AspAsn: 2.582 ± 0.458
4.176AspPro: 4.176 ± 0.671
2.658AspGln: 2.658 ± 0.549
3.721AspArg: 3.721 ± 0.623
2.506AspSer: 2.506 ± 0.411
3.341AspThr: 3.341 ± 0.537
4.404AspVal: 4.404 ± 0.567
0.911AspTrp: 0.911 ± 0.254
1.898AspTyr: 1.898 ± 0.347
0.0AspXaa: 0.0 ± 0.0
Glu
5.695GluAla: 5.695 ± 0.807
0.304GluCys: 0.304 ± 0.19
3.645GluAsp: 3.645 ± 0.533
2.202GluGlu: 2.202 ± 0.544
1.747GluPhe: 1.747 ± 0.31
4.101GluGly: 4.101 ± 0.576
1.063GluHis: 1.063 ± 0.321
2.43GluIle: 2.43 ± 0.486
2.962GluLys: 2.962 ± 0.548
4.632GluLeu: 4.632 ± 0.684
0.911GluMet: 0.911 ± 0.29
1.519GluAsn: 1.519 ± 0.367
2.126GluPro: 2.126 ± 0.387
3.721GluGln: 3.721 ± 0.574
4.101GluArg: 4.101 ± 0.739
2.202GluSer: 2.202 ± 0.336
3.113GluThr: 3.113 ± 0.467
4.404GluVal: 4.404 ± 0.666
1.215GluTrp: 1.215 ± 0.375
2.05GluTyr: 2.05 ± 0.398
0.0GluXaa: 0.0 ± 0.0
Phe
2.734PheAla: 2.734 ± 0.396
0.456PheCys: 0.456 ± 0.197
3.189PheAsp: 3.189 ± 0.389
2.05PheGlu: 2.05 ± 0.378
1.063PhePhe: 1.063 ± 0.196
2.962PheGly: 2.962 ± 0.467
0.683PheHis: 0.683 ± 0.227
1.671PheIle: 1.671 ± 0.366
1.974PheLys: 1.974 ± 0.414
2.962PheLeu: 2.962 ± 0.358
0.532PheMet: 0.532 ± 0.181
1.822PheAsn: 1.822 ± 0.277
0.987PhePro: 0.987 ± 0.293
1.367PheGln: 1.367 ± 0.279
2.582PheArg: 2.582 ± 0.337
3.037PheSer: 3.037 ± 0.379
2.278PheThr: 2.278 ± 0.493
2.126PheVal: 2.126 ± 0.442
0.304PheTrp: 0.304 ± 0.141
1.063PheTyr: 1.063 ± 0.215
0.0PheXaa: 0.0 ± 0.0
Gly
8.429GlyAla: 8.429 ± 0.995
0.835GlyCys: 0.835 ± 0.281
4.632GlyAsp: 4.632 ± 0.458
4.48GlyGlu: 4.48 ± 0.561
3.873GlyPhe: 3.873 ± 0.463
7.973GlyGly: 7.973 ± 0.882
1.443GlyHis: 1.443 ± 0.358
3.645GlyIle: 3.645 ± 0.427
5.543GlyLys: 5.543 ± 0.502
5.316GlyLeu: 5.316 ± 0.839
3.189GlyMet: 3.189 ± 0.5
3.873GlyAsn: 3.873 ± 0.699
2.354GlyPro: 2.354 ± 0.449
3.949GlyGln: 3.949 ± 0.361
6.379GlyArg: 6.379 ± 0.735
5.999GlySer: 5.999 ± 0.551
5.088GlyThr: 5.088 ± 0.546
4.328GlyVal: 4.328 ± 0.43
1.443GlyTrp: 1.443 ± 0.311
2.354GlyTyr: 2.354 ± 0.383
0.0GlyXaa: 0.0 ± 0.0
His
1.139HisAla: 1.139 ± 0.281
0.152HisCys: 0.152 ± 0.106
1.063HisAsp: 1.063 ± 0.221
0.532HisGlu: 0.532 ± 0.255
0.532HisPhe: 0.532 ± 0.175
0.987HisGly: 0.987 ± 0.283
0.304HisHis: 0.304 ± 0.138
0.911HisIle: 0.911 ± 0.251
0.987HisLys: 0.987 ± 0.258
1.595HisLeu: 1.595 ± 0.278
0.835HisMet: 0.835 ± 0.246
0.532HisAsn: 0.532 ± 0.201
0.759HisPro: 0.759 ± 0.238
0.456HisGln: 0.456 ± 0.152
0.607HisArg: 0.607 ± 0.281
0.456HisSer: 0.456 ± 0.161
1.215HisThr: 1.215 ± 0.291
1.443HisVal: 1.443 ± 0.292
0.456HisTrp: 0.456 ± 0.192
0.532HisTyr: 0.532 ± 0.23
0.0HisXaa: 0.0 ± 0.0
Ile
5.543IleAla: 5.543 ± 0.567
0.304IleCys: 0.304 ± 0.154
3.645IleAsp: 3.645 ± 0.398
3.113IleGlu: 3.113 ± 0.612
1.367IlePhe: 1.367 ± 0.366
4.936IleGly: 4.936 ± 0.589
0.835IleHis: 0.835 ± 0.267
1.898IleIle: 1.898 ± 0.389
2.582IleLys: 2.582 ± 0.399
2.734IleLeu: 2.734 ± 0.445
1.063IleMet: 1.063 ± 0.285
2.81IleAsn: 2.81 ± 0.365
1.974IlePro: 1.974 ± 0.378
1.671IleGln: 1.671 ± 0.27
2.582IleArg: 2.582 ± 0.391
1.443IleSer: 1.443 ± 0.259
3.189IleThr: 3.189 ± 0.405
3.645IleVal: 3.645 ± 0.668
0.532IleTrp: 0.532 ± 0.162
1.063IleTyr: 1.063 ± 0.29
0.0IleXaa: 0.0 ± 0.0
Lys
7.594LysAla: 7.594 ± 1.092
0.304LysCys: 0.304 ± 0.156
2.81LysAsp: 2.81 ± 0.424
2.658LysGlu: 2.658 ± 0.552
1.367LysPhe: 1.367 ± 0.277
3.189LysGly: 3.189 ± 0.58
0.987LysHis: 0.987 ± 0.34
2.278LysIle: 2.278 ± 0.371
1.898LysLys: 1.898 ± 0.367
5.543LysLeu: 5.543 ± 0.586
1.291LysMet: 1.291 ± 0.33
2.202LysAsn: 2.202 ± 0.442
2.43LysPro: 2.43 ± 0.395
1.974LysGln: 1.974 ± 0.33
2.582LysArg: 2.582 ± 0.459
3.417LysSer: 3.417 ± 0.428
2.962LysThr: 2.962 ± 0.484
3.341LysVal: 3.341 ± 0.515
0.683LysTrp: 0.683 ± 0.232
1.822LysTyr: 1.822 ± 0.333
0.0LysXaa: 0.0 ± 0.0
Leu
11.542LeuAla: 11.542 ± 0.813
0.835LeuCys: 0.835 ± 0.295
6.075LeuAsp: 6.075 ± 0.794
4.936LeuGlu: 4.936 ± 0.771
2.506LeuPhe: 2.506 ± 0.484
5.771LeuGly: 5.771 ± 0.638
0.987LeuHis: 0.987 ± 0.239
3.265LeuIle: 3.265 ± 0.549
2.81LeuLys: 2.81 ± 0.509
6.91LeuLeu: 6.91 ± 0.755
2.05LeuMet: 2.05 ± 0.487
3.721LeuAsn: 3.721 ± 0.514
4.101LeuPro: 4.101 ± 0.601
2.81LeuGln: 2.81 ± 0.371
5.543LeuArg: 5.543 ± 0.431
5.164LeuSer: 5.164 ± 0.82
4.86LeuThr: 4.86 ± 0.534
5.316LeuVal: 5.316 ± 0.523
0.683LeuTrp: 0.683 ± 0.219
3.189LeuTyr: 3.189 ± 0.381
0.0LeuXaa: 0.0 ± 0.0
Met
3.417MetAla: 3.417 ± 0.554
0.076MetCys: 0.076 ± 0.099
2.202MetAsp: 2.202 ± 0.403
1.215MetGlu: 1.215 ± 0.384
0.683MetPhe: 0.683 ± 0.223
2.202MetGly: 2.202 ± 0.308
0.456MetHis: 0.456 ± 0.178
1.443MetIle: 1.443 ± 0.406
0.911MetLys: 0.911 ± 0.331
2.81MetLeu: 2.81 ± 0.655
0.683MetMet: 0.683 ± 0.257
1.291MetAsn: 1.291 ± 0.339
1.291MetPro: 1.291 ± 0.319
1.215MetGln: 1.215 ± 0.261
1.822MetArg: 1.822 ± 0.422
1.671MetSer: 1.671 ± 0.339
1.443MetThr: 1.443 ± 0.308
1.519MetVal: 1.519 ± 0.234
0.683MetTrp: 0.683 ± 0.277
0.532MetTyr: 0.532 ± 0.169
0.0MetXaa: 0.0 ± 0.0
Asn
3.417AsnAla: 3.417 ± 0.427
0.532AsnCys: 0.532 ± 0.272
2.582AsnAsp: 2.582 ± 0.371
1.671AsnGlu: 1.671 ± 0.339
1.443AsnPhe: 1.443 ± 0.477
4.48AsnGly: 4.48 ± 0.538
0.911AsnHis: 0.911 ± 0.2
2.278AsnIle: 2.278 ± 0.415
2.278AsnLys: 2.278 ± 0.366
2.886AsnLeu: 2.886 ± 0.511
1.595AsnMet: 1.595 ± 0.325
1.974AsnAsn: 1.974 ± 0.401
2.81AsnPro: 2.81 ± 0.38
1.291AsnGln: 1.291 ± 0.267
2.202AsnArg: 2.202 ± 0.44
1.974AsnSer: 1.974 ± 0.41
3.265AsnThr: 3.265 ± 0.447
3.037AsnVal: 3.037 ± 0.541
0.759AsnTrp: 0.759 ± 0.229
1.139AsnTyr: 1.139 ± 0.282
0.0AsnXaa: 0.0 ± 0.0
Pro
4.86ProAla: 4.86 ± 0.784
0.456ProCys: 0.456 ± 0.185
3.113ProAsp: 3.113 ± 0.355
2.582ProGlu: 2.582 ± 0.471
1.747ProPhe: 1.747 ± 0.329
3.493ProGly: 3.493 ± 0.526
0.456ProHis: 0.456 ± 0.146
1.898ProIle: 1.898 ± 0.445
2.734ProLys: 2.734 ± 0.529
3.797ProLeu: 3.797 ± 0.549
1.443ProMet: 1.443 ± 0.27
2.202ProAsn: 2.202 ± 0.476
1.367ProPro: 1.367 ± 0.322
1.291ProGln: 1.291 ± 0.304
1.747ProArg: 1.747 ± 0.461
2.962ProSer: 2.962 ± 0.522
2.734ProThr: 2.734 ± 0.543
3.949ProVal: 3.949 ± 0.452
0.228ProTrp: 0.228 ± 0.105
1.519ProTyr: 1.519 ± 0.31
0.0ProXaa: 0.0 ± 0.0
Gln
6.227GlnAla: 6.227 ± 1.066
0.228GlnCys: 0.228 ± 0.115
1.974GlnAsp: 1.974 ± 0.353
1.822GlnGlu: 1.822 ± 0.341
1.747GlnPhe: 1.747 ± 0.324
3.037GlnGly: 3.037 ± 0.651
1.443GlnHis: 1.443 ± 0.287
2.202GlnIle: 2.202 ± 0.421
2.126GlnLys: 2.126 ± 0.375
3.721GlnLeu: 3.721 ± 0.497
0.835GlnMet: 0.835 ± 0.299
1.139GlnAsn: 1.139 ± 0.296
2.05GlnPro: 2.05 ± 0.496
1.519GlnGln: 1.519 ± 0.465
2.81GlnArg: 2.81 ± 0.48
2.202GlnSer: 2.202 ± 0.335
1.974GlnThr: 1.974 ± 0.332
2.658GlnVal: 2.658 ± 0.302
0.987GlnTrp: 0.987 ± 0.2
1.139GlnTyr: 1.139 ± 0.31
0.0GlnXaa: 0.0 ± 0.0
Arg
4.784ArgAla: 4.784 ± 0.558
0.532ArgCys: 0.532 ± 0.212
2.886ArgAsp: 2.886 ± 0.369
4.025ArgGlu: 4.025 ± 0.571
2.43ArgPhe: 2.43 ± 0.375
5.012ArgGly: 5.012 ± 0.555
1.215ArgHis: 1.215 ± 0.309
4.101ArgIle: 4.101 ± 0.571
2.658ArgLys: 2.658 ± 0.449
4.936ArgLeu: 4.936 ± 0.639
2.05ArgMet: 2.05 ± 0.424
2.658ArgAsn: 2.658 ± 0.395
2.43ArgPro: 2.43 ± 0.456
2.658ArgGln: 2.658 ± 0.435
4.252ArgArg: 4.252 ± 0.629
3.797ArgSer: 3.797 ± 0.453
2.658ArgThr: 2.658 ± 0.508
4.252ArgVal: 4.252 ± 0.485
0.987ArgTrp: 0.987 ± 0.384
2.202ArgTyr: 2.202 ± 0.468
0.0ArgXaa: 0.0 ± 0.0
Ser
5.771SerAla: 5.771 ± 0.515
0.532SerCys: 0.532 ± 0.209
3.189SerAsp: 3.189 ± 0.375
2.962SerGlu: 2.962 ± 0.443
2.506SerPhe: 2.506 ± 0.468
6.075SerGly: 6.075 ± 0.905
0.759SerHis: 0.759 ± 0.239
2.43SerIle: 2.43 ± 0.408
2.582SerLys: 2.582 ± 0.408
4.404SerLeu: 4.404 ± 0.498
1.595SerMet: 1.595 ± 0.385
2.126SerAsn: 2.126 ± 0.361
2.354SerPro: 2.354 ± 0.41
2.506SerGln: 2.506 ± 0.513
3.037SerArg: 3.037 ± 0.594
3.189SerSer: 3.189 ± 0.431
3.417SerThr: 3.417 ± 0.468
3.797SerVal: 3.797 ± 0.556
0.683SerTrp: 0.683 ± 0.264
2.202SerTyr: 2.202 ± 0.599
0.0SerXaa: 0.0 ± 0.0
Thr
6.986ThrAla: 6.986 ± 0.719
0.38ThrCys: 0.38 ± 0.186
3.113ThrAsp: 3.113 ± 0.423
2.354ThrGlu: 2.354 ± 0.575
1.747ThrPhe: 1.747 ± 0.314
5.847ThrGly: 5.847 ± 0.489
0.607ThrHis: 0.607 ± 0.187
2.81ThrIle: 2.81 ± 0.529
2.81ThrLys: 2.81 ± 0.509
4.86ThrLeu: 4.86 ± 0.922
1.139ThrMet: 1.139 ± 0.248
1.822ThrAsn: 1.822 ± 0.316
4.176ThrPro: 4.176 ± 0.585
1.747ThrGln: 1.747 ± 0.368
3.721ThrArg: 3.721 ± 0.496
3.493ThrSer: 3.493 ± 0.477
4.252ThrThr: 4.252 ± 0.549
4.556ThrVal: 4.556 ± 0.718
0.911ThrTrp: 0.911 ± 0.243
1.822ThrTyr: 1.822 ± 0.353
0.0ThrXaa: 0.0 ± 0.0
Val
7.214ValAla: 7.214 ± 0.728
0.38ValCys: 0.38 ± 0.174
4.101ValAsp: 4.101 ± 0.493
4.632ValGlu: 4.632 ± 0.665
1.898ValPhe: 1.898 ± 0.332
6.075ValGly: 6.075 ± 0.971
1.139ValHis: 1.139 ± 0.257
3.189ValIle: 3.189 ± 0.57
3.797ValLys: 3.797 ± 0.608
4.632ValLeu: 4.632 ± 0.536
1.519ValMet: 1.519 ± 0.355
3.569ValAsn: 3.569 ± 0.436
3.189ValPro: 3.189 ± 0.556
2.582ValGln: 2.582 ± 0.349
3.797ValArg: 3.797 ± 0.562
4.101ValSer: 4.101 ± 0.7
4.404ValThr: 4.404 ± 0.583
3.949ValVal: 3.949 ± 0.653
0.911ValTrp: 0.911 ± 0.29
1.898ValTyr: 1.898 ± 0.364
0.0ValXaa: 0.0 ± 0.0
Trp
1.519TrpAla: 1.519 ± 0.347
0.38TrpCys: 0.38 ± 0.194
0.987TrpAsp: 0.987 ± 0.336
0.607TrpGlu: 0.607 ± 0.176
0.607TrpPhe: 0.607 ± 0.218
0.835TrpGly: 0.835 ± 0.248
0.228TrpHis: 0.228 ± 0.119
0.532TrpIle: 0.532 ± 0.164
0.456TrpLys: 0.456 ± 0.181
1.974TrpLeu: 1.974 ± 0.43
0.607TrpMet: 0.607 ± 0.234
0.532TrpAsn: 0.532 ± 0.244
0.38TrpPro: 0.38 ± 0.182
1.291TrpGln: 1.291 ± 0.494
0.835TrpArg: 0.835 ± 0.253
0.835TrpSer: 0.835 ± 0.216
0.911TrpThr: 0.911 ± 0.263
1.139TrpVal: 1.139 ± 0.33
0.228TrpTrp: 0.228 ± 0.132
0.152TrpTyr: 0.152 ± 0.088
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.658TyrAla: 2.658 ± 0.361
0.228TyrCys: 0.228 ± 0.145
1.822TyrAsp: 1.822 ± 0.359
1.671TyrGlu: 1.671 ± 0.315
1.063TyrPhe: 1.063 ± 0.322
3.341TyrGly: 3.341 ± 0.493
0.152TyrHis: 0.152 ± 0.136
0.987TyrIle: 0.987 ± 0.229
1.974TyrLys: 1.974 ± 0.408
2.734TyrLeu: 2.734 ± 0.549
0.911TyrMet: 0.911 ± 0.271
1.291TyrAsn: 1.291 ± 0.402
1.671TyrPro: 1.671 ± 0.456
1.367TyrGln: 1.367 ± 0.293
2.05TyrArg: 2.05 ± 0.426
1.974TyrSer: 1.974 ± 0.319
1.822TyrThr: 1.822 ± 0.279
1.595TyrVal: 1.595 ± 0.315
0.532TyrTrp: 0.532 ± 0.234
1.063TyrTyr: 1.063 ± 0.251
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 51 proteins (13170 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski