Amino acid dipepetide frequency for Clostridioides phage phiSemix9P1

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
1.813AlaAla: 1.813 ± 0.533
0.324AlaCys: 0.324 ± 0.173
2.46AlaAsp: 2.46 ± 0.452
2.784AlaGlu: 2.784 ± 0.498
2.46AlaPhe: 2.46 ± 0.679
3.172AlaGly: 3.172 ± 0.855
0.388AlaHis: 0.388 ± 0.138
4.791AlaIle: 4.791 ± 0.538
4.467AlaLys: 4.467 ± 0.471
4.402AlaLeu: 4.402 ± 0.68
1.165AlaMet: 1.165 ± 0.328
3.172AlaAsn: 3.172 ± 0.486
0.906AlaPro: 0.906 ± 0.254
0.453AlaGln: 0.453 ± 0.159
1.359AlaArg: 1.359 ± 0.272
3.625AlaSer: 3.625 ± 0.589
3.496AlaThr: 3.496 ± 0.593
2.46AlaVal: 2.46 ± 0.461
0.388AlaTrp: 0.388 ± 0.131
1.813AlaTyr: 1.813 ± 0.467
0.0AlaXaa: 0.0 ± 0.0
Cys
0.453CysAla: 0.453 ± 0.189
0.0CysCys: 0.0 ± 0.0
0.324CysAsp: 0.324 ± 0.138
0.453CysGlu: 0.453 ± 0.182
0.842CysPhe: 0.842 ± 0.228
0.324CysGly: 0.324 ± 0.144
0.194CysHis: 0.194 ± 0.163
1.036CysIle: 1.036 ± 0.301
0.583CysLys: 0.583 ± 0.274
0.453CysLeu: 0.453 ± 0.155
0.324CysMet: 0.324 ± 0.133
0.518CysAsn: 0.518 ± 0.182
0.324CysPro: 0.324 ± 0.14
0.259CysGln: 0.259 ± 0.143
0.388CysArg: 0.388 ± 0.162
1.295CysSer: 1.295 ± 0.268
0.388CysThr: 0.388 ± 0.187
0.259CysVal: 0.259 ± 0.138
0.194CysTrp: 0.194 ± 0.123
0.259CysTyr: 0.259 ± 0.118
0.0CysXaa: 0.0 ± 0.0
Asp
2.654AspAla: 2.654 ± 0.427
0.712AspCys: 0.712 ± 0.246
2.978AspAsp: 2.978 ± 0.444
3.69AspGlu: 3.69 ± 0.569
2.46AspPhe: 2.46 ± 0.354
2.978AspGly: 2.978 ± 0.522
0.259AspHis: 0.259 ± 0.111
5.956AspIle: 5.956 ± 0.816
6.733AspLys: 6.733 ± 0.797
5.179AspLeu: 5.179 ± 0.6
1.359AspMet: 1.359 ± 0.267
5.114AspAsn: 5.114 ± 0.44
1.036AspPro: 1.036 ± 0.372
1.165AspGln: 1.165 ± 0.246
2.201AspArg: 2.201 ± 0.429
5.114AspSer: 5.114 ± 0.821
3.949AspThr: 3.949 ± 0.526
2.784AspVal: 2.784 ± 0.413
0.388AspTrp: 0.388 ± 0.16
2.913AspTyr: 2.913 ± 0.467
0.0AspXaa: 0.0 ± 0.0
Glu
3.172GluAla: 3.172 ± 0.615
0.842GluCys: 0.842 ± 0.246
4.078GluAsp: 4.078 ± 0.526
6.603GluGlu: 6.603 ± 1.061
3.561GluPhe: 3.561 ± 0.542
2.978GluGly: 2.978 ± 0.448
1.165GluHis: 1.165 ± 0.296
8.416GluIle: 8.416 ± 1.086
8.804GluLys: 8.804 ± 0.938
7.639GluLeu: 7.639 ± 0.824
1.618GluMet: 1.618 ± 0.384
6.085GluAsn: 6.085 ± 0.808
0.842GluPro: 0.842 ± 0.219
2.331GluGln: 2.331 ± 0.436
2.46GluArg: 2.46 ± 0.443
4.014GluSer: 4.014 ± 0.45
3.69GluThr: 3.69 ± 0.481
3.82GluVal: 3.82 ± 0.524
0.712GluTrp: 0.712 ± 0.209
3.82GluTyr: 3.82 ± 0.635
0.0GluXaa: 0.0 ± 0.0
Phe
1.877PheAla: 1.877 ± 0.328
0.647PheCys: 0.647 ± 0.193
2.719PheAsp: 2.719 ± 0.401
2.719PheGlu: 2.719 ± 0.396
1.877PhePhe: 1.877 ± 0.35
2.331PheGly: 2.331 ± 0.298
0.324PheHis: 0.324 ± 0.127
4.532PheIle: 4.532 ± 0.528
5.697PheLys: 5.697 ± 0.698
3.431PheLeu: 3.431 ± 0.494
1.683PheMet: 1.683 ± 0.444
2.589PheAsn: 2.589 ± 0.36
0.777PhePro: 0.777 ± 0.264
0.842PheGln: 0.842 ± 0.282
1.295PheArg: 1.295 ± 0.34
3.302PheSer: 3.302 ± 0.441
2.589PheThr: 2.589 ± 0.383
2.007PheVal: 2.007 ± 0.307
0.324PheTrp: 0.324 ± 0.121
1.683PheTyr: 1.683 ± 0.36
0.0PheXaa: 0.0 ± 0.0
Gly
2.848GlyAla: 2.848 ± 0.625
0.453GlyCys: 0.453 ± 0.213
3.107GlyAsp: 3.107 ± 0.538
3.107GlyGlu: 3.107 ± 0.455
2.266GlyPhe: 2.266 ± 0.393
4.014GlyGly: 4.014 ± 1.81
0.647GlyHis: 0.647 ± 0.184
3.82GlyIle: 3.82 ± 0.716
4.402GlyLys: 4.402 ± 0.526
3.82GlyLeu: 3.82 ± 0.511
1.165GlyMet: 1.165 ± 0.304
4.143GlyAsn: 4.143 ± 0.543
0.647GlyPro: 0.647 ± 0.237
1.683GlyGln: 1.683 ± 0.34
1.554GlyArg: 1.554 ± 0.37
3.561GlySer: 3.561 ± 0.832
3.755GlyThr: 3.755 ± 0.664
3.625GlyVal: 3.625 ± 0.752
0.712GlyTrp: 0.712 ± 0.22
2.978GlyTyr: 2.978 ± 0.691
0.0GlyXaa: 0.0 ± 0.0
His
0.388HisAla: 0.388 ± 0.139
0.129HisCys: 0.129 ± 0.102
0.712HisAsp: 0.712 ± 0.214
0.388HisGlu: 0.388 ± 0.131
0.583HisPhe: 0.583 ± 0.168
0.324HisGly: 0.324 ± 0.136
0.129HisHis: 0.129 ± 0.086
1.036HisIle: 1.036 ± 0.26
1.489HisLys: 1.489 ± 0.366
0.842HisLeu: 0.842 ± 0.196
0.388HisMet: 0.388 ± 0.166
1.036HisAsn: 1.036 ± 0.261
0.194HisPro: 0.194 ± 0.099
0.065HisGln: 0.065 ± 0.054
0.259HisArg: 0.259 ± 0.126
0.583HisSer: 0.583 ± 0.212
0.712HisThr: 0.712 ± 0.313
0.583HisVal: 0.583 ± 0.195
0.259HisTrp: 0.259 ± 0.128
0.194HisTyr: 0.194 ± 0.123
0.0HisXaa: 0.0 ± 0.0
Ile
5.05IleAla: 5.05 ± 0.516
0.971IleCys: 0.971 ± 0.279
6.215IleAsp: 6.215 ± 0.781
8.869IleGlu: 8.869 ± 1.097
3.755IlePhe: 3.755 ± 0.672
4.143IleGly: 4.143 ± 0.56
0.712IleHis: 0.712 ± 0.22
8.481IleIle: 8.481 ± 0.916
9.905IleLys: 9.905 ± 0.786
8.61IleLeu: 8.61 ± 0.738
1.877IleMet: 1.877 ± 0.356
6.992IleAsn: 6.992 ± 0.79
2.525IlePro: 2.525 ± 0.424
3.625IleGln: 3.625 ± 0.599
3.043IleArg: 3.043 ± 0.554
7.574IleSer: 7.574 ± 0.796
5.308IleThr: 5.308 ± 0.546
5.826IleVal: 5.826 ± 0.765
0.518IleTrp: 0.518 ± 0.202
4.532IleTyr: 4.532 ± 0.827
0.0IleXaa: 0.0 ± 0.0
Lys
4.208LysAla: 4.208 ± 0.561
0.518LysCys: 0.518 ± 0.177
6.927LysAsp: 6.927 ± 0.641
9.84LysGlu: 9.84 ± 0.935
3.496LysPhe: 3.496 ± 0.463
4.92LysGly: 4.92 ± 0.567
1.165LysHis: 1.165 ± 0.313
11.005LysIle: 11.005 ± 0.881
10.293LysLys: 10.293 ± 1.093
8.027LysLeu: 8.027 ± 0.679
2.848LysMet: 2.848 ± 0.481
7.639LysAsn: 7.639 ± 0.687
2.395LysPro: 2.395 ± 0.388
2.719LysGln: 2.719 ± 0.424
3.302LysArg: 3.302 ± 0.549
5.438LysSer: 5.438 ± 0.687
6.021LysThr: 6.021 ± 0.664
7.38LysVal: 7.38 ± 0.653
0.777LysTrp: 0.777 ± 0.175
5.179LysTyr: 5.179 ± 0.712
0.0LysXaa: 0.0 ± 0.0
Leu
4.467LeuAla: 4.467 ± 0.849
0.842LeuCys: 0.842 ± 0.192
6.15LeuAsp: 6.15 ± 0.677
6.862LeuGlu: 6.862 ± 0.621
4.078LeuPhe: 4.078 ± 0.435
3.561LeuGly: 3.561 ± 0.473
0.518LeuHis: 0.518 ± 0.142
8.804LeuIle: 8.804 ± 0.767
9.581LeuLys: 9.581 ± 0.834
6.992LeuLeu: 6.992 ± 0.988
1.877LeuMet: 1.877 ± 0.349
5.826LeuAsn: 5.826 ± 0.711
2.201LeuPro: 2.201 ± 0.489
2.719LeuGln: 2.719 ± 0.368
2.201LeuArg: 2.201 ± 0.355
6.344LeuSer: 6.344 ± 0.783
4.791LeuThr: 4.791 ± 0.604
4.337LeuVal: 4.337 ± 0.432
0.583LeuTrp: 0.583 ± 0.199
2.978LeuTyr: 2.978 ± 0.521
0.0LeuXaa: 0.0 ± 0.0
Met
1.101MetAla: 1.101 ± 0.207
0.194MetCys: 0.194 ± 0.106
1.683MetAsp: 1.683 ± 0.358
1.813MetGlu: 1.813 ± 0.387
1.036MetPhe: 1.036 ± 0.338
1.489MetGly: 1.489 ± 0.313
0.194MetHis: 0.194 ± 0.103
1.942MetIle: 1.942 ± 0.422
2.719MetLys: 2.719 ± 0.367
1.877MetLeu: 1.877 ± 0.429
0.259MetMet: 0.259 ± 0.109
1.554MetAsn: 1.554 ± 0.321
0.518MetPro: 0.518 ± 0.211
0.777MetGln: 0.777 ± 0.274
0.906MetArg: 0.906 ± 0.204
1.942MetSer: 1.942 ± 0.375
0.971MetThr: 0.971 ± 0.329
1.036MetVal: 1.036 ± 0.275
0.0MetTrp: 0.0 ± 0.0
1.23MetTyr: 1.23 ± 0.325
0.0MetXaa: 0.0 ± 0.0
Asn
3.496AsnAla: 3.496 ± 0.376
0.453AsnCys: 0.453 ± 0.177
4.143AsnAsp: 4.143 ± 0.479
5.567AsnGlu: 5.567 ± 0.743
3.172AsnPhe: 3.172 ± 0.397
4.726AsnGly: 4.726 ± 0.753
0.777AsnHis: 0.777 ± 0.207
7.251AsnIle: 7.251 ± 0.82
7.574AsnLys: 7.574 ± 0.782
7.251AsnLeu: 7.251 ± 0.576
1.554AsnMet: 1.554 ± 0.374
4.596AsnAsn: 4.596 ± 0.633
1.424AsnPro: 1.424 ± 0.317
1.748AsnGln: 1.748 ± 0.336
2.848AsnArg: 2.848 ± 0.412
4.532AsnSer: 4.532 ± 0.636
4.014AsnThr: 4.014 ± 0.671
4.337AsnVal: 4.337 ± 0.536
0.647AsnTrp: 0.647 ± 0.237
2.784AsnTyr: 2.784 ± 0.508
0.0AsnXaa: 0.0 ± 0.0
Pro
0.971ProAla: 0.971 ± 0.276
0.194ProCys: 0.194 ± 0.113
0.842ProAsp: 0.842 ± 0.244
1.489ProGlu: 1.489 ± 0.409
0.906ProPhe: 0.906 ± 0.213
1.424ProGly: 1.424 ± 0.321
0.194ProHis: 0.194 ± 0.108
2.266ProIle: 2.266 ± 0.413
1.683ProLys: 1.683 ± 0.346
1.618ProLeu: 1.618 ± 0.333
0.712ProMet: 0.712 ± 0.254
1.101ProAsn: 1.101 ± 0.279
0.453ProPro: 0.453 ± 0.21
0.453ProGln: 0.453 ± 0.16
0.906ProArg: 0.906 ± 0.247
2.46ProSer: 2.46 ± 0.395
1.618ProThr: 1.618 ± 0.328
1.165ProVal: 1.165 ± 0.275
0.065ProTrp: 0.065 ± 0.066
0.583ProTyr: 0.583 ± 0.231
0.0ProXaa: 0.0 ± 0.0
Gln
1.489GlnAla: 1.489 ± 0.307
0.453GlnCys: 0.453 ± 0.21
1.748GlnAsp: 1.748 ± 0.309
2.978GlnGlu: 2.978 ± 0.413
0.388GlnPhe: 0.388 ± 0.159
1.424GlnGly: 1.424 ± 0.298
0.259GlnHis: 0.259 ± 0.138
2.072GlnIle: 2.072 ± 0.336
2.525GlnLys: 2.525 ± 0.36
2.978GlnLeu: 2.978 ± 0.449
0.324GlnMet: 0.324 ± 0.147
1.942GlnAsn: 1.942 ± 0.269
0.583GlnPro: 0.583 ± 0.196
0.842GlnGln: 0.842 ± 0.237
1.101GlnArg: 1.101 ± 0.285
1.618GlnSer: 1.618 ± 0.376
1.489GlnThr: 1.489 ± 0.276
1.489GlnVal: 1.489 ± 0.283
0.194GlnTrp: 0.194 ± 0.138
0.906GlnTyr: 0.906 ± 0.204
0.0GlnXaa: 0.0 ± 0.0
Arg
1.877ArgAla: 1.877 ± 0.386
0.194ArgCys: 0.194 ± 0.175
1.683ArgAsp: 1.683 ± 0.364
3.107ArgGlu: 3.107 ± 0.505
1.23ArgPhe: 1.23 ± 0.313
2.395ArgGly: 2.395 ± 0.352
0.388ArgHis: 0.388 ± 0.168
3.302ArgIle: 3.302 ± 0.525
3.755ArgLys: 3.755 ± 0.622
2.266ArgLeu: 2.266 ± 0.367
0.647ArgMet: 0.647 ± 0.22
1.942ArgAsn: 1.942 ± 0.389
0.453ArgPro: 0.453 ± 0.186
0.906ArgGln: 0.906 ± 0.225
1.036ArgArg: 1.036 ± 0.3
1.877ArgSer: 1.877 ± 0.346
2.072ArgThr: 2.072 ± 0.384
1.877ArgVal: 1.877 ± 0.402
0.065ArgTrp: 0.065 ± 0.063
1.23ArgTyr: 1.23 ± 0.239
0.0ArgXaa: 0.0 ± 0.0
Ser
3.172SerAla: 3.172 ± 0.46
0.583SerCys: 0.583 ± 0.203
3.302SerAsp: 3.302 ± 0.439
4.92SerGlu: 4.92 ± 0.484
2.913SerPhe: 2.913 ± 0.427
4.208SerGly: 4.208 ± 0.8
0.971SerHis: 0.971 ± 0.218
7.121SerIle: 7.121 ± 0.667
7.121SerLys: 7.121 ± 0.702
5.503SerLeu: 5.503 ± 0.614
1.813SerMet: 1.813 ± 0.375
5.891SerAsn: 5.891 ± 0.642
1.359SerPro: 1.359 ± 0.256
1.813SerGln: 1.813 ± 0.386
2.719SerArg: 2.719 ± 0.495
5.956SerSer: 5.956 ± 0.924
5.308SerThr: 5.308 ± 0.609
2.784SerVal: 2.784 ± 0.462
0.518SerTrp: 0.518 ± 0.156
3.431SerTyr: 3.431 ± 0.621
0.0SerXaa: 0.0 ± 0.0
Thr
2.848ThrAla: 2.848 ± 0.632
0.583ThrCys: 0.583 ± 0.181
4.273ThrAsp: 4.273 ± 0.651
4.014ThrGlu: 4.014 ± 0.534
3.172ThrPhe: 3.172 ± 0.442
3.043ThrGly: 3.043 ± 0.454
0.647ThrHis: 0.647 ± 0.213
5.826ThrIle: 5.826 ± 0.778
5.567ThrLys: 5.567 ± 0.662
6.28ThrLeu: 6.28 ± 0.507
1.295ThrMet: 1.295 ± 0.259
4.402ThrAsn: 4.402 ± 0.544
1.942ThrPro: 1.942 ± 0.446
1.813ThrGln: 1.813 ± 0.353
1.748ThrArg: 1.748 ± 0.277
3.69ThrSer: 3.69 ± 0.439
3.561ThrThr: 3.561 ± 0.694
2.978ThrVal: 2.978 ± 0.492
0.388ThrTrp: 0.388 ± 0.156
2.784ThrTyr: 2.784 ± 0.511
0.0ThrXaa: 0.0 ± 0.0
Val
2.201ValAla: 2.201 ± 0.52
0.259ValCys: 0.259 ± 0.123
3.237ValAsp: 3.237 ± 0.413
3.366ValGlu: 3.366 ± 0.516
2.719ValPhe: 2.719 ± 0.416
2.913ValGly: 2.913 ± 0.475
0.712ValHis: 0.712 ± 0.195
4.855ValIle: 4.855 ± 0.607
5.567ValLys: 5.567 ± 0.731
3.69ValLeu: 3.69 ± 0.509
1.295ValMet: 1.295 ± 0.342
4.596ValAsn: 4.596 ± 0.538
1.295ValPro: 1.295 ± 0.322
1.101ValGln: 1.101 ± 0.238
1.489ValArg: 1.489 ± 0.33
4.467ValSer: 4.467 ± 0.483
3.496ValThr: 3.496 ± 0.469
2.784ValVal: 2.784 ± 0.448
0.453ValTrp: 0.453 ± 0.15
2.978ValTyr: 2.978 ± 0.391
0.0ValXaa: 0.0 ± 0.0
Trp
0.129TrpAla: 0.129 ± 0.1
0.0TrpCys: 0.0 ± 0.0
0.324TrpAsp: 0.324 ± 0.115
0.777TrpGlu: 0.777 ± 0.25
0.518TrpPhe: 0.518 ± 0.207
0.259TrpGly: 0.259 ± 0.129
0.0TrpHis: 0.0 ± 0.0
0.842TrpIle: 0.842 ± 0.222
0.388TrpLys: 0.388 ± 0.14
1.165TrpLeu: 1.165 ± 0.284
0.324TrpMet: 0.324 ± 0.139
0.777TrpAsn: 0.777 ± 0.21
0.065TrpPro: 0.065 ± 0.054
0.129TrpGln: 0.129 ± 0.096
0.129TrpArg: 0.129 ± 0.073
0.842TrpSer: 0.842 ± 0.204
0.259TrpThr: 0.259 ± 0.105
0.194TrpVal: 0.194 ± 0.105
0.0TrpTrp: 0.0 ± 0.0
0.259TrpTyr: 0.259 ± 0.114
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.683TyrAla: 1.683 ± 0.344
0.518TyrCys: 0.518 ± 0.194
2.719TyrAsp: 2.719 ± 0.399
3.302TyrGlu: 3.302 ± 0.66
1.813TyrPhe: 1.813 ± 0.361
1.683TyrGly: 1.683 ± 0.439
0.647TyrHis: 0.647 ± 0.18
5.179TyrIle: 5.179 ± 0.6
5.244TyrLys: 5.244 ± 0.595
3.69TyrLeu: 3.69 ± 0.514
0.647TyrMet: 0.647 ± 0.222
2.978TyrAsn: 2.978 ± 0.49
1.165TyrPro: 1.165 ± 0.289
1.359TyrGln: 1.359 ± 0.243
1.359TyrArg: 1.359 ± 0.276
3.172TyrSer: 3.172 ± 0.65
3.366TyrThr: 3.366 ± 0.776
1.813TyrVal: 1.813 ± 0.364
0.194TyrTrp: 0.194 ± 0.102
2.784TyrTyr: 2.784 ± 0.372
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 74 proteins (15448 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski