Amino acid dipepetide frequency for Microbacterium phage Smarties

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
17.618AlaAla: 17.618 ± 1.653
0.866AlaCys: 0.866 ± 0.244
8.029AlaAsp: 8.029 ± 0.832
7.278AlaGlu: 7.278 ± 0.747
3.524AlaPhe: 3.524 ± 0.465
9.473AlaGly: 9.473 ± 0.895
2.484AlaHis: 2.484 ± 0.454
5.372AlaIle: 5.372 ± 0.614
3.928AlaLys: 3.928 ± 0.507
10.744AlaLeu: 10.744 ± 1.339
3.235AlaMet: 3.235 ± 0.453
2.946AlaAsn: 2.946 ± 0.554
5.776AlaPro: 5.776 ± 0.568
4.274AlaGln: 4.274 ± 0.427
8.202AlaArg: 8.202 ± 0.866
6.412AlaSer: 6.412 ± 0.621
7.394AlaThr: 7.394 ± 0.76
6.585AlaVal: 6.585 ± 0.857
2.599AlaTrp: 2.599 ± 0.442
2.311AlaTyr: 2.311 ± 0.373
0.0AlaXaa: 0.0 ± 0.0
Cys
0.693CysAla: 0.693 ± 0.232
0.289CysCys: 0.289 ± 0.146
0.982CysAsp: 0.982 ± 0.25
0.809CysGlu: 0.809 ± 0.208
0.173CysPhe: 0.173 ± 0.079
1.386CysGly: 1.386 ± 0.271
0.231CysHis: 0.231 ± 0.122
0.289CysIle: 0.289 ± 0.114
0.058CysLys: 0.058 ± 0.064
0.52CysLeu: 0.52 ± 0.155
0.116CysMet: 0.116 ± 0.089
0.289CysAsn: 0.289 ± 0.125
1.155CysPro: 1.155 ± 0.244
0.173CysGln: 0.173 ± 0.088
0.52CysArg: 0.52 ± 0.179
0.347CysSer: 0.347 ± 0.14
0.462CysThr: 0.462 ± 0.154
0.866CysVal: 0.866 ± 0.223
0.116CysTrp: 0.116 ± 0.072
0.231CysTyr: 0.231 ± 0.107
0.0CysXaa: 0.0 ± 0.0
Asp
8.087AspAla: 8.087 ± 0.818
0.693AspCys: 0.693 ± 0.252
5.314AspAsp: 5.314 ± 0.775
4.91AspGlu: 4.91 ± 0.802
1.617AspPhe: 1.617 ± 0.324
5.661AspGly: 5.661 ± 0.56
1.444AspHis: 1.444 ± 0.348
2.368AspIle: 2.368 ± 0.362
1.098AspLys: 1.098 ± 0.246
5.719AspLeu: 5.719 ± 0.612
1.56AspMet: 1.56 ± 0.315
2.311AspAsn: 2.311 ± 0.328
3.755AspPro: 3.755 ± 0.508
2.311AspGln: 2.311 ± 0.352
4.794AspArg: 4.794 ± 0.694
3.119AspSer: 3.119 ± 0.486
3.235AspThr: 3.235 ± 0.388
4.39AspVal: 4.39 ± 0.563
1.617AspTrp: 1.617 ± 0.301
1.617AspTyr: 1.617 ± 0.304
0.0AspXaa: 0.0 ± 0.0
Glu
8.202GluAla: 8.202 ± 0.855
0.404GluCys: 0.404 ± 0.156
4.159GluAsp: 4.159 ± 0.546
4.794GluGlu: 4.794 ± 0.553
1.444GluPhe: 1.444 ± 0.296
4.968GluGly: 4.968 ± 0.626
1.964GluHis: 1.964 ± 0.397
1.271GluIle: 1.271 ± 0.26
1.733GluLys: 1.733 ± 0.304
3.235GluLeu: 3.235 ± 0.49
1.733GluMet: 1.733 ± 0.29
1.791GluAsn: 1.791 ± 0.302
5.314GluPro: 5.314 ± 0.769
2.368GluGln: 2.368 ± 0.38
5.141GluArg: 5.141 ± 0.645
3.177GluSer: 3.177 ± 0.36
4.159GluThr: 4.159 ± 0.511
5.488GluVal: 5.488 ± 0.689
1.733GluTrp: 1.733 ± 0.311
1.791GluTyr: 1.791 ± 0.466
0.0GluXaa: 0.0 ± 0.0
Phe
2.253PheAla: 2.253 ± 0.381
0.289PheCys: 0.289 ± 0.155
2.195PheAsp: 2.195 ± 0.403
1.964PheGlu: 1.964 ± 0.349
0.578PhePhe: 0.578 ± 0.194
2.368PheGly: 2.368 ± 0.402
0.404PheHis: 0.404 ± 0.133
1.213PheIle: 1.213 ± 0.25
0.635PheLys: 0.635 ± 0.201
1.964PheLeu: 1.964 ± 0.343
0.462PheMet: 0.462 ± 0.131
0.52PheAsn: 0.52 ± 0.19
0.982PhePro: 0.982 ± 0.209
0.347PheGln: 0.347 ± 0.186
1.675PheArg: 1.675 ± 0.288
1.155PheSer: 1.155 ± 0.335
2.426PheThr: 2.426 ± 0.333
1.848PheVal: 1.848 ± 0.342
0.462PheTrp: 0.462 ± 0.205
0.52PheTyr: 0.52 ± 0.176
0.0PheXaa: 0.0 ± 0.0
Gly
8.029GlyAla: 8.029 ± 1.171
0.693GlyCys: 0.693 ± 0.216
5.141GlyAsp: 5.141 ± 0.623
5.43GlyGlu: 5.43 ± 0.551
2.715GlyPhe: 2.715 ± 0.407
8.549GlyGly: 8.549 ± 0.772
1.906GlyHis: 1.906 ± 0.432
4.159GlyIle: 4.159 ± 0.572
2.599GlyLys: 2.599 ± 0.443
7.163GlyLeu: 7.163 ± 0.747
2.715GlyMet: 2.715 ± 0.452
2.195GlyAsn: 2.195 ± 0.435
3.639GlyPro: 3.639 ± 0.37
3.524GlyGln: 3.524 ± 0.424
4.968GlyArg: 4.968 ± 0.654
5.892GlySer: 5.892 ± 0.657
6.412GlyThr: 6.412 ± 0.681
6.412GlyVal: 6.412 ± 0.582
2.311GlyTrp: 2.311 ± 0.31
3.235GlyTyr: 3.235 ± 0.446
0.0GlyXaa: 0.0 ± 0.0
His
1.906HisAla: 1.906 ± 0.3
0.231HisCys: 0.231 ± 0.124
1.56HisAsp: 1.56 ± 0.331
1.386HisGlu: 1.386 ± 0.299
0.52HisPhe: 0.52 ± 0.167
1.329HisGly: 1.329 ± 0.308
0.462HisHis: 0.462 ± 0.165
0.578HisIle: 0.578 ± 0.177
0.462HisLys: 0.462 ± 0.162
2.137HisLeu: 2.137 ± 0.343
0.289HisMet: 0.289 ± 0.126
0.173HisAsn: 0.173 ± 0.094
2.253HisPro: 2.253 ± 0.354
0.231HisGln: 0.231 ± 0.136
1.56HisArg: 1.56 ± 0.34
1.098HisSer: 1.098 ± 0.238
1.213HisThr: 1.213 ± 0.279
1.329HisVal: 1.329 ± 0.354
0.347HisTrp: 0.347 ± 0.123
0.462HisTyr: 0.462 ± 0.167
0.0HisXaa: 0.0 ± 0.0
Ile
4.506IleAla: 4.506 ± 0.554
0.289IleCys: 0.289 ± 0.181
3.755IleAsp: 3.755 ± 0.457
4.043IleGlu: 4.043 ± 0.513
0.693IlePhe: 0.693 ± 0.175
3.524IleGly: 3.524 ± 0.42
0.635IleHis: 0.635 ± 0.183
2.311IleIle: 2.311 ± 0.442
0.982IleLys: 0.982 ± 0.222
2.599IleLeu: 2.599 ± 0.319
0.866IleMet: 0.866 ± 0.249
1.213IleAsn: 1.213 ± 0.297
3.581IlePro: 3.581 ± 0.683
1.271IleGln: 1.271 ± 0.298
2.195IleArg: 2.195 ± 0.414
2.137IleSer: 2.137 ± 0.51
4.91IleThr: 4.91 ± 0.63
3.812IleVal: 3.812 ± 0.5
0.693IleTrp: 0.693 ± 0.199
0.809IleTyr: 0.809 ± 0.197
0.0IleXaa: 0.0 ± 0.0
Lys
3.466LysAla: 3.466 ± 0.572
0.289LysCys: 0.289 ± 0.128
1.213LysAsp: 1.213 ± 0.288
0.347LysGlu: 0.347 ± 0.135
0.866LysPhe: 0.866 ± 0.236
2.484LysGly: 2.484 ± 0.301
0.751LysHis: 0.751 ± 0.214
0.693LysIle: 0.693 ± 0.229
0.52LysLys: 0.52 ± 0.161
0.866LysLeu: 0.866 ± 0.271
0.347LysMet: 0.347 ± 0.118
0.289LysAsn: 0.289 ± 0.111
1.848LysPro: 1.848 ± 0.313
0.809LysGln: 0.809 ± 0.209
2.599LysArg: 2.599 ± 0.47
1.733LysSer: 1.733 ± 0.312
0.924LysThr: 0.924 ± 0.302
3.119LysVal: 3.119 ± 0.464
0.347LysTrp: 0.347 ± 0.135
0.693LysTyr: 0.693 ± 0.176
0.0LysXaa: 0.0 ± 0.0
Leu
9.011LeuAla: 9.011 ± 0.7
0.347LeuCys: 0.347 ± 0.13
4.852LeuAsp: 4.852 ± 0.627
4.39LeuGlu: 4.39 ± 0.583
1.675LeuPhe: 1.675 ± 0.418
7.451LeuGly: 7.451 ± 0.989
1.155LeuHis: 1.155 ± 0.242
4.506LeuIle: 4.506 ± 0.799
0.924LeuLys: 0.924 ± 0.238
5.834LeuLeu: 5.834 ± 0.67
1.502LeuMet: 1.502 ± 0.312
1.502LeuAsn: 1.502 ± 0.31
4.563LeuPro: 4.563 ± 0.468
2.195LeuGln: 2.195 ± 0.325
5.314LeuArg: 5.314 ± 0.576
5.141LeuSer: 5.141 ± 0.696
6.989LeuThr: 6.989 ± 0.622
6.181LeuVal: 6.181 ± 0.566
1.098LeuTrp: 1.098 ± 0.242
1.386LeuTyr: 1.386 ± 0.334
0.0LeuXaa: 0.0 ± 0.0
Met
3.35MetAla: 3.35 ± 0.457
0.173MetCys: 0.173 ± 0.11
1.617MetAsp: 1.617 ± 0.306
0.578MetGlu: 0.578 ± 0.182
0.462MetPhe: 0.462 ± 0.181
1.906MetGly: 1.906 ± 0.377
0.404MetHis: 0.404 ± 0.149
1.56MetIle: 1.56 ± 0.368
0.462MetLys: 0.462 ± 0.147
1.56MetLeu: 1.56 ± 0.274
0.462MetMet: 0.462 ± 0.134
0.693MetAsn: 0.693 ± 0.153
1.213MetPro: 1.213 ± 0.305
0.809MetGln: 0.809 ± 0.228
1.386MetArg: 1.386 ± 0.273
3.35MetSer: 3.35 ± 0.44
2.599MetThr: 2.599 ± 0.375
1.444MetVal: 1.444 ± 0.297
0.289MetTrp: 0.289 ± 0.101
0.751MetTyr: 0.751 ± 0.18
0.0MetXaa: 0.0 ± 0.0
Asn
3.697AsnAla: 3.697 ± 0.564
0.347AsnCys: 0.347 ± 0.135
1.155AsnAsp: 1.155 ± 0.275
1.155AsnGlu: 1.155 ± 0.297
0.347AsnPhe: 0.347 ± 0.173
3.986AsnGly: 3.986 ± 0.649
0.347AsnHis: 0.347 ± 0.145
1.098AsnIle: 1.098 ± 0.251
0.289AsnLys: 0.289 ± 0.132
2.253AsnLeu: 2.253 ± 0.302
0.404AsnMet: 0.404 ± 0.145
0.289AsnAsn: 0.289 ± 0.15
2.426AsnPro: 2.426 ± 0.504
0.751AsnGln: 0.751 ± 0.193
2.137AsnArg: 2.137 ± 0.384
1.733AsnSer: 1.733 ± 0.335
1.733AsnThr: 1.733 ± 0.299
1.733AsnVal: 1.733 ± 0.383
0.347AsnTrp: 0.347 ± 0.132
0.578AsnTyr: 0.578 ± 0.157
0.0AsnXaa: 0.0 ± 0.0
Pro
7.278ProAla: 7.278 ± 0.726
0.866ProCys: 0.866 ± 0.227
4.679ProAsp: 4.679 ± 0.71
5.256ProGlu: 5.256 ± 0.678
1.906ProPhe: 1.906 ± 0.313
5.719ProGly: 5.719 ± 0.544
0.982ProHis: 0.982 ± 0.223
2.195ProIle: 2.195 ± 0.313
1.617ProLys: 1.617 ± 0.338
3.639ProLeu: 3.639 ± 0.501
1.329ProMet: 1.329 ± 0.285
2.599ProAsn: 2.599 ± 0.431
4.043ProPro: 4.043 ± 0.48
2.195ProGln: 2.195 ± 0.684
3.061ProArg: 3.061 ± 0.536
3.177ProSer: 3.177 ± 0.406
4.968ProThr: 4.968 ± 0.648
5.025ProVal: 5.025 ± 0.662
0.866ProTrp: 0.866 ± 0.193
1.444ProTyr: 1.444 ± 0.282
0.0ProXaa: 0.0 ± 0.0
Gln
3.986GlnAla: 3.986 ± 0.499
0.462GlnCys: 0.462 ± 0.196
1.617GlnAsp: 1.617 ± 0.3
1.617GlnGlu: 1.617 ± 0.331
1.213GlnPhe: 1.213 ± 0.292
2.195GlnGly: 2.195 ± 0.42
0.982GlnHis: 0.982 ± 0.276
1.848GlnIle: 1.848 ± 0.361
0.751GlnLys: 0.751 ± 0.216
1.098GlnLeu: 1.098 ± 0.4
1.386GlnMet: 1.386 ± 0.287
1.386GlnAsn: 1.386 ± 0.406
2.599GlnPro: 2.599 ± 0.413
1.444GlnGln: 1.444 ± 0.253
2.773GlnArg: 2.773 ± 0.514
1.906GlnSer: 1.906 ± 0.473
2.195GlnThr: 2.195 ± 0.401
2.311GlnVal: 2.311 ± 0.446
0.809GlnTrp: 0.809 ± 0.243
0.751GlnTyr: 0.751 ± 0.217
0.0GlnXaa: 0.0 ± 0.0
Arg
7.914ArgAla: 7.914 ± 0.832
0.924ArgCys: 0.924 ± 0.226
4.39ArgAsp: 4.39 ± 0.691
4.563ArgGlu: 4.563 ± 0.52
1.56ArgPhe: 1.56 ± 0.267
5.892ArgGly: 5.892 ± 0.536
1.386ArgHis: 1.386 ± 0.323
3.004ArgIle: 3.004 ± 0.473
1.791ArgLys: 1.791 ± 0.345
6.701ArgLeu: 6.701 ± 0.615
2.195ArgMet: 2.195 ± 0.406
1.56ArgAsn: 1.56 ± 0.271
3.581ArgPro: 3.581 ± 0.522
2.715ArgGln: 2.715 ± 0.431
7.22ArgArg: 7.22 ± 0.948
2.83ArgSer: 2.83 ± 0.427
2.888ArgThr: 2.888 ± 0.455
5.256ArgVal: 5.256 ± 0.474
1.791ArgTrp: 1.791 ± 0.376
2.195ArgTyr: 2.195 ± 0.301
0.0ArgXaa: 0.0 ± 0.0
Ser
7.278SerAla: 7.278 ± 1.001
0.404SerCys: 0.404 ± 0.146
3.928SerAsp: 3.928 ± 0.418
3.87SerGlu: 3.87 ± 0.56
0.982SerPhe: 0.982 ± 0.25
5.083SerGly: 5.083 ± 0.747
0.866SerHis: 0.866 ± 0.204
2.888SerIle: 2.888 ± 0.48
1.098SerLys: 1.098 ± 0.239
4.563SerLeu: 4.563 ± 0.571
2.079SerMet: 2.079 ± 0.336
1.56SerAsn: 1.56 ± 0.312
3.466SerPro: 3.466 ± 0.413
2.022SerGln: 2.022 ± 0.411
3.061SerArg: 3.061 ± 0.394
2.542SerSer: 2.542 ± 0.426
4.852SerThr: 4.852 ± 0.622
3.466SerVal: 3.466 ± 0.427
1.155SerTrp: 1.155 ± 0.308
1.444SerTyr: 1.444 ± 0.244
0.0SerXaa: 0.0 ± 0.0
Thr
8.665ThrAla: 8.665 ± 1.071
0.751ThrCys: 0.751 ± 0.254
4.448ThrAsp: 4.448 ± 0.551
3.928ThrGlu: 3.928 ± 0.595
1.271ThrPhe: 1.271 ± 0.252
6.412ThrGly: 6.412 ± 0.726
1.155ThrHis: 1.155 ± 0.267
3.639ThrIle: 3.639 ± 0.555
1.791ThrLys: 1.791 ± 0.396
6.585ThrLeu: 6.585 ± 0.63
1.502ThrMet: 1.502 ± 0.295
2.079ThrAsn: 2.079 ± 0.372
5.488ThrPro: 5.488 ± 0.631
1.848ThrGln: 1.848 ± 0.383
4.852ThrArg: 4.852 ± 0.631
3.697ThrSer: 3.697 ± 0.513
4.39ThrThr: 4.39 ± 0.581
5.372ThrVal: 5.372 ± 0.59
0.982ThrTrp: 0.982 ± 0.224
1.502ThrTyr: 1.502 ± 0.286
0.0ThrXaa: 0.0 ± 0.0
Val
8.376ValAla: 8.376 ± 0.76
0.809ValCys: 0.809 ± 0.227
3.639ValAsp: 3.639 ± 0.389
5.199ValGlu: 5.199 ± 0.577
1.56ValPhe: 1.56 ± 0.26
5.719ValGly: 5.719 ± 0.703
1.329ValHis: 1.329 ± 0.319
4.217ValIle: 4.217 ± 0.51
2.311ValLys: 2.311 ± 0.346
5.199ValLeu: 5.199 ± 0.437
1.791ValMet: 1.791 ± 0.307
2.137ValAsn: 2.137 ± 0.356
5.141ValPro: 5.141 ± 0.605
2.022ValGln: 2.022 ± 0.339
5.314ValArg: 5.314 ± 0.611
4.506ValSer: 4.506 ± 0.536
6.123ValThr: 6.123 ± 0.462
5.43ValVal: 5.43 ± 0.712
1.733ValTrp: 1.733 ± 0.27
1.271ValTyr: 1.271 ± 0.233
0.0ValXaa: 0.0 ± 0.0
Trp
2.137TrpAla: 2.137 ± 0.381
0.347TrpCys: 0.347 ± 0.13
1.098TrpAsp: 1.098 ± 0.259
1.617TrpGlu: 1.617 ± 0.305
0.751TrpPhe: 0.751 ± 0.207
1.386TrpGly: 1.386 ± 0.293
0.462TrpHis: 0.462 ± 0.148
0.866TrpIle: 0.866 ± 0.281
0.693TrpLys: 0.693 ± 0.234
1.733TrpLeu: 1.733 ± 0.315
0.289TrpMet: 0.289 ± 0.141
0.866TrpAsn: 0.866 ± 0.226
0.866TrpPro: 0.866 ± 0.271
1.04TrpGln: 1.04 ± 0.218
1.386TrpArg: 1.386 ± 0.272
1.271TrpSer: 1.271 ± 0.329
0.924TrpThr: 0.924 ± 0.23
1.444TrpVal: 1.444 ± 0.313
0.866TrpTrp: 0.866 ± 0.26
0.693TrpTyr: 0.693 ± 0.179
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.599TyrAla: 2.599 ± 0.363
0.231TyrCys: 0.231 ± 0.11
1.848TyrAsp: 1.848 ± 0.339
1.733TyrGlu: 1.733 ± 0.395
0.404TyrPhe: 0.404 ± 0.156
2.195TyrGly: 2.195 ± 0.405
0.173TyrHis: 0.173 ± 0.127
0.751TyrIle: 0.751 ± 0.168
0.462TyrLys: 0.462 ± 0.156
1.906TyrLeu: 1.906 ± 0.368
0.751TyrMet: 0.751 ± 0.231
0.404TyrAsn: 0.404 ± 0.176
1.213TyrPro: 1.213 ± 0.323
0.982TyrGln: 0.982 ± 0.255
2.195TyrArg: 2.195 ± 0.353
1.386TyrSer: 1.386 ± 0.283
1.56TyrThr: 1.56 ± 0.269
2.311TyrVal: 2.311 ± 0.396
0.578TyrTrp: 0.578 ± 0.163
0.635TyrTyr: 0.635 ± 0.187
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 100 proteins (17313 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski