Amino acid dipepetide frequency for Shigella phage Sf21

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
4.753AlaAla: 4.753 ± 0.319
0.481AlaCys: 0.481 ± 0.088
3.31AlaAsp: 3.31 ± 0.278
5.138AlaGlu: 5.138 ± 0.354
2.367AlaPhe: 2.367 ± 0.191
4.099AlaGly: 4.099 ± 0.355
1.135AlaHis: 1.135 ± 0.172
4.503AlaIle: 4.503 ± 0.265
4.753AlaLys: 4.753 ± 0.34
5.561AlaLeu: 5.561 ± 0.342
1.328AlaMet: 1.328 ± 0.158
3.194AlaAsn: 3.194 ± 0.308
2.174AlaPro: 2.174 ± 0.218
2.829AlaGln: 2.829 ± 0.234
2.848AlaArg: 2.848 ± 0.229
4.099AlaSer: 4.099 ± 0.296
3.002AlaThr: 3.002 ± 0.421
4.657AlaVal: 4.657 ± 0.309
0.981AlaTrp: 0.981 ± 0.144
2.405AlaTyr: 2.405 ± 0.207
0.0AlaXaa: 0.0 ± 0.0
Cys
0.597CysAla: 0.597 ± 0.096
0.192CysCys: 0.192 ± 0.071
0.808CysAsp: 0.808 ± 0.129
0.866CysGlu: 0.866 ± 0.119
0.366CysPhe: 0.366 ± 0.091
0.885CysGly: 0.885 ± 0.146
0.269CysHis: 0.269 ± 0.07
0.654CysIle: 0.654 ± 0.108
0.673CysLys: 0.673 ± 0.127
0.789CysLeu: 0.789 ± 0.109
0.289CysMet: 0.289 ± 0.078
0.539CysAsn: 0.539 ± 0.105
0.52CysPro: 0.52 ± 0.098
0.385CysGln: 0.385 ± 0.099
0.558CysArg: 0.558 ± 0.088
0.731CysSer: 0.731 ± 0.145
0.539CysThr: 0.539 ± 0.11
0.75CysVal: 0.75 ± 0.112
0.154CysTrp: 0.154 ± 0.054
0.481CysTyr: 0.481 ± 0.099
0.0CysXaa: 0.0 ± 0.0
Asp
3.868AspAla: 3.868 ± 0.277
0.597AspCys: 0.597 ± 0.102
4.022AspAsp: 4.022 ± 0.288
4.849AspGlu: 4.849 ± 0.356
3.464AspPhe: 3.464 ± 0.249
4.407AspGly: 4.407 ± 0.324
0.885AspHis: 0.885 ± 0.117
5.022AspIle: 5.022 ± 0.285
4.945AspLys: 4.945 ± 0.312
4.849AspLeu: 4.849 ± 0.365
1.982AspMet: 1.982 ± 0.202
2.713AspAsn: 2.713 ± 0.181
2.04AspPro: 2.04 ± 0.249
1.328AspGln: 1.328 ± 0.152
2.136AspArg: 2.136 ± 0.192
4.156AspSer: 4.156 ± 0.278
3.04AspThr: 3.04 ± 0.219
4.195AspVal: 4.195 ± 0.301
1.232AspTrp: 1.232 ± 0.151
3.483AspTyr: 3.483 ± 0.272
0.0AspXaa: 0.0 ± 0.0
Glu
4.657GluAla: 4.657 ± 0.323
0.866GluCys: 0.866 ± 0.137
4.33GluAsp: 4.33 ± 0.347
5.176GluGlu: 5.176 ± 0.336
3.194GluPhe: 3.194 ± 0.271
3.772GluGly: 3.772 ± 0.278
1.193GluHis: 1.193 ± 0.162
6.312GluIle: 6.312 ± 0.362
5.215GluLys: 5.215 ± 0.371
6.947GluLeu: 6.947 ± 0.359
2.136GluMet: 2.136 ± 0.194
4.099GluAsn: 4.099 ± 0.277
1.559GluPro: 1.559 ± 0.186
2.598GluGln: 2.598 ± 0.243
2.886GluArg: 2.886 ± 0.211
4.407GluSer: 4.407 ± 0.327
4.176GluThr: 4.176 ± 0.348
4.984GluVal: 4.984 ± 0.298
1.251GluTrp: 1.251 ± 0.148
3.791GluTyr: 3.791 ± 0.288
0.0GluXaa: 0.0 ± 0.0
Phe
2.386PheAla: 2.386 ± 0.188
0.481PheCys: 0.481 ± 0.1
3.252PheAsp: 3.252 ± 0.264
3.579PheGlu: 3.579 ± 0.279
1.52PhePhe: 1.52 ± 0.168
2.906PheGly: 2.906 ± 0.198
0.616PheHis: 0.616 ± 0.124
3.541PheIle: 3.541 ± 0.213
4.137PheLys: 4.137 ± 0.295
2.348PheLeu: 2.348 ± 0.203
1.424PheMet: 1.424 ± 0.175
3.06PheAsn: 3.06 ± 0.259
1.135PhePro: 1.135 ± 0.133
1.366PheGln: 1.366 ± 0.168
1.77PheArg: 1.77 ± 0.168
3.329PheSer: 3.329 ± 0.242
2.906PheThr: 2.906 ± 0.207
2.694PheVal: 2.694 ± 0.217
0.558PheTrp: 0.558 ± 0.103
2.04PheTyr: 2.04 ± 0.198
0.0PheXaa: 0.0 ± 0.0
Gly
3.002GlyAla: 3.002 ± 0.301
0.673GlyCys: 0.673 ± 0.132
4.06GlyAsp: 4.06 ± 0.279
3.906GlyGlu: 3.906 ± 0.268
2.79GlyPhe: 2.79 ± 0.229
3.637GlyGly: 3.637 ± 0.615
0.962GlyHis: 0.962 ± 0.126
4.464GlyIle: 4.464 ± 0.308
4.33GlyLys: 4.33 ± 0.312
4.58GlyLeu: 4.58 ± 0.352
2.04GlyMet: 2.04 ± 0.229
3.156GlyAsn: 3.156 ± 0.343
1.732GlyPro: 1.732 ± 0.192
2.271GlyGln: 2.271 ± 0.26
2.251GlyArg: 2.251 ± 0.224
4.022GlySer: 4.022 ± 0.294
3.945GlyThr: 3.945 ± 0.355
4.156GlyVal: 4.156 ± 0.29
0.866GlyTrp: 0.866 ± 0.135
2.925GlyTyr: 2.925 ± 0.226
0.0GlyXaa: 0.0 ± 0.0
His
0.904HisAla: 0.904 ± 0.12
0.269HisCys: 0.269 ± 0.087
0.808HisAsp: 0.808 ± 0.13
1.232HisGlu: 1.232 ± 0.149
0.827HisPhe: 0.827 ± 0.13
0.943HisGly: 0.943 ± 0.138
0.346HisHis: 0.346 ± 0.085
1.328HisIle: 1.328 ± 0.168
1.308HisLys: 1.308 ± 0.176
1.366HisLeu: 1.366 ± 0.151
0.423HisMet: 0.423 ± 0.087
0.635HisAsn: 0.635 ± 0.104
0.866HisPro: 0.866 ± 0.143
0.52HisGln: 0.52 ± 0.103
0.808HisArg: 0.808 ± 0.12
1.212HisSer: 1.212 ± 0.141
0.943HisThr: 0.943 ± 0.158
1.039HisVal: 1.039 ± 0.138
0.346HisTrp: 0.346 ± 0.079
0.597HisTyr: 0.597 ± 0.1
0.0HisXaa: 0.0 ± 0.0
Ile
5.157IleAla: 5.157 ± 0.29
0.75IleCys: 0.75 ± 0.122
5.407IleAsp: 5.407 ± 0.363
5.734IleGlu: 5.734 ± 0.336
2.829IlePhe: 2.829 ± 0.266
3.541IleGly: 3.541 ± 0.216
1.347IleHis: 1.347 ± 0.173
5.696IleIle: 5.696 ± 0.352
6.908IleLys: 6.908 ± 0.354
4.599IleLeu: 4.599 ± 0.273
1.828IleMet: 1.828 ± 0.198
5.08IleAsn: 5.08 ± 0.273
2.925IlePro: 2.925 ± 0.217
2.694IleGln: 2.694 ± 0.21
3.637IleArg: 3.637 ± 0.276
4.907IleSer: 4.907 ± 0.383
4.637IleThr: 4.637 ± 0.29
4.484IleVal: 4.484 ± 0.277
0.635IleTrp: 0.635 ± 0.115
2.79IleTyr: 2.79 ± 0.246
0.0IleXaa: 0.0 ± 0.0
Lys
5.657LysAla: 5.657 ± 0.328
0.866LysCys: 0.866 ± 0.135
5.215LysAsp: 5.215 ± 0.341
5.792LysGlu: 5.792 ± 0.395
4.099LysPhe: 4.099 ± 0.27
4.079LysGly: 4.079 ± 0.26
1.713LysHis: 1.713 ± 0.213
6.196LysIle: 6.196 ± 0.313
5.099LysLys: 5.099 ± 0.397
6.119LysLeu: 6.119 ± 0.354
2.502LysMet: 2.502 ± 0.214
4.503LysAsn: 4.503 ± 0.298
2.367LysPro: 2.367 ± 0.19
2.559LysGln: 2.559 ± 0.227
3.156LysArg: 3.156 ± 0.277
5.003LysSer: 5.003 ± 0.319
4.33LysThr: 4.33 ± 0.277
5.176LysVal: 5.176 ± 0.357
1.078LysTrp: 1.078 ± 0.119
3.425LysTyr: 3.425 ± 0.274
0.0LysXaa: 0.0 ± 0.0
Leu
4.791LeuAla: 4.791 ± 0.318
0.943LeuCys: 0.943 ± 0.121
4.926LeuAsp: 4.926 ± 0.306
5.292LeuGlu: 5.292 ± 0.336
3.329LeuPhe: 3.329 ± 0.281
4.233LeuGly: 4.233 ± 0.297
1.174LeuHis: 1.174 ± 0.16
5.349LeuIle: 5.349 ± 0.357
6.292LeuLys: 6.292 ± 0.328
4.714LeuLeu: 4.714 ± 0.312
2.097LeuMet: 2.097 ± 0.223
4.522LeuAsn: 4.522 ± 0.262
2.963LeuPro: 2.963 ± 0.247
2.405LeuGln: 2.405 ± 0.208
3.175LeuArg: 3.175 ± 0.228
5.388LeuSer: 5.388 ± 0.305
4.214LeuThr: 4.214 ± 0.303
4.233LeuVal: 4.233 ± 0.321
0.77LeuTrp: 0.77 ± 0.122
2.848LeuTyr: 2.848 ± 0.225
0.0LeuXaa: 0.0 ± 0.0
Met
2.29MetAla: 2.29 ± 0.199
0.346MetCys: 0.346 ± 0.085
1.385MetAsp: 1.385 ± 0.159
1.79MetGlu: 1.79 ± 0.214
1.405MetPhe: 1.405 ± 0.149
1.405MetGly: 1.405 ± 0.175
0.423MetHis: 0.423 ± 0.086
2.04MetIle: 2.04 ± 0.206
3.079MetLys: 3.079 ± 0.288
1.886MetLeu: 1.886 ± 0.204
1.02MetMet: 1.02 ± 0.143
1.751MetAsn: 1.751 ± 0.202
0.693MetPro: 0.693 ± 0.124
0.808MetGln: 0.808 ± 0.1
1.02MetArg: 1.02 ± 0.156
1.713MetSer: 1.713 ± 0.156
1.636MetThr: 1.636 ± 0.171
1.308MetVal: 1.308 ± 0.144
0.173MetTrp: 0.173 ± 0.053
1.039MetTyr: 1.039 ± 0.136
0.0MetXaa: 0.0 ± 0.0
Asn
3.868AsnAla: 3.868 ± 0.264
0.539AsnCys: 0.539 ± 0.099
3.502AsnAsp: 3.502 ± 0.244
3.983AsnGlu: 3.983 ± 0.284
2.444AsnPhe: 2.444 ± 0.159
3.772AsnGly: 3.772 ± 0.318
0.693AsnHis: 0.693 ± 0.13
4.522AsnIle: 4.522 ± 0.266
4.695AsnLys: 4.695 ± 0.31
4.002AsnLeu: 4.002 ± 0.277
1.559AsnMet: 1.559 ± 0.167
3.271AsnAsn: 3.271 ± 0.321
2.29AsnPro: 2.29 ± 0.22
1.751AsnGln: 1.751 ± 0.197
2.502AsnArg: 2.502 ± 0.218
3.752AsnSer: 3.752 ± 0.265
2.675AsnThr: 2.675 ± 0.237
3.156AsnVal: 3.156 ± 0.226
0.731AsnTrp: 0.731 ± 0.11
2.348AsnTyr: 2.348 ± 0.194
0.0AsnXaa: 0.0 ± 0.0
Pro
2.02ProAla: 2.02 ± 0.214
0.423ProCys: 0.423 ± 0.076
2.482ProAsp: 2.482 ± 0.22
3.04ProGlu: 3.04 ± 0.275
1.693ProPhe: 1.693 ± 0.199
2.328ProGly: 2.328 ± 0.231
0.654ProHis: 0.654 ± 0.1
2.059ProIle: 2.059 ± 0.22
2.328ProLys: 2.328 ± 0.268
2.117ProLeu: 2.117 ± 0.173
0.789ProMet: 0.789 ± 0.126
1.924ProAsn: 1.924 ± 0.207
1.058ProPro: 1.058 ± 0.147
1.02ProGln: 1.02 ± 0.123
1.116ProArg: 1.116 ± 0.174
2.251ProSer: 2.251 ± 0.212
2.251ProThr: 2.251 ± 0.218
2.579ProVal: 2.579 ± 0.23
0.673ProTrp: 0.673 ± 0.13
1.482ProTyr: 1.482 ± 0.18
0.0ProXaa: 0.0 ± 0.0
Gln
2.251GlnAla: 2.251 ± 0.227
0.308GlnCys: 0.308 ± 0.074
1.636GlnAsp: 1.636 ± 0.189
2.482GlnGlu: 2.482 ± 0.242
1.713GlnPhe: 1.713 ± 0.167
1.79GlnGly: 1.79 ± 0.155
0.481GlnHis: 0.481 ± 0.119
2.694GlnIle: 2.694 ± 0.213
2.463GlnLys: 2.463 ± 0.231
2.963GlnLeu: 2.963 ± 0.233
0.981GlnMet: 0.981 ± 0.143
1.539GlnAsn: 1.539 ± 0.171
1.232GlnPro: 1.232 ± 0.143
1.02GlnGln: 1.02 ± 0.139
1.713GlnArg: 1.713 ± 0.201
1.963GlnSer: 1.963 ± 0.184
1.867GlnThr: 1.867 ± 0.177
2.271GlnVal: 2.271 ± 0.239
0.616GlnTrp: 0.616 ± 0.115
1.79GlnTyr: 1.79 ± 0.175
0.0GlnXaa: 0.0 ± 0.0
Arg
2.425ArgAla: 2.425 ± 0.202
0.558ArgCys: 0.558 ± 0.111
2.502ArgAsp: 2.502 ± 0.202
3.406ArgGlu: 3.406 ± 0.278
1.77ArgPhe: 1.77 ± 0.174
2.713ArgGly: 2.713 ± 0.201
0.712ArgHis: 0.712 ± 0.114
3.117ArgIle: 3.117 ± 0.264
3.156ArgLys: 3.156 ± 0.256
3.56ArgLeu: 3.56 ± 0.249
0.924ArgMet: 0.924 ± 0.149
2.097ArgAsn: 2.097 ± 0.197
1.27ArgPro: 1.27 ± 0.174
1.616ArgGln: 1.616 ± 0.149
2.02ArgArg: 2.02 ± 0.214
2.598ArgSer: 2.598 ± 0.216
2.174ArgThr: 2.174 ± 0.22
2.559ArgVal: 2.559 ± 0.237
0.75ArgTrp: 0.75 ± 0.115
1.636ArgTyr: 1.636 ± 0.195
0.0ArgXaa: 0.0 ± 0.0
Ser
3.656SerAla: 3.656 ± 0.242
0.673SerCys: 0.673 ± 0.164
4.022SerAsp: 4.022 ± 0.295
4.233SerGlu: 4.233 ± 0.294
2.886SerPhe: 2.886 ± 0.276
4.58SerGly: 4.58 ± 0.315
1.116SerHis: 1.116 ± 0.153
5.08SerIle: 5.08 ± 0.294
5.484SerLys: 5.484 ± 0.282
4.888SerLeu: 4.888 ± 0.244
1.559SerMet: 1.559 ± 0.191
3.56SerAsn: 3.56 ± 0.289
2.502SerPro: 2.502 ± 0.216
2.04SerGln: 2.04 ± 0.189
2.598SerArg: 2.598 ± 0.212
5.446SerSer: 5.446 ± 0.452
4.253SerThr: 4.253 ± 0.355
4.176SerVal: 4.176 ± 0.308
0.866SerTrp: 0.866 ± 0.103
2.809SerTyr: 2.809 ± 0.261
0.0SerXaa: 0.0 ± 0.0
Thr
4.137ThrAla: 4.137 ± 0.442
0.443ThrCys: 0.443 ± 0.106
3.329ThrAsp: 3.329 ± 0.23
3.849ThrGlu: 3.849 ± 0.311
2.598ThrPhe: 2.598 ± 0.206
4.099ThrGly: 4.099 ± 0.355
1.039ThrHis: 1.039 ± 0.12
4.137ThrIle: 4.137 ± 0.302
3.579ThrLys: 3.579 ± 0.232
4.022ThrLeu: 4.022 ± 0.29
1.078ThrMet: 1.078 ± 0.135
2.848ThrAsn: 2.848 ± 0.244
2.771ThrPro: 2.771 ± 0.248
1.867ThrGln: 1.867 ± 0.207
2.482ThrArg: 2.482 ± 0.266
3.252ThrSer: 3.252 ± 0.287
3.175ThrThr: 3.175 ± 0.291
4.599ThrVal: 4.599 ± 0.381
0.866ThrTrp: 0.866 ± 0.117
2.463ThrTyr: 2.463 ± 0.207
0.0ThrXaa: 0.0 ± 0.0
Val
3.444ValAla: 3.444 ± 0.232
0.962ValCys: 0.962 ± 0.136
3.964ValAsp: 3.964 ± 0.243
5.657ValGlu: 5.657 ± 0.363
2.752ValPhe: 2.752 ± 0.237
3.733ValGly: 3.733 ± 0.325
1.058ValHis: 1.058 ± 0.149
4.599ValIle: 4.599 ± 0.312
5.773ValLys: 5.773 ± 0.318
4.657ValLeu: 4.657 ± 0.271
1.693ValMet: 1.693 ± 0.211
4.137ValAsn: 4.137 ± 0.279
2.194ValPro: 2.194 ± 0.228
2.251ValGln: 2.251 ± 0.256
2.521ValArg: 2.521 ± 0.225
4.407ValSer: 4.407 ± 0.292
3.829ValThr: 3.829 ± 0.323
4.387ValVal: 4.387 ± 0.333
0.712ValTrp: 0.712 ± 0.108
2.579ValTyr: 2.579 ± 0.235
0.0ValXaa: 0.0 ± 0.0
Trp
0.789TrpAla: 0.789 ± 0.13
0.154TrpCys: 0.154 ± 0.055
0.866TrpAsp: 0.866 ± 0.134
0.827TrpGlu: 0.827 ± 0.108
0.962TrpPhe: 0.962 ± 0.144
0.616TrpGly: 0.616 ± 0.134
0.173TrpHis: 0.173 ± 0.061
0.924TrpIle: 0.924 ± 0.127
1.559TrpLys: 1.559 ± 0.163
0.885TrpLeu: 0.885 ± 0.126
0.52TrpMet: 0.52 ± 0.102
0.981TrpAsn: 0.981 ± 0.167
0.404TrpPro: 0.404 ± 0.094
0.616TrpGln: 0.616 ± 0.11
0.481TrpArg: 0.481 ± 0.088
0.904TrpSer: 0.904 ± 0.106
0.597TrpThr: 0.597 ± 0.093
0.847TrpVal: 0.847 ± 0.119
0.192TrpTrp: 0.192 ± 0.054
0.789TrpTyr: 0.789 ± 0.121
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.963TyrAla: 2.963 ± 0.239
0.52TyrCys: 0.52 ± 0.123
3.387TyrAsp: 3.387 ± 0.272
2.713TyrGlu: 2.713 ± 0.253
2.04TyrPhe: 2.04 ± 0.221
2.232TyrGly: 2.232 ± 0.223
0.731TyrHis: 0.731 ± 0.112
3.348TyrIle: 3.348 ± 0.243
3.156TyrLys: 3.156 ± 0.309
2.809TyrLeu: 2.809 ± 0.217
1.02TyrMet: 1.02 ± 0.148
2.444TyrAsn: 2.444 ± 0.23
1.597TyrPro: 1.597 ± 0.184
1.847TyrGln: 1.847 ± 0.171
1.886TyrArg: 1.886 ± 0.164
2.886TyrSer: 2.886 ± 0.266
2.444TyrThr: 2.444 ± 0.229
3.06TyrVal: 3.06 ± 0.226
0.635TyrTrp: 0.635 ± 0.111
1.809TyrTyr: 1.809 ± 0.247
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 266 proteins (51969 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski