Amino acid dipepetide frequency for Mycobacterium phage Schatzie

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
10.9AlaAla: 10.9 ± 0.792
1.017AlaCys: 1.017 ± 0.177
6.54AlaAsp: 6.54 ± 0.512
7.499AlaGlu: 7.499 ± 0.578
3.517AlaPhe: 3.517 ± 0.342
7.063AlaGly: 7.063 ± 0.612
2.267AlaHis: 2.267 ± 0.312
4.825AlaIle: 4.825 ± 0.436
3.779AlaLys: 3.779 ± 0.373
8.574AlaLeu: 8.574 ± 0.493
3.604AlaMet: 3.604 ± 0.419
2.936AlaAsn: 2.936 ± 0.306
3.895AlaPro: 3.895 ± 0.364
3.081AlaGln: 3.081 ± 0.286
6.685AlaArg: 6.685 ± 0.47
5.028AlaSer: 5.028 ± 0.406
5.203AlaThr: 5.203 ± 0.402
6.511AlaVal: 6.511 ± 0.461
1.947AlaTrp: 1.947 ± 0.249
2.558AlaTyr: 2.558 ± 0.251
0.0AlaXaa: 0.0 ± 0.0
Cys
0.901CysAla: 0.901 ± 0.178
0.145CysCys: 0.145 ± 0.056
1.134CysAsp: 1.134 ± 0.216
1.279CysGlu: 1.279 ± 0.208
0.407CysPhe: 0.407 ± 0.11
1.744CysGly: 1.744 ± 0.222
0.349CysHis: 0.349 ± 0.103
0.494CysIle: 0.494 ± 0.137
0.639CysLys: 0.639 ± 0.125
0.959CysLeu: 0.959 ± 0.203
0.262CysMet: 0.262 ± 0.079
0.552CysAsn: 0.552 ± 0.14
0.727CysPro: 0.727 ± 0.149
0.407CysGln: 0.407 ± 0.122
1.046CysArg: 1.046 ± 0.208
0.523CysSer: 0.523 ± 0.12
0.872CysThr: 0.872 ± 0.137
0.843CysVal: 0.843 ± 0.188
0.203CysTrp: 0.203 ± 0.076
0.465CysTyr: 0.465 ± 0.126
0.0CysXaa: 0.0 ± 0.0
Asp
6.075AspAla: 6.075 ± 0.441
0.93AspCys: 0.93 ± 0.157
4.825AspAsp: 4.825 ± 0.368
5.406AspGlu: 5.406 ± 0.602
2.151AspPhe: 2.151 ± 0.211
7.034AspGly: 7.034 ± 0.447
1.744AspHis: 1.744 ± 0.256
3.255AspIle: 3.255 ± 0.316
2.558AspLys: 2.558 ± 0.301
5.871AspLeu: 5.871 ± 0.361
1.221AspMet: 1.221 ± 0.175
2.151AspAsn: 2.151 ± 0.256
3.43AspPro: 3.43 ± 0.311
1.889AspGln: 1.889 ± 0.228
4.04AspArg: 4.04 ± 0.347
3.081AspSer: 3.081 ± 0.326
3.255AspThr: 3.255 ± 0.279
4.098AspVal: 4.098 ± 0.346
2.296AspTrp: 2.296 ± 0.281
2.645AspTyr: 2.645 ± 0.277
0.0AspXaa: 0.0 ± 0.0
Glu
7.208GluAla: 7.208 ± 0.523
1.308GluCys: 1.308 ± 0.181
4.447GluAsp: 4.447 ± 0.443
5.261GluGlu: 5.261 ± 0.492
2.79GluPhe: 2.79 ± 0.334
4.65GluGly: 4.65 ± 0.342
1.337GluHis: 1.337 ± 0.222
4.331GluIle: 4.331 ± 0.314
2.296GluLys: 2.296 ± 0.306
6.365GluLeu: 6.365 ± 0.461
2.383GluMet: 2.383 ± 0.299
1.482GluAsn: 1.482 ± 0.237
3.284GluPro: 3.284 ± 0.328
2.296GluGln: 2.296 ± 0.275
5.057GluArg: 5.057 ± 0.446
3.255GluSer: 3.255 ± 0.329
3.401GluThr: 3.401 ± 0.317
4.36GluVal: 4.36 ± 0.375
1.395GluTrp: 1.395 ± 0.227
2.442GluTyr: 2.442 ± 0.267
0.0GluXaa: 0.0 ± 0.0
Phe
2.471PheAla: 2.471 ± 0.288
0.552PheCys: 0.552 ± 0.139
2.296PheAsp: 2.296 ± 0.232
2.267PheGlu: 2.267 ± 0.221
1.046PhePhe: 1.046 ± 0.18
3.284PheGly: 3.284 ± 0.436
0.639PheHis: 0.639 ± 0.157
1.424PheIle: 1.424 ± 0.185
1.075PheLys: 1.075 ± 0.178
2.383PheLeu: 2.383 ± 0.283
0.872PheMet: 0.872 ± 0.142
1.366PheAsn: 1.366 ± 0.216
1.802PhePro: 1.802 ± 0.24
0.988PheGln: 0.988 ± 0.189
2.064PheArg: 2.064 ± 0.231
1.86PheSer: 1.86 ± 0.231
2.035PheThr: 2.035 ± 0.253
2.383PheVal: 2.383 ± 0.254
0.639PheTrp: 0.639 ± 0.138
0.698PheTyr: 0.698 ± 0.133
0.0PheXaa: 0.0 ± 0.0
Gly
7.266GlyAla: 7.266 ± 0.642
1.25GlyCys: 1.25 ± 0.201
5.813GlyAsp: 5.813 ± 0.365
5.697GlyGlu: 5.697 ± 0.49
3.372GlyPhe: 3.372 ± 0.368
7.528GlyGly: 7.528 ± 1.365
2.209GlyHis: 2.209 ± 0.258
3.837GlyIle: 3.837 ± 0.412
3.749GlyLys: 3.749 ± 0.376
7.412GlyLeu: 7.412 ± 0.551
1.889GlyMet: 1.889 ± 0.246
3.11GlyAsn: 3.11 ± 0.342
3.604GlyPro: 3.604 ± 0.36
2.703GlyGln: 2.703 ± 0.376
5.581GlyArg: 5.581 ± 0.366
4.796GlySer: 4.796 ± 0.339
4.796GlyThr: 4.796 ± 0.341
4.854GlyVal: 4.854 ± 0.341
2.035GlyTrp: 2.035 ± 0.272
3.139GlyTyr: 3.139 ± 0.294
0.0GlyXaa: 0.0 ± 0.0
His
1.395HisAla: 1.395 ± 0.199
0.349HisCys: 0.349 ± 0.09
2.093HisAsp: 2.093 ± 0.276
1.54HisGlu: 1.54 ± 0.217
0.581HisPhe: 0.581 ± 0.143
2.209HisGly: 2.209 ± 0.282
0.988HisHis: 0.988 ± 0.198
1.134HisIle: 1.134 ± 0.209
0.698HisLys: 0.698 ± 0.153
2.645HisLeu: 2.645 ± 0.252
0.552HisMet: 0.552 ± 0.123
0.436HisAsn: 0.436 ± 0.106
1.366HisPro: 1.366 ± 0.192
0.814HisGln: 0.814 ± 0.137
1.918HisArg: 1.918 ± 0.229
0.901HisSer: 0.901 ± 0.163
0.698HisThr: 0.698 ± 0.141
1.308HisVal: 1.308 ± 0.192
0.814HisTrp: 0.814 ± 0.155
0.756HisTyr: 0.756 ± 0.165
0.0HisXaa: 0.0 ± 0.0
Ile
5.522IleAla: 5.522 ± 0.474
0.494IleCys: 0.494 ± 0.128
4.011IleAsp: 4.011 ± 0.348
3.604IleGlu: 3.604 ± 0.308
0.988IlePhe: 0.988 ± 0.197
3.953IleGly: 3.953 ± 0.467
1.134IleHis: 1.134 ± 0.168
2.064IleIle: 2.064 ± 0.278
1.715IleLys: 1.715 ± 0.233
3.517IleLeu: 3.517 ± 0.333
1.075IleMet: 1.075 ± 0.175
1.57IleAsn: 1.57 ± 0.215
3.139IlePro: 3.139 ± 0.298
1.54IleGln: 1.54 ± 0.183
3.197IleArg: 3.197 ± 0.321
2.122IleSer: 2.122 ± 0.215
2.848IleThr: 2.848 ± 0.3
3.575IleVal: 3.575 ± 0.325
0.785IleTrp: 0.785 ± 0.143
1.25IleTyr: 1.25 ± 0.178
0.0IleXaa: 0.0 ± 0.0
Lys
4.244LysAla: 4.244 ± 0.442
0.581LysCys: 0.581 ± 0.123
1.918LysAsp: 1.918 ± 0.272
1.918LysGlu: 1.918 ± 0.264
1.104LysPhe: 1.104 ± 0.17
2.936LysGly: 2.936 ± 0.345
0.93LysHis: 0.93 ± 0.178
1.686LysIle: 1.686 ± 0.235
1.976LysLys: 1.976 ± 0.31
3.779LysLeu: 3.779 ± 0.355
1.453LysMet: 1.453 ± 0.237
0.756LysAsn: 0.756 ± 0.112
2.587LysPro: 2.587 ± 0.255
1.075LysGln: 1.075 ± 0.187
2.674LysArg: 2.674 ± 0.273
2.122LysSer: 2.122 ± 0.301
1.889LysThr: 1.889 ± 0.238
3.023LysVal: 3.023 ± 0.322
0.988LysTrp: 0.988 ± 0.162
1.366LysTyr: 1.366 ± 0.212
0.0LysXaa: 0.0 ± 0.0
Leu
9.272LeuAla: 9.272 ± 0.514
0.814LeuCys: 0.814 ± 0.165
5.697LeuAsp: 5.697 ± 0.315
5.348LeuGlu: 5.348 ± 0.416
2.412LeuPhe: 2.412 ± 0.254
6.743LeuGly: 6.743 ± 0.507
1.976LeuHis: 1.976 ± 0.219
2.936LeuIle: 2.936 ± 0.288
3.11LeuLys: 3.11 ± 0.34
6.278LeuLeu: 6.278 ± 0.451
2.064LeuMet: 2.064 ± 0.218
3.43LeuAsn: 3.43 ± 0.293
4.302LeuPro: 4.302 ± 0.402
2.558LeuGln: 2.558 ± 0.304
5.232LeuArg: 5.232 ± 0.387
5.261LeuSer: 5.261 ± 0.381
5.319LeuThr: 5.319 ± 0.407
5.086LeuVal: 5.086 ± 0.398
1.57LeuTrp: 1.57 ± 0.223
2.18LeuTyr: 2.18 ± 0.301
0.0LeuXaa: 0.0 ± 0.0
Met
2.471MetAla: 2.471 ± 0.241
0.233MetCys: 0.233 ± 0.091
1.279MetAsp: 1.279 ± 0.209
1.628MetGlu: 1.628 ± 0.219
0.756MetPhe: 0.756 ± 0.153
1.715MetGly: 1.715 ± 0.208
0.407MetHis: 0.407 ± 0.104
1.54MetIle: 1.54 ± 0.24
1.163MetLys: 1.163 ± 0.174
1.453MetLeu: 1.453 ± 0.247
0.639MetMet: 0.639 ± 0.135
0.988MetAsn: 0.988 ± 0.164
1.308MetPro: 1.308 ± 0.208
0.727MetGln: 0.727 ± 0.142
1.308MetArg: 1.308 ± 0.172
2.761MetSer: 2.761 ± 0.27
2.18MetThr: 2.18 ± 0.253
1.25MetVal: 1.25 ± 0.168
0.552MetTrp: 0.552 ± 0.121
0.436MetTyr: 0.436 ± 0.116
0.0MetXaa: 0.0 ± 0.0
Asn
3.081AsnAla: 3.081 ± 0.302
0.349AsnCys: 0.349 ± 0.089
2.122AsnAsp: 2.122 ± 0.224
1.715AsnGlu: 1.715 ± 0.199
1.192AsnPhe: 1.192 ± 0.2
3.662AsnGly: 3.662 ± 0.28
0.785AsnHis: 0.785 ± 0.163
1.337AsnIle: 1.337 ± 0.215
1.308AsnLys: 1.308 ± 0.18
2.732AsnLeu: 2.732 ± 0.287
0.465AsnMet: 0.465 ± 0.147
0.581AsnAsn: 0.581 ± 0.113
2.616AsnPro: 2.616 ± 0.274
0.814AsnGln: 0.814 ± 0.187
2.383AsnArg: 2.383 ± 0.249
1.511AsnSer: 1.511 ± 0.252
1.511AsnThr: 1.511 ± 0.235
2.238AsnVal: 2.238 ± 0.279
0.843AsnTrp: 0.843 ± 0.186
0.93AsnTyr: 0.93 ± 0.191
0.0AsnXaa: 0.0 ± 0.0
Pro
5.232ProAla: 5.232 ± 0.429
0.727ProCys: 0.727 ± 0.152
3.837ProAsp: 3.837 ± 0.356
4.156ProGlu: 4.156 ± 0.427
1.686ProPhe: 1.686 ± 0.217
5.261ProGly: 5.261 ± 0.544
0.872ProHis: 0.872 ± 0.152
2.238ProIle: 2.238 ± 0.26
2.267ProLys: 2.267 ± 0.276
3.953ProLeu: 3.953 ± 0.278
1.163ProMet: 1.163 ± 0.219
2.093ProAsn: 2.093 ± 0.238
3.052ProPro: 3.052 ± 0.369
1.366ProGln: 1.366 ± 0.195
3.255ProArg: 3.255 ± 0.304
2.296ProSer: 2.296 ± 0.279
2.761ProThr: 2.761 ± 0.323
4.215ProVal: 4.215 ± 0.325
1.482ProTrp: 1.482 ± 0.191
1.54ProTyr: 1.54 ± 0.218
0.0ProXaa: 0.0 ± 0.0
Gln
3.139GlnAla: 3.139 ± 0.302
0.494GlnCys: 0.494 ± 0.117
1.192GlnAsp: 1.192 ± 0.16
1.947GlnGlu: 1.947 ± 0.253
1.308GlnPhe: 1.308 ± 0.205
2.5GlnGly: 2.5 ± 0.329
0.61GlnHis: 0.61 ± 0.123
1.773GlnIle: 1.773 ± 0.214
1.599GlnLys: 1.599 ± 0.314
2.645GlnLeu: 2.645 ± 0.318
0.814GlnMet: 0.814 ± 0.147
0.843GlnAsn: 0.843 ± 0.18
1.86GlnPro: 1.86 ± 0.224
1.134GlnGln: 1.134 ± 0.194
2.064GlnArg: 2.064 ± 0.257
1.715GlnSer: 1.715 ± 0.217
1.657GlnThr: 1.657 ± 0.22
2.412GlnVal: 2.412 ± 0.295
0.698GlnTrp: 0.698 ± 0.145
0.639GlnTyr: 0.639 ± 0.111
0.0GlnXaa: 0.0 ± 0.0
Arg
6.685ArgAla: 6.685 ± 0.518
1.134ArgCys: 1.134 ± 0.19
4.156ArgAsp: 4.156 ± 0.299
4.185ArgGlu: 4.185 ± 0.373
2.18ArgPhe: 2.18 ± 0.249
4.738ArgGly: 4.738 ± 0.334
1.57ArgHis: 1.57 ± 0.244
3.604ArgIle: 3.604 ± 0.396
2.703ArgLys: 2.703 ± 0.298
4.767ArgLeu: 4.767 ± 0.359
1.831ArgMet: 1.831 ± 0.225
1.976ArgAsn: 1.976 ± 0.222
3.372ArgPro: 3.372 ± 0.362
2.877ArgGln: 2.877 ± 0.331
5.406ArgArg: 5.406 ± 0.464
3.575ArgSer: 3.575 ± 0.311
3.372ArgThr: 3.372 ± 0.426
5.086ArgVal: 5.086 ± 0.353
2.093ArgTrp: 2.093 ± 0.282
2.471ArgTyr: 2.471 ± 0.28
0.0ArgXaa: 0.0 ± 0.0
Ser
5.348SerAla: 5.348 ± 0.416
0.785SerCys: 0.785 ± 0.172
3.459SerAsp: 3.459 ± 0.334
3.953SerGlu: 3.953 ± 0.38
1.395SerPhe: 1.395 ± 0.218
4.912SerGly: 4.912 ± 0.478
1.279SerHis: 1.279 ± 0.221
2.907SerIle: 2.907 ± 0.271
1.947SerLys: 1.947 ± 0.242
4.68SerLeu: 4.68 ± 0.384
1.337SerMet: 1.337 ± 0.193
1.86SerAsn: 1.86 ± 0.267
2.936SerPro: 2.936 ± 0.275
1.366SerGln: 1.366 ± 0.225
3.546SerArg: 3.546 ± 0.337
3.197SerSer: 3.197 ± 0.387
3.052SerThr: 3.052 ± 0.313
3.488SerVal: 3.488 ± 0.361
1.57SerTrp: 1.57 ± 0.217
1.511SerTyr: 1.511 ± 0.184
0.0SerXaa: 0.0 ± 0.0
Thr
4.883ThrAla: 4.883 ± 0.406
0.959ThrCys: 0.959 ± 0.177
3.604ThrAsp: 3.604 ± 0.349
3.255ThrGlu: 3.255 ± 0.32
2.006ThrPhe: 2.006 ± 0.251
4.941ThrGly: 4.941 ± 0.429
1.25ThrHis: 1.25 ± 0.214
2.936ThrIle: 2.936 ± 0.306
1.802ThrLys: 1.802 ± 0.304
4.68ThrLeu: 4.68 ± 0.351
0.756ThrMet: 0.756 ± 0.152
1.657ThrAsn: 1.657 ± 0.194
4.127ThrPro: 4.127 ± 0.301
1.482ThrGln: 1.482 ± 0.191
3.197ThrArg: 3.197 ± 0.323
3.226ThrSer: 3.226 ± 0.475
2.965ThrThr: 2.965 ± 0.338
4.447ThrVal: 4.447 ± 0.353
1.453ThrTrp: 1.453 ± 0.224
1.773ThrTyr: 1.773 ± 0.219
0.0ThrXaa: 0.0 ± 0.0
Val
6.482ValAla: 6.482 ± 0.456
0.785ValCys: 0.785 ± 0.164
5.813ValAsp: 5.813 ± 0.455
5.493ValGlu: 5.493 ± 0.405
1.715ValPhe: 1.715 ± 0.208
5.057ValGly: 5.057 ± 0.491
1.599ValHis: 1.599 ± 0.182
3.546ValIle: 3.546 ± 0.316
2.674ValLys: 2.674 ± 0.291
4.592ValLeu: 4.592 ± 0.34
1.366ValMet: 1.366 ± 0.213
2.529ValAsn: 2.529 ± 0.297
3.604ValPro: 3.604 ± 0.334
2.18ValGln: 2.18 ± 0.276
4.505ValArg: 4.505 ± 0.438
4.185ValSer: 4.185 ± 0.392
4.389ValThr: 4.389 ± 0.415
5.61ValVal: 5.61 ± 0.55
1.017ValTrp: 1.017 ± 0.161
2.006ValTyr: 2.006 ± 0.318
0.0ValXaa: 0.0 ± 0.0
Trp
1.831TrpAla: 1.831 ± 0.198
0.465TrpCys: 0.465 ± 0.119
1.337TrpAsp: 1.337 ± 0.221
1.25TrpGlu: 1.25 ± 0.188
0.756TrpPhe: 0.756 ± 0.158
1.715TrpGly: 1.715 ± 0.195
0.814TrpHis: 0.814 ± 0.142
1.134TrpIle: 1.134 ± 0.159
0.785TrpLys: 0.785 ± 0.163
1.86TrpLeu: 1.86 ± 0.267
0.61TrpMet: 0.61 ± 0.117
0.872TrpAsn: 0.872 ± 0.141
1.163TrpPro: 1.163 ± 0.181
0.669TrpGln: 0.669 ± 0.135
1.831TrpArg: 1.831 ± 0.252
1.831TrpSer: 1.831 ± 0.223
1.511TrpThr: 1.511 ± 0.206
2.035TrpVal: 2.035 ± 0.235
0.814TrpTrp: 0.814 ± 0.159
0.669TrpTyr: 0.669 ± 0.157
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.023TyrAla: 3.023 ± 0.266
0.581TyrCys: 0.581 ± 0.122
2.383TyrAsp: 2.383 ± 0.273
1.947TyrGlu: 1.947 ± 0.255
0.785TyrPhe: 0.785 ± 0.144
2.994TyrGly: 2.994 ± 0.317
0.436TyrHis: 0.436 ± 0.115
1.104TyrIle: 1.104 ± 0.213
1.163TyrLys: 1.163 ± 0.187
2.471TyrLeu: 2.471 ± 0.29
0.407TyrMet: 0.407 ± 0.096
0.988TyrAsn: 0.988 ± 0.146
1.279TyrPro: 1.279 ± 0.192
1.075TyrGln: 1.075 ± 0.171
2.79TyrArg: 2.79 ± 0.298
1.279TyrSer: 1.279 ± 0.185
1.686TyrThr: 1.686 ± 0.184
2.325TyrVal: 2.325 ± 0.245
0.756TyrTrp: 0.756 ± 0.178
0.901TyrTyr: 0.901 ± 0.183
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 230 proteins (34406 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski