Amino acid dipepetide frequency for Bacillus phage phiNIT1

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
5.377AlaAla: 5.377 ± 0.604
0.311AlaCys: 0.311 ± 0.083
4.182AlaAsp: 4.182 ± 0.314
4.445AlaGlu: 4.445 ± 0.356
2.246AlaPhe: 2.246 ± 0.213
3.991AlaGly: 3.991 ± 0.519
1.051AlaHis: 1.051 ± 0.153
4.087AlaIle: 4.087 ± 0.313
4.23AlaLys: 4.23 ± 0.376
5.783AlaLeu: 5.783 ± 0.32
1.888AlaMet: 1.888 ± 0.204
2.677AlaAsn: 2.677 ± 0.318
2.294AlaPro: 2.294 ± 0.298
2.342AlaGln: 2.342 ± 0.29
2.557AlaArg: 2.557 ± 0.252
3.274AlaSer: 3.274 ± 0.381
4.182AlaThr: 4.182 ± 0.417
3.943AlaVal: 3.943 ± 0.352
0.55AlaTrp: 0.55 ± 0.103
2.629AlaTyr: 2.629 ± 0.269
0.0AlaXaa: 0.0 ± 0.0
Cys
0.454CysAla: 0.454 ± 0.123
0.072CysCys: 0.072 ± 0.036
0.502CysAsp: 0.502 ± 0.108
0.382CysGlu: 0.382 ± 0.107
0.311CysPhe: 0.311 ± 0.09
0.693CysGly: 0.693 ± 0.153
0.119CysHis: 0.119 ± 0.05
0.55CysIle: 0.55 ± 0.128
0.526CysLys: 0.526 ± 0.119
0.621CysLeu: 0.621 ± 0.129
0.263CysMet: 0.263 ± 0.077
0.335CysAsn: 0.335 ± 0.098
0.454CysPro: 0.454 ± 0.122
0.167CysGln: 0.167 ± 0.068
0.191CysArg: 0.191 ± 0.074
0.765CysSer: 0.765 ± 0.154
0.311CysThr: 0.311 ± 0.092
0.478CysVal: 0.478 ± 0.117
0.119CysTrp: 0.119 ± 0.058
0.239CysTyr: 0.239 ± 0.078
0.0CysXaa: 0.0 ± 0.0
Asp
3.919AspAla: 3.919 ± 0.359
0.526AspCys: 0.526 ± 0.134
3.322AspAsp: 3.322 ± 0.329
4.421AspGlu: 4.421 ± 0.41
3.202AspPhe: 3.202 ± 0.354
4.349AspGly: 4.349 ± 0.462
1.099AspHis: 1.099 ± 0.152
4.756AspIle: 4.756 ± 0.355
4.493AspLys: 4.493 ± 0.421
6.046AspLeu: 6.046 ± 0.359
2.151AspMet: 2.151 ± 0.231
3.059AspAsn: 3.059 ± 0.297
2.246AspPro: 2.246 ± 0.256
1.936AspGln: 1.936 ± 0.199
2.748AspArg: 2.748 ± 0.312
3.752AspSer: 3.752 ± 0.311
4.134AspThr: 4.134 ± 0.342
5.305AspVal: 5.305 ± 0.419
0.789AspTrp: 0.789 ± 0.147
3.632AspTyr: 3.632 ± 0.288
0.0AspXaa: 0.0 ± 0.0
Glu
5.281GluAla: 5.281 ± 0.412
0.597GluCys: 0.597 ± 0.153
5.903GluAsp: 5.903 ± 0.457
9.033GluGlu: 9.033 ± 0.775
2.581GluPhe: 2.581 ± 0.243
5.138GluGly: 5.138 ± 0.376
1.577GluHis: 1.577 ± 0.254
4.708GluIle: 4.708 ± 0.389
5.473GluLys: 5.473 ± 0.401
7.026GluLeu: 7.026 ± 0.458
2.581GluMet: 2.581 ± 0.253
3.585GluAsn: 3.585 ± 0.36
2.031GluPro: 2.031 ± 0.254
2.987GluGln: 2.987 ± 0.338
3.776GluArg: 3.776 ± 0.334
3.728GluSer: 3.728 ± 0.319
3.585GluThr: 3.585 ± 0.334
5.735GluVal: 5.735 ± 0.389
1.004GluTrp: 1.004 ± 0.153
3.441GluTyr: 3.441 ± 0.329
0.0GluXaa: 0.0 ± 0.0
Phe
1.84PheAla: 1.84 ± 0.213
0.502PheCys: 0.502 ± 0.114
2.461PheAsp: 2.461 ± 0.261
2.39PheGlu: 2.39 ± 0.238
1.362PhePhe: 1.362 ± 0.199
2.151PheGly: 2.151 ± 0.223
0.741PheHis: 0.741 ± 0.124
2.7PheIle: 2.7 ± 0.271
2.772PheLys: 2.772 ± 0.257
2.939PheLeu: 2.939 ± 0.242
0.813PheMet: 0.813 ± 0.131
2.509PheAsn: 2.509 ± 0.24
1.171PhePro: 1.171 ± 0.137
1.29PheGln: 1.29 ± 0.211
1.529PheArg: 1.529 ± 0.194
3.178PheSer: 3.178 ± 0.278
2.724PheThr: 2.724 ± 0.266
2.342PheVal: 2.342 ± 0.247
0.382PheTrp: 0.382 ± 0.095
1.577PheTyr: 1.577 ± 0.242
0.0PheXaa: 0.0 ± 0.0
Gly
3.776GlyAla: 3.776 ± 0.498
0.741GlyCys: 0.741 ± 0.139
3.991GlyAsp: 3.991 ± 0.339
4.708GlyGlu: 4.708 ± 0.401
2.509GlyPhe: 2.509 ± 0.286
4.827GlyGly: 4.827 ± 0.641
1.028GlyHis: 1.028 ± 0.171
4.039GlyIle: 4.039 ± 0.371
4.732GlyLys: 4.732 ± 0.373
5.042GlyLeu: 5.042 ± 0.389
1.506GlyMet: 1.506 ± 0.216
3.752GlyAsn: 3.752 ± 0.333
0.645GlyPro: 0.645 ± 0.137
2.175GlyGln: 2.175 ± 0.23
2.939GlyArg: 2.939 ± 0.243
4.564GlySer: 4.564 ± 0.409
4.11GlyThr: 4.11 ± 0.387
4.612GlyVal: 4.612 ± 0.336
0.789GlyTrp: 0.789 ± 0.153
3.226GlyTyr: 3.226 ± 0.265
0.0GlyXaa: 0.0 ± 0.0
His
1.171HisAla: 1.171 ± 0.153
0.191HisCys: 0.191 ± 0.068
0.813HisAsp: 0.813 ± 0.155
0.956HisGlu: 0.956 ± 0.174
0.621HisPhe: 0.621 ± 0.142
1.147HisGly: 1.147 ± 0.189
0.55HisHis: 0.55 ± 0.121
1.362HisIle: 1.362 ± 0.197
1.147HisLys: 1.147 ± 0.178
1.864HisLeu: 1.864 ± 0.244
0.55HisMet: 0.55 ± 0.122
0.932HisAsn: 0.932 ± 0.145
0.741HisPro: 0.741 ± 0.14
0.574HisGln: 0.574 ± 0.128
0.956HisArg: 0.956 ± 0.162
1.314HisSer: 1.314 ± 0.204
1.123HisThr: 1.123 ± 0.196
1.792HisVal: 1.792 ± 0.203
0.335HisTrp: 0.335 ± 0.082
0.836HisTyr: 0.836 ± 0.142
0.0HisXaa: 0.0 ± 0.0
Ile
3.728IleAla: 3.728 ± 0.319
0.574IleCys: 0.574 ± 0.124
4.541IleAsp: 4.541 ± 0.348
4.708IleGlu: 4.708 ± 0.37
1.816IlePhe: 1.816 ± 0.193
3.441IleGly: 3.441 ± 0.326
1.195IleHis: 1.195 ± 0.169
3.202IleIle: 3.202 ± 0.271
4.995IleLys: 4.995 ± 0.345
4.66IleLeu: 4.66 ± 0.353
1.458IleMet: 1.458 ± 0.182
3.441IleAsn: 3.441 ± 0.328
2.222IlePro: 2.222 ± 0.231
2.438IleGln: 2.438 ± 0.229
2.7IleArg: 2.7 ± 0.271
4.78IleSer: 4.78 ± 0.434
4.087IleThr: 4.087 ± 0.486
4.087IleVal: 4.087 ± 0.336
0.406IleTrp: 0.406 ± 0.095
2.175IleTyr: 2.175 ± 0.253
0.0IleXaa: 0.0 ± 0.0
Lys
5.066LysAla: 5.066 ± 0.375
0.478LysCys: 0.478 ± 0.112
4.636LysAsp: 4.636 ± 0.299
7.408LysGlu: 7.408 ± 0.614
2.772LysPhe: 2.772 ± 0.312
4.947LysGly: 4.947 ± 0.385
1.768LysHis: 1.768 ± 0.251
3.178LysIle: 3.178 ± 0.303
6.572LysLys: 6.572 ± 0.515
5.735LysLeu: 5.735 ± 0.419
1.96LysMet: 1.96 ± 0.2
3.704LysAsn: 3.704 ± 0.405
2.175LysPro: 2.175 ± 0.298
2.82LysGln: 2.82 ± 0.341
3.346LysArg: 3.346 ± 0.32
4.588LysSer: 4.588 ± 0.403
4.134LysThr: 4.134 ± 0.289
4.827LysVal: 4.827 ± 0.344
0.908LysTrp: 0.908 ± 0.135
3.107LysTyr: 3.107 ± 0.263
0.0LysXaa: 0.0 ± 0.0
Leu
5.138LeuAla: 5.138 ± 0.358
0.645LeuCys: 0.645 ± 0.136
6.644LeuAsp: 6.644 ± 0.436
6.93LeuGlu: 6.93 ± 0.484
2.987LeuPhe: 2.987 ± 0.32
4.708LeuGly: 4.708 ± 0.325
1.41LeuHis: 1.41 ± 0.203
4.541LeuIle: 4.541 ± 0.33
7.002LeuLys: 7.002 ± 0.425
6.763LeuLeu: 6.763 ± 0.54
2.031LeuMet: 2.031 ± 0.22
4.612LeuAsn: 4.612 ± 0.317
3.178LeuPro: 3.178 ± 0.29
2.987LeuGln: 2.987 ± 0.32
3.68LeuArg: 3.68 ± 0.295
6.118LeuSer: 6.118 ± 0.458
5.879LeuThr: 5.879 ± 0.439
4.995LeuVal: 4.995 ± 0.385
0.932LeuTrp: 0.932 ± 0.145
3.441LeuTyr: 3.441 ± 0.311
0.0LeuXaa: 0.0 ± 0.0
Met
1.816MetAla: 1.816 ± 0.222
0.119MetCys: 0.119 ± 0.052
1.96MetAsp: 1.96 ± 0.216
2.055MetGlu: 2.055 ± 0.231
1.147MetPhe: 1.147 ± 0.161
1.41MetGly: 1.41 ± 0.208
0.406MetHis: 0.406 ± 0.099
1.482MetIle: 1.482 ± 0.204
2.222MetLys: 2.222 ± 0.214
1.984MetLeu: 1.984 ± 0.225
0.502MetMet: 0.502 ± 0.097
1.529MetAsn: 1.529 ± 0.203
0.717MetPro: 0.717 ± 0.133
0.932MetGln: 0.932 ± 0.167
1.29MetArg: 1.29 ± 0.156
2.342MetSer: 2.342 ± 0.246
1.84MetThr: 1.84 ± 0.212
1.123MetVal: 1.123 ± 0.146
0.143MetTrp: 0.143 ± 0.059
1.147MetTyr: 1.147 ± 0.15
0.0MetXaa: 0.0 ± 0.0
Asn
2.987AsnAla: 2.987 ± 0.333
0.478AsnCys: 0.478 ± 0.111
3.059AsnAsp: 3.059 ± 0.267
3.154AsnGlu: 3.154 ± 0.28
1.577AsnPhe: 1.577 ± 0.184
4.134AsnGly: 4.134 ± 0.372
1.099AsnHis: 1.099 ± 0.175
3.489AsnIle: 3.489 ± 0.386
4.11AsnLys: 4.11 ± 0.357
4.493AsnLeu: 4.493 ± 0.318
1.243AsnMet: 1.243 ± 0.195
3.131AsnAsn: 3.131 ± 0.308
2.294AsnPro: 2.294 ± 0.272
1.768AsnGln: 1.768 ± 0.211
2.509AsnArg: 2.509 ± 0.257
3.274AsnSer: 3.274 ± 0.314
3.322AsnThr: 3.322 ± 0.307
3.513AsnVal: 3.513 ± 0.304
0.645AsnTrp: 0.645 ± 0.115
1.984AsnTyr: 1.984 ± 0.203
0.0AsnXaa: 0.0 ± 0.0
Pro
2.103ProAla: 2.103 ± 0.282
0.215ProCys: 0.215 ± 0.075
2.294ProAsp: 2.294 ± 0.243
3.131ProGlu: 3.131 ± 0.391
1.123ProPhe: 1.123 ± 0.184
1.028ProGly: 1.028 ± 0.142
0.813ProHis: 0.813 ± 0.139
1.84ProIle: 1.84 ± 0.198
2.342ProLys: 2.342 ± 0.276
2.557ProLeu: 2.557 ± 0.249
0.621ProMet: 0.621 ± 0.142
1.745ProAsn: 1.745 ± 0.187
0.884ProPro: 0.884 ± 0.157
1.195ProGln: 1.195 ± 0.16
1.099ProArg: 1.099 ± 0.187
2.151ProSer: 2.151 ± 0.225
2.653ProThr: 2.653 ± 0.325
2.222ProVal: 2.222 ± 0.252
0.239ProTrp: 0.239 ± 0.083
1.816ProTyr: 1.816 ± 0.207
0.0ProXaa: 0.0 ± 0.0
Gln
2.318GlnAla: 2.318 ± 0.204
0.191GlnCys: 0.191 ± 0.065
1.888GlnAsp: 1.888 ± 0.217
3.441GlnGlu: 3.441 ± 0.334
1.243GlnPhe: 1.243 ± 0.19
1.768GlnGly: 1.768 ± 0.205
0.597GlnHis: 0.597 ± 0.128
2.151GlnIle: 2.151 ± 0.251
2.461GlnLys: 2.461 ± 0.241
3.417GlnLeu: 3.417 ± 0.288
1.314GlnMet: 1.314 ± 0.193
2.103GlnAsn: 2.103 ± 0.244
1.338GlnPro: 1.338 ± 0.212
2.222GlnGln: 2.222 ± 0.314
2.199GlnArg: 2.199 ± 0.245
2.366GlnSer: 2.366 ± 0.255
1.577GlnThr: 1.577 ± 0.224
2.796GlnVal: 2.796 ± 0.256
0.335GlnTrp: 0.335 ± 0.079
1.314GlnTyr: 1.314 ± 0.187
0.0GlnXaa: 0.0 ± 0.0
Arg
2.151ArgAla: 2.151 ± 0.224
0.263ArgCys: 0.263 ± 0.085
2.485ArgAsp: 2.485 ± 0.316
3.848ArgGlu: 3.848 ± 0.323
2.127ArgPhe: 2.127 ± 0.228
3.154ArgGly: 3.154 ± 0.273
0.741ArgHis: 0.741 ± 0.139
3.298ArgIle: 3.298 ± 0.286
3.632ArgLys: 3.632 ± 0.317
4.134ArgLeu: 4.134 ± 0.349
1.41ArgMet: 1.41 ± 0.211
2.031ArgAsn: 2.031 ± 0.254
1.051ArgPro: 1.051 ± 0.152
1.864ArgGln: 1.864 ± 0.236
2.222ArgArg: 2.222 ± 0.223
2.222ArgSer: 2.222 ± 0.226
2.748ArgThr: 2.748 ± 0.315
3.513ArgVal: 3.513 ± 0.298
0.335ArgTrp: 0.335 ± 0.092
2.127ArgTyr: 2.127 ± 0.213
0.0ArgXaa: 0.0 ± 0.0
Ser
3.226SerAla: 3.226 ± 0.293
0.478SerCys: 0.478 ± 0.123
3.585SerAsp: 3.585 ± 0.327
4.564SerGlu: 4.564 ± 0.371
2.963SerPhe: 2.963 ± 0.244
4.684SerGly: 4.684 ± 0.443
1.123SerHis: 1.123 ± 0.177
4.11SerIle: 4.11 ± 0.358
5.21SerLys: 5.21 ± 0.407
5.281SerLeu: 5.281 ± 0.29
1.96SerMet: 1.96 ± 0.218
3.322SerAsn: 3.322 ± 0.34
1.792SerPro: 1.792 ± 0.208
2.294SerGln: 2.294 ± 0.259
2.605SerArg: 2.605 ± 0.275
5.066SerSer: 5.066 ± 0.444
4.325SerThr: 4.325 ± 0.416
4.612SerVal: 4.612 ± 0.349
0.884SerTrp: 0.884 ± 0.147
2.916SerTyr: 2.916 ± 0.286
0.0SerXaa: 0.0 ± 0.0
Thr
4.087ThrAla: 4.087 ± 0.413
0.287ThrCys: 0.287 ± 0.076
4.493ThrAsp: 4.493 ± 0.421
4.517ThrGlu: 4.517 ± 0.326
2.246ThrPhe: 2.246 ± 0.254
5.042ThrGly: 5.042 ± 0.362
1.219ThrHis: 1.219 ± 0.182
3.871ThrIle: 3.871 ± 0.362
3.919ThrLys: 3.919 ± 0.317
5.855ThrLeu: 5.855 ± 0.447
1.147ThrMet: 1.147 ± 0.179
2.796ThrAsn: 2.796 ± 0.319
2.748ThrPro: 2.748 ± 0.248
2.509ThrGln: 2.509 ± 0.21
2.796ThrArg: 2.796 ± 0.215
3.298ThrSer: 3.298 ± 0.285
4.421ThrThr: 4.421 ± 0.437
4.995ThrVal: 4.995 ± 0.374
0.574ThrTrp: 0.574 ± 0.129
2.963ThrTyr: 2.963 ± 0.321
0.0ThrXaa: 0.0 ± 0.0
Val
4.23ValAla: 4.23 ± 0.35
0.478ValCys: 0.478 ± 0.1
5.138ValAsp: 5.138 ± 0.379
5.329ValGlu: 5.329 ± 0.355
2.677ValPhe: 2.677 ± 0.252
3.824ValGly: 3.824 ± 0.292
1.482ValHis: 1.482 ± 0.194
3.585ValIle: 3.585 ± 0.286
4.708ValLys: 4.708 ± 0.368
5.162ValLeu: 5.162 ± 0.363
1.577ValMet: 1.577 ± 0.229
3.728ValAsn: 3.728 ± 0.323
2.82ValPro: 2.82 ± 0.254
2.605ValGln: 2.605 ± 0.184
3.226ValArg: 3.226 ± 0.282
5.019ValSer: 5.019 ± 0.363
4.923ValThr: 4.923 ± 0.438
4.421ValVal: 4.421 ± 0.328
0.717ValTrp: 0.717 ± 0.139
3.107ValTyr: 3.107 ± 0.277
0.0ValXaa: 0.0 ± 0.0
Trp
0.478TrpAla: 0.478 ± 0.11
0.119TrpCys: 0.119 ± 0.058
0.908TrpAsp: 0.908 ± 0.153
1.051TrpGlu: 1.051 ± 0.154
0.43TrpPhe: 0.43 ± 0.107
0.574TrpGly: 0.574 ± 0.116
0.167TrpHis: 0.167 ± 0.076
0.669TrpIle: 0.669 ± 0.129
0.717TrpLys: 0.717 ± 0.113
0.669TrpLeu: 0.669 ± 0.126
0.287TrpMet: 0.287 ± 0.101
0.55TrpAsn: 0.55 ± 0.101
0.0TrpPro: 0.0 ± 0.0
0.335TrpGln: 0.335 ± 0.096
0.454TrpArg: 0.454 ± 0.105
0.621TrpSer: 0.621 ± 0.118
0.717TrpThr: 0.717 ± 0.112
1.075TrpVal: 1.075 ± 0.182
0.191TrpTrp: 0.191 ± 0.079
0.717TrpTyr: 0.717 ± 0.16
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.892TyrAla: 2.892 ± 0.248
0.287TyrCys: 0.287 ± 0.08
3.083TyrAsp: 3.083 ± 0.272
3.298TyrGlu: 3.298 ± 0.274
1.434TyrPhe: 1.434 ± 0.188
2.7TyrGly: 2.7 ± 0.33
0.789TyrHis: 0.789 ± 0.154
3.059TyrIle: 3.059 ± 0.266
2.868TyrLys: 2.868 ± 0.31
4.588TyrLeu: 4.588 ± 0.323
0.884TyrMet: 0.884 ± 0.132
2.724TyrAsn: 2.724 ± 0.292
1.267TyrPro: 1.267 ± 0.161
1.601TyrGln: 1.601 ± 0.198
2.533TyrArg: 2.533 ± 0.266
2.581TyrSer: 2.581 ± 0.268
2.987TyrThr: 2.987 ± 0.304
2.366TyrVal: 2.366 ± 0.236
0.454TyrTrp: 0.454 ± 0.096
1.721TyrTyr: 1.721 ± 0.199
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 211 proteins (41846 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski