Amino acid dipepetide frequency for Bacillus phage Bastille

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
4.168AlaAla: 4.168 ± 0.443
0.465AlaCys: 0.465 ± 0.086
3.597AlaAsp: 3.597 ± 0.282
3.851AlaGlu: 3.851 ± 0.276
2.581AlaPhe: 2.581 ± 0.24
3.809AlaGly: 3.809 ± 0.372
1.206AlaHis: 1.206 ± 0.179
3.618AlaIle: 3.618 ± 0.343
5.057AlaLys: 5.057 ± 0.294
5.586AlaLeu: 5.586 ± 0.295
1.735AlaMet: 1.735 ± 0.213
3.068AlaAsn: 3.068 ± 0.308
2.264AlaPro: 2.264 ± 0.267
2.412AlaGln: 2.412 ± 0.23
2.729AlaArg: 2.729 ± 0.292
3.449AlaSer: 3.449 ± 0.29
4.041AlaThr: 4.041 ± 0.37
3.618AlaVal: 3.618 ± 0.263
0.783AlaTrp: 0.783 ± 0.111
2.645AlaTyr: 2.645 ± 0.23
0.0AlaXaa: 0.0 ± 0.0
Cys
0.36CysAla: 0.36 ± 0.082
0.148CysCys: 0.148 ± 0.061
0.825CysAsp: 0.825 ± 0.145
0.614CysGlu: 0.614 ± 0.103
0.212CysPhe: 0.212 ± 0.079
0.635CysGly: 0.635 ± 0.146
0.254CysHis: 0.254 ± 0.075
0.592CysIle: 0.592 ± 0.102
0.656CysLys: 0.656 ± 0.139
0.635CysLeu: 0.635 ± 0.107
0.339CysMet: 0.339 ± 0.084
0.296CysAsn: 0.296 ± 0.082
0.317CysPro: 0.317 ± 0.086
0.169CysGln: 0.169 ± 0.069
0.233CysArg: 0.233 ± 0.07
0.487CysSer: 0.487 ± 0.112
0.487CysThr: 0.487 ± 0.111
0.592CysVal: 0.592 ± 0.097
0.127CysTrp: 0.127 ± 0.051
0.529CysTyr: 0.529 ± 0.109
0.0CysXaa: 0.0 ± 0.0
Asp
4.02AspAla: 4.02 ± 0.342
0.614AspCys: 0.614 ± 0.108
4.147AspAsp: 4.147 ± 0.338
6.009AspGlu: 6.009 ± 0.427
3.195AspPhe: 3.195 ± 0.242
4.253AspGly: 4.253 ± 0.329
0.592AspHis: 0.592 ± 0.095
5.417AspIle: 5.417 ± 0.333
6.051AspLys: 6.051 ± 0.436
5.057AspLeu: 5.057 ± 0.324
2.095AspMet: 2.095 ± 0.228
3.703AspAsn: 3.703 ± 0.256
1.481AspPro: 1.481 ± 0.187
0.889AspGln: 0.889 ± 0.131
3.153AspArg: 3.153 ± 0.266
3.385AspSer: 3.385 ± 0.263
3.914AspThr: 3.914 ± 0.348
4.697AspVal: 4.697 ± 0.305
0.889AspTrp: 0.889 ± 0.15
3.068AspTyr: 3.068 ± 0.258
0.0AspXaa: 0.0 ± 0.0
Glu
4.782GluAla: 4.782 ± 0.359
0.571GluCys: 0.571 ± 0.124
5.671GluAsp: 5.671 ± 0.495
8.146GluGlu: 8.146 ± 0.756
3.533GluPhe: 3.533 ± 0.341
4.613GluGly: 4.613 ± 0.3
1.566GluHis: 1.566 ± 0.226
5.628GluIle: 5.628 ± 0.381
6.263GluLys: 6.263 ± 0.543
8.315GluLeu: 8.315 ± 0.602
2.983GluMet: 2.983 ± 0.291
3.47GluAsn: 3.47 ± 0.244
1.82GluPro: 1.82 ± 0.23
3.237GluGln: 3.237 ± 0.324
3.364GluArg: 3.364 ± 0.258
3.914GluSer: 3.914 ± 0.332
3.364GluThr: 3.364 ± 0.24
5.988GluVal: 5.988 ± 0.469
1.143GluTrp: 1.143 ± 0.191
3.343GluTyr: 3.343 ± 0.316
0.0GluXaa: 0.0 ± 0.0
Phe
2.349PheAla: 2.349 ± 0.219
0.55PheCys: 0.55 ± 0.117
3.026PheAsp: 3.026 ± 0.233
2.856PheGlu: 2.856 ± 0.256
1.27PhePhe: 1.27 ± 0.21
2.074PheGly: 2.074 ± 0.186
0.91PheHis: 0.91 ± 0.138
2.624PheIle: 2.624 ± 0.229
2.983PheLys: 2.983 ± 0.228
3.364PheLeu: 3.364 ± 0.257
0.952PheMet: 0.952 ± 0.143
2.687PheAsn: 2.687 ± 0.244
1.016PhePro: 1.016 ± 0.162
1.1PheGln: 1.1 ± 0.154
1.523PheArg: 1.523 ± 0.194
2.433PheSer: 2.433 ± 0.234
2.856PheThr: 2.856 ± 0.293
3.153PheVal: 3.153 ± 0.253
0.381PheTrp: 0.381 ± 0.099
1.862PheTyr: 1.862 ± 0.204
0.0PheXaa: 0.0 ± 0.0
Gly
3.597GlyAla: 3.597 ± 0.47
0.487GlyCys: 0.487 ± 0.12
4.189GlyAsp: 4.189 ± 0.3
4.613GlyGlu: 4.613 ± 0.307
3.026GlyPhe: 3.026 ± 0.218
5.205GlyGly: 5.205 ± 0.689
0.889GlyHis: 0.889 ± 0.162
4.549GlyIle: 4.549 ± 0.396
4.888GlyLys: 4.888 ± 0.302
4.274GlyLeu: 4.274 ± 0.31
2.074GlyMet: 2.074 ± 0.217
3.428GlyAsn: 3.428 ± 0.344
0.0GlyPro: 0.0 ± 0.0
1.925GlyGln: 1.925 ± 0.238
2.603GlyArg: 2.603 ± 0.248
3.597GlySer: 3.597 ± 0.415
3.682GlyThr: 3.682 ± 0.348
4.951GlyVal: 4.951 ± 0.326
0.931GlyTrp: 0.931 ± 0.144
3.322GlyTyr: 3.322 ± 0.24
0.0GlyXaa: 0.0 ± 0.0
His
0.994HisAla: 0.994 ± 0.143
0.148HisCys: 0.148 ± 0.059
1.206HisAsp: 1.206 ± 0.167
1.354HisGlu: 1.354 ± 0.181
0.825HisPhe: 0.825 ± 0.163
1.058HisGly: 1.058 ± 0.159
0.444HisHis: 0.444 ± 0.106
1.439HisIle: 1.439 ± 0.184
1.523HisLys: 1.523 ± 0.197
1.185HisLeu: 1.185 ± 0.177
0.635HisMet: 0.635 ± 0.138
1.206HisAsn: 1.206 ± 0.163
0.804HisPro: 0.804 ± 0.122
0.487HisGln: 0.487 ± 0.082
0.719HisArg: 0.719 ± 0.133
1.037HisSer: 1.037 ± 0.175
1.164HisThr: 1.164 ± 0.165
1.46HisVal: 1.46 ± 0.175
0.254HisTrp: 0.254 ± 0.067
0.868HisTyr: 0.868 ± 0.145
0.0HisXaa: 0.0 ± 0.0
Ile
4.38IleAla: 4.38 ± 0.306
0.55IleCys: 0.55 ± 0.118
5.057IleAsp: 5.057 ± 0.313
5.882IleGlu: 5.882 ± 0.505
1.841IlePhe: 1.841 ± 0.173
3.703IleGly: 3.703 ± 0.325
1.143IleHis: 1.143 ± 0.176
4.443IleIle: 4.443 ± 0.342
6.221IleLys: 6.221 ± 0.364
4.909IleLeu: 4.909 ± 0.421
1.862IleMet: 1.862 ± 0.203
3.724IleAsn: 3.724 ± 0.282
2.349IlePro: 2.349 ± 0.224
2.116IleGln: 2.116 ± 0.217
3.089IleArg: 3.089 ± 0.236
3.682IleSer: 3.682 ± 0.276
4.316IleThr: 4.316 ± 0.308
4.338IleVal: 4.338 ± 0.306
0.529IleTrp: 0.529 ± 0.106
2.666IleTyr: 2.666 ± 0.229
0.0IleXaa: 0.0 ± 0.0
Lys
4.93LysAla: 4.93 ± 0.295
0.529LysCys: 0.529 ± 0.13
5.607LysAsp: 5.607 ± 0.376
8.4LysGlu: 8.4 ± 0.596
2.708LysPhe: 2.708 ± 0.211
5.586LysGly: 5.586 ± 0.315
1.523LysHis: 1.523 ± 0.194
4.549LysIle: 4.549 ± 0.279
5.967LysLys: 5.967 ± 0.509
6.411LysLeu: 6.411 ± 0.435
2.539LysMet: 2.539 ± 0.277
3.66LysAsn: 3.66 ± 0.282
2.031LysPro: 2.031 ± 0.207
2.899LysGln: 2.899 ± 0.274
3.914LysArg: 3.914 ± 0.354
3.724LysSer: 3.724 ± 0.263
3.703LysThr: 3.703 ± 0.266
5.819LysVal: 5.819 ± 0.372
0.804LysTrp: 0.804 ± 0.151
3.216LysTyr: 3.216 ± 0.285
0.0LysXaa: 0.0 ± 0.0
Leu
5.417LeuAla: 5.417 ± 0.302
0.741LeuCys: 0.741 ± 0.151
6.199LeuAsp: 6.199 ± 0.348
7.532LeuGlu: 7.532 ± 0.509
3.216LeuPhe: 3.216 ± 0.286
4.951LeuGly: 4.951 ± 0.302
1.883LeuHis: 1.883 ± 0.223
4.845LeuIle: 4.845 ± 0.483
6.432LeuLys: 6.432 ± 0.424
6.157LeuLeu: 6.157 ± 0.476
2.2LeuMet: 2.2 ± 0.237
3.914LeuAsn: 3.914 ± 0.273
2.751LeuPro: 2.751 ± 0.275
2.92LeuGln: 2.92 ± 0.255
3.724LeuArg: 3.724 ± 0.272
5.226LeuSer: 5.226 ± 0.336
5.332LeuThr: 5.332 ± 0.298
5.819LeuVal: 5.819 ± 0.434
0.677LeuTrp: 0.677 ± 0.124
2.835LeuTyr: 2.835 ± 0.225
0.0LeuXaa: 0.0 ± 0.0
Met
2.137MetAla: 2.137 ± 0.22
0.212MetCys: 0.212 ± 0.065
1.46MetAsp: 1.46 ± 0.19
2.264MetGlu: 2.264 ± 0.241
1.016MetPhe: 1.016 ± 0.185
1.756MetGly: 1.756 ± 0.241
0.529MetHis: 0.529 ± 0.104
2.116MetIle: 2.116 ± 0.236
2.454MetLys: 2.454 ± 0.231
2.518MetLeu: 2.518 ± 0.231
0.487MetMet: 0.487 ± 0.113
1.714MetAsn: 1.714 ± 0.189
0.783MetPro: 0.783 ± 0.11
0.846MetGln: 0.846 ± 0.135
1.418MetArg: 1.418 ± 0.188
1.989MetSer: 1.989 ± 0.201
1.841MetThr: 1.841 ± 0.237
1.693MetVal: 1.693 ± 0.191
0.402MetTrp: 0.402 ± 0.11
1.185MetTyr: 1.185 ± 0.168
0.0MetXaa: 0.0 ± 0.0
Asn
3.068AsnAla: 3.068 ± 0.258
0.508AsnCys: 0.508 ± 0.11
3.153AsnAsp: 3.153 ± 0.281
3.449AsnGlu: 3.449 ± 0.259
2.052AsnPhe: 2.052 ± 0.228
4.401AsnGly: 4.401 ± 0.417
0.931AsnHis: 0.931 ± 0.132
3.491AsnIle: 3.491 ± 0.307
4.274AsnLys: 4.274 ± 0.286
4.507AsnLeu: 4.507 ± 0.299
1.904AsnMet: 1.904 ± 0.209
3.385AsnAsn: 3.385 ± 0.278
2.264AsnPro: 2.264 ± 0.222
1.693AsnGln: 1.693 ± 0.203
2.729AsnArg: 2.729 ± 0.27
2.645AsnSer: 2.645 ± 0.244
2.962AsnThr: 2.962 ± 0.329
3.47AsnVal: 3.47 ± 0.262
0.592AsnTrp: 0.592 ± 0.114
2.751AsnTyr: 2.751 ± 0.22
0.0AsnXaa: 0.0 ± 0.0
Pro
1.777ProAla: 1.777 ± 0.195
0.169ProCys: 0.169 ± 0.058
2.2ProAsp: 2.2 ± 0.227
2.349ProGlu: 2.349 ± 0.265
1.143ProPhe: 1.143 ± 0.182
0.042ProGly: 0.042 ± 0.029
0.571ProHis: 0.571 ± 0.122
1.756ProIle: 1.756 ± 0.164
2.137ProLys: 2.137 ± 0.289
2.497ProLeu: 2.497 ± 0.224
0.783ProMet: 0.783 ± 0.161
2.052ProAsn: 2.052 ± 0.267
0.698ProPro: 0.698 ± 0.121
0.931ProGln: 0.931 ± 0.168
1.248ProArg: 1.248 ± 0.152
2.179ProSer: 2.179 ± 0.253
2.2ProThr: 2.2 ± 0.306
2.052ProVal: 2.052 ± 0.27
0.296ProTrp: 0.296 ± 0.076
1.523ProTyr: 1.523 ± 0.185
0.0ProXaa: 0.0 ± 0.0
Gln
2.052GlnAla: 2.052 ± 0.235
0.212GlnCys: 0.212 ± 0.063
1.798GlnAsp: 1.798 ± 0.185
2.962GlnGlu: 2.962 ± 0.294
1.058GlnPhe: 1.058 ± 0.162
2.137GlnGly: 2.137 ± 0.304
0.656GlnHis: 0.656 ± 0.109
1.925GlnIle: 1.925 ± 0.206
2.349GlnLys: 2.349 ± 0.201
2.666GlnLeu: 2.666 ± 0.283
0.931GlnMet: 0.931 ± 0.148
1.714GlnAsn: 1.714 ± 0.194
1.333GlnPro: 1.333 ± 0.217
1.714GlnGln: 1.714 ± 0.449
1.291GlnArg: 1.291 ± 0.168
1.925GlnSer: 1.925 ± 0.233
1.925GlnThr: 1.925 ± 0.218
2.285GlnVal: 2.285 ± 0.235
0.444GlnTrp: 0.444 ± 0.111
1.502GlnTyr: 1.502 ± 0.16
0.0GlnXaa: 0.0 ± 0.0
Arg
2.687ArgAla: 2.687 ± 0.246
0.381ArgCys: 0.381 ± 0.086
2.962ArgAsp: 2.962 ± 0.268
3.576ArgGlu: 3.576 ± 0.345
2.095ArgPhe: 2.095 ± 0.219
2.751ArgGly: 2.751 ± 0.259
0.698ArgHis: 0.698 ± 0.111
3.28ArgIle: 3.28 ± 0.262
3.533ArgLys: 3.533 ± 0.306
4.274ArgLeu: 4.274 ± 0.292
1.291ArgMet: 1.291 ± 0.153
2.327ArgAsn: 2.327 ± 0.228
0.952ArgPro: 0.952 ± 0.153
1.502ArgGln: 1.502 ± 0.185
1.883ArgArg: 1.883 ± 0.267
2.052ArgSer: 2.052 ± 0.244
2.645ArgThr: 2.645 ± 0.284
2.962ArgVal: 2.962 ± 0.256
0.444ArgTrp: 0.444 ± 0.088
2.052ArgTyr: 2.052 ± 0.22
0.0ArgXaa: 0.0 ± 0.0
Ser
3.301SerAla: 3.301 ± 0.326
0.296SerCys: 0.296 ± 0.074
3.005SerAsp: 3.005 ± 0.258
3.936SerGlu: 3.936 ± 0.31
2.624SerPhe: 2.624 ± 0.255
4.189SerGly: 4.189 ± 0.292
1.1SerHis: 1.1 ± 0.164
4.316SerIle: 4.316 ± 0.282
3.745SerLys: 3.745 ± 0.272
4.993SerLeu: 4.993 ± 0.319
1.545SerMet: 1.545 ± 0.187
2.856SerAsn: 2.856 ± 0.314
1.714SerPro: 1.714 ± 0.238
1.862SerGln: 1.862 ± 0.232
2.412SerArg: 2.412 ± 0.212
3.195SerSer: 3.195 ± 0.318
3.407SerThr: 3.407 ± 0.297
3.682SerVal: 3.682 ± 0.278
0.931SerTrp: 0.931 ± 0.163
2.835SerTyr: 2.835 ± 0.265
0.0SerXaa: 0.0 ± 0.0
Thr
3.576ThrAla: 3.576 ± 0.447
0.614ThrCys: 0.614 ± 0.137
3.533ThrAsp: 3.533 ± 0.254
3.809ThrGlu: 3.809 ± 0.321
2.454ThrPhe: 2.454 ± 0.308
4.041ThrGly: 4.041 ± 0.397
1.016ThrHis: 1.016 ± 0.152
3.893ThrIle: 3.893 ± 0.292
4.105ThrLys: 4.105 ± 0.286
5.269ThrLeu: 5.269 ± 0.347
1.333ThrMet: 1.333 ± 0.163
3.216ThrAsn: 3.216 ± 0.331
2.327ThrPro: 2.327 ± 0.282
1.798ThrGln: 1.798 ± 0.206
2.793ThrArg: 2.793 ± 0.258
3.597ThrSer: 3.597 ± 0.316
3.872ThrThr: 3.872 ± 0.328
4.845ThrVal: 4.845 ± 0.405
0.741ThrTrp: 0.741 ± 0.12
3.428ThrTyr: 3.428 ± 0.281
0.0ThrXaa: 0.0 ± 0.0
Val
4.232ValAla: 4.232 ± 0.295
0.741ValCys: 0.741 ± 0.11
5.12ValAsp: 5.12 ± 0.323
5.395ValGlu: 5.395 ± 0.358
2.899ValPhe: 2.899 ± 0.253
3.957ValGly: 3.957 ± 0.315
1.46ValHis: 1.46 ± 0.197
4.147ValIle: 4.147 ± 0.344
5.544ValLys: 5.544 ± 0.37
5.565ValLeu: 5.565 ± 0.31
1.65ValMet: 1.65 ± 0.167
3.893ValAsn: 3.893 ± 0.334
2.116ValPro: 2.116 ± 0.249
2.37ValGln: 2.37 ± 0.233
2.878ValArg: 2.878 ± 0.27
3.745ValSer: 3.745 ± 0.243
4.951ValThr: 4.951 ± 0.329
5.332ValVal: 5.332 ± 0.384
0.677ValTrp: 0.677 ± 0.102
3.682ValTyr: 3.682 ± 0.307
0.0ValXaa: 0.0 ± 0.0
Trp
0.614TrpAla: 0.614 ± 0.094
0.148TrpCys: 0.148 ± 0.05
0.91TrpAsp: 0.91 ± 0.147
1.079TrpGlu: 1.079 ± 0.152
0.614TrpPhe: 0.614 ± 0.124
0.487TrpGly: 0.487 ± 0.099
0.402TrpHis: 0.402 ± 0.104
0.783TrpIle: 0.783 ± 0.141
0.614TrpLys: 0.614 ± 0.119
1.143TrpLeu: 1.143 ± 0.177
0.212TrpMet: 0.212 ± 0.069
0.825TrpAsn: 0.825 ± 0.17
0.0TrpPro: 0.0 ± 0.0
0.444TrpGln: 0.444 ± 0.096
0.487TrpArg: 0.487 ± 0.106
0.931TrpSer: 0.931 ± 0.166
0.656TrpThr: 0.656 ± 0.128
0.741TrpVal: 0.741 ± 0.124
0.169TrpTrp: 0.169 ± 0.061
0.571TrpTyr: 0.571 ± 0.094
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.243TyrAla: 2.243 ± 0.213
0.423TyrCys: 0.423 ± 0.094
2.962TyrAsp: 2.962 ± 0.271
3.639TyrGlu: 3.639 ± 0.331
1.608TyrPhe: 1.608 ± 0.214
2.581TyrGly: 2.581 ± 0.214
1.037TyrHis: 1.037 ± 0.145
3.385TyrIle: 3.385 ± 0.307
3.745TyrLys: 3.745 ± 0.306
3.449TyrLeu: 3.449 ± 0.314
1.248TyrMet: 1.248 ± 0.158
3.089TyrAsn: 3.089 ± 0.208
1.587TyrPro: 1.587 ± 0.173
1.566TyrGln: 1.566 ± 0.199
2.158TyrArg: 2.158 ± 0.218
2.751TyrSer: 2.751 ± 0.304
2.941TyrThr: 2.941 ± 0.244
2.856TyrVal: 2.856 ± 0.213
0.614TyrTrp: 0.614 ± 0.101
2.222TyrTyr: 2.222 ± 0.224
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 273 proteins (47263 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski