Amino acid dipepetide frequency for Bacillus phage BCP8-2

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
2.856AlaAla: 2.856 ± 0.369
0.262AlaCys: 0.262 ± 0.076
3.445AlaAsp: 3.445 ± 0.273
4.426AlaGlu: 4.426 ± 0.377
2.268AlaPhe: 2.268 ± 0.183
4.295AlaGly: 4.295 ± 0.416
0.85AlaHis: 0.85 ± 0.127
4.077AlaIle: 4.077 ± 0.291
4.753AlaLys: 4.753 ± 0.328
4.666AlaLeu: 4.666 ± 0.328
1.722AlaMet: 1.722 ± 0.19
2.856AlaAsn: 2.856 ± 0.317
2.18AlaPro: 2.18 ± 0.371
2.355AlaGln: 2.355 ± 0.276
2.704AlaArg: 2.704 ± 0.305
3.161AlaSer: 3.161 ± 0.356
4.186AlaThr: 4.186 ± 0.364
3.314AlaVal: 3.314 ± 0.282
0.741AlaTrp: 0.741 ± 0.118
3.074AlaTyr: 3.074 ± 0.316
0.0AlaXaa: 0.0 ± 0.0
Cys
0.262CysAla: 0.262 ± 0.076
0.153CysCys: 0.153 ± 0.066
0.589CysAsp: 0.589 ± 0.113
0.567CysGlu: 0.567 ± 0.112
0.305CysPhe: 0.305 ± 0.079
0.676CysGly: 0.676 ± 0.135
0.174CysHis: 0.174 ± 0.06
0.436CysIle: 0.436 ± 0.118
0.632CysLys: 0.632 ± 0.121
0.501CysLeu: 0.501 ± 0.105
0.109CysMet: 0.109 ± 0.059
0.48CysAsn: 0.48 ± 0.108
0.371CysPro: 0.371 ± 0.099
0.196CysGln: 0.196 ± 0.063
0.283CysArg: 0.283 ± 0.077
0.458CysSer: 0.458 ± 0.097
0.458CysThr: 0.458 ± 0.131
0.61CysVal: 0.61 ± 0.108
0.131CysTrp: 0.131 ± 0.06
0.589CysTyr: 0.589 ± 0.109
0.0CysXaa: 0.0 ± 0.0
Asp
3.314AspAla: 3.314 ± 0.274
0.501AspCys: 0.501 ± 0.126
3.161AspAsp: 3.161 ± 0.285
4.818AspGlu: 4.818 ± 0.299
2.66AspPhe: 2.66 ± 0.245
4.426AspGly: 4.426 ± 0.337
0.894AspHis: 0.894 ± 0.17
5.124AspIle: 5.124 ± 0.382
5.364AspLys: 5.364 ± 0.376
5.124AspLeu: 5.124 ± 0.345
1.57AspMet: 1.57 ± 0.185
3.379AspAsn: 3.379 ± 0.313
2.311AspPro: 2.311 ± 0.226
1.548AspGln: 1.548 ± 0.198
2.856AspArg: 2.856 ± 0.293
3.488AspSer: 3.488 ± 0.307
3.14AspThr: 3.14 ± 0.258
4.164AspVal: 4.164 ± 0.292
0.981AspTrp: 0.981 ± 0.15
3.728AspTyr: 3.728 ± 0.304
0.0AspXaa: 0.0 ± 0.0
Glu
4.121GluAla: 4.121 ± 0.275
0.545GluCys: 0.545 ± 0.114
5.451GluAsp: 5.451 ± 0.429
8.896GluGlu: 8.896 ± 0.891
3.205GluPhe: 3.205 ± 0.294
4.709GluGly: 4.709 ± 0.341
1.483GluHis: 1.483 ± 0.231
5.364GluIle: 5.364 ± 0.398
5.974GluLys: 5.974 ± 0.441
7.391GluLeu: 7.391 ± 0.487
2.442GluMet: 2.442 ± 0.249
3.663GluAsn: 3.663 ± 0.292
2.137GluPro: 2.137 ± 0.441
3.314GluGln: 3.314 ± 0.265
3.51GluArg: 3.51 ± 0.289
3.685GluSer: 3.685 ± 0.343
3.379GluThr: 3.379 ± 0.293
5.712GluVal: 5.712 ± 0.372
1.156GluTrp: 1.156 ± 0.155
3.968GluTyr: 3.968 ± 0.293
0.0GluXaa: 0.0 ± 0.0
Phe
2.049PheAla: 2.049 ± 0.18
0.392PheCys: 0.392 ± 0.091
2.66PheAsp: 2.66 ± 0.268
2.878PheGlu: 2.878 ± 0.224
1.33PhePhe: 1.33 ± 0.185
2.311PheGly: 2.311 ± 0.206
0.829PheHis: 0.829 ± 0.188
2.507PheIle: 2.507 ± 0.26
2.943PheLys: 2.943 ± 0.252
2.791PheLeu: 2.791 ± 0.217
1.286PheMet: 1.286 ± 0.177
2.355PheAsn: 2.355 ± 0.275
0.959PhePro: 0.959 ± 0.154
1.308PheGln: 1.308 ± 0.16
1.395PheArg: 1.395 ± 0.158
2.943PheSer: 2.943 ± 0.288
2.507PheThr: 2.507 ± 0.294
2.704PheVal: 2.704 ± 0.262
0.305PheTrp: 0.305 ± 0.089
1.613PheTyr: 1.613 ± 0.177
0.0PheXaa: 0.0 ± 0.0
Gly
3.554GlyAla: 3.554 ± 0.528
0.589GlyCys: 0.589 ± 0.105
3.598GlyAsp: 3.598 ± 0.323
4.709GlyGlu: 4.709 ± 0.335
2.42GlyPhe: 2.42 ± 0.239
5.233GlyGly: 5.233 ± 0.696
1.068GlyHis: 1.068 ± 0.162
4.382GlyIle: 4.382 ± 0.32
5.429GlyLys: 5.429 ± 0.305
4.797GlyLeu: 4.797 ± 0.309
1.853GlyMet: 1.853 ± 0.257
4.186GlyAsn: 4.186 ± 0.361
0.087GlyPro: 0.087 ± 0.047
2.442GlyGln: 2.442 ± 0.277
2.922GlyArg: 2.922 ± 0.248
4.666GlySer: 4.666 ± 0.409
5.124GlyThr: 5.124 ± 0.444
4.557GlyVal: 4.557 ± 0.312
1.003GlyTrp: 1.003 ± 0.146
3.467GlyTyr: 3.467 ± 0.319
0.0GlyXaa: 0.0 ± 0.0
His
1.09HisAla: 1.09 ± 0.17
0.087HisCys: 0.087 ± 0.044
0.938HisAsp: 0.938 ± 0.157
1.286HisGlu: 1.286 ± 0.187
0.61HisPhe: 0.61 ± 0.126
0.916HisGly: 0.916 ± 0.156
0.392HisHis: 0.392 ± 0.09
1.417HisIle: 1.417 ± 0.178
1.265HisLys: 1.265 ± 0.2
1.744HisLeu: 1.744 ± 0.235
0.327HisMet: 0.327 ± 0.097
1.112HisAsn: 1.112 ± 0.154
0.654HisPro: 0.654 ± 0.112
0.458HisGln: 0.458 ± 0.111
1.265HisArg: 1.265 ± 0.169
0.872HisSer: 0.872 ± 0.146
1.352HisThr: 1.352 ± 0.201
1.679HisVal: 1.679 ± 0.209
0.174HisTrp: 0.174 ± 0.077
0.763HisTyr: 0.763 ± 0.157
0.0HisXaa: 0.0 ± 0.0
Ile
4.491IleAla: 4.491 ± 0.345
0.392IleCys: 0.392 ± 0.101
4.339IleAsp: 4.339 ± 0.326
5.538IleGlu: 5.538 ± 0.357
1.94IlePhe: 1.94 ± 0.215
4.012IleGly: 4.012 ± 0.288
1.374IleHis: 1.374 ± 0.198
3.707IleIle: 3.707 ± 0.319
5.451IleLys: 5.451 ± 0.401
4.295IleLeu: 4.295 ± 0.332
1.722IleMet: 1.722 ± 0.194
3.816IleAsn: 3.816 ± 0.324
2.943IlePro: 2.943 ± 0.253
2.159IleGln: 2.159 ± 0.192
2.987IleArg: 2.987 ± 0.275
4.491IleSer: 4.491 ± 0.315
4.339IleThr: 4.339 ± 0.396
4.688IleVal: 4.688 ± 0.357
0.436IleTrp: 0.436 ± 0.109
2.311IleTyr: 2.311 ± 0.196
0.0IleXaa: 0.0 ± 0.0
Lys
4.557LysAla: 4.557 ± 0.348
0.458LysCys: 0.458 ± 0.103
5.778LysAsp: 5.778 ± 0.325
8.089LysGlu: 8.089 ± 0.514
2.355LysPhe: 2.355 ± 0.244
5.451LysGly: 5.451 ± 0.407
1.788LysHis: 1.788 ± 0.236
4.382LysIle: 4.382 ± 0.316
6.367LysLys: 6.367 ± 0.479
6.694LysLeu: 6.694 ± 0.359
2.115LysMet: 2.115 ± 0.19
3.641LysAsn: 3.641 ± 0.328
2.725LysPro: 2.725 ± 0.242
3.576LysGln: 3.576 ± 0.251
3.816LysArg: 3.816 ± 0.346
3.859LysSer: 3.859 ± 0.337
4.361LysThr: 4.361 ± 0.317
5.756LysVal: 5.756 ± 0.355
0.916LysTrp: 0.916 ± 0.142
3.292LysTyr: 3.292 ± 0.324
0.0LysXaa: 0.0 ± 0.0
Leu
5.32LeuAla: 5.32 ± 0.345
0.654LeuCys: 0.654 ± 0.121
5.255LeuAsp: 5.255 ± 0.393
6.672LeuGlu: 6.672 ± 0.413
3.161LeuPhe: 3.161 ± 0.277
5.058LeuGly: 5.058 ± 0.327
1.417LeuHis: 1.417 ± 0.179
4.382LeuIle: 4.382 ± 0.369
6.345LeuLys: 6.345 ± 0.363
6.105LeuLeu: 6.105 ± 0.539
2.071LeuMet: 2.071 ± 0.25
3.99LeuAsn: 3.99 ± 0.279
2.943LeuPro: 2.943 ± 0.27
3.598LeuGln: 3.598 ± 0.285
3.99LeuArg: 3.99 ± 0.313
4.426LeuSer: 4.426 ± 0.334
5.385LeuThr: 5.385 ± 0.355
5.364LeuVal: 5.364 ± 0.402
0.872LeuTrp: 0.872 ± 0.159
3.096LeuTyr: 3.096 ± 0.284
0.0LeuXaa: 0.0 ± 0.0
Met
1.613MetAla: 1.613 ± 0.156
0.196MetCys: 0.196 ± 0.068
1.766MetAsp: 1.766 ± 0.186
2.028MetGlu: 2.028 ± 0.185
1.134MetPhe: 1.134 ± 0.149
1.265MetGly: 1.265 ± 0.173
0.392MetHis: 0.392 ± 0.08
1.701MetIle: 1.701 ± 0.234
2.769MetLys: 2.769 ± 0.225
2.18MetLeu: 2.18 ± 0.215
0.785MetMet: 0.785 ± 0.167
1.722MetAsn: 1.722 ± 0.18
0.676MetPro: 0.676 ± 0.139
0.785MetGln: 0.785 ± 0.125
1.374MetArg: 1.374 ± 0.174
1.744MetSer: 1.744 ± 0.195
1.984MetThr: 1.984 ± 0.205
1.548MetVal: 1.548 ± 0.155
0.262MetTrp: 0.262 ± 0.092
0.916MetTyr: 0.916 ± 0.131
0.0MetXaa: 0.0 ± 0.0
Asn
3.183AsnAla: 3.183 ± 0.305
0.48AsnCys: 0.48 ± 0.108
2.529AsnAsp: 2.529 ± 0.214
3.205AsnGlu: 3.205 ± 0.255
2.115AsnPhe: 2.115 ± 0.199
4.448AsnGly: 4.448 ± 0.401
1.025AsnHis: 1.025 ± 0.162
3.685AsnIle: 3.685 ± 0.298
4.077AsnLys: 4.077 ± 0.313
3.837AsnLeu: 3.837 ± 0.266
1.766AsnMet: 1.766 ± 0.171
2.987AsnAsn: 2.987 ± 0.305
2.769AsnPro: 2.769 ± 0.265
1.831AsnGln: 1.831 ± 0.219
2.813AsnArg: 2.813 ± 0.253
2.682AsnSer: 2.682 ± 0.247
3.249AsnThr: 3.249 ± 0.31
3.794AsnVal: 3.794 ± 0.305
0.61AsnTrp: 0.61 ± 0.109
2.268AsnTyr: 2.268 ± 0.224
0.0AsnXaa: 0.0 ± 0.0
Pro
1.657ProAla: 1.657 ± 0.23
0.327ProCys: 0.327 ± 0.089
1.94ProAsp: 1.94 ± 0.24
2.377ProGlu: 2.377 ± 0.221
1.526ProPhe: 1.526 ± 0.216
1.33ProGly: 1.33 ± 0.205
0.61ProHis: 0.61 ± 0.111
2.18ProIle: 2.18 ± 0.23
3.096ProLys: 3.096 ± 0.298
2.529ProLeu: 2.529 ± 0.261
0.916ProMet: 0.916 ± 0.151
2.137ProAsn: 2.137 ± 0.222
0.894ProPro: 0.894 ± 0.141
1.265ProGln: 1.265 ± 0.201
1.134ProArg: 1.134 ± 0.151
2.071ProSer: 2.071 ± 0.218
2.551ProThr: 2.551 ± 0.361
2.616ProVal: 2.616 ± 0.322
0.196ProTrp: 0.196 ± 0.064
1.613ProTyr: 1.613 ± 0.178
0.0ProXaa: 0.0 ± 0.0
Gln
2.834GlnAla: 2.834 ± 0.327
0.305GlnCys: 0.305 ± 0.089
2.137GlnAsp: 2.137 ± 0.229
3.249GlnGlu: 3.249 ± 0.256
1.156GlnPhe: 1.156 ± 0.18
2.638GlnGly: 2.638 ± 0.297
0.938GlnHis: 0.938 ± 0.188
1.897GlnIle: 1.897 ± 0.185
2.616GlnLys: 2.616 ± 0.257
3.336GlnLeu: 3.336 ± 0.277
0.894GlnMet: 0.894 ± 0.138
1.439GlnAsn: 1.439 ± 0.17
1.548GlnPro: 1.548 ± 0.263
1.94GlnGln: 1.94 ± 0.25
1.33GlnArg: 1.33 ± 0.151
2.224GlnSer: 2.224 ± 0.209
2.246GlnThr: 2.246 ± 0.209
2.704GlnVal: 2.704 ± 0.265
0.501GlnTrp: 0.501 ± 0.087
1.33GlnTyr: 1.33 ± 0.158
0.0GlnXaa: 0.0 ± 0.0
Arg
2.638ArgAla: 2.638 ± 0.284
0.567ArgCys: 0.567 ± 0.121
2.464ArgAsp: 2.464 ± 0.203
3.925ArgGlu: 3.925 ± 0.365
2.006ArgPhe: 2.006 ± 0.198
2.747ArgGly: 2.747 ± 0.33
0.654ArgHis: 0.654 ± 0.114
3.009ArgIle: 3.009 ± 0.244
3.598ArgLys: 3.598 ± 0.353
3.903ArgLeu: 3.903 ± 0.255
1.657ArgMet: 1.657 ± 0.178
2.071ArgAsn: 2.071 ± 0.216
1.265ArgPro: 1.265 ± 0.143
1.526ArgGln: 1.526 ± 0.166
2.159ArgArg: 2.159 ± 0.253
2.202ArgSer: 2.202 ± 0.227
2.507ArgThr: 2.507 ± 0.277
3.292ArgVal: 3.292 ± 0.337
0.654ArgTrp: 0.654 ± 0.107
2.049ArgTyr: 2.049 ± 0.24
0.0ArgXaa: 0.0 ± 0.0
Ser
3.096SerAla: 3.096 ± 0.264
0.327SerCys: 0.327 ± 0.091
3.641SerAsp: 3.641 ± 0.277
3.488SerGlu: 3.488 ± 0.294
2.813SerPhe: 2.813 ± 0.218
3.946SerGly: 3.946 ± 0.374
0.981SerHis: 0.981 ± 0.168
3.728SerIle: 3.728 ± 0.333
4.491SerLys: 4.491 ± 0.258
4.928SerLeu: 4.928 ± 0.347
1.308SerMet: 1.308 ± 0.164
3.096SerAsn: 3.096 ± 0.36
1.853SerPro: 1.853 ± 0.196
1.919SerGln: 1.919 ± 0.189
2.595SerArg: 2.595 ± 0.24
4.012SerSer: 4.012 ± 0.715
3.532SerThr: 3.532 ± 0.342
4.121SerVal: 4.121 ± 0.33
0.763SerTrp: 0.763 ± 0.173
2.66SerTyr: 2.66 ± 0.254
0.0SerXaa: 0.0 ± 0.0
Thr
3.51ThrAla: 3.51 ± 0.384
0.436ThrCys: 0.436 ± 0.09
3.99ThrAsp: 3.99 ± 0.296
4.252ThrGlu: 4.252 ± 0.311
2.616ThrPhe: 2.616 ± 0.274
4.775ThrGly: 4.775 ± 0.383
1.417ThrHis: 1.417 ± 0.183
4.928ThrIle: 4.928 ± 0.438
4.273ThrLys: 4.273 ± 0.366
5.102ThrLeu: 5.102 ± 0.35
1.286ThrMet: 1.286 ± 0.181
3.249ThrAsn: 3.249 ± 0.375
2.725ThrPro: 2.725 ± 0.296
2.159ThrGln: 2.159 ± 0.198
2.551ThrArg: 2.551 ± 0.253
2.856ThrSer: 2.856 ± 0.285
3.685ThrThr: 3.685 ± 0.348
5.211ThrVal: 5.211 ± 0.291
0.938ThrTrp: 0.938 ± 0.159
3.031ThrTyr: 3.031 ± 0.291
0.0ThrXaa: 0.0 ± 0.0
Val
4.622ValAla: 4.622 ± 0.324
0.632ValCys: 0.632 ± 0.12
4.448ValAsp: 4.448 ± 0.32
5.603ValGlu: 5.603 ± 0.474
2.398ValPhe: 2.398 ± 0.228
4.317ValGly: 4.317 ± 0.345
1.047ValHis: 1.047 ± 0.151
4.731ValIle: 4.731 ± 0.336
5.451ValLys: 5.451 ± 0.299
5.712ValLeu: 5.712 ± 0.387
1.613ValMet: 1.613 ± 0.179
3.467ValAsn: 3.467 ± 0.278
2.66ValPro: 2.66 ± 0.294
2.943ValGln: 2.943 ± 0.264
2.943ValArg: 2.943 ± 0.277
4.143ValSer: 4.143 ± 0.321
4.862ValThr: 4.862 ± 0.376
4.491ValVal: 4.491 ± 0.337
0.85ValTrp: 0.85 ± 0.142
3.096ValTyr: 3.096 ± 0.293
0.0ValXaa: 0.0 ± 0.0
Trp
0.872TrpAla: 0.872 ± 0.156
0.196TrpCys: 0.196 ± 0.067
1.047TrpAsp: 1.047 ± 0.16
0.938TrpGlu: 0.938 ± 0.155
0.48TrpPhe: 0.48 ± 0.127
0.72TrpGly: 0.72 ± 0.157
0.24TrpHis: 0.24 ± 0.076
0.741TrpIle: 0.741 ± 0.11
0.981TrpLys: 0.981 ± 0.166
0.872TrpLeu: 0.872 ± 0.133
0.24TrpMet: 0.24 ± 0.073
0.654TrpAsn: 0.654 ± 0.138
0.0TrpPro: 0.0 ± 0.0
0.458TrpGln: 0.458 ± 0.096
0.545TrpArg: 0.545 ± 0.116
0.763TrpSer: 0.763 ± 0.151
0.698TrpThr: 0.698 ± 0.121
0.894TrpVal: 0.894 ± 0.146
0.196TrpTrp: 0.196 ± 0.061
0.589TrpTyr: 0.589 ± 0.104
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.551TyrAla: 2.551 ± 0.206
0.48TyrCys: 0.48 ± 0.1
3.379TyrAsp: 3.379 ± 0.275
3.161TyrGlu: 3.161 ± 0.302
1.57TyrPhe: 1.57 ± 0.196
2.813TyrGly: 2.813 ± 0.236
0.829TyrHis: 0.829 ± 0.127
3.052TyrIle: 3.052 ± 0.298
4.121TyrLys: 4.121 ± 0.36
3.598TyrLeu: 3.598 ± 0.294
1.134TyrMet: 1.134 ± 0.141
3.052TyrAsn: 3.052 ± 0.285
1.265TyrPro: 1.265 ± 0.156
1.483TyrGln: 1.483 ± 0.168
1.722TyrArg: 1.722 ± 0.183
2.551TyrSer: 2.551 ± 0.242
3.445TyrThr: 3.445 ± 0.278
2.813TyrVal: 2.813 ± 0.279
0.436TyrTrp: 0.436 ± 0.095
1.897TyrTyr: 1.897 ± 0.239
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 220 proteins (45866 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski