Amino acid dipepetide frequency for Bacillus phage TsarBomba

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
5.665AlaAla: 5.665 ± 0.616
0.293AlaCys: 0.293 ± 0.092
3.679AlaAsp: 3.679 ± 0.305
4.892AlaGlu: 4.892 ± 0.37
2.467AlaPhe: 2.467 ± 0.206
4.035AlaGly: 4.035 ± 0.554
1.108AlaHis: 1.108 ± 0.151
4.244AlaIle: 4.244 ± 0.349
5.059AlaLys: 5.059 ± 0.317
5.394AlaLeu: 5.394 ± 0.37
2.132AlaMet: 2.132 ± 0.191
3.366AlaAsn: 3.366 ± 0.374
2.446AlaPro: 2.446 ± 0.403
3.052AlaGln: 3.052 ± 0.289
2.843AlaArg: 2.843 ± 0.255
3.324AlaSer: 3.324 ± 0.376
4.327AlaThr: 4.327 ± 0.461
3.742AlaVal: 3.742 ± 0.312
1.045AlaTrp: 1.045 ± 0.137
2.425AlaTyr: 2.425 ± 0.219
0.0AlaXaa: 0.0 ± 0.0
Cys
0.334CysAla: 0.334 ± 0.086
0.105CysCys: 0.105 ± 0.042
0.376CysAsp: 0.376 ± 0.081
0.272CysGlu: 0.272 ± 0.083
0.334CysPhe: 0.334 ± 0.079
0.669CysGly: 0.669 ± 0.122
0.167CysHis: 0.167 ± 0.065
0.439CysIle: 0.439 ± 0.091
0.815CysLys: 0.815 ± 0.146
0.564CysLeu: 0.564 ± 0.118
0.355CysMet: 0.355 ± 0.079
0.648CysAsn: 0.648 ± 0.126
0.418CysPro: 0.418 ± 0.093
0.063CysGln: 0.063 ± 0.032
0.23CysArg: 0.23 ± 0.059
0.502CysSer: 0.502 ± 0.106
0.523CysThr: 0.523 ± 0.102
0.585CysVal: 0.585 ± 0.117
0.063CysTrp: 0.063 ± 0.036
0.397CysTyr: 0.397 ± 0.101
0.0CysXaa: 0.0 ± 0.0
Asp
3.575AspAla: 3.575 ± 0.266
0.418AspCys: 0.418 ± 0.089
2.843AspAsp: 2.843 ± 0.295
4.578AspGlu: 4.578 ± 0.337
2.759AspPhe: 2.759 ± 0.224
3.972AspGly: 3.972 ± 0.31
1.024AspHis: 1.024 ± 0.147
4.265AspIle: 4.265 ± 0.298
5.143AspLys: 5.143 ± 0.363
4.829AspLeu: 4.829 ± 0.324
1.819AspMet: 1.819 ± 0.21
3.073AspAsn: 3.073 ± 0.288
1.714AspPro: 1.714 ± 0.189
1.38AspGln: 1.38 ± 0.209
2.969AspArg: 2.969 ± 0.218
3.345AspSer: 3.345 ± 0.304
3.261AspThr: 3.261 ± 0.276
3.867AspVal: 3.867 ± 0.279
0.857AspTrp: 0.857 ± 0.144
3.303AspTyr: 3.303 ± 0.298
0.0AspXaa: 0.0 ± 0.0
Glu
4.202GluAla: 4.202 ± 0.301
0.669GluCys: 0.669 ± 0.135
4.39GluAsp: 4.39 ± 0.352
8.153GluGlu: 8.153 ± 0.852
3.073GluPhe: 3.073 ± 0.271
4.871GluGly: 4.871 ± 0.329
1.547GluHis: 1.547 ± 0.192
4.955GluIle: 4.955 ± 0.379
7.087GluLys: 7.087 ± 0.43
7.15GluLeu: 7.15 ± 0.408
2.425GluMet: 2.425 ± 0.239
3.491GluAsn: 3.491 ± 0.265
2.279GluPro: 2.279 ± 0.362
3.178GluGln: 3.178 ± 0.312
3.24GluArg: 3.24 ± 0.307
3.115GluSer: 3.115 ± 0.219
3.7GluThr: 3.7 ± 0.283
5.895GluVal: 5.895 ± 0.452
0.878GluTrp: 0.878 ± 0.148
3.387GluTyr: 3.387 ± 0.307
0.0GluXaa: 0.0 ± 0.0
Phe
2.362PheAla: 2.362 ± 0.221
0.564PheCys: 0.564 ± 0.12
2.927PheAsp: 2.927 ± 0.29
2.53PheGlu: 2.53 ± 0.207
1.275PhePhe: 1.275 ± 0.176
2.341PheGly: 2.341 ± 0.188
0.732PheHis: 0.732 ± 0.154
2.258PheIle: 2.258 ± 0.238
2.697PheLys: 2.697 ± 0.225
3.428PheLeu: 3.428 ± 0.294
1.066PheMet: 1.066 ± 0.149
2.425PheAsn: 2.425 ± 0.261
0.92PhePro: 0.92 ± 0.151
1.463PheGln: 1.463 ± 0.178
1.965PheArg: 1.965 ± 0.183
2.843PheSer: 2.843 ± 0.307
2.613PheThr: 2.613 ± 0.265
2.739PheVal: 2.739 ± 0.254
0.334PheTrp: 0.334 ± 0.086
1.714PheTyr: 1.714 ± 0.223
0.0PheXaa: 0.0 ± 0.0
Gly
4.411GlyAla: 4.411 ± 0.498
0.502GlyCys: 0.502 ± 0.099
3.387GlyAsp: 3.387 ± 0.312
4.077GlyGlu: 4.077 ± 0.257
2.55GlyPhe: 2.55 ± 0.259
5.519GlyGly: 5.519 ± 0.81
1.254GlyHis: 1.254 ± 0.196
3.909GlyIle: 3.909 ± 0.3
5.268GlyLys: 5.268 ± 0.325
4.892GlyLeu: 4.892 ± 0.331
1.965GlyMet: 1.965 ± 0.253
3.805GlyAsn: 3.805 ± 0.353
0.0GlyPro: 0.0 ± 0.0
2.404GlyGln: 2.404 ± 0.258
2.801GlyArg: 2.801 ± 0.248
4.557GlySer: 4.557 ± 0.502
5.164GlyThr: 5.164 ± 0.557
4.829GlyVal: 4.829 ± 0.319
0.794GlyTrp: 0.794 ± 0.123
3.115GlyTyr: 3.115 ± 0.249
0.0GlyXaa: 0.0 ± 0.0
His
0.899HisAla: 0.899 ± 0.14
0.188HisCys: 0.188 ± 0.058
1.087HisAsp: 1.087 ± 0.162
1.401HisGlu: 1.401 ± 0.139
0.669HisPhe: 0.669 ± 0.114
1.024HisGly: 1.024 ± 0.169
0.481HisHis: 0.481 ± 0.106
1.171HisIle: 1.171 ± 0.176
1.401HisLys: 1.401 ± 0.226
1.672HisLeu: 1.672 ± 0.237
0.439HisMet: 0.439 ± 0.103
1.045HisAsn: 1.045 ± 0.141
0.753HisPro: 0.753 ± 0.127
0.418HisGln: 0.418 ± 0.1
1.066HisArg: 1.066 ± 0.182
1.108HisSer: 1.108 ± 0.16
1.171HisThr: 1.171 ± 0.135
1.505HisVal: 1.505 ± 0.172
0.272HisTrp: 0.272 ± 0.09
0.878HisTyr: 0.878 ± 0.143
0.0HisXaa: 0.0 ± 0.0
Ile
4.286IleAla: 4.286 ± 0.268
0.606IleCys: 0.606 ± 0.12
4.453IleAsp: 4.453 ± 0.418
5.143IleGlu: 5.143 ± 0.347
2.091IlePhe: 2.091 ± 0.207
4.077IleGly: 4.077 ± 0.302
1.003IleHis: 1.003 ± 0.166
4.097IleIle: 4.097 ± 0.379
4.934IleLys: 4.934 ± 0.336
4.745IleLeu: 4.745 ± 0.356
1.672IleMet: 1.672 ± 0.184
3.617IleAsn: 3.617 ± 0.336
2.927IlePro: 2.927 ± 0.233
2.55IleGln: 2.55 ± 0.22
2.697IleArg: 2.697 ± 0.204
4.139IleSer: 4.139 ± 0.257
4.453IleThr: 4.453 ± 0.348
4.495IleVal: 4.495 ± 0.362
0.585IleTrp: 0.585 ± 0.12
2.55IleTyr: 2.55 ± 0.201
0.0IleXaa: 0.0 ± 0.0
Lys
5.352LysAla: 5.352 ± 0.366
0.502LysCys: 0.502 ± 0.11
4.578LysAsp: 4.578 ± 0.33
8.09LysGlu: 8.09 ± 0.523
2.592LysPhe: 2.592 ± 0.238
5.101LysGly: 5.101 ± 0.317
1.484LysHis: 1.484 ± 0.194
4.453LysIle: 4.453 ± 0.289
6.794LysLys: 6.794 ± 0.522
6.669LysLeu: 6.669 ± 0.379
2.843LysMet: 2.843 ± 0.277
3.7LysAsn: 3.7 ± 0.284
2.676LysPro: 2.676 ± 0.231
3.679LysGln: 3.679 ± 0.341
3.93LysArg: 3.93 ± 0.333
3.805LysSer: 3.805 ± 0.284
3.533LysThr: 3.533 ± 0.243
5.916LysVal: 5.916 ± 0.437
0.773LysTrp: 0.773 ± 0.133
3.157LysTyr: 3.157 ± 0.292
0.0LysXaa: 0.0 ± 0.0
Leu
5.644LeuAla: 5.644 ± 0.396
0.418LeuCys: 0.418 ± 0.089
5.456LeuAsp: 5.456 ± 0.319
6.627LeuGlu: 6.627 ± 0.492
3.261LeuPhe: 3.261 ± 0.27
4.829LeuGly: 4.829 ± 0.337
1.401LeuHis: 1.401 ± 0.156
5.122LeuIle: 5.122 ± 0.359
6.627LeuLys: 6.627 ± 0.387
5.477LeuLeu: 5.477 ± 0.417
2.007LeuMet: 2.007 ± 0.214
3.784LeuAsn: 3.784 ± 0.299
3.052LeuPro: 3.052 ± 0.28
3.428LeuGln: 3.428 ± 0.3
3.679LeuArg: 3.679 ± 0.355
5.059LeuSer: 5.059 ± 0.311
5.247LeuThr: 5.247 ± 0.334
5.31LeuVal: 5.31 ± 0.357
0.899LeuTrp: 0.899 ± 0.121
3.136LeuTyr: 3.136 ± 0.25
0.0LeuXaa: 0.0 ± 0.0
Met
2.3MetAla: 2.3 ± 0.227
0.23MetCys: 0.23 ± 0.065
1.401MetAsp: 1.401 ± 0.195
2.32MetGlu: 2.32 ± 0.21
1.108MetPhe: 1.108 ± 0.133
1.484MetGly: 1.484 ± 0.209
0.606MetHis: 0.606 ± 0.118
2.07MetIle: 2.07 ± 0.221
2.718MetLys: 2.718 ± 0.234
2.258MetLeu: 2.258 ± 0.226
0.502MetMet: 0.502 ± 0.102
1.61MetAsn: 1.61 ± 0.199
0.753MetPro: 0.753 ± 0.113
0.92MetGln: 0.92 ± 0.139
1.568MetArg: 1.568 ± 0.173
1.798MetSer: 1.798 ± 0.207
1.944MetThr: 1.944 ± 0.201
1.61MetVal: 1.61 ± 0.193
0.272MetTrp: 0.272 ± 0.065
1.213MetTyr: 1.213 ± 0.173
0.0MetXaa: 0.0 ± 0.0
Asn
3.428AsnAla: 3.428 ± 0.305
0.481AsnCys: 0.481 ± 0.105
2.404AsnAsp: 2.404 ± 0.216
3.847AsnGlu: 3.847 ± 0.28
1.756AsnPhe: 1.756 ± 0.196
4.662AsnGly: 4.662 ± 0.428
1.003AsnHis: 1.003 ± 0.118
3.554AsnIle: 3.554 ± 0.243
3.993AsnLys: 3.993 ± 0.312
4.014AsnLeu: 4.014 ± 0.304
2.049AsnMet: 2.049 ± 0.227
3.094AsnAsn: 3.094 ± 0.343
2.53AsnPro: 2.53 ± 0.282
1.881AsnGln: 1.881 ± 0.2
2.822AsnArg: 2.822 ± 0.253
2.509AsnSer: 2.509 ± 0.267
3.178AsnThr: 3.178 ± 0.293
3.638AsnVal: 3.638 ± 0.257
0.648AsnTrp: 0.648 ± 0.106
2.216AsnTyr: 2.216 ± 0.179
0.0AsnXaa: 0.0 ± 0.0
Pro
2.111ProAla: 2.111 ± 0.294
0.272ProCys: 0.272 ± 0.094
1.819ProAsp: 1.819 ± 0.212
2.55ProGlu: 2.55 ± 0.333
1.171ProPhe: 1.171 ± 0.151
1.714ProGly: 1.714 ± 0.264
0.794ProHis: 0.794 ± 0.111
2.153ProIle: 2.153 ± 0.24
2.613ProLys: 2.613 ± 0.235
2.404ProLeu: 2.404 ± 0.209
0.962ProMet: 0.962 ± 0.167
1.986ProAsn: 1.986 ± 0.251
0.941ProPro: 0.941 ± 0.129
1.505ProGln: 1.505 ± 0.21
1.275ProArg: 1.275 ± 0.164
1.756ProSer: 1.756 ± 0.2
2.78ProThr: 2.78 ± 0.29
2.613ProVal: 2.613 ± 0.254
0.209ProTrp: 0.209 ± 0.072
1.589ProTyr: 1.589 ± 0.234
0.0ProXaa: 0.0 ± 0.0
Gln
2.864GlnAla: 2.864 ± 0.272
0.334GlnCys: 0.334 ± 0.081
2.153GlnAsp: 2.153 ± 0.212
3.742GlnGlu: 3.742 ± 0.298
1.484GlnPhe: 1.484 ± 0.178
2.383GlnGly: 2.383 ± 0.236
0.585GlnHis: 0.585 ± 0.085
2.571GlnIle: 2.571 ± 0.24
2.697GlnLys: 2.697 ± 0.231
3.115GlnLeu: 3.115 ± 0.317
1.275GlnMet: 1.275 ± 0.172
1.756GlnAsn: 1.756 ± 0.216
1.463GlnPro: 1.463 ± 0.228
2.404GlnGln: 2.404 ± 0.309
1.568GlnArg: 1.568 ± 0.188
2.216GlnSer: 2.216 ± 0.243
2.028GlnThr: 2.028 ± 0.203
2.53GlnVal: 2.53 ± 0.234
0.376GlnTrp: 0.376 ± 0.082
1.213GlnTyr: 1.213 ± 0.165
0.0GlnXaa: 0.0 ± 0.0
Arg
2.174ArgAla: 2.174 ± 0.23
0.376ArgCys: 0.376 ± 0.087
2.906ArgAsp: 2.906 ± 0.297
3.261ArgGlu: 3.261 ± 0.298
2.111ArgPhe: 2.111 ± 0.191
2.801ArgGly: 2.801 ± 0.249
0.857ArgHis: 0.857 ± 0.145
2.906ArgIle: 2.906 ± 0.281
3.7ArgLys: 3.7 ± 0.301
4.077ArgLeu: 4.077 ± 0.324
1.526ArgMet: 1.526 ± 0.156
2.467ArgAsn: 2.467 ± 0.233
1.233ArgPro: 1.233 ± 0.165
1.693ArgGln: 1.693 ± 0.187
2.341ArgArg: 2.341 ± 0.221
2.362ArgSer: 2.362 ± 0.195
2.237ArgThr: 2.237 ± 0.232
3.345ArgVal: 3.345 ± 0.268
0.523ArgTrp: 0.523 ± 0.105
2.174ArgTyr: 2.174 ± 0.201
0.0ArgXaa: 0.0 ± 0.0
Ser
3.408SerAla: 3.408 ± 0.309
0.355SerCys: 0.355 ± 0.086
3.449SerAsp: 3.449 ± 0.316
3.157SerGlu: 3.157 ± 0.273
2.927SerPhe: 2.927 ± 0.258
4.453SerGly: 4.453 ± 0.42
0.962SerHis: 0.962 ± 0.147
4.139SerIle: 4.139 ± 0.265
4.181SerLys: 4.181 ± 0.289
4.578SerLeu: 4.578 ± 0.325
1.401SerMet: 1.401 ± 0.171
2.989SerAsn: 2.989 ± 0.355
1.756SerPro: 1.756 ± 0.183
1.652SerGln: 1.652 ± 0.189
2.362SerArg: 2.362 ± 0.22
3.617SerSer: 3.617 ± 0.393
3.491SerThr: 3.491 ± 0.338
3.805SerVal: 3.805 ± 0.259
0.794SerTrp: 0.794 ± 0.13
2.55SerTyr: 2.55 ± 0.214
0.0SerXaa: 0.0 ± 0.0
Thr
4.16ThrAla: 4.16 ± 0.555
0.46ThrCys: 0.46 ± 0.096
3.449ThrAsp: 3.449 ± 0.252
4.348ThrGlu: 4.348 ± 0.318
2.801ThrPhe: 2.801 ± 0.266
4.369ThrGly: 4.369 ± 0.339
1.087ThrHis: 1.087 ± 0.185
4.306ThrIle: 4.306 ± 0.298
4.474ThrLys: 4.474 ± 0.291
5.164ThrLeu: 5.164 ± 0.314
1.15ThrMet: 1.15 ± 0.146
3.136ThrAsn: 3.136 ± 0.332
2.969ThrPro: 2.969 ± 0.335
1.986ThrGln: 1.986 ± 0.179
2.425ThrArg: 2.425 ± 0.225
3.366ThrSer: 3.366 ± 0.256
3.554ThrThr: 3.554 ± 0.331
4.913ThrVal: 4.913 ± 0.401
0.606ThrTrp: 0.606 ± 0.156
3.052ThrTyr: 3.052 ± 0.279
0.0ThrXaa: 0.0 ± 0.0
Val
4.641ValAla: 4.641 ± 0.367
0.481ValCys: 0.481 ± 0.105
4.453ValAsp: 4.453 ± 0.259
5.184ValGlu: 5.184 ± 0.429
2.906ValPhe: 2.906 ± 0.267
3.721ValGly: 3.721 ± 0.282
1.317ValHis: 1.317 ± 0.195
4.725ValIle: 4.725 ± 0.302
5.331ValLys: 5.331 ± 0.316
5.184ValLeu: 5.184 ± 0.398
1.84ValMet: 1.84 ± 0.16
4.077ValAsn: 4.077 ± 0.348
2.948ValPro: 2.948 ± 0.27
2.885ValGln: 2.885 ± 0.193
3.052ValArg: 3.052 ± 0.283
3.554ValSer: 3.554 ± 0.303
4.913ValThr: 4.913 ± 0.354
4.85ValVal: 4.85 ± 0.323
0.669ValTrp: 0.669 ± 0.116
2.989ValTyr: 2.989 ± 0.252
0.0ValXaa: 0.0 ± 0.0
Trp
0.669TrpAla: 0.669 ± 0.155
0.188TrpCys: 0.188 ± 0.06
0.836TrpAsp: 0.836 ± 0.169
0.899TrpGlu: 0.899 ± 0.145
0.523TrpPhe: 0.523 ± 0.116
0.585TrpGly: 0.585 ± 0.129
0.314TrpHis: 0.314 ± 0.077
0.69TrpIle: 0.69 ± 0.11
0.815TrpLys: 0.815 ± 0.118
1.024TrpLeu: 1.024 ± 0.163
0.105TrpMet: 0.105 ± 0.047
0.815TrpAsn: 0.815 ± 0.16
0.0TrpPro: 0.0 ± 0.0
0.46TrpGln: 0.46 ± 0.083
0.355TrpArg: 0.355 ± 0.09
0.773TrpSer: 0.773 ± 0.134
0.648TrpThr: 0.648 ± 0.109
0.732TrpVal: 0.732 ± 0.144
0.146TrpTrp: 0.146 ± 0.049
0.669TrpTyr: 0.669 ± 0.117
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.052TyrAla: 3.052 ± 0.221
0.418TyrCys: 0.418 ± 0.096
2.948TyrAsp: 2.948 ± 0.283
2.613TyrGlu: 2.613 ± 0.243
1.338TyrPhe: 1.338 ± 0.156
2.592TyrGly: 2.592 ± 0.269
0.92TyrHis: 0.92 ± 0.166
3.094TyrIle: 3.094 ± 0.286
3.282TyrLys: 3.282 ± 0.302
3.826TyrLeu: 3.826 ± 0.303
1.129TyrMet: 1.129 ± 0.17
2.822TyrAsn: 2.822 ± 0.228
1.442TyrPro: 1.442 ± 0.16
1.756TyrGln: 1.756 ± 0.208
1.881TyrArg: 1.881 ± 0.222
2.279TyrSer: 2.279 ± 0.198
3.01TyrThr: 3.01 ± 0.247
2.822TyrVal: 2.822 ± 0.224
0.502TyrTrp: 0.502 ± 0.097
1.735TyrTyr: 1.735 ± 0.177
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 234 proteins (47836 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski