Amino acid dipepetide frequency for Rhodococcus phage Finch

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
9.118AlaAla: 9.118 ± 0.737
0.45AlaCys: 0.45 ± 0.1
6.326AlaAsp: 6.326 ± 0.493
7.497AlaGlu: 7.497 ± 0.463
2.747AlaPhe: 2.747 ± 0.247
6.461AlaGly: 6.461 ± 0.438
1.576AlaHis: 1.576 ± 0.218
4.097AlaIle: 4.097 ± 0.293
4.615AlaLys: 4.615 ± 0.317
7.137AlaLeu: 7.137 ± 0.417
2.837AlaMet: 2.837 ± 0.234
3.715AlaAsn: 3.715 ± 0.288
4.03AlaPro: 4.03 ± 0.404
4.007AlaGln: 4.007 ± 0.352
5.381AlaArg: 5.381 ± 0.329
6.056AlaSer: 6.056 ± 0.407
6.056AlaThr: 6.056 ± 0.42
6.461AlaVal: 6.461 ± 0.363
1.801AlaTrp: 1.801 ± 0.185
2.386AlaTyr: 2.386 ± 0.262
0.0AlaXaa: 0.0 ± 0.0
Cys
0.45CysAla: 0.45 ± 0.108
0.225CysCys: 0.225 ± 0.104
0.81CysAsp: 0.81 ± 0.163
0.518CysGlu: 0.518 ± 0.101
0.27CysPhe: 0.27 ± 0.083
1.126CysGly: 1.126 ± 0.198
0.203CysHis: 0.203 ± 0.071
0.248CysIle: 0.248 ± 0.075
0.495CysLys: 0.495 ± 0.105
0.855CysLeu: 0.855 ± 0.144
0.18CysMet: 0.18 ± 0.062
0.45CysAsn: 0.45 ± 0.096
0.765CysPro: 0.765 ± 0.148
0.203CysGln: 0.203 ± 0.073
0.675CysArg: 0.675 ± 0.145
0.698CysSer: 0.698 ± 0.137
0.495CysThr: 0.495 ± 0.1
0.495CysVal: 0.495 ± 0.098
0.338CysTrp: 0.338 ± 0.081
0.45CysTyr: 0.45 ± 0.102
0.0CysXaa: 0.0 ± 0.0
Asp
5.718AspAla: 5.718 ± 0.38
0.698AspCys: 0.698 ± 0.135
4.93AspAsp: 4.93 ± 0.473
5.178AspGlu: 5.178 ± 0.455
2.747AspPhe: 2.747 ± 0.272
5.943AspGly: 5.943 ± 0.378
1.779AspHis: 1.779 ± 0.216
3.399AspIle: 3.399 ± 0.26
2.386AspLys: 2.386 ± 0.217
5.786AspLeu: 5.786 ± 0.317
1.869AspMet: 1.869 ± 0.208
2.341AspAsn: 2.341 ± 0.212
4.075AspPro: 4.075 ± 0.301
2.859AspGln: 2.859 ± 0.243
4.165AspArg: 4.165 ± 0.334
3.557AspSer: 3.557 ± 0.265
3.67AspThr: 3.67 ± 0.306
4.818AspVal: 4.818 ± 0.31
1.531AspTrp: 1.531 ± 0.213
2.431AspTyr: 2.431 ± 0.285
0.0AspXaa: 0.0 ± 0.0
Glu
7.159GluAla: 7.159 ± 0.473
0.72GluCys: 0.72 ± 0.148
5.246GluAsp: 5.246 ± 0.561
4.998GluGlu: 4.998 ± 0.436
3.084GluPhe: 3.084 ± 0.266
4.728GluGly: 4.728 ± 0.329
1.598GluHis: 1.598 ± 0.182
2.679GluIle: 2.679 ± 0.23
2.769GluLys: 2.769 ± 0.29
7.092GluLeu: 7.092 ± 0.52
1.914GluMet: 1.914 ± 0.212
1.981GluAsn: 1.981 ± 0.191
3.399GluPro: 3.399 ± 0.332
3.152GluGln: 3.152 ± 0.263
4.66GluArg: 4.66 ± 0.366
3.242GluSer: 3.242 ± 0.279
3.58GluThr: 3.58 ± 0.286
4.953GluVal: 4.953 ± 0.341
1.418GluTrp: 1.418 ± 0.18
2.409GluTyr: 2.409 ± 0.281
0.0GluXaa: 0.0 ± 0.0
Phe
3.174PheAla: 3.174 ± 0.289
0.383PheCys: 0.383 ± 0.093
2.882PheAsp: 2.882 ± 0.276
2.139PheGlu: 2.139 ± 0.205
1.103PhePhe: 1.103 ± 0.148
2.927PheGly: 2.927 ± 0.274
0.878PheHis: 0.878 ± 0.166
1.081PheIle: 1.081 ± 0.171
1.486PheLys: 1.486 ± 0.188
2.116PheLeu: 2.116 ± 0.223
0.81PheMet: 0.81 ± 0.117
1.238PheAsn: 1.238 ± 0.178
1.463PhePro: 1.463 ± 0.218
1.486PheGln: 1.486 ± 0.184
1.576PheArg: 1.576 ± 0.197
1.801PheSer: 1.801 ± 0.181
2.386PheThr: 2.386 ± 0.259
2.116PheVal: 2.116 ± 0.244
0.585PheTrp: 0.585 ± 0.098
1.126PheTyr: 1.126 ± 0.162
0.0PheXaa: 0.0 ± 0.0
Gly
5.493GlyAla: 5.493 ± 0.462
0.923GlyCys: 0.923 ± 0.154
4.953GlyAsp: 4.953 ± 0.389
5.11GlyGlu: 5.11 ± 0.415
2.611GlyPhe: 2.611 ± 0.219
6.574GlyGly: 6.574 ± 0.597
1.508GlyHis: 1.508 ± 0.209
3.512GlyIle: 3.512 ± 0.34
3.354GlyLys: 3.354 ± 0.261
5.426GlyLeu: 5.426 ± 0.347
1.959GlyMet: 1.959 ± 0.188
3.039GlyAsn: 3.039 ± 0.271
3.084GlyPro: 3.084 ± 0.282
2.769GlyGln: 2.769 ± 0.295
4.413GlyArg: 4.413 ± 0.242
5.11GlySer: 5.11 ± 0.329
5.628GlyThr: 5.628 ± 0.581
5.966GlyVal: 5.966 ± 0.352
1.869GlyTrp: 1.869 ± 0.23
3.129GlyTyr: 3.129 ± 0.307
0.0GlyXaa: 0.0 ± 0.0
His
1.779HisAla: 1.779 ± 0.219
0.315HisCys: 0.315 ± 0.085
1.171HisAsp: 1.171 ± 0.152
1.238HisGlu: 1.238 ± 0.184
0.855HisPhe: 0.855 ± 0.16
1.981HisGly: 1.981 ± 0.246
0.72HisHis: 0.72 ± 0.14
1.126HisIle: 1.126 ± 0.161
0.765HisLys: 0.765 ± 0.15
1.959HisLeu: 1.959 ± 0.216
0.495HisMet: 0.495 ± 0.111
0.788HisAsn: 0.788 ± 0.118
1.216HisPro: 1.216 ± 0.172
0.428HisGln: 0.428 ± 0.099
1.373HisArg: 1.373 ± 0.196
0.923HisSer: 0.923 ± 0.163
1.171HisThr: 1.171 ± 0.171
1.351HisVal: 1.351 ± 0.178
0.405HisTrp: 0.405 ± 0.101
0.991HisTyr: 0.991 ± 0.146
0.0HisXaa: 0.0 ± 0.0
Ile
3.872IleAla: 3.872 ± 0.306
0.293IleCys: 0.293 ± 0.074
3.197IleAsp: 3.197 ± 0.254
3.197IleGlu: 3.197 ± 0.265
0.968IlePhe: 0.968 ± 0.143
3.039IleGly: 3.039 ± 0.291
0.855IleHis: 0.855 ± 0.17
1.306IleIle: 1.306 ± 0.187
1.959IleLys: 1.959 ± 0.219
2.566IleLeu: 2.566 ± 0.237
1.148IleMet: 1.148 ± 0.148
1.666IleAsn: 1.666 ± 0.186
2.476IlePro: 2.476 ± 0.287
1.621IleGln: 1.621 ± 0.22
2.611IleArg: 2.611 ± 0.258
2.972IleSer: 2.972 ± 0.297
3.062IleThr: 3.062 ± 0.32
3.354IleVal: 3.354 ± 0.28
0.495IleTrp: 0.495 ± 0.104
1.486IleTyr: 1.486 ± 0.223
0.0IleXaa: 0.0 ± 0.0
Lys
4.795LysAla: 4.795 ± 0.318
0.293LysCys: 0.293 ± 0.088
2.972LysAsp: 2.972 ± 0.232
2.386LysGlu: 2.386 ± 0.257
1.261LysPhe: 1.261 ± 0.182
2.792LysGly: 2.792 ± 0.284
1.081LysHis: 1.081 ± 0.168
1.846LysIle: 1.846 ± 0.207
2.521LysLys: 2.521 ± 0.366
3.872LysLeu: 3.872 ± 0.305
1.216LysMet: 1.216 ± 0.167
1.351LysAsn: 1.351 ± 0.143
1.936LysPro: 1.936 ± 0.175
1.328LysGln: 1.328 ± 0.165
2.657LysArg: 2.657 ± 0.26
2.521LysSer: 2.521 ± 0.219
3.422LysThr: 3.422 ± 0.349
3.647LysVal: 3.647 ± 0.244
0.653LysTrp: 0.653 ± 0.124
1.531LysTyr: 1.531 ± 0.192
0.0LysXaa: 0.0 ± 0.0
Leu
8.96LeuAla: 8.96 ± 0.487
0.765LeuCys: 0.765 ± 0.13
6.394LeuAsp: 6.394 ± 0.33
4.503LeuGlu: 4.503 ± 0.406
1.914LeuPhe: 1.914 ± 0.221
5.223LeuGly: 5.223 ± 0.33
1.576LeuHis: 1.576 ± 0.219
2.882LeuIle: 2.882 ± 0.232
3.67LeuLys: 3.67 ± 0.261
5.651LeuLeu: 5.651 ± 0.433
2.274LeuMet: 2.274 ± 0.231
2.589LeuAsn: 2.589 ± 0.316
3.895LeuPro: 3.895 ± 0.262
2.476LeuGln: 2.476 ± 0.222
5.043LeuArg: 5.043 ± 0.36
5.133LeuSer: 5.133 ± 0.325
5.088LeuThr: 5.088 ± 0.372
5.921LeuVal: 5.921 ± 0.354
0.675LeuTrp: 0.675 ± 0.114
2.004LeuTyr: 2.004 ± 0.243
0.0LeuXaa: 0.0 ± 0.0
Met
2.882MetAla: 2.882 ± 0.261
0.203MetCys: 0.203 ± 0.067
1.553MetAsp: 1.553 ± 0.204
1.373MetGlu: 1.373 ± 0.217
1.013MetPhe: 1.013 ± 0.158
2.004MetGly: 2.004 ± 0.212
0.405MetHis: 0.405 ± 0.105
1.058MetIle: 1.058 ± 0.147
1.148MetLys: 1.148 ± 0.177
1.373MetLeu: 1.373 ± 0.191
0.675MetMet: 0.675 ± 0.124
1.261MetAsn: 1.261 ± 0.183
1.463MetPro: 1.463 ± 0.201
0.833MetGln: 0.833 ± 0.165
1.643MetArg: 1.643 ± 0.233
2.859MetSer: 2.859 ± 0.242
2.364MetThr: 2.364 ± 0.265
1.733MetVal: 1.733 ± 0.221
0.293MetTrp: 0.293 ± 0.089
0.833MetTyr: 0.833 ± 0.14
0.0MetXaa: 0.0 ± 0.0
Asn
2.927AsnAla: 2.927 ± 0.325
0.473AsnCys: 0.473 ± 0.108
2.566AsnAsp: 2.566 ± 0.251
2.184AsnGlu: 2.184 ± 0.177
1.148AsnPhe: 1.148 ± 0.185
3.647AsnGly: 3.647 ± 0.401
0.901AsnHis: 0.901 ± 0.138
1.621AsnIle: 1.621 ± 0.187
1.441AsnLys: 1.441 ± 0.174
2.949AsnLeu: 2.949 ± 0.267
0.855AsnMet: 0.855 ± 0.172
1.396AsnAsn: 1.396 ± 0.181
3.084AsnPro: 3.084 ± 0.271
1.328AsnGln: 1.328 ± 0.231
2.341AsnArg: 2.341 ± 0.235
1.576AsnSer: 1.576 ± 0.181
1.869AsnThr: 1.869 ± 0.233
2.071AsnVal: 2.071 ± 0.209
1.036AsnTrp: 1.036 ± 0.183
1.058AsnTyr: 1.058 ± 0.164
0.0AsnXaa: 0.0 ± 0.0
Pro
4.413ProAla: 4.413 ± 0.327
0.405ProCys: 0.405 ± 0.1
4.03ProAsp: 4.03 ± 0.331
5.2ProGlu: 5.2 ± 0.43
1.576ProPhe: 1.576 ± 0.187
4.638ProGly: 4.638 ± 0.364
0.901ProHis: 0.901 ± 0.127
2.139ProIle: 2.139 ± 0.234
2.319ProLys: 2.319 ± 0.258
3.197ProLeu: 3.197 ± 0.294
1.013ProMet: 1.013 ± 0.149
1.869ProAsn: 1.869 ± 0.218
2.769ProPro: 2.769 ± 0.464
1.328ProGln: 1.328 ± 0.192
2.139ProArg: 2.139 ± 0.239
3.737ProSer: 3.737 ± 0.317
3.467ProThr: 3.467 ± 0.324
3.647ProVal: 3.647 ± 0.383
0.608ProTrp: 0.608 ± 0.111
1.643ProTyr: 1.643 ± 0.21
0.0ProXaa: 0.0 ± 0.0
Gln
3.895GlnAla: 3.895 ± 0.309
0.225GlnCys: 0.225 ± 0.069
2.161GlnAsp: 2.161 ± 0.246
2.611GlnGlu: 2.611 ± 0.285
1.463GlnPhe: 1.463 ± 0.181
2.566GlnGly: 2.566 ± 0.265
0.675GlnHis: 0.675 ± 0.131
1.598GlnIle: 1.598 ± 0.206
1.553GlnLys: 1.553 ± 0.2
3.444GlnLeu: 3.444 ± 0.356
1.283GlnMet: 1.283 ± 0.195
1.171GlnAsn: 1.171 ± 0.176
1.351GlnPro: 1.351 ± 0.151
1.869GlnGln: 1.869 ± 0.206
2.364GlnArg: 2.364 ± 0.207
2.049GlnSer: 2.049 ± 0.281
2.049GlnThr: 2.049 ± 0.198
2.769GlnVal: 2.769 ± 0.289
0.901GlnTrp: 0.901 ± 0.137
1.441GlnTyr: 1.441 ± 0.166
0.0GlnXaa: 0.0 ± 0.0
Arg
4.885ArgAla: 4.885 ± 0.39
0.901ArgCys: 0.901 ± 0.155
3.422ArgAsp: 3.422 ± 0.286
4.773ArgGlu: 4.773 ± 0.374
1.936ArgPhe: 1.936 ± 0.195
3.692ArgGly: 3.692 ± 0.261
1.531ArgHis: 1.531 ± 0.217
2.814ArgIle: 2.814 ± 0.252
3.174ArgLys: 3.174 ± 0.277
4.458ArgLeu: 4.458 ± 0.334
1.914ArgMet: 1.914 ± 0.215
2.116ArgAsn: 2.116 ± 0.206
2.364ArgPro: 2.364 ± 0.218
2.544ArgGln: 2.544 ± 0.261
4.593ArgArg: 4.593 ± 0.52
3.219ArgSer: 3.219 ± 0.263
3.039ArgThr: 3.039 ± 0.292
4.773ArgVal: 4.773 ± 0.388
1.193ArgTrp: 1.193 ± 0.199
2.161ArgTyr: 2.161 ± 0.242
0.0ArgXaa: 0.0 ± 0.0
Ser
6.259SerAla: 6.259 ± 0.409
0.473SerCys: 0.473 ± 0.097
4.03SerAsp: 4.03 ± 0.294
3.692SerGlu: 3.692 ± 0.319
2.184SerPhe: 2.184 ± 0.22
4.953SerGly: 4.953 ± 0.372
1.013SerHis: 1.013 ± 0.158
2.837SerIle: 2.837 ± 0.286
3.062SerLys: 3.062 ± 0.336
4.232SerLeu: 4.232 ± 0.316
1.531SerMet: 1.531 ± 0.171
2.116SerAsn: 2.116 ± 0.217
2.859SerPro: 2.859 ± 0.27
2.566SerGln: 2.566 ± 0.31
2.904SerArg: 2.904 ± 0.261
4.795SerSer: 4.795 ± 0.417
4.435SerThr: 4.435 ± 0.332
4.615SerVal: 4.615 ± 0.366
1.193SerTrp: 1.193 ± 0.184
2.296SerTyr: 2.296 ± 0.259
0.0SerXaa: 0.0 ± 0.0
Thr
6.326ThrAla: 6.326 ± 0.49
0.855ThrCys: 0.855 ± 0.149
3.737ThrAsp: 3.737 ± 0.3
4.12ThrGlu: 4.12 ± 0.333
2.296ThrPhe: 2.296 ± 0.22
5.291ThrGly: 5.291 ± 0.426
1.418ThrHis: 1.418 ± 0.169
2.837ThrIle: 2.837 ± 0.222
2.476ThrLys: 2.476 ± 0.223
4.773ThrLeu: 4.773 ± 0.366
1.621ThrMet: 1.621 ± 0.184
2.319ThrAsn: 2.319 ± 0.266
4.435ThrPro: 4.435 ± 0.372
1.936ThrGln: 1.936 ± 0.231
3.377ThrArg: 3.377 ± 0.302
4.66ThrSer: 4.66 ± 0.469
4.863ThrThr: 4.863 ± 0.504
4.795ThrVal: 4.795 ± 0.422
0.968ThrTrp: 0.968 ± 0.136
2.184ThrTyr: 2.184 ± 0.254
0.0ThrXaa: 0.0 ± 0.0
Val
6.731ValAla: 6.731 ± 0.398
0.72ValCys: 0.72 ± 0.123
5.426ValAsp: 5.426 ± 0.387
5.921ValGlu: 5.921 ± 0.404
2.319ValPhe: 2.319 ± 0.244
4.953ValGly: 4.953 ± 0.404
1.351ValHis: 1.351 ± 0.223
3.174ValIle: 3.174 ± 0.271
3.017ValLys: 3.017 ± 0.286
5.381ValLeu: 5.381 ± 0.419
2.049ValMet: 2.049 ± 0.211
3.107ValAsn: 3.107 ± 0.304
3.895ValPro: 3.895 ± 0.313
2.634ValGln: 2.634 ± 0.306
4.435ValArg: 4.435 ± 0.329
4.165ValSer: 4.165 ± 0.317
4.773ValThr: 4.773 ± 0.451
6.416ValVal: 6.416 ± 0.472
1.036ValTrp: 1.036 ± 0.149
2.139ValTyr: 2.139 ± 0.201
0.0ValXaa: 0.0 ± 0.0
Trp
1.576TrpAla: 1.576 ± 0.192
0.135TrpCys: 0.135 ± 0.052
1.396TrpAsp: 1.396 ± 0.268
1.238TrpGlu: 1.238 ± 0.182
0.36TrpPhe: 0.36 ± 0.1
1.328TrpGly: 1.328 ± 0.234
0.473TrpHis: 0.473 ± 0.12
0.653TrpIle: 0.653 ± 0.133
0.698TrpLys: 0.698 ± 0.113
1.216TrpLeu: 1.216 ± 0.173
0.653TrpMet: 0.653 ± 0.132
0.855TrpAsn: 0.855 ± 0.127
0.585TrpPro: 0.585 ± 0.11
0.698TrpGln: 0.698 ± 0.136
0.968TrpArg: 0.968 ± 0.138
1.126TrpSer: 1.126 ± 0.126
1.598TrpThr: 1.598 ± 0.207
1.396TrpVal: 1.396 ± 0.164
0.383TrpTrp: 0.383 ± 0.078
0.518TrpTyr: 0.518 ± 0.121
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.341TyrAla: 2.341 ± 0.237
0.495TyrCys: 0.495 ± 0.125
2.679TyrAsp: 2.679 ± 0.278
2.904TyrGlu: 2.904 ± 0.283
0.901TyrPhe: 0.901 ± 0.149
2.566TyrGly: 2.566 ± 0.231
0.608TyrHis: 0.608 ± 0.108
1.238TyrIle: 1.238 ± 0.178
1.171TyrLys: 1.171 ± 0.158
3.152TyrLeu: 3.152 ± 0.274
0.653TyrMet: 0.653 ± 0.12
1.103TyrAsn: 1.103 ± 0.169
1.891TyrPro: 1.891 ± 0.232
1.373TyrGln: 1.373 ± 0.151
2.184TyrArg: 2.184 ± 0.268
1.869TyrSer: 1.869 ± 0.212
2.274TyrThr: 2.274 ± 0.265
2.341TyrVal: 2.341 ± 0.23
0.473TyrTrp: 0.473 ± 0.107
1.081TyrTyr: 1.081 ± 0.158
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 228 proteins (44420 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski