Amino acid dipepetide frequency for Rhodococcus phage NiceHouse

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
6.772AlaAla: 6.772 ± 0.759
0.552AlaCys: 0.552 ± 0.129
5.043AlaAsp: 5.043 ± 0.428
5.619AlaGlu: 5.619 ± 0.425
2.449AlaPhe: 2.449 ± 0.24
6.364AlaGly: 6.364 ± 0.42
1.105AlaHis: 1.105 ± 0.147
4.346AlaIle: 4.346 ± 0.33
4.803AlaLys: 4.803 ± 0.303
5.955AlaLeu: 5.955 ± 0.482
2.233AlaMet: 2.233 ± 0.326
4.202AlaAsn: 4.202 ± 0.299
2.906AlaPro: 2.906 ± 0.353
2.377AlaGln: 2.377 ± 0.395
3.41AlaArg: 3.41 ± 0.361
4.346AlaSer: 4.346 ± 0.299
5.235AlaThr: 5.235 ± 0.406
5.331AlaVal: 5.331 ± 0.424
1.465AlaTrp: 1.465 ± 0.172
2.69AlaTyr: 2.69 ± 0.313
0.0AlaXaa: 0.0 ± 0.0
Cys
0.792CysAla: 0.792 ± 0.165
0.168CysCys: 0.168 ± 0.079
0.624CysAsp: 0.624 ± 0.138
0.624CysGlu: 0.624 ± 0.133
0.408CysPhe: 0.408 ± 0.097
0.84CysGly: 0.84 ± 0.152
0.168CysHis: 0.168 ± 0.061
0.6CysIle: 0.6 ± 0.137
0.384CysLys: 0.384 ± 0.09
0.576CysLeu: 0.576 ± 0.119
0.24CysMet: 0.24 ± 0.075
0.6CysAsn: 0.6 ± 0.114
0.432CysPro: 0.432 ± 0.098
0.6CysGln: 0.6 ± 0.122
0.456CysArg: 0.456 ± 0.119
0.696CysSer: 0.696 ± 0.139
0.432CysThr: 0.432 ± 0.096
0.624CysVal: 0.624 ± 0.121
0.192CysTrp: 0.192 ± 0.086
0.24CysTyr: 0.24 ± 0.085
0.0CysXaa: 0.0 ± 0.0
Asp
5.379AspAla: 5.379 ± 0.35
0.961AspCys: 0.961 ± 0.179
4.322AspAsp: 4.322 ± 0.34
5.403AspGlu: 5.403 ± 0.458
3.074AspPhe: 3.074 ± 0.293
5.067AspGly: 5.067 ± 0.348
1.561AspHis: 1.561 ± 0.193
3.938AspIle: 3.938 ± 0.284
3.65AspLys: 3.65 ± 0.267
5.091AspLeu: 5.091 ± 0.294
1.513AspMet: 1.513 ± 0.215
2.858AspAsn: 2.858 ± 0.324
2.762AspPro: 2.762 ± 0.228
2.738AspGln: 2.738 ± 0.23
3.122AspArg: 3.122 ± 0.251
4.01AspSer: 4.01 ± 0.341
3.65AspThr: 3.65 ± 0.285
4.034AspVal: 4.034 ± 0.302
1.489AspTrp: 1.489 ± 0.169
2.93AspTyr: 2.93 ± 0.251
0.0AspXaa: 0.0 ± 0.0
Glu
5.067GluAla: 5.067 ± 0.306
0.72GluCys: 0.72 ± 0.157
4.443GluAsp: 4.443 ± 0.411
4.995GluGlu: 4.995 ± 0.448
3.362GluPhe: 3.362 ± 0.329
3.818GluGly: 3.818 ± 0.312
1.393GluHis: 1.393 ± 0.226
5.091GluIle: 5.091 ± 0.341
3.53GluLys: 3.53 ± 0.355
6.58GluLeu: 6.58 ± 0.489
1.729GluMet: 1.729 ± 0.237
3.77GluAsn: 3.77 ± 0.305
2.81GluPro: 2.81 ± 0.293
3.314GluGln: 3.314 ± 0.302
3.674GluArg: 3.674 ± 0.35
3.362GluSer: 3.362 ± 0.284
3.434GluThr: 3.434 ± 0.315
4.25GluVal: 4.25 ± 0.266
1.369GluTrp: 1.369 ± 0.204
3.434GluTyr: 3.434 ± 0.321
0.0GluXaa: 0.0 ± 0.0
Phe
2.954PheAla: 2.954 ± 0.244
0.264PheCys: 0.264 ± 0.087
3.602PheAsp: 3.602 ± 0.344
2.521PheGlu: 2.521 ± 0.293
1.537PhePhe: 1.537 ± 0.204
3.194PheGly: 3.194 ± 0.244
0.792PheHis: 0.792 ± 0.152
2.281PheIle: 2.281 ± 0.242
2.497PheLys: 2.497 ± 0.233
3.026PheLeu: 3.026 ± 0.292
1.081PheMet: 1.081 ± 0.166
2.281PheAsn: 2.281 ± 0.233
1.225PhePro: 1.225 ± 0.168
0.792PheGln: 0.792 ± 0.138
1.873PheArg: 1.873 ± 0.196
3.026PheSer: 3.026 ± 0.302
2.81PheThr: 2.81 ± 0.231
2.425PheVal: 2.425 ± 0.294
0.48PheTrp: 0.48 ± 0.107
1.705PheTyr: 1.705 ± 0.223
0.0PheXaa: 0.0 ± 0.0
Gly
4.515GlyAla: 4.515 ± 0.427
0.72GlyCys: 0.72 ± 0.114
3.65GlyAsp: 3.65 ± 0.28
4.178GlyGlu: 4.178 ± 0.327
3.362GlyPhe: 3.362 ± 0.242
4.13GlyGly: 4.13 ± 0.448
1.489GlyHis: 1.489 ± 0.196
4.779GlyIle: 4.779 ± 0.392
3.554GlyLys: 3.554 ± 0.329
5.451GlyLeu: 5.451 ± 0.429
2.161GlyMet: 2.161 ± 0.226
3.29GlyAsn: 3.29 ± 0.321
2.257GlyPro: 2.257 ± 0.254
2.666GlyGln: 2.666 ± 0.263
3.026GlyArg: 3.026 ± 0.277
4.611GlySer: 4.611 ± 0.471
5.019GlyThr: 5.019 ± 0.46
4.731GlyVal: 4.731 ± 0.334
1.561GlyTrp: 1.561 ± 0.185
3.146GlyTyr: 3.146 ± 0.316
0.0GlyXaa: 0.0 ± 0.0
His
0.961HisAla: 0.961 ± 0.16
0.264HisCys: 0.264 ± 0.06
1.249HisAsp: 1.249 ± 0.194
1.273HisGlu: 1.273 ± 0.166
0.985HisPhe: 0.985 ± 0.173
1.561HisGly: 1.561 ± 0.177
0.24HisHis: 0.24 ± 0.084
1.177HisIle: 1.177 ± 0.202
1.177HisLys: 1.177 ± 0.181
1.249HisLeu: 1.249 ± 0.169
0.36HisMet: 0.36 ± 0.088
1.153HisAsn: 1.153 ± 0.173
0.768HisPro: 0.768 ± 0.138
0.816HisGln: 0.816 ± 0.134
1.225HisArg: 1.225 ± 0.174
1.033HisSer: 1.033 ± 0.146
0.864HisThr: 0.864 ± 0.134
1.369HisVal: 1.369 ± 0.169
0.36HisTrp: 0.36 ± 0.098
1.057HisTyr: 1.057 ± 0.179
0.0HisXaa: 0.0 ± 0.0
Ile
5.211IleAla: 5.211 ± 0.316
0.576IleCys: 0.576 ± 0.116
4.971IleAsp: 4.971 ± 0.343
4.827IleGlu: 4.827 ± 0.305
2.113IlePhe: 2.113 ± 0.249
4.01IleGly: 4.01 ± 0.354
1.393IleHis: 1.393 ± 0.202
3.338IleIle: 3.338 ± 0.288
3.53IleLys: 3.53 ± 0.301
3.674IleLeu: 3.674 ± 0.34
1.321IleMet: 1.321 ± 0.194
3.098IleAsn: 3.098 ± 0.292
2.305IlePro: 2.305 ± 0.222
2.642IleGln: 2.642 ± 0.274
3.242IleArg: 3.242 ± 0.29
3.818IleSer: 3.818 ± 0.302
3.866IleThr: 3.866 ± 0.241
4.274IleVal: 4.274 ± 0.347
0.937IleTrp: 0.937 ± 0.139
2.041IleTyr: 2.041 ± 0.273
0.0IleXaa: 0.0 ± 0.0
Lys
5.163LysAla: 5.163 ± 0.434
0.576LysCys: 0.576 ± 0.105
3.842LysAsp: 3.842 ± 0.315
3.746LysGlu: 3.746 ± 0.325
2.377LysPhe: 2.377 ± 0.231
2.81LysGly: 2.81 ± 0.334
1.081LysHis: 1.081 ± 0.169
3.602LysIle: 3.602 ± 0.283
3.29LysLys: 3.29 ± 0.363
4.899LysLeu: 4.899 ± 0.358
1.921LysMet: 1.921 ± 0.209
2.978LysAsn: 2.978 ± 0.278
2.161LysPro: 2.161 ± 0.217
2.161LysGln: 2.161 ± 0.206
3.266LysArg: 3.266 ± 0.305
3.674LysSer: 3.674 ± 0.258
3.794LysThr: 3.794 ± 0.349
3.218LysVal: 3.218 ± 0.275
1.081LysTrp: 1.081 ± 0.196
2.161LysTyr: 2.161 ± 0.205
0.0LysXaa: 0.0 ± 0.0
Leu
6.22LeuAla: 6.22 ± 0.406
0.672LeuCys: 0.672 ± 0.142
5.835LeuAsp: 5.835 ± 0.416
5.667LeuGlu: 5.667 ± 0.423
3.242LeuPhe: 3.242 ± 0.254
5.163LeuGly: 5.163 ± 0.383
1.561LeuHis: 1.561 ± 0.188
4.635LeuIle: 4.635 ± 0.334
4.683LeuLys: 4.683 ± 0.299
5.259LeuLeu: 5.259 ± 0.454
1.873LeuMet: 1.873 ± 0.228
4.443LeuAsn: 4.443 ± 0.297
3.098LeuPro: 3.098 ± 0.293
2.593LeuGln: 2.593 ± 0.263
3.146LeuArg: 3.146 ± 0.276
5.259LeuSer: 5.259 ± 0.373
4.683LeuThr: 4.683 ± 0.353
4.611LeuVal: 4.611 ± 0.363
1.417LeuTrp: 1.417 ± 0.193
2.69LeuTyr: 2.69 ± 0.279
0.0LeuXaa: 0.0 ± 0.0
Met
2.089MetAla: 2.089 ± 0.259
0.288MetCys: 0.288 ± 0.082
1.321MetAsp: 1.321 ± 0.201
1.177MetGlu: 1.177 ± 0.17
0.913MetPhe: 0.913 ± 0.148
1.609MetGly: 1.609 ± 0.216
0.504MetHis: 0.504 ± 0.118
1.585MetIle: 1.585 ± 0.177
1.753MetLys: 1.753 ± 0.219
2.185MetLeu: 2.185 ± 0.253
0.6MetMet: 0.6 ± 0.14
1.297MetAsn: 1.297 ± 0.159
0.961MetPro: 0.961 ± 0.137
1.393MetGln: 1.393 ± 0.258
1.297MetArg: 1.297 ± 0.213
2.425MetSer: 2.425 ± 0.268
2.113MetThr: 2.113 ± 0.251
1.249MetVal: 1.249 ± 0.163
0.336MetTrp: 0.336 ± 0.09
1.081MetTyr: 1.081 ± 0.141
0.0MetXaa: 0.0 ± 0.0
Asn
3.77AsnAla: 3.77 ± 0.421
0.504AsnCys: 0.504 ± 0.124
2.834AsnAsp: 2.834 ± 0.246
3.65AsnGlu: 3.65 ± 0.283
1.705AsnPhe: 1.705 ± 0.182
4.346AsnGly: 4.346 ± 0.438
0.937AsnHis: 0.937 ± 0.15
2.858AsnIle: 2.858 ± 0.243
3.026AsnLys: 3.026 ± 0.323
4.755AsnLeu: 4.755 ± 0.394
1.081AsnMet: 1.081 ± 0.164
2.882AsnAsn: 2.882 ± 0.34
2.642AsnPro: 2.642 ± 0.225
1.705AsnGln: 1.705 ± 0.192
2.954AsnArg: 2.954 ± 0.23
3.266AsnSer: 3.266 ± 0.285
2.666AsnThr: 2.666 ± 0.327
3.146AsnVal: 3.146 ± 0.261
1.009AsnTrp: 1.009 ± 0.178
1.849AsnTyr: 1.849 ± 0.238
0.0AsnXaa: 0.0 ± 0.0
Pro
3.65ProAla: 3.65 ± 0.389
0.12ProCys: 0.12 ± 0.051
3.122ProAsp: 3.122 ± 0.275
3.818ProGlu: 3.818 ± 0.326
1.057ProPhe: 1.057 ± 0.17
3.074ProGly: 3.074 ± 0.374
0.864ProHis: 0.864 ± 0.149
2.257ProIle: 2.257 ± 0.256
2.449ProLys: 2.449 ± 0.329
2.377ProLeu: 2.377 ± 0.239
0.889ProMet: 0.889 ± 0.122
1.993ProAsn: 1.993 ± 0.23
0.913ProPro: 0.913 ± 0.159
0.937ProGln: 0.937 ± 0.155
1.417ProArg: 1.417 ± 0.205
2.089ProSer: 2.089 ± 0.247
2.762ProThr: 2.762 ± 0.281
3.746ProVal: 3.746 ± 0.301
0.504ProTrp: 0.504 ± 0.102
1.225ProTyr: 1.225 ± 0.174
0.0ProXaa: 0.0 ± 0.0
Gln
3.314GlnAla: 3.314 ± 0.437
0.288GlnCys: 0.288 ± 0.096
1.969GlnAsp: 1.969 ± 0.204
2.401GlnGlu: 2.401 ± 0.237
1.633GlnPhe: 1.633 ± 0.196
2.137GlnGly: 2.137 ± 0.214
0.744GlnHis: 0.744 ± 0.135
2.69GlnIle: 2.69 ± 0.267
2.666GlnLys: 2.666 ± 0.476
2.834GlnLeu: 2.834 ± 0.29
1.441GlnMet: 1.441 ± 0.246
1.657GlnAsn: 1.657 ± 0.207
1.393GlnPro: 1.393 ± 0.192
1.585GlnGln: 1.585 ± 0.273
2.209GlnArg: 2.209 ± 0.287
2.617GlnSer: 2.617 ± 0.236
1.897GlnThr: 1.897 ± 0.308
2.185GlnVal: 2.185 ± 0.254
0.696GlnTrp: 0.696 ± 0.133
1.177GlnTyr: 1.177 ± 0.159
0.0GlnXaa: 0.0 ± 0.0
Arg
3.17ArgAla: 3.17 ± 0.212
0.336ArgCys: 0.336 ± 0.092
2.882ArgAsp: 2.882 ± 0.304
3.386ArgGlu: 3.386 ± 0.251
1.873ArgPhe: 1.873 ± 0.259
2.954ArgGly: 2.954 ± 0.307
0.937ArgHis: 0.937 ± 0.134
3.218ArgIle: 3.218 ± 0.283
3.026ArgLys: 3.026 ± 0.31
4.226ArgLeu: 4.226 ± 0.291
1.489ArgMet: 1.489 ± 0.174
2.906ArgAsn: 2.906 ± 0.294
1.849ArgPro: 1.849 ± 0.216
1.777ArgGln: 1.777 ± 0.25
2.954ArgArg: 2.954 ± 0.274
2.93ArgSer: 2.93 ± 0.265
2.882ArgThr: 2.882 ± 0.257
3.386ArgVal: 3.386 ± 0.285
0.889ArgTrp: 0.889 ± 0.155
1.969ArgTyr: 1.969 ± 0.209
0.0ArgXaa: 0.0 ± 0.0
Ser
4.298SerAla: 4.298 ± 0.423
0.528SerCys: 0.528 ± 0.126
4.058SerAsp: 4.058 ± 0.308
4.539SerGlu: 4.539 ± 0.322
2.786SerPhe: 2.786 ± 0.25
4.443SerGly: 4.443 ± 0.411
0.985SerHis: 0.985 ± 0.155
3.698SerIle: 3.698 ± 0.292
3.122SerLys: 3.122 ± 0.26
4.683SerLeu: 4.683 ± 0.343
1.369SerMet: 1.369 ± 0.177
3.434SerAsn: 3.434 ± 0.354
2.666SerPro: 2.666 ± 0.26
2.401SerGln: 2.401 ± 0.247
2.834SerArg: 2.834 ± 0.254
3.53SerSer: 3.53 ± 0.34
4.25SerThr: 4.25 ± 0.34
4.755SerVal: 4.755 ± 0.291
1.273SerTrp: 1.273 ± 0.177
2.449SerTyr: 2.449 ± 0.229
0.0SerXaa: 0.0 ± 0.0
Thr
4.515ThrAla: 4.515 ± 0.385
0.528ThrCys: 0.528 ± 0.109
4.346ThrAsp: 4.346 ± 0.323
4.419ThrGlu: 4.419 ± 0.313
2.377ThrPhe: 2.377 ± 0.254
5.307ThrGly: 5.307 ± 0.533
0.889ThrHis: 0.889 ± 0.14
3.722ThrIle: 3.722 ± 0.296
3.218ThrLys: 3.218 ± 0.297
4.635ThrLeu: 4.635 ± 0.34
1.369ThrMet: 1.369 ± 0.166
3.026ThrAsn: 3.026 ± 0.257
3.17ThrPro: 3.17 ± 0.36
2.209ThrGln: 2.209 ± 0.232
2.449ThrArg: 2.449 ± 0.213
3.29ThrSer: 3.29 ± 0.306
3.89ThrThr: 3.89 ± 0.458
4.995ThrVal: 4.995 ± 0.361
1.009ThrTrp: 1.009 ± 0.152
2.617ThrTyr: 2.617 ± 0.299
0.0ThrXaa: 0.0 ± 0.0
Val
5.595ValAla: 5.595 ± 0.418
0.864ValCys: 0.864 ± 0.151
5.019ValAsp: 5.019 ± 0.433
4.01ValGlu: 4.01 ± 0.312
2.834ValPhe: 2.834 ± 0.318
3.458ValGly: 3.458 ± 0.298
1.297ValHis: 1.297 ± 0.174
4.226ValIle: 4.226 ± 0.324
4.394ValLys: 4.394 ± 0.283
5.043ValLeu: 5.043 ± 0.346
1.753ValMet: 1.753 ± 0.161
2.882ValAsn: 2.882 ± 0.283
2.93ValPro: 2.93 ± 0.257
2.738ValGln: 2.738 ± 0.241
3.554ValArg: 3.554 ± 0.278
3.938ValSer: 3.938 ± 0.356
4.058ValThr: 4.058 ± 0.28
4.683ValVal: 4.683 ± 0.364
0.913ValTrp: 0.913 ± 0.171
2.666ValTyr: 2.666 ± 0.248
0.0ValXaa: 0.0 ± 0.0
Trp
1.297TrpAla: 1.297 ± 0.154
0.312TrpCys: 0.312 ± 0.092
1.417TrpAsp: 1.417 ± 0.184
1.009TrpGlu: 1.009 ± 0.165
0.696TrpPhe: 0.696 ± 0.118
1.129TrpGly: 1.129 ± 0.138
0.528TrpHis: 0.528 ± 0.131
1.105TrpIle: 1.105 ± 0.187
0.672TrpLys: 0.672 ± 0.114
1.321TrpLeu: 1.321 ± 0.169
0.624TrpMet: 0.624 ± 0.121
1.129TrpAsn: 1.129 ± 0.197
0.552TrpPro: 0.552 ± 0.118
0.624TrpGln: 0.624 ± 0.119
0.84TrpArg: 0.84 ± 0.154
1.297TrpSer: 1.297 ± 0.173
1.177TrpThr: 1.177 ± 0.159
1.129TrpVal: 1.129 ± 0.178
0.192TrpTrp: 0.192 ± 0.059
0.792TrpTyr: 0.792 ± 0.137
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.473TyrAla: 2.473 ± 0.242
0.456TyrCys: 0.456 ± 0.116
3.002TyrAsp: 3.002 ± 0.318
2.882TyrGlu: 2.882 ± 0.254
1.585TyrPhe: 1.585 ± 0.191
2.882TyrGly: 2.882 ± 0.283
0.672TyrHis: 0.672 ± 0.164
2.041TyrIle: 2.041 ± 0.207
2.377TyrLys: 2.377 ± 0.292
2.93TyrLeu: 2.93 ± 0.257
1.105TyrMet: 1.105 ± 0.174
1.705TyrAsn: 1.705 ± 0.221
1.465TyrPro: 1.465 ± 0.218
1.465TyrGln: 1.465 ± 0.164
2.089TyrArg: 2.089 ± 0.239
2.882TyrSer: 2.882 ± 0.257
2.497TyrThr: 2.497 ± 0.23
2.714TyrVal: 2.714 ± 0.308
0.672TyrTrp: 0.672 ± 0.15
1.609TyrTyr: 1.609 ± 0.215
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 250 proteins (41644 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski