Amino acid dipepetide frequency for Rhodococcus phage Peregrin

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
6.071AlaAla: 6.071 ± 0.809
0.806AlaCys: 0.806 ± 0.155
4.056AlaAsp: 4.056 ± 0.33
4.862AlaGlu: 4.862 ± 0.354
3.25AlaPhe: 3.25 ± 0.331
5.013AlaGly: 5.013 ± 0.538
1.159AlaHis: 1.159 ± 0.175
4.912AlaIle: 4.912 ± 0.361
5.089AlaLys: 5.089 ± 0.368
6.348AlaLeu: 6.348 ± 0.614
2.141AlaMet: 2.141 ± 0.288
2.973AlaAsn: 2.973 ± 0.251
2.292AlaPro: 2.292 ± 0.251
2.695AlaGln: 2.695 ± 0.293
3.376AlaArg: 3.376 ± 0.32
4.257AlaSer: 4.257 ± 0.384
4.056AlaThr: 4.056 ± 0.435
4.031AlaVal: 4.031 ± 0.4
1.562AlaTrp: 1.562 ± 0.188
2.544AlaTyr: 2.544 ± 0.316
0.0AlaXaa: 0.0 ± 0.0
Cys
0.655CysAla: 0.655 ± 0.157
0.252CysCys: 0.252 ± 0.074
1.008CysAsp: 1.008 ± 0.174
1.083CysGlu: 1.083 ± 0.193
0.403CysPhe: 0.403 ± 0.124
1.134CysGly: 1.134 ± 0.193
0.353CysHis: 0.353 ± 0.11
0.705CysIle: 0.705 ± 0.145
0.68CysLys: 0.68 ± 0.139
0.781CysLeu: 0.781 ± 0.162
0.202CysMet: 0.202 ± 0.068
0.403CysAsn: 0.403 ± 0.112
0.479CysPro: 0.479 ± 0.117
0.176CysGln: 0.176 ± 0.083
0.731CysArg: 0.731 ± 0.13
0.731CysSer: 0.731 ± 0.137
0.68CysThr: 0.68 ± 0.145
0.68CysVal: 0.68 ± 0.119
0.277CysTrp: 0.277 ± 0.081
0.479CysTyr: 0.479 ± 0.098
0.0CysXaa: 0.0 ± 0.0
Asp
5.114AspAla: 5.114 ± 0.363
0.932AspCys: 0.932 ± 0.177
4.635AspAsp: 4.635 ± 0.373
5.769AspGlu: 5.769 ± 0.455
3.199AspPhe: 3.199 ± 0.294
5.139AspGly: 5.139 ± 0.345
1.285AspHis: 1.285 ± 0.187
4.711AspIle: 4.711 ± 0.34
3.905AspLys: 3.905 ± 0.311
6.273AspLeu: 6.273 ± 0.392
2.015AspMet: 2.015 ± 0.239
2.922AspAsn: 2.922 ± 0.245
2.569AspPro: 2.569 ± 0.247
2.141AspGln: 2.141 ± 0.244
2.998AspArg: 2.998 ± 0.252
4.156AspSer: 4.156 ± 0.316
3.678AspThr: 3.678 ± 0.293
4.156AspVal: 4.156 ± 0.451
1.31AspTrp: 1.31 ± 0.155
2.872AspTyr: 2.872 ± 0.316
0.0AspXaa: 0.0 ± 0.0
Glu
4.988GluAla: 4.988 ± 0.428
1.058GluCys: 1.058 ± 0.198
6.071GluAsp: 6.071 ± 0.446
6.298GluGlu: 6.298 ± 0.668
3.552GluPhe: 3.552 ± 0.306
4.232GluGly: 4.232 ± 0.346
1.763GluHis: 1.763 ± 0.211
5.492GluIle: 5.492 ± 0.324
4.509GluLys: 4.509 ± 0.49
8.111GluLeu: 8.111 ± 0.486
2.116GluMet: 2.116 ± 0.209
3.25GluAsn: 3.25 ± 0.261
2.695GluPro: 2.695 ± 0.261
2.569GluGln: 2.569 ± 0.232
3.426GluArg: 3.426 ± 0.306
4.005GluSer: 4.005 ± 0.303
3.653GluThr: 3.653 ± 0.443
5.391GluVal: 5.391 ± 0.41
1.411GluTrp: 1.411 ± 0.193
3.779GluTyr: 3.779 ± 0.424
0.0GluXaa: 0.0 ± 0.0
Phe
1.94PheAla: 1.94 ± 0.233
0.529PheCys: 0.529 ± 0.119
3.174PheAsp: 3.174 ± 0.275
3.023PheGlu: 3.023 ± 0.32
1.285PhePhe: 1.285 ± 0.205
2.444PheGly: 2.444 ± 0.224
0.806PheHis: 0.806 ± 0.151
2.267PheIle: 2.267 ± 0.264
2.418PheLys: 2.418 ± 0.235
3.073PheLeu: 3.073 ± 0.336
0.856PheMet: 0.856 ± 0.14
2.368PheAsn: 2.368 ± 0.236
1.058PhePro: 1.058 ± 0.17
1.184PheGln: 1.184 ± 0.187
1.688PheArg: 1.688 ± 0.226
2.746PheSer: 2.746 ± 0.252
2.393PheThr: 2.393 ± 0.24
2.318PheVal: 2.318 ± 0.255
0.856PheTrp: 0.856 ± 0.158
1.461PheTyr: 1.461 ± 0.223
0.0PheXaa: 0.0 ± 0.0
Gly
4.257GlyAla: 4.257 ± 0.633
0.605GlyCys: 0.605 ± 0.118
4.534GlyAsp: 4.534 ± 0.405
4.106GlyGlu: 4.106 ± 0.313
2.847GlyPhe: 2.847 ± 0.281
3.879GlyGly: 3.879 ± 0.535
1.537GlyHis: 1.537 ± 0.19
4.786GlyIle: 4.786 ± 0.331
3.879GlyLys: 3.879 ± 0.29
4.862GlyLeu: 4.862 ± 0.552
1.688GlyMet: 1.688 ± 0.189
3.25GlyAsn: 3.25 ± 0.342
2.116GlyPro: 2.116 ± 0.193
2.04GlyGln: 2.04 ± 0.279
3.098GlyArg: 3.098 ± 0.368
4.56GlySer: 4.56 ± 0.364
4.509GlyThr: 4.509 ± 0.487
4.358GlyVal: 4.358 ± 0.286
1.587GlyTrp: 1.587 ± 0.182
3.275GlyTyr: 3.275 ± 0.371
0.0GlyXaa: 0.0 ± 0.0
His
1.108HisAla: 1.108 ± 0.195
0.227HisCys: 0.227 ± 0.073
1.385HisAsp: 1.385 ± 0.175
1.436HisGlu: 1.436 ± 0.184
0.907HisPhe: 0.907 ± 0.144
1.411HisGly: 1.411 ± 0.17
0.353HisHis: 0.353 ± 0.107
1.612HisIle: 1.612 ± 0.218
1.26HisLys: 1.26 ± 0.163
1.033HisLeu: 1.033 ± 0.164
0.504HisMet: 0.504 ± 0.095
1.008HisAsn: 1.008 ± 0.162
0.856HisPro: 0.856 ± 0.164
0.68HisGln: 0.68 ± 0.123
1.285HisArg: 1.285 ± 0.21
1.134HisSer: 1.134 ± 0.173
1.335HisThr: 1.335 ± 0.191
1.411HisVal: 1.411 ± 0.163
0.353HisTrp: 0.353 ± 0.106
0.957HisTyr: 0.957 ± 0.165
0.0HisXaa: 0.0 ± 0.0
Ile
5.718IleAla: 5.718 ± 0.429
0.63IleCys: 0.63 ± 0.142
5.441IleAsp: 5.441 ± 0.383
5.668IleGlu: 5.668 ± 0.417
1.31IlePhe: 1.31 ± 0.182
4.005IleGly: 4.005 ± 0.367
1.058IleHis: 1.058 ± 0.167
4.056IleIle: 4.056 ± 0.302
4.635IleLys: 4.635 ± 0.328
4.434IleLeu: 4.434 ± 0.327
1.411IleMet: 1.411 ± 0.191
2.998IleAsn: 2.998 ± 0.304
2.67IlePro: 2.67 ± 0.366
1.814IleGln: 1.814 ± 0.256
2.922IleArg: 2.922 ± 0.255
4.333IleSer: 4.333 ± 0.367
4.333IleThr: 4.333 ± 0.325
4.383IleVal: 4.383 ± 0.324
0.957IleTrp: 0.957 ± 0.159
2.368IleTyr: 2.368 ± 0.309
0.0IleXaa: 0.0 ± 0.0
Lys
5.265LysAla: 5.265 ± 0.413
0.63LysCys: 0.63 ± 0.134
4.786LysAsp: 4.786 ± 0.393
5.416LysGlu: 5.416 ± 0.448
2.343LysPhe: 2.343 ± 0.218
3.275LysGly: 3.275 ± 0.307
1.108LysHis: 1.108 ± 0.167
4.333LysIle: 4.333 ± 0.564
4.434LysLys: 4.434 ± 0.39
5.366LysLeu: 5.366 ± 0.355
2.267LysMet: 2.267 ± 0.244
3.25LysAsn: 3.25 ± 0.32
2.721LysPro: 2.721 ± 0.284
2.015LysGln: 2.015 ± 0.222
2.998LysArg: 2.998 ± 0.275
3.627LysSer: 3.627 ± 0.306
3.753LysThr: 3.753 ± 0.303
4.358LysVal: 4.358 ± 0.381
1.285LysTrp: 1.285 ± 0.192
2.847LysTyr: 2.847 ± 0.331
0.0LysXaa: 0.0 ± 0.0
Leu
6.398LeuAla: 6.398 ± 0.45
0.68LeuCys: 0.68 ± 0.147
6.298LeuAsp: 6.298 ± 0.403
7.23LeuGlu: 7.23 ± 0.522
2.973LeuPhe: 2.973 ± 0.28
5.618LeuGly: 5.618 ± 0.535
1.562LeuHis: 1.562 ± 0.184
4.61LeuIle: 4.61 ± 0.398
5.97LeuLys: 5.97 ± 0.386
5.189LeuLeu: 5.189 ± 0.412
2.04LeuMet: 2.04 ± 0.224
4.308LeuAsn: 4.308 ± 0.299
3.124LeuPro: 3.124 ± 0.411
2.091LeuGln: 2.091 ± 0.232
3.325LeuArg: 3.325 ± 0.255
5.164LeuSer: 5.164 ± 0.331
3.602LeuThr: 3.602 ± 0.349
4.207LeuVal: 4.207 ± 0.337
1.411LeuTrp: 1.411 ± 0.203
2.519LeuTyr: 2.519 ± 0.292
0.0LeuXaa: 0.0 ± 0.0
Met
1.915MetAla: 1.915 ± 0.207
0.353MetCys: 0.353 ± 0.106
1.335MetAsp: 1.335 ± 0.22
1.814MetGlu: 1.814 ± 0.253
0.982MetPhe: 0.982 ± 0.186
1.864MetGly: 1.864 ± 0.373
0.655MetHis: 0.655 ± 0.138
1.814MetIle: 1.814 ± 0.228
2.04MetLys: 2.04 ± 0.258
1.663MetLeu: 1.663 ± 0.2
0.479MetMet: 0.479 ± 0.115
1.537MetAsn: 1.537 ± 0.19
1.159MetPro: 1.159 ± 0.144
0.63MetGln: 0.63 ± 0.13
1.335MetArg: 1.335 ± 0.17
2.217MetSer: 2.217 ± 0.247
1.688MetThr: 1.688 ± 0.205
1.486MetVal: 1.486 ± 0.204
0.252MetTrp: 0.252 ± 0.084
1.108MetTyr: 1.108 ± 0.16
0.0MetXaa: 0.0 ± 0.0
Asn
3.829AsnAla: 3.829 ± 0.466
0.756AsnCys: 0.756 ± 0.13
2.771AsnAsp: 2.771 ± 0.249
3.653AsnGlu: 3.653 ± 0.286
1.587AsnPhe: 1.587 ± 0.171
3.577AsnGly: 3.577 ± 0.29
1.209AsnHis: 1.209 ± 0.183
2.771AsnIle: 2.771 ± 0.241
3.376AsnLys: 3.376 ± 0.387
3.627AsnLeu: 3.627 ± 0.294
1.134AsnMet: 1.134 ± 0.17
2.192AsnAsn: 2.192 ± 0.269
2.318AsnPro: 2.318 ± 0.275
1.637AsnGln: 1.637 ± 0.241
2.217AsnArg: 2.217 ± 0.209
3.073AsnSer: 3.073 ± 0.294
2.418AsnThr: 2.418 ± 0.301
2.897AsnVal: 2.897 ± 0.366
0.982AsnTrp: 0.982 ± 0.158
2.343AsnTyr: 2.343 ± 0.293
0.0AsnXaa: 0.0 ± 0.0
Pro
2.318ProAla: 2.318 ± 0.29
0.327ProCys: 0.327 ± 0.08
2.544ProAsp: 2.544 ± 0.238
3.703ProGlu: 3.703 ± 0.317
1.083ProPhe: 1.083 ± 0.14
2.494ProGly: 2.494 ± 0.24
0.68ProHis: 0.68 ± 0.144
1.915ProIle: 1.915 ± 0.232
2.595ProLys: 2.595 ± 0.261
2.62ProLeu: 2.62 ± 0.282
0.756ProMet: 0.756 ± 0.112
1.94ProAsn: 1.94 ± 0.241
1.31ProPro: 1.31 ± 0.302
1.159ProGln: 1.159 ± 0.187
1.461ProArg: 1.461 ± 0.198
1.965ProSer: 1.965 ± 0.204
2.116ProThr: 2.116 ± 0.231
3.3ProVal: 3.3 ± 0.334
0.63ProTrp: 0.63 ± 0.133
1.763ProTyr: 1.763 ± 0.232
0.0ProXaa: 0.0 ± 0.0
Gln
2.947GlnAla: 2.947 ± 0.408
0.378GlnCys: 0.378 ± 0.101
1.99GlnAsp: 1.99 ± 0.221
2.318GlnGlu: 2.318 ± 0.299
1.083GlnPhe: 1.083 ± 0.141
1.587GlnGly: 1.587 ± 0.252
0.831GlnHis: 0.831 ± 0.14
2.544GlnIle: 2.544 ± 0.33
2.166GlnLys: 2.166 ± 0.197
2.494GlnLeu: 2.494 ± 0.255
0.957GlnMet: 0.957 ± 0.132
1.385GlnAsn: 1.385 ± 0.157
1.108GlnPro: 1.108 ± 0.159
1.31GlnGln: 1.31 ± 0.203
1.486GlnArg: 1.486 ± 0.173
1.688GlnSer: 1.688 ± 0.173
2.141GlnThr: 2.141 ± 0.383
2.267GlnVal: 2.267 ± 0.228
0.579GlnTrp: 0.579 ± 0.117
1.058GlnTyr: 1.058 ± 0.182
0.0GlnXaa: 0.0 ± 0.0
Arg
2.746ArgAla: 2.746 ± 0.218
0.378ArgCys: 0.378 ± 0.12
2.595ArgAsp: 2.595 ± 0.251
3.653ArgGlu: 3.653 ± 0.294
1.612ArgPhe: 1.612 ± 0.23
3.174ArgGly: 3.174 ± 0.361
0.957ArgHis: 0.957 ± 0.172
3.124ArgIle: 3.124 ± 0.241
3.703ArgLys: 3.703 ± 0.351
3.476ArgLeu: 3.476 ± 0.22
1.461ArgMet: 1.461 ± 0.165
2.343ArgAsn: 2.343 ± 0.24
1.587ArgPro: 1.587 ± 0.194
1.94ArgGln: 1.94 ± 0.23
2.544ArgArg: 2.544 ± 0.294
2.267ArgSer: 2.267 ± 0.251
2.67ArgThr: 2.67 ± 0.286
3.023ArgVal: 3.023 ± 0.254
0.982ArgTrp: 0.982 ± 0.184
2.166ArgTyr: 2.166 ± 0.266
0.0ArgXaa: 0.0 ± 0.0
Ser
3.955SerAla: 3.955 ± 0.379
0.63SerCys: 0.63 ± 0.127
3.804SerAsp: 3.804 ± 0.299
4.635SerGlu: 4.635 ± 0.342
2.393SerPhe: 2.393 ± 0.242
4.459SerGly: 4.459 ± 0.428
1.26SerHis: 1.26 ± 0.212
3.93SerIle: 3.93 ± 0.297
4.207SerLys: 4.207 ± 0.324
4.358SerLeu: 4.358 ± 0.359
1.713SerMet: 1.713 ± 0.209
3.174SerAsn: 3.174 ± 0.27
1.94SerPro: 1.94 ± 0.202
2.267SerGln: 2.267 ± 0.214
3.098SerArg: 3.098 ± 0.303
4.358SerSer: 4.358 ± 0.617
3.93SerThr: 3.93 ± 0.361
4.156SerVal: 4.156 ± 0.403
0.756SerTrp: 0.756 ± 0.143
2.796SerTyr: 2.796 ± 0.315
0.0SerXaa: 0.0 ± 0.0
Thr
3.955ThrAla: 3.955 ± 0.356
0.756ThrCys: 0.756 ± 0.163
3.577ThrAsp: 3.577 ± 0.336
3.753ThrGlu: 3.753 ± 0.315
2.847ThrPhe: 2.847 ± 0.294
4.408ThrGly: 4.408 ± 0.441
1.234ThrHis: 1.234 ± 0.166
3.401ThrIle: 3.401 ± 0.336
3.753ThrLys: 3.753 ± 0.273
4.811ThrLeu: 4.811 ± 0.402
1.26ThrMet: 1.26 ± 0.179
2.67ThrAsn: 2.67 ± 0.277
2.091ThrPro: 2.091 ± 0.228
1.839ThrGln: 1.839 ± 0.263
2.67ThrArg: 2.67 ± 0.274
3.854ThrSer: 3.854 ± 0.432
3.451ThrThr: 3.451 ± 0.43
4.257ThrVal: 4.257 ± 0.356
1.134ThrTrp: 1.134 ± 0.18
1.688ThrTyr: 1.688 ± 0.262
0.0ThrXaa: 0.0 ± 0.0
Val
4.308ValAla: 4.308 ± 0.302
1.033ValCys: 1.033 ± 0.184
5.164ValAsp: 5.164 ± 0.549
5.265ValGlu: 5.265 ± 0.35
2.217ValPhe: 2.217 ± 0.252
3.829ValGly: 3.829 ± 0.34
1.184ValHis: 1.184 ± 0.161
4.156ValIle: 4.156 ± 0.298
4.257ValLys: 4.257 ± 0.342
5.265ValLeu: 5.265 ± 0.347
1.713ValMet: 1.713 ± 0.204
3.3ValAsn: 3.3 ± 0.289
2.519ValPro: 2.519 ± 0.26
2.217ValGln: 2.217 ± 0.218
2.947ValArg: 2.947 ± 0.287
4.182ValSer: 4.182 ± 0.35
3.476ValThr: 3.476 ± 0.259
4.736ValVal: 4.736 ± 0.411
1.26ValTrp: 1.26 ± 0.168
2.847ValTyr: 2.847 ± 0.309
0.0ValXaa: 0.0 ± 0.0
Trp
1.285TrpAla: 1.285 ± 0.162
0.277TrpCys: 0.277 ± 0.08
1.587TrpAsp: 1.587 ± 0.248
1.385TrpGlu: 1.385 ± 0.212
0.781TrpPhe: 0.781 ± 0.147
1.134TrpGly: 1.134 ± 0.179
0.403TrpHis: 0.403 ± 0.093
1.033TrpIle: 1.033 ± 0.172
1.234TrpLys: 1.234 ± 0.214
1.411TrpLeu: 1.411 ± 0.218
0.579TrpMet: 0.579 ± 0.136
1.008TrpAsn: 1.008 ± 0.18
0.63TrpPro: 0.63 ± 0.132
0.63TrpGln: 0.63 ± 0.131
0.932TrpArg: 0.932 ± 0.167
0.932TrpSer: 0.932 ± 0.16
1.108TrpThr: 1.108 ± 0.157
1.36TrpVal: 1.36 ± 0.195
0.302TrpTrp: 0.302 ± 0.106
0.63TrpTyr: 0.63 ± 0.144
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.721TyrAla: 2.721 ± 0.275
0.655TyrCys: 0.655 ± 0.133
3.023TyrAsp: 3.023 ± 0.271
3.3TyrGlu: 3.3 ± 0.331
1.234TyrPhe: 1.234 ± 0.19
3.098TyrGly: 3.098 ± 0.379
0.856TyrHis: 0.856 ± 0.162
2.973TyrIle: 2.973 ± 0.359
1.99TyrLys: 1.99 ± 0.276
3.124TyrLeu: 3.124 ± 0.279
1.008TyrMet: 1.008 ± 0.172
2.091TyrAsn: 2.091 ± 0.237
1.436TyrPro: 1.436 ± 0.198
1.285TyrGln: 1.285 ± 0.198
1.864TyrArg: 1.864 ± 0.215
2.569TyrSer: 2.569 ± 0.32
2.418TyrThr: 2.418 ± 0.28
3.098TyrVal: 3.098 ± 0.331
0.756TyrTrp: 0.756 ± 0.149
1.688TyrTyr: 1.688 ± 0.218
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 267 proteins (39698 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski