Amino acid dipepetide frequency for Streptomyces phage Faust

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
7.65AlaAla: 7.65 ± 0.921
0.705AlaCys: 0.705 ± 0.194
4.476AlaAsp: 4.476 ± 0.312
5.859AlaGlu: 5.859 ± 0.543
3.716AlaPhe: 3.716 ± 0.326
6.212AlaGly: 6.212 ± 0.623
1.194AlaHis: 1.194 ± 0.196
4.476AlaIle: 4.476 ± 0.406
4.856AlaLys: 4.856 ± 0.394
6.402AlaLeu: 6.402 ± 0.499
2.93AlaMet: 2.93 ± 0.422
3.418AlaAsn: 3.418 ± 0.453
2.767AlaPro: 2.767 ± 0.283
3.309AlaGln: 3.309 ± 0.325
4.422AlaArg: 4.422 ± 0.39
4.367AlaSer: 4.367 ± 0.525
4.774AlaThr: 4.774 ± 0.603
5.913AlaVal: 5.913 ± 0.444
1.438AlaTrp: 1.438 ± 0.192
2.604AlaTyr: 2.604 ± 0.331
0.0AlaXaa: 0.0 ± 0.0
Cys
0.543CysAla: 0.543 ± 0.138
0.136CysCys: 0.136 ± 0.06
0.732CysAsp: 0.732 ± 0.168
0.814CysGlu: 0.814 ± 0.161
0.461CysPhe: 0.461 ± 0.155
1.058CysGly: 1.058 ± 0.219
0.298CysHis: 0.298 ± 0.092
0.543CysIle: 0.543 ± 0.127
0.705CysLys: 0.705 ± 0.167
0.76CysLeu: 0.76 ± 0.156
0.353CysMet: 0.353 ± 0.109
0.244CysAsn: 0.244 ± 0.095
0.57CysPro: 0.57 ± 0.144
0.38CysGln: 0.38 ± 0.11
0.624CysArg: 0.624 ± 0.168
0.488CysSer: 0.488 ± 0.142
0.434CysThr: 0.434 ± 0.109
0.543CysVal: 0.543 ± 0.108
0.109CysTrp: 0.109 ± 0.054
0.326CysTyr: 0.326 ± 0.112
0.0CysXaa: 0.0 ± 0.0
Asp
5.425AspAla: 5.425 ± 0.426
0.814AspCys: 0.814 ± 0.173
3.771AspAsp: 3.771 ± 0.349
5.208AspGlu: 5.208 ± 0.481
3.472AspPhe: 3.472 ± 0.366
5.588AspGly: 5.588 ± 0.367
1.166AspHis: 1.166 ± 0.208
3.309AspIle: 3.309 ± 0.331
3.743AspLys: 3.743 ± 0.344
4.774AspLeu: 4.774 ± 0.388
1.926AspMet: 1.926 ± 0.236
2.93AspAsn: 2.93 ± 0.317
1.926AspPro: 1.926 ± 0.216
1.546AspGln: 1.546 ± 0.187
2.848AspArg: 2.848 ± 0.285
3.879AspSer: 3.879 ± 0.325
3.418AspThr: 3.418 ± 0.573
4.476AspVal: 4.476 ± 0.465
1.465AspTrp: 1.465 ± 0.16
3.119AspTyr: 3.119 ± 0.28
0.0AspXaa: 0.0 ± 0.0
Glu
5.669GluAla: 5.669 ± 0.513
0.651GluCys: 0.651 ± 0.15
4.774GluAsp: 4.774 ± 0.39
5.615GluGlu: 5.615 ± 0.57
3.499GluPhe: 3.499 ± 0.319
4.584GluGly: 4.584 ± 0.397
1.465GluHis: 1.465 ± 0.215
3.988GluIle: 3.988 ± 0.382
4.774GluLys: 4.774 ± 0.423
5.235GluLeu: 5.235 ± 0.405
2.279GluMet: 2.279 ± 0.324
3.201GluAsn: 3.201 ± 0.297
1.899GluPro: 1.899 ± 0.215
2.55GluGln: 2.55 ± 0.348
4.313GluArg: 4.313 ± 0.417
3.581GluSer: 3.581 ± 0.372
3.608GluThr: 3.608 ± 0.367
4.828GluVal: 4.828 ± 0.386
1.573GluTrp: 1.573 ± 0.277
3.038GluTyr: 3.038 ± 0.318
0.0GluXaa: 0.0 ± 0.0
Phe
2.767PheAla: 2.767 ± 0.319
0.543PheCys: 0.543 ± 0.115
3.174PheAsp: 3.174 ± 0.274
2.821PheGlu: 2.821 ± 0.292
1.628PhePhe: 1.628 ± 0.233
2.875PheGly: 2.875 ± 0.309
0.949PheHis: 0.949 ± 0.162
1.899PheIle: 1.899 ± 0.218
2.306PheLys: 2.306 ± 0.277
2.658PheLeu: 2.658 ± 0.302
0.922PheMet: 0.922 ± 0.16
2.279PheAsn: 2.279 ± 0.229
1.248PhePro: 1.248 ± 0.182
1.383PheGln: 1.383 ± 0.193
2.306PheArg: 2.306 ± 0.239
2.794PheSer: 2.794 ± 0.311
2.468PheThr: 2.468 ± 0.249
3.092PheVal: 3.092 ± 0.349
0.841PheTrp: 0.841 ± 0.174
1.573PheTyr: 1.573 ± 0.218
0.0PheXaa: 0.0 ± 0.0
Gly
5.154GlyAla: 5.154 ± 0.422
0.57GlyCys: 0.57 ± 0.132
4.747GlyAsp: 4.747 ± 0.382
4.205GlyGlu: 4.205 ± 0.312
3.228GlyPhe: 3.228 ± 0.275
5.398GlyGly: 5.398 ± 0.51
1.6GlyHis: 1.6 ± 0.223
5.127GlyIle: 5.127 ± 0.352
4.774GlyLys: 4.774 ± 0.406
5.751GlyLeu: 5.751 ± 0.478
2.523GlyMet: 2.523 ± 0.285
3.716GlyAsn: 3.716 ± 0.342
2.387GlyPro: 2.387 ± 0.31
2.143GlyGln: 2.143 ± 0.257
4.015GlyArg: 4.015 ± 0.364
5.1GlySer: 5.1 ± 0.489
4.828GlyThr: 4.828 ± 0.791
5.398GlyVal: 5.398 ± 0.472
1.817GlyTrp: 1.817 ± 0.246
3.418GlyTyr: 3.418 ± 0.35
0.0GlyXaa: 0.0 ± 0.0
His
1.221HisAla: 1.221 ± 0.181
0.19HisCys: 0.19 ± 0.074
1.221HisAsp: 1.221 ± 0.205
1.221HisGlu: 1.221 ± 0.207
0.841HisPhe: 0.841 ± 0.172
1.817HisGly: 1.817 ± 0.265
0.543HisHis: 0.543 ± 0.106
1.004HisIle: 1.004 ± 0.168
0.922HisLys: 0.922 ± 0.163
1.411HisLeu: 1.411 ± 0.191
0.488HisMet: 0.488 ± 0.13
1.085HisAsn: 1.085 ± 0.196
0.732HisPro: 0.732 ± 0.151
0.597HisGln: 0.597 ± 0.162
1.112HisArg: 1.112 ± 0.166
1.085HisSer: 1.085 ± 0.171
1.031HisThr: 1.031 ± 0.194
1.221HisVal: 1.221 ± 0.198
0.407HisTrp: 0.407 ± 0.109
0.814HisTyr: 0.814 ± 0.16
0.0HisXaa: 0.0 ± 0.0
Ile
5.154IleAla: 5.154 ± 0.422
0.543IleCys: 0.543 ± 0.142
4.34IleAsp: 4.34 ± 0.336
4.476IleGlu: 4.476 ± 0.388
1.275IlePhe: 1.275 ± 0.156
4.449IleGly: 4.449 ± 0.375
1.383IleHis: 1.383 ± 0.196
2.794IleIle: 2.794 ± 0.343
3.445IleLys: 3.445 ± 0.361
3.798IleLeu: 3.798 ± 0.367
1.166IleMet: 1.166 ± 0.179
1.79IleAsn: 1.79 ± 0.212
2.577IlePro: 2.577 ± 0.271
1.6IleGln: 1.6 ± 0.188
2.902IleArg: 2.902 ± 0.366
2.984IleSer: 2.984 ± 0.289
3.526IleThr: 3.526 ± 0.419
4.015IleVal: 4.015 ± 0.362
0.787IleTrp: 0.787 ± 0.183
1.546IleTyr: 1.546 ± 0.206
0.0IleXaa: 0.0 ± 0.0
Lys
5.127LysAla: 5.127 ± 0.521
0.705LysCys: 0.705 ± 0.197
2.875LysAsp: 2.875 ± 0.281
3.906LysGlu: 3.906 ± 0.435
2.523LysPhe: 2.523 ± 0.26
4.123LysGly: 4.123 ± 0.361
1.329LysHis: 1.329 ± 0.233
3.309LysIle: 3.309 ± 0.295
4.639LysLys: 4.639 ± 0.562
3.364LysLeu: 3.364 ± 0.298
2.034LysMet: 2.034 ± 0.265
2.848LysAsn: 2.848 ± 0.341
2.36LysPro: 2.36 ± 0.275
2.17LysGln: 2.17 ± 0.232
3.852LysArg: 3.852 ± 0.389
2.821LysSer: 2.821 ± 0.298
4.042LysThr: 4.042 ± 0.325
4.856LysVal: 4.856 ± 0.442
1.492LysTrp: 1.492 ± 0.226
2.658LysTyr: 2.658 ± 0.259
0.0LysXaa: 0.0 ± 0.0
Leu
5.561LeuAla: 5.561 ± 0.434
0.787LeuCys: 0.787 ± 0.182
5.696LeuAsp: 5.696 ± 0.48
5.344LeuGlu: 5.344 ± 0.352
2.713LeuPhe: 2.713 ± 0.322
5.1LeuGly: 5.1 ± 0.435
1.248LeuHis: 1.248 ± 0.21
3.933LeuIle: 3.933 ± 0.443
4.096LeuLys: 4.096 ± 0.314
3.96LeuLeu: 3.96 ± 0.364
1.736LeuMet: 1.736 ± 0.242
3.038LeuAsn: 3.038 ± 0.272
2.604LeuPro: 2.604 ± 0.24
2.062LeuGln: 2.062 ± 0.336
3.743LeuArg: 3.743 ± 0.285
4.069LeuSer: 4.069 ± 0.305
4.259LeuThr: 4.259 ± 0.335
4.937LeuVal: 4.937 ± 0.35
1.275LeuTrp: 1.275 ± 0.165
3.092LeuTyr: 3.092 ± 0.273
0.0LeuXaa: 0.0 ± 0.0
Met
2.848MetAla: 2.848 ± 0.374
0.244MetCys: 0.244 ± 0.077
1.492MetAsp: 1.492 ± 0.18
1.275MetGlu: 1.275 ± 0.189
0.895MetPhe: 0.895 ± 0.151
1.628MetGly: 1.628 ± 0.226
0.434MetHis: 0.434 ± 0.103
1.465MetIle: 1.465 ± 0.183
2.143MetLys: 2.143 ± 0.311
2.089MetLeu: 2.089 ± 0.275
1.085MetMet: 1.085 ± 0.157
1.573MetAsn: 1.573 ± 0.276
1.411MetPro: 1.411 ± 0.216
0.868MetGln: 0.868 ± 0.176
1.817MetArg: 1.817 ± 0.231
1.953MetSer: 1.953 ± 0.248
2.143MetThr: 2.143 ± 0.207
1.872MetVal: 1.872 ± 0.204
0.217MetTrp: 0.217 ± 0.069
0.787MetTyr: 0.787 ± 0.132
0.0MetXaa: 0.0 ± 0.0
Asn
3.662AsnAla: 3.662 ± 0.367
0.434AsnCys: 0.434 ± 0.112
2.496AsnAsp: 2.496 ± 0.303
3.092AsnGlu: 3.092 ± 0.307
1.573AsnPhe: 1.573 ± 0.214
4.34AsnGly: 4.34 ± 0.383
0.841AsnHis: 0.841 ± 0.145
1.953AsnIle: 1.953 ± 0.227
2.74AsnLys: 2.74 ± 0.298
3.282AsnLeu: 3.282 ± 0.313
0.977AsnMet: 0.977 ± 0.157
1.926AsnAsn: 1.926 ± 0.25
2.333AsnPro: 2.333 ± 0.293
1.302AsnGln: 1.302 ± 0.201
2.197AsnArg: 2.197 ± 0.248
2.685AsnSer: 2.685 ± 0.271
2.767AsnThr: 2.767 ± 0.543
3.581AsnVal: 3.581 ± 0.304
1.058AsnTrp: 1.058 ± 0.174
1.736AsnTyr: 1.736 ± 0.2
0.0AsnXaa: 0.0 ± 0.0
Pro
2.631ProAla: 2.631 ± 0.333
0.298ProCys: 0.298 ± 0.095
2.984ProAsp: 2.984 ± 0.295
3.147ProGlu: 3.147 ± 0.35
1.492ProPhe: 1.492 ± 0.176
2.333ProGly: 2.333 ± 0.345
0.624ProHis: 0.624 ± 0.124
1.926ProIle: 1.926 ± 0.288
2.306ProLys: 2.306 ± 0.297
2.116ProLeu: 2.116 ± 0.226
0.705ProMet: 0.705 ± 0.145
1.926ProAsn: 1.926 ± 0.241
1.112ProPro: 1.112 ± 0.278
1.194ProGln: 1.194 ± 0.209
2.116ProArg: 2.116 ± 0.248
2.007ProSer: 2.007 ± 0.265
2.794ProThr: 2.794 ± 0.375
3.309ProVal: 3.309 ± 0.288
0.678ProTrp: 0.678 ± 0.152
1.411ProTyr: 1.411 ± 0.213
0.0ProXaa: 0.0 ± 0.0
Gln
2.902GlnAla: 2.902 ± 0.507
0.407GlnCys: 0.407 ± 0.094
1.736GlnAsp: 1.736 ± 0.199
2.279GlnGlu: 2.279 ± 0.269
1.112GlnPhe: 1.112 ± 0.144
2.17GlnGly: 2.17 ± 0.241
0.651GlnHis: 0.651 ± 0.15
2.089GlnIle: 2.089 ± 0.234
2.143GlnLys: 2.143 ± 0.257
2.279GlnLeu: 2.279 ± 0.273
0.678GlnMet: 0.678 ± 0.125
1.763GlnAsn: 1.763 ± 0.235
1.194GlnPro: 1.194 ± 0.176
1.166GlnGln: 1.166 ± 0.202
2.441GlnArg: 2.441 ± 0.354
1.763GlnSer: 1.763 ± 0.292
1.492GlnThr: 1.492 ± 0.167
1.98GlnVal: 1.98 ± 0.204
0.787GlnTrp: 0.787 ± 0.143
1.302GlnTyr: 1.302 ± 0.2
0.0GlnXaa: 0.0 ± 0.0
Arg
4.91ArgAla: 4.91 ± 0.375
0.326ArgCys: 0.326 ± 0.099
3.255ArgAsp: 3.255 ± 0.266
3.825ArgGlu: 3.825 ± 0.416
2.414ArgPhe: 2.414 ± 0.277
3.771ArgGly: 3.771 ± 0.311
1.112ArgHis: 1.112 ± 0.146
3.554ArgIle: 3.554 ± 0.329
3.825ArgLys: 3.825 ± 0.448
3.635ArgLeu: 3.635 ± 0.284
1.763ArgMet: 1.763 ± 0.214
2.523ArgAsn: 2.523 ± 0.256
2.523ArgPro: 2.523 ± 0.259
1.817ArgGln: 1.817 ± 0.278
4.313ArgArg: 4.313 ± 0.478
2.306ArgSer: 2.306 ± 0.283
2.496ArgThr: 2.496 ± 0.275
3.798ArgVal: 3.798 ± 0.308
0.977ArgTrp: 0.977 ± 0.17
2.93ArgTyr: 2.93 ± 0.259
0.0ArgXaa: 0.0 ± 0.0
Ser
4.856SerAla: 4.856 ± 0.432
0.624SerCys: 0.624 ± 0.137
3.852SerAsp: 3.852 ± 0.288
3.825SerGlu: 3.825 ± 0.33
2.306SerPhe: 2.306 ± 0.258
4.801SerGly: 4.801 ± 0.529
0.922SerHis: 0.922 ± 0.173
3.147SerIle: 3.147 ± 0.291
3.336SerLys: 3.336 ± 0.303
4.042SerLeu: 4.042 ± 0.359
1.682SerMet: 1.682 ± 0.35
2.197SerAsn: 2.197 ± 0.34
2.224SerPro: 2.224 ± 0.252
1.98SerGln: 1.98 ± 0.224
3.065SerArg: 3.065 ± 0.299
3.499SerSer: 3.499 ± 0.381
3.689SerThr: 3.689 ± 0.507
3.933SerVal: 3.933 ± 0.356
1.356SerTrp: 1.356 ± 0.204
1.953SerTyr: 1.953 ± 0.208
0.0SerXaa: 0.0 ± 0.0
Thr
5.208ThrAla: 5.208 ± 0.686
0.515ThrCys: 0.515 ± 0.12
3.554ThrAsp: 3.554 ± 0.394
3.96ThrGlu: 3.96 ± 0.346
2.631ThrPhe: 2.631 ± 0.332
5.642ThrGly: 5.642 ± 0.841
1.031ThrHis: 1.031 ± 0.174
3.309ThrIle: 3.309 ± 0.322
2.984ThrLys: 2.984 ± 0.334
4.856ThrLeu: 4.856 ± 0.386
1.438ThrMet: 1.438 ± 0.18
2.658ThrAsn: 2.658 ± 0.367
2.74ThrPro: 2.74 ± 0.367
2.116ThrGln: 2.116 ± 0.285
2.523ThrArg: 2.523 ± 0.3
3.499ThrSer: 3.499 ± 0.579
4.611ThrThr: 4.611 ± 0.781
5.425ThrVal: 5.425 ± 0.561
1.085ThrTrp: 1.085 ± 0.206
2.441ThrTyr: 2.441 ± 0.261
0.0ThrXaa: 0.0 ± 0.0
Val
5.832ValAla: 5.832 ± 0.448
0.977ValCys: 0.977 ± 0.169
5.534ValAsp: 5.534 ± 0.397
5.181ValGlu: 5.181 ± 0.443
2.441ValPhe: 2.441 ± 0.22
5.154ValGly: 5.154 ± 0.398
1.194ValHis: 1.194 ± 0.181
3.852ValIle: 3.852 ± 0.374
4.367ValLys: 4.367 ± 0.369
4.774ValLeu: 4.774 ± 0.362
1.817ValMet: 1.817 ± 0.212
3.038ValAsn: 3.038 ± 0.265
2.767ValPro: 2.767 ± 0.319
2.089ValGln: 2.089 ± 0.249
4.232ValArg: 4.232 ± 0.338
4.747ValSer: 4.747 ± 0.406
5.398ValThr: 5.398 ± 0.42
5.805ValVal: 5.805 ± 0.458
1.248ValTrp: 1.248 ± 0.218
3.228ValTyr: 3.228 ± 0.387
0.0ValXaa: 0.0 ± 0.0
Trp
1.166TrpAla: 1.166 ± 0.194
0.19TrpCys: 0.19 ± 0.08
1.6TrpAsp: 1.6 ± 0.222
1.6TrpGlu: 1.6 ± 0.241
1.058TrpPhe: 1.058 ± 0.182
1.655TrpGly: 1.655 ± 0.244
0.38TrpHis: 0.38 ± 0.1
0.977TrpIle: 0.977 ± 0.178
0.922TrpLys: 0.922 ± 0.156
1.628TrpLeu: 1.628 ± 0.29
0.407TrpMet: 0.407 ± 0.117
1.058TrpAsn: 1.058 ± 0.175
0.488TrpPro: 0.488 ± 0.124
0.543TrpGln: 0.543 ± 0.11
0.868TrpArg: 0.868 ± 0.148
1.194TrpSer: 1.194 ± 0.169
1.411TrpThr: 1.411 ± 0.301
1.248TrpVal: 1.248 ± 0.253
0.515TrpTrp: 0.515 ± 0.128
0.841TrpTyr: 0.841 ± 0.169
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.201TyrAla: 3.201 ± 0.366
0.461TyrCys: 0.461 ± 0.122
2.523TyrAsp: 2.523 ± 0.301
3.472TyrGlu: 3.472 ± 0.373
1.248TyrPhe: 1.248 ± 0.211
3.201TyrGly: 3.201 ± 0.226
0.597TyrHis: 0.597 ± 0.135
1.899TyrIle: 1.899 ± 0.237
2.007TyrLys: 2.007 ± 0.299
2.441TyrLeu: 2.441 ± 0.278
1.302TyrMet: 1.302 ± 0.177
1.763TyrAsn: 1.763 ± 0.187
1.248TyrPro: 1.248 ± 0.226
1.519TyrGln: 1.519 ± 0.17
2.36TyrArg: 2.36 ± 0.25
2.468TyrSer: 2.468 ± 0.276
3.038TyrThr: 3.038 ± 0.381
3.445TyrVal: 3.445 ± 0.345
0.624TyrTrp: 0.624 ± 0.134
1.763TyrTyr: 1.763 ± 0.265
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 238 proteins (36866 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski