Amino acid dipepetide frequency for Streptomyces phage Comrade

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
8.091AlaAla: 8.091 ± 1.25
0.76AlaCys: 0.76 ± 0.168
4.507AlaAsp: 4.507 ± 0.381
5.458AlaGlu: 5.458 ± 0.433
3.503AlaPhe: 3.503 ± 0.296
6.706AlaGly: 6.706 ± 0.675
1.303AlaHis: 1.303 ± 0.2
4.018AlaIle: 4.018 ± 0.342
4.942AlaLys: 4.942 ± 0.37
6.163AlaLeu: 6.163 ± 0.45
2.742AlaMet: 2.742 ± 0.328
4.127AlaAsn: 4.127 ± 0.613
2.851AlaPro: 2.851 ± 0.34
3.068AlaGln: 3.068 ± 0.508
3.883AlaArg: 3.883 ± 0.308
4.154AlaSer: 4.154 ± 0.345
5.023AlaThr: 5.023 ± 0.568
6.245AlaVal: 6.245 ± 0.47
1.52AlaTrp: 1.52 ± 0.192
2.715AlaTyr: 2.715 ± 0.236
0.0AlaXaa: 0.0 ± 0.0
Cys
0.597CysAla: 0.597 ± 0.134
0.109CysCys: 0.109 ± 0.055
0.679CysAsp: 0.679 ± 0.144
0.787CysGlu: 0.787 ± 0.164
0.299CysPhe: 0.299 ± 0.089
1.33CysGly: 1.33 ± 0.218
0.217CysHis: 0.217 ± 0.073
0.326CysIle: 0.326 ± 0.094
0.652CysLys: 0.652 ± 0.162
0.706CysLeu: 0.706 ± 0.16
0.272CysMet: 0.272 ± 0.083
0.516CysAsn: 0.516 ± 0.127
0.434CysPro: 0.434 ± 0.121
0.462CysGln: 0.462 ± 0.115
0.624CysArg: 0.624 ± 0.136
0.597CysSer: 0.597 ± 0.135
0.434CysThr: 0.434 ± 0.12
0.489CysVal: 0.489 ± 0.112
0.136CysTrp: 0.136 ± 0.068
0.38CysTyr: 0.38 ± 0.117
0.0CysXaa: 0.0 ± 0.0
Asp
4.914AspAla: 4.914 ± 0.453
0.842AspCys: 0.842 ± 0.16
4.181AspAsp: 4.181 ± 0.424
5.267AspGlu: 5.267 ± 0.458
2.905AspPhe: 2.905 ± 0.288
5.702AspGly: 5.702 ± 0.446
0.896AspHis: 0.896 ± 0.19
3.258AspIle: 3.258 ± 0.279
3.448AspLys: 3.448 ± 0.337
4.942AspLeu: 4.942 ± 0.336
1.819AspMet: 1.819 ± 0.227
3.367AspAsn: 3.367 ± 0.381
2.199AspPro: 2.199 ± 0.249
1.575AspGln: 1.575 ± 0.171
2.905AspArg: 2.905 ± 0.349
3.584AspSer: 3.584 ± 0.363
3.258AspThr: 3.258 ± 0.359
5.159AspVal: 5.159 ± 0.429
1.303AspTrp: 1.303 ± 0.182
2.824AspTyr: 2.824 ± 0.266
0.0AspXaa: 0.0 ± 0.0
Glu
5.458GluAla: 5.458 ± 0.516
0.652GluCys: 0.652 ± 0.173
4.724GluAsp: 4.724 ± 0.432
4.887GluGlu: 4.887 ± 0.464
3.258GluPhe: 3.258 ± 0.29
4.29GluGly: 4.29 ± 0.339
1.249GluHis: 1.249 ± 0.207
3.964GluIle: 3.964 ± 0.423
4.887GluLys: 4.887 ± 0.403
5.62GluLeu: 5.62 ± 0.405
2.308GluMet: 2.308 ± 0.253
2.851GluAsn: 2.851 ± 0.27
1.792GluPro: 1.792 ± 0.205
2.444GluGln: 2.444 ± 0.292
3.883GluArg: 3.883 ± 0.38
3.584GluSer: 3.584 ± 0.324
3.584GluThr: 3.584 ± 0.357
4.724GluVal: 4.724 ± 0.429
1.195GluTrp: 1.195 ± 0.184
3.448GluTyr: 3.448 ± 0.329
0.0GluXaa: 0.0 ± 0.0
Phe
3.177PheAla: 3.177 ± 0.31
0.489PheCys: 0.489 ± 0.106
3.041PheAsp: 3.041 ± 0.251
3.122PheGlu: 3.122 ± 0.294
1.222PhePhe: 1.222 ± 0.195
2.932PheGly: 2.932 ± 0.307
0.787PheHis: 0.787 ± 0.155
1.928PheIle: 1.928 ± 0.265
2.064PheLys: 2.064 ± 0.214
2.742PheLeu: 2.742 ± 0.277
1.059PheMet: 1.059 ± 0.167
2.118PheAsn: 2.118 ± 0.146
1.059PhePro: 1.059 ± 0.159
1.113PheGln: 1.113 ± 0.174
2.009PheArg: 2.009 ± 0.221
2.688PheSer: 2.688 ± 0.287
2.064PheThr: 2.064 ± 0.222
3.177PheVal: 3.177 ± 0.317
0.57PheTrp: 0.57 ± 0.125
1.412PheTyr: 1.412 ± 0.183
0.0PheXaa: 0.0 ± 0.0
Gly
5.05GlyAla: 5.05 ± 0.438
0.57GlyCys: 0.57 ± 0.119
4.371GlyAsp: 4.371 ± 0.323
4.399GlyGlu: 4.399 ± 0.277
3.693GlyPhe: 3.693 ± 0.307
6.055GlyGly: 6.055 ± 0.515
1.738GlyHis: 1.738 ± 0.229
4.616GlyIle: 4.616 ± 0.325
4.534GlyLys: 4.534 ± 0.41
5.702GlyLeu: 5.702 ± 0.401
2.905GlyMet: 2.905 ± 0.265
4.154GlyAsn: 4.154 ± 0.32
2.824GlyPro: 2.824 ± 0.285
2.281GlyGln: 2.281 ± 0.301
4.154GlyArg: 4.154 ± 0.339
4.996GlySer: 4.996 ± 0.588
5.376GlyThr: 5.376 ± 0.788
5.756GlyVal: 5.756 ± 0.485
1.928GlyTrp: 1.928 ± 0.264
3.448GlyTyr: 3.448 ± 0.308
0.0GlyXaa: 0.0 ± 0.0
His
1.113HisAla: 1.113 ± 0.217
0.244HisCys: 0.244 ± 0.073
1.195HisAsp: 1.195 ± 0.218
1.005HisGlu: 1.005 ± 0.139
0.869HisPhe: 0.869 ± 0.155
1.602HisGly: 1.602 ± 0.249
0.272HisHis: 0.272 ± 0.101
1.113HisIle: 1.113 ± 0.17
0.977HisLys: 0.977 ± 0.159
1.195HisLeu: 1.195 ± 0.206
0.38HisMet: 0.38 ± 0.102
0.95HisAsn: 0.95 ± 0.189
0.787HisPro: 0.787 ± 0.174
0.624HisGln: 0.624 ± 0.136
1.466HisArg: 1.466 ± 0.217
1.086HisSer: 1.086 ± 0.163
0.733HisThr: 0.733 ± 0.125
1.629HisVal: 1.629 ± 0.252
0.407HisTrp: 0.407 ± 0.096
0.652HisTyr: 0.652 ± 0.135
0.0HisXaa: 0.0 ± 0.0
Ile
3.991IleAla: 3.991 ± 0.349
0.624IleCys: 0.624 ± 0.128
3.638IleAsp: 3.638 ± 0.405
4.154IleGlu: 4.154 ± 0.335
1.548IlePhe: 1.548 ± 0.198
4.181IleGly: 4.181 ± 0.309
1.005IleHis: 1.005 ± 0.157
2.688IleIle: 2.688 ± 0.286
3.421IleLys: 3.421 ± 0.234
4.1IleLeu: 4.1 ± 0.4
1.086IleMet: 1.086 ± 0.159
2.281IleAsn: 2.281 ± 0.254
2.688IlePro: 2.688 ± 0.277
1.982IleGln: 1.982 ± 0.268
3.503IleArg: 3.503 ± 0.351
3.122IleSer: 3.122 ± 0.289
3.285IleThr: 3.285 ± 0.364
4.073IleVal: 4.073 ± 0.36
0.842IleTrp: 0.842 ± 0.158
1.656IleTyr: 1.656 ± 0.209
0.0IleXaa: 0.0 ± 0.0
Lys
5.566LysAla: 5.566 ± 0.533
0.733LysCys: 0.733 ± 0.192
3.285LysAsp: 3.285 ± 0.327
3.421LysGlu: 3.421 ± 0.39
2.226LysPhe: 2.226 ± 0.252
4.046LysGly: 4.046 ± 0.316
1.086LysHis: 1.086 ± 0.196
2.824LysIle: 2.824 ± 0.248
4.209LysLys: 4.209 ± 0.421
3.557LysLeu: 3.557 ± 0.326
2.281LysMet: 2.281 ± 0.264
3.421LysAsn: 3.421 ± 0.38
2.417LysPro: 2.417 ± 0.243
1.901LysGln: 1.901 ± 0.226
3.964LysArg: 3.964 ± 0.393
3.503LysSer: 3.503 ± 0.336
4.29LysThr: 4.29 ± 0.285
4.48LysVal: 4.48 ± 0.333
1.358LysTrp: 1.358 ± 0.183
2.769LysTyr: 2.769 ± 0.296
0.0LysXaa: 0.0 ± 0.0
Leu
6.326LeuAla: 6.326 ± 0.463
0.733LeuCys: 0.733 ± 0.156
5.81LeuAsp: 5.81 ± 0.416
5.05LeuGlu: 5.05 ± 0.435
2.362LeuPhe: 2.362 ± 0.305
5.512LeuGly: 5.512 ± 0.369
1.33LeuHis: 1.33 ± 0.158
3.584LeuIle: 3.584 ± 0.287
4.453LeuLys: 4.453 ± 0.32
4.453LeuLeu: 4.453 ± 0.379
1.439LeuMet: 1.439 ± 0.195
3.122LeuAsn: 3.122 ± 0.294
2.742LeuPro: 2.742 ± 0.265
2.064LeuGln: 2.064 ± 0.239
3.285LeuArg: 3.285 ± 0.281
4.426LeuSer: 4.426 ± 0.291
4.399LeuThr: 4.399 ± 0.333
5.077LeuVal: 5.077 ± 0.394
1.439LeuTrp: 1.439 ± 0.192
2.742LeuTyr: 2.742 ± 0.252
0.0LeuXaa: 0.0 ± 0.0
Met
2.769MetAla: 2.769 ± 0.321
0.163MetCys: 0.163 ± 0.063
1.711MetAsp: 1.711 ± 0.229
1.819MetGlu: 1.819 ± 0.224
0.842MetPhe: 0.842 ± 0.128
1.738MetGly: 1.738 ± 0.212
0.407MetHis: 0.407 ± 0.113
1.656MetIle: 1.656 ± 0.234
1.656MetLys: 1.656 ± 0.233
2.091MetLeu: 2.091 ± 0.223
0.815MetMet: 0.815 ± 0.13
1.656MetAsn: 1.656 ± 0.218
1.439MetPro: 1.439 ± 0.166
0.977MetGln: 0.977 ± 0.22
1.656MetArg: 1.656 ± 0.242
2.118MetSer: 2.118 ± 0.227
1.819MetThr: 1.819 ± 0.203
1.982MetVal: 1.982 ± 0.234
0.407MetTrp: 0.407 ± 0.102
0.977MetTyr: 0.977 ± 0.173
0.0MetXaa: 0.0 ± 0.0
Asn
4.073AsnAla: 4.073 ± 0.439
0.353AsnCys: 0.353 ± 0.128
2.851AsnAsp: 2.851 ± 0.229
3.638AsnGlu: 3.638 ± 0.34
1.602AsnPhe: 1.602 ± 0.208
4.779AsnGly: 4.779 ± 0.478
1.059AsnHis: 1.059 ± 0.149
2.769AsnIle: 2.769 ± 0.27
2.987AsnLys: 2.987 ± 0.311
3.394AsnLeu: 3.394 ± 0.303
1.33AsnMet: 1.33 ± 0.159
1.873AsnAsn: 1.873 ± 0.28
2.498AsnPro: 2.498 ± 0.299
1.466AsnGln: 1.466 ± 0.177
2.335AsnArg: 2.335 ± 0.241
2.851AsnSer: 2.851 ± 0.343
2.96AsnThr: 2.96 ± 0.413
3.014AsnVal: 3.014 ± 0.3
0.679AsnTrp: 0.679 ± 0.127
2.226AsnTyr: 2.226 ± 0.249
0.0AsnXaa: 0.0 ± 0.0
Pro
3.068ProAla: 3.068 ± 0.322
0.163ProCys: 0.163 ± 0.056
2.797ProAsp: 2.797 ± 0.318
3.448ProGlu: 3.448 ± 0.423
1.358ProPhe: 1.358 ± 0.175
2.769ProGly: 2.769 ± 0.315
0.597ProHis: 0.597 ± 0.137
1.819ProIle: 1.819 ± 0.209
2.552ProLys: 2.552 ± 0.334
2.281ProLeu: 2.281 ± 0.247
0.597ProMet: 0.597 ± 0.145
2.498ProAsn: 2.498 ± 0.261
1.168ProPro: 1.168 ± 0.221
1.168ProGln: 1.168 ± 0.14
2.009ProArg: 2.009 ± 0.27
1.901ProSer: 1.901 ± 0.309
2.254ProThr: 2.254 ± 0.299
3.34ProVal: 3.34 ± 0.273
0.597ProTrp: 0.597 ± 0.126
1.303ProTyr: 1.303 ± 0.173
0.0ProXaa: 0.0 ± 0.0
Gln
3.258GlnAla: 3.258 ± 0.346
0.407GlnCys: 0.407 ± 0.097
1.819GlnAsp: 1.819 ± 0.223
2.226GlnGlu: 2.226 ± 0.323
1.168GlnPhe: 1.168 ± 0.168
2.145GlnGly: 2.145 ± 0.232
0.57GlnHis: 0.57 ± 0.127
1.765GlnIle: 1.765 ± 0.219
2.254GlnLys: 2.254 ± 0.304
2.145GlnLeu: 2.145 ± 0.314
1.005GlnMet: 1.005 ± 0.215
1.412GlnAsn: 1.412 ± 0.201
0.977GlnPro: 0.977 ± 0.151
1.005GlnGln: 1.005 ± 0.211
2.281GlnArg: 2.281 ± 0.308
2.036GlnSer: 2.036 ± 0.287
1.493GlnThr: 1.493 ± 0.202
1.928GlnVal: 1.928 ± 0.217
0.543GlnTrp: 0.543 ± 0.125
0.977GlnTyr: 0.977 ± 0.148
0.0GlnXaa: 0.0 ± 0.0
Arg
4.697ArgAla: 4.697 ± 0.427
0.462ArgCys: 0.462 ± 0.121
3.15ArgAsp: 3.15 ± 0.27
3.747ArgGlu: 3.747 ± 0.389
2.335ArgPhe: 2.335 ± 0.281
3.828ArgGly: 3.828 ± 0.288
0.95ArgHis: 0.95 ± 0.176
3.285ArgIle: 3.285 ± 0.264
4.263ArgLys: 4.263 ± 0.498
3.72ArgLeu: 3.72 ± 0.374
1.52ArgMet: 1.52 ± 0.178
2.96ArgAsn: 2.96 ± 0.295
2.362ArgPro: 2.362 ± 0.296
1.792ArgGln: 1.792 ± 0.276
3.883ArgArg: 3.883 ± 0.438
2.525ArgSer: 2.525 ± 0.262
2.444ArgThr: 2.444 ± 0.26
3.828ArgVal: 3.828 ± 0.342
1.005ArgTrp: 1.005 ± 0.154
2.335ArgTyr: 2.335 ± 0.261
0.0ArgXaa: 0.0 ± 0.0
Ser
4.561SerAla: 4.561 ± 0.496
0.462SerCys: 0.462 ± 0.121
3.693SerAsp: 3.693 ± 0.311
3.177SerGlu: 3.177 ± 0.319
2.498SerPhe: 2.498 ± 0.224
4.942SerGly: 4.942 ± 0.607
1.276SerHis: 1.276 ± 0.172
3.394SerIle: 3.394 ± 0.289
3.421SerLys: 3.421 ± 0.299
4.263SerLeu: 4.263 ± 0.339
1.738SerMet: 1.738 ± 0.206
2.607SerAsn: 2.607 ± 0.31
2.444SerPro: 2.444 ± 0.247
2.064SerGln: 2.064 ± 0.239
2.96SerArg: 2.96 ± 0.298
3.774SerSer: 3.774 ± 0.421
3.964SerThr: 3.964 ± 0.4
4.29SerVal: 4.29 ± 0.366
1.358SerTrp: 1.358 ± 0.298
2.118SerTyr: 2.118 ± 0.227
0.0SerXaa: 0.0 ± 0.0
Thr
5.105ThrAla: 5.105 ± 0.685
0.597ThrCys: 0.597 ± 0.138
3.72ThrAsp: 3.72 ± 0.316
4.018ThrGlu: 4.018 ± 0.385
2.498ThrPhe: 2.498 ± 0.26
6.001ThrGly: 6.001 ± 0.665
1.059ThrHis: 1.059 ± 0.155
3.584ThrIle: 3.584 ± 0.451
2.715ThrLys: 2.715 ± 0.224
4.779ThrLeu: 4.779 ± 0.346
1.358ThrMet: 1.358 ± 0.193
2.661ThrAsn: 2.661 ± 0.391
2.498ThrPro: 2.498 ± 0.265
1.846ThrGln: 1.846 ± 0.255
2.742ThrArg: 2.742 ± 0.289
3.638ThrSer: 3.638 ± 0.6
4.29ThrThr: 4.29 ± 0.645
4.779ThrVal: 4.779 ± 0.468
1.005ThrTrp: 1.005 ± 0.159
2.009ThrTyr: 2.009 ± 0.251
0.0ThrXaa: 0.0 ± 0.0
Val
5.865ValAla: 5.865 ± 0.457
1.005ValCys: 1.005 ± 0.187
5.24ValAsp: 5.24 ± 0.423
4.887ValGlu: 4.887 ± 0.484
2.552ValPhe: 2.552 ± 0.256
4.752ValGly: 4.752 ± 0.436
1.249ValHis: 1.249 ± 0.224
4.181ValIle: 4.181 ± 0.349
4.371ValLys: 4.371 ± 0.328
4.589ValLeu: 4.589 ± 0.391
2.254ValMet: 2.254 ± 0.281
2.932ValAsn: 2.932 ± 0.308
2.634ValPro: 2.634 ± 0.331
1.738ValGln: 1.738 ± 0.213
4.507ValArg: 4.507 ± 0.327
4.806ValSer: 4.806 ± 0.321
5.267ValThr: 5.267 ± 0.556
5.675ValVal: 5.675 ± 0.497
1.52ValTrp: 1.52 ± 0.233
3.448ValTyr: 3.448 ± 0.312
0.0ValXaa: 0.0 ± 0.0
Trp
1.14TrpAla: 1.14 ± 0.169
0.299TrpCys: 0.299 ± 0.095
1.303TrpAsp: 1.303 ± 0.184
1.493TrpGlu: 1.493 ± 0.224
0.787TrpPhe: 0.787 ± 0.146
1.575TrpGly: 1.575 ± 0.218
0.57TrpHis: 0.57 ± 0.129
0.977TrpIle: 0.977 ± 0.171
1.168TrpLys: 1.168 ± 0.21
1.412TrpLeu: 1.412 ± 0.217
0.516TrpMet: 0.516 ± 0.11
1.005TrpAsn: 1.005 ± 0.15
0.434TrpPro: 0.434 ± 0.114
0.489TrpGln: 0.489 ± 0.107
0.896TrpArg: 0.896 ± 0.143
1.276TrpSer: 1.276 ± 0.192
1.276TrpThr: 1.276 ± 0.223
1.059TrpVal: 1.059 ± 0.209
0.38TrpTrp: 0.38 ± 0.091
0.787TrpTyr: 0.787 ± 0.14
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.96TyrAla: 2.96 ± 0.29
0.38TyrCys: 0.38 ± 0.096
2.634TyrAsp: 2.634 ± 0.318
2.797TyrGlu: 2.797 ± 0.303
1.249TyrPhe: 1.249 ± 0.201
3.665TyrGly: 3.665 ± 0.297
0.679TyrHis: 0.679 ± 0.115
2.118TyrIle: 2.118 ± 0.23
2.362TyrLys: 2.362 ± 0.265
2.498TyrLeu: 2.498 ± 0.29
1.249TyrMet: 1.249 ± 0.179
2.226TyrAsn: 2.226 ± 0.236
1.439TyrPro: 1.439 ± 0.25
1.358TyrGln: 1.358 ± 0.169
2.172TyrArg: 2.172 ± 0.254
2.335TyrSer: 2.335 ± 0.298
2.688TyrThr: 2.688 ± 0.309
2.769TyrVal: 2.769 ± 0.308
0.652TyrTrp: 0.652 ± 0.126
1.765TyrTyr: 1.765 ± 0.217
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 228 proteins (36831 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski