Amino acid dipepetide frequency for Synechococcus phage S-SM1

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
6.894AlaAla: 6.894 ± 0.522
0.443AlaCys: 0.443 ± 0.111
4.2AlaAsp: 4.2 ± 0.313
3.811AlaGlu: 3.811 ± 0.325
3.084AlaPhe: 3.084 ± 0.258
7.036AlaGly: 7.036 ± 0.517
1.01AlaHis: 1.01 ± 0.14
4.59AlaIle: 4.59 ± 0.326
3.687AlaLys: 3.687 ± 0.298
4.909AlaLeu: 4.909 ± 0.326
1.382AlaMet: 1.382 ± 0.217
3.669AlaAsn: 3.669 ± 0.321
3.066AlaPro: 3.066 ± 0.252
2.659AlaGln: 2.659 ± 0.188
2.853AlaArg: 2.853 ± 0.322
5.246AlaSer: 5.246 ± 0.469
6.008AlaThr: 6.008 ± 0.609
4.644AlaVal: 4.644 ± 0.364
0.603AlaTrp: 0.603 ± 0.116
2.233AlaTyr: 2.233 ± 0.202
0.0AlaXaa: 0.0 ± 0.0
Cys
0.62CysAla: 0.62 ± 0.106
0.106CysCys: 0.106 ± 0.051
0.762CysAsp: 0.762 ± 0.168
0.62CysGlu: 0.62 ± 0.13
0.408CysPhe: 0.408 ± 0.102
0.514CysGly: 0.514 ± 0.106
0.266CysHis: 0.266 ± 0.076
0.39CysIle: 0.39 ± 0.095
0.656CysLys: 0.656 ± 0.165
0.656CysLeu: 0.656 ± 0.115
0.319CysMet: 0.319 ± 0.089
0.425CysAsn: 0.425 ± 0.121
0.319CysPro: 0.319 ± 0.076
0.337CysGln: 0.337 ± 0.084
0.284CysArg: 0.284 ± 0.086
0.585CysSer: 0.585 ± 0.107
0.496CysThr: 0.496 ± 0.101
0.567CysVal: 0.567 ± 0.121
0.124CysTrp: 0.124 ± 0.046
0.39CysTyr: 0.39 ± 0.089
0.0CysXaa: 0.0 ± 0.0
Asp
4.998AspAla: 4.998 ± 0.338
0.727AspCys: 0.727 ± 0.16
4.502AspAsp: 4.502 ± 0.384
4.289AspGlu: 4.289 ± 0.284
2.907AspPhe: 2.907 ± 0.268
5.406AspGly: 5.406 ± 0.489
1.063AspHis: 1.063 ± 0.19
4.165AspIle: 4.165 ± 0.333
3.421AspLys: 3.421 ± 0.314
5.211AspLeu: 5.211 ± 0.305
1.613AspMet: 1.613 ± 0.221
3.332AspAsn: 3.332 ± 0.273
3.367AspPro: 3.367 ± 0.255
2.145AspGln: 2.145 ± 0.183
2.57AspArg: 2.57 ± 0.28
4.112AspSer: 4.112 ± 0.293
4.236AspThr: 4.236 ± 0.3
4.165AspVal: 4.165 ± 0.295
1.028AspTrp: 1.028 ± 0.135
3.669AspTyr: 3.669 ± 0.248
0.0AspXaa: 0.0 ± 0.0
Glu
3.562GluAla: 3.562 ± 0.264
0.656GluCys: 0.656 ± 0.132
4.076GluAsp: 4.076 ± 0.325
4.59GluGlu: 4.59 ± 0.546
3.102GluPhe: 3.102 ± 0.22
4.041GluGly: 4.041 ± 0.318
1.01GluHis: 1.01 ± 0.171
4.094GluIle: 4.094 ± 0.274
3.545GluLys: 3.545 ± 0.435
5.264GluLeu: 5.264 ± 0.348
1.56GluMet: 1.56 ± 0.25
3.137GluAsn: 3.137 ± 0.215
1.648GluPro: 1.648 ± 0.188
2.34GluGln: 2.34 ± 0.191
2.889GluArg: 2.889 ± 0.302
3.509GluSer: 3.509 ± 0.302
4.449GluThr: 4.449 ± 0.311
4.041GluVal: 4.041 ± 0.26
0.957GluTrp: 0.957 ± 0.153
2.605GluTyr: 2.605 ± 0.208
0.0GluXaa: 0.0 ± 0.0
Phe
3.013PheAla: 3.013 ± 0.267
0.496PheCys: 0.496 ± 0.111
3.757PheAsp: 3.757 ± 0.271
2.783PheGlu: 2.783 ± 0.251
1.684PhePhe: 1.684 ± 0.165
3.102PheGly: 3.102 ± 0.274
0.638PheHis: 0.638 ± 0.138
2.517PheIle: 2.517 ± 0.297
2.074PheLys: 2.074 ± 0.216
3.226PheLeu: 3.226 ± 0.31
1.046PheMet: 1.046 ± 0.153
2.836PheAsn: 2.836 ± 0.194
1.861PhePro: 1.861 ± 0.197
1.489PheGln: 1.489 ± 0.146
1.648PheArg: 1.648 ± 0.15
3.013PheSer: 3.013 ± 0.282
3.243PheThr: 3.243 ± 0.339
2.924PheVal: 2.924 ± 0.247
0.408PheTrp: 0.408 ± 0.084
1.577PheTyr: 1.577 ± 0.154
0.0PheXaa: 0.0 ± 0.0
Gly
6.451GlyAla: 6.451 ± 0.631
0.656GlyCys: 0.656 ± 0.134
4.839GlyAsp: 4.839 ± 0.333
4.2GlyGlu: 4.2 ± 0.325
2.942GlyPhe: 2.942 ± 0.202
7.745GlyGly: 7.745 ± 1.037
1.258GlyHis: 1.258 ± 0.181
4.006GlyIle: 4.006 ± 0.275
3.881GlyLys: 3.881 ± 0.338
4.679GlyLeu: 4.679 ± 0.328
1.755GlyMet: 1.755 ± 0.276
4.466GlyAsn: 4.466 ± 0.434
2.198GlyPro: 2.198 ± 0.225
2.978GlyGln: 2.978 ± 0.247
2.871GlyArg: 2.871 ± 0.248
6.168GlySer: 6.168 ± 0.594
7.426GlyThr: 7.426 ± 0.898
5.211GlyVal: 5.211 ± 0.366
1.046GlyTrp: 1.046 ± 0.141
3.456GlyTyr: 3.456 ± 0.297
0.0GlyXaa: 0.0 ± 0.0
His
0.744HisAla: 0.744 ± 0.139
0.195HisCys: 0.195 ± 0.064
0.815HisAsp: 0.815 ± 0.153
0.939HisGlu: 0.939 ± 0.146
0.904HisPhe: 0.904 ± 0.159
1.117HisGly: 1.117 ± 0.176
0.496HisHis: 0.496 ± 0.126
0.744HisIle: 0.744 ± 0.123
1.046HisLys: 1.046 ± 0.187
1.152HisLeu: 1.152 ± 0.194
0.39HisMet: 0.39 ± 0.112
0.815HisAsn: 0.815 ± 0.147
1.081HisPro: 1.081 ± 0.151
0.638HisGln: 0.638 ± 0.123
0.638HisArg: 0.638 ± 0.119
0.939HisSer: 0.939 ± 0.122
0.922HisThr: 0.922 ± 0.126
1.01HisVal: 1.01 ± 0.19
0.301HisTrp: 0.301 ± 0.097
0.922HisTyr: 0.922 ± 0.177
0.0HisXaa: 0.0 ± 0.0
Ile
4.13IleAla: 4.13 ± 0.289
0.567IleCys: 0.567 ± 0.132
5.087IleAsp: 5.087 ± 0.36
3.828IleGlu: 3.828 ± 0.266
2.428IlePhe: 2.428 ± 0.2
3.811IleGly: 3.811 ± 0.339
0.727IleHis: 0.727 ± 0.152
3.562IleIle: 3.562 ± 0.241
3.952IleLys: 3.952 ± 0.284
4.307IleLeu: 4.307 ± 0.331
1.028IleMet: 1.028 ± 0.171
3.687IleAsn: 3.687 ± 0.211
2.712IlePro: 2.712 ± 0.28
2.286IleGln: 2.286 ± 0.207
2.251IleArg: 2.251 ± 0.212
4.13IleSer: 4.13 ± 0.408
5.618IleThr: 5.618 ± 0.583
4.147IleVal: 4.147 ± 0.353
0.461IleTrp: 0.461 ± 0.102
2.003IleTyr: 2.003 ± 0.211
0.0IleXaa: 0.0 ± 0.0
Lys
3.651LysAla: 3.651 ± 0.366
0.762LysCys: 0.762 ± 0.14
3.367LysAsp: 3.367 ± 0.277
4.271LysGlu: 4.271 ± 0.479
2.694LysPhe: 2.694 ± 0.255
3.403LysGly: 3.403 ± 0.303
0.815LysHis: 0.815 ± 0.164
3.811LysIle: 3.811 ± 0.354
4.661LysLys: 4.661 ± 0.6
4.626LysLeu: 4.626 ± 0.334
1.471LysMet: 1.471 ± 0.206
2.605LysAsn: 2.605 ± 0.269
1.914LysPro: 1.914 ± 0.205
2.233LysGln: 2.233 ± 0.303
2.286LysArg: 2.286 ± 0.261
3.633LysSer: 3.633 ± 0.355
3.492LysThr: 3.492 ± 0.283
4.094LysVal: 4.094 ± 0.294
0.744LysTrp: 0.744 ± 0.134
2.641LysTyr: 2.641 ± 0.322
0.0LysXaa: 0.0 ± 0.0
Leu
4.785LeuAla: 4.785 ± 0.286
0.744LeuCys: 0.744 ± 0.156
5.175LeuAsp: 5.175 ± 0.316
4.502LeuGlu: 4.502 ± 0.355
3.137LeuPhe: 3.137 ± 0.216
4.661LeuGly: 4.661 ± 0.392
1.365LeuHis: 1.365 ± 0.189
4.342LeuIle: 4.342 ± 0.364
4.98LeuLys: 4.98 ± 0.424
5.423LeuLeu: 5.423 ± 0.392
1.382LeuMet: 1.382 ± 0.223
4.892LeuAsn: 4.892 ± 0.327
2.907LeuPro: 2.907 ± 0.308
3.031LeuGln: 3.031 ± 0.244
3.509LeuArg: 3.509 ± 0.29
5.388LeuSer: 5.388 ± 0.284
5.742LeuThr: 5.742 ± 0.524
4.218LeuVal: 4.218 ± 0.266
0.567LeuTrp: 0.567 ± 0.125
3.031LeuTyr: 3.031 ± 0.24
0.0LeuXaa: 0.0 ± 0.0
Met
1.542MetAla: 1.542 ± 0.239
0.142MetCys: 0.142 ± 0.054
1.347MetAsp: 1.347 ± 0.258
1.418MetGlu: 1.418 ± 0.24
0.744MetPhe: 0.744 ± 0.127
1.117MetGly: 1.117 ± 0.192
0.443MetHis: 0.443 ± 0.122
1.187MetIle: 1.187 ± 0.19
2.091MetLys: 2.091 ± 0.328
1.755MetLeu: 1.755 ± 0.227
0.567MetMet: 0.567 ± 0.12
1.312MetAsn: 1.312 ± 0.182
1.01MetPro: 1.01 ± 0.158
0.851MetGln: 0.851 ± 0.136
0.922MetArg: 0.922 ± 0.146
1.755MetSer: 1.755 ± 0.227
1.684MetThr: 1.684 ± 0.263
1.099MetVal: 1.099 ± 0.161
0.301MetTrp: 0.301 ± 0.087
0.62MetTyr: 0.62 ± 0.11
0.0MetXaa: 0.0 ± 0.0
Asn
4.147AsnAla: 4.147 ± 0.306
0.443AsnCys: 0.443 ± 0.081
3.19AsnAsp: 3.19 ± 0.251
3.226AsnGlu: 3.226 ± 0.306
2.552AsnPhe: 2.552 ± 0.285
4.484AsnGly: 4.484 ± 0.44
0.691AsnHis: 0.691 ± 0.097
3.438AsnIle: 3.438 ± 0.319
2.712AsnLys: 2.712 ± 0.259
4.821AsnLeu: 4.821 ± 0.366
0.939AsnMet: 0.939 ± 0.186
3.137AsnAsn: 3.137 ± 0.331
3.119AsnPro: 3.119 ± 0.214
2.074AsnGln: 2.074 ± 0.209
2.269AsnArg: 2.269 ± 0.183
3.669AsnSer: 3.669 ± 0.301
3.864AsnThr: 3.864 ± 0.398
3.935AsnVal: 3.935 ± 0.324
0.585AsnTrp: 0.585 ± 0.115
2.712AsnTyr: 2.712 ± 0.213
0.0AsnXaa: 0.0 ± 0.0
Pro
2.694ProAla: 2.694 ± 0.255
0.354ProCys: 0.354 ± 0.092
2.481ProAsp: 2.481 ± 0.218
2.889ProGlu: 2.889 ± 0.248
1.701ProPhe: 1.701 ± 0.2
3.456ProGly: 3.456 ± 0.352
0.833ProHis: 0.833 ± 0.13
2.233ProIle: 2.233 ± 0.193
2.056ProLys: 2.056 ± 0.282
2.428ProLeu: 2.428 ± 0.221
0.532ProMet: 0.532 ± 0.126
2.251ProAsn: 2.251 ± 0.158
1.489ProPro: 1.489 ± 0.214
1.382ProGln: 1.382 ± 0.18
1.613ProArg: 1.613 ± 0.189
3.421ProSer: 3.421 ± 0.329
3.013ProThr: 3.013 ± 0.223
2.57ProVal: 2.57 ± 0.214
0.443ProTrp: 0.443 ± 0.107
1.843ProTyr: 1.843 ± 0.196
0.0ProXaa: 0.0 ± 0.0
Gln
2.481GlnAla: 2.481 ± 0.212
0.354GlnCys: 0.354 ± 0.101
2.428GlnAsp: 2.428 ± 0.146
2.464GlnGlu: 2.464 ± 0.232
1.595GlnPhe: 1.595 ± 0.182
2.375GlnGly: 2.375 ± 0.251
0.638GlnHis: 0.638 ± 0.149
2.375GlnIle: 2.375 ± 0.242
2.162GlnLys: 2.162 ± 0.223
3.19GlnLeu: 3.19 ± 0.271
1.028GlnMet: 1.028 ± 0.2
2.02GlnAsn: 2.02 ± 0.154
1.134GlnPro: 1.134 ± 0.159
1.524GlnGln: 1.524 ± 0.197
1.737GlnArg: 1.737 ± 0.215
2.428GlnSer: 2.428 ± 0.205
2.41GlnThr: 2.41 ± 0.275
2.588GlnVal: 2.588 ± 0.224
0.372GlnTrp: 0.372 ± 0.093
1.808GlnTyr: 1.808 ± 0.2
0.0GlnXaa: 0.0 ± 0.0
Arg
2.836ArgAla: 2.836 ± 0.239
0.213ArgCys: 0.213 ± 0.05
2.233ArgAsp: 2.233 ± 0.204
2.162ArgGlu: 2.162 ± 0.253
1.896ArgPhe: 1.896 ± 0.173
3.031ArgGly: 3.031 ± 0.311
0.744ArgHis: 0.744 ± 0.137
2.978ArgIle: 2.978 ± 0.225
2.694ArgLys: 2.694 ± 0.357
3.474ArgLeu: 3.474 ± 0.241
1.365ArgMet: 1.365 ± 0.223
1.879ArgAsn: 1.879 ± 0.208
1.489ArgPro: 1.489 ± 0.149
1.542ArgGln: 1.542 ± 0.179
2.145ArgArg: 2.145 ± 0.306
2.481ArgSer: 2.481 ± 0.264
2.588ArgThr: 2.588 ± 0.259
2.907ArgVal: 2.907 ± 0.25
0.372ArgTrp: 0.372 ± 0.088
2.269ArgTyr: 2.269 ± 0.195
0.0ArgXaa: 0.0 ± 0.0
Ser
5.299SerAla: 5.299 ± 0.384
0.479SerCys: 0.479 ± 0.107
4.218SerAsp: 4.218 ± 0.283
3.687SerGlu: 3.687 ± 0.284
3.385SerPhe: 3.385 ± 0.33
7.284SerGly: 7.284 ± 0.69
0.851SerHis: 0.851 ± 0.103
4.378SerIle: 4.378 ± 0.387
3.297SerLys: 3.297 ± 0.353
4.803SerLeu: 4.803 ± 0.336
1.56SerMet: 1.56 ± 0.205
3.775SerAsn: 3.775 ± 0.308
2.623SerPro: 2.623 ± 0.313
2.588SerGln: 2.588 ± 0.223
2.783SerArg: 2.783 ± 0.256
5.211SerSer: 5.211 ± 0.389
5.193SerThr: 5.193 ± 0.522
4.36SerVal: 4.36 ± 0.363
0.656SerTrp: 0.656 ± 0.139
2.765SerTyr: 2.765 ± 0.216
0.0SerXaa: 0.0 ± 0.0
Thr
6.38ThrAla: 6.38 ± 0.8
0.372ThrCys: 0.372 ± 0.084
4.732ThrAsp: 4.732 ± 0.373
3.545ThrGlu: 3.545 ± 0.237
3.332ThrPhe: 3.332 ± 0.435
6.965ThrGly: 6.965 ± 0.848
0.904ThrHis: 0.904 ± 0.134
5.104ThrIle: 5.104 ± 0.489
3.651ThrLys: 3.651 ± 0.285
5.831ThrLeu: 5.831 ± 0.43
1.436ThrMet: 1.436 ± 0.21
4.147ThrAsn: 4.147 ± 0.521
3.208ThrPro: 3.208 ± 0.249
2.623ThrGln: 2.623 ± 0.198
2.605ThrArg: 2.605 ± 0.214
5.53ThrSer: 5.53 ± 0.545
6.593ThrThr: 6.593 ± 0.72
5.973ThrVal: 5.973 ± 0.71
0.886ThrTrp: 0.886 ± 0.136
2.871ThrTyr: 2.871 ± 0.288
0.0ThrXaa: 0.0 ± 0.0
Val
4.839ValAla: 4.839 ± 0.323
0.443ValCys: 0.443 ± 0.094
5.317ValAsp: 5.317 ± 0.385
4.502ValGlu: 4.502 ± 0.244
2.57ValPhe: 2.57 ± 0.224
5.441ValGly: 5.441 ± 0.402
0.939ValHis: 0.939 ± 0.152
3.793ValIle: 3.793 ± 0.218
3.35ValLys: 3.35 ± 0.263
4.378ValLeu: 4.378 ± 0.277
1.418ValMet: 1.418 ± 0.233
4.112ValAsn: 4.112 ± 0.357
2.659ValPro: 2.659 ± 0.226
2.286ValGln: 2.286 ± 0.201
2.853ValArg: 2.853 ± 0.213
4.821ValSer: 4.821 ± 0.307
5.725ValThr: 5.725 ± 0.711
4.714ValVal: 4.714 ± 0.425
0.585ValTrp: 0.585 ± 0.11
2.233ValTyr: 2.233 ± 0.192
0.0ValXaa: 0.0 ± 0.0
Trp
0.62TrpAla: 0.62 ± 0.103
0.142TrpCys: 0.142 ± 0.05
0.904TrpAsp: 0.904 ± 0.122
0.815TrpGlu: 0.815 ± 0.148
0.461TrpPhe: 0.461 ± 0.112
0.603TrpGly: 0.603 ± 0.096
0.372TrpHis: 0.372 ± 0.089
0.496TrpIle: 0.496 ± 0.093
0.833TrpLys: 0.833 ± 0.166
0.638TrpLeu: 0.638 ± 0.143
0.248TrpMet: 0.248 ± 0.076
0.762TrpAsn: 0.762 ± 0.106
0.213TrpPro: 0.213 ± 0.071
0.425TrpGln: 0.425 ± 0.088
0.443TrpArg: 0.443 ± 0.09
0.78TrpSer: 0.78 ± 0.127
0.833TrpThr: 0.833 ± 0.132
0.833TrpVal: 0.833 ± 0.117
0.142TrpTrp: 0.142 ± 0.054
0.461TrpTyr: 0.461 ± 0.084
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.34TyrAla: 2.34 ± 0.194
0.549TyrCys: 0.549 ± 0.11
3.492TyrAsp: 3.492 ± 0.27
2.534TyrGlu: 2.534 ± 0.311
1.808TyrPhe: 1.808 ± 0.198
2.641TyrGly: 2.641 ± 0.196
0.744TyrHis: 0.744 ± 0.132
2.517TyrIle: 2.517 ± 0.205
2.428TyrLys: 2.428 ± 0.22
3.013TyrLeu: 3.013 ± 0.215
0.851TyrMet: 0.851 ± 0.138
2.747TyrAsn: 2.747 ± 0.255
1.684TyrPro: 1.684 ± 0.177
1.684TyrGln: 1.684 ± 0.19
2.127TyrArg: 2.127 ± 0.199
2.375TyrSer: 2.375 ± 0.209
3.155TyrThr: 3.155 ± 0.409
2.978TyrVal: 2.978 ± 0.242
0.443TyrTrp: 0.443 ± 0.112
2.074TyrTyr: 2.074 ± 0.206
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 234 proteins (56423 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski