Amino acid dipepetide frequency for Shigella phage Ag3

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
6.377AlaAla: 6.377 ± 0.433
0.761AlaCys: 0.761 ± 0.11
4.464AlaAsp: 4.464 ± 0.297
5.081AlaGlu: 5.081 ± 0.46
2.469AlaPhe: 2.469 ± 0.28
5.04AlaGly: 5.04 ± 0.426
1.44AlaHis: 1.44 ± 0.174
4.238AlaIle: 4.238 ± 0.311
4.67AlaLys: 4.67 ± 0.322
6.007AlaLeu: 6.007 ± 0.391
1.913AlaMet: 1.913 ± 0.237
3.23AlaAsn: 3.23 ± 0.231
2.839AlaPro: 2.839 ± 0.25
2.839AlaGln: 2.839 ± 0.249
3.415AlaArg: 3.415 ± 0.298
3.991AlaSer: 3.991 ± 0.388
4.567AlaThr: 4.567 ± 0.4
5.102AlaVal: 5.102 ± 0.308
0.967AlaTrp: 0.967 ± 0.147
2.571AlaTyr: 2.571 ± 0.244
0.0AlaXaa: 0.0 ± 0.0
Cys
0.72CysAla: 0.72 ± 0.119
0.165CysCys: 0.165 ± 0.063
0.741CysAsp: 0.741 ± 0.134
0.741CysGlu: 0.741 ± 0.119
0.37CysPhe: 0.37 ± 0.077
0.782CysGly: 0.782 ± 0.135
0.35CysHis: 0.35 ± 0.094
0.802CysIle: 0.802 ± 0.125
0.761CysLys: 0.761 ± 0.144
0.823CysLeu: 0.823 ± 0.137
0.309CysMet: 0.309 ± 0.081
0.597CysAsn: 0.597 ± 0.095
0.576CysPro: 0.576 ± 0.109
0.329CysGln: 0.329 ± 0.088
0.576CysArg: 0.576 ± 0.111
0.782CysSer: 0.782 ± 0.127
0.617CysThr: 0.617 ± 0.119
1.07CysVal: 1.07 ± 0.15
0.123CysTrp: 0.123 ± 0.045
0.473CysTyr: 0.473 ± 0.103
0.0CysXaa: 0.0 ± 0.0
Asp
4.958AspAla: 4.958 ± 0.361
0.638AspCys: 0.638 ± 0.116
4.176AspAsp: 4.176 ± 0.388
4.238AspGlu: 4.238 ± 0.308
3.189AspPhe: 3.189 ± 0.258
4.958AspGly: 4.958 ± 0.353
1.008AspHis: 1.008 ± 0.146
4.361AspIle: 4.361 ± 0.277
3.621AspLys: 3.621 ± 0.308
5.493AspLeu: 5.493 ± 0.367
2.098AspMet: 2.098 ± 0.194
2.839AspAsn: 2.839 ± 0.276
2.942AspPro: 2.942 ± 0.24
2.263AspGln: 2.263 ± 0.241
2.777AspArg: 2.777 ± 0.258
3.374AspSer: 3.374 ± 0.231
3.6AspThr: 3.6 ± 0.263
4.629AspVal: 4.629 ± 0.303
1.07AspTrp: 1.07 ± 0.134
3.271AspTyr: 3.271 ± 0.265
0.0AspXaa: 0.0 ± 0.0
Glu
4.649GluAla: 4.649 ± 0.366
0.72GluCys: 0.72 ± 0.137
4.176GluAsp: 4.176 ± 0.368
4.505GluGlu: 4.505 ± 0.383
2.942GluPhe: 2.942 ± 0.269
4.197GluGly: 4.197 ± 0.277
1.399GluHis: 1.399 ± 0.163
3.929GluIle: 3.929 ± 0.291
3.497GluLys: 3.497 ± 0.306
6.357GluLeu: 6.357 ± 0.359
2.242GluMet: 2.242 ± 0.205
2.921GluAsn: 2.921 ± 0.287
2.16GluPro: 2.16 ± 0.208
2.613GluGln: 2.613 ± 0.234
3.682GluArg: 3.682 ± 0.296
3.209GluSer: 3.209 ± 0.282
3.785GluThr: 3.785 ± 0.27
4.834GluVal: 4.834 ± 0.366
1.193GluTrp: 1.193 ± 0.159
2.798GluTyr: 2.798 ± 0.255
0.0GluXaa: 0.0 ± 0.0
Phe
2.448PheAla: 2.448 ± 0.241
0.37PheCys: 0.37 ± 0.097
2.674PheAsp: 2.674 ± 0.218
3.353PheGlu: 3.353 ± 0.305
1.666PhePhe: 1.666 ± 0.21
3.003PheGly: 3.003 ± 0.304
0.946PheHis: 0.946 ± 0.156
2.736PheIle: 2.736 ± 0.229
2.736PheLys: 2.736 ± 0.238
3.209PheLeu: 3.209 ± 0.233
1.378PheMet: 1.378 ± 0.186
2.325PheAsn: 2.325 ± 0.23
1.543PhePro: 1.543 ± 0.196
1.543PheGln: 1.543 ± 0.174
2.222PheArg: 2.222 ± 0.23
2.921PheSer: 2.921 ± 0.267
2.715PheThr: 2.715 ± 0.222
2.757PheVal: 2.757 ± 0.219
0.638PheTrp: 0.638 ± 0.131
1.605PheTyr: 1.605 ± 0.17
0.0PheXaa: 0.0 ± 0.0
Gly
4.382GlyAla: 4.382 ± 0.322
1.029GlyCys: 1.029 ± 0.156
3.888GlyAsp: 3.888 ± 0.309
4.629GlyGlu: 4.629 ± 0.306
3.003GlyPhe: 3.003 ± 0.274
5.452GlyGly: 5.452 ± 0.565
1.358GlyHis: 1.358 ± 0.187
4.546GlyIle: 4.546 ± 0.272
4.937GlyLys: 4.937 ± 0.379
5.287GlyLeu: 5.287 ± 0.316
1.995GlyMet: 1.995 ± 0.179
3.271GlyAsn: 3.271 ± 0.299
1.09GlyPro: 1.09 ± 0.146
2.654GlyGln: 2.654 ± 0.205
3.271GlyArg: 3.271 ± 0.229
4.156GlySer: 4.156 ± 0.379
4.238GlyThr: 4.238 ± 0.417
5.76GlyVal: 5.76 ± 0.397
1.255GlyTrp: 1.255 ± 0.169
2.427GlyTyr: 2.427 ± 0.21
0.0GlyXaa: 0.0 ± 0.0
His
1.173HisAla: 1.173 ± 0.184
0.35HisCys: 0.35 ± 0.077
1.193HisAsp: 1.193 ± 0.199
0.802HisGlu: 0.802 ± 0.116
0.987HisPhe: 0.987 ± 0.143
1.09HisGly: 1.09 ± 0.183
0.576HisHis: 0.576 ± 0.103
1.317HisIle: 1.317 ± 0.166
1.358HisLys: 1.358 ± 0.16
1.605HisLeu: 1.605 ± 0.164
0.535HisMet: 0.535 ± 0.11
0.864HisAsn: 0.864 ± 0.144
1.029HisPro: 1.029 ± 0.141
0.761HisGln: 0.761 ± 0.128
1.09HisArg: 1.09 ± 0.155
1.008HisSer: 1.008 ± 0.166
1.193HisThr: 1.193 ± 0.179
1.173HisVal: 1.173 ± 0.181
0.206HisTrp: 0.206 ± 0.065
0.885HisTyr: 0.885 ± 0.144
0.0HisXaa: 0.0 ± 0.0
Ile
4.279IleAla: 4.279 ± 0.309
0.699IleCys: 0.699 ± 0.128
5.41IleAsp: 5.41 ± 0.34
4.732IleGlu: 4.732 ± 0.314
1.934IlePhe: 1.934 ± 0.179
3.682IleGly: 3.682 ± 0.269
1.214IleHis: 1.214 ± 0.156
3.456IleIle: 3.456 ± 0.311
3.6IleLys: 3.6 ± 0.273
3.95IleLeu: 3.95 ± 0.286
1.358IleMet: 1.358 ± 0.172
3.312IleAsn: 3.312 ± 0.304
2.859IlePro: 2.859 ± 0.237
2.592IleGln: 2.592 ± 0.223
3.148IleArg: 3.148 ± 0.249
3.106IleSer: 3.106 ± 0.285
4.258IleThr: 4.258 ± 0.343
3.888IleVal: 3.888 ± 0.236
0.658IleTrp: 0.658 ± 0.123
1.625IleTyr: 1.625 ± 0.187
0.0IleXaa: 0.0 ± 0.0
Lys
4.135LysAla: 4.135 ± 0.336
0.658LysCys: 0.658 ± 0.118
4.361LysAsp: 4.361 ± 0.321
4.3LysGlu: 4.3 ± 0.414
3.127LysPhe: 3.127 ± 0.212
4.279LysGly: 4.279 ± 0.353
1.214LysHis: 1.214 ± 0.161
3.497LysIle: 3.497 ± 0.246
4.114LysLys: 4.114 ± 0.353
5.39LysLeu: 5.39 ± 0.312
2.839LysMet: 2.839 ± 0.282
2.654LysAsn: 2.654 ± 0.236
2.489LysPro: 2.489 ± 0.232
2.407LysGln: 2.407 ± 0.272
3.394LysArg: 3.394 ± 0.293
4.032LysSer: 4.032 ± 0.286
3.477LysThr: 3.477 ± 0.272
4.3LysVal: 4.3 ± 0.327
1.008LysTrp: 1.008 ± 0.147
2.119LysTyr: 2.119 ± 0.199
0.0LysXaa: 0.0 ± 0.0
Leu
6.377LeuAla: 6.377 ± 0.369
0.802LeuCys: 0.802 ± 0.132
5.328LeuAsp: 5.328 ± 0.361
4.793LeuGlu: 4.793 ± 0.346
3.394LeuPhe: 3.394 ± 0.221
5.452LeuGly: 5.452 ± 0.322
1.522LeuHis: 1.522 ± 0.181
3.97LeuIle: 3.97 ± 0.279
6.562LeuLys: 6.562 ± 0.455
6.521LeuLeu: 6.521 ± 0.428
2.139LeuMet: 2.139 ± 0.205
4.279LeuAsn: 4.279 ± 0.313
3.518LeuPro: 3.518 ± 0.267
2.798LeuGln: 2.798 ± 0.264
3.929LeuArg: 3.929 ± 0.264
4.937LeuSer: 4.937 ± 0.307
5.081LeuThr: 5.081 ± 0.39
6.418LeuVal: 6.418 ± 0.372
0.802LeuTrp: 0.802 ± 0.141
2.942LeuTyr: 2.942 ± 0.278
0.0LeuXaa: 0.0 ± 0.0
Met
2.551MetAla: 2.551 ± 0.249
0.329MetCys: 0.329 ± 0.08
1.563MetAsp: 1.563 ± 0.171
1.543MetGlu: 1.543 ± 0.209
1.522MetPhe: 1.522 ± 0.195
1.666MetGly: 1.666 ± 0.192
0.473MetHis: 0.473 ± 0.11
1.563MetIle: 1.563 ± 0.165
2.304MetLys: 2.304 ± 0.269
2.427MetLeu: 2.427 ± 0.231
0.843MetMet: 0.843 ± 0.138
1.481MetAsn: 1.481 ± 0.195
0.967MetPro: 0.967 ± 0.163
1.152MetGln: 1.152 ± 0.127
1.646MetArg: 1.646 ± 0.177
2.139MetSer: 2.139 ± 0.187
1.605MetThr: 1.605 ± 0.186
1.872MetVal: 1.872 ± 0.201
0.309MetTrp: 0.309 ± 0.076
0.843MetTyr: 0.843 ± 0.151
0.0MetXaa: 0.0 ± 0.0
Asn
3.847AsnAla: 3.847 ± 0.271
0.617AsnCys: 0.617 ± 0.126
2.489AsnAsp: 2.489 ± 0.249
2.489AsnGlu: 2.489 ± 0.264
2.139AsnPhe: 2.139 ± 0.217
4.053AsnGly: 4.053 ± 0.304
1.049AsnHis: 1.049 ± 0.155
2.942AsnIle: 2.942 ± 0.237
2.798AsnLys: 2.798 ± 0.206
3.765AsnLeu: 3.765 ± 0.28
1.481AsnMet: 1.481 ± 0.176
2.736AsnAsn: 2.736 ± 0.28
2.427AsnPro: 2.427 ± 0.273
1.625AsnGln: 1.625 ± 0.17
2.366AsnArg: 2.366 ± 0.27
2.489AsnSer: 2.489 ± 0.219
3.127AsnThr: 3.127 ± 0.342
3.477AsnVal: 3.477 ± 0.311
0.761AsnTrp: 0.761 ± 0.134
1.851AsnTyr: 1.851 ± 0.208
0.0AsnXaa: 0.0 ± 0.0
Pro
2.654ProAla: 2.654 ± 0.234
0.432ProCys: 0.432 ± 0.099
3.127ProAsp: 3.127 ± 0.242
3.209ProGlu: 3.209 ± 0.275
1.81ProPhe: 1.81 ± 0.182
2.469ProGly: 2.469 ± 0.239
0.802ProHis: 0.802 ± 0.125
1.769ProIle: 1.769 ± 0.189
2.366ProLys: 2.366 ± 0.234
3.25ProLeu: 3.25 ± 0.214
0.864ProMet: 0.864 ± 0.15
1.707ProAsn: 1.707 ± 0.18
1.255ProPro: 1.255 ± 0.181
1.584ProGln: 1.584 ± 0.182
1.687ProArg: 1.687 ± 0.201
2.777ProSer: 2.777 ± 0.274
2.489ProThr: 2.489 ± 0.222
2.983ProVal: 2.983 ± 0.239
0.597ProTrp: 0.597 ± 0.107
1.296ProTyr: 1.296 ± 0.172
0.0ProXaa: 0.0 ± 0.0
Gln
2.942GlnAla: 2.942 ± 0.27
0.35GlnCys: 0.35 ± 0.097
2.325GlnAsp: 2.325 ± 0.203
2.222GlnGlu: 2.222 ± 0.296
1.954GlnPhe: 1.954 ± 0.214
2.386GlnGly: 2.386 ± 0.204
0.555GlnHis: 0.555 ± 0.116
2.366GlnIle: 2.366 ± 0.222
2.325GlnLys: 2.325 ± 0.242
3.024GlnLeu: 3.024 ± 0.223
1.152GlnMet: 1.152 ± 0.146
1.563GlnAsn: 1.563 ± 0.206
1.234GlnPro: 1.234 ± 0.143
1.687GlnGln: 1.687 ± 0.196
1.913GlnArg: 1.913 ± 0.188
2.283GlnSer: 2.283 ± 0.182
2.283GlnThr: 2.283 ± 0.224
2.633GlnVal: 2.633 ± 0.22
0.576GlnTrp: 0.576 ± 0.101
1.625GlnTyr: 1.625 ± 0.158
0.0GlnXaa: 0.0 ± 0.0
Arg
3.148ArgAla: 3.148 ± 0.231
0.741ArgCys: 0.741 ± 0.152
3.086ArgAsp: 3.086 ± 0.233
3.436ArgGlu: 3.436 ± 0.291
2.119ArgPhe: 2.119 ± 0.232
3.127ArgGly: 3.127 ± 0.248
0.967ArgHis: 0.967 ± 0.142
3.086ArgIle: 3.086 ± 0.232
3.271ArgLys: 3.271 ± 0.292
4.567ArgLeu: 4.567 ± 0.317
1.81ArgMet: 1.81 ± 0.198
2.469ArgAsn: 2.469 ± 0.254
1.749ArgPro: 1.749 ± 0.212
1.893ArgGln: 1.893 ± 0.199
3.292ArgArg: 3.292 ± 0.306
2.901ArgSer: 2.901 ± 0.257
2.51ArgThr: 2.51 ± 0.239
3.25ArgVal: 3.25 ± 0.268
0.72ArgTrp: 0.72 ± 0.111
2.078ArgTyr: 2.078 ± 0.23
0.0ArgXaa: 0.0 ± 0.0
Ser
3.847SerAla: 3.847 ± 0.319
0.535SerCys: 0.535 ± 0.122
3.662SerAsp: 3.662 ± 0.256
3.394SerGlu: 3.394 ± 0.244
2.345SerPhe: 2.345 ± 0.218
4.649SerGly: 4.649 ± 0.481
0.864SerHis: 0.864 ± 0.124
4.114SerIle: 4.114 ± 0.252
3.621SerLys: 3.621 ± 0.241
5.081SerLeu: 5.081 ± 0.33
1.522SerMet: 1.522 ± 0.173
3.065SerAsn: 3.065 ± 0.33
2.283SerPro: 2.283 ± 0.227
2.263SerGln: 2.263 ± 0.206
2.757SerArg: 2.757 ± 0.246
4.032SerSer: 4.032 ± 0.363
3.538SerThr: 3.538 ± 0.384
4.444SerVal: 4.444 ± 0.373
0.576SerTrp: 0.576 ± 0.102
2.427SerTyr: 2.427 ± 0.252
0.0SerXaa: 0.0 ± 0.0
Thr
4.546ThrAla: 4.546 ± 0.411
0.699ThrCys: 0.699 ± 0.117
3.518ThrAsp: 3.518 ± 0.332
3.888ThrGlu: 3.888 ± 0.26
2.901ThrPhe: 2.901 ± 0.288
4.423ThrGly: 4.423 ± 0.392
1.049ThrHis: 1.049 ± 0.143
4.114ThrIle: 4.114 ± 0.329
3.559ThrLys: 3.559 ± 0.278
5.164ThrLeu: 5.164 ± 0.439
1.049ThrMet: 1.049 ± 0.155
2.674ThrAsn: 2.674 ± 0.262
3.559ThrPro: 3.559 ± 0.278
1.872ThrGln: 1.872 ± 0.211
3.189ThrArg: 3.189 ± 0.234
3.148ThrSer: 3.148 ± 0.304
3.95ThrThr: 3.95 ± 0.349
4.896ThrVal: 4.896 ± 0.425
0.864ThrTrp: 0.864 ± 0.133
1.81ThrTyr: 1.81 ± 0.241
0.0ThrXaa: 0.0 ± 0.0
Val
4.876ValAla: 4.876 ± 0.343
0.967ValCys: 0.967 ± 0.143
5.637ValAsp: 5.637 ± 0.371
5.02ValGlu: 5.02 ± 0.394
2.736ValPhe: 2.736 ± 0.251
4.567ValGly: 4.567 ± 0.356
1.152ValHis: 1.152 ± 0.138
4.279ValIle: 4.279 ± 0.284
5.02ValLys: 5.02 ± 0.348
5.431ValLeu: 5.431 ± 0.34
1.893ValMet: 1.893 ± 0.208
3.847ValAsn: 3.847 ± 0.308
2.695ValPro: 2.695 ± 0.249
2.674ValGln: 2.674 ± 0.252
3.106ValArg: 3.106 ± 0.274
4.649ValSer: 4.649 ± 0.361
5.102ValThr: 5.102 ± 0.468
6.11ValVal: 6.11 ± 0.404
1.111ValTrp: 1.111 ± 0.156
2.962ValTyr: 2.962 ± 0.289
0.0ValXaa: 0.0 ± 0.0
Trp
1.111TrpAla: 1.111 ± 0.146
0.288TrpCys: 0.288 ± 0.08
0.905TrpAsp: 0.905 ± 0.145
1.337TrpGlu: 1.337 ± 0.17
0.638TrpPhe: 0.638 ± 0.103
0.885TrpGly: 0.885 ± 0.135
0.247TrpHis: 0.247 ± 0.066
0.699TrpIle: 0.699 ± 0.13
0.782TrpLys: 0.782 ± 0.134
1.358TrpLeu: 1.358 ± 0.186
0.37TrpMet: 0.37 ± 0.092
0.72TrpAsn: 0.72 ± 0.116
0.411TrpPro: 0.411 ± 0.089
0.329TrpGln: 0.329 ± 0.086
0.905TrpArg: 0.905 ± 0.134
0.679TrpSer: 0.679 ± 0.107
0.658TrpThr: 0.658 ± 0.112
1.152TrpVal: 1.152 ± 0.155
0.165TrpTrp: 0.165 ± 0.051
0.432TrpTyr: 0.432 ± 0.105
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.962TyrAla: 2.962 ± 0.272
0.555TyrCys: 0.555 ± 0.116
2.777TyrAsp: 2.777 ± 0.24
2.098TyrGlu: 2.098 ± 0.186
1.543TyrPhe: 1.543 ± 0.152
2.469TyrGly: 2.469 ± 0.248
1.008TyrHis: 1.008 ± 0.135
1.975TyrIle: 1.975 ± 0.224
1.893TyrLys: 1.893 ± 0.189
2.88TyrLeu: 2.88 ± 0.243
0.987TyrMet: 0.987 ± 0.116
1.975TyrAsn: 1.975 ± 0.18
1.502TyrPro: 1.502 ± 0.175
1.522TyrGln: 1.522 ± 0.186
1.851TyrArg: 1.851 ± 0.231
2.407TyrSer: 2.407 ± 0.261
2.078TyrThr: 2.078 ± 0.228
3.086TyrVal: 3.086 ± 0.245
0.473TyrTrp: 0.473 ± 0.09
1.358TyrTyr: 1.358 ± 0.151
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 216 proteins (48611 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski