Amino acid dipepetide frequency for Mycobacterium phage Hannaconda

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
11.0AlaAla: 11.0 ± 0.75
1.517AlaCys: 1.517 ± 0.239
6.215AlaAsp: 6.215 ± 0.427
7.557AlaGlu: 7.557 ± 0.611
3.122AlaPhe: 3.122 ± 0.32
7.44AlaGly: 7.44 ± 0.794
2.159AlaHis: 2.159 ± 0.267
5.398AlaIle: 5.398 ± 0.427
4.085AlaLys: 4.085 ± 0.362
8.666AlaLeu: 8.666 ± 0.528
3.21AlaMet: 3.21 ± 0.336
3.093AlaAsn: 3.093 ± 0.293
4.464AlaPro: 4.464 ± 0.396
2.889AlaGln: 2.889 ± 0.238
6.623AlaArg: 6.623 ± 0.447
4.289AlaSer: 4.289 ± 0.372
4.873AlaThr: 4.873 ± 0.393
5.719AlaVal: 5.719 ± 0.401
2.042AlaTrp: 2.042 ± 0.232
2.422AlaTyr: 2.422 ± 0.246
0.0AlaXaa: 0.0 ± 0.0
Cys
1.05CysAla: 1.05 ± 0.207
0.117CysCys: 0.117 ± 0.084
1.313CysAsp: 1.313 ± 0.232
1.284CysGlu: 1.284 ± 0.229
0.438CysPhe: 0.438 ± 0.113
2.188CysGly: 2.188 ± 0.335
0.525CysHis: 0.525 ± 0.135
0.671CysIle: 0.671 ± 0.148
0.759CysLys: 0.759 ± 0.147
1.138CysLeu: 1.138 ± 0.218
0.233CysMet: 0.233 ± 0.093
0.7CysAsn: 0.7 ± 0.185
0.875CysPro: 0.875 ± 0.201
0.379CysGln: 0.379 ± 0.135
1.167CysArg: 1.167 ± 0.218
0.846CysSer: 0.846 ± 0.184
0.934CysThr: 0.934 ± 0.17
0.905CysVal: 0.905 ± 0.198
0.263CysTrp: 0.263 ± 0.079
0.467CysTyr: 0.467 ± 0.129
0.0CysXaa: 0.0 ± 0.0
Asp
5.281AspAla: 5.281 ± 0.395
1.138AspCys: 1.138 ± 0.211
3.881AspAsp: 3.881 ± 0.314
4.727AspGlu: 4.727 ± 0.528
1.926AspPhe: 1.926 ± 0.186
6.594AspGly: 6.594 ± 0.456
1.897AspHis: 1.897 ± 0.236
3.18AspIle: 3.18 ± 0.313
2.889AspLys: 2.889 ± 0.253
5.427AspLeu: 5.427 ± 0.357
1.284AspMet: 1.284 ± 0.193
1.809AspAsn: 1.809 ± 0.243
3.326AspPro: 3.326 ± 0.269
2.13AspGln: 2.13 ± 0.232
3.881AspArg: 3.881 ± 0.299
2.772AspSer: 2.772 ± 0.331
3.151AspThr: 3.151 ± 0.288
4.348AspVal: 4.348 ± 0.412
2.188AspTrp: 2.188 ± 0.282
2.305AspTyr: 2.305 ± 0.224
0.0AspXaa: 0.0 ± 0.0
Glu
6.799GluAla: 6.799 ± 0.577
1.401GluCys: 1.401 ± 0.266
4.202GluAsp: 4.202 ± 0.352
5.631GluGlu: 5.631 ± 0.478
3.151GluPhe: 3.151 ± 0.313
4.493GluGly: 4.493 ± 0.441
1.605GluHis: 1.605 ± 0.23
3.939GluIle: 3.939 ± 0.322
2.743GluLys: 2.743 ± 0.289
6.157GluLeu: 6.157 ± 0.584
2.334GluMet: 2.334 ± 0.297
1.634GluAsn: 1.634 ± 0.224
3.035GluPro: 3.035 ± 0.356
2.334GluGln: 2.334 ± 0.258
5.165GluArg: 5.165 ± 0.453
3.414GluSer: 3.414 ± 0.355
3.093GluThr: 3.093 ± 0.273
4.493GluVal: 4.493 ± 0.369
1.517GluTrp: 1.517 ± 0.246
2.947GluTyr: 2.947 ± 0.377
0.0GluXaa: 0.0 ± 0.0
Phe
2.509PheAla: 2.509 ± 0.284
0.467PheCys: 0.467 ± 0.105
2.393PheAsp: 2.393 ± 0.277
2.072PheGlu: 2.072 ± 0.222
1.08PhePhe: 1.08 ± 0.203
3.297PheGly: 3.297 ± 0.333
0.7PheHis: 0.7 ± 0.151
1.722PheIle: 1.722 ± 0.212
0.992PheLys: 0.992 ± 0.132
2.072PheLeu: 2.072 ± 0.272
0.729PheMet: 0.729 ± 0.135
1.255PheAsn: 1.255 ± 0.224
1.867PhePro: 1.867 ± 0.247
0.817PheGln: 0.817 ± 0.187
1.663PheArg: 1.663 ± 0.202
2.072PheSer: 2.072 ± 0.244
1.897PheThr: 1.897 ± 0.264
1.926PheVal: 1.926 ± 0.25
0.554PheTrp: 0.554 ± 0.147
0.846PheTyr: 0.846 ± 0.146
0.0PheXaa: 0.0 ± 0.0
Gly
7.324GlyAla: 7.324 ± 0.719
1.605GlyCys: 1.605 ± 0.231
5.865GlyAsp: 5.865 ± 0.397
5.952GlyGlu: 5.952 ± 0.483
2.859GlyPhe: 2.859 ± 0.288
8.491GlyGly: 8.491 ± 1.473
2.013GlyHis: 2.013 ± 0.26
3.647GlyIle: 3.647 ± 0.368
3.297GlyLys: 3.297 ± 0.342
6.828GlyLeu: 6.828 ± 0.71
2.101GlyMet: 2.101 ± 0.25
3.501GlyAsn: 3.501 ± 0.387
3.997GlyPro: 3.997 ± 0.366
2.597GlyGln: 2.597 ± 0.38
5.836GlyArg: 5.836 ± 0.441
5.194GlySer: 5.194 ± 0.509
4.844GlyThr: 4.844 ± 0.528
5.573GlyVal: 5.573 ± 0.407
2.305GlyTrp: 2.305 ± 0.315
3.21GlyTyr: 3.21 ± 0.25
0.0GlyXaa: 0.0 ± 0.0
His
1.751HisAla: 1.751 ± 0.21
0.584HisCys: 0.584 ± 0.136
1.663HisAsp: 1.663 ± 0.226
1.984HisGlu: 1.984 ± 0.253
0.525HisPhe: 0.525 ± 0.143
2.363HisGly: 2.363 ± 0.29
0.875HisHis: 0.875 ± 0.194
1.021HisIle: 1.021 ± 0.18
0.759HisLys: 0.759 ± 0.132
2.276HisLeu: 2.276 ± 0.271
0.467HisMet: 0.467 ± 0.113
0.554HisAsn: 0.554 ± 0.124
1.576HisPro: 1.576 ± 0.199
0.613HisGln: 0.613 ± 0.129
2.072HisArg: 2.072 ± 0.237
0.905HisSer: 0.905 ± 0.187
0.788HisThr: 0.788 ± 0.144
1.43HisVal: 1.43 ± 0.217
0.875HisTrp: 0.875 ± 0.176
0.671HisTyr: 0.671 ± 0.136
0.0HisXaa: 0.0 ± 0.0
Ile
5.281IleAla: 5.281 ± 0.368
0.554IleCys: 0.554 ± 0.13
3.297IleAsp: 3.297 ± 0.334
4.173IleGlu: 4.173 ± 0.381
1.401IlePhe: 1.401 ± 0.196
3.589IleGly: 3.589 ± 0.407
1.255IleHis: 1.255 ± 0.172
2.042IleIle: 2.042 ± 0.277
1.78IleLys: 1.78 ± 0.214
3.385IleLeu: 3.385 ± 0.401
0.759IleMet: 0.759 ± 0.155
1.867IleAsn: 1.867 ± 0.238
3.268IlePro: 3.268 ± 0.315
1.546IleGln: 1.546 ± 0.214
2.947IleArg: 2.947 ± 0.325
2.393IleSer: 2.393 ± 0.255
3.326IleThr: 3.326 ± 0.336
3.531IleVal: 3.531 ± 0.388
0.759IleTrp: 0.759 ± 0.138
1.313IleTyr: 1.313 ± 0.188
0.0IleXaa: 0.0 ± 0.0
Lys
3.91LysAla: 3.91 ± 0.407
0.554LysCys: 0.554 ± 0.136
1.838LysAsp: 1.838 ± 0.257
1.984LysGlu: 1.984 ± 0.25
1.255LysPhe: 1.255 ± 0.193
2.918LysGly: 2.918 ± 0.337
0.992LysHis: 0.992 ± 0.165
1.546LysIle: 1.546 ± 0.234
2.305LysLys: 2.305 ± 0.353
3.618LysLeu: 3.618 ± 0.371
1.488LysMet: 1.488 ± 0.223
0.817LysAsn: 0.817 ± 0.149
2.655LysPro: 2.655 ± 0.273
1.109LysGln: 1.109 ± 0.169
2.772LysArg: 2.772 ± 0.369
2.655LysSer: 2.655 ± 0.395
1.984LysThr: 1.984 ± 0.19
2.889LysVal: 2.889 ± 0.332
1.08LysTrp: 1.08 ± 0.186
1.576LysTyr: 1.576 ± 0.194
0.0LysXaa: 0.0 ± 0.0
Leu
8.812LeuAla: 8.812 ± 0.482
0.905LeuCys: 0.905 ± 0.176
5.661LeuAsp: 5.661 ± 0.474
4.493LeuGlu: 4.493 ± 0.416
1.809LeuPhe: 1.809 ± 0.218
6.536LeuGly: 6.536 ± 0.464
1.722LeuHis: 1.722 ± 0.192
3.151LeuIle: 3.151 ± 0.36
2.976LeuLys: 2.976 ± 0.264
5.748LeuLeu: 5.748 ± 0.483
2.159LeuMet: 2.159 ± 0.242
2.363LeuAsn: 2.363 ± 0.253
4.348LeuPro: 4.348 ± 0.378
2.568LeuGln: 2.568 ± 0.279
6.157LeuArg: 6.157 ± 0.437
5.106LeuSer: 5.106 ± 0.396
5.631LeuThr: 5.631 ± 0.471
5.427LeuVal: 5.427 ± 0.349
1.255LeuTrp: 1.255 ± 0.179
2.042LeuTyr: 2.042 ± 0.258
0.0LeuXaa: 0.0 ± 0.0
Met
2.276MetAla: 2.276 ± 0.276
0.292MetCys: 0.292 ± 0.103
1.284MetAsp: 1.284 ± 0.179
1.313MetGlu: 1.313 ± 0.192
0.642MetPhe: 0.642 ± 0.127
1.488MetGly: 1.488 ± 0.211
0.496MetHis: 0.496 ± 0.124
1.138MetIle: 1.138 ± 0.164
1.225MetLys: 1.225 ± 0.167
1.751MetLeu: 1.751 ± 0.214
0.408MetMet: 0.408 ± 0.106
1.401MetAsn: 1.401 ± 0.197
1.138MetPro: 1.138 ± 0.21
0.759MetGln: 0.759 ± 0.151
1.926MetArg: 1.926 ± 0.242
2.451MetSer: 2.451 ± 0.262
2.188MetThr: 2.188 ± 0.297
1.459MetVal: 1.459 ± 0.201
0.7MetTrp: 0.7 ± 0.132
0.554MetTyr: 0.554 ± 0.122
0.0MetXaa: 0.0 ± 0.0
Asn
3.122AsnAla: 3.122 ± 0.331
0.496AsnCys: 0.496 ± 0.12
2.159AsnAsp: 2.159 ± 0.297
2.013AsnGlu: 2.013 ± 0.259
0.992AsnPhe: 0.992 ± 0.179
3.997AsnGly: 3.997 ± 0.4
0.875AsnHis: 0.875 ± 0.164
1.342AsnIle: 1.342 ± 0.224
0.992AsnLys: 0.992 ± 0.156
2.568AsnLeu: 2.568 ± 0.255
0.729AsnMet: 0.729 ± 0.131
0.788AsnAsn: 0.788 ± 0.165
2.48AsnPro: 2.48 ± 0.253
0.905AsnGln: 0.905 ± 0.19
2.597AsnArg: 2.597 ± 0.28
1.838AsnSer: 1.838 ± 0.272
1.663AsnThr: 1.663 ± 0.234
1.955AsnVal: 1.955 ± 0.256
0.525AsnTrp: 0.525 ± 0.114
0.7AsnTyr: 0.7 ± 0.131
0.0AsnXaa: 0.0 ± 0.0
Pro
5.048ProAla: 5.048 ± 0.365
0.759ProCys: 0.759 ± 0.165
3.706ProAsp: 3.706 ± 0.317
4.377ProGlu: 4.377 ± 0.332
1.751ProPhe: 1.751 ± 0.234
5.544ProGly: 5.544 ± 0.615
1.109ProHis: 1.109 ± 0.169
2.568ProIle: 2.568 ± 0.223
1.984ProLys: 1.984 ± 0.274
3.589ProLeu: 3.589 ± 0.282
1.342ProMet: 1.342 ± 0.193
2.042ProAsn: 2.042 ± 0.206
2.772ProPro: 2.772 ± 0.348
1.196ProGln: 1.196 ± 0.267
2.918ProArg: 2.918 ± 0.274
2.684ProSer: 2.684 ± 0.298
3.268ProThr: 3.268 ± 0.317
4.348ProVal: 4.348 ± 0.4
1.459ProTrp: 1.459 ± 0.216
1.371ProTyr: 1.371 ± 0.213
0.0ProXaa: 0.0 ± 0.0
Gln
3.18GlnAla: 3.18 ± 0.291
0.408GlnCys: 0.408 ± 0.112
1.284GlnAsp: 1.284 ± 0.234
1.692GlnGlu: 1.692 ± 0.235
1.167GlnPhe: 1.167 ± 0.205
1.867GlnGly: 1.867 ± 0.221
0.642GlnHis: 0.642 ± 0.135
1.459GlnIle: 1.459 ± 0.211
2.013GlnLys: 2.013 ± 0.306
2.305GlnLeu: 2.305 ± 0.334
0.846GlnMet: 0.846 ± 0.166
0.788GlnAsn: 0.788 ± 0.129
1.663GlnPro: 1.663 ± 0.265
1.109GlnGln: 1.109 ± 0.273
2.83GlnArg: 2.83 ± 0.303
2.042GlnSer: 2.042 ± 0.283
1.517GlnThr: 1.517 ± 0.187
1.867GlnVal: 1.867 ± 0.249
0.7GlnTrp: 0.7 ± 0.156
0.846GlnTyr: 0.846 ± 0.136
0.0GlnXaa: 0.0 ± 0.0
Arg
6.828ArgAla: 6.828 ± 0.506
1.021ArgCys: 1.021 ± 0.181
3.764ArgAsp: 3.764 ± 0.342
4.902ArgGlu: 4.902 ± 0.459
2.159ArgPhe: 2.159 ± 0.273
5.427ArgGly: 5.427 ± 0.4
1.663ArgHis: 1.663 ± 0.229
3.91ArgIle: 3.91 ± 0.363
3.035ArgLys: 3.035 ± 0.297
5.252ArgLeu: 5.252 ± 0.348
2.247ArgMet: 2.247 ± 0.274
2.218ArgAsn: 2.218 ± 0.25
3.093ArgPro: 3.093 ± 0.33
2.539ArgGln: 2.539 ± 0.324
5.602ArgArg: 5.602 ± 0.508
2.947ArgSer: 2.947 ± 0.236
3.414ArgThr: 3.414 ± 0.397
4.873ArgVal: 4.873 ± 0.39
2.013ArgTrp: 2.013 ± 0.267
2.451ArgTyr: 2.451 ± 0.277
0.0ArgXaa: 0.0 ± 0.0
Ser
5.281SerAla: 5.281 ± 0.417
0.875SerCys: 0.875 ± 0.185
3.501SerAsp: 3.501 ± 0.326
3.356SerGlu: 3.356 ± 0.271
1.255SerPhe: 1.255 ± 0.197
6.215SerGly: 6.215 ± 0.648
0.992SerHis: 0.992 ± 0.167
2.684SerIle: 2.684 ± 0.28
2.042SerLys: 2.042 ± 0.244
4.493SerLeu: 4.493 ± 0.32
1.284SerMet: 1.284 ± 0.19
1.809SerAsn: 1.809 ± 0.257
3.035SerPro: 3.035 ± 0.307
1.838SerGln: 1.838 ± 0.204
3.093SerArg: 3.093 ± 0.279
3.151SerSer: 3.151 ± 0.388
2.976SerThr: 2.976 ± 0.323
3.735SerVal: 3.735 ± 0.35
1.313SerTrp: 1.313 ± 0.225
1.488SerTyr: 1.488 ± 0.192
0.0SerXaa: 0.0 ± 0.0
Thr
4.931ThrAla: 4.931 ± 0.406
1.138ThrCys: 1.138 ± 0.224
3.589ThrAsp: 3.589 ± 0.313
3.297ThrGlu: 3.297 ± 0.371
2.101ThrPhe: 2.101 ± 0.247
5.281ThrGly: 5.281 ± 0.492
1.167ThrHis: 1.167 ± 0.179
3.122ThrIle: 3.122 ± 0.321
1.838ThrLys: 1.838 ± 0.285
4.523ThrLeu: 4.523 ± 0.386
0.846ThrMet: 0.846 ± 0.157
1.692ThrAsn: 1.692 ± 0.256
4.435ThrPro: 4.435 ± 0.463
1.225ThrGln: 1.225 ± 0.214
3.122ThrArg: 3.122 ± 0.306
2.363ThrSer: 2.363 ± 0.308
2.947ThrThr: 2.947 ± 0.339
4.873ThrVal: 4.873 ± 0.438
1.342ThrTrp: 1.342 ± 0.2
1.926ThrTyr: 1.926 ± 0.24
0.0ThrXaa: 0.0 ± 0.0
Val
7.003ValAla: 7.003 ± 0.52
1.255ValCys: 1.255 ± 0.22
5.135ValAsp: 5.135 ± 0.413
5.573ValGlu: 5.573 ± 0.468
1.926ValPhe: 1.926 ± 0.218
4.814ValGly: 4.814 ± 0.374
1.576ValHis: 1.576 ± 0.19
3.56ValIle: 3.56 ± 0.324
2.509ValLys: 2.509 ± 0.293
4.464ValLeu: 4.464 ± 0.372
1.342ValMet: 1.342 ± 0.206
2.684ValAsn: 2.684 ± 0.268
3.618ValPro: 3.618 ± 0.353
1.897ValGln: 1.897 ± 0.256
4.289ValArg: 4.289 ± 0.37
4.143ValSer: 4.143 ± 0.414
4.173ValThr: 4.173 ± 0.434
5.252ValVal: 5.252 ± 0.419
1.255ValTrp: 1.255 ± 0.204
1.984ValTyr: 1.984 ± 0.238
0.0ValXaa: 0.0 ± 0.0
Trp
2.159TrpAla: 2.159 ± 0.239
0.7TrpCys: 0.7 ± 0.156
1.43TrpAsp: 1.43 ± 0.196
1.488TrpGlu: 1.488 ± 0.223
0.671TrpPhe: 0.671 ± 0.149
1.692TrpGly: 1.692 ± 0.212
0.817TrpHis: 0.817 ± 0.185
1.08TrpIle: 1.08 ± 0.189
0.496TrpLys: 0.496 ± 0.116
2.013TrpLeu: 2.013 ± 0.243
0.642TrpMet: 0.642 ± 0.138
0.671TrpAsn: 0.671 ± 0.132
1.138TrpPro: 1.138 ± 0.151
0.467TrpGln: 0.467 ± 0.118
1.955TrpArg: 1.955 ± 0.277
1.546TrpSer: 1.546 ± 0.218
1.342TrpThr: 1.342 ± 0.19
1.78TrpVal: 1.78 ± 0.213
0.613TrpTrp: 0.613 ± 0.126
0.671TrpTyr: 0.671 ± 0.168
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.239TyrAla: 3.239 ± 0.293
0.467TyrCys: 0.467 ± 0.133
1.955TyrAsp: 1.955 ± 0.232
2.247TyrGlu: 2.247 ± 0.245
0.7TyrPhe: 0.7 ± 0.169
2.801TyrGly: 2.801 ± 0.325
0.671TyrHis: 0.671 ± 0.148
1.284TyrIle: 1.284 ± 0.229
1.225TyrLys: 1.225 ± 0.194
2.597TyrLeu: 2.597 ± 0.304
0.408TyrMet: 0.408 ± 0.113
1.05TyrAsn: 1.05 ± 0.158
1.05TyrPro: 1.05 ± 0.173
1.313TyrGln: 1.313 ± 0.209
2.743TyrArg: 2.743 ± 0.308
1.605TyrSer: 1.605 ± 0.225
1.751TyrThr: 1.751 ± 0.237
2.072TyrVal: 2.072 ± 0.281
0.613TyrTrp: 0.613 ± 0.147
0.905TyrTyr: 0.905 ± 0.183
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 231 proteins (34273 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski