Amino acid dipepetide frequency for Mycobacterium phage Ejimix

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
10.971AlaAla: 10.971 ± 0.785
1.037AlaCys: 1.037 ± 0.181
6.537AlaAsp: 6.537 ± 0.466
7.573AlaGlu: 7.573 ± 0.506
3.398AlaPhe: 3.398 ± 0.288
6.911AlaGly: 6.911 ± 0.634
2.131AlaHis: 2.131 ± 0.29
4.809AlaIle: 4.809 ± 0.406
4.06AlaLys: 4.06 ± 0.375
8.812AlaLeu: 8.812 ± 0.593
3.456AlaMet: 3.456 ± 0.391
2.851AlaAsn: 2.851 ± 0.332
4.06AlaPro: 4.06 ± 0.349
2.851AlaGln: 2.851 ± 0.386
6.767AlaArg: 6.767 ± 0.487
5.068AlaSer: 5.068 ± 0.426
5.154AlaThr: 5.154 ± 0.361
6.45AlaVal: 6.45 ± 0.428
2.16AlaTrp: 2.16 ± 0.236
2.592AlaTyr: 2.592 ± 0.29
0.0AlaXaa: 0.0 ± 0.0
Cys
1.094CysAla: 1.094 ± 0.213
0.115CysCys: 0.115 ± 0.055
1.267CysAsp: 1.267 ± 0.211
1.238CysGlu: 1.238 ± 0.21
0.346CysPhe: 0.346 ± 0.094
1.814CysGly: 1.814 ± 0.267
0.432CysHis: 0.432 ± 0.118
0.576CysIle: 0.576 ± 0.167
0.634CysLys: 0.634 ± 0.16
0.979CysLeu: 0.979 ± 0.205
0.346CysMet: 0.346 ± 0.105
0.662CysAsn: 0.662 ± 0.153
0.691CysPro: 0.691 ± 0.149
0.518CysGln: 0.518 ± 0.127
1.065CysArg: 1.065 ± 0.225
0.547CysSer: 0.547 ± 0.114
0.806CysThr: 0.806 ± 0.135
0.691CysVal: 0.691 ± 0.132
0.23CysTrp: 0.23 ± 0.078
0.518CysTyr: 0.518 ± 0.12
0.0CysXaa: 0.0 ± 0.0
Asp
6.191AspAla: 6.191 ± 0.432
1.152AspCys: 1.152 ± 0.203
4.55AspAsp: 4.55 ± 0.416
5.356AspGlu: 5.356 ± 0.546
2.16AspPhe: 2.16 ± 0.193
7.055AspGly: 7.055 ± 0.449
1.613AspHis: 1.613 ± 0.244
3.369AspIle: 3.369 ± 0.303
2.361AspLys: 2.361 ± 0.236
5.673AspLeu: 5.673 ± 0.331
1.065AspMet: 1.065 ± 0.168
2.304AspAsn: 2.304 ± 0.239
3.6AspPro: 3.6 ± 0.31
1.929AspGln: 1.929 ± 0.215
4.089AspArg: 4.089 ± 0.325
2.908AspSer: 2.908 ± 0.299
3.283AspThr: 3.283 ± 0.319
4.175AspVal: 4.175 ± 0.367
2.217AspTrp: 2.217 ± 0.272
2.563AspTyr: 2.563 ± 0.307
0.0AspXaa: 0.0 ± 0.0
Glu
6.911GluAla: 6.911 ± 0.516
1.382GluCys: 1.382 ± 0.219
4.492GluAsp: 4.492 ± 0.395
5.011GluGlu: 5.011 ± 0.454
2.88GluPhe: 2.88 ± 0.298
4.06GluGly: 4.06 ± 0.337
1.238GluHis: 1.238 ± 0.199
4.175GluIle: 4.175 ± 0.358
2.707GluLys: 2.707 ± 0.38
6.306GluLeu: 6.306 ± 0.478
2.505GluMet: 2.505 ± 0.31
1.526GluAsn: 1.526 ± 0.202
3.139GluPro: 3.139 ± 0.32
2.39GluGln: 2.39 ± 0.259
4.867GluArg: 4.867 ± 0.38
3.571GluSer: 3.571 ± 0.366
3.686GluThr: 3.686 ± 0.291
4.751GluVal: 4.751 ± 0.361
1.325GluTrp: 1.325 ± 0.207
2.419GluTyr: 2.419 ± 0.259
0.0GluXaa: 0.0 ± 0.0
Phe
2.39PheAla: 2.39 ± 0.266
0.461PheCys: 0.461 ± 0.122
2.332PheAsp: 2.332 ± 0.279
2.361PheGlu: 2.361 ± 0.231
1.037PhePhe: 1.037 ± 0.168
2.995PheGly: 2.995 ± 0.363
0.605PheHis: 0.605 ± 0.144
1.411PheIle: 1.411 ± 0.159
1.008PheLys: 1.008 ± 0.192
2.332PheLeu: 2.332 ± 0.252
0.777PheMet: 0.777 ± 0.12
1.382PheAsn: 1.382 ± 0.221
1.843PhePro: 1.843 ± 0.245
0.921PheGln: 0.921 ± 0.19
2.045PheArg: 2.045 ± 0.245
1.929PheSer: 1.929 ± 0.251
1.872PheThr: 1.872 ± 0.209
2.275PheVal: 2.275 ± 0.233
0.576PheTrp: 0.576 ± 0.128
0.691PheTyr: 0.691 ± 0.125
0.0PheXaa: 0.0 ± 0.0
Gly
7.141GlyAla: 7.141 ± 0.628
1.267GlyCys: 1.267 ± 0.222
5.788GlyAsp: 5.788 ± 0.403
5.788GlyGlu: 5.788 ± 0.44
3.081GlyPhe: 3.081 ± 0.306
7.516GlyGly: 7.516 ± 1.479
2.304GlyHis: 2.304 ± 0.267
3.801GlyIle: 3.801 ± 0.34
3.859GlyLys: 3.859 ± 0.338
7.141GlyLeu: 7.141 ± 0.555
1.958GlyMet: 1.958 ± 0.23
3.139GlyAsn: 3.139 ± 0.333
3.628GlyPro: 3.628 ± 0.316
2.62GlyGln: 2.62 ± 0.331
5.442GlyArg: 5.442 ± 0.321
4.838GlySer: 4.838 ± 0.336
4.636GlyThr: 4.636 ± 0.418
5.126GlyVal: 5.126 ± 0.343
1.929GlyTrp: 1.929 ± 0.273
3.254GlyTyr: 3.254 ± 0.258
0.0GlyXaa: 0.0 ± 0.0
His
1.555HisAla: 1.555 ± 0.212
0.288HisCys: 0.288 ± 0.088
1.757HisAsp: 1.757 ± 0.219
1.613HisGlu: 1.613 ± 0.197
0.547HisPhe: 0.547 ± 0.135
2.188HisGly: 2.188 ± 0.271
0.95HisHis: 0.95 ± 0.172
0.979HisIle: 0.979 ± 0.19
0.662HisLys: 0.662 ± 0.154
2.62HisLeu: 2.62 ± 0.293
0.576HisMet: 0.576 ± 0.131
0.634HisAsn: 0.634 ± 0.117
1.497HisPro: 1.497 ± 0.227
0.893HisGln: 0.893 ± 0.153
1.67HisArg: 1.67 ± 0.22
0.864HisSer: 0.864 ± 0.163
0.806HisThr: 0.806 ± 0.158
1.353HisVal: 1.353 ± 0.208
0.72HisTrp: 0.72 ± 0.166
0.72HisTyr: 0.72 ± 0.171
0.0HisXaa: 0.0 ± 0.0
Ile
5.414IleAla: 5.414 ± 0.451
0.605IleCys: 0.605 ± 0.145
4.003IleAsp: 4.003 ± 0.369
4.06IleGlu: 4.06 ± 0.347
0.893IlePhe: 0.893 ± 0.182
3.83IleGly: 3.83 ± 0.405
1.296IleHis: 1.296 ± 0.167
2.102IleIle: 2.102 ± 0.265
1.641IleLys: 1.641 ± 0.215
3.571IleLeu: 3.571 ± 0.365
0.979IleMet: 0.979 ± 0.179
1.67IleAsn: 1.67 ± 0.197
3.11IlePro: 3.11 ± 0.293
1.526IleGln: 1.526 ± 0.207
2.908IleArg: 2.908 ± 0.306
2.188IleSer: 2.188 ± 0.23
3.024IleThr: 3.024 ± 0.24
3.398IleVal: 3.398 ± 0.364
0.806IleTrp: 0.806 ± 0.137
1.238IleTyr: 1.238 ± 0.197
0.0IleXaa: 0.0 ± 0.0
Lys
4.262LysAla: 4.262 ± 0.42
0.691LysCys: 0.691 ± 0.149
1.929LysAsp: 1.929 ± 0.249
2.045LysGlu: 2.045 ± 0.322
1.094LysPhe: 1.094 ± 0.15
3.196LysGly: 3.196 ± 0.382
1.065LysHis: 1.065 ± 0.18
1.584LysIle: 1.584 ± 0.203
1.929LysLys: 1.929 ± 0.29
3.628LysLeu: 3.628 ± 0.329
1.584LysMet: 1.584 ± 0.224
0.864LysAsn: 0.864 ± 0.122
2.563LysPro: 2.563 ± 0.323
1.065LysGln: 1.065 ± 0.173
2.937LysArg: 2.937 ± 0.339
1.785LysSer: 1.785 ± 0.214
1.699LysThr: 1.699 ± 0.22
3.024LysVal: 3.024 ± 0.32
1.181LysTrp: 1.181 ± 0.187
1.382LysTyr: 1.382 ± 0.192
0.0LysXaa: 0.0 ± 0.0
Leu
8.984LeuAla: 8.984 ± 0.524
0.835LeuCys: 0.835 ± 0.174
5.903LeuAsp: 5.903 ± 0.345
5.27LeuGlu: 5.27 ± 0.394
2.304LeuPhe: 2.304 ± 0.255
6.479LeuGly: 6.479 ± 0.457
1.872LeuHis: 1.872 ± 0.202
3.11LeuIle: 3.11 ± 0.338
3.168LeuLys: 3.168 ± 0.316
6.22LeuLeu: 6.22 ± 0.434
1.958LeuMet: 1.958 ± 0.214
3.542LeuAsn: 3.542 ± 0.314
4.435LeuPro: 4.435 ± 0.369
2.649LeuGln: 2.649 ± 0.273
5.27LeuArg: 5.27 ± 0.465
5.154LeuSer: 5.154 ± 0.373
5.126LeuThr: 5.126 ± 0.333
4.924LeuVal: 4.924 ± 0.363
1.44LeuTrp: 1.44 ± 0.207
1.958LeuTyr: 1.958 ± 0.264
0.0LeuXaa: 0.0 ± 0.0
Met
2.476MetAla: 2.476 ± 0.258
0.23MetCys: 0.23 ± 0.099
1.411MetAsp: 1.411 ± 0.191
1.411MetGlu: 1.411 ± 0.202
0.835MetPhe: 0.835 ± 0.134
1.958MetGly: 1.958 ± 0.21
0.374MetHis: 0.374 ± 0.098
1.44MetIle: 1.44 ± 0.192
1.181MetLys: 1.181 ± 0.206
1.353MetLeu: 1.353 ± 0.195
0.634MetMet: 0.634 ± 0.116
0.979MetAsn: 0.979 ± 0.162
1.296MetPro: 1.296 ± 0.198
0.72MetGln: 0.72 ± 0.16
1.325MetArg: 1.325 ± 0.191
2.707MetSer: 2.707 ± 0.307
2.419MetThr: 2.419 ± 0.291
1.152MetVal: 1.152 ± 0.204
0.49MetTrp: 0.49 ± 0.105
0.461MetTyr: 0.461 ± 0.102
0.0MetXaa: 0.0 ± 0.0
Asn
3.6AsnAla: 3.6 ± 0.345
0.288AsnCys: 0.288 ± 0.092
1.987AsnAsp: 1.987 ± 0.199
1.641AsnGlu: 1.641 ± 0.228
1.152AsnPhe: 1.152 ± 0.178
3.6AsnGly: 3.6 ± 0.309
0.806AsnHis: 0.806 ± 0.117
1.296AsnIle: 1.296 ± 0.183
1.382AsnLys: 1.382 ± 0.195
2.764AsnLeu: 2.764 ± 0.308
0.49AsnMet: 0.49 ± 0.127
0.662AsnAsn: 0.662 ± 0.133
2.62AsnPro: 2.62 ± 0.271
1.037AsnGln: 1.037 ± 0.25
2.592AsnArg: 2.592 ± 0.252
1.641AsnSer: 1.641 ± 0.2
1.67AsnThr: 1.67 ± 0.244
2.62AsnVal: 2.62 ± 0.267
0.806AsnTrp: 0.806 ± 0.144
0.893AsnTyr: 0.893 ± 0.157
0.0AsnXaa: 0.0 ± 0.0
Pro
5.011ProAla: 5.011 ± 0.473
0.835ProCys: 0.835 ± 0.149
3.715ProAsp: 3.715 ± 0.337
4.291ProGlu: 4.291 ± 0.377
1.757ProPhe: 1.757 ± 0.241
5.356ProGly: 5.356 ± 0.613
0.893ProHis: 0.893 ± 0.141
2.073ProIle: 2.073 ± 0.225
2.275ProLys: 2.275 ± 0.262
3.859ProLeu: 3.859 ± 0.33
1.181ProMet: 1.181 ± 0.187
2.16ProAsn: 2.16 ± 0.197
3.081ProPro: 3.081 ± 0.41
1.411ProGln: 1.411 ± 0.188
3.052ProArg: 3.052 ± 0.263
2.045ProSer: 2.045 ± 0.295
3.052ProThr: 3.052 ± 0.318
4.262ProVal: 4.262 ± 0.31
1.353ProTrp: 1.353 ± 0.173
1.613ProTyr: 1.613 ± 0.215
0.0ProXaa: 0.0 ± 0.0
Gln
3.456GlnAla: 3.456 ± 0.353
0.461GlnCys: 0.461 ± 0.122
1.325GlnAsp: 1.325 ± 0.222
1.987GlnGlu: 1.987 ± 0.294
1.238GlnPhe: 1.238 ± 0.216
2.419GlnGly: 2.419 ± 0.268
0.461GlnHis: 0.461 ± 0.1
1.814GlnIle: 1.814 ± 0.216
1.641GlnLys: 1.641 ± 0.312
2.419GlnLeu: 2.419 ± 0.272
0.864GlnMet: 0.864 ± 0.146
0.95GlnAsn: 0.95 ± 0.192
1.814GlnPro: 1.814 ± 0.242
1.325GlnGln: 1.325 ± 0.245
2.073GlnArg: 2.073 ± 0.221
1.613GlnSer: 1.613 ± 0.213
1.641GlnThr: 1.641 ± 0.221
2.448GlnVal: 2.448 ± 0.261
0.72GlnTrp: 0.72 ± 0.143
0.749GlnTyr: 0.749 ± 0.129
0.0GlnXaa: 0.0 ± 0.0
Arg
6.767ArgAla: 6.767 ± 0.509
1.094ArgCys: 1.094 ± 0.206
4.348ArgAsp: 4.348 ± 0.337
3.974ArgGlu: 3.974 ± 0.39
1.987ArgPhe: 1.987 ± 0.214
5.039ArgGly: 5.039 ± 0.385
1.641ArgHis: 1.641 ± 0.211
3.456ArgIle: 3.456 ± 0.354
2.764ArgLys: 2.764 ± 0.279
4.435ArgLeu: 4.435 ± 0.342
1.958ArgMet: 1.958 ± 0.247
1.929ArgAsn: 1.929 ± 0.208
3.11ArgPro: 3.11 ± 0.304
2.707ArgGln: 2.707 ± 0.34
5.154ArgArg: 5.154 ± 0.53
3.11ArgSer: 3.11 ± 0.29
3.427ArgThr: 3.427 ± 0.398
5.126ArgVal: 5.126 ± 0.382
2.16ArgTrp: 2.16 ± 0.255
2.563ArgTyr: 2.563 ± 0.258
0.0ArgXaa: 0.0 ± 0.0
Ser
5.327SerAla: 5.327 ± 0.472
0.777SerCys: 0.777 ± 0.156
3.34SerAsp: 3.34 ± 0.318
4.06SerGlu: 4.06 ± 0.415
1.353SerPhe: 1.353 ± 0.219
4.982SerGly: 4.982 ± 0.447
1.152SerHis: 1.152 ± 0.197
2.995SerIle: 2.995 ± 0.263
1.872SerLys: 1.872 ± 0.233
4.118SerLeu: 4.118 ± 0.322
1.382SerMet: 1.382 ± 0.181
2.073SerAsn: 2.073 ± 0.284
2.678SerPro: 2.678 ± 0.263
1.181SerGln: 1.181 ± 0.18
3.542SerArg: 3.542 ± 0.307
3.11SerSer: 3.11 ± 0.375
2.995SerThr: 2.995 ± 0.336
3.484SerVal: 3.484 ± 0.356
1.641SerTrp: 1.641 ± 0.208
1.641SerTyr: 1.641 ± 0.216
0.0SerXaa: 0.0 ± 0.0
Thr
4.579ThrAla: 4.579 ± 0.344
1.123ThrCys: 1.123 ± 0.208
3.513ThrAsp: 3.513 ± 0.322
2.995ThrGlu: 2.995 ± 0.286
2.016ThrPhe: 2.016 ± 0.264
5.183ThrGly: 5.183 ± 0.403
1.209ThrHis: 1.209 ± 0.183
3.081ThrIle: 3.081 ± 0.259
1.843ThrLys: 1.843 ± 0.245
5.039ThrLeu: 5.039 ± 0.346
0.864ThrMet: 0.864 ± 0.13
1.728ThrAsn: 1.728 ± 0.269
4.147ThrPro: 4.147 ± 0.401
1.526ThrGln: 1.526 ± 0.208
2.822ThrArg: 2.822 ± 0.245
3.398ThrSer: 3.398 ± 0.401
2.995ThrThr: 2.995 ± 0.359
4.521ThrVal: 4.521 ± 0.431
1.613ThrTrp: 1.613 ± 0.228
1.901ThrTyr: 1.901 ± 0.282
0.0ThrXaa: 0.0 ± 0.0
Val
6.709ValAla: 6.709 ± 0.426
0.72ValCys: 0.72 ± 0.15
5.558ValAsp: 5.558 ± 0.468
5.471ValGlu: 5.471 ± 0.397
1.699ValPhe: 1.699 ± 0.217
5.039ValGly: 5.039 ± 0.449
1.641ValHis: 1.641 ± 0.199
3.916ValIle: 3.916 ± 0.313
2.62ValLys: 2.62 ± 0.281
4.809ValLeu: 4.809 ± 0.404
1.267ValMet: 1.267 ± 0.24
2.62ValAsn: 2.62 ± 0.302
3.571ValPro: 3.571 ± 0.309
2.361ValGln: 2.361 ± 0.26
4.492ValArg: 4.492 ± 0.4
4.118ValSer: 4.118 ± 0.356
4.521ValThr: 4.521 ± 0.39
5.5ValVal: 5.5 ± 0.5
0.95ValTrp: 0.95 ± 0.147
2.102ValTyr: 2.102 ± 0.286
0.0ValXaa: 0.0 ± 0.0
Trp
1.814TrpAla: 1.814 ± 0.189
0.547TrpCys: 0.547 ± 0.14
1.584TrpAsp: 1.584 ± 0.227
1.209TrpGlu: 1.209 ± 0.194
0.691TrpPhe: 0.691 ± 0.149
1.584TrpGly: 1.584 ± 0.201
0.691TrpHis: 0.691 ± 0.134
1.065TrpIle: 1.065 ± 0.173
0.72TrpLys: 0.72 ± 0.133
2.102TrpLeu: 2.102 ± 0.234
0.518TrpMet: 0.518 ± 0.133
0.893TrpAsn: 0.893 ± 0.144
0.95TrpPro: 0.95 ± 0.141
0.691TrpGln: 0.691 ± 0.116
1.929TrpArg: 1.929 ± 0.236
1.757TrpSer: 1.757 ± 0.221
1.555TrpThr: 1.555 ± 0.272
1.929TrpVal: 1.929 ± 0.226
0.777TrpTrp: 0.777 ± 0.144
0.777TrpTyr: 0.777 ± 0.171
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.995TyrAla: 2.995 ± 0.299
0.662TyrCys: 0.662 ± 0.136
2.275TyrAsp: 2.275 ± 0.281
1.987TyrGlu: 1.987 ± 0.194
0.777TyrPhe: 0.777 ± 0.17
3.052TyrGly: 3.052 ± 0.281
0.547TyrHis: 0.547 ± 0.112
1.209TyrIle: 1.209 ± 0.201
1.181TyrLys: 1.181 ± 0.188
2.419TyrLeu: 2.419 ± 0.282
0.461TyrMet: 0.461 ± 0.122
0.979TyrAsn: 0.979 ± 0.166
1.44TyrPro: 1.44 ± 0.2
1.094TyrGln: 1.094 ± 0.192
2.707TyrArg: 2.707 ± 0.297
1.353TyrSer: 1.353 ± 0.173
1.728TyrThr: 1.728 ± 0.196
2.476TyrVal: 2.476 ± 0.297
0.691TyrTrp: 0.691 ± 0.149
0.921TyrTyr: 0.921 ± 0.182
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 234 proteins (34728 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski