Amino acid dipepetide frequency for Mycobacterium phage Bobby

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
12.192AlaAla: 12.192 ± 1.188
1.593AlaCys: 1.593 ± 0.265
6.805AlaAsp: 6.805 ± 0.535
7.964AlaGlu: 7.964 ± 0.554
3.301AlaPhe: 3.301 ± 0.353
6.689AlaGly: 6.689 ± 0.529
2.201AlaHis: 2.201 ± 0.271
4.981AlaIle: 4.981 ± 0.37
4.141AlaLys: 4.141 ± 0.381
8.977AlaLeu: 8.977 ± 0.47
3.359AlaMet: 3.359 ± 0.339
3.041AlaAsn: 3.041 ± 0.354
3.533AlaPro: 3.533 ± 0.336
2.983AlaGln: 2.983 ± 0.351
6.747AlaArg: 6.747 ± 0.564
4.865AlaSer: 4.865 ± 0.363
5.705AlaThr: 5.705 ± 0.378
6.284AlaVal: 6.284 ± 0.43
1.911AlaTrp: 1.911 ± 0.27
2.722AlaTyr: 2.722 ± 0.293
0.0AlaXaa: 0.0 ± 0.0
Cys
1.043CysAla: 1.043 ± 0.192
0.087CysCys: 0.087 ± 0.048
1.361CysAsp: 1.361 ± 0.275
0.985CysGlu: 0.985 ± 0.189
0.463CysPhe: 0.463 ± 0.135
1.94CysGly: 1.94 ± 0.275
0.463CysHis: 0.463 ± 0.094
0.55CysIle: 0.55 ± 0.121
0.927CysLys: 0.927 ± 0.197
1.216CysLeu: 1.216 ± 0.228
0.116CysMet: 0.116 ± 0.052
0.753CysAsn: 0.753 ± 0.136
0.666CysPro: 0.666 ± 0.16
0.492CysGln: 0.492 ± 0.118
0.869CysArg: 0.869 ± 0.23
0.724CysSer: 0.724 ± 0.153
0.811CysThr: 0.811 ± 0.171
0.782CysVal: 0.782 ± 0.166
0.348CysTrp: 0.348 ± 0.105
0.348CysTyr: 0.348 ± 0.112
0.0CysXaa: 0.0 ± 0.0
Asp
6.429AspAla: 6.429 ± 0.481
1.014AspCys: 1.014 ± 0.18
4.199AspAsp: 4.199 ± 0.366
5.299AspGlu: 5.299 ± 0.52
2.056AspPhe: 2.056 ± 0.224
6.863AspGly: 6.863 ± 0.514
1.68AspHis: 1.68 ± 0.221
3.272AspIle: 3.272 ± 0.314
2.404AspLys: 2.404 ± 0.309
5.56AspLeu: 5.56 ± 0.354
1.274AspMet: 1.274 ± 0.167
2.172AspAsn: 2.172 ± 0.282
4.025AspPro: 4.025 ± 0.34
1.853AspGln: 1.853 ± 0.228
4.489AspArg: 4.489 ± 0.329
2.635AspSer: 2.635 ± 0.26
3.128AspThr: 3.128 ± 0.321
3.996AspVal: 3.996 ± 0.397
1.94AspTrp: 1.94 ± 0.222
2.635AspTyr: 2.635 ± 0.276
0.0AspXaa: 0.0 ± 0.0
Glu
7.298GluAla: 7.298 ± 0.472
1.274GluCys: 1.274 ± 0.227
4.112GluAsp: 4.112 ± 0.424
4.952GluGlu: 4.952 ± 0.466
3.041GluPhe: 3.041 ± 0.316
4.489GluGly: 4.489 ± 0.329
1.593GluHis: 1.593 ± 0.211
4.17GluIle: 4.17 ± 0.337
2.78GluLys: 2.78 ± 0.316
6.66GluLeu: 6.66 ± 0.569
2.114GluMet: 2.114 ± 0.27
1.68GluAsn: 1.68 ± 0.203
3.359GluPro: 3.359 ± 0.373
2.404GluGln: 2.404 ± 0.272
4.807GluArg: 4.807 ± 0.378
3.359GluSer: 3.359 ± 0.312
3.417GluThr: 3.417 ± 0.298
4.286GluVal: 4.286 ± 0.36
1.332GluTrp: 1.332 ± 0.222
2.606GluTyr: 2.606 ± 0.33
0.0GluXaa: 0.0 ± 0.0
Phe
2.635PheAla: 2.635 ± 0.219
0.463PheCys: 0.463 ± 0.11
2.027PheAsp: 2.027 ± 0.245
2.23PheGlu: 2.23 ± 0.265
0.985PhePhe: 0.985 ± 0.173
2.954PheGly: 2.954 ± 0.42
0.724PheHis: 0.724 ± 0.153
1.39PheIle: 1.39 ± 0.2
1.187PheLys: 1.187 ± 0.191
2.288PheLeu: 2.288 ± 0.289
0.724PheMet: 0.724 ± 0.155
1.332PheAsn: 1.332 ± 0.218
1.853PhePro: 1.853 ± 0.24
0.84PheGln: 0.84 ± 0.15
2.056PheArg: 2.056 ± 0.245
1.882PheSer: 1.882 ± 0.249
2.143PheThr: 2.143 ± 0.254
2.577PheVal: 2.577 ± 0.232
0.405PheTrp: 0.405 ± 0.11
0.898PheTyr: 0.898 ± 0.159
0.0PheXaa: 0.0 ± 0.0
Gly
6.776GlyAla: 6.776 ± 0.625
1.564GlyCys: 1.564 ± 0.25
5.531GlyAsp: 5.531 ± 0.427
5.357GlyGlu: 5.357 ± 0.419
3.214GlyPhe: 3.214 ± 0.368
7.587GlyGly: 7.587 ± 1.545
2.085GlyHis: 2.085 ± 0.246
4.025GlyIle: 4.025 ± 0.408
4.141GlyLys: 4.141 ± 0.368
6.805GlyLeu: 6.805 ± 0.509
1.998GlyMet: 1.998 ± 0.302
3.099GlyAsn: 3.099 ± 0.318
3.996GlyPro: 3.996 ± 0.445
2.288GlyGln: 2.288 ± 0.316
5.357GlyArg: 5.357 ± 0.335
4.807GlySer: 4.807 ± 0.38
4.952GlyThr: 4.952 ± 0.439
5.705GlyVal: 5.705 ± 0.375
2.027GlyTrp: 2.027 ± 0.221
3.128GlyTyr: 3.128 ± 0.319
0.0GlyXaa: 0.0 ± 0.0
His
2.027HisAla: 2.027 ± 0.18
0.376HisCys: 0.376 ± 0.101
1.593HisAsp: 1.593 ± 0.23
1.68HisGlu: 1.68 ± 0.239
0.579HisPhe: 0.579 ± 0.122
2.259HisGly: 2.259 ± 0.276
1.043HisHis: 1.043 ± 0.156
0.753HisIle: 0.753 ± 0.147
0.782HisLys: 0.782 ± 0.139
2.461HisLeu: 2.461 ± 0.313
0.434HisMet: 0.434 ± 0.111
0.666HisAsn: 0.666 ± 0.133
1.564HisPro: 1.564 ± 0.211
0.869HisGln: 0.869 ± 0.143
1.94HisArg: 1.94 ± 0.234
0.666HisSer: 0.666 ± 0.153
0.898HisThr: 0.898 ± 0.161
1.274HisVal: 1.274 ± 0.206
0.753HisTrp: 0.753 ± 0.137
0.637HisTyr: 0.637 ± 0.14
0.0HisXaa: 0.0 ± 0.0
Ile
5.56IleAla: 5.56 ± 0.504
0.579IleCys: 0.579 ± 0.121
3.794IleAsp: 3.794 ± 0.383
4.315IleGlu: 4.315 ± 0.351
1.216IlePhe: 1.216 ± 0.206
4.17IleGly: 4.17 ± 0.472
1.158IleHis: 1.158 ± 0.171
1.766IleIle: 1.766 ± 0.238
1.68IleLys: 1.68 ± 0.206
3.678IleLeu: 3.678 ± 0.33
0.782IleMet: 0.782 ± 0.147
1.651IleAsn: 1.651 ± 0.229
2.693IlePro: 2.693 ± 0.241
1.564IleGln: 1.564 ± 0.2
3.301IleArg: 3.301 ± 0.336
2.056IleSer: 2.056 ± 0.245
2.983IleThr: 2.983 ± 0.281
3.156IleVal: 3.156 ± 0.322
0.608IleTrp: 0.608 ± 0.115
1.274IleTyr: 1.274 ± 0.193
0.0IleXaa: 0.0 ± 0.0
Lys
4.112LysAla: 4.112 ± 0.343
0.637LysCys: 0.637 ± 0.134
2.317LysAsp: 2.317 ± 0.312
1.969LysGlu: 1.969 ± 0.266
1.216LysPhe: 1.216 ± 0.193
3.012LysGly: 3.012 ± 0.273
0.869LysHis: 0.869 ± 0.162
1.593LysIle: 1.593 ± 0.227
1.998LysLys: 1.998 ± 0.279
3.504LysLeu: 3.504 ± 0.342
1.39LysMet: 1.39 ± 0.196
0.84LysAsn: 0.84 ± 0.149
2.433LysPro: 2.433 ± 0.287
1.129LysGln: 1.129 ± 0.178
3.214LysArg: 3.214 ± 0.367
1.969LysSer: 1.969 ± 0.272
1.998LysThr: 1.998 ± 0.247
3.33LysVal: 3.33 ± 0.291
1.129LysTrp: 1.129 ± 0.166
1.361LysTyr: 1.361 ± 0.216
0.0LysXaa: 0.0 ± 0.0
Leu
9.441LeuAla: 9.441 ± 0.49
0.956LeuCys: 0.956 ± 0.164
5.531LeuAsp: 5.531 ± 0.32
5.126LeuGlu: 5.126 ± 0.382
2.404LeuPhe: 2.404 ± 0.279
6.168LeuGly: 6.168 ± 0.495
1.998LeuHis: 1.998 ± 0.201
3.07LeuIle: 3.07 ± 0.31
3.243LeuLys: 3.243 ± 0.279
6.023LeuLeu: 6.023 ± 0.419
1.738LeuMet: 1.738 ± 0.23
3.359LeuAsn: 3.359 ± 0.306
4.807LeuPro: 4.807 ± 0.382
2.954LeuGln: 2.954 ± 0.316
5.879LeuArg: 5.879 ± 0.537
5.299LeuSer: 5.299 ± 0.361
5.502LeuThr: 5.502 ± 0.374
4.604LeuVal: 4.604 ± 0.414
1.1LeuTrp: 1.1 ± 0.174
2.201LeuTyr: 2.201 ± 0.296
0.0LeuXaa: 0.0 ± 0.0
Met
2.259MetAla: 2.259 ± 0.261
0.145MetCys: 0.145 ± 0.074
1.593MetAsp: 1.593 ± 0.188
1.129MetGlu: 1.129 ± 0.154
0.666MetPhe: 0.666 ± 0.141
1.824MetGly: 1.824 ± 0.241
0.492MetHis: 0.492 ± 0.117
1.129MetIle: 1.129 ± 0.164
0.985MetLys: 0.985 ± 0.165
1.303MetLeu: 1.303 ± 0.234
0.579MetMet: 0.579 ± 0.14
1.187MetAsn: 1.187 ± 0.185
1.043MetPro: 1.043 ± 0.203
0.55MetGln: 0.55 ± 0.11
1.274MetArg: 1.274 ± 0.176
2.375MetSer: 2.375 ± 0.266
2.49MetThr: 2.49 ± 0.282
1.129MetVal: 1.129 ± 0.198
0.579MetTrp: 0.579 ± 0.127
0.434MetTyr: 0.434 ± 0.117
0.0MetXaa: 0.0 ± 0.0
Asn
3.417AsnAla: 3.417 ± 0.369
0.203AsnCys: 0.203 ± 0.077
2.085AsnAsp: 2.085 ± 0.222
1.882AsnGlu: 1.882 ± 0.215
1.245AsnPhe: 1.245 ± 0.176
3.707AsnGly: 3.707 ± 0.31
0.869AsnHis: 0.869 ± 0.137
1.1AsnIle: 1.1 ± 0.199
1.158AsnLys: 1.158 ± 0.216
2.896AsnLeu: 2.896 ± 0.318
0.521AsnMet: 0.521 ± 0.143
0.811AsnAsn: 0.811 ± 0.146
2.722AsnPro: 2.722 ± 0.312
0.956AsnGln: 0.956 ± 0.249
2.809AsnArg: 2.809 ± 0.297
1.593AsnSer: 1.593 ± 0.221
1.477AsnThr: 1.477 ± 0.225
2.172AsnVal: 2.172 ± 0.248
0.898AsnTrp: 0.898 ± 0.188
0.898AsnTyr: 0.898 ± 0.177
0.0AsnXaa: 0.0 ± 0.0
Pro
5.097ProAla: 5.097 ± 0.393
0.811ProCys: 0.811 ± 0.165
3.765ProAsp: 3.765 ± 0.314
4.518ProGlu: 4.518 ± 0.395
1.853ProPhe: 1.853 ± 0.248
5.763ProGly: 5.763 ± 0.707
0.869ProHis: 0.869 ± 0.166
2.23ProIle: 2.23 ± 0.247
2.114ProLys: 2.114 ± 0.258
4.054ProLeu: 4.054 ± 0.34
0.898ProMet: 0.898 ± 0.184
1.969ProAsn: 1.969 ± 0.259
2.809ProPro: 2.809 ± 0.302
1.187ProGln: 1.187 ± 0.198
3.099ProArg: 3.099 ± 0.291
2.577ProSer: 2.577 ± 0.243
2.838ProThr: 2.838 ± 0.305
3.852ProVal: 3.852 ± 0.32
1.187ProTrp: 1.187 ± 0.18
1.39ProTyr: 1.39 ± 0.191
0.0ProXaa: 0.0 ± 0.0
Gln
3.243GlnAla: 3.243 ± 0.454
0.434GlnCys: 0.434 ± 0.129
1.303GlnAsp: 1.303 ± 0.172
2.027GlnGlu: 2.027 ± 0.287
1.1GlnPhe: 1.1 ± 0.157
2.577GlnGly: 2.577 ± 0.358
0.521GlnHis: 0.521 ± 0.137
2.056GlnIle: 2.056 ± 0.242
1.622GlnLys: 1.622 ± 0.307
2.606GlnLeu: 2.606 ± 0.285
0.956GlnMet: 0.956 ± 0.158
0.927GlnAsn: 0.927 ± 0.208
1.911GlnPro: 1.911 ± 0.241
1.506GlnGln: 1.506 ± 0.278
1.969GlnArg: 1.969 ± 0.219
1.882GlnSer: 1.882 ± 0.192
1.303GlnThr: 1.303 ± 0.223
2.404GlnVal: 2.404 ± 0.356
0.782GlnTrp: 0.782 ± 0.152
0.637GlnTyr: 0.637 ± 0.112
0.0GlnXaa: 0.0 ± 0.0
Arg
6.718ArgAla: 6.718 ± 0.522
1.187ArgCys: 1.187 ± 0.21
4.633ArgAsp: 4.633 ± 0.351
4.662ArgGlu: 4.662 ± 0.412
1.969ArgPhe: 1.969 ± 0.225
5.242ArgGly: 5.242 ± 0.441
1.766ArgHis: 1.766 ± 0.228
3.823ArgIle: 3.823 ± 0.314
2.983ArgLys: 2.983 ± 0.322
5.01ArgLeu: 5.01 ± 0.401
1.824ArgMet: 1.824 ± 0.254
1.853ArgAsn: 1.853 ± 0.237
3.099ArgPro: 3.099 ± 0.283
3.128ArgGln: 3.128 ± 0.31
5.357ArgArg: 5.357 ± 0.5
2.838ArgSer: 2.838 ± 0.275
2.983ArgThr: 2.983 ± 0.32
4.662ArgVal: 4.662 ± 0.41
2.375ArgTrp: 2.375 ± 0.261
2.433ArgTyr: 2.433 ± 0.264
0.0ArgXaa: 0.0 ± 0.0
Ser
4.923SerAla: 4.923 ± 0.407
0.898SerCys: 0.898 ± 0.177
3.62SerAsp: 3.62 ± 0.377
3.649SerGlu: 3.649 ± 0.312
1.477SerPhe: 1.477 ± 0.237
5.27SerGly: 5.27 ± 0.564
1.274SerHis: 1.274 ± 0.173
2.693SerIle: 2.693 ± 0.262
1.795SerLys: 1.795 ± 0.205
4.518SerLeu: 4.518 ± 0.265
1.477SerMet: 1.477 ± 0.172
1.882SerAsn: 1.882 ± 0.258
2.838SerPro: 2.838 ± 0.296
1.303SerGln: 1.303 ± 0.202
3.359SerArg: 3.359 ± 0.315
3.272SerSer: 3.272 ± 0.358
2.896SerThr: 2.896 ± 0.27
3.446SerVal: 3.446 ± 0.328
1.564SerTrp: 1.564 ± 0.245
1.419SerTyr: 1.419 ± 0.206
0.0SerXaa: 0.0 ± 0.0
Thr
4.778ThrAla: 4.778 ± 0.429
0.869ThrCys: 0.869 ± 0.215
3.272ThrAsp: 3.272 ± 0.317
3.446ThrGlu: 3.446 ± 0.311
1.795ThrPhe: 1.795 ± 0.204
4.691ThrGly: 4.691 ± 0.386
1.1ThrHis: 1.1 ± 0.18
3.301ThrIle: 3.301 ± 0.373
1.882ThrLys: 1.882 ± 0.263
4.865ThrLeu: 4.865 ± 0.334
0.724ThrMet: 0.724 ± 0.135
1.709ThrAsn: 1.709 ± 0.231
3.996ThrPro: 3.996 ± 0.349
1.535ThrGln: 1.535 ± 0.212
3.533ThrArg: 3.533 ± 0.317
3.185ThrSer: 3.185 ± 0.492
2.606ThrThr: 2.606 ± 0.377
4.72ThrVal: 4.72 ± 0.43
1.506ThrTrp: 1.506 ± 0.244
1.795ThrTyr: 1.795 ± 0.238
0.0ThrXaa: 0.0 ± 0.0
Val
6.747ValAla: 6.747 ± 0.454
0.724ValCys: 0.724 ± 0.165
5.705ValAsp: 5.705 ± 0.517
5.184ValGlu: 5.184 ± 0.345
1.593ValPhe: 1.593 ± 0.182
4.691ValGly: 4.691 ± 0.42
1.361ValHis: 1.361 ± 0.173
3.88ValIle: 3.88 ± 0.312
2.461ValLys: 2.461 ± 0.311
4.489ValLeu: 4.489 ± 0.449
1.245ValMet: 1.245 ± 0.173
2.838ValAsn: 2.838 ± 0.289
3.591ValPro: 3.591 ± 0.35
2.143ValGln: 2.143 ± 0.266
4.025ValArg: 4.025 ± 0.356
4.373ValSer: 4.373 ± 0.375
4.17ValThr: 4.17 ± 0.404
5.734ValVal: 5.734 ± 0.508
1.216ValTrp: 1.216 ± 0.191
1.738ValTyr: 1.738 ± 0.265
0.0ValXaa: 0.0 ± 0.0
Trp
2.085TrpAla: 2.085 ± 0.268
0.55TrpCys: 0.55 ± 0.144
1.39TrpAsp: 1.39 ± 0.231
1.303TrpGlu: 1.303 ± 0.184
0.579TrpPhe: 0.579 ± 0.143
1.506TrpGly: 1.506 ± 0.179
0.666TrpHis: 0.666 ± 0.129
1.1TrpIle: 1.1 ± 0.17
0.695TrpLys: 0.695 ± 0.155
2.056TrpLeu: 2.056 ± 0.29
0.666TrpMet: 0.666 ± 0.129
0.869TrpAsn: 0.869 ± 0.161
0.869TrpPro: 0.869 ± 0.166
0.927TrpGln: 0.927 ± 0.178
1.68TrpArg: 1.68 ± 0.222
1.622TrpSer: 1.622 ± 0.217
1.448TrpThr: 1.448 ± 0.211
1.68TrpVal: 1.68 ± 0.243
0.724TrpTrp: 0.724 ± 0.157
0.666TrpTyr: 0.666 ± 0.134
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.243TyrAla: 3.243 ± 0.339
0.579TyrCys: 0.579 ± 0.126
2.288TyrAsp: 2.288 ± 0.275
2.056TyrGlu: 2.056 ± 0.267
0.84TyrPhe: 0.84 ± 0.137
3.041TyrGly: 3.041 ± 0.315
0.521TyrHis: 0.521 ± 0.118
1.1TyrIle: 1.1 ± 0.175
1.043TyrLys: 1.043 ± 0.194
2.433TyrLeu: 2.433 ± 0.267
0.319TyrMet: 0.319 ± 0.074
0.782TyrAsn: 0.782 ± 0.131
1.071TyrPro: 1.071 ± 0.198
1.158TyrGln: 1.158 ± 0.201
2.78TyrArg: 2.78 ± 0.277
1.593TyrSer: 1.593 ± 0.196
1.564TyrThr: 1.564 ± 0.218
2.114TyrVal: 2.114 ± 0.263
0.695TyrTrp: 0.695 ± 0.17
0.84TyrTyr: 0.84 ± 0.177
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 220 proteins (34533 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski