Amino acid dipepetide frequency for Mycobacterium phage Zeeculate

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
13.01AlaAla: 13.01 ± 1.127
0.955AlaCys: 0.955 ± 0.188
6.983AlaAsp: 6.983 ± 0.688
5.789AlaGlu: 5.789 ± 0.666
2.865AlaPhe: 2.865 ± 0.37
8.296AlaGly: 8.296 ± 0.836
1.492AlaHis: 1.492 ± 0.316
3.581AlaIle: 3.581 ± 0.44
4.118AlaLys: 4.118 ± 0.592
8.773AlaLeu: 8.773 ± 0.862
2.686AlaMet: 2.686 ± 0.506
2.745AlaAsn: 2.745 ± 0.318
4.416AlaPro: 4.416 ± 0.764
2.865AlaGln: 2.865 ± 0.469
5.968AlaArg: 5.968 ± 0.493
5.073AlaSer: 5.073 ± 0.616
5.132AlaThr: 5.132 ± 0.622
7.818AlaVal: 7.818 ± 0.646
1.969AlaTrp: 1.969 ± 0.359
3.044AlaTyr: 3.044 ± 0.418
0.0AlaXaa: 0.0 ± 0.0
Cys
0.836CysAla: 0.836 ± 0.237
0.0CysCys: 0.0 ± 0.0
0.836CysAsp: 0.836 ± 0.183
0.656CysGlu: 0.656 ± 0.167
0.119CysPhe: 0.119 ± 0.076
0.597CysGly: 0.597 ± 0.179
0.06CysHis: 0.06 ± 0.072
0.418CysIle: 0.418 ± 0.178
0.358CysLys: 0.358 ± 0.162
0.358CysLeu: 0.358 ± 0.15
0.119CysMet: 0.119 ± 0.086
0.358CysAsn: 0.358 ± 0.139
0.239CysPro: 0.239 ± 0.123
0.179CysGln: 0.179 ± 0.118
0.597CysArg: 0.597 ± 0.209
0.477CysSer: 0.477 ± 0.166
0.537CysThr: 0.537 ± 0.19
0.298CysVal: 0.298 ± 0.133
0.358CysTrp: 0.358 ± 0.132
0.239CysTyr: 0.239 ± 0.117
0.0CysXaa: 0.0 ± 0.0
Asp
6.386AspAla: 6.386 ± 0.611
0.597AspCys: 0.597 ± 0.16
4.834AspAsp: 4.834 ± 0.542
4.178AspGlu: 4.178 ± 0.616
2.507AspPhe: 2.507 ± 0.325
6.207AspGly: 6.207 ± 0.632
1.074AspHis: 1.074 ± 0.252
2.626AspIle: 2.626 ± 0.402
2.566AspLys: 2.566 ± 0.434
7.102AspLeu: 7.102 ± 0.645
1.313AspMet: 1.313 ± 0.211
2.268AspAsn: 2.268 ± 0.331
4.476AspPro: 4.476 ± 0.579
1.432AspGln: 1.432 ± 0.322
4.058AspArg: 4.058 ± 0.43
3.342AspSer: 3.342 ± 0.543
3.521AspThr: 3.521 ± 0.358
5.073AspVal: 5.073 ± 0.506
1.552AspTrp: 1.552 ± 0.277
2.148AspTyr: 2.148 ± 0.334
0.0AspXaa: 0.0 ± 0.0
Glu
6.445GluAla: 6.445 ± 0.673
0.418GluCys: 0.418 ± 0.213
5.013GluAsp: 5.013 ± 0.5
5.073GluGlu: 5.073 ± 0.563
2.328GluPhe: 2.328 ± 0.388
4.178GluGly: 4.178 ± 0.462
1.492GluHis: 1.492 ± 0.31
3.879GluIle: 3.879 ± 0.413
2.566GluLys: 2.566 ± 0.386
7.042GluLeu: 7.042 ± 0.505
1.552GluMet: 1.552 ± 0.258
1.611GluAsn: 1.611 ± 0.311
2.745GluPro: 2.745 ± 0.35
2.745GluGln: 2.745 ± 0.416
3.879GluArg: 3.879 ± 0.515
3.223GluSer: 3.223 ± 0.432
3.342GluThr: 3.342 ± 0.488
5.132GluVal: 5.132 ± 0.615
1.552GluTrp: 1.552 ± 0.354
2.626GluTyr: 2.626 ± 0.417
0.0GluXaa: 0.0 ± 0.0
Phe
2.148PheAla: 2.148 ± 0.278
0.418PheCys: 0.418 ± 0.163
2.865PheAsp: 2.865 ± 0.358
2.208PheGlu: 2.208 ± 0.297
0.418PhePhe: 0.418 ± 0.133
3.879PheGly: 3.879 ± 0.464
0.836PheHis: 0.836 ± 0.277
1.492PheIle: 1.492 ± 0.286
1.194PheLys: 1.194 ± 0.285
2.148PheLeu: 2.148 ± 0.396
0.537PheMet: 0.537 ± 0.167
1.015PheAsn: 1.015 ± 0.267
1.91PhePro: 1.91 ± 0.319
0.955PheGln: 0.955 ± 0.208
1.969PheArg: 1.969 ± 0.354
1.432PheSer: 1.432 ± 0.287
2.148PheThr: 2.148 ± 0.358
1.91PheVal: 1.91 ± 0.358
0.597PheTrp: 0.597 ± 0.185
1.015PheTyr: 1.015 ± 0.273
0.0PheXaa: 0.0 ± 0.0
Gly
6.863GlyAla: 6.863 ± 0.882
0.776GlyCys: 0.776 ± 0.216
6.147GlyAsp: 6.147 ± 0.579
5.073GlyGlu: 5.073 ± 0.58
3.044GlyPhe: 3.044 ± 0.462
9.191GlyGly: 9.191 ± 2.112
1.79GlyHis: 1.79 ± 0.366
4.357GlyIle: 4.357 ± 0.599
4.058GlyLys: 4.058 ± 0.456
7.997GlyLeu: 7.997 ± 0.787
1.91GlyMet: 1.91 ± 0.352
3.521GlyAsn: 3.521 ± 0.494
4.118GlyPro: 4.118 ± 0.646
2.745GlyGln: 2.745 ± 0.295
5.013GlyArg: 5.013 ± 0.596
6.147GlySer: 6.147 ± 0.845
5.371GlyThr: 5.371 ± 0.602
5.55GlyVal: 5.55 ± 0.62
2.507GlyTrp: 2.507 ± 0.349
2.865GlyTyr: 2.865 ± 0.458
0.0GlyXaa: 0.0 ± 0.0
His
1.731HisAla: 1.731 ± 0.307
0.119HisCys: 0.119 ± 0.093
1.015HisAsp: 1.015 ± 0.204
1.313HisGlu: 1.313 ± 0.283
0.716HisPhe: 0.716 ± 0.187
1.611HisGly: 1.611 ± 0.397
0.477HisHis: 0.477 ± 0.154
1.134HisIle: 1.134 ± 0.214
0.955HisLys: 0.955 ± 0.289
1.492HisLeu: 1.492 ± 0.339
0.239HisMet: 0.239 ± 0.123
0.418HisAsn: 0.418 ± 0.145
1.492HisPro: 1.492 ± 0.277
0.776HisGln: 0.776 ± 0.195
1.79HisArg: 1.79 ± 0.329
0.597HisSer: 0.597 ± 0.159
1.253HisThr: 1.253 ± 0.312
1.671HisVal: 1.671 ± 0.319
0.477HisTrp: 0.477 ± 0.158
0.597HisTyr: 0.597 ± 0.216
0.0HisXaa: 0.0 ± 0.0
Ile
6.028IleAla: 6.028 ± 0.633
0.239IleCys: 0.239 ± 0.11
3.581IleAsp: 3.581 ± 0.347
3.581IleGlu: 3.581 ± 0.447
0.895IlePhe: 0.895 ± 0.244
3.939IleGly: 3.939 ± 0.533
1.015IleHis: 1.015 ± 0.252
1.432IleIle: 1.432 ± 0.279
1.671IleLys: 1.671 ± 0.353
2.865IleLeu: 2.865 ± 0.362
0.716IleMet: 0.716 ± 0.185
2.268IleAsn: 2.268 ± 0.308
3.103IlePro: 3.103 ± 0.361
1.134IleGln: 1.134 ± 0.297
3.521IleArg: 3.521 ± 0.473
3.103IleSer: 3.103 ± 0.466
3.76IleThr: 3.76 ± 0.442
2.507IleVal: 2.507 ± 0.504
0.776IleTrp: 0.776 ± 0.198
1.432IleTyr: 1.432 ± 0.258
0.0IleXaa: 0.0 ± 0.0
Lys
3.282LysAla: 3.282 ± 0.468
0.119LysCys: 0.119 ± 0.08
2.507LysAsp: 2.507 ± 0.529
2.208LysGlu: 2.208 ± 0.354
1.552LysPhe: 1.552 ± 0.278
2.686LysGly: 2.686 ± 0.381
1.373LysHis: 1.373 ± 0.282
2.805LysIle: 2.805 ± 0.529
2.208LysLys: 2.208 ± 0.415
3.044LysLeu: 3.044 ± 0.458
0.955LysMet: 0.955 ± 0.221
1.492LysAsn: 1.492 ± 0.244
2.865LysPro: 2.865 ± 0.513
1.373LysGln: 1.373 ± 0.35
2.924LysArg: 2.924 ± 0.471
2.387LysSer: 2.387 ± 0.432
2.686LysThr: 2.686 ± 0.381
3.223LysVal: 3.223 ± 0.487
0.895LysTrp: 0.895 ± 0.245
0.716LysTyr: 0.716 ± 0.199
0.0LysXaa: 0.0 ± 0.0
Leu
9.25LeuAla: 9.25 ± 0.825
0.418LeuCys: 0.418 ± 0.134
6.266LeuAsp: 6.266 ± 0.547
5.431LeuGlu: 5.431 ± 0.576
2.328LeuPhe: 2.328 ± 0.383
7.818LeuGly: 7.818 ± 0.746
1.432LeuHis: 1.432 ± 0.302
4.118LeuIle: 4.118 ± 0.453
3.879LeuLys: 3.879 ± 0.484
5.371LeuLeu: 5.371 ± 0.637
1.91LeuMet: 1.91 ± 0.242
3.163LeuAsn: 3.163 ± 0.44
5.789LeuPro: 5.789 ± 0.606
2.566LeuGln: 2.566 ± 0.442
6.087LeuArg: 6.087 ± 0.489
5.491LeuSer: 5.491 ± 0.572
6.028LeuThr: 6.028 ± 0.554
4.357LeuVal: 4.357 ± 0.546
0.955LeuTrp: 0.955 ± 0.302
2.328LeuTyr: 2.328 ± 0.427
0.0LeuXaa: 0.0 ± 0.0
Met
2.387MetAla: 2.387 ± 0.352
0.0MetCys: 0.0 ± 0.0
0.955MetAsp: 0.955 ± 0.182
1.552MetGlu: 1.552 ± 0.274
0.477MetPhe: 0.477 ± 0.179
1.671MetGly: 1.671 ± 0.273
0.179MetHis: 0.179 ± 0.095
0.955MetIle: 0.955 ± 0.232
1.074MetLys: 1.074 ± 0.246
1.611MetLeu: 1.611 ± 0.286
0.179MetMet: 0.179 ± 0.096
1.253MetAsn: 1.253 ± 0.242
0.955MetPro: 0.955 ± 0.221
0.597MetGln: 0.597 ± 0.178
1.313MetArg: 1.313 ± 0.29
1.731MetSer: 1.731 ± 0.333
2.089MetThr: 2.089 ± 0.252
1.313MetVal: 1.313 ± 0.353
0.239MetTrp: 0.239 ± 0.095
0.418MetTyr: 0.418 ± 0.156
0.0MetXaa: 0.0 ± 0.0
Asn
3.044AsnAla: 3.044 ± 0.503
0.119AsnCys: 0.119 ± 0.08
2.268AsnAsp: 2.268 ± 0.375
1.85AsnGlu: 1.85 ± 0.303
0.895AsnPhe: 0.895 ± 0.245
3.581AsnGly: 3.581 ± 0.479
0.716AsnHis: 0.716 ± 0.195
1.492AsnIle: 1.492 ± 0.299
0.955AsnLys: 0.955 ± 0.291
2.447AsnLeu: 2.447 ± 0.363
0.776AsnMet: 0.776 ± 0.201
0.895AsnAsn: 0.895 ± 0.196
2.924AsnPro: 2.924 ± 0.345
1.313AsnGln: 1.313 ± 0.365
1.313AsnArg: 1.313 ± 0.313
1.91AsnSer: 1.91 ± 0.41
2.686AsnThr: 2.686 ± 0.323
2.328AsnVal: 2.328 ± 0.391
0.776AsnTrp: 0.776 ± 0.159
1.253AsnTyr: 1.253 ± 0.268
0.0AsnXaa: 0.0 ± 0.0
Pro
5.61ProAla: 5.61 ± 0.516
0.477ProCys: 0.477 ± 0.189
4.536ProAsp: 4.536 ± 0.478
4.237ProGlu: 4.237 ± 0.517
2.328ProPhe: 2.328 ± 0.345
5.073ProGly: 5.073 ± 0.608
1.015ProHis: 1.015 ± 0.211
1.969ProIle: 1.969 ± 0.384
2.208ProLys: 2.208 ± 0.268
4.834ProLeu: 4.834 ± 0.506
0.836ProMet: 0.836 ± 0.22
1.552ProAsn: 1.552 ± 0.296
2.924ProPro: 2.924 ± 0.411
1.611ProGln: 1.611 ± 0.365
2.805ProArg: 2.805 ± 0.416
3.581ProSer: 3.581 ± 0.452
3.939ProThr: 3.939 ± 0.504
3.879ProVal: 3.879 ± 0.503
0.656ProTrp: 0.656 ± 0.245
1.611ProTyr: 1.611 ± 0.295
0.0ProXaa: 0.0 ± 0.0
Gln
3.521GlnAla: 3.521 ± 0.715
0.06GlnCys: 0.06 ± 0.065
1.313GlnAsp: 1.313 ± 0.254
1.492GlnGlu: 1.492 ± 0.292
1.015GlnPhe: 1.015 ± 0.246
2.745GlnGly: 2.745 ± 0.358
0.477GlnHis: 0.477 ± 0.135
2.328GlnIle: 2.328 ± 0.401
1.074GlnLys: 1.074 ± 0.38
3.76GlnLeu: 3.76 ± 0.437
0.776GlnMet: 0.776 ± 0.206
0.776GlnAsn: 0.776 ± 0.228
1.969GlnPro: 1.969 ± 0.304
1.611GlnGln: 1.611 ± 0.326
1.79GlnArg: 1.79 ± 0.339
1.313GlnSer: 1.313 ± 0.235
1.671GlnThr: 1.671 ± 0.285
2.387GlnVal: 2.387 ± 0.378
0.836GlnTrp: 0.836 ± 0.213
0.597GlnTyr: 0.597 ± 0.182
0.0GlnXaa: 0.0 ± 0.0
Arg
5.132ArgAla: 5.132 ± 0.684
0.955ArgCys: 0.955 ± 0.281
2.686ArgAsp: 2.686 ± 0.401
5.013ArgGlu: 5.013 ± 0.653
1.85ArgPhe: 1.85 ± 0.371
5.55ArgGly: 5.55 ± 0.555
1.313ArgHis: 1.313 ± 0.302
2.984ArgIle: 2.984 ± 0.421
3.223ArgLys: 3.223 ± 0.457
5.61ArgLeu: 5.61 ± 0.656
2.328ArgMet: 2.328 ± 0.36
2.566ArgAsn: 2.566 ± 0.436
2.148ArgPro: 2.148 ± 0.314
1.671ArgGln: 1.671 ± 0.297
5.849ArgArg: 5.849 ± 0.678
3.879ArgSer: 3.879 ± 0.497
3.282ArgThr: 3.282 ± 0.475
5.312ArgVal: 5.312 ± 0.638
1.253ArgTrp: 1.253 ± 0.303
1.91ArgTyr: 1.91 ± 0.357
0.0ArgXaa: 0.0 ± 0.0
Ser
5.789SerAla: 5.789 ± 0.596
0.477SerCys: 0.477 ± 0.183
3.402SerAsp: 3.402 ± 0.449
3.879SerGlu: 3.879 ± 0.431
1.432SerPhe: 1.432 ± 0.341
6.326SerGly: 6.326 ± 0.831
1.432SerHis: 1.432 ± 0.285
2.447SerIle: 2.447 ± 0.42
2.268SerLys: 2.268 ± 0.365
5.192SerLeu: 5.192 ± 0.635
1.194SerMet: 1.194 ± 0.239
1.85SerAsn: 1.85 ± 0.399
3.103SerPro: 3.103 ± 0.437
1.671SerGln: 1.671 ± 0.249
2.924SerArg: 2.924 ± 0.382
2.984SerSer: 2.984 ± 0.64
3.342SerThr: 3.342 ± 0.565
3.64SerVal: 3.64 ± 0.412
1.194SerTrp: 1.194 ± 0.276
1.253SerTyr: 1.253 ± 0.293
0.0SerXaa: 0.0 ± 0.0
Thr
6.147ThrAla: 6.147 ± 0.717
0.477ThrCys: 0.477 ± 0.171
4.118ThrAsp: 4.118 ± 0.485
4.595ThrGlu: 4.595 ± 0.464
2.268ThrPhe: 2.268 ± 0.338
6.684ThrGly: 6.684 ± 0.557
1.492ThrHis: 1.492 ± 0.341
2.686ThrIle: 2.686 ± 0.472
2.566ThrLys: 2.566 ± 0.367
5.61ThrLeu: 5.61 ± 0.57
0.776ThrMet: 0.776 ± 0.181
1.671ThrAsn: 1.671 ± 0.329
3.999ThrPro: 3.999 ± 0.481
1.91ThrGln: 1.91 ± 0.333
3.342ThrArg: 3.342 ± 0.484
3.402ThrSer: 3.402 ± 0.464
4.416ThrThr: 4.416 ± 0.584
5.968ThrVal: 5.968 ± 0.58
1.194ThrTrp: 1.194 ± 0.259
1.611ThrTyr: 1.611 ± 0.297
0.0ThrXaa: 0.0 ± 0.0
Val
6.207ValAla: 6.207 ± 0.704
0.477ValCys: 0.477 ± 0.152
5.192ValAsp: 5.192 ± 0.554
5.312ValGlu: 5.312 ± 0.536
2.447ValPhe: 2.447 ± 0.341
4.894ValGly: 4.894 ± 0.631
1.194ValHis: 1.194 ± 0.218
3.939ValIle: 3.939 ± 0.497
2.984ValLys: 2.984 ± 0.404
5.431ValLeu: 5.431 ± 0.51
1.313ValMet: 1.313 ± 0.376
2.328ValAsn: 2.328 ± 0.368
4.416ValPro: 4.416 ± 0.437
2.029ValGln: 2.029 ± 0.313
5.491ValArg: 5.491 ± 0.645
3.999ValSer: 3.999 ± 0.506
5.55ValThr: 5.55 ± 0.477
5.252ValVal: 5.252 ± 0.68
1.074ValTrp: 1.074 ± 0.248
2.029ValTyr: 2.029 ± 0.402
0.0ValXaa: 0.0 ± 0.0
Trp
1.373TrpAla: 1.373 ± 0.285
0.298TrpCys: 0.298 ± 0.118
1.432TrpAsp: 1.432 ± 0.231
1.015TrpGlu: 1.015 ± 0.21
0.955TrpPhe: 0.955 ± 0.21
1.79TrpGly: 1.79 ± 0.312
0.418TrpHis: 0.418 ± 0.158
1.253TrpIle: 1.253 ± 0.242
0.298TrpLys: 0.298 ± 0.149
1.85TrpLeu: 1.85 ± 0.287
0.298TrpMet: 0.298 ± 0.148
0.597TrpAsn: 0.597 ± 0.163
0.836TrpPro: 0.836 ± 0.25
0.836TrpGln: 0.836 ± 0.212
1.253TrpArg: 1.253 ± 0.287
0.597TrpSer: 0.597 ± 0.205
1.91TrpThr: 1.91 ± 0.331
1.85TrpVal: 1.85 ± 0.304
0.537TrpTrp: 0.537 ± 0.201
0.358TrpTyr: 0.358 ± 0.159
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.447TyrAla: 2.447 ± 0.365
0.239TyrCys: 0.239 ± 0.161
1.313TyrAsp: 1.313 ± 0.241
2.387TyrGlu: 2.387 ± 0.316
0.656TyrPhe: 0.656 ± 0.184
2.328TyrGly: 2.328 ± 0.358
0.597TyrHis: 0.597 ± 0.179
1.671TyrIle: 1.671 ± 0.335
1.074TyrLys: 1.074 ± 0.234
2.447TyrLeu: 2.447 ± 0.369
0.477TyrMet: 0.477 ± 0.156
1.194TyrAsn: 1.194 ± 0.272
1.253TyrPro: 1.253 ± 0.236
1.492TyrGln: 1.492 ± 0.304
2.626TyrArg: 2.626 ± 0.392
1.134TyrSer: 1.134 ± 0.232
2.268TyrThr: 2.268 ± 0.353
2.089TyrVal: 2.089 ± 0.35
0.358TyrTrp: 0.358 ± 0.123
0.537TyrTyr: 0.537 ± 0.161
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 91 proteins (16757 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski