Amino acid dipepetide frequency for Mycobacterium phage LastJedi

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
14.568AlaAla: 14.568 ± 1.633
1.138AlaCys: 1.138 ± 0.272
7.625AlaAsp: 7.625 ± 0.623
6.544AlaGlu: 6.544 ± 0.572
2.561AlaPhe: 2.561 ± 0.365
8.763AlaGly: 8.763 ± 1.04
2.504AlaHis: 2.504 ± 0.371
4.666AlaIle: 4.666 ± 0.571
4.04AlaLys: 4.04 ± 0.461
8.536AlaLeu: 8.536 ± 0.762
1.935AlaMet: 1.935 ± 0.319
3.357AlaAsn: 3.357 ± 0.443
4.951AlaPro: 4.951 ± 0.609
3.471AlaGln: 3.471 ± 0.425
8.137AlaArg: 8.137 ± 0.771
4.723AlaSer: 4.723 ± 0.522
6.373AlaThr: 6.373 ± 0.646
6.772AlaVal: 6.772 ± 0.58
2.504AlaTrp: 2.504 ± 0.414
2.276AlaTyr: 2.276 ± 0.335
0.0AlaXaa: 0.0 ± 0.0
Cys
1.252CysAla: 1.252 ± 0.299
0.0CysCys: 0.0 ± 0.0
1.423CysAsp: 1.423 ± 0.339
0.797CysGlu: 0.797 ± 0.235
0.171CysPhe: 0.171 ± 0.095
1.423CysGly: 1.423 ± 0.272
0.398CysHis: 0.398 ± 0.14
0.057CysIle: 0.057 ± 0.056
0.683CysLys: 0.683 ± 0.178
0.854CysLeu: 0.854 ± 0.25
0.171CysMet: 0.171 ± 0.081
0.569CysAsn: 0.569 ± 0.148
1.252CysPro: 1.252 ± 0.328
0.228CysGln: 0.228 ± 0.119
0.683CysArg: 0.683 ± 0.216
0.683CysSer: 0.683 ± 0.224
0.854CysThr: 0.854 ± 0.218
0.683CysVal: 0.683 ± 0.181
0.285CysTrp: 0.285 ± 0.104
0.285CysTyr: 0.285 ± 0.123
0.0CysXaa: 0.0 ± 0.0
Asp
7.284AspAla: 7.284 ± 0.563
1.195AspCys: 1.195 ± 0.294
5.235AspAsp: 5.235 ± 0.607
4.154AspGlu: 4.154 ± 0.44
1.593AspPhe: 1.593 ± 0.248
7.056AspGly: 7.056 ± 0.64
1.252AspHis: 1.252 ± 0.265
2.447AspIle: 2.447 ± 0.332
1.764AspLys: 1.764 ± 0.281
6.487AspLeu: 6.487 ± 0.475
1.423AspMet: 1.423 ± 0.302
1.48AspAsn: 1.48 ± 0.305
5.065AspPro: 5.065 ± 0.457
2.561AspGln: 2.561 ± 0.298
5.121AspArg: 5.121 ± 0.552
3.414AspSer: 3.414 ± 0.5
4.325AspThr: 4.325 ± 0.487
4.268AspVal: 4.268 ± 0.539
1.593AspTrp: 1.593 ± 0.247
1.707AspTyr: 1.707 ± 0.323
0.0AspXaa: 0.0 ± 0.0
Glu
6.658GluAla: 6.658 ± 0.73
0.854GluCys: 0.854 ± 0.253
3.301GluAsp: 3.301 ± 0.421
3.073GluGlu: 3.073 ± 0.526
2.162GluPhe: 2.162 ± 0.404
3.244GluGly: 3.244 ± 0.434
1.764GluHis: 1.764 ± 0.39
2.39GluIle: 2.39 ± 0.413
2.39GluLys: 2.39 ± 0.34
5.975GluLeu: 5.975 ± 0.681
1.48GluMet: 1.48 ± 0.306
1.48GluAsn: 1.48 ± 0.271
2.959GluPro: 2.959 ± 0.545
3.244GluGln: 3.244 ± 0.364
4.894GluArg: 4.894 ± 0.613
3.187GluSer: 3.187 ± 0.529
4.325GluThr: 4.325 ± 0.669
4.268GluVal: 4.268 ± 0.493
1.65GluTrp: 1.65 ± 0.278
1.65GluTyr: 1.65 ± 0.301
0.0GluXaa: 0.0 ± 0.0
Phe
2.731PheAla: 2.731 ± 0.347
0.228PheCys: 0.228 ± 0.105
3.357PheAsp: 3.357 ± 0.618
1.366PheGlu: 1.366 ± 0.282
0.683PhePhe: 0.683 ± 0.189
2.788PheGly: 2.788 ± 0.637
0.569PheHis: 0.569 ± 0.228
1.309PheIle: 1.309 ± 0.376
0.797PheLys: 0.797 ± 0.232
1.878PheLeu: 1.878 ± 0.258
0.967PheMet: 0.967 ± 0.222
0.967PheAsn: 0.967 ± 0.275
1.366PhePro: 1.366 ± 0.295
0.91PheGln: 0.91 ± 0.32
1.821PheArg: 1.821 ± 0.355
1.423PheSer: 1.423 ± 0.26
2.333PheThr: 2.333 ± 0.346
2.504PheVal: 2.504 ± 0.334
0.569PheTrp: 0.569 ± 0.161
0.797PheTyr: 0.797 ± 0.24
0.0PheXaa: 0.0 ± 0.0
Gly
7.91GlyAla: 7.91 ± 1.162
1.195GlyCys: 1.195 ± 0.282
5.52GlyAsp: 5.52 ± 0.492
4.894GlyGlu: 4.894 ± 0.542
2.333GlyPhe: 2.333 ± 0.354
7.967GlyGly: 7.967 ± 1.381
2.276GlyHis: 2.276 ± 0.394
3.813GlyIle: 3.813 ± 0.512
2.447GlyLys: 2.447 ± 0.316
5.52GlyLeu: 5.52 ± 0.538
2.561GlyMet: 2.561 ± 0.551
2.731GlyAsn: 2.731 ± 0.409
3.983GlyPro: 3.983 ± 0.641
2.561GlyGln: 2.561 ± 0.571
6.26GlyArg: 6.26 ± 0.838
4.78GlySer: 4.78 ± 0.662
5.349GlyThr: 5.349 ± 0.647
6.373GlyVal: 6.373 ± 0.628
1.878GlyTrp: 1.878 ± 0.357
2.106GlyTyr: 2.106 ± 0.406
0.0GlyXaa: 0.0 ± 0.0
His
2.106HisAla: 2.106 ± 0.324
0.171HisCys: 0.171 ± 0.106
1.195HisAsp: 1.195 ± 0.21
1.423HisGlu: 1.423 ± 0.315
0.512HisPhe: 0.512 ± 0.157
1.48HisGly: 1.48 ± 0.331
1.024HisHis: 1.024 ± 0.297
1.707HisIle: 1.707 ± 0.356
1.024HisLys: 1.024 ± 0.283
1.593HisLeu: 1.593 ± 0.254
0.569HisMet: 0.569 ± 0.162
1.138HisAsn: 1.138 ± 0.263
1.536HisPro: 1.536 ± 0.271
0.74HisGln: 0.74 ± 0.176
2.219HisArg: 2.219 ± 0.365
0.797HisSer: 0.797 ± 0.2
1.48HisThr: 1.48 ± 0.318
1.593HisVal: 1.593 ± 0.336
0.398HisTrp: 0.398 ± 0.121
0.854HisTyr: 0.854 ± 0.22
0.0HisXaa: 0.0 ± 0.0
Ile
5.008IleAla: 5.008 ± 0.641
0.683IleCys: 0.683 ± 0.228
3.699IleAsp: 3.699 ± 0.549
3.87IleGlu: 3.87 ± 0.443
0.854IlePhe: 0.854 ± 0.224
3.813IleGly: 3.813 ± 0.446
1.252IleHis: 1.252 ± 0.307
1.252IleIle: 1.252 ± 0.256
1.195IleLys: 1.195 ± 0.293
1.992IleLeu: 1.992 ± 0.35
0.455IleMet: 0.455 ± 0.114
1.423IleAsn: 1.423 ± 0.291
2.675IlePro: 2.675 ± 0.347
1.821IleGln: 1.821 ± 0.337
2.788IleArg: 2.788 ± 0.473
1.935IleSer: 1.935 ± 0.401
3.813IleThr: 3.813 ± 0.428
3.187IleVal: 3.187 ± 0.402
1.138IleTrp: 1.138 ± 0.323
0.797IleTyr: 0.797 ± 0.18
0.0IleXaa: 0.0 ± 0.0
Lys
3.813LysAla: 3.813 ± 0.494
0.569LysCys: 0.569 ± 0.179
2.447LysAsp: 2.447 ± 0.298
1.423LysGlu: 1.423 ± 0.261
1.366LysPhe: 1.366 ± 0.237
2.788LysGly: 2.788 ± 0.374
1.138LysHis: 1.138 ± 0.284
1.081LysIle: 1.081 ± 0.24
1.309LysLys: 1.309 ± 0.328
2.845LysLeu: 2.845 ± 0.487
0.569LysMet: 0.569 ± 0.168
1.024LysAsn: 1.024 ± 0.226
2.276LysPro: 2.276 ± 0.372
1.309LysGln: 1.309 ± 0.269
2.219LysArg: 2.219 ± 0.394
2.049LysSer: 2.049 ± 0.273
2.049LysThr: 2.049 ± 0.361
2.219LysVal: 2.219 ± 0.314
0.569LysTrp: 0.569 ± 0.158
1.024LysTyr: 1.024 ± 0.268
0.0LysXaa: 0.0 ± 0.0
Leu
8.081LeuAla: 8.081 ± 0.747
0.854LeuCys: 0.854 ± 0.248
5.178LeuAsp: 5.178 ± 0.578
4.382LeuGlu: 4.382 ± 0.494
2.276LeuPhe: 2.276 ± 0.417
5.918LeuGly: 5.918 ± 0.642
1.081LeuHis: 1.081 ± 0.226
3.414LeuIle: 3.414 ± 0.403
2.049LeuLys: 2.049 ± 0.357
4.951LeuLeu: 4.951 ± 0.643
1.65LeuMet: 1.65 ± 0.317
2.845LeuAsn: 2.845 ± 0.411
5.349LeuPro: 5.349 ± 0.639
3.073LeuGln: 3.073 ± 0.388
5.178LeuArg: 5.178 ± 0.732
5.121LeuSer: 5.121 ± 0.556
6.146LeuThr: 6.146 ± 0.517
5.292LeuVal: 5.292 ± 0.571
1.309LeuTrp: 1.309 ± 0.287
2.276LeuTyr: 2.276 ± 0.369
0.0LeuXaa: 0.0 ± 0.0
Met
2.731MetAla: 2.731 ± 0.437
0.228MetCys: 0.228 ± 0.183
1.48MetAsp: 1.48 ± 0.31
0.91MetGlu: 0.91 ± 0.241
0.512MetPhe: 0.512 ± 0.186
1.423MetGly: 1.423 ± 0.278
0.228MetHis: 0.228 ± 0.114
0.967MetIle: 0.967 ± 0.195
0.569MetLys: 0.569 ± 0.182
2.106MetLeu: 2.106 ± 0.322
0.626MetMet: 0.626 ± 0.284
0.74MetAsn: 0.74 ± 0.178
1.309MetPro: 1.309 ± 0.236
0.455MetGln: 0.455 ± 0.143
1.423MetArg: 1.423 ± 0.316
2.788MetSer: 2.788 ± 0.402
2.447MetThr: 2.447 ± 0.338
1.366MetVal: 1.366 ± 0.325
0.228MetTrp: 0.228 ± 0.113
0.398MetTyr: 0.398 ± 0.153
0.0MetXaa: 0.0 ± 0.0
Asn
3.244AsnAla: 3.244 ± 0.361
0.398AsnCys: 0.398 ± 0.152
1.366AsnAsp: 1.366 ± 0.258
1.48AsnGlu: 1.48 ± 0.344
0.74AsnPhe: 0.74 ± 0.261
3.016AsnGly: 3.016 ± 0.514
0.91AsnHis: 0.91 ± 0.2
1.024AsnIle: 1.024 ± 0.313
0.74AsnLys: 0.74 ± 0.229
3.016AsnLeu: 3.016 ± 0.43
0.91AsnMet: 0.91 ± 0.186
1.821AsnAsn: 1.821 ± 0.355
2.788AsnPro: 2.788 ± 0.314
1.138AsnGln: 1.138 ± 0.322
2.049AsnArg: 2.049 ± 0.31
1.309AsnSer: 1.309 ± 0.291
1.764AsnThr: 1.764 ± 0.265
2.162AsnVal: 2.162 ± 0.377
0.683AsnTrp: 0.683 ± 0.178
0.797AsnTyr: 0.797 ± 0.182
0.0AsnXaa: 0.0 ± 0.0
Pro
5.349ProAla: 5.349 ± 0.603
0.854ProCys: 0.854 ± 0.204
4.496ProAsp: 4.496 ± 0.616
4.496ProGlu: 4.496 ± 0.566
2.276ProPhe: 2.276 ± 0.376
6.317ProGly: 6.317 ± 0.689
1.764ProHis: 1.764 ± 0.362
1.764ProIle: 1.764 ± 0.288
2.504ProLys: 2.504 ± 0.466
4.552ProLeu: 4.552 ± 0.512
1.593ProMet: 1.593 ± 0.39
2.162ProAsn: 2.162 ± 0.32
4.097ProPro: 4.097 ± 0.486
2.561ProGln: 2.561 ± 0.374
2.902ProArg: 2.902 ± 0.557
2.845ProSer: 2.845 ± 0.307
3.187ProThr: 3.187 ± 0.404
4.78ProVal: 4.78 ± 0.511
1.252ProTrp: 1.252 ± 0.291
1.195ProTyr: 1.195 ± 0.268
0.0ProXaa: 0.0 ± 0.0
Gln
4.325GlnAla: 4.325 ± 0.498
0.341GlnCys: 0.341 ± 0.143
1.707GlnAsp: 1.707 ± 0.296
1.935GlnGlu: 1.935 ± 0.315
1.195GlnPhe: 1.195 ± 0.227
2.675GlnGly: 2.675 ± 0.468
1.081GlnHis: 1.081 ± 0.231
1.992GlnIle: 1.992 ± 0.325
1.309GlnLys: 1.309 ± 0.26
3.073GlnLeu: 3.073 ± 0.411
0.398GlnMet: 0.398 ± 0.161
0.74GlnAsn: 0.74 ± 0.234
2.618GlnPro: 2.618 ± 0.358
1.081GlnGln: 1.081 ± 0.218
2.675GlnArg: 2.675 ± 0.473
1.821GlnSer: 1.821 ± 0.323
1.764GlnThr: 1.764 ± 0.349
2.902GlnVal: 2.902 ± 0.423
0.797GlnTrp: 0.797 ± 0.226
0.854GlnTyr: 0.854 ± 0.234
0.0GlnXaa: 0.0 ± 0.0
Arg
7.17ArgAla: 7.17 ± 0.612
1.252ArgCys: 1.252 ± 0.346
5.349ArgAsp: 5.349 ± 0.566
4.78ArgGlu: 4.78 ± 0.604
2.731ArgPhe: 2.731 ± 0.427
4.496ArgGly: 4.496 ± 0.594
1.48ArgHis: 1.48 ± 0.36
4.78ArgIle: 4.78 ± 0.555
2.845ArgLys: 2.845 ± 0.463
5.178ArgLeu: 5.178 ± 0.649
2.049ArgMet: 2.049 ± 0.391
2.333ArgAsn: 2.333 ± 0.395
3.414ArgPro: 3.414 ± 0.388
1.935ArgGln: 1.935 ± 0.305
6.544ArgArg: 6.544 ± 0.729
3.414ArgSer: 3.414 ± 0.447
3.301ArgThr: 3.301 ± 0.526
4.78ArgVal: 4.78 ± 0.797
2.333ArgTrp: 2.333 ± 0.411
1.821ArgTyr: 1.821 ± 0.293
0.0ArgXaa: 0.0 ± 0.0
Ser
5.634SerAla: 5.634 ± 0.728
0.455SerCys: 0.455 ± 0.226
3.699SerAsp: 3.699 ± 0.494
2.845SerGlu: 2.845 ± 0.385
2.219SerPhe: 2.219 ± 0.387
4.723SerGly: 4.723 ± 0.527
0.967SerHis: 0.967 ± 0.211
2.219SerIle: 2.219 ± 0.323
2.276SerLys: 2.276 ± 0.409
3.983SerLeu: 3.983 ± 0.451
1.593SerMet: 1.593 ± 0.251
1.821SerAsn: 1.821 ± 0.245
3.642SerPro: 3.642 ± 0.355
1.593SerGln: 1.593 ± 0.3
2.959SerArg: 2.959 ± 0.44
2.959SerSer: 2.959 ± 0.49
3.13SerThr: 3.13 ± 0.467
4.552SerVal: 4.552 ± 0.505
1.024SerTrp: 1.024 ± 0.244
1.48SerTyr: 1.48 ± 0.254
0.0SerXaa: 0.0 ± 0.0
Thr
6.715ThrAla: 6.715 ± 0.695
0.512ThrCys: 0.512 ± 0.19
4.382ThrAsp: 4.382 ± 0.539
4.04ThrGlu: 4.04 ± 0.391
1.764ThrPhe: 1.764 ± 0.335
6.203ThrGly: 6.203 ± 0.592
1.821ThrHis: 1.821 ± 0.348
3.585ThrIle: 3.585 ± 0.469
2.049ThrLys: 2.049 ± 0.354
4.723ThrLeu: 4.723 ± 0.66
1.423ThrMet: 1.423 ± 0.312
1.536ThrAsn: 1.536 ± 0.325
5.008ThrPro: 5.008 ± 0.648
2.049ThrGln: 2.049 ± 0.3
3.756ThrArg: 3.756 ± 0.498
3.756ThrSer: 3.756 ± 0.485
4.666ThrThr: 4.666 ± 0.661
5.121ThrVal: 5.121 ± 0.661
0.967ThrTrp: 0.967 ± 0.23
1.764ThrTyr: 1.764 ± 0.247
0.0ThrXaa: 0.0 ± 0.0
Val
6.601ValAla: 6.601 ± 0.633
1.024ValCys: 1.024 ± 0.216
4.951ValAsp: 4.951 ± 0.507
5.292ValGlu: 5.292 ± 0.585
1.935ValPhe: 1.935 ± 0.361
5.406ValGly: 5.406 ± 0.478
1.252ValHis: 1.252 ± 0.294
3.187ValIle: 3.187 ± 0.493
2.675ValLys: 2.675 ± 0.355
5.292ValLeu: 5.292 ± 0.575
1.65ValMet: 1.65 ± 0.313
1.764ValAsn: 1.764 ± 0.372
4.04ValPro: 4.04 ± 0.479
2.675ValGln: 2.675 ± 0.334
5.634ValArg: 5.634 ± 0.665
4.78ValSer: 4.78 ± 0.661
4.894ValThr: 4.894 ± 0.526
6.146ValVal: 6.146 ± 0.638
1.707ValTrp: 1.707 ± 0.269
1.764ValTyr: 1.764 ± 0.295
0.0ValXaa: 0.0 ± 0.0
Trp
1.935TrpAla: 1.935 ± 0.311
0.398TrpCys: 0.398 ± 0.145
1.309TrpAsp: 1.309 ± 0.278
1.081TrpGlu: 1.081 ± 0.335
0.854TrpPhe: 0.854 ± 0.189
1.138TrpGly: 1.138 ± 0.214
0.569TrpHis: 0.569 ± 0.169
0.91TrpIle: 0.91 ± 0.225
0.74TrpLys: 0.74 ± 0.17
1.878TrpLeu: 1.878 ± 0.355
0.626TrpMet: 0.626 ± 0.198
0.626TrpAsn: 0.626 ± 0.146
0.91TrpPro: 0.91 ± 0.265
1.024TrpGln: 1.024 ± 0.251
2.618TrpArg: 2.618 ± 0.421
1.195TrpSer: 1.195 ± 0.254
1.821TrpThr: 1.821 ± 0.31
1.536TrpVal: 1.536 ± 0.339
1.024TrpTrp: 1.024 ± 0.245
0.341TrpTyr: 0.341 ± 0.147
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.333TyrAla: 2.333 ± 0.392
0.398TyrCys: 0.398 ± 0.125
1.707TyrAsp: 1.707 ± 0.424
2.162TyrGlu: 2.162 ± 0.296
0.626TyrPhe: 0.626 ± 0.202
1.821TyrGly: 1.821 ± 0.268
0.228TyrHis: 0.228 ± 0.103
1.081TyrIle: 1.081 ± 0.201
0.683TyrLys: 0.683 ± 0.187
1.992TyrLeu: 1.992 ± 0.428
0.228TyrMet: 0.228 ± 0.104
0.797TyrAsn: 0.797 ± 0.19
1.707TyrPro: 1.707 ± 0.228
0.91TyrGln: 0.91 ± 0.218
2.106TyrArg: 2.106 ± 0.364
0.797TyrSer: 0.797 ± 0.227
1.821TyrThr: 1.821 ± 0.274
2.162TyrVal: 2.162 ± 0.271
0.626TyrTrp: 0.626 ± 0.172
0.398TyrTyr: 0.398 ± 0.126
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 94 proteins (17574 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski