Amino acid dipepetide frequency for Mycobacterium phage GageAP

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
12.976AlaAla: 12.976 ± 1.63
0.728AlaCys: 0.728 ± 0.236
6.549AlaAsp: 6.549 ± 0.661
6.306AlaGlu: 6.306 ± 0.767
3.274AlaPhe: 3.274 ± 0.502
7.337AlaGly: 7.337 ± 0.789
2.062AlaHis: 2.062 ± 0.417
4.608AlaIle: 4.608 ± 0.671
3.82AlaLys: 3.82 ± 0.441
8.974AlaLeu: 8.974 ± 0.842
2.789AlaMet: 2.789 ± 0.392
2.971AlaAsn: 2.971 ± 0.521
4.911AlaPro: 4.911 ± 0.674
3.153AlaGln: 3.153 ± 0.477
6.064AlaArg: 6.064 ± 0.58
4.73AlaSer: 4.73 ± 0.569
5.76AlaThr: 5.76 ± 0.659
9.338AlaVal: 9.338 ± 0.917
1.819AlaTrp: 1.819 ± 0.395
2.85AlaTyr: 2.85 ± 0.329
0.0AlaXaa: 0.0 ± 0.0
Cys
0.667CysAla: 0.667 ± 0.192
0.121CysCys: 0.121 ± 0.084
0.364CysAsp: 0.364 ± 0.147
0.788CysGlu: 0.788 ± 0.22
0.182CysPhe: 0.182 ± 0.1
0.606CysGly: 0.606 ± 0.195
0.121CysHis: 0.121 ± 0.079
0.303CysIle: 0.303 ± 0.148
0.485CysLys: 0.485 ± 0.172
0.485CysLeu: 0.485 ± 0.177
0.182CysMet: 0.182 ± 0.108
0.243CysAsn: 0.243 ± 0.111
0.424CysPro: 0.424 ± 0.196
0.243CysGln: 0.243 ± 0.1
0.606CysArg: 0.606 ± 0.182
0.243CysSer: 0.243 ± 0.118
0.424CysThr: 0.424 ± 0.174
0.424CysVal: 0.424 ± 0.174
0.243CysTrp: 0.243 ± 0.131
0.182CysTyr: 0.182 ± 0.131
0.0CysXaa: 0.0 ± 0.0
Asp
6.064AspAla: 6.064 ± 0.632
0.546AspCys: 0.546 ± 0.162
4.487AspAsp: 4.487 ± 0.573
3.396AspGlu: 3.396 ± 0.499
2.304AspPhe: 2.304 ± 0.398
5.639AspGly: 5.639 ± 0.67
1.213AspHis: 1.213 ± 0.284
2.729AspIle: 2.729 ± 0.483
2.547AspLys: 2.547 ± 0.465
6.609AspLeu: 6.609 ± 0.542
1.334AspMet: 1.334 ± 0.234
1.88AspAsn: 1.88 ± 0.324
4.608AspPro: 4.608 ± 0.595
1.637AspGln: 1.637 ± 0.403
3.396AspArg: 3.396 ± 0.415
3.456AspSer: 3.456 ± 0.526
4.002AspThr: 4.002 ± 0.453
4.002AspVal: 4.002 ± 0.591
1.758AspTrp: 1.758 ± 0.347
2.122AspTyr: 2.122 ± 0.392
0.0AspXaa: 0.0 ± 0.0
Glu
6.549GluAla: 6.549 ± 0.785
0.243GluCys: 0.243 ± 0.132
4.305GluAsp: 4.305 ± 0.577
4.79GluGlu: 4.79 ± 0.553
2.425GluPhe: 2.425 ± 0.382
3.881GluGly: 3.881 ± 0.511
1.273GluHis: 1.273 ± 0.303
2.971GluIle: 2.971 ± 0.532
2.911GluLys: 2.911 ± 0.414
7.155GluLeu: 7.155 ± 0.566
1.334GluMet: 1.334 ± 0.31
1.819GluAsn: 1.819 ± 0.38
2.789GluPro: 2.789 ± 0.483
2.729GluGln: 2.729 ± 0.389
3.699GluArg: 3.699 ± 0.601
3.699GluSer: 3.699 ± 0.508
4.184GluThr: 4.184 ± 0.711
5.336GluVal: 5.336 ± 0.664
1.516GluTrp: 1.516 ± 0.381
2.607GluTyr: 2.607 ± 0.506
0.0GluXaa: 0.0 ± 0.0
Phe
2.425PheAla: 2.425 ± 0.317
0.364PheCys: 0.364 ± 0.167
2.789PheAsp: 2.789 ± 0.355
1.88PheGlu: 1.88 ± 0.375
0.546PhePhe: 0.546 ± 0.2
3.214PheGly: 3.214 ± 0.515
0.728PheHis: 0.728 ± 0.235
1.273PheIle: 1.273 ± 0.288
1.455PheLys: 1.455 ± 0.278
2.486PheLeu: 2.486 ± 0.416
0.728PheMet: 0.728 ± 0.179
0.91PheAsn: 0.91 ± 0.226
1.758PhePro: 1.758 ± 0.287
1.031PheGln: 1.031 ± 0.281
1.94PheArg: 1.94 ± 0.346
2.122PheSer: 2.122 ± 0.532
1.88PheThr: 1.88 ± 0.337
2.183PheVal: 2.183 ± 0.396
0.606PheTrp: 0.606 ± 0.167
1.152PheTyr: 1.152 ± 0.3
0.0PheXaa: 0.0 ± 0.0
Gly
6.791GlyAla: 6.791 ± 1.003
0.667GlyCys: 0.667 ± 0.201
5.457GlyAsp: 5.457 ± 0.412
5.275GlyGlu: 5.275 ± 0.47
2.971GlyPhe: 2.971 ± 0.512
6.488GlyGly: 6.488 ± 1.046
1.758GlyHis: 1.758 ± 0.374
3.941GlyIle: 3.941 ± 0.594
3.456GlyLys: 3.456 ± 0.615
7.64GlyLeu: 7.64 ± 0.867
1.698GlyMet: 1.698 ± 0.334
3.335GlyAsn: 3.335 ± 0.399
3.881GlyPro: 3.881 ± 0.599
2.062GlyGln: 2.062 ± 0.341
4.487GlyArg: 4.487 ± 0.507
5.336GlySer: 5.336 ± 0.577
5.154GlyThr: 5.154 ± 0.643
5.578GlyVal: 5.578 ± 0.657
2.122GlyTrp: 2.122 ± 0.42
2.607GlyTyr: 2.607 ± 0.366
0.0GlyXaa: 0.0 ± 0.0
His
2.001HisAla: 2.001 ± 0.407
0.182HisCys: 0.182 ± 0.098
1.334HisAsp: 1.334 ± 0.276
1.152HisGlu: 1.152 ± 0.251
0.667HisPhe: 0.667 ± 0.193
1.334HisGly: 1.334 ± 0.304
0.788HisHis: 0.788 ± 0.211
0.788HisIle: 0.788 ± 0.202
1.213HisLys: 1.213 ± 0.358
1.698HisLeu: 1.698 ± 0.4
0.121HisMet: 0.121 ± 0.081
0.546HisAsn: 0.546 ± 0.187
1.637HisPro: 1.637 ± 0.332
1.152HisGln: 1.152 ± 0.24
1.516HisArg: 1.516 ± 0.345
0.606HisSer: 0.606 ± 0.218
1.031HisThr: 1.031 ± 0.275
1.758HisVal: 1.758 ± 0.338
0.546HisTrp: 0.546 ± 0.157
0.546HisTyr: 0.546 ± 0.214
0.0HisXaa: 0.0 ± 0.0
Ile
6.124IleAla: 6.124 ± 0.749
0.303IleCys: 0.303 ± 0.122
3.396IleAsp: 3.396 ± 0.416
3.699IleGlu: 3.699 ± 0.466
0.788IlePhe: 0.788 ± 0.206
3.638IleGly: 3.638 ± 0.428
1.031IleHis: 1.031 ± 0.252
1.577IleIle: 1.577 ± 0.305
1.455IleLys: 1.455 ± 0.323
3.638IleLeu: 3.638 ± 0.527
0.849IleMet: 0.849 ± 0.198
1.94IleAsn: 1.94 ± 0.351
3.153IlePro: 3.153 ± 0.437
1.395IleGln: 1.395 ± 0.449
3.759IleArg: 3.759 ± 0.452
3.456IleSer: 3.456 ± 0.425
3.153IleThr: 3.153 ± 0.471
2.547IleVal: 2.547 ± 0.429
0.728IleTrp: 0.728 ± 0.196
1.334IleTyr: 1.334 ± 0.257
0.0IleXaa: 0.0 ± 0.0
Lys
3.577LysAla: 3.577 ± 0.5
0.303LysCys: 0.303 ± 0.128
2.607LysAsp: 2.607 ± 0.437
2.425LysGlu: 2.425 ± 0.376
1.516LysPhe: 1.516 ± 0.26
2.486LysGly: 2.486 ± 0.422
1.152LysHis: 1.152 ± 0.282
2.668LysIle: 2.668 ± 0.411
1.819LysLys: 1.819 ± 0.357
3.335LysLeu: 3.335 ± 0.458
1.091LysMet: 1.091 ± 0.26
1.577LysAsn: 1.577 ± 0.266
2.425LysPro: 2.425 ± 0.391
1.819LysGln: 1.819 ± 0.366
3.517LysArg: 3.517 ± 0.609
2.183LysSer: 2.183 ± 0.38
2.607LysThr: 2.607 ± 0.393
3.335LysVal: 3.335 ± 0.485
0.849LysTrp: 0.849 ± 0.234
0.97LysTyr: 0.97 ± 0.25
0.0LysXaa: 0.0 ± 0.0
Leu
9.884LeuAla: 9.884 ± 0.966
0.485LeuCys: 0.485 ± 0.141
5.578LeuAsp: 5.578 ± 0.598
5.76LeuGlu: 5.76 ± 0.659
2.062LeuPhe: 2.062 ± 0.428
7.155LeuGly: 7.155 ± 0.84
1.577LeuHis: 1.577 ± 0.327
4.79LeuIle: 4.79 ± 0.674
4.669LeuLys: 4.669 ± 0.544
5.821LeuLeu: 5.821 ± 0.584
1.698LeuMet: 1.698 ± 0.294
2.607LeuAsn: 2.607 ± 0.409
5.882LeuPro: 5.882 ± 0.641
2.547LeuGln: 2.547 ± 0.496
5.942LeuArg: 5.942 ± 0.567
5.882LeuSer: 5.882 ± 0.492
6.67LeuThr: 6.67 ± 0.527
4.305LeuVal: 4.305 ± 0.658
1.273LeuTrp: 1.273 ± 0.321
2.486LeuTyr: 2.486 ± 0.417
0.0LeuXaa: 0.0 ± 0.0
Met
2.183MetAla: 2.183 ± 0.325
0.061MetCys: 0.061 ± 0.049
1.152MetAsp: 1.152 ± 0.227
1.455MetGlu: 1.455 ± 0.301
0.606MetPhe: 0.606 ± 0.195
1.395MetGly: 1.395 ± 0.325
0.364MetHis: 0.364 ± 0.143
0.788MetIle: 0.788 ± 0.242
0.97MetLys: 0.97 ± 0.212
1.213MetLeu: 1.213 ± 0.281
0.121MetMet: 0.121 ± 0.092
1.091MetAsn: 1.091 ± 0.259
1.152MetPro: 1.152 ± 0.256
0.728MetGln: 0.728 ± 0.175
1.334MetArg: 1.334 ± 0.289
2.365MetSer: 2.365 ± 0.385
2.244MetThr: 2.244 ± 0.272
1.334MetVal: 1.334 ± 0.401
0.303MetTrp: 0.303 ± 0.114
0.364MetTyr: 0.364 ± 0.134
0.0MetXaa: 0.0 ± 0.0
Asn
2.85AsnAla: 2.85 ± 0.499
0.121AsnCys: 0.121 ± 0.079
2.122AsnAsp: 2.122 ± 0.385
1.698AsnGlu: 1.698 ± 0.284
0.97AsnPhe: 0.97 ± 0.244
3.456AsnGly: 3.456 ± 0.465
0.667AsnHis: 0.667 ± 0.206
1.758AsnIle: 1.758 ± 0.307
0.728AsnLys: 0.728 ± 0.218
2.365AsnLeu: 2.365 ± 0.422
0.485AsnMet: 0.485 ± 0.158
0.788AsnAsn: 0.788 ± 0.223
2.729AsnPro: 2.729 ± 0.37
1.213AsnGln: 1.213 ± 0.279
1.577AsnArg: 1.577 ± 0.277
2.001AsnSer: 2.001 ± 0.391
1.758AsnThr: 1.758 ± 0.266
2.425AsnVal: 2.425 ± 0.38
0.788AsnTrp: 0.788 ± 0.17
1.273AsnTyr: 1.273 ± 0.295
0.0AsnXaa: 0.0 ± 0.0
Pro
5.942ProAla: 5.942 ± 0.62
0.485ProCys: 0.485 ± 0.163
4.123ProAsp: 4.123 ± 0.448
4.487ProGlu: 4.487 ± 0.542
2.062ProPhe: 2.062 ± 0.401
5.578ProGly: 5.578 ± 0.605
0.91ProHis: 0.91 ± 0.261
2.244ProIle: 2.244 ± 0.398
1.88ProLys: 1.88 ± 0.252
4.487ProLeu: 4.487 ± 0.616
1.031ProMet: 1.031 ± 0.241
1.516ProAsn: 1.516 ± 0.318
3.456ProPro: 3.456 ± 0.749
1.577ProGln: 1.577 ± 0.319
2.668ProArg: 2.668 ± 0.437
4.063ProSer: 4.063 ± 0.487
4.002ProThr: 4.002 ± 0.506
4.002ProVal: 4.002 ± 0.55
0.91ProTrp: 0.91 ± 0.3
1.516ProTyr: 1.516 ± 0.315
0.0ProXaa: 0.0 ± 0.0
Gln
3.517GlnAla: 3.517 ± 0.5
0.061GlnCys: 0.061 ± 0.058
1.152GlnAsp: 1.152 ± 0.263
1.637GlnGlu: 1.637 ± 0.257
1.031GlnPhe: 1.031 ± 0.212
2.607GlnGly: 2.607 ± 0.316
0.667GlnHis: 0.667 ± 0.176
2.85GlnIle: 2.85 ± 0.568
1.395GlnLys: 1.395 ± 0.265
4.063GlnLeu: 4.063 ± 0.508
0.97GlnMet: 0.97 ± 0.266
0.424GlnAsn: 0.424 ± 0.14
2.122GlnPro: 2.122 ± 0.361
1.819GlnGln: 1.819 ± 0.404
2.001GlnArg: 2.001 ± 0.367
1.577GlnSer: 1.577 ± 0.261
1.637GlnThr: 1.637 ± 0.306
2.729GlnVal: 2.729 ± 0.443
0.546GlnTrp: 0.546 ± 0.171
0.667GlnTyr: 0.667 ± 0.181
0.0GlnXaa: 0.0 ± 0.0
Arg
6.124ArgAla: 6.124 ± 0.654
0.606ArgCys: 0.606 ± 0.202
2.971ArgAsp: 2.971 ± 0.411
4.972ArgGlu: 4.972 ± 0.63
2.122ArgPhe: 2.122 ± 0.413
4.73ArgGly: 4.73 ± 0.525
1.091ArgHis: 1.091 ± 0.266
3.092ArgIle: 3.092 ± 0.445
3.274ArgLys: 3.274 ± 0.463
6.064ArgLeu: 6.064 ± 0.865
1.88ArgMet: 1.88 ± 0.309
2.062ArgAsn: 2.062 ± 0.378
2.85ArgPro: 2.85 ± 0.446
2.122ArgGln: 2.122 ± 0.414
5.397ArgArg: 5.397 ± 0.696
3.456ArgSer: 3.456 ± 0.475
3.032ArgThr: 3.032 ± 0.458
4.851ArgVal: 4.851 ± 0.552
0.97ArgTrp: 0.97 ± 0.244
1.758ArgTyr: 1.758 ± 0.336
0.0ArgXaa: 0.0 ± 0.0
Ser
5.942SerAla: 5.942 ± 0.823
0.606SerCys: 0.606 ± 0.211
3.032SerAsp: 3.032 ± 0.421
4.244SerGlu: 4.244 ± 0.582
1.94SerPhe: 1.94 ± 0.318
6.064SerGly: 6.064 ± 0.678
1.516SerHis: 1.516 ± 0.327
2.668SerIle: 2.668 ± 0.41
2.85SerLys: 2.85 ± 0.397
5.457SerLeu: 5.457 ± 0.652
1.758SerMet: 1.758 ± 0.409
2.183SerAsn: 2.183 ± 0.449
3.214SerPro: 3.214 ± 0.412
1.758SerGln: 1.758 ± 0.292
3.092SerArg: 3.092 ± 0.362
3.274SerSer: 3.274 ± 0.602
3.396SerThr: 3.396 ± 0.522
3.396SerVal: 3.396 ± 0.433
1.273SerTrp: 1.273 ± 0.271
1.637SerTyr: 1.637 ± 0.326
0.0SerXaa: 0.0 ± 0.0
Thr
6.731ThrAla: 6.731 ± 0.898
0.303ThrCys: 0.303 ± 0.151
4.184ThrAsp: 4.184 ± 0.591
4.426ThrGlu: 4.426 ± 0.548
2.365ThrPhe: 2.365 ± 0.383
5.882ThrGly: 5.882 ± 0.569
1.091ThrHis: 1.091 ± 0.311
2.729ThrIle: 2.729 ± 0.58
2.729ThrLys: 2.729 ± 0.308
5.7ThrLeu: 5.7 ± 0.611
0.97ThrMet: 0.97 ± 0.237
1.758ThrAsn: 1.758 ± 0.303
4.063ThrPro: 4.063 ± 0.543
2.244ThrGln: 2.244 ± 0.365
3.699ThrArg: 3.699 ± 0.5
3.517ThrSer: 3.517 ± 0.503
4.305ThrThr: 4.305 ± 0.614
5.215ThrVal: 5.215 ± 0.583
1.091ThrTrp: 1.091 ± 0.248
2.001ThrTyr: 2.001 ± 0.373
0.0ThrXaa: 0.0 ± 0.0
Val
6.549ValAla: 6.549 ± 0.755
0.424ValCys: 0.424 ± 0.163
5.397ValAsp: 5.397 ± 0.687
4.548ValGlu: 4.548 ± 0.589
2.244ValPhe: 2.244 ± 0.312
5.033ValGly: 5.033 ± 0.724
1.455ValHis: 1.455 ± 0.279
3.577ValIle: 3.577 ± 0.486
3.274ValLys: 3.274 ± 0.553
5.275ValLeu: 5.275 ± 0.524
1.213ValMet: 1.213 ± 0.314
2.425ValAsn: 2.425 ± 0.393
3.941ValPro: 3.941 ± 0.496
2.365ValGln: 2.365 ± 0.394
4.851ValArg: 4.851 ± 0.634
4.851ValSer: 4.851 ± 0.463
6.124ValThr: 6.124 ± 0.64
4.79ValVal: 4.79 ± 0.778
1.031ValTrp: 1.031 ± 0.24
2.486ValTyr: 2.486 ± 0.438
0.0ValXaa: 0.0 ± 0.0
Trp
1.698TrpAla: 1.698 ± 0.316
0.364TrpCys: 0.364 ± 0.156
1.455TrpAsp: 1.455 ± 0.303
0.91TrpGlu: 0.91 ± 0.2
0.788TrpPhe: 0.788 ± 0.247
1.577TrpGly: 1.577 ± 0.282
0.364TrpHis: 0.364 ± 0.16
1.091TrpIle: 1.091 ± 0.247
0.303TrpLys: 0.303 ± 0.138
1.819TrpLeu: 1.819 ± 0.273
0.424TrpMet: 0.424 ± 0.178
0.606TrpAsn: 0.606 ± 0.163
0.667TrpPro: 0.667 ± 0.213
0.788TrpGln: 0.788 ± 0.215
1.213TrpArg: 1.213 ± 0.328
0.91TrpSer: 0.91 ± 0.192
1.637TrpThr: 1.637 ± 0.389
2.062TrpVal: 2.062 ± 0.28
0.485TrpTrp: 0.485 ± 0.212
0.243TrpTyr: 0.243 ± 0.139
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.425TyrAla: 2.425 ± 0.42
0.364TyrCys: 0.364 ± 0.162
1.273TyrAsp: 1.273 ± 0.321
2.365TyrGlu: 2.365 ± 0.393
0.606TyrPhe: 0.606 ± 0.214
2.607TyrGly: 2.607 ± 0.402
0.788TyrHis: 0.788 ± 0.2
1.516TyrIle: 1.516 ± 0.347
1.031TyrLys: 1.031 ± 0.193
2.789TyrLeu: 2.789 ± 0.454
0.485TyrMet: 0.485 ± 0.15
1.091TyrAsn: 1.091 ± 0.284
1.273TyrPro: 1.273 ± 0.264
1.091TyrGln: 1.091 ± 0.31
2.729TyrArg: 2.729 ± 0.418
1.577TyrSer: 1.577 ± 0.32
2.001TyrThr: 2.001 ± 0.357
2.244TyrVal: 2.244 ± 0.399
0.546TyrTrp: 0.546 ± 0.182
0.424TyrTyr: 0.424 ± 0.141
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 91 proteins (16493 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski