Amino acid dipepetide frequency for Mycobacterium phage Pepe

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
12.832AlaAla: 12.832 ± 1.175
0.529AlaCys: 0.529 ± 0.163
5.887AlaAsp: 5.887 ± 0.591
6.813AlaGlu: 6.813 ± 0.794
2.977AlaPhe: 2.977 ± 0.452
7.541AlaGly: 7.541 ± 0.888
1.389AlaHis: 1.389 ± 0.425
4.432AlaIle: 4.432 ± 0.667
4.167AlaLys: 4.167 ± 0.549
9.591AlaLeu: 9.591 ± 0.983
2.249AlaMet: 2.249 ± 0.404
2.646AlaAsn: 2.646 ± 0.386
5.292AlaPro: 5.292 ± 0.752
2.91AlaGln: 2.91 ± 0.476
6.416AlaArg: 6.416 ± 0.68
5.027AlaSer: 5.027 ± 0.631
6.019AlaThr: 6.019 ± 0.696
8.93AlaVal: 8.93 ± 0.766
2.117AlaTrp: 2.117 ± 0.347
3.109AlaTyr: 3.109 ± 0.463
0.0AlaXaa: 0.0 ± 0.0
Cys
0.595CysAla: 0.595 ± 0.229
0.0CysCys: 0.0 ± 0.0
0.397CysAsp: 0.397 ± 0.15
0.661CysGlu: 0.661 ± 0.227
0.198CysPhe: 0.198 ± 0.112
0.661CysGly: 0.661 ± 0.266
0.198CysHis: 0.198 ± 0.108
0.265CysIle: 0.265 ± 0.16
0.198CysLys: 0.198 ± 0.107
0.397CysLeu: 0.397 ± 0.179
0.132CysMet: 0.132 ± 0.103
0.265CysAsn: 0.265 ± 0.139
0.331CysPro: 0.331 ± 0.149
0.132CysGln: 0.132 ± 0.081
0.463CysArg: 0.463 ± 0.196
0.265CysSer: 0.265 ± 0.139
0.331CysThr: 0.331 ± 0.144
0.397CysVal: 0.397 ± 0.166
0.132CysTrp: 0.132 ± 0.079
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
6.152AspAla: 6.152 ± 0.688
0.529AspCys: 0.529 ± 0.151
4.829AspAsp: 4.829 ± 0.568
3.969AspGlu: 3.969 ± 0.622
2.514AspPhe: 2.514 ± 0.451
5.887AspGly: 5.887 ± 0.755
0.992AspHis: 0.992 ± 0.258
2.447AspIle: 2.447 ± 0.422
3.109AspLys: 3.109 ± 0.427
6.681AspLeu: 6.681 ± 0.764
1.058AspMet: 1.058 ± 0.199
1.588AspAsn: 1.588 ± 0.344
4.3AspPro: 4.3 ± 0.627
1.588AspGln: 1.588 ± 0.382
3.77AspArg: 3.77 ± 0.509
3.638AspSer: 3.638 ± 0.451
3.638AspThr: 3.638 ± 0.391
3.77AspVal: 3.77 ± 0.591
1.786AspTrp: 1.786 ± 0.34
1.918AspTyr: 1.918 ± 0.372
0.0AspXaa: 0.0 ± 0.0
Glu
5.689GluAla: 5.689 ± 0.667
0.397GluCys: 0.397 ± 0.229
4.432GluAsp: 4.432 ± 0.558
5.093GluGlu: 5.093 ± 0.693
2.051GluPhe: 2.051 ± 0.371
4.432GluGly: 4.432 ± 0.497
1.389GluHis: 1.389 ± 0.352
3.307GluIle: 3.307 ± 0.506
3.043GluLys: 3.043 ± 0.542
6.879GluLeu: 6.879 ± 0.55
1.521GluMet: 1.521 ± 0.295
1.588GluAsn: 1.588 ± 0.326
2.778GluPro: 2.778 ± 0.475
2.977GluGln: 2.977 ± 0.528
4.101GluArg: 4.101 ± 0.504
3.373GluSer: 3.373 ± 0.44
3.836GluThr: 3.836 ± 0.588
7.012GluVal: 7.012 ± 0.739
1.521GluTrp: 1.521 ± 0.363
2.712GluTyr: 2.712 ± 0.464
0.0GluXaa: 0.0 ± 0.0
Phe
2.249PheAla: 2.249 ± 0.378
0.265PheCys: 0.265 ± 0.156
2.58PheAsp: 2.58 ± 0.34
2.381PheGlu: 2.381 ± 0.399
0.595PhePhe: 0.595 ± 0.204
3.77PheGly: 3.77 ± 0.551
0.728PheHis: 0.728 ± 0.271
1.72PheIle: 1.72 ± 0.267
1.058PheLys: 1.058 ± 0.263
2.844PheLeu: 2.844 ± 0.517
0.595PheMet: 0.595 ± 0.214
1.455PheAsn: 1.455 ± 0.26
1.654PhePro: 1.654 ± 0.349
1.058PheGln: 1.058 ± 0.235
2.051PheArg: 2.051 ± 0.417
2.315PheSer: 2.315 ± 0.398
1.852PheThr: 1.852 ± 0.367
1.852PheVal: 1.852 ± 0.405
0.661PheTrp: 0.661 ± 0.22
0.86PheTyr: 0.86 ± 0.223
0.0PheXaa: 0.0 ± 0.0
Gly
6.548GlyAla: 6.548 ± 0.863
0.728GlyCys: 0.728 ± 0.273
5.887GlyAsp: 5.887 ± 0.502
5.292GlyGlu: 5.292 ± 0.544
3.175GlyPhe: 3.175 ± 0.545
7.673GlyGly: 7.673 ± 1.164
2.051GlyHis: 2.051 ± 0.428
5.093GlyIle: 5.093 ± 0.781
3.373GlyLys: 3.373 ± 0.559
7.408GlyLeu: 7.408 ± 0.716
2.183GlyMet: 2.183 ± 0.381
3.175GlyAsn: 3.175 ± 0.455
3.638GlyPro: 3.638 ± 0.616
2.91GlyGln: 2.91 ± 0.353
5.226GlyArg: 5.226 ± 0.54
4.763GlySer: 4.763 ± 0.569
5.424GlyThr: 5.424 ± 0.647
4.895GlyVal: 4.895 ± 0.663
2.381GlyTrp: 2.381 ± 0.404
2.514GlyTyr: 2.514 ± 0.391
0.0GlyXaa: 0.0 ± 0.0
His
1.588HisAla: 1.588 ± 0.309
0.198HisCys: 0.198 ± 0.127
1.257HisAsp: 1.257 ± 0.244
1.323HisGlu: 1.323 ± 0.28
0.926HisPhe: 0.926 ± 0.218
1.72HisGly: 1.72 ± 0.375
0.529HisHis: 0.529 ± 0.218
1.124HisIle: 1.124 ± 0.225
1.191HisLys: 1.191 ± 0.335
1.323HisLeu: 1.323 ± 0.317
0.066HisMet: 0.066 ± 0.066
0.198HisAsn: 0.198 ± 0.108
1.191HisPro: 1.191 ± 0.238
0.926HisGln: 0.926 ± 0.275
1.455HisArg: 1.455 ± 0.334
0.728HisSer: 0.728 ± 0.209
0.926HisThr: 0.926 ± 0.232
1.72HisVal: 1.72 ± 0.368
0.529HisTrp: 0.529 ± 0.191
0.728HisTyr: 0.728 ± 0.258
0.0HisXaa: 0.0 ± 0.0
Ile
6.218IleAla: 6.218 ± 0.585
0.198IleCys: 0.198 ± 0.098
3.44IleAsp: 3.44 ± 0.471
3.572IleGlu: 3.572 ± 0.442
1.058IlePhe: 1.058 ± 0.269
4.233IleGly: 4.233 ± 0.561
1.257IleHis: 1.257 ± 0.308
1.72IleIle: 1.72 ± 0.333
1.521IleLys: 1.521 ± 0.366
3.307IleLeu: 3.307 ± 0.371
0.992IleMet: 0.992 ± 0.255
1.654IleAsn: 1.654 ± 0.339
3.043IlePro: 3.043 ± 0.413
1.588IleGln: 1.588 ± 0.419
3.175IleArg: 3.175 ± 0.474
3.241IleSer: 3.241 ± 0.393
3.241IleThr: 3.241 ± 0.415
3.241IleVal: 3.241 ± 0.605
0.728IleTrp: 0.728 ± 0.208
1.389IleTyr: 1.389 ± 0.279
0.0IleXaa: 0.0 ± 0.0
Lys
3.506LysAla: 3.506 ± 0.459
0.265LysCys: 0.265 ± 0.127
2.646LysAsp: 2.646 ± 0.429
1.984LysGlu: 1.984 ± 0.361
1.521LysPhe: 1.521 ± 0.283
2.91LysGly: 2.91 ± 0.474
0.992LysHis: 0.992 ± 0.267
1.786LysIle: 1.786 ± 0.444
2.051LysLys: 2.051 ± 0.394
3.836LysLeu: 3.836 ± 0.468
1.058LysMet: 1.058 ± 0.235
1.852LysAsn: 1.852 ± 0.333
2.249LysPro: 2.249 ± 0.424
1.588LysGln: 1.588 ± 0.295
3.109LysArg: 3.109 ± 0.536
2.646LysSer: 2.646 ± 0.426
2.117LysThr: 2.117 ± 0.373
3.638LysVal: 3.638 ± 0.53
0.661LysTrp: 0.661 ± 0.215
0.992LysTyr: 0.992 ± 0.264
0.0LysXaa: 0.0 ± 0.0
Leu
9.922LeuAla: 9.922 ± 1.206
0.265LeuCys: 0.265 ± 0.128
6.085LeuAsp: 6.085 ± 0.553
5.622LeuGlu: 5.622 ± 0.609
2.315LeuPhe: 2.315 ± 0.466
7.276LeuGly: 7.276 ± 0.657
1.455LeuHis: 1.455 ± 0.318
4.829LeuIle: 4.829 ± 0.612
3.77LeuLys: 3.77 ± 0.524
5.953LeuLeu: 5.953 ± 0.672
1.654LeuMet: 1.654 ± 0.249
2.91LeuAsn: 2.91 ± 0.41
5.556LeuPro: 5.556 ± 0.637
2.712LeuGln: 2.712 ± 0.493
6.35LeuArg: 6.35 ± 0.554
5.755LeuSer: 5.755 ± 0.664
6.681LeuThr: 6.681 ± 0.765
4.763LeuVal: 4.763 ± 0.625
1.191LeuTrp: 1.191 ± 0.319
2.315LeuTyr: 2.315 ± 0.446
0.0LeuXaa: 0.0 ± 0.0
Met
2.381MetAla: 2.381 ± 0.353
0.0MetCys: 0.0 ± 0.0
0.86MetAsp: 0.86 ± 0.23
1.654MetGlu: 1.654 ± 0.353
0.595MetPhe: 0.595 ± 0.171
1.323MetGly: 1.323 ± 0.257
0.397MetHis: 0.397 ± 0.241
0.661MetIle: 0.661 ± 0.196
1.124MetLys: 1.124 ± 0.252
1.191MetLeu: 1.191 ± 0.28
0.132MetMet: 0.132 ± 0.09
0.86MetAsn: 0.86 ± 0.203
1.323MetPro: 1.323 ± 0.271
0.728MetGln: 0.728 ± 0.243
1.191MetArg: 1.191 ± 0.278
2.051MetSer: 2.051 ± 0.405
2.447MetThr: 2.447 ± 0.39
1.191MetVal: 1.191 ± 0.36
0.265MetTrp: 0.265 ± 0.117
0.397MetTyr: 0.397 ± 0.149
0.0MetXaa: 0.0 ± 0.0
Asn
3.506AsnAla: 3.506 ± 0.588
0.265AsnCys: 0.265 ± 0.252
1.72AsnAsp: 1.72 ± 0.429
1.786AsnGlu: 1.786 ± 0.421
0.794AsnPhe: 0.794 ± 0.214
3.638AsnGly: 3.638 ± 0.495
0.661AsnHis: 0.661 ± 0.212
1.588AsnIle: 1.588 ± 0.273
0.728AsnLys: 0.728 ± 0.202
2.117AsnLeu: 2.117 ± 0.375
0.661AsnMet: 0.661 ± 0.155
0.86AsnAsn: 0.86 ± 0.234
2.712AsnPro: 2.712 ± 0.418
0.926AsnGln: 0.926 ± 0.231
1.323AsnArg: 1.323 ± 0.288
1.654AsnSer: 1.654 ± 0.32
1.72AsnThr: 1.72 ± 0.405
2.778AsnVal: 2.778 ± 0.435
0.661AsnTrp: 0.661 ± 0.187
1.124AsnTyr: 1.124 ± 0.261
0.0AsnXaa: 0.0 ± 0.0
Pro
5.292ProAla: 5.292 ± 0.579
0.331ProCys: 0.331 ± 0.145
4.101ProAsp: 4.101 ± 0.508
3.969ProGlu: 3.969 ± 0.621
1.852ProPhe: 1.852 ± 0.409
4.895ProGly: 4.895 ± 0.756
0.728ProHis: 0.728 ± 0.24
2.514ProIle: 2.514 ± 0.403
1.984ProLys: 1.984 ± 0.356
4.63ProLeu: 4.63 ± 0.579
1.257ProMet: 1.257 ± 0.311
1.323ProAsn: 1.323 ± 0.334
2.381ProPro: 2.381 ± 0.42
1.455ProGln: 1.455 ± 0.325
2.315ProArg: 2.315 ± 0.456
3.969ProSer: 3.969 ± 0.572
3.506ProThr: 3.506 ± 0.45
3.836ProVal: 3.836 ± 0.436
0.992ProTrp: 0.992 ± 0.346
1.72ProTyr: 1.72 ± 0.374
0.0ProXaa: 0.0 ± 0.0
Gln
3.175GlnAla: 3.175 ± 0.733
0.066GlnCys: 0.066 ± 0.062
1.058GlnAsp: 1.058 ± 0.33
1.984GlnGlu: 1.984 ± 0.324
1.191GlnPhe: 1.191 ± 0.36
2.91GlnGly: 2.91 ± 0.439
0.595GlnHis: 0.595 ± 0.155
2.249GlnIle: 2.249 ± 0.338
1.389GlnLys: 1.389 ± 0.28
3.836GlnLeu: 3.836 ± 0.602
1.058GlnMet: 1.058 ± 0.247
0.529GlnAsn: 0.529 ± 0.175
1.852GlnPro: 1.852 ± 0.347
1.786GlnGln: 1.786 ± 0.428
2.117GlnArg: 2.117 ± 0.374
1.786GlnSer: 1.786 ± 0.308
1.72GlnThr: 1.72 ± 0.35
2.514GlnVal: 2.514 ± 0.362
0.661GlnTrp: 0.661 ± 0.176
0.595GlnTyr: 0.595 ± 0.176
0.0GlnXaa: 0.0 ± 0.0
Arg
5.755ArgAla: 5.755 ± 0.632
0.529ArgCys: 0.529 ± 0.199
3.506ArgAsp: 3.506 ± 0.527
5.093ArgGlu: 5.093 ± 0.674
2.183ArgPhe: 2.183 ± 0.413
5.027ArgGly: 5.027 ± 0.635
1.058ArgHis: 1.058 ± 0.287
3.373ArgIle: 3.373 ± 0.558
3.109ArgLys: 3.109 ± 0.44
6.085ArgLeu: 6.085 ± 0.709
1.72ArgMet: 1.72 ± 0.305
2.249ArgAsn: 2.249 ± 0.378
2.183ArgPro: 2.183 ± 0.366
1.588ArgGln: 1.588 ± 0.348
5.226ArgArg: 5.226 ± 0.709
3.506ArgSer: 3.506 ± 0.502
2.91ArgThr: 2.91 ± 0.581
5.358ArgVal: 5.358 ± 0.678
1.124ArgTrp: 1.124 ± 0.264
1.654ArgTyr: 1.654 ± 0.295
0.0ArgXaa: 0.0 ± 0.0
Ser
6.218SerAla: 6.218 ± 0.578
0.265SerCys: 0.265 ± 0.124
3.241SerAsp: 3.241 ± 0.508
3.903SerGlu: 3.903 ± 0.497
2.051SerPhe: 2.051 ± 0.364
5.953SerGly: 5.953 ± 0.716
1.455SerHis: 1.455 ± 0.324
2.844SerIle: 2.844 ± 0.413
2.183SerLys: 2.183 ± 0.334
5.49SerLeu: 5.49 ± 0.573
1.455SerMet: 1.455 ± 0.264
2.117SerAsn: 2.117 ± 0.433
3.109SerPro: 3.109 ± 0.418
1.852SerGln: 1.852 ± 0.331
2.712SerArg: 2.712 ± 0.37
3.241SerSer: 3.241 ± 0.537
3.241SerThr: 3.241 ± 0.455
3.969SerVal: 3.969 ± 0.426
1.257SerTrp: 1.257 ± 0.29
1.521SerTyr: 1.521 ± 0.314
0.0SerXaa: 0.0 ± 0.0
Thr
6.879ThrAla: 6.879 ± 0.803
0.198ThrCys: 0.198 ± 0.118
4.3ThrAsp: 4.3 ± 0.635
4.432ThrGlu: 4.432 ± 0.554
2.58ThrPhe: 2.58 ± 0.35
5.755ThrGly: 5.755 ± 0.586
1.323ThrHis: 1.323 ± 0.364
2.514ThrIle: 2.514 ± 0.55
2.646ThrLys: 2.646 ± 0.378
6.085ThrLeu: 6.085 ± 0.67
0.728ThrMet: 0.728 ± 0.178
1.588ThrAsn: 1.588 ± 0.297
3.44ThrPro: 3.44 ± 0.474
1.918ThrGln: 1.918 ± 0.401
3.373ThrArg: 3.373 ± 0.603
3.307ThrSer: 3.307 ± 0.504
3.903ThrThr: 3.903 ± 0.637
5.424ThrVal: 5.424 ± 0.647
1.455ThrTrp: 1.455 ± 0.33
1.72ThrTyr: 1.72 ± 0.346
0.0ThrXaa: 0.0 ± 0.0
Val
7.739ValAla: 7.739 ± 0.854
0.529ValCys: 0.529 ± 0.195
5.424ValAsp: 5.424 ± 0.58
5.226ValGlu: 5.226 ± 0.557
2.447ValPhe: 2.447 ± 0.349
4.167ValGly: 4.167 ± 0.561
1.654ValHis: 1.654 ± 0.306
3.704ValIle: 3.704 ± 0.446
3.109ValLys: 3.109 ± 0.489
5.689ValLeu: 5.689 ± 0.691
1.257ValMet: 1.257 ± 0.355
2.712ValAsn: 2.712 ± 0.394
4.035ValPro: 4.035 ± 0.559
2.646ValGln: 2.646 ± 0.426
4.829ValArg: 4.829 ± 0.706
4.366ValSer: 4.366 ± 0.466
6.152ValThr: 6.152 ± 0.6
5.622ValVal: 5.622 ± 0.714
1.588ValTrp: 1.588 ± 0.354
1.786ValTyr: 1.786 ± 0.361
0.0ValXaa: 0.0 ± 0.0
Trp
1.72TrpAla: 1.72 ± 0.33
0.265TrpCys: 0.265 ± 0.125
1.588TrpAsp: 1.588 ± 0.285
0.86TrpGlu: 0.86 ± 0.254
0.992TrpPhe: 0.992 ± 0.243
1.918TrpGly: 1.918 ± 0.373
0.331TrpHis: 0.331 ± 0.156
1.191TrpIle: 1.191 ± 0.251
0.331TrpLys: 0.331 ± 0.14
1.918TrpLeu: 1.918 ± 0.318
0.265TrpMet: 0.265 ± 0.125
0.463TrpAsn: 0.463 ± 0.142
0.728TrpPro: 0.728 ± 0.234
0.794TrpGln: 0.794 ± 0.232
1.389TrpArg: 1.389 ± 0.29
1.124TrpSer: 1.124 ± 0.3
1.984TrpThr: 1.984 ± 0.399
1.984TrpVal: 1.984 ± 0.349
0.728TrpTrp: 0.728 ± 0.271
0.331TrpTyr: 0.331 ± 0.129
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.91TyrAla: 2.91 ± 0.498
0.198TyrCys: 0.198 ± 0.191
0.926TyrAsp: 0.926 ± 0.281
2.514TyrGlu: 2.514 ± 0.393
0.529TyrPhe: 0.529 ± 0.183
2.646TyrGly: 2.646 ± 0.4
0.463TyrHis: 0.463 ± 0.151
1.455TyrIle: 1.455 ± 0.367
1.323TyrLys: 1.323 ± 0.254
2.183TyrLeu: 2.183 ± 0.467
0.661TyrMet: 0.661 ± 0.186
1.191TyrAsn: 1.191 ± 0.265
1.257TyrPro: 1.257 ± 0.291
0.992TyrGln: 0.992 ± 0.268
2.646TyrArg: 2.646 ± 0.445
1.323TyrSer: 1.323 ± 0.264
1.918TyrThr: 1.918 ± 0.366
1.786TyrVal: 1.786 ± 0.351
0.463TyrTrp: 0.463 ± 0.188
0.728TyrTyr: 0.728 ± 0.217
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 86 proteins (15119 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski