Amino acid dipepetide frequency for Mycobacterium phage Butters

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
17.912AlaAla: 17.912 ± 2.319
1.129AlaCys: 1.129 ± 0.321
7.376AlaAsp: 7.376 ± 0.811
8.053AlaGlu: 8.053 ± 1.13
3.387AlaPhe: 3.387 ± 0.559
9.558AlaGly: 9.558 ± 1.149
2.032AlaHis: 2.032 ± 0.36
5.87AlaIle: 5.87 ± 0.848
3.763AlaLys: 3.763 ± 0.587
8.956AlaLeu: 8.956 ± 0.794
2.484AlaMet: 2.484 ± 0.563
3.462AlaAsn: 3.462 ± 0.539
7.15AlaPro: 7.15 ± 0.676
4.516AlaGln: 4.516 ± 0.747
8.505AlaArg: 8.505 ± 0.961
6.924AlaSer: 6.924 ± 0.81
6.247AlaThr: 6.247 ± 0.653
7.677AlaVal: 7.677 ± 0.651
2.333AlaTrp: 2.333 ± 0.545
2.408AlaTyr: 2.408 ± 0.435
0.0AlaXaa: 0.0 ± 0.0
Cys
1.43CysAla: 1.43 ± 0.341
0.075CysCys: 0.075 ± 0.078
0.828CysAsp: 0.828 ± 0.311
0.527CysGlu: 0.527 ± 0.196
0.0CysPhe: 0.0 ± 0.0
1.279CysGly: 1.279 ± 0.341
0.226CysHis: 0.226 ± 0.152
0.075CysIle: 0.075 ± 0.07
0.226CysLys: 0.226 ± 0.132
0.828CysLeu: 0.828 ± 0.23
0.151CysMet: 0.151 ± 0.106
0.301CysAsn: 0.301 ± 0.169
0.978CysPro: 0.978 ± 0.299
0.301CysGln: 0.301 ± 0.157
0.828CysArg: 0.828 ± 0.24
0.527CysSer: 0.527 ± 0.184
0.753CysThr: 0.753 ± 0.233
0.753CysVal: 0.753 ± 0.258
0.075CysTrp: 0.075 ± 0.081
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
7.225AspAla: 7.225 ± 0.853
0.903AspCys: 0.903 ± 0.244
5.193AspAsp: 5.193 ± 0.781
5.268AspGlu: 5.268 ± 0.671
1.505AspPhe: 1.505 ± 0.325
7.902AspGly: 7.902 ± 0.791
1.129AspHis: 1.129 ± 0.294
2.408AspIle: 2.408 ± 0.375
1.806AspLys: 1.806 ± 0.368
5.795AspLeu: 5.795 ± 0.571
1.204AspMet: 1.204 ± 0.256
1.58AspAsn: 1.58 ± 0.425
3.838AspPro: 3.838 ± 0.559
2.107AspGln: 2.107 ± 0.326
4.139AspArg: 4.139 ± 0.608
3.387AspSer: 3.387 ± 0.665
2.484AspThr: 2.484 ± 0.37
3.537AspVal: 3.537 ± 0.575
0.978AspTrp: 0.978 ± 0.292
1.656AspTyr: 1.656 ± 0.287
0.0AspXaa: 0.0 ± 0.0
Glu
6.171GluAla: 6.171 ± 0.537
0.828GluCys: 0.828 ± 0.22
3.236GluAsp: 3.236 ± 0.428
2.785GluGlu: 2.785 ± 0.517
2.484GluPhe: 2.484 ± 0.427
3.161GluGly: 3.161 ± 0.629
1.355GluHis: 1.355 ± 0.354
3.236GluIle: 3.236 ± 0.46
1.957GluLys: 1.957 ± 0.351
5.87GluLeu: 5.87 ± 0.688
1.806GluMet: 1.806 ± 0.423
1.806GluAsn: 1.806 ± 0.296
2.559GluPro: 2.559 ± 0.429
2.86GluGln: 2.86 ± 0.462
4.44GluArg: 4.44 ± 0.584
2.785GluSer: 2.785 ± 0.492
4.741GluThr: 4.741 ± 0.657
4.139GluVal: 4.139 ± 0.608
1.505GluTrp: 1.505 ± 0.283
1.355GluTyr: 1.355 ± 0.299
0.0GluXaa: 0.0 ± 0.0
Phe
4.064PheAla: 4.064 ± 0.639
0.226PheCys: 0.226 ± 0.127
2.107PheAsp: 2.107 ± 0.503
1.43PheGlu: 1.43 ± 0.322
0.978PhePhe: 0.978 ± 0.311
3.161PheGly: 3.161 ± 0.603
0.828PheHis: 0.828 ± 0.265
1.129PheIle: 1.129 ± 0.278
0.903PheLys: 0.903 ± 0.244
1.731PheLeu: 1.731 ± 0.408
0.753PheMet: 0.753 ± 0.237
1.355PheAsn: 1.355 ± 0.285
1.204PhePro: 1.204 ± 0.31
0.602PheGln: 0.602 ± 0.228
1.806PheArg: 1.806 ± 0.415
1.43PheSer: 1.43 ± 0.297
2.107PheThr: 2.107 ± 0.393
2.032PheVal: 2.032 ± 0.445
0.452PheTrp: 0.452 ± 0.16
0.828PheTyr: 0.828 ± 0.217
0.0PheXaa: 0.0 ± 0.0
Gly
8.505GlyAla: 8.505 ± 1.085
0.978GlyCys: 0.978 ± 0.371
5.268GlyAsp: 5.268 ± 0.733
4.215GlyGlu: 4.215 ± 0.645
2.86GlyPhe: 2.86 ± 0.506
12.87GlyGly: 12.87 ± 2.097
1.279GlyHis: 1.279 ± 0.298
4.741GlyIle: 4.741 ± 0.827
1.882GlyLys: 1.882 ± 0.311
7.526GlyLeu: 7.526 ± 0.651
1.505GlyMet: 1.505 ± 0.337
3.387GlyAsn: 3.387 ± 0.467
4.516GlyPro: 4.516 ± 0.826
3.763GlyGln: 3.763 ± 0.461
6.171GlyArg: 6.171 ± 0.554
6.774GlySer: 6.774 ± 0.952
6.698GlyThr: 6.698 ± 0.61
6.774GlyVal: 6.774 ± 0.776
2.032GlyTrp: 2.032 ± 0.35
3.086GlyTyr: 3.086 ± 0.532
0.0GlyXaa: 0.0 ± 0.0
His
1.806HisAla: 1.806 ± 0.374
0.226HisCys: 0.226 ± 0.143
1.355HisAsp: 1.355 ± 0.267
1.129HisGlu: 1.129 ± 0.319
0.376HisPhe: 0.376 ± 0.152
1.731HisGly: 1.731 ± 0.324
0.978HisHis: 0.978 ± 0.318
1.505HisIle: 1.505 ± 0.326
0.602HisLys: 0.602 ± 0.192
1.505HisLeu: 1.505 ± 0.347
0.376HisMet: 0.376 ± 0.178
0.301HisAsn: 0.301 ± 0.189
1.054HisPro: 1.054 ± 0.319
0.753HisGln: 0.753 ± 0.246
2.032HisArg: 2.032 ± 0.45
0.527HisSer: 0.527 ± 0.225
1.279HisThr: 1.279 ± 0.27
1.129HisVal: 1.129 ± 0.293
0.452HisTrp: 0.452 ± 0.16
0.602HisTyr: 0.602 ± 0.191
0.0HisXaa: 0.0 ± 0.0
Ile
6.171IleAla: 6.171 ± 0.73
0.151IleCys: 0.151 ± 0.112
3.989IleAsp: 3.989 ± 0.55
3.537IleGlu: 3.537 ± 0.427
0.978IlePhe: 0.978 ± 0.27
4.741IleGly: 4.741 ± 0.524
0.978IleHis: 0.978 ± 0.256
1.355IleIle: 1.355 ± 0.265
1.204IleLys: 1.204 ± 0.3
2.935IleLeu: 2.935 ± 0.384
0.452IleMet: 0.452 ± 0.174
1.957IleAsn: 1.957 ± 0.332
2.935IlePro: 2.935 ± 0.425
1.129IleGln: 1.129 ± 0.235
3.387IleArg: 3.387 ± 0.412
2.183IleSer: 2.183 ± 0.39
3.462IleThr: 3.462 ± 0.534
3.613IleVal: 3.613 ± 0.508
0.527IleTrp: 0.527 ± 0.231
0.903IleTyr: 0.903 ± 0.192
0.0IleXaa: 0.0 ± 0.0
Lys
4.666LysAla: 4.666 ± 0.992
0.151LysCys: 0.151 ± 0.106
1.054LysAsp: 1.054 ± 0.256
1.279LysGlu: 1.279 ± 0.307
0.828LysPhe: 0.828 ± 0.249
2.032LysGly: 2.032 ± 0.439
0.226LysHis: 0.226 ± 0.108
1.43LysIle: 1.43 ± 0.291
0.677LysLys: 0.677 ± 0.309
2.032LysLeu: 2.032 ± 0.382
0.828LysMet: 0.828 ± 0.247
0.527LysAsn: 0.527 ± 0.195
2.709LysPro: 2.709 ± 0.444
1.43LysGln: 1.43 ± 0.383
2.107LysArg: 2.107 ± 0.359
1.731LysSer: 1.731 ± 0.39
1.505LysThr: 1.505 ± 0.389
1.957LysVal: 1.957 ± 0.369
0.452LysTrp: 0.452 ± 0.174
0.978LysTyr: 0.978 ± 0.282
0.0LysXaa: 0.0 ± 0.0
Leu
10.537LeuAla: 10.537 ± 1.031
0.602LeuCys: 0.602 ± 0.229
5.193LeuAsp: 5.193 ± 0.638
4.064LeuGlu: 4.064 ± 0.368
3.537LeuPhe: 3.537 ± 0.537
7.978LeuGly: 7.978 ± 0.785
1.731LeuHis: 1.731 ± 0.394
3.688LeuIle: 3.688 ± 0.474
2.559LeuLys: 2.559 ± 0.439
6.472LeuLeu: 6.472 ± 0.804
1.505LeuMet: 1.505 ± 0.377
2.484LeuAsn: 2.484 ± 0.548
3.763LeuPro: 3.763 ± 0.591
2.032LeuGln: 2.032 ± 0.376
5.72LeuArg: 5.72 ± 0.831
4.44LeuSer: 4.44 ± 0.544
5.645LeuThr: 5.645 ± 0.768
5.569LeuVal: 5.569 ± 0.556
1.204LeuTrp: 1.204 ± 0.315
1.58LeuTyr: 1.58 ± 0.325
0.0LeuXaa: 0.0 ± 0.0
Met
3.086MetAla: 3.086 ± 0.556
0.075MetCys: 0.075 ± 0.086
0.903MetAsp: 0.903 ± 0.264
0.677MetGlu: 0.677 ± 0.272
0.828MetPhe: 0.828 ± 0.229
0.903MetGly: 0.903 ± 0.234
0.301MetHis: 0.301 ± 0.14
0.903MetIle: 0.903 ± 0.265
0.677MetLys: 0.677 ± 0.207
1.656MetLeu: 1.656 ± 0.342
0.527MetMet: 0.527 ± 0.216
0.677MetAsn: 0.677 ± 0.221
0.978MetPro: 0.978 ± 0.212
0.602MetGln: 0.602 ± 0.199
1.355MetArg: 1.355 ± 0.313
1.882MetSer: 1.882 ± 0.291
2.634MetThr: 2.634 ± 0.557
0.903MetVal: 0.903 ± 0.211
0.828MetTrp: 0.828 ± 0.324
0.226MetTyr: 0.226 ± 0.126
0.0MetXaa: 0.0 ± 0.0
Asn
3.236AsnAla: 3.236 ± 0.503
0.301AsnCys: 0.301 ± 0.14
1.882AsnAsp: 1.882 ± 0.428
0.828AsnGlu: 0.828 ± 0.26
0.753AsnPhe: 0.753 ± 0.241
4.44AsnGly: 4.44 ± 0.578
0.151AsnHis: 0.151 ± 0.104
1.957AsnIle: 1.957 ± 0.401
0.753AsnLys: 0.753 ± 0.286
2.333AsnLeu: 2.333 ± 0.431
0.301AsnMet: 0.301 ± 0.155
0.978AsnAsn: 0.978 ± 0.249
2.183AsnPro: 2.183 ± 0.373
0.828AsnGln: 0.828 ± 0.322
2.634AsnArg: 2.634 ± 0.449
1.58AsnSer: 1.58 ± 0.374
1.505AsnThr: 1.505 ± 0.332
1.957AsnVal: 1.957 ± 0.295
0.602AsnTrp: 0.602 ± 0.179
0.677AsnTyr: 0.677 ± 0.279
0.0AsnXaa: 0.0 ± 0.0
Pro
7.3ProAla: 7.3 ± 0.886
0.602ProCys: 0.602 ± 0.2
4.139ProAsp: 4.139 ± 0.505
4.591ProGlu: 4.591 ± 0.549
2.258ProPhe: 2.258 ± 0.395
6.397ProGly: 6.397 ± 0.813
0.903ProHis: 0.903 ± 0.253
2.107ProIle: 2.107 ± 0.402
1.279ProLys: 1.279 ± 0.274
4.516ProLeu: 4.516 ± 0.75
1.58ProMet: 1.58 ± 0.439
1.806ProAsn: 1.806 ± 0.345
3.613ProPro: 3.613 ± 0.588
1.957ProGln: 1.957 ± 0.336
4.44ProArg: 4.44 ± 0.659
2.484ProSer: 2.484 ± 0.451
3.312ProThr: 3.312 ± 0.475
4.139ProVal: 4.139 ± 0.526
1.279ProTrp: 1.279 ± 0.361
1.279ProTyr: 1.279 ± 0.409
0.0ProXaa: 0.0 ± 0.0
Gln
4.29GlnAla: 4.29 ± 0.924
0.226GlnCys: 0.226 ± 0.127
1.58GlnAsp: 1.58 ± 0.357
1.279GlnGlu: 1.279 ± 0.363
0.753GlnPhe: 0.753 ± 0.283
1.505GlnGly: 1.505 ± 0.272
0.978GlnHis: 0.978 ± 0.248
2.408GlnIle: 2.408 ± 0.4
1.054GlnLys: 1.054 ± 0.297
4.139GlnLeu: 4.139 ± 0.645
0.753GlnMet: 0.753 ± 0.231
0.376GlnAsn: 0.376 ± 0.151
2.559GlnPro: 2.559 ± 0.419
1.806GlnGln: 1.806 ± 0.419
3.161GlnArg: 3.161 ± 0.484
1.656GlnSer: 1.656 ± 0.292
2.258GlnThr: 2.258 ± 0.385
2.408GlnVal: 2.408 ± 0.345
0.828GlnTrp: 0.828 ± 0.242
0.903GlnTyr: 0.903 ± 0.267
0.0GlnXaa: 0.0 ± 0.0
Arg
6.774ArgAla: 6.774 ± 0.728
1.129ArgCys: 1.129 ± 0.358
4.967ArgAsp: 4.967 ± 0.522
4.741ArgGlu: 4.741 ± 0.694
1.882ArgPhe: 1.882 ± 0.336
5.795ArgGly: 5.795 ± 0.657
1.957ArgHis: 1.957 ± 0.352
3.312ArgIle: 3.312 ± 0.44
2.333ArgLys: 2.333 ± 0.45
5.946ArgLeu: 5.946 ± 0.639
1.882ArgMet: 1.882 ± 0.514
2.258ArgAsn: 2.258 ± 0.398
4.666ArgPro: 4.666 ± 0.51
2.559ArgGln: 2.559 ± 0.651
6.623ArgArg: 6.623 ± 0.941
3.462ArgSer: 3.462 ± 0.571
3.914ArgThr: 3.914 ± 0.629
4.741ArgVal: 4.741 ± 0.497
1.204ArgTrp: 1.204 ± 0.275
1.505ArgTyr: 1.505 ± 0.363
0.0ArgXaa: 0.0 ± 0.0
Ser
6.924SerAla: 6.924 ± 0.806
0.452SerCys: 0.452 ± 0.17
3.688SerAsp: 3.688 ± 0.455
1.882SerGlu: 1.882 ± 0.348
1.43SerPhe: 1.43 ± 0.282
6.397SerGly: 6.397 ± 0.798
1.355SerHis: 1.355 ± 0.334
1.882SerIle: 1.882 ± 0.353
1.957SerLys: 1.957 ± 0.375
3.763SerLeu: 3.763 ± 0.551
2.107SerMet: 2.107 ± 0.328
0.978SerAsn: 0.978 ± 0.442
3.613SerPro: 3.613 ± 0.477
1.731SerGln: 1.731 ± 0.344
3.387SerArg: 3.387 ± 0.507
3.236SerSer: 3.236 ± 0.546
4.064SerThr: 4.064 ± 0.579
3.236SerVal: 3.236 ± 0.447
0.903SerTrp: 0.903 ± 0.266
1.279SerTyr: 1.279 ± 0.264
0.0SerXaa: 0.0 ± 0.0
Thr
8.128ThrAla: 8.128 ± 0.573
0.376ThrCys: 0.376 ± 0.177
3.688ThrAsp: 3.688 ± 0.59
4.064ThrGlu: 4.064 ± 0.629
1.731ThrPhe: 1.731 ± 0.461
6.322ThrGly: 6.322 ± 0.818
1.129ThrHis: 1.129 ± 0.322
3.387ThrIle: 3.387 ± 0.663
1.806ThrLys: 1.806 ± 0.346
5.569ThrLeu: 5.569 ± 0.784
1.054ThrMet: 1.054 ± 0.259
1.355ThrAsn: 1.355 ± 0.274
5.72ThrPro: 5.72 ± 0.696
1.731ThrGln: 1.731 ± 0.368
3.236ThrArg: 3.236 ± 0.455
3.462ThrSer: 3.462 ± 0.585
3.688ThrThr: 3.688 ± 0.659
6.171ThrVal: 6.171 ± 0.961
0.677ThrTrp: 0.677 ± 0.241
1.731ThrTyr: 1.731 ± 0.367
0.0ThrXaa: 0.0 ± 0.0
Val
7.978ValAla: 7.978 ± 0.801
1.054ValCys: 1.054 ± 0.299
4.666ValAsp: 4.666 ± 0.69
6.171ValGlu: 6.171 ± 0.82
1.43ValPhe: 1.43 ± 0.379
5.118ValGly: 5.118 ± 0.531
1.129ValHis: 1.129 ± 0.274
3.537ValIle: 3.537 ± 0.608
2.258ValLys: 2.258 ± 0.546
4.892ValLeu: 4.892 ± 0.543
0.753ValMet: 0.753 ± 0.282
2.785ValAsn: 2.785 ± 0.491
4.139ValPro: 4.139 ± 0.544
2.408ValGln: 2.408 ± 0.34
4.666ValArg: 4.666 ± 0.522
3.688ValSer: 3.688 ± 0.454
5.569ValThr: 5.569 ± 0.681
5.268ValVal: 5.268 ± 0.708
0.903ValTrp: 0.903 ± 0.178
1.505ValTyr: 1.505 ± 0.321
0.0ValXaa: 0.0 ± 0.0
Trp
1.957TrpAla: 1.957 ± 0.416
0.376TrpCys: 0.376 ± 0.164
1.204TrpAsp: 1.204 ± 0.286
0.527TrpGlu: 0.527 ± 0.213
0.602TrpPhe: 0.602 ± 0.246
1.054TrpGly: 1.054 ± 0.312
0.602TrpHis: 0.602 ± 0.207
0.753TrpIle: 0.753 ± 0.245
0.301TrpLys: 0.301 ± 0.138
1.882TrpLeu: 1.882 ± 0.465
0.151TrpMet: 0.151 ± 0.128
0.602TrpAsn: 0.602 ± 0.208
0.978TrpPro: 0.978 ± 0.298
0.828TrpGln: 0.828 ± 0.224
1.129TrpArg: 1.129 ± 0.274
1.054TrpSer: 1.054 ± 0.236
1.204TrpThr: 1.204 ± 0.354
1.882TrpVal: 1.882 ± 0.302
0.677TrpTrp: 0.677 ± 0.175
0.301TrpTyr: 0.301 ± 0.135
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.58TyrAla: 1.58 ± 0.36
0.301TyrCys: 0.301 ± 0.14
1.731TyrAsp: 1.731 ± 0.371
2.032TyrGlu: 2.032 ± 0.404
0.527TyrPhe: 0.527 ± 0.188
2.484TyrGly: 2.484 ± 0.408
0.452TyrHis: 0.452 ± 0.173
0.903TyrIle: 0.903 ± 0.307
0.602TyrLys: 0.602 ± 0.189
1.957TyrLeu: 1.957 ± 0.337
0.376TyrMet: 0.376 ± 0.167
0.903TyrAsn: 0.903 ± 0.216
0.828TyrPro: 0.828 ± 0.221
0.978TyrGln: 0.978 ± 0.229
1.882TyrArg: 1.882 ± 0.382
1.054TyrSer: 1.054 ± 0.301
1.957TyrThr: 1.957 ± 0.326
2.032TyrVal: 2.032 ± 0.542
0.226TyrTrp: 0.226 ± 0.122
0.677TyrTyr: 0.677 ± 0.199
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 66 proteins (13288 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski