Amino acid dipepetide frequency for Mycobacterium phage Fulbright

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
19.311AlaAla: 19.311 ± 2.673
0.731AlaCys: 0.731 ± 0.264
8.851AlaAsp: 8.851 ± 0.725
7.68AlaGlu: 7.68 ± 0.955
2.999AlaPhe: 2.999 ± 0.508
9.143AlaGly: 9.143 ± 1.318
2.487AlaHis: 2.487 ± 0.556
5.486AlaIle: 5.486 ± 0.654
3.292AlaLys: 3.292 ± 0.535
9.802AlaLeu: 9.802 ± 0.747
2.048AlaMet: 2.048 ± 0.379
3.145AlaAsn: 3.145 ± 0.611
7.242AlaPro: 7.242 ± 0.89
5.559AlaGln: 5.559 ± 0.751
7.68AlaArg: 7.68 ± 0.735
6.583AlaSer: 6.583 ± 0.717
8.778AlaThr: 8.778 ± 0.788
7.022AlaVal: 7.022 ± 0.695
1.975AlaTrp: 1.975 ± 0.466
1.902AlaTyr: 1.902 ± 0.311
0.0AlaXaa: 0.0 ± 0.0
Cys
1.463CysAla: 1.463 ± 0.364
0.366CysCys: 0.366 ± 0.202
1.097CysAsp: 1.097 ± 0.28
0.585CysGlu: 0.585 ± 0.194
0.073CysPhe: 0.073 ± 0.072
1.317CysGly: 1.317 ± 0.35
0.219CysHis: 0.219 ± 0.131
0.073CysIle: 0.073 ± 0.071
0.219CysLys: 0.219 ± 0.135
0.439CysLeu: 0.439 ± 0.173
0.293CysMet: 0.293 ± 0.135
0.293CysAsn: 0.293 ± 0.184
1.097CysPro: 1.097 ± 0.391
0.439CysGln: 0.439 ± 0.163
1.317CysArg: 1.317 ± 0.39
0.439CysSer: 0.439 ± 0.161
0.878CysThr: 0.878 ± 0.271
0.731CysVal: 0.731 ± 0.231
0.219CysTrp: 0.219 ± 0.136
0.219CysTyr: 0.219 ± 0.127
0.0CysXaa: 0.0 ± 0.0
Asp
7.973AspAla: 7.973 ± 0.683
0.805AspCys: 0.805 ± 0.258
5.047AspAsp: 5.047 ± 0.793
5.193AspGlu: 5.193 ± 0.71
1.902AspPhe: 1.902 ± 0.337
6.949AspGly: 6.949 ± 0.807
1.682AspHis: 1.682 ± 0.298
2.999AspIle: 2.999 ± 0.538
1.609AspLys: 1.609 ± 0.427
5.779AspLeu: 5.779 ± 0.574
1.317AspMet: 1.317 ± 0.293
2.048AspAsn: 2.048 ± 0.429
4.389AspPro: 4.389 ± 0.572
2.048AspGln: 2.048 ± 0.402
4.608AspArg: 4.608 ± 0.708
3.292AspSer: 3.292 ± 0.534
3.657AspThr: 3.657 ± 0.487
3.731AspVal: 3.731 ± 0.514
1.024AspTrp: 1.024 ± 0.287
1.317AspTyr: 1.317 ± 0.339
0.0AspXaa: 0.0 ± 0.0
Glu
5.779GluAla: 5.779 ± 0.569
1.097GluCys: 1.097 ± 0.298
3.804GluAsp: 3.804 ± 0.64
2.194GluGlu: 2.194 ± 0.477
2.341GluPhe: 2.341 ± 0.441
2.78GluGly: 2.78 ± 0.565
1.097GluHis: 1.097 ± 0.275
3.218GluIle: 3.218 ± 0.478
2.341GluLys: 2.341 ± 0.409
5.12GluLeu: 5.12 ± 0.634
1.829GluMet: 1.829 ± 0.488
1.975GluAsn: 1.975 ± 0.264
3.072GluPro: 3.072 ± 0.563
2.633GluGln: 2.633 ± 0.407
4.535GluArg: 4.535 ± 0.673
2.926GluSer: 2.926 ± 0.459
3.877GluThr: 3.877 ± 0.501
4.023GluVal: 4.023 ± 0.469
0.878GluTrp: 0.878 ± 0.28
1.536GluTyr: 1.536 ± 0.303
0.0GluXaa: 0.0 ± 0.0
Phe
3.511PheAla: 3.511 ± 0.505
0.439PheCys: 0.439 ± 0.208
1.756PheAsp: 1.756 ± 0.528
1.39PheGlu: 1.39 ± 0.34
0.805PhePhe: 0.805 ± 0.293
3.145PheGly: 3.145 ± 0.471
0.878PheHis: 0.878 ± 0.275
1.244PheIle: 1.244 ± 0.276
0.805PheLys: 0.805 ± 0.178
1.536PheLeu: 1.536 ± 0.279
0.512PheMet: 0.512 ± 0.245
0.878PheAsn: 0.878 ± 0.275
1.17PhePro: 1.17 ± 0.337
0.658PheGln: 0.658 ± 0.202
1.902PheArg: 1.902 ± 0.29
1.317PheSer: 1.317 ± 0.263
2.268PheThr: 2.268 ± 0.424
2.341PheVal: 2.341 ± 0.476
0.512PheTrp: 0.512 ± 0.165
1.024PheTyr: 1.024 ± 0.271
0.0PheXaa: 0.0 ± 0.0
Gly
7.9GlyAla: 7.9 ± 1.09
1.317GlyCys: 1.317 ± 0.375
5.34GlyAsp: 5.34 ± 0.611
3.657GlyGlu: 3.657 ± 0.543
2.414GlyPhe: 2.414 ± 0.541
9.948GlyGly: 9.948 ± 1.63
1.609GlyHis: 1.609 ± 0.351
4.535GlyIle: 4.535 ± 0.711
1.902GlyLys: 1.902 ± 0.363
6.51GlyLeu: 6.51 ± 0.57
1.975GlyMet: 1.975 ± 0.401
2.999GlyAsn: 2.999 ± 0.556
4.608GlyPro: 4.608 ± 0.683
3.584GlyGln: 3.584 ± 0.415
5.852GlyArg: 5.852 ± 0.552
5.559GlySer: 5.559 ± 0.764
5.925GlyThr: 5.925 ± 0.617
6.876GlyVal: 6.876 ± 0.71
1.829GlyTrp: 1.829 ± 0.369
3.072GlyTyr: 3.072 ± 0.626
0.0GlyXaa: 0.0 ± 0.0
His
1.756HisAla: 1.756 ± 0.396
0.293HisCys: 0.293 ± 0.133
1.536HisAsp: 1.536 ± 0.345
1.317HisGlu: 1.317 ± 0.375
0.146HisPhe: 0.146 ± 0.107
1.902HisGly: 1.902 ± 0.462
1.024HisHis: 1.024 ± 0.262
0.731HisIle: 0.731 ± 0.236
0.658HisLys: 0.658 ± 0.215
1.682HisLeu: 1.682 ± 0.51
0.146HisMet: 0.146 ± 0.095
0.439HisAsn: 0.439 ± 0.166
1.17HisPro: 1.17 ± 0.345
1.244HisGln: 1.244 ± 0.274
2.194HisArg: 2.194 ± 0.399
0.512HisSer: 0.512 ± 0.207
1.609HisThr: 1.609 ± 0.315
1.244HisVal: 1.244 ± 0.311
0.439HisTrp: 0.439 ± 0.158
0.878HisTyr: 0.878 ± 0.233
0.0HisXaa: 0.0 ± 0.0
Ile
5.925IleAla: 5.925 ± 0.58
0.219IleCys: 0.219 ± 0.121
3.804IleAsp: 3.804 ± 0.557
3.877IleGlu: 3.877 ± 0.54
1.024IlePhe: 1.024 ± 0.298
4.316IleGly: 4.316 ± 0.602
0.951IleHis: 0.951 ± 0.27
1.317IleIle: 1.317 ± 0.256
1.097IleLys: 1.097 ± 0.335
2.194IleLeu: 2.194 ± 0.452
0.439IleMet: 0.439 ± 0.166
2.048IleAsn: 2.048 ± 0.331
2.414IlePro: 2.414 ± 0.45
1.17IleGln: 1.17 ± 0.323
3.584IleArg: 3.584 ± 0.579
2.487IleSer: 2.487 ± 0.502
3.584IleThr: 3.584 ± 0.563
2.999IleVal: 2.999 ± 0.525
0.878IleTrp: 0.878 ± 0.205
0.731IleTyr: 0.731 ± 0.172
0.0IleXaa: 0.0 ± 0.0
Lys
4.608LysAla: 4.608 ± 1.02
0.219LysCys: 0.219 ± 0.143
1.244LysAsp: 1.244 ± 0.302
0.878LysGlu: 0.878 ± 0.208
0.805LysPhe: 0.805 ± 0.208
2.414LysGly: 2.414 ± 0.439
0.439LysHis: 0.439 ± 0.17
1.609LysIle: 1.609 ± 0.304
0.293LysLys: 0.293 ± 0.144
1.902LysLeu: 1.902 ± 0.372
0.878LysMet: 0.878 ± 0.23
0.658LysAsn: 0.658 ± 0.213
2.121LysPro: 2.121 ± 0.479
1.097LysGln: 1.097 ± 0.316
2.706LysArg: 2.706 ± 0.505
1.536LysSer: 1.536 ± 0.332
1.829LysThr: 1.829 ± 0.382
2.268LysVal: 2.268 ± 0.406
0.439LysTrp: 0.439 ± 0.191
1.17LysTyr: 1.17 ± 0.297
0.0LysXaa: 0.0 ± 0.0
Leu
10.168LeuAla: 10.168 ± 0.94
0.805LeuCys: 0.805 ± 0.283
5.413LeuAsp: 5.413 ± 0.539
3.218LeuGlu: 3.218 ± 0.354
2.414LeuPhe: 2.414 ± 0.439
7.827LeuGly: 7.827 ± 0.782
1.609LeuHis: 1.609 ± 0.333
3.292LeuIle: 3.292 ± 0.474
2.633LeuLys: 2.633 ± 0.42
6.218LeuLeu: 6.218 ± 0.643
1.097LeuMet: 1.097 ± 0.316
2.56LeuAsn: 2.56 ± 0.541
3.877LeuPro: 3.877 ± 0.752
2.853LeuGln: 2.853 ± 0.511
5.486LeuArg: 5.486 ± 0.812
4.535LeuSer: 4.535 ± 0.535
4.974LeuThr: 4.974 ± 0.665
5.486LeuVal: 5.486 ± 0.626
1.024LeuTrp: 1.024 ± 0.231
1.317LeuTyr: 1.317 ± 0.288
0.0LeuXaa: 0.0 ± 0.0
Met
3.218MetAla: 3.218 ± 0.513
0.146MetCys: 0.146 ± 0.151
0.731MetAsp: 0.731 ± 0.239
0.658MetGlu: 0.658 ± 0.259
0.951MetPhe: 0.951 ± 0.294
0.805MetGly: 0.805 ± 0.234
0.146MetHis: 0.146 ± 0.115
1.097MetIle: 1.097 ± 0.298
0.658MetLys: 0.658 ± 0.246
1.317MetLeu: 1.317 ± 0.33
0.439MetMet: 0.439 ± 0.156
0.731MetAsn: 0.731 ± 0.258
1.463MetPro: 1.463 ± 0.288
0.512MetGln: 0.512 ± 0.198
1.097MetArg: 1.097 ± 0.291
2.633MetSer: 2.633 ± 0.389
1.975MetThr: 1.975 ± 0.43
1.024MetVal: 1.024 ± 0.297
0.805MetTrp: 0.805 ± 0.298
0.366MetTyr: 0.366 ± 0.152
0.0MetXaa: 0.0 ± 0.0
Asn
3.877AsnAla: 3.877 ± 0.703
0.366AsnCys: 0.366 ± 0.15
1.902AsnAsp: 1.902 ± 0.379
1.024AsnGlu: 1.024 ± 0.254
1.024AsnPhe: 1.024 ± 0.262
3.511AsnGly: 3.511 ± 0.624
0.658AsnHis: 0.658 ± 0.285
1.463AsnIle: 1.463 ± 0.368
0.658AsnLys: 0.658 ± 0.243
2.853AsnLeu: 2.853 ± 0.479
0.366AsnMet: 0.366 ± 0.131
1.463AsnAsn: 1.463 ± 0.264
2.487AsnPro: 2.487 ± 0.346
1.244AsnGln: 1.244 ± 0.343
1.536AsnArg: 1.536 ± 0.347
1.317AsnSer: 1.317 ± 0.283
1.756AsnThr: 1.756 ± 0.433
1.975AsnVal: 1.975 ± 0.281
0.658AsnTrp: 0.658 ± 0.2
0.512AsnTyr: 0.512 ± 0.243
0.0AsnXaa: 0.0 ± 0.0
Pro
6.364ProAla: 6.364 ± 0.929
0.585ProCys: 0.585 ± 0.211
5.193ProAsp: 5.193 ± 0.509
4.535ProGlu: 4.535 ± 0.694
1.902ProPhe: 1.902 ± 0.328
5.779ProGly: 5.779 ± 0.901
1.097ProHis: 1.097 ± 0.262
2.268ProIle: 2.268 ± 0.429
1.17ProLys: 1.17 ± 0.367
3.292ProLeu: 3.292 ± 0.479
1.463ProMet: 1.463 ± 0.466
1.682ProAsn: 1.682 ± 0.392
3.145ProPro: 3.145 ± 0.503
2.268ProGln: 2.268 ± 0.398
3.218ProArg: 3.218 ± 0.431
2.706ProSer: 2.706 ± 0.472
3.438ProThr: 3.438 ± 0.511
4.462ProVal: 4.462 ± 0.453
1.536ProTrp: 1.536 ± 0.321
1.39ProTyr: 1.39 ± 0.393
0.0ProXaa: 0.0 ± 0.0
Gln
5.34GlnAla: 5.34 ± 1.015
0.366GlnCys: 0.366 ± 0.133
1.609GlnAsp: 1.609 ± 0.484
1.609GlnGlu: 1.609 ± 0.31
0.805GlnPhe: 0.805 ± 0.207
1.609GlnGly: 1.609 ± 0.306
1.244GlnHis: 1.244 ± 0.355
2.341GlnIle: 2.341 ± 0.397
1.682GlnLys: 1.682 ± 0.386
4.608GlnLeu: 4.608 ± 0.636
0.731GlnMet: 0.731 ± 0.201
0.512GlnAsn: 0.512 ± 0.195
2.121GlnPro: 2.121 ± 0.508
1.756GlnGln: 1.756 ± 0.434
3.292GlnArg: 3.292 ± 0.637
1.756GlnSer: 1.756 ± 0.369
2.414GlnThr: 2.414 ± 0.364
2.999GlnVal: 2.999 ± 0.512
0.658GlnTrp: 0.658 ± 0.235
0.805GlnTyr: 0.805 ± 0.281
0.0GlnXaa: 0.0 ± 0.0
Arg
6.437ArgAla: 6.437 ± 0.747
1.244ArgCys: 1.244 ± 0.359
4.901ArgAsp: 4.901 ± 0.672
4.681ArgGlu: 4.681 ± 0.703
1.756ArgPhe: 1.756 ± 0.375
4.316ArgGly: 4.316 ± 0.626
2.048ArgHis: 2.048 ± 0.495
3.365ArgIle: 3.365 ± 0.522
2.194ArgLys: 2.194 ± 0.392
5.486ArgLeu: 5.486 ± 0.661
2.268ArgMet: 2.268 ± 0.53
1.682ArgAsn: 1.682 ± 0.275
4.023ArgPro: 4.023 ± 0.563
3.365ArgGln: 3.365 ± 0.632
5.998ArgArg: 5.998 ± 0.863
3.292ArgSer: 3.292 ± 0.451
4.681ArgThr: 4.681 ± 0.818
5.12ArgVal: 5.12 ± 0.58
1.682ArgTrp: 1.682 ± 0.353
1.682ArgTyr: 1.682 ± 0.354
0.0ArgXaa: 0.0 ± 0.0
Ser
7.095SerAla: 7.095 ± 0.778
0.439SerCys: 0.439 ± 0.207
3.657SerAsp: 3.657 ± 0.509
1.975SerGlu: 1.975 ± 0.421
1.682SerPhe: 1.682 ± 0.319
5.193SerGly: 5.193 ± 0.609
0.658SerHis: 0.658 ± 0.208
1.829SerIle: 1.829 ± 0.38
1.463SerLys: 1.463 ± 0.305
3.438SerLeu: 3.438 ± 0.544
2.048SerMet: 2.048 ± 0.321
1.536SerAsn: 1.536 ± 0.354
2.853SerPro: 2.853 ± 0.379
1.829SerGln: 1.829 ± 0.336
3.145SerArg: 3.145 ± 0.429
3.072SerSer: 3.072 ± 0.41
3.511SerThr: 3.511 ± 0.389
4.462SerVal: 4.462 ± 0.701
0.878SerTrp: 0.878 ± 0.246
1.609SerTyr: 1.609 ± 0.368
0.0SerXaa: 0.0 ± 0.0
Thr
8.266ThrAla: 8.266 ± 0.76
0.805ThrCys: 0.805 ± 0.227
4.243ThrAsp: 4.243 ± 0.569
3.804ThrGlu: 3.804 ± 0.494
2.121ThrPhe: 2.121 ± 0.371
7.095ThrGly: 7.095 ± 0.93
0.951ThrHis: 0.951 ± 0.278
2.926ThrIle: 2.926 ± 0.425
2.706ThrLys: 2.706 ± 0.395
5.925ThrLeu: 5.925 ± 0.576
1.317ThrMet: 1.317 ± 0.275
1.536ThrAsn: 1.536 ± 0.362
4.828ThrPro: 4.828 ± 0.609
2.268ThrGln: 2.268 ± 0.354
4.535ThrArg: 4.535 ± 0.606
2.853ThrSer: 2.853 ± 0.594
4.755ThrThr: 4.755 ± 0.591
6.291ThrVal: 6.291 ± 0.86
0.805ThrTrp: 0.805 ± 0.27
1.463ThrTyr: 1.463 ± 0.258
0.0ThrXaa: 0.0 ± 0.0
Val
9.436ValAla: 9.436 ± 0.94
0.658ValCys: 0.658 ± 0.189
4.169ValAsp: 4.169 ± 0.594
6.73ValGlu: 6.73 ± 0.792
1.902ValPhe: 1.902 ± 0.463
5.632ValGly: 5.632 ± 0.672
1.097ValHis: 1.097 ± 0.257
3.365ValIle: 3.365 ± 0.621
2.853ValLys: 2.853 ± 0.581
5.047ValLeu: 5.047 ± 0.539
0.951ValMet: 0.951 ± 0.248
2.78ValAsn: 2.78 ± 0.381
3.731ValPro: 3.731 ± 0.574
2.341ValGln: 2.341 ± 0.291
4.535ValArg: 4.535 ± 0.525
3.584ValSer: 3.584 ± 0.479
5.706ValThr: 5.706 ± 0.612
5.779ValVal: 5.779 ± 0.781
1.097ValTrp: 1.097 ± 0.295
1.609ValTyr: 1.609 ± 0.303
0.0ValXaa: 0.0 ± 0.0
Trp
1.39TrpAla: 1.39 ± 0.27
0.366TrpCys: 0.366 ± 0.161
1.097TrpAsp: 1.097 ± 0.255
0.293TrpGlu: 0.293 ± 0.14
0.512TrpPhe: 0.512 ± 0.244
1.244TrpGly: 1.244 ± 0.322
0.585TrpHis: 0.585 ± 0.209
0.878TrpIle: 0.878 ± 0.218
0.219TrpLys: 0.219 ± 0.11
2.194TrpLeu: 2.194 ± 0.474
0.366TrpMet: 0.366 ± 0.145
0.585TrpAsn: 0.585 ± 0.178
0.805TrpPro: 0.805 ± 0.236
0.731TrpGln: 0.731 ± 0.234
1.317TrpArg: 1.317 ± 0.311
1.097TrpSer: 1.097 ± 0.264
1.756TrpThr: 1.756 ± 0.374
1.975TrpVal: 1.975 ± 0.353
0.805TrpTrp: 0.805 ± 0.25
0.439TrpTyr: 0.439 ± 0.176
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.975TyrAla: 1.975 ± 0.417
0.512TyrCys: 0.512 ± 0.172
1.975TyrAsp: 1.975 ± 0.434
1.756TyrGlu: 1.756 ± 0.252
0.512TyrPhe: 0.512 ± 0.21
2.487TyrGly: 2.487 ± 0.392
0.439TyrHis: 0.439 ± 0.183
1.024TyrIle: 1.024 ± 0.286
0.658TyrLys: 0.658 ± 0.242
1.463TyrLeu: 1.463 ± 0.302
0.219TyrMet: 0.219 ± 0.133
1.097TyrAsn: 1.097 ± 0.201
0.878TyrPro: 0.878 ± 0.264
0.658TyrGln: 0.658 ± 0.177
1.682TyrArg: 1.682 ± 0.442
0.878TyrSer: 0.878 ± 0.242
2.121TyrThr: 2.121 ± 0.372
2.194TyrVal: 2.194 ± 0.553
0.512TyrTrp: 0.512 ± 0.205
0.731TyrTyr: 0.731 ± 0.216
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 70 proteins (13672 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski