Amino acid dipepetide frequency for Mycobacterium phage RidgeCB

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
12.835AlaAla: 12.835 ± 1.109
0.679AlaCys: 0.679 ± 0.2
6.417AlaAsp: 6.417 ± 0.731
5.985AlaGlu: 5.985 ± 0.72
2.962AlaPhe: 2.962 ± 0.523
7.96AlaGly: 7.96 ± 0.795
1.419AlaHis: 1.419 ± 0.299
4.073AlaIle: 4.073 ± 0.498
4.073AlaLys: 4.073 ± 0.577
8.947AlaLeu: 8.947 ± 1.074
1.975AlaMet: 1.975 ± 0.398
2.345AlaAsn: 2.345 ± 0.382
5.307AlaPro: 5.307 ± 0.7
3.024AlaGln: 3.024 ± 0.412
6.541AlaArg: 6.541 ± 0.56
4.875AlaSer: 4.875 ± 0.491
6.109AlaThr: 6.109 ± 0.708
8.577AlaVal: 8.577 ± 0.781
1.543AlaTrp: 1.543 ± 0.374
2.653AlaTyr: 2.653 ± 0.374
0.0AlaXaa: 0.0 ± 0.0
Cys
0.74CysAla: 0.74 ± 0.205
0.123CysCys: 0.123 ± 0.094
0.37CysAsp: 0.37 ± 0.136
0.679CysGlu: 0.679 ± 0.199
0.123CysPhe: 0.123 ± 0.095
0.494CysGly: 0.494 ± 0.19
0.185CysHis: 0.185 ± 0.121
0.247CysIle: 0.247 ± 0.137
0.247CysLys: 0.247 ± 0.117
0.37CysLeu: 0.37 ± 0.183
0.062CysMet: 0.062 ± 0.066
0.309CysAsn: 0.309 ± 0.134
0.309CysPro: 0.309 ± 0.136
0.309CysGln: 0.309 ± 0.151
0.247CysArg: 0.247 ± 0.115
0.494CysSer: 0.494 ± 0.189
0.37CysThr: 0.37 ± 0.15
0.247CysVal: 0.247 ± 0.113
0.123CysTrp: 0.123 ± 0.085
0.123CysTyr: 0.123 ± 0.078
0.0CysXaa: 0.0 ± 0.0
Asp
6.356AspAla: 6.356 ± 0.643
0.494AspCys: 0.494 ± 0.161
4.566AspAsp: 4.566 ± 0.482
3.887AspGlu: 3.887 ± 0.534
2.468AspPhe: 2.468 ± 0.343
5.985AspGly: 5.985 ± 0.697
1.111AspHis: 1.111 ± 0.272
2.715AspIle: 2.715 ± 0.442
3.024AspLys: 3.024 ± 0.488
6.664AspLeu: 6.664 ± 0.708
1.234AspMet: 1.234 ± 0.223
1.728AspAsn: 1.728 ± 0.359
4.443AspPro: 4.443 ± 0.528
1.728AspGln: 1.728 ± 0.363
4.134AspArg: 4.134 ± 0.543
2.777AspSer: 2.777 ± 0.414
3.949AspThr: 3.949 ± 0.388
5.368AspVal: 5.368 ± 0.653
1.666AspTrp: 1.666 ± 0.311
1.789AspTyr: 1.789 ± 0.368
0.0AspXaa: 0.0 ± 0.0
Glu
5.985GluAla: 5.985 ± 0.746
0.37GluCys: 0.37 ± 0.171
5.245GluAsp: 5.245 ± 0.58
5.245GluGlu: 5.245 ± 0.616
1.913GluPhe: 1.913 ± 0.335
4.319GluGly: 4.319 ± 0.516
1.419GluHis: 1.419 ± 0.285
3.579GluIle: 3.579 ± 0.472
2.345GluLys: 2.345 ± 0.362
6.294GluLeu: 6.294 ± 0.546
1.604GluMet: 1.604 ± 0.326
1.728GluAsn: 1.728 ± 0.331
2.777GluPro: 2.777 ± 0.496
2.715GluGln: 2.715 ± 0.438
3.826GluArg: 3.826 ± 0.564
3.394GluSer: 3.394 ± 0.467
3.517GluThr: 3.517 ± 0.464
5.183GluVal: 5.183 ± 0.632
1.358GluTrp: 1.358 ± 0.378
2.53GluTyr: 2.53 ± 0.479
0.0GluXaa: 0.0 ± 0.0
Phe
2.592PheAla: 2.592 ± 0.478
0.247PheCys: 0.247 ± 0.157
2.962PheAsp: 2.962 ± 0.455
2.036PheGlu: 2.036 ± 0.334
0.555PhePhe: 0.555 ± 0.188
3.579PheGly: 3.579 ± 0.463
0.679PheHis: 0.679 ± 0.217
1.296PheIle: 1.296 ± 0.299
1.481PheLys: 1.481 ± 0.288
2.345PheLeu: 2.345 ± 0.43
0.555PheMet: 0.555 ± 0.165
1.666PheAsn: 1.666 ± 0.323
1.543PhePro: 1.543 ± 0.268
0.679PheGln: 0.679 ± 0.183
1.789PheArg: 1.789 ± 0.412
1.666PheSer: 1.666 ± 0.282
2.098PheThr: 2.098 ± 0.385
1.666PheVal: 1.666 ± 0.35
0.555PheTrp: 0.555 ± 0.167
0.987PheTyr: 0.987 ± 0.269
0.0PheXaa: 0.0 ± 0.0
Gly
7.281GlyAla: 7.281 ± 0.785
0.555GlyCys: 0.555 ± 0.185
5.985GlyAsp: 5.985 ± 0.469
4.134GlyGlu: 4.134 ± 0.552
2.592GlyPhe: 2.592 ± 0.583
9.318GlyGly: 9.318 ± 2.135
1.728GlyHis: 1.728 ± 0.324
4.381GlyIle: 4.381 ± 0.769
3.702GlyLys: 3.702 ± 0.592
7.651GlyLeu: 7.651 ± 0.739
2.098GlyMet: 2.098 ± 0.402
3.456GlyAsn: 3.456 ± 0.428
3.579GlyPro: 3.579 ± 0.502
2.283GlyGln: 2.283 ± 0.307
5.492GlyArg: 5.492 ± 0.578
6.356GlySer: 6.356 ± 0.919
5.307GlyThr: 5.307 ± 0.586
6.047GlyVal: 6.047 ± 0.824
2.838GlyTrp: 2.838 ± 0.398
2.9GlyTyr: 2.9 ± 0.441
0.0GlyXaa: 0.0 ± 0.0
His
1.543HisAla: 1.543 ± 0.278
0.123HisCys: 0.123 ± 0.115
1.111HisAsp: 1.111 ± 0.267
1.543HisGlu: 1.543 ± 0.297
0.679HisPhe: 0.679 ± 0.191
1.296HisGly: 1.296 ± 0.312
0.74HisHis: 0.74 ± 0.211
0.802HisIle: 0.802 ± 0.183
1.111HisLys: 1.111 ± 0.331
1.728HisLeu: 1.728 ± 0.364
0.185HisMet: 0.185 ± 0.115
0.123HisAsn: 0.123 ± 0.095
1.111HisPro: 1.111 ± 0.268
0.864HisGln: 0.864 ± 0.223
1.358HisArg: 1.358 ± 0.285
0.679HisSer: 0.679 ± 0.204
0.987HisThr: 0.987 ± 0.257
1.604HisVal: 1.604 ± 0.337
0.617HisTrp: 0.617 ± 0.172
0.679HisTyr: 0.679 ± 0.207
0.0HisXaa: 0.0 ± 0.0
Ile
5.924IleAla: 5.924 ± 0.751
0.185IleCys: 0.185 ± 0.116
3.702IleAsp: 3.702 ± 0.416
3.764IleGlu: 3.764 ± 0.405
0.864IlePhe: 0.864 ± 0.248
4.134IleGly: 4.134 ± 0.536
0.802IleHis: 0.802 ± 0.197
2.16IleIle: 2.16 ± 0.325
1.789IleLys: 1.789 ± 0.302
3.209IleLeu: 3.209 ± 0.464
0.926IleMet: 0.926 ± 0.222
1.728IleAsn: 1.728 ± 0.259
3.085IlePro: 3.085 ± 0.416
1.481IleGln: 1.481 ± 0.301
3.27IleArg: 3.27 ± 0.459
3.456IleSer: 3.456 ± 0.471
3.209IleThr: 3.209 ± 0.393
2.9IleVal: 2.9 ± 0.55
0.679IleTrp: 0.679 ± 0.191
1.481IleTyr: 1.481 ± 0.288
0.0IleXaa: 0.0 ± 0.0
Lys
4.319LysAla: 4.319 ± 0.553
0.247LysCys: 0.247 ± 0.127
2.468LysAsp: 2.468 ± 0.47
2.16LysGlu: 2.16 ± 0.438
1.419LysPhe: 1.419 ± 0.258
2.838LysGly: 2.838 ± 0.378
0.802LysHis: 0.802 ± 0.237
2.283LysIle: 2.283 ± 0.386
1.913LysLys: 1.913 ± 0.448
3.456LysLeu: 3.456 ± 0.436
1.049LysMet: 1.049 ± 0.228
1.481LysAsn: 1.481 ± 0.258
2.653LysPro: 2.653 ± 0.435
1.789LysGln: 1.789 ± 0.417
2.838LysArg: 2.838 ± 0.489
2.221LysSer: 2.221 ± 0.371
2.283LysThr: 2.283 ± 0.417
3.27LysVal: 3.27 ± 0.492
0.74LysTrp: 0.74 ± 0.236
1.172LysTyr: 1.172 ± 0.303
0.0LysXaa: 0.0 ± 0.0
Leu
9.071LeuAla: 9.071 ± 0.744
0.37LeuCys: 0.37 ± 0.168
6.171LeuAsp: 6.171 ± 0.59
5.43LeuGlu: 5.43 ± 0.587
2.283LeuPhe: 2.283 ± 0.402
7.775LeuGly: 7.775 ± 0.683
1.358LeuHis: 1.358 ± 0.276
4.381LeuIle: 4.381 ± 0.448
3.826LeuLys: 3.826 ± 0.506
5.8LeuLeu: 5.8 ± 0.529
1.604LeuMet: 1.604 ± 0.279
2.838LeuAsn: 2.838 ± 0.375
5.862LeuPro: 5.862 ± 0.572
2.468LeuGln: 2.468 ± 0.54
5.739LeuArg: 5.739 ± 0.523
5.8LeuSer: 5.8 ± 0.631
6.047LeuThr: 6.047 ± 0.608
4.875LeuVal: 4.875 ± 0.699
1.296LeuTrp: 1.296 ± 0.323
2.838LeuTyr: 2.838 ± 0.451
0.0LeuXaa: 0.0 ± 0.0
Met
2.407MetAla: 2.407 ± 0.362
0.0MetCys: 0.0 ± 0.0
0.864MetAsp: 0.864 ± 0.209
1.543MetGlu: 1.543 ± 0.324
0.617MetPhe: 0.617 ± 0.192
1.358MetGly: 1.358 ± 0.28
0.185MetHis: 0.185 ± 0.114
0.37MetIle: 0.37 ± 0.155
1.049MetLys: 1.049 ± 0.276
1.234MetLeu: 1.234 ± 0.281
0.062MetMet: 0.062 ± 0.057
1.111MetAsn: 1.111 ± 0.237
0.74MetPro: 0.74 ± 0.199
0.494MetGln: 0.494 ± 0.162
1.543MetArg: 1.543 ± 0.313
2.777MetSer: 2.777 ± 0.469
1.975MetThr: 1.975 ± 0.332
0.926MetVal: 0.926 ± 0.222
0.247MetTrp: 0.247 ± 0.107
0.309MetTyr: 0.309 ± 0.152
0.0MetXaa: 0.0 ± 0.0
Asn
3.641AsnAla: 3.641 ± 0.577
0.062AsnCys: 0.062 ± 0.063
1.851AsnAsp: 1.851 ± 0.399
2.16AsnGlu: 2.16 ± 0.366
0.926AsnPhe: 0.926 ± 0.277
3.826AsnGly: 3.826 ± 0.586
0.74AsnHis: 0.74 ± 0.192
1.543AsnIle: 1.543 ± 0.354
0.864AsnLys: 0.864 ± 0.247
2.653AsnLeu: 2.653 ± 0.382
0.555AsnMet: 0.555 ± 0.162
0.864AsnAsn: 0.864 ± 0.245
2.592AsnPro: 2.592 ± 0.404
0.864AsnGln: 0.864 ± 0.24
1.666AsnArg: 1.666 ± 0.384
1.604AsnSer: 1.604 ± 0.323
2.221AsnThr: 2.221 ± 0.386
2.592AsnVal: 2.592 ± 0.472
0.802AsnTrp: 0.802 ± 0.21
1.358AsnTyr: 1.358 ± 0.303
0.0AsnXaa: 0.0 ± 0.0
Pro
5.06ProAla: 5.06 ± 0.55
0.247ProCys: 0.247 ± 0.13
4.134ProAsp: 4.134 ± 0.486
4.134ProGlu: 4.134 ± 0.547
2.16ProPhe: 2.16 ± 0.363
4.936ProGly: 4.936 ± 0.682
0.74ProHis: 0.74 ± 0.213
2.653ProIle: 2.653 ± 0.423
1.913ProLys: 1.913 ± 0.31
4.381ProLeu: 4.381 ± 0.492
1.049ProMet: 1.049 ± 0.225
1.913ProAsn: 1.913 ± 0.308
2.962ProPro: 2.962 ± 0.498
1.543ProGln: 1.543 ± 0.329
2.653ProArg: 2.653 ± 0.496
3.641ProSer: 3.641 ± 0.43
3.641ProThr: 3.641 ± 0.474
3.887ProVal: 3.887 ± 0.368
0.926ProTrp: 0.926 ± 0.231
1.419ProTyr: 1.419 ± 0.352
0.0ProXaa: 0.0 ± 0.0
Gln
2.962GlnAla: 2.962 ± 0.498
0.062GlnCys: 0.062 ± 0.056
1.358GlnAsp: 1.358 ± 0.355
1.481GlnGlu: 1.481 ± 0.26
1.172GlnPhe: 1.172 ± 0.265
2.345GlnGly: 2.345 ± 0.377
0.617GlnHis: 0.617 ± 0.17
2.838GlnIle: 2.838 ± 0.501
1.234GlnLys: 1.234 ± 0.304
3.394GlnLeu: 3.394 ± 0.501
0.74GlnMet: 0.74 ± 0.223
0.432GlnAsn: 0.432 ± 0.143
1.604GlnPro: 1.604 ± 0.325
1.913GlnGln: 1.913 ± 0.391
1.728GlnArg: 1.728 ± 0.404
1.975GlnSer: 1.975 ± 0.324
1.913GlnThr: 1.913 ± 0.371
2.53GlnVal: 2.53 ± 0.385
0.679GlnTrp: 0.679 ± 0.185
0.617GlnTyr: 0.617 ± 0.147
0.0GlnXaa: 0.0 ± 0.0
Arg
5.43ArgAla: 5.43 ± 0.715
0.802ArgCys: 0.802 ± 0.229
3.147ArgAsp: 3.147 ± 0.476
4.813ArgGlu: 4.813 ± 0.638
1.975ArgPhe: 1.975 ± 0.357
4.936ArgGly: 4.936 ± 0.797
1.049ArgHis: 1.049 ± 0.23
3.394ArgIle: 3.394 ± 0.561
3.394ArgLys: 3.394 ± 0.538
6.171ArgLeu: 6.171 ± 0.781
1.975ArgMet: 1.975 ± 0.33
2.283ArgAsn: 2.283 ± 0.495
2.592ArgPro: 2.592 ± 0.4
1.789ArgGln: 1.789 ± 0.28
6.109ArgArg: 6.109 ± 0.742
4.011ArgSer: 4.011 ± 0.612
2.777ArgThr: 2.777 ± 0.493
5.615ArgVal: 5.615 ± 0.553
1.543ArgTrp: 1.543 ± 0.32
1.543ArgTyr: 1.543 ± 0.267
0.0ArgXaa: 0.0 ± 0.0
Ser
6.232SerAla: 6.232 ± 0.684
0.432SerCys: 0.432 ± 0.176
3.394SerAsp: 3.394 ± 0.442
3.641SerGlu: 3.641 ± 0.53
2.036SerPhe: 2.036 ± 0.453
6.788SerGly: 6.788 ± 0.927
1.604SerHis: 1.604 ± 0.338
2.715SerIle: 2.715 ± 0.451
2.345SerLys: 2.345 ± 0.362
4.998SerLeu: 4.998 ± 0.565
1.234SerMet: 1.234 ± 0.258
2.9SerAsn: 2.9 ± 0.498
3.024SerPro: 3.024 ± 0.477
1.789SerGln: 1.789 ± 0.282
3.024SerArg: 3.024 ± 0.41
3.764SerSer: 3.764 ± 0.687
3.209SerThr: 3.209 ± 0.43
3.887SerVal: 3.887 ± 0.624
1.604SerTrp: 1.604 ± 0.379
1.481SerTyr: 1.481 ± 0.304
0.0SerXaa: 0.0 ± 0.0
Thr
5.553ThrAla: 5.553 ± 0.607
0.309ThrCys: 0.309 ± 0.178
4.505ThrAsp: 4.505 ± 0.515
4.505ThrGlu: 4.505 ± 0.579
2.098ThrPhe: 2.098 ± 0.357
6.232ThrGly: 6.232 ± 0.633
0.926ThrHis: 0.926 ± 0.256
3.147ThrIle: 3.147 ± 0.526
2.345ThrLys: 2.345 ± 0.329
6.047ThrLeu: 6.047 ± 0.644
0.926ThrMet: 0.926 ± 0.217
1.851ThrAsn: 1.851 ± 0.287
3.641ThrPro: 3.641 ± 0.507
1.666ThrGln: 1.666 ± 0.296
3.024ThrArg: 3.024 ± 0.567
3.579ThrSer: 3.579 ± 0.531
3.949ThrThr: 3.949 ± 0.501
5.43ThrVal: 5.43 ± 0.65
1.049ThrTrp: 1.049 ± 0.287
1.728ThrTyr: 1.728 ± 0.389
0.0ThrXaa: 0.0 ± 0.0
Val
6.541ValAla: 6.541 ± 0.671
0.494ValCys: 0.494 ± 0.174
5.43ValAsp: 5.43 ± 0.561
4.936ValGlu: 4.936 ± 0.618
2.653ValPhe: 2.653 ± 0.374
5.43ValGly: 5.43 ± 0.664
1.666ValHis: 1.666 ± 0.269
3.27ValIle: 3.27 ± 0.398
2.9ValLys: 2.9 ± 0.402
5.862ValLeu: 5.862 ± 0.588
0.987ValMet: 0.987 ± 0.283
2.838ValAsn: 2.838 ± 0.404
4.011ValPro: 4.011 ± 0.444
2.345ValGln: 2.345 ± 0.454
5.739ValArg: 5.739 ± 0.647
4.319ValSer: 4.319 ± 0.492
5.307ValThr: 5.307 ± 0.563
4.998ValVal: 4.998 ± 0.7
1.789ValTrp: 1.789 ± 0.312
2.345ValTyr: 2.345 ± 0.402
0.0ValXaa: 0.0 ± 0.0
Trp
1.481TrpAla: 1.481 ± 0.361
0.247TrpCys: 0.247 ± 0.107
1.419TrpAsp: 1.419 ± 0.324
0.926TrpGlu: 0.926 ± 0.239
0.864TrpPhe: 0.864 ± 0.226
2.036TrpGly: 2.036 ± 0.34
0.494TrpHis: 0.494 ± 0.181
1.234TrpIle: 1.234 ± 0.258
0.37TrpLys: 0.37 ± 0.197
2.221TrpLeu: 2.221 ± 0.421
0.432TrpMet: 0.432 ± 0.18
0.494TrpAsn: 0.494 ± 0.156
0.679TrpPro: 0.679 ± 0.217
0.802TrpGln: 0.802 ± 0.184
1.481TrpArg: 1.481 ± 0.281
0.987TrpSer: 0.987 ± 0.253
1.543TrpThr: 1.543 ± 0.346
2.283TrpVal: 2.283 ± 0.361
0.617TrpTrp: 0.617 ± 0.232
0.37TrpTyr: 0.37 ± 0.157
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.345TyrAla: 2.345 ± 0.359
0.123TyrCys: 0.123 ± 0.083
0.987TyrAsp: 0.987 ± 0.3
2.345TyrGlu: 2.345 ± 0.362
0.617TyrPhe: 0.617 ± 0.169
2.16TyrGly: 2.16 ± 0.377
0.617TyrHis: 0.617 ± 0.192
1.666TyrIle: 1.666 ± 0.332
1.481TyrLys: 1.481 ± 0.308
2.407TyrLeu: 2.407 ± 0.418
0.494TyrMet: 0.494 ± 0.173
1.296TyrAsn: 1.296 ± 0.311
1.481TyrPro: 1.481 ± 0.287
1.049TyrGln: 1.049 ± 0.229
3.147TyrArg: 3.147 ± 0.406
1.728TyrSer: 1.728 ± 0.363
2.098TyrThr: 2.098 ± 0.386
1.975TyrVal: 1.975 ± 0.308
0.309TyrTrp: 0.309 ± 0.156
0.679TyrTyr: 0.679 ± 0.191
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 94 proteins (16207 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski