Amino acid dipepetide frequency for Mycobacterium phage Herbertwm

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
10.619AlaAla: 10.619 ± 0.951
0.494AlaCys: 0.494 ± 0.15
5.433AlaAsp: 5.433 ± 0.622
5.989AlaGlu: 5.989 ± 0.666
3.334AlaPhe: 3.334 ± 0.455
7.471AlaGly: 7.471 ± 0.692
1.914AlaHis: 1.914 ± 0.329
4.075AlaIle: 4.075 ± 0.48
4.877AlaLys: 4.877 ± 0.541
8.15AlaLeu: 8.15 ± 0.692
3.087AlaMet: 3.087 ± 0.42
3.766AlaAsn: 3.766 ± 0.482
4.754AlaPro: 4.754 ± 0.758
3.457AlaGln: 3.457 ± 0.544
6.297AlaArg: 6.297 ± 0.747
4.692AlaSer: 4.692 ± 0.448
5.68AlaThr: 5.68 ± 0.668
7.038AlaVal: 7.038 ± 0.796
1.79AlaTrp: 1.79 ± 0.351
2.902AlaTyr: 2.902 ± 0.419
0.0AlaXaa: 0.0 ± 0.0
Cys
0.494CysAla: 0.494 ± 0.16
0.309CysCys: 0.309 ± 0.198
0.679CysAsp: 0.679 ± 0.238
0.37CysGlu: 0.37 ± 0.139
0.37CysPhe: 0.37 ± 0.14
0.926CysGly: 0.926 ± 0.298
0.185CysHis: 0.185 ± 0.09
0.247CysIle: 0.247 ± 0.119
0.309CysLys: 0.309 ± 0.146
0.556CysLeu: 0.556 ± 0.191
0.309CysMet: 0.309 ± 0.136
0.556CysAsn: 0.556 ± 0.188
0.37CysPro: 0.37 ± 0.15
0.432CysGln: 0.432 ± 0.163
0.679CysArg: 0.679 ± 0.201
0.741CysSer: 0.741 ± 0.314
0.741CysThr: 0.741 ± 0.182
0.926CysVal: 0.926 ± 0.25
0.494CysTrp: 0.494 ± 0.168
0.617CysTyr: 0.617 ± 0.186
0.0CysXaa: 0.0 ± 0.0
Asp
6.73AspAla: 6.73 ± 0.765
0.864AspCys: 0.864 ± 0.273
5.001AspAsp: 5.001 ± 0.656
4.445AspGlu: 4.445 ± 0.714
2.655AspPhe: 2.655 ± 0.459
5.557AspGly: 5.557 ± 0.559
1.543AspHis: 1.543 ± 0.327
2.902AspIle: 2.902 ± 0.42
2.902AspLys: 2.902 ± 0.572
6.421AspLeu: 6.421 ± 0.917
1.173AspMet: 1.173 ± 0.325
1.605AspAsn: 1.605 ± 0.279
4.63AspPro: 4.63 ± 0.53
2.099AspGln: 2.099 ± 0.42
3.396AspArg: 3.396 ± 0.576
2.778AspSer: 2.778 ± 0.381
3.828AspThr: 3.828 ± 0.567
4.569AspVal: 4.569 ± 0.49
1.111AspTrp: 1.111 ± 0.269
2.47AspTyr: 2.47 ± 0.372
0.0AspXaa: 0.0 ± 0.0
Glu
7.1GluAla: 7.1 ± 0.715
0.185GluCys: 0.185 ± 0.107
4.322GluAsp: 4.322 ± 0.579
4.384GluGlu: 4.384 ± 0.661
2.47GluPhe: 2.47 ± 0.439
4.26GluGly: 4.26 ± 0.495
1.42GluHis: 1.42 ± 0.263
3.766GluIle: 3.766 ± 0.594
2.284GluLys: 2.284 ± 0.361
7.471GluLeu: 7.471 ± 0.917
1.852GluMet: 1.852 ± 0.355
1.729GluAsn: 1.729 ± 0.367
2.655GluPro: 2.655 ± 0.39
2.47GluGln: 2.47 ± 0.352
3.643GluArg: 3.643 ± 0.527
2.778GluSer: 2.778 ± 0.41
3.581GluThr: 3.581 ± 0.526
4.384GluVal: 4.384 ± 0.499
1.42GluTrp: 1.42 ± 0.293
1.852GluTyr: 1.852 ± 0.284
0.0GluXaa: 0.0 ± 0.0
Phe
3.334PheAla: 3.334 ± 0.43
0.247PheCys: 0.247 ± 0.113
3.087PheAsp: 3.087 ± 0.595
3.21PheGlu: 3.21 ± 0.451
0.679PhePhe: 0.679 ± 0.315
3.087PheGly: 3.087 ± 0.443
0.741PheHis: 0.741 ± 0.228
1.482PheIle: 1.482 ± 0.287
1.235PheLys: 1.235 ± 0.275
2.717PheLeu: 2.717 ± 0.449
0.556PheMet: 0.556 ± 0.176
1.297PheAsn: 1.297 ± 0.303
1.79PhePro: 1.79 ± 0.282
1.235PheGln: 1.235 ± 0.31
1.852PheArg: 1.852 ± 0.302
2.346PheSer: 2.346 ± 0.504
2.284PheThr: 2.284 ± 0.319
2.037PheVal: 2.037 ± 0.401
0.37PheTrp: 0.37 ± 0.122
0.679PheTyr: 0.679 ± 0.169
0.0PheXaa: 0.0 ± 0.0
Gly
6.853GlyAla: 6.853 ± 0.885
0.864GlyCys: 0.864 ± 0.247
5.989GlyAsp: 5.989 ± 0.993
4.939GlyGlu: 4.939 ± 0.559
2.593GlyPhe: 2.593 ± 0.359
5.927GlyGly: 5.927 ± 0.735
2.037GlyHis: 2.037 ± 0.413
4.692GlyIle: 4.692 ± 1.007
4.013GlyLys: 4.013 ± 0.592
7.285GlyLeu: 7.285 ± 0.78
2.161GlyMet: 2.161 ± 0.35
2.964GlyAsn: 2.964 ± 0.557
5.557GlyPro: 5.557 ± 2.02
3.087GlyGln: 3.087 ± 0.466
3.89GlyArg: 3.89 ± 0.491
3.828GlySer: 3.828 ± 0.647
4.137GlyThr: 4.137 ± 0.606
5.989GlyVal: 5.989 ± 0.693
2.531GlyTrp: 2.531 ± 0.35
2.223GlyTyr: 2.223 ± 0.294
0.0GlyXaa: 0.0 ± 0.0
His
2.099HisAla: 2.099 ± 0.417
0.309HisCys: 0.309 ± 0.127
1.852HisAsp: 1.852 ± 0.364
1.667HisGlu: 1.667 ± 0.342
0.37HisPhe: 0.37 ± 0.159
1.79HisGly: 1.79 ± 0.541
0.741HisHis: 0.741 ± 0.215
1.235HisIle: 1.235 ± 0.255
0.803HisLys: 0.803 ± 0.224
1.543HisLeu: 1.543 ± 0.291
0.247HisMet: 0.247 ± 0.143
0.556HisAsn: 0.556 ± 0.17
1.543HisPro: 1.543 ± 0.3
0.864HisGln: 0.864 ± 0.201
1.976HisArg: 1.976 ± 0.395
0.556HisSer: 0.556 ± 0.189
1.05HisThr: 1.05 ± 0.281
1.173HisVal: 1.173 ± 0.263
0.494HisTrp: 0.494 ± 0.188
0.432HisTyr: 0.432 ± 0.209
0.0HisXaa: 0.0 ± 0.0
Ile
4.445IleAla: 4.445 ± 0.496
0.864IleCys: 0.864 ± 0.222
2.84IleAsp: 2.84 ± 0.433
4.877IleGlu: 4.877 ± 0.612
1.358IlePhe: 1.358 ± 0.284
3.89IleGly: 3.89 ± 0.624
0.741IleHis: 0.741 ± 0.198
2.037IleIle: 2.037 ± 0.367
2.655IleLys: 2.655 ± 0.553
3.828IleLeu: 3.828 ± 0.469
0.494IleMet: 0.494 ± 0.193
2.037IleAsn: 2.037 ± 0.387
3.149IlePro: 3.149 ± 0.411
1.605IleGln: 1.605 ± 0.358
3.519IleArg: 3.519 ± 0.452
3.149IleSer: 3.149 ± 0.353
3.89IleThr: 3.89 ± 0.513
3.704IleVal: 3.704 ± 0.558
0.741IleTrp: 0.741 ± 0.184
1.235IleTyr: 1.235 ± 0.311
0.0IleXaa: 0.0 ± 0.0
Lys
4.816LysAla: 4.816 ± 0.612
0.309LysCys: 0.309 ± 0.14
2.531LysAsp: 2.531 ± 0.515
2.099LysGlu: 2.099 ± 0.311
1.543LysPhe: 1.543 ± 0.422
4.013LysGly: 4.013 ± 0.955
0.864LysHis: 0.864 ± 0.257
2.099LysIle: 2.099 ± 0.403
3.149LysLys: 3.149 ± 0.614
4.322LysLeu: 4.322 ± 0.586
1.173LysMet: 1.173 ± 0.273
1.05LysAsn: 1.05 ± 0.271
3.457LysPro: 3.457 ± 0.501
1.852LysGln: 1.852 ± 0.386
3.272LysArg: 3.272 ± 0.555
2.284LysSer: 2.284 ± 0.353
2.717LysThr: 2.717 ± 0.462
4.075LysVal: 4.075 ± 0.691
0.679LysTrp: 0.679 ± 0.182
1.111LysTyr: 1.111 ± 0.272
0.0LysXaa: 0.0 ± 0.0
Leu
9.261LeuAla: 9.261 ± 0.835
0.988LeuCys: 0.988 ± 0.237
5.248LeuAsp: 5.248 ± 0.559
4.63LeuGlu: 4.63 ± 0.487
2.717LeuPhe: 2.717 ± 0.423
6.421LeuGly: 6.421 ± 0.522
1.852LeuHis: 1.852 ± 0.393
3.766LeuIle: 3.766 ± 0.505
3.89LeuLys: 3.89 ± 0.704
5.063LeuLeu: 5.063 ± 0.514
2.778LeuMet: 2.778 ± 0.432
2.84LeuAsn: 2.84 ± 0.497
4.939LeuPro: 4.939 ± 0.526
2.717LeuGln: 2.717 ± 0.385
5.433LeuArg: 5.433 ± 0.533
5.433LeuSer: 5.433 ± 0.714
5.557LeuThr: 5.557 ± 0.753
4.692LeuVal: 4.692 ± 0.556
1.235LeuTrp: 1.235 ± 0.258
2.531LeuTyr: 2.531 ± 0.398
0.0LeuXaa: 0.0 ± 0.0
Met
2.531MetAla: 2.531 ± 0.423
0.123MetCys: 0.123 ± 0.078
1.05MetAsp: 1.05 ± 0.28
1.42MetGlu: 1.42 ± 0.276
0.617MetPhe: 0.617 ± 0.162
1.852MetGly: 1.852 ± 0.384
0.432MetHis: 0.432 ± 0.175
1.482MetIle: 1.482 ± 0.328
1.976MetLys: 1.976 ± 0.374
1.42MetLeu: 1.42 ± 0.295
0.741MetMet: 0.741 ± 0.184
1.05MetAsn: 1.05 ± 0.234
0.741MetPro: 0.741 ± 0.198
1.111MetGln: 1.111 ± 0.187
1.543MetArg: 1.543 ± 0.298
2.902MetSer: 2.902 ± 0.422
2.47MetThr: 2.47 ± 0.458
1.173MetVal: 1.173 ± 0.252
0.37MetTrp: 0.37 ± 0.134
0.864MetTyr: 0.864 ± 0.252
0.0MetXaa: 0.0 ± 0.0
Asn
2.902AsnAla: 2.902 ± 0.498
0.679AsnCys: 0.679 ± 0.222
1.976AsnAsp: 1.976 ± 0.331
2.284AsnGlu: 2.284 ± 0.351
1.111AsnPhe: 1.111 ± 0.266
3.828AsnGly: 3.828 ± 0.558
0.988AsnHis: 0.988 ± 0.233
1.297AsnIle: 1.297 ± 0.32
1.358AsnLys: 1.358 ± 0.256
2.593AsnLeu: 2.593 ± 0.548
0.988AsnMet: 0.988 ± 0.229
0.803AsnAsn: 0.803 ± 0.286
2.531AsnPro: 2.531 ± 0.302
1.05AsnGln: 1.05 ± 0.246
2.223AsnArg: 2.223 ± 0.389
0.926AsnSer: 0.926 ± 0.211
1.358AsnThr: 1.358 ± 0.27
2.284AsnVal: 2.284 ± 0.511
0.679AsnTrp: 0.679 ± 0.162
0.803AsnTyr: 0.803 ± 0.216
0.0AsnXaa: 0.0 ± 0.0
Pro
4.569ProAla: 4.569 ± 0.488
0.37ProCys: 0.37 ± 0.163
3.704ProAsp: 3.704 ± 0.51
4.137ProGlu: 4.137 ± 0.496
2.593ProPhe: 2.593 ± 0.37
4.26ProGly: 4.26 ± 0.528
0.926ProHis: 0.926 ± 0.239
2.964ProIle: 2.964 ± 0.457
2.717ProLys: 2.717 ± 0.552
3.149ProLeu: 3.149 ± 0.455
1.358ProMet: 1.358 ± 0.332
2.161ProAsn: 2.161 ± 0.412
2.655ProPro: 2.655 ± 0.478
3.21ProGln: 3.21 ± 1.195
3.643ProArg: 3.643 ± 0.497
2.408ProSer: 2.408 ± 0.362
3.025ProThr: 3.025 ± 0.425
3.581ProVal: 3.581 ± 0.433
1.729ProTrp: 1.729 ± 0.301
1.605ProTyr: 1.605 ± 0.333
0.0ProXaa: 0.0 ± 0.0
Gln
3.334GlnAla: 3.334 ± 0.52
0.123GlnCys: 0.123 ± 0.085
1.605GlnAsp: 1.605 ± 0.297
2.161GlnGlu: 2.161 ± 0.434
1.297GlnPhe: 1.297 ± 0.302
4.63GlnGly: 4.63 ± 2.088
1.173GlnHis: 1.173 ± 0.275
2.778GlnIle: 2.778 ± 0.399
2.099GlnLys: 2.099 ± 0.413
3.21GlnLeu: 3.21 ± 0.55
1.173GlnMet: 1.173 ± 0.225
1.235GlnAsn: 1.235 ± 0.313
1.543GlnPro: 1.543 ± 0.292
1.543GlnGln: 1.543 ± 0.315
1.914GlnArg: 1.914 ± 0.348
1.482GlnSer: 1.482 ± 0.399
1.42GlnThr: 1.42 ± 0.341
2.593GlnVal: 2.593 ± 0.413
0.864GlnTrp: 0.864 ± 0.235
1.297GlnTyr: 1.297 ± 0.226
0.0GlnXaa: 0.0 ± 0.0
Arg
6.236ArgAla: 6.236 ± 0.847
1.358ArgCys: 1.358 ± 0.397
3.951ArgAsp: 3.951 ± 0.384
4.075ArgGlu: 4.075 ± 0.667
2.346ArgPhe: 2.346 ± 0.394
4.075ArgGly: 4.075 ± 0.544
1.297ArgHis: 1.297 ± 0.284
3.643ArgIle: 3.643 ± 0.451
3.149ArgLys: 3.149 ± 0.543
5.186ArgLeu: 5.186 ± 0.604
2.47ArgMet: 2.47 ± 0.387
1.79ArgAsn: 1.79 ± 0.335
2.655ArgPro: 2.655 ± 0.435
1.729ArgGln: 1.729 ± 0.381
5.186ArgArg: 5.186 ± 0.733
3.272ArgSer: 3.272 ± 0.568
2.84ArgThr: 2.84 ± 0.425
4.013ArgVal: 4.013 ± 0.478
1.297ArgTrp: 1.297 ± 0.274
2.47ArgTyr: 2.47 ± 0.367
0.0ArgXaa: 0.0 ± 0.0
Ser
4.26SerAla: 4.26 ± 0.519
0.432SerCys: 0.432 ± 0.172
3.21SerAsp: 3.21 ± 0.481
3.334SerGlu: 3.334 ± 0.456
1.976SerPhe: 1.976 ± 0.357
5.124SerGly: 5.124 ± 0.772
0.679SerHis: 0.679 ± 0.189
3.025SerIle: 3.025 ± 0.46
2.655SerLys: 2.655 ± 0.397
4.445SerLeu: 4.445 ± 0.524
1.482SerMet: 1.482 ± 0.421
1.05SerAsn: 1.05 ± 0.241
3.025SerPro: 3.025 ± 0.458
2.223SerGln: 2.223 ± 0.369
3.581SerArg: 3.581 ± 0.473
2.655SerSer: 2.655 ± 0.437
3.334SerThr: 3.334 ± 0.509
3.766SerVal: 3.766 ± 0.504
0.864SerTrp: 0.864 ± 0.244
2.284SerTyr: 2.284 ± 0.345
0.0SerXaa: 0.0 ± 0.0
Thr
5.433ThrAla: 5.433 ± 0.577
0.37ThrCys: 0.37 ± 0.153
4.198ThrAsp: 4.198 ± 0.541
3.396ThrGlu: 3.396 ± 0.375
1.976ThrPhe: 1.976 ± 0.28
5.68ThrGly: 5.68 ± 1.245
1.173ThrHis: 1.173 ± 0.275
3.21ThrIle: 3.21 ± 0.536
2.593ThrLys: 2.593 ± 0.524
4.816ThrLeu: 4.816 ± 0.442
1.358ThrMet: 1.358 ± 0.293
1.667ThrAsn: 1.667 ± 0.332
4.137ThrPro: 4.137 ± 0.585
1.914ThrGln: 1.914 ± 0.367
2.902ThrArg: 2.902 ± 0.397
3.21ThrSer: 3.21 ± 0.542
3.21ThrThr: 3.21 ± 0.539
4.63ThrVal: 4.63 ± 0.493
1.42ThrTrp: 1.42 ± 0.272
1.42ThrTyr: 1.42 ± 0.237
0.0ThrXaa: 0.0 ± 0.0
Val
6.297ValAla: 6.297 ± 0.639
0.679ValCys: 0.679 ± 0.226
5.804ValAsp: 5.804 ± 0.609
3.951ValGlu: 3.951 ± 0.592
2.531ValPhe: 2.531 ± 0.438
5.063ValGly: 5.063 ± 0.68
1.235ValHis: 1.235 ± 0.312
3.581ValIle: 3.581 ± 0.413
3.025ValLys: 3.025 ± 0.387
6.112ValLeu: 6.112 ± 0.724
1.235ValMet: 1.235 ± 0.256
2.593ValAsn: 2.593 ± 0.4
2.47ValPro: 2.47 ± 0.388
2.778ValGln: 2.778 ± 0.411
4.322ValArg: 4.322 ± 0.532
4.939ValSer: 4.939 ± 0.657
4.26ValThr: 4.26 ± 0.591
4.877ValVal: 4.877 ± 0.468
1.482ValTrp: 1.482 ± 0.281
1.852ValTyr: 1.852 ± 0.376
0.0ValXaa: 0.0 ± 0.0
Trp
2.161TrpAla: 2.161 ± 0.389
0.37TrpCys: 0.37 ± 0.179
1.543TrpAsp: 1.543 ± 0.333
0.988TrpGlu: 0.988 ± 0.24
0.741TrpPhe: 0.741 ± 0.235
1.297TrpGly: 1.297 ± 0.309
0.864TrpHis: 0.864 ± 0.242
0.926TrpIle: 0.926 ± 0.238
1.05TrpLys: 1.05 ± 0.25
1.297TrpLeu: 1.297 ± 0.233
0.617TrpMet: 0.617 ± 0.176
0.741TrpAsn: 0.741 ± 0.175
0.864TrpPro: 0.864 ± 0.249
0.926TrpGln: 0.926 ± 0.228
1.235TrpArg: 1.235 ± 0.246
1.173TrpSer: 1.173 ± 0.251
1.543TrpThr: 1.543 ± 0.317
1.358TrpVal: 1.358 ± 0.294
0.556TrpTrp: 0.556 ± 0.172
0.432TrpTyr: 0.432 ± 0.151
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.346TyrAla: 2.346 ± 0.409
0.37TyrCys: 0.37 ± 0.171
2.655TyrAsp: 2.655 ± 0.316
1.667TyrGlu: 1.667 ± 0.288
0.988TyrPhe: 0.988 ± 0.25
2.408TyrGly: 2.408 ± 0.347
0.494TyrHis: 0.494 ± 0.187
1.852TyrIle: 1.852 ± 0.323
0.679TyrLys: 0.679 ± 0.207
2.408TyrLeu: 2.408 ± 0.347
0.556TyrMet: 0.556 ± 0.264
1.235TyrAsn: 1.235 ± 0.282
1.297TyrPro: 1.297 ± 0.343
1.173TyrGln: 1.173 ± 0.264
2.531TyrArg: 2.531 ± 0.382
1.852TyrSer: 1.852 ± 0.3
1.79TyrThr: 1.79 ± 0.379
2.223TyrVal: 2.223 ± 0.332
0.556TyrTrp: 0.556 ± 0.205
0.803TyrTyr: 0.803 ± 0.22
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 88 proteins (16198 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski