Amino acid dipepetide frequency for Mycobacterium phage Guanica15

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
21.883AlaAla: 21.883 ± 1.941
1.413AlaCys: 1.413 ± 0.327
8.428AlaAsp: 8.428 ± 0.766
9.79AlaGlu: 9.79 ± 1.031
2.932AlaPhe: 2.932 ± 0.609
9.214AlaGly: 9.214 ± 1.055
3.141AlaHis: 3.141 ± 0.451
6.282AlaIle: 6.282 ± 0.565
3.507AlaLys: 3.507 ± 0.426
11.569AlaLeu: 11.569 ± 0.774
3.141AlaMet: 3.141 ± 0.367
3.193AlaAsn: 3.193 ± 0.505
7.067AlaPro: 7.067 ± 0.713
4.607AlaGln: 4.607 ± 0.573
8.795AlaArg: 8.795 ± 0.812
5.602AlaSer: 5.602 ± 0.649
6.649AlaThr: 6.649 ± 0.807
9.947AlaVal: 9.947 ± 0.727
2.565AlaTrp: 2.565 ± 0.366
2.618AlaTyr: 2.618 ± 0.435
0.0AlaXaa: 0.0 ± 0.0
Cys
0.995CysAla: 0.995 ± 0.237
0.105CysCys: 0.105 ± 0.072
0.524CysAsp: 0.524 ± 0.254
0.576CysGlu: 0.576 ± 0.199
0.262CysPhe: 0.262 ± 0.115
1.675CysGly: 1.675 ± 0.378
0.105CysHis: 0.105 ± 0.076
0.419CysIle: 0.419 ± 0.166
0.366CysLys: 0.366 ± 0.131
0.733CysLeu: 0.733 ± 0.216
0.052CysMet: 0.052 ± 0.048
0.157CysAsn: 0.157 ± 0.089
0.995CysPro: 0.995 ± 0.232
0.262CysGln: 0.262 ± 0.115
0.838CysArg: 0.838 ± 0.214
0.733CysSer: 0.733 ± 0.229
0.576CysThr: 0.576 ± 0.179
0.785CysVal: 0.785 ± 0.203
0.576CysTrp: 0.576 ± 0.163
0.157CysTyr: 0.157 ± 0.095
0.0CysXaa: 0.0 ± 0.0
Asp
7.957AspAla: 7.957 ± 0.891
0.838AspCys: 0.838 ± 0.258
6.02AspAsp: 6.02 ± 0.912
5.602AspGlu: 5.602 ± 0.738
1.256AspPhe: 1.256 ± 0.212
6.858AspGly: 6.858 ± 0.534
0.942AspHis: 0.942 ± 0.251
1.623AspIle: 1.623 ± 0.336
2.303AspLys: 2.303 ± 0.34
5.916AspLeu: 5.916 ± 0.683
1.78AspMet: 1.78 ± 0.319
1.675AspAsn: 1.675 ± 0.315
4.24AspPro: 4.24 ± 0.491
2.408AspGln: 2.408 ± 0.374
4.607AspArg: 4.607 ± 0.578
2.042AspSer: 2.042 ± 0.316
2.879AspThr: 2.879 ± 0.366
3.769AspVal: 3.769 ± 0.439
1.152AspTrp: 1.152 ± 0.229
1.466AspTyr: 1.466 ± 0.3
0.0AspXaa: 0.0 ± 0.0
Glu
8.533GluAla: 8.533 ± 0.933
0.838GluCys: 0.838 ± 0.196
2.146GluAsp: 2.146 ± 0.378
1.152GluGlu: 1.152 ± 0.294
2.199GluPhe: 2.199 ± 0.317
4.293GluGly: 4.293 ± 0.541
1.728GluHis: 1.728 ± 0.285
2.722GluIle: 2.722 ± 0.319
1.361GluLys: 1.361 ± 0.362
5.916GluLeu: 5.916 ± 0.62
1.361GluMet: 1.361 ± 0.258
1.309GluAsn: 1.309 ± 0.257
2.618GluPro: 2.618 ± 0.428
2.46GluGln: 2.46 ± 0.34
5.392GluArg: 5.392 ± 0.645
2.303GluSer: 2.303 ± 0.304
2.984GluThr: 2.984 ± 0.326
5.235GluVal: 5.235 ± 0.729
0.785GluTrp: 0.785 ± 0.226
2.042GluTyr: 2.042 ± 0.369
0.0GluXaa: 0.0 ± 0.0
Phe
3.036PheAla: 3.036 ± 0.437
0.314PheCys: 0.314 ± 0.107
3.036PheAsp: 3.036 ± 0.402
1.361PheGlu: 1.361 ± 0.242
0.576PhePhe: 0.576 ± 0.184
2.827PheGly: 2.827 ± 0.363
0.366PheHis: 0.366 ± 0.115
1.256PheIle: 1.256 ± 0.215
1.047PheLys: 1.047 ± 0.221
1.675PheLeu: 1.675 ± 0.344
0.576PheMet: 0.576 ± 0.178
1.256PheAsn: 1.256 ± 0.285
1.256PhePro: 1.256 ± 0.251
0.681PheGln: 0.681 ± 0.205
1.675PheArg: 1.675 ± 0.368
1.099PheSer: 1.099 ± 0.288
1.571PheThr: 1.571 ± 0.253
2.199PheVal: 2.199 ± 0.372
0.366PheTrp: 0.366 ± 0.12
0.366PheTyr: 0.366 ± 0.198
0.0PheXaa: 0.0 ± 0.0
Gly
11.203GlyAla: 11.203 ± 0.967
1.204GlyCys: 1.204 ± 0.3
4.973GlyAsp: 4.973 ± 0.562
4.607GlyGlu: 4.607 ± 0.543
2.042GlyPhe: 2.042 ± 0.388
10.156GlyGly: 10.156 ± 1.655
1.885GlyHis: 1.885 ± 0.337
2.879GlyIle: 2.879 ± 0.494
3.717GlyLys: 3.717 ± 0.529
7.015GlyLeu: 7.015 ± 0.642
1.885GlyMet: 1.885 ± 0.275
3.193GlyAsn: 3.193 ± 0.412
3.769GlyPro: 3.769 ± 0.522
1.78GlyGln: 1.78 ± 0.355
5.916GlyArg: 5.916 ± 0.488
6.073GlySer: 6.073 ± 0.68
5.287GlyThr: 5.287 ± 0.524
6.91GlyVal: 6.91 ± 0.76
2.513GlyTrp: 2.513 ± 0.294
2.303GlyTyr: 2.303 ± 0.488
0.0GlyXaa: 0.0 ± 0.0
His
2.513HisAla: 2.513 ± 0.412
0.262HisCys: 0.262 ± 0.103
1.571HisAsp: 1.571 ± 0.357
1.309HisGlu: 1.309 ± 0.314
0.524HisPhe: 0.524 ± 0.207
2.356HisGly: 2.356 ± 0.39
0.471HisHis: 0.471 ± 0.142
0.785HisIle: 0.785 ± 0.237
0.471HisLys: 0.471 ± 0.161
1.885HisLeu: 1.885 ± 0.241
0.366HisMet: 0.366 ± 0.127
0.733HisAsn: 0.733 ± 0.166
1.466HisPro: 1.466 ± 0.29
0.262HisGln: 0.262 ± 0.116
1.832HisArg: 1.832 ± 0.336
0.89HisSer: 0.89 ± 0.204
1.413HisThr: 1.413 ± 0.217
1.728HisVal: 1.728 ± 0.305
0.681HisTrp: 0.681 ± 0.213
0.681HisTyr: 0.681 ± 0.193
0.0HisXaa: 0.0 ± 0.0
Ile
6.334IleAla: 6.334 ± 0.624
0.157IleCys: 0.157 ± 0.083
3.455IleAsp: 3.455 ± 0.52
2.827IleGlu: 2.827 ± 0.407
0.576IlePhe: 0.576 ± 0.193
4.24IleGly: 4.24 ± 0.824
0.471IleHis: 0.471 ± 0.156
0.838IleIle: 0.838 ± 0.221
1.675IleLys: 1.675 ± 0.272
2.932IleLeu: 2.932 ± 0.377
0.366IleMet: 0.366 ± 0.134
1.309IleAsn: 1.309 ± 0.258
2.146IlePro: 2.146 ± 0.447
1.099IleGln: 1.099 ± 0.248
2.408IleArg: 2.408 ± 0.304
2.042IleSer: 2.042 ± 0.311
2.513IleThr: 2.513 ± 0.297
3.612IleVal: 3.612 ± 0.439
0.524IleTrp: 0.524 ± 0.145
0.366IleTyr: 0.366 ± 0.126
0.0IleXaa: 0.0 ± 0.0
Lys
3.926LysAla: 3.926 ± 0.493
0.419LysCys: 0.419 ± 0.154
1.152LysAsp: 1.152 ± 0.215
0.471LysGlu: 0.471 ± 0.137
1.204LysPhe: 1.204 ± 0.287
2.879LysGly: 2.879 ± 0.44
0.681LysHis: 0.681 ± 0.166
1.675LysIle: 1.675 ± 0.364
0.89LysLys: 0.89 ± 0.236
3.298LysLeu: 3.298 ± 0.36
1.309LysMet: 1.309 ± 0.255
0.419LysAsn: 0.419 ± 0.154
2.46LysPro: 2.46 ± 0.392
0.576LysGln: 0.576 ± 0.131
2.984LysArg: 2.984 ± 0.454
1.152LysSer: 1.152 ± 0.228
2.094LysThr: 2.094 ± 0.365
3.089LysVal: 3.089 ± 0.363
0.314LysTrp: 0.314 ± 0.107
1.047LysTyr: 1.047 ± 0.275
0.0LysXaa: 0.0 ± 0.0
Leu
13.14LeuAla: 13.14 ± 0.83
0.733LeuCys: 0.733 ± 0.221
7.905LeuAsp: 7.905 ± 0.747
2.408LeuGlu: 2.408 ± 0.358
1.832LeuPhe: 1.832 ± 0.268
6.753LeuGly: 6.753 ± 0.631
2.251LeuHis: 2.251 ± 0.396
3.874LeuIle: 3.874 ± 0.464
2.094LeuLys: 2.094 ± 0.332
5.34LeuLeu: 5.34 ± 0.525
1.675LeuMet: 1.675 ± 0.215
2.251LeuAsn: 2.251 ± 0.329
4.083LeuPro: 4.083 ± 0.527
2.251LeuGln: 2.251 ± 0.382
6.02LeuArg: 6.02 ± 0.724
5.392LeuSer: 5.392 ± 0.437
4.973LeuThr: 4.973 ± 0.538
5.811LeuVal: 5.811 ± 0.595
1.309LeuTrp: 1.309 ± 0.248
1.571LeuTyr: 1.571 ± 0.326
0.0LeuXaa: 0.0 ± 0.0
Met
2.775MetAla: 2.775 ± 0.398
0.052MetCys: 0.052 ± 0.055
0.89MetAsp: 0.89 ± 0.214
0.733MetGlu: 0.733 ± 0.2
0.89MetPhe: 0.89 ± 0.196
1.571MetGly: 1.571 ± 0.311
0.628MetHis: 0.628 ± 0.169
1.099MetIle: 1.099 ± 0.231
0.419MetLys: 0.419 ± 0.14
1.518MetLeu: 1.518 ± 0.313
0.785MetMet: 0.785 ± 0.214
0.838MetAsn: 0.838 ± 0.193
1.309MetPro: 1.309 ± 0.263
0.628MetGln: 0.628 ± 0.163
1.309MetArg: 1.309 ± 0.251
2.513MetSer: 2.513 ± 0.35
2.042MetThr: 2.042 ± 0.329
1.571MetVal: 1.571 ± 0.339
0.419MetTrp: 0.419 ± 0.144
0.419MetTyr: 0.419 ± 0.171
0.0MetXaa: 0.0 ± 0.0
Asn
3.612AsnAla: 3.612 ± 0.46
0.366AsnCys: 0.366 ± 0.141
1.571AsnAsp: 1.571 ± 0.293
1.413AsnGlu: 1.413 ± 0.26
0.942AsnPhe: 0.942 ± 0.318
3.769AsnGly: 3.769 ± 0.471
0.576AsnHis: 0.576 ± 0.157
0.995AsnIle: 0.995 ± 0.237
0.838AsnLys: 0.838 ± 0.232
2.303AsnLeu: 2.303 ± 0.34
0.471AsnMet: 0.471 ± 0.144
0.628AsnAsn: 0.628 ± 0.215
2.199AsnPro: 2.199 ± 0.319
0.681AsnGln: 0.681 ± 0.155
1.413AsnArg: 1.413 ± 0.262
1.047AsnSer: 1.047 ± 0.201
1.675AsnThr: 1.675 ± 0.241
2.513AsnVal: 2.513 ± 0.376
0.524AsnTrp: 0.524 ± 0.167
0.366AsnTyr: 0.366 ± 0.115
0.0AsnXaa: 0.0 ± 0.0
Pro
8.114ProAla: 8.114 ± 0.673
0.524ProCys: 0.524 ± 0.154
3.193ProAsp: 3.193 ± 0.396
4.293ProGlu: 4.293 ± 0.5
1.78ProPhe: 1.78 ± 0.322
5.287ProGly: 5.287 ± 0.577
1.152ProHis: 1.152 ± 0.296
1.885ProIle: 1.885 ± 0.248
2.146ProLys: 2.146 ± 0.436
3.874ProLeu: 3.874 ± 0.472
1.361ProMet: 1.361 ± 0.247
1.413ProAsn: 1.413 ± 0.319
3.089ProPro: 3.089 ± 0.443
1.885ProGln: 1.885 ± 0.346
3.089ProArg: 3.089 ± 0.456
2.775ProSer: 2.775 ± 0.38
2.722ProThr: 2.722 ± 0.374
4.816ProVal: 4.816 ± 0.431
1.152ProTrp: 1.152 ± 0.251
1.361ProTyr: 1.361 ± 0.262
0.0ProXaa: 0.0 ± 0.0
Gln
3.926GlnAla: 3.926 ± 0.541
0.471GlnCys: 0.471 ± 0.16
0.995GlnAsp: 0.995 ± 0.214
1.309GlnGlu: 1.309 ± 0.24
0.838GlnPhe: 0.838 ± 0.231
2.356GlnGly: 2.356 ± 0.317
0.942GlnHis: 0.942 ± 0.189
1.78GlnIle: 1.78 ± 0.272
0.733GlnLys: 0.733 ± 0.209
2.094GlnLeu: 2.094 ± 0.366
1.152GlnMet: 1.152 ± 0.214
0.681GlnAsn: 0.681 ± 0.238
2.146GlnPro: 2.146 ± 0.334
1.675GlnGln: 1.675 ± 0.341
2.984GlnArg: 2.984 ± 0.381
0.733GlnSer: 0.733 ± 0.187
1.78GlnThr: 1.78 ± 0.345
3.35GlnVal: 3.35 ± 0.461
0.733GlnTrp: 0.733 ± 0.217
0.995GlnTyr: 0.995 ± 0.189
0.0GlnXaa: 0.0 ± 0.0
Arg
7.381ArgAla: 7.381 ± 0.835
0.785ArgCys: 0.785 ± 0.237
4.293ArgAsp: 4.293 ± 0.535
5.026ArgGlu: 5.026 ± 0.604
2.146ArgPhe: 2.146 ± 0.411
5.235ArgGly: 5.235 ± 0.507
2.094ArgHis: 2.094 ± 0.343
2.722ArgIle: 2.722 ± 0.383
2.827ArgLys: 2.827 ± 0.427
6.23ArgLeu: 6.23 ± 0.837
2.199ArgMet: 2.199 ± 0.279
2.146ArgAsn: 2.146 ± 0.351
3.665ArgPro: 3.665 ± 0.445
3.141ArgGln: 3.141 ± 0.418
5.916ArgArg: 5.916 ± 0.67
3.089ArgSer: 3.089 ± 0.365
3.717ArgThr: 3.717 ± 0.503
5.183ArgVal: 5.183 ± 0.581
2.146ArgTrp: 2.146 ± 0.352
2.146ArgTyr: 2.146 ± 0.391
0.0ArgXaa: 0.0 ± 0.0
Ser
6.596SerAla: 6.596 ± 0.87
0.262SerCys: 0.262 ± 0.125
2.67SerAsp: 2.67 ± 0.422
2.513SerGlu: 2.513 ± 0.374
1.518SerPhe: 1.518 ± 0.247
4.397SerGly: 4.397 ± 0.633
1.204SerHis: 1.204 ± 0.329
1.832SerIle: 1.832 ± 0.375
1.152SerLys: 1.152 ± 0.256
4.293SerLeu: 4.293 ± 0.41
1.204SerMet: 1.204 ± 0.274
1.309SerAsn: 1.309 ± 0.28
3.455SerPro: 3.455 ± 0.405
1.309SerGln: 1.309 ± 0.245
3.193SerArg: 3.193 ± 0.452
3.717SerSer: 3.717 ± 0.576
3.193SerThr: 3.193 ± 0.379
3.56SerVal: 3.56 ± 0.348
1.361SerTrp: 1.361 ± 0.319
1.309SerTyr: 1.309 ± 0.233
0.0SerXaa: 0.0 ± 0.0
Thr
6.858ThrAla: 6.858 ± 0.846
0.628ThrCys: 0.628 ± 0.178
3.403ThrAsp: 3.403 ± 0.458
3.665ThrGlu: 3.665 ± 0.489
1.989ThrPhe: 1.989 ± 0.335
5.235ThrGly: 5.235 ± 0.561
1.047ThrHis: 1.047 ± 0.264
3.246ThrIle: 3.246 ± 0.444
2.199ThrLys: 2.199 ± 0.343
4.188ThrLeu: 4.188 ± 0.425
0.628ThrMet: 0.628 ± 0.163
1.309ThrAsn: 1.309 ± 0.263
3.56ThrPro: 3.56 ± 0.414
1.466ThrGln: 1.466 ± 0.307
3.874ThrArg: 3.874 ± 0.53
3.141ThrSer: 3.141 ± 0.394
3.141ThrThr: 3.141 ± 0.413
4.764ThrVal: 4.764 ± 0.45
1.152ThrTrp: 1.152 ± 0.206
1.728ThrTyr: 1.728 ± 0.297
0.0ThrXaa: 0.0 ± 0.0
Val
9.371ValAla: 9.371 ± 0.875
0.995ValCys: 0.995 ± 0.223
6.439ValAsp: 6.439 ± 0.674
5.811ValGlu: 5.811 ± 0.669
1.937ValPhe: 1.937 ± 0.303
6.439ValGly: 6.439 ± 0.748
1.728ValHis: 1.728 ± 0.337
2.775ValIle: 2.775 ± 0.388
3.036ValLys: 3.036 ± 0.322
5.968ValLeu: 5.968 ± 0.615
1.361ValMet: 1.361 ± 0.254
2.513ValAsn: 2.513 ± 0.36
4.816ValPro: 4.816 ± 0.538
2.199ValGln: 2.199 ± 0.376
5.392ValArg: 5.392 ± 0.566
3.507ValSer: 3.507 ± 0.433
5.026ValThr: 5.026 ± 0.491
6.491ValVal: 6.491 ± 0.76
2.618ValTrp: 2.618 ± 0.394
1.466ValTyr: 1.466 ± 0.275
0.0ValXaa: 0.0 ± 0.0
Trp
1.675TrpAla: 1.675 ± 0.279
0.262TrpCys: 0.262 ± 0.117
0.89TrpAsp: 0.89 ± 0.251
0.89TrpGlu: 0.89 ± 0.163
0.366TrpPhe: 0.366 ± 0.134
1.571TrpGly: 1.571 ± 0.236
0.471TrpHis: 0.471 ± 0.13
0.785TrpIle: 0.785 ± 0.203
0.471TrpLys: 0.471 ± 0.165
3.141TrpLeu: 3.141 ± 0.387
0.209TrpMet: 0.209 ± 0.109
0.89TrpAsn: 0.89 ± 0.215
0.838TrpPro: 0.838 ± 0.2
1.413TrpGln: 1.413 ± 0.245
2.565TrpArg: 2.565 ± 0.376
0.838TrpSer: 0.838 ± 0.226
1.047TrpThr: 1.047 ± 0.206
2.408TrpVal: 2.408 ± 0.316
0.576TrpTrp: 0.576 ± 0.181
0.628TrpTyr: 0.628 ± 0.155
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.565TyrAla: 2.565 ± 0.287
0.262TyrCys: 0.262 ± 0.109
1.78TyrAsp: 1.78 ± 0.332
1.623TyrGlu: 1.623 ± 0.274
0.733TyrPhe: 0.733 ± 0.176
2.199TyrGly: 2.199 ± 0.37
0.314TyrHis: 0.314 ± 0.119
0.419TyrIle: 0.419 ± 0.156
0.785TyrLys: 0.785 ± 0.192
1.78TyrLeu: 1.78 ± 0.308
0.471TyrMet: 0.471 ± 0.153
0.681TyrAsn: 0.681 ± 0.135
0.733TyrPro: 0.733 ± 0.18
0.89TyrGln: 0.89 ± 0.224
1.885TyrArg: 1.885 ± 0.327
1.361TyrSer: 1.361 ± 0.248
1.937TyrThr: 1.937 ± 0.305
2.146TyrVal: 2.146 ± 0.35
0.471TyrTrp: 0.471 ± 0.23
0.314TyrTyr: 0.314 ± 0.117
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 102 proteins (19103 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski