Amino acid dipepetide frequency for Mycobacterium phage Hosp

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
17.412AlaAla: 17.412 ± 1.582
1.022AlaCys: 1.022 ± 0.214
7.862AlaAsp: 7.862 ± 0.537
10.305AlaGlu: 10.305 ± 0.871
2.71AlaPhe: 2.71 ± 0.364
10.483AlaGly: 10.483 ± 1.433
1.821AlaHis: 1.821 ± 0.381
4.753AlaIle: 4.753 ± 0.557
3.287AlaLys: 3.287 ± 0.464
11.016AlaLeu: 11.016 ± 0.968
3.109AlaMet: 3.109 ± 0.351
2.887AlaAsn: 2.887 ± 0.415
8.129AlaPro: 8.129 ± 0.64
5.197AlaGln: 5.197 ± 0.682
8.44AlaArg: 8.44 ± 0.817
6.219AlaSer: 6.219 ± 0.631
7.64AlaThr: 7.64 ± 0.683
8.573AlaVal: 8.573 ± 0.705
2.177AlaTrp: 2.177 ± 0.296
2.354AlaTyr: 2.354 ± 0.372
0.0AlaXaa: 0.0 ± 0.0
Cys
1.11CysAla: 1.11 ± 0.285
0.222CysCys: 0.222 ± 0.083
0.666CysAsp: 0.666 ± 0.175
0.355CysGlu: 0.355 ± 0.127
0.311CysPhe: 0.311 ± 0.108
1.555CysGly: 1.555 ± 0.304
0.355CysHis: 0.355 ± 0.142
0.133CysIle: 0.133 ± 0.081
0.044CysLys: 0.044 ± 0.045
0.4CysLeu: 0.4 ± 0.13
0.178CysMet: 0.178 ± 0.098
0.267CysAsn: 0.267 ± 0.105
0.933CysPro: 0.933 ± 0.275
0.489CysGln: 0.489 ± 0.152
1.11CysArg: 1.11 ± 0.262
0.489CysSer: 0.489 ± 0.135
0.311CysThr: 0.311 ± 0.128
0.577CysVal: 0.577 ± 0.172
0.267CysTrp: 0.267 ± 0.122
0.044CysTyr: 0.044 ± 0.042
0.0CysXaa: 0.0 ± 0.0
Asp
7.507AspAla: 7.507 ± 0.494
1.066AspCys: 1.066 ± 0.334
3.909AspAsp: 3.909 ± 0.477
3.909AspGlu: 3.909 ± 0.588
1.421AspPhe: 1.421 ± 0.238
6.13AspGly: 6.13 ± 0.549
1.421AspHis: 1.421 ± 0.28
2.754AspIle: 2.754 ± 0.432
1.421AspLys: 1.421 ± 0.268
4.708AspLeu: 4.708 ± 0.372
1.199AspMet: 1.199 ± 0.253
1.599AspAsn: 1.599 ± 0.267
5.33AspPro: 5.33 ± 0.623
2.71AspGln: 2.71 ± 0.348
4.397AspArg: 4.397 ± 0.503
2.487AspSer: 2.487 ± 0.356
3.953AspThr: 3.953 ± 0.506
4.442AspVal: 4.442 ± 0.322
1.421AspTrp: 1.421 ± 0.228
1.199AspTyr: 1.199 ± 0.235
0.0AspXaa: 0.0 ± 0.0
Glu
8.75GluAla: 8.75 ± 0.729
0.844GluCys: 0.844 ± 0.201
3.109GluAsp: 3.109 ± 0.444
2.621GluGlu: 2.621 ± 0.376
1.954GluPhe: 1.954 ± 0.285
4.708GluGly: 4.708 ± 0.439
0.977GluHis: 0.977 ± 0.181
2.754GluIle: 2.754 ± 0.392
1.199GluLys: 1.199 ± 0.322
5.997GluLeu: 5.997 ± 0.69
1.155GluMet: 1.155 ± 0.194
1.244GluAsn: 1.244 ± 0.347
4.575GluPro: 4.575 ± 0.555
2.798GluGln: 2.798 ± 0.395
3.864GluArg: 3.864 ± 0.484
2.177GluSer: 2.177 ± 0.237
4.575GluThr: 4.575 ± 0.503
4.842GluVal: 4.842 ± 0.527
1.244GluTrp: 1.244 ± 0.222
1.333GluTyr: 1.333 ± 0.2
0.0GluXaa: 0.0 ± 0.0
Phe
2.843PheAla: 2.843 ± 0.317
0.222PheCys: 0.222 ± 0.104
2.354PheAsp: 2.354 ± 0.339
1.199PheGlu: 1.199 ± 0.255
0.4PhePhe: 0.4 ± 0.129
3.109PheGly: 3.109 ± 0.429
0.622PheHis: 0.622 ± 0.175
1.066PheIle: 1.066 ± 0.217
0.844PheLys: 0.844 ± 0.184
1.866PheLeu: 1.866 ± 0.303
0.222PheMet: 0.222 ± 0.107
0.666PheAsn: 0.666 ± 0.155
1.199PhePro: 1.199 ± 0.269
0.844PheGln: 0.844 ± 0.196
2.043PheArg: 2.043 ± 0.263
1.11PheSer: 1.11 ± 0.25
1.643PheThr: 1.643 ± 0.287
2.399PheVal: 2.399 ± 0.305
0.4PheTrp: 0.4 ± 0.132
0.311PheTyr: 0.311 ± 0.137
0.0PheXaa: 0.0 ± 0.0
Gly
10.172GlyAla: 10.172 ± 1.513
0.755GlyCys: 0.755 ± 0.217
5.863GlyAsp: 5.863 ± 0.416
5.286GlyGlu: 5.286 ± 0.441
2.443GlyPhe: 2.443 ± 0.359
10.661GlyGly: 10.661 ± 1.7
1.777GlyHis: 1.777 ± 0.341
3.642GlyIle: 3.642 ± 0.383
2.887GlyLys: 2.887 ± 0.299
8.884GlyLeu: 8.884 ± 1.045
2.088GlyMet: 2.088 ± 0.318
3.02GlyAsn: 3.02 ± 0.389
3.82GlyPro: 3.82 ± 0.418
5.019GlyGln: 5.019 ± 0.478
6.263GlyArg: 6.263 ± 0.594
4.842GlySer: 4.842 ± 0.436
5.863GlyThr: 5.863 ± 0.562
6.752GlyVal: 6.752 ± 0.688
2.399GlyTrp: 2.399 ± 0.344
2.976GlyTyr: 2.976 ± 0.388
0.0GlyXaa: 0.0 ± 0.0
His
1.555HisAla: 1.555 ± 0.278
0.178HisCys: 0.178 ± 0.089
1.288HisAsp: 1.288 ± 0.256
0.489HisGlu: 0.489 ± 0.143
0.533HisPhe: 0.533 ± 0.159
1.555HisGly: 1.555 ± 0.279
0.444HisHis: 0.444 ± 0.136
0.844HisIle: 0.844 ± 0.158
0.4HisLys: 0.4 ± 0.136
1.421HisLeu: 1.421 ± 0.241
0.444HisMet: 0.444 ± 0.117
0.533HisAsn: 0.533 ± 0.178
1.599HisPro: 1.599 ± 0.283
0.489HisGln: 0.489 ± 0.136
1.999HisArg: 1.999 ± 0.364
0.489HisSer: 0.489 ± 0.14
1.466HisThr: 1.466 ± 0.345
0.844HisVal: 0.844 ± 0.2
0.489HisTrp: 0.489 ± 0.157
0.444HisTyr: 0.444 ± 0.126
0.0HisXaa: 0.0 ± 0.0
Ile
4.753IleAla: 4.753 ± 0.436
0.222IleCys: 0.222 ± 0.092
3.687IleAsp: 3.687 ± 0.343
3.376IleGlu: 3.376 ± 0.407
0.8IlePhe: 0.8 ± 0.162
3.998IleGly: 3.998 ± 0.504
0.666IleHis: 0.666 ± 0.19
1.91IleIle: 1.91 ± 0.297
1.421IleLys: 1.421 ± 0.235
2.576IleLeu: 2.576 ± 0.26
0.355IleMet: 0.355 ± 0.174
1.421IleAsn: 1.421 ± 0.28
3.198IlePro: 3.198 ± 0.356
1.555IleGln: 1.555 ± 0.227
2.576IleArg: 2.576 ± 0.368
2.043IleSer: 2.043 ± 0.299
3.909IleThr: 3.909 ± 0.418
2.976IleVal: 2.976 ± 0.286
0.267IleTrp: 0.267 ± 0.103
0.666IleTyr: 0.666 ± 0.192
0.0IleXaa: 0.0 ± 0.0
Lys
3.82LysAla: 3.82 ± 0.527
0.178LysCys: 0.178 ± 0.087
1.51LysAsp: 1.51 ± 0.286
1.466LysGlu: 1.466 ± 0.298
0.533LysPhe: 0.533 ± 0.147
2.443LysGly: 2.443 ± 0.383
0.355LysHis: 0.355 ± 0.122
1.333LysIle: 1.333 ± 0.273
0.666LysLys: 0.666 ± 0.183
2.665LysLeu: 2.665 ± 0.358
0.577LysMet: 0.577 ± 0.166
0.933LysAsn: 0.933 ± 0.161
1.288LysPro: 1.288 ± 0.268
0.8LysGln: 0.8 ± 0.173
1.643LysArg: 1.643 ± 0.264
1.555LysSer: 1.555 ± 0.241
2.665LysThr: 2.665 ± 0.348
1.999LysVal: 1.999 ± 0.316
0.355LysTrp: 0.355 ± 0.127
0.666LysTyr: 0.666 ± 0.187
0.0LysXaa: 0.0 ± 0.0
Leu
11.238LeuAla: 11.238 ± 0.67
0.844LeuCys: 0.844 ± 0.224
5.197LeuAsp: 5.197 ± 0.5
4.042LeuGlu: 4.042 ± 0.438
1.954LeuPhe: 1.954 ± 0.379
8.04LeuGly: 8.04 ± 0.809
1.421LeuHis: 1.421 ± 0.271
3.82LeuIle: 3.82 ± 0.339
2.043LeuLys: 2.043 ± 0.381
5.774LeuLeu: 5.774 ± 0.459
2.265LeuMet: 2.265 ± 0.297
1.821LeuAsn: 1.821 ± 0.37
5.774LeuPro: 5.774 ± 0.645
2.576LeuGln: 2.576 ± 0.445
4.264LeuArg: 4.264 ± 0.507
4.264LeuSer: 4.264 ± 0.432
6.707LeuThr: 6.707 ± 0.539
5.375LeuVal: 5.375 ± 0.403
1.288LeuTrp: 1.288 ± 0.243
1.51LeuTyr: 1.51 ± 0.333
0.0LeuXaa: 0.0 ± 0.0
Met
2.976MetAla: 2.976 ± 0.401
0.267MetCys: 0.267 ± 0.099
1.51MetAsp: 1.51 ± 0.2
0.8MetGlu: 0.8 ± 0.199
0.533MetPhe: 0.533 ± 0.159
2.043MetGly: 2.043 ± 0.341
0.4MetHis: 0.4 ± 0.134
1.199MetIle: 1.199 ± 0.266
0.622MetLys: 0.622 ± 0.192
1.333MetLeu: 1.333 ± 0.264
0.355MetMet: 0.355 ± 0.129
0.489MetAsn: 0.489 ± 0.15
1.777MetPro: 1.777 ± 0.239
0.533MetGln: 0.533 ± 0.161
1.555MetArg: 1.555 ± 0.227
1.599MetSer: 1.599 ± 0.254
2.843MetThr: 2.843 ± 0.311
1.066MetVal: 1.066 ± 0.208
0.311MetTrp: 0.311 ± 0.129
0.4MetTyr: 0.4 ± 0.129
0.0MetXaa: 0.0 ± 0.0
Asn
3.909AsnAla: 3.909 ± 0.558
0.267AsnCys: 0.267 ± 0.095
1.866AsnAsp: 1.866 ± 0.283
1.244AsnGlu: 1.244 ± 0.254
0.533AsnPhe: 0.533 ± 0.128
3.198AsnGly: 3.198 ± 0.354
0.711AsnHis: 0.711 ± 0.174
1.155AsnIle: 1.155 ± 0.285
0.933AsnLys: 0.933 ± 0.192
2.132AsnLeu: 2.132 ± 0.259
0.355AsnMet: 0.355 ± 0.116
0.844AsnAsn: 0.844 ± 0.201
1.821AsnPro: 1.821 ± 0.27
0.888AsnGln: 0.888 ± 0.217
1.199AsnArg: 1.199 ± 0.305
1.421AsnSer: 1.421 ± 0.205
1.91AsnThr: 1.91 ± 0.297
1.954AsnVal: 1.954 ± 0.286
0.355AsnTrp: 0.355 ± 0.147
0.888AsnTyr: 0.888 ± 0.185
0.0AsnXaa: 0.0 ± 0.0
Pro
6.885ProAla: 6.885 ± 0.651
0.355ProCys: 0.355 ± 0.13
5.33ProAsp: 5.33 ± 0.657
5.419ProGlu: 5.419 ± 0.668
1.954ProPhe: 1.954 ± 0.261
6.796ProGly: 6.796 ± 0.546
0.888ProHis: 0.888 ± 0.224
2.798ProIle: 2.798 ± 0.324
1.777ProLys: 1.777 ± 0.253
3.953ProLeu: 3.953 ± 0.361
1.421ProMet: 1.421 ± 0.255
1.866ProAsn: 1.866 ± 0.309
4.753ProPro: 4.753 ± 0.528
1.155ProGln: 1.155 ± 0.242
3.509ProArg: 3.509 ± 0.381
3.465ProSer: 3.465 ± 0.464
4.22ProThr: 4.22 ± 0.531
6.085ProVal: 6.085 ± 0.415
1.288ProTrp: 1.288 ± 0.279
0.933ProTyr: 0.933 ± 0.24
0.0ProXaa: 0.0 ± 0.0
Gln
5.863GlnAla: 5.863 ± 0.829
0.4GlnCys: 0.4 ± 0.138
1.199GlnAsp: 1.199 ± 0.216
1.11GlnGlu: 1.11 ± 0.248
0.844GlnPhe: 0.844 ± 0.184
3.154GlnGly: 3.154 ± 0.411
0.444GlnHis: 0.444 ± 0.128
2.399GlnIle: 2.399 ± 0.317
0.711GlnLys: 0.711 ± 0.171
3.554GlnLeu: 3.554 ± 0.451
1.022GlnMet: 1.022 ± 0.211
0.666GlnAsn: 0.666 ± 0.146
2.399GlnPro: 2.399 ± 0.348
0.977GlnGln: 0.977 ± 0.189
3.065GlnArg: 3.065 ± 0.438
0.888GlnSer: 0.888 ± 0.202
2.443GlnThr: 2.443 ± 0.424
3.331GlnVal: 3.331 ± 0.255
0.577GlnTrp: 0.577 ± 0.201
0.755GlnTyr: 0.755 ± 0.187
0.0GlnXaa: 0.0 ± 0.0
Arg
7.773ArgAla: 7.773 ± 0.693
0.666ArgCys: 0.666 ± 0.168
4.309ArgAsp: 4.309 ± 0.416
3.82ArgGlu: 3.82 ± 0.384
1.866ArgPhe: 1.866 ± 0.286
5.375ArgGly: 5.375 ± 0.5
1.333ArgHis: 1.333 ± 0.286
1.999ArgIle: 1.999 ± 0.298
2.043ArgLys: 2.043 ± 0.305
6.263ArgLeu: 6.263 ± 0.541
2.043ArgMet: 2.043 ± 0.311
1.954ArgAsn: 1.954 ± 0.231
4.042ArgPro: 4.042 ± 0.532
2.443ArgGln: 2.443 ± 0.337
5.508ArgArg: 5.508 ± 0.635
3.376ArgSer: 3.376 ± 0.347
3.42ArgThr: 3.42 ± 0.361
5.241ArgVal: 5.241 ± 0.598
1.688ArgTrp: 1.688 ± 0.3
1.643ArgTyr: 1.643 ± 0.279
0.0ArgXaa: 0.0 ± 0.0
Ser
6.13SerAla: 6.13 ± 0.459
0.4SerCys: 0.4 ± 0.127
2.443SerAsp: 2.443 ± 0.35
2.177SerGlu: 2.177 ± 0.352
1.11SerPhe: 1.11 ± 0.283
5.552SerGly: 5.552 ± 0.609
0.577SerHis: 0.577 ± 0.136
1.954SerIle: 1.954 ± 0.315
1.555SerLys: 1.555 ± 0.261
4.131SerLeu: 4.131 ± 0.443
0.888SerMet: 0.888 ± 0.197
1.333SerAsn: 1.333 ± 0.25
2.932SerPro: 2.932 ± 0.4
1.999SerGln: 1.999 ± 0.271
3.198SerArg: 3.198 ± 0.332
2.576SerSer: 2.576 ± 0.389
3.065SerThr: 3.065 ± 0.396
4.886SerVal: 4.886 ± 0.521
1.155SerTrp: 1.155 ± 0.207
0.755SerTyr: 0.755 ± 0.168
0.0SerXaa: 0.0 ± 0.0
Thr
9.506ThrAla: 9.506 ± 0.672
0.622ThrCys: 0.622 ± 0.169
4.087ThrAsp: 4.087 ± 0.565
5.597ThrGlu: 5.597 ± 0.655
1.954ThrPhe: 1.954 ± 0.32
7.462ThrGly: 7.462 ± 0.601
1.155ThrHis: 1.155 ± 0.228
2.621ThrIle: 2.621 ± 0.326
2.487ThrLys: 2.487 ± 0.382
4.975ThrLeu: 4.975 ± 0.57
1.821ThrMet: 1.821 ± 0.338
1.688ThrAsn: 1.688 ± 0.27
4.575ThrPro: 4.575 ± 0.602
1.599ThrGln: 1.599 ± 0.257
4.309ThrArg: 4.309 ± 0.401
3.376ThrSer: 3.376 ± 0.329
3.953ThrThr: 3.953 ± 0.455
5.064ThrVal: 5.064 ± 0.569
2.265ThrTrp: 2.265 ± 0.304
1.466ThrTyr: 1.466 ± 0.273
0.0ThrXaa: 0.0 ± 0.0
Val
8.351ValAla: 8.351 ± 0.605
0.577ValCys: 0.577 ± 0.168
4.309ValAsp: 4.309 ± 0.459
5.241ValGlu: 5.241 ± 0.375
1.954ValPhe: 1.954 ± 0.317
6.263ValGly: 6.263 ± 0.696
1.11ValHis: 1.11 ± 0.223
3.864ValIle: 3.864 ± 0.385
2.177ValLys: 2.177 ± 0.263
5.241ValLeu: 5.241 ± 0.492
1.866ValMet: 1.866 ± 0.305
2.843ValAsn: 2.843 ± 0.384
4.131ValPro: 4.131 ± 0.342
2.31ValGln: 2.31 ± 0.301
4.575ValArg: 4.575 ± 0.513
4.442ValSer: 4.442 ± 0.437
7.018ValThr: 7.018 ± 0.596
5.33ValVal: 5.33 ± 0.578
1.155ValTrp: 1.155 ± 0.259
1.732ValTyr: 1.732 ± 0.239
0.0ValXaa: 0.0 ± 0.0
Trp
2.043TrpAla: 2.043 ± 0.379
0.444TrpCys: 0.444 ± 0.145
0.977TrpAsp: 0.977 ± 0.217
1.022TrpGlu: 1.022 ± 0.218
1.11TrpPhe: 1.11 ± 0.273
0.888TrpGly: 0.888 ± 0.159
0.666TrpHis: 0.666 ± 0.175
0.711TrpIle: 0.711 ± 0.15
0.444TrpLys: 0.444 ± 0.115
1.599TrpLeu: 1.599 ± 0.282
0.622TrpMet: 0.622 ± 0.159
0.844TrpAsn: 0.844 ± 0.203
1.288TrpPro: 1.288 ± 0.244
0.489TrpGln: 0.489 ± 0.162
1.466TrpArg: 1.466 ± 0.25
1.199TrpSer: 1.199 ± 0.247
1.688TrpThr: 1.688 ± 0.321
1.022TrpVal: 1.022 ± 0.251
0.577TrpTrp: 0.577 ± 0.167
0.755TrpTyr: 0.755 ± 0.191
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.798TyrAla: 2.798 ± 0.331
0.267TyrCys: 0.267 ± 0.102
1.51TyrAsp: 1.51 ± 0.233
1.288TyrGlu: 1.288 ± 0.21
0.444TyrPhe: 0.444 ± 0.139
1.999TyrGly: 1.999 ± 0.325
0.355TyrHis: 0.355 ± 0.117
0.666TyrIle: 0.666 ± 0.133
0.489TyrLys: 0.489 ± 0.135
1.732TyrLeu: 1.732 ± 0.263
0.577TyrMet: 0.577 ± 0.153
0.666TyrAsn: 0.666 ± 0.187
1.288TyrPro: 1.288 ± 0.281
0.933TyrGln: 0.933 ± 0.18
1.821TyrArg: 1.821 ± 0.325
0.8TyrSer: 0.8 ± 0.237
1.288TyrThr: 1.288 ± 0.255
1.732TyrVal: 1.732 ± 0.271
0.178TyrTrp: 0.178 ± 0.085
0.533TyrTyr: 0.533 ± 0.15
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 97 proteins (22514 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski