Amino acid dipepetide frequency for Mycobacterium phage Gancho

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
23.76AlaAla: 23.76 ± 2.04
0.805AlaCys: 0.805 ± 0.261
11.276AlaAsp: 11.276 ± 0.981
8.457AlaGlu: 8.457 ± 0.918
3.107AlaPhe: 3.107 ± 0.39
10.356AlaGly: 10.356 ± 1.403
2.014AlaHis: 2.014 ± 0.387
6.271AlaIle: 6.271 ± 0.628
4.372AlaLys: 4.372 ± 0.652
9.09AlaLeu: 9.09 ± 0.793
4.142AlaMet: 4.142 ± 0.595
4.718AlaAsn: 4.718 ± 0.575
8.054AlaPro: 8.054 ± 0.907
3.337AlaGln: 3.337 ± 0.413
8.917AlaArg: 8.917 ± 0.741
4.89AlaSer: 4.89 ± 0.487
7.997AlaThr: 7.997 ± 0.542
10.01AlaVal: 10.01 ± 0.874
2.531AlaTrp: 2.531 ± 0.358
2.992AlaTyr: 2.992 ± 0.438
0.0AlaXaa: 0.0 ± 0.0
Cys
1.266CysAla: 1.266 ± 0.378
0.058CysCys: 0.058 ± 0.055
0.46CysAsp: 0.46 ± 0.18
0.403CysGlu: 0.403 ± 0.152
0.058CysPhe: 0.058 ± 0.047
1.899CysGly: 1.899 ± 0.439
0.115CysHis: 0.115 ± 0.083
0.633CysIle: 0.633 ± 0.223
0.23CysLys: 0.23 ± 0.137
0.748CysLeu: 0.748 ± 0.238
0.115CysMet: 0.115 ± 0.087
0.69CysAsn: 0.69 ± 0.284
0.805CysPro: 0.805 ± 0.29
0.058CysGln: 0.058 ± 0.064
0.805CysArg: 0.805 ± 0.201
0.288CysSer: 0.288 ± 0.138
0.978CysThr: 0.978 ± 0.326
0.46CysVal: 0.46 ± 0.215
0.288CysTrp: 0.288 ± 0.139
0.115CysTyr: 0.115 ± 0.087
0.0CysXaa: 0.0 ± 0.0
Asp
10.356AspAla: 10.356 ± 0.902
0.575AspCys: 0.575 ± 0.159
6.789AspAsp: 6.789 ± 0.912
3.797AspGlu: 3.797 ± 0.546
1.036AspPhe: 1.036 ± 0.263
7.076AspGly: 7.076 ± 0.533
1.726AspHis: 1.726 ± 0.345
4.257AspIle: 4.257 ± 0.526
0.92AspLys: 0.92 ± 0.298
3.855AspLeu: 3.855 ± 0.585
1.783AspMet: 1.783 ± 0.326
2.531AspAsn: 2.531 ± 0.41
6.961AspPro: 6.961 ± 0.657
1.496AspGln: 1.496 ± 0.263
5.005AspArg: 5.005 ± 0.596
1.266AspSer: 1.266 ± 0.266
4.89AspThr: 4.89 ± 0.563
5.293AspVal: 5.293 ± 0.544
1.668AspTrp: 1.668 ± 0.296
1.553AspTyr: 1.553 ± 0.286
0.0AspXaa: 0.0 ± 0.0
Glu
8.86GluAla: 8.86 ± 0.931
0.748GluCys: 0.748 ± 0.239
2.819GluAsp: 2.819 ± 0.527
2.129GluGlu: 2.129 ± 0.351
1.093GluPhe: 1.093 ± 0.258
3.222GluGly: 3.222 ± 0.38
1.151GluHis: 1.151 ± 0.254
1.266GluIle: 1.266 ± 0.263
0.805GluLys: 0.805 ± 0.206
3.394GluLeu: 3.394 ± 0.454
1.266GluMet: 1.266 ± 0.3
1.438GluAsn: 1.438 ± 0.308
3.682GluPro: 3.682 ± 0.677
2.129GluGln: 2.129 ± 0.364
4.43GluArg: 4.43 ± 0.558
1.496GluSer: 1.496 ± 0.223
2.704GluThr: 2.704 ± 0.363
4.66GluVal: 4.66 ± 0.527
0.805GluTrp: 0.805 ± 0.198
1.266GluTyr: 1.266 ± 0.252
0.0GluXaa: 0.0 ± 0.0
Phe
2.761PheAla: 2.761 ± 0.403
0.115PheCys: 0.115 ± 0.086
1.956PheAsp: 1.956 ± 0.399
1.438PheGlu: 1.438 ± 0.338
0.403PhePhe: 0.403 ± 0.119
3.855PheGly: 3.855 ± 0.536
0.403PheHis: 0.403 ± 0.159
1.323PheIle: 1.323 ± 0.311
0.92PheLys: 0.92 ± 0.212
1.323PheLeu: 1.323 ± 0.24
0.633PheMet: 0.633 ± 0.196
1.093PheAsn: 1.093 ± 0.232
1.323PhePro: 1.323 ± 0.312
0.403PheGln: 0.403 ± 0.155
1.726PheArg: 1.726 ± 0.369
0.978PheSer: 0.978 ± 0.277
1.208PheThr: 1.208 ± 0.265
2.244PheVal: 2.244 ± 0.353
0.345PheTrp: 0.345 ± 0.131
0.748PheTyr: 0.748 ± 0.166
0.0PheXaa: 0.0 ± 0.0
Gly
9.378GlyAla: 9.378 ± 1.061
0.748GlyCys: 0.748 ± 0.204
7.652GlyAsp: 7.652 ± 0.701
3.049GlyGlu: 3.049 ± 0.449
2.589GlyPhe: 2.589 ± 0.344
11.966GlyGly: 11.966 ± 2.477
1.899GlyHis: 1.899 ± 0.381
4.142GlyIle: 4.142 ± 0.578
2.359GlyLys: 2.359 ± 0.292
6.443GlyLeu: 6.443 ± 0.802
1.726GlyMet: 1.726 ± 0.352
3.164GlyAsn: 3.164 ± 0.526
4.718GlyPro: 4.718 ± 0.618
3.222GlyGln: 3.222 ± 0.453
7.537GlyArg: 7.537 ± 0.721
5.235GlySer: 5.235 ± 0.647
6.386GlyThr: 6.386 ± 0.677
7.076GlyVal: 7.076 ± 0.548
1.036GlyTrp: 1.036 ± 0.222
2.474GlyTyr: 2.474 ± 0.38
0.0GlyXaa: 0.0 ± 0.0
His
2.071HisAla: 2.071 ± 0.405
0.403HisCys: 0.403 ± 0.142
1.266HisAsp: 1.266 ± 0.314
0.92HisGlu: 0.92 ± 0.283
0.403HisPhe: 0.403 ± 0.159
1.899HisGly: 1.899 ± 0.323
0.633HisHis: 0.633 ± 0.223
1.036HisIle: 1.036 ± 0.242
0.46HisLys: 0.46 ± 0.156
1.093HisLeu: 1.093 ± 0.267
0.345HisMet: 0.345 ± 0.127
0.748HisAsn: 0.748 ± 0.215
1.668HisPro: 1.668 ± 0.345
0.288HisGln: 0.288 ± 0.132
2.761HisArg: 2.761 ± 0.454
0.46HisSer: 0.46 ± 0.161
1.726HisThr: 1.726 ± 0.433
2.359HisVal: 2.359 ± 0.303
0.345HisTrp: 0.345 ± 0.134
0.748HisTyr: 0.748 ± 0.214
0.0HisXaa: 0.0 ± 0.0
Ile
6.443IleAla: 6.443 ± 0.661
0.518IleCys: 0.518 ± 0.181
3.337IleAsp: 3.337 ± 0.474
2.877IleGlu: 2.877 ± 0.456
0.46IlePhe: 0.46 ± 0.165
3.97IleGly: 3.97 ± 0.512
1.093IleHis: 1.093 ± 0.265
1.899IleIle: 1.899 ± 0.41
0.633IleLys: 0.633 ± 0.175
2.474IleLeu: 2.474 ± 0.391
0.633IleMet: 0.633 ± 0.238
1.438IleAsn: 1.438 ± 0.321
3.164IlePro: 3.164 ± 0.56
1.208IleGln: 1.208 ± 0.222
4.43IleArg: 4.43 ± 0.559
1.381IleSer: 1.381 ± 0.247
2.704IleThr: 2.704 ± 0.389
3.279IleVal: 3.279 ± 0.426
0.345IleTrp: 0.345 ± 0.153
1.323IleTyr: 1.323 ± 0.265
0.0IleXaa: 0.0 ± 0.0
Lys
4.833LysAla: 4.833 ± 0.785
0.23LysCys: 0.23 ± 0.145
1.899LysAsp: 1.899 ± 0.359
0.748LysGlu: 0.748 ± 0.179
0.92LysPhe: 0.92 ± 0.247
2.244LysGly: 2.244 ± 0.366
0.46LysHis: 0.46 ± 0.147
1.151LysIle: 1.151 ± 0.362
0.46LysLys: 0.46 ± 0.167
1.668LysLeu: 1.668 ± 0.334
0.518LysMet: 0.518 ± 0.139
0.345LysAsn: 0.345 ± 0.138
2.129LysPro: 2.129 ± 0.376
0.978LysGln: 0.978 ± 0.233
1.899LysArg: 1.899 ± 0.372
1.151LysSer: 1.151 ± 0.294
1.668LysThr: 1.668 ± 0.311
1.726LysVal: 1.726 ± 0.231
0.46LysTrp: 0.46 ± 0.13
0.69LysTyr: 0.69 ± 0.194
0.0LysXaa: 0.0 ± 0.0
Leu
7.767LeuAla: 7.767 ± 0.684
0.748LeuCys: 0.748 ± 0.223
4.2LeuAsp: 4.2 ± 0.509
3.797LeuGlu: 3.797 ± 0.413
2.416LeuPhe: 2.416 ± 0.349
5.408LeuGly: 5.408 ± 0.628
1.611LeuHis: 1.611 ± 0.319
2.359LeuIle: 2.359 ± 0.367
1.668LeuLys: 1.668 ± 0.376
4.948LeuLeu: 4.948 ± 0.515
1.093LeuMet: 1.093 ± 0.216
2.014LeuAsn: 2.014 ± 0.403
4.602LeuPro: 4.602 ± 0.629
1.899LeuGln: 1.899 ± 0.397
5.12LeuArg: 5.12 ± 0.549
3.74LeuSer: 3.74 ± 0.405
4.545LeuThr: 4.545 ± 0.554
4.2LeuVal: 4.2 ± 0.51
1.208LeuTrp: 1.208 ± 0.279
1.438LeuTyr: 1.438 ± 0.253
0.0LeuXaa: 0.0 ± 0.0
Met
2.186MetAla: 2.186 ± 0.411
0.345MetCys: 0.345 ± 0.136
1.208MetAsp: 1.208 ± 0.275
0.518MetGlu: 0.518 ± 0.154
0.978MetPhe: 0.978 ± 0.239
1.611MetGly: 1.611 ± 0.273
0.345MetHis: 0.345 ± 0.129
1.208MetIle: 1.208 ± 0.262
1.323MetLys: 1.323 ± 0.29
1.381MetLeu: 1.381 ± 0.309
0.575MetMet: 0.575 ± 0.171
0.863MetAsn: 0.863 ± 0.222
1.668MetPro: 1.668 ± 0.261
0.403MetGln: 0.403 ± 0.151
1.496MetArg: 1.496 ± 0.3
1.726MetSer: 1.726 ± 0.32
2.474MetThr: 2.474 ± 0.342
1.208MetVal: 1.208 ± 0.201
0.69MetTrp: 0.69 ± 0.213
0.23MetTyr: 0.23 ± 0.102
0.0MetXaa: 0.0 ± 0.0
Asn
4.027AsnAla: 4.027 ± 0.5
0.288AsnCys: 0.288 ± 0.132
2.186AsnAsp: 2.186 ± 0.373
1.323AsnGlu: 1.323 ± 0.305
0.46AsnPhe: 0.46 ± 0.176
3.682AsnGly: 3.682 ± 0.521
0.69AsnHis: 0.69 ± 0.174
1.036AsnIle: 1.036 ± 0.315
0.518AsnLys: 0.518 ± 0.191
2.014AsnLeu: 2.014 ± 0.573
0.69AsnMet: 0.69 ± 0.186
0.633AsnAsn: 0.633 ± 0.242
3.74AsnPro: 3.74 ± 0.434
0.748AsnGln: 0.748 ± 0.223
2.474AsnArg: 2.474 ± 0.436
1.151AsnSer: 1.151 ± 0.242
2.186AsnThr: 2.186 ± 0.507
2.589AsnVal: 2.589 ± 0.391
0.403AsnTrp: 0.403 ± 0.162
0.403AsnTyr: 0.403 ± 0.154
0.0AsnXaa: 0.0 ± 0.0
Pro
9.205ProAla: 9.205 ± 1.061
0.575ProCys: 0.575 ± 0.193
6.616ProAsp: 6.616 ± 0.583
5.523ProGlu: 5.523 ± 0.891
1.783ProPhe: 1.783 ± 0.289
6.213ProGly: 6.213 ± 0.664
1.956ProHis: 1.956 ± 0.484
2.704ProIle: 2.704 ± 0.486
2.301ProLys: 2.301 ± 0.347
3.624ProLeu: 3.624 ± 0.487
1.151ProMet: 1.151 ± 0.245
2.646ProAsn: 2.646 ± 0.373
4.545ProPro: 4.545 ± 0.752
1.899ProGln: 1.899 ± 0.307
4.257ProArg: 4.257 ± 0.518
2.301ProSer: 2.301 ± 0.345
4.66ProThr: 4.66 ± 0.576
5.293ProVal: 5.293 ± 0.571
1.036ProTrp: 1.036 ± 0.227
1.323ProTyr: 1.323 ± 0.283
0.0ProXaa: 0.0 ± 0.0
Gln
3.74GlnAla: 3.74 ± 0.461
0.115GlnCys: 0.115 ± 0.075
0.633GlnAsp: 0.633 ± 0.174
0.518GlnGlu: 0.518 ± 0.156
1.266GlnPhe: 1.266 ± 0.328
1.726GlnGly: 1.726 ± 0.362
0.633GlnHis: 0.633 ± 0.158
1.323GlnIle: 1.323 ± 0.273
0.575GlnLys: 0.575 ± 0.187
2.416GlnLeu: 2.416 ± 0.431
0.805GlnMet: 0.805 ± 0.197
0.748GlnAsn: 0.748 ± 0.192
2.301GlnPro: 2.301 ± 0.352
1.381GlnGln: 1.381 ± 0.3
2.761GlnArg: 2.761 ± 0.386
1.553GlnSer: 1.553 ± 0.282
2.186GlnThr: 2.186 ± 0.394
1.208GlnVal: 1.208 ± 0.275
0.978GlnTrp: 0.978 ± 0.234
0.633GlnTyr: 0.633 ± 0.201
0.0GlnXaa: 0.0 ± 0.0
Arg
9.953ArgAla: 9.953 ± 0.782
1.036ArgCys: 1.036 ± 0.332
4.718ArgAsp: 4.718 ± 0.574
3.74ArgGlu: 3.74 ± 0.387
2.014ArgPhe: 2.014 ± 0.387
6.674ArgGly: 6.674 ± 0.61
2.301ArgHis: 2.301 ± 0.529
3.394ArgIle: 3.394 ± 0.497
2.531ArgLys: 2.531 ± 0.383
6.098ArgLeu: 6.098 ± 0.638
1.841ArgMet: 1.841 ± 0.382
2.359ArgAsn: 2.359 ± 0.342
5.235ArgPro: 5.235 ± 0.635
1.783ArgGln: 1.783 ± 0.327
7.939ArgArg: 7.939 ± 0.961
2.992ArgSer: 2.992 ± 0.378
5.58ArgThr: 5.58 ± 0.603
5.235ArgVal: 5.235 ± 0.657
1.899ArgTrp: 1.899 ± 0.357
1.553ArgTyr: 1.553 ± 0.291
0.0ArgXaa: 0.0 ± 0.0
Ser
6.156SerAla: 6.156 ± 0.672
0.403SerCys: 0.403 ± 0.144
2.416SerAsp: 2.416 ± 0.356
1.783SerGlu: 1.783 ± 0.379
1.611SerPhe: 1.611 ± 0.37
5.235SerGly: 5.235 ± 0.552
0.69SerHis: 0.69 ± 0.225
1.726SerIle: 1.726 ± 0.317
1.381SerLys: 1.381 ± 0.289
2.474SerLeu: 2.474 ± 0.45
1.323SerMet: 1.323 ± 0.253
0.748SerAsn: 0.748 ± 0.185
1.956SerPro: 1.956 ± 0.383
1.266SerGln: 1.266 ± 0.337
2.877SerArg: 2.877 ± 0.468
2.129SerSer: 2.129 ± 0.488
2.877SerThr: 2.877 ± 0.417
2.589SerVal: 2.589 ± 0.393
0.633SerTrp: 0.633 ± 0.174
0.805SerTyr: 0.805 ± 0.167
0.0SerXaa: 0.0 ± 0.0
Thr
8.917ThrAla: 8.917 ± 0.689
1.266ThrCys: 1.266 ± 0.344
4.545ThrAsp: 4.545 ± 0.463
2.761ThrGlu: 2.761 ± 0.422
1.899ThrPhe: 1.899 ± 0.304
6.961ThrGly: 6.961 ± 0.598
0.805ThrHis: 0.805 ± 0.222
2.934ThrIle: 2.934 ± 0.449
1.726ThrLys: 1.726 ± 0.352
4.718ThrLeu: 4.718 ± 0.6
1.151ThrMet: 1.151 ± 0.233
2.071ThrAsn: 2.071 ± 0.325
6.213ThrPro: 6.213 ± 0.635
1.956ThrGln: 1.956 ± 0.34
4.833ThrArg: 4.833 ± 0.567
2.589ThrSer: 2.589 ± 0.444
5.811ThrThr: 5.811 ± 0.771
6.443ThrVal: 6.443 ± 0.621
1.323ThrTrp: 1.323 ± 0.232
1.783ThrTyr: 1.783 ± 0.3
0.0ThrXaa: 0.0 ± 0.0
Val
10.183ValAla: 10.183 ± 0.881
0.69ValCys: 0.69 ± 0.202
6.098ValAsp: 6.098 ± 0.476
3.797ValGlu: 3.797 ± 0.381
1.841ValPhe: 1.841 ± 0.376
5.12ValGly: 5.12 ± 0.619
2.014ValHis: 2.014 ± 0.36
3.509ValIle: 3.509 ± 0.464
2.244ValLys: 2.244 ± 0.494
4.257ValLeu: 4.257 ± 0.519
1.899ValMet: 1.899 ± 0.339
1.956ValAsn: 1.956 ± 0.359
4.085ValPro: 4.085 ± 0.432
1.553ValGln: 1.553 ± 0.331
5.58ValArg: 5.58 ± 0.661
4.142ValSer: 4.142 ± 0.44
6.271ValThr: 6.271 ± 0.698
5.178ValVal: 5.178 ± 0.638
1.323ValTrp: 1.323 ± 0.286
1.783ValTyr: 1.783 ± 0.317
0.0ValXaa: 0.0 ± 0.0
Trp
1.956TrpAla: 1.956 ± 0.268
0.518TrpCys: 0.518 ± 0.186
1.438TrpAsp: 1.438 ± 0.228
0.978TrpGlu: 0.978 ± 0.279
0.345TrpPhe: 0.345 ± 0.151
0.978TrpGly: 0.978 ± 0.207
0.69TrpHis: 0.69 ± 0.243
0.748TrpIle: 0.748 ± 0.198
0.403TrpLys: 0.403 ± 0.148
1.496TrpLeu: 1.496 ± 0.302
0.518TrpMet: 0.518 ± 0.185
0.345TrpAsn: 0.345 ± 0.136
1.266TrpPro: 1.266 ± 0.307
0.748TrpGln: 0.748 ± 0.199
1.783TrpArg: 1.783 ± 0.279
0.805TrpSer: 0.805 ± 0.214
1.496TrpThr: 1.496 ± 0.271
0.805TrpVal: 0.805 ± 0.198
0.748TrpTrp: 0.748 ± 0.212
0.288TrpTyr: 0.288 ± 0.11
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.279TyrAla: 3.279 ± 0.457
0.288TyrCys: 0.288 ± 0.14
1.381TyrAsp: 1.381 ± 0.32
0.805TyrGlu: 0.805 ± 0.237
0.518TyrPhe: 0.518 ± 0.166
2.704TyrGly: 2.704 ± 0.42
0.403TyrHis: 0.403 ± 0.143
0.69TyrIle: 0.69 ± 0.2
0.345TyrLys: 0.345 ± 0.125
1.266TyrLeu: 1.266 ± 0.204
0.173TyrMet: 0.173 ± 0.116
0.69TyrAsn: 0.69 ± 0.187
1.496TyrPro: 1.496 ± 0.309
0.748TyrGln: 0.748 ± 0.239
2.244TyrArg: 2.244 ± 0.418
0.805TyrSer: 0.805 ± 0.182
2.359TyrThr: 2.359 ± 0.301
1.668TyrVal: 1.668 ± 0.341
0.288TyrTrp: 0.288 ± 0.144
0.575TyrTyr: 0.575 ± 0.174
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 86 proteins (17383 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski