Amino acid dipepetide frequency for Mycobacterium phage Byougenkin

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
14.0AlaAla: 14.0 ± 1.638
1.205AlaCys: 1.205 ± 0.274
7.516AlaAsp: 7.516 ± 0.605
7.057AlaGlu: 7.057 ± 0.787
2.41AlaPhe: 2.41 ± 0.378
9.467AlaGly: 9.467 ± 1.283
2.639AlaHis: 2.639 ± 0.418
4.246AlaIle: 4.246 ± 0.596
4.303AlaLys: 4.303 ± 0.492
7.172AlaLeu: 7.172 ± 0.849
2.295AlaMet: 2.295 ± 0.372
2.467AlaAsn: 2.467 ± 0.411
5.508AlaPro: 5.508 ± 0.573
3.5AlaGln: 3.5 ± 0.523
7.459AlaArg: 7.459 ± 0.753
4.82AlaSer: 4.82 ± 0.608
6.828AlaThr: 6.828 ± 0.595
6.598AlaVal: 6.598 ± 0.549
2.811AlaTrp: 2.811 ± 0.383
2.41AlaTyr: 2.41 ± 0.336
0.0AlaXaa: 0.0 ± 0.0
Cys
1.148CysAla: 1.148 ± 0.357
0.0CysCys: 0.0 ± 0.0
1.434CysAsp: 1.434 ± 0.414
0.689CysGlu: 0.689 ± 0.169
0.172CysPhe: 0.172 ± 0.111
1.492CysGly: 1.492 ± 0.369
0.172CysHis: 0.172 ± 0.099
0.172CysIle: 0.172 ± 0.099
0.516CysLys: 0.516 ± 0.198
1.033CysLeu: 1.033 ± 0.262
0.23CysMet: 0.23 ± 0.099
0.516CysAsn: 0.516 ± 0.188
1.262CysPro: 1.262 ± 0.343
0.287CysGln: 0.287 ± 0.146
0.861CysArg: 0.861 ± 0.265
0.689CysSer: 0.689 ± 0.262
0.861CysThr: 0.861 ± 0.253
0.574CysVal: 0.574 ± 0.161
0.23CysTrp: 0.23 ± 0.12
0.115CysTyr: 0.115 ± 0.072
0.0CysXaa: 0.0 ± 0.0
Asp
6.828AspAla: 6.828 ± 0.71
0.861AspCys: 0.861 ± 0.222
4.246AspAsp: 4.246 ± 0.539
3.27AspGlu: 3.27 ± 0.518
1.893AspPhe: 1.893 ± 0.291
6.713AspGly: 6.713 ± 0.622
1.664AspHis: 1.664 ± 0.346
2.41AspIle: 2.41 ± 0.412
1.721AspLys: 1.721 ± 0.262
5.795AspLeu: 5.795 ± 0.541
1.205AspMet: 1.205 ± 0.333
1.664AspAsn: 1.664 ± 0.329
5.393AspPro: 5.393 ± 0.623
2.41AspGln: 2.41 ± 0.388
5.279AspArg: 5.279 ± 0.719
3.5AspSer: 3.5 ± 0.564
4.361AspThr: 4.361 ± 0.561
4.303AspVal: 4.303 ± 0.62
1.664AspTrp: 1.664 ± 0.336
1.779AspTyr: 1.779 ± 0.35
0.0AspXaa: 0.0 ± 0.0
Glu
5.91GluAla: 5.91 ± 0.725
0.918GluCys: 0.918 ± 0.271
3.098GluAsp: 3.098 ± 0.363
3.041GluGlu: 3.041 ± 0.676
2.352GluPhe: 2.352 ± 0.394
3.328GluGly: 3.328 ± 0.472
1.836GluHis: 1.836 ± 0.45
1.951GluIle: 1.951 ± 0.312
2.123GluLys: 2.123 ± 0.32
5.106GluLeu: 5.106 ± 0.698
1.664GluMet: 1.664 ± 0.322
2.18GluAsn: 2.18 ± 0.334
2.811GluPro: 2.811 ± 0.443
2.984GluGln: 2.984 ± 0.454
4.59GluArg: 4.59 ± 0.614
3.041GluSer: 3.041 ± 0.436
4.361GluThr: 4.361 ± 0.693
4.303GluVal: 4.303 ± 0.539
1.262GluTrp: 1.262 ± 0.27
2.066GluTyr: 2.066 ± 0.384
0.0GluXaa: 0.0 ± 0.0
Phe
2.984PheAla: 2.984 ± 0.515
0.402PheCys: 0.402 ± 0.156
2.295PheAsp: 2.295 ± 0.441
1.721PheGlu: 1.721 ± 0.342
1.033PhePhe: 1.033 ± 0.286
2.869PheGly: 2.869 ± 0.707
0.459PheHis: 0.459 ± 0.168
1.377PheIle: 1.377 ± 0.343
0.861PheLys: 0.861 ± 0.28
2.008PheLeu: 2.008 ± 0.308
0.918PheMet: 0.918 ± 0.267
1.32PheAsn: 1.32 ± 0.372
1.434PhePro: 1.434 ± 0.272
0.975PheGln: 0.975 ± 0.287
1.664PheArg: 1.664 ± 0.282
1.664PheSer: 1.664 ± 0.305
2.18PheThr: 2.18 ± 0.356
2.066PheVal: 2.066 ± 0.233
0.459PheTrp: 0.459 ± 0.138
0.975PheTyr: 0.975 ± 0.258
0.0PheXaa: 0.0 ± 0.0
Gly
8.606GlyAla: 8.606 ± 1.348
0.918GlyCys: 0.918 ± 0.263
6.483GlyAsp: 6.483 ± 0.653
4.074GlyGlu: 4.074 ± 0.622
2.639GlyPhe: 2.639 ± 0.436
9.18GlyGly: 9.18 ± 1.585
2.123GlyHis: 2.123 ± 0.329
4.188GlyIle: 4.188 ± 0.611
2.582GlyLys: 2.582 ± 0.381
6.197GlyLeu: 6.197 ± 0.596
2.467GlyMet: 2.467 ± 0.419
2.984GlyAsn: 2.984 ± 0.441
4.82GlyPro: 4.82 ± 0.526
2.123GlyGln: 2.123 ± 0.576
5.106GlyArg: 5.106 ± 0.669
5.795GlySer: 5.795 ± 0.837
6.77GlyThr: 6.77 ± 0.897
5.68GlyVal: 5.68 ± 0.596
2.467GlyTrp: 2.467 ± 0.402
2.123GlyTyr: 2.123 ± 0.327
0.0GlyXaa: 0.0 ± 0.0
His
2.066HisAla: 2.066 ± 0.455
0.344HisCys: 0.344 ± 0.147
1.32HisAsp: 1.32 ± 0.261
1.549HisGlu: 1.549 ± 0.297
0.516HisPhe: 0.516 ± 0.155
1.779HisGly: 1.779 ± 0.297
1.148HisHis: 1.148 ± 0.345
1.549HisIle: 1.549 ± 0.319
0.861HisLys: 0.861 ± 0.242
1.664HisLeu: 1.664 ± 0.336
0.689HisMet: 0.689 ± 0.202
0.918HisAsn: 0.918 ± 0.212
1.779HisPro: 1.779 ± 0.372
0.861HisGln: 0.861 ± 0.263
2.066HisArg: 2.066 ± 0.423
0.918HisSer: 0.918 ± 0.187
1.664HisThr: 1.664 ± 0.399
1.434HisVal: 1.434 ± 0.389
0.402HisTrp: 0.402 ± 0.148
0.803HisTyr: 0.803 ± 0.199
0.0HisXaa: 0.0 ± 0.0
Ile
5.451IleAla: 5.451 ± 0.577
0.574IleCys: 0.574 ± 0.18
3.844IleAsp: 3.844 ± 0.493
3.729IleGlu: 3.729 ± 0.393
0.746IlePhe: 0.746 ± 0.277
3.844IleGly: 3.844 ± 0.557
1.492IleHis: 1.492 ± 0.297
1.492IleIle: 1.492 ± 0.273
1.033IleLys: 1.033 ± 0.267
2.18IleLeu: 2.18 ± 0.401
0.402IleMet: 0.402 ± 0.16
1.779IleAsn: 1.779 ± 0.281
2.582IlePro: 2.582 ± 0.277
1.492IleGln: 1.492 ± 0.254
2.639IleArg: 2.639 ± 0.408
1.779IleSer: 1.779 ± 0.487
3.5IleThr: 3.5 ± 0.367
2.811IleVal: 2.811 ± 0.328
1.033IleTrp: 1.033 ± 0.223
0.803IleTyr: 0.803 ± 0.227
0.0IleXaa: 0.0 ± 0.0
Lys
3.729LysAla: 3.729 ± 0.495
0.516LysCys: 0.516 ± 0.158
1.721LysAsp: 1.721 ± 0.307
1.549LysGlu: 1.549 ± 0.302
1.377LysPhe: 1.377 ± 0.245
2.754LysGly: 2.754 ± 0.376
1.434LysHis: 1.434 ± 0.334
0.975LysIle: 0.975 ± 0.223
1.262LysLys: 1.262 ± 0.342
2.639LysLeu: 2.639 ± 0.438
0.689LysMet: 0.689 ± 0.197
1.09LysAsn: 1.09 ± 0.335
2.238LysPro: 2.238 ± 0.338
1.434LysGln: 1.434 ± 0.252
2.123LysArg: 2.123 ± 0.346
2.066LysSer: 2.066 ± 0.35
2.123LysThr: 2.123 ± 0.325
2.754LysVal: 2.754 ± 0.443
0.459LysTrp: 0.459 ± 0.149
1.09LysTyr: 1.09 ± 0.289
0.0LysXaa: 0.0 ± 0.0
Leu
7.229LeuAla: 7.229 ± 0.709
0.803LeuCys: 0.803 ± 0.243
4.59LeuAsp: 4.59 ± 0.686
3.729LeuGlu: 3.729 ± 0.507
2.639LeuPhe: 2.639 ± 0.312
4.82LeuGly: 4.82 ± 0.507
1.148LeuHis: 1.148 ± 0.295
3.385LeuIle: 3.385 ± 0.47
2.066LeuLys: 2.066 ± 0.327
4.647LeuLeu: 4.647 ± 0.509
1.549LeuMet: 1.549 ± 0.292
2.352LeuAsn: 2.352 ± 0.375
5.565LeuPro: 5.565 ± 0.667
3.098LeuGln: 3.098 ± 0.461
5.451LeuArg: 5.451 ± 0.63
5.279LeuSer: 5.279 ± 0.6
6.541LeuThr: 6.541 ± 0.591
4.82LeuVal: 4.82 ± 0.552
1.262LeuTrp: 1.262 ± 0.308
2.123LeuTyr: 2.123 ± 0.397
0.0LeuXaa: 0.0 ± 0.0
Met
2.41MetAla: 2.41 ± 0.361
0.172MetCys: 0.172 ± 0.112
1.205MetAsp: 1.205 ± 0.227
1.09MetGlu: 1.09 ± 0.21
0.746MetPhe: 0.746 ± 0.215
1.951MetGly: 1.951 ± 0.309
0.115MetHis: 0.115 ± 0.082
0.861MetIle: 0.861 ± 0.269
0.746MetLys: 0.746 ± 0.224
2.123MetLeu: 2.123 ± 0.311
0.516MetMet: 0.516 ± 0.221
1.09MetAsn: 1.09 ± 0.242
1.148MetPro: 1.148 ± 0.238
0.23MetGln: 0.23 ± 0.098
1.434MetArg: 1.434 ± 0.291
2.639MetSer: 2.639 ± 0.413
2.18MetThr: 2.18 ± 0.375
1.377MetVal: 1.377 ± 0.305
0.402MetTrp: 0.402 ± 0.144
0.402MetTyr: 0.402 ± 0.165
0.0MetXaa: 0.0 ± 0.0
Asn
3.557AsnAla: 3.557 ± 0.455
0.115AsnCys: 0.115 ± 0.083
2.18AsnAsp: 2.18 ± 0.421
1.549AsnGlu: 1.549 ± 0.316
0.918AsnPhe: 0.918 ± 0.307
3.844AsnGly: 3.844 ± 0.45
0.975AsnHis: 0.975 ± 0.211
1.549AsnIle: 1.549 ± 0.436
0.918AsnLys: 0.918 ± 0.193
2.18AsnLeu: 2.18 ± 0.378
0.516AsnMet: 0.516 ± 0.153
1.779AsnAsn: 1.779 ± 0.4
2.697AsnPro: 2.697 ± 0.355
1.205AsnGln: 1.205 ± 0.373
1.664AsnArg: 1.664 ± 0.36
1.32AsnSer: 1.32 ± 0.253
2.639AsnThr: 2.639 ± 0.344
1.836AsnVal: 1.836 ± 0.39
0.918AsnTrp: 0.918 ± 0.202
0.918AsnTyr: 0.918 ± 0.2
0.0AsnXaa: 0.0 ± 0.0
Pro
5.738ProAla: 5.738 ± 0.64
0.631ProCys: 0.631 ± 0.168
4.762ProAsp: 4.762 ± 0.631
3.959ProGlu: 3.959 ± 0.455
1.664ProPhe: 1.664 ± 0.309
6.885ProGly: 6.885 ± 0.762
1.664ProHis: 1.664 ± 0.343
2.18ProIle: 2.18 ± 0.333
2.754ProLys: 2.754 ± 0.485
4.188ProLeu: 4.188 ± 0.626
1.549ProMet: 1.549 ± 0.303
2.066ProAsn: 2.066 ± 0.319
3.959ProPro: 3.959 ± 0.547
2.238ProGln: 2.238 ± 0.344
3.557ProArg: 3.557 ± 0.597
3.385ProSer: 3.385 ± 0.462
3.5ProThr: 3.5 ± 0.457
4.82ProVal: 4.82 ± 0.546
0.918ProTrp: 0.918 ± 0.248
1.492ProTyr: 1.492 ± 0.266
0.0ProXaa: 0.0 ± 0.0
Gln
4.992GlnAla: 4.992 ± 0.664
0.459GlnCys: 0.459 ± 0.178
1.32GlnAsp: 1.32 ± 0.281
1.664GlnGlu: 1.664 ± 0.325
1.205GlnPhe: 1.205 ± 0.23
2.639GlnGly: 2.639 ± 0.516
0.803GlnHis: 0.803 ± 0.239
1.721GlnIle: 1.721 ± 0.328
1.262GlnLys: 1.262 ± 0.215
3.041GlnLeu: 3.041 ± 0.484
0.689GlnMet: 0.689 ± 0.19
0.918GlnAsn: 0.918 ± 0.268
2.525GlnPro: 2.525 ± 0.411
1.148GlnGln: 1.148 ± 0.284
2.467GlnArg: 2.467 ± 0.427
2.008GlnSer: 2.008 ± 0.372
1.664GlnThr: 1.664 ± 0.335
2.18GlnVal: 2.18 ± 0.411
0.574GlnTrp: 0.574 ± 0.183
0.861GlnTyr: 0.861 ± 0.273
0.0GlnXaa: 0.0 ± 0.0
Arg
5.91ArgAla: 5.91 ± 0.585
1.377ArgCys: 1.377 ± 0.337
4.82ArgAsp: 4.82 ± 0.636
4.82ArgGlu: 4.82 ± 0.705
2.238ArgPhe: 2.238 ± 0.422
4.475ArgGly: 4.475 ± 0.503
1.377ArgHis: 1.377 ± 0.336
3.787ArgIle: 3.787 ± 0.53
2.238ArgLys: 2.238 ± 0.409
4.82ArgLeu: 4.82 ± 0.675
2.467ArgMet: 2.467 ± 0.419
2.352ArgAsn: 2.352 ± 0.383
3.213ArgPro: 3.213 ± 0.401
1.836ArgGln: 1.836 ± 0.326
5.451ArgArg: 5.451 ± 0.928
3.5ArgSer: 3.5 ± 0.349
3.443ArgThr: 3.443 ± 0.51
5.68ArgVal: 5.68 ± 0.703
1.836ArgTrp: 1.836 ± 0.313
1.836ArgTyr: 1.836 ± 0.331
0.0ArgXaa: 0.0 ± 0.0
Ser
6.082SerAla: 6.082 ± 0.856
0.459SerCys: 0.459 ± 0.166
3.959SerAsp: 3.959 ± 0.483
3.557SerGlu: 3.557 ± 0.48
1.951SerPhe: 1.951 ± 0.371
5.91SerGly: 5.91 ± 0.76
1.205SerHis: 1.205 ± 0.234
2.525SerIle: 2.525 ± 0.444
2.869SerLys: 2.869 ± 0.434
3.844SerLeu: 3.844 ± 0.485
1.549SerMet: 1.549 ± 0.301
1.951SerAsn: 1.951 ± 0.333
3.213SerPro: 3.213 ± 0.327
1.664SerGln: 1.664 ± 0.278
3.156SerArg: 3.156 ± 0.441
3.385SerSer: 3.385 ± 0.606
2.926SerThr: 2.926 ± 0.43
4.131SerVal: 4.131 ± 0.515
1.721SerTrp: 1.721 ± 0.303
1.262SerTyr: 1.262 ± 0.188
0.0SerXaa: 0.0 ± 0.0
Thr
6.139ThrAla: 6.139 ± 0.56
1.033ThrCys: 1.033 ± 0.284
4.246ThrAsp: 4.246 ± 0.596
3.787ThrGlu: 3.787 ± 0.479
1.779ThrPhe: 1.779 ± 0.29
6.656ThrGly: 6.656 ± 0.656
1.893ThrHis: 1.893 ± 0.407
3.557ThrIle: 3.557 ± 0.504
1.951ThrLys: 1.951 ± 0.366
4.934ThrLeu: 4.934 ± 0.598
1.262ThrMet: 1.262 ± 0.294
2.582ThrAsn: 2.582 ± 0.472
5.336ThrPro: 5.336 ± 0.687
2.238ThrGln: 2.238 ± 0.404
3.959ThrArg: 3.959 ± 0.488
4.246ThrSer: 4.246 ± 0.501
5.279ThrThr: 5.279 ± 0.757
5.565ThrVal: 5.565 ± 0.685
1.033ThrTrp: 1.033 ± 0.273
1.721ThrTyr: 1.721 ± 0.306
0.0ThrXaa: 0.0 ± 0.0
Val
7.287ValAla: 7.287 ± 0.659
1.205ValCys: 1.205 ± 0.298
4.762ValAsp: 4.762 ± 0.597
5.221ValGlu: 5.221 ± 0.593
2.18ValPhe: 2.18 ± 0.395
5.451ValGly: 5.451 ± 0.62
1.32ValHis: 1.32 ± 0.288
2.697ValIle: 2.697 ± 0.427
2.295ValLys: 2.295 ± 0.361
5.393ValLeu: 5.393 ± 0.759
1.32ValMet: 1.32 ± 0.216
2.008ValAsn: 2.008 ± 0.334
4.131ValPro: 4.131 ± 0.409
2.926ValGln: 2.926 ± 0.341
3.959ValArg: 3.959 ± 0.555
5.279ValSer: 5.279 ± 0.647
4.705ValThr: 4.705 ± 0.596
6.024ValVal: 6.024 ± 0.718
1.664ValTrp: 1.664 ± 0.357
1.377ValTyr: 1.377 ± 0.27
0.0ValXaa: 0.0 ± 0.0
Trp
2.123TrpAla: 2.123 ± 0.336
0.172TrpCys: 0.172 ± 0.105
1.434TrpAsp: 1.434 ± 0.305
1.205TrpGlu: 1.205 ± 0.37
0.631TrpPhe: 0.631 ± 0.177
1.148TrpGly: 1.148 ± 0.262
0.574TrpHis: 0.574 ± 0.17
1.262TrpIle: 1.262 ± 0.247
0.918TrpLys: 0.918 ± 0.208
2.008TrpLeu: 2.008 ± 0.365
0.631TrpMet: 0.631 ± 0.205
0.689TrpAsn: 0.689 ± 0.25
0.975TrpPro: 0.975 ± 0.249
0.918TrpGln: 0.918 ± 0.264
2.18TrpArg: 2.18 ± 0.464
1.148TrpSer: 1.148 ± 0.244
1.607TrpThr: 1.607 ± 0.309
1.664TrpVal: 1.664 ± 0.424
1.09TrpTrp: 1.09 ± 0.208
0.402TrpTyr: 0.402 ± 0.142
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.525TyrAla: 2.525 ± 0.363
0.287TyrCys: 0.287 ± 0.137
1.721TyrAsp: 1.721 ± 0.369
1.664TyrGlu: 1.664 ± 0.333
0.574TyrPhe: 0.574 ± 0.185
2.123TyrGly: 2.123 ± 0.386
0.287TyrHis: 0.287 ± 0.113
1.32TyrIle: 1.32 ± 0.248
0.746TyrLys: 0.746 ± 0.192
2.18TyrLeu: 2.18 ± 0.33
0.172TyrMet: 0.172 ± 0.101
0.574TyrAsn: 0.574 ± 0.149
1.32TyrPro: 1.32 ± 0.234
0.574TyrGln: 0.574 ± 0.199
2.41TyrArg: 2.41 ± 0.374
0.861TyrSer: 0.861 ± 0.215
2.066TyrThr: 2.066 ± 0.395
2.582TyrVal: 2.582 ± 0.342
0.574TyrTrp: 0.574 ± 0.171
0.631TyrTyr: 0.631 ± 0.166
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 101 proteins (17430 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski