Amino acid dipepetide frequency for Mycobacterium phage Gandalph

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
14.238AlaAla: 14.238 ± 1.427
0.904AlaCys: 0.904 ± 0.265
7.345AlaAsp: 7.345 ± 0.549
7.176AlaGlu: 7.176 ± 0.583
2.656AlaPhe: 2.656 ± 0.409
9.436AlaGly: 9.436 ± 1.17
2.373AlaHis: 2.373 ± 0.379
4.407AlaIle: 4.407 ± 0.544
4.125AlaLys: 4.125 ± 0.408
8.08AlaLeu: 8.08 ± 0.782
2.656AlaMet: 2.656 ± 0.43
3.277AlaAsn: 3.277 ± 0.514
4.633AlaPro: 4.633 ± 0.556
2.995AlaGln: 2.995 ± 0.452
7.91AlaArg: 7.91 ± 0.701
5.029AlaSer: 5.029 ± 0.555
6.441AlaThr: 6.441 ± 0.598
6.441AlaVal: 6.441 ± 0.601
2.543AlaTrp: 2.543 ± 0.383
2.486AlaTyr: 2.486 ± 0.409
0.0AlaXaa: 0.0 ± 0.0
Cys
0.791CysAla: 0.791 ± 0.22
0.057CysCys: 0.057 ± 0.06
1.356CysAsp: 1.356 ± 0.309
0.678CysGlu: 0.678 ± 0.232
0.226CysPhe: 0.226 ± 0.114
1.808CysGly: 1.808 ± 0.428
0.283CysHis: 0.283 ± 0.148
0.17CysIle: 0.17 ± 0.094
0.452CysLys: 0.452 ± 0.137
0.678CysLeu: 0.678 ± 0.272
0.226CysMet: 0.226 ± 0.094
0.339CysAsn: 0.339 ± 0.125
1.243CysPro: 1.243 ± 0.293
0.283CysGln: 0.283 ± 0.117
0.961CysArg: 0.961 ± 0.259
0.565CysSer: 0.565 ± 0.234
0.735CysThr: 0.735 ± 0.228
0.791CysVal: 0.791 ± 0.235
0.226CysTrp: 0.226 ± 0.114
0.283CysTyr: 0.283 ± 0.123
0.0CysXaa: 0.0 ± 0.0
Asp
7.006AspAla: 7.006 ± 0.653
1.017AspCys: 1.017 ± 0.232
4.746AspAsp: 4.746 ± 0.641
3.729AspGlu: 3.729 ± 0.412
1.695AspPhe: 1.695 ± 0.269
6.498AspGly: 6.498 ± 0.716
1.074AspHis: 1.074 ± 0.233
1.978AspIle: 1.978 ± 0.348
1.808AspLys: 1.808 ± 0.278
5.82AspLeu: 5.82 ± 0.605
1.187AspMet: 1.187 ± 0.255
1.752AspAsn: 1.752 ± 0.363
4.577AspPro: 4.577 ± 0.568
2.317AspGln: 2.317 ± 0.338
5.481AspArg: 5.481 ± 0.681
3.56AspSer: 3.56 ± 0.537
3.955AspThr: 3.955 ± 0.545
4.859AspVal: 4.859 ± 0.526
1.187AspTrp: 1.187 ± 0.276
2.204AspTyr: 2.204 ± 0.352
0.0AspXaa: 0.0 ± 0.0
Glu
6.498GluAla: 6.498 ± 0.74
1.13GluCys: 1.13 ± 0.299
3.164GluAsp: 3.164 ± 0.404
2.995GluGlu: 2.995 ± 0.494
2.204GluPhe: 2.204 ± 0.309
3.56GluGly: 3.56 ± 0.412
1.639GluHis: 1.639 ± 0.41
2.147GluIle: 2.147 ± 0.355
2.26GluLys: 2.26 ± 0.329
5.65GluLeu: 5.65 ± 0.674
1.752GluMet: 1.752 ± 0.284
1.695GluAsn: 1.695 ± 0.306
2.995GluPro: 2.995 ± 0.486
2.769GluGln: 2.769 ± 0.398
4.746GluArg: 4.746 ± 0.687
3.108GluSer: 3.108 ± 0.538
4.294GluThr: 4.294 ± 0.536
4.633GluVal: 4.633 ± 0.562
1.639GluTrp: 1.639 ± 0.286
1.639GluTyr: 1.639 ± 0.291
0.0GluXaa: 0.0 ± 0.0
Phe
2.825PheAla: 2.825 ± 0.413
0.226PheCys: 0.226 ± 0.117
2.373PheAsp: 2.373 ± 0.475
1.526PheGlu: 1.526 ± 0.268
0.791PhePhe: 0.791 ± 0.24
2.43PheGly: 2.43 ± 0.669
0.509PheHis: 0.509 ± 0.191
1.469PheIle: 1.469 ± 0.302
1.017PheLys: 1.017 ± 0.252
2.204PheLeu: 2.204 ± 0.317
0.735PheMet: 0.735 ± 0.201
1.13PheAsn: 1.13 ± 0.36
1.582PhePro: 1.582 ± 0.294
1.074PheGln: 1.074 ± 0.348
1.469PheArg: 1.469 ± 0.256
1.469PheSer: 1.469 ± 0.26
2.26PheThr: 2.26 ± 0.316
2.204PheVal: 2.204 ± 0.271
0.396PheTrp: 0.396 ± 0.133
0.848PheTyr: 0.848 ± 0.232
0.0PheXaa: 0.0 ± 0.0
Gly
8.814GlyAla: 8.814 ± 1.186
1.017GlyCys: 1.017 ± 0.302
5.707GlyAsp: 5.707 ± 0.439
4.69GlyGlu: 4.69 ± 0.512
2.373GlyPhe: 2.373 ± 0.419
10.905GlyGly: 10.905 ± 2.208
1.695GlyHis: 1.695 ± 0.355
4.012GlyIle: 4.012 ± 0.596
2.373GlyLys: 2.373 ± 0.352
5.537GlyLeu: 5.537 ± 0.54
2.43GlyMet: 2.43 ± 0.497
2.769GlyAsn: 2.769 ± 0.428
4.238GlyPro: 4.238 ± 0.507
2.26GlyGln: 2.26 ± 0.55
5.142GlyArg: 5.142 ± 0.606
6.893GlySer: 6.893 ± 0.943
6.554GlyThr: 6.554 ± 0.84
5.876GlyVal: 5.876 ± 0.599
2.147GlyTrp: 2.147 ± 0.366
2.034GlyTyr: 2.034 ± 0.428
0.0GlyXaa: 0.0 ± 0.0
His
1.865HisAla: 1.865 ± 0.35
0.339HisCys: 0.339 ± 0.177
1.3HisAsp: 1.3 ± 0.26
1.13HisGlu: 1.13 ± 0.29
0.509HisPhe: 0.509 ± 0.153
1.469HisGly: 1.469 ± 0.299
0.791HisHis: 0.791 ± 0.251
1.582HisIle: 1.582 ± 0.307
0.961HisLys: 0.961 ± 0.199
1.413HisLeu: 1.413 ± 0.306
0.396HisMet: 0.396 ± 0.111
0.961HisAsn: 0.961 ± 0.215
1.356HisPro: 1.356 ± 0.27
0.735HisGln: 0.735 ± 0.164
1.978HisArg: 1.978 ± 0.39
1.187HisSer: 1.187 ± 0.282
1.356HisThr: 1.356 ± 0.359
1.469HisVal: 1.469 ± 0.281
0.509HisTrp: 0.509 ± 0.164
0.791HisTyr: 0.791 ± 0.19
0.0HisXaa: 0.0 ± 0.0
Ile
5.029IleAla: 5.029 ± 0.575
0.622IleCys: 0.622 ± 0.242
4.181IleAsp: 4.181 ± 0.454
3.786IleGlu: 3.786 ± 0.472
0.509IlePhe: 0.509 ± 0.137
3.786IleGly: 3.786 ± 0.439
1.469IleHis: 1.469 ± 0.28
1.469IleIle: 1.469 ± 0.304
1.13IleLys: 1.13 ± 0.245
1.978IleLeu: 1.978 ± 0.33
0.509IleMet: 0.509 ± 0.158
1.808IleAsn: 1.808 ± 0.272
3.051IlePro: 3.051 ± 0.349
1.639IleGln: 1.639 ± 0.292
2.373IleArg: 2.373 ± 0.468
2.43IleSer: 2.43 ± 0.425
3.616IleThr: 3.616 ± 0.591
3.108IleVal: 3.108 ± 0.349
0.961IleTrp: 0.961 ± 0.228
0.961IleTyr: 0.961 ± 0.192
0.0IleXaa: 0.0 ± 0.0
Lys
3.786LysAla: 3.786 ± 0.57
0.452LysCys: 0.452 ± 0.153
1.639LysAsp: 1.639 ± 0.304
1.356LysGlu: 1.356 ± 0.276
1.243LysPhe: 1.243 ± 0.217
2.317LysGly: 2.317 ± 0.394
0.904LysHis: 0.904 ± 0.228
0.961LysIle: 0.961 ± 0.201
1.695LysLys: 1.695 ± 0.377
2.43LysLeu: 2.43 ± 0.42
0.791LysMet: 0.791 ± 0.18
0.735LysAsn: 0.735 ± 0.197
2.147LysPro: 2.147 ± 0.352
1.808LysGln: 1.808 ± 0.272
2.769LysArg: 2.769 ± 0.39
2.204LysSer: 2.204 ± 0.27
2.147LysThr: 2.147 ± 0.365
2.599LysVal: 2.599 ± 0.414
1.187LysTrp: 1.187 ± 0.329
0.678LysTyr: 0.678 ± 0.195
0.0LysXaa: 0.0 ± 0.0
Leu
7.628LeuAla: 7.628 ± 0.724
0.622LeuCys: 0.622 ± 0.196
4.294LeuAsp: 4.294 ± 0.538
4.294LeuGlu: 4.294 ± 0.509
2.26LeuPhe: 2.26 ± 0.277
4.803LeuGly: 4.803 ± 0.531
1.074LeuHis: 1.074 ± 0.234
3.277LeuIle: 3.277 ± 0.416
2.204LeuLys: 2.204 ± 0.385
5.085LeuLeu: 5.085 ± 0.532
1.752LeuMet: 1.752 ± 0.279
2.486LeuAsn: 2.486 ± 0.279
5.085LeuPro: 5.085 ± 0.678
2.712LeuGln: 2.712 ± 0.445
5.142LeuArg: 5.142 ± 0.62
4.859LeuSer: 4.859 ± 0.484
5.311LeuThr: 5.311 ± 0.462
4.633LeuVal: 4.633 ± 0.549
1.3LeuTrp: 1.3 ± 0.336
2.091LeuTyr: 2.091 ± 0.353
0.0LeuXaa: 0.0 ± 0.0
Met
2.204MetAla: 2.204 ± 0.322
0.283MetCys: 0.283 ± 0.215
1.187MetAsp: 1.187 ± 0.256
1.3MetGlu: 1.3 ± 0.24
0.622MetPhe: 0.622 ± 0.178
1.639MetGly: 1.639 ± 0.271
0.057MetHis: 0.057 ± 0.074
0.904MetIle: 0.904 ± 0.237
0.961MetLys: 0.961 ± 0.245
1.808MetLeu: 1.808 ± 0.301
0.396MetMet: 0.396 ± 0.2
0.961MetAsn: 0.961 ± 0.244
1.13MetPro: 1.13 ± 0.231
0.452MetGln: 0.452 ± 0.159
1.639MetArg: 1.639 ± 0.291
3.164MetSer: 3.164 ± 0.362
2.317MetThr: 2.317 ± 0.368
1.469MetVal: 1.469 ± 0.333
0.226MetTrp: 0.226 ± 0.09
0.339MetTyr: 0.339 ± 0.119
0.0MetXaa: 0.0 ± 0.0
Asn
3.277AsnAla: 3.277 ± 0.342
0.113AsnCys: 0.113 ± 0.07
1.978AsnAsp: 1.978 ± 0.327
1.808AsnGlu: 1.808 ± 0.365
0.735AsnPhe: 0.735 ± 0.248
4.351AsnGly: 4.351 ± 0.622
0.791AsnHis: 0.791 ± 0.168
1.582AsnIle: 1.582 ± 0.476
1.074AsnLys: 1.074 ± 0.254
2.26AsnLeu: 2.26 ± 0.364
0.904AsnMet: 0.904 ± 0.221
1.808AsnAsn: 1.808 ± 0.412
2.712AsnPro: 2.712 ± 0.318
1.187AsnGln: 1.187 ± 0.359
2.147AsnArg: 2.147 ± 0.382
1.639AsnSer: 1.639 ± 0.323
1.978AsnThr: 1.978 ± 0.344
1.808AsnVal: 1.808 ± 0.37
0.509AsnTrp: 0.509 ± 0.167
0.622AsnTyr: 0.622 ± 0.176
0.0AsnXaa: 0.0 ± 0.0
Pro
5.255ProAla: 5.255 ± 0.597
0.848ProCys: 0.848 ± 0.224
3.616ProAsp: 3.616 ± 0.45
4.633ProGlu: 4.633 ± 0.539
1.752ProPhe: 1.752 ± 0.28
6.498ProGly: 6.498 ± 0.718
1.3ProHis: 1.3 ± 0.292
2.147ProIle: 2.147 ± 0.355
2.091ProLys: 2.091 ± 0.397
4.52ProLeu: 4.52 ± 0.571
1.582ProMet: 1.582 ± 0.371
1.978ProAsn: 1.978 ± 0.286
3.842ProPro: 3.842 ± 0.573
2.091ProGln: 2.091 ± 0.351
3.729ProArg: 3.729 ± 0.487
3.334ProSer: 3.334 ± 0.49
3.842ProThr: 3.842 ± 0.487
4.294ProVal: 4.294 ± 0.47
1.13ProTrp: 1.13 ± 0.244
1.187ProTyr: 1.187 ± 0.302
0.0ProXaa: 0.0 ± 0.0
Gln
4.012GlnAla: 4.012 ± 0.6
0.339GlnCys: 0.339 ± 0.153
1.526GlnAsp: 1.526 ± 0.316
1.695GlnGlu: 1.695 ± 0.311
1.187GlnPhe: 1.187 ± 0.262
2.486GlnGly: 2.486 ± 0.498
0.904GlnHis: 0.904 ± 0.224
2.147GlnIle: 2.147 ± 0.368
1.582GlnLys: 1.582 ± 0.321
2.938GlnLeu: 2.938 ± 0.307
0.735GlnMet: 0.735 ± 0.18
0.791GlnAsn: 0.791 ± 0.21
2.599GlnPro: 2.599 ± 0.399
1.13GlnGln: 1.13 ± 0.305
2.204GlnArg: 2.204 ± 0.319
2.26GlnSer: 2.26 ± 0.334
1.582GlnThr: 1.582 ± 0.274
2.43GlnVal: 2.43 ± 0.341
0.565GlnTrp: 0.565 ± 0.167
0.848GlnTyr: 0.848 ± 0.246
0.0GlnXaa: 0.0 ± 0.0
Arg
6.837ArgAla: 6.837 ± 0.687
1.243ArgCys: 1.243 ± 0.341
4.803ArgAsp: 4.803 ± 0.643
4.52ArgGlu: 4.52 ± 0.618
2.147ArgPhe: 2.147 ± 0.363
4.464ArgGly: 4.464 ± 0.472
1.639ArgHis: 1.639 ± 0.369
4.181ArgIle: 4.181 ± 0.595
2.712ArgLys: 2.712 ± 0.363
4.69ArgLeu: 4.69 ± 0.515
2.147ArgMet: 2.147 ± 0.383
2.543ArgAsn: 2.543 ± 0.396
4.068ArgPro: 4.068 ± 0.479
1.752ArgGln: 1.752 ± 0.405
5.424ArgArg: 5.424 ± 0.878
4.125ArgSer: 4.125 ± 0.447
3.334ArgThr: 3.334 ± 0.486
5.65ArgVal: 5.65 ± 0.666
1.865ArgTrp: 1.865 ± 0.302
2.091ArgTyr: 2.091 ± 0.27
0.0ArgXaa: 0.0 ± 0.0
Ser
5.82SerAla: 5.82 ± 0.818
0.396SerCys: 0.396 ± 0.15
4.69SerAsp: 4.69 ± 0.534
3.503SerGlu: 3.503 ± 0.445
1.921SerPhe: 1.921 ± 0.279
6.441SerGly: 6.441 ± 0.908
1.3SerHis: 1.3 ± 0.234
2.656SerIle: 2.656 ± 0.42
1.808SerLys: 1.808 ± 0.321
3.39SerLeu: 3.39 ± 0.334
1.526SerMet: 1.526 ± 0.266
2.43SerAsn: 2.43 ± 0.429
3.447SerPro: 3.447 ± 0.41
1.808SerGln: 1.808 ± 0.292
3.842SerArg: 3.842 ± 0.435
3.786SerSer: 3.786 ± 0.573
3.221SerThr: 3.221 ± 0.454
5.311SerVal: 5.311 ± 0.551
1.582SerTrp: 1.582 ± 0.26
1.413SerTyr: 1.413 ± 0.24
0.0SerXaa: 0.0 ± 0.0
Thr
7.006ThrAla: 7.006 ± 0.739
0.678ThrCys: 0.678 ± 0.258
4.407ThrAsp: 4.407 ± 0.517
3.899ThrGlu: 3.899 ± 0.387
1.752ThrPhe: 1.752 ± 0.372
6.102ThrGly: 6.102 ± 0.674
1.695ThrHis: 1.695 ± 0.323
3.503ThrIle: 3.503 ± 0.505
1.695ThrLys: 1.695 ± 0.296
4.012ThrLeu: 4.012 ± 0.478
1.017ThrMet: 1.017 ± 0.238
2.147ThrAsn: 2.147 ± 0.379
4.633ThrPro: 4.633 ± 0.658
2.204ThrGln: 2.204 ± 0.342
4.294ThrArg: 4.294 ± 0.517
4.012ThrSer: 4.012 ± 0.483
4.746ThrThr: 4.746 ± 0.649
5.763ThrVal: 5.763 ± 0.519
1.3ThrTrp: 1.3 ± 0.278
1.526ThrTyr: 1.526 ± 0.303
0.0ThrXaa: 0.0 ± 0.0
Val
7.402ValAla: 7.402 ± 0.65
1.13ValCys: 1.13 ± 0.224
5.255ValAsp: 5.255 ± 0.546
4.972ValGlu: 4.972 ± 0.579
2.373ValPhe: 2.373 ± 0.366
5.368ValGly: 5.368 ± 0.652
1.639ValHis: 1.639 ± 0.303
3.447ValIle: 3.447 ± 0.452
2.712ValLys: 2.712 ± 0.468
5.311ValLeu: 5.311 ± 0.664
1.074ValMet: 1.074 ± 0.219
2.26ValAsn: 2.26 ± 0.379
3.899ValPro: 3.899 ± 0.433
2.995ValGln: 2.995 ± 0.353
4.351ValArg: 4.351 ± 0.545
4.633ValSer: 4.633 ± 0.597
5.085ValThr: 5.085 ± 0.586
6.554ValVal: 6.554 ± 0.748
2.034ValTrp: 2.034 ± 0.367
1.469ValTyr: 1.469 ± 0.271
0.0ValXaa: 0.0 ± 0.0
Trp
2.204TrpAla: 2.204 ± 0.353
0.283TrpCys: 0.283 ± 0.122
1.413TrpAsp: 1.413 ± 0.265
1.017TrpGlu: 1.017 ± 0.358
0.735TrpPhe: 0.735 ± 0.242
1.017TrpGly: 1.017 ± 0.251
0.509TrpHis: 0.509 ± 0.155
1.243TrpIle: 1.243 ± 0.269
0.678TrpLys: 0.678 ± 0.163
1.526TrpLeu: 1.526 ± 0.344
1.017TrpMet: 1.017 ± 0.261
0.565TrpAsn: 0.565 ± 0.231
1.017TrpPro: 1.017 ± 0.29
1.017TrpGln: 1.017 ± 0.235
2.373TrpArg: 2.373 ± 0.381
1.413TrpSer: 1.413 ± 0.308
1.639TrpThr: 1.639 ± 0.319
1.639TrpVal: 1.639 ± 0.442
1.13TrpTrp: 1.13 ± 0.231
0.622TrpTyr: 0.622 ± 0.172
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.656TyrAla: 2.656 ± 0.409
0.452TyrCys: 0.452 ± 0.174
1.469TyrAsp: 1.469 ± 0.343
1.865TyrGlu: 1.865 ± 0.322
0.848TyrPhe: 0.848 ± 0.201
2.091TyrGly: 2.091 ± 0.36
0.452TyrHis: 0.452 ± 0.176
1.13TyrIle: 1.13 ± 0.221
0.509TyrLys: 0.509 ± 0.205
1.582TyrLeu: 1.582 ± 0.266
0.17TyrMet: 0.17 ± 0.1
0.848TyrAsn: 0.848 ± 0.205
1.413TyrPro: 1.413 ± 0.266
0.735TyrGln: 0.735 ± 0.175
2.26TyrArg: 2.26 ± 0.323
0.622TyrSer: 0.622 ± 0.174
1.978TyrThr: 1.978 ± 0.396
2.43TyrVal: 2.43 ± 0.297
0.622TyrTrp: 0.622 ± 0.181
0.509TyrTyr: 0.509 ± 0.181
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 104 proteins (17700 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski