Amino acid dipepetide frequency for Mycobacterium phage Mahavrat

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
14.624AlaAla: 14.624 ± 1.736
1.121AlaCys: 1.121 ± 0.306
7.228AlaAsp: 7.228 ± 0.614
7.62AlaGlu: 7.62 ± 0.703
2.858AlaPhe: 2.858 ± 0.4
10.142AlaGly: 10.142 ± 1.271
2.129AlaHis: 2.129 ± 0.409
4.034AlaIle: 4.034 ± 0.529
4.314AlaLys: 4.314 ± 0.415
7.732AlaLeu: 7.732 ± 0.828
2.858AlaMet: 2.858 ± 0.49
3.082AlaAsn: 3.082 ± 0.573
4.595AlaPro: 4.595 ± 0.522
3.362AlaGln: 3.362 ± 0.373
7.004AlaArg: 7.004 ± 0.652
5.771AlaSer: 5.771 ± 0.63
6.724AlaThr: 6.724 ± 0.51
6.724AlaVal: 6.724 ± 0.695
2.353AlaTrp: 2.353 ± 0.381
2.409AlaTyr: 2.409 ± 0.364
0.0AlaXaa: 0.0 ± 0.0
Cys
0.897CysAla: 0.897 ± 0.264
0.224CysCys: 0.224 ± 0.143
1.233CysAsp: 1.233 ± 0.299
0.784CysGlu: 0.784 ± 0.219
0.336CysPhe: 0.336 ± 0.14
1.793CysGly: 1.793 ± 0.451
0.168CysHis: 0.168 ± 0.108
0.168CysIle: 0.168 ± 0.124
0.616CysLys: 0.616 ± 0.168
0.784CysLeu: 0.784 ± 0.228
0.28CysMet: 0.28 ± 0.127
0.336CysAsn: 0.336 ± 0.131
1.289CysPro: 1.289 ± 0.286
0.224CysGln: 0.224 ± 0.115
0.84CysArg: 0.84 ± 0.232
0.897CysSer: 0.897 ± 0.268
0.897CysThr: 0.897 ± 0.264
0.504CysVal: 0.504 ± 0.159
0.336CysTrp: 0.336 ± 0.139
0.168CysTyr: 0.168 ± 0.106
0.0CysXaa: 0.0 ± 0.0
Asp
6.444AspAla: 6.444 ± 0.627
1.009AspCys: 1.009 ± 0.228
4.595AspAsp: 4.595 ± 0.542
3.082AspGlu: 3.082 ± 0.445
2.017AspPhe: 2.017 ± 0.301
6.107AspGly: 6.107 ± 0.751
1.233AspHis: 1.233 ± 0.27
2.129AspIle: 2.129 ± 0.428
1.737AspLys: 1.737 ± 0.294
5.659AspLeu: 5.659 ± 0.553
0.953AspMet: 0.953 ± 0.227
1.793AspAsn: 1.793 ± 0.339
4.931AspPro: 4.931 ± 0.551
2.577AspGln: 2.577 ± 0.34
5.379AspArg: 5.379 ± 0.736
3.642AspSer: 3.642 ± 0.548
4.37AspThr: 4.37 ± 0.598
4.539AspVal: 4.539 ± 0.534
1.457AspTrp: 1.457 ± 0.331
2.129AspTyr: 2.129 ± 0.399
0.0AspXaa: 0.0 ± 0.0
Glu
5.995GluAla: 5.995 ± 0.69
1.233GluCys: 1.233 ± 0.259
3.194GluAsp: 3.194 ± 0.419
2.914GluGlu: 2.914 ± 0.537
2.073GluPhe: 2.073 ± 0.315
3.418GluGly: 3.418 ± 0.487
1.625GluHis: 1.625 ± 0.369
2.073GluIle: 2.073 ± 0.339
1.737GluLys: 1.737 ± 0.274
5.323GluLeu: 5.323 ± 0.573
1.345GluMet: 1.345 ± 0.291
2.241GluAsn: 2.241 ± 0.341
3.306GluPro: 3.306 ± 0.501
2.858GluGln: 2.858 ± 0.5
5.435GluArg: 5.435 ± 0.64
3.026GluSer: 3.026 ± 0.495
3.81GluThr: 3.81 ± 0.625
3.978GluVal: 3.978 ± 0.559
1.289GluTrp: 1.289 ± 0.286
1.793GluTyr: 1.793 ± 0.33
0.0GluXaa: 0.0 ± 0.0
Phe
3.306PheAla: 3.306 ± 0.419
0.224PheCys: 0.224 ± 0.12
2.409PheAsp: 2.409 ± 0.398
1.513PheGlu: 1.513 ± 0.315
1.233PhePhe: 1.233 ± 0.277
2.858PheGly: 2.858 ± 0.545
0.504PheHis: 0.504 ± 0.176
1.401PheIle: 1.401 ± 0.307
0.728PheLys: 0.728 ± 0.232
2.073PheLeu: 2.073 ± 0.282
0.728PheMet: 0.728 ± 0.2
1.289PheAsn: 1.289 ± 0.355
1.625PhePro: 1.625 ± 0.313
1.177PheGln: 1.177 ± 0.344
1.513PheArg: 1.513 ± 0.252
1.289PheSer: 1.289 ± 0.251
2.241PheThr: 2.241 ± 0.328
2.297PheVal: 2.297 ± 0.352
0.56PheTrp: 0.56 ± 0.152
0.897PheTyr: 0.897 ± 0.244
0.0PheXaa: 0.0 ± 0.0
Gly
9.133GlyAla: 9.133 ± 1.091
1.009GlyCys: 1.009 ± 0.275
6.051GlyAsp: 6.051 ± 0.583
4.427GlyGlu: 4.427 ± 0.606
2.633GlyPhe: 2.633 ± 0.492
11.543GlyGly: 11.543 ± 2.279
2.017GlyHis: 2.017 ± 0.309
3.698GlyIle: 3.698 ± 0.58
2.241GlyLys: 2.241 ± 0.336
5.379GlyLeu: 5.379 ± 0.562
2.353GlyMet: 2.353 ± 0.475
3.138GlyAsn: 3.138 ± 0.425
4.483GlyPro: 4.483 ± 0.607
2.521GlyGln: 2.521 ± 0.549
4.707GlyArg: 4.707 ± 0.607
6.276GlySer: 6.276 ± 0.816
7.116GlyThr: 7.116 ± 0.937
5.827GlyVal: 5.827 ± 0.679
2.409GlyTrp: 2.409 ± 0.37
2.521GlyTyr: 2.521 ± 0.459
0.0GlyXaa: 0.0 ± 0.0
His
1.681HisAla: 1.681 ± 0.275
0.336HisCys: 0.336 ± 0.153
1.065HisAsp: 1.065 ± 0.269
1.233HisGlu: 1.233 ± 0.28
0.504HisPhe: 0.504 ± 0.15
1.625HisGly: 1.625 ± 0.303
0.784HisHis: 0.784 ± 0.257
1.457HisIle: 1.457 ± 0.279
1.009HisLys: 1.009 ± 0.247
1.401HisLeu: 1.401 ± 0.287
0.56HisMet: 0.56 ± 0.151
0.728HisAsn: 0.728 ± 0.22
2.017HisPro: 2.017 ± 0.322
0.672HisGln: 0.672 ± 0.212
1.793HisArg: 1.793 ± 0.299
1.065HisSer: 1.065 ± 0.257
1.905HisThr: 1.905 ± 0.37
1.009HisVal: 1.009 ± 0.258
0.392HisTrp: 0.392 ± 0.154
0.953HisTyr: 0.953 ± 0.207
0.0HisXaa: 0.0 ± 0.0
Ile
5.547IleAla: 5.547 ± 0.577
0.504IleCys: 0.504 ± 0.2
3.418IleAsp: 3.418 ± 0.422
3.586IleGlu: 3.586 ± 0.438
0.728IlePhe: 0.728 ± 0.242
3.642IleGly: 3.642 ± 0.467
1.681IleHis: 1.681 ± 0.351
1.457IleIle: 1.457 ± 0.29
0.953IleLys: 0.953 ± 0.203
2.297IleLeu: 2.297 ± 0.462
0.28IleMet: 0.28 ± 0.117
1.793IleAsn: 1.793 ± 0.278
2.577IlePro: 2.577 ± 0.335
1.457IleGln: 1.457 ± 0.307
2.633IleArg: 2.633 ± 0.4
1.849IleSer: 1.849 ± 0.463
3.922IleThr: 3.922 ± 0.408
2.914IleVal: 2.914 ± 0.346
0.897IleTrp: 0.897 ± 0.204
0.672IleTyr: 0.672 ± 0.176
0.0IleXaa: 0.0 ± 0.0
Lys
3.866LysAla: 3.866 ± 0.541
0.448LysCys: 0.448 ± 0.133
1.401LysAsp: 1.401 ± 0.311
1.345LysGlu: 1.345 ± 0.241
1.345LysPhe: 1.345 ± 0.192
2.241LysGly: 2.241 ± 0.39
1.009LysHis: 1.009 ± 0.255
0.897LysIle: 0.897 ± 0.223
1.401LysLys: 1.401 ± 0.38
2.69LysLeu: 2.69 ± 0.518
0.728LysMet: 0.728 ± 0.145
0.953LysAsn: 0.953 ± 0.235
2.746LysPro: 2.746 ± 0.53
1.569LysGln: 1.569 ± 0.255
2.465LysArg: 2.465 ± 0.46
1.905LysSer: 1.905 ± 0.339
1.793LysThr: 1.793 ± 0.314
2.185LysVal: 2.185 ± 0.363
0.953LysTrp: 0.953 ± 0.367
1.065LysTyr: 1.065 ± 0.286
0.0LysXaa: 0.0 ± 0.0
Leu
8.069LeuAla: 8.069 ± 0.788
0.784LeuCys: 0.784 ± 0.302
4.931LeuAsp: 4.931 ± 0.609
3.53LeuGlu: 3.53 ± 0.452
2.185LeuPhe: 2.185 ± 0.311
5.323LeuGly: 5.323 ± 0.599
0.897LeuHis: 0.897 ± 0.299
2.69LeuIle: 2.69 ± 0.45
2.353LeuLys: 2.353 ± 0.38
4.987LeuLeu: 4.987 ± 0.641
1.681LeuMet: 1.681 ± 0.302
2.409LeuAsn: 2.409 ± 0.381
5.491LeuPro: 5.491 ± 0.606
2.69LeuGln: 2.69 ± 0.415
4.707LeuArg: 4.707 ± 0.558
4.819LeuSer: 4.819 ± 0.537
5.547LeuThr: 5.547 ± 0.614
4.819LeuVal: 4.819 ± 0.437
1.065LeuTrp: 1.065 ± 0.292
2.409LeuTyr: 2.409 ± 0.359
0.0LeuXaa: 0.0 ± 0.0
Met
2.129MetAla: 2.129 ± 0.354
0.056MetCys: 0.056 ± 0.056
1.177MetAsp: 1.177 ± 0.296
0.953MetGlu: 0.953 ± 0.188
0.672MetPhe: 0.672 ± 0.205
1.737MetGly: 1.737 ± 0.318
0.112MetHis: 0.112 ± 0.071
0.616MetIle: 0.616 ± 0.193
0.84MetLys: 0.84 ± 0.203
1.849MetLeu: 1.849 ± 0.284
0.56MetMet: 0.56 ± 0.25
0.728MetAsn: 0.728 ± 0.199
1.121MetPro: 1.121 ± 0.236
0.336MetGln: 0.336 ± 0.14
1.625MetArg: 1.625 ± 0.336
2.69MetSer: 2.69 ± 0.367
2.577MetThr: 2.577 ± 0.356
1.233MetVal: 1.233 ± 0.295
0.28MetTrp: 0.28 ± 0.128
0.392MetTyr: 0.392 ± 0.126
0.0MetXaa: 0.0 ± 0.0
Asn
3.754AsnAla: 3.754 ± 0.493
0.224AsnCys: 0.224 ± 0.122
1.849AsnAsp: 1.849 ± 0.309
1.625AsnGlu: 1.625 ± 0.392
0.728AsnPhe: 0.728 ± 0.233
4.875AsnGly: 4.875 ± 0.61
0.784AsnHis: 0.784 ± 0.187
1.793AsnIle: 1.793 ± 0.441
1.065AsnLys: 1.065 ± 0.249
2.017AsnLeu: 2.017 ± 0.387
0.448AsnMet: 0.448 ± 0.174
1.905AsnAsn: 1.905 ± 0.403
2.465AsnPro: 2.465 ± 0.339
1.009AsnGln: 1.009 ± 0.3
2.017AsnArg: 2.017 ± 0.401
1.737AsnSer: 1.737 ± 0.306
2.073AsnThr: 2.073 ± 0.302
1.961AsnVal: 1.961 ± 0.386
0.728AsnTrp: 0.728 ± 0.197
0.897AsnTyr: 0.897 ± 0.212
0.0AsnXaa: 0.0 ± 0.0
Pro
5.491ProAla: 5.491 ± 0.582
0.616ProCys: 0.616 ± 0.198
4.427ProAsp: 4.427 ± 0.556
4.427ProGlu: 4.427 ± 0.562
1.737ProPhe: 1.737 ± 0.327
6.332ProGly: 6.332 ± 0.843
1.569ProHis: 1.569 ± 0.309
2.129ProIle: 2.129 ± 0.279
2.017ProLys: 2.017 ± 0.366
4.763ProLeu: 4.763 ± 0.578
1.233ProMet: 1.233 ± 0.258
2.185ProAsn: 2.185 ± 0.351
4.034ProPro: 4.034 ± 0.576
2.073ProGln: 2.073 ± 0.433
3.586ProArg: 3.586 ± 0.591
3.642ProSer: 3.642 ± 0.455
3.474ProThr: 3.474 ± 0.463
4.707ProVal: 4.707 ± 0.504
1.289ProTrp: 1.289 ± 0.296
1.737ProTyr: 1.737 ± 0.256
0.0ProXaa: 0.0 ± 0.0
Gln
4.258GlnAla: 4.258 ± 0.483
0.448GlnCys: 0.448 ± 0.184
1.401GlnAsp: 1.401 ± 0.259
1.513GlnGlu: 1.513 ± 0.308
1.233GlnPhe: 1.233 ± 0.225
2.914GlnGly: 2.914 ± 0.445
0.672GlnHis: 0.672 ± 0.198
1.849GlnIle: 1.849 ± 0.337
1.513GlnLys: 1.513 ± 0.311
2.521GlnLeu: 2.521 ± 0.393
0.728GlnMet: 0.728 ± 0.204
0.784GlnAsn: 0.784 ± 0.259
2.465GlnPro: 2.465 ± 0.405
1.233GlnGln: 1.233 ± 0.394
2.353GlnArg: 2.353 ± 0.401
2.297GlnSer: 2.297 ± 0.381
2.241GlnThr: 2.241 ± 0.328
2.185GlnVal: 2.185 ± 0.295
0.448GlnTrp: 0.448 ± 0.129
0.897GlnTyr: 0.897 ± 0.292
0.0GlnXaa: 0.0 ± 0.0
Arg
6.892ArgAla: 6.892 ± 0.647
1.457ArgCys: 1.457 ± 0.412
4.314ArgAsp: 4.314 ± 0.624
5.099ArgGlu: 5.099 ± 0.755
1.961ArgPhe: 1.961 ± 0.362
3.81ArgGly: 3.81 ± 0.409
1.457ArgHis: 1.457 ± 0.309
4.258ArgIle: 4.258 ± 0.578
2.241ArgLys: 2.241 ± 0.36
4.595ArgLeu: 4.595 ± 0.576
2.409ArgMet: 2.409 ± 0.375
2.577ArgAsn: 2.577 ± 0.392
3.53ArgPro: 3.53 ± 0.474
1.625ArgGln: 1.625 ± 0.368
5.659ArgArg: 5.659 ± 0.901
3.698ArgSer: 3.698 ± 0.409
3.362ArgThr: 3.362 ± 0.502
5.379ArgVal: 5.379 ± 0.619
1.905ArgTrp: 1.905 ± 0.353
2.129ArgTyr: 2.129 ± 0.321
0.0ArgXaa: 0.0 ± 0.0
Ser
5.883SerAla: 5.883 ± 0.811
0.56SerCys: 0.56 ± 0.194
4.483SerAsp: 4.483 ± 0.504
3.138SerGlu: 3.138 ± 0.491
1.793SerPhe: 1.793 ± 0.297
6.388SerGly: 6.388 ± 0.926
1.009SerHis: 1.009 ± 0.311
2.858SerIle: 2.858 ± 0.477
2.073SerLys: 2.073 ± 0.328
3.586SerLeu: 3.586 ± 0.397
1.457SerMet: 1.457 ± 0.302
2.353SerAsn: 2.353 ± 0.428
3.474SerPro: 3.474 ± 0.36
1.849SerGln: 1.849 ± 0.311
3.81SerArg: 3.81 ± 0.424
3.866SerSer: 3.866 ± 0.504
3.586SerThr: 3.586 ± 0.528
4.37SerVal: 4.37 ± 0.588
1.289SerTrp: 1.289 ± 0.302
1.401SerTyr: 1.401 ± 0.214
0.0SerXaa: 0.0 ± 0.0
Thr
7.06ThrAla: 7.06 ± 0.553
0.728ThrCys: 0.728 ± 0.217
4.37ThrAsp: 4.37 ± 0.586
4.034ThrGlu: 4.034 ± 0.442
2.017ThrPhe: 2.017 ± 0.35
6.388ThrGly: 6.388 ± 0.718
1.737ThrHis: 1.737 ± 0.352
4.034ThrIle: 4.034 ± 0.469
2.017ThrLys: 2.017 ± 0.36
4.595ThrLeu: 4.595 ± 0.584
1.065ThrMet: 1.065 ± 0.257
2.129ThrAsn: 2.129 ± 0.334
4.875ThrPro: 4.875 ± 0.574
2.297ThrGln: 2.297 ± 0.355
4.146ThrArg: 4.146 ± 0.51
4.034ThrSer: 4.034 ± 0.503
4.931ThrThr: 4.931 ± 0.707
5.379ThrVal: 5.379 ± 0.631
1.625ThrTrp: 1.625 ± 0.368
1.961ThrTyr: 1.961 ± 0.387
0.0ThrXaa: 0.0 ± 0.0
Val
7.116ValAla: 7.116 ± 0.638
1.233ValCys: 1.233 ± 0.271
5.267ValAsp: 5.267 ± 0.547
4.931ValGlu: 4.931 ± 0.565
2.241ValPhe: 2.241 ± 0.425
4.763ValGly: 4.763 ± 0.669
1.457ValHis: 1.457 ± 0.318
2.633ValIle: 2.633 ± 0.338
2.577ValLys: 2.577 ± 0.37
4.987ValLeu: 4.987 ± 0.544
1.065ValMet: 1.065 ± 0.238
2.185ValAsn: 2.185 ± 0.348
3.866ValPro: 3.866 ± 0.392
2.746ValGln: 2.746 ± 0.385
4.707ValArg: 4.707 ± 0.641
3.922ValSer: 3.922 ± 0.527
5.491ValThr: 5.491 ± 0.491
5.883ValVal: 5.883 ± 0.697
1.625ValTrp: 1.625 ± 0.339
1.289ValTyr: 1.289 ± 0.252
0.0ValXaa: 0.0 ± 0.0
Trp
2.017TrpAla: 2.017 ± 0.303
0.224TrpCys: 0.224 ± 0.116
1.233TrpAsp: 1.233 ± 0.297
1.177TrpGlu: 1.177 ± 0.323
0.728TrpPhe: 0.728 ± 0.249
1.121TrpGly: 1.121 ± 0.272
0.728TrpHis: 0.728 ± 0.204
1.065TrpIle: 1.065 ± 0.234
0.784TrpLys: 0.784 ± 0.172
2.017TrpLeu: 2.017 ± 0.391
0.784TrpMet: 0.784 ± 0.236
0.504TrpAsn: 0.504 ± 0.224
1.009TrpPro: 1.009 ± 0.28
0.897TrpGln: 0.897 ± 0.219
1.961TrpArg: 1.961 ± 0.472
1.681TrpSer: 1.681 ± 0.346
1.345TrpThr: 1.345 ± 0.241
1.793TrpVal: 1.793 ± 0.394
1.065TrpTrp: 1.065 ± 0.241
0.448TrpTyr: 0.448 ± 0.169
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.521TyrAla: 2.521 ± 0.441
0.336TyrCys: 0.336 ± 0.138
1.737TyrAsp: 1.737 ± 0.378
1.737TyrGlu: 1.737 ± 0.315
0.84TyrPhe: 0.84 ± 0.247
2.465TyrGly: 2.465 ± 0.395
0.616TyrHis: 0.616 ± 0.222
1.569TyrIle: 1.569 ± 0.315
0.784TyrLys: 0.784 ± 0.201
2.129TyrLeu: 2.129 ± 0.372
0.112TyrMet: 0.112 ± 0.065
0.84TyrAsn: 0.84 ± 0.193
1.569TyrPro: 1.569 ± 0.256
0.784TyrGln: 0.784 ± 0.209
2.073TyrArg: 2.073 ± 0.381
1.121TyrSer: 1.121 ± 0.251
2.017TyrThr: 2.017 ± 0.375
2.297TyrVal: 2.297 ± 0.34
0.616TyrTrp: 0.616 ± 0.17
0.56TyrTyr: 0.56 ± 0.157
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 99 proteins (17848 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski