Amino acid dipepetide frequency for Mycobacterium phage Purky

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
16.668AlaAla: 16.668 ± 1.616
0.973AlaCys: 0.973 ± 0.239
8.577AlaAsp: 8.577 ± 0.846
9.55AlaGlu: 9.55 ± 1.035
2.981AlaPhe: 2.981 ± 0.53
9.185AlaGly: 9.185 ± 0.986
1.947AlaHis: 1.947 ± 0.331
6.083AlaIle: 6.083 ± 0.655
4.258AlaLys: 4.258 ± 0.665
10.524AlaLeu: 10.524 ± 0.835
2.677AlaMet: 2.677 ± 0.353
3.528AlaAsn: 3.528 ± 0.466
6.752AlaPro: 6.752 ± 0.824
4.501AlaGln: 4.501 ± 0.732
6.996AlaArg: 6.996 ± 0.606
6.326AlaSer: 6.326 ± 0.722
6.874AlaThr: 6.874 ± 0.618
6.996AlaVal: 6.996 ± 0.56
2.312AlaTrp: 2.312 ± 0.447
2.251AlaTyr: 2.251 ± 0.393
0.0AlaXaa: 0.0 ± 0.0
Cys
1.217CysAla: 1.217 ± 0.279
0.365CysCys: 0.365 ± 0.158
0.852CysAsp: 0.852 ± 0.232
0.73CysGlu: 0.73 ± 0.24
0.061CysPhe: 0.061 ± 0.059
1.521CysGly: 1.521 ± 0.343
0.608CysHis: 0.608 ± 0.204
0.304CysIle: 0.304 ± 0.138
0.304CysLys: 0.304 ± 0.14
0.487CysLeu: 0.487 ± 0.177
0.182CysMet: 0.182 ± 0.109
0.243CysAsn: 0.243 ± 0.137
1.217CysPro: 1.217 ± 0.331
0.243CysGln: 0.243 ± 0.156
0.973CysArg: 0.973 ± 0.227
0.122CysSer: 0.122 ± 0.072
0.73CysThr: 0.73 ± 0.2
0.304CysVal: 0.304 ± 0.178
0.182CysTrp: 0.182 ± 0.12
0.304CysTyr: 0.304 ± 0.13
0.0CysXaa: 0.0 ± 0.0
Asp
8.516AspAla: 8.516 ± 0.81
0.669AspCys: 0.669 ± 0.227
6.144AspAsp: 6.144 ± 0.681
6.144AspGlu: 6.144 ± 0.749
1.703AspPhe: 1.703 ± 0.322
7.056AspGly: 7.056 ± 0.683
1.338AspHis: 1.338 ± 0.315
2.372AspIle: 2.372 ± 0.339
1.642AspLys: 1.642 ± 0.249
5.292AspLeu: 5.292 ± 0.551
1.399AspMet: 1.399 ± 0.234
1.521AspAsn: 1.521 ± 0.29
5.536AspPro: 5.536 ± 0.507
2.312AspGln: 2.312 ± 0.556
4.501AspArg: 4.501 ± 0.622
2.251AspSer: 2.251 ± 0.378
3.589AspThr: 3.589 ± 0.427
4.562AspVal: 4.562 ± 0.544
1.156AspTrp: 1.156 ± 0.253
1.217AspTyr: 1.217 ± 0.341
0.0AspXaa: 0.0 ± 0.0
Glu
6.57GluAla: 6.57 ± 0.899
1.217GluCys: 1.217 ± 0.306
3.163GluAsp: 3.163 ± 0.427
2.798GluGlu: 2.798 ± 0.39
2.19GluPhe: 2.19 ± 0.369
3.528GluGly: 3.528 ± 0.527
1.338GluHis: 1.338 ± 0.308
3.772GluIle: 3.772 ± 0.513
2.007GluLys: 2.007 ± 0.313
6.813GluLeu: 6.813 ± 0.849
1.825GluMet: 1.825 ± 0.348
1.521GluAsn: 1.521 ± 0.286
4.137GluPro: 4.137 ± 0.526
2.372GluGln: 2.372 ± 0.385
4.137GluArg: 4.137 ± 0.475
2.798GluSer: 2.798 ± 0.406
3.285GluThr: 3.285 ± 0.441
3.711GluVal: 3.711 ± 0.543
1.582GluTrp: 1.582 ± 0.297
1.399GluTyr: 1.399 ± 0.307
0.0GluXaa: 0.0 ± 0.0
Phe
2.737PheAla: 2.737 ± 0.343
0.243PheCys: 0.243 ± 0.125
2.251PheAsp: 2.251 ± 0.481
1.642PheGlu: 1.642 ± 0.334
0.487PhePhe: 0.487 ± 0.167
3.224PheGly: 3.224 ± 0.611
0.426PheHis: 0.426 ± 0.151
1.764PheIle: 1.764 ± 0.379
1.338PheLys: 1.338 ± 0.264
2.129PheLeu: 2.129 ± 0.35
0.547PheMet: 0.547 ± 0.161
0.669PheAsn: 0.669 ± 0.211
1.947PhePro: 1.947 ± 0.373
0.669PheGln: 0.669 ± 0.228
1.825PheArg: 1.825 ± 0.292
0.791PheSer: 0.791 ± 0.212
1.703PheThr: 1.703 ± 0.341
1.764PheVal: 1.764 ± 0.279
0.608PheTrp: 0.608 ± 0.183
0.608PheTyr: 0.608 ± 0.167
0.0PheXaa: 0.0 ± 0.0
Gly
9.064GlyAla: 9.064 ± 1.327
0.73GlyCys: 0.73 ± 0.232
5.353GlyAsp: 5.353 ± 0.496
4.501GlyGlu: 4.501 ± 0.529
2.92GlyPhe: 2.92 ± 0.665
8.76GlyGly: 8.76 ± 1.922
1.095GlyHis: 1.095 ± 0.275
3.528GlyIle: 3.528 ± 0.501
3.042GlyLys: 3.042 ± 0.38
6.996GlyLeu: 6.996 ± 0.758
2.068GlyMet: 2.068 ± 0.383
2.92GlyAsn: 2.92 ± 0.393
4.441GlyPro: 4.441 ± 0.675
3.65GlyGln: 3.65 ± 0.774
5.961GlyArg: 5.961 ± 0.583
4.623GlySer: 4.623 ± 0.584
4.806GlyThr: 4.806 ± 0.694
7.543GlyVal: 7.543 ± 0.876
1.825GlyTrp: 1.825 ± 0.346
2.433GlyTyr: 2.433 ± 0.364
0.0GlyXaa: 0.0 ± 0.0
His
2.007HisAla: 2.007 ± 0.417
0.243HisCys: 0.243 ± 0.129
1.338HisAsp: 1.338 ± 0.28
1.217HisGlu: 1.217 ± 0.324
0.365HisPhe: 0.365 ± 0.156
0.791HisGly: 0.791 ± 0.258
0.973HisHis: 0.973 ± 0.246
0.73HisIle: 0.73 ± 0.239
0.608HisLys: 0.608 ± 0.179
1.095HisLeu: 1.095 ± 0.277
0.243HisMet: 0.243 ± 0.11
0.73HisAsn: 0.73 ± 0.178
1.825HisPro: 1.825 ± 0.354
0.912HisGln: 0.912 ± 0.228
2.129HisArg: 2.129 ± 0.458
0.426HisSer: 0.426 ± 0.163
0.912HisThr: 0.912 ± 0.244
1.095HisVal: 1.095 ± 0.285
0.304HisTrp: 0.304 ± 0.162
0.852HisTyr: 0.852 ± 0.188
0.0HisXaa: 0.0 ± 0.0
Ile
5.84IleAla: 5.84 ± 0.587
0.487IleCys: 0.487 ± 0.161
3.528IleAsp: 3.528 ± 0.44
4.137IleGlu: 4.137 ± 0.465
0.73IlePhe: 0.73 ± 0.217
3.893IleGly: 3.893 ± 0.505
0.669IleHis: 0.669 ± 0.184
1.217IleIle: 1.217 ± 0.286
1.338IleLys: 1.338 ± 0.229
1.338IleLeu: 1.338 ± 0.217
0.487IleMet: 0.487 ± 0.189
1.703IleAsn: 1.703 ± 0.359
2.859IlePro: 2.859 ± 0.438
2.19IleGln: 2.19 ± 0.313
3.772IleArg: 3.772 ± 0.5
3.407IleSer: 3.407 ± 0.381
3.407IleThr: 3.407 ± 0.518
3.346IleVal: 3.346 ± 0.447
0.365IleTrp: 0.365 ± 0.161
1.095IleTyr: 1.095 ± 0.306
0.0IleXaa: 0.0 ± 0.0
Lys
4.258LysAla: 4.258 ± 0.523
0.365LysCys: 0.365 ± 0.187
1.095LysAsp: 1.095 ± 0.225
1.034LysGlu: 1.034 ± 0.283
1.642LysPhe: 1.642 ± 0.282
2.859LysGly: 2.859 ± 0.523
0.547LysHis: 0.547 ± 0.16
1.217LysIle: 1.217 ± 0.281
1.034LysLys: 1.034 ± 0.29
3.042LysLeu: 3.042 ± 0.463
0.608LysMet: 0.608 ± 0.169
0.487LysAsn: 0.487 ± 0.236
1.947LysPro: 1.947 ± 0.344
0.973LysGln: 0.973 ± 0.231
2.555LysArg: 2.555 ± 0.45
1.521LysSer: 1.521 ± 0.319
2.494LysThr: 2.494 ± 0.344
2.737LysVal: 2.737 ± 0.409
0.547LysTrp: 0.547 ± 0.162
0.791LysTyr: 0.791 ± 0.191
0.0LysXaa: 0.0 ± 0.0
Leu
11.071LeuAla: 11.071 ± 0.765
0.791LeuCys: 0.791 ± 0.244
5.657LeuAsp: 5.657 ± 0.735
3.65LeuGlu: 3.65 ± 0.422
1.947LeuPhe: 1.947 ± 0.335
6.813LeuGly: 6.813 ± 0.773
1.156LeuHis: 1.156 ± 0.243
4.745LeuIle: 4.745 ± 0.568
2.129LeuLys: 2.129 ± 0.335
7.117LeuLeu: 7.117 ± 0.655
1.399LeuMet: 1.399 ± 0.333
2.068LeuAsn: 2.068 ± 0.393
5.657LeuPro: 5.657 ± 0.474
2.92LeuGln: 2.92 ± 0.488
5.901LeuArg: 5.901 ± 0.575
3.346LeuSer: 3.346 ± 0.4
4.745LeuThr: 4.745 ± 0.421
4.745LeuVal: 4.745 ± 0.529
1.338LeuTrp: 1.338 ± 0.225
1.521LeuTyr: 1.521 ± 0.337
0.0LeuXaa: 0.0 ± 0.0
Met
3.407MetAla: 3.407 ± 0.445
0.304MetCys: 0.304 ± 0.132
1.034MetAsp: 1.034 ± 0.255
0.547MetGlu: 0.547 ± 0.168
0.669MetPhe: 0.669 ± 0.246
1.399MetGly: 1.399 ± 0.32
0.304MetHis: 0.304 ± 0.157
0.791MetIle: 0.791 ± 0.181
0.547MetLys: 0.547 ± 0.216
1.642MetLeu: 1.642 ± 0.332
0.487MetMet: 0.487 ± 0.189
0.547MetAsn: 0.547 ± 0.172
1.46MetPro: 1.46 ± 0.329
0.669MetGln: 0.669 ± 0.208
1.521MetArg: 1.521 ± 0.341
2.251MetSer: 2.251 ± 0.35
1.886MetThr: 1.886 ± 0.312
0.973MetVal: 0.973 ± 0.229
0.243MetTrp: 0.243 ± 0.132
0.061MetTyr: 0.061 ± 0.061
0.0MetXaa: 0.0 ± 0.0
Asn
3.285AsnAla: 3.285 ± 0.573
0.182AsnCys: 0.182 ± 0.095
1.764AsnAsp: 1.764 ± 0.377
1.46AsnGlu: 1.46 ± 0.298
0.426AsnPhe: 0.426 ± 0.176
2.92AsnGly: 2.92 ± 0.387
0.547AsnHis: 0.547 ± 0.321
1.46AsnIle: 1.46 ± 0.304
0.426AsnLys: 0.426 ± 0.148
2.251AsnLeu: 2.251 ± 0.431
0.182AsnMet: 0.182 ± 0.102
0.547AsnAsn: 0.547 ± 0.185
3.102AsnPro: 3.102 ± 0.454
1.156AsnGln: 1.156 ± 0.294
2.433AsnArg: 2.433 ± 0.447
1.825AsnSer: 1.825 ± 0.404
1.399AsnThr: 1.399 ± 0.307
1.277AsnVal: 1.277 ± 0.263
0.73AsnTrp: 0.73 ± 0.212
0.487AsnTyr: 0.487 ± 0.191
0.0AsnXaa: 0.0 ± 0.0
Pro
7.665ProAla: 7.665 ± 0.685
0.426ProCys: 0.426 ± 0.159
5.657ProAsp: 5.657 ± 0.637
4.501ProGlu: 4.501 ± 0.635
1.825ProPhe: 1.825 ± 0.351
7.3ProGly: 7.3 ± 1.07
1.034ProHis: 1.034 ± 0.277
2.129ProIle: 2.129 ± 0.424
2.068ProLys: 2.068 ± 0.405
4.015ProLeu: 4.015 ± 0.586
1.217ProMet: 1.217 ± 0.277
2.007ProAsn: 2.007 ± 0.431
4.866ProPro: 4.866 ± 0.816
2.494ProGln: 2.494 ± 0.282
4.927ProArg: 4.927 ± 0.714
2.92ProSer: 2.92 ± 0.391
3.893ProThr: 3.893 ± 0.529
4.441ProVal: 4.441 ± 0.49
1.642ProTrp: 1.642 ± 0.325
1.582ProTyr: 1.582 ± 0.306
0.0ProXaa: 0.0 ± 0.0
Gln
4.441GlnAla: 4.441 ± 0.51
0.547GlnCys: 0.547 ± 0.201
1.582GlnAsp: 1.582 ± 0.304
1.642GlnGlu: 1.642 ± 0.292
0.973GlnPhe: 0.973 ± 0.233
2.677GlnGly: 2.677 ± 0.609
0.669GlnHis: 0.669 ± 0.212
2.068GlnIle: 2.068 ± 0.36
1.217GlnLys: 1.217 ± 0.283
3.285GlnLeu: 3.285 ± 0.513
0.852GlnMet: 0.852 ± 0.207
0.608GlnAsn: 0.608 ± 0.204
2.068GlnPro: 2.068 ± 0.383
2.129GlnGln: 2.129 ± 0.5
3.346GlnArg: 3.346 ± 0.492
1.947GlnSer: 1.947 ± 0.268
2.616GlnThr: 2.616 ± 0.481
3.589GlnVal: 3.589 ± 0.487
1.034GlnTrp: 1.034 ± 0.24
0.73GlnTyr: 0.73 ± 0.193
0.0GlnXaa: 0.0 ± 0.0
Arg
7.908ArgAla: 7.908 ± 0.767
0.73ArgCys: 0.73 ± 0.192
5.171ArgAsp: 5.171 ± 0.622
4.137ArgGlu: 4.137 ± 0.681
2.129ArgPhe: 2.129 ± 0.333
5.231ArgGly: 5.231 ± 0.498
1.703ArgHis: 1.703 ± 0.347
4.015ArgIle: 4.015 ± 0.478
3.102ArgLys: 3.102 ± 0.405
5.11ArgLeu: 5.11 ± 0.544
1.764ArgMet: 1.764 ± 0.363
2.555ArgAsn: 2.555 ± 0.424
3.711ArgPro: 3.711 ± 0.61
2.555ArgGln: 2.555 ± 0.402
6.448ArgArg: 6.448 ± 0.974
2.92ArgSer: 2.92 ± 0.61
4.623ArgThr: 4.623 ± 0.459
4.258ArgVal: 4.258 ± 0.491
1.399ArgTrp: 1.399 ± 0.289
2.068ArgTyr: 2.068 ± 0.395
0.0ArgXaa: 0.0 ± 0.0
Ser
6.448SerAla: 6.448 ± 0.833
0.182SerCys: 0.182 ± 0.102
3.102SerAsp: 3.102 ± 0.37
2.007SerGlu: 2.007 ± 0.343
1.217SerPhe: 1.217 ± 0.281
4.38SerGly: 4.38 ± 0.68
0.73SerHis: 0.73 ± 0.228
2.251SerIle: 2.251 ± 0.409
1.521SerLys: 1.521 ± 0.363
4.015SerLeu: 4.015 ± 0.427
1.582SerMet: 1.582 ± 0.286
1.825SerAsn: 1.825 ± 0.309
3.224SerPro: 3.224 ± 0.521
1.642SerGln: 1.642 ± 0.316
3.224SerArg: 3.224 ± 0.462
2.859SerSer: 2.859 ± 0.404
3.467SerThr: 3.467 ± 0.483
3.772SerVal: 3.772 ± 0.534
1.217SerTrp: 1.217 ± 0.243
1.156SerTyr: 1.156 ± 0.237
0.0SerXaa: 0.0 ± 0.0
Thr
7.178ThrAla: 7.178 ± 0.876
0.426ThrCys: 0.426 ± 0.186
4.076ThrAsp: 4.076 ± 0.615
3.589ThrGlu: 3.589 ± 0.436
1.642ThrPhe: 1.642 ± 0.332
5.536ThrGly: 5.536 ± 0.652
1.277ThrHis: 1.277 ± 0.349
3.528ThrIle: 3.528 ± 0.478
2.129ThrLys: 2.129 ± 0.353
4.806ThrLeu: 4.806 ± 0.429
1.034ThrMet: 1.034 ± 0.308
1.582ThrAsn: 1.582 ± 0.278
4.684ThrPro: 4.684 ± 0.534
2.312ThrGln: 2.312 ± 0.327
3.65ThrArg: 3.65 ± 0.431
2.981ThrSer: 2.981 ± 0.487
3.954ThrThr: 3.954 ± 0.512
5.536ThrVal: 5.536 ± 0.698
1.46ThrTrp: 1.46 ± 0.344
1.764ThrTyr: 1.764 ± 0.291
0.0ThrXaa: 0.0 ± 0.0
Val
7.726ValAla: 7.726 ± 0.687
0.912ValCys: 0.912 ± 0.23
5.901ValAsp: 5.901 ± 0.812
4.866ValGlu: 4.866 ± 0.581
2.068ValPhe: 2.068 ± 0.401
5.231ValGly: 5.231 ± 0.585
1.582ValHis: 1.582 ± 0.304
2.129ValIle: 2.129 ± 0.361
1.947ValLys: 1.947 ± 0.354
5.11ValLeu: 5.11 ± 0.708
1.277ValMet: 1.277 ± 0.301
1.703ValAsn: 1.703 ± 0.311
4.441ValPro: 4.441 ± 0.478
2.555ValGln: 2.555 ± 0.415
4.076ValArg: 4.076 ± 0.54
3.772ValSer: 3.772 ± 0.506
5.353ValThr: 5.353 ± 0.739
6.57ValVal: 6.57 ± 0.855
1.277ValTrp: 1.277 ± 0.302
1.46ValTyr: 1.46 ± 0.257
0.0ValXaa: 0.0 ± 0.0
Trp
1.764TrpAla: 1.764 ± 0.385
0.73TrpCys: 0.73 ± 0.185
0.912TrpAsp: 0.912 ± 0.262
0.912TrpGlu: 0.912 ± 0.209
0.973TrpPhe: 0.973 ± 0.312
1.886TrpGly: 1.886 ± 0.306
0.304TrpHis: 0.304 ± 0.144
0.608TrpIle: 0.608 ± 0.196
0.547TrpLys: 0.547 ± 0.167
1.764TrpLeu: 1.764 ± 0.353
0.669TrpMet: 0.669 ± 0.181
0.547TrpAsn: 0.547 ± 0.174
1.156TrpPro: 1.156 ± 0.287
0.852TrpGln: 0.852 ± 0.196
1.642TrpArg: 1.642 ± 0.294
1.399TrpSer: 1.399 ± 0.309
1.338TrpThr: 1.338 ± 0.332
1.217TrpVal: 1.217 ± 0.291
0.973TrpTrp: 0.973 ± 0.264
0.547TrpTyr: 0.547 ± 0.171
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.19TyrAla: 2.19 ± 0.366
0.426TyrCys: 0.426 ± 0.146
1.764TyrAsp: 1.764 ± 0.381
1.156TyrGlu: 1.156 ± 0.231
0.426TyrPhe: 0.426 ± 0.159
2.068TyrGly: 2.068 ± 0.382
0.547TyrHis: 0.547 ± 0.167
1.095TyrIle: 1.095 ± 0.285
0.547TyrLys: 0.547 ± 0.198
1.886TyrLeu: 1.886 ± 0.305
0.182TyrMet: 0.182 ± 0.111
0.547TyrAsn: 0.547 ± 0.159
1.582TyrPro: 1.582 ± 0.302
0.912TyrGln: 0.912 ± 0.225
1.521TyrArg: 1.521 ± 0.351
1.338TyrSer: 1.338 ± 0.313
2.068TyrThr: 2.068 ± 0.379
1.521TyrVal: 1.521 ± 0.308
0.547TyrTrp: 0.547 ± 0.179
0.547TyrTyr: 0.547 ± 0.163
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 84 proteins (16440 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski