Amino acid dipepetide frequency for Mycobacterium phage Spikelee

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
15.757AlaAla: 15.757 ± 2.177
1.137AlaCys: 1.137 ± 0.241
6.281AlaAsp: 6.281 ± 0.595
7.797AlaGlu: 7.797 ± 0.89
2.816AlaPhe: 2.816 ± 0.352
9.801AlaGly: 9.801 ± 1.349
1.895AlaHis: 1.895 ± 0.38
4.603AlaIle: 4.603 ± 0.56
3.953AlaLys: 3.953 ± 0.456
7.635AlaLeu: 7.635 ± 0.765
2.762AlaMet: 2.762 ± 0.469
3.628AlaAsn: 3.628 ± 0.665
5.252AlaPro: 5.252 ± 0.611
3.79AlaGln: 3.79 ± 0.425
6.931AlaArg: 6.931 ± 0.746
6.335AlaSer: 6.335 ± 0.797
5.956AlaThr: 5.956 ± 0.665
6.877AlaVal: 6.877 ± 0.584
2.924AlaTrp: 2.924 ± 0.491
2.274AlaTyr: 2.274 ± 0.416
0.0AlaXaa: 0.0 ± 0.0
Cys
0.921CysAla: 0.921 ± 0.277
0.054CysCys: 0.054 ± 0.054
0.704CysAsp: 0.704 ± 0.2
0.758CysGlu: 0.758 ± 0.217
0.271CysPhe: 0.271 ± 0.116
1.3CysGly: 1.3 ± 0.313
0.217CysHis: 0.217 ± 0.102
0.271CysIle: 0.271 ± 0.125
0.325CysLys: 0.325 ± 0.144
0.596CysLeu: 0.596 ± 0.189
0.108CysMet: 0.108 ± 0.074
0.541CysAsn: 0.541 ± 0.165
1.029CysPro: 1.029 ± 0.262
0.325CysGln: 0.325 ± 0.126
0.812CysArg: 0.812 ± 0.221
0.812CysSer: 0.812 ± 0.226
0.541CysThr: 0.541 ± 0.185
0.65CysVal: 0.65 ± 0.19
0.325CysTrp: 0.325 ± 0.135
0.162CysTyr: 0.162 ± 0.092
0.0CysXaa: 0.0 ± 0.0
Asp
5.956AspAla: 5.956 ± 0.535
0.541AspCys: 0.541 ± 0.149
4.548AspAsp: 4.548 ± 0.497
3.411AspGlu: 3.411 ± 0.427
1.3AspPhe: 1.3 ± 0.221
6.335AspGly: 6.335 ± 0.496
1.191AspHis: 1.191 ± 0.242
2.653AspIle: 2.653 ± 0.458
1.841AspLys: 1.841 ± 0.311
6.498AspLeu: 6.498 ± 0.57
1.3AspMet: 1.3 ± 0.28
1.787AspAsn: 1.787 ± 0.293
4.711AspPro: 4.711 ± 0.544
2.545AspGln: 2.545 ± 0.359
5.523AspArg: 5.523 ± 0.701
3.141AspSer: 3.141 ± 0.364
3.844AspThr: 3.844 ± 0.44
4.657AspVal: 4.657 ± 0.607
1.516AspTrp: 1.516 ± 0.292
1.787AspTyr: 1.787 ± 0.331
0.0AspXaa: 0.0 ± 0.0
Glu
5.361GluAla: 5.361 ± 0.563
0.921GluCys: 0.921 ± 0.226
2.87GluAsp: 2.87 ± 0.401
2.653GluGlu: 2.653 ± 0.411
2.166GluPhe: 2.166 ± 0.318
3.249GluGly: 3.249 ± 0.428
1.191GluHis: 1.191 ± 0.388
3.086GluIle: 3.086 ± 0.356
2.112GluLys: 2.112 ± 0.31
5.631GluLeu: 5.631 ± 0.808
1.787GluMet: 1.787 ± 0.286
2.112GluAsn: 2.112 ± 0.301
2.545GluPro: 2.545 ± 0.41
3.195GluGln: 3.195 ± 0.431
5.036GluArg: 5.036 ± 0.618
3.303GluSer: 3.303 ± 0.494
4.169GluThr: 4.169 ± 0.526
4.115GluVal: 4.115 ± 0.5
1.462GluTrp: 1.462 ± 0.293
1.57GluTyr: 1.57 ± 0.31
0.0GluXaa: 0.0 ± 0.0
Phe
3.195PheAla: 3.195 ± 0.416
0.217PheCys: 0.217 ± 0.107
2.058PheAsp: 2.058 ± 0.325
1.57PheGlu: 1.57 ± 0.268
0.866PhePhe: 0.866 ± 0.277
2.978PheGly: 2.978 ± 0.593
0.433PheHis: 0.433 ± 0.137
1.245PheIle: 1.245 ± 0.332
1.245PheLys: 1.245 ± 0.298
1.949PheLeu: 1.949 ± 0.279
0.596PheMet: 0.596 ± 0.211
1.245PheAsn: 1.245 ± 0.358
1.733PhePro: 1.733 ± 0.3
0.921PheGln: 0.921 ± 0.293
1.733PheArg: 1.733 ± 0.315
1.462PheSer: 1.462 ± 0.265
2.491PheThr: 2.491 ± 0.319
1.841PheVal: 1.841 ± 0.276
0.65PheTrp: 0.65 ± 0.177
0.758PheTyr: 0.758 ± 0.266
0.0PheXaa: 0.0 ± 0.0
Gly
8.934GlyAla: 8.934 ± 1.238
0.921GlyCys: 0.921 ± 0.267
6.498GlyAsp: 6.498 ± 0.637
3.628GlyGlu: 3.628 ± 0.539
2.816GlyPhe: 2.816 ± 0.479
11.263GlyGly: 11.263 ± 2.535
1.841GlyHis: 1.841 ± 0.226
4.169GlyIle: 4.169 ± 0.567
3.141GlyLys: 3.141 ± 0.428
5.848GlyLeu: 5.848 ± 0.532
2.653GlyMet: 2.653 ± 0.456
2.978GlyAsn: 2.978 ± 0.376
4.007GlyPro: 4.007 ± 0.563
2.382GlyGln: 2.382 ± 0.465
4.873GlyArg: 4.873 ± 0.618
5.794GlySer: 5.794 ± 0.824
6.065GlyThr: 6.065 ± 0.711
6.335GlyVal: 6.335 ± 0.651
2.328GlyTrp: 2.328 ± 0.356
2.382GlyTyr: 2.382 ± 0.387
0.0GlyXaa: 0.0 ± 0.0
His
1.949HisAla: 1.949 ± 0.361
0.487HisCys: 0.487 ± 0.186
1.462HisAsp: 1.462 ± 0.264
1.137HisGlu: 1.137 ± 0.261
0.379HisPhe: 0.379 ± 0.128
1.787HisGly: 1.787 ± 0.319
0.758HisHis: 0.758 ± 0.23
1.516HisIle: 1.516 ± 0.327
0.596HisLys: 0.596 ± 0.201
1.57HisLeu: 1.57 ± 0.274
0.433HisMet: 0.433 ± 0.139
1.137HisAsn: 1.137 ± 0.233
1.191HisPro: 1.191 ± 0.213
0.65HisGln: 0.65 ± 0.182
1.462HisArg: 1.462 ± 0.277
0.758HisSer: 0.758 ± 0.204
1.408HisThr: 1.408 ± 0.38
1.408HisVal: 1.408 ± 0.342
0.487HisTrp: 0.487 ± 0.183
0.704HisTyr: 0.704 ± 0.182
0.0HisXaa: 0.0 ± 0.0
Ile
5.306IleAla: 5.306 ± 0.559
0.758IleCys: 0.758 ± 0.263
3.574IleAsp: 3.574 ± 0.468
3.79IleGlu: 3.79 ± 0.491
0.812IlePhe: 0.812 ± 0.236
4.603IleGly: 4.603 ± 0.487
1.408IleHis: 1.408 ± 0.315
1.3IleIle: 1.3 ± 0.305
1.191IleLys: 1.191 ± 0.286
2.491IleLeu: 2.491 ± 0.369
0.271IleMet: 0.271 ± 0.114
1.679IleAsn: 1.679 ± 0.32
2.653IlePro: 2.653 ± 0.375
1.516IleGln: 1.516 ± 0.28
2.87IleArg: 2.87 ± 0.389
1.949IleSer: 1.949 ± 0.412
3.953IleThr: 3.953 ± 0.486
3.465IleVal: 3.465 ± 0.4
0.921IleTrp: 0.921 ± 0.205
0.975IleTyr: 0.975 ± 0.22
0.0IleXaa: 0.0 ± 0.0
Lys
3.953LysAla: 3.953 ± 0.511
0.325LysCys: 0.325 ± 0.14
1.57LysAsp: 1.57 ± 0.274
1.462LysGlu: 1.462 ± 0.286
1.3LysPhe: 1.3 ± 0.257
2.382LysGly: 2.382 ± 0.379
1.029LysHis: 1.029 ± 0.237
0.812LysIle: 0.812 ± 0.179
1.787LysLys: 1.787 ± 0.398
3.141LysLeu: 3.141 ± 0.434
0.812LysMet: 0.812 ± 0.19
0.921LysAsn: 0.921 ± 0.248
2.978LysPro: 2.978 ± 0.46
1.245LysGln: 1.245 ± 0.171
2.003LysArg: 2.003 ± 0.367
2.166LysSer: 2.166 ± 0.323
2.003LysThr: 2.003 ± 0.288
2.491LysVal: 2.491 ± 0.387
1.083LysTrp: 1.083 ± 0.322
0.866LysTyr: 0.866 ± 0.262
0.0LysXaa: 0.0 ± 0.0
Leu
8.068LeuAla: 8.068 ± 0.89
0.433LeuCys: 0.433 ± 0.152
5.523LeuAsp: 5.523 ± 0.606
4.169LeuGlu: 4.169 ± 0.526
2.437LeuPhe: 2.437 ± 0.305
5.523LeuGly: 5.523 ± 0.53
1.083LeuHis: 1.083 ± 0.286
3.357LeuIle: 3.357 ± 0.373
2.166LeuLys: 2.166 ± 0.388
4.927LeuLeu: 4.927 ± 0.631
1.462LeuMet: 1.462 ± 0.352
2.924LeuAsn: 2.924 ± 0.419
5.415LeuPro: 5.415 ± 0.617
2.653LeuGln: 2.653 ± 0.476
5.198LeuArg: 5.198 ± 0.603
4.873LeuSer: 4.873 ± 0.528
5.469LeuThr: 5.469 ± 0.54
5.361LeuVal: 5.361 ± 0.561
1.624LeuTrp: 1.624 ± 0.347
2.274LeuTyr: 2.274 ± 0.414
0.0LeuXaa: 0.0 ± 0.0
Met
1.895MetAla: 1.895 ± 0.343
0.271MetCys: 0.271 ± 0.155
1.137MetAsp: 1.137 ± 0.258
0.921MetGlu: 0.921 ± 0.196
0.704MetPhe: 0.704 ± 0.197
2.274MetGly: 2.274 ± 0.312
0.162MetHis: 0.162 ± 0.095
0.975MetIle: 0.975 ± 0.264
0.975MetLys: 0.975 ± 0.27
1.733MetLeu: 1.733 ± 0.253
0.541MetMet: 0.541 ± 0.197
0.921MetAsn: 0.921 ± 0.24
1.137MetPro: 1.137 ± 0.279
0.433MetGln: 0.433 ± 0.145
1.624MetArg: 1.624 ± 0.308
2.924MetSer: 2.924 ± 0.431
2.166MetThr: 2.166 ± 0.305
1.191MetVal: 1.191 ± 0.303
0.325MetTrp: 0.325 ± 0.151
0.325MetTyr: 0.325 ± 0.139
0.0MetXaa: 0.0 ± 0.0
Asn
3.79AsnAla: 3.79 ± 0.422
0.217AsnCys: 0.217 ± 0.098
2.003AsnAsp: 2.003 ± 0.296
2.328AsnGlu: 2.328 ± 0.424
0.975AsnPhe: 0.975 ± 0.294
4.332AsnGly: 4.332 ± 0.536
0.866AsnHis: 0.866 ± 0.199
1.408AsnIle: 1.408 ± 0.41
0.921AsnLys: 0.921 ± 0.266
2.707AsnLeu: 2.707 ± 0.389
0.704AsnMet: 0.704 ± 0.171
1.841AsnAsn: 1.841 ± 0.37
2.545AsnPro: 2.545 ± 0.36
1.354AsnGln: 1.354 ± 0.374
2.003AsnArg: 2.003 ± 0.378
1.245AsnSer: 1.245 ± 0.247
2.437AsnThr: 2.437 ± 0.36
1.624AsnVal: 1.624 ± 0.299
0.596AsnTrp: 0.596 ± 0.167
0.866AsnTyr: 0.866 ± 0.2
0.0AsnXaa: 0.0 ± 0.0
Pro
5.252ProAla: 5.252 ± 0.567
0.541ProCys: 0.541 ± 0.172
4.711ProAsp: 4.711 ± 0.45
4.548ProGlu: 4.548 ± 0.495
1.949ProPhe: 1.949 ± 0.345
6.335ProGly: 6.335 ± 0.696
1.516ProHis: 1.516 ± 0.277
2.058ProIle: 2.058 ± 0.279
1.949ProLys: 1.949 ± 0.311
4.494ProLeu: 4.494 ± 0.524
1.624ProMet: 1.624 ± 0.336
2.003ProAsn: 2.003 ± 0.318
4.169ProPro: 4.169 ± 0.552
2.22ProGln: 2.22 ± 0.341
3.357ProArg: 3.357 ± 0.51
3.465ProSer: 3.465 ± 0.476
2.87ProThr: 2.87 ± 0.363
4.007ProVal: 4.007 ± 0.491
1.029ProTrp: 1.029 ± 0.255
1.624ProTyr: 1.624 ± 0.294
0.0ProXaa: 0.0 ± 0.0
Gln
4.494GlnAla: 4.494 ± 0.62
0.379GlnCys: 0.379 ± 0.16
1.3GlnAsp: 1.3 ± 0.287
1.679GlnGlu: 1.679 ± 0.258
1.029GlnPhe: 1.029 ± 0.242
2.491GlnGly: 2.491 ± 0.411
0.921GlnHis: 0.921 ± 0.237
1.895GlnIle: 1.895 ± 0.317
1.245GlnLys: 1.245 ± 0.222
3.357GlnLeu: 3.357 ± 0.436
0.758GlnMet: 0.758 ± 0.212
0.921GlnAsn: 0.921 ± 0.198
2.437GlnPro: 2.437 ± 0.452
2.058GlnGln: 2.058 ± 0.486
2.382GlnArg: 2.382 ± 0.397
2.382GlnSer: 2.382 ± 0.322
1.679GlnThr: 1.679 ± 0.361
2.599GlnVal: 2.599 ± 0.394
0.704GlnTrp: 0.704 ± 0.187
0.975GlnTyr: 0.975 ± 0.269
0.0GlnXaa: 0.0 ± 0.0
Arg
7.147ArgAla: 7.147 ± 0.781
1.083ArgCys: 1.083 ± 0.279
4.224ArgAsp: 4.224 ± 0.488
4.982ArgGlu: 4.982 ± 0.716
1.679ArgPhe: 1.679 ± 0.327
4.278ArgGly: 4.278 ± 0.391
1.191ArgHis: 1.191 ± 0.289
4.007ArgIle: 4.007 ± 0.517
2.22ArgLys: 2.22 ± 0.316
4.819ArgLeu: 4.819 ± 0.504
2.437ArgMet: 2.437 ± 0.355
2.058ArgAsn: 2.058 ± 0.437
3.736ArgPro: 3.736 ± 0.468
2.166ArgGln: 2.166 ± 0.349
5.252ArgArg: 5.252 ± 0.637
3.79ArgSer: 3.79 ± 0.445
3.032ArgThr: 3.032 ± 0.503
5.09ArgVal: 5.09 ± 0.501
1.462ArgTrp: 1.462 ± 0.329
2.112ArgTyr: 2.112 ± 0.335
0.0ArgXaa: 0.0 ± 0.0
Ser
7.418SerAla: 7.418 ± 1.112
0.379SerCys: 0.379 ± 0.152
4.061SerAsp: 4.061 ± 0.447
3.357SerGlu: 3.357 ± 0.453
2.274SerPhe: 2.274 ± 0.395
5.686SerGly: 5.686 ± 0.792
1.354SerHis: 1.354 ± 0.245
2.707SerIle: 2.707 ± 0.406
2.22SerLys: 2.22 ± 0.389
3.953SerLeu: 3.953 ± 0.444
1.354SerMet: 1.354 ± 0.248
2.003SerAsn: 2.003 ± 0.335
3.465SerPro: 3.465 ± 0.424
1.841SerGln: 1.841 ± 0.275
3.628SerArg: 3.628 ± 0.38
4.332SerSer: 4.332 ± 0.78
3.032SerThr: 3.032 ± 0.548
4.278SerVal: 4.278 ± 0.466
1.354SerTrp: 1.354 ± 0.301
1.516SerTyr: 1.516 ± 0.247
0.0SerXaa: 0.0 ± 0.0
Thr
6.281ThrAla: 6.281 ± 0.749
0.433ThrCys: 0.433 ± 0.182
3.736ThrAsp: 3.736 ± 0.514
3.465ThrGlu: 3.465 ± 0.367
1.733ThrPhe: 1.733 ± 0.361
5.523ThrGly: 5.523 ± 0.548
1.733ThrHis: 1.733 ± 0.308
3.682ThrIle: 3.682 ± 0.485
2.545ThrLys: 2.545 ± 0.362
4.169ThrLeu: 4.169 ± 0.583
0.704ThrMet: 0.704 ± 0.173
2.058ThrAsn: 2.058 ± 0.362
4.115ThrPro: 4.115 ± 0.424
1.841ThrGln: 1.841 ± 0.297
3.953ThrArg: 3.953 ± 0.522
3.79ThrSer: 3.79 ± 0.439
4.603ThrThr: 4.603 ± 0.589
5.902ThrVal: 5.902 ± 0.676
1.191ThrTrp: 1.191 ± 0.254
2.058ThrTyr: 2.058 ± 0.283
0.0ThrXaa: 0.0 ± 0.0
Val
7.635ValAla: 7.635 ± 0.622
0.975ValCys: 0.975 ± 0.192
5.523ValAsp: 5.523 ± 0.529
3.953ValGlu: 3.953 ± 0.542
1.787ValPhe: 1.787 ± 0.391
5.794ValGly: 5.794 ± 0.567
1.624ValHis: 1.624 ± 0.376
2.978ValIle: 2.978 ± 0.345
2.437ValLys: 2.437 ± 0.385
5.902ValLeu: 5.902 ± 0.615
1.083ValMet: 1.083 ± 0.199
2.87ValAsn: 2.87 ± 0.363
4.115ValPro: 4.115 ± 0.395
2.653ValGln: 2.653 ± 0.298
4.494ValArg: 4.494 ± 0.532
4.657ValSer: 4.657 ± 0.563
4.873ValThr: 4.873 ± 0.528
5.577ValVal: 5.577 ± 0.546
1.516ValTrp: 1.516 ± 0.322
1.137ValTyr: 1.137 ± 0.318
0.0ValXaa: 0.0 ± 0.0
Trp
2.653TrpAla: 2.653 ± 0.402
0.433TrpCys: 0.433 ± 0.188
1.354TrpAsp: 1.354 ± 0.312
1.029TrpGlu: 1.029 ± 0.251
0.921TrpPhe: 0.921 ± 0.175
1.029TrpGly: 1.029 ± 0.277
0.487TrpHis: 0.487 ± 0.157
1.462TrpIle: 1.462 ± 0.303
0.758TrpLys: 0.758 ± 0.169
1.679TrpLeu: 1.679 ± 0.379
0.866TrpMet: 0.866 ± 0.241
0.379TrpAsn: 0.379 ± 0.18
1.029TrpPro: 1.029 ± 0.309
1.029TrpGln: 1.029 ± 0.242
1.949TrpArg: 1.949 ± 0.305
1.624TrpSer: 1.624 ± 0.388
1.245TrpThr: 1.245 ± 0.242
1.57TrpVal: 1.57 ± 0.416
0.975TrpTrp: 0.975 ± 0.237
0.433TrpTyr: 0.433 ± 0.158
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.437TyrAla: 2.437 ± 0.38
0.217TyrCys: 0.217 ± 0.11
1.949TyrAsp: 1.949 ± 0.344
1.895TyrGlu: 1.895 ± 0.382
0.975TyrPhe: 0.975 ± 0.198
1.895TyrGly: 1.895 ± 0.362
0.379TyrHis: 0.379 ± 0.155
1.3TyrIle: 1.3 ± 0.266
0.704TyrLys: 0.704 ± 0.206
1.895TyrLeu: 1.895 ± 0.344
0.325TyrMet: 0.325 ± 0.13
0.758TyrAsn: 0.758 ± 0.209
1.462TyrPro: 1.462 ± 0.256
0.866TyrGln: 0.866 ± 0.206
1.679TyrArg: 1.679 ± 0.304
1.408TyrSer: 1.408 ± 0.311
1.679TyrThr: 1.679 ± 0.301
2.545TyrVal: 2.545 ± 0.316
0.433TyrTrp: 0.433 ± 0.147
0.596TyrTyr: 0.596 ± 0.159
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 104 proteins (18469 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski