Amino acid dipepetide frequency for Streptomyces phage Verse

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
12.602AlaAla: 12.602 ± 1.092
0.8AlaCys: 0.8 ± 0.257
7.001AlaAsp: 7.001 ± 0.63
7.401AlaGlu: 7.401 ± 0.74
3.2AlaPhe: 3.2 ± 0.489
8.468AlaGly: 8.468 ± 0.814
1.667AlaHis: 1.667 ± 0.362
4.734AlaIle: 4.734 ± 0.689
4.667AlaLys: 4.667 ± 0.73
10.135AlaLeu: 10.135 ± 1.187
2.934AlaMet: 2.934 ± 0.396
2.934AlaAsn: 2.934 ± 0.357
4.401AlaPro: 4.401 ± 0.556
4.334AlaGln: 4.334 ± 0.476
6.801AlaArg: 6.801 ± 0.71
6.534AlaSer: 6.534 ± 0.524
6.801AlaThr: 6.801 ± 0.732
8.868AlaVal: 8.868 ± 0.829
2.467AlaTrp: 2.467 ± 0.334
3.067AlaTyr: 3.067 ± 0.52
0.0AlaXaa: 0.0 ± 0.0
Cys
0.467CysAla: 0.467 ± 0.223
0.067CysCys: 0.067 ± 0.063
0.4CysAsp: 0.4 ± 0.179
0.4CysGlu: 0.4 ± 0.162
0.133CysPhe: 0.133 ± 0.097
0.6CysGly: 0.6 ± 0.166
0.333CysHis: 0.333 ± 0.127
0.133CysIle: 0.133 ± 0.089
0.133CysLys: 0.133 ± 0.095
0.467CysLeu: 0.467 ± 0.174
0.067CysMet: 0.067 ± 0.05
0.2CysAsn: 0.2 ± 0.087
0.6CysPro: 0.6 ± 0.207
0.267CysGln: 0.267 ± 0.137
0.6CysArg: 0.6 ± 0.219
0.6CysSer: 0.6 ± 0.258
0.667CysThr: 0.667 ± 0.244
0.4CysVal: 0.4 ± 0.173
0.133CysTrp: 0.133 ± 0.093
0.333CysTyr: 0.333 ± 0.18
0.0CysXaa: 0.0 ± 0.0
Asp
6.668AspAla: 6.668 ± 0.728
0.533AspCys: 0.533 ± 0.173
3.801AspAsp: 3.801 ± 0.604
4.534AspGlu: 4.534 ± 0.665
2.267AspPhe: 2.267 ± 0.356
6.067AspGly: 6.067 ± 0.704
1.334AspHis: 1.334 ± 0.376
3.067AspIle: 3.067 ± 0.486
3.0AspLys: 3.0 ± 0.481
5.201AspLeu: 5.201 ± 0.56
1.734AspMet: 1.734 ± 0.305
1.667AspAsn: 1.667 ± 0.342
3.734AspPro: 3.734 ± 0.516
2.534AspGln: 2.534 ± 0.466
3.867AspArg: 3.867 ± 0.576
2.334AspSer: 2.334 ± 0.509
3.934AspThr: 3.934 ± 0.447
4.934AspVal: 4.934 ± 0.474
1.467AspTrp: 1.467 ± 0.307
1.667AspTyr: 1.667 ± 0.367
0.0AspXaa: 0.0 ± 0.0
Glu
8.001GluAla: 8.001 ± 0.832
0.6GluCys: 0.6 ± 0.152
5.001GluAsp: 5.001 ± 0.625
4.201GluGlu: 4.201 ± 0.726
1.867GluPhe: 1.867 ± 0.368
5.201GluGly: 5.201 ± 0.645
1.2GluHis: 1.2 ± 0.307
2.267GluIle: 2.267 ± 0.365
2.467GluLys: 2.467 ± 0.379
6.401GluLeu: 6.401 ± 0.711
1.6GluMet: 1.6 ± 0.348
2.0GluAsn: 2.0 ± 0.353
2.534GluPro: 2.534 ± 0.449
2.467GluGln: 2.467 ± 0.422
4.267GluArg: 4.267 ± 0.687
3.267GluSer: 3.267 ± 0.559
3.2GluThr: 3.2 ± 0.461
5.134GluVal: 5.134 ± 0.578
1.534GluTrp: 1.534 ± 0.258
1.867GluTyr: 1.867 ± 0.301
0.0GluXaa: 0.0 ± 0.0
Phe
3.267PheAla: 3.267 ± 0.443
0.2PheCys: 0.2 ± 0.126
2.467PheAsp: 2.467 ± 0.437
1.934PheGlu: 1.934 ± 0.346
0.933PhePhe: 0.933 ± 0.215
3.067PheGly: 3.067 ± 0.414
0.6PheHis: 0.6 ± 0.174
1.4PheIle: 1.4 ± 0.245
1.4PheLys: 1.4 ± 0.32
2.134PheLeu: 2.134 ± 0.316
0.867PheMet: 0.867 ± 0.306
1.0PheAsn: 1.0 ± 0.256
1.067PhePro: 1.067 ± 0.281
0.6PheGln: 0.6 ± 0.192
2.0PheArg: 2.0 ± 0.373
1.534PheSer: 1.534 ± 0.386
2.067PheThr: 2.067 ± 0.413
1.734PheVal: 1.734 ± 0.386
0.733PheTrp: 0.733 ± 0.211
1.334PheTyr: 1.334 ± 0.308
0.0PheXaa: 0.0 ± 0.0
Gly
7.934GlyAla: 7.934 ± 0.795
0.067GlyCys: 0.067 ± 0.077
6.668GlyAsp: 6.668 ± 0.835
5.067GlyGlu: 5.067 ± 0.631
3.067GlyPhe: 3.067 ± 0.495
7.201GlyGly: 7.201 ± 0.843
2.134GlyHis: 2.134 ± 0.44
5.534GlyIle: 5.534 ± 0.689
4.601GlyLys: 4.601 ± 0.669
7.068GlyLeu: 7.068 ± 0.847
1.934GlyMet: 1.934 ± 0.325
2.534GlyAsn: 2.534 ± 0.386
3.667GlyPro: 3.667 ± 0.803
2.467GlyGln: 2.467 ± 0.382
4.534GlyArg: 4.534 ± 0.514
5.601GlySer: 5.601 ± 1.026
6.134GlyThr: 6.134 ± 0.689
6.534GlyVal: 6.534 ± 0.578
2.067GlyTrp: 2.067 ± 0.42
2.334GlyTyr: 2.334 ± 0.349
0.0GlyXaa: 0.0 ± 0.0
His
1.734HisAla: 1.734 ± 0.345
0.2HisCys: 0.2 ± 0.098
0.867HisAsp: 0.867 ± 0.271
1.2HisGlu: 1.2 ± 0.277
1.133HisPhe: 1.133 ± 0.208
1.534HisGly: 1.534 ± 0.315
0.6HisHis: 0.6 ± 0.197
0.6HisIle: 0.6 ± 0.251
0.333HisLys: 0.333 ± 0.138
1.8HisLeu: 1.8 ± 0.35
0.2HisMet: 0.2 ± 0.097
0.467HisAsn: 0.467 ± 0.149
1.133HisPro: 1.133 ± 0.242
0.8HisGln: 0.8 ± 0.216
0.933HisArg: 0.933 ± 0.256
0.467HisSer: 0.467 ± 0.186
1.4HisThr: 1.4 ± 0.333
1.0HisVal: 1.0 ± 0.293
0.733HisTrp: 0.733 ± 0.209
1.133HisTyr: 1.133 ± 0.215
0.0HisXaa: 0.0 ± 0.0
Ile
5.267IleAla: 5.267 ± 0.614
0.2IleCys: 0.2 ± 0.127
2.334IleAsp: 2.334 ± 0.409
4.334IleGlu: 4.334 ± 0.487
0.8IlePhe: 0.8 ± 0.213
4.134IleGly: 4.134 ± 0.652
0.533IleHis: 0.533 ± 0.196
1.867IleIle: 1.867 ± 0.672
1.534IleLys: 1.534 ± 0.357
3.134IleLeu: 3.134 ± 0.526
0.6IleMet: 0.6 ± 0.188
1.467IleAsn: 1.467 ± 0.309
2.4IlePro: 2.4 ± 0.315
1.6IleGln: 1.6 ± 0.427
3.2IleArg: 3.2 ± 0.554
2.467IleSer: 2.467 ± 0.41
2.934IleThr: 2.934 ± 0.501
3.2IleVal: 3.2 ± 0.421
0.467IleTrp: 0.467 ± 0.201
1.8IleTyr: 1.8 ± 0.378
0.0IleXaa: 0.0 ± 0.0
Lys
6.067LysAla: 6.067 ± 0.866
0.067LysCys: 0.067 ± 0.08
2.6LysAsp: 2.6 ± 0.466
1.934LysGlu: 1.934 ± 0.36
0.933LysPhe: 0.933 ± 0.233
4.867LysGly: 4.867 ± 0.628
0.733LysHis: 0.733 ± 0.249
2.267LysIle: 2.267 ± 0.324
2.867LysLys: 2.867 ± 0.538
4.267LysLeu: 4.267 ± 0.571
0.8LysMet: 0.8 ± 0.204
1.133LysAsn: 1.133 ± 0.256
3.4LysPro: 3.4 ± 0.544
1.6LysGln: 1.6 ± 0.27
2.334LysArg: 2.334 ± 0.404
2.267LysSer: 2.267 ± 0.449
3.267LysThr: 3.267 ± 0.492
3.0LysVal: 3.0 ± 0.512
0.4LysTrp: 0.4 ± 0.154
1.133LysTyr: 1.133 ± 0.256
0.0LysXaa: 0.0 ± 0.0
Leu
10.401LeuAla: 10.401 ± 1.016
0.4LeuCys: 0.4 ± 0.188
5.934LeuAsp: 5.934 ± 0.61
4.201LeuGlu: 4.201 ± 0.574
1.734LeuPhe: 1.734 ± 0.338
6.668LeuGly: 6.668 ± 0.736
1.4LeuHis: 1.4 ± 0.343
3.801LeuIle: 3.801 ± 0.467
4.801LeuLys: 4.801 ± 0.651
6.001LeuLeu: 6.001 ± 0.762
1.734LeuMet: 1.734 ± 0.448
3.4LeuAsn: 3.4 ± 0.428
4.734LeuPro: 4.734 ± 0.594
2.067LeuGln: 2.067 ± 0.325
4.801LeuArg: 4.801 ± 0.63
5.934LeuSer: 5.934 ± 0.544
5.267LeuThr: 5.267 ± 0.6
5.667LeuVal: 5.667 ± 0.629
1.334LeuTrp: 1.334 ± 0.277
2.067LeuTyr: 2.067 ± 0.374
0.0LeuXaa: 0.0 ± 0.0
Met
3.267MetAla: 3.267 ± 0.495
0.133MetCys: 0.133 ± 0.085
1.0MetAsp: 1.0 ± 0.251
1.0MetGlu: 1.0 ± 0.226
0.667MetPhe: 0.667 ± 0.269
1.133MetGly: 1.133 ± 0.278
0.067MetHis: 0.067 ± 0.056
1.334MetIle: 1.334 ± 0.271
1.0MetLys: 1.0 ± 0.301
1.734MetLeu: 1.734 ± 0.286
0.333MetMet: 0.333 ± 0.131
0.8MetAsn: 0.8 ± 0.263
1.534MetPro: 1.534 ± 0.375
0.467MetGln: 0.467 ± 0.178
1.6MetArg: 1.6 ± 0.384
2.2MetSer: 2.2 ± 0.341
1.6MetThr: 1.6 ± 0.259
1.4MetVal: 1.4 ± 0.255
0.2MetTrp: 0.2 ± 0.11
0.267MetTyr: 0.267 ± 0.132
0.0MetXaa: 0.0 ± 0.0
Asn
3.134AsnAla: 3.134 ± 0.481
0.4AsnCys: 0.4 ± 0.193
2.0AsnAsp: 2.0 ± 0.37
1.6AsnGlu: 1.6 ± 0.343
1.2AsnPhe: 1.2 ± 0.315
3.467AsnGly: 3.467 ± 0.527
0.667AsnHis: 0.667 ± 0.205
0.933AsnIle: 0.933 ± 0.247
1.2AsnLys: 1.2 ± 0.341
2.8AsnLeu: 2.8 ± 0.499
0.267AsnMet: 0.267 ± 0.136
0.733AsnAsn: 0.733 ± 0.248
1.867AsnPro: 1.867 ± 0.351
1.133AsnGln: 1.133 ± 0.265
1.867AsnArg: 1.867 ± 0.274
1.6AsnSer: 1.6 ± 0.249
2.534AsnThr: 2.534 ± 0.368
1.8AsnVal: 1.8 ± 0.374
0.533AsnTrp: 0.533 ± 0.168
0.867AsnTyr: 0.867 ± 0.156
0.0AsnXaa: 0.0 ± 0.0
Pro
5.267ProAla: 5.267 ± 0.717
0.533ProCys: 0.533 ± 0.165
3.067ProAsp: 3.067 ± 0.416
4.134ProGlu: 4.134 ± 0.508
1.467ProPhe: 1.467 ± 0.313
4.801ProGly: 4.801 ± 0.463
0.667ProHis: 0.667 ± 0.166
2.2ProIle: 2.2 ± 0.523
2.2ProLys: 2.2 ± 0.512
3.334ProLeu: 3.334 ± 0.495
1.067ProMet: 1.067 ± 0.274
1.8ProAsn: 1.8 ± 0.405
1.867ProPro: 1.867 ± 0.396
1.2ProGln: 1.2 ± 0.37
2.134ProArg: 2.134 ± 0.388
3.467ProSer: 3.467 ± 0.575
4.401ProThr: 4.401 ± 0.717
3.801ProVal: 3.801 ± 0.515
0.867ProTrp: 0.867 ± 0.23
1.334ProTyr: 1.334 ± 0.368
0.0ProXaa: 0.0 ± 0.0
Gln
4.601GlnAla: 4.601 ± 0.686
0.2GlnCys: 0.2 ± 0.107
1.4GlnAsp: 1.4 ± 0.273
2.267GlnGlu: 2.267 ± 0.387
1.067GlnPhe: 1.067 ± 0.259
2.734GlnGly: 2.734 ± 0.69
0.467GlnHis: 0.467 ± 0.16
1.934GlnIle: 1.934 ± 0.344
1.4GlnLys: 1.4 ± 0.258
2.4GlnLeu: 2.4 ± 0.462
0.8GlnMet: 0.8 ± 0.236
1.267GlnAsn: 1.267 ± 0.35
1.2GlnPro: 1.2 ± 0.255
0.8GlnGln: 0.8 ± 0.204
2.734GlnArg: 2.734 ± 0.354
1.667GlnSer: 1.667 ± 0.279
1.934GlnThr: 1.934 ± 0.391
2.2GlnVal: 2.2 ± 0.352
0.667GlnTrp: 0.667 ± 0.19
1.0GlnTyr: 1.0 ± 0.292
0.0GlnXaa: 0.0 ± 0.0
Arg
5.734ArgAla: 5.734 ± 0.758
0.733ArgCys: 0.733 ± 0.181
4.067ArgAsp: 4.067 ± 0.699
4.201ArgGlu: 4.201 ± 0.539
2.6ArgPhe: 2.6 ± 0.424
4.267ArgGly: 4.267 ± 0.537
1.133ArgHis: 1.133 ± 0.26
2.2ArgIle: 2.2 ± 0.316
3.2ArgLys: 3.2 ± 0.538
5.667ArgLeu: 5.667 ± 0.679
1.867ArgMet: 1.867 ± 0.349
1.867ArgAsn: 1.867 ± 0.305
2.4ArgPro: 2.4 ± 0.501
2.067ArgGln: 2.067 ± 0.395
6.067ArgArg: 6.067 ± 0.818
3.6ArgSer: 3.6 ± 0.46
3.2ArgThr: 3.2 ± 0.396
4.201ArgVal: 4.201 ± 0.562
1.133ArgTrp: 1.133 ± 0.296
2.2ArgTyr: 2.2 ± 0.461
0.0ArgXaa: 0.0 ± 0.0
Ser
5.934SerAla: 5.934 ± 0.595
0.4SerCys: 0.4 ± 0.159
3.0SerAsp: 3.0 ± 0.534
4.267SerGlu: 4.267 ± 0.541
1.934SerPhe: 1.934 ± 0.382
5.734SerGly: 5.734 ± 0.854
0.733SerHis: 0.733 ± 0.214
2.4SerIle: 2.4 ± 0.576
2.2SerLys: 2.2 ± 0.35
5.267SerLeu: 5.267 ± 0.574
1.334SerMet: 1.334 ± 0.29
1.734SerAsn: 1.734 ± 0.289
2.8SerPro: 2.8 ± 0.405
2.067SerGln: 2.067 ± 0.318
3.6SerArg: 3.6 ± 0.556
4.001SerSer: 4.001 ± 0.851
4.801SerThr: 4.801 ± 0.738
4.334SerVal: 4.334 ± 0.515
1.4SerTrp: 1.4 ± 0.288
1.867SerTyr: 1.867 ± 0.277
0.0SerXaa: 0.0 ± 0.0
Thr
6.334ThrAla: 6.334 ± 0.711
0.6ThrCys: 0.6 ± 0.185
4.267ThrAsp: 4.267 ± 0.425
3.867ThrGlu: 3.867 ± 0.499
2.534ThrPhe: 2.534 ± 0.428
5.534ThrGly: 5.534 ± 0.917
1.6ThrHis: 1.6 ± 0.452
2.734ThrIle: 2.734 ± 0.39
3.067ThrLys: 3.067 ± 0.392
5.601ThrLeu: 5.601 ± 0.738
1.067ThrMet: 1.067 ± 0.301
1.8ThrAsn: 1.8 ± 0.401
4.667ThrPro: 4.667 ± 0.987
1.8ThrGln: 1.8 ± 0.321
3.6ThrArg: 3.6 ± 0.51
4.867ThrSer: 4.867 ± 0.827
5.134ThrThr: 5.134 ± 0.747
5.067ThrVal: 5.067 ± 0.565
1.2ThrTrp: 1.2 ± 0.269
2.534ThrTyr: 2.534 ± 0.388
0.0ThrXaa: 0.0 ± 0.0
Val
7.534ValAla: 7.534 ± 0.718
0.4ValCys: 0.4 ± 0.164
4.867ValAsp: 4.867 ± 0.508
4.934ValGlu: 4.934 ± 0.715
1.734ValPhe: 1.734 ± 0.35
6.801ValGly: 6.801 ± 0.668
1.6ValHis: 1.6 ± 0.298
3.534ValIle: 3.534 ± 0.53
3.934ValLys: 3.934 ± 0.525
5.201ValLeu: 5.201 ± 0.75
1.4ValMet: 1.4 ± 0.244
1.934ValAsn: 1.934 ± 0.35
3.867ValPro: 3.867 ± 0.503
2.6ValGln: 2.6 ± 0.385
3.867ValArg: 3.867 ± 0.554
3.934ValSer: 3.934 ± 0.542
5.934ValThr: 5.934 ± 0.696
5.934ValVal: 5.934 ± 0.828
0.8ValTrp: 0.8 ± 0.22
1.467ValTyr: 1.467 ± 0.339
0.0ValXaa: 0.0 ± 0.0
Trp
2.134TrpAla: 2.134 ± 0.376
0.267TrpCys: 0.267 ± 0.121
1.6TrpAsp: 1.6 ± 0.387
1.2TrpGlu: 1.2 ± 0.272
0.467TrpPhe: 0.467 ± 0.173
1.267TrpGly: 1.267 ± 0.261
0.4TrpHis: 0.4 ± 0.164
0.333TrpIle: 0.333 ± 0.142
1.067TrpLys: 1.067 ± 0.222
1.667TrpLeu: 1.667 ± 0.38
0.533TrpMet: 0.533 ± 0.151
0.867TrpAsn: 0.867 ± 0.221
0.4TrpPro: 0.4 ± 0.199
0.8TrpGln: 0.8 ± 0.207
1.334TrpArg: 1.334 ± 0.334
1.267TrpSer: 1.267 ± 0.355
1.334TrpThr: 1.334 ± 0.328
1.4TrpVal: 1.4 ± 0.331
0.2TrpTrp: 0.2 ± 0.107
0.4TrpTyr: 0.4 ± 0.147
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.534TyrAla: 3.534 ± 0.39
0.2TyrCys: 0.2 ± 0.119
2.0TyrAsp: 2.0 ± 0.379
2.4TyrGlu: 2.4 ± 0.467
0.6TyrPhe: 0.6 ± 0.199
3.6TyrGly: 3.6 ± 0.641
0.4TyrHis: 0.4 ± 0.164
1.0TyrIle: 1.0 ± 0.239
0.867TyrLys: 0.867 ± 0.299
2.067TyrLeu: 2.067 ± 0.369
0.533TyrMet: 0.533 ± 0.177
1.0TyrAsn: 1.0 ± 0.254
1.334TyrPro: 1.334 ± 0.326
1.067TyrGln: 1.067 ± 0.248
2.267TyrArg: 2.267 ± 0.375
2.2TyrSer: 2.2 ± 0.508
1.334TyrThr: 1.334 ± 0.28
1.667TyrVal: 1.667 ± 0.342
0.6TyrTrp: 0.6 ± 0.197
0.8TyrTyr: 0.8 ± 0.235
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 75 proteins (14999 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski