Amino acid dipepetide frequency for Mycobacterium phage Jolie1

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
16.734AlaAla: 16.734 ± 1.337
0.971AlaCys: 0.971 ± 0.205
8.08AlaAsp: 8.08 ± 0.58
10.155AlaGlu: 10.155 ± 0.838
2.914AlaPhe: 2.914 ± 0.377
10.42AlaGly: 10.42 ± 1.374
1.899AlaHis: 1.899 ± 0.379
4.813AlaIle: 4.813 ± 0.404
3.532AlaLys: 3.532 ± 0.535
10.729AlaLeu: 10.729 ± 0.935
3.047AlaMet: 3.047 ± 0.314
2.649AlaAsn: 2.649 ± 0.428
8.08AlaPro: 8.08 ± 0.589
5.254AlaGln: 5.254 ± 0.57
8.654AlaArg: 8.654 ± 0.812
6.711AlaSer: 6.711 ± 0.636
7.992AlaThr: 7.992 ± 0.748
7.948AlaVal: 7.948 ± 0.601
2.252AlaTrp: 2.252 ± 0.316
2.296AlaTyr: 2.296 ± 0.29
0.0AlaXaa: 0.0 ± 0.0
Cys
1.06CysAla: 1.06 ± 0.247
0.132CysCys: 0.132 ± 0.083
0.662CysAsp: 0.662 ± 0.151
0.265CysGlu: 0.265 ± 0.102
0.265CysPhe: 0.265 ± 0.129
1.678CysGly: 1.678 ± 0.342
0.353CysHis: 0.353 ± 0.147
0.088CysIle: 0.088 ± 0.057
0.044CysLys: 0.044 ± 0.045
0.53CysLeu: 0.53 ± 0.164
0.177CysMet: 0.177 ± 0.089
0.221CysAsn: 0.221 ± 0.099
1.148CysPro: 1.148 ± 0.349
0.397CysGln: 0.397 ± 0.153
1.104CysArg: 1.104 ± 0.269
0.397CysSer: 0.397 ± 0.113
0.353CysThr: 0.353 ± 0.123
0.662CysVal: 0.662 ± 0.21
0.309CysTrp: 0.309 ± 0.129
0.088CysTyr: 0.088 ± 0.061
0.0CysXaa: 0.0 ± 0.0
Asp
7.859AspAla: 7.859 ± 0.519
1.148AspCys: 1.148 ± 0.344
4.46AspAsp: 4.46 ± 0.439
4.106AspGlu: 4.106 ± 0.679
1.325AspPhe: 1.325 ± 0.262
6.093AspGly: 6.093 ± 0.51
1.369AspHis: 1.369 ± 0.254
2.738AspIle: 2.738 ± 0.37
1.236AspLys: 1.236 ± 0.221
4.857AspLeu: 4.857 ± 0.424
1.192AspMet: 1.192 ± 0.275
1.59AspAsn: 1.59 ± 0.23
5.431AspPro: 5.431 ± 0.541
2.693AspGln: 2.693 ± 0.322
4.769AspArg: 4.769 ± 0.538
2.384AspSer: 2.384 ± 0.281
3.532AspThr: 3.532 ± 0.366
4.371AspVal: 4.371 ± 0.378
1.634AspTrp: 1.634 ± 0.266
1.413AspTyr: 1.413 ± 0.28
0.0AspXaa: 0.0 ± 0.0
Glu
8.213GluAla: 8.213 ± 0.721
0.927GluCys: 0.927 ± 0.221
3.047GluAsp: 3.047 ± 0.533
2.782GluGlu: 2.782 ± 0.36
1.943GluPhe: 1.943 ± 0.252
4.857GluGly: 4.857 ± 0.468
1.104GluHis: 1.104 ± 0.204
2.738GluIle: 2.738 ± 0.382
0.971GluLys: 0.971 ± 0.259
5.784GluLeu: 5.784 ± 0.534
1.28GluMet: 1.28 ± 0.201
1.28GluAsn: 1.28 ± 0.292
4.327GluPro: 4.327 ± 0.549
2.605GluGln: 2.605 ± 0.405
4.062GluArg: 4.062 ± 0.519
2.428GluSer: 2.428 ± 0.29
4.239GluThr: 4.239 ± 0.344
5.034GluVal: 5.034 ± 0.488
1.28GluTrp: 1.28 ± 0.223
1.192GluTyr: 1.192 ± 0.22
0.0GluXaa: 0.0 ± 0.0
Phe
2.826PheAla: 2.826 ± 0.421
0.265PheCys: 0.265 ± 0.152
2.693PheAsp: 2.693 ± 0.396
1.325PheGlu: 1.325 ± 0.263
0.397PhePhe: 0.397 ± 0.142
3.091PheGly: 3.091 ± 0.391
0.618PheHis: 0.618 ± 0.19
1.192PheIle: 1.192 ± 0.228
0.795PheLys: 0.795 ± 0.181
1.854PheLeu: 1.854 ± 0.319
0.265PheMet: 0.265 ± 0.111
0.706PheAsn: 0.706 ± 0.15
1.148PhePro: 1.148 ± 0.243
0.751PheGln: 0.751 ± 0.188
1.854PheArg: 1.854 ± 0.29
1.06PheSer: 1.06 ± 0.229
1.678PheThr: 1.678 ± 0.317
2.296PheVal: 2.296 ± 0.371
0.353PheTrp: 0.353 ± 0.128
0.353PheTyr: 0.353 ± 0.133
0.0PheXaa: 0.0 ± 0.0
Gly
10.376GlyAla: 10.376 ± 1.271
0.795GlyCys: 0.795 ± 0.208
5.387GlyAsp: 5.387 ± 0.596
5.387GlyGlu: 5.387 ± 0.43
2.296GlyPhe: 2.296 ± 0.341
10.42GlyGly: 10.42 ± 1.688
1.854GlyHis: 1.854 ± 0.355
3.93GlyIle: 3.93 ± 0.462
2.914GlyLys: 2.914 ± 0.335
8.698GlyLeu: 8.698 ± 0.887
2.296GlyMet: 2.296 ± 0.35
3.047GlyAsn: 3.047 ± 0.419
4.018GlyPro: 4.018 ± 0.466
4.636GlyGln: 4.636 ± 0.477
6.491GlyArg: 6.491 ± 0.529
4.813GlySer: 4.813 ± 0.445
5.872GlyThr: 5.872 ± 0.624
6.623GlyVal: 6.623 ± 0.64
2.34GlyTrp: 2.34 ± 0.383
2.826GlyTyr: 2.826 ± 0.342
0.0GlyXaa: 0.0 ± 0.0
His
1.545HisAla: 1.545 ± 0.268
0.177HisCys: 0.177 ± 0.092
1.28HisAsp: 1.28 ± 0.268
0.397HisGlu: 0.397 ± 0.125
0.486HisPhe: 0.486 ± 0.155
1.59HisGly: 1.59 ± 0.27
0.486HisHis: 0.486 ± 0.139
0.751HisIle: 0.751 ± 0.167
0.442HisLys: 0.442 ± 0.13
1.413HisLeu: 1.413 ± 0.256
0.486HisMet: 0.486 ± 0.155
0.442HisAsn: 0.442 ± 0.153
1.59HisPro: 1.59 ± 0.274
0.574HisGln: 0.574 ± 0.156
1.899HisArg: 1.899 ± 0.368
0.574HisSer: 0.574 ± 0.149
1.545HisThr: 1.545 ± 0.335
0.927HisVal: 0.927 ± 0.212
0.486HisTrp: 0.486 ± 0.146
0.442HisTyr: 0.442 ± 0.129
0.0HisXaa: 0.0 ± 0.0
Ile
4.724IleAla: 4.724 ± 0.46
0.177IleCys: 0.177 ± 0.087
3.841IleAsp: 3.841 ± 0.387
3.179IleGlu: 3.179 ± 0.397
0.618IlePhe: 0.618 ± 0.124
4.327IleGly: 4.327 ± 0.513
0.574IleHis: 0.574 ± 0.178
1.899IleIle: 1.899 ± 0.266
1.148IleLys: 1.148 ± 0.285
2.826IleLeu: 2.826 ± 0.332
0.442IleMet: 0.442 ± 0.156
1.369IleAsn: 1.369 ± 0.257
3.267IlePro: 3.267 ± 0.354
1.678IleGln: 1.678 ± 0.222
2.561IleArg: 2.561 ± 0.341
2.031IleSer: 2.031 ± 0.335
4.106IleThr: 4.106 ± 0.48
2.826IleVal: 2.826 ± 0.343
0.265IleTrp: 0.265 ± 0.106
0.706IleTyr: 0.706 ± 0.172
0.0IleXaa: 0.0 ± 0.0
Lys
3.886LysAla: 3.886 ± 0.513
0.177LysCys: 0.177 ± 0.082
1.457LysAsp: 1.457 ± 0.255
1.325LysGlu: 1.325 ± 0.257
0.486LysPhe: 0.486 ± 0.107
2.605LysGly: 2.605 ± 0.35
0.309LysHis: 0.309 ± 0.121
1.545LysIle: 1.545 ± 0.263
0.706LysLys: 0.706 ± 0.182
2.428LysLeu: 2.428 ± 0.326
0.618LysMet: 0.618 ± 0.145
0.927LysAsn: 0.927 ± 0.161
1.192LysPro: 1.192 ± 0.226
0.883LysGln: 0.883 ± 0.221
1.634LysArg: 1.634 ± 0.299
1.369LysSer: 1.369 ± 0.244
2.428LysThr: 2.428 ± 0.327
1.987LysVal: 1.987 ± 0.263
0.353LysTrp: 0.353 ± 0.125
0.618LysTyr: 0.618 ± 0.167
0.0LysXaa: 0.0 ± 0.0
Leu
11.348LeuAla: 11.348 ± 0.824
0.751LeuCys: 0.751 ± 0.234
4.989LeuAsp: 4.989 ± 0.488
4.283LeuGlu: 4.283 ± 0.419
1.987LeuPhe: 1.987 ± 0.351
8.036LeuGly: 8.036 ± 0.818
1.369LeuHis: 1.369 ± 0.226
3.621LeuIle: 3.621 ± 0.412
1.854LeuLys: 1.854 ± 0.299
6.005LeuLeu: 6.005 ± 0.501
2.208LeuMet: 2.208 ± 0.276
1.81LeuAsn: 1.81 ± 0.345
5.696LeuPro: 5.696 ± 0.633
2.428LeuGln: 2.428 ± 0.408
4.504LeuArg: 4.504 ± 0.457
4.46LeuSer: 4.46 ± 0.423
7.109LeuThr: 7.109 ± 0.574
5.519LeuVal: 5.519 ± 0.542
1.325LeuTrp: 1.325 ± 0.254
1.722LeuTyr: 1.722 ± 0.311
0.0LeuXaa: 0.0 ± 0.0
Met
2.782MetAla: 2.782 ± 0.355
0.265MetCys: 0.265 ± 0.101
1.545MetAsp: 1.545 ± 0.269
0.706MetGlu: 0.706 ± 0.218
0.53MetPhe: 0.53 ± 0.157
1.987MetGly: 1.987 ± 0.3
0.442MetHis: 0.442 ± 0.147
1.192MetIle: 1.192 ± 0.248
0.662MetLys: 0.662 ± 0.167
1.369MetLeu: 1.369 ± 0.266
0.353MetMet: 0.353 ± 0.133
0.442MetAsn: 0.442 ± 0.146
1.678MetPro: 1.678 ± 0.24
0.574MetGln: 0.574 ± 0.16
1.59MetArg: 1.59 ± 0.273
1.81MetSer: 1.81 ± 0.232
2.826MetThr: 2.826 ± 0.325
0.927MetVal: 0.927 ± 0.188
0.265MetTrp: 0.265 ± 0.124
0.442MetTyr: 0.442 ± 0.159
0.0MetXaa: 0.0 ± 0.0
Asn
3.753AsnAla: 3.753 ± 0.486
0.265AsnCys: 0.265 ± 0.087
1.899AsnAsp: 1.899 ± 0.284
1.236AsnGlu: 1.236 ± 0.244
0.53AsnPhe: 0.53 ± 0.148
3.356AsnGly: 3.356 ± 0.378
0.706AsnHis: 0.706 ± 0.153
1.104AsnIle: 1.104 ± 0.28
0.795AsnLys: 0.795 ± 0.189
2.031AsnLeu: 2.031 ± 0.311
0.442AsnMet: 0.442 ± 0.147
0.839AsnAsn: 0.839 ± 0.221
2.075AsnPro: 2.075 ± 0.346
0.706AsnGln: 0.706 ± 0.189
1.148AsnArg: 1.148 ± 0.286
1.369AsnSer: 1.369 ± 0.221
1.899AsnThr: 1.899 ± 0.291
1.81AsnVal: 1.81 ± 0.308
0.353AsnTrp: 0.353 ± 0.143
0.839AsnTyr: 0.839 ± 0.157
0.0AsnXaa: 0.0 ± 0.0
Pro
7.109ProAla: 7.109 ± 0.623
0.353ProCys: 0.353 ± 0.131
5.343ProAsp: 5.343 ± 0.656
5.475ProGlu: 5.475 ± 0.718
2.075ProPhe: 2.075 ± 0.273
7.109ProGly: 7.109 ± 0.619
0.839ProHis: 0.839 ± 0.19
2.649ProIle: 2.649 ± 0.358
1.722ProLys: 1.722 ± 0.254
4.15ProLeu: 4.15 ± 0.36
1.28ProMet: 1.28 ± 0.178
1.854ProAsn: 1.854 ± 0.286
4.283ProPro: 4.283 ± 0.514
1.236ProGln: 1.236 ± 0.232
3.621ProArg: 3.621 ± 0.343
3.621ProSer: 3.621 ± 0.445
3.709ProThr: 3.709 ± 0.396
6.049ProVal: 6.049 ± 0.395
1.325ProTrp: 1.325 ± 0.251
1.016ProTyr: 1.016 ± 0.262
0.0ProXaa: 0.0 ± 0.0
Gln
5.608GlnAla: 5.608 ± 0.625
0.397GlnCys: 0.397 ± 0.148
1.28GlnAsp: 1.28 ± 0.207
1.104GlnGlu: 1.104 ± 0.294
0.883GlnPhe: 0.883 ± 0.211
2.826GlnGly: 2.826 ± 0.363
0.442GlnHis: 0.442 ± 0.147
2.649GlnIle: 2.649 ± 0.335
0.751GlnLys: 0.751 ± 0.147
3.841GlnLeu: 3.841 ± 0.408
0.971GlnMet: 0.971 ± 0.199
0.839GlnAsn: 0.839 ± 0.187
2.208GlnPro: 2.208 ± 0.304
1.104GlnGln: 1.104 ± 0.206
2.914GlnArg: 2.914 ± 0.39
0.839GlnSer: 0.839 ± 0.203
2.296GlnThr: 2.296 ± 0.361
3.356GlnVal: 3.356 ± 0.394
0.574GlnTrp: 0.574 ± 0.228
0.883GlnTyr: 0.883 ± 0.194
0.0GlnXaa: 0.0 ± 0.0
Arg
8.301ArgAla: 8.301 ± 0.716
0.883ArgCys: 0.883 ± 0.186
4.46ArgAsp: 4.46 ± 0.46
3.886ArgGlu: 3.886 ± 0.428
1.854ArgPhe: 1.854 ± 0.251
4.901ArgGly: 4.901 ± 0.432
1.369ArgHis: 1.369 ± 0.26
2.031ArgIle: 2.031 ± 0.307
2.119ArgLys: 2.119 ± 0.31
6.226ArgLeu: 6.226 ± 0.458
1.987ArgMet: 1.987 ± 0.319
1.899ArgAsn: 1.899 ± 0.245
4.592ArgPro: 4.592 ± 0.535
2.782ArgGln: 2.782 ± 0.343
5.431ArgArg: 5.431 ± 0.535
3.356ArgSer: 3.356 ± 0.393
3.532ArgThr: 3.532 ± 0.347
4.813ArgVal: 4.813 ± 0.564
1.678ArgTrp: 1.678 ± 0.264
1.59ArgTyr: 1.59 ± 0.291
0.0ArgXaa: 0.0 ± 0.0
Ser
6.358SerAla: 6.358 ± 0.379
0.486SerCys: 0.486 ± 0.158
2.826SerAsp: 2.826 ± 0.344
2.031SerGlu: 2.031 ± 0.395
1.016SerPhe: 1.016 ± 0.252
5.74SerGly: 5.74 ± 0.661
0.53SerHis: 0.53 ± 0.157
1.899SerIle: 1.899 ± 0.268
1.634SerLys: 1.634 ± 0.292
4.15SerLeu: 4.15 ± 0.417
1.016SerMet: 1.016 ± 0.185
1.236SerAsn: 1.236 ± 0.239
2.87SerPro: 2.87 ± 0.403
1.987SerGln: 1.987 ± 0.274
3.179SerArg: 3.179 ± 0.342
2.384SerSer: 2.384 ± 0.348
3.179SerThr: 3.179 ± 0.427
4.636SerVal: 4.636 ± 0.503
1.236SerTrp: 1.236 ± 0.199
0.839SerTyr: 0.839 ± 0.181
0.0SerXaa: 0.0 ± 0.0
Thr
9.626ThrAla: 9.626 ± 0.624
0.574ThrCys: 0.574 ± 0.155
4.106ThrAsp: 4.106 ± 0.499
5.519ThrGlu: 5.519 ± 0.661
2.208ThrPhe: 2.208 ± 0.354
7.02ThrGly: 7.02 ± 0.624
1.148ThrHis: 1.148 ± 0.206
2.738ThrIle: 2.738 ± 0.336
2.384ThrLys: 2.384 ± 0.346
5.298ThrLeu: 5.298 ± 0.582
1.59ThrMet: 1.59 ± 0.303
1.501ThrAsn: 1.501 ± 0.255
4.46ThrPro: 4.46 ± 0.647
1.722ThrGln: 1.722 ± 0.241
4.283ThrArg: 4.283 ± 0.368
3.532ThrSer: 3.532 ± 0.367
3.974ThrThr: 3.974 ± 0.497
4.945ThrVal: 4.945 ± 0.482
2.119ThrTrp: 2.119 ± 0.297
1.457ThrTyr: 1.457 ± 0.285
0.0ThrXaa: 0.0 ± 0.0
Val
8.345ValAla: 8.345 ± 0.654
0.574ValCys: 0.574 ± 0.16
4.195ValAsp: 4.195 ± 0.457
4.68ValGlu: 4.68 ± 0.423
2.119ValPhe: 2.119 ± 0.306
5.917ValGly: 5.917 ± 0.622
1.148ValHis: 1.148 ± 0.23
3.665ValIle: 3.665 ± 0.359
2.296ValLys: 2.296 ± 0.336
5.21ValLeu: 5.21 ± 0.577
1.766ValMet: 1.766 ± 0.262
3.091ValAsn: 3.091 ± 0.361
4.106ValPro: 4.106 ± 0.445
2.34ValGln: 2.34 ± 0.308
4.592ValArg: 4.592 ± 0.545
4.15ValSer: 4.15 ± 0.462
7.02ValThr: 7.02 ± 0.667
5.343ValVal: 5.343 ± 0.507
0.971ValTrp: 0.971 ± 0.209
1.678ValTyr: 1.678 ± 0.279
0.0ValXaa: 0.0 ± 0.0
Trp
2.075TrpAla: 2.075 ± 0.372
0.397TrpCys: 0.397 ± 0.157
1.016TrpAsp: 1.016 ± 0.209
1.016TrpGlu: 1.016 ± 0.198
1.104TrpPhe: 1.104 ± 0.288
0.795TrpGly: 0.795 ± 0.159
0.662TrpHis: 0.662 ± 0.183
0.574TrpIle: 0.574 ± 0.171
0.442TrpLys: 0.442 ± 0.113
1.722TrpLeu: 1.722 ± 0.32
0.53TrpMet: 0.53 ± 0.16
0.883TrpAsn: 0.883 ± 0.207
1.369TrpPro: 1.369 ± 0.297
0.486TrpGln: 0.486 ± 0.16
1.457TrpArg: 1.457 ± 0.251
1.148TrpSer: 1.148 ± 0.219
1.678TrpThr: 1.678 ± 0.267
1.28TrpVal: 1.28 ± 0.263
0.618TrpTrp: 0.618 ± 0.154
0.751TrpTyr: 0.751 ± 0.196
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.605TyrAla: 2.605 ± 0.343
0.221TyrCys: 0.221 ± 0.103
1.634TyrAsp: 1.634 ± 0.277
1.28TyrGlu: 1.28 ± 0.196
0.486TyrPhe: 0.486 ± 0.132
2.252TyrGly: 2.252 ± 0.362
0.309TyrHis: 0.309 ± 0.101
0.751TyrIle: 0.751 ± 0.176
0.486TyrLys: 0.486 ± 0.174
1.899TyrLeu: 1.899 ± 0.323
0.397TyrMet: 0.397 ± 0.119
0.53TyrAsn: 0.53 ± 0.176
1.28TyrPro: 1.28 ± 0.279
0.927TyrGln: 0.927 ± 0.181
1.943TyrArg: 1.943 ± 0.337
0.795TyrSer: 0.795 ± 0.212
1.236TyrThr: 1.236 ± 0.277
1.854TyrVal: 1.854 ± 0.262
0.265TyrTrp: 0.265 ± 0.096
0.442TyrTyr: 0.442 ± 0.151
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 98 proteins (22649 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski