Amino acid dipepetide frequency for Mycobacterium virus SG4

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
15.254AlaAla: 15.254 ± 1.786
1.067AlaCys: 1.067 ± 0.288
6.72AlaAsp: 6.72 ± 0.584
8.0AlaGlu: 8.0 ± 0.755
2.773AlaPhe: 2.773 ± 0.422
9.334AlaGly: 9.334 ± 1.144
2.507AlaHis: 2.507 ± 0.381
3.894AlaIle: 3.894 ± 0.556
4.054AlaLys: 4.054 ± 0.427
8.747AlaLeu: 8.747 ± 0.816
2.933AlaMet: 2.933 ± 0.484
2.773AlaAsn: 2.773 ± 0.372
5.067AlaPro: 5.067 ± 0.612
2.987AlaGln: 2.987 ± 0.414
8.107AlaArg: 8.107 ± 0.972
5.44AlaSer: 5.44 ± 0.671
6.187AlaThr: 6.187 ± 0.583
7.04AlaVal: 7.04 ± 0.587
2.827AlaTrp: 2.827 ± 0.511
1.973AlaTyr: 1.973 ± 0.289
0.0AlaXaa: 0.0 ± 0.0
Cys
1.28CysAla: 1.28 ± 0.263
0.053CysCys: 0.053 ± 0.064
1.6CysAsp: 1.6 ± 0.368
0.533CysGlu: 0.533 ± 0.166
0.107CysPhe: 0.107 ± 0.076
1.493CysGly: 1.493 ± 0.293
0.373CysHis: 0.373 ± 0.133
0.267CysIle: 0.267 ± 0.122
0.427CysLys: 0.427 ± 0.158
0.747CysLeu: 0.747 ± 0.244
0.16CysMet: 0.16 ± 0.096
0.427CysAsn: 0.427 ± 0.143
0.907CysPro: 0.907 ± 0.222
0.32CysGln: 0.32 ± 0.143
0.693CysArg: 0.693 ± 0.183
0.853CysSer: 0.853 ± 0.215
0.747CysThr: 0.747 ± 0.224
0.48CysVal: 0.48 ± 0.143
0.373CysTrp: 0.373 ± 0.139
0.32CysTyr: 0.32 ± 0.133
0.0CysXaa: 0.0 ± 0.0
Asp
7.2AspAla: 7.2 ± 0.63
0.747AspCys: 0.747 ± 0.249
4.587AspAsp: 4.587 ± 0.611
3.307AspGlu: 3.307 ± 0.487
2.133AspPhe: 2.133 ± 0.323
6.88AspGly: 6.88 ± 0.606
1.28AspHis: 1.28 ± 0.251
2.453AspIle: 2.453 ± 0.339
1.6AspLys: 1.6 ± 0.299
5.44AspLeu: 5.44 ± 0.498
1.067AspMet: 1.067 ± 0.254
1.653AspAsn: 1.653 ± 0.383
5.494AspPro: 5.494 ± 0.624
2.347AspGln: 2.347 ± 0.344
5.6AspArg: 5.6 ± 0.642
3.787AspSer: 3.787 ± 0.516
3.68AspThr: 3.68 ± 0.447
4.32AspVal: 4.32 ± 0.59
1.44AspTrp: 1.44 ± 0.258
1.547AspTyr: 1.547 ± 0.237
0.0AspXaa: 0.0 ± 0.0
Glu
6.347GluAla: 6.347 ± 0.692
1.173GluCys: 1.173 ± 0.279
3.36GluAsp: 3.36 ± 0.372
3.36GluGlu: 3.36 ± 0.572
2.347GluPhe: 2.347 ± 0.283
2.88GluGly: 2.88 ± 0.405
1.92GluHis: 1.92 ± 0.36
2.08GluIle: 2.08 ± 0.347
1.973GluLys: 1.973 ± 0.29
5.654GluLeu: 5.654 ± 0.786
1.547GluMet: 1.547 ± 0.278
2.027GluAsn: 2.027 ± 0.308
2.773GluPro: 2.773 ± 0.489
2.88GluGln: 2.88 ± 0.373
5.28GluArg: 5.28 ± 0.67
3.2GluSer: 3.2 ± 0.477
3.894GluThr: 3.894 ± 0.649
3.894GluVal: 3.894 ± 0.443
1.44GluTrp: 1.44 ± 0.265
1.76GluTyr: 1.76 ± 0.365
0.0GluXaa: 0.0 ± 0.0
Phe
3.36PheAla: 3.36 ± 0.414
0.48PheCys: 0.48 ± 0.188
2.4PheAsp: 2.4 ± 0.408
1.547PheGlu: 1.547 ± 0.348
0.853PhePhe: 0.853 ± 0.263
3.2PheGly: 3.2 ± 0.598
0.427PheHis: 0.427 ± 0.177
1.28PheIle: 1.28 ± 0.319
0.907PheLys: 0.907 ± 0.236
1.76PheLeu: 1.76 ± 0.292
0.853PheMet: 0.853 ± 0.242
0.96PheAsn: 0.96 ± 0.294
1.547PhePro: 1.547 ± 0.28
1.013PheGln: 1.013 ± 0.328
1.813PheArg: 1.813 ± 0.284
1.707PheSer: 1.707 ± 0.256
2.293PheThr: 2.293 ± 0.416
2.027PheVal: 2.027 ± 0.301
0.533PheTrp: 0.533 ± 0.128
0.853PheTyr: 0.853 ± 0.236
0.0PheXaa: 0.0 ± 0.0
Gly
9.547GlyAla: 9.547 ± 1.176
1.067GlyCys: 1.067 ± 0.241
6.507GlyAsp: 6.507 ± 0.673
3.84GlyGlu: 3.84 ± 0.497
2.933GlyPhe: 2.933 ± 0.406
10.881GlyGly: 10.881 ± 2.574
1.92GlyHis: 1.92 ± 0.257
4.64GlyIle: 4.64 ± 0.607
2.24GlyLys: 2.24 ± 0.332
6.507GlyLeu: 6.507 ± 0.533
2.187GlyMet: 2.187 ± 0.42
2.827GlyAsn: 2.827 ± 0.349
4.587GlyPro: 4.587 ± 0.57
2.347GlyGln: 2.347 ± 0.555
5.12GlyArg: 5.12 ± 0.652
5.814GlySer: 5.814 ± 0.744
6.507GlyThr: 6.507 ± 0.728
6.24GlyVal: 6.24 ± 0.526
2.133GlyTrp: 2.133 ± 0.33
1.707GlyTyr: 1.707 ± 0.333
0.0GlyXaa: 0.0 ± 0.0
His
2.08HisAla: 2.08 ± 0.319
0.533HisCys: 0.533 ± 0.191
1.227HisAsp: 1.227 ± 0.247
1.547HisGlu: 1.547 ± 0.31
0.32HisPhe: 0.32 ± 0.12
1.92HisGly: 1.92 ± 0.276
1.067HisHis: 1.067 ± 0.282
1.387HisIle: 1.387 ± 0.261
0.64HisLys: 0.64 ± 0.185
1.547HisLeu: 1.547 ± 0.311
0.373HisMet: 0.373 ± 0.125
0.693HisAsn: 0.693 ± 0.197
1.813HisPro: 1.813 ± 0.289
0.96HisGln: 0.96 ± 0.224
2.187HisArg: 2.187 ± 0.391
0.747HisSer: 0.747 ± 0.173
1.12HisThr: 1.12 ± 0.27
1.547HisVal: 1.547 ± 0.297
0.48HisTrp: 0.48 ± 0.179
0.747HisTyr: 0.747 ± 0.195
0.0HisXaa: 0.0 ± 0.0
Ile
5.174IleAla: 5.174 ± 0.496
0.64IleCys: 0.64 ± 0.232
3.894IleAsp: 3.894 ± 0.544
3.52IleGlu: 3.52 ± 0.327
0.853IlePhe: 0.853 ± 0.261
3.734IleGly: 3.734 ± 0.513
1.493IleHis: 1.493 ± 0.288
1.28IleIle: 1.28 ± 0.266
0.96IleLys: 0.96 ± 0.231
2.027IleLeu: 2.027 ± 0.492
0.533IleMet: 0.533 ± 0.183
2.08IleAsn: 2.08 ± 0.314
2.88IlePro: 2.88 ± 0.341
1.387IleGln: 1.387 ± 0.255
2.72IleArg: 2.72 ± 0.396
1.76IleSer: 1.76 ± 0.349
3.52IleThr: 3.52 ± 0.43
2.613IleVal: 2.613 ± 0.336
0.693IleTrp: 0.693 ± 0.174
0.693IleTyr: 0.693 ± 0.194
0.0IleXaa: 0.0 ± 0.0
Lys
3.947LysAla: 3.947 ± 0.484
0.32LysCys: 0.32 ± 0.13
1.707LysAsp: 1.707 ± 0.277
1.44LysGlu: 1.44 ± 0.253
1.067LysPhe: 1.067 ± 0.183
2.56LysGly: 2.56 ± 0.311
1.12LysHis: 1.12 ± 0.255
1.013LysIle: 1.013 ± 0.277
1.44LysLys: 1.44 ± 0.376
2.613LysLeu: 2.613 ± 0.493
0.48LysMet: 0.48 ± 0.126
0.747LysAsn: 0.747 ± 0.198
2.4LysPro: 2.4 ± 0.341
1.707LysGln: 1.707 ± 0.276
2.347LysArg: 2.347 ± 0.364
1.813LysSer: 1.813 ± 0.298
1.867LysThr: 1.867 ± 0.271
1.92LysVal: 1.92 ± 0.341
0.747LysTrp: 0.747 ± 0.22
0.853LysTyr: 0.853 ± 0.234
0.0LysXaa: 0.0 ± 0.0
Leu
8.374LeuAla: 8.374 ± 0.845
0.587LeuCys: 0.587 ± 0.21
5.387LeuAsp: 5.387 ± 0.634
4.16LeuGlu: 4.16 ± 0.549
2.507LeuPhe: 2.507 ± 0.357
5.547LeuGly: 5.547 ± 0.557
1.12LeuHis: 1.12 ± 0.254
3.36LeuIle: 3.36 ± 0.463
2.293LeuLys: 2.293 ± 0.317
4.534LeuLeu: 4.534 ± 0.608
1.707LeuMet: 1.707 ± 0.346
2.613LeuAsn: 2.613 ± 0.373
5.12LeuPro: 5.12 ± 0.666
2.72LeuGln: 2.72 ± 0.421
5.654LeuArg: 5.654 ± 0.69
5.547LeuSer: 5.547 ± 0.533
4.534LeuThr: 4.534 ± 0.443
4.907LeuVal: 4.907 ± 0.611
1.547LeuTrp: 1.547 ± 0.287
1.867LeuTyr: 1.867 ± 0.376
0.0LeuXaa: 0.0 ± 0.0
Met
2.24MetAla: 2.24 ± 0.336
0.16MetCys: 0.16 ± 0.101
1.173MetAsp: 1.173 ± 0.245
1.013MetGlu: 1.013 ± 0.203
0.64MetPhe: 0.64 ± 0.238
1.867MetGly: 1.867 ± 0.372
0.16MetHis: 0.16 ± 0.082
0.96MetIle: 0.96 ± 0.232
0.64MetLys: 0.64 ± 0.218
1.707MetLeu: 1.707 ± 0.259
0.533MetMet: 0.533 ± 0.197
0.907MetAsn: 0.907 ± 0.228
1.387MetPro: 1.387 ± 0.306
0.373MetGln: 0.373 ± 0.12
1.493MetArg: 1.493 ± 0.243
3.2MetSer: 3.2 ± 0.496
1.76MetThr: 1.76 ± 0.325
1.44MetVal: 1.44 ± 0.322
0.32MetTrp: 0.32 ± 0.124
0.267MetTyr: 0.267 ± 0.118
0.0MetXaa: 0.0 ± 0.0
Asn
3.574AsnAla: 3.574 ± 0.431
0.16AsnCys: 0.16 ± 0.097
1.973AsnAsp: 1.973 ± 0.289
1.547AsnGlu: 1.547 ± 0.33
0.853AsnPhe: 0.853 ± 0.286
3.68AsnGly: 3.68 ± 0.468
0.853AsnHis: 0.853 ± 0.186
1.707AsnIle: 1.707 ± 0.463
1.013AsnLys: 1.013 ± 0.189
2.187AsnLeu: 2.187 ± 0.4
0.693AsnMet: 0.693 ± 0.176
1.547AsnAsn: 1.547 ± 0.299
2.72AsnPro: 2.72 ± 0.38
1.067AsnGln: 1.067 ± 0.302
2.347AsnArg: 2.347 ± 0.351
1.493AsnSer: 1.493 ± 0.291
1.867AsnThr: 1.867 ± 0.304
2.187AsnVal: 2.187 ± 0.339
0.747AsnTrp: 0.747 ± 0.175
0.64AsnTyr: 0.64 ± 0.152
0.0AsnXaa: 0.0 ± 0.0
Pro
5.44ProAla: 5.44 ± 0.669
0.853ProCys: 0.853 ± 0.204
4.587ProAsp: 4.587 ± 0.566
4.0ProGlu: 4.0 ± 0.503
1.76ProPhe: 1.76 ± 0.313
6.934ProGly: 6.934 ± 0.665
1.28ProHis: 1.28 ± 0.266
2.133ProIle: 2.133 ± 0.314
1.973ProLys: 1.973 ± 0.312
4.64ProLeu: 4.64 ± 0.483
1.493ProMet: 1.493 ± 0.34
2.453ProAsn: 2.453 ± 0.321
3.68ProPro: 3.68 ± 0.529
1.92ProGln: 1.92 ± 0.34
3.574ProArg: 3.574 ± 0.484
3.04ProSer: 3.04 ± 0.487
2.827ProThr: 2.827 ± 0.392
4.96ProVal: 4.96 ± 0.551
1.173ProTrp: 1.173 ± 0.267
1.707ProTyr: 1.707 ± 0.299
0.0ProXaa: 0.0 ± 0.0
Gln
4.0GlnAla: 4.0 ± 0.526
0.48GlnCys: 0.48 ± 0.233
1.547GlnAsp: 1.547 ± 0.27
1.493GlnGlu: 1.493 ± 0.306
1.12GlnPhe: 1.12 ± 0.261
2.187GlnGly: 2.187 ± 0.494
0.427GlnHis: 0.427 ± 0.189
1.813GlnIle: 1.813 ± 0.303
1.493GlnLys: 1.493 ± 0.272
2.987GlnLeu: 2.987 ± 0.35
0.587GlnMet: 0.587 ± 0.158
0.96GlnAsn: 0.96 ± 0.273
2.72GlnPro: 2.72 ± 0.44
1.333GlnGln: 1.333 ± 0.282
2.4GlnArg: 2.4 ± 0.352
2.72GlnSer: 2.72 ± 0.347
1.813GlnThr: 1.813 ± 0.333
2.4GlnVal: 2.4 ± 0.347
0.693GlnTrp: 0.693 ± 0.194
0.8GlnTyr: 0.8 ± 0.234
0.0GlnXaa: 0.0 ± 0.0
Arg
6.134ArgAla: 6.134 ± 0.658
1.333ArgCys: 1.333 ± 0.319
4.0ArgAsp: 4.0 ± 0.549
5.547ArgGlu: 5.547 ± 0.66
2.027ArgPhe: 2.027 ± 0.405
4.534ArgGly: 4.534 ± 0.445
1.76ArgHis: 1.76 ± 0.316
3.787ArgIle: 3.787 ± 0.536
2.347ArgLys: 2.347 ± 0.394
5.334ArgLeu: 5.334 ± 0.657
2.24ArgMet: 2.24 ± 0.42
2.56ArgAsn: 2.56 ± 0.41
3.467ArgPro: 3.467 ± 0.392
2.293ArgGln: 2.293 ± 0.429
6.187ArgArg: 6.187 ± 0.84
4.427ArgSer: 4.427 ± 0.615
3.254ArgThr: 3.254 ± 0.46
5.92ArgVal: 5.92 ± 0.597
2.027ArgTrp: 2.027 ± 0.304
2.347ArgTyr: 2.347 ± 0.335
0.0ArgXaa: 0.0 ± 0.0
Ser
6.187SerAla: 6.187 ± 1.333
0.373SerCys: 0.373 ± 0.115
3.787SerAsp: 3.787 ± 0.507
3.84SerGlu: 3.84 ± 0.53
2.027SerPhe: 2.027 ± 0.409
7.574SerGly: 7.574 ± 0.669
1.067SerHis: 1.067 ± 0.207
2.667SerIle: 2.667 ± 0.404
2.24SerLys: 2.24 ± 0.387
3.894SerLeu: 3.894 ± 0.514
1.493SerMet: 1.493 ± 0.271
1.867SerAsn: 1.867 ± 0.336
3.2SerPro: 3.2 ± 0.334
1.76SerGln: 1.76 ± 0.29
3.467SerArg: 3.467 ± 0.479
3.734SerSer: 3.734 ± 0.56
3.2SerThr: 3.2 ± 0.43
5.12SerVal: 5.12 ± 0.532
1.493SerTrp: 1.493 ± 0.278
1.28SerTyr: 1.28 ± 0.209
0.0SerXaa: 0.0 ± 0.0
Thr
6.027ThrAla: 6.027 ± 0.559
0.427ThrCys: 0.427 ± 0.15
3.84ThrAsp: 3.84 ± 0.54
3.04ThrGlu: 3.04 ± 0.339
1.653ThrPhe: 1.653 ± 0.327
5.867ThrGly: 5.867 ± 0.575
1.44ThrHis: 1.44 ± 0.26
3.147ThrIle: 3.147 ± 0.441
1.867ThrLys: 1.867 ± 0.29
4.267ThrLeu: 4.267 ± 0.513
0.96ThrMet: 0.96 ± 0.209
2.4ThrAsn: 2.4 ± 0.421
4.32ThrPro: 4.32 ± 0.507
1.973ThrGln: 1.973 ± 0.291
3.734ThrArg: 3.734 ± 0.383
3.84ThrSer: 3.84 ± 0.438
4.267ThrThr: 4.267 ± 0.538
5.867ThrVal: 5.867 ± 0.652
1.12ThrTrp: 1.12 ± 0.239
1.44ThrTyr: 1.44 ± 0.308
0.0ThrXaa: 0.0 ± 0.0
Val
7.414ValAla: 7.414 ± 0.542
1.013ValCys: 1.013 ± 0.203
5.12ValAsp: 5.12 ± 0.595
4.96ValGlu: 4.96 ± 0.543
2.293ValPhe: 2.293 ± 0.359
5.654ValGly: 5.654 ± 0.715
1.493ValHis: 1.493 ± 0.315
2.613ValIle: 2.613 ± 0.374
2.293ValLys: 2.293 ± 0.351
6.134ValLeu: 6.134 ± 0.585
1.493ValMet: 1.493 ± 0.209
2.187ValAsn: 2.187 ± 0.338
3.894ValPro: 3.894 ± 0.407
2.773ValGln: 2.773 ± 0.392
4.907ValArg: 4.907 ± 0.658
4.747ValSer: 4.747 ± 0.565
5.014ValThr: 5.014 ± 0.476
6.88ValVal: 6.88 ± 0.753
1.44ValTrp: 1.44 ± 0.316
1.653ValTyr: 1.653 ± 0.272
0.0ValXaa: 0.0 ± 0.0
Trp
1.973TrpAla: 1.973 ± 0.286
0.373TrpCys: 0.373 ± 0.125
1.6TrpAsp: 1.6 ± 0.314
1.28TrpGlu: 1.28 ± 0.366
0.747TrpPhe: 0.747 ± 0.202
0.8TrpGly: 0.8 ± 0.207
0.587TrpHis: 0.587 ± 0.188
0.96TrpIle: 0.96 ± 0.181
0.853TrpLys: 0.853 ± 0.177
1.76TrpLeu: 1.76 ± 0.352
0.64TrpMet: 0.64 ± 0.203
0.587TrpAsn: 0.587 ± 0.192
1.28TrpPro: 1.28 ± 0.293
1.067TrpGln: 1.067 ± 0.28
1.813TrpArg: 1.813 ± 0.378
1.6TrpSer: 1.6 ± 0.364
1.547TrpThr: 1.547 ± 0.26
1.92TrpVal: 1.92 ± 0.39
1.067TrpTrp: 1.067 ± 0.201
0.267TrpTyr: 0.267 ± 0.111
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.133TyrAla: 2.133 ± 0.354
0.32TyrCys: 0.32 ± 0.126
1.547TyrAsp: 1.547 ± 0.404
1.813TyrGlu: 1.813 ± 0.335
0.8TyrPhe: 0.8 ± 0.224
2.027TyrGly: 2.027 ± 0.321
0.64TyrHis: 0.64 ± 0.189
1.013TyrIle: 1.013 ± 0.201
0.853TyrLys: 0.853 ± 0.216
1.6TyrLeu: 1.6 ± 0.291
0.213TyrMet: 0.213 ± 0.098
0.587TyrAsn: 0.587 ± 0.194
1.227TyrPro: 1.227 ± 0.228
0.747TyrGln: 0.747 ± 0.199
2.08TyrArg: 2.08 ± 0.357
0.8TyrSer: 0.8 ± 0.242
1.6TyrThr: 1.6 ± 0.31
2.293TyrVal: 2.293 ± 0.319
0.427TyrTrp: 0.427 ± 0.165
0.64TyrTyr: 0.64 ± 0.187
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 104 proteins (18750 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski