Amino acid dipepetide frequency for Mycobacterium phage Soul22

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
14.193AlaAla: 14.193 ± 1.816
1.249AlaCys: 1.249 ± 0.322
5.961AlaAsp: 5.961 ± 0.533
7.835AlaGlu: 7.835 ± 1.04
3.52AlaPhe: 3.52 ± 0.418
8.516AlaGly: 8.516 ± 1.201
2.101AlaHis: 2.101 ± 0.337
5.053AlaIle: 5.053 ± 0.477
4.428AlaLys: 4.428 ± 0.418
8.686AlaLeu: 8.686 ± 0.937
3.52AlaMet: 3.52 ± 0.496
2.725AlaAsn: 2.725 ± 0.424
4.542AlaPro: 4.542 ± 0.371
4.428AlaGln: 4.428 ± 0.471
7.38AlaArg: 7.38 ± 0.826
4.769AlaSer: 4.769 ± 0.475
5.45AlaThr: 5.45 ± 0.812
7.835AlaVal: 7.835 ± 0.656
2.328AlaTrp: 2.328 ± 0.537
2.498AlaTyr: 2.498 ± 0.369
0.0AlaXaa: 0.0 ± 0.0
Cys
1.306CysAla: 1.306 ± 0.307
0.227CysCys: 0.227 ± 0.112
1.192CysAsp: 1.192 ± 0.343
0.738CysGlu: 0.738 ± 0.18
0.341CysPhe: 0.341 ± 0.133
1.76CysGly: 1.76 ± 0.435
0.284CysHis: 0.284 ± 0.135
0.511CysIle: 0.511 ± 0.189
0.681CysLys: 0.681 ± 0.204
0.908CysLeu: 0.908 ± 0.292
0.17CysMet: 0.17 ± 0.108
0.625CysAsn: 0.625 ± 0.199
1.022CysPro: 1.022 ± 0.235
0.454CysGln: 0.454 ± 0.185
0.738CysArg: 0.738 ± 0.211
1.192CysSer: 1.192 ± 0.282
0.681CysThr: 0.681 ± 0.221
0.738CysVal: 0.738 ± 0.23
0.284CysTrp: 0.284 ± 0.13
0.227CysTyr: 0.227 ± 0.107
0.0CysXaa: 0.0 ± 0.0
Asp
7.551AspAla: 7.551 ± 0.65
0.965AspCys: 0.965 ± 0.279
4.372AspAsp: 4.372 ± 0.661
4.655AspGlu: 4.655 ± 0.598
2.328AspPhe: 2.328 ± 0.353
5.45AspGly: 5.45 ± 0.533
1.59AspHis: 1.59 ± 0.339
2.782AspIle: 2.782 ± 0.397
1.987AspLys: 1.987 ± 0.29
4.826AspLeu: 4.826 ± 0.59
1.363AspMet: 1.363 ± 0.312
1.646AspAsn: 1.646 ± 0.336
3.747AspPro: 3.747 ± 0.442
2.271AspGln: 2.271 ± 0.443
4.769AspArg: 4.769 ± 0.598
2.839AspSer: 2.839 ± 0.404
3.123AspThr: 3.123 ± 0.358
4.485AspVal: 4.485 ± 0.475
1.703AspTrp: 1.703 ± 0.296
1.59AspTyr: 1.59 ± 0.324
0.0AspXaa: 0.0 ± 0.0
Glu
6.188GluAla: 6.188 ± 0.597
1.363GluCys: 1.363 ± 0.295
4.031GluAsp: 4.031 ± 0.54
3.066GluGlu: 3.066 ± 0.616
2.668GluPhe: 2.668 ± 0.355
3.577GluGly: 3.577 ± 0.516
1.249GluHis: 1.249 ± 0.284
2.441GluIle: 2.441 ± 0.441
2.101GluLys: 2.101 ± 0.353
4.939GluLeu: 4.939 ± 0.519
1.192GluMet: 1.192 ± 0.23
2.157GluAsn: 2.157 ± 0.35
3.804GluPro: 3.804 ± 0.542
2.612GluGln: 2.612 ± 0.341
3.804GluArg: 3.804 ± 0.671
3.633GluSer: 3.633 ± 0.567
3.236GluThr: 3.236 ± 0.444
4.712GluVal: 4.712 ± 0.422
1.363GluTrp: 1.363 ± 0.256
1.363GluTyr: 1.363 ± 0.324
0.0GluXaa: 0.0 ± 0.0
Phe
3.236PheAla: 3.236 ± 0.391
0.568PheCys: 0.568 ± 0.18
2.101PheAsp: 2.101 ± 0.319
2.101PheGlu: 2.101 ± 0.331
0.681PhePhe: 0.681 ± 0.181
2.782PheGly: 2.782 ± 0.425
0.568PheHis: 0.568 ± 0.163
1.135PheIle: 1.135 ± 0.263
1.249PheLys: 1.249 ± 0.32
2.612PheLeu: 2.612 ± 0.411
0.965PheMet: 0.965 ± 0.246
1.079PheAsn: 1.079 ± 0.225
1.59PhePro: 1.59 ± 0.311
0.795PheGln: 0.795 ± 0.207
1.476PheArg: 1.476 ± 0.292
2.101PheSer: 2.101 ± 0.363
2.214PheThr: 2.214 ± 0.352
1.76PheVal: 1.76 ± 0.32
0.568PheTrp: 0.568 ± 0.193
0.795PheTyr: 0.795 ± 0.208
0.0PheXaa: 0.0 ± 0.0
Gly
9.368GlyAla: 9.368 ± 1.297
0.738GlyCys: 0.738 ± 0.21
5.393GlyAsp: 5.393 ± 0.52
4.428GlyGlu: 4.428 ± 0.471
2.441GlyPhe: 2.441 ± 0.377
10.9GlyGly: 10.9 ± 2.152
2.101GlyHis: 2.101 ± 0.301
3.52GlyIle: 3.52 ± 0.662
3.406GlyLys: 3.406 ± 0.434
6.131GlyLeu: 6.131 ± 0.649
1.59GlyMet: 1.59 ± 0.3
2.839GlyAsn: 2.839 ± 0.431
3.747GlyPro: 3.747 ± 0.421
3.577GlyGln: 3.577 ± 0.599
5.053GlyArg: 5.053 ± 0.489
4.939GlySer: 4.939 ± 0.864
6.699GlyThr: 6.699 ± 0.759
6.756GlyVal: 6.756 ± 0.629
2.441GlyTrp: 2.441 ± 0.353
2.328GlyTyr: 2.328 ± 0.376
0.0GlyXaa: 0.0 ± 0.0
His
1.703HisAla: 1.703 ± 0.278
0.568HisCys: 0.568 ± 0.193
1.363HisAsp: 1.363 ± 0.32
0.965HisGlu: 0.965 ± 0.249
0.625HisPhe: 0.625 ± 0.191
2.044HisGly: 2.044 ± 0.285
0.738HisHis: 0.738 ± 0.199
1.135HisIle: 1.135 ± 0.318
0.795HisLys: 0.795 ± 0.204
2.044HisLeu: 2.044 ± 0.368
0.284HisMet: 0.284 ± 0.125
0.852HisAsn: 0.852 ± 0.234
1.419HisPro: 1.419 ± 0.222
0.738HisGln: 0.738 ± 0.198
1.703HisArg: 1.703 ± 0.364
0.965HisSer: 0.965 ± 0.213
1.192HisThr: 1.192 ± 0.241
1.533HisVal: 1.533 ± 0.299
0.681HisTrp: 0.681 ± 0.205
0.965HisTyr: 0.965 ± 0.204
0.0HisXaa: 0.0 ± 0.0
Ile
4.882IleAla: 4.882 ± 0.655
0.795IleCys: 0.795 ± 0.288
4.088IleAsp: 4.088 ± 0.506
3.69IleGlu: 3.69 ± 0.408
0.795IlePhe: 0.795 ± 0.198
4.201IleGly: 4.201 ± 0.572
1.192IleHis: 1.192 ± 0.214
1.533IleIle: 1.533 ± 0.312
1.59IleLys: 1.59 ± 0.283
2.271IleLeu: 2.271 ± 0.409
0.738IleMet: 0.738 ± 0.219
1.817IleAsn: 1.817 ± 0.335
3.293IlePro: 3.293 ± 0.393
1.476IleGln: 1.476 ± 0.254
3.35IleArg: 3.35 ± 0.498
1.987IleSer: 1.987 ± 0.365
3.861IleThr: 3.861 ± 0.497
2.555IleVal: 2.555 ± 0.296
1.249IleTrp: 1.249 ± 0.25
0.908IleTyr: 0.908 ± 0.193
0.0IleXaa: 0.0 ± 0.0
Lys
4.144LysAla: 4.144 ± 0.59
0.341LysCys: 0.341 ± 0.121
2.271LysAsp: 2.271 ± 0.351
1.192LysGlu: 1.192 ± 0.3
0.908LysPhe: 0.908 ± 0.203
3.179LysGly: 3.179 ± 0.382
1.419LysHis: 1.419 ± 0.302
1.59LysIle: 1.59 ± 0.253
1.419LysLys: 1.419 ± 0.28
2.612LysLeu: 2.612 ± 0.402
0.681LysMet: 0.681 ± 0.168
0.852LysAsn: 0.852 ± 0.26
2.725LysPro: 2.725 ± 0.359
1.76LysGln: 1.76 ± 0.374
3.463LysArg: 3.463 ± 0.503
1.817LysSer: 1.817 ± 0.338
2.214LysThr: 2.214 ± 0.33
2.214LysVal: 2.214 ± 0.307
0.681LysTrp: 0.681 ± 0.174
0.852LysTyr: 0.852 ± 0.228
0.0LysXaa: 0.0 ± 0.0
Leu
9.027LeuAla: 9.027 ± 0.714
0.795LeuCys: 0.795 ± 0.214
5.337LeuAsp: 5.337 ± 0.64
3.747LeuGlu: 3.747 ± 0.448
1.874LeuPhe: 1.874 ± 0.317
6.131LeuGly: 6.131 ± 0.7
1.59LeuHis: 1.59 ± 0.392
4.088LeuIle: 4.088 ± 0.55
2.952LeuLys: 2.952 ± 0.423
5.621LeuLeu: 5.621 ± 0.632
1.022LeuMet: 1.022 ± 0.24
1.93LeuAsn: 1.93 ± 0.321
5.393LeuPro: 5.393 ± 0.55
2.839LeuGln: 2.839 ± 0.429
4.372LeuArg: 4.372 ± 0.519
5.791LeuSer: 5.791 ± 0.601
5.621LeuThr: 5.621 ± 0.603
4.769LeuVal: 4.769 ± 0.568
1.419LeuTrp: 1.419 ± 0.403
1.533LeuTyr: 1.533 ± 0.267
0.0LeuXaa: 0.0 ± 0.0
Met
1.987MetAla: 1.987 ± 0.391
0.284MetCys: 0.284 ± 0.129
0.965MetAsp: 0.965 ± 0.173
0.852MetGlu: 0.852 ± 0.178
0.965MetPhe: 0.965 ± 0.284
1.533MetGly: 1.533 ± 0.376
0.227MetHis: 0.227 ± 0.12
1.022MetIle: 1.022 ± 0.211
0.625MetLys: 0.625 ± 0.212
1.874MetLeu: 1.874 ± 0.293
0.284MetMet: 0.284 ± 0.119
1.135MetAsn: 1.135 ± 0.25
1.079MetPro: 1.079 ± 0.207
0.454MetGln: 0.454 ± 0.149
1.363MetArg: 1.363 ± 0.255
2.441MetSer: 2.441 ± 0.334
2.271MetThr: 2.271 ± 0.325
1.022MetVal: 1.022 ± 0.228
0.341MetTrp: 0.341 ± 0.148
0.397MetTyr: 0.397 ± 0.12
0.0MetXaa: 0.0 ± 0.0
Asn
3.406AsnAla: 3.406 ± 0.482
0.341AsnCys: 0.341 ± 0.144
1.419AsnAsp: 1.419 ± 0.197
1.76AsnGlu: 1.76 ± 0.288
0.511AsnPhe: 0.511 ± 0.14
4.315AsnGly: 4.315 ± 0.535
0.908AsnHis: 0.908 ± 0.25
1.363AsnIle: 1.363 ± 0.283
0.965AsnLys: 0.965 ± 0.193
2.839AsnLeu: 2.839 ± 0.38
0.681AsnMet: 0.681 ± 0.189
1.249AsnAsn: 1.249 ± 0.243
2.612AsnPro: 2.612 ± 0.411
0.568AsnGln: 0.568 ± 0.199
1.817AsnArg: 1.817 ± 0.385
1.76AsnSer: 1.76 ± 0.282
2.498AsnThr: 2.498 ± 0.304
1.987AsnVal: 1.987 ± 0.271
0.625AsnTrp: 0.625 ± 0.142
0.852AsnTyr: 0.852 ± 0.198
0.0AsnXaa: 0.0 ± 0.0
Pro
5.904ProAla: 5.904 ± 0.545
0.908ProCys: 0.908 ± 0.23
4.542ProAsp: 4.542 ± 0.503
4.542ProGlu: 4.542 ± 0.479
1.987ProPhe: 1.987 ± 0.336
5.791ProGly: 5.791 ± 0.623
1.135ProHis: 1.135 ± 0.24
1.93ProIle: 1.93 ± 0.333
2.271ProLys: 2.271 ± 0.369
3.066ProLeu: 3.066 ± 0.363
1.306ProMet: 1.306 ± 0.258
2.384ProAsn: 2.384 ± 0.283
4.088ProPro: 4.088 ± 0.541
2.044ProGln: 2.044 ± 0.345
4.031ProArg: 4.031 ± 0.549
2.441ProSer: 2.441 ± 0.363
3.52ProThr: 3.52 ± 0.478
4.485ProVal: 4.485 ± 0.499
0.795ProTrp: 0.795 ± 0.217
1.59ProTyr: 1.59 ± 0.267
0.0ProXaa: 0.0 ± 0.0
Gln
4.201GlnAla: 4.201 ± 0.583
0.397GlnCys: 0.397 ± 0.183
1.703GlnAsp: 1.703 ± 0.253
2.044GlnGlu: 2.044 ± 0.339
1.249GlnPhe: 1.249 ± 0.254
2.441GlnGly: 2.441 ± 0.367
1.022GlnHis: 1.022 ± 0.225
2.157GlnIle: 2.157 ± 0.355
1.76GlnLys: 1.76 ± 0.382
3.179GlnLeu: 3.179 ± 0.345
0.908GlnMet: 0.908 ± 0.236
1.022GlnAsn: 1.022 ± 0.326
2.555GlnPro: 2.555 ± 0.44
1.646GlnGln: 1.646 ± 0.311
1.93GlnArg: 1.93 ± 0.379
1.93GlnSer: 1.93 ± 0.272
1.59GlnThr: 1.59 ± 0.254
2.498GlnVal: 2.498 ± 0.407
0.738GlnTrp: 0.738 ± 0.192
1.192GlnTyr: 1.192 ± 0.267
0.0GlnXaa: 0.0 ± 0.0
Arg
6.302ArgAla: 6.302 ± 0.591
1.135ArgCys: 1.135 ± 0.332
3.747ArgAsp: 3.747 ± 0.494
4.201ArgGlu: 4.201 ± 0.651
2.839ArgPhe: 2.839 ± 0.404
4.939ArgGly: 4.939 ± 0.483
1.363ArgHis: 1.363 ± 0.315
3.747ArgIle: 3.747 ± 0.503
1.987ArgLys: 1.987 ± 0.322
5.904ArgLeu: 5.904 ± 0.708
2.157ArgMet: 2.157 ± 0.413
1.987ArgAsn: 1.987 ± 0.363
3.577ArgPro: 3.577 ± 0.444
2.952ArgGln: 2.952 ± 0.43
6.302ArgArg: 6.302 ± 0.928
3.69ArgSer: 3.69 ± 0.405
2.952ArgThr: 2.952 ± 0.356
4.599ArgVal: 4.599 ± 0.596
2.044ArgTrp: 2.044 ± 0.281
2.044ArgTyr: 2.044 ± 0.353
0.0ArgXaa: 0.0 ± 0.0
Ser
5.904SerAla: 5.904 ± 0.885
0.568SerCys: 0.568 ± 0.192
3.633SerAsp: 3.633 ± 0.503
3.406SerGlu: 3.406 ± 0.498
1.817SerPhe: 1.817 ± 0.376
6.699SerGly: 6.699 ± 0.972
0.908SerHis: 0.908 ± 0.255
2.725SerIle: 2.725 ± 0.355
2.101SerLys: 2.101 ± 0.347
4.315SerLeu: 4.315 ± 0.504
1.306SerMet: 1.306 ± 0.219
2.271SerAsn: 2.271 ± 0.356
3.406SerPro: 3.406 ± 0.484
1.987SerGln: 1.987 ± 0.346
4.428SerArg: 4.428 ± 0.569
2.782SerSer: 2.782 ± 0.498
3.747SerThr: 3.747 ± 0.419
4.031SerVal: 4.031 ± 0.417
1.306SerTrp: 1.306 ± 0.297
1.192SerTyr: 1.192 ± 0.249
0.0SerXaa: 0.0 ± 0.0
Thr
6.472ThrAla: 6.472 ± 0.779
0.681ThrCys: 0.681 ± 0.233
4.372ThrAsp: 4.372 ± 0.487
3.463ThrGlu: 3.463 ± 0.378
1.59ThrPhe: 1.59 ± 0.367
5.564ThrGly: 5.564 ± 0.832
1.363ThrHis: 1.363 ± 0.23
3.917ThrIle: 3.917 ± 0.559
1.646ThrLys: 1.646 ± 0.286
5.11ThrLeu: 5.11 ± 0.467
0.795ThrMet: 0.795 ± 0.209
2.214ThrAsn: 2.214 ± 0.372
4.655ThrPro: 4.655 ± 0.64
1.646ThrGln: 1.646 ± 0.285
3.974ThrArg: 3.974 ± 0.529
4.428ThrSer: 4.428 ± 0.613
4.372ThrThr: 4.372 ± 0.694
4.542ThrVal: 4.542 ± 0.599
1.306ThrTrp: 1.306 ± 0.228
1.306ThrTyr: 1.306 ± 0.244
0.0ThrXaa: 0.0 ± 0.0
Val
7.437ValAla: 7.437 ± 0.756
1.306ValCys: 1.306 ± 0.255
4.315ValAsp: 4.315 ± 0.555
4.655ValGlu: 4.655 ± 0.569
1.817ValPhe: 1.817 ± 0.223
4.939ValGly: 4.939 ± 0.517
1.76ValHis: 1.76 ± 0.29
3.236ValIle: 3.236 ± 0.392
2.839ValLys: 2.839 ± 0.332
4.769ValLeu: 4.769 ± 0.561
1.249ValMet: 1.249 ± 0.256
2.101ValAsn: 2.101 ± 0.312
3.577ValPro: 3.577 ± 0.368
2.157ValGln: 2.157 ± 0.319
4.655ValArg: 4.655 ± 0.624
5.961ValSer: 5.961 ± 0.876
4.485ValThr: 4.485 ± 0.533
5.848ValVal: 5.848 ± 0.635
1.646ValTrp: 1.646 ± 0.278
1.079ValTyr: 1.079 ± 0.273
0.0ValXaa: 0.0 ± 0.0
Trp
1.476TrpAla: 1.476 ± 0.266
0.284TrpCys: 0.284 ± 0.129
1.703TrpAsp: 1.703 ± 0.327
0.908TrpGlu: 0.908 ± 0.224
0.454TrpPhe: 0.454 ± 0.169
1.419TrpGly: 1.419 ± 0.371
0.227TrpHis: 0.227 ± 0.102
1.476TrpIle: 1.476 ± 0.304
0.908TrpLys: 0.908 ± 0.202
2.044TrpLeu: 2.044 ± 0.32
0.625TrpMet: 0.625 ± 0.181
0.568TrpAsn: 0.568 ± 0.165
1.022TrpPro: 1.022 ± 0.246
0.852TrpGln: 0.852 ± 0.206
1.703TrpArg: 1.703 ± 0.332
2.214TrpSer: 2.214 ± 0.506
1.76TrpThr: 1.76 ± 0.383
1.646TrpVal: 1.646 ± 0.302
0.738TrpTrp: 0.738 ± 0.205
0.852TrpTyr: 0.852 ± 0.198
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.328TyrAla: 2.328 ± 0.379
0.568TyrCys: 0.568 ± 0.218
1.533TyrAsp: 1.533 ± 0.297
1.249TyrGlu: 1.249 ± 0.267
0.965TyrPhe: 0.965 ± 0.292
2.214TyrGly: 2.214 ± 0.4
0.568TyrHis: 0.568 ± 0.149
0.908TyrIle: 0.908 ± 0.201
0.625TyrLys: 0.625 ± 0.198
2.101TyrLeu: 2.101 ± 0.394
0.227TyrMet: 0.227 ± 0.119
1.022TyrAsn: 1.022 ± 0.247
0.908TyrPro: 0.908 ± 0.242
0.908TyrGln: 0.908 ± 0.252
2.214TyrArg: 2.214 ± 0.378
0.852TyrSer: 0.852 ± 0.199
1.93TyrThr: 1.93 ± 0.277
1.817TyrVal: 1.817 ± 0.258
0.625TyrTrp: 0.625 ± 0.205
0.625TyrTyr: 0.625 ± 0.179
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 102 proteins (17615 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski