Amino acid dipepetide frequency for Mycobacterium phage MarkPhew

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
23.497AlaAla: 23.497 ± 1.798
0.828AlaCys: 0.828 ± 0.22
8.436AlaAsp: 8.436 ± 0.724
8.798AlaGlu: 8.798 ± 0.899
3.778AlaPhe: 3.778 ± 0.562
10.144AlaGly: 10.144 ± 1.184
2.795AlaHis: 2.795 ± 0.37
4.71AlaIle: 4.71 ± 0.491
5.279AlaLys: 5.279 ± 0.593
11.593AlaLeu: 11.593 ± 0.895
3.312AlaMet: 3.312 ± 0.484
3.105AlaAsn: 3.105 ± 0.426
6.676AlaPro: 6.676 ± 0.66
4.917AlaGln: 4.917 ± 0.699
7.453AlaArg: 7.453 ± 0.66
6.159AlaSer: 6.159 ± 0.706
7.039AlaThr: 7.039 ± 0.737
10.247AlaVal: 10.247 ± 0.68
3.002AlaTrp: 3.002 ± 0.324
2.898AlaTyr: 2.898 ± 0.433
0.0AlaXaa: 0.0 ± 0.0
Cys
1.139CysAla: 1.139 ± 0.252
0.207CysCys: 0.207 ± 0.113
0.828CysAsp: 0.828 ± 0.213
0.88CysGlu: 0.88 ± 0.291
0.207CysPhe: 0.207 ± 0.105
1.501CysGly: 1.501 ± 0.393
0.155CysHis: 0.155 ± 0.116
0.311CysIle: 0.311 ± 0.111
0.518CysLys: 0.518 ± 0.189
0.725CysLeu: 0.725 ± 0.25
0.207CysMet: 0.207 ± 0.126
0.311CysAsn: 0.311 ± 0.123
0.725CysPro: 0.725 ± 0.244
0.155CysGln: 0.155 ± 0.108
0.776CysArg: 0.776 ± 0.176
0.932CysSer: 0.932 ± 0.29
0.828CysThr: 0.828 ± 0.277
0.621CysVal: 0.621 ± 0.175
0.466CysTrp: 0.466 ± 0.147
0.259CysTyr: 0.259 ± 0.115
0.0CysXaa: 0.0 ± 0.0
Asp
8.177AspAla: 8.177 ± 0.729
0.673AspCys: 0.673 ± 0.201
4.865AspAsp: 4.865 ± 0.606
5.589AspGlu: 5.589 ± 0.632
1.035AspPhe: 1.035 ± 0.231
7.297AspGly: 7.297 ± 0.55
1.035AspHis: 1.035 ± 0.204
1.656AspIle: 1.656 ± 0.286
1.708AspLys: 1.708 ± 0.343
6.262AspLeu: 6.262 ± 0.61
1.708AspMet: 1.708 ± 0.346
1.397AspAsn: 1.397 ± 0.323
4.399AspPro: 4.399 ± 0.466
1.708AspGln: 1.708 ± 0.256
4.71AspArg: 4.71 ± 0.553
2.536AspSer: 2.536 ± 0.352
2.846AspThr: 2.846 ± 0.359
4.606AspVal: 4.606 ± 0.554
0.725AspTrp: 0.725 ± 0.196
1.397AspTyr: 1.397 ± 0.277
0.0AspXaa: 0.0 ± 0.0
Glu
6.728GluAla: 6.728 ± 0.812
1.035GluCys: 1.035 ± 0.267
3.157GluAsp: 3.157 ± 0.628
1.139GluGlu: 1.139 ± 0.298
1.915GluPhe: 1.915 ± 0.32
4.658GluGly: 4.658 ± 0.55
1.967GluHis: 1.967 ± 0.28
2.277GluIle: 2.277 ± 0.384
1.397GluLys: 1.397 ± 0.321
5.486GluLeu: 5.486 ± 0.548
1.449GluMet: 1.449 ± 0.264
0.776GluAsn: 0.776 ± 0.22
3.054GluPro: 3.054 ± 0.439
2.588GluGln: 2.588 ± 0.361
4.813GluArg: 4.813 ± 0.639
2.484GluSer: 2.484 ± 0.378
2.329GluThr: 2.329 ± 0.33
5.175GluVal: 5.175 ± 0.723
1.242GluTrp: 1.242 ± 0.252
1.967GluTyr: 1.967 ± 0.364
0.0GluXaa: 0.0 ± 0.0
Phe
3.468PheAla: 3.468 ± 0.548
0.259PheCys: 0.259 ± 0.15
2.795PheAsp: 2.795 ± 0.387
1.139PheGlu: 1.139 ± 0.196
0.466PhePhe: 0.466 ± 0.151
3.209PheGly: 3.209 ± 0.399
0.518PheHis: 0.518 ± 0.151
0.828PheIle: 0.828 ± 0.215
1.087PheLys: 1.087 ± 0.321
2.277PheLeu: 2.277 ± 0.317
0.569PheMet: 0.569 ± 0.164
1.087PheAsn: 1.087 ± 0.283
1.294PhePro: 1.294 ± 0.311
0.828PheGln: 0.828 ± 0.229
1.19PheArg: 1.19 ± 0.268
1.19PheSer: 1.19 ± 0.271
1.397PheThr: 1.397 ± 0.298
2.484PheVal: 2.484 ± 0.329
0.414PheTrp: 0.414 ± 0.122
0.414PheTyr: 0.414 ± 0.169
0.0PheXaa: 0.0 ± 0.0
Gly
10.506GlyAla: 10.506 ± 1.197
0.983GlyCys: 0.983 ± 0.297
4.71GlyAsp: 4.71 ± 0.554
5.331GlyGlu: 5.331 ± 0.558
2.743GlyPhe: 2.743 ± 0.369
10.661GlyGly: 10.661 ± 1.639
2.07GlyHis: 2.07 ± 0.36
3.209GlyIle: 3.209 ± 0.583
3.675GlyLys: 3.675 ± 0.488
7.246GlyLeu: 7.246 ± 0.936
2.122GlyMet: 2.122 ± 0.312
2.639GlyAsn: 2.639 ± 0.383
3.726GlyPro: 3.726 ± 0.484
2.225GlyGln: 2.225 ± 0.332
6.418GlyArg: 6.418 ± 0.661
5.279GlySer: 5.279 ± 0.594
6.055GlyThr: 6.055 ± 0.716
7.142GlyVal: 7.142 ± 0.586
2.588GlyTrp: 2.588 ± 0.313
2.536GlyTyr: 2.536 ± 0.404
0.0GlyXaa: 0.0 ± 0.0
His
2.536HisAla: 2.536 ± 0.408
0.466HisCys: 0.466 ± 0.157
1.656HisAsp: 1.656 ± 0.32
0.983HisGlu: 0.983 ± 0.192
0.518HisPhe: 0.518 ± 0.178
2.225HisGly: 2.225 ± 0.365
0.776HisHis: 0.776 ± 0.217
0.621HisIle: 0.621 ± 0.199
0.362HisLys: 0.362 ± 0.145
2.018HisLeu: 2.018 ± 0.292
0.414HisMet: 0.414 ± 0.15
0.621HisAsn: 0.621 ± 0.168
1.242HisPro: 1.242 ± 0.247
0.466HisGln: 0.466 ± 0.144
2.225HisArg: 2.225 ± 0.374
1.087HisSer: 1.087 ± 0.232
1.294HisThr: 1.294 ± 0.241
2.329HisVal: 2.329 ± 0.304
0.518HisTrp: 0.518 ± 0.183
0.362HisTyr: 0.362 ± 0.127
0.0HisXaa: 0.0 ± 0.0
Ile
5.952IleAla: 5.952 ± 0.521
0.207IleCys: 0.207 ± 0.12
3.002IleAsp: 3.002 ± 0.423
3.623IleGlu: 3.623 ± 0.465
0.518IlePhe: 0.518 ± 0.161
4.347IleGly: 4.347 ± 0.744
0.414IleHis: 0.414 ± 0.173
0.725IleIle: 0.725 ± 0.202
1.242IleLys: 1.242 ± 0.279
2.018IleLeu: 2.018 ± 0.338
0.414IleMet: 0.414 ± 0.148
1.242IleAsn: 1.242 ± 0.241
1.915IlePro: 1.915 ± 0.373
0.518IleGln: 0.518 ± 0.146
2.174IleArg: 2.174 ± 0.277
1.604IleSer: 1.604 ± 0.245
2.381IleThr: 2.381 ± 0.367
3.83IleVal: 3.83 ± 0.474
0.776IleTrp: 0.776 ± 0.212
0.362IleTyr: 0.362 ± 0.124
0.0IleXaa: 0.0 ± 0.0
Lys
3.675LysAla: 3.675 ± 0.545
0.466LysCys: 0.466 ± 0.165
1.294LysAsp: 1.294 ± 0.258
0.518LysGlu: 0.518 ± 0.163
0.673LysPhe: 0.673 ± 0.161
2.846LysGly: 2.846 ± 0.525
0.518LysHis: 0.518 ± 0.167
1.501LysIle: 1.501 ± 0.369
0.569LysLys: 0.569 ± 0.202
2.846LysLeu: 2.846 ± 0.295
0.983LysMet: 0.983 ± 0.255
0.518LysAsn: 0.518 ± 0.211
3.261LysPro: 3.261 ± 0.494
0.932LysGln: 0.932 ± 0.22
3.002LysArg: 3.002 ± 0.378
1.087LysSer: 1.087 ± 0.179
2.225LysThr: 2.225 ± 0.322
2.329LysVal: 2.329 ± 0.313
0.518LysTrp: 0.518 ± 0.177
1.035LysTyr: 1.035 ± 0.262
0.0LysXaa: 0.0 ± 0.0
Leu
12.68LeuAla: 12.68 ± 0.858
1.035LeuCys: 1.035 ± 0.247
7.815LeuAsp: 7.815 ± 0.558
1.863LeuGlu: 1.863 ± 0.41
2.277LeuPhe: 2.277 ± 0.43
6.78LeuGly: 6.78 ± 0.672
2.122LeuHis: 2.122 ± 0.331
3.364LeuIle: 3.364 ± 0.433
1.811LeuLys: 1.811 ± 0.258
6.055LeuLeu: 6.055 ± 0.751
1.553LeuMet: 1.553 ± 0.264
2.329LeuAsn: 2.329 ± 0.333
4.244LeuPro: 4.244 ± 0.517
2.691LeuGln: 2.691 ± 0.388
6.676LeuArg: 6.676 ± 0.521
5.382LeuSer: 5.382 ± 0.557
5.175LeuThr: 5.175 ± 0.447
5.227LeuVal: 5.227 ± 0.494
1.397LeuTrp: 1.397 ± 0.252
1.915LeuTyr: 1.915 ± 0.31
0.0LeuXaa: 0.0 ± 0.0
Met
3.054MetAla: 3.054 ± 0.331
0.155MetCys: 0.155 ± 0.098
0.983MetAsp: 0.983 ± 0.207
0.466MetGlu: 0.466 ± 0.132
0.88MetPhe: 0.88 ± 0.181
1.501MetGly: 1.501 ± 0.318
0.673MetHis: 0.673 ± 0.173
1.139MetIle: 1.139 ± 0.255
0.569MetLys: 0.569 ± 0.164
1.242MetLeu: 1.242 ± 0.267
0.259MetMet: 0.259 ± 0.106
0.569MetAsn: 0.569 ± 0.147
1.553MetPro: 1.553 ± 0.286
0.569MetGln: 0.569 ± 0.202
1.811MetArg: 1.811 ± 0.337
2.432MetSer: 2.432 ± 0.327
1.604MetThr: 1.604 ± 0.275
1.863MetVal: 1.863 ± 0.292
0.311MetTrp: 0.311 ± 0.1
0.466MetTyr: 0.466 ± 0.177
0.0MetXaa: 0.0 ± 0.0
Asn
4.14AsnAla: 4.14 ± 0.494
0.311AsnCys: 0.311 ± 0.116
1.242AsnAsp: 1.242 ± 0.317
0.725AsnGlu: 0.725 ± 0.15
0.725AsnPhe: 0.725 ± 0.216
3.416AsnGly: 3.416 ± 0.497
0.311AsnHis: 0.311 ± 0.113
0.673AsnIle: 0.673 ± 0.259
0.673AsnLys: 0.673 ± 0.138
1.76AsnLeu: 1.76 ± 0.333
0.207AsnMet: 0.207 ± 0.095
0.725AsnAsn: 0.725 ± 0.204
2.225AsnPro: 2.225 ± 0.385
0.725AsnGln: 0.725 ± 0.137
2.07AsnArg: 2.07 ± 0.329
0.88AsnSer: 0.88 ± 0.24
1.449AsnThr: 1.449 ± 0.254
2.174AsnVal: 2.174 ± 0.364
0.259AsnTrp: 0.259 ± 0.112
0.828AsnTyr: 0.828 ± 0.187
0.0AsnXaa: 0.0 ± 0.0
Pro
8.436ProAla: 8.436 ± 0.632
0.621ProCys: 0.621 ± 0.209
3.83ProAsp: 3.83 ± 0.365
4.089ProGlu: 4.089 ± 0.533
1.553ProPhe: 1.553 ± 0.234
5.848ProGly: 5.848 ± 0.53
1.139ProHis: 1.139 ± 0.281
1.967ProIle: 1.967 ± 0.333
1.915ProLys: 1.915 ± 0.282
4.347ProLeu: 4.347 ± 0.448
0.88ProMet: 0.88 ± 0.189
1.19ProAsn: 1.19 ± 0.434
2.795ProPro: 2.795 ± 0.378
1.397ProGln: 1.397 ± 0.318
3.675ProArg: 3.675 ± 0.548
3.416ProSer: 3.416 ± 0.447
3.209ProThr: 3.209 ± 0.475
4.658ProVal: 4.658 ± 0.585
0.932ProTrp: 0.932 ± 0.184
1.656ProTyr: 1.656 ± 0.249
0.0ProXaa: 0.0 ± 0.0
Gln
4.865GlnAla: 4.865 ± 0.526
0.259GlnCys: 0.259 ± 0.132
1.139GlnAsp: 1.139 ± 0.234
1.035GlnGlu: 1.035 ± 0.25
1.035GlnPhe: 1.035 ± 0.344
2.484GlnGly: 2.484 ± 0.348
0.88GlnHis: 0.88 ± 0.171
1.553GlnIle: 1.553 ± 0.255
0.414GlnLys: 0.414 ± 0.133
2.484GlnLeu: 2.484 ± 0.441
0.776GlnMet: 0.776 ± 0.192
0.518GlnAsn: 0.518 ± 0.185
1.967GlnPro: 1.967 ± 0.233
1.346GlnGln: 1.346 ± 0.243
2.174GlnArg: 2.174 ± 0.284
1.604GlnSer: 1.604 ± 0.262
1.967GlnThr: 1.967 ± 0.247
3.105GlnVal: 3.105 ± 0.493
0.932GlnTrp: 0.932 ± 0.241
0.673GlnTyr: 0.673 ± 0.178
0.0GlnXaa: 0.0 ± 0.0
Arg
7.246ArgAla: 7.246 ± 0.716
1.294ArgCys: 1.294 ± 0.283
4.14ArgAsp: 4.14 ± 0.563
4.865ArgGlu: 4.865 ± 0.683
1.76ArgPhe: 1.76 ± 0.307
4.14ArgGly: 4.14 ± 0.433
1.863ArgHis: 1.863 ± 0.419
3.105ArgIle: 3.105 ± 0.526
3.002ArgLys: 3.002 ± 0.45
6.314ArgLeu: 6.314 ± 0.528
2.277ArgMet: 2.277 ± 0.312
2.381ArgAsn: 2.381 ± 0.41
4.192ArgPro: 4.192 ± 0.557
2.277ArgGln: 2.277 ± 0.316
6.521ArgArg: 6.521 ± 0.815
3.468ArgSer: 3.468 ± 0.348
3.623ArgThr: 3.623 ± 0.547
4.192ArgVal: 4.192 ± 0.511
1.915ArgTrp: 1.915 ± 0.394
1.967ArgTyr: 1.967 ± 0.39
0.0ArgXaa: 0.0 ± 0.0
Ser
6.78SerAla: 6.78 ± 0.674
0.311SerCys: 0.311 ± 0.152
3.105SerAsp: 3.105 ± 0.334
2.898SerGlu: 2.898 ± 0.381
1.553SerPhe: 1.553 ± 0.321
4.347SerGly: 4.347 ± 0.607
1.397SerHis: 1.397 ± 0.229
2.329SerIle: 2.329 ± 0.406
1.139SerLys: 1.139 ± 0.19
4.037SerLeu: 4.037 ± 0.4
1.656SerMet: 1.656 ± 0.261
1.604SerAsn: 1.604 ± 0.269
3.571SerPro: 3.571 ± 0.385
1.708SerGln: 1.708 ± 0.251
2.95SerArg: 2.95 ± 0.367
2.898SerSer: 2.898 ± 0.549
2.95SerThr: 2.95 ± 0.402
3.985SerVal: 3.985 ± 0.507
1.087SerTrp: 1.087 ± 0.215
1.604SerTyr: 1.604 ± 0.267
0.0SerXaa: 0.0 ± 0.0
Thr
6.935ThrAla: 6.935 ± 0.773
0.983ThrCys: 0.983 ± 0.257
3.105ThrAsp: 3.105 ± 0.429
3.519ThrGlu: 3.519 ± 0.403
1.967ThrPhe: 1.967 ± 0.312
6.366ThrGly: 6.366 ± 0.628
1.397ThrHis: 1.397 ± 0.294
2.95ThrIle: 2.95 ± 0.4
1.76ThrLys: 1.76 ± 0.376
4.244ThrLeu: 4.244 ± 0.464
0.776ThrMet: 0.776 ± 0.199
1.346ThrAsn: 1.346 ± 0.235
3.623ThrPro: 3.623 ± 0.517
1.76ThrGln: 1.76 ± 0.317
2.743ThrArg: 2.743 ± 0.456
2.588ThrSer: 2.588 ± 0.465
3.571ThrThr: 3.571 ± 0.436
5.279ThrVal: 5.279 ± 0.525
1.087ThrTrp: 1.087 ± 0.211
1.708ThrTyr: 1.708 ± 0.373
0.0ThrXaa: 0.0 ± 0.0
Val
9.937ValAla: 9.937 ± 0.836
1.087ValCys: 1.087 ± 0.233
5.331ValAsp: 5.331 ± 0.649
5.952ValGlu: 5.952 ± 0.69
2.07ValPhe: 2.07 ± 0.298
6.469ValGly: 6.469 ± 0.633
2.07ValHis: 2.07 ± 0.459
3.105ValIle: 3.105 ± 0.463
2.898ValLys: 2.898 ± 0.384
6.469ValLeu: 6.469 ± 0.568
1.656ValMet: 1.656 ± 0.265
1.501ValAsn: 1.501 ± 0.229
5.02ValPro: 5.02 ± 0.555
2.329ValGln: 2.329 ± 0.304
4.71ValArg: 4.71 ± 0.471
4.14ValSer: 4.14 ± 0.474
4.554ValThr: 4.554 ± 0.403
6.469ValVal: 6.469 ± 0.826
1.915ValTrp: 1.915 ± 0.282
1.811ValTyr: 1.811 ± 0.244
0.0ValXaa: 0.0 ± 0.0
Trp
2.536TrpAla: 2.536 ± 0.428
0.362TrpCys: 0.362 ± 0.118
0.828TrpAsp: 0.828 ± 0.229
0.776TrpGlu: 0.776 ± 0.218
0.673TrpPhe: 0.673 ± 0.176
1.242TrpGly: 1.242 ± 0.241
0.466TrpHis: 0.466 ± 0.163
0.88TrpIle: 0.88 ± 0.192
0.207TrpLys: 0.207 ± 0.11
2.639TrpLeu: 2.639 ± 0.384
0.414TrpMet: 0.414 ± 0.139
0.776TrpAsn: 0.776 ± 0.171
1.035TrpPro: 1.035 ± 0.273
1.242TrpGln: 1.242 ± 0.201
2.329TrpArg: 2.329 ± 0.376
1.087TrpSer: 1.087 ± 0.243
1.19TrpThr: 1.19 ± 0.219
1.346TrpVal: 1.346 ± 0.31
0.414TrpTrp: 0.414 ± 0.155
0.362TrpTyr: 0.362 ± 0.122
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.432TyrAla: 2.432 ± 0.338
0.207TyrCys: 0.207 ± 0.116
2.07TyrAsp: 2.07 ± 0.364
1.449TyrGlu: 1.449 ± 0.279
0.569TyrPhe: 0.569 ± 0.179
2.381TyrGly: 2.381 ± 0.37
0.155TyrHis: 0.155 ± 0.101
0.362TyrIle: 0.362 ± 0.144
0.569TyrLys: 0.569 ± 0.177
2.329TyrLeu: 2.329 ± 0.362
0.414TyrMet: 0.414 ± 0.141
0.88TyrAsn: 0.88 ± 0.2
1.035TyrPro: 1.035 ± 0.234
0.828TyrGln: 0.828 ± 0.24
2.122TyrArg: 2.122 ± 0.366
1.656TyrSer: 1.656 ± 0.277
2.018TyrThr: 2.018 ± 0.345
2.432TyrVal: 2.432 ± 0.326
0.311TyrTrp: 0.311 ± 0.122
0.621TyrTyr: 0.621 ± 0.187
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 102 proteins (19323 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski