Amino acid dipepetide frequency for Mycobacterium phage Mova

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
12.903AlaAla: 12.903 ± 1.342
1.139AlaCys: 1.139 ± 0.27
6.777AlaAsp: 6.777 ± 0.616
7.373AlaGlu: 7.373 ± 0.669
2.494AlaPhe: 2.494 ± 0.375
9.488AlaGly: 9.488 ± 1.107
2.385AlaHis: 2.385 ± 0.428
4.934AlaIle: 4.934 ± 0.532
4.337AlaLys: 4.337 ± 0.555
8.295AlaLeu: 8.295 ± 0.722
2.169AlaMet: 2.169 ± 0.343
2.765AlaAsn: 2.765 ± 0.424
5.042AlaPro: 5.042 ± 0.593
3.361AlaGln: 3.361 ± 0.442
6.831AlaArg: 6.831 ± 0.597
5.367AlaSer: 5.367 ± 0.677
6.018AlaThr: 6.018 ± 0.527
6.452AlaVal: 6.452 ± 0.523
2.385AlaTrp: 2.385 ± 0.358
2.657AlaTyr: 2.657 ± 0.322
0.0AlaXaa: 0.0 ± 0.0
Cys
0.922CysAla: 0.922 ± 0.287
0.108CysCys: 0.108 ± 0.088
0.813CysAsp: 0.813 ± 0.226
0.813CysGlu: 0.813 ± 0.227
0.163CysPhe: 0.163 ± 0.083
1.193CysGly: 1.193 ± 0.349
0.217CysHis: 0.217 ± 0.094
0.488CysIle: 0.488 ± 0.167
0.325CysLys: 0.325 ± 0.133
1.03CysLeu: 1.03 ± 0.244
0.054CysMet: 0.054 ± 0.058
0.325CysAsn: 0.325 ± 0.129
1.084CysPro: 1.084 ± 0.29
0.488CysGln: 0.488 ± 0.163
0.922CysArg: 0.922 ± 0.267
0.813CysSer: 0.813 ± 0.296
0.867CysThr: 0.867 ± 0.237
0.596CysVal: 0.596 ± 0.192
0.325CysTrp: 0.325 ± 0.122
0.271CysTyr: 0.271 ± 0.125
0.0CysXaa: 0.0 ± 0.0
Asp
6.777AspAla: 6.777 ± 0.489
1.139AspCys: 1.139 ± 0.283
4.554AspAsp: 4.554 ± 0.636
3.903AspGlu: 3.903 ± 0.492
1.898AspPhe: 1.898 ± 0.285
6.56AspGly: 6.56 ± 0.532
1.464AspHis: 1.464 ± 0.263
2.711AspIle: 2.711 ± 0.437
2.114AspLys: 2.114 ± 0.319
6.018AspLeu: 6.018 ± 0.651
1.139AspMet: 1.139 ± 0.252
1.898AspAsn: 1.898 ± 0.326
5.422AspPro: 5.422 ± 0.574
2.277AspGln: 2.277 ± 0.353
5.313AspArg: 5.313 ± 0.61
3.144AspSer: 3.144 ± 0.424
3.741AspThr: 3.741 ± 0.449
4.663AspVal: 4.663 ± 0.623
1.518AspTrp: 1.518 ± 0.296
1.735AspTyr: 1.735 ± 0.315
0.0AspXaa: 0.0 ± 0.0
Glu
5.964GluAla: 5.964 ± 0.561
1.03GluCys: 1.03 ± 0.241
3.741GluAsp: 3.741 ± 0.417
2.494GluGlu: 2.494 ± 0.426
2.385GluPhe: 2.385 ± 0.317
3.199GluGly: 3.199 ± 0.403
1.41GluHis: 1.41 ± 0.383
3.09GluIle: 3.09 ± 0.363
1.789GluLys: 1.789 ± 0.322
5.801GluLeu: 5.801 ± 0.726
1.843GluMet: 1.843 ± 0.328
1.952GluAsn: 1.952 ± 0.284
2.819GluPro: 2.819 ± 0.323
2.819GluGln: 2.819 ± 0.419
5.15GluArg: 5.15 ± 0.645
3.144GluSer: 3.144 ± 0.459
3.687GluThr: 3.687 ± 0.498
3.958GluVal: 3.958 ± 0.504
1.03GluTrp: 1.03 ± 0.234
1.572GluTyr: 1.572 ± 0.349
0.0GluXaa: 0.0 ± 0.0
Phe
3.09PheAla: 3.09 ± 0.419
0.325PheCys: 0.325 ± 0.136
2.385PheAsp: 2.385 ± 0.315
1.464PheGlu: 1.464 ± 0.289
0.867PhePhe: 0.867 ± 0.245
3.524PheGly: 3.524 ± 0.546
0.434PheHis: 0.434 ± 0.149
1.247PheIle: 1.247 ± 0.335
1.193PheLys: 1.193 ± 0.255
1.626PheLeu: 1.626 ± 0.306
1.03PheMet: 1.03 ± 0.248
1.193PheAsn: 1.193 ± 0.33
1.952PhePro: 1.952 ± 0.323
0.867PheGln: 0.867 ± 0.281
1.898PheArg: 1.898 ± 0.342
1.355PheSer: 1.355 ± 0.27
2.169PheThr: 2.169 ± 0.284
1.789PheVal: 1.789 ± 0.302
0.488PheTrp: 0.488 ± 0.145
0.867PheTyr: 0.867 ± 0.265
0.0PheXaa: 0.0 ± 0.0
Gly
9.271GlyAla: 9.271 ± 1.065
0.922GlyCys: 0.922 ± 0.274
6.018GlyAsp: 6.018 ± 0.607
4.283GlyGlu: 4.283 ± 0.508
2.602GlyPhe: 2.602 ± 0.39
10.626GlyGly: 10.626 ± 1.992
1.681GlyHis: 1.681 ± 0.327
3.849GlyIle: 3.849 ± 0.592
2.169GlyLys: 2.169 ± 0.324
5.855GlyLeu: 5.855 ± 0.626
2.385GlyMet: 2.385 ± 0.374
3.307GlyAsn: 3.307 ± 0.436
3.903GlyPro: 3.903 ± 0.552
1.843GlyGln: 1.843 ± 0.54
5.367GlyArg: 5.367 ± 0.665
6.289GlySer: 6.289 ± 0.802
6.668GlyThr: 6.668 ± 0.672
6.126GlyVal: 6.126 ± 0.599
2.494GlyTrp: 2.494 ± 0.383
2.548GlyTyr: 2.548 ± 0.497
0.0GlyXaa: 0.0 ± 0.0
His
1.681HisAla: 1.681 ± 0.394
0.651HisCys: 0.651 ± 0.233
1.518HisAsp: 1.518 ± 0.312
1.084HisGlu: 1.084 ± 0.28
0.488HisPhe: 0.488 ± 0.145
2.223HisGly: 2.223 ± 0.296
1.084HisHis: 1.084 ± 0.264
1.301HisIle: 1.301 ± 0.275
0.705HisLys: 0.705 ± 0.219
1.464HisLeu: 1.464 ± 0.296
0.38HisMet: 0.38 ± 0.124
0.651HisAsn: 0.651 ± 0.179
1.084HisPro: 1.084 ± 0.247
0.596HisGln: 0.596 ± 0.162
2.006HisArg: 2.006 ± 0.409
0.922HisSer: 0.922 ± 0.181
1.464HisThr: 1.464 ± 0.333
1.247HisVal: 1.247 ± 0.26
0.542HisTrp: 0.542 ± 0.158
0.976HisTyr: 0.976 ± 0.219
0.0HisXaa: 0.0 ± 0.0
Ile
5.53IleAla: 5.53 ± 0.571
0.542IleCys: 0.542 ± 0.205
4.066IleAsp: 4.066 ± 0.488
3.416IleGlu: 3.416 ± 0.395
0.867IlePhe: 0.867 ± 0.229
3.795IleGly: 3.795 ± 0.522
1.355IleHis: 1.355 ± 0.336
1.247IleIle: 1.247 ± 0.249
1.247IleLys: 1.247 ± 0.259
2.06IleLeu: 2.06 ± 0.361
0.542IleMet: 0.542 ± 0.162
2.006IleAsn: 2.006 ± 0.285
3.361IlePro: 3.361 ± 0.38
1.572IleGln: 1.572 ± 0.265
3.144IleArg: 3.144 ± 0.439
2.169IleSer: 2.169 ± 0.401
3.307IleThr: 3.307 ± 0.445
2.873IleVal: 2.873 ± 0.353
1.139IleTrp: 1.139 ± 0.28
0.813IleTyr: 0.813 ± 0.2
0.0IleXaa: 0.0 ± 0.0
Lys
3.741LysAla: 3.741 ± 0.438
0.325LysCys: 0.325 ± 0.133
2.114LysAsp: 2.114 ± 0.378
1.03LysGlu: 1.03 ± 0.257
1.193LysPhe: 1.193 ± 0.209
2.385LysGly: 2.385 ± 0.394
0.976LysHis: 0.976 ± 0.23
1.084LysIle: 1.084 ± 0.29
1.41LysLys: 1.41 ± 0.272
3.09LysLeu: 3.09 ± 0.531
0.488LysMet: 0.488 ± 0.129
0.813LysAsn: 0.813 ± 0.219
2.657LysPro: 2.657 ± 0.404
1.843LysGln: 1.843 ± 0.264
2.277LysArg: 2.277 ± 0.369
2.006LysSer: 2.006 ± 0.332
2.331LysThr: 2.331 ± 0.374
2.169LysVal: 2.169 ± 0.368
0.705LysTrp: 0.705 ± 0.179
0.922LysTyr: 0.922 ± 0.221
0.0LysXaa: 0.0 ± 0.0
Leu
7.699LeuAla: 7.699 ± 0.714
0.596LeuCys: 0.596 ± 0.154
5.313LeuAsp: 5.313 ± 0.633
4.012LeuGlu: 4.012 ± 0.444
1.952LeuPhe: 1.952 ± 0.307
5.801LeuGly: 5.801 ± 0.59
0.867LeuHis: 0.867 ± 0.252
3.687LeuIle: 3.687 ± 0.42
2.548LeuLys: 2.548 ± 0.435
4.175LeuLeu: 4.175 ± 0.517
1.193LeuMet: 1.193 ± 0.316
2.982LeuAsn: 2.982 ± 0.473
5.205LeuPro: 5.205 ± 0.665
2.765LeuGln: 2.765 ± 0.471
5.855LeuArg: 5.855 ± 0.634
5.584LeuSer: 5.584 ± 0.619
5.15LeuThr: 5.15 ± 0.566
5.909LeuVal: 5.909 ± 0.692
1.03LeuTrp: 1.03 ± 0.311
2.169LeuTyr: 2.169 ± 0.397
0.0LeuXaa: 0.0 ± 0.0
Met
2.06MetAla: 2.06 ± 0.373
0.054MetCys: 0.054 ± 0.066
0.922MetAsp: 0.922 ± 0.223
0.813MetGlu: 0.813 ± 0.193
0.651MetPhe: 0.651 ± 0.197
1.681MetGly: 1.681 ± 0.327
0.271MetHis: 0.271 ± 0.121
0.867MetIle: 0.867 ± 0.205
0.867MetLys: 0.867 ± 0.268
1.626MetLeu: 1.626 ± 0.256
0.488MetMet: 0.488 ± 0.23
0.922MetAsn: 0.922 ± 0.217
1.355MetPro: 1.355 ± 0.26
0.325MetGln: 0.325 ± 0.118
1.518MetArg: 1.518 ± 0.253
2.819MetSer: 2.819 ± 0.359
1.952MetThr: 1.952 ± 0.327
1.681MetVal: 1.681 ± 0.324
0.488MetTrp: 0.488 ± 0.159
0.271MetTyr: 0.271 ± 0.113
0.0MetXaa: 0.0 ± 0.0
Asn
3.524AsnAla: 3.524 ± 0.516
0.217AsnCys: 0.217 ± 0.112
1.789AsnAsp: 1.789 ± 0.238
2.06AsnGlu: 2.06 ± 0.319
0.813AsnPhe: 0.813 ± 0.284
4.229AsnGly: 4.229 ± 0.711
0.813AsnHis: 0.813 ± 0.165
1.572AsnIle: 1.572 ± 0.455
0.976AsnLys: 0.976 ± 0.26
2.44AsnLeu: 2.44 ± 0.34
0.651AsnMet: 0.651 ± 0.152
1.626AsnAsn: 1.626 ± 0.324
2.494AsnPro: 2.494 ± 0.353
1.084AsnGln: 1.084 ± 0.318
2.331AsnArg: 2.331 ± 0.414
1.735AsnSer: 1.735 ± 0.283
2.277AsnThr: 2.277 ± 0.314
1.681AsnVal: 1.681 ± 0.278
0.542AsnTrp: 0.542 ± 0.158
0.813AsnTyr: 0.813 ± 0.15
0.0AsnXaa: 0.0 ± 0.0
Pro
5.747ProAla: 5.747 ± 0.637
0.651ProCys: 0.651 ± 0.212
4.229ProAsp: 4.229 ± 0.55
4.391ProGlu: 4.391 ± 0.493
1.952ProPhe: 1.952 ± 0.331
6.289ProGly: 6.289 ± 0.709
1.355ProHis: 1.355 ± 0.266
2.548ProIle: 2.548 ± 0.282
2.331ProLys: 2.331 ± 0.387
4.12ProLeu: 4.12 ± 0.473
1.572ProMet: 1.572 ± 0.361
2.277ProAsn: 2.277 ± 0.316
3.795ProPro: 3.795 ± 0.596
2.385ProGln: 2.385 ± 0.407
3.361ProArg: 3.361 ± 0.511
3.253ProSer: 3.253 ± 0.431
2.602ProThr: 2.602 ± 0.357
4.934ProVal: 4.934 ± 0.542
0.922ProTrp: 0.922 ± 0.263
1.898ProTyr: 1.898 ± 0.298
0.0ProXaa: 0.0 ± 0.0
Gln
4.175GlnAla: 4.175 ± 0.554
0.271GlnCys: 0.271 ± 0.148
1.41GlnAsp: 1.41 ± 0.277
1.735GlnGlu: 1.735 ± 0.341
1.084GlnPhe: 1.084 ± 0.26
2.385GlnGly: 2.385 ± 0.486
0.867GlnHis: 0.867 ± 0.192
2.114GlnIle: 2.114 ± 0.336
1.084GlnLys: 1.084 ± 0.217
3.416GlnLeu: 3.416 ± 0.48
0.542GlnMet: 0.542 ± 0.175
0.813GlnAsn: 0.813 ± 0.269
1.789GlnPro: 1.789 ± 0.323
1.355GlnGln: 1.355 ± 0.368
2.06GlnArg: 2.06 ± 0.328
2.548GlnSer: 2.548 ± 0.358
1.41GlnThr: 1.41 ± 0.343
2.331GlnVal: 2.331 ± 0.334
0.922GlnTrp: 0.922 ± 0.193
1.247GlnTyr: 1.247 ± 0.257
0.0GlnXaa: 0.0 ± 0.0
Arg
6.506ArgAla: 6.506 ± 0.534
1.247ArgCys: 1.247 ± 0.326
4.879ArgAsp: 4.879 ± 0.687
5.15ArgGlu: 5.15 ± 0.629
2.548ArgPhe: 2.548 ± 0.342
3.795ArgGly: 3.795 ± 0.409
1.518ArgHis: 1.518 ± 0.348
3.741ArgIle: 3.741 ± 0.513
2.223ArgLys: 2.223 ± 0.349
5.638ArgLeu: 5.638 ± 0.646
2.765ArgMet: 2.765 ± 0.433
2.114ArgAsn: 2.114 ± 0.474
3.578ArgPro: 3.578 ± 0.42
2.114ArgGln: 2.114 ± 0.354
5.638ArgArg: 5.638 ± 0.745
4.337ArgSer: 4.337 ± 0.503
3.253ArgThr: 3.253 ± 0.494
4.988ArgVal: 4.988 ± 0.611
1.898ArgTrp: 1.898 ± 0.275
2.223ArgTyr: 2.223 ± 0.43
0.0ArgXaa: 0.0 ± 0.0
Ser
5.909SerAla: 5.909 ± 0.681
0.542SerCys: 0.542 ± 0.188
4.175SerAsp: 4.175 ± 0.561
3.578SerGlu: 3.578 ± 0.505
2.277SerPhe: 2.277 ± 0.391
6.668SerGly: 6.668 ± 0.696
1.03SerHis: 1.03 ± 0.199
2.765SerIle: 2.765 ± 0.403
2.385SerLys: 2.385 ± 0.392
3.632SerLeu: 3.632 ± 0.463
1.084SerMet: 1.084 ± 0.228
2.331SerAsn: 2.331 ± 0.55
3.47SerPro: 3.47 ± 0.437
1.681SerGln: 1.681 ± 0.31
4.066SerArg: 4.066 ± 0.458
3.849SerSer: 3.849 ± 0.523
3.09SerThr: 3.09 ± 0.425
4.879SerVal: 4.879 ± 0.613
1.735SerTrp: 1.735 ± 0.288
1.681SerTyr: 1.681 ± 0.266
0.0SerXaa: 0.0 ± 0.0
Thr
5.801ThrAla: 5.801 ± 0.617
0.38ThrCys: 0.38 ± 0.169
3.958ThrAsp: 3.958 ± 0.585
3.687ThrGlu: 3.687 ± 0.469
1.735ThrPhe: 1.735 ± 0.322
5.53ThrGly: 5.53 ± 0.657
1.735ThrHis: 1.735 ± 0.334
3.47ThrIle: 3.47 ± 0.406
2.169ThrLys: 2.169 ± 0.344
4.229ThrLeu: 4.229 ± 0.491
0.976ThrMet: 0.976 ± 0.223
2.114ThrAsn: 2.114 ± 0.328
4.554ThrPro: 4.554 ± 0.523
1.843ThrGln: 1.843 ± 0.291
3.849ThrArg: 3.849 ± 0.434
3.632ThrSer: 3.632 ± 0.39
4.934ThrThr: 4.934 ± 0.671
5.53ThrVal: 5.53 ± 0.572
1.084ThrTrp: 1.084 ± 0.268
1.952ThrTyr: 1.952 ± 0.282
0.0ThrXaa: 0.0 ± 0.0
Val
7.427ValAla: 7.427 ± 0.55
1.193ValCys: 1.193 ± 0.272
5.476ValAsp: 5.476 ± 0.493
4.391ValGlu: 4.391 ± 0.556
2.169ValPhe: 2.169 ± 0.292
5.476ValGly: 5.476 ± 0.587
1.518ValHis: 1.518 ± 0.357
2.385ValIle: 2.385 ± 0.439
2.331ValLys: 2.331 ± 0.436
6.072ValLeu: 6.072 ± 0.646
1.247ValMet: 1.247 ± 0.232
2.114ValAsn: 2.114 ± 0.278
4.283ValPro: 4.283 ± 0.443
2.44ValGln: 2.44 ± 0.364
4.446ValArg: 4.446 ± 0.638
5.313ValSer: 5.313 ± 0.544
4.879ValThr: 4.879 ± 0.447
5.964ValVal: 5.964 ± 0.7
1.464ValTrp: 1.464 ± 0.303
1.247ValTyr: 1.247 ± 0.318
0.0ValXaa: 0.0 ± 0.0
Trp
2.006TrpAla: 2.006 ± 0.306
0.325TrpCys: 0.325 ± 0.139
1.681TrpAsp: 1.681 ± 0.303
1.139TrpGlu: 1.139 ± 0.257
0.705TrpPhe: 0.705 ± 0.173
0.759TrpGly: 0.759 ± 0.201
0.542TrpHis: 0.542 ± 0.176
0.922TrpIle: 0.922 ± 0.21
0.813TrpLys: 0.813 ± 0.197
1.843TrpLeu: 1.843 ± 0.356
0.867TrpMet: 0.867 ± 0.234
0.651TrpAsn: 0.651 ± 0.231
1.247TrpPro: 1.247 ± 0.317
1.084TrpGln: 1.084 ± 0.23
1.789TrpArg: 1.789 ± 0.277
1.355TrpSer: 1.355 ± 0.273
1.301TrpThr: 1.301 ± 0.292
1.735TrpVal: 1.735 ± 0.424
0.867TrpTrp: 0.867 ± 0.197
0.759TrpTyr: 0.759 ± 0.225
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.44TyrAla: 2.44 ± 0.381
0.163TyrCys: 0.163 ± 0.11
2.169TyrAsp: 2.169 ± 0.417
2.114TyrGlu: 2.114 ± 0.33
0.922TyrPhe: 0.922 ± 0.192
2.602TyrGly: 2.602 ± 0.448
0.542TyrHis: 0.542 ± 0.163
1.03TyrIle: 1.03 ± 0.198
0.542TyrLys: 0.542 ± 0.19
2.06TyrLeu: 2.06 ± 0.332
0.163TyrMet: 0.163 ± 0.08
0.813TyrAsn: 0.813 ± 0.2
1.518TyrPro: 1.518 ± 0.298
0.813TyrGln: 0.813 ± 0.22
2.331TyrArg: 2.331 ± 0.402
1.193TyrSer: 1.193 ± 0.234
2.06TyrThr: 2.06 ± 0.477
2.331TyrVal: 2.331 ± 0.297
0.813TyrTrp: 0.813 ± 0.23
0.542TyrTyr: 0.542 ± 0.134
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 106 proteins (18446 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski