Amino acid dipepetide frequency for Mycobacterium phage CASbig

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
12.17AlaAla: 12.17 ± 1.407
0.953AlaCys: 0.953 ± 0.234
6.282AlaAsp: 6.282 ± 0.593
5.945AlaGlu: 5.945 ± 0.601
2.86AlaPhe: 2.86 ± 0.464
7.684AlaGly: 7.684 ± 0.707
1.514AlaHis: 1.514 ± 0.349
3.982AlaIle: 3.982 ± 0.577
3.87AlaLys: 3.87 ± 0.481
9.422AlaLeu: 9.422 ± 0.937
2.299AlaMet: 2.299 ± 0.422
2.468AlaAsn: 2.468 ± 0.403
4.879AlaPro: 4.879 ± 0.744
2.636AlaGln: 2.636 ± 0.414
6.169AlaArg: 6.169 ± 0.492
5.384AlaSer: 5.384 ± 0.716
5.552AlaThr: 5.552 ± 0.638
7.964AlaVal: 7.964 ± 0.753
1.626AlaTrp: 1.626 ± 0.288
2.524AlaTyr: 2.524 ± 0.321
0.0AlaXaa: 0.0 ± 0.0
Cys
0.841CysAla: 0.841 ± 0.241
0.112CysCys: 0.112 ± 0.076
0.617CysAsp: 0.617 ± 0.209
0.673CysGlu: 0.673 ± 0.175
0.168CysPhe: 0.168 ± 0.085
0.729CysGly: 0.729 ± 0.219
0.393CysHis: 0.393 ± 0.16
0.393CysIle: 0.393 ± 0.164
0.224CysLys: 0.224 ± 0.148
0.561CysLeu: 0.561 ± 0.183
0.168CysMet: 0.168 ± 0.107
0.168CysAsn: 0.168 ± 0.083
0.393CysPro: 0.393 ± 0.147
0.28CysGln: 0.28 ± 0.112
0.617CysArg: 0.617 ± 0.202
1.01CysSer: 1.01 ± 0.374
0.28CysThr: 0.28 ± 0.133
0.505CysVal: 0.505 ± 0.146
0.112CysTrp: 0.112 ± 0.091
0.224CysTyr: 0.224 ± 0.104
0.0CysXaa: 0.0 ± 0.0
Asp
6.45AspAla: 6.45 ± 0.57
0.673AspCys: 0.673 ± 0.209
4.206AspAsp: 4.206 ± 0.537
3.421AspGlu: 3.421 ± 0.503
2.187AspPhe: 2.187 ± 0.312
5.272AspGly: 5.272 ± 0.571
1.57AspHis: 1.57 ± 0.294
2.804AspIle: 2.804 ± 0.503
2.019AspLys: 2.019 ± 0.341
8.076AspLeu: 8.076 ± 0.899
1.122AspMet: 1.122 ± 0.254
1.739AspAsn: 1.739 ± 0.367
4.992AspPro: 4.992 ± 0.581
1.626AspGln: 1.626 ± 0.37
5.16AspArg: 5.16 ± 0.722
3.589AspSer: 3.589 ± 0.445
3.421AspThr: 3.421 ± 0.46
5.048AspVal: 5.048 ± 0.623
1.626AspTrp: 1.626 ± 0.332
1.683AspTyr: 1.683 ± 0.336
0.0AspXaa: 0.0 ± 0.0
Glu
5.496GluAla: 5.496 ± 0.69
0.28GluCys: 0.28 ± 0.134
4.823GluAsp: 4.823 ± 0.574
3.814GluGlu: 3.814 ± 0.5
1.795GluPhe: 1.795 ± 0.365
3.758GluGly: 3.758 ± 0.529
1.234GluHis: 1.234 ± 0.285
3.253GluIle: 3.253 ± 0.462
2.187GluLys: 2.187 ± 0.382
6.786GluLeu: 6.786 ± 0.603
1.29GluMet: 1.29 ± 0.223
1.795GluAsn: 1.795 ± 0.479
2.58GluPro: 2.58 ± 0.369
2.243GluGln: 2.243 ± 0.381
4.319GluArg: 4.319 ± 0.67
2.692GluSer: 2.692 ± 0.377
3.646GluThr: 3.646 ± 0.48
5.272GluVal: 5.272 ± 0.635
1.346GluTrp: 1.346 ± 0.32
2.299GluTyr: 2.299 ± 0.448
0.0GluXaa: 0.0 ± 0.0
Phe
2.692PheAla: 2.692 ± 0.48
0.561PheCys: 0.561 ± 0.222
2.468PheAsp: 2.468 ± 0.348
1.963PheGlu: 1.963 ± 0.26
0.505PhePhe: 0.505 ± 0.206
3.421PheGly: 3.421 ± 0.441
0.673PheHis: 0.673 ± 0.259
1.234PheIle: 1.234 ± 0.245
1.178PheLys: 1.178 ± 0.227
2.243PheLeu: 2.243 ± 0.429
0.617PheMet: 0.617 ± 0.182
1.29PheAsn: 1.29 ± 0.28
1.57PhePro: 1.57 ± 0.298
0.729PheGln: 0.729 ± 0.197
2.019PheArg: 2.019 ± 0.317
1.963PheSer: 1.963 ± 0.352
2.187PheThr: 2.187 ± 0.396
1.795PheVal: 1.795 ± 0.389
0.617PheTrp: 0.617 ± 0.186
0.785PheTyr: 0.785 ± 0.249
0.0PheXaa: 0.0 ± 0.0
Gly
6.394GlyAla: 6.394 ± 0.744
0.673GlyCys: 0.673 ± 0.207
6.113GlyAsp: 6.113 ± 0.544
4.15GlyGlu: 4.15 ± 0.515
2.86GlyPhe: 2.86 ± 0.458
6.674GlyGly: 6.674 ± 0.923
2.243GlyHis: 2.243 ± 0.412
4.262GlyIle: 4.262 ± 0.649
3.477GlyLys: 3.477 ± 0.529
7.684GlyLeu: 7.684 ± 0.792
1.851GlyMet: 1.851 ± 0.413
3.309GlyAsn: 3.309 ± 0.437
3.982GlyPro: 3.982 ± 0.592
2.131GlyGln: 2.131 ± 0.379
5.665GlyArg: 5.665 ± 0.666
6.674GlySer: 6.674 ± 0.66
4.879GlyThr: 4.879 ± 0.67
5.665GlyVal: 5.665 ± 0.624
2.019GlyTrp: 2.019 ± 0.341
2.356GlyTyr: 2.356 ± 0.345
0.0GlyXaa: 0.0 ± 0.0
His
1.683HisAla: 1.683 ± 0.351
0.224HisCys: 0.224 ± 0.126
1.514HisAsp: 1.514 ± 0.302
1.626HisGlu: 1.626 ± 0.341
0.729HisPhe: 0.729 ± 0.208
1.57HisGly: 1.57 ± 0.392
0.505HisHis: 0.505 ± 0.177
0.785HisIle: 0.785 ± 0.171
0.953HisLys: 0.953 ± 0.275
2.187HisLeu: 2.187 ± 0.523
0.224HisMet: 0.224 ± 0.112
0.224HisAsn: 0.224 ± 0.129
1.514HisPro: 1.514 ± 0.29
1.066HisGln: 1.066 ± 0.229
1.963HisArg: 1.963 ± 0.383
0.953HisSer: 0.953 ± 0.262
1.29HisThr: 1.29 ± 0.28
1.795HisVal: 1.795 ± 0.377
0.561HisTrp: 0.561 ± 0.161
0.505HisTyr: 0.505 ± 0.216
0.0HisXaa: 0.0 ± 0.0
Ile
5.328IleAla: 5.328 ± 0.572
0.617IleCys: 0.617 ± 0.19
3.365IleAsp: 3.365 ± 0.427
3.029IleGlu: 3.029 ± 0.441
0.785IlePhe: 0.785 ± 0.233
4.15IleGly: 4.15 ± 0.515
0.785IleHis: 0.785 ± 0.265
1.458IleIle: 1.458 ± 0.244
1.346IleLys: 1.346 ± 0.223
3.029IleLeu: 3.029 ± 0.423
0.673IleMet: 0.673 ± 0.196
1.57IleAsn: 1.57 ± 0.298
3.141IlePro: 3.141 ± 0.482
1.402IleGln: 1.402 ± 0.281
3.365IleArg: 3.365 ± 0.446
3.029IleSer: 3.029 ± 0.544
3.589IleThr: 3.589 ± 0.503
2.692IleVal: 2.692 ± 0.492
0.729IleTrp: 0.729 ± 0.237
1.29IleTyr: 1.29 ± 0.302
0.0IleXaa: 0.0 ± 0.0
Lys
3.197LysAla: 3.197 ± 0.508
0.168LysCys: 0.168 ± 0.095
1.963LysAsp: 1.963 ± 0.382
1.57LysGlu: 1.57 ± 0.294
1.29LysPhe: 1.29 ± 0.264
2.299LysGly: 2.299 ± 0.418
0.953LysHis: 0.953 ± 0.332
2.299LysIle: 2.299 ± 0.404
1.514LysLys: 1.514 ± 0.365
2.86LysLeu: 2.86 ± 0.431
0.785LysMet: 0.785 ± 0.184
1.346LysAsn: 1.346 ± 0.296
2.412LysPro: 2.412 ± 0.436
1.402LysGln: 1.402 ± 0.383
2.524LysArg: 2.524 ± 0.521
2.58LysSer: 2.58 ± 0.353
2.299LysThr: 2.299 ± 0.386
3.029LysVal: 3.029 ± 0.466
0.449LysTrp: 0.449 ± 0.184
1.122LysTyr: 1.122 ± 0.242
0.0LysXaa: 0.0 ± 0.0
Leu
8.861LeuAla: 8.861 ± 0.722
0.28LeuCys: 0.28 ± 0.119
6.73LeuAsp: 6.73 ± 0.87
5.609LeuGlu: 5.609 ± 0.584
2.299LeuPhe: 2.299 ± 0.369
7.515LeuGly: 7.515 ± 0.666
1.907LeuHis: 1.907 ± 0.388
3.758LeuIle: 3.758 ± 0.42
3.646LeuLys: 3.646 ± 0.436
6.842LeuLeu: 6.842 ± 0.629
1.683LeuMet: 1.683 ± 0.334
3.365LeuAsn: 3.365 ± 0.423
5.496LeuPro: 5.496 ± 0.554
2.636LeuGln: 2.636 ± 0.474
6.618LeuArg: 6.618 ± 0.71
5.833LeuSer: 5.833 ± 0.523
5.609LeuThr: 5.609 ± 0.518
6.169LeuVal: 6.169 ± 0.813
0.729LeuTrp: 0.729 ± 0.304
2.131LeuTyr: 2.131 ± 0.383
0.0LeuXaa: 0.0 ± 0.0
Met
2.131MetAla: 2.131 ± 0.38
0.056MetCys: 0.056 ± 0.055
1.066MetAsp: 1.066 ± 0.199
1.29MetGlu: 1.29 ± 0.281
0.841MetPhe: 0.841 ± 0.201
1.626MetGly: 1.626 ± 0.322
0.393MetHis: 0.393 ± 0.135
0.785MetIle: 0.785 ± 0.183
0.953MetLys: 0.953 ± 0.253
1.178MetLeu: 1.178 ± 0.275
0.224MetMet: 0.224 ± 0.12
0.897MetAsn: 0.897 ± 0.169
1.514MetPro: 1.514 ± 0.3
0.673MetGln: 0.673 ± 0.196
1.122MetArg: 1.122 ± 0.26
1.851MetSer: 1.851 ± 0.324
1.963MetThr: 1.963 ± 0.258
1.178MetVal: 1.178 ± 0.255
0.224MetTrp: 0.224 ± 0.108
0.337MetTyr: 0.337 ± 0.132
0.0MetXaa: 0.0 ± 0.0
Asn
3.533AsnAla: 3.533 ± 0.5
0.056AsnCys: 0.056 ± 0.053
2.356AsnAsp: 2.356 ± 0.407
1.739AsnGlu: 1.739 ± 0.298
1.066AsnPhe: 1.066 ± 0.258
3.533AsnGly: 3.533 ± 0.536
0.729AsnHis: 0.729 ± 0.185
1.234AsnIle: 1.234 ± 0.289
0.561AsnLys: 0.561 ± 0.227
2.468AsnLeu: 2.468 ± 0.337
0.673AsnMet: 0.673 ± 0.16
0.673AsnAsn: 0.673 ± 0.194
2.468AsnPro: 2.468 ± 0.431
0.953AsnGln: 0.953 ± 0.206
1.683AsnArg: 1.683 ± 0.336
1.907AsnSer: 1.907 ± 0.441
1.795AsnThr: 1.795 ± 0.305
2.468AsnVal: 2.468 ± 0.355
0.785AsnTrp: 0.785 ± 0.2
1.178AsnTyr: 1.178 ± 0.3
0.0AsnXaa: 0.0 ± 0.0
Pro
5.384ProAla: 5.384 ± 0.584
0.449ProCys: 0.449 ± 0.158
4.543ProAsp: 4.543 ± 0.475
4.375ProGlu: 4.375 ± 0.535
1.907ProPhe: 1.907 ± 0.36
5.496ProGly: 5.496 ± 0.594
1.01ProHis: 1.01 ± 0.243
2.692ProIle: 2.692 ± 0.434
1.907ProLys: 1.907 ± 0.291
4.655ProLeu: 4.655 ± 0.575
0.953ProMet: 0.953 ± 0.314
1.851ProAsn: 1.851 ± 0.282
2.86ProPro: 2.86 ± 0.458
1.402ProGln: 1.402 ± 0.318
3.646ProArg: 3.646 ± 0.552
4.431ProSer: 4.431 ± 0.729
4.038ProThr: 4.038 ± 0.528
4.599ProVal: 4.599 ± 0.547
0.953ProTrp: 0.953 ± 0.292
1.626ProTyr: 1.626 ± 0.369
0.0ProXaa: 0.0 ± 0.0
Gln
3.197GlnAla: 3.197 ± 0.533
0.056GlnCys: 0.056 ± 0.071
1.234GlnAsp: 1.234 ± 0.277
1.29GlnGlu: 1.29 ± 0.271
0.953GlnPhe: 0.953 ± 0.192
2.187GlnGly: 2.187 ± 0.283
0.617GlnHis: 0.617 ± 0.212
2.356GlnIle: 2.356 ± 0.517
0.953GlnLys: 0.953 ± 0.268
3.421GlnLeu: 3.421 ± 0.472
0.841GlnMet: 0.841 ± 0.237
0.561GlnAsn: 0.561 ± 0.161
2.075GlnPro: 2.075 ± 0.398
2.131GlnGln: 2.131 ± 0.453
2.973GlnArg: 2.973 ± 0.757
1.683GlnSer: 1.683 ± 0.267
1.683GlnThr: 1.683 ± 0.339
2.636GlnVal: 2.636 ± 0.361
0.561GlnTrp: 0.561 ± 0.173
0.505GlnTyr: 0.505 ± 0.183
0.0GlnXaa: 0.0 ± 0.0
Arg
6.225ArgAla: 6.225 ± 0.696
0.841ArgCys: 0.841 ± 0.244
3.702ArgAsp: 3.702 ± 0.608
4.599ArgGlu: 4.599 ± 0.578
2.131ArgPhe: 2.131 ± 0.394
5.609ArgGly: 5.609 ± 0.648
1.739ArgHis: 1.739 ± 0.331
3.309ArgIle: 3.309 ± 0.5
2.748ArgLys: 2.748 ± 0.517
6.282ArgLeu: 6.282 ± 0.743
1.963ArgMet: 1.963 ± 0.315
2.468ArgAsn: 2.468 ± 0.465
4.262ArgPro: 4.262 ± 0.707
2.524ArgGln: 2.524 ± 0.452
6.506ArgArg: 6.506 ± 0.981
4.543ArgSer: 4.543 ± 0.668
3.309ArgThr: 3.309 ± 0.454
7.067ArgVal: 7.067 ± 0.916
1.29ArgTrp: 1.29 ± 0.292
1.683ArgTyr: 1.683 ± 0.311
0.0ArgXaa: 0.0 ± 0.0
Ser
7.179SerAla: 7.179 ± 0.791
0.785SerCys: 0.785 ± 0.226
3.365SerAsp: 3.365 ± 0.403
4.487SerGlu: 4.487 ± 0.616
2.243SerPhe: 2.243 ± 0.443
6.113SerGly: 6.113 ± 0.705
1.683SerHis: 1.683 ± 0.36
2.524SerIle: 2.524 ± 0.389
2.075SerLys: 2.075 ± 0.328
4.599SerLeu: 4.599 ± 0.614
1.57SerMet: 1.57 ± 0.251
1.851SerAsn: 1.851 ± 0.357
4.094SerPro: 4.094 ± 0.658
2.356SerGln: 2.356 ± 0.278
4.319SerArg: 4.319 ± 0.643
5.889SerSer: 5.889 ± 1.219
3.365SerThr: 3.365 ± 0.504
4.487SerVal: 4.487 ± 0.673
1.346SerTrp: 1.346 ± 0.273
1.29SerTyr: 1.29 ± 0.327
0.0SerXaa: 0.0 ± 0.0
Thr
5.609ThrAla: 5.609 ± 0.631
0.393ThrCys: 0.393 ± 0.176
4.038ThrAsp: 4.038 ± 0.53
4.094ThrGlu: 4.094 ± 0.508
2.019ThrPhe: 2.019 ± 0.348
6.169ThrGly: 6.169 ± 0.679
1.122ThrHis: 1.122 ± 0.305
2.692ThrIle: 2.692 ± 0.516
2.356ThrLys: 2.356 ± 0.35
5.16ThrLeu: 5.16 ± 0.638
1.122ThrMet: 1.122 ± 0.323
1.795ThrAsn: 1.795 ± 0.366
3.926ThrPro: 3.926 ± 0.478
1.626ThrGln: 1.626 ± 0.344
3.926ThrArg: 3.926 ± 0.504
3.589ThrSer: 3.589 ± 0.596
3.589ThrThr: 3.589 ± 0.545
5.104ThrVal: 5.104 ± 0.605
1.01ThrTrp: 1.01 ± 0.258
1.907ThrTyr: 1.907 ± 0.324
0.0ThrXaa: 0.0 ± 0.0
Val
6.282ValAla: 6.282 ± 0.647
0.841ValCys: 0.841 ± 0.232
6.057ValAsp: 6.057 ± 0.587
4.487ValGlu: 4.487 ± 0.608
2.524ValPhe: 2.524 ± 0.327
5.384ValGly: 5.384 ± 0.54
2.019ValHis: 2.019 ± 0.338
3.141ValIle: 3.141 ± 0.445
2.58ValLys: 2.58 ± 0.363
6.001ValLeu: 6.001 ± 0.692
1.29ValMet: 1.29 ± 0.344
2.468ValAsn: 2.468 ± 0.372
4.711ValPro: 4.711 ± 0.545
2.356ValGln: 2.356 ± 0.416
6.45ValArg: 6.45 ± 0.862
5.721ValSer: 5.721 ± 0.632
5.328ValThr: 5.328 ± 0.6
5.833ValVal: 5.833 ± 0.725
1.122ValTrp: 1.122 ± 0.296
1.963ValTyr: 1.963 ± 0.368
0.0ValXaa: 0.0 ± 0.0
Trp
1.514TrpAla: 1.514 ± 0.319
0.112TrpCys: 0.112 ± 0.081
1.122TrpAsp: 1.122 ± 0.313
0.729TrpGlu: 0.729 ± 0.155
0.841TrpPhe: 0.841 ± 0.216
1.57TrpGly: 1.57 ± 0.313
0.337TrpHis: 0.337 ± 0.131
1.01TrpIle: 1.01 ± 0.195
0.393TrpLys: 0.393 ± 0.179
1.626TrpLeu: 1.626 ± 0.24
0.449TrpMet: 0.449 ± 0.181
0.953TrpAsn: 0.953 ± 0.253
0.617TrpPro: 0.617 ± 0.213
0.841TrpGln: 0.841 ± 0.207
1.402TrpArg: 1.402 ± 0.311
0.841TrpSer: 0.841 ± 0.216
1.458TrpThr: 1.458 ± 0.318
1.402TrpVal: 1.402 ± 0.302
0.393TrpTrp: 0.393 ± 0.142
0.28TrpTyr: 0.28 ± 0.131
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.019TyrAla: 2.019 ± 0.335
0.337TyrCys: 0.337 ± 0.152
1.122TyrAsp: 1.122 ± 0.274
2.019TyrGlu: 2.019 ± 0.344
0.505TyrPhe: 0.505 ± 0.158
2.299TyrGly: 2.299 ± 0.371
0.449TyrHis: 0.449 ± 0.141
1.458TyrIle: 1.458 ± 0.308
1.122TyrLys: 1.122 ± 0.278
2.636TyrLeu: 2.636 ± 0.389
0.449TyrMet: 0.449 ± 0.144
1.066TyrAsn: 1.066 ± 0.269
1.29TyrPro: 1.29 ± 0.342
0.897TyrGln: 0.897 ± 0.232
2.356TyrArg: 2.356 ± 0.321
1.458TyrSer: 1.458 ± 0.313
1.963TyrThr: 1.963 ± 0.409
1.851TyrVal: 1.851 ± 0.335
0.393TyrTrp: 0.393 ± 0.162
0.505TyrTyr: 0.505 ± 0.164
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 97 proteins (17831 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski