Amino acid dipepetide frequency for Mycobacterium phage Fancypants

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
14.765AlaAla: 14.765 ± 1.889
0.736AlaCys: 0.736 ± 0.2
7.356AlaAsp: 7.356 ± 0.591
7.671AlaGlu: 7.671 ± 0.708
2.785AlaPhe: 2.785 ± 0.402
9.615AlaGly: 9.615 ± 1.258
2.312AlaHis: 2.312 ± 0.361
4.098AlaIle: 4.098 ± 0.542
4.256AlaLys: 4.256 ± 0.423
8.302AlaLeu: 8.302 ± 0.84
2.68AlaMet: 2.68 ± 0.42
2.68AlaAsn: 2.68 ± 0.39
4.519AlaPro: 4.519 ± 0.537
3.52AlaGln: 3.52 ± 0.428
7.409AlaArg: 7.409 ± 0.668
5.359AlaSer: 5.359 ± 0.519
6.148AlaThr: 6.148 ± 0.475
6.726AlaVal: 6.726 ± 0.6
1.892AlaTrp: 1.892 ± 0.309
2.312AlaTyr: 2.312 ± 0.33
0.0AlaXaa: 0.0 ± 0.0
Cys
0.946CysAla: 0.946 ± 0.275
0.158CysCys: 0.158 ± 0.098
1.156CysAsp: 1.156 ± 0.256
0.841CysGlu: 0.841 ± 0.222
0.263CysPhe: 0.263 ± 0.124
1.734CysGly: 1.734 ± 0.347
0.263CysHis: 0.263 ± 0.128
0.158CysIle: 0.158 ± 0.087
0.368CysLys: 0.368 ± 0.157
0.631CysLeu: 0.631 ± 0.226
0.158CysMet: 0.158 ± 0.096
0.368CysAsn: 0.368 ± 0.129
1.419CysPro: 1.419 ± 0.33
0.263CysGln: 0.263 ± 0.134
0.788CysArg: 0.788 ± 0.249
0.893CysSer: 0.893 ± 0.275
0.578CysThr: 0.578 ± 0.195
0.893CysVal: 0.893 ± 0.214
0.368CysTrp: 0.368 ± 0.132
0.263CysTyr: 0.263 ± 0.114
0.0CysXaa: 0.0 ± 0.0
Asp
6.673AspAla: 6.673 ± 0.611
0.736AspCys: 0.736 ± 0.179
4.519AspAsp: 4.519 ± 0.534
3.468AspGlu: 3.468 ± 0.462
1.576AspPhe: 1.576 ± 0.255
6.253AspGly: 6.253 ± 0.596
1.366AspHis: 1.366 ± 0.256
2.522AspIle: 2.522 ± 0.304
1.681AspLys: 1.681 ± 0.328
5.78AspLeu: 5.78 ± 0.52
1.208AspMet: 1.208 ± 0.279
2.102AspAsn: 2.102 ± 0.386
4.939AspPro: 4.939 ± 0.531
2.417AspGln: 2.417 ± 0.292
4.992AspArg: 4.992 ± 0.625
3.52AspSer: 3.52 ± 0.526
4.203AspThr: 4.203 ± 0.53
4.992AspVal: 4.992 ± 0.647
1.419AspTrp: 1.419 ± 0.27
2.207AspTyr: 2.207 ± 0.31
0.0AspXaa: 0.0 ± 0.0
Glu
6.095GluAla: 6.095 ± 0.649
1.314GluCys: 1.314 ± 0.3
3.258GluAsp: 3.258 ± 0.337
3.153GluGlu: 3.153 ± 0.508
2.207GluPhe: 2.207 ± 0.306
3.783GluGly: 3.783 ± 0.478
1.524GluHis: 1.524 ± 0.367
2.207GluIle: 2.207 ± 0.281
1.997GluLys: 1.997 ± 0.332
5.832GluLeu: 5.832 ± 0.676
1.576GluMet: 1.576 ± 0.267
1.944GluAsn: 1.944 ± 0.296
3.205GluPro: 3.205 ± 0.446
3.047GluGln: 3.047 ± 0.354
5.097GluArg: 5.097 ± 0.597
2.732GluSer: 2.732 ± 0.493
3.625GluThr: 3.625 ± 0.535
4.466GluVal: 4.466 ± 0.529
1.892GluTrp: 1.892 ± 0.332
1.944GluTyr: 1.944 ± 0.377
0.0GluXaa: 0.0 ± 0.0
Phe
3.205PheAla: 3.205 ± 0.371
0.315PheCys: 0.315 ± 0.139
2.522PheAsp: 2.522 ± 0.37
1.629PheGlu: 1.629 ± 0.294
1.103PhePhe: 1.103 ± 0.261
2.312PheGly: 2.312 ± 0.664
0.578PheHis: 0.578 ± 0.166
1.524PheIle: 1.524 ± 0.331
0.841PheLys: 0.841 ± 0.213
1.681PheLeu: 1.681 ± 0.241
0.841PheMet: 0.841 ± 0.238
1.261PheAsn: 1.261 ± 0.309
1.314PhePro: 1.314 ± 0.259
1.156PheGln: 1.156 ± 0.311
1.997PheArg: 1.997 ± 0.328
1.208PheSer: 1.208 ± 0.236
2.207PheThr: 2.207 ± 0.324
2.207PheVal: 2.207 ± 0.315
0.525PheTrp: 0.525 ± 0.163
0.893PheTyr: 0.893 ± 0.269
0.0PheXaa: 0.0 ± 0.0
Gly
9.09GlyAla: 9.09 ± 1.244
1.208GlyCys: 1.208 ± 0.279
6.148GlyAsp: 6.148 ± 0.498
4.256GlyGlu: 4.256 ± 0.52
2.89GlyPhe: 2.89 ± 0.431
10.771GlyGly: 10.771 ± 2.114
1.786GlyHis: 1.786 ± 0.335
3.993GlyIle: 3.993 ± 0.561
2.627GlyLys: 2.627 ± 0.348
5.727GlyLeu: 5.727 ± 0.578
2.68GlyMet: 2.68 ± 0.526
3.31GlyAsn: 3.31 ± 0.353
3.836GlyPro: 3.836 ± 0.65
2.575GlyGln: 2.575 ± 0.578
4.729GlyArg: 4.729 ± 0.643
5.832GlySer: 5.832 ± 0.749
6.253GlyThr: 6.253 ± 0.778
5.202GlyVal: 5.202 ± 0.536
2.575GlyTrp: 2.575 ± 0.408
2.049GlyTyr: 2.049 ± 0.409
0.0GlyXaa: 0.0 ± 0.0
His
1.944HisAla: 1.944 ± 0.334
0.473HisCys: 0.473 ± 0.2
1.156HisAsp: 1.156 ± 0.239
0.946HisGlu: 0.946 ± 0.268
0.473HisPhe: 0.473 ± 0.155
1.629HisGly: 1.629 ± 0.291
0.998HisHis: 0.998 ± 0.296
1.786HisIle: 1.786 ± 0.312
0.736HisLys: 0.736 ± 0.208
1.576HisLeu: 1.576 ± 0.34
0.578HisMet: 0.578 ± 0.143
0.841HisAsn: 0.841 ± 0.183
1.576HisPro: 1.576 ± 0.268
0.631HisGln: 0.631 ± 0.166
2.259HisArg: 2.259 ± 0.401
0.841HisSer: 0.841 ± 0.202
1.681HisThr: 1.681 ± 0.35
1.103HisVal: 1.103 ± 0.253
0.315HisTrp: 0.315 ± 0.111
0.998HisTyr: 0.998 ± 0.193
0.0HisXaa: 0.0 ± 0.0
Ile
5.464IleAla: 5.464 ± 0.516
0.788IleCys: 0.788 ± 0.275
3.52IleAsp: 3.52 ± 0.476
3.52IleGlu: 3.52 ± 0.368
0.683IlePhe: 0.683 ± 0.239
4.256IleGly: 4.256 ± 0.446
1.576IleHis: 1.576 ± 0.309
1.366IleIle: 1.366 ± 0.267
1.261IleLys: 1.261 ± 0.269
1.892IleLeu: 1.892 ± 0.351
0.21IleMet: 0.21 ± 0.09
1.524IleAsn: 1.524 ± 0.339
2.89IlePro: 2.89 ± 0.262
1.524IleGln: 1.524 ± 0.284
2.68IleArg: 2.68 ± 0.379
1.786IleSer: 1.786 ± 0.458
3.941IleThr: 3.941 ± 0.439
3.468IleVal: 3.468 ± 0.361
1.051IleTrp: 1.051 ± 0.26
0.631IleTyr: 0.631 ± 0.215
0.0IleXaa: 0.0 ± 0.0
Lys
3.731LysAla: 3.731 ± 0.471
0.42LysCys: 0.42 ± 0.141
1.734LysAsp: 1.734 ± 0.293
1.366LysGlu: 1.366 ± 0.248
1.366LysPhe: 1.366 ± 0.225
2.417LysGly: 2.417 ± 0.332
0.893LysHis: 0.893 ± 0.249
1.156LysIle: 1.156 ± 0.3
1.524LysLys: 1.524 ± 0.357
2.575LysLeu: 2.575 ± 0.491
0.736LysMet: 0.736 ± 0.206
0.788LysAsn: 0.788 ± 0.217
2.575LysPro: 2.575 ± 0.468
1.576LysGln: 1.576 ± 0.236
2.47LysArg: 2.47 ± 0.395
2.102LysSer: 2.102 ± 0.325
2.364LysThr: 2.364 ± 0.399
2.47LysVal: 2.47 ± 0.346
0.736LysTrp: 0.736 ± 0.24
0.893LysTyr: 0.893 ± 0.209
0.0LysXaa: 0.0 ± 0.0
Leu
8.354LeuAla: 8.354 ± 0.933
0.736LeuCys: 0.736 ± 0.24
4.781LeuAsp: 4.781 ± 0.562
3.783LeuGlu: 3.783 ± 0.453
2.312LeuPhe: 2.312 ± 0.264
5.885LeuGly: 5.885 ± 0.6
1.051LeuHis: 1.051 ± 0.237
2.627LeuIle: 2.627 ± 0.4
2.575LeuLys: 2.575 ± 0.404
5.097LeuLeu: 5.097 ± 0.598
1.366LeuMet: 1.366 ± 0.265
2.785LeuAsn: 2.785 ± 0.359
5.57LeuPro: 5.57 ± 0.713
2.575LeuGln: 2.575 ± 0.403
4.781LeuArg: 4.781 ± 0.594
4.992LeuSer: 4.992 ± 0.453
5.149LeuThr: 5.149 ± 0.447
5.044LeuVal: 5.044 ± 0.441
1.051LeuTrp: 1.051 ± 0.228
2.049LeuTyr: 2.049 ± 0.331
0.0LeuXaa: 0.0 ± 0.0
Met
1.892MetAla: 1.892 ± 0.347
0.105MetCys: 0.105 ± 0.094
1.419MetAsp: 1.419 ± 0.273
0.893MetGlu: 0.893 ± 0.182
0.841MetPhe: 0.841 ± 0.224
2.102MetGly: 2.102 ± 0.301
0.263MetHis: 0.263 ± 0.105
0.736MetIle: 0.736 ± 0.223
0.946MetLys: 0.946 ± 0.276
1.997MetLeu: 1.997 ± 0.301
0.473MetMet: 0.473 ± 0.202
0.683MetAsn: 0.683 ± 0.179
1.208MetPro: 1.208 ± 0.219
0.42MetGln: 0.42 ± 0.143
1.524MetArg: 1.524 ± 0.276
2.785MetSer: 2.785 ± 0.338
2.154MetThr: 2.154 ± 0.36
1.524MetVal: 1.524 ± 0.367
0.263MetTrp: 0.263 ± 0.121
0.473MetTyr: 0.473 ± 0.143
0.0MetXaa: 0.0 ± 0.0
Asn
3.415AsnAla: 3.415 ± 0.334
0.315AsnCys: 0.315 ± 0.136
2.259AsnAsp: 2.259 ± 0.338
1.944AsnGlu: 1.944 ± 0.368
0.683AsnPhe: 0.683 ± 0.234
4.046AsnGly: 4.046 ± 0.54
1.051AsnHis: 1.051 ± 0.233
1.681AsnIle: 1.681 ± 0.436
1.051AsnLys: 1.051 ± 0.226
2.259AsnLeu: 2.259 ± 0.404
0.525AsnMet: 0.525 ± 0.145
1.524AsnAsn: 1.524 ± 0.381
2.102AsnPro: 2.102 ± 0.354
0.893AsnGln: 0.893 ± 0.342
1.892AsnArg: 1.892 ± 0.31
1.576AsnSer: 1.576 ± 0.232
2.417AsnThr: 2.417 ± 0.385
2.102AsnVal: 2.102 ± 0.327
0.631AsnTrp: 0.631 ± 0.122
0.998AsnTyr: 0.998 ± 0.193
0.0AsnXaa: 0.0 ± 0.0
Pro
4.834ProAla: 4.834 ± 0.577
0.841ProCys: 0.841 ± 0.247
3.993ProAsp: 3.993 ± 0.429
4.887ProGlu: 4.887 ± 0.535
1.629ProPhe: 1.629 ± 0.315
6.463ProGly: 6.463 ± 0.743
1.524ProHis: 1.524 ± 0.288
1.892ProIle: 1.892 ± 0.3
1.681ProLys: 1.681 ± 0.343
4.466ProLeu: 4.466 ± 0.536
1.419ProMet: 1.419 ± 0.291
2.312ProAsn: 2.312 ± 0.343
3.941ProPro: 3.941 ± 0.581
1.892ProGln: 1.892 ± 0.392
3.205ProArg: 3.205 ± 0.565
3.783ProSer: 3.783 ± 0.432
3.153ProThr: 3.153 ± 0.444
4.781ProVal: 4.781 ± 0.522
1.103ProTrp: 1.103 ± 0.229
1.629ProTyr: 1.629 ± 0.265
0.0ProXaa: 0.0 ± 0.0
Gln
4.466GlnAla: 4.466 ± 0.466
0.42GlnCys: 0.42 ± 0.187
1.366GlnAsp: 1.366 ± 0.257
1.944GlnGlu: 1.944 ± 0.319
1.156GlnPhe: 1.156 ± 0.206
2.259GlnGly: 2.259 ± 0.502
0.946GlnHis: 0.946 ± 0.235
2.049GlnIle: 2.049 ± 0.419
1.261GlnLys: 1.261 ± 0.274
2.995GlnLeu: 2.995 ± 0.41
0.578GlnMet: 0.578 ± 0.174
0.893GlnAsn: 0.893 ± 0.271
2.575GlnPro: 2.575 ± 0.391
0.893GlnGln: 0.893 ± 0.223
2.312GlnArg: 2.312 ± 0.331
2.312GlnSer: 2.312 ± 0.315
1.734GlnThr: 1.734 ± 0.308
2.312GlnVal: 2.312 ± 0.306
0.578GlnTrp: 0.578 ± 0.186
0.998GlnTyr: 0.998 ± 0.277
0.0GlnXaa: 0.0 ± 0.0
Arg
7.093ArgAla: 7.093 ± 0.67
1.156ArgCys: 1.156 ± 0.328
4.361ArgAsp: 4.361 ± 0.503
5.254ArgGlu: 5.254 ± 0.633
1.892ArgPhe: 1.892 ± 0.359
3.993ArgGly: 3.993 ± 0.447
1.471ArgHis: 1.471 ± 0.38
4.414ArgIle: 4.414 ± 0.596
2.522ArgLys: 2.522 ± 0.386
4.676ArgLeu: 4.676 ± 0.571
2.522ArgMet: 2.522 ± 0.423
2.417ArgAsn: 2.417 ± 0.378
3.573ArgPro: 3.573 ± 0.403
1.892ArgGln: 1.892 ± 0.415
5.622ArgArg: 5.622 ± 0.854
3.783ArgSer: 3.783 ± 0.387
3.153ArgThr: 3.153 ± 0.562
5.464ArgVal: 5.464 ± 0.575
2.259ArgTrp: 2.259 ± 0.401
1.786ArgTyr: 1.786 ± 0.299
0.0ArgXaa: 0.0 ± 0.0
Ser
4.992SerAla: 4.992 ± 0.751
0.42SerCys: 0.42 ± 0.143
4.151SerAsp: 4.151 ± 0.517
2.89SerGlu: 2.89 ± 0.354
1.786SerPhe: 1.786 ± 0.315
6.2SerGly: 6.2 ± 0.76
1.103SerHis: 1.103 ± 0.303
2.942SerIle: 2.942 ± 0.419
2.47SerLys: 2.47 ± 0.376
3.993SerLeu: 3.993 ± 0.376
1.419SerMet: 1.419 ± 0.298
2.522SerAsn: 2.522 ± 0.485
3.625SerPro: 3.625 ± 0.425
1.576SerGln: 1.576 ± 0.319
3.625SerArg: 3.625 ± 0.41
3.836SerSer: 3.836 ± 0.646
3.1SerThr: 3.1 ± 0.383
4.519SerVal: 4.519 ± 0.486
1.314SerTrp: 1.314 ± 0.267
1.524SerTyr: 1.524 ± 0.235
0.0SerXaa: 0.0 ± 0.0
Thr
6.41ThrAla: 6.41 ± 0.531
0.683ThrCys: 0.683 ± 0.201
4.203ThrAsp: 4.203 ± 0.576
3.993ThrGlu: 3.993 ± 0.371
1.892ThrPhe: 1.892 ± 0.349
5.885ThrGly: 5.885 ± 0.707
1.576ThrHis: 1.576 ± 0.286
3.468ThrIle: 3.468 ± 0.548
2.154ThrLys: 2.154 ± 0.366
3.941ThrLeu: 3.941 ± 0.433
0.998ThrMet: 0.998 ± 0.238
2.207ThrAsn: 2.207 ± 0.401
3.836ThrPro: 3.836 ± 0.429
1.997ThrGln: 1.997 ± 0.314
4.571ThrArg: 4.571 ± 0.527
3.941ThrSer: 3.941 ± 0.478
4.887ThrThr: 4.887 ± 0.731
5.885ThrVal: 5.885 ± 0.644
1.156ThrTrp: 1.156 ± 0.308
1.786ThrTyr: 1.786 ± 0.31
0.0ThrXaa: 0.0 ± 0.0
Val
7.093ValAla: 7.093 ± 0.49
1.103ValCys: 1.103 ± 0.267
5.517ValAsp: 5.517 ± 0.426
5.097ValGlu: 5.097 ± 0.541
2.259ValPhe: 2.259 ± 0.398
5.097ValGly: 5.097 ± 0.726
1.366ValHis: 1.366 ± 0.229
2.942ValIle: 2.942 ± 0.46
2.522ValLys: 2.522 ± 0.409
5.149ValLeu: 5.149 ± 0.547
1.524ValMet: 1.524 ± 0.224
2.102ValAsn: 2.102 ± 0.327
4.098ValPro: 4.098 ± 0.372
2.89ValGln: 2.89 ± 0.295
5.254ValArg: 5.254 ± 0.656
4.361ValSer: 4.361 ± 0.526
5.202ValThr: 5.202 ± 0.475
6.41ValVal: 6.41 ± 0.689
1.839ValTrp: 1.839 ± 0.328
1.681ValTyr: 1.681 ± 0.336
0.0ValXaa: 0.0 ± 0.0
Trp
1.997TrpAla: 1.997 ± 0.324
0.368TrpCys: 0.368 ± 0.143
1.314TrpAsp: 1.314 ± 0.257
1.314TrpGlu: 1.314 ± 0.34
0.525TrpPhe: 0.525 ± 0.145
0.736TrpGly: 0.736 ± 0.182
0.525TrpHis: 0.525 ± 0.162
1.314TrpIle: 1.314 ± 0.242
0.736TrpLys: 0.736 ± 0.166
2.102TrpLeu: 2.102 ± 0.334
0.946TrpMet: 0.946 ± 0.226
0.525TrpAsn: 0.525 ± 0.199
1.208TrpPro: 1.208 ± 0.311
1.103TrpGln: 1.103 ± 0.264
2.154TrpArg: 2.154 ± 0.421
1.156TrpSer: 1.156 ± 0.274
1.419TrpThr: 1.419 ± 0.282
1.629TrpVal: 1.629 ± 0.395
0.946TrpTrp: 0.946 ± 0.201
0.683TrpTyr: 0.683 ± 0.217
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.68TyrAla: 2.68 ± 0.48
0.21TyrCys: 0.21 ± 0.117
1.839TyrAsp: 1.839 ± 0.432
2.049TyrGlu: 2.049 ± 0.326
0.998TyrPhe: 0.998 ± 0.237
2.102TyrGly: 2.102 ± 0.359
0.368TyrHis: 0.368 ± 0.148
1.156TyrIle: 1.156 ± 0.232
0.683TyrLys: 0.683 ± 0.207
1.892TyrLeu: 1.892 ± 0.304
0.21TyrMet: 0.21 ± 0.098
0.578TyrAsn: 0.578 ± 0.133
1.419TyrPro: 1.419 ± 0.223
1.208TyrGln: 1.208 ± 0.271
1.997TyrArg: 1.997 ± 0.306
1.208TyrSer: 1.208 ± 0.226
2.102TyrThr: 2.102 ± 0.336
2.312TyrVal: 2.312 ± 0.314
0.788TyrTrp: 0.788 ± 0.187
0.683TyrTyr: 0.683 ± 0.185
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 110 proteins (19033 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski