Amino acid dipepetide frequency for Mycobacterium phage Contagion

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
12.186AlaAla: 12.186 ± 1.15
1.154AlaCys: 1.154 ± 0.311
6.328AlaAsp: 6.328 ± 0.539
8.637AlaGlu: 8.637 ± 0.65
2.865AlaPhe: 2.865 ± 0.353
8.722AlaGly: 8.722 ± 0.901
2.394AlaHis: 2.394 ± 0.333
3.848AlaIle: 3.848 ± 0.443
5.045AlaLys: 5.045 ± 0.394
7.953AlaLeu: 7.953 ± 0.632
3.121AlaMet: 3.121 ± 0.313
3.036AlaAsn: 3.036 ± 0.458
4.618AlaPro: 4.618 ± 0.376
3.549AlaGln: 3.549 ± 0.381
6.071AlaArg: 6.071 ± 0.693
5.644AlaSer: 5.644 ± 0.552
6.285AlaThr: 6.285 ± 0.46
5.815AlaVal: 5.815 ± 0.439
1.796AlaTrp: 1.796 ± 0.254
2.651AlaTyr: 2.651 ± 0.3
0.0AlaXaa: 0.0 ± 0.0
Cys
1.154CysAla: 1.154 ± 0.244
0.214CysCys: 0.214 ± 0.099
0.599CysAsp: 0.599 ± 0.168
0.684CysGlu: 0.684 ± 0.195
0.342CysPhe: 0.342 ± 0.126
2.052CysGly: 2.052 ± 0.361
0.257CysHis: 0.257 ± 0.11
0.47CysIle: 0.47 ± 0.136
0.556CysLys: 0.556 ± 0.138
1.368CysLeu: 1.368 ± 0.233
0.128CysMet: 0.128 ± 0.07
0.47CysAsn: 0.47 ± 0.141
0.641CysPro: 0.641 ± 0.197
0.513CysGln: 0.513 ± 0.176
0.641CysArg: 0.641 ± 0.184
0.812CysSer: 0.812 ± 0.223
0.47CysThr: 0.47 ± 0.157
0.684CysVal: 0.684 ± 0.159
0.214CysTrp: 0.214 ± 0.104
0.428CysTyr: 0.428 ± 0.124
0.0CysXaa: 0.0 ± 0.0
Asp
5.858AspAla: 5.858 ± 0.612
1.24AspCys: 1.24 ± 0.277
3.891AspAsp: 3.891 ± 0.485
4.404AspGlu: 4.404 ± 0.515
1.411AspPhe: 1.411 ± 0.227
7.055AspGly: 7.055 ± 0.538
1.539AspHis: 1.539 ± 0.276
3.25AspIle: 3.25 ± 0.393
2.608AspLys: 2.608 ± 0.353
4.618AspLeu: 4.618 ± 0.39
1.496AspMet: 1.496 ± 0.267
1.753AspAsn: 1.753 ± 0.326
3.848AspPro: 3.848 ± 0.304
1.71AspGln: 1.71 ± 0.33
4.062AspArg: 4.062 ± 0.485
3.079AspSer: 3.079 ± 0.347
3.506AspThr: 3.506 ± 0.369
4.361AspVal: 4.361 ± 0.376
1.539AspTrp: 1.539 ± 0.213
2.651AspTyr: 2.651 ± 0.415
0.0AspXaa: 0.0 ± 0.0
Glu
6.969GluAla: 6.969 ± 0.583
1.112GluCys: 1.112 ± 0.221
4.96GluAsp: 4.96 ± 0.486
4.917GluGlu: 4.917 ± 0.654
2.651GluPhe: 2.651 ± 0.294
4.404GluGly: 4.404 ± 0.426
1.411GluHis: 1.411 ± 0.217
2.907GluIle: 2.907 ± 0.338
2.651GluLys: 2.651 ± 0.394
7.269GluLeu: 7.269 ± 0.585
1.197GluMet: 1.197 ± 0.209
2.095GluAsn: 2.095 ± 0.271
2.95GluPro: 2.95 ± 0.406
2.352GluGln: 2.352 ± 0.299
4.105GluArg: 4.105 ± 0.426
2.608GluSer: 2.608 ± 0.346
3.677GluThr: 3.677 ± 0.358
4.532GluVal: 4.532 ± 0.489
1.24GluTrp: 1.24 ± 0.255
2.352GluTyr: 2.352 ± 0.372
0.0GluXaa: 0.0 ± 0.0
Phe
2.822PheAla: 2.822 ± 0.369
0.599PheCys: 0.599 ± 0.176
2.394PheAsp: 2.394 ± 0.309
2.052PheGlu: 2.052 ± 0.267
0.983PhePhe: 0.983 ± 0.235
2.565PheGly: 2.565 ± 0.366
0.983PheHis: 0.983 ± 0.188
1.411PheIle: 1.411 ± 0.256
1.197PheLys: 1.197 ± 0.188
1.924PheLeu: 1.924 ± 0.323
0.556PheMet: 0.556 ± 0.131
1.026PheAsn: 1.026 ± 0.21
1.197PhePro: 1.197 ± 0.195
0.812PheGln: 0.812 ± 0.194
1.325PheArg: 1.325 ± 0.211
1.924PheSer: 1.924 ± 0.284
2.181PheThr: 2.181 ± 0.301
2.01PheVal: 2.01 ± 0.381
0.299PheTrp: 0.299 ± 0.093
0.941PheTyr: 0.941 ± 0.211
0.0PheXaa: 0.0 ± 0.0
Gly
7.91GlyAla: 7.91 ± 0.704
0.684GlyCys: 0.684 ± 0.186
5.687GlyAsp: 5.687 ± 0.466
5.43GlyGlu: 5.43 ± 0.414
2.181GlyPhe: 2.181 ± 0.323
9.62GlyGly: 9.62 ± 1.842
2.181GlyHis: 2.181 ± 0.282
4.147GlyIle: 4.147 ± 0.53
4.019GlyLys: 4.019 ± 0.406
6.542GlyLeu: 6.542 ± 0.469
2.138GlyMet: 2.138 ± 0.356
3.421GlyAsn: 3.421 ± 0.374
3.677GlyPro: 3.677 ± 0.521
2.48GlyGln: 2.48 ± 0.389
4.19GlyArg: 4.19 ± 0.404
5.302GlySer: 5.302 ± 0.661
6.499GlyThr: 6.499 ± 0.681
6.585GlyVal: 6.585 ± 0.517
1.753GlyTrp: 1.753 ± 0.235
3.506GlyTyr: 3.506 ± 0.456
0.0GlyXaa: 0.0 ± 0.0
His
1.924HisAla: 1.924 ± 0.337
0.513HisCys: 0.513 ± 0.156
1.539HisAsp: 1.539 ± 0.28
1.411HisGlu: 1.411 ± 0.274
0.684HisPhe: 0.684 ± 0.133
2.309HisGly: 2.309 ± 0.29
1.026HisHis: 1.026 ± 0.217
1.026HisIle: 1.026 ± 0.207
1.112HisLys: 1.112 ± 0.234
2.523HisLeu: 2.523 ± 0.319
0.513HisMet: 0.513 ± 0.128
0.983HisAsn: 0.983 ± 0.203
1.668HisPro: 1.668 ± 0.227
0.898HisGln: 0.898 ± 0.235
1.539HisArg: 1.539 ± 0.25
1.24HisSer: 1.24 ± 0.192
1.325HisThr: 1.325 ± 0.246
1.368HisVal: 1.368 ± 0.24
0.342HisTrp: 0.342 ± 0.123
0.983HisTyr: 0.983 ± 0.221
0.0HisXaa: 0.0 ± 0.0
Ile
4.575IleAla: 4.575 ± 0.426
0.385IleCys: 0.385 ± 0.138
3.463IleAsp: 3.463 ± 0.367
3.592IleGlu: 3.592 ± 0.42
0.941IlePhe: 0.941 ± 0.19
3.25IleGly: 3.25 ± 0.416
1.283IleHis: 1.283 ± 0.253
1.582IleIle: 1.582 ± 0.284
1.839IleLys: 1.839 ± 0.25
3.25IleLeu: 3.25 ± 0.388
0.855IleMet: 0.855 ± 0.166
2.01IleAsn: 2.01 ± 0.29
2.437IlePro: 2.437 ± 0.278
1.71IleGln: 1.71 ± 0.308
2.608IleArg: 2.608 ± 0.309
2.651IleSer: 2.651 ± 0.428
2.865IleThr: 2.865 ± 0.369
3.378IleVal: 3.378 ± 0.392
0.513IleTrp: 0.513 ± 0.133
1.24IleTyr: 1.24 ± 0.225
0.0IleXaa: 0.0 ± 0.0
Lys
4.789LysAla: 4.789 ± 0.464
0.684LysCys: 0.684 ± 0.16
2.651LysAsp: 2.651 ± 0.42
2.394LysGlu: 2.394 ± 0.314
1.24LysPhe: 1.24 ± 0.216
3.421LysGly: 3.421 ± 0.42
1.197LysHis: 1.197 ± 0.245
1.668LysIle: 1.668 ± 0.274
2.309LysLys: 2.309 ± 0.322
3.934LysLeu: 3.934 ± 0.351
1.197LysMet: 1.197 ± 0.243
1.582LysAsn: 1.582 ± 0.316
2.608LysPro: 2.608 ± 0.408
1.71LysGln: 1.71 ± 0.248
3.036LysArg: 3.036 ± 0.473
2.865LysSer: 2.865 ± 0.407
1.839LysThr: 1.839 ± 0.257
3.506LysVal: 3.506 ± 0.286
1.197LysTrp: 1.197 ± 0.244
1.112LysTyr: 1.112 ± 0.218
0.0LysXaa: 0.0 ± 0.0
Leu
9.193LeuAla: 9.193 ± 0.827
0.727LeuCys: 0.727 ± 0.186
4.618LeuAsp: 4.618 ± 0.441
4.661LeuGlu: 4.661 ± 0.492
2.181LeuPhe: 2.181 ± 0.368
5.943LeuGly: 5.943 ± 0.484
1.796LeuHis: 1.796 ± 0.257
3.934LeuIle: 3.934 ± 0.42
3.934LeuLys: 3.934 ± 0.361
6.585LeuLeu: 6.585 ± 0.588
2.437LeuMet: 2.437 ± 0.294
3.463LeuAsn: 3.463 ± 0.375
3.805LeuPro: 3.805 ± 0.424
2.907LeuGln: 2.907 ± 0.389
5.43LeuArg: 5.43 ± 0.506
5.045LeuSer: 5.045 ± 0.46
5.858LeuThr: 5.858 ± 0.586
5.302LeuVal: 5.302 ± 0.431
1.24LeuTrp: 1.24 ± 0.252
1.71LeuTyr: 1.71 ± 0.296
0.0LeuXaa: 0.0 ± 0.0
Met
2.608MetAla: 2.608 ± 0.326
0.299MetCys: 0.299 ± 0.098
1.24MetAsp: 1.24 ± 0.262
1.496MetGlu: 1.496 ± 0.282
0.641MetPhe: 0.641 ± 0.157
1.625MetGly: 1.625 ± 0.277
0.385MetHis: 0.385 ± 0.118
1.026MetIle: 1.026 ± 0.218
1.753MetLys: 1.753 ± 0.323
1.539MetLeu: 1.539 ± 0.211
0.47MetMet: 0.47 ± 0.138
0.855MetAsn: 0.855 ± 0.2
1.069MetPro: 1.069 ± 0.192
0.641MetGln: 0.641 ± 0.209
1.283MetArg: 1.283 ± 0.188
2.651MetSer: 2.651 ± 0.391
1.967MetThr: 1.967 ± 0.278
1.411MetVal: 1.411 ± 0.229
0.599MetTrp: 0.599 ± 0.14
0.299MetTyr: 0.299 ± 0.114
0.0MetXaa: 0.0 ± 0.0
Asn
2.95AsnAla: 2.95 ± 0.395
0.299AsnCys: 0.299 ± 0.116
2.266AsnAsp: 2.266 ± 0.303
1.881AsnGlu: 1.881 ± 0.29
0.983AsnPhe: 0.983 ± 0.232
4.661AsnGly: 4.661 ± 0.584
0.812AsnHis: 0.812 ± 0.189
1.325AsnIle: 1.325 ± 0.32
0.941AsnLys: 0.941 ± 0.195
3.463AsnLeu: 3.463 ± 0.498
0.385AsnMet: 0.385 ± 0.187
0.983AsnAsn: 0.983 ± 0.23
2.779AsnPro: 2.779 ± 0.393
1.197AsnGln: 1.197 ± 0.185
2.138AsnArg: 2.138 ± 0.335
1.924AsnSer: 1.924 ± 0.299
1.881AsnThr: 1.881 ± 0.293
2.052AsnVal: 2.052 ± 0.344
0.556AsnTrp: 0.556 ± 0.162
0.77AsnTyr: 0.77 ± 0.171
0.0AsnXaa: 0.0 ± 0.0
Pro
5.174ProAla: 5.174 ± 0.58
0.641ProCys: 0.641 ± 0.204
3.463ProAsp: 3.463 ± 0.357
3.378ProGlu: 3.378 ± 0.369
1.796ProPhe: 1.796 ± 0.306
4.96ProGly: 4.96 ± 0.629
1.582ProHis: 1.582 ± 0.249
2.352ProIle: 2.352 ± 0.282
2.565ProLys: 2.565 ± 0.449
3.763ProLeu: 3.763 ± 0.298
1.325ProMet: 1.325 ± 0.198
1.796ProAsn: 1.796 ± 0.255
2.993ProPro: 2.993 ± 0.445
1.796ProGln: 1.796 ± 0.285
2.608ProArg: 2.608 ± 0.43
3.036ProSer: 3.036 ± 0.313
3.335ProThr: 3.335 ± 0.325
3.549ProVal: 3.549 ± 0.435
1.24ProTrp: 1.24 ± 0.196
1.668ProTyr: 1.668 ± 0.254
0.0ProXaa: 0.0 ± 0.0
Gln
3.805GlnAla: 3.805 ± 0.363
0.342GlnCys: 0.342 ± 0.143
1.71GlnAsp: 1.71 ± 0.274
1.924GlnGlu: 1.924 ± 0.286
0.77GlnPhe: 0.77 ± 0.176
2.394GlnGly: 2.394 ± 0.337
0.428GlnHis: 0.428 ± 0.139
1.582GlnIle: 1.582 ± 0.259
1.496GlnLys: 1.496 ± 0.298
3.549GlnLeu: 3.549 ± 0.519
0.898GlnMet: 0.898 ± 0.205
0.941GlnAsn: 0.941 ± 0.199
1.668GlnPro: 1.668 ± 0.246
0.983GlnGln: 0.983 ± 0.199
2.437GlnArg: 2.437 ± 0.32
2.052GlnSer: 2.052 ± 0.323
1.625GlnThr: 1.625 ± 0.233
2.565GlnVal: 2.565 ± 0.331
0.812GlnTrp: 0.812 ± 0.194
0.983GlnTyr: 0.983 ± 0.191
0.0GlnXaa: 0.0 ± 0.0
Arg
6.756ArgAla: 6.756 ± 0.774
0.812ArgCys: 0.812 ± 0.227
4.19ArgAsp: 4.19 ± 0.45
3.378ArgGlu: 3.378 ± 0.363
1.796ArgPhe: 1.796 ± 0.279
3.805ArgGly: 3.805 ± 0.428
2.01ArgHis: 2.01 ± 0.337
2.608ArgIle: 2.608 ± 0.373
3.079ArgLys: 3.079 ± 0.381
4.404ArgLeu: 4.404 ± 0.544
2.095ArgMet: 2.095 ± 0.353
1.582ArgAsn: 1.582 ± 0.242
2.993ArgPro: 2.993 ± 0.356
1.881ArgGln: 1.881 ± 0.249
3.976ArgArg: 3.976 ± 0.593
2.736ArgSer: 2.736 ± 0.362
2.779ArgThr: 2.779 ± 0.379
4.361ArgVal: 4.361 ± 0.537
1.496ArgTrp: 1.496 ± 0.25
2.309ArgTyr: 2.309 ± 0.272
0.0ArgXaa: 0.0 ± 0.0
Ser
5.088SerAla: 5.088 ± 0.492
0.428SerCys: 0.428 ± 0.118
3.164SerAsp: 3.164 ± 0.346
3.634SerGlu: 3.634 ± 0.394
2.052SerPhe: 2.052 ± 0.304
5.986SerGly: 5.986 ± 0.987
1.454SerHis: 1.454 ± 0.232
2.181SerIle: 2.181 ± 0.329
2.138SerLys: 2.138 ± 0.336
4.96SerLeu: 4.96 ± 0.576
1.197SerMet: 1.197 ± 0.19
2.052SerAsn: 2.052 ± 0.265
2.907SerPro: 2.907 ± 0.328
1.539SerGln: 1.539 ± 0.311
3.292SerArg: 3.292 ± 0.419
3.805SerSer: 3.805 ± 0.559
3.805SerThr: 3.805 ± 0.53
3.848SerVal: 3.848 ± 0.389
1.496SerTrp: 1.496 ± 0.254
1.454SerTyr: 1.454 ± 0.246
0.0SerXaa: 0.0 ± 0.0
Thr
6.499ThrAla: 6.499 ± 0.57
0.684ThrCys: 0.684 ± 0.174
3.592ThrAsp: 3.592 ± 0.368
4.105ThrGlu: 4.105 ± 0.418
2.01ThrPhe: 2.01 ± 0.301
5.387ThrGly: 5.387 ± 0.563
1.112ThrHis: 1.112 ± 0.187
3.207ThrIle: 3.207 ± 0.369
2.48ThrLys: 2.48 ± 0.367
4.917ThrLeu: 4.917 ± 0.444
1.411ThrMet: 1.411 ± 0.286
2.223ThrAsn: 2.223 ± 0.32
5.088ThrPro: 5.088 ± 0.541
2.01ThrGln: 2.01 ± 0.354
2.95ThrArg: 2.95 ± 0.502
2.565ThrSer: 2.565 ± 0.294
3.292ThrThr: 3.292 ± 0.445
5.473ThrVal: 5.473 ± 0.415
1.411ThrTrp: 1.411 ± 0.358
2.01ThrTyr: 2.01 ± 0.243
0.0ThrXaa: 0.0 ± 0.0
Val
7.44ValAla: 7.44 ± 0.54
0.941ValCys: 0.941 ± 0.196
4.917ValAsp: 4.917 ± 0.509
5.644ValGlu: 5.644 ± 0.501
2.095ValPhe: 2.095 ± 0.336
5.131ValGly: 5.131 ± 0.441
1.753ValHis: 1.753 ± 0.269
3.549ValIle: 3.549 ± 0.387
3.421ValLys: 3.421 ± 0.434
4.019ValLeu: 4.019 ± 0.453
1.496ValMet: 1.496 ± 0.255
2.523ValAsn: 2.523 ± 0.33
3.634ValPro: 3.634 ± 0.318
2.223ValGln: 2.223 ± 0.317
4.147ValArg: 4.147 ± 0.464
3.976ValSer: 3.976 ± 0.38
5.516ValThr: 5.516 ± 0.667
5.943ValVal: 5.943 ± 0.671
0.812ValTrp: 0.812 ± 0.229
1.924ValTyr: 1.924 ± 0.336
0.0ValXaa: 0.0 ± 0.0
Trp
1.454TrpAla: 1.454 ± 0.234
0.47TrpCys: 0.47 ± 0.142
1.582TrpAsp: 1.582 ± 0.248
0.812TrpGlu: 0.812 ± 0.22
0.77TrpPhe: 0.77 ± 0.209
1.24TrpGly: 1.24 ± 0.264
0.641TrpHis: 0.641 ± 0.162
1.026TrpIle: 1.026 ± 0.198
0.684TrpLys: 0.684 ± 0.153
1.668TrpLeu: 1.668 ± 0.278
0.299TrpMet: 0.299 ± 0.124
0.727TrpAsn: 0.727 ± 0.169
0.812TrpPro: 0.812 ± 0.156
0.898TrpGln: 0.898 ± 0.217
0.898TrpArg: 0.898 ± 0.219
1.197TrpSer: 1.197 ± 0.218
1.154TrpThr: 1.154 ± 0.253
2.138TrpVal: 2.138 ± 0.312
0.47TrpTrp: 0.47 ± 0.139
0.641TrpTyr: 0.641 ± 0.175
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.565TyrAla: 2.565 ± 0.322
0.47TyrCys: 0.47 ± 0.158
1.668TyrAsp: 1.668 ± 0.341
1.924TyrGlu: 1.924 ± 0.264
0.812TyrPhe: 0.812 ± 0.193
3.421TyrGly: 3.421 ± 0.487
0.684TyrHis: 0.684 ± 0.172
1.325TyrIle: 1.325 ± 0.213
1.112TyrLys: 1.112 ± 0.217
2.523TyrLeu: 2.523 ± 0.364
0.599TyrMet: 0.599 ± 0.184
0.727TyrAsn: 0.727 ± 0.149
1.625TyrPro: 1.625 ± 0.304
1.154TyrGln: 1.154 ± 0.209
2.309TyrArg: 2.309 ± 0.3
1.368TyrSer: 1.368 ± 0.218
2.565TyrThr: 2.565 ± 0.345
2.223TyrVal: 2.223 ± 0.327
0.556TyrTrp: 0.556 ± 0.188
0.77TyrTyr: 0.77 ± 0.206
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 140 proteins (23389 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski