Amino acid dipepetide frequency for Mycobacterium phage CRB2

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
17.91AlaAla: 17.91 ± 1.353
0.523AlaCys: 0.523 ± 0.144
8.323AlaAsp: 8.323 ± 0.805
8.846AlaGlu: 8.846 ± 0.664
2.658AlaPhe: 2.658 ± 0.332
11.679AlaGly: 11.679 ± 1.249
2.702AlaHis: 2.702 ± 0.408
5.36AlaIle: 5.36 ± 0.491
5.011AlaLys: 5.011 ± 0.616
11.156AlaLeu: 11.156 ± 0.83
3.617AlaMet: 3.617 ± 0.379
3.268AlaAsn: 3.268 ± 0.388
7.931AlaPro: 7.931 ± 0.8
4.401AlaGln: 4.401 ± 0.567
8.323AlaArg: 8.323 ± 0.854
7.059AlaSer: 7.059 ± 0.477
7.103AlaThr: 7.103 ± 0.573
8.803AlaVal: 8.803 ± 0.731
2.266AlaTrp: 2.266 ± 0.269
2.353AlaTyr: 2.353 ± 0.306
0.0AlaXaa: 0.0 ± 0.0
Cys
0.784CysAla: 0.784 ± 0.191
0.131CysCys: 0.131 ± 0.081
0.61CysAsp: 0.61 ± 0.154
0.349CysGlu: 0.349 ± 0.128
0.174CysPhe: 0.174 ± 0.095
1.569CysGly: 1.569 ± 0.255
0.261CysHis: 0.261 ± 0.097
0.261CysIle: 0.261 ± 0.119
0.174CysLys: 0.174 ± 0.08
0.305CysLeu: 0.305 ± 0.146
0.174CysMet: 0.174 ± 0.089
0.174CysAsn: 0.174 ± 0.082
0.872CysPro: 0.872 ± 0.214
0.305CysGln: 0.305 ± 0.126
1.089CysArg: 1.089 ± 0.25
0.828CysSer: 0.828 ± 0.206
0.392CysThr: 0.392 ± 0.165
0.741CysVal: 0.741 ± 0.198
0.218CysTrp: 0.218 ± 0.114
0.174CysTyr: 0.174 ± 0.086
0.0CysXaa: 0.0 ± 0.0
Asp
6.667AspAla: 6.667 ± 0.506
0.741AspCys: 0.741 ± 0.208
4.14AspAsp: 4.14 ± 0.561
4.096AspGlu: 4.096 ± 0.577
1.612AspPhe: 1.612 ± 0.235
6.319AspGly: 6.319 ± 0.494
1.874AspHis: 1.874 ± 0.327
2.876AspIle: 2.876 ± 0.261
1.438AspLys: 1.438 ± 0.274
4.706AspLeu: 4.706 ± 0.504
1.394AspMet: 1.394 ± 0.233
1.264AspAsn: 1.264 ± 0.207
4.488AspPro: 4.488 ± 0.409
2.484AspGln: 2.484 ± 0.42
3.617AspArg: 3.617 ± 0.487
3.181AspSer: 3.181 ± 0.363
2.876AspThr: 2.876 ± 0.315
3.443AspVal: 3.443 ± 0.384
1.307AspTrp: 1.307 ± 0.296
1.482AspTyr: 1.482 ± 0.296
0.0AspXaa: 0.0 ± 0.0
Glu
9.02GluAla: 9.02 ± 0.777
1.002GluCys: 1.002 ± 0.237
2.832GluAsp: 2.832 ± 0.473
3.835GluGlu: 3.835 ± 0.451
1.656GluPhe: 1.656 ± 0.284
4.924GluGly: 4.924 ± 0.627
1.089GluHis: 1.089 ± 0.172
3.181GluIle: 3.181 ± 0.388
1.917GluLys: 1.917 ± 0.282
6.58GluLeu: 6.58 ± 0.504
1.438GluMet: 1.438 ± 0.285
1.307GluAsn: 1.307 ± 0.26
4.183GluPro: 4.183 ± 0.563
2.658GluGln: 2.658 ± 0.354
4.924GluArg: 4.924 ± 0.597
2.135GluSer: 2.135 ± 0.279
3.225GluThr: 3.225 ± 0.378
4.924GluVal: 4.924 ± 0.583
0.828GluTrp: 0.828 ± 0.191
1.046GluTyr: 1.046 ± 0.167
0.0GluXaa: 0.0 ± 0.0
Phe
3.05PheAla: 3.05 ± 0.294
0.131PheCys: 0.131 ± 0.082
2.484PheAsp: 2.484 ± 0.398
1.438PheGlu: 1.438 ± 0.262
0.566PhePhe: 0.566 ± 0.157
2.745PheGly: 2.745 ± 0.439
0.959PheHis: 0.959 ± 0.18
1.22PheIle: 1.22 ± 0.237
0.828PheLys: 0.828 ± 0.206
1.525PheLeu: 1.525 ± 0.313
0.349PheMet: 0.349 ± 0.112
0.915PheAsn: 0.915 ± 0.213
1.046PhePro: 1.046 ± 0.196
0.566PheGln: 0.566 ± 0.159
1.264PheArg: 1.264 ± 0.2
1.874PheSer: 1.874 ± 0.297
1.438PheThr: 1.438 ± 0.23
2.005PheVal: 2.005 ± 0.306
0.436PheTrp: 0.436 ± 0.136
0.305PheTyr: 0.305 ± 0.122
0.0PheXaa: 0.0 ± 0.0
Gly
11.069GlyAla: 11.069 ± 1.264
1.089GlyCys: 1.089 ± 0.234
4.924GlyAsp: 4.924 ± 0.407
6.144GlyGlu: 6.144 ± 0.547
2.135GlyPhe: 2.135 ± 0.408
10.633GlyGly: 10.633 ± 2.362
1.83GlyHis: 1.83 ± 0.361
4.227GlyIle: 4.227 ± 0.46
3.399GlyLys: 3.399 ± 0.36
9.413GlyLeu: 9.413 ± 0.762
2.266GlyMet: 2.266 ± 0.408
2.484GlyAsn: 2.484 ± 0.315
4.183GlyPro: 4.183 ± 0.553
4.183GlyGln: 4.183 ± 0.411
5.621GlyArg: 5.621 ± 0.438
5.316GlySer: 5.316 ± 0.5
5.534GlyThr: 5.534 ± 0.503
6.493GlyVal: 6.493 ± 0.685
3.138GlyTrp: 3.138 ± 0.388
2.135GlyTyr: 2.135 ± 0.332
0.0GlyXaa: 0.0 ± 0.0
His
2.44HisAla: 2.44 ± 0.352
0.261HisCys: 0.261 ± 0.127
1.264HisAsp: 1.264 ± 0.245
0.784HisGlu: 0.784 ± 0.194
0.784HisPhe: 0.784 ± 0.205
2.005HisGly: 2.005 ± 0.286
0.741HisHis: 0.741 ± 0.211
0.915HisIle: 0.915 ± 0.22
0.392HisLys: 0.392 ± 0.138
1.83HisLeu: 1.83 ± 0.314
0.479HisMet: 0.479 ± 0.15
0.523HisAsn: 0.523 ± 0.156
1.743HisPro: 1.743 ± 0.319
0.872HisGln: 0.872 ± 0.203
1.83HisArg: 1.83 ± 0.316
0.566HisSer: 0.566 ± 0.156
1.177HisThr: 1.177 ± 0.242
1.046HisVal: 1.046 ± 0.194
0.61HisTrp: 0.61 ± 0.15
0.479HisTyr: 0.479 ± 0.135
0.0HisXaa: 0.0 ± 0.0
Ile
6.449IleAla: 6.449 ± 0.48
0.261IleCys: 0.261 ± 0.118
3.312IleAsp: 3.312 ± 0.335
4.532IleGlu: 4.532 ± 0.455
1.002IlePhe: 1.002 ± 0.162
4.706IleGly: 4.706 ± 0.552
0.741IleHis: 0.741 ± 0.19
2.31IleIle: 2.31 ± 0.37
1.351IleLys: 1.351 ± 0.319
2.789IleLeu: 2.789 ± 0.354
0.436IleMet: 0.436 ± 0.128
2.135IleAsn: 2.135 ± 0.296
2.658IlePro: 2.658 ± 0.304
1.525IleGln: 1.525 ± 0.251
2.615IleArg: 2.615 ± 0.439
2.484IleSer: 2.484 ± 0.299
2.789IleThr: 2.789 ± 0.351
2.527IleVal: 2.527 ± 0.372
0.523IleTrp: 0.523 ± 0.162
0.784IleTyr: 0.784 ± 0.161
0.0IleXaa: 0.0 ± 0.0
Lys
4.445LysAla: 4.445 ± 0.55
0.305LysCys: 0.305 ± 0.101
1.699LysAsp: 1.699 ± 0.277
1.525LysGlu: 1.525 ± 0.256
0.523LysPhe: 0.523 ± 0.145
2.789LysGly: 2.789 ± 0.374
0.436LysHis: 0.436 ± 0.15
1.699LysIle: 1.699 ± 0.237
1.264LysLys: 1.264 ± 0.305
3.094LysLeu: 3.094 ± 0.393
0.915LysMet: 0.915 ± 0.208
0.784LysAsn: 0.784 ± 0.164
1.264LysPro: 1.264 ± 0.295
0.741LysGln: 0.741 ± 0.183
1.612LysArg: 1.612 ± 0.3
1.917LysSer: 1.917 ± 0.27
1.612LysThr: 1.612 ± 0.231
2.876LysVal: 2.876 ± 0.359
0.523LysTrp: 0.523 ± 0.161
0.436LysTyr: 0.436 ± 0.131
0.0LysXaa: 0.0 ± 0.0
Leu
12.158LeuAla: 12.158 ± 0.714
0.784LeuCys: 0.784 ± 0.22
5.839LeuAsp: 5.839 ± 0.39
4.793LeuGlu: 4.793 ± 0.488
1.917LeuPhe: 1.917 ± 0.322
8.541LeuGly: 8.541 ± 0.884
1.569LeuHis: 1.569 ± 0.283
4.14LeuIle: 4.14 ± 0.338
1.961LeuLys: 1.961 ± 0.287
6.275LeuLeu: 6.275 ± 0.614
1.961LeuMet: 1.961 ± 0.291
2.31LeuAsn: 2.31 ± 0.342
6.449LeuPro: 6.449 ± 0.503
2.658LeuGln: 2.658 ± 0.453
4.619LeuArg: 4.619 ± 0.592
5.011LeuSer: 5.011 ± 0.543
5.796LeuThr: 5.796 ± 0.638
4.924LeuVal: 4.924 ± 0.429
1.394LeuTrp: 1.394 ± 0.237
1.874LeuTyr: 1.874 ± 0.321
0.0LeuXaa: 0.0 ± 0.0
Met
3.138MetAla: 3.138 ± 0.328
0.218MetCys: 0.218 ± 0.087
1.307MetAsp: 1.307 ± 0.249
1.351MetGlu: 1.351 ± 0.314
0.784MetPhe: 0.784 ± 0.171
1.656MetGly: 1.656 ± 0.318
0.523MetHis: 0.523 ± 0.164
1.394MetIle: 1.394 ± 0.264
0.697MetLys: 0.697 ± 0.212
1.787MetLeu: 1.787 ± 0.245
0.523MetMet: 0.523 ± 0.137
0.523MetAsn: 0.523 ± 0.146
1.83MetPro: 1.83 ± 0.24
0.436MetGln: 0.436 ± 0.112
1.699MetArg: 1.699 ± 0.235
1.874MetSer: 1.874 ± 0.268
1.917MetThr: 1.917 ± 0.29
1.394MetVal: 1.394 ± 0.248
0.305MetTrp: 0.305 ± 0.117
0.479MetTyr: 0.479 ± 0.118
0.0MetXaa: 0.0 ± 0.0
Asn
3.66AsnAla: 3.66 ± 0.45
0.261AsnCys: 0.261 ± 0.088
1.612AsnAsp: 1.612 ± 0.324
1.002AsnGlu: 1.002 ± 0.185
0.959AsnPhe: 0.959 ± 0.225
2.702AsnGly: 2.702 ± 0.281
0.479AsnHis: 0.479 ± 0.126
1.133AsnIle: 1.133 ± 0.209
0.741AsnLys: 0.741 ± 0.186
2.571AsnLeu: 2.571 ± 0.407
0.654AsnMet: 0.654 ± 0.138
0.523AsnAsn: 0.523 ± 0.176
2.527AsnPro: 2.527 ± 0.384
0.697AsnGln: 0.697 ± 0.179
1.089AsnArg: 1.089 ± 0.231
1.307AsnSer: 1.307 ± 0.263
1.656AsnThr: 1.656 ± 0.274
1.656AsnVal: 1.656 ± 0.24
0.61AsnTrp: 0.61 ± 0.168
0.566AsnTyr: 0.566 ± 0.173
0.0AsnXaa: 0.0 ± 0.0
Pro
9.108ProAla: 9.108 ± 0.729
0.479ProCys: 0.479 ± 0.133
4.75ProAsp: 4.75 ± 0.527
5.316ProGlu: 5.316 ± 0.485
1.612ProPhe: 1.612 ± 0.272
7.277ProGly: 7.277 ± 0.576
0.915ProHis: 0.915 ± 0.164
2.789ProIle: 2.789 ± 0.386
1.83ProLys: 1.83 ± 0.259
4.271ProLeu: 4.271 ± 0.376
1.089ProMet: 1.089 ± 0.184
1.874ProAsn: 1.874 ± 0.256
5.273ProPro: 5.273 ± 0.543
1.569ProGln: 1.569 ± 0.227
2.571ProArg: 2.571 ± 0.359
2.745ProSer: 2.745 ± 0.472
4.401ProThr: 4.401 ± 0.643
4.793ProVal: 4.793 ± 0.551
1.046ProTrp: 1.046 ± 0.169
0.915ProTyr: 0.915 ± 0.212
0.0ProXaa: 0.0 ± 0.0
Gln
5.055GlnAla: 5.055 ± 0.606
0.131GlnCys: 0.131 ± 0.079
1.133GlnAsp: 1.133 ± 0.238
1.961GlnGlu: 1.961 ± 0.303
1.089GlnPhe: 1.089 ± 0.229
3.225GlnGly: 3.225 ± 0.358
0.523GlnHis: 0.523 ± 0.151
1.743GlnIle: 1.743 ± 0.261
0.697GlnLys: 0.697 ± 0.202
4.096GlnLeu: 4.096 ± 0.337
1.264GlnMet: 1.264 ± 0.271
0.784GlnAsn: 0.784 ± 0.193
2.179GlnPro: 2.179 ± 0.312
1.351GlnGln: 1.351 ± 0.466
2.527GlnArg: 2.527 ± 0.272
1.394GlnSer: 1.394 ± 0.26
2.135GlnThr: 2.135 ± 0.318
2.31GlnVal: 2.31 ± 0.297
0.479GlnTrp: 0.479 ± 0.127
0.697GlnTyr: 0.697 ± 0.178
0.0GlnXaa: 0.0 ± 0.0
Arg
8.367ArgAla: 8.367 ± 0.689
0.915ArgCys: 0.915 ± 0.193
3.268ArgAsp: 3.268 ± 0.378
3.748ArgGlu: 3.748 ± 0.515
1.961ArgPhe: 1.961 ± 0.263
4.183ArgGly: 4.183 ± 0.405
1.394ArgHis: 1.394 ± 0.247
2.571ArgIle: 2.571 ± 0.304
1.699ArgLys: 1.699 ± 0.299
6.406ArgLeu: 6.406 ± 0.591
1.612ArgMet: 1.612 ± 0.285
1.917ArgAsn: 1.917 ± 0.284
3.268ArgPro: 3.268 ± 0.515
2.571ArgGln: 2.571 ± 0.369
6.537ArgArg: 6.537 ± 0.691
3.355ArgSer: 3.355 ± 0.365
3.486ArgThr: 3.486 ± 0.42
4.445ArgVal: 4.445 ± 0.457
1.699ArgTrp: 1.699 ± 0.31
1.699ArgTyr: 1.699 ± 0.372
0.0ArgXaa: 0.0 ± 0.0
Ser
6.319SerAla: 6.319 ± 0.488
0.784SerCys: 0.784 ± 0.189
2.702SerAsp: 2.702 ± 0.344
3.181SerGlu: 3.181 ± 0.487
1.351SerPhe: 1.351 ± 0.281
6.972SerGly: 6.972 ± 0.692
0.828SerHis: 0.828 ± 0.155
2.397SerIle: 2.397 ± 0.331
1.961SerLys: 1.961 ± 0.39
4.314SerLeu: 4.314 ± 0.326
1.394SerMet: 1.394 ± 0.248
1.351SerAsn: 1.351 ± 0.283
2.876SerPro: 2.876 ± 0.384
2.353SerGln: 2.353 ± 0.283
3.66SerArg: 3.66 ± 0.486
2.527SerSer: 2.527 ± 0.325
3.181SerThr: 3.181 ± 0.319
3.138SerVal: 3.138 ± 0.348
1.438SerTrp: 1.438 ± 0.305
1.22SerTyr: 1.22 ± 0.235
0.0SerXaa: 0.0 ± 0.0
Thr
7.19ThrAla: 7.19 ± 0.544
0.566ThrCys: 0.566 ± 0.127
3.312ThrAsp: 3.312 ± 0.348
3.617ThrGlu: 3.617 ± 0.449
1.612ThrPhe: 1.612 ± 0.249
5.36ThrGly: 5.36 ± 0.551
1.089ThrHis: 1.089 ± 0.2
3.312ThrIle: 3.312 ± 0.381
1.787ThrLys: 1.787 ± 0.31
3.791ThrLeu: 3.791 ± 0.379
1.787ThrMet: 1.787 ± 0.273
1.612ThrAsn: 1.612 ± 0.282
5.098ThrPro: 5.098 ± 0.585
1.394ThrGln: 1.394 ± 0.243
3.225ThrArg: 3.225 ± 0.33
4.009ThrSer: 4.009 ± 0.392
3.181ThrThr: 3.181 ± 0.486
4.096ThrVal: 4.096 ± 0.447
1.264ThrTrp: 1.264 ± 0.249
1.482ThrTyr: 1.482 ± 0.233
0.0ThrXaa: 0.0 ± 0.0
Val
7.582ValAla: 7.582 ± 0.478
0.436ValCys: 0.436 ± 0.136
3.704ValAsp: 3.704 ± 0.458
4.009ValGlu: 4.009 ± 0.425
1.438ValPhe: 1.438 ± 0.258
5.796ValGly: 5.796 ± 0.526
1.699ValHis: 1.699 ± 0.322
3.138ValIle: 3.138 ± 0.325
2.397ValLys: 2.397 ± 0.333
6.406ValLeu: 6.406 ± 0.456
1.612ValMet: 1.612 ± 0.33
1.743ValAsn: 1.743 ± 0.288
4.227ValPro: 4.227 ± 0.446
2.266ValGln: 2.266 ± 0.359
5.098ValArg: 5.098 ± 0.551
3.704ValSer: 3.704 ± 0.411
4.619ValThr: 4.619 ± 0.559
4.358ValVal: 4.358 ± 0.531
1.351ValTrp: 1.351 ± 0.25
1.351ValTyr: 1.351 ± 0.232
0.0ValXaa: 0.0 ± 0.0
Trp
2.397TrpAla: 2.397 ± 0.321
0.392TrpCys: 0.392 ± 0.135
1.046TrpAsp: 1.046 ± 0.202
1.002TrpGlu: 1.002 ± 0.196
0.741TrpPhe: 0.741 ± 0.26
1.394TrpGly: 1.394 ± 0.225
0.784TrpHis: 0.784 ± 0.189
0.741TrpIle: 0.741 ± 0.168
0.349TrpLys: 0.349 ± 0.106
1.743TrpLeu: 1.743 ± 0.333
0.479TrpMet: 0.479 ± 0.142
0.523TrpAsn: 0.523 ± 0.152
1.22TrpPro: 1.22 ± 0.239
0.741TrpGln: 0.741 ± 0.171
1.612TrpArg: 1.612 ± 0.27
1.743TrpSer: 1.743 ± 0.279
1.002TrpThr: 1.002 ± 0.26
1.482TrpVal: 1.482 ± 0.25
0.261TrpTrp: 0.261 ± 0.095
0.436TrpTyr: 0.436 ± 0.147
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.527TyrAla: 2.527 ± 0.332
0.174TyrCys: 0.174 ± 0.087
1.743TyrAsp: 1.743 ± 0.29
1.177TyrGlu: 1.177 ± 0.215
0.566TyrPhe: 0.566 ± 0.141
1.743TyrGly: 1.743 ± 0.292
0.305TyrHis: 0.305 ± 0.117
0.523TyrIle: 0.523 ± 0.172
0.566TyrLys: 0.566 ± 0.152
1.917TyrLeu: 1.917 ± 0.294
0.349TyrMet: 0.349 ± 0.107
0.392TyrAsn: 0.392 ± 0.132
1.307TyrPro: 1.307 ± 0.251
1.002TyrGln: 1.002 ± 0.22
1.569TyrArg: 1.569 ± 0.328
0.915TyrSer: 0.915 ± 0.183
1.264TyrThr: 1.264 ± 0.251
1.482TyrVal: 1.482 ± 0.225
0.392TyrTrp: 0.392 ± 0.122
0.523TyrTyr: 0.523 ± 0.189
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 96 proteins (22949 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski