Amino acid dipepetide frequency for Mycobacterium phage EleanorGeorge

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
13.47AlaAla: 13.47 ± 1.798
1.044AlaCys: 1.044 ± 0.272
7.466AlaAsp: 7.466 ± 0.683
7.622AlaGlu: 7.622 ± 0.766
3.08AlaPhe: 3.08 ± 0.356
9.345AlaGly: 9.345 ± 1.205
2.245AlaHis: 2.245 ± 0.357
4.438AlaIle: 4.438 ± 0.6
3.707AlaLys: 3.707 ± 0.438
7.727AlaLeu: 7.727 ± 0.675
2.506AlaMet: 2.506 ± 0.364
2.871AlaAsn: 2.871 ± 0.318
5.482AlaPro: 5.482 ± 0.599
3.498AlaGln: 3.498 ± 0.366
8.406AlaArg: 8.406 ± 1.009
5.221AlaSer: 5.221 ± 0.608
6.526AlaThr: 6.526 ± 0.607
6.735AlaVal: 6.735 ± 0.65
2.871AlaTrp: 2.871 ± 0.366
1.827AlaTyr: 1.827 ± 0.283
0.0AlaXaa: 0.0 ± 0.0
Cys
1.305CysAla: 1.305 ± 0.285
0.052CysCys: 0.052 ± 0.051
1.305CysAsp: 1.305 ± 0.299
0.522CysGlu: 0.522 ± 0.159
0.157CysPhe: 0.157 ± 0.077
1.514CysGly: 1.514 ± 0.367
0.47CysHis: 0.47 ± 0.156
0.313CysIle: 0.313 ± 0.148
0.679CysLys: 0.679 ± 0.225
0.731CysLeu: 0.731 ± 0.237
0.261CysMet: 0.261 ± 0.12
0.365CysAsn: 0.365 ± 0.143
1.149CysPro: 1.149 ± 0.26
0.365CysGln: 0.365 ± 0.134
0.783CysArg: 0.783 ± 0.26
0.679CysSer: 0.679 ± 0.224
0.679CysThr: 0.679 ± 0.201
0.627CysVal: 0.627 ± 0.176
0.313CysTrp: 0.313 ± 0.122
0.261CysTyr: 0.261 ± 0.131
0.0CysXaa: 0.0 ± 0.0
Asp
7.518AspAla: 7.518 ± 0.518
0.783AspCys: 0.783 ± 0.198
5.116AspAsp: 5.116 ± 0.611
3.446AspGlu: 3.446 ± 0.412
1.827AspPhe: 1.827 ± 0.255
6.317AspGly: 6.317 ± 0.552
1.618AspHis: 1.618 ± 0.251
2.402AspIle: 2.402 ± 0.375
1.775AspLys: 1.775 ± 0.243
5.534AspLeu: 5.534 ± 0.497
1.044AspMet: 1.044 ± 0.202
1.775AspAsn: 1.775 ± 0.344
5.221AspPro: 5.221 ± 0.524
2.402AspGln: 2.402 ± 0.318
5.169AspArg: 5.169 ± 0.657
3.916AspSer: 3.916 ± 0.516
3.655AspThr: 3.655 ± 0.468
4.072AspVal: 4.072 ± 0.636
1.566AspTrp: 1.566 ± 0.277
1.775AspTyr: 1.775 ± 0.29
0.0AspXaa: 0.0 ± 0.0
Glu
5.482GluAla: 5.482 ± 0.666
0.992GluCys: 0.992 ± 0.244
2.871GluAsp: 2.871 ± 0.372
2.767GluGlu: 2.767 ± 0.439
2.402GluPhe: 2.402 ± 0.346
3.55GluGly: 3.55 ± 0.479
1.618GluHis: 1.618 ± 0.334
2.141GluIle: 2.141 ± 0.318
1.723GluLys: 1.723 ± 0.295
5.116GluLeu: 5.116 ± 0.575
1.514GluMet: 1.514 ± 0.289
2.349GluAsn: 2.349 ± 0.321
1.984GluPro: 1.984 ± 0.373
3.289GluGln: 3.289 ± 0.399
5.221GluArg: 5.221 ± 0.617
3.394GluSer: 3.394 ± 0.557
3.863GluThr: 3.863 ± 0.574
4.281GluVal: 4.281 ± 0.488
1.357GluTrp: 1.357 ± 0.231
2.036GluTyr: 2.036 ± 0.432
0.0GluXaa: 0.0 ± 0.0
Phe
3.446PheAla: 3.446 ± 0.455
0.365PheCys: 0.365 ± 0.141
2.245PheAsp: 2.245 ± 0.408
1.514PheGlu: 1.514 ± 0.315
0.94PhePhe: 0.94 ± 0.3
3.08PheGly: 3.08 ± 0.672
0.365PheHis: 0.365 ± 0.138
1.462PheIle: 1.462 ± 0.348
0.94PheLys: 0.94 ± 0.219
2.088PheLeu: 2.088 ± 0.249
0.835PheMet: 0.835 ± 0.231
1.149PheAsn: 1.149 ± 0.302
1.775PhePro: 1.775 ± 0.305
1.044PheGln: 1.044 ± 0.291
1.566PheArg: 1.566 ± 0.243
1.566PheSer: 1.566 ± 0.249
2.141PheThr: 2.141 ± 0.485
2.141PheVal: 2.141 ± 0.301
0.679PheTrp: 0.679 ± 0.149
0.94PheTyr: 0.94 ± 0.274
0.0PheXaa: 0.0 ± 0.0
Gly
8.823GlyAla: 8.823 ± 1.133
1.201GlyCys: 1.201 ± 0.281
6.056GlyAsp: 6.056 ± 0.564
4.281GlyGlu: 4.281 ± 0.484
2.715GlyPhe: 2.715 ± 0.457
10.181GlyGly: 10.181 ± 2.24
2.088GlyHis: 2.088 ± 0.342
4.386GlyIle: 4.386 ± 0.565
2.61GlyLys: 2.61 ± 0.37
6.213GlyLeu: 6.213 ± 0.577
2.036GlyMet: 2.036 ± 0.436
2.976GlyAsn: 2.976 ± 0.408
4.594GlyPro: 4.594 ± 0.577
2.193GlyGln: 2.193 ± 0.576
5.012GlyArg: 5.012 ± 0.579
5.691GlySer: 5.691 ± 0.749
6.735GlyThr: 6.735 ± 0.673
5.9GlyVal: 5.9 ± 0.621
2.402GlyTrp: 2.402 ± 0.409
1.827GlyTyr: 1.827 ± 0.331
0.0GlyXaa: 0.0 ± 0.0
His
2.036HisAla: 2.036 ± 0.358
0.418HisCys: 0.418 ± 0.19
1.096HisAsp: 1.096 ± 0.235
1.514HisGlu: 1.514 ± 0.295
0.365HisPhe: 0.365 ± 0.121
1.723HisGly: 1.723 ± 0.27
0.888HisHis: 0.888 ± 0.262
1.357HisIle: 1.357 ± 0.264
0.888HisLys: 0.888 ± 0.231
1.514HisLeu: 1.514 ± 0.348
0.574HisMet: 0.574 ± 0.183
0.679HisAsn: 0.679 ± 0.184
1.775HisPro: 1.775 ± 0.321
1.044HisGln: 1.044 ± 0.227
1.827HisArg: 1.827 ± 0.323
1.044HisSer: 1.044 ± 0.199
1.201HisThr: 1.201 ± 0.274
1.566HisVal: 1.566 ± 0.289
0.47HisTrp: 0.47 ± 0.154
0.835HisTyr: 0.835 ± 0.224
0.0HisXaa: 0.0 ± 0.0
Ile
5.743IleAla: 5.743 ± 0.635
0.679IleCys: 0.679 ± 0.261
3.968IleAsp: 3.968 ± 0.457
3.498IleGlu: 3.498 ± 0.339
0.888IlePhe: 0.888 ± 0.231
4.02IleGly: 4.02 ± 0.472
1.357IleHis: 1.357 ± 0.297
1.41IleIle: 1.41 ± 0.286
0.94IleLys: 0.94 ± 0.224
1.932IleLeu: 1.932 ± 0.384
0.418IleMet: 0.418 ± 0.163
1.723IleAsn: 1.723 ± 0.268
2.976IlePro: 2.976 ± 0.323
1.41IleGln: 1.41 ± 0.234
2.976IleArg: 2.976 ± 0.434
2.036IleSer: 2.036 ± 0.355
3.446IleThr: 3.446 ± 0.408
2.819IleVal: 2.819 ± 0.306
0.679IleTrp: 0.679 ± 0.181
0.783IleTyr: 0.783 ± 0.193
0.0IleXaa: 0.0 ± 0.0
Lys
3.968LysAla: 3.968 ± 0.442
0.365LysCys: 0.365 ± 0.127
1.671LysAsp: 1.671 ± 0.301
1.41LysGlu: 1.41 ± 0.301
1.201LysPhe: 1.201 ± 0.221
2.506LysGly: 2.506 ± 0.355
0.94LysHis: 0.94 ± 0.251
0.94LysIle: 0.94 ± 0.225
1.514LysLys: 1.514 ± 0.437
2.193LysLeu: 2.193 ± 0.42
0.835LysMet: 0.835 ± 0.184
0.94LysAsn: 0.94 ± 0.205
2.245LysPro: 2.245 ± 0.342
1.775LysGln: 1.775 ± 0.322
2.193LysArg: 2.193 ± 0.357
1.984LysSer: 1.984 ± 0.292
2.245LysThr: 2.245 ± 0.391
2.454LysVal: 2.454 ± 0.396
0.835LysTrp: 0.835 ± 0.251
0.992LysTyr: 0.992 ± 0.215
0.0LysXaa: 0.0 ± 0.0
Leu
7.466LeuAla: 7.466 ± 0.737
0.783LeuCys: 0.783 ± 0.229
4.803LeuAsp: 4.803 ± 0.592
4.02LeuGlu: 4.02 ± 0.445
2.454LeuPhe: 2.454 ± 0.313
5.012LeuGly: 5.012 ± 0.639
1.044LeuHis: 1.044 ± 0.23
3.289LeuIle: 3.289 ± 0.495
2.454LeuLys: 2.454 ± 0.368
4.855LeuLeu: 4.855 ± 0.639
1.618LeuMet: 1.618 ± 0.291
2.767LeuAsn: 2.767 ± 0.373
5.116LeuPro: 5.116 ± 0.688
2.767LeuGln: 2.767 ± 0.388
5.169LeuArg: 5.169 ± 0.697
4.855LeuSer: 4.855 ± 0.505
5.377LeuThr: 5.377 ± 0.511
4.751LeuVal: 4.751 ± 0.62
1.149LeuTrp: 1.149 ± 0.27
1.984LeuTyr: 1.984 ± 0.385
0.0LeuXaa: 0.0 ± 0.0
Met
2.193MetAla: 2.193 ± 0.391
0.104MetCys: 0.104 ± 0.079
1.149MetAsp: 1.149 ± 0.269
0.94MetGlu: 0.94 ± 0.218
0.783MetPhe: 0.783 ± 0.244
1.514MetGly: 1.514 ± 0.261
0.261MetHis: 0.261 ± 0.117
0.94MetIle: 0.94 ± 0.249
0.627MetLys: 0.627 ± 0.221
1.775MetLeu: 1.775 ± 0.279
0.522MetMet: 0.522 ± 0.218
0.731MetAsn: 0.731 ± 0.168
1.096MetPro: 1.096 ± 0.213
0.418MetGln: 0.418 ± 0.132
1.305MetArg: 1.305 ± 0.241
2.767MetSer: 2.767 ± 0.423
2.245MetThr: 2.245 ± 0.31
1.566MetVal: 1.566 ± 0.303
0.313MetTrp: 0.313 ± 0.118
0.365MetTyr: 0.365 ± 0.158
0.0MetXaa: 0.0 ± 0.0
Asn
3.237AsnAla: 3.237 ± 0.419
0.418AsnCys: 0.418 ± 0.136
1.775AsnAsp: 1.775 ± 0.278
1.566AsnGlu: 1.566 ± 0.348
0.835AsnPhe: 0.835 ± 0.272
3.968AsnGly: 3.968 ± 0.46
1.096AsnHis: 1.096 ± 0.24
1.618AsnIle: 1.618 ± 0.42
1.201AsnLys: 1.201 ± 0.253
1.827AsnLeu: 1.827 ± 0.332
0.731AsnMet: 0.731 ± 0.163
1.41AsnAsn: 1.41 ± 0.306
2.767AsnPro: 2.767 ± 0.356
1.357AsnGln: 1.357 ± 0.289
2.036AsnArg: 2.036 ± 0.381
1.671AsnSer: 1.671 ± 0.278
2.088AsnThr: 2.088 ± 0.298
2.141AsnVal: 2.141 ± 0.356
0.835AsnTrp: 0.835 ± 0.185
0.835AsnTyr: 0.835 ± 0.174
0.0AsnXaa: 0.0 ± 0.0
Pro
6.108ProAla: 6.108 ± 0.659
0.574ProCys: 0.574 ± 0.175
4.124ProAsp: 4.124 ± 0.562
4.281ProGlu: 4.281 ± 0.442
1.88ProPhe: 1.88 ± 0.334
6.317ProGly: 6.317 ± 0.686
1.671ProHis: 1.671 ± 0.337
2.036ProIle: 2.036 ± 0.261
2.088ProLys: 2.088 ± 0.319
4.542ProLeu: 4.542 ± 0.536
1.41ProMet: 1.41 ± 0.297
2.141ProAsn: 2.141 ± 0.39
3.602ProPro: 3.602 ± 0.571
2.402ProGln: 2.402 ± 0.393
3.55ProArg: 3.55 ± 0.481
2.871ProSer: 2.871 ± 0.418
3.341ProThr: 3.341 ± 0.448
5.377ProVal: 5.377 ± 0.643
1.201ProTrp: 1.201 ± 0.291
1.41ProTyr: 1.41 ± 0.266
0.0ProXaa: 0.0 ± 0.0
Gln
4.386GlnAla: 4.386 ± 0.419
0.209GlnCys: 0.209 ± 0.139
1.723GlnAsp: 1.723 ± 0.263
1.775GlnGlu: 1.775 ± 0.279
1.149GlnPhe: 1.149 ± 0.258
2.506GlnGly: 2.506 ± 0.462
0.888GlnHis: 0.888 ± 0.253
1.88GlnIle: 1.88 ± 0.28
1.357GlnLys: 1.357 ± 0.268
2.976GlnLeu: 2.976 ± 0.422
0.679GlnMet: 0.679 ± 0.177
1.096GlnAsn: 1.096 ± 0.265
2.924GlnPro: 2.924 ± 0.461
1.514GlnGln: 1.514 ± 0.312
2.088GlnArg: 2.088 ± 0.299
2.454GlnSer: 2.454 ± 0.385
1.671GlnThr: 1.671 ± 0.355
2.819GlnVal: 2.819 ± 0.427
0.888GlnTrp: 0.888 ± 0.22
0.783GlnTyr: 0.783 ± 0.21
0.0GlnXaa: 0.0 ± 0.0
Arg
6.161ArgAla: 6.161 ± 0.598
1.514ArgCys: 1.514 ± 0.347
4.594ArgAsp: 4.594 ± 0.566
4.386ArgGlu: 4.386 ± 0.615
2.193ArgPhe: 2.193 ± 0.382
4.177ArgGly: 4.177 ± 0.452
1.357ArgHis: 1.357 ± 0.307
3.968ArgIle: 3.968 ± 0.493
2.819ArgLys: 2.819 ± 0.409
5.325ArgLeu: 5.325 ± 0.533
2.036ArgMet: 2.036 ± 0.345
2.506ArgAsn: 2.506 ± 0.429
3.655ArgPro: 3.655 ± 0.448
2.193ArgGln: 2.193 ± 0.398
5.743ArgArg: 5.743 ± 0.868
4.386ArgSer: 4.386 ± 0.629
3.237ArgThr: 3.237 ± 0.454
5.012ArgVal: 5.012 ± 0.602
2.245ArgTrp: 2.245 ± 0.341
1.984ArgTyr: 1.984 ± 0.327
0.0ArgXaa: 0.0 ± 0.0
Ser
6.787SerAla: 6.787 ± 1.315
0.627SerCys: 0.627 ± 0.165
3.916SerAsp: 3.916 ± 0.471
2.871SerGlu: 2.871 ± 0.459
1.984SerPhe: 1.984 ± 0.471
6.735SerGly: 6.735 ± 0.656
1.044SerHis: 1.044 ± 0.201
2.715SerIle: 2.715 ± 0.37
1.984SerLys: 1.984 ± 0.357
3.498SerLeu: 3.498 ± 0.453
1.514SerMet: 1.514 ± 0.244
2.036SerAsn: 2.036 ± 0.362
3.498SerPro: 3.498 ± 0.364
1.723SerGln: 1.723 ± 0.286
4.02SerArg: 4.02 ± 0.446
3.863SerSer: 3.863 ± 0.563
3.446SerThr: 3.446 ± 0.42
4.751SerVal: 4.751 ± 0.624
1.462SerTrp: 1.462 ± 0.289
1.305SerTyr: 1.305 ± 0.218
0.0SerXaa: 0.0 ± 0.0
Thr
6.735ThrAla: 6.735 ± 0.712
0.574ThrCys: 0.574 ± 0.202
4.177ThrAsp: 4.177 ± 0.535
3.55ThrGlu: 3.55 ± 0.398
1.932ThrPhe: 1.932 ± 0.316
5.795ThrGly: 5.795 ± 0.629
1.566ThrHis: 1.566 ± 0.297
3.394ThrIle: 3.394 ± 0.397
1.723ThrLys: 1.723 ± 0.32
4.699ThrLeu: 4.699 ± 0.407
1.253ThrMet: 1.253 ± 0.265
2.506ThrAsn: 2.506 ± 0.342
4.647ThrPro: 4.647 ± 0.503
1.984ThrGln: 1.984 ± 0.289
3.916ThrArg: 3.916 ± 0.405
3.707ThrSer: 3.707 ± 0.435
4.699ThrThr: 4.699 ± 0.594
5.743ThrVal: 5.743 ± 0.613
1.149ThrTrp: 1.149 ± 0.26
1.566ThrTyr: 1.566 ± 0.253
0.0ThrXaa: 0.0 ± 0.0
Val
7.257ValAla: 7.257 ± 0.587
1.149ValCys: 1.149 ± 0.222
5.482ValAsp: 5.482 ± 0.53
5.221ValGlu: 5.221 ± 0.576
2.141ValPhe: 2.141 ± 0.359
6.056ValGly: 6.056 ± 0.706
1.357ValHis: 1.357 ± 0.265
2.819ValIle: 2.819 ± 0.392
2.558ValLys: 2.558 ± 0.372
5.116ValLeu: 5.116 ± 0.552
1.149ValMet: 1.149 ± 0.169
2.141ValAsn: 2.141 ± 0.33
3.863ValPro: 3.863 ± 0.391
2.61ValGln: 2.61 ± 0.315
4.699ValArg: 4.699 ± 0.594
5.377ValSer: 5.377 ± 0.555
5.169ValThr: 5.169 ± 0.479
6.422ValVal: 6.422 ± 0.739
1.618ValTrp: 1.618 ± 0.287
1.201ValTyr: 1.201 ± 0.234
0.0ValXaa: 0.0 ± 0.0
Trp
1.88TrpAla: 1.88 ± 0.326
0.313TrpCys: 0.313 ± 0.121
1.566TrpAsp: 1.566 ± 0.303
1.096TrpGlu: 1.096 ± 0.304
0.627TrpPhe: 0.627 ± 0.187
1.201TrpGly: 1.201 ± 0.264
0.522TrpHis: 0.522 ± 0.169
1.149TrpIle: 1.149 ± 0.202
0.835TrpLys: 0.835 ± 0.183
2.193TrpLeu: 2.193 ± 0.361
0.522TrpMet: 0.522 ± 0.17
0.574TrpAsn: 0.574 ± 0.196
1.201TrpPro: 1.201 ± 0.288
1.149TrpGln: 1.149 ± 0.272
2.088TrpArg: 2.088 ± 0.413
1.357TrpSer: 1.357 ± 0.274
2.036TrpThr: 2.036 ± 0.348
1.827TrpVal: 1.827 ± 0.452
0.888TrpTrp: 0.888 ± 0.21
0.418TrpTyr: 0.418 ± 0.146
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.141TyrAla: 2.141 ± 0.374
0.365TyrCys: 0.365 ± 0.134
1.827TyrAsp: 1.827 ± 0.376
1.671TyrGlu: 1.671 ± 0.288
0.731TyrPhe: 0.731 ± 0.172
2.402TyrGly: 2.402 ± 0.367
0.418TyrHis: 0.418 ± 0.135
1.044TyrIle: 1.044 ± 0.206
0.783TyrLys: 0.783 ± 0.216
1.88TyrLeu: 1.88 ± 0.266
0.104TyrMet: 0.104 ± 0.067
0.679TyrAsn: 0.679 ± 0.186
1.357TyrPro: 1.357 ± 0.243
0.627TyrGln: 0.627 ± 0.171
1.723TyrArg: 1.723 ± 0.372
0.94TyrSer: 0.94 ± 0.23
1.514TyrThr: 1.514 ± 0.345
2.402TyrVal: 2.402 ± 0.296
0.522TyrTrp: 0.522 ± 0.157
0.574TyrTyr: 0.574 ± 0.18
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 108 proteins (19155 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski