Amino acid dipepetide frequency for Mycobacterium phage BuzzLyseyear

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
13.221AlaAla: 13.221 ± 1.348
0.944AlaCys: 0.944 ± 0.192
6.821AlaAsp: 6.821 ± 0.587
7.345AlaGlu: 7.345 ± 0.718
2.676AlaPhe: 2.676 ± 0.335
9.496AlaGly: 9.496 ± 1.454
2.256AlaHis: 2.256 ± 0.431
4.302AlaIle: 4.302 ± 0.566
4.25AlaLys: 4.25 ± 0.471
8.08AlaLeu: 8.08 ± 0.684
2.413AlaMet: 2.413 ± 0.37
2.991AlaAsn: 2.991 ± 0.495
4.355AlaPro: 4.355 ± 0.71
3.778AlaGln: 3.778 ± 0.462
6.716AlaArg: 6.716 ± 0.594
6.034AlaSer: 6.034 ± 0.651
6.086AlaThr: 6.086 ± 0.571
6.873AlaVal: 6.873 ± 0.604
2.361AlaTrp: 2.361 ± 0.349
2.204AlaTyr: 2.204 ± 0.323
0.0AlaXaa: 0.0 ± 0.0
Cys
0.997CysAla: 0.997 ± 0.315
0.105CysCys: 0.105 ± 0.073
0.997CysAsp: 0.997 ± 0.255
1.154CysGlu: 1.154 ± 0.28
0.21CysPhe: 0.21 ± 0.087
1.836CysGly: 1.836 ± 0.386
0.21CysHis: 0.21 ± 0.102
0.157CysIle: 0.157 ± 0.09
0.367CysLys: 0.367 ± 0.132
0.682CysLeu: 0.682 ± 0.202
0.157CysMet: 0.157 ± 0.105
0.367CysAsn: 0.367 ± 0.179
1.259CysPro: 1.259 ± 0.302
0.262CysGln: 0.262 ± 0.153
0.682CysArg: 0.682 ± 0.203
0.787CysSer: 0.787 ± 0.227
0.787CysThr: 0.787 ± 0.228
0.892CysVal: 0.892 ± 0.212
0.315CysTrp: 0.315 ± 0.129
0.262CysTyr: 0.262 ± 0.113
0.0CysXaa: 0.0 ± 0.0
Asp
6.558AspAla: 6.558 ± 0.498
0.944AspCys: 0.944 ± 0.237
4.774AspAsp: 4.774 ± 0.566
3.83AspGlu: 3.83 ± 0.545
0.997AspPhe: 0.997 ± 0.178
6.191AspGly: 6.191 ± 0.553
1.417AspHis: 1.417 ± 0.266
2.204AspIle: 2.204 ± 0.366
1.994AspLys: 1.994 ± 0.309
6.243AspLeu: 6.243 ± 0.571
1.154AspMet: 1.154 ± 0.275
1.626AspAsn: 1.626 ± 0.338
4.46AspPro: 4.46 ± 0.638
2.886AspGln: 2.886 ± 0.335
5.771AspArg: 5.771 ± 0.657
2.991AspSer: 2.991 ± 0.427
4.302AspThr: 4.302 ± 0.469
4.774AspVal: 4.774 ± 0.593
1.574AspTrp: 1.574 ± 0.28
1.941AspTyr: 1.941 ± 0.278
0.0AspXaa: 0.0 ± 0.0
Glu
6.506GluAla: 6.506 ± 0.692
1.364GluCys: 1.364 ± 0.379
3.148GluAsp: 3.148 ± 0.329
2.833GluGlu: 2.833 ± 0.467
2.204GluPhe: 2.204 ± 0.357
3.2GluGly: 3.2 ± 0.411
1.102GluHis: 1.102 ± 0.31
2.518GluIle: 2.518 ± 0.383
1.784GluLys: 1.784 ± 0.278
5.981GluLeu: 5.981 ± 0.713
1.522GluMet: 1.522 ± 0.27
2.046GluAsn: 2.046 ± 0.311
2.781GluPro: 2.781 ± 0.43
2.886GluGln: 2.886 ± 0.407
4.984GluArg: 4.984 ± 0.578
2.886GluSer: 2.886 ± 0.455
4.092GluThr: 4.092 ± 0.481
4.46GluVal: 4.46 ± 0.637
0.892GluTrp: 0.892 ± 0.201
1.836GluTyr: 1.836 ± 0.327
0.0GluXaa: 0.0 ± 0.0
Phe
2.728PheAla: 2.728 ± 0.428
0.21PheCys: 0.21 ± 0.094
2.623PheAsp: 2.623 ± 0.485
1.364PheGlu: 1.364 ± 0.281
0.787PhePhe: 0.787 ± 0.261
3.095PheGly: 3.095 ± 0.636
0.682PheHis: 0.682 ± 0.23
0.997PheIle: 0.997 ± 0.323
1.364PheLys: 1.364 ± 0.24
1.784PheLeu: 1.784 ± 0.251
0.682PheMet: 0.682 ± 0.193
1.207PheAsn: 1.207 ± 0.323
1.522PhePro: 1.522 ± 0.302
1.049PheGln: 1.049 ± 0.307
1.574PheArg: 1.574 ± 0.289
1.522PheSer: 1.522 ± 0.261
2.256PheThr: 2.256 ± 0.306
1.836PheVal: 1.836 ± 0.311
0.787PheTrp: 0.787 ± 0.193
0.839PheTyr: 0.839 ± 0.242
0.0PheXaa: 0.0 ± 0.0
Gly
8.972GlyAla: 8.972 ± 1.144
1.102GlyCys: 1.102 ± 0.31
5.929GlyAsp: 5.929 ± 0.493
4.04GlyGlu: 4.04 ± 0.513
2.518GlyPhe: 2.518 ± 0.441
10.651GlyGly: 10.651 ± 2.423
2.151GlyHis: 2.151 ± 0.342
4.145GlyIle: 4.145 ± 0.639
2.886GlyLys: 2.886 ± 0.405
5.561GlyLeu: 5.561 ± 0.535
2.361GlyMet: 2.361 ± 0.437
2.991GlyAsn: 2.991 ± 0.382
3.83GlyPro: 3.83 ± 0.476
2.308GlyGln: 2.308 ± 0.536
5.614GlyArg: 5.614 ± 0.758
5.666GlySer: 5.666 ± 0.92
6.086GlyThr: 6.086 ± 0.785
6.034GlyVal: 6.034 ± 0.55
2.676GlyTrp: 2.676 ± 0.425
2.518GlyTyr: 2.518 ± 0.43
0.0GlyXaa: 0.0 ± 0.0
His
1.889HisAla: 1.889 ± 0.404
0.367HisCys: 0.367 ± 0.183
1.207HisAsp: 1.207 ± 0.229
1.207HisGlu: 1.207 ± 0.257
0.577HisPhe: 0.577 ± 0.169
1.836HisGly: 1.836 ± 0.274
0.892HisHis: 0.892 ± 0.233
1.522HisIle: 1.522 ± 0.341
1.102HisLys: 1.102 ± 0.292
1.522HisLeu: 1.522 ± 0.278
0.42HisMet: 0.42 ± 0.13
1.154HisAsn: 1.154 ± 0.273
1.364HisPro: 1.364 ± 0.232
0.787HisGln: 0.787 ± 0.211
1.679HisArg: 1.679 ± 0.309
0.944HisSer: 0.944 ± 0.212
1.679HisThr: 1.679 ± 0.392
1.731HisVal: 1.731 ± 0.325
0.367HisTrp: 0.367 ± 0.142
0.997HisTyr: 0.997 ± 0.221
0.0HisXaa: 0.0 ± 0.0
Ile
5.771IleAla: 5.771 ± 0.545
0.787IleCys: 0.787 ± 0.26
3.253IleAsp: 3.253 ± 0.4
3.41IleGlu: 3.41 ± 0.367
0.839IlePhe: 0.839 ± 0.257
3.882IleGly: 3.882 ± 0.449
1.312IleHis: 1.312 ± 0.265
1.417IleIle: 1.417 ± 0.254
1.154IleLys: 1.154 ± 0.278
2.361IleLeu: 2.361 ± 0.369
0.157IleMet: 0.157 ± 0.089
1.469IleAsn: 1.469 ± 0.278
2.571IlePro: 2.571 ± 0.337
1.731IleGln: 1.731 ± 0.253
2.518IleArg: 2.518 ± 0.431
1.889IleSer: 1.889 ± 0.411
3.41IleThr: 3.41 ± 0.386
3.568IleVal: 3.568 ± 0.352
0.944IleTrp: 0.944 ± 0.202
0.787IleTyr: 0.787 ± 0.183
0.0IleXaa: 0.0 ± 0.0
Lys
3.882LysAla: 3.882 ± 0.457
0.472LysCys: 0.472 ± 0.153
1.731LysAsp: 1.731 ± 0.236
1.417LysGlu: 1.417 ± 0.239
1.102LysPhe: 1.102 ± 0.211
2.204LysGly: 2.204 ± 0.314
1.154LysHis: 1.154 ± 0.254
1.049LysIle: 1.049 ± 0.302
1.364LysLys: 1.364 ± 0.369
3.148LysLeu: 3.148 ± 0.477
0.787LysMet: 0.787 ± 0.163
0.944LysAsn: 0.944 ± 0.236
2.833LysPro: 2.833 ± 0.391
1.312LysGln: 1.312 ± 0.188
2.466LysArg: 2.466 ± 0.382
2.256LysSer: 2.256 ± 0.324
2.046LysThr: 2.046 ± 0.348
2.308LysVal: 2.308 ± 0.365
1.154LysTrp: 1.154 ± 0.291
0.892LysTyr: 0.892 ± 0.223
0.0LysXaa: 0.0 ± 0.0
Leu
7.503LeuAla: 7.503 ± 0.781
0.63LeuCys: 0.63 ± 0.229
4.827LeuAsp: 4.827 ± 0.451
3.882LeuGlu: 3.882 ± 0.448
2.466LeuPhe: 2.466 ± 0.356
5.981LeuGly: 5.981 ± 0.533
0.892LeuHis: 0.892 ± 0.216
3.148LeuIle: 3.148 ± 0.404
2.256LeuLys: 2.256 ± 0.35
4.669LeuLeu: 4.669 ± 0.57
1.574LeuMet: 1.574 ± 0.284
2.571LeuAsn: 2.571 ± 0.37
5.719LeuPro: 5.719 ± 0.643
2.833LeuGln: 2.833 ± 0.454
5.404LeuArg: 5.404 ± 0.577
5.247LeuSer: 5.247 ± 0.469
5.666LeuThr: 5.666 ± 0.564
5.614LeuVal: 5.614 ± 0.548
1.049LeuTrp: 1.049 ± 0.206
1.994LeuTyr: 1.994 ± 0.338
0.0LeuXaa: 0.0 ± 0.0
Met
1.679MetAla: 1.679 ± 0.309
0.367MetCys: 0.367 ± 0.198
1.259MetAsp: 1.259 ± 0.251
1.049MetGlu: 1.049 ± 0.233
0.682MetPhe: 0.682 ± 0.187
1.836MetGly: 1.836 ± 0.342
0.21MetHis: 0.21 ± 0.092
0.997MetIle: 0.997 ± 0.258
1.049MetLys: 1.049 ± 0.233
1.784MetLeu: 1.784 ± 0.243
0.682MetMet: 0.682 ± 0.227
0.892MetAsn: 0.892 ± 0.199
1.259MetPro: 1.259 ± 0.252
0.42MetGln: 0.42 ± 0.155
1.417MetArg: 1.417 ± 0.28
2.623MetSer: 2.623 ± 0.398
2.256MetThr: 2.256 ± 0.333
1.312MetVal: 1.312 ± 0.317
0.315MetTrp: 0.315 ± 0.167
0.315MetTyr: 0.315 ± 0.117
0.0MetXaa: 0.0 ± 0.0
Asn
3.148AsnAla: 3.148 ± 0.364
0.315AsnCys: 0.315 ± 0.149
1.731AsnAsp: 1.731 ± 0.244
1.889AsnGlu: 1.889 ± 0.377
0.787AsnPhe: 0.787 ± 0.281
4.145AsnGly: 4.145 ± 0.566
0.944AsnHis: 0.944 ± 0.215
1.574AsnIle: 1.574 ± 0.477
0.892AsnLys: 0.892 ± 0.195
2.466AsnLeu: 2.466 ± 0.383
0.682AsnMet: 0.682 ± 0.169
1.941AsnAsn: 1.941 ± 0.337
2.466AsnPro: 2.466 ± 0.341
1.049AsnGln: 1.049 ± 0.279
2.886AsnArg: 2.886 ± 0.466
1.312AsnSer: 1.312 ± 0.258
2.308AsnThr: 2.308 ± 0.328
1.574AsnVal: 1.574 ± 0.264
0.525AsnTrp: 0.525 ± 0.166
0.577AsnTyr: 0.577 ± 0.151
0.0AsnXaa: 0.0 ± 0.0
Pro
5.614ProAla: 5.614 ± 0.517
0.472ProCys: 0.472 ± 0.158
4.407ProAsp: 4.407 ± 0.438
4.565ProGlu: 4.565 ± 0.49
1.941ProPhe: 1.941 ± 0.365
6.453ProGly: 6.453 ± 0.795
1.784ProHis: 1.784 ± 0.32
2.204ProIle: 2.204 ± 0.281
1.784ProLys: 1.784 ± 0.359
4.04ProLeu: 4.04 ± 0.574
1.259ProMet: 1.259 ± 0.261
1.994ProAsn: 1.994 ± 0.293
4.145ProPro: 4.145 ± 0.507
2.361ProGln: 2.361 ± 0.339
3.148ProArg: 3.148 ± 0.394
3.305ProSer: 3.305 ± 0.396
3.095ProThr: 3.095 ± 0.391
4.565ProVal: 4.565 ± 0.496
0.839ProTrp: 0.839 ± 0.238
1.679ProTyr: 1.679 ± 0.31
0.0ProXaa: 0.0 ± 0.0
Gln
4.512GlnAla: 4.512 ± 0.591
0.577GlnCys: 0.577 ± 0.206
1.469GlnAsp: 1.469 ± 0.285
1.522GlnGlu: 1.522 ± 0.292
0.892GlnPhe: 0.892 ± 0.189
2.571GlnGly: 2.571 ± 0.425
0.839GlnHis: 0.839 ± 0.215
1.469GlnIle: 1.469 ± 0.299
1.312GlnLys: 1.312 ± 0.222
3.2GlnLeu: 3.2 ± 0.48
0.787GlnMet: 0.787 ± 0.214
0.944GlnAsn: 0.944 ± 0.23
2.676GlnPro: 2.676 ± 0.445
1.889GlnGln: 1.889 ± 0.44
3.043GlnArg: 3.043 ± 0.406
2.204GlnSer: 2.204 ± 0.338
1.626GlnThr: 1.626 ± 0.326
2.361GlnVal: 2.361 ± 0.362
0.787GlnTrp: 0.787 ± 0.173
1.102GlnTyr: 1.102 ± 0.27
0.0GlnXaa: 0.0 ± 0.0
Arg
6.821ArgAla: 6.821 ± 0.654
1.259ArgCys: 1.259 ± 0.366
4.984ArgAsp: 4.984 ± 0.633
5.089ArgGlu: 5.089 ± 0.633
2.256ArgPhe: 2.256 ± 0.411
4.669ArgGly: 4.669 ± 0.391
1.364ArgHis: 1.364 ± 0.317
4.145ArgIle: 4.145 ± 0.49
2.204ArgLys: 2.204 ± 0.367
4.407ArgLeu: 4.407 ± 0.549
2.728ArgMet: 2.728 ± 0.37
1.941ArgAsn: 1.941 ± 0.425
3.673ArgPro: 3.673 ± 0.474
1.941ArgGln: 1.941 ± 0.26
5.089ArgArg: 5.089 ± 0.654
4.092ArgSer: 4.092 ± 0.447
3.41ArgThr: 3.41 ± 0.583
4.827ArgVal: 4.827 ± 0.52
2.046ArgTrp: 2.046 ± 0.408
2.204ArgTyr: 2.204 ± 0.392
0.0ArgXaa: 0.0 ± 0.0
Ser
5.561SerAla: 5.561 ± 0.772
0.472SerCys: 0.472 ± 0.194
4.092SerAsp: 4.092 ± 0.451
3.62SerGlu: 3.62 ± 0.518
2.466SerPhe: 2.466 ± 0.451
6.086SerGly: 6.086 ± 0.806
1.312SerHis: 1.312 ± 0.244
2.728SerIle: 2.728 ± 0.414
2.518SerLys: 2.518 ± 0.369
4.092SerLeu: 4.092 ± 0.52
1.154SerMet: 1.154 ± 0.225
1.784SerAsn: 1.784 ± 0.393
3.568SerPro: 3.568 ± 0.371
1.679SerGln: 1.679 ± 0.285
3.253SerArg: 3.253 ± 0.382
4.145SerSer: 4.145 ± 0.588
3.305SerThr: 3.305 ± 0.481
4.407SerVal: 4.407 ± 0.458
1.312SerTrp: 1.312 ± 0.216
1.417SerTyr: 1.417 ± 0.26
0.0SerXaa: 0.0 ± 0.0
Thr
5.981ThrAla: 5.981 ± 0.561
0.472ThrCys: 0.472 ± 0.177
4.092ThrAsp: 4.092 ± 0.606
3.882ThrGlu: 3.882 ± 0.43
1.784ThrPhe: 1.784 ± 0.345
5.666ThrGly: 5.666 ± 0.594
1.889ThrHis: 1.889 ± 0.336
3.463ThrIle: 3.463 ± 0.478
2.046ThrLys: 2.046 ± 0.338
4.355ThrLeu: 4.355 ± 0.49
0.892ThrMet: 0.892 ± 0.228
2.204ThrAsn: 2.204 ± 0.388
4.25ThrPro: 4.25 ± 0.472
1.836ThrGln: 1.836 ± 0.369
4.25ThrArg: 4.25 ± 0.478
3.673ThrSer: 3.673 ± 0.38
4.722ThrThr: 4.722 ± 0.697
6.139ThrVal: 6.139 ± 0.683
1.207ThrTrp: 1.207 ± 0.249
2.308ThrTyr: 2.308 ± 0.361
0.0ThrXaa: 0.0 ± 0.0
Val
7.24ValAla: 7.24 ± 0.606
1.154ValCys: 1.154 ± 0.236
5.719ValAsp: 5.719 ± 0.518
4.145ValGlu: 4.145 ± 0.433
2.151ValPhe: 2.151 ± 0.396
5.561ValGly: 5.561 ± 0.557
1.889ValHis: 1.889 ± 0.326
3.095ValIle: 3.095 ± 0.363
2.728ValLys: 2.728 ± 0.332
5.247ValLeu: 5.247 ± 0.508
1.574ValMet: 1.574 ± 0.266
2.886ValAsn: 2.886 ± 0.375
4.355ValPro: 4.355 ± 0.444
2.886ValGln: 2.886 ± 0.33
4.827ValArg: 4.827 ± 0.646
4.774ValSer: 4.774 ± 0.499
4.932ValThr: 4.932 ± 0.5
6.243ValVal: 6.243 ± 0.742
1.364ValTrp: 1.364 ± 0.263
1.417ValTyr: 1.417 ± 0.36
0.0ValXaa: 0.0 ± 0.0
Trp
2.046TrpAla: 2.046 ± 0.284
0.262TrpCys: 0.262 ± 0.101
1.259TrpAsp: 1.259 ± 0.256
0.997TrpGlu: 0.997 ± 0.243
0.787TrpPhe: 0.787 ± 0.204
0.892TrpGly: 0.892 ± 0.251
0.577TrpHis: 0.577 ± 0.201
0.997TrpIle: 0.997 ± 0.261
0.787TrpLys: 0.787 ± 0.17
2.046TrpLeu: 2.046 ± 0.38
1.049TrpMet: 1.049 ± 0.251
0.525TrpAsn: 0.525 ± 0.187
0.997TrpPro: 0.997 ± 0.246
0.997TrpGln: 0.997 ± 0.242
1.731TrpArg: 1.731 ± 0.273
1.364TrpSer: 1.364 ± 0.269
1.259TrpThr: 1.259 ± 0.248
1.941TrpVal: 1.941 ± 0.407
0.892TrpTrp: 0.892 ± 0.174
0.525TrpTyr: 0.525 ± 0.144
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.571TyrAla: 2.571 ± 0.326
0.21TyrCys: 0.21 ± 0.108
2.308TyrAsp: 2.308 ± 0.317
1.784TyrGlu: 1.784 ± 0.292
0.787TyrPhe: 0.787 ± 0.201
1.784TyrGly: 1.784 ± 0.338
0.367TyrHis: 0.367 ± 0.108
1.102TyrIle: 1.102 ± 0.228
0.682TyrLys: 0.682 ± 0.208
2.256TyrLeu: 2.256 ± 0.422
0.315TyrMet: 0.315 ± 0.142
0.787TyrAsn: 0.787 ± 0.172
1.574TyrPro: 1.574 ± 0.255
0.892TyrGln: 0.892 ± 0.233
2.099TyrArg: 2.099 ± 0.357
1.259TyrSer: 1.259 ± 0.244
1.784TyrThr: 1.784 ± 0.365
2.728TyrVal: 2.728 ± 0.308
0.525TyrTrp: 0.525 ± 0.164
0.577TyrTyr: 0.577 ± 0.153
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 110 proteins (19061 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski