Amino acid dipepetide frequency for Mycobacterium phage Avocado

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
19.259AlaAla: 19.259 ± 1.747
0.808AlaCys: 0.808 ± 0.238
8.822AlaAsp: 8.822 ± 0.836
7.677AlaGlu: 7.677 ± 1.044
4.579AlaPhe: 4.579 ± 0.6
15.354AlaGly: 15.354 ± 1.891
2.02AlaHis: 2.02 ± 0.505
7.34AlaIle: 7.34 ± 0.712
4.714AlaLys: 4.714 ± 0.517
10.303AlaLeu: 10.303 ± 0.881
2.896AlaMet: 2.896 ± 0.433
3.906AlaAsn: 3.906 ± 0.54
7.071AlaPro: 7.071 ± 0.747
5.859AlaGln: 5.859 ± 0.828
8.215AlaArg: 8.215 ± 0.998
4.04AlaSer: 4.04 ± 0.688
7.475AlaThr: 7.475 ± 0.566
8.148AlaVal: 8.148 ± 0.873
1.953AlaTrp: 1.953 ± 0.285
1.953AlaTyr: 1.953 ± 0.357
0.0AlaXaa: 0.0 ± 0.0
Cys
0.741CysAla: 0.741 ± 0.273
0.135CysCys: 0.135 ± 0.099
0.404CysAsp: 0.404 ± 0.162
0.673CysGlu: 0.673 ± 0.252
0.337CysPhe: 0.337 ± 0.157
1.347CysGly: 1.347 ± 0.375
0.135CysHis: 0.135 ± 0.098
0.606CysIle: 0.606 ± 0.18
0.067CysLys: 0.067 ± 0.065
0.404CysLeu: 0.404 ± 0.165
0.0CysMet: 0.0 ± 0.0
0.337CysAsn: 0.337 ± 0.171
0.943CysPro: 0.943 ± 0.321
0.404CysGln: 0.404 ± 0.175
1.01CysArg: 1.01 ± 0.268
0.471CysSer: 0.471 ± 0.18
0.606CysThr: 0.606 ± 0.255
0.337CysVal: 0.337 ± 0.194
0.404CysTrp: 0.404 ± 0.195
0.269CysTyr: 0.269 ± 0.16
0.0CysXaa: 0.0 ± 0.0
Asp
9.966AspAla: 9.966 ± 0.788
0.943AspCys: 0.943 ± 0.371
7.407AspAsp: 7.407 ± 0.95
5.657AspGlu: 5.657 ± 0.643
1.953AspPhe: 1.953 ± 0.46
7.071AspGly: 7.071 ± 0.765
1.414AspHis: 1.414 ± 0.285
1.145AspIle: 1.145 ± 0.285
1.818AspLys: 1.818 ± 0.296
4.444AspLeu: 4.444 ± 0.431
1.347AspMet: 1.347 ± 0.26
1.953AspAsn: 1.953 ± 0.393
4.579AspPro: 4.579 ± 0.593
3.704AspGln: 3.704 ± 0.51
4.512AspArg: 4.512 ± 0.593
2.357AspSer: 2.357 ± 0.415
4.31AspThr: 4.31 ± 0.449
6.667AspVal: 6.667 ± 0.666
1.414AspTrp: 1.414 ± 0.303
1.953AspTyr: 1.953 ± 0.412
0.0AspXaa: 0.0 ± 0.0
Glu
6.33GluAla: 6.33 ± 0.739
0.404GluCys: 0.404 ± 0.154
4.175GluAsp: 4.175 ± 0.632
2.828GluGlu: 2.828 ± 0.371
2.02GluPhe: 2.02 ± 0.363
3.367GluGly: 3.367 ± 0.496
1.684GluHis: 1.684 ± 0.319
1.549GluIle: 1.549 ± 0.338
1.886GluLys: 1.886 ± 0.346
4.242GluLeu: 4.242 ± 0.582
1.01GluMet: 1.01 ± 0.252
2.02GluAsn: 2.02 ± 0.431
4.983GluPro: 4.983 ± 0.795
2.626GluGln: 2.626 ± 0.488
4.579GluArg: 4.579 ± 0.598
1.684GluSer: 1.684 ± 0.338
2.29GluThr: 2.29 ± 0.369
3.771GluVal: 3.771 ± 0.531
1.347GluTrp: 1.347 ± 0.329
1.212GluTyr: 1.212 ± 0.277
0.0GluXaa: 0.0 ± 0.0
Phe
3.838PheAla: 3.838 ± 0.511
0.337PheCys: 0.337 ± 0.152
3.704PheAsp: 3.704 ± 0.588
2.155PheGlu: 2.155 ± 0.349
0.606PhePhe: 0.606 ± 0.175
3.838PheGly: 3.838 ± 0.548
0.539PheHis: 0.539 ± 0.206
1.212PheIle: 1.212 ± 0.338
0.808PheLys: 0.808 ± 0.23
2.29PheLeu: 2.29 ± 0.527
0.135PheMet: 0.135 ± 0.106
0.943PheAsn: 0.943 ± 0.257
1.481PhePro: 1.481 ± 0.353
1.01PheGln: 1.01 ± 0.248
1.481PheArg: 1.481 ± 0.329
1.145PheSer: 1.145 ± 0.259
2.02PheThr: 2.02 ± 0.393
2.626PheVal: 2.626 ± 0.389
0.404PheTrp: 0.404 ± 0.166
0.943PheTyr: 0.943 ± 0.282
0.0PheXaa: 0.0 ± 0.0
Gly
11.313GlyAla: 11.313 ± 1.785
0.875GlyCys: 0.875 ± 0.274
7.138GlyAsp: 7.138 ± 0.789
4.512GlyGlu: 4.512 ± 0.601
3.3GlyPhe: 3.3 ± 0.474
11.717GlyGly: 11.717 ± 3.053
1.549GlyHis: 1.549 ± 0.406
3.367GlyIle: 3.367 ± 0.603
3.569GlyLys: 3.569 ± 0.6
7.138GlyLeu: 7.138 ± 0.707
2.222GlyMet: 2.222 ± 0.404
3.165GlyAsn: 3.165 ± 0.509
4.108GlyPro: 4.108 ± 0.603
2.828GlyGln: 2.828 ± 0.462
6.801GlyArg: 6.801 ± 0.757
4.444GlySer: 4.444 ± 0.79
5.589GlyThr: 5.589 ± 0.504
6.263GlyVal: 6.263 ± 0.624
2.155GlyTrp: 2.155 ± 0.401
2.357GlyTyr: 2.357 ± 0.421
0.0GlyXaa: 0.0 ± 0.0
His
2.02HisAla: 2.02 ± 0.521
0.202HisCys: 0.202 ± 0.126
1.145HisAsp: 1.145 ± 0.345
0.606HisGlu: 0.606 ± 0.23
0.337HisPhe: 0.337 ± 0.147
2.222HisGly: 2.222 ± 0.383
0.269HisHis: 0.269 ± 0.154
0.741HisIle: 0.741 ± 0.298
0.404HisLys: 0.404 ± 0.148
1.279HisLeu: 1.279 ± 0.296
0.606HisMet: 0.606 ± 0.173
0.404HisAsn: 0.404 ± 0.174
1.549HisPro: 1.549 ± 0.348
1.01HisGln: 1.01 ± 0.325
2.29HisArg: 2.29 ± 0.449
0.673HisSer: 0.673 ± 0.193
0.606HisThr: 0.606 ± 0.234
1.414HisVal: 1.414 ± 0.303
0.202HisTrp: 0.202 ± 0.115
0.135HisTyr: 0.135 ± 0.076
0.0HisXaa: 0.0 ± 0.0
Ile
8.081IleAla: 8.081 ± 0.715
0.337IleCys: 0.337 ± 0.164
3.367IleAsp: 3.367 ± 0.4
1.616IleGlu: 1.616 ± 0.276
0.673IlePhe: 0.673 ± 0.218
4.175IleGly: 4.175 ± 0.804
0.808IleHis: 0.808 ± 0.187
1.077IleIle: 1.077 ± 0.267
1.481IleLys: 1.481 ± 0.459
2.357IleLeu: 2.357 ± 0.364
0.741IleMet: 0.741 ± 0.277
1.145IleAsn: 1.145 ± 0.245
2.761IlePro: 2.761 ± 0.584
1.01IleGln: 1.01 ± 0.202
3.232IleArg: 3.232 ± 0.513
1.684IleSer: 1.684 ± 0.322
2.02IleThr: 2.02 ± 0.369
2.896IleVal: 2.896 ± 0.36
1.01IleTrp: 1.01 ± 0.259
0.606IleTyr: 0.606 ± 0.223
0.0IleXaa: 0.0 ± 0.0
Lys
4.714LysAla: 4.714 ± 0.556
0.539LysCys: 0.539 ± 0.229
1.751LysAsp: 1.751 ± 0.397
1.077LysGlu: 1.077 ± 0.209
1.616LysPhe: 1.616 ± 0.299
2.02LysGly: 2.02 ± 0.429
0.471LysHis: 0.471 ± 0.182
2.155LysIle: 2.155 ± 0.432
1.212LysLys: 1.212 ± 0.275
3.434LysLeu: 3.434 ± 0.522
0.875LysMet: 0.875 ± 0.249
1.347LysAsn: 1.347 ± 0.25
2.088LysPro: 2.088 ± 0.412
1.077LysGln: 1.077 ± 0.292
2.896LysArg: 2.896 ± 0.549
1.818LysSer: 1.818 ± 0.276
2.02LysThr: 2.02 ± 0.416
2.088LysVal: 2.088 ± 0.333
0.404LysTrp: 0.404 ± 0.157
0.673LysTyr: 0.673 ± 0.285
0.0LysXaa: 0.0 ± 0.0
Leu
9.697LeuAla: 9.697 ± 1.012
0.471LeuCys: 0.471 ± 0.189
5.859LeuAsp: 5.859 ± 0.715
3.569LeuGlu: 3.569 ± 0.406
2.424LeuPhe: 2.424 ± 0.364
6.061LeuGly: 6.061 ± 0.602
1.347LeuHis: 1.347 ± 0.278
3.367LeuIle: 3.367 ± 0.551
2.761LeuLys: 2.761 ± 0.478
5.589LeuLeu: 5.589 ± 0.65
1.616LeuMet: 1.616 ± 0.375
2.222LeuAsn: 2.222 ± 0.56
5.657LeuPro: 5.657 ± 0.746
2.088LeuGln: 2.088 ± 0.323
4.714LeuArg: 4.714 ± 0.528
4.646LeuSer: 4.646 ± 0.517
4.579LeuThr: 4.579 ± 0.463
4.646LeuVal: 4.646 ± 0.546
1.886LeuTrp: 1.886 ± 0.393
2.088LeuTyr: 2.088 ± 0.389
0.0LeuXaa: 0.0 ± 0.0
Met
2.963MetAla: 2.963 ± 0.374
0.067MetCys: 0.067 ± 0.063
1.212MetAsp: 1.212 ± 0.23
1.01MetGlu: 1.01 ± 0.209
1.077MetPhe: 1.077 ± 0.253
1.684MetGly: 1.684 ± 0.296
0.337MetHis: 0.337 ± 0.123
0.943MetIle: 0.943 ± 0.292
1.145MetLys: 1.145 ± 0.251
1.886MetLeu: 1.886 ± 0.319
0.135MetMet: 0.135 ± 0.108
0.606MetAsn: 0.606 ± 0.209
0.875MetPro: 0.875 ± 0.292
0.606MetGln: 0.606 ± 0.207
1.212MetArg: 1.212 ± 0.241
1.414MetSer: 1.414 ± 0.251
2.088MetThr: 2.088 ± 0.402
1.01MetVal: 1.01 ± 0.216
0.269MetTrp: 0.269 ± 0.145
0.202MetTyr: 0.202 ± 0.13
0.0MetXaa: 0.0 ± 0.0
Asn
4.444AsnAla: 4.444 ± 0.746
0.673AsnCys: 0.673 ± 0.282
1.684AsnAsp: 1.684 ± 0.397
1.347AsnGlu: 1.347 ± 0.286
0.808AsnPhe: 0.808 ± 0.206
4.04AsnGly: 4.04 ± 0.528
0.875AsnHis: 0.875 ± 0.276
0.943AsnIle: 0.943 ± 0.205
1.077AsnLys: 1.077 ± 0.272
2.424AsnLeu: 2.424 ± 0.412
0.539AsnMet: 0.539 ± 0.163
0.741AsnAsn: 0.741 ± 0.194
4.242AsnPro: 4.242 ± 0.412
1.212AsnGln: 1.212 ± 0.253
2.492AsnArg: 2.492 ± 0.434
0.673AsnSer: 0.673 ± 0.213
1.818AsnThr: 1.818 ± 0.317
1.077AsnVal: 1.077 ± 0.247
0.808AsnTrp: 0.808 ± 0.197
0.606AsnTyr: 0.606 ± 0.169
0.0AsnXaa: 0.0 ± 0.0
Pro
9.832ProAla: 9.832 ± 0.908
0.337ProCys: 0.337 ± 0.184
6.734ProAsp: 6.734 ± 0.807
4.242ProGlu: 4.242 ± 0.53
1.684ProPhe: 1.684 ± 0.357
4.31ProGly: 4.31 ± 0.636
0.875ProHis: 0.875 ± 0.307
3.232ProIle: 3.232 ± 0.454
2.29ProLys: 2.29 ± 0.503
4.916ProLeu: 4.916 ± 0.578
0.943ProMet: 0.943 ± 0.325
1.818ProAsn: 1.818 ± 0.341
4.512ProPro: 4.512 ± 0.575
2.02ProGln: 2.02 ± 0.328
4.108ProArg: 4.108 ± 0.565
2.222ProSer: 2.222 ± 0.409
4.646ProThr: 4.646 ± 0.621
3.232ProVal: 3.232 ± 0.677
0.673ProTrp: 0.673 ± 0.216
0.943ProTyr: 0.943 ± 0.259
0.0ProXaa: 0.0 ± 0.0
Gln
4.444GlnAla: 4.444 ± 0.625
0.135GlnCys: 0.135 ± 0.094
2.29GlnAsp: 2.29 ± 0.379
0.808GlnGlu: 0.808 ± 0.231
1.481GlnPhe: 1.481 ± 0.272
3.3GlnGly: 3.3 ± 0.424
0.673GlnHis: 0.673 ± 0.207
1.347GlnIle: 1.347 ± 0.218
1.481GlnLys: 1.481 ± 0.331
2.896GlnLeu: 2.896 ± 0.33
1.414GlnMet: 1.414 ± 0.273
1.751GlnAsn: 1.751 ± 0.311
2.088GlnPro: 2.088 ± 0.366
1.751GlnGln: 1.751 ± 0.37
3.03GlnArg: 3.03 ± 0.376
1.818GlnSer: 1.818 ± 0.342
2.424GlnThr: 2.424 ± 0.367
2.694GlnVal: 2.694 ± 0.348
0.606GlnTrp: 0.606 ± 0.177
0.741GlnTyr: 0.741 ± 0.248
0.0GlnXaa: 0.0 ± 0.0
Arg
8.148ArgAla: 8.148 ± 1.03
1.077ArgCys: 1.077 ± 0.326
4.242ArgAsp: 4.242 ± 0.517
4.04ArgGlu: 4.04 ± 0.499
1.549ArgPhe: 1.549 ± 0.378
4.579ArgGly: 4.579 ± 0.568
1.279ArgHis: 1.279 ± 0.29
2.694ArgIle: 2.694 ± 0.359
2.828ArgLys: 2.828 ± 0.525
5.791ArgLeu: 5.791 ± 0.614
1.818ArgMet: 1.818 ± 0.323
2.29ArgAsn: 2.29 ± 0.395
4.512ArgPro: 4.512 ± 0.533
2.492ArgGln: 2.492 ± 0.436
6.195ArgArg: 6.195 ± 0.773
3.098ArgSer: 3.098 ± 0.402
5.118ArgThr: 5.118 ± 0.614
3.569ArgVal: 3.569 ± 0.595
2.088ArgTrp: 2.088 ± 0.528
1.481ArgTyr: 1.481 ± 0.391
0.0ArgXaa: 0.0 ± 0.0
Ser
4.175SerAla: 4.175 ± 0.615
0.269SerCys: 0.269 ± 0.135
3.165SerAsp: 3.165 ± 0.516
2.492SerGlu: 2.492 ± 0.357
1.01SerPhe: 1.01 ± 0.29
4.646SerGly: 4.646 ± 0.774
0.404SerHis: 0.404 ± 0.142
2.02SerIle: 2.02 ± 0.429
1.414SerLys: 1.414 ± 0.356
3.165SerLeu: 3.165 ± 0.453
0.741SerMet: 0.741 ± 0.219
1.953SerAsn: 1.953 ± 0.413
2.424SerPro: 2.424 ± 0.356
1.347SerGln: 1.347 ± 0.324
2.761SerArg: 2.761 ± 0.43
2.492SerSer: 2.492 ± 0.545
3.636SerThr: 3.636 ± 0.464
2.896SerVal: 2.896 ± 0.403
0.808SerTrp: 0.808 ± 0.21
1.01SerTyr: 1.01 ± 0.265
0.0SerXaa: 0.0 ± 0.0
Thr
9.832ThrAla: 9.832 ± 0.853
0.606ThrCys: 0.606 ± 0.212
4.175ThrAsp: 4.175 ± 0.6
3.502ThrGlu: 3.502 ± 0.499
1.818ThrPhe: 1.818 ± 0.334
5.926ThrGly: 5.926 ± 0.653
1.145ThrHis: 1.145 ± 0.261
2.29ThrIle: 2.29 ± 0.362
1.684ThrLys: 1.684 ± 0.256
3.973ThrLeu: 3.973 ± 0.452
2.088ThrMet: 2.088 ± 0.435
2.222ThrAsn: 2.222 ± 0.44
4.175ThrPro: 4.175 ± 0.65
1.347ThrGln: 1.347 ± 0.308
3.098ThrArg: 3.098 ± 0.42
3.3ThrSer: 3.3 ± 0.595
4.444ThrThr: 4.444 ± 0.621
5.32ThrVal: 5.32 ± 0.465
1.01ThrTrp: 1.01 ± 0.283
0.875ThrTyr: 0.875 ± 0.176
0.0ThrXaa: 0.0 ± 0.0
Val
7.677ValAla: 7.677 ± 0.973
1.077ValCys: 1.077 ± 0.295
4.983ValAsp: 4.983 ± 0.568
4.646ValGlu: 4.646 ± 0.497
2.626ValPhe: 2.626 ± 0.533
5.522ValGly: 5.522 ± 0.551
0.943ValHis: 0.943 ± 0.206
3.3ValIle: 3.3 ± 0.509
2.694ValLys: 2.694 ± 0.461
5.051ValLeu: 5.051 ± 0.535
1.01ValMet: 1.01 ± 0.208
2.02ValAsn: 2.02 ± 0.376
3.973ValPro: 3.973 ± 0.581
2.626ValGln: 2.626 ± 0.377
3.636ValArg: 3.636 ± 0.443
2.828ValSer: 2.828 ± 0.477
4.579ValThr: 4.579 ± 0.647
4.983ValVal: 4.983 ± 0.526
1.549ValTrp: 1.549 ± 0.332
1.077ValTyr: 1.077 ± 0.241
0.0ValXaa: 0.0 ± 0.0
Trp
2.761TrpAla: 2.761 ± 0.41
0.135TrpCys: 0.135 ± 0.093
1.145TrpAsp: 1.145 ± 0.239
0.741TrpGlu: 0.741 ± 0.245
1.145TrpPhe: 1.145 ± 0.275
0.943TrpGly: 0.943 ± 0.33
0.471TrpHis: 0.471 ± 0.193
0.943TrpIle: 0.943 ± 0.197
0.539TrpLys: 0.539 ± 0.202
2.088TrpLeu: 2.088 ± 0.594
0.269TrpMet: 0.269 ± 0.134
1.145TrpAsn: 1.145 ± 0.316
0.741TrpPro: 0.741 ± 0.203
1.145TrpGln: 1.145 ± 0.247
1.212TrpArg: 1.212 ± 0.307
0.875TrpSer: 0.875 ± 0.274
1.481TrpThr: 1.481 ± 0.345
0.943TrpVal: 0.943 ± 0.281
0.269TrpTrp: 0.269 ± 0.122
0.471TrpTyr: 0.471 ± 0.161
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.155TyrAla: 2.155 ± 0.366
0.202TyrCys: 0.202 ± 0.104
1.212TyrAsp: 1.212 ± 0.3
0.808TyrGlu: 0.808 ± 0.221
0.471TyrPhe: 0.471 ± 0.2
2.357TyrGly: 2.357 ± 0.436
0.875TyrHis: 0.875 ± 0.261
0.808TyrIle: 0.808 ± 0.213
0.404TyrLys: 0.404 ± 0.158
1.616TyrLeu: 1.616 ± 0.311
0.269TyrMet: 0.269 ± 0.188
0.539TyrAsn: 0.539 ± 0.17
0.808TyrPro: 0.808 ± 0.258
1.01TyrGln: 1.01 ± 0.232
1.414TyrArg: 1.414 ± 0.293
1.145TyrSer: 1.145 ± 0.356
1.077TyrThr: 1.077 ± 0.22
2.222TyrVal: 2.222 ± 0.411
0.202TyrTrp: 0.202 ± 0.109
0.269TyrTyr: 0.269 ± 0.168
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 68 proteins (14851 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski