Amino acid dipepetide frequency for Mycobacterium phage GaugeLDP

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
11.014AlaAla: 11.014 ± 1.042
0.424AlaCys: 0.424 ± 0.188
5.749AlaAsp: 5.749 ± 0.661
7.565AlaGlu: 7.565 ± 0.867
4.115AlaPhe: 4.115 ± 0.585
7.746AlaGly: 7.746 ± 0.784
1.816AlaHis: 1.816 ± 0.296
5.326AlaIle: 5.326 ± 0.455
4.962AlaLys: 4.962 ± 0.606
9.199AlaLeu: 9.199 ± 0.946
2.844AlaMet: 2.844 ± 0.43
2.965AlaAsn: 2.965 ± 0.418
4.418AlaPro: 4.418 ± 0.646
3.813AlaGln: 3.813 ± 0.572
6.415AlaArg: 6.415 ± 0.724
5.084AlaSer: 5.084 ± 0.485
5.144AlaThr: 5.144 ± 0.563
6.899AlaVal: 6.899 ± 0.667
1.755AlaTrp: 1.755 ± 0.33
3.086AlaTyr: 3.086 ± 0.47
0.0AlaXaa: 0.0 ± 0.0
Cys
0.545CysAla: 0.545 ± 0.166
0.061CysCys: 0.061 ± 0.066
0.424CysAsp: 0.424 ± 0.178
0.484CysGlu: 0.484 ± 0.186
0.363CysPhe: 0.363 ± 0.132
0.726CysGly: 0.726 ± 0.214
0.303CysHis: 0.303 ± 0.135
0.303CysIle: 0.303 ± 0.153
0.424CysLys: 0.424 ± 0.143
0.666CysLeu: 0.666 ± 0.216
0.242CysMet: 0.242 ± 0.196
0.484CysAsn: 0.484 ± 0.149
0.605CysPro: 0.605 ± 0.195
0.242CysGln: 0.242 ± 0.128
0.545CysArg: 0.545 ± 0.235
0.363CysSer: 0.363 ± 0.162
0.484CysThr: 0.484 ± 0.189
0.666CysVal: 0.666 ± 0.189
0.424CysTrp: 0.424 ± 0.152
0.182CysTyr: 0.182 ± 0.109
0.0CysXaa: 0.0 ± 0.0
Asp
5.931AspAla: 5.931 ± 0.575
0.666AspCys: 0.666 ± 0.243
3.45AspAsp: 3.45 ± 0.509
5.205AspGlu: 5.205 ± 0.865
2.602AspPhe: 2.602 ± 0.444
5.81AspGly: 5.81 ± 0.464
1.21AspHis: 1.21 ± 0.279
3.571AspIle: 3.571 ± 0.524
2.118AspLys: 2.118 ± 0.333
6.233AspLeu: 6.233 ± 0.685
1.392AspMet: 1.392 ± 0.27
1.21AspAsn: 1.21 ± 0.269
4.297AspPro: 4.297 ± 0.549
1.755AspGln: 1.755 ± 0.308
2.239AspArg: 2.239 ± 0.337
2.844AspSer: 2.844 ± 0.423
3.328AspThr: 3.328 ± 0.478
4.418AspVal: 4.418 ± 0.435
0.908AspTrp: 0.908 ± 0.191
2.239AspTyr: 2.239 ± 0.333
0.0AspXaa: 0.0 ± 0.0
Glu
8.109GluAla: 8.109 ± 1.04
0.182GluCys: 0.182 ± 0.108
4.418GluAsp: 4.418 ± 0.556
4.72GluGlu: 4.72 ± 0.711
2.602GluPhe: 2.602 ± 0.454
4.72GluGly: 4.72 ± 0.563
1.513GluHis: 1.513 ± 0.324
3.207GluIle: 3.207 ± 0.423
2.179GluLys: 2.179 ± 0.43
6.778GluLeu: 6.778 ± 0.721
2.421GluMet: 2.421 ± 0.334
2.179GluAsn: 2.179 ± 0.332
2.965GluPro: 2.965 ± 0.43
2.965GluGln: 2.965 ± 0.41
4.66GluArg: 4.66 ± 0.506
2.481GluSer: 2.481 ± 0.287
3.692GluThr: 3.692 ± 0.481
4.66GluVal: 4.66 ± 0.584
1.573GluTrp: 1.573 ± 0.33
1.937GluTyr: 1.937 ± 0.361
0.0GluXaa: 0.0 ± 0.0
Phe
3.51PheAla: 3.51 ± 0.457
0.363PheCys: 0.363 ± 0.148
1.997PheAsp: 1.997 ± 0.448
2.36PheGlu: 2.36 ± 0.366
0.666PhePhe: 0.666 ± 0.23
3.389PheGly: 3.389 ± 0.534
0.726PheHis: 0.726 ± 0.178
1.271PheIle: 1.271 ± 0.285
1.21PheLys: 1.21 ± 0.292
2.179PheLeu: 2.179 ± 0.495
0.847PheMet: 0.847 ± 0.239
1.937PheAsn: 1.937 ± 0.377
1.816PhePro: 1.816 ± 0.307
1.271PheGln: 1.271 ± 0.35
2.239PheArg: 2.239 ± 0.376
2.179PheSer: 2.179 ± 0.408
2.3PheThr: 2.3 ± 0.304
2.118PheVal: 2.118 ± 0.42
0.605PheTrp: 0.605 ± 0.189
0.787PheTyr: 0.787 ± 0.207
0.0PheXaa: 0.0 ± 0.0
Gly
7.02GlyAla: 7.02 ± 0.86
0.666GlyCys: 0.666 ± 0.188
5.931GlyAsp: 5.931 ± 0.72
4.72GlyGlu: 4.72 ± 0.551
2.844GlyPhe: 2.844 ± 0.418
7.444GlyGly: 7.444 ± 1.229
1.695GlyHis: 1.695 ± 0.369
4.297GlyIle: 4.297 ± 0.547
4.176GlyLys: 4.176 ± 0.461
6.536GlyLeu: 6.536 ± 1.045
2.3GlyMet: 2.3 ± 0.35
3.631GlyAsn: 3.631 ± 0.531
5.265GlyPro: 5.265 ± 1.389
3.268GlyGln: 3.268 ± 0.463
4.297GlyArg: 4.297 ± 0.593
4.478GlySer: 4.478 ± 0.48
4.902GlyThr: 4.902 ± 0.511
6.415GlyVal: 6.415 ± 0.549
1.755GlyTrp: 1.755 ± 0.341
2.602GlyTyr: 2.602 ± 0.334
0.0GlyXaa: 0.0 ± 0.0
His
1.755HisAla: 1.755 ± 0.391
0.363HisCys: 0.363 ± 0.124
1.271HisAsp: 1.271 ± 0.276
1.392HisGlu: 1.392 ± 0.351
0.847HisPhe: 0.847 ± 0.214
1.573HisGly: 1.573 ± 0.417
0.424HisHis: 0.424 ± 0.151
1.392HisIle: 1.392 ± 0.223
0.908HisLys: 0.908 ± 0.24
1.331HisLeu: 1.331 ± 0.261
0.242HisMet: 0.242 ± 0.129
0.666HisAsn: 0.666 ± 0.173
0.968HisPro: 0.968 ± 0.219
0.968HisGln: 0.968 ± 0.225
1.513HisArg: 1.513 ± 0.354
0.605HisSer: 0.605 ± 0.198
0.908HisThr: 0.908 ± 0.204
1.21HisVal: 1.21 ± 0.272
0.424HisTrp: 0.424 ± 0.15
0.847HisTyr: 0.847 ± 0.278
0.0HisXaa: 0.0 ± 0.0
Ile
5.507IleAla: 5.507 ± 0.559
0.363IleCys: 0.363 ± 0.172
3.994IleAsp: 3.994 ± 0.362
4.962IleGlu: 4.962 ± 0.574
1.029IlePhe: 1.029 ± 0.232
3.934IleGly: 3.934 ± 0.607
0.787IleHis: 0.787 ± 0.205
1.816IleIle: 1.816 ± 0.299
2.3IleLys: 2.3 ± 0.362
4.297IleLeu: 4.297 ± 0.581
0.605IleMet: 0.605 ± 0.166
1.634IleAsn: 1.634 ± 0.353
3.328IlePro: 3.328 ± 0.466
1.331IleGln: 1.331 ± 0.298
3.752IleArg: 3.752 ± 0.522
2.905IleSer: 2.905 ± 0.39
3.268IleThr: 3.268 ± 0.447
2.723IleVal: 2.723 ± 0.412
0.726IleTrp: 0.726 ± 0.194
0.787IleTyr: 0.787 ± 0.236
0.0IleXaa: 0.0 ± 0.0
Lys
4.962LysAla: 4.962 ± 0.555
0.242LysCys: 0.242 ± 0.093
2.36LysAsp: 2.36 ± 0.396
2.542LysGlu: 2.542 ± 0.42
0.968LysPhe: 0.968 ± 0.241
4.055LysGly: 4.055 ± 0.599
0.847LysHis: 0.847 ± 0.2
2.058LysIle: 2.058 ± 0.363
2.118LysLys: 2.118 ± 0.426
3.571LysLeu: 3.571 ± 0.481
0.908LysMet: 0.908 ± 0.223
1.513LysAsn: 1.513 ± 0.249
3.631LysPro: 3.631 ± 0.571
1.573LysGln: 1.573 ± 0.272
3.571LysArg: 3.571 ± 0.574
1.997LysSer: 1.997 ± 0.326
3.026LysThr: 3.026 ± 0.402
4.176LysVal: 4.176 ± 0.524
0.787LysTrp: 0.787 ± 0.193
1.331LysTyr: 1.331 ± 0.29
0.0LysXaa: 0.0 ± 0.0
Leu
9.743LeuAla: 9.743 ± 0.892
1.029LeuCys: 1.029 ± 0.282
4.962LeuAsp: 4.962 ± 0.468
5.084LeuGlu: 5.084 ± 0.648
2.784LeuPhe: 2.784 ± 0.303
7.02LeuGly: 7.02 ± 0.907
1.816LeuHis: 1.816 ± 0.394
4.781LeuIle: 4.781 ± 0.395
3.51LeuLys: 3.51 ± 0.408
5.326LeuLeu: 5.326 ± 0.58
2.663LeuMet: 2.663 ± 0.486
2.421LeuAsn: 2.421 ± 0.421
4.418LeuPro: 4.418 ± 0.533
2.36LeuGln: 2.36 ± 0.413
5.689LeuArg: 5.689 ± 0.642
5.023LeuSer: 5.023 ± 0.668
5.084LeuThr: 5.084 ± 0.636
4.297LeuVal: 4.297 ± 0.552
1.452LeuTrp: 1.452 ± 0.297
2.36LeuTyr: 2.36 ± 0.422
0.0LeuXaa: 0.0 ± 0.0
Met
2.602MetAla: 2.602 ± 0.389
0.061MetCys: 0.061 ± 0.059
1.331MetAsp: 1.331 ± 0.273
1.513MetGlu: 1.513 ± 0.335
0.484MetPhe: 0.484 ± 0.15
2.058MetGly: 2.058 ± 0.367
0.545MetHis: 0.545 ± 0.199
1.452MetIle: 1.452 ± 0.307
1.513MetLys: 1.513 ± 0.338
2.36MetLeu: 2.36 ± 0.308
0.545MetMet: 0.545 ± 0.164
0.968MetAsn: 0.968 ± 0.207
1.392MetPro: 1.392 ± 0.376
1.15MetGln: 1.15 ± 0.315
1.816MetArg: 1.816 ± 0.299
1.937MetSer: 1.937 ± 0.314
1.695MetThr: 1.695 ± 0.299
1.937MetVal: 1.937 ± 0.369
0.121MetTrp: 0.121 ± 0.08
0.484MetTyr: 0.484 ± 0.169
0.0MetXaa: 0.0 ± 0.0
Asn
2.723AsnAla: 2.723 ± 0.476
0.484AsnCys: 0.484 ± 0.196
1.755AsnAsp: 1.755 ± 0.314
2.179AsnGlu: 2.179 ± 0.401
1.089AsnPhe: 1.089 ± 0.285
3.389AsnGly: 3.389 ± 0.461
0.847AsnHis: 0.847 ± 0.173
1.331AsnIle: 1.331 ± 0.365
1.331AsnLys: 1.331 ± 0.266
2.784AsnLeu: 2.784 ± 0.408
0.908AsnMet: 0.908 ± 0.216
0.545AsnAsn: 0.545 ± 0.159
2.542AsnPro: 2.542 ± 0.381
0.787AsnGln: 0.787 ± 0.215
1.937AsnArg: 1.937 ± 0.363
1.15AsnSer: 1.15 ± 0.277
1.513AsnThr: 1.513 ± 0.301
3.026AsnVal: 3.026 ± 0.377
0.787AsnTrp: 0.787 ± 0.224
0.908AsnTyr: 0.908 ± 0.218
0.0AsnXaa: 0.0 ± 0.0
Pro
5.568ProAla: 5.568 ± 0.581
0.363ProCys: 0.363 ± 0.142
3.752ProAsp: 3.752 ± 0.486
4.055ProGlu: 4.055 ± 0.59
2.3ProPhe: 2.3 ± 0.376
5.205ProGly: 5.205 ± 0.587
1.029ProHis: 1.029 ± 0.203
2.784ProIle: 2.784 ± 0.457
3.147ProLys: 3.147 ± 0.613
3.207ProLeu: 3.207 ± 0.474
1.271ProMet: 1.271 ± 0.28
2.421ProAsn: 2.421 ± 0.463
3.207ProPro: 3.207 ± 0.553
2.481ProGln: 2.481 ± 0.718
4.236ProArg: 4.236 ± 0.672
2.723ProSer: 2.723 ± 0.465
3.631ProThr: 3.631 ± 0.409
3.571ProVal: 3.571 ± 0.473
1.089ProTrp: 1.089 ± 0.322
1.271ProTyr: 1.271 ± 0.233
0.0ProXaa: 0.0 ± 0.0
Gln
4.357GlnAla: 4.357 ± 0.585
0.182GlnCys: 0.182 ± 0.1
1.452GlnAsp: 1.452 ± 0.276
1.816GlnGlu: 1.816 ± 0.306
1.21GlnPhe: 1.21 ± 0.377
3.994GlnGly: 3.994 ± 0.887
0.666GlnHis: 0.666 ± 0.214
2.421GlnIle: 2.421 ± 0.359
1.634GlnLys: 1.634 ± 0.297
2.965GlnLeu: 2.965 ± 0.624
0.908GlnMet: 0.908 ± 0.251
0.605GlnAsn: 0.605 ± 0.208
1.15GlnPro: 1.15 ± 0.275
1.695GlnGln: 1.695 ± 0.343
2.965GlnArg: 2.965 ± 0.571
1.937GlnSer: 1.937 ± 0.331
2.118GlnThr: 2.118 ± 0.326
2.663GlnVal: 2.663 ± 0.518
0.605GlnTrp: 0.605 ± 0.212
0.968GlnTyr: 0.968 ± 0.215
0.0GlnXaa: 0.0 ± 0.0
Arg
5.931ArgAla: 5.931 ± 0.694
0.545ArgCys: 0.545 ± 0.247
3.873ArgAsp: 3.873 ± 0.452
5.144ArgGlu: 5.144 ± 0.58
2.542ArgPhe: 2.542 ± 0.386
4.176ArgGly: 4.176 ± 0.549
1.513ArgHis: 1.513 ± 0.301
3.631ArgIle: 3.631 ± 0.45
3.51ArgLys: 3.51 ± 0.579
6.294ArgLeu: 6.294 ± 0.608
2.179ArgMet: 2.179 ± 0.388
1.937ArgAsn: 1.937 ± 0.366
3.328ArgPro: 3.328 ± 0.449
2.058ArgGln: 2.058 ± 0.356
4.962ArgArg: 4.962 ± 0.607
3.51ArgSer: 3.51 ± 0.509
3.207ArgThr: 3.207 ± 0.394
3.934ArgVal: 3.934 ± 0.417
1.452ArgTrp: 1.452 ± 0.311
2.36ArgTyr: 2.36 ± 0.404
0.0ArgXaa: 0.0 ± 0.0
Ser
4.539SerAla: 4.539 ± 0.468
0.484SerCys: 0.484 ± 0.149
2.663SerAsp: 2.663 ± 0.464
3.692SerGlu: 3.692 ± 0.432
1.816SerPhe: 1.816 ± 0.336
4.962SerGly: 4.962 ± 0.703
0.666SerHis: 0.666 ± 0.171
2.058SerIle: 2.058 ± 0.345
3.268SerLys: 3.268 ± 0.582
4.357SerLeu: 4.357 ± 0.586
1.271SerMet: 1.271 ± 0.266
1.029SerAsn: 1.029 ± 0.242
2.905SerPro: 2.905 ± 0.379
1.876SerGln: 1.876 ± 0.283
3.51SerArg: 3.51 ± 0.488
2.723SerSer: 2.723 ± 0.549
3.086SerThr: 3.086 ± 0.417
4.176SerVal: 4.176 ± 0.533
1.331SerTrp: 1.331 ± 0.279
1.695SerTyr: 1.695 ± 0.301
0.0SerXaa: 0.0 ± 0.0
Thr
5.386ThrAla: 5.386 ± 0.466
0.726ThrCys: 0.726 ± 0.228
3.147ThrAsp: 3.147 ± 0.528
3.147ThrGlu: 3.147 ± 0.435
1.816ThrPhe: 1.816 ± 0.293
5.87ThrGly: 5.87 ± 0.821
1.271ThrHis: 1.271 ± 0.279
2.421ThrIle: 2.421 ± 0.396
3.389ThrLys: 3.389 ± 0.441
4.841ThrLeu: 4.841 ± 0.733
1.392ThrMet: 1.392 ± 0.289
1.21ThrAsn: 1.21 ± 0.232
4.055ThrPro: 4.055 ± 0.572
2.602ThrGln: 2.602 ± 0.411
3.631ThrArg: 3.631 ± 0.489
3.147ThrSer: 3.147 ± 0.452
3.268ThrThr: 3.268 ± 0.553
4.72ThrVal: 4.72 ± 0.577
1.15ThrTrp: 1.15 ± 0.256
1.695ThrTyr: 1.695 ± 0.24
0.0ThrXaa: 0.0 ± 0.0
Val
7.02ValAla: 7.02 ± 0.837
0.787ValCys: 0.787 ± 0.22
5.507ValAsp: 5.507 ± 0.519
4.297ValGlu: 4.297 ± 0.444
2.179ValPhe: 2.179 ± 0.431
5.023ValGly: 5.023 ± 0.674
1.089ValHis: 1.089 ± 0.3
3.328ValIle: 3.328 ± 0.38
2.965ValLys: 2.965 ± 0.441
5.144ValLeu: 5.144 ± 0.644
1.695ValMet: 1.695 ± 0.331
2.965ValAsn: 2.965 ± 0.422
3.934ValPro: 3.934 ± 0.485
1.997ValGln: 1.997 ± 0.357
4.176ValArg: 4.176 ± 0.533
4.539ValSer: 4.539 ± 0.537
5.386ValThr: 5.386 ± 0.571
4.841ValVal: 4.841 ± 0.663
1.271ValTrp: 1.271 ± 0.296
1.452ValTyr: 1.452 ± 0.267
0.0ValXaa: 0.0 ± 0.0
Trp
1.816TrpAla: 1.816 ± 0.349
0.303TrpCys: 0.303 ± 0.154
0.847TrpAsp: 0.847 ± 0.227
1.392TrpGlu: 1.392 ± 0.257
0.666TrpPhe: 0.666 ± 0.179
1.331TrpGly: 1.331 ± 0.28
0.303TrpHis: 0.303 ± 0.132
1.15TrpIle: 1.15 ± 0.263
0.605TrpLys: 0.605 ± 0.199
1.15TrpLeu: 1.15 ± 0.259
0.605TrpMet: 0.605 ± 0.179
0.787TrpAsn: 0.787 ± 0.22
1.15TrpPro: 1.15 ± 0.276
1.089TrpGln: 1.089 ± 0.223
1.271TrpArg: 1.271 ± 0.271
0.968TrpSer: 0.968 ± 0.246
1.513TrpThr: 1.513 ± 0.303
1.089TrpVal: 1.089 ± 0.25
0.363TrpTrp: 0.363 ± 0.156
0.605TrpTyr: 0.605 ± 0.19
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.3TyrAla: 2.3 ± 0.408
0.242TyrCys: 0.242 ± 0.113
2.542TyrAsp: 2.542 ± 0.352
1.695TyrGlu: 1.695 ± 0.366
0.847TyrPhe: 0.847 ± 0.2
1.816TyrGly: 1.816 ± 0.385
0.545TyrHis: 0.545 ± 0.161
1.452TyrIle: 1.452 ± 0.313
0.968TyrLys: 0.968 ± 0.232
2.602TyrLeu: 2.602 ± 0.352
0.666TyrMet: 0.666 ± 0.216
0.908TyrAsn: 0.908 ± 0.249
1.937TyrPro: 1.937 ± 0.341
1.029TyrGln: 1.029 ± 0.262
2.663TyrArg: 2.663 ± 0.389
1.573TyrSer: 1.573 ± 0.277
1.331TyrThr: 1.331 ± 0.276
2.058TyrVal: 2.058 ± 0.417
0.424TyrTrp: 0.424 ± 0.148
1.15TyrTyr: 1.15 ± 0.289
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 90 proteins (16525 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski