Amino acid dipepetide frequency for Mycobacterium phage Gaia

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
9.88AlaAla: 9.88 ± 1.182
1.017AlaCys: 1.017 ± 0.205
6.357AlaAsp: 6.357 ± 0.496
7.446AlaGlu: 7.446 ± 0.553
2.543AlaPhe: 2.543 ± 0.259
7.265AlaGly: 7.265 ± 0.773
1.961AlaHis: 1.961 ± 0.315
5.158AlaIle: 5.158 ± 0.5
4.214AlaLys: 4.214 ± 0.419
7.701AlaLeu: 7.701 ± 0.713
3.523AlaMet: 3.523 ± 0.372
3.305AlaAsn: 3.305 ± 0.277
3.741AlaPro: 3.741 ± 0.341
3.378AlaGln: 3.378 ± 0.491
6.611AlaArg: 6.611 ± 0.54
6.211AlaSer: 6.211 ± 0.522
5.049AlaThr: 5.049 ± 0.5
5.993AlaVal: 5.993 ± 0.467
2.216AlaTrp: 2.216 ± 0.261
2.906AlaTyr: 2.906 ± 0.367
0.0AlaXaa: 0.0 ± 0.0
Cys
1.09CysAla: 1.09 ± 0.205
0.436CysCys: 0.436 ± 0.12
1.271CysAsp: 1.271 ± 0.239
1.126CysGlu: 1.126 ± 0.204
0.254CysPhe: 0.254 ± 0.101
2.325CysGly: 2.325 ± 0.373
0.472CysHis: 0.472 ± 0.126
0.509CysIle: 0.509 ± 0.189
0.581CysLys: 0.581 ± 0.19
1.09CysLeu: 1.09 ± 0.211
0.218CysMet: 0.218 ± 0.075
0.654CysAsn: 0.654 ± 0.162
0.69CysPro: 0.69 ± 0.197
0.545CysGln: 0.545 ± 0.141
0.835CysArg: 0.835 ± 0.179
1.017CysSer: 1.017 ± 0.295
0.509CysThr: 0.509 ± 0.127
1.053CysVal: 1.053 ± 0.197
0.254CysTrp: 0.254 ± 0.092
0.472CysTyr: 0.472 ± 0.157
0.0CysXaa: 0.0 ± 0.0
Asp
6.575AspAla: 6.575 ± 0.454
1.271AspCys: 1.271 ± 0.288
4.577AspAsp: 4.577 ± 0.456
4.758AspGlu: 4.758 ± 0.458
2.361AspPhe: 2.361 ± 0.275
5.34AspGly: 5.34 ± 0.412
1.707AspHis: 1.707 ± 0.246
2.942AspIle: 2.942 ± 0.313
3.124AspLys: 3.124 ± 0.359
5.267AspLeu: 5.267 ± 0.475
1.162AspMet: 1.162 ± 0.212
2.397AspAsn: 2.397 ± 0.33
3.996AspPro: 3.996 ± 0.437
1.816AspGln: 1.816 ± 0.23
4.613AspArg: 4.613 ± 0.409
3.124AspSer: 3.124 ± 0.331
3.451AspThr: 3.451 ± 0.369
4.177AspVal: 4.177 ± 0.427
1.744AspTrp: 1.744 ± 0.306
2.506AspTyr: 2.506 ± 0.257
0.0AspXaa: 0.0 ± 0.0
Glu
6.902GluAla: 6.902 ± 0.574
0.981GluCys: 0.981 ± 0.184
4.25GluAsp: 4.25 ± 0.379
3.051GluGlu: 3.051 ± 0.39
2.579GluPhe: 2.579 ± 0.321
3.632GluGly: 3.632 ± 0.376
1.526GluHis: 1.526 ± 0.198
3.088GluIle: 3.088 ± 0.28
2.579GluLys: 2.579 ± 0.323
7.011GluLeu: 7.011 ± 0.557
1.744GluMet: 1.744 ± 0.236
1.925GluAsn: 1.925 ± 0.268
2.797GluPro: 2.797 ± 0.366
2.179GluGln: 2.179 ± 0.249
5.085GluArg: 5.085 ± 0.471
4.068GluSer: 4.068 ± 0.373
3.124GluThr: 3.124 ± 0.335
4.613GluVal: 4.613 ± 0.407
1.562GluTrp: 1.562 ± 0.229
2.434GluTyr: 2.434 ± 0.294
0.0GluXaa: 0.0 ± 0.0
Phe
2.034PheAla: 2.034 ± 0.28
0.472PheCys: 0.472 ± 0.141
2.688PheAsp: 2.688 ± 0.292
1.744PheGlu: 1.744 ± 0.218
0.654PhePhe: 0.654 ± 0.155
2.361PheGly: 2.361 ± 0.352
0.726PheHis: 0.726 ± 0.193
1.489PheIle: 1.489 ± 0.238
1.235PheLys: 1.235 ± 0.21
2.143PheLeu: 2.143 ± 0.313
0.763PheMet: 0.763 ± 0.177
1.344PheAsn: 1.344 ± 0.229
1.961PhePro: 1.961 ± 0.265
0.581PheGln: 0.581 ± 0.114
1.853PheArg: 1.853 ± 0.269
2.07PheSer: 2.07 ± 0.33
2.143PheThr: 2.143 ± 0.286
2.361PheVal: 2.361 ± 0.316
0.472PheTrp: 0.472 ± 0.164
1.09PheTyr: 1.09 ± 0.22
0.0PheXaa: 0.0 ± 0.0
Gly
6.938GlyAla: 6.938 ± 0.733
1.344GlyCys: 1.344 ± 0.243
5.412GlyAsp: 5.412 ± 0.406
5.267GlyGlu: 5.267 ± 0.414
2.761GlyPhe: 2.761 ± 0.324
9.59GlyGly: 9.59 ± 1.63
2.143GlyHis: 2.143 ± 0.334
3.887GlyIle: 3.887 ± 0.434
3.669GlyLys: 3.669 ± 0.327
5.884GlyLeu: 5.884 ± 0.521
2.434GlyMet: 2.434 ± 0.283
3.342GlyAsn: 3.342 ± 0.427
2.979GlyPro: 2.979 ± 0.371
1.744GlyGln: 1.744 ± 0.253
4.758GlyArg: 4.758 ± 0.417
6.03GlySer: 6.03 ± 0.53
4.831GlyThr: 4.831 ± 0.647
5.848GlyVal: 5.848 ± 0.454
2.543GlyTrp: 2.543 ± 0.256
2.833GlyTyr: 2.833 ± 0.272
0.0GlyXaa: 0.0 ± 0.0
His
1.526HisAla: 1.526 ± 0.251
0.363HisCys: 0.363 ± 0.108
1.526HisAsp: 1.526 ± 0.263
1.453HisGlu: 1.453 ± 0.237
0.472HisPhe: 0.472 ± 0.126
2.034HisGly: 2.034 ± 0.324
0.69HisHis: 0.69 ± 0.147
1.235HisIle: 1.235 ± 0.187
0.908HisLys: 0.908 ± 0.18
1.925HisLeu: 1.925 ± 0.265
0.654HisMet: 0.654 ± 0.141
0.509HisAsn: 0.509 ± 0.144
1.235HisPro: 1.235 ± 0.185
0.509HisGln: 0.509 ± 0.138
1.889HisArg: 1.889 ± 0.289
1.453HisSer: 1.453 ± 0.249
1.199HisThr: 1.199 ± 0.229
1.707HisVal: 1.707 ± 0.257
0.654HisTrp: 0.654 ± 0.159
1.235HisTyr: 1.235 ± 0.253
0.0HisXaa: 0.0 ± 0.0
Ile
5.884IleAla: 5.884 ± 0.422
0.763IleCys: 0.763 ± 0.201
3.632IleAsp: 3.632 ± 0.307
4.105IleGlu: 4.105 ± 0.42
0.981IlePhe: 0.981 ± 0.183
4.323IleGly: 4.323 ± 0.34
1.162IleHis: 1.162 ± 0.185
1.961IleIle: 1.961 ± 0.312
1.562IleLys: 1.562 ± 0.238
3.523IleLeu: 3.523 ± 0.377
0.981IleMet: 0.981 ± 0.208
1.853IleAsn: 1.853 ± 0.242
2.652IlePro: 2.652 ± 0.243
1.417IleGln: 1.417 ± 0.226
2.906IleArg: 2.906 ± 0.354
2.652IleSer: 2.652 ± 0.312
3.269IleThr: 3.269 ± 0.4
2.906IleVal: 2.906 ± 0.29
0.981IleTrp: 0.981 ± 0.164
1.271IleTyr: 1.271 ± 0.254
0.0IleXaa: 0.0 ± 0.0
Lys
4.795LysAla: 4.795 ± 0.55
0.763LysCys: 0.763 ± 0.205
2.179LysAsp: 2.179 ± 0.309
2.034LysGlu: 2.034 ± 0.318
1.271LysPhe: 1.271 ± 0.207
3.197LysGly: 3.197 ± 0.271
1.017LysHis: 1.017 ± 0.206
1.78LysIle: 1.78 ± 0.237
1.526LysLys: 1.526 ± 0.313
3.378LysLeu: 3.378 ± 0.361
0.944LysMet: 0.944 ± 0.19
1.925LysAsn: 1.925 ± 0.254
1.998LysPro: 1.998 ± 0.278
1.671LysGln: 1.671 ± 0.242
3.342LysArg: 3.342 ± 0.375
2.216LysSer: 2.216 ± 0.268
2.361LysThr: 2.361 ± 0.382
2.724LysVal: 2.724 ± 0.289
0.763LysTrp: 0.763 ± 0.177
1.017LysTyr: 1.017 ± 0.199
0.0LysXaa: 0.0 ± 0.0
Leu
8.827LeuAla: 8.827 ± 0.702
1.126LeuCys: 1.126 ± 0.222
5.085LeuAsp: 5.085 ± 0.385
5.231LeuGlu: 5.231 ± 0.529
2.325LeuPhe: 2.325 ± 0.299
6.03LeuGly: 6.03 ± 0.442
1.707LeuHis: 1.707 ± 0.27
3.596LeuIle: 3.596 ± 0.332
3.305LeuLys: 3.305 ± 0.404
6.393LeuLeu: 6.393 ± 0.642
1.671LeuMet: 1.671 ± 0.257
2.761LeuAsn: 2.761 ± 0.332
4.468LeuPro: 4.468 ± 0.486
2.797LeuGln: 2.797 ± 0.426
6.502LeuArg: 6.502 ± 0.569
5.884LeuSer: 5.884 ± 0.456
4.323LeuThr: 4.323 ± 0.341
4.323LeuVal: 4.323 ± 0.29
1.09LeuTrp: 1.09 ± 0.195
2.434LeuTyr: 2.434 ± 0.255
0.0LeuXaa: 0.0 ± 0.0
Met
2.506MetAla: 2.506 ± 0.269
0.218MetCys: 0.218 ± 0.097
1.308MetAsp: 1.308 ± 0.197
1.453MetGlu: 1.453 ± 0.244
1.271MetPhe: 1.271 ± 0.26
1.598MetGly: 1.598 ± 0.243
0.654MetHis: 0.654 ± 0.162
1.308MetIle: 1.308 ± 0.244
0.872MetLys: 0.872 ± 0.185
1.453MetLeu: 1.453 ± 0.221
0.4MetMet: 0.4 ± 0.104
1.017MetAsn: 1.017 ± 0.184
1.199MetPro: 1.199 ± 0.204
0.763MetGln: 0.763 ± 0.194
1.707MetArg: 1.707 ± 0.252
2.325MetSer: 2.325 ± 0.311
1.853MetThr: 1.853 ± 0.234
1.308MetVal: 1.308 ± 0.211
0.509MetTrp: 0.509 ± 0.118
0.618MetTyr: 0.618 ± 0.141
0.0MetXaa: 0.0 ± 0.0
Asn
3.414AsnAla: 3.414 ± 0.454
0.654AsnCys: 0.654 ± 0.167
2.034AsnAsp: 2.034 ± 0.258
1.744AsnGlu: 1.744 ± 0.229
0.944AsnPhe: 0.944 ± 0.22
3.85AsnGly: 3.85 ± 0.361
0.835AsnHis: 0.835 ± 0.186
1.925AsnIle: 1.925 ± 0.319
0.872AsnLys: 0.872 ± 0.152
2.688AsnLeu: 2.688 ± 0.283
0.981AsnMet: 0.981 ± 0.218
1.453AsnAsn: 1.453 ± 0.27
3.414AsnPro: 3.414 ± 0.335
0.908AsnGln: 0.908 ± 0.166
2.506AsnArg: 2.506 ± 0.311
2.252AsnSer: 2.252 ± 0.265
1.562AsnThr: 1.562 ± 0.258
2.107AsnVal: 2.107 ± 0.295
0.799AsnTrp: 0.799 ± 0.159
0.799AsnTyr: 0.799 ± 0.151
0.0AsnXaa: 0.0 ± 0.0
Pro
4.649ProAla: 4.649 ± 0.435
0.581ProCys: 0.581 ± 0.157
3.669ProAsp: 3.669 ± 0.418
3.741ProGlu: 3.741 ± 0.379
1.925ProPhe: 1.925 ± 0.303
5.267ProGly: 5.267 ± 0.458
1.489ProHis: 1.489 ± 0.265
2.434ProIle: 2.434 ± 0.389
1.78ProLys: 1.78 ± 0.321
3.705ProLeu: 3.705 ± 0.401
0.726ProMet: 0.726 ± 0.166
2.034ProAsn: 2.034 ± 0.271
2.543ProPro: 2.543 ± 0.345
1.671ProGln: 1.671 ± 0.27
2.579ProArg: 2.579 ± 0.306
3.378ProSer: 3.378 ± 0.362
2.942ProThr: 2.942 ± 0.291
3.669ProVal: 3.669 ± 0.385
1.199ProTrp: 1.199 ± 0.193
1.053ProTyr: 1.053 ± 0.177
0.0ProXaa: 0.0 ± 0.0
Gln
3.523GlnAla: 3.523 ± 0.45
0.291GlnCys: 0.291 ± 0.094
1.199GlnAsp: 1.199 ± 0.189
1.998GlnGlu: 1.998 ± 0.308
1.199GlnPhe: 1.199 ± 0.198
2.107GlnGly: 2.107 ± 0.364
0.472GlnHis: 0.472 ± 0.121
1.635GlnIle: 1.635 ± 0.27
1.635GlnLys: 1.635 ± 0.243
2.652GlnLeu: 2.652 ± 0.304
0.509GlnMet: 0.509 ± 0.143
1.053GlnAsn: 1.053 ± 0.203
1.78GlnPro: 1.78 ± 0.266
1.09GlnGln: 1.09 ± 0.269
2.615GlnArg: 2.615 ± 0.348
1.671GlnSer: 1.671 ± 0.245
1.126GlnThr: 1.126 ± 0.176
2.034GlnVal: 2.034 ± 0.224
0.618GlnTrp: 0.618 ± 0.159
0.726GlnTyr: 0.726 ± 0.157
0.0GlnXaa: 0.0 ± 0.0
Arg
6.865ArgAla: 6.865 ± 0.533
1.308ArgCys: 1.308 ± 0.247
4.286ArgAsp: 4.286 ± 0.426
5.158ArgGlu: 5.158 ± 0.572
1.744ArgPhe: 1.744 ± 0.254
4.722ArgGly: 4.722 ± 0.411
1.635ArgHis: 1.635 ± 0.223
3.56ArgIle: 3.56 ± 0.338
3.523ArgLys: 3.523 ± 0.402
5.884ArgLeu: 5.884 ± 0.56
2.216ArgMet: 2.216 ± 0.285
2.47ArgAsn: 2.47 ± 0.273
2.652ArgPro: 2.652 ± 0.33
2.179ArgGln: 2.179 ± 0.347
4.541ArgArg: 4.541 ± 0.485
3.85ArgSer: 3.85 ± 0.36
3.305ArgThr: 3.305 ± 0.408
4.541ArgVal: 4.541 ± 0.469
2.143ArgTrp: 2.143 ± 0.28
2.397ArgTyr: 2.397 ± 0.352
0.0ArgXaa: 0.0 ± 0.0
Ser
5.993SerAla: 5.993 ± 0.576
1.126SerCys: 1.126 ± 0.232
4.686SerAsp: 4.686 ± 0.434
3.56SerGlu: 3.56 ± 0.267
2.397SerPhe: 2.397 ± 0.295
5.812SerGly: 5.812 ± 0.607
1.308SerHis: 1.308 ± 0.246
3.051SerIle: 3.051 ± 0.266
2.652SerLys: 2.652 ± 0.343
4.867SerLeu: 4.867 ± 0.448
2.107SerMet: 2.107 ± 0.222
1.889SerAsn: 1.889 ± 0.341
3.342SerPro: 3.342 ± 0.364
1.744SerGln: 1.744 ± 0.233
4.032SerArg: 4.032 ± 0.471
4.105SerSer: 4.105 ± 0.549
3.197SerThr: 3.197 ± 0.315
4.686SerVal: 4.686 ± 0.45
1.162SerTrp: 1.162 ± 0.164
1.671SerTyr: 1.671 ± 0.243
0.0SerXaa: 0.0 ± 0.0
Thr
4.395ThrAla: 4.395 ± 0.391
0.69ThrCys: 0.69 ± 0.175
3.523ThrAsp: 3.523 ± 0.338
2.833ThrGlu: 2.833 ± 0.343
1.925ThrPhe: 1.925 ± 0.255
5.194ThrGly: 5.194 ± 0.487
1.017ThrHis: 1.017 ± 0.187
3.523ThrIle: 3.523 ± 0.366
2.252ThrLys: 2.252 ± 0.3
4.613ThrLeu: 4.613 ± 0.445
0.981ThrMet: 0.981 ± 0.181
1.853ThrAsn: 1.853 ± 0.291
4.286ThrPro: 4.286 ± 0.398
1.38ThrGln: 1.38 ± 0.226
2.942ThrArg: 2.942 ± 0.274
2.797ThrSer: 2.797 ± 0.353
2.906ThrThr: 2.906 ± 0.385
4.541ThrVal: 4.541 ± 0.427
1.199ThrTrp: 1.199 ± 0.245
1.489ThrTyr: 1.489 ± 0.243
0.0ThrXaa: 0.0 ± 0.0
Val
5.558ValAla: 5.558 ± 0.458
1.09ValCys: 1.09 ± 0.218
5.993ValAsp: 5.993 ± 0.468
5.34ValGlu: 5.34 ± 0.454
1.489ValPhe: 1.489 ± 0.239
5.739ValGly: 5.739 ± 0.466
1.199ValHis: 1.199 ± 0.208
3.197ValIle: 3.197 ± 0.308
2.543ValLys: 2.543 ± 0.32
4.867ValLeu: 4.867 ± 0.329
1.562ValMet: 1.562 ± 0.233
2.252ValAsn: 2.252 ± 0.248
3.088ValPro: 3.088 ± 0.393
1.78ValGln: 1.78 ± 0.278
4.686ValArg: 4.686 ± 0.45
4.577ValSer: 4.577 ± 0.357
4.177ValThr: 4.177 ± 0.346
4.649ValVal: 4.649 ± 0.502
1.853ValTrp: 1.853 ± 0.273
2.107ValTyr: 2.107 ± 0.28
0.0ValXaa: 0.0 ± 0.0
Trp
2.288TrpAla: 2.288 ± 0.333
0.472TrpCys: 0.472 ± 0.139
1.489TrpAsp: 1.489 ± 0.234
1.199TrpGlu: 1.199 ± 0.209
0.581TrpPhe: 0.581 ± 0.141
1.344TrpGly: 1.344 ± 0.197
0.654TrpHis: 0.654 ± 0.145
0.944TrpIle: 0.944 ± 0.2
0.872TrpLys: 0.872 ± 0.187
2.252TrpLeu: 2.252 ± 0.283
0.509TrpMet: 0.509 ± 0.128
0.799TrpAsn: 0.799 ± 0.227
0.944TrpPro: 0.944 ± 0.163
0.835TrpGln: 0.835 ± 0.178
1.744TrpArg: 1.744 ± 0.244
1.816TrpSer: 1.816 ± 0.263
1.271TrpThr: 1.271 ± 0.206
1.961TrpVal: 1.961 ± 0.317
0.545TrpTrp: 0.545 ± 0.139
0.581TrpTyr: 0.581 ± 0.152
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.833TyrAla: 2.833 ± 0.333
0.436TyrCys: 0.436 ± 0.132
1.78TyrAsp: 1.78 ± 0.281
1.889TyrGlu: 1.889 ± 0.28
0.509TyrPhe: 0.509 ± 0.131
2.506TyrGly: 2.506 ± 0.282
0.799TyrHis: 0.799 ± 0.163
1.562TyrIle: 1.562 ± 0.272
1.199TyrLys: 1.199 ± 0.236
2.579TyrLeu: 2.579 ± 0.303
0.218TyrMet: 0.218 ± 0.08
0.908TyrAsn: 0.908 ± 0.171
1.271TyrPro: 1.271 ± 0.209
0.944TyrGln: 0.944 ± 0.182
3.233TyrArg: 3.233 ± 0.344
1.889TyrSer: 1.889 ± 0.312
1.744TyrThr: 1.744 ± 0.299
2.543TyrVal: 2.543 ± 0.262
0.799TyrTrp: 0.799 ± 0.162
0.69TyrTyr: 0.69 ± 0.172
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 179 proteins (27531 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski