Amino acid dipepetide frequency for Mycobacterium phage Spartacus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
14.68AlaAla: 14.68 ± 1.905
0.934AlaCys: 0.934 ± 0.245
7.107AlaAsp: 7.107 ± 0.659
7.107AlaGlu: 7.107 ± 0.725
2.957AlaPhe: 2.957 ± 0.394
10.011AlaGly: 10.011 ± 1.09
2.594AlaHis: 2.594 ± 0.381
4.357AlaIle: 4.357 ± 0.441
4.15AlaLys: 4.15 ± 0.409
7.885AlaLeu: 7.885 ± 0.775
2.646AlaMet: 2.646 ± 0.446
3.164AlaAsn: 3.164 ± 0.472
4.876AlaPro: 4.876 ± 0.588
3.527AlaGln: 3.527 ± 0.403
7.885AlaArg: 7.885 ± 0.958
4.565AlaSer: 4.565 ± 0.587
6.225AlaThr: 6.225 ± 0.605
7.003AlaVal: 7.003 ± 0.613
2.438AlaTrp: 2.438 ± 0.456
2.542AlaTyr: 2.542 ± 0.334
0.0AlaXaa: 0.0 ± 0.0
Cys
1.141CysAla: 1.141 ± 0.252
0.104CysCys: 0.104 ± 0.07
1.141CysAsp: 1.141 ± 0.298
0.934CysGlu: 0.934 ± 0.205
0.311CysPhe: 0.311 ± 0.127
1.452CysGly: 1.452 ± 0.309
0.207CysHis: 0.207 ± 0.094
0.207CysIle: 0.207 ± 0.103
0.622CysLys: 0.622 ± 0.197
0.571CysLeu: 0.571 ± 0.213
0.207CysMet: 0.207 ± 0.101
0.467CysAsn: 0.467 ± 0.158
0.986CysPro: 0.986 ± 0.22
0.259CysGln: 0.259 ± 0.117
1.037CysArg: 1.037 ± 0.332
0.726CysSer: 0.726 ± 0.247
0.83CysThr: 0.83 ± 0.209
0.934CysVal: 0.934 ± 0.217
0.207CysTrp: 0.207 ± 0.095
0.156CysTyr: 0.156 ± 0.093
0.0CysXaa: 0.0 ± 0.0
Asp
7.47AspAla: 7.47 ± 0.558
0.83AspCys: 0.83 ± 0.22
4.461AspAsp: 4.461 ± 0.473
2.957AspGlu: 2.957 ± 0.4
1.764AspPhe: 1.764 ± 0.227
6.017AspGly: 6.017 ± 0.602
1.452AspHis: 1.452 ± 0.227
2.49AspIle: 2.49 ± 0.291
1.556AspLys: 1.556 ± 0.265
6.069AspLeu: 6.069 ± 0.552
1.193AspMet: 1.193 ± 0.249
1.919AspAsn: 1.919 ± 0.368
4.824AspPro: 4.824 ± 0.583
2.334AspGln: 2.334 ± 0.283
5.187AspArg: 5.187 ± 0.562
3.32AspSer: 3.32 ± 0.508
3.735AspThr: 3.735 ± 0.475
4.617AspVal: 4.617 ± 0.571
1.452AspTrp: 1.452 ± 0.288
2.179AspTyr: 2.179 ± 0.356
0.0AspXaa: 0.0 ± 0.0
Glu
5.758GluAla: 5.758 ± 0.647
0.986GluCys: 0.986 ± 0.265
3.268GluAsp: 3.268 ± 0.336
2.594GluGlu: 2.594 ± 0.499
2.49GluPhe: 2.49 ± 0.402
3.164GluGly: 3.164 ± 0.378
1.193GluHis: 1.193 ± 0.318
2.646GluIle: 2.646 ± 0.32
2.231GluLys: 2.231 ± 0.296
5.135GluLeu: 5.135 ± 0.58
1.452GluMet: 1.452 ± 0.343
2.179GluAsn: 2.179 ± 0.312
2.697GluPro: 2.697 ± 0.426
3.268GluGln: 3.268 ± 0.427
5.084GluArg: 5.084 ± 0.587
3.112GluSer: 3.112 ± 0.532
3.89GluThr: 3.89 ± 0.498
4.098GluVal: 4.098 ± 0.542
1.349GluTrp: 1.349 ± 0.259
1.816GluTyr: 1.816 ± 0.334
0.0GluXaa: 0.0 ± 0.0
Phe
3.112PheAla: 3.112 ± 0.478
0.311PheCys: 0.311 ± 0.131
2.594PheAsp: 2.594 ± 0.486
1.66PheGlu: 1.66 ± 0.283
0.83PhePhe: 0.83 ± 0.249
3.112PheGly: 3.112 ± 0.649
0.571PheHis: 0.571 ± 0.201
1.401PheIle: 1.401 ± 0.329
0.83PheLys: 0.83 ± 0.214
2.231PheLeu: 2.231 ± 0.264
0.986PheMet: 0.986 ± 0.232
1.193PheAsn: 1.193 ± 0.311
1.452PhePro: 1.452 ± 0.269
0.934PheGln: 0.934 ± 0.265
1.712PheArg: 1.712 ± 0.264
1.349PheSer: 1.349 ± 0.293
2.594PheThr: 2.594 ± 0.355
2.282PheVal: 2.282 ± 0.277
0.882PheTrp: 0.882 ± 0.184
0.934PheTyr: 0.934 ± 0.24
0.0PheXaa: 0.0 ± 0.0
Gly
9.13GlyAla: 9.13 ± 1.249
1.089GlyCys: 1.089 ± 0.269
6.069GlyAsp: 6.069 ± 0.547
4.098GlyGlu: 4.098 ± 0.556
2.697GlyPhe: 2.697 ± 0.414
11.101GlyGly: 11.101 ± 2.499
1.712GlyHis: 1.712 ± 0.286
3.839GlyIle: 3.839 ± 0.54
2.49GlyLys: 2.49 ± 0.369
6.484GlyLeu: 6.484 ± 0.497
2.231GlyMet: 2.231 ± 0.464
3.216GlyAsn: 3.216 ± 0.379
4.098GlyPro: 4.098 ± 0.545
2.231GlyGln: 2.231 ± 0.551
4.72GlyArg: 4.72 ± 0.56
6.173GlySer: 6.173 ± 0.744
6.017GlyThr: 6.017 ± 0.691
5.862GlyVal: 5.862 ± 0.6
2.334GlyTrp: 2.334 ± 0.389
2.334GlyTyr: 2.334 ± 0.35
0.0GlyXaa: 0.0 ± 0.0
His
1.608HisAla: 1.608 ± 0.283
0.415HisCys: 0.415 ± 0.158
1.141HisAsp: 1.141 ± 0.205
1.452HisGlu: 1.452 ± 0.274
0.363HisPhe: 0.363 ± 0.133
1.867HisGly: 1.867 ± 0.311
0.882HisHis: 0.882 ± 0.279
1.556HisIle: 1.556 ± 0.302
0.83HisLys: 0.83 ± 0.235
1.297HisLeu: 1.297 ± 0.324
0.622HisMet: 0.622 ± 0.163
0.778HisAsn: 0.778 ± 0.182
1.401HisPro: 1.401 ± 0.205
0.83HisGln: 0.83 ± 0.226
1.971HisArg: 1.971 ± 0.373
0.882HisSer: 0.882 ± 0.2
1.297HisThr: 1.297 ± 0.335
1.556HisVal: 1.556 ± 0.373
0.622HisTrp: 0.622 ± 0.21
0.83HisTyr: 0.83 ± 0.164
0.0HisXaa: 0.0 ± 0.0
Ile
5.498IleAla: 5.498 ± 0.571
0.622IleCys: 0.622 ± 0.211
3.787IleAsp: 3.787 ± 0.534
3.89IleGlu: 3.89 ± 0.366
0.934IlePhe: 0.934 ± 0.235
4.046IleGly: 4.046 ± 0.492
1.349IleHis: 1.349 ± 0.273
1.141IleIle: 1.141 ± 0.232
1.089IleLys: 1.089 ± 0.199
2.023IleLeu: 2.023 ± 0.417
0.363IleMet: 0.363 ± 0.132
1.867IleAsn: 1.867 ± 0.29
2.957IlePro: 2.957 ± 0.316
1.504IleGln: 1.504 ± 0.215
2.179IleArg: 2.179 ± 0.351
2.023IleSer: 2.023 ± 0.364
3.839IleThr: 3.839 ± 0.392
3.268IleVal: 3.268 ± 0.466
0.83IleTrp: 0.83 ± 0.19
0.726IleTyr: 0.726 ± 0.199
0.0IleXaa: 0.0 ± 0.0
Lys
4.15LysAla: 4.15 ± 0.497
0.415LysCys: 0.415 ± 0.13
1.764LysAsp: 1.764 ± 0.295
1.349LysGlu: 1.349 ± 0.249
1.349LysPhe: 1.349 ± 0.213
2.438LysGly: 2.438 ± 0.387
1.089LysHis: 1.089 ± 0.238
1.245LysIle: 1.245 ± 0.301
1.349LysLys: 1.349 ± 0.371
2.646LysLeu: 2.646 ± 0.422
0.674LysMet: 0.674 ± 0.153
0.726LysAsn: 0.726 ± 0.176
1.867LysPro: 1.867 ± 0.294
1.66LysGln: 1.66 ± 0.273
2.334LysArg: 2.334 ± 0.316
1.764LysSer: 1.764 ± 0.279
2.075LysThr: 2.075 ± 0.277
2.023LysVal: 2.023 ± 0.334
0.778LysTrp: 0.778 ± 0.222
1.089LysTyr: 1.089 ± 0.262
0.0LysXaa: 0.0 ± 0.0
Leu
8.403LeuAla: 8.403 ± 0.871
0.622LeuCys: 0.622 ± 0.217
4.772LeuAsp: 4.772 ± 0.586
3.787LeuGlu: 3.787 ± 0.424
2.697LeuPhe: 2.697 ± 0.431
6.069LeuGly: 6.069 ± 0.642
0.986LeuHis: 0.986 ± 0.247
3.424LeuIle: 3.424 ± 0.519
1.764LeuLys: 1.764 ± 0.283
4.617LeuLeu: 4.617 ± 0.468
1.66LeuMet: 1.66 ± 0.337
2.646LeuAsn: 2.646 ± 0.332
5.654LeuPro: 5.654 ± 0.709
3.009LeuGln: 3.009 ± 0.449
5.135LeuArg: 5.135 ± 0.659
5.032LeuSer: 5.032 ± 0.499
5.758LeuThr: 5.758 ± 0.565
4.98LeuVal: 4.98 ± 0.531
1.297LeuTrp: 1.297 ± 0.272
2.023LeuTyr: 2.023 ± 0.395
0.0LeuXaa: 0.0 ± 0.0
Met
2.231MetAla: 2.231 ± 0.382
0.207MetCys: 0.207 ± 0.128
1.452MetAsp: 1.452 ± 0.303
0.934MetGlu: 0.934 ± 0.181
0.83MetPhe: 0.83 ± 0.188
1.764MetGly: 1.764 ± 0.29
0.259MetHis: 0.259 ± 0.115
0.934MetIle: 0.934 ± 0.234
0.726MetLys: 0.726 ± 0.194
2.075MetLeu: 2.075 ± 0.278
0.622MetMet: 0.622 ± 0.21
0.83MetAsn: 0.83 ± 0.206
1.089MetPro: 1.089 ± 0.214
0.363MetGln: 0.363 ± 0.112
1.504MetArg: 1.504 ± 0.259
3.112MetSer: 3.112 ± 0.427
1.764MetThr: 1.764 ± 0.275
1.245MetVal: 1.245 ± 0.248
0.311MetTrp: 0.311 ± 0.121
0.311MetTyr: 0.311 ± 0.111
0.0MetXaa: 0.0 ± 0.0
Asn
3.631AsnAla: 3.631 ± 0.416
0.311AsnCys: 0.311 ± 0.125
1.919AsnAsp: 1.919 ± 0.319
1.66AsnGlu: 1.66 ± 0.313
0.882AsnPhe: 0.882 ± 0.285
3.787AsnGly: 3.787 ± 0.439
1.089AsnHis: 1.089 ± 0.244
1.504AsnIle: 1.504 ± 0.383
0.986AsnLys: 0.986 ± 0.208
2.438AsnLeu: 2.438 ± 0.393
0.467AsnMet: 0.467 ± 0.143
1.764AsnAsn: 1.764 ± 0.406
2.282AsnPro: 2.282 ± 0.373
1.089AsnGln: 1.089 ± 0.299
2.386AsnArg: 2.386 ± 0.348
1.608AsnSer: 1.608 ± 0.271
2.231AsnThr: 2.231 ± 0.349
2.127AsnVal: 2.127 ± 0.323
0.674AsnTrp: 0.674 ± 0.184
0.934AsnTyr: 0.934 ± 0.172
0.0AsnXaa: 0.0 ± 0.0
Pro
5.239ProAla: 5.239 ± 0.561
0.674ProCys: 0.674 ± 0.147
4.357ProAsp: 4.357 ± 0.436
4.617ProGlu: 4.617 ± 0.462
1.919ProPhe: 1.919 ± 0.329
5.913ProGly: 5.913 ± 0.685
1.712ProHis: 1.712 ± 0.344
1.66ProIle: 1.66 ± 0.268
1.556ProLys: 1.556 ± 0.314
4.409ProLeu: 4.409 ± 0.471
1.452ProMet: 1.452 ± 0.299
1.867ProAsn: 1.867 ± 0.278
3.942ProPro: 3.942 ± 0.564
2.127ProGln: 2.127 ± 0.337
3.372ProArg: 3.372 ± 0.524
3.164ProSer: 3.164 ± 0.429
3.268ProThr: 3.268 ± 0.509
4.72ProVal: 4.72 ± 0.519
1.089ProTrp: 1.089 ± 0.229
1.297ProTyr: 1.297 ± 0.246
0.0ProXaa: 0.0 ± 0.0
Gln
4.409GlnAla: 4.409 ± 0.54
0.259GlnCys: 0.259 ± 0.131
1.245GlnAsp: 1.245 ± 0.222
1.452GlnGlu: 1.452 ± 0.261
1.141GlnPhe: 1.141 ± 0.215
2.594GlnGly: 2.594 ± 0.463
0.519GlnHis: 0.519 ± 0.147
1.919GlnIle: 1.919 ± 0.331
1.452GlnLys: 1.452 ± 0.239
2.957GlnLeu: 2.957 ± 0.45
1.037GlnMet: 1.037 ± 0.198
0.83GlnAsn: 0.83 ± 0.251
2.231GlnPro: 2.231 ± 0.295
1.037GlnGln: 1.037 ± 0.246
2.386GlnArg: 2.386 ± 0.355
2.594GlnSer: 2.594 ± 0.361
1.919GlnThr: 1.919 ± 0.347
2.646GlnVal: 2.646 ± 0.371
0.519GlnTrp: 0.519 ± 0.158
0.83GlnTyr: 0.83 ± 0.226
0.0GlnXaa: 0.0 ± 0.0
Arg
6.38ArgAla: 6.38 ± 0.635
1.401ArgCys: 1.401 ± 0.354
4.15ArgAsp: 4.15 ± 0.513
5.135ArgGlu: 5.135 ± 0.64
2.282ArgPhe: 2.282 ± 0.367
3.994ArgGly: 3.994 ± 0.46
1.141ArgHis: 1.141 ± 0.282
4.461ArgIle: 4.461 ± 0.586
2.542ArgLys: 2.542 ± 0.403
5.135ArgLeu: 5.135 ± 0.548
2.023ArgMet: 2.023 ± 0.312
2.334ArgAsn: 2.334 ± 0.353
3.527ArgPro: 3.527 ± 0.411
2.075ArgGln: 2.075 ± 0.4
5.706ArgArg: 5.706 ± 0.68
4.046ArgSer: 4.046 ± 0.701
3.164ArgThr: 3.164 ± 0.472
5.706ArgVal: 5.706 ± 0.622
1.971ArgTrp: 1.971 ± 0.333
2.075ArgTyr: 2.075 ± 0.315
0.0ArgXaa: 0.0 ± 0.0
Ser
6.121SerAla: 6.121 ± 1.559
0.571SerCys: 0.571 ± 0.18
3.683SerAsp: 3.683 ± 0.558
3.475SerGlu: 3.475 ± 0.445
2.127SerPhe: 2.127 ± 0.473
6.017SerGly: 6.017 ± 0.717
1.401SerHis: 1.401 ± 0.261
2.957SerIle: 2.957 ± 0.447
2.49SerLys: 2.49 ± 0.379
4.202SerLeu: 4.202 ± 0.49
1.401SerMet: 1.401 ± 0.241
2.075SerAsn: 2.075 ± 0.335
3.32SerPro: 3.32 ± 0.327
1.504SerGln: 1.504 ± 0.259
3.32SerArg: 3.32 ± 0.359
3.631SerSer: 3.631 ± 0.641
3.268SerThr: 3.268 ± 0.405
4.254SerVal: 4.254 ± 0.443
1.297SerTrp: 1.297 ± 0.243
1.349SerTyr: 1.349 ± 0.212
0.0SerXaa: 0.0 ± 0.0
Thr
6.173ThrAla: 6.173 ± 0.506
0.778ThrCys: 0.778 ± 0.209
4.202ThrAsp: 4.202 ± 0.54
3.527ThrGlu: 3.527 ± 0.447
1.764ThrPhe: 1.764 ± 0.335
6.225ThrGly: 6.225 ± 0.612
1.66ThrHis: 1.66 ± 0.303
3.164ThrIle: 3.164 ± 0.438
2.334ThrLys: 2.334 ± 0.345
4.565ThrLeu: 4.565 ± 0.529
1.245ThrMet: 1.245 ± 0.265
2.075ThrAsn: 2.075 ± 0.373
4.565ThrPro: 4.565 ± 0.483
1.816ThrGln: 1.816 ± 0.337
4.15ThrArg: 4.15 ± 0.453
3.787ThrSer: 3.787 ± 0.456
4.772ThrThr: 4.772 ± 0.571
5.913ThrVal: 5.913 ± 0.6
1.245ThrTrp: 1.245 ± 0.273
1.712ThrTyr: 1.712 ± 0.293
0.0ThrXaa: 0.0 ± 0.0
Val
6.795ValAla: 6.795 ± 0.585
1.401ValCys: 1.401 ± 0.258
5.706ValAsp: 5.706 ± 0.513
5.032ValGlu: 5.032 ± 0.657
1.971ValPhe: 1.971 ± 0.333
5.343ValGly: 5.343 ± 0.625
1.452ValHis: 1.452 ± 0.285
2.957ValIle: 2.957 ± 0.434
1.919ValLys: 1.919 ± 0.292
5.135ValLeu: 5.135 ± 0.598
1.193ValMet: 1.193 ± 0.265
2.386ValAsn: 2.386 ± 0.346
4.565ValPro: 4.565 ± 0.427
2.801ValGln: 2.801 ± 0.289
4.876ValArg: 4.876 ± 0.59
4.824ValSer: 4.824 ± 0.502
5.706ValThr: 5.706 ± 0.468
6.64ValVal: 6.64 ± 0.754
1.867ValTrp: 1.867 ± 0.414
1.504ValTyr: 1.504 ± 0.287
0.0ValXaa: 0.0 ± 0.0
Trp
2.282TrpAla: 2.282 ± 0.324
0.311TrpCys: 0.311 ± 0.111
1.245TrpAsp: 1.245 ± 0.255
0.986TrpGlu: 0.986 ± 0.291
0.882TrpPhe: 0.882 ± 0.218
0.934TrpGly: 0.934 ± 0.235
0.363TrpHis: 0.363 ± 0.132
1.037TrpIle: 1.037 ± 0.241
1.089TrpLys: 1.089 ± 0.212
2.127TrpLeu: 2.127 ± 0.336
0.778TrpMet: 0.778 ± 0.234
0.571TrpAsn: 0.571 ± 0.196
0.83TrpPro: 0.83 ± 0.22
0.778TrpGln: 0.778 ± 0.217
2.179TrpArg: 2.179 ± 0.42
1.712TrpSer: 1.712 ± 0.465
1.608TrpThr: 1.608 ± 0.284
1.66TrpVal: 1.66 ± 0.395
1.037TrpTrp: 1.037 ± 0.216
0.311TrpTyr: 0.311 ± 0.123
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.438TyrAla: 2.438 ± 0.359
0.259TyrCys: 0.259 ± 0.114
1.764TyrAsp: 1.764 ± 0.363
1.816TyrGlu: 1.816 ± 0.288
0.778TyrPhe: 0.778 ± 0.209
2.075TyrGly: 2.075 ± 0.393
0.571TyrHis: 0.571 ± 0.186
1.037TyrIle: 1.037 ± 0.191
0.778TyrLys: 0.778 ± 0.189
2.334TyrLeu: 2.334 ± 0.341
0.259TyrMet: 0.259 ± 0.124
0.882TyrAsn: 0.882 ± 0.227
1.193TyrPro: 1.193 ± 0.235
0.83TyrGln: 0.83 ± 0.211
2.127TyrArg: 2.127 ± 0.384
1.037TyrSer: 1.037 ± 0.236
1.712TyrThr: 1.712 ± 0.335
2.542TyrVal: 2.542 ± 0.338
0.519TyrTrp: 0.519 ± 0.171
0.674TyrTyr: 0.674 ± 0.157
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 110 proteins (19279 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski