Amino acid dipepetide frequency for Mycobacterium phage Hegedechwinu

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
13.89AlaAla: 13.89 ± 1.502
0.965AlaCys: 0.965 ± 0.246
7.455AlaAsp: 7.455 ± 0.7
7.776AlaGlu: 7.776 ± 0.821
2.521AlaPhe: 2.521 ± 0.32
9.761AlaGly: 9.761 ± 0.99
2.413AlaHis: 2.413 ± 0.439
4.344AlaIle: 4.344 ± 0.474
4.29AlaLys: 4.29 ± 0.529
8.098AlaLeu: 8.098 ± 0.67
2.789AlaMet: 2.789 ± 0.418
2.735AlaAsn: 2.735 ± 0.442
5.309AlaPro: 5.309 ± 0.451
3.701AlaGln: 3.701 ± 0.435
7.508AlaArg: 7.508 ± 0.758
5.953AlaSer: 5.953 ± 0.635
5.899AlaThr: 5.899 ± 0.58
7.133AlaVal: 7.133 ± 0.59
2.789AlaTrp: 2.789 ± 0.382
2.467AlaTyr: 2.467 ± 0.321
0.0AlaXaa: 0.0 ± 0.0
Cys
0.858CysAla: 0.858 ± 0.232
0.054CysCys: 0.054 ± 0.06
1.019CysAsp: 1.019 ± 0.242
0.697CysGlu: 0.697 ± 0.186
0.429CysPhe: 0.429 ± 0.152
1.287CysGly: 1.287 ± 0.311
0.161CysHis: 0.161 ± 0.097
0.429CysIle: 0.429 ± 0.168
0.375CysLys: 0.375 ± 0.132
0.804CysLeu: 0.804 ± 0.235
0.107CysMet: 0.107 ± 0.074
0.536CysAsn: 0.536 ± 0.165
0.965CysPro: 0.965 ± 0.227
0.322CysGln: 0.322 ± 0.155
0.858CysArg: 0.858 ± 0.231
0.536CysSer: 0.536 ± 0.216
0.804CysThr: 0.804 ± 0.231
0.483CysVal: 0.483 ± 0.135
0.268CysTrp: 0.268 ± 0.113
0.161CysTyr: 0.161 ± 0.099
0.0CysXaa: 0.0 ± 0.0
Asp
7.347AspAla: 7.347 ± 0.785
0.965AspCys: 0.965 ± 0.253
4.505AspAsp: 4.505 ± 0.534
3.486AspGlu: 3.486 ± 0.491
1.502AspPhe: 1.502 ± 0.26
6.168AspGly: 6.168 ± 0.496
1.394AspHis: 1.394 ± 0.249
2.628AspIle: 2.628 ± 0.421
1.716AspLys: 1.716 ± 0.299
6.168AspLeu: 6.168 ± 0.562
1.073AspMet: 1.073 ± 0.238
1.555AspAsn: 1.555 ± 0.3
4.612AspPro: 4.612 ± 0.611
2.682AspGln: 2.682 ± 0.315
5.47AspArg: 5.47 ± 0.582
3.379AspSer: 3.379 ± 0.484
3.54AspThr: 3.54 ± 0.419
4.559AspVal: 4.559 ± 0.556
1.555AspTrp: 1.555 ± 0.296
1.77AspTyr: 1.77 ± 0.339
0.0AspXaa: 0.0 ± 0.0
Glu
6.489GluAla: 6.489 ± 0.719
1.073GluCys: 1.073 ± 0.29
3.701GluAsp: 3.701 ± 0.371
2.735GluGlu: 2.735 ± 0.507
1.984GluPhe: 1.984 ± 0.278
3.164GluGly: 3.164 ± 0.423
1.18GluHis: 1.18 ± 0.3
2.628GluIle: 2.628 ± 0.414
1.931GluLys: 1.931 ± 0.36
4.988GluLeu: 4.988 ± 0.608
1.716GluMet: 1.716 ± 0.359
1.77GluAsn: 1.77 ± 0.235
2.467GluPro: 2.467 ± 0.355
3.218GluGln: 3.218 ± 0.398
5.256GluArg: 5.256 ± 0.577
2.95GluSer: 2.95 ± 0.441
3.808GluThr: 3.808 ± 0.627
3.754GluVal: 3.754 ± 0.536
1.073GluTrp: 1.073 ± 0.213
1.716GluTyr: 1.716 ± 0.331
0.0GluXaa: 0.0 ± 0.0
Phe
3.057PheAla: 3.057 ± 0.408
0.268PheCys: 0.268 ± 0.118
2.306PheAsp: 2.306 ± 0.295
1.502PheGlu: 1.502 ± 0.25
0.912PhePhe: 0.912 ± 0.234
3.003PheGly: 3.003 ± 0.595
0.483PheHis: 0.483 ± 0.156
1.126PheIle: 1.126 ± 0.351
1.073PheLys: 1.073 ± 0.292
1.984PheLeu: 1.984 ± 0.296
1.019PheMet: 1.019 ± 0.277
1.18PheAsn: 1.18 ± 0.31
1.984PhePro: 1.984 ± 0.309
0.858PheGln: 0.858 ± 0.275
1.448PheArg: 1.448 ± 0.271
1.502PheSer: 1.502 ± 0.305
1.931PheThr: 1.931 ± 0.347
1.823PheVal: 1.823 ± 0.306
0.536PheTrp: 0.536 ± 0.18
0.912PheTyr: 0.912 ± 0.272
0.0PheXaa: 0.0 ± 0.0
Gly
10.78GlyAla: 10.78 ± 1.029
0.912GlyCys: 0.912 ± 0.21
6.06GlyAsp: 6.06 ± 0.47
3.861GlyGlu: 3.861 ± 0.515
3.057GlyPhe: 3.057 ± 0.488
10.726GlyGly: 10.726 ± 2.285
2.038GlyHis: 2.038 ± 0.303
4.29GlyIle: 4.29 ± 0.581
2.628GlyLys: 2.628 ± 0.323
5.953GlyLeu: 5.953 ± 0.495
2.413GlyMet: 2.413 ± 0.392
3.111GlyAsn: 3.111 ± 0.399
3.593GlyPro: 3.593 ± 0.551
2.145GlyGln: 2.145 ± 0.426
4.72GlyArg: 4.72 ± 0.599
5.309GlySer: 5.309 ± 0.758
6.865GlyThr: 6.865 ± 0.565
5.685GlyVal: 5.685 ± 0.658
2.735GlyTrp: 2.735 ± 0.381
2.199GlyTyr: 2.199 ± 0.358
0.0GlyXaa: 0.0 ± 0.0
His
1.77HisAla: 1.77 ± 0.359
0.536HisCys: 0.536 ± 0.179
1.073HisAsp: 1.073 ± 0.305
1.234HisGlu: 1.234 ± 0.262
0.375HisPhe: 0.375 ± 0.124
2.306HisGly: 2.306 ± 0.383
1.019HisHis: 1.019 ± 0.261
1.448HisIle: 1.448 ± 0.311
0.965HisLys: 0.965 ± 0.255
1.18HisLeu: 1.18 ± 0.234
0.536HisMet: 0.536 ± 0.179
0.858HisAsn: 0.858 ± 0.187
1.555HisPro: 1.555 ± 0.255
0.912HisGln: 0.912 ± 0.247
1.823HisArg: 1.823 ± 0.367
0.751HisSer: 0.751 ± 0.202
1.502HisThr: 1.502 ± 0.296
1.126HisVal: 1.126 ± 0.276
0.483HisTrp: 0.483 ± 0.164
0.697HisTyr: 0.697 ± 0.186
0.0HisXaa: 0.0 ± 0.0
Ile
5.47IleAla: 5.47 ± 0.573
0.858IleCys: 0.858 ± 0.272
3.969IleAsp: 3.969 ± 0.46
3.379IleGlu: 3.379 ± 0.317
0.751IlePhe: 0.751 ± 0.273
3.379IleGly: 3.379 ± 0.44
1.234IleHis: 1.234 ± 0.264
1.394IleIle: 1.394 ± 0.276
1.394IleLys: 1.394 ± 0.261
2.413IleLeu: 2.413 ± 0.407
0.322IleMet: 0.322 ± 0.111
2.038IleAsn: 2.038 ± 0.295
2.842IlePro: 2.842 ± 0.348
1.77IleGln: 1.77 ± 0.272
2.682IleArg: 2.682 ± 0.434
1.931IleSer: 1.931 ± 0.358
3.54IleThr: 3.54 ± 0.404
3.379IleVal: 3.379 ± 0.368
0.965IleTrp: 0.965 ± 0.218
0.536IleTyr: 0.536 ± 0.162
0.0IleXaa: 0.0 ± 0.0
Lys
4.076LysAla: 4.076 ± 0.493
0.375LysCys: 0.375 ± 0.155
1.555LysAsp: 1.555 ± 0.257
1.502LysGlu: 1.502 ± 0.297
1.234LysPhe: 1.234 ± 0.222
2.789LysGly: 2.789 ± 0.292
1.126LysHis: 1.126 ± 0.25
0.965LysIle: 0.965 ± 0.277
1.341LysLys: 1.341 ± 0.247
2.896LysLeu: 2.896 ± 0.489
0.751LysMet: 0.751 ± 0.184
0.804LysAsn: 0.804 ± 0.224
3.432LysPro: 3.432 ± 0.588
1.609LysGln: 1.609 ± 0.274
1.931LysArg: 1.931 ± 0.274
1.984LysSer: 1.984 ± 0.327
2.145LysThr: 2.145 ± 0.415
2.413LysVal: 2.413 ± 0.398
0.644LysTrp: 0.644 ± 0.187
0.858LysTyr: 0.858 ± 0.223
0.0LysXaa: 0.0 ± 0.0
Leu
8.42LeuAla: 8.42 ± 0.856
0.59LeuCys: 0.59 ± 0.171
4.88LeuAsp: 4.88 ± 0.643
3.861LeuGlu: 3.861 ± 0.491
2.038LeuPhe: 2.038 ± 0.329
5.685LeuGly: 5.685 ± 0.569
1.019LeuHis: 1.019 ± 0.322
3.808LeuIle: 3.808 ± 0.42
2.574LeuLys: 2.574 ± 0.458
4.988LeuLeu: 4.988 ± 0.613
1.341LeuMet: 1.341 ± 0.269
2.682LeuAsn: 2.682 ± 0.431
5.095LeuPro: 5.095 ± 0.557
2.735LeuGln: 2.735 ± 0.409
5.256LeuArg: 5.256 ± 0.629
5.524LeuSer: 5.524 ± 0.554
5.309LeuThr: 5.309 ± 0.536
5.417LeuVal: 5.417 ± 0.54
1.448LeuTrp: 1.448 ± 0.299
2.092LeuTyr: 2.092 ± 0.312
0.0LeuXaa: 0.0 ± 0.0
Met
2.038MetAla: 2.038 ± 0.358
0.215MetCys: 0.215 ± 0.107
1.126MetAsp: 1.126 ± 0.28
1.019MetGlu: 1.019 ± 0.208
0.697MetPhe: 0.697 ± 0.214
2.199MetGly: 2.199 ± 0.369
0.161MetHis: 0.161 ± 0.088
0.965MetIle: 0.965 ± 0.219
0.751MetLys: 0.751 ± 0.234
1.394MetLeu: 1.394 ± 0.214
0.697MetMet: 0.697 ± 0.234
1.126MetAsn: 1.126 ± 0.225
1.234MetPro: 1.234 ± 0.238
0.375MetGln: 0.375 ± 0.128
1.448MetArg: 1.448 ± 0.256
3.003MetSer: 3.003 ± 0.442
1.984MetThr: 1.984 ± 0.315
1.394MetVal: 1.394 ± 0.344
0.375MetTrp: 0.375 ± 0.158
0.375MetTyr: 0.375 ± 0.117
0.0MetXaa: 0.0 ± 0.0
Asn
3.325AsnAla: 3.325 ± 0.363
0.107AsnCys: 0.107 ± 0.069
1.609AsnAsp: 1.609 ± 0.28
2.092AsnGlu: 2.092 ± 0.361
0.912AsnPhe: 0.912 ± 0.268
3.915AsnGly: 3.915 ± 0.487
0.858AsnHis: 0.858 ± 0.16
1.502AsnIle: 1.502 ± 0.383
1.126AsnLys: 1.126 ± 0.226
2.574AsnLeu: 2.574 ± 0.372
0.644AsnMet: 0.644 ± 0.172
1.663AsnAsn: 1.663 ± 0.351
2.413AsnPro: 2.413 ± 0.355
0.965AsnGln: 0.965 ± 0.279
2.413AsnArg: 2.413 ± 0.357
1.555AsnSer: 1.555 ± 0.316
2.36AsnThr: 2.36 ± 0.377
1.716AsnVal: 1.716 ± 0.276
0.751AsnTrp: 0.751 ± 0.18
0.644AsnTyr: 0.644 ± 0.154
0.0AsnXaa: 0.0 ± 0.0
Pro
5.578ProAla: 5.578 ± 0.54
0.429ProCys: 0.429 ± 0.141
4.237ProAsp: 4.237 ± 0.512
4.398ProGlu: 4.398 ± 0.537
1.823ProPhe: 1.823 ± 0.236
6.168ProGly: 6.168 ± 0.526
1.502ProHis: 1.502 ± 0.299
2.145ProIle: 2.145 ± 0.237
2.092ProLys: 2.092 ± 0.472
5.041ProLeu: 5.041 ± 0.607
1.341ProMet: 1.341 ± 0.321
1.984ProAsn: 1.984 ± 0.335
4.076ProPro: 4.076 ± 0.531
2.467ProGln: 2.467 ± 0.326
3.164ProArg: 3.164 ± 0.457
3.218ProSer: 3.218 ± 0.392
3.111ProThr: 3.111 ± 0.363
4.988ProVal: 4.988 ± 0.502
0.912ProTrp: 0.912 ± 0.211
1.931ProTyr: 1.931 ± 0.308
0.0ProXaa: 0.0 ± 0.0
Gln
4.612GlnAla: 4.612 ± 0.502
0.322GlnCys: 0.322 ± 0.18
1.394GlnAsp: 1.394 ± 0.294
1.716GlnGlu: 1.716 ± 0.347
1.019GlnPhe: 1.019 ± 0.241
2.467GlnGly: 2.467 ± 0.432
0.59GlnHis: 0.59 ± 0.203
1.877GlnIle: 1.877 ± 0.331
1.18GlnLys: 1.18 ± 0.211
3.218GlnLeu: 3.218 ± 0.406
0.751GlnMet: 0.751 ± 0.186
0.965GlnAsn: 0.965 ± 0.246
2.36GlnPro: 2.36 ± 0.414
1.823GlnGln: 1.823 ± 0.383
2.199GlnArg: 2.199 ± 0.287
2.628GlnSer: 2.628 ± 0.353
1.77GlnThr: 1.77 ± 0.348
2.896GlnVal: 2.896 ± 0.383
1.073GlnTrp: 1.073 ± 0.222
0.965GlnTyr: 0.965 ± 0.248
0.0GlnXaa: 0.0 ± 0.0
Arg
6.221ArgAla: 6.221 ± 0.688
1.287ArgCys: 1.287 ± 0.327
4.398ArgAsp: 4.398 ± 0.575
5.202ArgGlu: 5.202 ± 0.662
2.306ArgPhe: 2.306 ± 0.361
4.451ArgGly: 4.451 ± 0.519
1.287ArgHis: 1.287 ± 0.291
3.325ArgIle: 3.325 ± 0.454
2.145ArgLys: 2.145 ± 0.282
4.88ArgLeu: 4.88 ± 0.57
2.252ArgMet: 2.252 ± 0.337
2.038ArgAsn: 2.038 ± 0.356
3.915ArgPro: 3.915 ± 0.51
2.038ArgGln: 2.038 ± 0.341
5.202ArgArg: 5.202 ± 0.668
3.808ArgSer: 3.808 ± 0.416
3.218ArgThr: 3.218 ± 0.465
4.612ArgVal: 4.612 ± 0.529
1.77ArgTrp: 1.77 ± 0.326
2.413ArgTyr: 2.413 ± 0.322
0.0ArgXaa: 0.0 ± 0.0
Ser
5.631SerAla: 5.631 ± 0.725
0.375SerCys: 0.375 ± 0.147
4.076SerAsp: 4.076 ± 0.437
3.701SerGlu: 3.701 ± 0.535
2.36SerPhe: 2.36 ± 0.462
6.597SerGly: 6.597 ± 0.742
1.287SerHis: 1.287 ± 0.223
3.003SerIle: 3.003 ± 0.383
2.521SerLys: 2.521 ± 0.45
3.701SerLeu: 3.701 ± 0.483
1.341SerMet: 1.341 ± 0.261
2.092SerAsn: 2.092 ± 0.335
3.486SerPro: 3.486 ± 0.402
1.663SerGln: 1.663 ± 0.29
3.593SerArg: 3.593 ± 0.384
3.701SerSer: 3.701 ± 0.557
2.682SerThr: 2.682 ± 0.399
4.505SerVal: 4.505 ± 0.461
1.555SerTrp: 1.555 ± 0.265
1.716SerTyr: 1.716 ± 0.257
0.0SerXaa: 0.0 ± 0.0
Thr
6.114ThrAla: 6.114 ± 0.607
0.375ThrCys: 0.375 ± 0.14
3.808ThrAsp: 3.808 ± 0.542
3.003ThrGlu: 3.003 ± 0.395
1.663ThrPhe: 1.663 ± 0.323
6.382ThrGly: 6.382 ± 0.598
1.77ThrHis: 1.77 ± 0.338
3.164ThrIle: 3.164 ± 0.42
2.145ThrLys: 2.145 ± 0.352
4.559ThrLeu: 4.559 ± 0.486
0.965ThrMet: 0.965 ± 0.247
2.306ThrAsn: 2.306 ± 0.383
4.451ThrPro: 4.451 ± 0.483
1.931ThrGln: 1.931 ± 0.327
3.486ThrArg: 3.486 ± 0.375
3.432ThrSer: 3.432 ± 0.359
5.202ThrThr: 5.202 ± 0.764
5.631ThrVal: 5.631 ± 0.625
1.287ThrTrp: 1.287 ± 0.267
1.931ThrTyr: 1.931 ± 0.29
0.0ThrXaa: 0.0 ± 0.0
Val
7.24ValAla: 7.24 ± 0.531
1.019ValCys: 1.019 ± 0.269
5.149ValAsp: 5.149 ± 0.619
4.183ValGlu: 4.183 ± 0.568
1.931ValPhe: 1.931 ± 0.331
5.738ValGly: 5.738 ± 0.615
1.448ValHis: 1.448 ± 0.285
3.271ValIle: 3.271 ± 0.433
2.413ValLys: 2.413 ± 0.39
5.738ValLeu: 5.738 ± 0.594
1.287ValMet: 1.287 ± 0.224
2.199ValAsn: 2.199 ± 0.326
4.076ValPro: 4.076 ± 0.384
2.628ValGln: 2.628 ± 0.335
4.666ValArg: 4.666 ± 0.61
5.47ValSer: 5.47 ± 0.598
4.666ValThr: 4.666 ± 0.458
5.846ValVal: 5.846 ± 0.587
1.877ValTrp: 1.877 ± 0.342
1.341ValTyr: 1.341 ± 0.292
0.0ValXaa: 0.0 ± 0.0
Trp
2.36TrpAla: 2.36 ± 0.282
0.215TrpCys: 0.215 ± 0.109
1.716TrpAsp: 1.716 ± 0.31
0.804TrpGlu: 0.804 ± 0.194
0.697TrpPhe: 0.697 ± 0.189
0.912TrpGly: 0.912 ± 0.208
0.59TrpHis: 0.59 ± 0.171
0.858TrpIle: 0.858 ± 0.165
1.126TrpLys: 1.126 ± 0.212
2.038TrpLeu: 2.038 ± 0.444
0.965TrpMet: 0.965 ± 0.262
0.644TrpAsn: 0.644 ± 0.181
1.234TrpPro: 1.234 ± 0.268
1.019TrpGln: 1.019 ± 0.277
1.877TrpArg: 1.877 ± 0.321
1.823TrpSer: 1.823 ± 0.309
1.341TrpThr: 1.341 ± 0.312
2.038TrpVal: 2.038 ± 0.398
0.804TrpTrp: 0.804 ± 0.166
0.483TrpTyr: 0.483 ± 0.174
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.574TyrAla: 2.574 ± 0.368
0.215TyrCys: 0.215 ± 0.121
2.038TyrAsp: 2.038 ± 0.328
1.448TyrGlu: 1.448 ± 0.257
0.697TyrPhe: 0.697 ± 0.206
1.984TyrGly: 1.984 ± 0.355
0.59TyrHis: 0.59 ± 0.16
1.073TyrIle: 1.073 ± 0.217
0.644TyrLys: 0.644 ± 0.175
2.145TyrLeu: 2.145 ± 0.332
0.215TyrMet: 0.215 ± 0.121
0.858TyrAsn: 0.858 ± 0.208
1.502TyrPro: 1.502 ± 0.214
0.965TyrGln: 0.965 ± 0.231
1.77TyrArg: 1.77 ± 0.327
1.126TyrSer: 1.126 ± 0.195
1.931TyrThr: 1.931 ± 0.335
2.789TyrVal: 2.789 ± 0.357
0.644TyrTrp: 0.644 ± 0.185
0.751TyrTyr: 0.751 ± 0.17
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 112 proteins (18647 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski