Amino acid dipepetide frequency for Mycobacterium phage QueenHazel

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
16.5AlaAla: 16.5 ± 1.391
1.031AlaCys: 1.031 ± 0.242
8.121AlaAsp: 8.121 ± 0.739
9.346AlaGlu: 9.346 ± 1.069
2.707AlaPhe: 2.707 ± 0.425
11.086AlaGly: 11.086 ± 1.361
1.934AlaHis: 1.934 ± 0.31
5.736AlaIle: 5.736 ± 0.656
4.77AlaLys: 4.77 ± 0.654
9.152AlaLeu: 9.152 ± 0.697
2.256AlaMet: 2.256 ± 0.429
3.223AlaAsn: 3.223 ± 0.493
6.445AlaPro: 6.445 ± 0.925
4.834AlaGln: 4.834 ± 0.658
7.154AlaArg: 7.154 ± 0.727
5.672AlaSer: 5.672 ± 0.722
6.768AlaThr: 6.768 ± 0.759
6.961AlaVal: 6.961 ± 0.731
2.385AlaTrp: 2.385 ± 0.571
2.965AlaTyr: 2.965 ± 0.313
0.0AlaXaa: 0.0 ± 0.0
Cys
0.773CysAla: 0.773 ± 0.252
0.064CysCys: 0.064 ± 0.067
0.516CysAsp: 0.516 ± 0.168
0.58CysGlu: 0.58 ± 0.207
0.129CysPhe: 0.129 ± 0.083
1.225CysGly: 1.225 ± 0.287
0.258CysHis: 0.258 ± 0.132
0.645CysIle: 0.645 ± 0.232
0.258CysLys: 0.258 ± 0.142
1.096CysLeu: 1.096 ± 0.299
0.258CysMet: 0.258 ± 0.121
0.322CysAsn: 0.322 ± 0.143
0.902CysPro: 0.902 ± 0.295
0.387CysGln: 0.387 ± 0.169
1.096CysArg: 1.096 ± 0.28
0.709CysSer: 0.709 ± 0.252
0.516CysThr: 0.516 ± 0.23
0.387CysVal: 0.387 ± 0.155
0.129CysTrp: 0.129 ± 0.103
0.258CysTyr: 0.258 ± 0.14
0.0CysXaa: 0.0 ± 0.0
Asp
7.154AspAla: 7.154 ± 0.803
1.096AspCys: 1.096 ± 0.291
6.123AspAsp: 6.123 ± 0.76
5.801AspGlu: 5.801 ± 0.703
1.74AspPhe: 1.74 ± 0.42
6.381AspGly: 6.381 ± 0.641
1.289AspHis: 1.289 ± 0.352
2.191AspIle: 2.191 ± 0.398
1.418AspLys: 1.418 ± 0.39
6.639AspLeu: 6.639 ± 0.644
1.354AspMet: 1.354 ± 0.274
1.611AspAsn: 1.611 ± 0.289
3.932AspPro: 3.932 ± 0.411
1.482AspGln: 1.482 ± 0.292
4.834AspArg: 4.834 ± 0.479
3.481AspSer: 3.481 ± 0.45
3.996AspThr: 3.996 ± 0.612
4.189AspVal: 4.189 ± 0.453
1.934AspTrp: 1.934 ± 0.466
1.998AspTyr: 1.998 ± 0.322
0.0AspXaa: 0.0 ± 0.0
Glu
7.219GluAla: 7.219 ± 0.83
0.967GluCys: 0.967 ± 0.319
3.029GluAsp: 3.029 ± 0.484
2.836GluGlu: 2.836 ± 0.584
2.643GluPhe: 2.643 ± 0.458
3.287GluGly: 3.287 ± 0.567
2.191GluHis: 2.191 ± 0.509
2.385GluIle: 2.385 ± 0.471
1.934GluLys: 1.934 ± 0.383
6.252GluLeu: 6.252 ± 0.709
1.096GluMet: 1.096 ± 0.296
1.354GluAsn: 1.354 ± 0.254
3.932GluPro: 3.932 ± 0.614
1.934GluGln: 1.934 ± 0.334
4.705GluArg: 4.705 ± 0.524
3.029GluSer: 3.029 ± 0.502
3.416GluThr: 3.416 ± 0.396
4.189GluVal: 4.189 ± 0.461
1.354GluTrp: 1.354 ± 0.282
1.611GluTyr: 1.611 ± 0.308
0.0GluXaa: 0.0 ± 0.0
Phe
3.287PheAla: 3.287 ± 0.428
0.258PheCys: 0.258 ± 0.123
2.707PheAsp: 2.707 ± 0.446
1.611PheGlu: 1.611 ± 0.299
0.902PhePhe: 0.902 ± 0.226
3.481PheGly: 3.481 ± 0.527
0.322PheHis: 0.322 ± 0.156
1.805PheIle: 1.805 ± 0.355
0.902PheLys: 0.902 ± 0.269
2.127PheLeu: 2.127 ± 0.42
0.516PheMet: 0.516 ± 0.178
0.838PheAsn: 0.838 ± 0.22
1.225PhePro: 1.225 ± 0.325
1.096PheGln: 1.096 ± 0.257
2.256PheArg: 2.256 ± 0.312
1.418PheSer: 1.418 ± 0.246
1.611PheThr: 1.611 ± 0.437
2.063PheVal: 2.063 ± 0.361
0.645PheTrp: 0.645 ± 0.198
0.451PheTyr: 0.451 ± 0.171
0.0PheXaa: 0.0 ± 0.0
Gly
9.668GlyAla: 9.668 ± 1.011
0.709GlyCys: 0.709 ± 0.212
4.705GlyAsp: 4.705 ± 0.565
4.447GlyGlu: 4.447 ± 0.572
2.965GlyPhe: 2.965 ± 0.502
10.248GlyGly: 10.248 ± 1.521
1.934GlyHis: 1.934 ± 0.411
4.125GlyIle: 4.125 ± 0.567
2.9GlyLys: 2.9 ± 0.396
6.897GlyLeu: 6.897 ± 0.947
1.611GlyMet: 1.611 ± 0.334
2.965GlyAsn: 2.965 ± 0.433
5.092GlyPro: 5.092 ± 0.739
3.803GlyGln: 3.803 ± 0.541
5.994GlyArg: 5.994 ± 0.642
5.285GlySer: 5.285 ± 0.719
6.059GlyThr: 6.059 ± 0.597
5.736GlyVal: 5.736 ± 0.617
1.676GlyTrp: 1.676 ± 0.344
1.934GlyTyr: 1.934 ± 0.237
0.0GlyXaa: 0.0 ± 0.0
His
2.514HisAla: 2.514 ± 0.509
0.193HisCys: 0.193 ± 0.111
1.547HisAsp: 1.547 ± 0.338
1.031HisGlu: 1.031 ± 0.286
0.516HisPhe: 0.516 ± 0.197
1.418HisGly: 1.418 ± 0.259
0.773HisHis: 0.773 ± 0.248
0.902HisIle: 0.902 ± 0.316
0.58HisLys: 0.58 ± 0.22
1.611HisLeu: 1.611 ± 0.31
0.58HisMet: 0.58 ± 0.188
0.516HisAsn: 0.516 ± 0.212
1.611HisPro: 1.611 ± 0.346
1.031HisGln: 1.031 ± 0.279
1.482HisArg: 1.482 ± 0.339
0.773HisSer: 0.773 ± 0.244
1.225HisThr: 1.225 ± 0.306
1.74HisVal: 1.74 ± 0.307
0.58HisTrp: 0.58 ± 0.235
0.709HisTyr: 0.709 ± 0.214
0.0HisXaa: 0.0 ± 0.0
Ile
5.285IleAla: 5.285 ± 0.534
0.193IleCys: 0.193 ± 0.106
3.223IleAsp: 3.223 ± 0.471
3.287IleGlu: 3.287 ± 0.401
0.773IlePhe: 0.773 ± 0.278
4.125IleGly: 4.125 ± 0.433
0.967IleHis: 0.967 ± 0.272
1.16IleIle: 1.16 ± 0.294
1.611IleLys: 1.611 ± 0.3
1.998IleLeu: 1.998 ± 0.341
0.451IleMet: 0.451 ± 0.242
1.482IleAsn: 1.482 ± 0.292
3.223IlePro: 3.223 ± 0.398
1.289IleGln: 1.289 ± 0.303
3.996IleArg: 3.996 ± 0.497
1.998IleSer: 1.998 ± 0.41
3.094IleThr: 3.094 ± 0.489
3.609IleVal: 3.609 ± 0.531
0.387IleTrp: 0.387 ± 0.198
1.16IleTyr: 1.16 ± 0.287
0.0IleXaa: 0.0 ± 0.0
Lys
3.803LysAla: 3.803 ± 0.528
0.129LysCys: 0.129 ± 0.087
2.256LysAsp: 2.256 ± 0.468
1.74LysGlu: 1.74 ± 0.406
1.805LysPhe: 1.805 ± 0.433
2.191LysGly: 2.191 ± 0.422
1.096LysHis: 1.096 ± 0.299
1.096LysIle: 1.096 ± 0.322
0.838LysLys: 0.838 ± 0.235
2.578LysLeu: 2.578 ± 0.371
1.16LysMet: 1.16 ± 0.273
1.16LysAsn: 1.16 ± 0.202
3.287LysPro: 3.287 ± 0.536
0.773LysGln: 0.773 ± 0.261
2.191LysArg: 2.191 ± 0.512
1.289LysSer: 1.289 ± 0.315
1.482LysThr: 1.482 ± 0.332
2.256LysVal: 2.256 ± 0.443
0.709LysTrp: 0.709 ± 0.202
0.451LysTyr: 0.451 ± 0.156
0.0LysXaa: 0.0 ± 0.0
Leu
10.184LeuAla: 10.184 ± 1.108
0.773LeuCys: 0.773 ± 0.266
6.639LeuAsp: 6.639 ± 0.702
3.609LeuGlu: 3.609 ± 0.49
1.676LeuPhe: 1.676 ± 0.297
7.477LeuGly: 7.477 ± 0.681
1.354LeuHis: 1.354 ± 0.307
3.481LeuIle: 3.481 ± 0.446
3.029LeuLys: 3.029 ± 0.552
5.801LeuLeu: 5.801 ± 0.601
1.547LeuMet: 1.547 ± 0.333
3.481LeuAsn: 3.481 ± 0.572
5.479LeuPro: 5.479 ± 0.636
2.9LeuGln: 2.9 ± 0.48
5.092LeuArg: 5.092 ± 0.652
3.609LeuSer: 3.609 ± 0.464
4.705LeuThr: 4.705 ± 0.515
4.512LeuVal: 4.512 ± 0.463
1.547LeuTrp: 1.547 ± 0.321
1.676LeuTyr: 1.676 ± 0.302
0.0LeuXaa: 0.0 ± 0.0
Met
3.029MetAla: 3.029 ± 0.427
0.258MetCys: 0.258 ± 0.146
0.516MetAsp: 0.516 ± 0.16
1.031MetGlu: 1.031 ± 0.355
0.516MetPhe: 0.516 ± 0.172
1.354MetGly: 1.354 ± 0.403
0.516MetHis: 0.516 ± 0.175
1.16MetIle: 1.16 ± 0.369
0.645MetLys: 0.645 ± 0.225
1.482MetLeu: 1.482 ± 0.301
0.129MetMet: 0.129 ± 0.092
0.516MetAsn: 0.516 ± 0.165
1.74MetPro: 1.74 ± 0.323
0.967MetGln: 0.967 ± 0.208
1.74MetArg: 1.74 ± 0.343
1.869MetSer: 1.869 ± 0.337
2.256MetThr: 2.256 ± 0.414
0.709MetVal: 0.709 ± 0.247
0.322MetTrp: 0.322 ± 0.131
0.387MetTyr: 0.387 ± 0.131
0.0MetXaa: 0.0 ± 0.0
Asn
3.867AsnAla: 3.867 ± 0.585
0.258AsnCys: 0.258 ± 0.124
1.547AsnAsp: 1.547 ± 0.33
1.418AsnGlu: 1.418 ± 0.363
0.322AsnPhe: 0.322 ± 0.129
3.609AsnGly: 3.609 ± 0.504
0.451AsnHis: 0.451 ± 0.157
1.289AsnIle: 1.289 ± 0.342
0.645AsnLys: 0.645 ± 0.233
2.965AsnLeu: 2.965 ± 0.473
0.387AsnMet: 0.387 ± 0.146
0.902AsnAsn: 0.902 ± 0.27
3.287AsnPro: 3.287 ± 0.543
0.709AsnGln: 0.709 ± 0.205
2.578AsnArg: 2.578 ± 0.469
1.869AsnSer: 1.869 ± 0.438
1.482AsnThr: 1.482 ± 0.371
2.063AsnVal: 2.063 ± 0.308
0.773AsnTrp: 0.773 ± 0.248
0.838AsnTyr: 0.838 ± 0.257
0.0AsnXaa: 0.0 ± 0.0
Pro
6.381ProAla: 6.381 ± 0.703
0.193ProCys: 0.193 ± 0.114
5.092ProAsp: 5.092 ± 0.691
3.867ProGlu: 3.867 ± 0.593
2.32ProPhe: 2.32 ± 0.322
6.381ProGly: 6.381 ± 0.883
1.031ProHis: 1.031 ± 0.284
3.416ProIle: 3.416 ± 0.412
2.127ProLys: 2.127 ± 0.408
4.705ProLeu: 4.705 ± 0.723
1.354ProMet: 1.354 ± 0.251
2.063ProAsn: 2.063 ± 0.352
3.287ProPro: 3.287 ± 0.505
3.029ProGln: 3.029 ± 0.474
2.9ProArg: 2.9 ± 0.486
3.481ProSer: 3.481 ± 0.573
2.836ProThr: 2.836 ± 0.415
4.447ProVal: 4.447 ± 0.508
1.611ProTrp: 1.611 ± 0.319
1.676ProTyr: 1.676 ± 0.41
0.0ProXaa: 0.0 ± 0.0
Gln
5.092GlnAla: 5.092 ± 0.555
0.322GlnCys: 0.322 ± 0.137
1.74GlnAsp: 1.74 ± 0.372
0.838GlnGlu: 0.838 ± 0.284
1.289GlnPhe: 1.289 ± 0.248
2.965GlnGly: 2.965 ± 0.677
0.773GlnHis: 0.773 ± 0.263
2.256GlnIle: 2.256 ± 0.405
1.289GlnLys: 1.289 ± 0.358
2.578GlnLeu: 2.578 ± 0.476
1.031GlnMet: 1.031 ± 0.226
1.418GlnAsn: 1.418 ± 0.282
2.643GlnPro: 2.643 ± 0.365
1.869GlnGln: 1.869 ± 0.476
3.996GlnArg: 3.996 ± 0.494
2.127GlnSer: 2.127 ± 0.335
1.934GlnThr: 1.934 ± 0.257
2.9GlnVal: 2.9 ± 0.427
0.709GlnTrp: 0.709 ± 0.215
0.516GlnTyr: 0.516 ± 0.258
0.0GlnXaa: 0.0 ± 0.0
Arg
8.121ArgAla: 8.121 ± 0.771
0.838ArgCys: 0.838 ± 0.242
5.092ArgAsp: 5.092 ± 0.529
3.932ArgGlu: 3.932 ± 0.536
2.191ArgPhe: 2.191 ± 0.394
4.576ArgGly: 4.576 ± 0.516
1.74ArgHis: 1.74 ± 0.333
2.449ArgIle: 2.449 ± 0.333
2.707ArgLys: 2.707 ± 0.452
5.156ArgLeu: 5.156 ± 0.667
2.449ArgMet: 2.449 ± 0.47
2.9ArgAsn: 2.9 ± 0.437
3.158ArgPro: 3.158 ± 0.416
2.836ArgGln: 2.836 ± 0.509
6.832ArgArg: 6.832 ± 0.937
2.707ArgSer: 2.707 ± 0.515
4.447ArgThr: 4.447 ± 0.63
4.318ArgVal: 4.318 ± 0.482
1.998ArgTrp: 1.998 ± 0.414
2.127ArgTyr: 2.127 ± 0.395
0.0ArgXaa: 0.0 ± 0.0
Ser
5.672SerAla: 5.672 ± 0.739
0.838SerCys: 0.838 ± 0.261
3.867SerAsp: 3.867 ± 0.471
2.127SerGlu: 2.127 ± 0.44
1.676SerPhe: 1.676 ± 0.29
5.027SerGly: 5.027 ± 0.56
0.773SerHis: 0.773 ± 0.253
2.32SerIle: 2.32 ± 0.34
1.74SerLys: 1.74 ± 0.343
3.674SerLeu: 3.674 ± 0.6
1.805SerMet: 1.805 ± 0.309
2.063SerAsn: 2.063 ± 0.447
2.643SerPro: 2.643 ± 0.428
1.934SerGln: 1.934 ± 0.433
3.481SerArg: 3.481 ± 0.48
2.256SerSer: 2.256 ± 0.453
3.029SerThr: 3.029 ± 0.344
3.481SerVal: 3.481 ± 0.522
1.289SerTrp: 1.289 ± 0.242
1.096SerTyr: 1.096 ± 0.237
0.0SerXaa: 0.0 ± 0.0
Thr
7.348ThrAla: 7.348 ± 0.723
0.709ThrCys: 0.709 ± 0.208
3.609ThrAsp: 3.609 ± 0.53
3.223ThrGlu: 3.223 ± 0.455
1.934ThrPhe: 1.934 ± 0.407
5.414ThrGly: 5.414 ± 0.68
1.354ThrHis: 1.354 ± 0.244
2.836ThrIle: 2.836 ± 0.416
1.998ThrLys: 1.998 ± 0.402
5.027ThrLeu: 5.027 ± 0.551
0.838ThrMet: 0.838 ± 0.249
1.805ThrAsn: 1.805 ± 0.386
4.254ThrPro: 4.254 ± 0.476
2.127ThrGln: 2.127 ± 0.381
2.385ThrArg: 2.385 ± 0.372
3.223ThrSer: 3.223 ± 0.447
3.609ThrThr: 3.609 ± 0.58
5.027ThrVal: 5.027 ± 0.541
1.482ThrTrp: 1.482 ± 0.384
1.934ThrTyr: 1.934 ± 0.318
0.0ThrXaa: 0.0 ± 0.0
Val
8.572ValAla: 8.572 ± 0.871
0.967ValCys: 0.967 ± 0.235
5.156ValAsp: 5.156 ± 0.602
5.285ValGlu: 5.285 ± 0.472
1.482ValPhe: 1.482 ± 0.286
4.77ValGly: 4.77 ± 0.566
1.289ValHis: 1.289 ± 0.318
2.707ValIle: 2.707 ± 0.427
2.063ValLys: 2.063 ± 0.395
5.156ValLeu: 5.156 ± 0.661
1.289ValMet: 1.289 ± 0.283
1.805ValAsn: 1.805 ± 0.36
3.674ValPro: 3.674 ± 0.47
2.449ValGln: 2.449 ± 0.441
4.447ValArg: 4.447 ± 0.52
3.609ValSer: 3.609 ± 0.572
4.963ValThr: 4.963 ± 0.628
6.123ValVal: 6.123 ± 0.752
0.838ValTrp: 0.838 ± 0.231
1.418ValTyr: 1.418 ± 0.247
0.0ValXaa: 0.0 ± 0.0
Trp
2.063TrpAla: 2.063 ± 0.378
0.387TrpCys: 0.387 ± 0.175
1.482TrpAsp: 1.482 ± 0.359
1.031TrpGlu: 1.031 ± 0.231
1.289TrpPhe: 1.289 ± 0.326
1.16TrpGly: 1.16 ± 0.238
0.516TrpHis: 0.516 ± 0.167
0.902TrpIle: 0.902 ± 0.249
0.516TrpLys: 0.516 ± 0.168
1.74TrpLeu: 1.74 ± 0.277
0.58TrpMet: 0.58 ± 0.199
0.387TrpAsn: 0.387 ± 0.154
0.967TrpPro: 0.967 ± 0.26
1.611TrpGln: 1.611 ± 0.353
1.805TrpArg: 1.805 ± 0.388
1.482TrpSer: 1.482 ± 0.373
1.225TrpThr: 1.225 ± 0.346
1.289TrpVal: 1.289 ± 0.265
0.645TrpTrp: 0.645 ± 0.222
0.451TrpTyr: 0.451 ± 0.142
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.514TyrAla: 2.514 ± 0.403
0.322TyrCys: 0.322 ± 0.139
1.676TyrAsp: 1.676 ± 0.376
2.32TyrGlu: 2.32 ± 0.362
0.645TyrPhe: 0.645 ± 0.203
2.578TyrGly: 2.578 ± 0.427
0.773TyrHis: 0.773 ± 0.218
0.451TyrIle: 0.451 ± 0.172
0.322TyrLys: 0.322 ± 0.145
1.934TyrLeu: 1.934 ± 0.395
0.387TyrMet: 0.387 ± 0.181
0.451TyrAsn: 0.451 ± 0.157
1.418TyrPro: 1.418 ± 0.286
1.225TyrGln: 1.225 ± 0.28
1.676TyrArg: 1.676 ± 0.233
0.902TyrSer: 0.902 ± 0.222
1.482TyrThr: 1.482 ± 0.287
2.063TyrVal: 2.063 ± 0.304
0.516TyrTrp: 0.516 ± 0.165
1.031TyrTyr: 1.031 ± 0.281
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 73 proteins (15516 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski