Amino acid dipepetide frequency for Mycobacterium phage XianYue

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
11.589AlaAla: 11.589 ± 0.998
0.366AlaCys: 0.366 ± 0.113
5.55AlaAsp: 5.55 ± 0.655
7.685AlaGlu: 7.685 ± 0.812
3.843AlaPhe: 3.843 ± 0.519
8.356AlaGly: 8.356 ± 1.041
1.403AlaHis: 1.403 ± 0.308
4.758AlaIle: 4.758 ± 0.466
5.124AlaLys: 5.124 ± 0.66
8.783AlaLeu: 8.783 ± 0.837
2.684AlaMet: 2.684 ± 0.345
3.111AlaAsn: 3.111 ± 0.467
4.636AlaPro: 4.636 ± 0.561
3.66AlaGln: 3.66 ± 0.476
5.672AlaArg: 5.672 ± 0.64
4.453AlaSer: 4.453 ± 0.569
5.794AlaThr: 5.794 ± 0.603
7.502AlaVal: 7.502 ± 0.741
1.83AlaTrp: 1.83 ± 0.306
2.806AlaTyr: 2.806 ± 0.426
0.0AlaXaa: 0.0 ± 0.0
Cys
0.549CysAla: 0.549 ± 0.187
0.122CysCys: 0.122 ± 0.121
0.915CysAsp: 0.915 ± 0.218
0.305CysGlu: 0.305 ± 0.132
0.366CysPhe: 0.366 ± 0.153
1.037CysGly: 1.037 ± 0.234
0.244CysHis: 0.244 ± 0.151
0.427CysIle: 0.427 ± 0.201
0.488CysLys: 0.488 ± 0.182
0.671CysLeu: 0.671 ± 0.193
0.183CysMet: 0.183 ± 0.126
0.427CysAsn: 0.427 ± 0.148
0.732CysPro: 0.732 ± 0.254
0.244CysGln: 0.244 ± 0.113
0.427CysArg: 0.427 ± 0.166
0.427CysSer: 0.427 ± 0.167
0.427CysThr: 0.427 ± 0.151
0.488CysVal: 0.488 ± 0.148
0.427CysTrp: 0.427 ± 0.171
0.183CysTyr: 0.183 ± 0.104
0.0CysXaa: 0.0 ± 0.0
Asp
6.648AspAla: 6.648 ± 0.679
0.854AspCys: 0.854 ± 0.266
3.843AspAsp: 3.843 ± 0.545
4.941AspGlu: 4.941 ± 0.751
3.111AspPhe: 3.111 ± 0.515
5.306AspGly: 5.306 ± 0.532
1.22AspHis: 1.22 ± 0.291
3.599AspIle: 3.599 ± 0.408
2.44AspLys: 2.44 ± 0.357
5.124AspLeu: 5.124 ± 0.588
1.769AspMet: 1.769 ± 0.311
1.403AspAsn: 1.403 ± 0.237
4.88AspPro: 4.88 ± 0.509
2.562AspGln: 2.562 ± 0.3
2.684AspArg: 2.684 ± 0.472
2.684AspSer: 2.684 ± 0.475
2.806AspThr: 2.806 ± 0.476
4.575AspVal: 4.575 ± 0.473
1.159AspTrp: 1.159 ± 0.246
1.83AspTyr: 1.83 ± 0.272
0.0AspXaa: 0.0 ± 0.0
Glu
6.892GluAla: 6.892 ± 0.781
0.488GluCys: 0.488 ± 0.19
4.575GluAsp: 4.575 ± 0.756
5.002GluGlu: 5.002 ± 0.771
2.928GluPhe: 2.928 ± 0.371
5.185GluGly: 5.185 ± 0.655
1.342GluHis: 1.342 ± 0.3
3.599GluIle: 3.599 ± 0.502
2.44GluLys: 2.44 ± 0.392
8.722GluLeu: 8.722 ± 0.86
2.379GluMet: 2.379 ± 0.342
2.196GluAsn: 2.196 ± 0.291
2.623GluPro: 2.623 ± 0.438
1.891GluGln: 1.891 ± 0.301
4.026GluArg: 4.026 ± 0.585
2.806GluSer: 2.806 ± 0.418
3.721GluThr: 3.721 ± 0.454
4.575GluVal: 4.575 ± 0.506
1.708GluTrp: 1.708 ± 0.378
2.379GluTyr: 2.379 ± 0.419
0.0GluXaa: 0.0 ± 0.0
Phe
3.05PheAla: 3.05 ± 0.433
0.427PheCys: 0.427 ± 0.173
2.623PheAsp: 2.623 ± 0.39
2.806PheGlu: 2.806 ± 0.373
0.976PhePhe: 0.976 ± 0.249
2.867PheGly: 2.867 ± 0.381
0.793PheHis: 0.793 ± 0.259
1.464PheIle: 1.464 ± 0.27
1.525PheLys: 1.525 ± 0.292
2.867PheLeu: 2.867 ± 0.544
0.61PheMet: 0.61 ± 0.176
2.013PheAsn: 2.013 ± 0.36
1.586PhePro: 1.586 ± 0.321
1.281PheGln: 1.281 ± 0.323
2.013PheArg: 2.013 ± 0.319
2.196PheSer: 2.196 ± 0.39
2.135PheThr: 2.135 ± 0.313
2.44PheVal: 2.44 ± 0.34
0.427PheTrp: 0.427 ± 0.154
0.61PheTyr: 0.61 ± 0.183
0.0PheXaa: 0.0 ± 0.0
Gly
6.831GlyAla: 6.831 ± 0.947
0.549GlyCys: 0.549 ± 0.188
5.733GlyAsp: 5.733 ± 0.606
4.88GlyGlu: 4.88 ± 0.605
3.172GlyPhe: 3.172 ± 0.55
8.478GlyGly: 8.478 ± 1.469
1.952GlyHis: 1.952 ± 0.335
4.148GlyIle: 4.148 ± 0.595
4.209GlyLys: 4.209 ± 0.571
6.343GlyLeu: 6.343 ± 0.784
1.952GlyMet: 1.952 ± 0.329
3.416GlyAsn: 3.416 ± 0.587
3.172GlyPro: 3.172 ± 0.432
3.355GlyGln: 3.355 ± 0.527
4.27GlyArg: 4.27 ± 0.475
4.27GlySer: 4.27 ± 0.684
4.88GlyThr: 4.88 ± 0.746
6.282GlyVal: 6.282 ± 0.777
1.647GlyTrp: 1.647 ± 0.32
2.623GlyTyr: 2.623 ± 0.403
0.0GlyXaa: 0.0 ± 0.0
His
1.83HisAla: 1.83 ± 0.382
0.305HisCys: 0.305 ± 0.131
1.525HisAsp: 1.525 ± 0.315
1.464HisGlu: 1.464 ± 0.289
0.427HisPhe: 0.427 ± 0.162
1.464HisGly: 1.464 ± 0.384
0.732HisHis: 0.732 ± 0.213
0.976HisIle: 0.976 ± 0.236
0.854HisLys: 0.854 ± 0.211
1.159HisLeu: 1.159 ± 0.292
0.427HisMet: 0.427 ± 0.16
0.61HisAsn: 0.61 ± 0.165
1.159HisPro: 1.159 ± 0.254
0.549HisGln: 0.549 ± 0.198
1.159HisArg: 1.159 ± 0.297
1.342HisSer: 1.342 ± 0.282
0.976HisThr: 0.976 ± 0.219
1.159HisVal: 1.159 ± 0.261
0.427HisTrp: 0.427 ± 0.184
0.671HisTyr: 0.671 ± 0.258
0.0HisXaa: 0.0 ± 0.0
Ile
5.246IleAla: 5.246 ± 0.57
0.488IleCys: 0.488 ± 0.152
4.026IleAsp: 4.026 ± 0.448
4.819IleGlu: 4.819 ± 0.668
0.976IlePhe: 0.976 ± 0.214
3.965IleGly: 3.965 ± 0.384
1.098IleHis: 1.098 ± 0.234
2.074IleIle: 2.074 ± 0.371
3.111IleLys: 3.111 ± 0.393
4.209IleLeu: 4.209 ± 0.482
0.488IleMet: 0.488 ± 0.186
2.501IleAsn: 2.501 ± 0.435
3.416IlePro: 3.416 ± 0.422
1.464IleGln: 1.464 ± 0.315
3.233IleArg: 3.233 ± 0.396
2.318IleSer: 2.318 ± 0.451
3.233IleThr: 3.233 ± 0.3
2.562IleVal: 2.562 ± 0.349
0.488IleTrp: 0.488 ± 0.126
1.159IleTyr: 1.159 ± 0.216
0.0IleXaa: 0.0 ± 0.0
Lys
5.367LysAla: 5.367 ± 0.614
0.244LysCys: 0.244 ± 0.099
2.257LysAsp: 2.257 ± 0.389
2.379LysGlu: 2.379 ± 0.445
1.403LysPhe: 1.403 ± 0.334
3.782LysGly: 3.782 ± 0.702
0.732LysHis: 0.732 ± 0.212
2.379LysIle: 2.379 ± 0.375
2.928LysLys: 2.928 ± 0.505
3.904LysLeu: 3.904 ± 0.447
0.854LysMet: 0.854 ± 0.207
1.708LysAsn: 1.708 ± 0.379
2.989LysPro: 2.989 ± 0.533
1.403LysGln: 1.403 ± 0.302
3.111LysArg: 3.111 ± 0.436
2.501LysSer: 2.501 ± 0.3
2.745LysThr: 2.745 ± 0.369
4.453LysVal: 4.453 ± 0.489
0.732LysTrp: 0.732 ± 0.228
1.464LysTyr: 1.464 ± 0.315
0.0LysXaa: 0.0 ± 0.0
Leu
9.027LeuAla: 9.027 ± 0.934
0.732LeuCys: 0.732 ± 0.219
5.246LeuAsp: 5.246 ± 0.693
5.367LeuGlu: 5.367 ± 0.642
2.562LeuPhe: 2.562 ± 0.376
6.404LeuGly: 6.404 ± 0.69
1.891LeuHis: 1.891 ± 0.394
4.148LeuIle: 4.148 ± 0.412
3.111LeuLys: 3.111 ± 0.489
6.221LeuLeu: 6.221 ± 0.697
2.501LeuMet: 2.501 ± 0.406
2.623LeuAsn: 2.623 ± 0.45
4.514LeuPro: 4.514 ± 0.434
2.257LeuGln: 2.257 ± 0.469
6.099LeuArg: 6.099 ± 0.73
5.489LeuSer: 5.489 ± 0.686
5.733LeuThr: 5.733 ± 0.537
4.697LeuVal: 4.697 ± 0.606
1.403LeuTrp: 1.403 ± 0.252
2.562LeuTyr: 2.562 ± 0.427
0.0LeuXaa: 0.0 ± 0.0
Met
3.05MetAla: 3.05 ± 0.383
0.122MetCys: 0.122 ± 0.087
1.22MetAsp: 1.22 ± 0.26
1.098MetGlu: 1.098 ± 0.215
0.549MetPhe: 0.549 ± 0.169
1.83MetGly: 1.83 ± 0.346
0.61MetHis: 0.61 ± 0.163
1.22MetIle: 1.22 ± 0.296
1.281MetLys: 1.281 ± 0.287
1.769MetLeu: 1.769 ± 0.428
0.61MetMet: 0.61 ± 0.16
0.732MetAsn: 0.732 ± 0.202
1.464MetPro: 1.464 ± 0.309
1.037MetGln: 1.037 ± 0.23
1.586MetArg: 1.586 ± 0.316
1.891MetSer: 1.891 ± 0.291
2.135MetThr: 2.135 ± 0.309
1.342MetVal: 1.342 ± 0.238
0.183MetTrp: 0.183 ± 0.097
0.61MetTyr: 0.61 ± 0.174
0.0MetXaa: 0.0 ± 0.0
Asn
3.233AsnAla: 3.233 ± 0.449
0.427AsnCys: 0.427 ± 0.177
2.013AsnAsp: 2.013 ± 0.337
2.074AsnGlu: 2.074 ± 0.389
0.793AsnPhe: 0.793 ± 0.236
3.66AsnGly: 3.66 ± 0.561
0.915AsnHis: 0.915 ± 0.19
1.647AsnIle: 1.647 ± 0.308
1.403AsnLys: 1.403 ± 0.277
2.44AsnLeu: 2.44 ± 0.378
0.915AsnMet: 0.915 ± 0.252
0.915AsnAsn: 0.915 ± 0.233
2.745AsnPro: 2.745 ± 0.424
0.793AsnGln: 0.793 ± 0.216
2.135AsnArg: 2.135 ± 0.337
1.525AsnSer: 1.525 ± 0.274
1.708AsnThr: 1.708 ± 0.364
2.379AsnVal: 2.379 ± 0.374
0.915AsnTrp: 0.915 ± 0.235
1.037AsnTyr: 1.037 ± 0.241
0.0AsnXaa: 0.0 ± 0.0
Pro
5.611ProAla: 5.611 ± 0.615
0.305ProCys: 0.305 ± 0.146
4.026ProAsp: 4.026 ± 0.547
4.819ProGlu: 4.819 ± 0.626
1.891ProPhe: 1.891 ± 0.378
4.392ProGly: 4.392 ± 0.573
0.976ProHis: 0.976 ± 0.212
2.928ProIle: 2.928 ± 0.393
2.196ProLys: 2.196 ± 0.438
3.05ProLeu: 3.05 ± 0.458
0.732ProMet: 0.732 ± 0.227
2.074ProAsn: 2.074 ± 0.321
2.745ProPro: 2.745 ± 0.518
1.708ProGln: 1.708 ± 0.371
3.538ProArg: 3.538 ± 0.562
3.416ProSer: 3.416 ± 0.488
3.416ProThr: 3.416 ± 0.45
4.087ProVal: 4.087 ± 0.433
1.403ProTrp: 1.403 ± 0.393
1.586ProTyr: 1.586 ± 0.269
0.0ProXaa: 0.0 ± 0.0
Gln
3.721GlnAla: 3.721 ± 0.629
0.244GlnCys: 0.244 ± 0.108
1.342GlnAsp: 1.342 ± 0.289
1.952GlnGlu: 1.952 ± 0.377
1.464GlnPhe: 1.464 ± 0.246
2.684GlnGly: 2.684 ± 0.378
0.549GlnHis: 0.549 ± 0.161
2.501GlnIle: 2.501 ± 0.341
1.525GlnLys: 1.525 ± 0.283
3.172GlnLeu: 3.172 ± 0.77
0.976GlnMet: 0.976 ± 0.257
0.915GlnAsn: 0.915 ± 0.267
1.769GlnPro: 1.769 ± 0.4
2.196GlnGln: 2.196 ± 0.429
2.928GlnArg: 2.928 ± 0.519
1.586GlnSer: 1.586 ± 0.305
2.135GlnThr: 2.135 ± 0.254
2.379GlnVal: 2.379 ± 0.369
0.671GlnTrp: 0.671 ± 0.244
0.671GlnTyr: 0.671 ± 0.174
0.0GlnXaa: 0.0 ± 0.0
Arg
4.697ArgAla: 4.697 ± 0.622
0.915ArgCys: 0.915 ± 0.256
3.782ArgAsp: 3.782 ± 0.464
5.124ArgGlu: 5.124 ± 0.681
2.196ArgPhe: 2.196 ± 0.388
4.148ArgGly: 4.148 ± 0.534
1.281ArgHis: 1.281 ± 0.326
3.904ArgIle: 3.904 ± 0.404
3.538ArgLys: 3.538 ± 0.501
5.55ArgLeu: 5.55 ± 0.672
1.952ArgMet: 1.952 ± 0.32
2.379ArgAsn: 2.379 ± 0.388
2.501ArgPro: 2.501 ± 0.365
2.501ArgGln: 2.501 ± 0.407
5.246ArgArg: 5.246 ± 0.714
2.989ArgSer: 2.989 ± 0.522
2.928ArgThr: 2.928 ± 0.419
4.453ArgVal: 4.453 ± 0.597
1.22ArgTrp: 1.22 ± 0.322
1.952ArgTyr: 1.952 ± 0.318
0.0ArgXaa: 0.0 ± 0.0
Ser
4.697SerAla: 4.697 ± 0.446
0.549SerCys: 0.549 ± 0.151
3.05SerAsp: 3.05 ± 0.389
3.477SerGlu: 3.477 ± 0.444
1.708SerPhe: 1.708 ± 0.337
4.453SerGly: 4.453 ± 0.451
0.732SerHis: 0.732 ± 0.192
2.501SerIle: 2.501 ± 0.328
3.355SerLys: 3.355 ± 0.586
4.514SerLeu: 4.514 ± 0.626
1.403SerMet: 1.403 ± 0.307
1.342SerAsn: 1.342 ± 0.264
3.843SerPro: 3.843 ± 0.528
1.586SerGln: 1.586 ± 0.318
3.66SerArg: 3.66 ± 0.42
2.44SerSer: 2.44 ± 0.547
2.806SerThr: 2.806 ± 0.361
3.172SerVal: 3.172 ± 0.426
1.037SerTrp: 1.037 ± 0.242
1.342SerTyr: 1.342 ± 0.257
0.0SerXaa: 0.0 ± 0.0
Thr
6.709ThrAla: 6.709 ± 0.64
0.549ThrCys: 0.549 ± 0.16
3.355ThrAsp: 3.355 ± 0.571
3.355ThrGlu: 3.355 ± 0.391
2.257ThrPhe: 2.257 ± 0.306
5.367ThrGly: 5.367 ± 0.555
0.488ThrHis: 0.488 ± 0.174
3.355ThrIle: 3.355 ± 0.408
3.66ThrLys: 3.66 ± 0.477
5.063ThrLeu: 5.063 ± 0.611
1.281ThrMet: 1.281 ± 0.26
1.403ThrAsn: 1.403 ± 0.283
3.782ThrPro: 3.782 ± 0.572
2.318ThrGln: 2.318 ± 0.332
2.928ThrArg: 2.928 ± 0.392
2.867ThrSer: 2.867 ± 0.444
4.087ThrThr: 4.087 ± 0.6
4.453ThrVal: 4.453 ± 0.567
1.159ThrTrp: 1.159 ± 0.253
1.708ThrTyr: 1.708 ± 0.31
0.0ThrXaa: 0.0 ± 0.0
Val
5.977ValAla: 5.977 ± 0.827
0.976ValCys: 0.976 ± 0.254
5.55ValAsp: 5.55 ± 0.583
4.819ValGlu: 4.819 ± 0.568
2.318ValPhe: 2.318 ± 0.441
5.367ValGly: 5.367 ± 0.678
1.098ValHis: 1.098 ± 0.268
3.111ValIle: 3.111 ± 0.366
2.867ValLys: 2.867 ± 0.377
5.185ValLeu: 5.185 ± 0.632
1.525ValMet: 1.525 ± 0.362
2.44ValAsn: 2.44 ± 0.41
3.721ValPro: 3.721 ± 0.519
2.074ValGln: 2.074 ± 0.416
5.063ValArg: 5.063 ± 0.64
3.965ValSer: 3.965 ± 0.546
5.428ValThr: 5.428 ± 0.649
5.672ValVal: 5.672 ± 0.501
1.159ValTrp: 1.159 ± 0.292
2.074ValTyr: 2.074 ± 0.434
0.0ValXaa: 0.0 ± 0.0
Trp
1.891TrpAla: 1.891 ± 0.376
0.305TrpCys: 0.305 ± 0.168
0.976TrpAsp: 0.976 ± 0.266
1.403TrpGlu: 1.403 ± 0.28
1.098TrpPhe: 1.098 ± 0.256
1.342TrpGly: 1.342 ± 0.354
0.488TrpHis: 0.488 ± 0.191
0.732TrpIle: 0.732 ± 0.194
0.61TrpLys: 0.61 ± 0.176
1.22TrpLeu: 1.22 ± 0.263
0.549TrpMet: 0.549 ± 0.158
0.671TrpAsn: 0.671 ± 0.201
1.159TrpPro: 1.159 ± 0.254
1.037TrpGln: 1.037 ± 0.253
1.159TrpArg: 1.159 ± 0.225
0.793TrpSer: 0.793 ± 0.243
1.342TrpThr: 1.342 ± 0.278
1.342TrpVal: 1.342 ± 0.248
0.732TrpTrp: 0.732 ± 0.181
0.427TrpTyr: 0.427 ± 0.159
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.989TyrAla: 2.989 ± 0.456
0.122TyrCys: 0.122 ± 0.075
1.83TyrAsp: 1.83 ± 0.292
1.769TyrGlu: 1.769 ± 0.376
0.671TyrPhe: 0.671 ± 0.187
2.074TyrGly: 2.074 ± 0.358
0.671TyrHis: 0.671 ± 0.188
1.403TyrIle: 1.403 ± 0.248
0.671TyrLys: 0.671 ± 0.228
2.623TyrLeu: 2.623 ± 0.398
0.488TyrMet: 0.488 ± 0.168
0.793TyrAsn: 0.793 ± 0.209
1.464TyrPro: 1.464 ± 0.338
1.464TyrGln: 1.464 ± 0.3
2.318TyrArg: 2.318 ± 0.424
1.708TyrSer: 1.708 ± 0.317
1.708TyrThr: 1.708 ± 0.327
2.379TyrVal: 2.379 ± 0.445
0.549TyrTrp: 0.549 ± 0.197
0.854TyrTyr: 0.854 ± 0.237
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 92 proteins (16396 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski