Amino acid dipepetide frequency for Mycobacterium phage Hamulus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
14.262AlaAla: 14.262 ± 1.652
0.933AlaCys: 0.933 ± 0.22
7.076AlaAsp: 7.076 ± 0.703
7.186AlaGlu: 7.186 ± 0.779
2.578AlaPhe: 2.578 ± 0.381
9.819AlaGly: 9.819 ± 1.161
1.81AlaHis: 1.81 ± 0.279
3.95AlaIle: 3.95 ± 0.451
4.059AlaLys: 4.059 ± 0.531
7.954AlaLeu: 7.954 ± 0.763
2.743AlaMet: 2.743 ± 0.482
2.414AlaAsn: 2.414 ± 0.333
5.101AlaPro: 5.101 ± 0.586
3.346AlaGln: 3.346 ± 0.402
7.241AlaArg: 7.241 ± 0.693
5.376AlaSer: 5.376 ± 0.553
5.65AlaThr: 5.65 ± 0.52
7.076AlaVal: 7.076 ± 0.601
2.798AlaTrp: 2.798 ± 0.374
2.249AlaTyr: 2.249 ± 0.337
0.0AlaXaa: 0.0 ± 0.0
Cys
0.768CysAla: 0.768 ± 0.226
0.0CysCys: 0.0 ± 0.0
1.755CysAsp: 1.755 ± 0.307
0.987CysGlu: 0.987 ± 0.239
0.219CysPhe: 0.219 ± 0.104
1.646CysGly: 1.646 ± 0.364
0.219CysHis: 0.219 ± 0.12
0.11CysIle: 0.11 ± 0.089
0.549CysLys: 0.549 ± 0.177
0.768CysLeu: 0.768 ± 0.273
0.274CysMet: 0.274 ± 0.107
0.439CysAsn: 0.439 ± 0.146
1.426CysPro: 1.426 ± 0.335
0.439CysGln: 0.439 ± 0.159
0.878CysArg: 0.878 ± 0.264
0.494CysSer: 0.494 ± 0.164
0.658CysThr: 0.658 ± 0.189
0.713CysVal: 0.713 ± 0.214
0.329CysTrp: 0.329 ± 0.127
0.11CysTyr: 0.11 ± 0.084
0.0CysXaa: 0.0 ± 0.0
Asp
7.131AspAla: 7.131 ± 0.506
1.207AspCys: 1.207 ± 0.3
5.047AspAsp: 5.047 ± 0.566
3.456AspGlu: 3.456 ± 0.456
2.194AspPhe: 2.194 ± 0.347
6.692AspGly: 6.692 ± 0.796
1.152AspHis: 1.152 ± 0.23
2.468AspIle: 2.468 ± 0.334
1.81AspLys: 1.81 ± 0.305
6.253AspLeu: 6.253 ± 0.648
1.152AspMet: 1.152 ± 0.246
2.139AspAsn: 2.139 ± 0.337
5.376AspPro: 5.376 ± 0.679
2.359AspGln: 2.359 ± 0.374
5.266AspArg: 5.266 ± 0.661
3.127AspSer: 3.127 ± 0.544
4.334AspThr: 4.334 ± 0.462
4.608AspVal: 4.608 ± 0.626
1.426AspTrp: 1.426 ± 0.287
2.249AspTyr: 2.249 ± 0.357
0.0AspXaa: 0.0 ± 0.0
Glu
5.979GluAla: 5.979 ± 0.613
1.042GluCys: 1.042 ± 0.282
2.743GluAsp: 2.743 ± 0.366
2.688GluGlu: 2.688 ± 0.526
1.92GluPhe: 1.92 ± 0.309
4.114GluGly: 4.114 ± 0.497
1.536GluHis: 1.536 ± 0.357
2.249GluIle: 2.249 ± 0.433
1.975GluLys: 1.975 ± 0.314
5.595GluLeu: 5.595 ± 0.567
1.591GluMet: 1.591 ± 0.312
2.139GluAsn: 2.139 ± 0.293
2.852GluPro: 2.852 ± 0.429
2.578GluGln: 2.578 ± 0.425
4.553GluArg: 4.553 ± 0.624
2.688GluSer: 2.688 ± 0.421
3.95GluThr: 3.95 ± 0.5
3.73GluVal: 3.73 ± 0.485
1.207GluTrp: 1.207 ± 0.234
1.755GluTyr: 1.755 ± 0.331
0.0GluXaa: 0.0 ± 0.0
Phe
2.907PheAla: 2.907 ± 0.371
0.274PheCys: 0.274 ± 0.138
2.359PheAsp: 2.359 ± 0.351
1.481PheGlu: 1.481 ± 0.29
0.933PhePhe: 0.933 ± 0.258
3.182PheGly: 3.182 ± 0.625
0.439PheHis: 0.439 ± 0.147
1.536PheIle: 1.536 ± 0.351
1.207PheLys: 1.207 ± 0.262
1.481PheLeu: 1.481 ± 0.244
0.549PheMet: 0.549 ± 0.171
1.371PheAsn: 1.371 ± 0.355
1.81PhePro: 1.81 ± 0.298
1.152PheGln: 1.152 ± 0.326
1.481PheArg: 1.481 ± 0.253
1.81PheSer: 1.81 ± 0.287
2.578PheThr: 2.578 ± 0.43
1.975PheVal: 1.975 ± 0.3
0.603PheTrp: 0.603 ± 0.173
0.933PheTyr: 0.933 ± 0.257
0.0PheXaa: 0.0 ± 0.0
Gly
8.557GlyAla: 8.557 ± 0.913
0.878GlyCys: 0.878 ± 0.214
6.802GlyAsp: 6.802 ± 0.537
3.675GlyGlu: 3.675 ± 0.54
2.962GlyPhe: 2.962 ± 0.417
10.861GlyGly: 10.861 ± 2.869
2.194GlyHis: 2.194 ± 0.4
4.498GlyIle: 4.498 ± 0.582
2.633GlyLys: 2.633 ± 0.341
6.199GlyLeu: 6.199 ± 0.565
2.194GlyMet: 2.194 ± 0.453
3.291GlyAsn: 3.291 ± 0.424
4.334GlyPro: 4.334 ± 0.581
2.523GlyGln: 2.523 ± 0.523
5.54GlyArg: 5.54 ± 0.783
5.924GlySer: 5.924 ± 1.067
6.418GlyThr: 6.418 ± 0.782
5.869GlyVal: 5.869 ± 0.614
2.468GlyTrp: 2.468 ± 0.336
2.249GlyTyr: 2.249 ± 0.46
0.0GlyXaa: 0.0 ± 0.0
His
1.536HisAla: 1.536 ± 0.351
0.494HisCys: 0.494 ± 0.171
1.262HisAsp: 1.262 ± 0.326
1.317HisGlu: 1.317 ± 0.268
0.494HisPhe: 0.494 ± 0.138
1.92HisGly: 1.92 ± 0.309
1.097HisHis: 1.097 ± 0.302
1.262HisIle: 1.262 ± 0.255
0.713HisLys: 0.713 ± 0.201
1.207HisLeu: 1.207 ± 0.261
0.549HisMet: 0.549 ± 0.146
0.713HisAsn: 0.713 ± 0.212
1.81HisPro: 1.81 ± 0.357
0.823HisGln: 0.823 ± 0.189
2.03HisArg: 2.03 ± 0.419
0.878HisSer: 0.878 ± 0.201
1.426HisThr: 1.426 ± 0.371
1.262HisVal: 1.262 ± 0.243
0.713HisTrp: 0.713 ± 0.222
0.768HisTyr: 0.768 ± 0.228
0.0HisXaa: 0.0 ± 0.0
Ile
5.705IleAla: 5.705 ± 0.573
0.658IleCys: 0.658 ± 0.213
3.73IleAsp: 3.73 ± 0.462
3.291IleGlu: 3.291 ± 0.368
0.768IlePhe: 0.768 ± 0.235
3.95IleGly: 3.95 ± 0.516
1.81IleHis: 1.81 ± 0.365
1.7IleIle: 1.7 ± 0.295
0.878IleLys: 0.878 ± 0.208
2.304IleLeu: 2.304 ± 0.299
0.439IleMet: 0.439 ± 0.16
2.03IleAsn: 2.03 ± 0.273
2.468IlePro: 2.468 ± 0.37
1.317IleGln: 1.317 ± 0.246
2.414IleArg: 2.414 ± 0.431
1.975IleSer: 1.975 ± 0.367
3.401IleThr: 3.401 ± 0.432
2.688IleVal: 2.688 ± 0.38
0.768IleTrp: 0.768 ± 0.207
0.933IleTyr: 0.933 ± 0.25
0.0IleXaa: 0.0 ± 0.0
Lys
3.456LysAla: 3.456 ± 0.421
0.384LysCys: 0.384 ± 0.162
1.591LysAsp: 1.591 ± 0.293
1.646LysGlu: 1.646 ± 0.311
1.097LysPhe: 1.097 ± 0.19
2.468LysGly: 2.468 ± 0.352
1.042LysHis: 1.042 ± 0.272
0.987LysIle: 0.987 ± 0.268
1.591LysLys: 1.591 ± 0.391
2.578LysLeu: 2.578 ± 0.416
0.768LysMet: 0.768 ± 0.211
0.878LysAsn: 0.878 ± 0.221
2.578LysPro: 2.578 ± 0.456
1.481LysGln: 1.481 ± 0.236
2.633LysArg: 2.633 ± 0.425
1.865LysSer: 1.865 ± 0.295
2.084LysThr: 2.084 ± 0.333
2.468LysVal: 2.468 ± 0.414
1.042LysTrp: 1.042 ± 0.286
0.987LysTyr: 0.987 ± 0.238
0.0LysXaa: 0.0 ± 0.0
Leu
8.173LeuAla: 8.173 ± 0.785
0.823LeuCys: 0.823 ± 0.216
5.376LeuAsp: 5.376 ± 0.585
3.895LeuGlu: 3.895 ± 0.471
2.523LeuPhe: 2.523 ± 0.329
5.156LeuGly: 5.156 ± 0.491
1.097LeuHis: 1.097 ± 0.242
3.291LeuIle: 3.291 ± 0.438
2.03LeuLys: 2.03 ± 0.332
5.266LeuLeu: 5.266 ± 0.674
1.371LeuMet: 1.371 ± 0.253
2.468LeuAsn: 2.468 ± 0.403
5.266LeuPro: 5.266 ± 0.663
2.414LeuGln: 2.414 ± 0.416
5.485LeuArg: 5.485 ± 0.642
5.376LeuSer: 5.376 ± 0.525
5.65LeuThr: 5.65 ± 0.513
4.827LeuVal: 4.827 ± 0.535
1.426LeuTrp: 1.426 ± 0.294
1.865LeuTyr: 1.865 ± 0.328
0.0LeuXaa: 0.0 ± 0.0
Met
2.194MetAla: 2.194 ± 0.382
0.219MetCys: 0.219 ± 0.158
1.042MetAsp: 1.042 ± 0.251
1.042MetGlu: 1.042 ± 0.235
0.987MetPhe: 0.987 ± 0.244
1.92MetGly: 1.92 ± 0.265
0.165MetHis: 0.165 ± 0.106
0.823MetIle: 0.823 ± 0.208
0.933MetLys: 0.933 ± 0.234
1.536MetLeu: 1.536 ± 0.236
0.549MetMet: 0.549 ± 0.232
0.987MetAsn: 0.987 ± 0.225
1.481MetPro: 1.481 ± 0.29
0.603MetGln: 0.603 ± 0.157
1.262MetArg: 1.262 ± 0.246
2.743MetSer: 2.743 ± 0.357
2.468MetThr: 2.468 ± 0.318
1.426MetVal: 1.426 ± 0.323
0.219MetTrp: 0.219 ± 0.113
0.384MetTyr: 0.384 ± 0.174
0.0MetXaa: 0.0 ± 0.0
Asn
3.675AsnAla: 3.675 ± 0.419
0.165AsnCys: 0.165 ± 0.093
2.414AsnAsp: 2.414 ± 0.259
1.7AsnGlu: 1.7 ± 0.338
0.878AsnPhe: 0.878 ± 0.286
4.224AsnGly: 4.224 ± 0.478
0.878AsnHis: 0.878 ± 0.201
1.646AsnIle: 1.646 ± 0.444
1.097AsnLys: 1.097 ± 0.242
2.359AsnLeu: 2.359 ± 0.367
0.713AsnMet: 0.713 ± 0.202
1.7AsnAsn: 1.7 ± 0.369
2.798AsnPro: 2.798 ± 0.488
1.317AsnGln: 1.317 ± 0.361
2.139AsnArg: 2.139 ± 0.419
1.317AsnSer: 1.317 ± 0.218
2.304AsnThr: 2.304 ± 0.287
1.865AsnVal: 1.865 ± 0.32
0.768AsnTrp: 0.768 ± 0.15
0.494AsnTyr: 0.494 ± 0.152
0.0AsnXaa: 0.0 ± 0.0
Pro
4.882ProAla: 4.882 ± 0.51
0.878ProCys: 0.878 ± 0.214
5.266ProAsp: 5.266 ± 0.576
4.059ProGlu: 4.059 ± 0.377
1.92ProPhe: 1.92 ± 0.368
6.583ProGly: 6.583 ± 0.708
1.426ProHis: 1.426 ± 0.27
2.249ProIle: 2.249 ± 0.342
2.852ProLys: 2.852 ± 0.463
4.279ProLeu: 4.279 ± 0.561
1.426ProMet: 1.426 ± 0.315
2.523ProAsn: 2.523 ± 0.413
3.84ProPro: 3.84 ± 0.597
1.81ProGln: 1.81 ± 0.37
3.62ProArg: 3.62 ± 0.547
3.456ProSer: 3.456 ± 0.447
3.291ProThr: 3.291 ± 0.408
4.937ProVal: 4.937 ± 0.525
1.152ProTrp: 1.152 ± 0.258
1.646ProTyr: 1.646 ± 0.289
0.0ProXaa: 0.0 ± 0.0
Gln
4.224GlnAla: 4.224 ± 0.637
0.384GlnCys: 0.384 ± 0.18
1.481GlnAsp: 1.481 ± 0.268
1.7GlnGlu: 1.7 ± 0.329
1.042GlnPhe: 1.042 ± 0.238
2.249GlnGly: 2.249 ± 0.408
0.987GlnHis: 0.987 ± 0.276
1.591GlnIle: 1.591 ± 0.254
1.207GlnLys: 1.207 ± 0.267
3.127GlnLeu: 3.127 ± 0.414
0.603GlnMet: 0.603 ± 0.212
0.878GlnAsn: 0.878 ± 0.24
2.414GlnPro: 2.414 ± 0.381
1.317GlnGln: 1.317 ± 0.284
2.578GlnArg: 2.578 ± 0.354
2.468GlnSer: 2.468 ± 0.356
2.03GlnThr: 2.03 ± 0.297
2.139GlnVal: 2.139 ± 0.366
0.713GlnTrp: 0.713 ± 0.163
0.768GlnTyr: 0.768 ± 0.252
0.0GlnXaa: 0.0 ± 0.0
Arg
7.131ArgAla: 7.131 ± 0.719
1.207ArgCys: 1.207 ± 0.327
4.388ArgAsp: 4.388 ± 0.566
4.553ArgGlu: 4.553 ± 0.607
2.084ArgPhe: 2.084 ± 0.346
4.498ArgGly: 4.498 ± 0.46
1.481ArgHis: 1.481 ± 0.342
3.675ArgIle: 3.675 ± 0.59
2.194ArgLys: 2.194 ± 0.351
4.663ArgLeu: 4.663 ± 0.553
2.523ArgMet: 2.523 ± 0.399
2.907ArgAsn: 2.907 ± 0.477
3.84ArgPro: 3.84 ± 0.409
2.359ArgGln: 2.359 ± 0.355
5.869ArgArg: 5.869 ± 0.742
3.456ArgSer: 3.456 ± 0.357
3.401ArgThr: 3.401 ± 0.553
5.431ArgVal: 5.431 ± 0.53
1.755ArgTrp: 1.755 ± 0.33
1.591ArgTyr: 1.591 ± 0.295
0.0ArgXaa: 0.0 ± 0.0
Ser
5.156SerAla: 5.156 ± 0.658
0.603SerCys: 0.603 ± 0.189
4.114SerAsp: 4.114 ± 0.459
2.962SerGlu: 2.962 ± 0.415
1.975SerPhe: 1.975 ± 0.382
6.418SerGly: 6.418 ± 0.92
1.097SerHis: 1.097 ± 0.226
2.852SerIle: 2.852 ± 0.449
2.359SerLys: 2.359 ± 0.425
3.785SerLeu: 3.785 ± 0.466
1.426SerMet: 1.426 ± 0.279
1.7SerAsn: 1.7 ± 0.341
3.456SerPro: 3.456 ± 0.442
1.646SerGln: 1.646 ± 0.243
3.73SerArg: 3.73 ± 0.503
3.566SerSer: 3.566 ± 0.695
3.84SerThr: 3.84 ± 0.538
4.059SerVal: 4.059 ± 0.542
1.262SerTrp: 1.262 ± 0.27
1.152SerTyr: 1.152 ± 0.253
0.0SerXaa: 0.0 ± 0.0
Thr
6.034ThrAla: 6.034 ± 0.626
0.768ThrCys: 0.768 ± 0.265
3.95ThrAsp: 3.95 ± 0.622
3.73ThrGlu: 3.73 ± 0.406
1.92ThrPhe: 1.92 ± 0.345
5.76ThrGly: 5.76 ± 0.598
1.536ThrHis: 1.536 ± 0.303
3.511ThrIle: 3.511 ± 0.435
1.755ThrLys: 1.755 ± 0.33
5.101ThrLeu: 5.101 ± 0.558
1.426ThrMet: 1.426 ± 0.272
2.414ThrAsn: 2.414 ± 0.352
5.156ThrPro: 5.156 ± 0.498
1.865ThrGln: 1.865 ± 0.295
4.059ThrArg: 4.059 ± 0.458
4.114ThrSer: 4.114 ± 0.483
4.882ThrThr: 4.882 ± 0.597
5.54ThrVal: 5.54 ± 0.577
1.097ThrTrp: 1.097 ± 0.242
1.865ThrTyr: 1.865 ± 0.326
0.0ThrXaa: 0.0 ± 0.0
Val
7.131ValAla: 7.131 ± 0.579
1.152ValCys: 1.152 ± 0.255
5.65ValAsp: 5.65 ± 0.52
4.608ValGlu: 4.608 ± 0.574
2.249ValPhe: 2.249 ± 0.377
5.815ValGly: 5.815 ± 0.652
1.207ValHis: 1.207 ± 0.276
2.688ValIle: 2.688 ± 0.397
2.359ValLys: 2.359 ± 0.385
5.485ValLeu: 5.485 ± 0.569
1.536ValMet: 1.536 ± 0.267
2.139ValAsn: 2.139 ± 0.313
3.73ValPro: 3.73 ± 0.412
2.688ValGln: 2.688 ± 0.401
4.224ValArg: 4.224 ± 0.602
4.059ValSer: 4.059 ± 0.466
4.827ValThr: 4.827 ± 0.561
5.869ValVal: 5.869 ± 0.716
1.865ValTrp: 1.865 ± 0.369
1.371ValTyr: 1.371 ± 0.261
0.0ValXaa: 0.0 ± 0.0
Trp
2.03TrpAla: 2.03 ± 0.316
0.274TrpCys: 0.274 ± 0.141
1.536TrpAsp: 1.536 ± 0.303
1.097TrpGlu: 1.097 ± 0.305
0.713TrpPhe: 0.713 ± 0.18
0.987TrpGly: 0.987 ± 0.24
0.878TrpHis: 0.878 ± 0.246
1.207TrpIle: 1.207 ± 0.202
0.768TrpLys: 0.768 ± 0.185
1.755TrpLeu: 1.755 ± 0.347
0.878TrpMet: 0.878 ± 0.225
0.658TrpAsn: 0.658 ± 0.238
1.097TrpPro: 1.097 ± 0.251
0.933TrpGln: 0.933 ± 0.235
2.084TrpArg: 2.084 ± 0.374
1.426TrpSer: 1.426 ± 0.274
1.591TrpThr: 1.591 ± 0.273
2.03TrpVal: 2.03 ± 0.414
0.768TrpTrp: 0.768 ± 0.186
0.329TrpTyr: 0.329 ± 0.13
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.688TyrAla: 2.688 ± 0.363
0.439TyrCys: 0.439 ± 0.173
1.865TyrAsp: 1.865 ± 0.385
1.81TyrGlu: 1.81 ± 0.301
0.603TyrPhe: 0.603 ± 0.192
1.81TyrGly: 1.81 ± 0.342
0.219TyrHis: 0.219 ± 0.095
0.933TyrIle: 0.933 ± 0.211
0.713TyrLys: 0.713 ± 0.185
2.084TyrLeu: 2.084 ± 0.368
0.219TyrMet: 0.219 ± 0.106
0.658TyrAsn: 0.658 ± 0.152
1.262TyrPro: 1.262 ± 0.227
0.933TyrGln: 0.933 ± 0.219
1.92TyrArg: 1.92 ± 0.344
1.042TyrSer: 1.042 ± 0.259
1.7TyrThr: 1.7 ± 0.36
2.139TyrVal: 2.139 ± 0.339
0.658TyrTrp: 0.658 ± 0.221
0.768TyrTyr: 0.768 ± 0.183
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 105 proteins (18231 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski