Amino acid dipepetide frequency for Mycobacterium phage Theia

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
11.026AlaAla: 11.026 ± 1.154
0.376AlaCys: 0.376 ± 0.131
6.453AlaAsp: 6.453 ± 0.676
7.017AlaGlu: 7.017 ± 0.903
3.007AlaPhe: 3.007 ± 0.508
7.643AlaGly: 7.643 ± 0.838
1.065AlaHis: 1.065 ± 0.252
4.448AlaIle: 4.448 ± 0.539
5.137AlaLys: 5.137 ± 0.755
7.393AlaLeu: 7.393 ± 0.766
2.005AlaMet: 2.005 ± 0.393
2.819AlaAsn: 2.819 ± 0.426
4.448AlaPro: 4.448 ± 0.56
3.007AlaGln: 3.007 ± 0.435
5.45AlaArg: 5.45 ± 0.611
4.636AlaSer: 4.636 ± 0.613
5.513AlaThr: 5.513 ± 0.601
7.455AlaVal: 7.455 ± 0.867
1.692AlaTrp: 1.692 ± 0.276
2.193AlaTyr: 2.193 ± 0.357
0.0AlaXaa: 0.0 ± 0.0
Cys
0.564CysAla: 0.564 ± 0.172
0.0CysCys: 0.0 ± 0.0
0.689CysAsp: 0.689 ± 0.213
0.689CysGlu: 0.689 ± 0.279
0.376CysPhe: 0.376 ± 0.156
0.877CysGly: 0.877 ± 0.233
0.188CysHis: 0.188 ± 0.102
0.439CysIle: 0.439 ± 0.147
0.376CysLys: 0.376 ± 0.143
0.877CysLeu: 0.877 ± 0.243
0.063CysMet: 0.063 ± 0.068
0.439CysAsn: 0.439 ± 0.198
0.251CysPro: 0.251 ± 0.115
0.439CysGln: 0.439 ± 0.185
0.564CysArg: 0.564 ± 0.214
0.501CysSer: 0.501 ± 0.196
0.251CysThr: 0.251 ± 0.126
0.501CysVal: 0.501 ± 0.184
0.376CysTrp: 0.376 ± 0.143
0.188CysTyr: 0.188 ± 0.116
0.0CysXaa: 0.0 ± 0.0
Asp
6.202AspAla: 6.202 ± 0.743
0.94AspCys: 0.94 ± 0.277
4.385AspAsp: 4.385 ± 0.538
4.824AspGlu: 4.824 ± 0.622
2.757AspPhe: 2.757 ± 0.439
5.826AspGly: 5.826 ± 0.538
1.504AspHis: 1.504 ± 0.356
2.757AspIle: 2.757 ± 0.437
2.694AspLys: 2.694 ± 0.469
6.453AspLeu: 6.453 ± 0.768
1.879AspMet: 1.879 ± 0.277
1.879AspAsn: 1.879 ± 0.395
4.636AspPro: 4.636 ± 0.7
2.694AspGln: 2.694 ± 0.469
3.759AspArg: 3.759 ± 0.463
3.07AspSer: 3.07 ± 0.481
3.696AspThr: 3.696 ± 0.428
6.014AspVal: 6.014 ± 0.549
1.316AspTrp: 1.316 ± 0.311
2.757AspTyr: 2.757 ± 0.39
0.0AspXaa: 0.0 ± 0.0
Glu
7.142GluAla: 7.142 ± 0.687
0.501GluCys: 0.501 ± 0.186
5.262GluAsp: 5.262 ± 0.643
6.077GluGlu: 6.077 ± 0.619
2.005GluPhe: 2.005 ± 0.456
5.638GluGly: 5.638 ± 0.57
1.253GluHis: 1.253 ± 0.297
3.759GluIle: 3.759 ± 0.465
3.696GluLys: 3.696 ± 0.444
7.831GluLeu: 7.831 ± 0.759
2.005GluMet: 2.005 ± 0.349
2.067GluAsn: 2.067 ± 0.354
3.007GluPro: 3.007 ± 0.411
2.819GluGln: 2.819 ± 0.471
4.699GluArg: 4.699 ± 0.688
2.819GluSer: 2.819 ± 0.431
3.446GluThr: 3.446 ± 0.462
6.202GluVal: 6.202 ± 0.536
1.754GluTrp: 1.754 ± 0.419
2.694GluTyr: 2.694 ± 0.543
0.0GluXaa: 0.0 ± 0.0
Phe
3.258PheAla: 3.258 ± 0.522
0.188PheCys: 0.188 ± 0.165
3.195PheAsp: 3.195 ± 0.485
2.13PheGlu: 2.13 ± 0.354
0.814PhePhe: 0.814 ± 0.243
3.258PheGly: 3.258 ± 0.394
0.501PheHis: 0.501 ± 0.179
1.441PheIle: 1.441 ± 0.311
1.817PheLys: 1.817 ± 0.324
3.132PheLeu: 3.132 ± 0.513
0.376PheMet: 0.376 ± 0.145
1.629PheAsn: 1.629 ± 0.323
1.378PhePro: 1.378 ± 0.335
1.19PheGln: 1.19 ± 0.274
1.942PheArg: 1.942 ± 0.318
1.629PheSer: 1.629 ± 0.298
1.754PheThr: 1.754 ± 0.3
2.318PheVal: 2.318 ± 0.369
0.877PheTrp: 0.877 ± 0.309
1.002PheTyr: 1.002 ± 0.234
0.0PheXaa: 0.0 ± 0.0
Gly
6.14GlyAla: 6.14 ± 0.932
0.752GlyCys: 0.752 ± 0.241
5.889GlyAsp: 5.889 ± 0.705
5.701GlyGlu: 5.701 ± 0.513
2.757GlyPhe: 2.757 ± 0.452
9.773GlyGly: 9.773 ± 2.292
1.942GlyHis: 1.942 ± 0.287
4.072GlyIle: 4.072 ± 0.494
4.511GlyLys: 4.511 ± 0.663
7.079GlyLeu: 7.079 ± 0.75
2.13GlyMet: 2.13 ± 0.405
3.446GlyAsn: 3.446 ± 0.421
3.884GlyPro: 3.884 ± 0.567
2.819GlyGln: 2.819 ± 0.463
4.636GlyArg: 4.636 ± 0.64
4.887GlySer: 4.887 ± 0.59
4.699GlyThr: 4.699 ± 0.564
5.325GlyVal: 5.325 ± 0.648
1.629GlyTrp: 1.629 ± 0.298
3.195GlyTyr: 3.195 ± 0.606
0.0GlyXaa: 0.0 ± 0.0
His
1.754HisAla: 1.754 ± 0.331
0.125HisCys: 0.125 ± 0.095
1.629HisAsp: 1.629 ± 0.321
1.378HisGlu: 1.378 ± 0.299
0.564HisPhe: 0.564 ± 0.152
2.005HisGly: 2.005 ± 0.363
0.689HisHis: 0.689 ± 0.202
1.253HisIle: 1.253 ± 0.303
0.689HisLys: 0.689 ± 0.22
1.566HisLeu: 1.566 ± 0.348
0.063HisMet: 0.063 ± 0.057
0.376HisAsn: 0.376 ± 0.138
0.94HisPro: 0.94 ± 0.229
0.814HisGln: 0.814 ± 0.206
1.504HisArg: 1.504 ± 0.3
0.689HisSer: 0.689 ± 0.185
0.814HisThr: 0.814 ± 0.211
1.566HisVal: 1.566 ± 0.325
0.313HisTrp: 0.313 ± 0.166
0.626HisTyr: 0.626 ± 0.215
0.0HisXaa: 0.0 ± 0.0
Ile
5.2IleAla: 5.2 ± 0.639
0.439IleCys: 0.439 ± 0.149
4.26IleAsp: 4.26 ± 0.447
4.699IleGlu: 4.699 ± 0.555
1.316IlePhe: 1.316 ± 0.371
3.759IleGly: 3.759 ± 0.466
1.065IleHis: 1.065 ± 0.226
2.193IleIle: 2.193 ± 0.357
2.13IleLys: 2.13 ± 0.36
3.508IleLeu: 3.508 ± 0.453
0.689IleMet: 0.689 ± 0.193
2.193IleAsn: 2.193 ± 0.394
3.634IlePro: 3.634 ± 0.465
1.253IleGln: 1.253 ± 0.246
3.195IleArg: 3.195 ± 0.484
2.694IleSer: 2.694 ± 0.392
2.819IleThr: 2.819 ± 0.449
2.944IleVal: 2.944 ± 0.447
0.564IleTrp: 0.564 ± 0.177
1.378IleTyr: 1.378 ± 0.313
0.0IleXaa: 0.0 ± 0.0
Lys
5.45LysAla: 5.45 ± 0.735
0.251LysCys: 0.251 ± 0.141
2.882LysAsp: 2.882 ± 0.367
3.007LysGlu: 3.007 ± 0.455
1.504LysPhe: 1.504 ± 0.273
3.258LysGly: 3.258 ± 0.445
0.877LysHis: 0.877 ± 0.25
2.506LysIle: 2.506 ± 0.435
2.819LysLys: 2.819 ± 0.447
3.884LysLeu: 3.884 ± 0.411
1.316LysMet: 1.316 ± 0.258
1.065LysAsn: 1.065 ± 0.307
2.255LysPro: 2.255 ± 0.373
1.441LysGln: 1.441 ± 0.326
2.318LysArg: 2.318 ± 0.458
2.005LysSer: 2.005 ± 0.399
3.132LysThr: 3.132 ± 0.443
4.135LysVal: 4.135 ± 0.562
1.065LysTrp: 1.065 ± 0.247
1.19LysTyr: 1.19 ± 0.317
0.0LysXaa: 0.0 ± 0.0
Leu
7.894LeuAla: 7.894 ± 0.757
0.689LeuCys: 0.689 ± 0.239
6.39LeuAsp: 6.39 ± 0.643
6.453LeuGlu: 6.453 ± 0.647
2.13LeuPhe: 2.13 ± 0.355
5.701LeuGly: 5.701 ± 0.646
1.002LeuHis: 1.002 ± 0.303
3.884LeuIle: 3.884 ± 0.486
4.573LeuLys: 4.573 ± 0.643
5.764LeuLeu: 5.764 ± 0.6
2.255LeuMet: 2.255 ± 0.318
2.819LeuAsn: 2.819 ± 0.495
3.822LeuPro: 3.822 ± 0.61
2.569LeuGln: 2.569 ± 0.601
5.45LeuArg: 5.45 ± 0.571
5.262LeuSer: 5.262 ± 0.609
5.262LeuThr: 5.262 ± 0.684
6.328LeuVal: 6.328 ± 0.659
1.253LeuTrp: 1.253 ± 0.3
2.819LeuTyr: 2.819 ± 0.486
0.0LeuXaa: 0.0 ± 0.0
Met
1.942MetAla: 1.942 ± 0.405
0.313MetCys: 0.313 ± 0.119
1.316MetAsp: 1.316 ± 0.345
1.19MetGlu: 1.19 ± 0.253
1.002MetPhe: 1.002 ± 0.299
1.629MetGly: 1.629 ± 0.334
0.376MetHis: 0.376 ± 0.172
0.626MetIle: 0.626 ± 0.209
1.504MetLys: 1.504 ± 0.262
2.067MetLeu: 2.067 ± 0.359
0.313MetMet: 0.313 ± 0.181
0.814MetAsn: 0.814 ± 0.248
1.065MetPro: 1.065 ± 0.228
0.501MetGln: 0.501 ± 0.177
1.754MetArg: 1.754 ± 0.347
2.193MetSer: 2.193 ± 0.435
2.631MetThr: 2.631 ± 0.355
1.253MetVal: 1.253 ± 0.32
0.313MetTrp: 0.313 ± 0.125
0.564MetTyr: 0.564 ± 0.208
0.0MetXaa: 0.0 ± 0.0
Asn
2.381AsnAla: 2.381 ± 0.464
0.439AsnCys: 0.439 ± 0.183
2.193AsnAsp: 2.193 ± 0.35
2.318AsnGlu: 2.318 ± 0.45
1.817AsnPhe: 1.817 ± 0.362
3.947AsnGly: 3.947 ± 0.612
0.752AsnHis: 0.752 ± 0.197
1.692AsnIle: 1.692 ± 0.413
0.752AsnLys: 0.752 ± 0.179
3.007AsnLeu: 3.007 ± 0.444
0.814AsnMet: 0.814 ± 0.288
1.002AsnAsn: 1.002 ± 0.235
2.193AsnPro: 2.193 ± 0.42
0.814AsnGln: 0.814 ± 0.219
1.754AsnArg: 1.754 ± 0.408
1.879AsnSer: 1.879 ± 0.325
2.067AsnThr: 2.067 ± 0.407
2.067AsnVal: 2.067 ± 0.376
0.752AsnTrp: 0.752 ± 0.231
0.626AsnTyr: 0.626 ± 0.192
0.0AsnXaa: 0.0 ± 0.0
Pro
4.448ProAla: 4.448 ± 0.599
0.376ProCys: 0.376 ± 0.178
3.07ProAsp: 3.07 ± 0.463
4.26ProGlu: 4.26 ± 0.47
1.817ProPhe: 1.817 ± 0.408
4.573ProGly: 4.573 ± 0.663
0.94ProHis: 0.94 ± 0.271
2.694ProIle: 2.694 ± 0.37
1.692ProLys: 1.692 ± 0.447
3.195ProLeu: 3.195 ± 0.528
1.002ProMet: 1.002 ± 0.262
1.754ProAsn: 1.754 ± 0.333
2.757ProPro: 2.757 ± 0.498
1.879ProGln: 1.879 ± 0.359
1.942ProArg: 1.942 ± 0.382
3.195ProSer: 3.195 ± 0.356
4.01ProThr: 4.01 ± 0.625
4.26ProVal: 4.26 ± 0.556
1.002ProTrp: 1.002 ± 0.384
1.441ProTyr: 1.441 ± 0.305
0.0ProXaa: 0.0 ± 0.0
Gln
3.884GlnAla: 3.884 ± 0.663
0.125GlnCys: 0.125 ± 0.088
1.504GlnAsp: 1.504 ± 0.311
2.631GlnGlu: 2.631 ± 0.401
1.566GlnPhe: 1.566 ± 0.333
2.381GlnGly: 2.381 ± 0.415
0.94GlnHis: 0.94 ± 0.203
2.255GlnIle: 2.255 ± 0.364
1.253GlnLys: 1.253 ± 0.336
3.634GlnLeu: 3.634 ± 0.631
0.814GlnMet: 0.814 ± 0.254
0.752GlnAsn: 0.752 ± 0.216
0.94GlnPro: 0.94 ± 0.264
1.128GlnGln: 1.128 ± 0.294
1.692GlnArg: 1.692 ± 0.4
1.629GlnSer: 1.629 ± 0.334
1.942GlnThr: 1.942 ± 0.324
2.631GlnVal: 2.631 ± 0.387
0.564GlnTrp: 0.564 ± 0.221
1.19GlnTyr: 1.19 ± 0.304
0.0GlnXaa: 0.0 ± 0.0
Arg
5.137ArgAla: 5.137 ± 0.578
0.564ArgCys: 0.564 ± 0.225
3.07ArgAsp: 3.07 ± 0.353
4.949ArgGlu: 4.949 ± 0.555
3.007ArgPhe: 3.007 ± 0.458
4.385ArgGly: 4.385 ± 0.666
1.566ArgHis: 1.566 ± 0.341
3.195ArgIle: 3.195 ± 0.559
3.195ArgLys: 3.195 ± 0.482
5.638ArgLeu: 5.638 ± 0.636
1.504ArgMet: 1.504 ± 0.28
2.005ArgAsn: 2.005 ± 0.292
2.631ArgPro: 2.631 ± 0.309
1.942ArgGln: 1.942 ± 0.374
4.511ArgArg: 4.511 ± 0.572
3.007ArgSer: 3.007 ± 0.392
2.882ArgThr: 2.882 ± 0.512
3.571ArgVal: 3.571 ± 0.481
1.19ArgTrp: 1.19 ± 0.325
1.378ArgTyr: 1.378 ± 0.318
0.0ArgXaa: 0.0 ± 0.0
Ser
4.636SerAla: 4.636 ± 0.462
0.626SerCys: 0.626 ± 0.23
3.634SerAsp: 3.634 ± 0.505
3.884SerGlu: 3.884 ± 0.576
2.318SerPhe: 2.318 ± 0.339
5.638SerGly: 5.638 ± 0.693
0.814SerHis: 0.814 ± 0.229
2.443SerIle: 2.443 ± 0.337
2.193SerLys: 2.193 ± 0.366
3.195SerLeu: 3.195 ± 0.39
1.504SerMet: 1.504 ± 0.305
1.879SerAsn: 1.879 ± 0.379
2.506SerPro: 2.506 ± 0.382
2.067SerGln: 2.067 ± 0.399
4.135SerArg: 4.135 ± 0.623
3.132SerSer: 3.132 ± 0.456
3.195SerThr: 3.195 ± 0.456
3.759SerVal: 3.759 ± 0.534
1.128SerTrp: 1.128 ± 0.264
1.504SerTyr: 1.504 ± 0.266
0.0SerXaa: 0.0 ± 0.0
Thr
4.824ThrAla: 4.824 ± 0.523
0.564ThrCys: 0.564 ± 0.213
4.197ThrAsp: 4.197 ± 0.543
4.636ThrGlu: 4.636 ± 0.557
2.005ThrPhe: 2.005 ± 0.372
4.761ThrGly: 4.761 ± 0.763
1.19ThrHis: 1.19 ± 0.314
2.757ThrIle: 2.757 ± 0.381
2.569ThrLys: 2.569 ± 0.432
4.761ThrLeu: 4.761 ± 0.47
1.754ThrMet: 1.754 ± 0.32
1.942ThrAsn: 1.942 ± 0.36
4.197ThrPro: 4.197 ± 0.617
2.005ThrGln: 2.005 ± 0.363
2.819ThrArg: 2.819 ± 0.424
2.757ThrSer: 2.757 ± 0.413
3.132ThrThr: 3.132 ± 0.499
4.385ThrVal: 4.385 ± 0.499
1.253ThrTrp: 1.253 ± 0.273
2.005ThrTyr: 2.005 ± 0.377
0.0ThrXaa: 0.0 ± 0.0
Val
6.641ValAla: 6.641 ± 0.657
1.002ValCys: 1.002 ± 0.26
5.576ValAsp: 5.576 ± 0.604
5.513ValGlu: 5.513 ± 0.623
1.817ValPhe: 1.817 ± 0.348
6.14ValGly: 6.14 ± 0.668
1.629ValHis: 1.629 ± 0.386
4.26ValIle: 4.26 ± 0.558
3.383ValLys: 3.383 ± 0.496
5.2ValLeu: 5.2 ± 0.728
1.566ValMet: 1.566 ± 0.282
2.506ValAsn: 2.506 ± 0.401
3.634ValPro: 3.634 ± 0.444
2.381ValGln: 2.381 ± 0.362
4.01ValArg: 4.01 ± 0.511
4.824ValSer: 4.824 ± 0.586
4.448ValThr: 4.448 ± 0.441
6.14ValVal: 6.14 ± 0.742
1.504ValTrp: 1.504 ± 0.299
2.318ValTyr: 2.318 ± 0.472
0.0ValXaa: 0.0 ± 0.0
Trp
1.879TrpAla: 1.879 ± 0.42
0.188TrpCys: 0.188 ± 0.095
1.817TrpAsp: 1.817 ± 0.289
1.002TrpGlu: 1.002 ± 0.269
0.564TrpPhe: 0.564 ± 0.217
1.378TrpGly: 1.378 ± 0.325
0.501TrpHis: 0.501 ± 0.182
1.316TrpIle: 1.316 ± 0.334
0.689TrpLys: 0.689 ± 0.203
1.378TrpLeu: 1.378 ± 0.359
0.626TrpMet: 0.626 ± 0.208
0.877TrpAsn: 0.877 ± 0.24
0.752TrpPro: 0.752 ± 0.195
0.877TrpGln: 0.877 ± 0.2
0.752TrpArg: 0.752 ± 0.208
1.692TrpSer: 1.692 ± 0.327
1.19TrpThr: 1.19 ± 0.274
1.19TrpVal: 1.19 ± 0.214
0.439TrpTrp: 0.439 ± 0.2
0.376TrpTyr: 0.376 ± 0.148
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.443TyrAla: 2.443 ± 0.378
0.188TyrCys: 0.188 ± 0.132
2.631TyrAsp: 2.631 ± 0.451
2.318TyrGlu: 2.318 ± 0.461
0.689TyrPhe: 0.689 ± 0.249
2.694TyrGly: 2.694 ± 0.479
0.564TyrHis: 0.564 ± 0.207
1.942TyrIle: 1.942 ± 0.358
0.689TyrLys: 0.689 ± 0.209
2.757TyrLeu: 2.757 ± 0.421
0.626TyrMet: 0.626 ± 0.207
1.002TyrAsn: 1.002 ± 0.255
1.378TyrPro: 1.378 ± 0.273
0.814TyrGln: 0.814 ± 0.173
2.443TyrArg: 2.443 ± 0.512
1.692TyrSer: 1.692 ± 0.416
1.629TyrThr: 1.629 ± 0.329
2.443TyrVal: 2.443 ± 0.494
0.501TyrTrp: 0.501 ± 0.179
0.94TyrTyr: 0.94 ± 0.259
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 87 proteins (15963 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski