Amino acid dipepetide frequency for Mycobacterium phage Catera

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
10.517AlaAla: 10.517 ± 0.66
0.649AlaCys: 0.649 ± 0.127
7.011AlaAsp: 7.011 ± 0.45
6.124AlaGlu: 6.124 ± 0.524
2.618AlaPhe: 2.618 ± 0.266
8.007AlaGly: 8.007 ± 0.569
1.839AlaHis: 1.839 ± 0.169
4.415AlaIle: 4.415 ± 0.349
4.177AlaLys: 4.177 ± 0.403
8.548AlaLeu: 8.548 ± 0.551
2.683AlaMet: 2.683 ± 0.242
3.376AlaAsn: 3.376 ± 0.315
5.302AlaPro: 5.302 ± 0.401
4.566AlaGln: 4.566 ± 0.377
6.341AlaArg: 6.341 ± 0.435
5.453AlaSer: 5.453 ± 0.46
5.497AlaThr: 5.497 ± 0.484
6.406AlaVal: 6.406 ± 0.427
1.515AlaTrp: 1.515 ± 0.186
2.792AlaTyr: 2.792 ± 0.254
0.0AlaXaa: 0.0 ± 0.0
Cys
0.649CysAla: 0.649 ± 0.135
0.065CysCys: 0.065 ± 0.036
0.628CysAsp: 0.628 ± 0.126
0.498CysGlu: 0.498 ± 0.099
0.303CysPhe: 0.303 ± 0.085
0.844CysGly: 0.844 ± 0.151
0.281CysHis: 0.281 ± 0.085
0.411CysIle: 0.411 ± 0.095
0.108CysLys: 0.108 ± 0.054
0.649CysLeu: 0.649 ± 0.136
0.195CysMet: 0.195 ± 0.062
0.151CysAsn: 0.151 ± 0.053
0.584CysPro: 0.584 ± 0.135
0.368CysGln: 0.368 ± 0.085
0.498CysArg: 0.498 ± 0.114
0.757CysSer: 0.757 ± 0.157
0.346CysThr: 0.346 ± 0.086
0.39CysVal: 0.39 ± 0.087
0.216CysTrp: 0.216 ± 0.076
0.238CysTyr: 0.238 ± 0.059
0.0CysXaa: 0.0 ± 0.0
Asp
6.644AspAla: 6.644 ± 0.374
0.498AspCys: 0.498 ± 0.103
5.085AspAsp: 5.085 ± 0.369
5.583AspGlu: 5.583 ± 0.393
2.51AspPhe: 2.51 ± 0.28
5.367AspGly: 5.367 ± 0.31
1.796AspHis: 1.796 ± 0.237
2.943AspIle: 2.943 ± 0.221
2.51AspLys: 2.51 ± 0.207
5.929AspLeu: 5.929 ± 0.403
1.948AspMet: 1.948 ± 0.225
1.926AspAsn: 1.926 ± 0.191
5.432AspPro: 5.432 ± 0.401
2.835AspGln: 2.835 ± 0.23
4.804AspArg: 4.804 ± 0.392
3.311AspSer: 3.311 ± 0.311
3.592AspThr: 3.592 ± 0.345
4.133AspVal: 4.133 ± 0.352
1.558AspTrp: 1.558 ± 0.192
2.748AspTyr: 2.748 ± 0.233
0.0AspXaa: 0.0 ± 0.0
Glu
6.341GluAla: 6.341 ± 0.404
0.563GluCys: 0.563 ± 0.132
4.869GluAsp: 4.869 ± 0.458
4.371GluGlu: 4.371 ± 0.409
1.883GluPhe: 1.883 ± 0.181
3.939GluGly: 3.939 ± 0.289
1.883GluHis: 1.883 ± 0.217
3.289GluIle: 3.289 ± 0.314
2.38GluLys: 2.38 ± 0.223
5.54GluLeu: 5.54 ± 0.408
1.601GluMet: 1.601 ± 0.163
1.536GluAsn: 1.536 ± 0.187
3.311GluPro: 3.311 ± 0.233
2.857GluGln: 2.857 ± 0.257
4.588GluArg: 4.588 ± 0.359
3.073GluSer: 3.073 ± 0.292
3.679GluThr: 3.679 ± 0.28
4.761GluVal: 4.761 ± 0.363
1.428GluTrp: 1.428 ± 0.161
1.753GluTyr: 1.753 ± 0.184
0.0GluXaa: 0.0 ± 0.0
Phe
2.857PheAla: 2.857 ± 0.252
0.346PheCys: 0.346 ± 0.101
2.857PheAsp: 2.857 ± 0.234
1.926PheGlu: 1.926 ± 0.187
0.671PhePhe: 0.671 ± 0.125
2.575PheGly: 2.575 ± 0.251
0.714PheHis: 0.714 ± 0.133
1.32PheIle: 1.32 ± 0.195
1.169PheLys: 1.169 ± 0.152
2.51PheLeu: 2.51 ± 0.262
0.628PheMet: 0.628 ± 0.137
0.931PheAsn: 0.931 ± 0.17
1.515PhePro: 1.515 ± 0.193
1.407PheGln: 1.407 ± 0.173
2.142PheArg: 2.142 ± 0.186
1.601PheSer: 1.601 ± 0.178
2.294PheThr: 2.294 ± 0.279
2.272PheVal: 2.272 ± 0.222
0.498PheTrp: 0.498 ± 0.125
1.147PheTyr: 1.147 ± 0.16
0.0PheXaa: 0.0 ± 0.0
Gly
6.535GlyAla: 6.535 ± 0.492
0.476GlyCys: 0.476 ± 0.118
5.064GlyAsp: 5.064 ± 0.327
4.544GlyGlu: 4.544 ± 0.282
2.575GlyPhe: 2.575 ± 0.25
8.786GlyGly: 8.786 ± 1.176
2.164GlyHis: 2.164 ± 0.211
3.527GlyIle: 3.527 ± 0.325
3.83GlyLys: 3.83 ± 0.369
6.341GlyLeu: 6.341 ± 0.413
1.904GlyMet: 1.904 ± 0.202
2.683GlyAsn: 2.683 ± 0.295
4.047GlyPro: 4.047 ± 0.398
3.657GlyGln: 3.657 ± 0.329
4.804GlyArg: 4.804 ± 0.301
4.847GlySer: 4.847 ± 0.347
5.518GlyThr: 5.518 ± 0.434
5.085GlyVal: 5.085 ± 0.327
1.991GlyTrp: 1.991 ± 0.2
2.575GlyTyr: 2.575 ± 0.259
0.0GlyXaa: 0.0 ± 0.0
His
2.077HisAla: 2.077 ± 0.226
0.173HisCys: 0.173 ± 0.064
1.601HisAsp: 1.601 ± 0.195
1.666HisGlu: 1.666 ± 0.187
0.887HisPhe: 0.887 ± 0.149
2.077HisGly: 2.077 ± 0.208
0.952HisHis: 0.952 ± 0.147
0.995HisIle: 0.995 ± 0.132
0.714HisLys: 0.714 ± 0.127
2.099HisLeu: 2.099 ± 0.24
0.454HisMet: 0.454 ± 0.096
0.606HisAsn: 0.606 ± 0.102
1.991HisPro: 1.991 ± 0.271
0.931HisGln: 0.931 ± 0.151
1.796HisArg: 1.796 ± 0.24
1.169HisSer: 1.169 ± 0.186
1.385HisThr: 1.385 ± 0.169
1.796HisVal: 1.796 ± 0.211
0.368HisTrp: 0.368 ± 0.087
0.866HisTyr: 0.866 ± 0.148
0.0HisXaa: 0.0 ± 0.0
Ile
4.912IleAla: 4.912 ± 0.3
0.498IleCys: 0.498 ± 0.125
3.008IleAsp: 3.008 ± 0.248
2.921IleGlu: 2.921 ± 0.294
1.147IlePhe: 1.147 ± 0.152
3.246IleGly: 3.246 ± 0.246
0.822IleHis: 0.822 ± 0.159
1.277IleIle: 1.277 ± 0.157
1.407IleLys: 1.407 ± 0.169
2.77IleLeu: 2.77 ± 0.223
0.866IleMet: 0.866 ± 0.152
1.818IleAsn: 1.818 ± 0.197
2.51IlePro: 2.51 ± 0.253
1.731IleGln: 1.731 ± 0.207
3.549IleArg: 3.549 ± 0.346
2.272IleSer: 2.272 ± 0.226
3.203IleThr: 3.203 ± 0.298
3.311IleVal: 3.311 ± 0.319
0.779IleTrp: 0.779 ± 0.138
1.233IleTyr: 1.233 ± 0.164
0.0IleXaa: 0.0 ± 0.0
Lys
4.285LysAla: 4.285 ± 0.35
0.325LysCys: 0.325 ± 0.085
2.077LysAsp: 2.077 ± 0.214
2.077LysGlu: 2.077 ± 0.228
1.407LysPhe: 1.407 ± 0.182
2.857LysGly: 2.857 ± 0.255
0.801LysHis: 0.801 ± 0.124
1.861LysIle: 1.861 ± 0.202
1.839LysLys: 1.839 ± 0.257
3.051LysLeu: 3.051 ± 0.268
1.125LysMet: 1.125 ± 0.175
1.407LysAsn: 1.407 ± 0.173
1.839LysPro: 1.839 ± 0.215
1.147LysGln: 1.147 ± 0.152
2.921LysArg: 2.921 ± 0.228
1.818LysSer: 1.818 ± 0.197
2.294LysThr: 2.294 ± 0.232
3.354LysVal: 3.354 ± 0.254
0.563LysTrp: 0.563 ± 0.115
1.147LysTyr: 1.147 ± 0.16
0.0LysXaa: 0.0 ± 0.0
Leu
8.245LeuAla: 8.245 ± 0.44
0.692LeuCys: 0.692 ± 0.119
6.622LeuAsp: 6.622 ± 0.426
4.068LeuGlu: 4.068 ± 0.357
1.969LeuPhe: 1.969 ± 0.204
5.388LeuGly: 5.388 ± 0.318
1.45LeuHis: 1.45 ± 0.191
3.203LeuIle: 3.203 ± 0.299
3.073LeuLys: 3.073 ± 0.26
5.194LeuLeu: 5.194 ± 0.404
2.229LeuMet: 2.229 ± 0.23
3.289LeuAsn: 3.289 ± 0.277
3.939LeuPro: 3.939 ± 0.334
2.857LeuGln: 2.857 ± 0.248
5.475LeuArg: 5.475 ± 0.412
4.155LeuSer: 4.155 ± 0.378
4.588LeuThr: 4.588 ± 0.409
5.648LeuVal: 5.648 ± 0.381
1.19LeuTrp: 1.19 ± 0.144
2.337LeuTyr: 2.337 ± 0.234
0.0LeuXaa: 0.0 ± 0.0
Met
2.77MetAla: 2.77 ± 0.262
0.13MetCys: 0.13 ± 0.052
1.645MetAsp: 1.645 ± 0.206
1.32MetGlu: 1.32 ± 0.156
0.757MetPhe: 0.757 ± 0.112
1.926MetGly: 1.926 ± 0.214
0.476MetHis: 0.476 ± 0.104
0.909MetIle: 0.909 ± 0.155
1.06MetLys: 1.06 ± 0.148
1.472MetLeu: 1.472 ± 0.213
0.779MetMet: 0.779 ± 0.136
0.844MetAsn: 0.844 ± 0.134
1.493MetPro: 1.493 ± 0.2
1.039MetGln: 1.039 ± 0.165
1.839MetArg: 1.839 ± 0.197
2.013MetSer: 2.013 ± 0.22
2.316MetThr: 2.316 ± 0.227
1.32MetVal: 1.32 ± 0.146
0.39MetTrp: 0.39 ± 0.092
0.519MetTyr: 0.519 ± 0.092
0.0MetXaa: 0.0 ± 0.0
Asn
3.051AsnAla: 3.051 ± 0.259
0.173AsnCys: 0.173 ± 0.066
1.969AsnAsp: 1.969 ± 0.218
1.666AsnGlu: 1.666 ± 0.157
1.212AsnPhe: 1.212 ± 0.167
3.268AsnGly: 3.268 ± 0.275
0.844AsnHis: 0.844 ± 0.147
1.407AsnIle: 1.407 ± 0.173
1.515AsnLys: 1.515 ± 0.157
2.121AsnLeu: 2.121 ± 0.209
0.909AsnMet: 0.909 ± 0.174
1.428AsnAsn: 1.428 ± 0.199
2.857AsnPro: 2.857 ± 0.285
1.363AsnGln: 1.363 ± 0.193
2.727AsnArg: 2.727 ± 0.224
1.666AsnSer: 1.666 ± 0.211
2.142AsnThr: 2.142 ± 0.229
2.013AsnVal: 2.013 ± 0.169
0.757AsnTrp: 0.757 ± 0.118
0.801AsnTyr: 0.801 ± 0.135
0.0AsnXaa: 0.0 ± 0.0
Pro
5.085ProAla: 5.085 ± 0.404
0.26ProCys: 0.26 ± 0.078
5.107ProAsp: 5.107 ± 0.355
4.588ProGlu: 4.588 ± 0.42
1.536ProPhe: 1.536 ± 0.173
5.605ProGly: 5.605 ± 0.446
1.45ProHis: 1.45 ± 0.221
2.034ProIle: 2.034 ± 0.191
1.731ProLys: 1.731 ± 0.211
3.181ProLeu: 3.181 ± 0.248
1.363ProMet: 1.363 ± 0.189
2.229ProAsn: 2.229 ± 0.231
3.246ProPro: 3.246 ± 0.451
2.229ProGln: 2.229 ± 0.229
2.835ProArg: 2.835 ± 0.28
2.857ProSer: 2.857 ± 0.276
3.852ProThr: 3.852 ± 0.328
4.609ProVal: 4.609 ± 0.312
1.169ProTrp: 1.169 ± 0.165
1.58ProTyr: 1.58 ± 0.228
0.0ProXaa: 0.0 ± 0.0
Gln
4.696GlnAla: 4.696 ± 0.388
0.281GlnCys: 0.281 ± 0.084
2.38GlnAsp: 2.38 ± 0.274
2.489GlnGlu: 2.489 ± 0.261
1.753GlnPhe: 1.753 ± 0.176
2.51GlnGly: 2.51 ± 0.258
1.255GlnHis: 1.255 ± 0.168
2.207GlnIle: 2.207 ± 0.191
0.952GlnLys: 0.952 ± 0.165
3.571GlnLeu: 3.571 ± 0.311
1.255GlnMet: 1.255 ± 0.17
1.472GlnAsn: 1.472 ± 0.183
1.861GlnPro: 1.861 ± 0.18
1.883GlnGln: 1.883 ± 0.263
3.614GlnArg: 3.614 ± 0.287
2.142GlnSer: 2.142 ± 0.244
2.467GlnThr: 2.467 ± 0.237
2.921GlnVal: 2.921 ± 0.297
0.606GlnTrp: 0.606 ± 0.111
1.212GlnTyr: 1.212 ± 0.159
0.0GlnXaa: 0.0 ± 0.0
Arg
6.189ArgAla: 6.189 ± 0.386
0.736ArgCys: 0.736 ± 0.125
4.22ArgAsp: 4.22 ± 0.31
5.497ArgGlu: 5.497 ± 0.413
2.38ArgPhe: 2.38 ± 0.296
4.285ArgGly: 4.285 ± 0.316
1.796ArgHis: 1.796 ± 0.241
2.965ArgIle: 2.965 ± 0.302
3.679ArgLys: 3.679 ± 0.319
4.956ArgLeu: 4.956 ± 0.421
1.818ArgMet: 1.818 ± 0.205
2.337ArgAsn: 2.337 ± 0.202
2.835ArgPro: 2.835 ± 0.244
3.246ArgGln: 3.246 ± 0.293
5.15ArgArg: 5.15 ± 0.391
3.722ArgSer: 3.722 ± 0.272
4.003ArgThr: 4.003 ± 0.311
4.977ArgVal: 4.977 ± 0.285
1.277ArgTrp: 1.277 ± 0.152
2.337ArgTyr: 2.337 ± 0.241
0.0ArgXaa: 0.0 ± 0.0
Ser
4.696SerAla: 4.696 ± 0.386
0.454SerCys: 0.454 ± 0.129
3.874SerAsp: 3.874 ± 0.378
2.921SerGlu: 2.921 ± 0.246
1.775SerPhe: 1.775 ± 0.208
5.367SerGly: 5.367 ± 0.409
0.909SerHis: 0.909 ± 0.16
2.251SerIle: 2.251 ± 0.209
2.056SerLys: 2.056 ± 0.219
4.263SerLeu: 4.263 ± 0.321
1.407SerMet: 1.407 ± 0.162
1.623SerAsn: 1.623 ± 0.155
2.835SerPro: 2.835 ± 0.22
2.142SerGln: 2.142 ± 0.207
3.462SerArg: 3.462 ± 0.216
3.506SerSer: 3.506 ± 0.323
3.917SerThr: 3.917 ± 0.281
3.939SerVal: 3.939 ± 0.312
1.169SerTrp: 1.169 ± 0.204
1.796SerTyr: 1.796 ± 0.194
0.0SerXaa: 0.0 ± 0.0
Thr
6.038ThrAla: 6.038 ± 0.44
0.714ThrCys: 0.714 ± 0.167
4.263ThrAsp: 4.263 ± 0.339
3.744ThrGlu: 3.744 ± 0.346
2.316ThrPhe: 2.316 ± 0.281
5.843ThrGly: 5.843 ± 0.384
1.861ThrHis: 1.861 ± 0.241
3.203ThrIle: 3.203 ± 0.304
2.251ThrLys: 2.251 ± 0.208
4.544ThrLeu: 4.544 ± 0.305
1.385ThrMet: 1.385 ± 0.161
2.359ThrAsn: 2.359 ± 0.272
4.35ThrPro: 4.35 ± 0.404
2.164ThrGln: 2.164 ± 0.225
3.203ThrArg: 3.203 ± 0.243
3.744ThrSer: 3.744 ± 0.319
4.458ThrThr: 4.458 ± 0.458
4.761ThrVal: 4.761 ± 0.376
1.407ThrTrp: 1.407 ± 0.198
2.034ThrTyr: 2.034 ± 0.211
0.0ThrXaa: 0.0 ± 0.0
Val
7.617ValAla: 7.617 ± 0.483
0.671ValCys: 0.671 ± 0.149
4.999ValAsp: 4.999 ± 0.259
4.847ValGlu: 4.847 ± 0.332
2.099ValPhe: 2.099 ± 0.231
5.324ValGly: 5.324 ± 0.36
1.861ValHis: 1.861 ± 0.195
3.268ValIle: 3.268 ± 0.245
2.272ValLys: 2.272 ± 0.215
5.064ValLeu: 5.064 ± 0.395
1.645ValMet: 1.645 ± 0.158
2.142ValAsn: 2.142 ± 0.244
4.371ValPro: 4.371 ± 0.314
2.878ValGln: 2.878 ± 0.357
4.306ValArg: 4.306 ± 0.333
3.7ValSer: 3.7 ± 0.251
5.237ValThr: 5.237 ± 0.365
5.865ValVal: 5.865 ± 0.479
1.125ValTrp: 1.125 ± 0.164
1.883ValTyr: 1.883 ± 0.245
0.0ValXaa: 0.0 ± 0.0
Trp
1.515TrpAla: 1.515 ± 0.18
0.216TrpCys: 0.216 ± 0.069
1.472TrpAsp: 1.472 ± 0.165
1.212TrpGlu: 1.212 ± 0.194
0.519TrpPhe: 0.519 ± 0.139
1.666TrpGly: 1.666 ± 0.196
0.411TrpHis: 0.411 ± 0.071
0.671TrpIle: 0.671 ± 0.11
0.671TrpLys: 0.671 ± 0.109
1.233TrpLeu: 1.233 ± 0.149
0.39TrpMet: 0.39 ± 0.094
0.606TrpAsn: 0.606 ± 0.141
0.866TrpPro: 0.866 ± 0.133
0.887TrpGln: 0.887 ± 0.115
1.688TrpArg: 1.688 ± 0.212
1.104TrpSer: 1.104 ± 0.131
1.558TrpThr: 1.558 ± 0.159
1.298TrpVal: 1.298 ± 0.188
0.541TrpTrp: 0.541 ± 0.12
0.541TrpTyr: 0.541 ± 0.094
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.116TyrAla: 3.116 ± 0.217
0.26TyrCys: 0.26 ± 0.083
2.402TyrAsp: 2.402 ± 0.204
1.58TyrGlu: 1.58 ± 0.189
1.06TyrPhe: 1.06 ± 0.161
2.077TyrGly: 2.077 ± 0.235
1.039TyrHis: 1.039 ± 0.13
1.125TyrIle: 1.125 ± 0.145
0.801TyrLys: 0.801 ± 0.142
2.575TyrLeu: 2.575 ± 0.279
0.411TyrMet: 0.411 ± 0.072
1.147TyrAsn: 1.147 ± 0.156
1.363TyrPro: 1.363 ± 0.183
1.342TyrGln: 1.342 ± 0.202
2.662TyrArg: 2.662 ± 0.237
1.45TyrSer: 1.45 ± 0.157
2.294TyrThr: 2.294 ± 0.278
2.316TyrVal: 2.316 ± 0.241
0.519TyrTrp: 0.519 ± 0.108
0.995TyrTyr: 0.995 ± 0.167
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 218 proteins (46211 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski