Amino acid dipepetide frequency for Mycobacterium phage Chris

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
21.206AlaAla: 21.206 ± 1.778
1.047AlaCys: 1.047 ± 0.203
8.378AlaAsp: 8.378 ± 0.728
8.483AlaGlu: 8.483 ± 0.984
3.822AlaPhe: 3.822 ± 0.641
8.692AlaGly: 8.692 ± 1.177
2.88AlaHis: 2.88 ± 0.427
5.446AlaIle: 5.446 ± 0.571
5.393AlaLys: 5.393 ± 0.615
11.31AlaLeu: 11.31 ± 0.836
3.142AlaMet: 3.142 ± 0.446
3.718AlaAsn: 3.718 ± 0.471
6.388AlaPro: 6.388 ± 0.614
4.346AlaGln: 4.346 ± 0.6
7.75AlaArg: 7.75 ± 0.781
6.388AlaSer: 6.388 ± 0.733
6.755AlaThr: 6.755 ± 0.683
9.425AlaVal: 9.425 ± 0.629
2.828AlaTrp: 2.828 ± 0.338
2.985AlaTyr: 2.985 ± 0.47
0.0AlaXaa: 0.0 ± 0.0
Cys
1.623CysAla: 1.623 ± 0.319
0.209CysCys: 0.209 ± 0.094
0.576CysAsp: 0.576 ± 0.214
0.89CysGlu: 0.89 ± 0.252
0.157CysPhe: 0.157 ± 0.087
1.414CysGly: 1.414 ± 0.285
0.157CysHis: 0.157 ± 0.129
0.314CysIle: 0.314 ± 0.131
0.471CysLys: 0.471 ± 0.16
0.785CysLeu: 0.785 ± 0.229
0.209CysMet: 0.209 ± 0.118
0.262CysAsn: 0.262 ± 0.117
0.943CysPro: 0.943 ± 0.22
0.209CysGln: 0.209 ± 0.093
0.681CysArg: 0.681 ± 0.158
0.838CysSer: 0.838 ± 0.194
0.524CysThr: 0.524 ± 0.145
0.733CysVal: 0.733 ± 0.202
0.524CysTrp: 0.524 ± 0.162
0.314CysTyr: 0.314 ± 0.134
0.0CysXaa: 0.0 ± 0.0
Asp
6.964AspAla: 6.964 ± 0.594
0.524AspCys: 0.524 ± 0.179
5.393AspAsp: 5.393 ± 0.731
5.289AspGlu: 5.289 ± 0.633
0.89AspPhe: 0.89 ± 0.193
7.226AspGly: 7.226 ± 0.544
1.047AspHis: 1.047 ± 0.258
1.728AspIle: 1.728 ± 0.255
1.937AspLys: 1.937 ± 0.288
5.917AspLeu: 5.917 ± 0.618
1.833AspMet: 1.833 ± 0.332
1.833AspAsn: 1.833 ± 0.314
4.503AspPro: 4.503 ± 0.489
2.513AspGln: 2.513 ± 0.35
4.241AspArg: 4.241 ± 0.55
2.723AspSer: 2.723 ± 0.366
3.403AspThr: 3.403 ± 0.359
4.713AspVal: 4.713 ± 0.5
0.89AspTrp: 0.89 ± 0.18
1.78AspTyr: 1.78 ± 0.333
0.0AspXaa: 0.0 ± 0.0
Glu
7.278GluAla: 7.278 ± 0.899
0.89GluCys: 0.89 ± 0.228
2.932GluAsp: 2.932 ± 0.485
1.361GluGlu: 1.361 ± 0.389
2.042GluPhe: 2.042 ± 0.331
4.608GluGly: 4.608 ± 0.518
1.518GluHis: 1.518 ± 0.308
2.356GluIle: 2.356 ± 0.417
1.518GluLys: 1.518 ± 0.393
5.289GluLeu: 5.289 ± 0.566
1.623GluMet: 1.623 ± 0.307
1.1GluAsn: 1.1 ± 0.254
3.403GluPro: 3.403 ± 0.53
2.304GluGln: 2.304 ± 0.393
4.608GluArg: 4.608 ± 0.535
2.618GluSer: 2.618 ± 0.323
2.356GluThr: 2.356 ± 0.36
4.922GluVal: 4.922 ± 0.558
1.257GluTrp: 1.257 ± 0.237
1.99GluTyr: 1.99 ± 0.422
0.0GluXaa: 0.0 ± 0.0
Phe
3.351PheAla: 3.351 ± 0.56
0.314PheCys: 0.314 ± 0.162
2.461PheAsp: 2.461 ± 0.375
1.257PheGlu: 1.257 ± 0.247
0.576PhePhe: 0.576 ± 0.179
3.194PheGly: 3.194 ± 0.415
0.367PheHis: 0.367 ± 0.135
0.943PheIle: 0.943 ± 0.201
0.943PheLys: 0.943 ± 0.327
2.147PheLeu: 2.147 ± 0.309
0.524PheMet: 0.524 ± 0.166
1.047PheAsn: 1.047 ± 0.246
1.152PhePro: 1.152 ± 0.233
0.681PheGln: 0.681 ± 0.205
1.518PheArg: 1.518 ± 0.266
1.1PheSer: 1.1 ± 0.219
1.78PheThr: 1.78 ± 0.382
2.67PheVal: 2.67 ± 0.38
0.419PheTrp: 0.419 ± 0.162
0.419PheTyr: 0.419 ± 0.192
0.0PheXaa: 0.0 ± 0.0
Gly
10.786GlyAla: 10.786 ± 1.175
1.204GlyCys: 1.204 ± 0.248
5.079GlyAsp: 5.079 ± 0.531
5.289GlyGlu: 5.289 ± 0.41
2.618GlyPhe: 2.618 ± 0.471
9.32GlyGly: 9.32 ± 1.56
1.676GlyHis: 1.676 ± 0.302
3.089GlyIle: 3.089 ± 0.657
3.718GlyLys: 3.718 ± 0.476
6.755GlyLeu: 6.755 ± 0.85
2.252GlyMet: 2.252 ± 0.371
2.356GlyAsn: 2.356 ± 0.331
3.613GlyPro: 3.613 ± 0.453
2.513GlyGln: 2.513 ± 0.363
5.76GlyArg: 5.76 ± 0.538
5.131GlySer: 5.131 ± 0.637
5.707GlyThr: 5.707 ± 0.765
6.807GlyVal: 6.807 ± 0.659
2.304GlyTrp: 2.304 ± 0.268
2.304GlyTyr: 2.304 ± 0.287
0.0GlyXaa: 0.0 ± 0.0
His
2.304HisAla: 2.304 ± 0.364
0.419HisCys: 0.419 ± 0.167
1.937HisAsp: 1.937 ± 0.377
0.785HisGlu: 0.785 ± 0.185
0.628HisPhe: 0.628 ± 0.187
1.833HisGly: 1.833 ± 0.323
0.628HisHis: 0.628 ± 0.167
0.733HisIle: 0.733 ± 0.189
0.576HisLys: 0.576 ± 0.174
1.833HisLeu: 1.833 ± 0.281
0.367HisMet: 0.367 ± 0.125
0.628HisAsn: 0.628 ± 0.146
1.414HisPro: 1.414 ± 0.319
0.628HisGln: 0.628 ± 0.207
1.937HisArg: 1.937 ± 0.325
0.995HisSer: 0.995 ± 0.179
1.257HisThr: 1.257 ± 0.271
2.094HisVal: 2.094 ± 0.322
0.628HisTrp: 0.628 ± 0.219
0.733HisTyr: 0.733 ± 0.231
0.0HisXaa: 0.0 ± 0.0
Ile
5.131IleAla: 5.131 ± 0.473
0.105IleCys: 0.105 ± 0.068
3.194IleAsp: 3.194 ± 0.369
3.299IleGlu: 3.299 ± 0.479
0.524IlePhe: 0.524 ± 0.15
4.451IleGly: 4.451 ± 0.828
0.628IleHis: 0.628 ± 0.213
0.785IleIle: 0.785 ± 0.24
1.466IleLys: 1.466 ± 0.235
2.304IleLeu: 2.304 ± 0.336
0.471IleMet: 0.471 ± 0.137
1.676IleAsn: 1.676 ± 0.24
1.885IlePro: 1.885 ± 0.33
0.576IleGln: 0.576 ± 0.147
1.833IleArg: 1.833 ± 0.301
1.518IleSer: 1.518 ± 0.282
2.461IleThr: 2.461 ± 0.332
3.875IleVal: 3.875 ± 0.462
0.681IleTrp: 0.681 ± 0.188
0.524IleTyr: 0.524 ± 0.182
0.0IleXaa: 0.0 ± 0.0
Lys
3.822LysAla: 3.822 ± 0.521
0.471LysCys: 0.471 ± 0.151
1.309LysAsp: 1.309 ± 0.238
0.733LysGlu: 0.733 ± 0.187
0.681LysPhe: 0.681 ± 0.202
2.932LysGly: 2.932 ± 0.461
0.733LysHis: 0.733 ± 0.197
1.571LysIle: 1.571 ± 0.358
0.628LysLys: 0.628 ± 0.192
2.985LysLeu: 2.985 ± 0.341
1.047LysMet: 1.047 ± 0.252
0.628LysAsn: 0.628 ± 0.206
3.089LysPro: 3.089 ± 0.447
1.047LysGln: 1.047 ± 0.237
2.932LysArg: 2.932 ± 0.431
1.361LysSer: 1.361 ± 0.287
1.99LysThr: 1.99 ± 0.312
2.828LysVal: 2.828 ± 0.372
0.576LysTrp: 0.576 ± 0.205
0.838LysTyr: 0.838 ± 0.257
0.0LysXaa: 0.0 ± 0.0
Leu
12.41LeuAla: 12.41 ± 0.662
1.047LeuCys: 1.047 ± 0.273
7.907LeuAsp: 7.907 ± 0.743
2.304LeuGlu: 2.304 ± 0.352
2.461LeuPhe: 2.461 ± 0.363
7.226LeuGly: 7.226 ± 0.827
2.042LeuHis: 2.042 ± 0.344
3.246LeuIle: 3.246 ± 0.448
2.199LeuLys: 2.199 ± 0.316
5.707LeuLeu: 5.707 ± 0.702
1.623LeuMet: 1.623 ± 0.261
2.147LeuAsn: 2.147 ± 0.367
4.346LeuPro: 4.346 ± 0.561
2.304LeuGln: 2.304 ± 0.418
6.493LeuArg: 6.493 ± 0.581
5.707LeuSer: 5.707 ± 0.404
5.079LeuThr: 5.079 ± 0.543
5.446LeuVal: 5.446 ± 0.582
1.676LeuTrp: 1.676 ± 0.267
1.571LeuTyr: 1.571 ± 0.328
0.0LeuXaa: 0.0 ± 0.0
Met
3.299MetAla: 3.299 ± 0.333
0.262MetCys: 0.262 ± 0.105
0.943MetAsp: 0.943 ± 0.192
0.576MetGlu: 0.576 ± 0.159
0.733MetPhe: 0.733 ± 0.199
1.623MetGly: 1.623 ± 0.319
0.628MetHis: 0.628 ± 0.193
1.204MetIle: 1.204 ± 0.196
0.314MetLys: 0.314 ± 0.135
1.728MetLeu: 1.728 ± 0.295
0.367MetMet: 0.367 ± 0.129
0.785MetAsn: 0.785 ± 0.217
1.571MetPro: 1.571 ± 0.286
0.576MetGln: 0.576 ± 0.177
1.885MetArg: 1.885 ± 0.36
2.461MetSer: 2.461 ± 0.368
1.833MetThr: 1.833 ± 0.32
1.676MetVal: 1.676 ± 0.277
0.524MetTrp: 0.524 ± 0.144
0.471MetTyr: 0.471 ± 0.186
0.0MetXaa: 0.0 ± 0.0
Asn
3.77AsnAla: 3.77 ± 0.561
0.419AsnCys: 0.419 ± 0.142
1.414AsnAsp: 1.414 ± 0.322
0.89AsnGlu: 0.89 ± 0.195
0.576AsnPhe: 0.576 ± 0.182
3.351AsnGly: 3.351 ± 0.49
0.524AsnHis: 0.524 ± 0.184
0.838AsnIle: 0.838 ± 0.29
0.785AsnLys: 0.785 ± 0.197
2.252AsnLeu: 2.252 ± 0.44
0.314AsnMet: 0.314 ± 0.118
0.785AsnAsn: 0.785 ± 0.228
2.356AsnPro: 2.356 ± 0.331
0.628AsnGln: 0.628 ± 0.164
1.833AsnArg: 1.833 ± 0.292
0.733AsnSer: 0.733 ± 0.232
1.833AsnThr: 1.833 ± 0.313
2.618AsnVal: 2.618 ± 0.451
0.209AsnTrp: 0.209 ± 0.098
0.838AsnTyr: 0.838 ± 0.219
0.0AsnXaa: 0.0 ± 0.0
Pro
8.378ProAla: 8.378 ± 0.65
0.576ProCys: 0.576 ± 0.193
4.084ProAsp: 4.084 ± 0.45
4.451ProGlu: 4.451 ± 0.511
1.361ProPhe: 1.361 ± 0.282
5.76ProGly: 5.76 ± 0.513
1.152ProHis: 1.152 ± 0.283
1.623ProIle: 1.623 ± 0.319
1.99ProLys: 1.99 ± 0.313
4.765ProLeu: 4.765 ± 0.54
1.047ProMet: 1.047 ± 0.217
1.518ProAsn: 1.518 ± 0.336
3.403ProPro: 3.403 ± 0.536
1.518ProGln: 1.518 ± 0.345
3.351ProArg: 3.351 ± 0.437
2.775ProSer: 2.775 ± 0.358
3.456ProThr: 3.456 ± 0.417
5.236ProVal: 5.236 ± 0.662
1.047ProTrp: 1.047 ± 0.206
1.676ProTyr: 1.676 ± 0.234
0.0ProXaa: 0.0 ± 0.0
Gln
4.608GlnAla: 4.608 ± 0.63
0.262GlnCys: 0.262 ± 0.114
0.995GlnAsp: 0.995 ± 0.212
1.152GlnGlu: 1.152 ± 0.221
0.995GlnPhe: 0.995 ± 0.276
2.828GlnGly: 2.828 ± 0.337
0.943GlnHis: 0.943 ± 0.216
1.518GlnIle: 1.518 ± 0.327
0.471GlnLys: 0.471 ± 0.169
2.67GlnLeu: 2.67 ± 0.459
0.838GlnMet: 0.838 ± 0.181
0.419GlnAsn: 0.419 ± 0.155
2.094GlnPro: 2.094 ± 0.283
1.833GlnGln: 1.833 ± 0.305
1.885GlnArg: 1.885 ± 0.273
1.518GlnSer: 1.518 ± 0.286
2.356GlnThr: 2.356 ± 0.395
2.67GlnVal: 2.67 ± 0.42
0.838GlnTrp: 0.838 ± 0.237
0.838GlnTyr: 0.838 ± 0.191
0.0GlnXaa: 0.0 ± 0.0
Arg
6.388ArgAla: 6.388 ± 0.701
1.047ArgCys: 1.047 ± 0.234
4.189ArgAsp: 4.189 ± 0.558
4.555ArgGlu: 4.555 ± 0.644
1.78ArgPhe: 1.78 ± 0.355
3.875ArgGly: 3.875 ± 0.412
1.78ArgHis: 1.78 ± 0.392
3.142ArgIle: 3.142 ± 0.454
3.089ArgLys: 3.089 ± 0.446
6.388ArgLeu: 6.388 ± 0.607
2.618ArgMet: 2.618 ± 0.325
2.147ArgAsn: 2.147 ± 0.334
4.294ArgPro: 4.294 ± 0.614
2.461ArgGln: 2.461 ± 0.307
5.969ArgArg: 5.969 ± 0.755
2.985ArgSer: 2.985 ± 0.414
3.665ArgThr: 3.665 ± 0.502
4.137ArgVal: 4.137 ± 0.591
1.78ArgTrp: 1.78 ± 0.352
1.833ArgTyr: 1.833 ± 0.294
0.0ArgXaa: 0.0 ± 0.0
Ser
6.807SerAla: 6.807 ± 0.68
0.681SerCys: 0.681 ± 0.198
3.194SerAsp: 3.194 ± 0.365
2.828SerGlu: 2.828 ± 0.476
1.623SerPhe: 1.623 ± 0.238
4.503SerGly: 4.503 ± 0.632
1.257SerHis: 1.257 ± 0.261
1.728SerIle: 1.728 ± 0.331
1.309SerLys: 1.309 ± 0.236
4.032SerLeu: 4.032 ± 0.437
1.309SerMet: 1.309 ± 0.318
1.571SerAsn: 1.571 ± 0.285
3.456SerPro: 3.456 ± 0.457
1.78SerGln: 1.78 ± 0.271
3.456SerArg: 3.456 ± 0.396
2.775SerSer: 2.775 ± 0.475
3.089SerThr: 3.089 ± 0.389
3.927SerVal: 3.927 ± 0.365
0.995SerTrp: 0.995 ± 0.21
1.623SerTyr: 1.623 ± 0.252
0.0SerXaa: 0.0 ± 0.0
Thr
6.859ThrAla: 6.859 ± 0.816
0.628ThrCys: 0.628 ± 0.177
3.299ThrAsp: 3.299 ± 0.444
3.77ThrGlu: 3.77 ± 0.494
1.937ThrPhe: 1.937 ± 0.289
5.707ThrGly: 5.707 ± 0.64
1.466ThrHis: 1.466 ± 0.277
3.194ThrIle: 3.194 ± 0.356
1.99ThrLys: 1.99 ± 0.327
4.765ThrLeu: 4.765 ± 0.529
0.733ThrMet: 0.733 ± 0.203
1.414ThrAsn: 1.414 ± 0.277
3.77ThrPro: 3.77 ± 0.543
1.833ThrGln: 1.833 ± 0.299
2.513ThrArg: 2.513 ± 0.424
2.985ThrSer: 2.985 ± 0.541
3.299ThrThr: 3.299 ± 0.445
5.446ThrVal: 5.446 ± 0.549
1.1ThrTrp: 1.1 ± 0.194
1.728ThrTyr: 1.728 ± 0.326
0.0ThrXaa: 0.0 ± 0.0
Val
9.792ValAla: 9.792 ± 0.881
0.995ValCys: 0.995 ± 0.207
5.341ValAsp: 5.341 ± 0.631
5.76ValGlu: 5.76 ± 0.731
2.094ValPhe: 2.094 ± 0.364
6.388ValGly: 6.388 ± 0.684
2.147ValHis: 2.147 ± 0.446
2.67ValIle: 2.67 ± 0.432
2.67ValLys: 2.67 ± 0.417
6.336ValLeu: 6.336 ± 0.633
2.042ValMet: 2.042 ± 0.319
1.571ValAsn: 1.571 ± 0.266
5.498ValPro: 5.498 ± 0.632
2.094ValGln: 2.094 ± 0.268
4.87ValArg: 4.87 ± 0.537
4.555ValSer: 4.555 ± 0.442
4.608ValThr: 4.608 ± 0.521
7.226ValVal: 7.226 ± 0.844
1.78ValTrp: 1.78 ± 0.274
1.833ValTyr: 1.833 ± 0.252
0.0ValXaa: 0.0 ± 0.0
Trp
2.566TrpAla: 2.566 ± 0.394
0.419TrpCys: 0.419 ± 0.134
0.785TrpAsp: 0.785 ± 0.191
0.733TrpGlu: 0.733 ± 0.191
0.785TrpPhe: 0.785 ± 0.199
1.257TrpGly: 1.257 ± 0.279
0.524TrpHis: 0.524 ± 0.167
0.733TrpIle: 0.733 ± 0.182
0.262TrpLys: 0.262 ± 0.118
2.67TrpLeu: 2.67 ± 0.408
0.471TrpMet: 0.471 ± 0.177
0.576TrpAsn: 0.576 ± 0.165
0.838TrpPro: 0.838 ± 0.206
1.257TrpGln: 1.257 ± 0.237
2.356TrpArg: 2.356 ± 0.402
1.1TrpSer: 1.1 ± 0.274
1.204TrpThr: 1.204 ± 0.243
1.414TrpVal: 1.414 ± 0.302
0.524TrpTrp: 0.524 ± 0.152
0.471TrpTyr: 0.471 ± 0.152
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.037TyrAla: 3.037 ± 0.455
0.314TyrCys: 0.314 ± 0.129
1.99TyrAsp: 1.99 ± 0.389
1.571TyrGlu: 1.571 ± 0.277
0.576TyrPhe: 0.576 ± 0.167
2.094TyrGly: 2.094 ± 0.307
0.262TyrHis: 0.262 ± 0.123
0.524TyrIle: 0.524 ± 0.161
0.471TyrLys: 0.471 ± 0.175
2.304TyrLeu: 2.304 ± 0.36
0.576TyrMet: 0.576 ± 0.176
0.785TyrAsn: 0.785 ± 0.151
1.152TyrPro: 1.152 ± 0.271
0.628TyrGln: 0.628 ± 0.223
2.252TyrArg: 2.252 ± 0.291
1.676TyrSer: 1.676 ± 0.299
1.78TyrThr: 1.78 ± 0.347
2.409TyrVal: 2.409 ± 0.306
0.367TyrTrp: 0.367 ± 0.112
0.628TyrTyr: 0.628 ± 0.176
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 100 proteins (19099 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski