Amino acid dipepetide frequency for Bacillus phage v_B-Bak1

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
5.152AlaAla: 5.152 ± 0.764
0.724AlaCys: 0.724 ± 0.188
3.406AlaAsp: 3.406 ± 0.557
4.598AlaGlu: 4.598 ± 0.67
2.384AlaPhe: 2.384 ± 0.386
4.556AlaGly: 4.556 ± 0.631
0.596AlaHis: 0.596 ± 0.161
3.875AlaIle: 3.875 ± 0.506
5.961AlaLys: 5.961 ± 0.523
4.3AlaLeu: 4.3 ± 0.532
1.405AlaMet: 1.405 ± 0.294
3.96AlaAsn: 3.96 ± 0.664
1.192AlaPro: 1.192 ± 0.244
2.086AlaGln: 2.086 ± 0.378
1.959AlaArg: 1.959 ± 0.357
2.81AlaSer: 2.81 ± 0.433
3.704AlaThr: 3.704 ± 0.673
4.088AlaVal: 4.088 ± 0.44
0.937AlaTrp: 0.937 ± 0.232
2.512AlaTyr: 2.512 ± 0.384
0.0AlaXaa: 0.0 ± 0.0
Cys
0.255CysAla: 0.255 ± 0.103
0.043CysCys: 0.043 ± 0.048
0.937CysAsp: 0.937 ± 0.226
0.554CysGlu: 0.554 ± 0.157
0.468CysPhe: 0.468 ± 0.167
0.213CysGly: 0.213 ± 0.104
0.17CysHis: 0.17 ± 0.081
0.639CysIle: 0.639 ± 0.186
0.766CysLys: 0.766 ± 0.213
0.511CysLeu: 0.511 ± 0.167
0.426CysMet: 0.426 ± 0.14
0.511CysAsn: 0.511 ± 0.166
0.468CysPro: 0.468 ± 0.137
0.213CysGln: 0.213 ± 0.093
0.298CysArg: 0.298 ± 0.116
0.426CysSer: 0.426 ± 0.158
0.383CysThr: 0.383 ± 0.146
0.383CysVal: 0.383 ± 0.162
0.085CysTrp: 0.085 ± 0.06
0.298CysTyr: 0.298 ± 0.128
0.0CysXaa: 0.0 ± 0.0
Asp
3.108AspAla: 3.108 ± 0.454
0.511AspCys: 0.511 ± 0.157
3.491AspAsp: 3.491 ± 0.392
4.982AspGlu: 4.982 ± 0.554
2.895AspPhe: 2.895 ± 0.441
3.875AspGly: 3.875 ± 0.436
1.277AspHis: 1.277 ± 0.235
5.195AspIle: 5.195 ± 0.465
6.216AspLys: 6.216 ± 0.567
4.854AspLeu: 4.854 ± 0.366
1.405AspMet: 1.405 ± 0.247
4.002AspAsn: 4.002 ± 0.5
1.916AspPro: 1.916 ± 0.354
2.086AspGln: 2.086 ± 0.342
2.938AspArg: 2.938 ± 0.429
3.151AspSer: 3.151 ± 0.341
3.321AspThr: 3.321 ± 0.394
3.747AspVal: 3.747 ± 0.349
0.937AspTrp: 0.937 ± 0.205
3.449AspTyr: 3.449 ± 0.419
0.0AspXaa: 0.0 ± 0.0
Glu
4.598GluAla: 4.598 ± 0.592
0.596GluCys: 0.596 ± 0.157
3.577GluAsp: 3.577 ± 0.425
6.131GluGlu: 6.131 ± 0.665
2.938GluPhe: 2.938 ± 0.389
4.513GluGly: 4.513 ± 0.432
0.979GluHis: 0.979 ± 0.225
5.706GluIle: 5.706 ± 0.556
6.77GluLys: 6.77 ± 0.622
6.727GluLeu: 6.727 ± 0.64
2.98GluMet: 2.98 ± 0.363
4.513GluAsn: 4.513 ± 0.503
1.533GluPro: 1.533 ± 0.307
3.406GluGln: 3.406 ± 0.469
3.279GluArg: 3.279 ± 0.488
4.258GluSer: 4.258 ± 0.384
4.343GluThr: 4.343 ± 0.46
5.322GluVal: 5.322 ± 0.452
1.15GluTrp: 1.15 ± 0.259
3.023GluTyr: 3.023 ± 0.341
0.0GluXaa: 0.0 ± 0.0
Phe
2.682PheAla: 2.682 ± 0.337
0.468PheCys: 0.468 ± 0.139
3.406PheAsp: 3.406 ± 0.347
3.789PheGlu: 3.789 ± 0.429
1.533PhePhe: 1.533 ± 0.274
2.895PheGly: 2.895 ± 0.47
0.554PheHis: 0.554 ± 0.154
3.023PheIle: 3.023 ± 0.373
3.747PheLys: 3.747 ± 0.403
3.321PheLeu: 3.321 ± 0.486
1.15PheMet: 1.15 ± 0.202
2.768PheAsn: 2.768 ± 0.313
1.15PhePro: 1.15 ± 0.227
1.064PheGln: 1.064 ± 0.216
1.618PheArg: 1.618 ± 0.228
2.47PheSer: 2.47 ± 0.323
3.364PheThr: 3.364 ± 0.411
2.64PheVal: 2.64 ± 0.355
0.639PheTrp: 0.639 ± 0.205
1.788PheTyr: 1.788 ± 0.317
0.0PheXaa: 0.0 ± 0.0
Gly
3.236GlyAla: 3.236 ± 0.682
0.596GlyCys: 0.596 ± 0.195
2.725GlyAsp: 2.725 ± 0.427
4.3GlyGlu: 4.3 ± 0.327
2.895GlyPhe: 2.895 ± 0.375
5.024GlyGly: 5.024 ± 0.87
1.32GlyHis: 1.32 ± 0.293
4.258GlyIle: 4.258 ± 0.344
5.237GlyLys: 5.237 ± 0.393
5.663GlyLeu: 5.663 ± 0.471
2.129GlyMet: 2.129 ± 0.377
3.789GlyAsn: 3.789 ± 0.502
0.085GlyPro: 0.085 ± 0.055
2.895GlyGln: 2.895 ± 0.857
2.725GlyArg: 2.725 ± 0.339
3.875GlySer: 3.875 ± 0.504
4.386GlyThr: 4.386 ± 0.665
5.024GlyVal: 5.024 ± 0.643
1.022GlyTrp: 1.022 ± 0.198
3.151GlyTyr: 3.151 ± 0.35
0.0GlyXaa: 0.0 ± 0.0
His
0.766HisAla: 0.766 ± 0.193
0.128HisCys: 0.128 ± 0.072
1.064HisAsp: 1.064 ± 0.22
0.852HisGlu: 0.852 ± 0.208
0.937HisPhe: 0.937 ± 0.222
1.022HisGly: 1.022 ± 0.191
0.17HisHis: 0.17 ± 0.081
1.107HisIle: 1.107 ± 0.235
1.405HisLys: 1.405 ± 0.262
1.703HisLeu: 1.703 ± 0.282
0.383HisMet: 0.383 ± 0.13
1.32HisAsn: 1.32 ± 0.273
0.511HisPro: 0.511 ± 0.149
0.554HisGln: 0.554 ± 0.148
0.341HisArg: 0.341 ± 0.128
0.937HisSer: 0.937 ± 0.178
0.937HisThr: 0.937 ± 0.204
1.064HisVal: 1.064 ± 0.247
0.255HisTrp: 0.255 ± 0.104
0.681HisTyr: 0.681 ± 0.169
0.0HisXaa: 0.0 ± 0.0
Ile
4.13IleAla: 4.13 ± 0.466
0.681IleCys: 0.681 ± 0.221
6.259IleAsp: 6.259 ± 0.582
5.62IleGlu: 5.62 ± 0.462
2.682IlePhe: 2.682 ± 0.343
4.854IleGly: 4.854 ± 0.442
1.32IleHis: 1.32 ± 0.239
4.045IleIle: 4.045 ± 0.608
7.196IleLys: 7.196 ± 0.704
4.215IleLeu: 4.215 ± 0.384
1.959IleMet: 1.959 ± 0.281
4.173IleAsn: 4.173 ± 0.395
1.831IlePro: 1.831 ± 0.275
2.384IleGln: 2.384 ± 0.312
2.47IleArg: 2.47 ± 0.381
3.832IleSer: 3.832 ± 0.503
4.684IleThr: 4.684 ± 0.605
4.726IleVal: 4.726 ± 0.536
0.383IleTrp: 0.383 ± 0.137
2.682IleTyr: 2.682 ± 0.413
0.0IleXaa: 0.0 ± 0.0
Lys
4.939LysAla: 4.939 ± 0.511
0.894LysCys: 0.894 ± 0.238
5.918LysAsp: 5.918 ± 0.549
8.431LysGlu: 8.431 ± 0.76
3.832LysPhe: 3.832 ± 0.302
4.811LysGly: 4.811 ± 0.421
1.064LysHis: 1.064 ± 0.258
6.515LysIle: 6.515 ± 0.513
6.94LysLys: 6.94 ± 0.719
7.281LysLeu: 7.281 ± 0.61
3.108LysMet: 3.108 ± 0.38
4.513LysAsn: 4.513 ± 0.48
2.895LysPro: 2.895 ± 0.398
2.98LysGln: 2.98 ± 0.394
3.491LysArg: 3.491 ± 0.388
4.471LysSer: 4.471 ± 0.364
4.386LysThr: 4.386 ± 0.468
7.153LysVal: 7.153 ± 0.583
1.064LysTrp: 1.064 ± 0.184
4.258LysTyr: 4.258 ± 0.437
0.0LysXaa: 0.0 ± 0.0
Leu
5.195LeuAla: 5.195 ± 0.566
0.468LeuCys: 0.468 ± 0.126
5.237LeuAsp: 5.237 ± 0.447
6.515LeuGlu: 6.515 ± 0.535
3.023LeuPhe: 3.023 ± 0.353
4.343LeuGly: 4.343 ± 0.435
1.49LeuHis: 1.49 ± 0.206
4.684LeuIle: 4.684 ± 0.571
7.451LeuLys: 7.451 ± 0.509
4.769LeuLeu: 4.769 ± 0.559
2.086LeuMet: 2.086 ± 0.362
5.918LeuAsn: 5.918 ± 0.503
2.682LeuPro: 2.682 ± 0.401
2.853LeuGln: 2.853 ± 0.397
3.193LeuArg: 3.193 ± 0.362
4.045LeuSer: 4.045 ± 0.461
5.578LeuThr: 5.578 ± 0.42
3.917LeuVal: 3.917 ± 0.445
1.192LeuTrp: 1.192 ± 0.265
2.427LeuTyr: 2.427 ± 0.369
0.0LeuXaa: 0.0 ± 0.0
Met
1.831MetAla: 1.831 ± 0.313
0.255MetCys: 0.255 ± 0.117
1.575MetAsp: 1.575 ± 0.212
2.129MetGlu: 2.129 ± 0.32
1.32MetPhe: 1.32 ± 0.239
1.575MetGly: 1.575 ± 0.371
0.468MetHis: 0.468 ± 0.169
2.597MetIle: 2.597 ± 0.331
3.279MetLys: 3.279 ± 0.312
2.001MetLeu: 2.001 ± 0.361
0.809MetMet: 0.809 ± 0.192
1.703MetAsn: 1.703 ± 0.353
0.724MetPro: 0.724 ± 0.221
1.064MetGln: 1.064 ± 0.195
1.15MetArg: 1.15 ± 0.214
2.001MetSer: 2.001 ± 0.289
1.575MetThr: 1.575 ± 0.216
2.257MetVal: 2.257 ± 0.322
0.213MetTrp: 0.213 ± 0.109
1.49MetTyr: 1.49 ± 0.284
0.0MetXaa: 0.0 ± 0.0
Asn
2.98AsnAla: 2.98 ± 0.382
0.383AsnCys: 0.383 ± 0.178
3.704AsnAsp: 3.704 ± 0.384
4.386AsnGlu: 4.386 ± 0.496
2.384AsnPhe: 2.384 ± 0.359
4.513AsnGly: 4.513 ± 0.446
0.894AsnHis: 0.894 ± 0.184
4.3AsnIle: 4.3 ± 0.423
6.004AsnLys: 6.004 ± 0.615
5.407AsnLeu: 5.407 ± 0.4
2.129AsnMet: 2.129 ± 0.275
3.875AsnAsn: 3.875 ± 0.401
2.044AsnPro: 2.044 ± 0.38
2.47AsnGln: 2.47 ± 0.568
2.597AsnArg: 2.597 ± 0.503
3.704AsnSer: 3.704 ± 0.587
2.81AsnThr: 2.81 ± 0.46
3.789AsnVal: 3.789 ± 0.414
0.681AsnTrp: 0.681 ± 0.198
2.427AsnTyr: 2.427 ± 0.368
0.0AsnXaa: 0.0 ± 0.0
Pro
1.788ProAla: 1.788 ± 0.269
0.17ProCys: 0.17 ± 0.103
1.575ProAsp: 1.575 ± 0.322
2.214ProGlu: 2.214 ± 0.312
1.831ProPhe: 1.831 ± 0.313
0.0ProGly: 0.0 ± 0.0
0.298ProHis: 0.298 ± 0.107
2.129ProIle: 2.129 ± 0.364
2.384ProLys: 2.384 ± 0.391
1.831ProLeu: 1.831 ± 0.321
1.022ProMet: 1.022 ± 0.219
2.086ProAsn: 2.086 ± 0.395
0.341ProPro: 0.341 ± 0.135
1.064ProGln: 1.064 ± 0.203
0.511ProArg: 0.511 ± 0.14
1.831ProSer: 1.831 ± 0.279
1.746ProThr: 1.746 ± 0.374
1.916ProVal: 1.916 ± 0.232
0.213ProTrp: 0.213 ± 0.09
1.235ProTyr: 1.235 ± 0.301
0.0ProXaa: 0.0 ± 0.0
Gln
2.129GlnAla: 2.129 ± 0.583
0.255GlnCys: 0.255 ± 0.11
1.32GlnAsp: 1.32 ± 0.267
2.853GlnGlu: 2.853 ± 0.419
1.192GlnPhe: 1.192 ± 0.272
2.512GlnGly: 2.512 ± 0.469
0.468GlnHis: 0.468 ± 0.168
2.427GlnIle: 2.427 ± 0.329
2.81GlnLys: 2.81 ± 0.408
3.534GlnLeu: 3.534 ± 0.397
1.363GlnMet: 1.363 ± 0.26
2.257GlnAsn: 2.257 ± 0.437
1.064GlnPro: 1.064 ± 0.365
2.938GlnGln: 2.938 ± 1.543
1.703GlnArg: 1.703 ± 0.283
1.788GlnSer: 1.788 ± 0.265
2.001GlnThr: 2.001 ± 0.495
2.044GlnVal: 2.044 ± 0.303
0.426GlnTrp: 0.426 ± 0.144
1.363GlnTyr: 1.363 ± 0.251
0.0GlnXaa: 0.0 ± 0.0
Arg
2.257ArgAla: 2.257 ± 0.397
0.255ArgCys: 0.255 ± 0.113
2.555ArgAsp: 2.555 ± 0.295
2.555ArgGlu: 2.555 ± 0.342
2.086ArgPhe: 2.086 ± 0.285
2.512ArgGly: 2.512 ± 0.423
0.852ArgHis: 0.852 ± 0.2
2.853ArgIle: 2.853 ± 0.331
3.364ArgLys: 3.364 ± 0.445
2.895ArgLeu: 2.895 ± 0.397
1.235ArgMet: 1.235 ± 0.237
2.299ArgAsn: 2.299 ± 0.297
1.107ArgPro: 1.107 ± 0.262
1.363ArgGln: 1.363 ± 0.262
1.405ArgArg: 1.405 ± 0.26
1.873ArgSer: 1.873 ± 0.24
2.129ArgThr: 2.129 ± 0.239
2.512ArgVal: 2.512 ± 0.354
0.383ArgTrp: 0.383 ± 0.124
1.746ArgTyr: 1.746 ± 0.308
0.0ArgXaa: 0.0 ± 0.0
Ser
3.789SerAla: 3.789 ± 0.61
0.511SerCys: 0.511 ± 0.162
3.151SerAsp: 3.151 ± 0.355
3.151SerGlu: 3.151 ± 0.417
3.534SerPhe: 3.534 ± 0.393
5.067SerGly: 5.067 ± 0.56
1.022SerHis: 1.022 ± 0.237
3.832SerIle: 3.832 ± 0.548
4.3SerLys: 4.3 ± 0.413
4.854SerLeu: 4.854 ± 0.432
1.873SerMet: 1.873 ± 0.374
2.597SerAsn: 2.597 ± 0.391
1.49SerPro: 1.49 ± 0.321
1.788SerGln: 1.788 ± 0.281
1.235SerArg: 1.235 ± 0.246
3.406SerSer: 3.406 ± 0.624
3.108SerThr: 3.108 ± 0.335
2.895SerVal: 2.895 ± 0.347
0.766SerTrp: 0.766 ± 0.25
2.768SerTyr: 2.768 ± 0.293
0.0SerXaa: 0.0 ± 0.0
Thr
4.428ThrAla: 4.428 ± 0.879
0.255ThrCys: 0.255 ± 0.119
4.088ThrAsp: 4.088 ± 0.414
3.875ThrGlu: 3.875 ± 0.501
2.512ThrPhe: 2.512 ± 0.374
4.939ThrGly: 4.939 ± 0.592
1.022ThrHis: 1.022 ± 0.207
5.322ThrIle: 5.322 ± 0.527
4.726ThrLys: 4.726 ± 0.461
4.258ThrLeu: 4.258 ± 0.5
1.363ThrMet: 1.363 ± 0.252
3.449ThrAsn: 3.449 ± 0.391
2.044ThrPro: 2.044 ± 0.355
1.746ThrGln: 1.746 ± 0.351
2.044ThrArg: 2.044 ± 0.271
3.449ThrSer: 3.449 ± 0.565
4.641ThrThr: 4.641 ± 0.634
4.556ThrVal: 4.556 ± 0.493
0.341ThrTrp: 0.341 ± 0.111
2.512ThrTyr: 2.512 ± 0.32
0.0ThrXaa: 0.0 ± 0.0
Val
3.789ValAla: 3.789 ± 0.46
0.426ValCys: 0.426 ± 0.138
4.641ValAsp: 4.641 ± 0.474
4.854ValGlu: 4.854 ± 0.367
2.853ValPhe: 2.853 ± 0.414
3.789ValGly: 3.789 ± 0.458
1.15ValHis: 1.15 ± 0.23
4.045ValIle: 4.045 ± 0.485
6.557ValLys: 6.557 ± 0.613
4.513ValLeu: 4.513 ± 0.469
1.661ValMet: 1.661 ± 0.328
4.045ValAsn: 4.045 ± 0.413
2.342ValPro: 2.342 ± 0.307
2.129ValGln: 2.129 ± 0.39
2.938ValArg: 2.938 ± 0.386
3.577ValSer: 3.577 ± 0.387
4.939ValThr: 4.939 ± 0.637
4.726ValVal: 4.726 ± 0.484
0.937ValTrp: 0.937 ± 0.222
2.64ValTyr: 2.64 ± 0.37
0.0ValXaa: 0.0 ± 0.0
Trp
0.511TrpAla: 0.511 ± 0.224
0.128TrpCys: 0.128 ± 0.071
1.107TrpAsp: 1.107 ± 0.202
0.724TrpGlu: 0.724 ± 0.169
0.809TrpPhe: 0.809 ± 0.207
0.681TrpGly: 0.681 ± 0.175
0.298TrpHis: 0.298 ± 0.112
0.937TrpIle: 0.937 ± 0.27
0.681TrpLys: 0.681 ± 0.182
0.766TrpLeu: 0.766 ± 0.222
0.213TrpMet: 0.213 ± 0.091
0.852TrpAsn: 0.852 ± 0.181
0.043TrpPro: 0.043 ± 0.038
0.426TrpGln: 0.426 ± 0.177
0.426TrpArg: 0.426 ± 0.144
0.766TrpSer: 0.766 ± 0.233
0.937TrpThr: 0.937 ± 0.19
1.064TrpVal: 1.064 ± 0.211
0.085TrpTrp: 0.085 ± 0.06
0.681TrpTyr: 0.681 ± 0.163
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.64TyrAla: 2.64 ± 0.439
0.298TyrCys: 0.298 ± 0.105
3.619TyrAsp: 3.619 ± 0.453
3.832TyrGlu: 3.832 ± 0.381
1.959TyrPhe: 1.959 ± 0.308
2.64TyrGly: 2.64 ± 0.336
0.724TyrHis: 0.724 ± 0.174
2.64TyrIle: 2.64 ± 0.368
3.151TyrLys: 3.151 ± 0.442
3.662TyrLeu: 3.662 ± 0.438
1.277TyrMet: 1.277 ± 0.221
2.853TyrAsn: 2.853 ± 0.375
0.724TyrPro: 0.724 ± 0.161
0.937TyrGln: 0.937 ± 0.167
1.916TyrArg: 1.916 ± 0.32
2.47TyrSer: 2.47 ± 0.373
2.597TyrThr: 2.597 ± 0.331
2.768TyrVal: 2.768 ± 0.305
0.426TyrTrp: 0.426 ± 0.119
2.47TyrTyr: 2.47 ± 0.38
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 120 proteins (23487 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski