Amino acid dipepetide frequency for Mycobacterium phage SimranZ1

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
15.733AlaAla: 15.733 ± 2.243
0.784AlaCys: 0.784 ± 0.263
6.943AlaAsp: 6.943 ± 0.718
8.791AlaGlu: 8.791 ± 0.988
2.632AlaPhe: 2.632 ± 0.357
9.295AlaGly: 9.295 ± 1.144
1.848AlaHis: 1.848 ± 0.351
3.807AlaIle: 3.807 ± 0.438
3.919AlaLys: 3.919 ± 0.421
9.406AlaLeu: 9.406 ± 0.781
2.408AlaMet: 2.408 ± 0.407
3.08AlaAsn: 3.08 ± 0.476
4.535AlaPro: 4.535 ± 0.523
3.191AlaGln: 3.191 ± 0.527
7.223AlaArg: 7.223 ± 0.731
6.215AlaSer: 6.215 ± 0.726
6.551AlaThr: 6.551 ± 0.655
7.111AlaVal: 7.111 ± 0.604
2.352AlaTrp: 2.352 ± 0.472
2.296AlaTyr: 2.296 ± 0.302
0.0AlaXaa: 0.0 ± 0.0
Cys
1.064CysAla: 1.064 ± 0.271
0.168CysCys: 0.168 ± 0.108
1.008CysAsp: 1.008 ± 0.283
0.616CysGlu: 0.616 ± 0.186
0.28CysPhe: 0.28 ± 0.131
1.624CysGly: 1.624 ± 0.365
0.168CysHis: 0.168 ± 0.086
0.392CysIle: 0.392 ± 0.162
0.504CysLys: 0.504 ± 0.162
0.84CysLeu: 0.84 ± 0.303
0.112CysMet: 0.112 ± 0.081
0.336CysAsn: 0.336 ± 0.128
1.232CysPro: 1.232 ± 0.277
0.112CysGln: 0.112 ± 0.075
0.896CysArg: 0.896 ± 0.28
0.728CysSer: 0.728 ± 0.215
0.672CysThr: 0.672 ± 0.208
0.504CysVal: 0.504 ± 0.142
0.112CysTrp: 0.112 ± 0.078
0.336CysTyr: 0.336 ± 0.145
0.0CysXaa: 0.0 ± 0.0
Asp
5.991AspAla: 5.991 ± 0.569
0.84AspCys: 0.84 ± 0.227
4.815AspAsp: 4.815 ± 0.632
3.583AspGlu: 3.583 ± 0.519
1.232AspPhe: 1.232 ± 0.222
6.327AspGly: 6.327 ± 0.603
1.736AspHis: 1.736 ± 0.257
2.8AspIle: 2.8 ± 0.432
1.848AspLys: 1.848 ± 0.273
5.991AspLeu: 5.991 ± 0.563
1.12AspMet: 1.12 ± 0.246
1.512AspAsn: 1.512 ± 0.342
4.255AspPro: 4.255 ± 0.504
2.632AspGln: 2.632 ± 0.361
5.543AspArg: 5.543 ± 0.578
3.359AspSer: 3.359 ± 0.459
3.639AspThr: 3.639 ± 0.38
4.647AspVal: 4.647 ± 0.585
1.68AspTrp: 1.68 ± 0.293
1.848AspTyr: 1.848 ± 0.355
0.0AspXaa: 0.0 ± 0.0
Glu
6.103GluAla: 6.103 ± 0.682
0.896GluCys: 0.896 ± 0.256
3.08GluAsp: 3.08 ± 0.391
3.024GluGlu: 3.024 ± 0.537
2.408GluPhe: 2.408 ± 0.287
2.968GluGly: 2.968 ± 0.412
1.232GluHis: 1.232 ± 0.36
2.408GluIle: 2.408 ± 0.426
1.792GluLys: 1.792 ± 0.333
5.655GluLeu: 5.655 ± 0.669
1.456GluMet: 1.456 ± 0.329
2.128GluAsn: 2.128 ± 0.285
2.744GluPro: 2.744 ± 0.363
3.247GluGln: 3.247 ± 0.491
5.095GluArg: 5.095 ± 0.664
3.247GluSer: 3.247 ± 0.51
3.919GluThr: 3.919 ± 0.511
4.087GluVal: 4.087 ± 0.579
1.008GluTrp: 1.008 ± 0.197
1.624GluTyr: 1.624 ± 0.397
0.0GluXaa: 0.0 ± 0.0
Phe
3.024PheAla: 3.024 ± 0.448
0.392PheCys: 0.392 ± 0.149
2.24PheAsp: 2.24 ± 0.336
1.288PheGlu: 1.288 ± 0.261
0.784PhePhe: 0.784 ± 0.247
3.191PheGly: 3.191 ± 0.637
0.392PheHis: 0.392 ± 0.152
1.12PheIle: 1.12 ± 0.311
1.176PheLys: 1.176 ± 0.286
1.792PheLeu: 1.792 ± 0.31
0.84PheMet: 0.84 ± 0.234
1.232PheAsn: 1.232 ± 0.36
1.512PhePro: 1.512 ± 0.314
0.896PheGln: 0.896 ± 0.265
1.288PheArg: 1.288 ± 0.248
1.624PheSer: 1.624 ± 0.345
1.848PheThr: 1.848 ± 0.251
1.96PheVal: 1.96 ± 0.243
0.672PheTrp: 0.672 ± 0.171
0.616PheTyr: 0.616 ± 0.228
0.0PheXaa: 0.0 ± 0.0
Gly
9.071GlyAla: 9.071 ± 1.191
1.064GlyCys: 1.064 ± 0.248
5.935GlyAsp: 5.935 ± 0.531
3.919GlyGlu: 3.919 ± 0.44
2.464GlyPhe: 2.464 ± 0.427
10.694GlyGly: 10.694 ± 2.084
1.904GlyHis: 1.904 ± 0.294
4.087GlyIle: 4.087 ± 0.474
2.912GlyLys: 2.912 ± 0.445
5.935GlyLeu: 5.935 ± 0.617
2.296GlyMet: 2.296 ± 0.46
2.968GlyAsn: 2.968 ± 0.431
4.255GlyPro: 4.255 ± 0.506
2.352GlyGln: 2.352 ± 0.562
5.151GlyArg: 5.151 ± 0.707
5.655GlySer: 5.655 ± 0.777
7.167GlyThr: 7.167 ± 0.707
5.487GlyVal: 5.487 ± 0.634
2.632GlyTrp: 2.632 ± 0.394
2.128GlyTyr: 2.128 ± 0.381
0.0GlyXaa: 0.0 ± 0.0
His
2.016HisAla: 2.016 ± 0.394
0.448HisCys: 0.448 ± 0.21
1.232HisAsp: 1.232 ± 0.292
1.288HisGlu: 1.288 ± 0.291
0.448HisPhe: 0.448 ± 0.143
1.736HisGly: 1.736 ± 0.358
1.064HisHis: 1.064 ± 0.32
1.176HisIle: 1.176 ± 0.236
1.064HisLys: 1.064 ± 0.255
1.792HisLeu: 1.792 ± 0.332
0.616HisMet: 0.616 ± 0.17
0.784HisAsn: 0.784 ± 0.22
1.456HisPro: 1.456 ± 0.304
0.896HisGln: 0.896 ± 0.27
1.4HisArg: 1.4 ± 0.3
0.84HisSer: 0.84 ± 0.206
1.512HisThr: 1.512 ± 0.355
0.952HisVal: 0.952 ± 0.242
0.56HisTrp: 0.56 ± 0.169
0.952HisTyr: 0.952 ± 0.202
0.0HisXaa: 0.0 ± 0.0
Ile
5.431IleAla: 5.431 ± 0.514
0.448IleCys: 0.448 ± 0.202
3.415IleAsp: 3.415 ± 0.467
3.024IleGlu: 3.024 ± 0.439
0.728IlePhe: 0.728 ± 0.213
4.199IleGly: 4.199 ± 0.504
1.232IleHis: 1.232 ± 0.301
1.4IleIle: 1.4 ± 0.244
1.064IleLys: 1.064 ± 0.255
2.184IleLeu: 2.184 ± 0.383
0.392IleMet: 0.392 ± 0.142
1.904IleAsn: 1.904 ± 0.333
2.632IlePro: 2.632 ± 0.372
1.4IleGln: 1.4 ± 0.256
2.576IleArg: 2.576 ± 0.4
2.184IleSer: 2.184 ± 0.433
3.695IleThr: 3.695 ± 0.522
3.303IleVal: 3.303 ± 0.441
0.784IleTrp: 0.784 ± 0.186
0.616IleTyr: 0.616 ± 0.172
0.0IleXaa: 0.0 ± 0.0
Lys
4.087LysAla: 4.087 ± 0.458
0.336LysCys: 0.336 ± 0.14
1.792LysAsp: 1.792 ± 0.341
1.176LysGlu: 1.176 ± 0.2
1.176LysPhe: 1.176 ± 0.175
2.352LysGly: 2.352 ± 0.318
1.008LysHis: 1.008 ± 0.236
1.008LysIle: 1.008 ± 0.237
1.288LysLys: 1.288 ± 0.257
2.968LysLeu: 2.968 ± 0.537
0.448LysMet: 0.448 ± 0.123
1.008LysAsn: 1.008 ± 0.234
2.408LysPro: 2.408 ± 0.354
1.4LysGln: 1.4 ± 0.209
2.52LysArg: 2.52 ± 0.42
1.848LysSer: 1.848 ± 0.278
2.352LysThr: 2.352 ± 0.423
2.52LysVal: 2.52 ± 0.395
0.728LysTrp: 0.728 ± 0.198
1.12LysTyr: 1.12 ± 0.263
0.0LysXaa: 0.0 ± 0.0
Leu
8.735LeuAla: 8.735 ± 0.751
0.672LeuCys: 0.672 ± 0.244
5.319LeuAsp: 5.319 ± 0.566
4.031LeuGlu: 4.031 ± 0.516
2.128LeuPhe: 2.128 ± 0.298
5.711LeuGly: 5.711 ± 0.575
1.12LeuHis: 1.12 ± 0.287
3.247LeuIle: 3.247 ± 0.395
2.184LeuLys: 2.184 ± 0.344
4.703LeuLeu: 4.703 ± 0.553
1.456LeuMet: 1.456 ± 0.259
2.744LeuAsn: 2.744 ± 0.469
5.655LeuPro: 5.655 ± 0.595
2.576LeuGln: 2.576 ± 0.5
5.991LeuArg: 5.991 ± 0.71
5.935LeuSer: 5.935 ± 0.603
5.991LeuThr: 5.991 ± 0.525
5.263LeuVal: 5.263 ± 0.559
1.064LeuTrp: 1.064 ± 0.182
2.408LeuTyr: 2.408 ± 0.419
0.0LeuXaa: 0.0 ± 0.0
Met
1.736MetAla: 1.736 ± 0.347
0.168MetCys: 0.168 ± 0.099
1.232MetAsp: 1.232 ± 0.271
0.896MetGlu: 0.896 ± 0.178
0.728MetPhe: 0.728 ± 0.227
1.848MetGly: 1.848 ± 0.26
0.224MetHis: 0.224 ± 0.1
0.84MetIle: 0.84 ± 0.228
0.896MetLys: 0.896 ± 0.322
1.624MetLeu: 1.624 ± 0.236
0.56MetMet: 0.56 ± 0.224
0.952MetAsn: 0.952 ± 0.238
1.232MetPro: 1.232 ± 0.24
0.168MetGln: 0.168 ± 0.081
1.68MetArg: 1.68 ± 0.301
2.856MetSer: 2.856 ± 0.433
2.128MetThr: 2.128 ± 0.333
1.792MetVal: 1.792 ± 0.405
0.392MetTrp: 0.392 ± 0.159
0.28MetTyr: 0.28 ± 0.117
0.0MetXaa: 0.0 ± 0.0
Asn
3.191AsnAla: 3.191 ± 0.348
0.112AsnCys: 0.112 ± 0.073
1.68AsnAsp: 1.68 ± 0.287
2.128AsnGlu: 2.128 ± 0.4
0.728AsnPhe: 0.728 ± 0.253
4.199AsnGly: 4.199 ± 0.476
0.896AsnHis: 0.896 ± 0.215
1.568AsnIle: 1.568 ± 0.431
1.008AsnLys: 1.008 ± 0.228
2.52AsnLeu: 2.52 ± 0.325
0.784AsnMet: 0.784 ± 0.18
1.512AsnAsn: 1.512 ± 0.343
2.632AsnPro: 2.632 ± 0.347
0.952AsnGln: 0.952 ± 0.283
1.96AsnArg: 1.96 ± 0.415
1.624AsnSer: 1.624 ± 0.252
2.24AsnThr: 2.24 ± 0.308
1.904AsnVal: 1.904 ± 0.387
0.672AsnTrp: 0.672 ± 0.194
0.84AsnTyr: 0.84 ± 0.185
0.0AsnXaa: 0.0 ± 0.0
Pro
5.431ProAla: 5.431 ± 0.57
0.784ProCys: 0.784 ± 0.274
4.367ProAsp: 4.367 ± 0.584
4.703ProGlu: 4.703 ± 0.563
1.736ProPhe: 1.736 ± 0.278
6.719ProGly: 6.719 ± 0.74
1.344ProHis: 1.344 ± 0.276
2.184ProIle: 2.184 ± 0.317
1.848ProLys: 1.848 ± 0.324
4.423ProLeu: 4.423 ± 0.502
1.12ProMet: 1.12 ± 0.257
2.24ProAsn: 2.24 ± 0.278
3.863ProPro: 3.863 ± 0.568
2.072ProGln: 2.072 ± 0.4
2.968ProArg: 2.968 ± 0.437
3.639ProSer: 3.639 ± 0.472
3.191ProThr: 3.191 ± 0.421
4.199ProVal: 4.199 ± 0.511
1.344ProTrp: 1.344 ± 0.283
1.904ProTyr: 1.904 ± 0.355
0.0ProXaa: 0.0 ± 0.0
Gln
4.087GlnAla: 4.087 ± 0.522
0.56GlnCys: 0.56 ± 0.223
1.232GlnAsp: 1.232 ± 0.256
1.4GlnGlu: 1.4 ± 0.269
0.952GlnPhe: 0.952 ± 0.209
2.576GlnGly: 2.576 ± 0.493
0.56GlnHis: 0.56 ± 0.185
1.792GlnIle: 1.792 ± 0.324
1.232GlnLys: 1.232 ± 0.235
3.639GlnLeu: 3.639 ± 0.492
0.84GlnMet: 0.84 ± 0.19
0.952GlnAsn: 0.952 ± 0.265
2.464GlnPro: 2.464 ± 0.326
1.792GlnGln: 1.792 ± 0.471
2.24GlnArg: 2.24 ± 0.3
2.016GlnSer: 2.016 ± 0.396
1.848GlnThr: 1.848 ± 0.442
2.24GlnVal: 2.24 ± 0.408
0.728GlnTrp: 0.728 ± 0.174
1.344GlnTyr: 1.344 ± 0.303
0.0GlnXaa: 0.0 ± 0.0
Arg
6.551ArgAla: 6.551 ± 0.669
1.456ArgCys: 1.456 ± 0.31
5.095ArgAsp: 5.095 ± 0.643
4.871ArgGlu: 4.871 ± 0.618
2.352ArgPhe: 2.352 ± 0.351
4.031ArgGly: 4.031 ± 0.465
1.568ArgHis: 1.568 ± 0.337
3.583ArgIle: 3.583 ± 0.474
2.072ArgLys: 2.072 ± 0.34
5.207ArgLeu: 5.207 ± 0.594
2.464ArgMet: 2.464 ± 0.387
1.736ArgAsn: 1.736 ± 0.393
3.135ArgPro: 3.135 ± 0.391
1.848ArgGln: 1.848 ± 0.372
4.647ArgArg: 4.647 ± 0.568
4.143ArgSer: 4.143 ± 0.515
3.303ArgThr: 3.303 ± 0.479
4.927ArgVal: 4.927 ± 0.505
1.568ArgTrp: 1.568 ± 0.317
2.24ArgTyr: 2.24 ± 0.312
0.0ArgXaa: 0.0 ± 0.0
Ser
7.559SerAla: 7.559 ± 1.413
0.392SerCys: 0.392 ± 0.137
3.975SerAsp: 3.975 ± 0.549
3.08SerGlu: 3.08 ± 0.548
2.016SerPhe: 2.016 ± 0.397
6.327SerGly: 6.327 ± 0.702
1.568SerHis: 1.568 ± 0.277
2.632SerIle: 2.632 ± 0.413
2.408SerLys: 2.408 ± 0.378
3.695SerLeu: 3.695 ± 0.519
1.624SerMet: 1.624 ± 0.316
2.072SerAsn: 2.072 ± 0.369
3.919SerPro: 3.919 ± 0.498
1.624SerGln: 1.624 ± 0.34
3.527SerArg: 3.527 ± 0.389
4.199SerSer: 4.199 ± 0.584
3.695SerThr: 3.695 ± 0.51
4.647SerVal: 4.647 ± 0.492
1.4SerTrp: 1.4 ± 0.261
1.568SerTyr: 1.568 ± 0.25
0.0SerXaa: 0.0 ± 0.0
Thr
5.991ThrAla: 5.991 ± 0.738
0.616ThrCys: 0.616 ± 0.216
3.583ThrAsp: 3.583 ± 0.48
4.031ThrGlu: 4.031 ± 0.409
1.96ThrPhe: 1.96 ± 0.351
5.935ThrGly: 5.935 ± 0.606
1.792ThrHis: 1.792 ± 0.354
3.247ThrIle: 3.247 ± 0.432
2.408ThrLys: 2.408 ± 0.345
5.039ThrLeu: 5.039 ± 0.576
1.12ThrMet: 1.12 ± 0.277
2.184ThrAsn: 2.184 ± 0.316
5.263ThrPro: 5.263 ± 0.633
2.128ThrGln: 2.128 ± 0.341
3.751ThrArg: 3.751 ± 0.469
3.751ThrSer: 3.751 ± 0.435
4.647ThrThr: 4.647 ± 0.655
6.159ThrVal: 6.159 ± 0.709
1.176ThrTrp: 1.176 ± 0.25
1.904ThrTyr: 1.904 ± 0.314
0.0ThrXaa: 0.0 ± 0.0
Val
7.727ValAla: 7.727 ± 0.611
1.232ValCys: 1.232 ± 0.265
5.095ValAsp: 5.095 ± 0.572
3.919ValGlu: 3.919 ± 0.421
1.736ValPhe: 1.736 ± 0.306
5.543ValGly: 5.543 ± 0.575
1.456ValHis: 1.456 ± 0.347
2.912ValIle: 2.912 ± 0.381
2.296ValLys: 2.296 ± 0.363
5.431ValLeu: 5.431 ± 0.539
1.512ValMet: 1.512 ± 0.226
2.296ValAsn: 2.296 ± 0.348
4.255ValPro: 4.255 ± 0.342
2.8ValGln: 2.8 ± 0.327
4.255ValArg: 4.255 ± 0.521
5.151ValSer: 5.151 ± 0.528
5.095ValThr: 5.095 ± 0.471
5.487ValVal: 5.487 ± 0.628
1.792ValTrp: 1.792 ± 0.318
1.288ValTyr: 1.288 ± 0.303
0.0ValXaa: 0.0 ± 0.0
Trp
2.016TrpAla: 2.016 ± 0.282
0.28TrpCys: 0.28 ± 0.128
1.512TrpAsp: 1.512 ± 0.286
0.84TrpGlu: 0.84 ± 0.262
0.56TrpPhe: 0.56 ± 0.173
0.84TrpGly: 0.84 ± 0.208
0.616TrpHis: 0.616 ± 0.214
1.008TrpIle: 1.008 ± 0.193
0.84TrpLys: 0.84 ± 0.16
1.848TrpLeu: 1.848 ± 0.34
0.728TrpMet: 0.728 ± 0.211
0.616TrpAsn: 0.616 ± 0.224
1.064TrpPro: 1.064 ± 0.253
1.344TrpGln: 1.344 ± 0.272
2.016TrpArg: 2.016 ± 0.363
1.68TrpSer: 1.68 ± 0.409
1.288TrpThr: 1.288 ± 0.283
1.848TrpVal: 1.848 ± 0.408
0.896TrpTrp: 0.896 ± 0.206
0.448TrpTyr: 0.448 ± 0.166
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.576TyrAla: 2.576 ± 0.42
0.056TyrCys: 0.056 ± 0.055
1.848TyrAsp: 1.848 ± 0.37
1.568TyrGlu: 1.568 ± 0.29
0.84TyrPhe: 0.84 ± 0.22
1.96TyrGly: 1.96 ± 0.356
0.504TyrHis: 0.504 ± 0.131
1.4TyrIle: 1.4 ± 0.294
0.784TyrLys: 0.784 ± 0.23
2.24TyrLeu: 2.24 ± 0.285
0.28TyrMet: 0.28 ± 0.122
0.896TyrAsn: 0.896 ± 0.267
1.568TyrPro: 1.568 ± 0.258
1.064TyrGln: 1.064 ± 0.263
2.072TyrArg: 2.072 ± 0.357
1.232TyrSer: 1.232 ± 0.269
1.904TyrThr: 1.904 ± 0.356
2.24TyrVal: 2.24 ± 0.295
0.728TyrTrp: 0.728 ± 0.176
0.728TyrTyr: 0.728 ± 0.18
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 103 proteins (17861 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski