Amino acid dipepetide frequency for Mycobacterium phage StAnnes

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
15.132AlaAla: 15.132 ± 1.735
0.844AlaCys: 0.844 ± 0.205
7.434AlaAsp: 7.434 ± 0.662
7.539AlaGlu: 7.539 ± 0.776
2.952AlaPhe: 2.952 ± 0.45
9.49AlaGly: 9.49 ± 1.164
2.109AlaHis: 2.109 ± 0.397
4.165AlaIle: 4.165 ± 0.588
3.638AlaLys: 3.638 ± 0.446
7.803AlaLeu: 7.803 ± 0.738
2.742AlaMet: 2.742 ± 0.417
2.214AlaAsn: 2.214 ± 0.331
4.798AlaPro: 4.798 ± 0.528
3.532AlaGln: 3.532 ± 0.441
7.961AlaArg: 7.961 ± 0.663
5.325AlaSer: 5.325 ± 0.559
5.694AlaThr: 5.694 ± 0.517
6.59AlaVal: 6.59 ± 0.578
2.952AlaTrp: 2.952 ± 0.446
2.531AlaTyr: 2.531 ± 0.342
0.0AlaXaa: 0.0 ± 0.0
Cys
1.371CysAla: 1.371 ± 0.294
0.0CysCys: 0.0 ± 0.0
1.687CysAsp: 1.687 ± 0.352
0.738CysGlu: 0.738 ± 0.209
0.211CysPhe: 0.211 ± 0.129
1.634CysGly: 1.634 ± 0.365
0.316CysHis: 0.316 ± 0.128
0.369CysIle: 0.369 ± 0.224
0.475CysLys: 0.475 ± 0.169
0.633CysLeu: 0.633 ± 0.222
0.264CysMet: 0.264 ± 0.107
0.422CysAsn: 0.422 ± 0.149
1.318CysPro: 1.318 ± 0.321
0.316CysGln: 0.316 ± 0.127
0.844CysArg: 0.844 ± 0.245
0.791CysSer: 0.791 ± 0.22
0.685CysThr: 0.685 ± 0.207
0.633CysVal: 0.633 ± 0.181
0.316CysTrp: 0.316 ± 0.146
0.158CysTyr: 0.158 ± 0.097
0.0CysXaa: 0.0 ± 0.0
Asp
6.854AspAla: 6.854 ± 0.528
1.213AspCys: 1.213 ± 0.349
4.798AspAsp: 4.798 ± 0.551
2.952AspGlu: 2.952 ± 0.372
2.109AspPhe: 2.109 ± 0.29
6.274AspGly: 6.274 ± 0.589
1.371AspHis: 1.371 ± 0.258
2.583AspIle: 2.583 ± 0.363
1.74AspLys: 1.74 ± 0.295
5.852AspLeu: 5.852 ± 0.531
1.107AspMet: 1.107 ± 0.279
1.74AspAsn: 1.74 ± 0.366
5.272AspPro: 5.272 ± 0.627
2.32AspGln: 2.32 ± 0.382
5.43AspArg: 5.43 ± 0.555
4.112AspSer: 4.112 ± 0.571
3.691AspThr: 3.691 ± 0.487
4.851AspVal: 4.851 ± 0.533
1.476AspTrp: 1.476 ± 0.294
2.056AspTyr: 2.056 ± 0.327
0.0AspXaa: 0.0 ± 0.0
Glu
6.116GluAla: 6.116 ± 0.693
0.738GluCys: 0.738 ± 0.219
2.9GluAsp: 2.9 ± 0.418
2.583GluGlu: 2.583 ± 0.585
2.32GluPhe: 2.32 ± 0.393
3.48GluGly: 3.48 ± 0.457
1.582GluHis: 1.582 ± 0.336
2.32GluIle: 2.32 ± 0.324
1.845GluLys: 1.845 ± 0.251
5.272GluLeu: 5.272 ± 0.68
1.371GluMet: 1.371 ± 0.261
2.162GluAsn: 2.162 ± 0.283
2.636GluPro: 2.636 ± 0.389
2.9GluGln: 2.9 ± 0.443
4.903GluArg: 4.903 ± 0.571
2.794GluSer: 2.794 ± 0.462
4.007GluThr: 4.007 ± 0.53
3.796GluVal: 3.796 ± 0.458
1.371GluTrp: 1.371 ± 0.265
1.845GluTyr: 1.845 ± 0.328
0.0GluXaa: 0.0 ± 0.0
Phe
3.058PheAla: 3.058 ± 0.396
0.369PheCys: 0.369 ± 0.154
2.478PheAsp: 2.478 ± 0.387
1.634PheGlu: 1.634 ± 0.303
0.738PhePhe: 0.738 ± 0.224
3.111PheGly: 3.111 ± 0.592
0.422PheHis: 0.422 ± 0.158
1.424PheIle: 1.424 ± 0.315
0.896PheLys: 0.896 ± 0.199
1.845PheLeu: 1.845 ± 0.279
0.738PheMet: 0.738 ± 0.236
1.213PheAsn: 1.213 ± 0.294
1.74PhePro: 1.74 ± 0.302
1.054PheGln: 1.054 ± 0.303
1.582PheArg: 1.582 ± 0.257
1.213PheSer: 1.213 ± 0.289
2.689PheThr: 2.689 ± 0.408
2.162PheVal: 2.162 ± 0.261
0.527PheTrp: 0.527 ± 0.156
1.107PheTyr: 1.107 ± 0.269
0.0PheXaa: 0.0 ± 0.0
Gly
9.332GlyAla: 9.332 ± 1.059
1.213GlyCys: 1.213 ± 0.32
5.958GlyAsp: 5.958 ± 0.564
3.796GlyGlu: 3.796 ± 0.522
3.058GlyPhe: 3.058 ± 0.44
10.017GlyGly: 10.017 ± 1.833
1.793GlyHis: 1.793 ± 0.294
4.481GlyIle: 4.481 ± 0.62
2.636GlyLys: 2.636 ± 0.382
6.116GlyLeu: 6.116 ± 0.618
2.373GlyMet: 2.373 ± 0.419
2.9GlyAsn: 2.9 ± 0.393
4.112GlyPro: 4.112 ± 0.508
2.32GlyGln: 2.32 ± 0.544
5.325GlyArg: 5.325 ± 0.617
5.905GlySer: 5.905 ± 0.827
6.221GlyThr: 6.221 ± 0.661
6.432GlyVal: 6.432 ± 0.705
2.583GlyTrp: 2.583 ± 0.356
2.056GlyTyr: 2.056 ± 0.317
0.0GlyXaa: 0.0 ± 0.0
His
1.529HisAla: 1.529 ± 0.383
0.369HisCys: 0.369 ± 0.134
0.949HisAsp: 0.949 ± 0.226
1.318HisGlu: 1.318 ± 0.253
0.475HisPhe: 0.475 ± 0.152
2.003HisGly: 2.003 ± 0.373
0.949HisHis: 0.949 ± 0.288
1.371HisIle: 1.371 ± 0.282
0.738HisLys: 0.738 ± 0.231
1.16HisLeu: 1.16 ± 0.268
0.475HisMet: 0.475 ± 0.156
0.58HisAsn: 0.58 ± 0.19
1.74HisPro: 1.74 ± 0.334
0.791HisGln: 0.791 ± 0.239
2.478HisArg: 2.478 ± 0.424
0.738HisSer: 0.738 ± 0.204
1.582HisThr: 1.582 ± 0.295
1.318HisVal: 1.318 ± 0.289
0.58HisTrp: 0.58 ± 0.181
0.896HisTyr: 0.896 ± 0.211
0.0HisXaa: 0.0 ± 0.0
Ile
5.114IleAla: 5.114 ± 0.541
0.738IleCys: 0.738 ± 0.206
4.007IleAsp: 4.007 ± 0.546
3.322IleGlu: 3.322 ± 0.353
0.896IlePhe: 0.896 ± 0.262
3.691IleGly: 3.691 ± 0.519
1.371IleHis: 1.371 ± 0.319
1.529IleIle: 1.529 ± 0.266
0.896IleLys: 0.896 ± 0.218
1.793IleLeu: 1.793 ± 0.382
0.475IleMet: 0.475 ± 0.168
2.162IleAsn: 2.162 ± 0.249
3.111IlePro: 3.111 ± 0.317
1.318IleGln: 1.318 ± 0.242
2.742IleArg: 2.742 ± 0.426
2.003IleSer: 2.003 ± 0.407
3.532IleThr: 3.532 ± 0.444
2.478IleVal: 2.478 ± 0.354
0.949IleTrp: 0.949 ± 0.229
0.949IleTyr: 0.949 ± 0.204
0.0IleXaa: 0.0 ± 0.0
Lys
3.532LysAla: 3.532 ± 0.517
0.422LysCys: 0.422 ± 0.149
1.582LysAsp: 1.582 ± 0.284
1.213LysGlu: 1.213 ± 0.263
1.107LysPhe: 1.107 ± 0.193
2.531LysGly: 2.531 ± 0.371
0.896LysHis: 0.896 ± 0.181
0.738LysIle: 0.738 ± 0.212
1.529LysLys: 1.529 ± 0.388
2.267LysLeu: 2.267 ± 0.442
0.527LysMet: 0.527 ± 0.15
0.844LysAsn: 0.844 ± 0.183
2.583LysPro: 2.583 ± 0.44
1.634LysGln: 1.634 ± 0.282
2.583LysArg: 2.583 ± 0.446
2.162LysSer: 2.162 ± 0.329
1.793LysThr: 1.793 ± 0.297
2.32LysVal: 2.32 ± 0.419
0.791LysTrp: 0.791 ± 0.191
1.16LysTyr: 1.16 ± 0.284
0.0LysXaa: 0.0 ± 0.0
Leu
7.381LeuAla: 7.381 ± 0.618
0.791LeuCys: 0.791 ± 0.243
5.43LeuAsp: 5.43 ± 0.684
3.954LeuGlu: 3.954 ± 0.484
2.267LeuPhe: 2.267 ± 0.326
5.114LeuGly: 5.114 ± 0.526
1.002LeuHis: 1.002 ± 0.223
3.374LeuIle: 3.374 ± 0.446
2.162LeuLys: 2.162 ± 0.333
4.64LeuLeu: 4.64 ± 0.559
1.213LeuMet: 1.213 ± 0.233
2.531LeuAsn: 2.531 ± 0.334
5.378LeuPro: 5.378 ± 0.635
2.636LeuGln: 2.636 ± 0.423
5.061LeuArg: 5.061 ± 0.685
5.536LeuSer: 5.536 ± 0.496
5.22LeuThr: 5.22 ± 0.551
4.903LeuVal: 4.903 ± 0.515
1.16LeuTrp: 1.16 ± 0.257
1.951LeuTyr: 1.951 ± 0.346
0.0LeuXaa: 0.0 ± 0.0
Met
2.056MetAla: 2.056 ± 0.425
0.211MetCys: 0.211 ± 0.11
1.476MetAsp: 1.476 ± 0.314
0.844MetGlu: 0.844 ± 0.209
0.685MetPhe: 0.685 ± 0.221
1.634MetGly: 1.634 ± 0.27
0.264MetHis: 0.264 ± 0.103
0.791MetIle: 0.791 ± 0.213
0.896MetLys: 0.896 ± 0.269
1.582MetLeu: 1.582 ± 0.266
0.58MetMet: 0.58 ± 0.235
1.213MetAsn: 1.213 ± 0.255
1.318MetPro: 1.318 ± 0.247
0.422MetGln: 0.422 ± 0.123
1.529MetArg: 1.529 ± 0.264
2.531MetSer: 2.531 ± 0.332
2.003MetThr: 2.003 ± 0.269
1.371MetVal: 1.371 ± 0.302
0.422MetTrp: 0.422 ± 0.12
0.316MetTyr: 0.316 ± 0.116
0.0MetXaa: 0.0 ± 0.0
Asn
3.585AsnAla: 3.585 ± 0.42
0.105AsnCys: 0.105 ± 0.074
2.267AsnAsp: 2.267 ± 0.273
1.951AsnGlu: 1.951 ± 0.291
0.791AsnPhe: 0.791 ± 0.228
4.376AsnGly: 4.376 ± 0.562
0.738AsnHis: 0.738 ± 0.152
1.582AsnIle: 1.582 ± 0.383
0.896AsnLys: 0.896 ± 0.265
2.267AsnLeu: 2.267 ± 0.372
0.527AsnMet: 0.527 ± 0.147
1.793AsnAsn: 1.793 ± 0.347
2.742AsnPro: 2.742 ± 0.425
1.213AsnGln: 1.213 ± 0.326
2.109AsnArg: 2.109 ± 0.365
1.582AsnSer: 1.582 ± 0.274
2.267AsnThr: 2.267 ± 0.328
2.109AsnVal: 2.109 ± 0.333
0.791AsnTrp: 0.791 ± 0.157
0.791AsnTyr: 0.791 ± 0.178
0.0AsnXaa: 0.0 ± 0.0
Pro
5.061ProAla: 5.061 ± 0.642
0.633ProCys: 0.633 ± 0.203
4.429ProAsp: 4.429 ± 0.576
4.112ProGlu: 4.112 ± 0.55
1.898ProPhe: 1.898 ± 0.313
6.485ProGly: 6.485 ± 0.699
1.529ProHis: 1.529 ± 0.277
2.373ProIle: 2.373 ± 0.267
2.531ProLys: 2.531 ± 0.44
4.323ProLeu: 4.323 ± 0.569
1.424ProMet: 1.424 ± 0.333
2.794ProAsn: 2.794 ± 0.344
4.007ProPro: 4.007 ± 0.597
1.898ProGln: 1.898 ± 0.364
3.638ProArg: 3.638 ± 0.549
2.847ProSer: 2.847 ± 0.415
3.322ProThr: 3.322 ± 0.518
5.22ProVal: 5.22 ± 0.491
1.16ProTrp: 1.16 ± 0.256
1.529ProTyr: 1.529 ± 0.271
0.0ProXaa: 0.0 ± 0.0
Gln
4.323GlnAla: 4.323 ± 0.574
0.422GlnCys: 0.422 ± 0.132
1.74GlnAsp: 1.74 ± 0.279
1.793GlnGlu: 1.793 ± 0.324
1.107GlnPhe: 1.107 ± 0.221
2.425GlnGly: 2.425 ± 0.418
0.685GlnHis: 0.685 ± 0.192
1.529GlnIle: 1.529 ± 0.334
1.002GlnLys: 1.002 ± 0.232
3.374GlnLeu: 3.374 ± 0.481
0.633GlnMet: 0.633 ± 0.18
1.002GlnAsn: 1.002 ± 0.276
2.214GlnPro: 2.214 ± 0.375
1.16GlnGln: 1.16 ± 0.269
2.531GlnArg: 2.531 ± 0.41
2.425GlnSer: 2.425 ± 0.383
1.634GlnThr: 1.634 ± 0.333
2.109GlnVal: 2.109 ± 0.363
0.791GlnTrp: 0.791 ± 0.189
1.002GlnTyr: 1.002 ± 0.262
0.0GlnXaa: 0.0 ± 0.0
Arg
6.854ArgAla: 6.854 ± 0.648
1.529ArgCys: 1.529 ± 0.367
3.849ArgAsp: 3.849 ± 0.505
5.061ArgGlu: 5.061 ± 0.632
2.214ArgPhe: 2.214 ± 0.371
4.429ArgGly: 4.429 ± 0.486
2.003ArgHis: 2.003 ± 0.37
3.691ArgIle: 3.691 ± 0.464
2.32ArgLys: 2.32 ± 0.385
5.378ArgLeu: 5.378 ± 0.611
2.689ArgMet: 2.689 ± 0.389
2.794ArgAsn: 2.794 ± 0.502
3.322ArgPro: 3.322 ± 0.419
2.214ArgGln: 2.214 ± 0.352
6.538ArgArg: 6.538 ± 0.739
3.796ArgSer: 3.796 ± 0.36
3.532ArgThr: 3.532 ± 0.517
5.272ArgVal: 5.272 ± 0.565
1.687ArgTrp: 1.687 ± 0.307
2.214ArgTyr: 2.214 ± 0.336
0.0ArgXaa: 0.0 ± 0.0
Ser
6.116SerAla: 6.116 ± 0.829
0.633SerCys: 0.633 ± 0.168
3.902SerAsp: 3.902 ± 0.408
3.058SerGlu: 3.058 ± 0.407
1.687SerPhe: 1.687 ± 0.298
6.432SerGly: 6.432 ± 0.731
1.054SerHis: 1.054 ± 0.273
2.9SerIle: 2.9 ± 0.372
2.425SerLys: 2.425 ± 0.403
3.638SerLeu: 3.638 ± 0.435
1.476SerMet: 1.476 ± 0.281
2.267SerAsn: 2.267 ± 0.294
3.427SerPro: 3.427 ± 0.383
1.582SerGln: 1.582 ± 0.281
3.48SerArg: 3.48 ± 0.424
3.322SerSer: 3.322 ± 0.517
3.374SerThr: 3.374 ± 0.469
4.534SerVal: 4.534 ± 0.525
1.793SerTrp: 1.793 ± 0.303
1.318SerTyr: 1.318 ± 0.235
0.0SerXaa: 0.0 ± 0.0
Thr
6.221ThrAla: 6.221 ± 0.651
0.949ThrCys: 0.949 ± 0.239
4.271ThrAsp: 4.271 ± 0.601
3.269ThrGlu: 3.269 ± 0.404
1.687ThrPhe: 1.687 ± 0.266
6.063ThrGly: 6.063 ± 0.659
1.476ThrHis: 1.476 ± 0.325
3.427ThrIle: 3.427 ± 0.421
1.898ThrLys: 1.898 ± 0.324
4.851ThrLeu: 4.851 ± 0.52
1.318ThrMet: 1.318 ± 0.269
2.267ThrAsn: 2.267 ± 0.4
4.903ThrPro: 4.903 ± 0.565
2.056ThrGln: 2.056 ± 0.303
3.796ThrArg: 3.796 ± 0.391
3.585ThrSer: 3.585 ± 0.416
4.64ThrThr: 4.64 ± 0.536
5.061ThrVal: 5.061 ± 0.536
0.896ThrTrp: 0.896 ± 0.254
2.32ThrTyr: 2.32 ± 0.35
0.0ThrXaa: 0.0 ± 0.0
Val
7.329ValAla: 7.329 ± 0.595
1.213ValCys: 1.213 ± 0.219
5.22ValAsp: 5.22 ± 0.592
4.534ValGlu: 4.534 ± 0.506
2.267ValPhe: 2.267 ± 0.36
5.958ValGly: 5.958 ± 0.599
1.213ValHis: 1.213 ± 0.275
2.162ValIle: 2.162 ± 0.414
2.162ValLys: 2.162 ± 0.348
4.903ValLeu: 4.903 ± 0.518
1.371ValMet: 1.371 ± 0.22
2.214ValAsn: 2.214 ± 0.306
4.06ValPro: 4.06 ± 0.418
2.636ValGln: 2.636 ± 0.342
4.376ValArg: 4.376 ± 0.566
5.009ValSer: 5.009 ± 0.579
5.378ValThr: 5.378 ± 0.468
6.59ValVal: 6.59 ± 0.823
1.898ValTrp: 1.898 ± 0.368
1.371ValTyr: 1.371 ± 0.279
0.0ValXaa: 0.0 ± 0.0
Trp
1.845TrpAla: 1.845 ± 0.301
0.369TrpCys: 0.369 ± 0.139
1.74TrpAsp: 1.74 ± 0.317
1.213TrpGlu: 1.213 ± 0.314
0.844TrpPhe: 0.844 ± 0.198
1.054TrpGly: 1.054 ± 0.229
0.685TrpHis: 0.685 ± 0.2
1.16TrpIle: 1.16 ± 0.218
0.844TrpLys: 0.844 ± 0.206
1.898TrpLeu: 1.898 ± 0.342
0.633TrpMet: 0.633 ± 0.233
0.685TrpAsn: 0.685 ± 0.227
1.213TrpPro: 1.213 ± 0.272
1.002TrpGln: 1.002 ± 0.235
2.267TrpArg: 2.267 ± 0.388
1.424TrpSer: 1.424 ± 0.321
1.634TrpThr: 1.634 ± 0.28
1.845TrpVal: 1.845 ± 0.438
0.791TrpTrp: 0.791 ± 0.173
0.527TrpTyr: 0.527 ± 0.168
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.636TyrAla: 2.636 ± 0.377
0.422TyrCys: 0.422 ± 0.165
1.74TyrAsp: 1.74 ± 0.34
2.056TyrGlu: 2.056 ± 0.322
0.685TyrPhe: 0.685 ± 0.172
2.425TyrGly: 2.425 ± 0.4
0.58TyrHis: 0.58 ± 0.192
1.107TyrIle: 1.107 ± 0.241
0.685TyrLys: 0.685 ± 0.19
2.109TyrLeu: 2.109 ± 0.311
0.158TyrMet: 0.158 ± 0.086
0.685TyrAsn: 0.685 ± 0.154
1.318TyrPro: 1.318 ± 0.196
0.896TyrGln: 0.896 ± 0.273
2.214TyrArg: 2.214 ± 0.404
1.318TyrSer: 1.318 ± 0.255
2.056TyrThr: 2.056 ± 0.379
2.267TyrVal: 2.267 ± 0.306
0.738TyrTrp: 0.738 ± 0.216
0.844TyrTyr: 0.844 ± 0.166
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 110 proteins (18968 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski