Amino acid dipepetide frequency for Streptomyces phage Gibson

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
14.865AlaAla: 14.865 ± 2.197
0.972AlaCys: 0.972 ± 0.247
6.898AlaAsp: 6.898 ± 0.465
8.939AlaGlu: 8.939 ± 1.174
2.915AlaPhe: 2.915 ± 0.419
8.598AlaGly: 8.598 ± 0.792
1.652AlaHis: 1.652 ± 0.245
4.421AlaIle: 4.421 ± 0.435
5.295AlaLys: 5.295 ± 0.654
7.044AlaLeu: 7.044 ± 0.607
3.303AlaMet: 3.303 ± 0.423
3.643AlaAsn: 3.643 ± 0.541
3.789AlaPro: 3.789 ± 0.432
4.081AlaGln: 4.081 ± 0.655
5.684AlaArg: 5.684 ± 0.523
4.324AlaSer: 4.324 ± 0.337
7.773AlaThr: 7.773 ± 0.982
6.558AlaVal: 6.558 ± 0.528
1.603AlaTrp: 1.603 ± 0.295
2.526AlaTyr: 2.526 ± 0.345
0.0AlaXaa: 0.0 ± 0.0
Cys
0.972CysAla: 0.972 ± 0.243
0.243CysCys: 0.243 ± 0.109
0.874CysAsp: 0.874 ± 0.227
1.069CysGlu: 1.069 ± 0.245
0.291CysPhe: 0.291 ± 0.129
1.166CysGly: 1.166 ± 0.324
0.34CysHis: 0.34 ± 0.117
0.34CysIle: 0.34 ± 0.142
0.874CysLys: 0.874 ± 0.214
0.632CysLeu: 0.632 ± 0.235
0.437CysMet: 0.437 ± 0.149
0.34CysAsn: 0.34 ± 0.151
0.729CysPro: 0.729 ± 0.208
0.68CysGln: 0.68 ± 0.183
0.874CysArg: 0.874 ± 0.2
0.583CysSer: 0.583 ± 0.197
0.68CysThr: 0.68 ± 0.192
0.534CysVal: 0.534 ± 0.145
0.194CysTrp: 0.194 ± 0.089
0.291CysTyr: 0.291 ± 0.124
0.0CysXaa: 0.0 ± 0.0
Asp
7.238AspAla: 7.238 ± 0.601
0.874AspCys: 0.874 ± 0.201
6.364AspAsp: 6.364 ± 0.709
7.578AspGlu: 7.578 ± 0.945
2.235AspPhe: 2.235 ± 0.32
5.247AspGly: 5.247 ± 0.552
1.069AspHis: 1.069 ± 0.187
2.963AspIle: 2.963 ± 0.356
2.38AspLys: 2.38 ± 0.454
5.975AspLeu: 5.975 ± 0.564
1.7AspMet: 1.7 ± 0.288
2.429AspAsn: 2.429 ± 0.367
3.935AspPro: 3.935 ± 0.599
2.38AspGln: 2.38 ± 0.341
3.886AspArg: 3.886 ± 0.469
3.789AspSer: 3.789 ± 0.455
3.546AspThr: 3.546 ± 0.615
4.178AspVal: 4.178 ± 0.323
1.069AspTrp: 1.069 ± 0.29
2.478AspTyr: 2.478 ± 0.273
0.0AspXaa: 0.0 ± 0.0
Glu
8.744GluAla: 8.744 ± 0.936
1.263GluCys: 1.263 ± 0.241
5.927GluAsp: 5.927 ± 0.905
5.975GluGlu: 5.975 ± 0.808
2.38GluPhe: 2.38 ± 0.32
4.664GluGly: 4.664 ± 0.573
1.7GluHis: 1.7 ± 0.308
2.623GluIle: 2.623 ± 0.373
2.915GluLys: 2.915 ± 0.475
7.238GluLeu: 7.238 ± 0.629
2.235GluMet: 2.235 ± 0.39
2.089GluAsn: 2.089 ± 0.335
3.303GluPro: 3.303 ± 0.416
2.866GluGln: 2.866 ± 0.586
4.469GluArg: 4.469 ± 0.662
2.672GluSer: 2.672 ± 0.384
3.206GluThr: 3.206 ± 0.353
4.469GluVal: 4.469 ± 0.516
1.457GluTrp: 1.457 ± 0.258
1.36GluTyr: 1.36 ± 0.288
0.0GluXaa: 0.0 ± 0.0
Phe
2.672PheAla: 2.672 ± 0.336
0.243PheCys: 0.243 ± 0.105
2.137PheAsp: 2.137 ± 0.406
1.895PheGlu: 1.895 ± 0.374
0.874PhePhe: 0.874 ± 0.25
2.72PheGly: 2.72 ± 0.288
0.534PheHis: 0.534 ± 0.152
0.972PheIle: 0.972 ± 0.194
1.992PheLys: 1.992 ± 0.353
1.992PheLeu: 1.992 ± 0.253
0.972PheMet: 0.972 ± 0.239
1.117PheAsn: 1.117 ± 0.222
1.506PhePro: 1.506 ± 0.408
1.069PheGln: 1.069 ± 0.21
1.895PheArg: 1.895 ± 0.272
1.603PheSer: 1.603 ± 0.407
1.652PheThr: 1.652 ± 0.319
2.137PheVal: 2.137 ± 0.298
0.583PheTrp: 0.583 ± 0.173
0.874PheTyr: 0.874 ± 0.193
0.0PheXaa: 0.0 ± 0.0
Gly
6.364GlyAla: 6.364 ± 0.722
1.069GlyCys: 1.069 ± 0.266
6.072GlyAsp: 6.072 ± 0.991
6.072GlyGlu: 6.072 ± 0.501
2.575GlyPhe: 2.575 ± 0.356
6.218GlyGly: 6.218 ± 0.557
1.506GlyHis: 1.506 ± 0.332
3.935GlyIle: 3.935 ± 0.474
4.178GlyLys: 4.178 ± 0.575
5.004GlyLeu: 5.004 ± 0.553
2.332GlyMet: 2.332 ± 0.44
3.012GlyAsn: 3.012 ± 0.379
3.789GlyPro: 3.789 ± 0.418
2.429GlyGln: 2.429 ± 0.303
4.566GlyArg: 4.566 ± 0.572
5.149GlySer: 5.149 ± 0.47
5.052GlyThr: 5.052 ± 0.533
5.635GlyVal: 5.635 ± 0.526
2.283GlyTrp: 2.283 ± 0.387
2.332GlyTyr: 2.332 ± 0.311
0.0GlyXaa: 0.0 ± 0.0
His
1.555HisAla: 1.555 ± 0.282
0.194HisCys: 0.194 ± 0.107
1.166HisAsp: 1.166 ± 0.26
1.603HisGlu: 1.603 ± 0.308
0.583HisPhe: 0.583 ± 0.185
1.214HisGly: 1.214 ± 0.272
0.437HisHis: 0.437 ± 0.181
0.972HisIle: 0.972 ± 0.241
1.652HisLys: 1.652 ± 0.297
1.943HisLeu: 1.943 ± 0.338
0.826HisMet: 0.826 ± 0.24
0.534HisAsn: 0.534 ± 0.166
1.069HisPro: 1.069 ± 0.238
0.874HisGln: 0.874 ± 0.218
1.409HisArg: 1.409 ± 0.31
0.972HisSer: 0.972 ± 0.177
1.214HisThr: 1.214 ± 0.202
1.117HisVal: 1.117 ± 0.227
0.534HisTrp: 0.534 ± 0.154
0.486HisTyr: 0.486 ± 0.14
0.0HisXaa: 0.0 ± 0.0
Ile
4.129IleAla: 4.129 ± 0.455
0.486IleCys: 0.486 ± 0.136
3.643IleAsp: 3.643 ± 0.478
3.012IleGlu: 3.012 ± 0.368
1.117IlePhe: 1.117 ± 0.242
2.963IleGly: 2.963 ± 0.409
1.069IleHis: 1.069 ± 0.23
1.506IleIle: 1.506 ± 0.392
2.623IleLys: 2.623 ± 0.363
3.012IleLeu: 3.012 ± 0.444
0.874IleMet: 0.874 ± 0.213
1.555IleAsn: 1.555 ± 0.253
2.429IlePro: 2.429 ± 0.388
2.186IleGln: 2.186 ± 0.331
2.769IleArg: 2.769 ± 0.369
2.38IleSer: 2.38 ± 0.434
2.866IleThr: 2.866 ± 0.391
2.866IleVal: 2.866 ± 0.287
0.68IleTrp: 0.68 ± 0.19
0.972IleTyr: 0.972 ± 0.21
0.0IleXaa: 0.0 ± 0.0
Lys
5.538LysAla: 5.538 ± 0.711
0.34LysCys: 0.34 ± 0.138
2.769LysAsp: 2.769 ± 0.442
2.672LysGlu: 2.672 ± 0.406
1.214LysPhe: 1.214 ± 0.197
3.401LysGly: 3.401 ± 0.433
1.117LysHis: 1.117 ± 0.263
1.652LysIle: 1.652 ± 0.312
2.769LysLys: 2.769 ± 0.485
4.518LysLeu: 4.518 ± 0.499
1.7LysMet: 1.7 ± 0.315
1.214LysAsn: 1.214 ± 0.225
3.158LysPro: 3.158 ± 0.502
2.332LysGln: 2.332 ± 0.42
3.789LysArg: 3.789 ± 0.519
2.623LysSer: 2.623 ± 0.353
3.838LysThr: 3.838 ± 0.522
3.498LysVal: 3.498 ± 0.479
1.214LysTrp: 1.214 ± 0.237
1.263LysTyr: 1.263 ± 0.23
0.0LysXaa: 0.0 ± 0.0
Leu
7.481LeuAla: 7.481 ± 0.536
1.02LeuCys: 1.02 ± 0.213
5.247LeuAsp: 5.247 ± 0.564
4.955LeuGlu: 4.955 ± 0.551
2.235LeuPhe: 2.235 ± 0.321
5.781LeuGly: 5.781 ± 0.545
1.846LeuHis: 1.846 ± 0.345
2.915LeuIle: 2.915 ± 0.388
4.032LeuLys: 4.032 ± 0.499
5.149LeuLeu: 5.149 ± 0.531
2.235LeuMet: 2.235 ± 0.386
2.72LeuAsn: 2.72 ± 0.325
4.032LeuPro: 4.032 ± 0.413
2.089LeuGln: 2.089 ± 0.3
5.344LeuArg: 5.344 ± 0.551
4.858LeuSer: 4.858 ± 0.506
4.712LeuThr: 4.712 ± 0.698
5.684LeuVal: 5.684 ± 0.522
1.457LeuTrp: 1.457 ± 0.228
2.332LeuTyr: 2.332 ± 0.383
0.0LeuXaa: 0.0 ± 0.0
Met
3.06MetAla: 3.06 ± 0.401
0.34MetCys: 0.34 ± 0.118
1.846MetAsp: 1.846 ± 0.313
1.749MetGlu: 1.749 ± 0.284
0.874MetPhe: 0.874 ± 0.173
1.895MetGly: 1.895 ± 0.401
0.34MetHis: 0.34 ± 0.133
0.972MetIle: 0.972 ± 0.173
0.923MetLys: 0.923 ± 0.234
2.04MetLeu: 2.04 ± 0.331
0.632MetMet: 0.632 ± 0.154
0.874MetAsn: 0.874 ± 0.201
1.117MetPro: 1.117 ± 0.253
1.117MetGln: 1.117 ± 0.221
2.478MetArg: 2.478 ± 0.398
2.526MetSer: 2.526 ± 0.377
2.186MetThr: 2.186 ± 0.454
1.263MetVal: 1.263 ± 0.289
0.68MetTrp: 0.68 ± 0.179
0.534MetTyr: 0.534 ± 0.199
0.0MetXaa: 0.0 ± 0.0
Asn
3.352AsnAla: 3.352 ± 0.456
0.534AsnCys: 0.534 ± 0.149
2.478AsnAsp: 2.478 ± 0.365
1.992AsnGlu: 1.992 ± 0.35
0.534AsnPhe: 0.534 ± 0.141
3.06AsnGly: 3.06 ± 0.492
0.826AsnHis: 0.826 ± 0.229
1.263AsnIle: 1.263 ± 0.23
1.749AsnLys: 1.749 ± 0.283
1.603AsnLeu: 1.603 ± 0.316
0.777AsnMet: 0.777 ± 0.175
1.312AsnAsn: 1.312 ± 0.279
2.283AsnPro: 2.283 ± 0.333
1.797AsnGln: 1.797 ± 0.386
1.846AsnArg: 1.846 ± 0.289
1.797AsnSer: 1.797 ± 0.313
2.089AsnThr: 2.089 ± 0.504
2.186AsnVal: 2.186 ± 0.452
0.632AsnTrp: 0.632 ± 0.222
0.777AsnTyr: 0.777 ± 0.204
0.0AsnXaa: 0.0 ± 0.0
Pro
4.712ProAla: 4.712 ± 0.52
0.437ProCys: 0.437 ± 0.12
3.546ProAsp: 3.546 ± 0.412
3.643ProGlu: 3.643 ± 0.465
1.069ProPhe: 1.069 ± 0.199
4.372ProGly: 4.372 ± 0.543
1.069ProHis: 1.069 ± 0.195
1.846ProIle: 1.846 ± 0.284
2.526ProLys: 2.526 ± 0.332
3.401ProLeu: 3.401 ± 0.523
0.874ProMet: 0.874 ± 0.196
1.36ProAsn: 1.36 ± 0.27
1.312ProPro: 1.312 ± 0.285
1.36ProGln: 1.36 ± 0.229
2.429ProArg: 2.429 ± 0.406
2.769ProSer: 2.769 ± 0.324
3.935ProThr: 3.935 ± 0.599
3.303ProVal: 3.303 ± 0.444
0.729ProTrp: 0.729 ± 0.178
1.652ProTyr: 1.652 ± 0.282
0.0ProXaa: 0.0 ± 0.0
Gln
4.664GlnAla: 4.664 ± 0.73
0.34GlnCys: 0.34 ± 0.155
1.166GlnAsp: 1.166 ± 0.248
2.332GlnGlu: 2.332 ± 0.361
1.117GlnPhe: 1.117 ± 0.261
2.72GlnGly: 2.72 ± 0.349
0.632GlnHis: 0.632 ± 0.14
1.409GlnIle: 1.409 ± 0.261
2.429GlnLys: 2.429 ± 0.449
3.206GlnLeu: 3.206 ± 0.294
1.214GlnMet: 1.214 ± 0.212
1.117GlnAsn: 1.117 ± 0.195
1.214GlnPro: 1.214 ± 0.257
1.506GlnGln: 1.506 ± 0.294
2.963GlnArg: 2.963 ± 0.447
1.943GlnSer: 1.943 ± 0.358
1.895GlnThr: 1.895 ± 0.271
2.478GlnVal: 2.478 ± 0.347
0.632GlnTrp: 0.632 ± 0.16
1.652GlnTyr: 1.652 ± 0.386
0.0GlnXaa: 0.0 ± 0.0
Arg
5.927ArgAla: 5.927 ± 0.658
0.826ArgCys: 0.826 ± 0.19
4.032ArgAsp: 4.032 ± 0.534
4.081ArgGlu: 4.081 ± 0.644
2.38ArgPhe: 2.38 ± 0.354
4.518ArgGly: 4.518 ± 0.463
1.7ArgHis: 1.7 ± 0.431
4.032ArgIle: 4.032 ± 0.53
4.324ArgLys: 4.324 ± 0.548
5.101ArgLeu: 5.101 ± 0.672
2.137ArgMet: 2.137 ± 0.343
1.749ArgAsn: 1.749 ± 0.322
2.623ArgPro: 2.623 ± 0.358
1.506ArgGln: 1.506 ± 0.313
4.615ArgArg: 4.615 ± 0.755
3.255ArgSer: 3.255 ± 0.484
2.672ArgThr: 2.672 ± 0.431
4.129ArgVal: 4.129 ± 0.423
1.263ArgTrp: 1.263 ± 0.248
2.866ArgTyr: 2.866 ± 0.464
0.0ArgXaa: 0.0 ± 0.0
Ser
5.684SerAla: 5.684 ± 0.651
0.972SerCys: 0.972 ± 0.229
3.886SerAsp: 3.886 ± 0.352
3.595SerGlu: 3.595 ± 0.494
1.457SerPhe: 1.457 ± 0.276
5.927SerGly: 5.927 ± 0.773
0.874SerHis: 0.874 ± 0.193
2.866SerIle: 2.866 ± 0.369
2.137SerLys: 2.137 ± 0.346
4.226SerLeu: 4.226 ± 0.505
1.603SerMet: 1.603 ± 0.286
1.846SerAsn: 1.846 ± 0.324
2.235SerPro: 2.235 ± 0.378
1.652SerGln: 1.652 ± 0.284
2.575SerArg: 2.575 ± 0.398
3.498SerSer: 3.498 ± 0.453
3.741SerThr: 3.741 ± 0.536
3.886SerVal: 3.886 ± 0.609
0.923SerTrp: 0.923 ± 0.201
1.506SerTyr: 1.506 ± 0.286
0.0SerXaa: 0.0 ± 0.0
Thr
6.17ThrAla: 6.17 ± 0.731
0.583ThrCys: 0.583 ± 0.161
4.081ThrAsp: 4.081 ± 0.447
3.206ThrGlu: 3.206 ± 0.4
1.992ThrPhe: 1.992 ± 0.355
6.558ThrGly: 6.558 ± 0.936
1.457ThrHis: 1.457 ± 0.309
3.303ThrIle: 3.303 ± 0.481
2.332ThrLys: 2.332 ± 0.38
4.906ThrLeu: 4.906 ± 0.569
1.652ThrMet: 1.652 ± 0.448
2.089ThrAsn: 2.089 ± 0.38
3.546ThrPro: 3.546 ± 0.441
1.603ThrGln: 1.603 ± 0.305
4.178ThrArg: 4.178 ± 0.459
3.206ThrSer: 3.206 ± 0.715
3.935ThrThr: 3.935 ± 0.725
5.052ThrVal: 5.052 ± 0.648
0.972ThrTrp: 0.972 ± 0.178
1.555ThrTyr: 1.555 ± 0.273
0.0ThrXaa: 0.0 ± 0.0
Val
6.655ValAla: 6.655 ± 0.653
0.729ValCys: 0.729 ± 0.241
5.489ValAsp: 5.489 ± 0.687
4.566ValGlu: 4.566 ± 0.518
2.235ValPhe: 2.235 ± 0.283
5.101ValGly: 5.101 ± 0.533
1.555ValHis: 1.555 ± 0.318
2.818ValIle: 2.818 ± 0.39
3.206ValLys: 3.206 ± 0.473
5.878ValLeu: 5.878 ± 0.584
0.923ValMet: 0.923 ± 0.255
2.089ValAsn: 2.089 ± 0.299
2.429ValPro: 2.429 ± 0.388
3.012ValGln: 3.012 ± 0.354
4.421ValArg: 4.421 ± 0.571
4.178ValSer: 4.178 ± 0.502
4.275ValThr: 4.275 ± 0.566
5.344ValVal: 5.344 ± 0.522
1.214ValTrp: 1.214 ± 0.269
2.186ValTyr: 2.186 ± 0.354
0.0ValXaa: 0.0 ± 0.0
Trp
1.797TrpAla: 1.797 ± 0.284
0.437TrpCys: 0.437 ± 0.136
1.603TrpAsp: 1.603 ± 0.245
1.214TrpGlu: 1.214 ± 0.269
0.534TrpPhe: 0.534 ± 0.147
0.777TrpGly: 0.777 ± 0.191
0.243TrpHis: 0.243 ± 0.122
1.166TrpIle: 1.166 ± 0.212
0.777TrpLys: 0.777 ± 0.195
1.603TrpLeu: 1.603 ± 0.29
0.389TrpMet: 0.389 ± 0.145
0.826TrpAsn: 0.826 ± 0.23
0.874TrpPro: 0.874 ± 0.208
0.729TrpGln: 0.729 ± 0.204
1.409TrpArg: 1.409 ± 0.295
1.02TrpSer: 1.02 ± 0.216
1.555TrpThr: 1.555 ± 0.409
1.166TrpVal: 1.166 ± 0.317
0.291TrpTrp: 0.291 ± 0.12
0.68TrpTyr: 0.68 ± 0.182
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.303TyrAla: 3.303 ± 0.43
0.243TyrCys: 0.243 ± 0.105
2.089TyrAsp: 2.089 ± 0.408
1.409TyrGlu: 1.409 ± 0.261
0.826TyrPhe: 0.826 ± 0.216
2.672TyrGly: 2.672 ± 0.43
0.389TyrHis: 0.389 ± 0.134
1.166TyrIle: 1.166 ± 0.184
1.506TyrLys: 1.506 ± 0.355
1.652TyrLeu: 1.652 ± 0.321
0.583TyrMet: 0.583 ± 0.157
1.02TyrAsn: 1.02 ± 0.211
0.972TyrPro: 0.972 ± 0.241
1.36TyrGln: 1.36 ± 0.258
2.137TyrArg: 2.137 ± 0.361
1.943TyrSer: 1.943 ± 0.312
1.555TyrThr: 1.555 ± 0.286
2.818TyrVal: 2.818 ± 0.612
0.68TyrTrp: 0.68 ± 0.194
0.632TyrTyr: 0.632 ± 0.172
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 105 proteins (20586 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski