Amino acid dipepetide frequency for Shigella phage SHSML-45

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
7.507AlaAla: 7.507 ± 1.171
0.577AlaCys: 0.577 ± 0.143
3.465AlaAsp: 3.465 ± 0.388
5.536AlaGlu: 5.536 ± 0.578
3.193AlaPhe: 3.193 ± 0.372
4.789AlaGly: 4.789 ± 0.436
1.528AlaHis: 1.528 ± 0.275
4.348AlaIle: 4.348 ± 0.374
5.91AlaLys: 5.91 ± 0.496
6.046AlaLeu: 6.046 ± 0.497
1.936AlaMet: 1.936 ± 0.285
3.566AlaAsn: 3.566 ± 0.356
2.242AlaPro: 2.242 ± 0.332
3.465AlaGln: 3.465 ± 0.406
2.955AlaArg: 2.955 ± 0.35
4.382AlaSer: 4.382 ± 0.627
3.634AlaThr: 3.634 ± 0.442
4.653AlaVal: 4.653 ± 0.463
0.985AlaTrp: 0.985 ± 0.161
2.412AlaTyr: 2.412 ± 0.326
0.0AlaXaa: 0.0 ± 0.0
Cys
0.611CysAla: 0.611 ± 0.163
0.238CysCys: 0.238 ± 0.099
0.577CysAsp: 0.577 ± 0.151
0.577CysGlu: 0.577 ± 0.136
0.543CysPhe: 0.543 ± 0.122
0.917CysGly: 0.917 ± 0.199
0.204CysHis: 0.204 ± 0.083
0.611CysIle: 0.611 ± 0.133
0.645CysLys: 0.645 ± 0.153
0.849CysLeu: 0.849 ± 0.148
0.34CysMet: 0.34 ± 0.102
0.509CysAsn: 0.509 ± 0.116
0.543CysPro: 0.543 ± 0.177
0.408CysGln: 0.408 ± 0.143
0.679CysArg: 0.679 ± 0.16
1.087CysSer: 1.087 ± 0.254
0.476CysThr: 0.476 ± 0.143
0.713CysVal: 0.713 ± 0.174
0.102CysTrp: 0.102 ± 0.056
0.442CysTyr: 0.442 ± 0.145
0.0CysXaa: 0.0 ± 0.0
Asp
4.382AspAla: 4.382 ± 0.393
0.747AspCys: 0.747 ± 0.162
2.717AspAsp: 2.717 ± 0.468
4.755AspGlu: 4.755 ± 0.47
2.785AspPhe: 2.785 ± 0.396
3.702AspGly: 3.702 ± 0.336
1.427AspHis: 1.427 ± 0.208
4.619AspIle: 4.619 ± 0.409
4.585AspLys: 4.585 ± 0.45
5.91AspLeu: 5.91 ± 0.499
1.868AspMet: 1.868 ± 0.239
2.853AspAsn: 2.853 ± 0.369
2.887AspPro: 2.887 ± 0.282
1.427AspGln: 1.427 ± 0.23
2.989AspArg: 2.989 ± 0.335
3.499AspSer: 3.499 ± 0.441
3.295AspThr: 3.295 ± 0.423
3.6AspVal: 3.6 ± 0.377
0.747AspTrp: 0.747 ± 0.188
2.683AspTyr: 2.683 ± 0.338
0.0AspXaa: 0.0 ± 0.0
Glu
5.91GluAla: 5.91 ± 0.482
0.815GluCys: 0.815 ± 0.205
3.566GluAsp: 3.566 ± 0.343
5.061GluGlu: 5.061 ± 0.483
2.581GluPhe: 2.581 ± 0.316
3.838GluGly: 3.838 ± 0.424
1.528GluHis: 1.528 ± 0.224
4.993GluIle: 4.993 ± 0.411
5.401GluLys: 5.401 ± 0.516
7.71GluLeu: 7.71 ± 0.531
2.208GluMet: 2.208 ± 0.296
2.581GluAsn: 2.581 ± 0.297
1.393GluPro: 1.393 ± 0.267
2.751GluGln: 2.751 ± 0.442
3.363GluArg: 3.363 ± 0.376
3.397GluSer: 3.397 ± 0.287
3.702GluThr: 3.702 ± 0.383
4.721GluVal: 4.721 ± 0.414
1.189GluTrp: 1.189 ± 0.185
3.566GluTyr: 3.566 ± 0.368
0.0GluXaa: 0.0 ± 0.0
Phe
2.581PheAla: 2.581 ± 0.286
0.272PheCys: 0.272 ± 0.099
2.887PheAsp: 2.887 ± 0.305
2.683PheGlu: 2.683 ± 0.333
1.495PhePhe: 1.495 ± 0.225
2.955PheGly: 2.955 ± 0.318
1.257PheHis: 1.257 ± 0.268
2.683PheIle: 2.683 ± 0.326
2.955PheLys: 2.955 ± 0.299
3.261PheLeu: 3.261 ± 0.293
0.781PheMet: 0.781 ± 0.135
2.785PheAsn: 2.785 ± 0.338
1.461PhePro: 1.461 ± 0.224
0.985PheGln: 0.985 ± 0.154
1.97PheArg: 1.97 ± 0.34
2.921PheSer: 2.921 ± 0.35
2.106PheThr: 2.106 ± 0.241
2.853PheVal: 2.853 ± 0.308
0.543PheTrp: 0.543 ± 0.119
1.291PheTyr: 1.291 ± 0.19
0.0PheXaa: 0.0 ± 0.0
Gly
4.212GlyAla: 4.212 ± 0.445
1.189GlyCys: 1.189 ± 0.251
3.872GlyAsp: 3.872 ± 0.405
4.653GlyGlu: 4.653 ± 0.404
3.227GlyPhe: 3.227 ± 0.34
3.329GlyGly: 3.329 ± 0.382
1.291GlyHis: 1.291 ± 0.215
4.382GlyIle: 4.382 ± 0.405
5.095GlyLys: 5.095 ± 0.369
4.619GlyLeu: 4.619 ± 0.422
1.664GlyMet: 1.664 ± 0.252
4.008GlyAsn: 4.008 ± 0.554
1.053GlyPro: 1.053 ± 0.204
2.106GlyGln: 2.106 ± 0.26
3.057GlyArg: 3.057 ± 0.324
4.178GlySer: 4.178 ± 0.356
4.076GlyThr: 4.076 ± 0.471
4.416GlyVal: 4.416 ± 0.441
0.917GlyTrp: 0.917 ± 0.218
3.465GlyTyr: 3.465 ± 0.34
0.0GlyXaa: 0.0 ± 0.0
His
1.257HisAla: 1.257 ± 0.234
0.238HisCys: 0.238 ± 0.087
1.528HisAsp: 1.528 ± 0.237
0.849HisGlu: 0.849 ± 0.164
0.645HisPhe: 0.645 ± 0.176
1.528HisGly: 1.528 ± 0.222
0.509HisHis: 0.509 ± 0.115
1.732HisIle: 1.732 ± 0.308
1.257HisLys: 1.257 ± 0.211
1.97HisLeu: 1.97 ± 0.245
0.306HisMet: 0.306 ± 0.142
0.985HisAsn: 0.985 ± 0.234
0.781HisPro: 0.781 ± 0.19
0.476HisGln: 0.476 ± 0.149
1.325HisArg: 1.325 ± 0.2
1.155HisSer: 1.155 ± 0.23
0.883HisThr: 0.883 ± 0.167
0.781HisVal: 0.781 ± 0.153
0.102HisTrp: 0.102 ± 0.053
0.781HisTyr: 0.781 ± 0.152
0.0HisXaa: 0.0 ± 0.0
Ile
4.755IleAla: 4.755 ± 0.455
0.747IleCys: 0.747 ± 0.199
5.299IleAsp: 5.299 ± 0.428
4.687IleGlu: 4.687 ± 0.415
2.242IlePhe: 2.242 ± 0.321
4.246IleGly: 4.246 ± 0.446
1.155IleHis: 1.155 ± 0.185
3.736IleIle: 3.736 ± 0.353
4.45IleLys: 4.45 ± 0.446
5.163IleLeu: 5.163 ± 0.386
1.936IleMet: 1.936 ± 0.298
4.144IleAsn: 4.144 ± 0.457
2.921IlePro: 2.921 ± 0.377
2.242IleGln: 2.242 ± 0.292
2.785IleArg: 2.785 ± 0.298
4.28IleSer: 4.28 ± 0.383
4.076IleThr: 4.076 ± 0.312
3.431IleVal: 3.431 ± 0.305
0.543IleTrp: 0.543 ± 0.136
2.004IleTyr: 2.004 ± 0.272
0.0IleXaa: 0.0 ± 0.0
Lys
6.148LysAla: 6.148 ± 0.557
0.543LysCys: 0.543 ± 0.144
5.129LysAsp: 5.129 ± 0.447
5.095LysGlu: 5.095 ± 0.384
3.465LysPhe: 3.465 ± 0.303
3.736LysGly: 3.736 ± 0.374
1.461LysHis: 1.461 ± 0.217
3.736LysIle: 3.736 ± 0.347
4.484LysLys: 4.484 ± 0.501
7.031LysLeu: 7.031 ± 0.52
2.276LysMet: 2.276 ± 0.303
4.178LysAsn: 4.178 ± 0.445
2.446LysPro: 2.446 ± 0.316
2.921LysGln: 2.921 ± 0.35
3.465LysArg: 3.465 ± 0.325
4.382LysSer: 4.382 ± 0.436
4.076LysThr: 4.076 ± 0.402
4.687LysVal: 4.687 ± 0.448
0.917LysTrp: 0.917 ± 0.175
3.227LysTyr: 3.227 ± 0.299
0.0LysXaa: 0.0 ± 0.0
Leu
6.997LeuAla: 6.997 ± 0.598
0.849LeuCys: 0.849 ± 0.172
6.454LeuAsp: 6.454 ± 0.469
7.167LeuGlu: 7.167 ± 0.469
2.853LeuPhe: 2.853 ± 0.314
5.333LeuGly: 5.333 ± 0.513
1.868LeuHis: 1.868 ± 0.291
4.993LeuIle: 4.993 ± 0.471
5.774LeuLys: 5.774 ± 0.462
6.42LeuLeu: 6.42 ± 0.513
1.8LeuMet: 1.8 ± 0.271
5.401LeuAsn: 5.401 ± 0.46
3.566LeuPro: 3.566 ± 0.41
2.955LeuGln: 2.955 ± 0.389
4.314LeuArg: 4.314 ± 0.313
4.653LeuSer: 4.653 ± 0.405
4.551LeuThr: 4.551 ± 0.376
5.095LeuVal: 5.095 ± 0.481
0.577LeuTrp: 0.577 ± 0.14
3.023LeuTyr: 3.023 ± 0.362
0.0LeuXaa: 0.0 ± 0.0
Met
1.732MetAla: 1.732 ± 0.264
0.17MetCys: 0.17 ± 0.082
1.189MetAsp: 1.189 ± 0.213
2.242MetGlu: 2.242 ± 0.295
0.883MetPhe: 0.883 ± 0.219
1.528MetGly: 1.528 ± 0.248
0.476MetHis: 0.476 ± 0.143
1.902MetIle: 1.902 ± 0.315
2.581MetLys: 2.581 ± 0.414
2.038MetLeu: 2.038 ± 0.271
0.509MetMet: 0.509 ± 0.153
1.019MetAsn: 1.019 ± 0.193
0.883MetPro: 0.883 ± 0.15
1.087MetGln: 1.087 ± 0.215
1.121MetArg: 1.121 ± 0.194
2.276MetSer: 2.276 ± 0.328
1.562MetThr: 1.562 ± 0.222
1.427MetVal: 1.427 ± 0.241
0.17MetTrp: 0.17 ± 0.072
0.917MetTyr: 0.917 ± 0.148
0.0MetXaa: 0.0 ± 0.0
Asn
3.329AsnAla: 3.329 ± 0.404
0.611AsnCys: 0.611 ± 0.187
2.615AsnAsp: 2.615 ± 0.302
2.751AsnGlu: 2.751 ± 0.307
2.276AsnPhe: 2.276 ± 0.255
4.619AsnGly: 4.619 ± 0.485
0.679AsnHis: 0.679 ± 0.16
3.804AsnIle: 3.804 ± 0.315
4.314AsnLys: 4.314 ± 0.495
4.993AsnLeu: 4.993 ± 0.468
1.155AsnMet: 1.155 ± 0.208
3.702AsnAsn: 3.702 ± 0.431
2.581AsnPro: 2.581 ± 0.322
1.495AsnGln: 1.495 ± 0.208
3.261AsnArg: 3.261 ± 0.369
4.246AsnSer: 4.246 ± 0.462
3.532AsnThr: 3.532 ± 0.364
3.77AsnVal: 3.77 ± 0.347
0.577AsnTrp: 0.577 ± 0.125
1.868AsnTyr: 1.868 ± 0.309
0.0AsnXaa: 0.0 ± 0.0
Pro
2.48ProAla: 2.48 ± 0.32
0.34ProCys: 0.34 ± 0.105
2.48ProAsp: 2.48 ± 0.361
3.465ProGlu: 3.465 ± 0.391
1.562ProPhe: 1.562 ± 0.293
2.174ProGly: 2.174 ± 0.288
0.611ProHis: 0.611 ± 0.153
2.038ProIle: 2.038 ± 0.314
1.936ProLys: 1.936 ± 0.293
2.072ProLeu: 2.072 ± 0.266
0.408ProMet: 0.408 ± 0.117
1.97ProAsn: 1.97 ± 0.301
1.427ProPro: 1.427 ± 0.281
0.951ProGln: 0.951 ± 0.196
1.427ProArg: 1.427 ± 0.247
2.242ProSer: 2.242 ± 0.31
2.174ProThr: 2.174 ± 0.26
2.615ProVal: 2.615 ± 0.35
0.442ProTrp: 0.442 ± 0.127
1.732ProTyr: 1.732 ± 0.233
0.0ProXaa: 0.0 ± 0.0
Gln
2.887GlnAla: 2.887 ± 0.532
0.476GlnCys: 0.476 ± 0.121
1.868GlnAsp: 1.868 ± 0.289
2.785GlnGlu: 2.785 ± 0.373
1.461GlnPhe: 1.461 ± 0.244
1.834GlnGly: 1.834 ± 0.191
0.543GlnHis: 0.543 ± 0.163
2.344GlnIle: 2.344 ± 0.263
2.581GlnLys: 2.581 ± 0.345
3.804GlnLeu: 3.804 ± 0.476
0.849GlnMet: 0.849 ± 0.139
1.766GlnAsn: 1.766 ± 0.246
0.408GlnPro: 0.408 ± 0.118
2.038GlnGln: 2.038 ± 0.327
1.766GlnArg: 1.766 ± 0.214
1.902GlnSer: 1.902 ± 0.307
1.698GlnThr: 1.698 ± 0.261
2.615GlnVal: 2.615 ± 0.324
0.645GlnTrp: 0.645 ± 0.185
1.393GlnTyr: 1.393 ± 0.194
0.0GlnXaa: 0.0 ± 0.0
Arg
2.989ArgAla: 2.989 ± 0.411
0.476ArgCys: 0.476 ± 0.143
2.887ArgAsp: 2.887 ± 0.304
3.668ArgGlu: 3.668 ± 0.417
1.732ArgPhe: 1.732 ± 0.254
3.668ArgGly: 3.668 ± 0.342
0.611ArgHis: 0.611 ± 0.17
3.261ArgIle: 3.261 ± 0.385
3.532ArgLys: 3.532 ± 0.387
3.906ArgLeu: 3.906 ± 0.366
1.562ArgMet: 1.562 ± 0.226
2.717ArgAsn: 2.717 ± 0.275
1.461ArgPro: 1.461 ± 0.207
1.902ArgGln: 1.902 ± 0.281
2.446ArgArg: 2.446 ± 0.295
2.514ArgSer: 2.514 ± 0.243
2.819ArgThr: 2.819 ± 0.3
3.091ArgVal: 3.091 ± 0.298
0.747ArgTrp: 0.747 ± 0.18
1.868ArgTyr: 1.868 ± 0.322
0.0ArgXaa: 0.0 ± 0.0
Ser
3.6SerAla: 3.6 ± 0.371
0.747SerCys: 0.747 ± 0.154
3.125SerAsp: 3.125 ± 0.313
3.295SerGlu: 3.295 ± 0.34
3.091SerPhe: 3.091 ± 0.261
4.925SerGly: 4.925 ± 0.461
0.883SerHis: 0.883 ± 0.203
4.28SerIle: 4.28 ± 0.547
4.721SerLys: 4.721 ± 0.436
5.842SerLeu: 5.842 ± 0.447
1.698SerMet: 1.698 ± 0.292
3.974SerAsn: 3.974 ± 0.341
2.072SerPro: 2.072 ± 0.303
1.63SerGln: 1.63 ± 0.214
3.091SerArg: 3.091 ± 0.325
3.94SerSer: 3.94 ± 0.388
3.702SerThr: 3.702 ± 0.355
3.872SerVal: 3.872 ± 0.359
1.155SerTrp: 1.155 ± 0.196
2.751SerTyr: 2.751 ± 0.288
0.0SerXaa: 0.0 ± 0.0
Thr
3.974ThrAla: 3.974 ± 0.463
0.442ThrCys: 0.442 ± 0.114
3.261ThrAsp: 3.261 ± 0.32
3.363ThrGlu: 3.363 ± 0.332
2.276ThrPhe: 2.276 ± 0.296
4.687ThrGly: 4.687 ± 0.47
0.951ThrHis: 0.951 ± 0.224
3.906ThrIle: 3.906 ± 0.359
4.11ThrLys: 4.11 ± 0.468
4.518ThrLeu: 4.518 ± 0.439
1.427ThrMet: 1.427 ± 0.294
3.634ThrAsn: 3.634 ± 0.354
2.412ThrPro: 2.412 ± 0.262
2.344ThrGln: 2.344 ± 0.293
2.412ThrArg: 2.412 ± 0.327
3.6ThrSer: 3.6 ± 0.363
3.227ThrThr: 3.227 ± 0.393
4.314ThrVal: 4.314 ± 0.428
0.713ThrTrp: 0.713 ± 0.168
2.004ThrTyr: 2.004 ± 0.274
0.0ThrXaa: 0.0 ± 0.0
Val
4.789ValAla: 4.789 ± 0.43
0.747ValCys: 0.747 ± 0.172
4.45ValAsp: 4.45 ± 0.404
4.178ValGlu: 4.178 ± 0.354
2.31ValPhe: 2.31 ± 0.292
3.804ValGly: 3.804 ± 0.47
0.951ValHis: 0.951 ± 0.179
4.246ValIle: 4.246 ± 0.429
4.993ValLys: 4.993 ± 0.381
4.518ValLeu: 4.518 ± 0.352
1.562ValMet: 1.562 ± 0.245
3.057ValAsn: 3.057 ± 0.456
2.446ValPro: 2.446 ± 0.316
2.242ValGln: 2.242 ± 0.302
3.057ValArg: 3.057 ± 0.315
4.45ValSer: 4.45 ± 0.483
4.45ValThr: 4.45 ± 0.476
4.382ValVal: 4.382 ± 0.464
0.781ValTrp: 0.781 ± 0.161
2.514ValTyr: 2.514 ± 0.287
0.0ValXaa: 0.0 ± 0.0
Trp
0.577TrpAla: 0.577 ± 0.157
0.204TrpCys: 0.204 ± 0.086
1.121TrpAsp: 1.121 ± 0.191
1.223TrpGlu: 1.223 ± 0.232
0.374TrpPhe: 0.374 ± 0.123
0.577TrpGly: 0.577 ± 0.145
0.17TrpHis: 0.17 ± 0.07
0.679TrpIle: 0.679 ± 0.169
1.087TrpLys: 1.087 ± 0.176
1.223TrpLeu: 1.223 ± 0.206
0.442TrpMet: 0.442 ± 0.138
0.747TrpAsn: 0.747 ± 0.177
0.272TrpPro: 0.272 ± 0.102
0.713TrpGln: 0.713 ± 0.146
0.577TrpArg: 0.577 ± 0.138
0.747TrpSer: 0.747 ± 0.217
0.713TrpThr: 0.713 ± 0.167
0.543TrpVal: 0.543 ± 0.125
0.204TrpTrp: 0.204 ± 0.097
0.34TrpTyr: 0.34 ± 0.127
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.412TyrAla: 2.412 ± 0.307
0.645TyrCys: 0.645 ± 0.127
3.023TyrAsp: 3.023 ± 0.375
2.242TyrGlu: 2.242 ± 0.267
1.732TyrPhe: 1.732 ± 0.251
2.615TyrGly: 2.615 ± 0.305
0.985TyrHis: 0.985 ± 0.151
2.547TyrIle: 2.547 ± 0.351
2.989TyrLys: 2.989 ± 0.316
2.751TyrLeu: 2.751 ± 0.318
0.951TyrMet: 0.951 ± 0.17
2.581TyrAsn: 2.581 ± 0.313
1.291TyrPro: 1.291 ± 0.198
1.528TyrGln: 1.528 ± 0.254
1.8TyrArg: 1.8 ± 0.224
2.547TyrSer: 2.547 ± 0.351
2.819TyrThr: 2.819 ± 0.299
2.344TyrVal: 2.344 ± 0.285
0.476TyrTrp: 0.476 ± 0.129
1.868TyrTyr: 1.868 ± 0.288
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 139 proteins (29442 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski