Amino acid dipepetide frequency for Caulobacter phage Sansa

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
19.439AlaAla: 19.439 ± 2.219
1.006AlaCys: 1.006 ± 0.198
7.164AlaAsp: 7.164 ± 0.513
8.331AlaGlu: 8.331 ± 0.953
3.099AlaPhe: 3.099 ± 0.346
9.619AlaGly: 9.619 ± 0.65
2.093AlaHis: 2.093 ± 0.297
5.152AlaIle: 5.152 ± 0.632
5.634AlaLys: 5.634 ± 0.538
11.349AlaLeu: 11.349 ± 0.804
3.461AlaMet: 3.461 ± 0.395
3.059AlaAsn: 3.059 ± 0.403
5.956AlaPro: 5.956 ± 0.599
5.152AlaGln: 5.152 ± 0.784
8.975AlaArg: 8.975 ± 0.614
7.043AlaSer: 7.043 ± 0.89
5.715AlaThr: 5.715 ± 0.743
8.291AlaVal: 8.291 ± 0.521
2.415AlaTrp: 2.415 ± 0.366
2.737AlaTyr: 2.737 ± 0.327
0.0AlaXaa: 0.0 ± 0.0
Cys
0.604CysAla: 0.604 ± 0.186
0.04CysCys: 0.04 ± 0.044
0.805CysAsp: 0.805 ± 0.2
0.483CysGlu: 0.483 ± 0.159
0.443CysPhe: 0.443 ± 0.125
0.966CysGly: 0.966 ± 0.206
0.161CysHis: 0.161 ± 0.084
0.563CysIle: 0.563 ± 0.145
0.282CysLys: 0.282 ± 0.095
0.563CysLeu: 0.563 ± 0.156
0.282CysMet: 0.282 ± 0.099
0.121CysAsn: 0.121 ± 0.076
0.684CysPro: 0.684 ± 0.164
0.282CysGln: 0.282 ± 0.089
0.845CysArg: 0.845 ± 0.203
0.362CysSer: 0.362 ± 0.133
0.443CysThr: 0.443 ± 0.144
0.604CysVal: 0.604 ± 0.151
0.08CysTrp: 0.08 ± 0.065
0.322CysTyr: 0.322 ± 0.114
0.0CysXaa: 0.0 ± 0.0
Asp
7.325AspAla: 7.325 ± 0.573
0.885AspCys: 0.885 ± 0.205
3.703AspAsp: 3.703 ± 0.557
4.266AspGlu: 4.266 ± 0.522
2.093AspPhe: 2.093 ± 0.223
6.037AspGly: 6.037 ± 0.626
0.966AspHis: 0.966 ± 0.217
2.898AspIle: 2.898 ± 0.299
2.616AspLys: 2.616 ± 0.355
7.244AspLeu: 7.244 ± 0.621
1.851AspMet: 1.851 ± 0.304
1.409AspAsn: 1.409 ± 0.259
4.145AspPro: 4.145 ± 0.486
2.817AspGln: 2.817 ± 0.418
4.467AspArg: 4.467 ± 0.352
2.173AspSer: 2.173 ± 0.255
2.536AspThr: 2.536 ± 0.328
4.105AspVal: 4.105 ± 0.324
1.771AspTrp: 1.771 ± 0.346
1.65AspTyr: 1.65 ± 0.258
0.0AspXaa: 0.0 ± 0.0
Glu
9.176GluAla: 9.176 ± 0.787
0.241GluCys: 0.241 ± 0.106
4.065GluAsp: 4.065 ± 0.504
5.152GluGlu: 5.152 ± 0.499
2.173GluPhe: 2.173 ± 0.268
5.031GluGly: 5.031 ± 0.542
1.207GluHis: 1.207 ± 0.249
3.783GluIle: 3.783 ± 0.377
2.576GluLys: 2.576 ± 0.325
5.514GluLeu: 5.514 ± 0.519
1.851GluMet: 1.851 ± 0.288
2.415GluAsn: 2.415 ± 0.293
2.495GluPro: 2.495 ± 0.369
2.495GluGln: 2.495 ± 0.337
4.387GluArg: 4.387 ± 0.41
2.455GluSer: 2.455 ± 0.315
3.099GluThr: 3.099 ± 0.332
3.703GluVal: 3.703 ± 0.403
1.006GluTrp: 1.006 ± 0.208
1.127GluTyr: 1.127 ± 0.158
0.0GluXaa: 0.0 ± 0.0
Phe
3.34PheAla: 3.34 ± 0.382
0.402PheCys: 0.402 ± 0.152
2.697PheAsp: 2.697 ± 0.383
2.334PheGlu: 2.334 ± 0.28
0.885PhePhe: 0.885 ± 0.181
2.817PheGly: 2.817 ± 0.365
0.322PheHis: 0.322 ± 0.105
1.087PheIle: 1.087 ± 0.181
1.167PheLys: 1.167 ± 0.263
2.415PheLeu: 2.415 ± 0.33
0.604PheMet: 0.604 ± 0.165
1.489PheAsn: 1.489 ± 0.242
1.489PhePro: 1.489 ± 0.244
0.765PheGln: 0.765 ± 0.201
1.811PheArg: 1.811 ± 0.247
2.173PheSer: 2.173 ± 0.276
1.892PheThr: 1.892 ± 0.289
1.892PheVal: 1.892 ± 0.269
0.644PheTrp: 0.644 ± 0.144
0.845PheTyr: 0.845 ± 0.188
0.0PheXaa: 0.0 ± 0.0
Gly
8.411GlyAla: 8.411 ± 0.667
0.845GlyCys: 0.845 ± 0.201
5.272GlyAsp: 5.272 ± 0.426
5.111GlyGlu: 5.111 ± 0.493
2.978GlyPhe: 2.978 ± 0.296
8.371GlyGly: 8.371 ± 0.704
1.248GlyHis: 1.248 ± 0.252
3.22GlyIle: 3.22 ± 0.414
4.266GlyLys: 4.266 ± 0.488
6.319GlyLeu: 6.319 ± 0.528
2.214GlyMet: 2.214 ± 0.315
2.495GlyAsn: 2.495 ± 0.389
3.622GlyPro: 3.622 ± 0.48
3.179GlyGln: 3.179 ± 0.299
5.353GlyArg: 5.353 ± 0.476
3.703GlySer: 3.703 ± 0.407
3.501GlyThr: 3.501 ± 0.451
5.715GlyVal: 5.715 ± 0.522
1.529GlyTrp: 1.529 ± 0.239
2.214GlyTyr: 2.214 ± 0.303
0.0GlyXaa: 0.0 ± 0.0
His
1.892HisAla: 1.892 ± 0.239
0.08HisCys: 0.08 ± 0.049
0.966HisAsp: 0.966 ± 0.208
1.328HisGlu: 1.328 ± 0.272
0.765HisPhe: 0.765 ± 0.184
1.328HisGly: 1.328 ± 0.301
0.483HisHis: 0.483 ± 0.157
0.926HisIle: 0.926 ± 0.198
0.805HisLys: 0.805 ± 0.176
1.65HisLeu: 1.65 ± 0.272
0.282HisMet: 0.282 ± 0.106
0.483HisAsn: 0.483 ± 0.141
1.328HisPro: 1.328 ± 0.237
0.523HisGln: 0.523 ± 0.144
1.248HisArg: 1.248 ± 0.237
0.523HisSer: 0.523 ± 0.165
0.805HisThr: 0.805 ± 0.213
1.328HisVal: 1.328 ± 0.242
0.362HisTrp: 0.362 ± 0.112
0.724HisTyr: 0.724 ± 0.204
0.0HisXaa: 0.0 ± 0.0
Ile
5.433IleAla: 5.433 ± 0.495
0.724IleCys: 0.724 ± 0.194
3.944IleAsp: 3.944 ± 0.404
2.294IleGlu: 2.294 ± 0.316
1.57IlePhe: 1.57 ± 0.216
3.944IleGly: 3.944 ± 0.36
0.926IleHis: 0.926 ± 0.203
2.254IleIle: 2.254 ± 0.336
2.093IleLys: 2.093 ± 0.336
3.501IleLeu: 3.501 ± 0.335
1.087IleMet: 1.087 ± 0.275
2.053IleAsn: 2.053 ± 0.41
2.214IlePro: 2.214 ± 0.353
1.046IleGln: 1.046 ± 0.22
3.542IleArg: 3.542 ± 0.294
2.294IleSer: 2.294 ± 0.328
2.817IleThr: 2.817 ± 0.336
3.22IleVal: 3.22 ± 0.358
0.724IleTrp: 0.724 ± 0.218
0.805IleTyr: 0.805 ± 0.225
0.0IleXaa: 0.0 ± 0.0
Lys
6.882LysAla: 6.882 ± 0.657
0.282LysCys: 0.282 ± 0.108
2.697LysAsp: 2.697 ± 0.313
2.737LysGlu: 2.737 ± 0.414
0.805LysPhe: 0.805 ± 0.203
3.139LysGly: 3.139 ± 0.271
0.684LysHis: 0.684 ± 0.164
2.093LysIle: 2.093 ± 0.291
1.771LysLys: 1.771 ± 0.33
2.857LysLeu: 2.857 ± 0.459
0.926LysMet: 0.926 ± 0.22
1.368LysAsn: 1.368 ± 0.24
3.018LysPro: 3.018 ± 0.391
1.368LysGln: 1.368 ± 0.223
2.938LysArg: 2.938 ± 0.298
2.817LysSer: 2.817 ± 0.357
3.34LysThr: 3.34 ± 0.331
3.22LysVal: 3.22 ± 0.33
0.724LysTrp: 0.724 ± 0.163
0.523LysTyr: 0.523 ± 0.142
0.0LysXaa: 0.0 ± 0.0
Leu
10.504LeuAla: 10.504 ± 0.703
0.724LeuCys: 0.724 ± 0.168
5.353LeuAsp: 5.353 ± 0.503
5.473LeuGlu: 5.473 ± 0.505
2.375LeuPhe: 2.375 ± 0.297
5.433LeuGly: 5.433 ± 0.488
1.529LeuHis: 1.529 ± 0.279
3.622LeuIle: 3.622 ± 0.385
4.186LeuLys: 4.186 ± 0.441
6.6LeuLeu: 6.6 ± 0.438
1.529LeuMet: 1.529 ± 0.277
3.099LeuAsn: 3.099 ± 0.414
3.622LeuPro: 3.622 ± 0.444
3.582LeuGln: 3.582 ± 0.474
6.238LeuArg: 6.238 ± 0.39
4.95LeuSer: 4.95 ± 0.493
5.313LeuThr: 5.313 ± 0.43
6.359LeuVal: 6.359 ± 0.465
1.731LeuTrp: 1.731 ± 0.264
1.731LeuTyr: 1.731 ± 0.228
0.0LeuXaa: 0.0 ± 0.0
Met
3.421MetAla: 3.421 ± 0.438
0.241MetCys: 0.241 ± 0.092
1.489MetAsp: 1.489 ± 0.247
1.288MetGlu: 1.288 ± 0.2
0.644MetPhe: 0.644 ± 0.166
1.529MetGly: 1.529 ± 0.234
0.402MetHis: 0.402 ± 0.131
1.087MetIle: 1.087 ± 0.195
1.368MetLys: 1.368 ± 0.208
2.294MetLeu: 2.294 ± 0.358
0.523MetMet: 0.523 ± 0.155
0.926MetAsn: 0.926 ± 0.216
1.328MetPro: 1.328 ± 0.275
0.724MetGln: 0.724 ± 0.169
2.053MetArg: 2.053 ± 0.262
1.892MetSer: 1.892 ± 0.286
1.771MetThr: 1.771 ± 0.286
1.972MetVal: 1.972 ± 0.288
0.161MetTrp: 0.161 ± 0.073
0.241MetTyr: 0.241 ± 0.089
0.0MetXaa: 0.0 ± 0.0
Asn
3.3AsnAla: 3.3 ± 0.583
0.241AsnCys: 0.241 ± 0.088
1.731AsnAsp: 1.731 ± 0.214
1.288AsnGlu: 1.288 ± 0.192
0.966AsnPhe: 0.966 ± 0.261
3.3AsnGly: 3.3 ± 0.322
0.604AsnHis: 0.604 ± 0.158
1.409AsnIle: 1.409 ± 0.248
1.006AsnLys: 1.006 ± 0.19
2.898AsnLeu: 2.898 ± 0.315
0.805AsnMet: 0.805 ± 0.195
0.765AsnAsn: 0.765 ± 0.188
2.294AsnPro: 2.294 ± 0.292
1.167AsnGln: 1.167 ± 0.219
1.932AsnArg: 1.932 ± 0.328
1.489AsnSer: 1.489 ± 0.322
1.69AsnThr: 1.69 ± 0.341
2.334AsnVal: 2.334 ± 0.283
0.402AsnTrp: 0.402 ± 0.127
0.805AsnTyr: 0.805 ± 0.181
0.0AsnXaa: 0.0 ± 0.0
Pro
6.761ProAla: 6.761 ± 0.693
0.563ProCys: 0.563 ± 0.15
4.306ProAsp: 4.306 ± 0.594
3.944ProGlu: 3.944 ± 0.506
2.012ProPhe: 2.012 ± 0.288
4.266ProGly: 4.266 ± 0.646
1.046ProHis: 1.046 ± 0.243
2.375ProIle: 2.375 ± 0.295
2.777ProLys: 2.777 ± 0.407
2.978ProLeu: 2.978 ± 0.374
1.167ProMet: 1.167 ± 0.244
1.65ProAsn: 1.65 ± 0.277
4.266ProPro: 4.266 ± 0.586
1.328ProGln: 1.328 ± 0.265
3.139ProArg: 3.139 ± 0.533
2.938ProSer: 2.938 ± 0.295
3.059ProThr: 3.059 ± 0.522
4.467ProVal: 4.467 ± 0.543
0.765ProTrp: 0.765 ± 0.166
1.167ProTyr: 1.167 ± 0.257
0.0ProXaa: 0.0 ± 0.0
Gln
5.594GlnAla: 5.594 ± 0.824
0.121GlnCys: 0.121 ± 0.058
2.133GlnAsp: 2.133 ± 0.286
2.133GlnGlu: 2.133 ± 0.399
1.006GlnPhe: 1.006 ± 0.188
2.294GlnGly: 2.294 ± 0.273
0.926GlnHis: 0.926 ± 0.201
1.771GlnIle: 1.771 ± 0.334
1.771GlnLys: 1.771 ± 0.371
3.542GlnLeu: 3.542 ± 0.553
0.805GlnMet: 0.805 ± 0.201
0.926GlnAsn: 0.926 ± 0.2
1.731GlnPro: 1.731 ± 0.292
1.851GlnGln: 1.851 ± 0.422
2.817GlnArg: 2.817 ± 0.284
1.368GlnSer: 1.368 ± 0.23
2.053GlnThr: 2.053 ± 0.28
2.737GlnVal: 2.737 ± 0.294
0.483GlnTrp: 0.483 ± 0.148
0.845GlnTyr: 0.845 ± 0.185
0.0GlnXaa: 0.0 ± 0.0
Arg
8.975ArgAla: 8.975 ± 0.573
0.644ArgCys: 0.644 ± 0.187
4.226ArgAsp: 4.226 ± 0.388
4.83ArgGlu: 4.83 ± 0.478
2.334ArgPhe: 2.334 ± 0.327
3.944ArgGly: 3.944 ± 0.396
1.087ArgHis: 1.087 ± 0.29
3.864ArgIle: 3.864 ± 0.358
2.857ArgLys: 2.857 ± 0.424
5.997ArgLeu: 5.997 ± 0.462
1.932ArgMet: 1.932 ± 0.26
1.409ArgAsn: 1.409 ± 0.234
4.306ArgPro: 4.306 ± 0.542
3.099ArgGln: 3.099 ± 0.411
5.473ArgArg: 5.473 ± 0.589
3.904ArgSer: 3.904 ± 0.458
3.34ArgThr: 3.34 ± 0.352
4.91ArgVal: 4.91 ± 0.41
1.409ArgTrp: 1.409 ± 0.282
1.368ArgTyr: 1.368 ± 0.263
0.0ArgXaa: 0.0 ± 0.0
Ser
6.037SerAla: 6.037 ± 0.75
0.483SerCys: 0.483 ± 0.159
3.179SerAsp: 3.179 ± 0.278
2.616SerGlu: 2.616 ± 0.297
2.173SerPhe: 2.173 ± 0.253
5.192SerGly: 5.192 ± 0.441
0.885SerHis: 0.885 ± 0.163
2.536SerIle: 2.536 ± 0.317
2.093SerLys: 2.093 ± 0.28
4.669SerLeu: 4.669 ± 0.46
1.529SerMet: 1.529 ± 0.302
1.892SerAsn: 1.892 ± 0.334
2.817SerPro: 2.817 ± 0.299
1.932SerGln: 1.932 ± 0.327
3.179SerArg: 3.179 ± 0.325
3.099SerSer: 3.099 ± 0.377
2.817SerThr: 2.817 ± 0.329
3.823SerVal: 3.823 ± 0.318
0.926SerTrp: 0.926 ± 0.221
1.328SerTyr: 1.328 ± 0.237
0.0SerXaa: 0.0 ± 0.0
Thr
5.956ThrAla: 5.956 ± 0.649
0.322ThrCys: 0.322 ± 0.104
3.381ThrAsp: 3.381 ± 0.405
3.381ThrGlu: 3.381 ± 0.383
1.69ThrPhe: 1.69 ± 0.179
4.709ThrGly: 4.709 ± 0.473
1.127ThrHis: 1.127 ± 0.223
3.018ThrIle: 3.018 ± 0.308
2.817ThrLys: 2.817 ± 0.364
3.904ThrLeu: 3.904 ± 0.329
1.288ThrMet: 1.288 ± 0.209
1.248ThrAsn: 1.248 ± 0.244
3.864ThrPro: 3.864 ± 0.466
1.851ThrGln: 1.851 ± 0.339
3.542ThrArg: 3.542 ± 0.339
3.542ThrSer: 3.542 ± 0.386
3.139ThrThr: 3.139 ± 0.369
3.461ThrVal: 3.461 ± 0.411
0.483ThrTrp: 0.483 ± 0.123
1.046ThrTyr: 1.046 ± 0.186
0.0ThrXaa: 0.0 ± 0.0
Val
8.613ValAla: 8.613 ± 0.549
0.644ValCys: 0.644 ± 0.156
4.588ValAsp: 4.588 ± 0.399
4.87ValGlu: 4.87 ± 0.527
1.892ValPhe: 1.892 ± 0.303
4.467ValGly: 4.467 ± 0.459
1.368ValHis: 1.368 ± 0.211
3.179ValIle: 3.179 ± 0.418
2.737ValLys: 2.737 ± 0.362
5.836ValLeu: 5.836 ± 0.588
1.932ValMet: 1.932 ± 0.23
2.495ValAsn: 2.495 ± 0.273
3.381ValPro: 3.381 ± 0.355
2.576ValGln: 2.576 ± 0.3
4.548ValArg: 4.548 ± 0.376
3.582ValSer: 3.582 ± 0.429
4.427ValThr: 4.427 ± 0.481
4.588ValVal: 4.588 ± 0.461
1.288ValTrp: 1.288 ± 0.264
1.892ValTyr: 1.892 ± 0.265
0.0ValXaa: 0.0 ± 0.0
Trp
2.093TrpAla: 2.093 ± 0.334
0.322TrpCys: 0.322 ± 0.109
1.006TrpAsp: 1.006 ± 0.229
0.885TrpGlu: 0.885 ± 0.188
0.604TrpPhe: 0.604 ± 0.143
1.248TrpGly: 1.248 ± 0.278
0.362TrpHis: 0.362 ± 0.144
0.885TrpIle: 0.885 ± 0.21
0.604TrpLys: 0.604 ± 0.151
1.248TrpLeu: 1.248 ± 0.212
0.724TrpMet: 0.724 ± 0.183
0.644TrpAsn: 0.644 ± 0.174
1.127TrpPro: 1.127 ± 0.26
0.402TrpGln: 0.402 ± 0.127
1.61TrpArg: 1.61 ± 0.25
1.328TrpSer: 1.328 ± 0.275
0.926TrpThr: 0.926 ± 0.216
0.885TrpVal: 0.885 ± 0.181
0.523TrpTrp: 0.523 ± 0.146
0.241TrpTyr: 0.241 ± 0.098
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.972TyrAla: 1.972 ± 0.261
0.121TyrCys: 0.121 ± 0.067
2.254TyrAsp: 2.254 ± 0.325
1.207TyrGlu: 1.207 ± 0.233
0.563TyrPhe: 0.563 ± 0.19
1.932TyrGly: 1.932 ± 0.238
0.402TyrHis: 0.402 ± 0.107
0.765TyrIle: 0.765 ± 0.162
0.765TyrLys: 0.765 ± 0.144
2.576TyrLeu: 2.576 ± 0.342
0.523TyrMet: 0.523 ± 0.126
0.483TyrAsn: 0.483 ± 0.133
1.288TyrPro: 1.288 ± 0.265
0.765TyrGln: 0.765 ± 0.162
1.811TyrArg: 1.811 ± 0.273
1.57TyrSer: 1.57 ± 0.265
1.087TyrThr: 1.087 ± 0.207
1.288TyrVal: 1.288 ± 0.265
0.241TyrTrp: 0.241 ± 0.11
0.644TyrTyr: 0.644 ± 0.165
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 115 proteins (24848 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski