Amino acid dipepetide frequency for Sinorhizobium phage PBC5

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
16.485AlaAla: 16.485 ± 1.377
1.598AlaCys: 1.598 ± 0.345
7.872AlaAsp: 7.872 ± 0.571
7.054AlaGlu: 7.054 ± 0.75
3.274AlaPhe: 3.274 ± 0.481
11.691AlaGly: 11.691 ± 1.032
2.767AlaHis: 2.767 ± 0.319
4.754AlaIle: 4.754 ± 0.463
4.677AlaLys: 4.677 ± 0.577
9.899AlaLeu: 9.899 ± 0.674
2.923AlaMet: 2.923 ± 0.338
2.962AlaAsn: 2.962 ± 0.415
4.754AlaPro: 4.754 ± 0.498
4.521AlaGln: 4.521 ± 0.449
9.158AlaArg: 9.158 ± 0.61
7.015AlaSer: 7.015 ± 0.713
5.456AlaThr: 5.456 ± 0.495
8.418AlaVal: 8.418 ± 0.85
1.208AlaTrp: 1.208 ± 0.253
1.247AlaTyr: 1.247 ± 0.206
0.0AlaXaa: 0.0 ± 0.0
Cys
1.442CysAla: 1.442 ± 0.345
0.818CysCys: 0.818 ± 0.258
0.74CysAsp: 0.74 ± 0.239
0.74CysGlu: 0.74 ± 0.211
0.624CysPhe: 0.624 ± 0.173
1.364CysGly: 1.364 ± 0.327
0.39CysHis: 0.39 ± 0.161
0.935CysIle: 0.935 ± 0.24
0.117CysLys: 0.117 ± 0.068
1.052CysLeu: 1.052 ± 0.286
0.273CysMet: 0.273 ± 0.118
0.273CysAsn: 0.273 ± 0.094
0.507CysPro: 0.507 ± 0.178
0.273CysGln: 0.273 ± 0.105
2.26CysArg: 2.26 ± 0.517
1.169CysSer: 1.169 ± 0.297
0.39CysThr: 0.39 ± 0.14
0.39CysVal: 0.39 ± 0.14
0.273CysTrp: 0.273 ± 0.106
0.156CysTyr: 0.156 ± 0.099
0.0CysXaa: 0.0 ± 0.0
Asp
7.443AspAla: 7.443 ± 0.705
0.429AspCys: 0.429 ± 0.132
4.131AspAsp: 4.131 ± 0.549
4.716AspGlu: 4.716 ± 0.596
1.988AspPhe: 1.988 ± 0.404
5.885AspGly: 5.885 ± 0.461
1.481AspHis: 1.481 ± 0.262
3.04AspIle: 3.04 ± 0.454
2.182AspLys: 2.182 ± 0.38
4.871AspLeu: 4.871 ± 0.604
0.74AspMet: 0.74 ± 0.148
1.637AspAsn: 1.637 ± 0.283
3.274AspPro: 3.274 ± 0.459
1.832AspGln: 1.832 ± 0.244
6.352AspArg: 6.352 ± 0.552
2.104AspSer: 2.104 ± 0.386
2.728AspThr: 2.728 ± 0.317
3.663AspVal: 3.663 ± 0.45
0.974AspTrp: 0.974 ± 0.189
1.091AspTyr: 1.091 ± 0.161
0.0AspXaa: 0.0 ± 0.0
Glu
6.976GluAla: 6.976 ± 0.712
0.779GluCys: 0.779 ± 0.191
2.806GluAsp: 2.806 ± 0.361
3.313GluGlu: 3.313 ± 0.443
1.325GluPhe: 1.325 ± 0.183
4.326GluGly: 4.326 ± 0.556
1.832GluHis: 1.832 ± 0.361
3.274GluIle: 3.274 ± 0.346
2.533GluLys: 2.533 ± 0.496
6.041GluLeu: 6.041 ± 0.749
1.403GluMet: 1.403 ± 0.235
2.027GluAsn: 2.027 ± 0.282
3.468GluPro: 3.468 ± 0.617
2.884GluGln: 2.884 ± 0.434
5.261GluArg: 5.261 ± 0.508
2.104GluSer: 2.104 ± 0.266
3.079GluThr: 3.079 ± 0.386
3.352GluVal: 3.352 ± 0.462
0.896GluTrp: 0.896 ± 0.183
1.091GluTyr: 1.091 ± 0.25
0.0GluXaa: 0.0 ± 0.0
Phe
3.04PheAla: 3.04 ± 0.395
0.468PheCys: 0.468 ± 0.14
2.221PheAsp: 2.221 ± 0.237
1.559PheGlu: 1.559 ± 0.235
0.857PhePhe: 0.857 ± 0.203
3.079PheGly: 3.079 ± 0.329
0.896PheHis: 0.896 ± 0.203
1.208PheIle: 1.208 ± 0.328
1.208PheLys: 1.208 ± 0.213
2.728PheLeu: 2.728 ± 0.378
0.585PheMet: 0.585 ± 0.162
0.935PheAsn: 0.935 ± 0.227
1.286PhePro: 1.286 ± 0.197
1.091PheGln: 1.091 ± 0.224
2.572PheArg: 2.572 ± 0.402
2.377PheSer: 2.377 ± 0.362
1.13PheThr: 1.13 ± 0.28
2.027PheVal: 2.027 ± 0.285
0.39PheTrp: 0.39 ± 0.13
0.663PheTyr: 0.663 ± 0.224
0.0PheXaa: 0.0 ± 0.0
Gly
9.86GlyAla: 9.86 ± 0.594
0.935GlyCys: 0.935 ± 0.25
5.807GlyAsp: 5.807 ± 0.457
4.443GlyGlu: 4.443 ± 0.419
3.001GlyPhe: 3.001 ± 0.391
6.976GlyGly: 6.976 ± 0.887
1.949GlyHis: 1.949 ± 0.227
3.118GlyIle: 3.118 ± 0.346
4.014GlyLys: 4.014 ± 0.48
8.301GlyLeu: 8.301 ± 0.864
2.065GlyMet: 2.065 ± 0.317
3.274GlyAsn: 3.274 ± 0.46
3.04GlyPro: 3.04 ± 0.375
2.494GlyGln: 2.494 ± 0.243
6.937GlyArg: 6.937 ± 0.692
5.534GlySer: 5.534 ± 0.675
4.599GlyThr: 4.599 ± 0.419
6.859GlyVal: 6.859 ± 0.642
0.896GlyTrp: 0.896 ± 0.199
1.403GlyTyr: 1.403 ± 0.246
0.0GlyXaa: 0.0 ± 0.0
His
3.313HisAla: 3.313 ± 0.547
0.585HisCys: 0.585 ± 0.166
1.325HisAsp: 1.325 ± 0.315
1.442HisGlu: 1.442 ± 0.292
0.624HisPhe: 0.624 ± 0.174
2.104HisGly: 2.104 ± 0.303
1.013HisHis: 1.013 ± 0.225
1.169HisIle: 1.169 ± 0.181
0.507HisLys: 0.507 ± 0.191
3.079HisLeu: 3.079 ± 0.488
0.156HisMet: 0.156 ± 0.079
0.585HisAsn: 0.585 ± 0.15
0.74HisPro: 0.74 ± 0.15
1.442HisGln: 1.442 ± 0.265
2.143HisArg: 2.143 ± 0.324
1.208HisSer: 1.208 ± 0.207
0.663HisThr: 0.663 ± 0.158
2.143HisVal: 2.143 ± 0.407
0.39HisTrp: 0.39 ± 0.131
0.701HisTyr: 0.701 ± 0.144
0.0HisXaa: 0.0 ± 0.0
Ile
5.768IleAla: 5.768 ± 0.512
0.624IleCys: 0.624 ± 0.211
3.78IleAsp: 3.78 ± 0.345
3.274IleGlu: 3.274 ± 0.456
1.247IlePhe: 1.247 ± 0.21
4.209IleGly: 4.209 ± 0.524
0.818IleHis: 0.818 ± 0.162
2.26IleIle: 2.26 ± 0.341
1.091IleLys: 1.091 ± 0.182
2.923IleLeu: 2.923 ± 0.307
0.857IleMet: 0.857 ± 0.155
1.247IleAsn: 1.247 ± 0.28
1.559IlePro: 1.559 ± 0.238
1.832IleGln: 1.832 ± 0.273
3.78IleArg: 3.78 ± 0.442
2.611IleSer: 2.611 ± 0.37
2.494IleThr: 2.494 ± 0.203
3.429IleVal: 3.429 ± 0.367
0.779IleTrp: 0.779 ± 0.171
0.701IleTyr: 0.701 ± 0.214
0.0IleXaa: 0.0 ± 0.0
Lys
5.456LysAla: 5.456 ± 0.793
0.156LysCys: 0.156 ± 0.073
1.91LysAsp: 1.91 ± 0.348
2.455LysGlu: 2.455 ± 0.453
0.896LysPhe: 0.896 ± 0.186
2.494LysGly: 2.494 ± 0.354
0.935LysHis: 0.935 ± 0.24
1.325LysIle: 1.325 ± 0.3
1.832LysLys: 1.832 ± 0.276
2.455LysLeu: 2.455 ± 0.308
0.663LysMet: 0.663 ± 0.167
1.013LysAsn: 1.013 ± 0.212
2.416LysPro: 2.416 ± 0.413
1.715LysGln: 1.715 ± 0.211
3.118LysArg: 3.118 ± 0.42
2.338LysSer: 2.338 ± 0.283
2.221LysThr: 2.221 ± 0.323
2.884LysVal: 2.884 ± 0.474
0.896LysTrp: 0.896 ± 0.213
0.818LysTyr: 0.818 ± 0.17
0.0LysXaa: 0.0 ± 0.0
Leu
9.626LeuAla: 9.626 ± 0.757
1.364LeuCys: 1.364 ± 0.448
5.885LeuAsp: 5.885 ± 0.745
5.222LeuGlu: 5.222 ± 0.459
2.533LeuPhe: 2.533 ± 0.352
7.482LeuGly: 7.482 ± 0.73
2.455LeuHis: 2.455 ± 0.424
2.923LeuIle: 2.923 ± 0.341
2.611LeuLys: 2.611 ± 0.272
8.652LeuLeu: 8.652 ± 1.283
1.91LeuMet: 1.91 ± 0.369
2.494LeuAsn: 2.494 ± 0.304
5.144LeuPro: 5.144 ± 0.467
2.845LeuGln: 2.845 ± 0.229
8.34LeuArg: 8.34 ± 0.826
5.222LeuSer: 5.222 ± 0.48
3.585LeuThr: 3.585 ± 0.362
6.196LeuVal: 6.196 ± 0.857
0.779LeuTrp: 0.779 ± 0.16
1.364LeuTyr: 1.364 ± 0.314
0.0LeuXaa: 0.0 ± 0.0
Met
2.182MetAla: 2.182 ± 0.308
0.039MetCys: 0.039 ± 0.042
0.857MetAsp: 0.857 ± 0.253
1.052MetGlu: 1.052 ± 0.203
1.013MetPhe: 1.013 ± 0.244
1.169MetGly: 1.169 ± 0.296
0.663MetHis: 0.663 ± 0.153
0.546MetIle: 0.546 ± 0.175
0.974MetLys: 0.974 ± 0.185
1.988MetLeu: 1.988 ± 0.243
0.624MetMet: 0.624 ± 0.162
0.507MetAsn: 0.507 ± 0.145
1.871MetPro: 1.871 ± 0.316
0.896MetGln: 0.896 ± 0.195
2.221MetArg: 2.221 ± 0.284
1.598MetSer: 1.598 ± 0.269
1.286MetThr: 1.286 ± 0.298
1.247MetVal: 1.247 ± 0.216
0.273MetTrp: 0.273 ± 0.118
0.312MetTyr: 0.312 ± 0.099
0.0MetXaa: 0.0 ± 0.0
Asn
3.78AsnAla: 3.78 ± 0.393
0.39AsnCys: 0.39 ± 0.143
2.065AsnAsp: 2.065 ± 0.311
1.364AsnGlu: 1.364 ± 0.206
1.013AsnPhe: 1.013 ± 0.194
2.923AsnGly: 2.923 ± 0.505
0.663AsnHis: 0.663 ± 0.178
1.403AsnIle: 1.403 ± 0.223
1.247AsnLys: 1.247 ± 0.334
1.949AsnLeu: 1.949 ± 0.326
0.74AsnMet: 0.74 ± 0.234
0.857AsnAsn: 0.857 ± 0.275
1.286AsnPro: 1.286 ± 0.248
0.701AsnGln: 0.701 ± 0.179
2.338AsnArg: 2.338 ± 0.279
1.286AsnSer: 1.286 ± 0.229
1.481AsnThr: 1.481 ± 0.275
1.832AsnVal: 1.832 ± 0.365
0.234AsnTrp: 0.234 ± 0.104
0.624AsnTyr: 0.624 ± 0.151
0.0AsnXaa: 0.0 ± 0.0
Pro
6.43ProAla: 6.43 ± 0.794
0.779ProCys: 0.779 ± 0.213
3.78ProAsp: 3.78 ± 0.383
2.455ProGlu: 2.455 ± 0.416
1.442ProPhe: 1.442 ± 0.223
3.78ProGly: 3.78 ± 0.383
0.857ProHis: 0.857 ± 0.255
2.572ProIle: 2.572 ± 0.325
2.027ProLys: 2.027 ± 0.306
3.741ProLeu: 3.741 ± 0.388
1.091ProMet: 1.091 ± 0.242
1.637ProAsn: 1.637 ± 0.205
2.494ProPro: 2.494 ± 0.373
1.247ProGln: 1.247 ± 0.26
4.365ProArg: 4.365 ± 0.643
4.053ProSer: 4.053 ± 0.477
2.806ProThr: 2.806 ± 0.409
3.118ProVal: 3.118 ± 0.399
0.74ProTrp: 0.74 ± 0.175
0.974ProTyr: 0.974 ± 0.229
0.0ProXaa: 0.0 ± 0.0
Gln
4.014GlnAla: 4.014 ± 0.605
0.234GlnCys: 0.234 ± 0.097
1.442GlnAsp: 1.442 ± 0.247
1.52GlnGlu: 1.52 ± 0.248
1.091GlnPhe: 1.091 ± 0.227
2.611GlnGly: 2.611 ± 0.317
1.091GlnHis: 1.091 ± 0.191
2.221GlnIle: 2.221 ± 0.273
1.559GlnLys: 1.559 ± 0.251
2.377GlnLeu: 2.377 ± 0.325
0.935GlnMet: 0.935 ± 0.175
0.974GlnAsn: 0.974 ± 0.152
2.143GlnPro: 2.143 ± 0.309
1.676GlnGln: 1.676 ± 0.477
4.014GlnArg: 4.014 ± 0.358
1.637GlnSer: 1.637 ± 0.258
1.91GlnThr: 1.91 ± 0.262
2.416GlnVal: 2.416 ± 0.304
0.546GlnTrp: 0.546 ± 0.141
0.663GlnTyr: 0.663 ± 0.175
0.0GlnXaa: 0.0 ± 0.0
Arg
8.301ArgAla: 8.301 ± 0.532
2.416ArgCys: 2.416 ± 0.569
4.482ArgAsp: 4.482 ± 0.431
5.3ArgGlu: 5.3 ± 0.42
3.079ArgPhe: 3.079 ± 0.359
6.352ArgGly: 6.352 ± 0.587
2.845ArgHis: 2.845 ± 0.521
4.871ArgIle: 4.871 ± 0.435
3.429ArgLys: 3.429 ± 0.366
8.807ArgLeu: 8.807 ± 0.802
1.871ArgMet: 1.871 ± 0.264
2.416ArgAsn: 2.416 ± 0.288
4.871ArgPro: 4.871 ± 0.593
3.313ArgGln: 3.313 ± 0.401
11.613ArgArg: 11.613 ± 1.353
6.703ArgSer: 6.703 ± 0.839
3.741ArgThr: 3.741 ± 0.449
5.339ArgVal: 5.339 ± 0.618
1.403ArgTrp: 1.403 ± 0.248
1.871ArgTyr: 1.871 ± 0.261
0.0ArgXaa: 0.0 ± 0.0
Ser
7.093SerAla: 7.093 ± 0.577
0.974SerCys: 0.974 ± 0.264
2.767SerAsp: 2.767 ± 0.282
3.157SerGlu: 3.157 ± 0.469
1.754SerPhe: 1.754 ± 0.285
5.495SerGly: 5.495 ± 0.671
1.325SerHis: 1.325 ± 0.243
3.196SerIle: 3.196 ± 0.529
2.299SerLys: 2.299 ± 0.326
4.677SerLeu: 4.677 ± 0.448
1.559SerMet: 1.559 ± 0.336
1.481SerAsn: 1.481 ± 0.221
3.468SerPro: 3.468 ± 0.429
2.143SerGln: 2.143 ± 0.277
5.651SerArg: 5.651 ± 0.649
4.988SerSer: 4.988 ± 0.804
3.468SerThr: 3.468 ± 0.515
3.741SerVal: 3.741 ± 0.455
1.013SerTrp: 1.013 ± 0.201
0.74SerTyr: 0.74 ± 0.155
0.0SerXaa: 0.0 ± 0.0
Thr
5.768ThrAla: 5.768 ± 0.516
0.74ThrCys: 0.74 ± 0.195
2.377ThrAsp: 2.377 ± 0.321
2.923ThrGlu: 2.923 ± 0.352
1.598ThrPhe: 1.598 ± 0.191
5.105ThrGly: 5.105 ± 0.481
0.779ThrHis: 0.779 ± 0.169
2.299ThrIle: 2.299 ± 0.315
2.027ThrLys: 2.027 ± 0.283
4.053ThrLeu: 4.053 ± 0.348
1.13ThrMet: 1.13 ± 0.226
1.13ThrAsn: 1.13 ± 0.298
3.001ThrPro: 3.001 ± 0.387
1.247ThrGln: 1.247 ± 0.199
3.741ThrArg: 3.741 ± 0.489
3.196ThrSer: 3.196 ± 0.388
3.468ThrThr: 3.468 ± 0.555
3.819ThrVal: 3.819 ± 0.405
0.974ThrTrp: 0.974 ± 0.186
0.818ThrTyr: 0.818 ± 0.184
0.0ThrXaa: 0.0 ± 0.0
Val
8.418ValAla: 8.418 ± 0.853
0.585ValCys: 0.585 ± 0.184
4.209ValAsp: 4.209 ± 0.42
5.417ValGlu: 5.417 ± 0.65
1.754ValPhe: 1.754 ± 0.242
6.391ValGly: 6.391 ± 0.705
1.559ValHis: 1.559 ± 0.288
3.118ValIle: 3.118 ± 0.327
2.338ValLys: 2.338 ± 0.268
5.417ValLeu: 5.417 ± 0.688
1.208ValMet: 1.208 ± 0.21
1.598ValAsn: 1.598 ± 0.324
3.858ValPro: 3.858 ± 0.419
1.559ValGln: 1.559 ± 0.223
5.573ValArg: 5.573 ± 0.485
3.936ValSer: 3.936 ± 0.418
4.287ValThr: 4.287 ± 0.547
5.612ValVal: 5.612 ± 0.783
0.624ValTrp: 0.624 ± 0.204
0.818ValTyr: 0.818 ± 0.198
0.0ValXaa: 0.0 ± 0.0
Trp
1.169TrpAla: 1.169 ± 0.201
0.117TrpCys: 0.117 ± 0.061
0.935TrpAsp: 0.935 ± 0.193
0.507TrpGlu: 0.507 ± 0.122
0.507TrpPhe: 0.507 ± 0.119
0.585TrpGly: 0.585 ± 0.155
0.546TrpHis: 0.546 ± 0.145
0.429TrpIle: 0.429 ± 0.114
0.663TrpLys: 0.663 ± 0.144
1.832TrpLeu: 1.832 ± 0.232
0.312TrpMet: 0.312 ± 0.092
0.624TrpAsn: 0.624 ± 0.169
0.585TrpPro: 0.585 ± 0.175
0.624TrpGln: 0.624 ± 0.138
1.52TrpArg: 1.52 ± 0.199
0.857TrpSer: 0.857 ± 0.196
0.585TrpThr: 0.585 ± 0.156
0.896TrpVal: 0.896 ± 0.191
0.273TrpTrp: 0.273 ± 0.121
0.195TrpTyr: 0.195 ± 0.077
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.169TyrAla: 1.169 ± 0.22
0.156TyrCys: 0.156 ± 0.077
1.169TyrAsp: 1.169 ± 0.28
0.896TyrGlu: 0.896 ± 0.181
0.701TyrPhe: 0.701 ± 0.217
1.481TyrGly: 1.481 ± 0.238
0.468TyrHis: 0.468 ± 0.129
0.624TyrIle: 0.624 ± 0.161
0.663TyrLys: 0.663 ± 0.166
1.793TyrLeu: 1.793 ± 0.274
0.234TyrMet: 0.234 ± 0.081
0.507TyrAsn: 0.507 ± 0.141
0.624TyrPro: 0.624 ± 0.158
0.74TyrGln: 0.74 ± 0.182
2.065TyrArg: 2.065 ± 0.27
1.052TyrSer: 1.052 ± 0.219
0.818TyrThr: 0.818 ± 0.144
0.896TyrVal: 0.896 ± 0.258
0.195TyrTrp: 0.195 ± 0.082
0.234TyrTyr: 0.234 ± 0.123
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 83 proteins (25661 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski