Amino acid dipepetide frequency for Mycobacterium phage SarFire

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
14.21AlaAla: 14.21 ± 1.487
0.651AlaCys: 0.651 ± 0.191
7.342AlaAsp: 7.342 ± 0.751
6.987AlaGlu: 6.987 ± 0.627
2.961AlaPhe: 2.961 ± 0.494
7.993AlaGly: 7.993 ± 0.746
1.717AlaHis: 1.717 ± 0.315
4.263AlaIle: 4.263 ± 0.503
4.026AlaLys: 4.026 ± 0.487
9.592AlaLeu: 9.592 ± 0.82
2.842AlaMet: 2.842 ± 0.395
2.783AlaAsn: 2.783 ± 0.451
4.855AlaPro: 4.855 ± 0.72
2.428AlaGln: 2.428 ± 0.372
6.454AlaArg: 6.454 ± 0.507
5.388AlaSer: 5.388 ± 0.487
5.625AlaThr: 5.625 ± 0.631
8.882AlaVal: 8.882 ± 0.64
1.895AlaTrp: 1.895 ± 0.346
2.724AlaTyr: 2.724 ± 0.326
0.0AlaXaa: 0.0 ± 0.0
Cys
0.947CysAla: 0.947 ± 0.287
0.0CysCys: 0.0 ± 0.0
0.355CysAsp: 0.355 ± 0.121
0.711CysGlu: 0.711 ± 0.186
0.118CysPhe: 0.118 ± 0.074
0.474CysGly: 0.474 ± 0.184
0.178CysHis: 0.178 ± 0.083
0.178CysIle: 0.178 ± 0.115
0.355CysLys: 0.355 ± 0.124
0.533CysLeu: 0.533 ± 0.184
0.118CysMet: 0.118 ± 0.083
0.237CysAsn: 0.237 ± 0.102
0.355CysPro: 0.355 ± 0.129
0.118CysGln: 0.118 ± 0.07
0.414CysArg: 0.414 ± 0.156
0.296CysSer: 0.296 ± 0.132
0.533CysThr: 0.533 ± 0.18
0.414CysVal: 0.414 ± 0.138
0.296CysTrp: 0.296 ± 0.125
0.059CysTyr: 0.059 ± 0.052
0.0CysXaa: 0.0 ± 0.0
Asp
7.046AspAla: 7.046 ± 0.74
0.474AspCys: 0.474 ± 0.171
4.263AspAsp: 4.263 ± 0.521
3.671AspGlu: 3.671 ± 0.482
2.783AspPhe: 2.783 ± 0.361
6.039AspGly: 6.039 ± 0.509
1.125AspHis: 1.125 ± 0.248
2.783AspIle: 2.783 ± 0.391
2.724AspLys: 2.724 ± 0.474
6.928AspLeu: 6.928 ± 0.753
1.184AspMet: 1.184 ± 0.204
1.776AspAsn: 1.776 ± 0.316
4.796AspPro: 4.796 ± 0.598
1.539AspGln: 1.539 ± 0.264
4.085AspArg: 4.085 ± 0.47
3.612AspSer: 3.612 ± 0.457
3.553AspThr: 3.553 ± 0.363
4.559AspVal: 4.559 ± 0.523
1.776AspTrp: 1.776 ± 0.289
1.895AspTyr: 1.895 ± 0.318
0.0AspXaa: 0.0 ± 0.0
Glu
6.395GluAla: 6.395 ± 0.743
0.296GluCys: 0.296 ± 0.142
4.855GluAsp: 4.855 ± 0.581
5.092GluGlu: 5.092 ± 0.606
2.013GluPhe: 2.013 ± 0.307
3.671GluGly: 3.671 ± 0.42
1.717GluHis: 1.717 ± 0.318
3.079GluIle: 3.079 ± 0.415
2.783GluLys: 2.783 ± 0.385
7.046GluLeu: 7.046 ± 0.544
1.895GluMet: 1.895 ± 0.317
1.954GluAsn: 1.954 ± 0.416
2.428GluPro: 2.428 ± 0.381
2.368GluGln: 2.368 ± 0.316
4.263GluArg: 4.263 ± 0.526
3.493GluSer: 3.493 ± 0.438
4.204GluThr: 4.204 ± 0.568
5.743GluVal: 5.743 ± 0.588
1.717GluTrp: 1.717 ± 0.322
2.428GluTyr: 2.428 ± 0.381
0.0GluXaa: 0.0 ± 0.0
Phe
2.132PheAla: 2.132 ± 0.278
0.237PheCys: 0.237 ± 0.14
2.724PheAsp: 2.724 ± 0.391
2.072PheGlu: 2.072 ± 0.303
0.651PhePhe: 0.651 ± 0.17
3.434PheGly: 3.434 ± 0.418
0.533PheHis: 0.533 ± 0.219
1.48PheIle: 1.48 ± 0.284
1.184PheLys: 1.184 ± 0.272
2.309PheLeu: 2.309 ± 0.401
0.533PheMet: 0.533 ± 0.191
1.066PheAsn: 1.066 ± 0.237
1.599PhePro: 1.599 ± 0.273
0.888PheGln: 0.888 ± 0.231
1.599PheArg: 1.599 ± 0.319
1.895PheSer: 1.895 ± 0.459
2.013PheThr: 2.013 ± 0.352
1.895PheVal: 1.895 ± 0.33
0.592PheTrp: 0.592 ± 0.168
0.888PheTyr: 0.888 ± 0.216
0.0PheXaa: 0.0 ± 0.0
Gly
6.987GlyAla: 6.987 ± 0.956
0.651GlyCys: 0.651 ± 0.174
5.803GlyAsp: 5.803 ± 0.492
4.974GlyGlu: 4.974 ± 0.557
2.605GlyPhe: 2.605 ± 0.457
9.0GlyGly: 9.0 ± 2.656
1.599GlyHis: 1.599 ± 0.286
3.849GlyIle: 3.849 ± 0.629
3.553GlyLys: 3.553 ± 0.503
7.46GlyLeu: 7.46 ± 0.866
1.836GlyMet: 1.836 ± 0.379
3.612GlyAsn: 3.612 ± 0.693
3.73GlyPro: 3.73 ± 0.5
2.072GlyGln: 2.072 ± 0.343
4.974GlyArg: 4.974 ± 0.57
6.217GlySer: 6.217 ± 0.957
5.329GlyThr: 5.329 ± 0.661
5.388GlyVal: 5.388 ± 0.552
2.428GlyTrp: 2.428 ± 0.414
2.842GlyTyr: 2.842 ± 0.367
0.0GlyXaa: 0.0 ± 0.0
His
1.776HisAla: 1.776 ± 0.356
0.178HisCys: 0.178 ± 0.119
1.184HisAsp: 1.184 ± 0.239
1.421HisGlu: 1.421 ± 0.292
0.592HisPhe: 0.592 ± 0.178
1.658HisGly: 1.658 ± 0.371
0.533HisHis: 0.533 ± 0.171
0.888HisIle: 0.888 ± 0.21
1.243HisLys: 1.243 ± 0.281
1.184HisLeu: 1.184 ± 0.271
0.237HisMet: 0.237 ± 0.117
0.296HisAsn: 0.296 ± 0.136
1.599HisPro: 1.599 ± 0.278
0.829HisGln: 0.829 ± 0.222
1.125HisArg: 1.125 ± 0.257
0.888HisSer: 0.888 ± 0.238
1.362HisThr: 1.362 ± 0.299
1.717HisVal: 1.717 ± 0.342
0.355HisTrp: 0.355 ± 0.129
0.651HisTyr: 0.651 ± 0.214
0.0HisXaa: 0.0 ± 0.0
Ile
6.454IleAla: 6.454 ± 0.747
0.296IleCys: 0.296 ± 0.121
2.961IleAsp: 2.961 ± 0.405
3.73IleGlu: 3.73 ± 0.441
1.007IlePhe: 1.007 ± 0.224
3.553IleGly: 3.553 ± 0.471
0.888IleHis: 0.888 ± 0.231
1.539IleIle: 1.539 ± 0.309
1.776IleLys: 1.776 ± 0.333
3.316IleLeu: 3.316 ± 0.438
0.77IleMet: 0.77 ± 0.228
1.717IleAsn: 1.717 ± 0.302
3.079IlePro: 3.079 ± 0.379
1.421IleGln: 1.421 ± 0.319
3.967IleArg: 3.967 ± 0.462
3.02IleSer: 3.02 ± 0.387
3.138IleThr: 3.138 ± 0.493
2.368IleVal: 2.368 ± 0.455
0.592IleTrp: 0.592 ± 0.16
1.362IleTyr: 1.362 ± 0.23
0.0IleXaa: 0.0 ± 0.0
Lys
3.849LysAla: 3.849 ± 0.587
0.178LysCys: 0.178 ± 0.101
2.428LysAsp: 2.428 ± 0.419
2.428LysGlu: 2.428 ± 0.413
1.362LysPhe: 1.362 ± 0.265
2.309LysGly: 2.309 ± 0.399
1.007LysHis: 1.007 ± 0.257
2.664LysIle: 2.664 ± 0.421
1.954LysLys: 1.954 ± 0.452
3.257LysLeu: 3.257 ± 0.454
1.007LysMet: 1.007 ± 0.226
1.599LysAsn: 1.599 ± 0.253
2.724LysPro: 2.724 ± 0.386
1.48LysGln: 1.48 ± 0.258
3.079LysArg: 3.079 ± 0.385
2.546LysSer: 2.546 ± 0.43
2.309LysThr: 2.309 ± 0.418
3.079LysVal: 3.079 ± 0.442
0.592LysTrp: 0.592 ± 0.199
1.066LysTyr: 1.066 ± 0.253
0.0LysXaa: 0.0 ± 0.0
Leu
10.184LeuAla: 10.184 ± 0.82
0.414LeuCys: 0.414 ± 0.129
6.276LeuAsp: 6.276 ± 0.717
5.21LeuGlu: 5.21 ± 0.527
1.895LeuPhe: 1.895 ± 0.345
7.579LeuGly: 7.579 ± 0.9
1.539LeuHis: 1.539 ± 0.328
4.559LeuIle: 4.559 ± 0.571
3.967LeuLys: 3.967 ± 0.461
5.566LeuLeu: 5.566 ± 0.604
1.776LeuMet: 1.776 ± 0.324
2.783LeuAsn: 2.783 ± 0.358
5.329LeuPro: 5.329 ± 0.524
2.309LeuGln: 2.309 ± 0.389
6.395LeuArg: 6.395 ± 0.514
5.743LeuSer: 5.743 ± 0.544
5.862LeuThr: 5.862 ± 0.36
4.737LeuVal: 4.737 ± 0.618
1.243LeuTrp: 1.243 ± 0.252
2.428LeuTyr: 2.428 ± 0.393
0.0LeuXaa: 0.0 ± 0.0
Met
2.368MetAla: 2.368 ± 0.346
0.059MetCys: 0.059 ± 0.073
1.362MetAsp: 1.362 ± 0.254
1.599MetGlu: 1.599 ± 0.305
0.592MetPhe: 0.592 ± 0.168
1.125MetGly: 1.125 ± 0.273
0.474MetHis: 0.474 ± 0.211
0.77MetIle: 0.77 ± 0.234
0.888MetLys: 0.888 ± 0.23
1.303MetLeu: 1.303 ± 0.27
0.237MetMet: 0.237 ± 0.117
1.125MetAsn: 1.125 ± 0.232
1.184MetPro: 1.184 ± 0.282
0.651MetGln: 0.651 ± 0.182
1.243MetArg: 1.243 ± 0.251
1.954MetSer: 1.954 ± 0.38
2.546MetThr: 2.546 ± 0.289
1.243MetVal: 1.243 ± 0.293
0.355MetTrp: 0.355 ± 0.14
0.474MetTyr: 0.474 ± 0.151
0.0MetXaa: 0.0 ± 0.0
Asn
3.079AsnAla: 3.079 ± 0.462
0.059AsnCys: 0.059 ± 0.059
2.191AsnAsp: 2.191 ± 0.375
2.132AsnGlu: 2.132 ± 0.384
0.947AsnPhe: 0.947 ± 0.234
3.789AsnGly: 3.789 ± 0.539
0.829AsnHis: 0.829 ± 0.208
1.717AsnIle: 1.717 ± 0.332
0.296AsnLys: 0.296 ± 0.154
2.309AsnLeu: 2.309 ± 0.361
0.474AsnMet: 0.474 ± 0.147
1.007AsnAsn: 1.007 ± 0.227
2.487AsnPro: 2.487 ± 0.362
0.888AsnGln: 0.888 ± 0.238
1.421AsnArg: 1.421 ± 0.305
1.895AsnSer: 1.895 ± 0.419
2.072AsnThr: 2.072 ± 0.369
2.605AsnVal: 2.605 ± 0.416
0.829AsnTrp: 0.829 ± 0.197
1.243AsnTyr: 1.243 ± 0.297
0.0AsnXaa: 0.0 ± 0.0
Pro
5.507ProAla: 5.507 ± 0.578
0.414ProCys: 0.414 ± 0.159
4.382ProAsp: 4.382 ± 0.487
4.855ProGlu: 4.855 ± 0.469
2.072ProPhe: 2.072 ± 0.361
4.855ProGly: 4.855 ± 0.644
0.77ProHis: 0.77 ± 0.216
2.25ProIle: 2.25 ± 0.339
1.836ProLys: 1.836 ± 0.302
4.322ProLeu: 4.322 ± 0.594
1.125ProMet: 1.125 ± 0.258
1.421ProAsn: 1.421 ± 0.312
2.961ProPro: 2.961 ± 0.542
1.421ProGln: 1.421 ± 0.301
3.079ProArg: 3.079 ± 0.432
3.434ProSer: 3.434 ± 0.401
3.967ProThr: 3.967 ± 0.578
3.967ProVal: 3.967 ± 0.418
0.888ProTrp: 0.888 ± 0.275
1.362ProTyr: 1.362 ± 0.299
0.0ProXaa: 0.0 ± 0.0
Gln
2.783GlnAla: 2.783 ± 0.497
0.118GlnCys: 0.118 ± 0.097
1.243GlnAsp: 1.243 ± 0.302
1.303GlnGlu: 1.303 ± 0.213
1.007GlnPhe: 1.007 ± 0.226
2.368GlnGly: 2.368 ± 0.361
0.651GlnHis: 0.651 ± 0.157
2.664GlnIle: 2.664 ± 0.485
0.888GlnLys: 0.888 ± 0.217
3.493GlnLeu: 3.493 ± 0.432
1.007GlnMet: 1.007 ± 0.233
0.474GlnAsn: 0.474 ± 0.162
1.954GlnPro: 1.954 ± 0.305
1.658GlnGln: 1.658 ± 0.325
1.776GlnArg: 1.776 ± 0.317
1.658GlnSer: 1.658 ± 0.369
1.48GlnThr: 1.48 ± 0.298
2.368GlnVal: 2.368 ± 0.33
0.651GlnTrp: 0.651 ± 0.182
0.533GlnTyr: 0.533 ± 0.146
0.0GlnXaa: 0.0 ± 0.0
Arg
5.98ArgAla: 5.98 ± 0.649
0.77ArgCys: 0.77 ± 0.207
3.434ArgAsp: 3.434 ± 0.435
4.737ArgGlu: 4.737 ± 0.594
1.895ArgPhe: 1.895 ± 0.348
5.447ArgGly: 5.447 ± 0.772
1.303ArgHis: 1.303 ± 0.364
2.901ArgIle: 2.901 ± 0.434
3.375ArgLys: 3.375 ± 0.552
6.158ArgLeu: 6.158 ± 0.7
1.895ArgMet: 1.895 ± 0.333
2.191ArgAsn: 2.191 ± 0.387
2.487ArgPro: 2.487 ± 0.383
1.895ArgGln: 1.895 ± 0.309
4.974ArgArg: 4.974 ± 0.593
3.789ArgSer: 3.789 ± 0.479
3.197ArgThr: 3.197 ± 0.516
5.388ArgVal: 5.388 ± 0.51
1.184ArgTrp: 1.184 ± 0.244
1.599ArgTyr: 1.599 ± 0.287
0.0ArgXaa: 0.0 ± 0.0
Ser
6.572SerAla: 6.572 ± 0.643
0.533SerCys: 0.533 ± 0.203
3.553SerAsp: 3.553 ± 0.399
3.789SerGlu: 3.789 ± 0.524
1.658SerPhe: 1.658 ± 0.283
6.75SerGly: 6.75 ± 1.116
1.184SerHis: 1.184 ± 0.252
2.783SerIle: 2.783 ± 0.452
2.546SerLys: 2.546 ± 0.415
5.329SerLeu: 5.329 ± 0.558
1.539SerMet: 1.539 ± 0.331
1.954SerAsn: 1.954 ± 0.367
3.197SerPro: 3.197 ± 0.478
1.836SerGln: 1.836 ± 0.261
3.02SerArg: 3.02 ± 0.368
3.316SerSer: 3.316 ± 0.652
3.079SerThr: 3.079 ± 0.431
3.908SerVal: 3.908 ± 0.429
1.303SerTrp: 1.303 ± 0.311
1.658SerTyr: 1.658 ± 0.322
0.0SerXaa: 0.0 ± 0.0
Thr
6.454ThrAla: 6.454 ± 0.911
0.355ThrCys: 0.355 ± 0.144
3.967ThrAsp: 3.967 ± 0.48
4.382ThrGlu: 4.382 ± 0.496
2.25ThrPhe: 2.25 ± 0.382
6.039ThrGly: 6.039 ± 0.566
1.184ThrHis: 1.184 ± 0.311
2.605ThrIle: 2.605 ± 0.518
2.487ThrLys: 2.487 ± 0.394
5.98ThrLeu: 5.98 ± 0.634
0.888ThrMet: 0.888 ± 0.2
1.776ThrAsn: 1.776 ± 0.314
3.908ThrPro: 3.908 ± 0.535
1.954ThrGln: 1.954 ± 0.367
3.73ThrArg: 3.73 ± 0.559
3.197ThrSer: 3.197 ± 0.499
3.908ThrThr: 3.908 ± 0.558
5.27ThrVal: 5.27 ± 0.66
1.007ThrTrp: 1.007 ± 0.26
1.836ThrTyr: 1.836 ± 0.291
0.0ThrXaa: 0.0 ± 0.0
Val
6.75ValAla: 6.75 ± 0.683
0.474ValCys: 0.474 ± 0.159
5.388ValAsp: 5.388 ± 0.505
5.033ValGlu: 5.033 ± 0.52
2.013ValPhe: 2.013 ± 0.282
4.441ValGly: 4.441 ± 0.539
1.48ValHis: 1.48 ± 0.245
3.493ValIle: 3.493 ± 0.45
3.079ValLys: 3.079 ± 0.459
5.329ValLeu: 5.329 ± 0.564
1.303ValMet: 1.303 ± 0.306
2.664ValAsn: 2.664 ± 0.366
4.204ValPro: 4.204 ± 0.501
2.487ValGln: 2.487 ± 0.409
5.447ValArg: 5.447 ± 0.724
4.618ValSer: 4.618 ± 0.379
5.625ValThr: 5.625 ± 0.687
5.151ValVal: 5.151 ± 0.697
1.303ValTrp: 1.303 ± 0.257
2.546ValTyr: 2.546 ± 0.426
0.0ValXaa: 0.0 ± 0.0
Trp
1.599TrpAla: 1.599 ± 0.34
0.237TrpCys: 0.237 ± 0.107
1.362TrpAsp: 1.362 ± 0.265
0.888TrpGlu: 0.888 ± 0.22
0.829TrpPhe: 0.829 ± 0.197
1.658TrpGly: 1.658 ± 0.302
0.414TrpHis: 0.414 ± 0.149
1.066TrpIle: 1.066 ± 0.217
0.533TrpLys: 0.533 ± 0.207
1.836TrpLeu: 1.836 ± 0.302
0.355TrpMet: 0.355 ± 0.169
0.711TrpAsn: 0.711 ± 0.194
0.829TrpPro: 0.829 ± 0.244
0.829TrpGln: 0.829 ± 0.201
1.243TrpArg: 1.243 ± 0.329
1.184TrpSer: 1.184 ± 0.236
1.539TrpThr: 1.539 ± 0.337
2.191TrpVal: 2.191 ± 0.327
0.592TrpTrp: 0.592 ± 0.215
0.355TrpTyr: 0.355 ± 0.12
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.546TyrAla: 2.546 ± 0.41
0.296TyrCys: 0.296 ± 0.173
1.539TyrAsp: 1.539 ± 0.274
2.428TyrGlu: 2.428 ± 0.374
0.474TyrPhe: 0.474 ± 0.143
2.724TyrGly: 2.724 ± 0.388
0.592TyrHis: 0.592 ± 0.195
1.421TyrIle: 1.421 ± 0.288
1.421TyrLys: 1.421 ± 0.289
2.546TyrLeu: 2.546 ± 0.388
0.474TyrMet: 0.474 ± 0.124
1.125TyrAsn: 1.125 ± 0.316
1.184TyrPro: 1.184 ± 0.25
1.007TyrGln: 1.007 ± 0.255
2.368TyrArg: 2.368 ± 0.338
1.362TyrSer: 1.362 ± 0.251
1.836TyrThr: 1.836 ± 0.315
1.954TyrVal: 1.954 ± 0.32
0.651TyrTrp: 0.651 ± 0.205
0.533TyrTyr: 0.533 ± 0.156
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 99 proteins (16890 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski