Amino acid dipepetide frequency for Mycobacterium phage Refuge

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
10.987AlaAla: 10.987 ± 0.895
0.653AlaCys: 0.653 ± 0.197
5.761AlaAsp: 5.761 ± 0.462
6.711AlaGlu: 6.711 ± 0.685
3.92AlaPhe: 3.92 ± 0.462
8.374AlaGly: 8.374 ± 0.905
1.96AlaHis: 1.96 ± 0.352
4.751AlaIle: 4.751 ± 0.506
5.167AlaLys: 5.167 ± 0.652
7.246AlaLeu: 7.246 ± 1.042
3.029AlaMet: 3.029 ± 0.395
2.91AlaAsn: 2.91 ± 0.429
5.404AlaPro: 5.404 ± 0.83
4.395AlaGln: 4.395 ± 0.525
5.88AlaArg: 5.88 ± 0.573
5.345AlaSer: 5.345 ± 0.462
6.295AlaThr: 6.295 ± 0.862
7.305AlaVal: 7.305 ± 0.738
1.722AlaTrp: 1.722 ± 0.239
2.435AlaTyr: 2.435 ± 0.355
0.0AlaXaa: 0.0 ± 0.0
Cys
0.891CysAla: 0.891 ± 0.226
0.059CysCys: 0.059 ± 0.061
0.653CysAsp: 0.653 ± 0.195
0.297CysGlu: 0.297 ± 0.167
0.356CysPhe: 0.356 ± 0.14
0.713CysGly: 0.713 ± 0.213
0.178CysHis: 0.178 ± 0.141
0.238CysIle: 0.238 ± 0.133
0.297CysLys: 0.297 ± 0.119
0.594CysLeu: 0.594 ± 0.176
0.297CysMet: 0.297 ± 0.145
0.416CysAsn: 0.416 ± 0.14
0.594CysPro: 0.594 ± 0.32
0.238CysGln: 0.238 ± 0.104
0.475CysArg: 0.475 ± 0.182
0.297CysSer: 0.297 ± 0.146
0.356CysThr: 0.356 ± 0.145
0.535CysVal: 0.535 ± 0.177
0.297CysTrp: 0.297 ± 0.143
0.356CysTyr: 0.356 ± 0.117
0.0CysXaa: 0.0 ± 0.0
Asp
4.989AspAla: 4.989 ± 0.494
0.475AspCys: 0.475 ± 0.165
3.742AspAsp: 3.742 ± 0.521
4.098AspGlu: 4.098 ± 0.574
2.673AspPhe: 2.673 ± 0.406
5.167AspGly: 5.167 ± 0.526
1.366AspHis: 1.366 ± 0.344
3.266AspIle: 3.266 ± 0.404
2.079AspLys: 2.079 ± 0.323
6.117AspLeu: 6.117 ± 0.761
1.307AspMet: 1.307 ± 0.25
1.841AspAsn: 1.841 ± 0.385
4.454AspPro: 4.454 ± 0.624
1.782AspGln: 1.782 ± 0.452
2.91AspArg: 2.91 ± 0.411
4.157AspSer: 4.157 ± 0.43
3.979AspThr: 3.979 ± 0.504
4.098AspVal: 4.098 ± 0.604
1.128AspTrp: 1.128 ± 0.273
1.9AspTyr: 1.9 ± 0.289
0.0AspXaa: 0.0 ± 0.0
Glu
7.78GluAla: 7.78 ± 0.698
0.119GluCys: 0.119 ± 0.092
3.266GluAsp: 3.266 ± 0.595
3.86GluGlu: 3.86 ± 0.613
2.969GluPhe: 2.969 ± 0.466
4.395GluGly: 4.395 ± 0.465
1.722GluHis: 1.722 ± 0.358
3.029GluIle: 3.029 ± 0.466
2.257GluLys: 2.257 ± 0.343
5.107GluLeu: 5.107 ± 0.651
2.316GluMet: 2.316 ± 0.448
1.96GluAsn: 1.96 ± 0.307
2.079GluPro: 2.079 ± 0.355
2.435GluGln: 2.435 ± 0.359
4.573GluArg: 4.573 ± 0.689
3.148GluSer: 3.148 ± 0.424
3.801GluThr: 3.801 ± 0.532
4.87GluVal: 4.87 ± 0.533
1.425GluTrp: 1.425 ± 0.295
2.019GluTyr: 2.019 ± 0.356
0.0GluXaa: 0.0 ± 0.0
Phe
3.563PheAla: 3.563 ± 0.546
0.238PheCys: 0.238 ± 0.142
2.376PheAsp: 2.376 ± 0.459
2.613PheGlu: 2.613 ± 0.393
0.891PhePhe: 0.891 ± 0.214
3.385PheGly: 3.385 ± 0.425
0.594PheHis: 0.594 ± 0.203
1.663PheIle: 1.663 ± 0.3
1.128PheLys: 1.128 ± 0.276
2.494PheLeu: 2.494 ± 0.494
0.356PheMet: 0.356 ± 0.148
1.782PheAsn: 1.782 ± 0.304
2.197PhePro: 2.197 ± 0.31
1.188PheGln: 1.188 ± 0.31
2.138PheArg: 2.138 ± 0.325
2.257PheSer: 2.257 ± 0.304
2.376PheThr: 2.376 ± 0.338
2.316PheVal: 2.316 ± 0.443
0.535PheTrp: 0.535 ± 0.181
0.831PheTyr: 0.831 ± 0.21
0.0PheXaa: 0.0 ± 0.0
Gly
7.186GlyAla: 7.186 ± 0.869
0.713GlyCys: 0.713 ± 0.18
6.533GlyAsp: 6.533 ± 0.699
4.751GlyGlu: 4.751 ± 0.472
2.257GlyPhe: 2.257 ± 0.343
8.611GlyGly: 8.611 ± 1.642
1.663GlyHis: 1.663 ± 0.371
4.811GlyIle: 4.811 ± 0.717
3.979GlyLys: 3.979 ± 0.401
5.939GlyLeu: 5.939 ± 0.691
1.841GlyMet: 1.841 ± 0.301
3.266GlyAsn: 3.266 ± 0.471
4.395GlyPro: 4.395 ± 1.167
2.673GlyGln: 2.673 ± 0.422
5.167GlyArg: 5.167 ± 0.604
4.751GlySer: 4.751 ± 0.912
5.642GlyThr: 5.642 ± 0.543
6.473GlyVal: 6.473 ± 0.717
2.079GlyTrp: 2.079 ± 0.277
2.791GlyTyr: 2.791 ± 0.39
0.0GlyXaa: 0.0 ± 0.0
His
1.366HisAla: 1.366 ± 0.282
0.119HisCys: 0.119 ± 0.071
1.01HisAsp: 1.01 ± 0.295
1.128HisGlu: 1.128 ± 0.297
0.713HisPhe: 0.713 ± 0.208
1.722HisGly: 1.722 ± 0.368
0.416HisHis: 0.416 ± 0.138
0.95HisIle: 0.95 ± 0.26
1.188HisLys: 1.188 ± 0.297
1.485HisLeu: 1.485 ± 0.31
0.416HisMet: 0.416 ± 0.161
0.356HisAsn: 0.356 ± 0.14
1.188HisPro: 1.188 ± 0.241
0.95HisGln: 0.95 ± 0.22
1.663HisArg: 1.663 ± 0.38
1.01HisSer: 1.01 ± 0.241
1.01HisThr: 1.01 ± 0.246
1.544HisVal: 1.544 ± 0.312
0.416HisTrp: 0.416 ± 0.176
0.535HisTyr: 0.535 ± 0.251
0.0HisXaa: 0.0 ± 0.0
Ile
4.335IleAla: 4.335 ± 0.469
0.594IleCys: 0.594 ± 0.187
3.742IleAsp: 3.742 ± 0.419
4.692IleGlu: 4.692 ± 0.438
0.95IlePhe: 0.95 ± 0.255
3.326IleGly: 3.326 ± 0.469
1.069IleHis: 1.069 ± 0.234
1.9IleIle: 1.9 ± 0.284
2.732IleLys: 2.732 ± 0.367
4.038IleLeu: 4.038 ± 0.45
0.831IleMet: 0.831 ± 0.224
2.079IleAsn: 2.079 ± 0.407
3.742IlePro: 3.742 ± 0.398
1.841IleGln: 1.841 ± 0.472
2.91IleArg: 2.91 ± 0.437
2.494IleSer: 2.494 ± 0.339
3.682IleThr: 3.682 ± 0.402
3.504IleVal: 3.504 ± 0.411
0.713IleTrp: 0.713 ± 0.2
1.01IleTyr: 1.01 ± 0.275
0.0IleXaa: 0.0 ± 0.0
Lys
4.989LysAla: 4.989 ± 0.643
0.119LysCys: 0.119 ± 0.096
2.138LysAsp: 2.138 ± 0.338
1.782LysGlu: 1.782 ± 0.294
1.247LysPhe: 1.247 ± 0.22
3.623LysGly: 3.623 ± 0.575
0.475LysHis: 0.475 ± 0.169
2.019LysIle: 2.019 ± 0.443
3.029LysLys: 3.029 ± 0.48
4.573LysLeu: 4.573 ± 0.506
0.891LysMet: 0.891 ± 0.255
1.069LysAsn: 1.069 ± 0.291
2.316LysPro: 2.316 ± 0.393
1.663LysGln: 1.663 ± 0.283
3.207LysArg: 3.207 ± 0.545
2.791LysSer: 2.791 ± 0.397
2.791LysThr: 2.791 ± 0.382
3.801LysVal: 3.801 ± 0.455
0.713LysTrp: 0.713 ± 0.202
1.069LysTyr: 1.069 ± 0.281
0.0LysXaa: 0.0 ± 0.0
Leu
9.205LeuAla: 9.205 ± 0.791
0.831LeuCys: 0.831 ± 0.198
4.276LeuAsp: 4.276 ± 0.442
4.454LeuGlu: 4.454 ± 0.633
2.435LeuPhe: 2.435 ± 0.338
6.83LeuGly: 6.83 ± 0.754
1.722LeuHis: 1.722 ± 0.37
4.098LeuIle: 4.098 ± 0.36
3.445LeuLys: 3.445 ± 0.442
5.523LeuLeu: 5.523 ± 0.544
1.722LeuMet: 1.722 ± 0.351
2.613LeuAsn: 2.613 ± 0.409
5.167LeuPro: 5.167 ± 0.548
2.138LeuGln: 2.138 ± 0.368
5.345LeuArg: 5.345 ± 0.655
5.82LeuSer: 5.82 ± 0.624
4.989LeuThr: 4.989 ± 0.576
4.692LeuVal: 4.692 ± 0.58
1.663LeuTrp: 1.663 ± 0.27
2.376LeuTyr: 2.376 ± 0.464
0.0LeuXaa: 0.0 ± 0.0
Met
3.088MetAla: 3.088 ± 0.415
0.059MetCys: 0.059 ± 0.058
1.069MetAsp: 1.069 ± 0.213
1.425MetGlu: 1.425 ± 0.283
1.069MetPhe: 1.069 ± 0.295
2.019MetGly: 2.019 ± 0.358
0.416MetHis: 0.416 ± 0.158
1.425MetIle: 1.425 ± 0.34
1.663MetLys: 1.663 ± 0.318
1.604MetLeu: 1.604 ± 0.356
0.653MetMet: 0.653 ± 0.191
0.95MetAsn: 0.95 ± 0.223
1.128MetPro: 1.128 ± 0.311
1.01MetGln: 1.01 ± 0.218
1.425MetArg: 1.425 ± 0.316
2.079MetSer: 2.079 ± 0.389
1.96MetThr: 1.96 ± 0.35
1.722MetVal: 1.722 ± 0.32
0.475MetTrp: 0.475 ± 0.171
0.594MetTyr: 0.594 ± 0.184
0.0MetXaa: 0.0 ± 0.0
Asn
3.207AsnAla: 3.207 ± 0.423
0.475AsnCys: 0.475 ± 0.175
1.722AsnAsp: 1.722 ± 0.303
1.782AsnGlu: 1.782 ± 0.303
0.891AsnPhe: 0.891 ± 0.226
3.326AsnGly: 3.326 ± 0.501
0.772AsnHis: 0.772 ± 0.21
1.366AsnIle: 1.366 ± 0.262
1.069AsnLys: 1.069 ± 0.253
3.385AsnLeu: 3.385 ± 0.437
0.95AsnMet: 0.95 ± 0.22
1.188AsnAsn: 1.188 ± 0.362
2.732AsnPro: 2.732 ± 0.355
1.188AsnGln: 1.188 ± 0.217
1.663AsnArg: 1.663 ± 0.281
1.663AsnSer: 1.663 ± 0.302
1.366AsnThr: 1.366 ± 0.27
2.494AsnVal: 2.494 ± 0.398
0.594AsnTrp: 0.594 ± 0.184
0.831AsnTyr: 0.831 ± 0.21
0.0AsnXaa: 0.0 ± 0.0
Pro
5.404ProAla: 5.404 ± 0.525
0.416ProCys: 0.416 ± 0.185
3.92ProAsp: 3.92 ± 0.612
4.87ProGlu: 4.87 ± 0.489
2.019ProPhe: 2.019 ± 0.421
4.751ProGly: 4.751 ± 0.534
1.307ProHis: 1.307 ± 0.248
2.851ProIle: 2.851 ± 0.409
2.019ProLys: 2.019 ± 0.419
3.088ProLeu: 3.088 ± 0.379
1.604ProMet: 1.604 ± 0.375
2.257ProAsn: 2.257 ± 0.305
1.9ProPro: 1.9 ± 0.374
2.197ProGln: 2.197 ± 0.448
3.979ProArg: 3.979 ± 0.501
2.791ProSer: 2.791 ± 0.423
3.92ProThr: 3.92 ± 0.573
4.038ProVal: 4.038 ± 0.542
0.772ProTrp: 0.772 ± 0.304
1.247ProTyr: 1.247 ± 0.269
0.0ProXaa: 0.0 ± 0.0
Gln
4.335GlnAla: 4.335 ± 0.5
0.356GlnCys: 0.356 ± 0.156
1.485GlnAsp: 1.485 ± 0.321
1.841GlnGlu: 1.841 ± 0.358
1.247GlnPhe: 1.247 ± 0.27
4.276GlnGly: 4.276 ± 1.499
0.416GlnHis: 0.416 ± 0.166
2.91GlnIle: 2.91 ± 0.392
1.485GlnLys: 1.485 ± 0.31
3.266GlnLeu: 3.266 ± 0.799
1.01GlnMet: 1.01 ± 0.217
0.95GlnAsn: 0.95 ± 0.257
1.425GlnPro: 1.425 ± 0.352
2.316GlnGln: 2.316 ± 0.427
1.96GlnArg: 1.96 ± 0.351
1.782GlnSer: 1.782 ± 0.35
1.9GlnThr: 1.9 ± 0.358
2.554GlnVal: 2.554 ± 0.386
0.772GlnTrp: 0.772 ± 0.17
1.188GlnTyr: 1.188 ± 0.223
0.0GlnXaa: 0.0 ± 0.0
Arg
5.583ArgAla: 5.583 ± 0.562
0.95ArgCys: 0.95 ± 0.322
4.514ArgAsp: 4.514 ± 0.527
4.514ArgGlu: 4.514 ± 0.592
2.197ArgPhe: 2.197 ± 0.425
4.87ArgGly: 4.87 ± 0.499
1.188ArgHis: 1.188 ± 0.323
3.445ArgIle: 3.445 ± 0.471
2.673ArgLys: 2.673 ± 0.419
4.989ArgLeu: 4.989 ± 0.508
2.435ArgMet: 2.435 ± 0.347
1.96ArgAsn: 1.96 ± 0.314
3.029ArgPro: 3.029 ± 0.391
2.138ArgGln: 2.138 ± 0.381
4.929ArgArg: 4.929 ± 0.71
3.266ArgSer: 3.266 ± 0.463
2.91ArgThr: 2.91 ± 0.443
4.395ArgVal: 4.395 ± 0.535
1.782ArgTrp: 1.782 ± 0.346
1.96ArgTyr: 1.96 ± 0.33
0.0ArgXaa: 0.0 ± 0.0
Ser
5.464SerAla: 5.464 ± 0.721
0.297SerCys: 0.297 ± 0.116
3.266SerAsp: 3.266 ± 0.405
3.86SerGlu: 3.86 ± 0.561
2.316SerPhe: 2.316 ± 0.416
6.236SerGly: 6.236 ± 0.994
0.713SerHis: 0.713 ± 0.176
2.494SerIle: 2.494 ± 0.321
2.197SerLys: 2.197 ± 0.348
5.345SerLeu: 5.345 ± 0.597
1.96SerMet: 1.96 ± 0.356
1.128SerAsn: 1.128 ± 0.241
3.504SerPro: 3.504 ± 0.477
2.257SerGln: 2.257 ± 0.364
3.86SerArg: 3.86 ± 0.538
3.801SerSer: 3.801 ± 0.665
3.148SerThr: 3.148 ± 0.481
3.92SerVal: 3.92 ± 0.486
1.188SerTrp: 1.188 ± 0.218
1.841SerTyr: 1.841 ± 0.388
0.0SerXaa: 0.0 ± 0.0
Thr
6.711ThrAla: 6.711 ± 0.54
0.594ThrCys: 0.594 ± 0.2
3.979ThrAsp: 3.979 ± 0.532
3.326ThrGlu: 3.326 ± 0.474
2.197ThrPhe: 2.197 ± 0.339
5.583ThrGly: 5.583 ± 0.941
1.188ThrHis: 1.188 ± 0.301
2.613ThrIle: 2.613 ± 0.418
3.029ThrLys: 3.029 ± 0.535
5.286ThrLeu: 5.286 ± 0.463
1.247ThrMet: 1.247 ± 0.342
1.307ThrAsn: 1.307 ± 0.325
4.276ThrPro: 4.276 ± 0.528
2.316ThrGln: 2.316 ± 0.356
3.088ThrArg: 3.088 ± 0.393
3.563ThrSer: 3.563 ± 0.469
3.207ThrThr: 3.207 ± 0.592
4.692ThrVal: 4.692 ± 0.516
1.069ThrTrp: 1.069 ± 0.256
1.604ThrTyr: 1.604 ± 0.27
0.0ThrXaa: 0.0 ± 0.0
Val
6.889ValAla: 6.889 ± 0.695
0.475ValCys: 0.475 ± 0.184
4.751ValAsp: 4.751 ± 0.597
4.098ValGlu: 4.098 ± 0.466
3.266ValPhe: 3.266 ± 0.504
4.751ValGly: 4.751 ± 0.466
1.128ValHis: 1.128 ± 0.259
3.742ValIle: 3.742 ± 0.37
3.207ValLys: 3.207 ± 0.428
5.998ValLeu: 5.998 ± 0.598
1.604ValMet: 1.604 ± 0.314
2.969ValAsn: 2.969 ± 0.451
3.504ValPro: 3.504 ± 0.418
2.197ValGln: 2.197 ± 0.36
5.345ValArg: 5.345 ± 0.457
5.226ValSer: 5.226 ± 0.725
4.632ValThr: 4.632 ± 0.603
4.87ValVal: 4.87 ± 0.585
1.247ValTrp: 1.247 ± 0.259
1.782ValTyr: 1.782 ± 0.341
0.0ValXaa: 0.0 ± 0.0
Trp
1.96TrpAla: 1.96 ± 0.326
0.356TrpCys: 0.356 ± 0.145
1.307TrpAsp: 1.307 ± 0.283
1.069TrpGlu: 1.069 ± 0.269
0.713TrpPhe: 0.713 ± 0.209
1.128TrpGly: 1.128 ± 0.296
0.653TrpHis: 0.653 ± 0.231
1.307TrpIle: 1.307 ± 0.262
0.653TrpLys: 0.653 ± 0.161
1.128TrpLeu: 1.128 ± 0.244
0.713TrpMet: 0.713 ± 0.21
0.713TrpAsn: 0.713 ± 0.21
1.01TrpPro: 1.01 ± 0.283
1.247TrpGln: 1.247 ± 0.309
1.307TrpArg: 1.307 ± 0.266
1.01TrpSer: 1.01 ± 0.227
1.069TrpThr: 1.069 ± 0.213
1.188TrpVal: 1.188 ± 0.259
0.594TrpTrp: 0.594 ± 0.197
0.416TrpTyr: 0.416 ± 0.163
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.316TyrAla: 2.316 ± 0.372
0.238TyrCys: 0.238 ± 0.158
2.019TyrAsp: 2.019 ± 0.35
1.96TyrGlu: 1.96 ± 0.374
0.594TyrPhe: 0.594 ± 0.155
2.435TyrGly: 2.435 ± 0.306
0.238TyrHis: 0.238 ± 0.123
1.307TyrIle: 1.307 ± 0.291
1.01TyrLys: 1.01 ± 0.293
2.138TyrLeu: 2.138 ± 0.376
0.594TyrMet: 0.594 ± 0.168
0.95TyrAsn: 0.95 ± 0.232
1.307TyrPro: 1.307 ± 0.248
1.307TyrGln: 1.307 ± 0.256
1.9TyrArg: 1.9 ± 0.39
1.604TyrSer: 1.604 ± 0.324
1.841TyrThr: 1.841 ± 0.345
2.673TyrVal: 2.673 ± 0.33
0.356TyrTrp: 0.356 ± 0.129
0.772TyrTyr: 0.772 ± 0.203
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 91 proteins (16839 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski