Amino acid dipepetide frequency for Mycobacterium phage Bernal13

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
21.113AlaAla: 21.113 ± 2.163
0.591AlaCys: 0.591 ± 0.188
9.745AlaAsp: 9.745 ± 0.763
7.013AlaGlu: 7.013 ± 0.841
3.322AlaPhe: 3.322 ± 0.386
13.288AlaGly: 13.288 ± 1.39
2.288AlaHis: 2.288 ± 0.443
5.758AlaIle: 5.758 ± 0.678
4.503AlaLys: 4.503 ± 0.609
8.563AlaLeu: 8.563 ± 0.947
2.731AlaMet: 2.731 ± 0.342
3.322AlaAsn: 3.322 ± 0.42
7.087AlaPro: 7.087 ± 0.833
5.463AlaGln: 5.463 ± 0.914
9.006AlaArg: 9.006 ± 0.903
5.168AlaSer: 5.168 ± 0.595
7.751AlaThr: 7.751 ± 0.717
9.228AlaVal: 9.228 ± 1.016
2.288AlaTrp: 2.288 ± 0.334
1.55AlaTyr: 1.55 ± 0.481
0.0AlaXaa: 0.0 ± 0.0
Cys
0.443CysAla: 0.443 ± 0.195
0.221CysCys: 0.221 ± 0.136
0.96CysAsp: 0.96 ± 0.295
0.295CysGlu: 0.295 ± 0.156
0.148CysPhe: 0.148 ± 0.092
1.476CysGly: 1.476 ± 0.36
0.148CysHis: 0.148 ± 0.109
0.443CysIle: 0.443 ± 0.197
0.443CysLys: 0.443 ± 0.238
0.664CysLeu: 0.664 ± 0.23
0.148CysMet: 0.148 ± 0.113
0.074CysAsn: 0.074 ± 0.068
0.738CysPro: 0.738 ± 0.267
0.369CysGln: 0.369 ± 0.15
0.443CysArg: 0.443 ± 0.172
0.295CysSer: 0.295 ± 0.167
0.591CysThr: 0.591 ± 0.183
0.96CysVal: 0.96 ± 0.247
0.148CysTrp: 0.148 ± 0.098
0.221CysTyr: 0.221 ± 0.141
0.0CysXaa: 0.0 ± 0.0
Asp
8.563AspAla: 8.563 ± 1.105
0.591AspCys: 0.591 ± 0.24
4.872AspAsp: 4.872 ± 0.776
4.429AspGlu: 4.429 ± 0.892
1.993AspPhe: 1.993 ± 0.32
6.423AspGly: 6.423 ± 0.685
1.55AspHis: 1.55 ± 0.419
2.731AspIle: 2.731 ± 0.427
1.846AspLys: 1.846 ± 0.373
5.168AspLeu: 5.168 ± 0.541
1.476AspMet: 1.476 ± 0.314
1.919AspAsn: 1.919 ± 0.347
3.913AspPro: 3.913 ± 0.514
2.953AspGln: 2.953 ± 0.328
4.872AspArg: 4.872 ± 0.794
3.174AspSer: 3.174 ± 0.597
2.658AspThr: 2.658 ± 0.457
4.872AspVal: 4.872 ± 0.571
1.476AspTrp: 1.476 ± 0.36
2.215AspTyr: 2.215 ± 0.39
0.0AspXaa: 0.0 ± 0.0
Glu
5.094GluAla: 5.094 ± 0.779
0.664GluCys: 0.664 ± 0.221
3.322GluAsp: 3.322 ± 0.611
1.624GluGlu: 1.624 ± 0.424
0.886GluPhe: 0.886 ± 0.266
2.51GluGly: 2.51 ± 0.382
0.738GluHis: 0.738 ± 0.257
1.476GluIle: 1.476 ± 0.342
0.812GluLys: 0.812 ± 0.289
7.678GluLeu: 7.678 ± 0.929
0.738GluMet: 0.738 ± 0.222
1.255GluAsn: 1.255 ± 0.266
2.362GluPro: 2.362 ± 0.536
3.617GluGln: 3.617 ± 0.505
3.986GluArg: 3.986 ± 0.773
2.658GluSer: 2.658 ± 0.403
3.396GluThr: 3.396 ± 0.628
3.913GluVal: 3.913 ± 0.625
0.738GluTrp: 0.738 ± 0.23
1.181GluTyr: 1.181 ± 0.319
0.0GluXaa: 0.0 ± 0.0
Phe
2.658PheAla: 2.658 ± 0.52
0.074PheCys: 0.074 ± 0.07
2.731PheAsp: 2.731 ± 0.522
1.403PheGlu: 1.403 ± 0.324
1.034PhePhe: 1.034 ± 0.324
3.691PheGly: 3.691 ± 0.628
0.517PheHis: 0.517 ± 0.197
1.107PheIle: 1.107 ± 0.249
0.812PheLys: 0.812 ± 0.24
2.51PheLeu: 2.51 ± 0.601
0.369PheMet: 0.369 ± 0.16
0.738PheAsn: 0.738 ± 0.242
1.846PhePro: 1.846 ± 0.398
0.591PheGln: 0.591 ± 0.205
1.107PheArg: 1.107 ± 0.254
0.96PheSer: 0.96 ± 0.289
2.51PheThr: 2.51 ± 0.394
2.067PheVal: 2.067 ± 0.498
0.591PheTrp: 0.591 ± 0.24
0.886PheTyr: 0.886 ± 0.238
0.0PheXaa: 0.0 ± 0.0
Gly
9.449GlyAla: 9.449 ± 1.407
1.181GlyCys: 1.181 ± 0.284
5.315GlyAsp: 5.315 ± 0.635
3.47GlyGlu: 3.47 ± 0.518
2.879GlyPhe: 2.879 ± 0.545
11.0GlyGly: 11.0 ± 2.88
1.919GlyHis: 1.919 ± 0.422
5.611GlyIle: 5.611 ± 1.025
3.543GlyLys: 3.543 ± 0.76
6.423GlyLeu: 6.423 ± 0.772
1.919GlyMet: 1.919 ± 0.358
3.543GlyAsn: 3.543 ± 0.778
3.986GlyPro: 3.986 ± 0.617
5.315GlyGln: 5.315 ± 0.512
5.315GlyArg: 5.315 ± 0.752
5.463GlySer: 5.463 ± 0.746
5.758GlyThr: 5.758 ± 0.98
7.161GlyVal: 7.161 ± 0.987
1.846GlyTrp: 1.846 ± 0.518
2.141GlyTyr: 2.141 ± 0.351
0.0GlyXaa: 0.0 ± 0.0
His
2.067HisAla: 2.067 ± 0.46
0.443HisCys: 0.443 ± 0.171
0.738HisAsp: 0.738 ± 0.217
1.034HisGlu: 1.034 ± 0.266
0.591HisPhe: 0.591 ± 0.225
1.846HisGly: 1.846 ± 0.311
0.812HisHis: 0.812 ± 0.275
0.96HisIle: 0.96 ± 0.388
0.369HisLys: 0.369 ± 0.203
1.255HisLeu: 1.255 ± 0.299
0.221HisMet: 0.221 ± 0.158
0.738HisAsn: 0.738 ± 0.219
1.107HisPro: 1.107 ± 0.334
1.107HisGln: 1.107 ± 0.311
2.141HisArg: 2.141 ± 0.492
0.664HisSer: 0.664 ± 0.262
1.181HisThr: 1.181 ± 0.31
1.034HisVal: 1.034 ± 0.331
0.591HisTrp: 0.591 ± 0.194
0.517HisTyr: 0.517 ± 0.211
0.0HisXaa: 0.0 ± 0.0
Ile
7.235IleAla: 7.235 ± 0.853
0.517IleCys: 0.517 ± 0.218
3.101IleAsp: 3.101 ± 0.391
2.953IleGlu: 2.953 ± 0.459
1.329IlePhe: 1.329 ± 0.226
3.691IleGly: 3.691 ± 0.557
0.443IleHis: 0.443 ± 0.192
1.255IleIle: 1.255 ± 0.36
1.034IleLys: 1.034 ± 0.255
2.584IleLeu: 2.584 ± 0.598
0.812IleMet: 0.812 ± 0.225
1.181IleAsn: 1.181 ± 0.386
2.362IlePro: 2.362 ± 0.368
1.698IleGln: 1.698 ± 0.444
3.322IleArg: 3.322 ± 0.529
1.698IleSer: 1.698 ± 0.314
2.879IleThr: 2.879 ± 0.423
3.101IleVal: 3.101 ± 0.417
0.812IleTrp: 0.812 ± 0.206
0.664IleTyr: 0.664 ± 0.17
0.0IleXaa: 0.0 ± 0.0
Lys
4.134LysAla: 4.134 ± 0.581
0.295LysCys: 0.295 ± 0.194
1.255LysAsp: 1.255 ± 0.338
0.738LysGlu: 0.738 ± 0.251
0.886LysPhe: 0.886 ± 0.218
2.141LysGly: 2.141 ± 0.441
0.443LysHis: 0.443 ± 0.179
1.403LysIle: 1.403 ± 0.255
0.96LysLys: 0.96 ± 0.237
2.805LysLeu: 2.805 ± 0.39
0.664LysMet: 0.664 ± 0.249
0.664LysAsn: 0.664 ± 0.251
2.067LysPro: 2.067 ± 0.525
1.624LysGln: 1.624 ± 0.416
2.436LysArg: 2.436 ± 0.531
1.698LysSer: 1.698 ± 0.38
2.215LysThr: 2.215 ± 0.417
2.51LysVal: 2.51 ± 0.514
0.443LysTrp: 0.443 ± 0.191
0.664LysTyr: 0.664 ± 0.208
0.0LysXaa: 0.0 ± 0.0
Leu
10.409LeuAla: 10.409 ± 0.971
0.738LeuCys: 0.738 ± 0.276
6.496LeuAsp: 6.496 ± 0.783
4.06LeuGlu: 4.06 ± 0.514
2.584LeuPhe: 2.584 ± 0.426
7.382LeuGly: 7.382 ± 1.026
1.181LeuHis: 1.181 ± 0.307
2.731LeuIle: 2.731 ± 0.489
2.141LeuLys: 2.141 ± 0.483
5.758LeuLeu: 5.758 ± 0.788
0.812LeuMet: 0.812 ± 0.236
2.658LeuAsn: 2.658 ± 0.403
6.275LeuPro: 6.275 ± 0.671
3.101LeuGln: 3.101 ± 0.516
4.798LeuArg: 4.798 ± 0.579
4.946LeuSer: 4.946 ± 0.664
3.839LeuThr: 3.839 ± 0.418
4.651LeuVal: 4.651 ± 0.584
1.329LeuTrp: 1.329 ± 0.288
1.624LeuTyr: 1.624 ± 0.314
0.0LeuXaa: 0.0 ± 0.0
Met
2.584MetAla: 2.584 ± 0.404
0.295MetCys: 0.295 ± 0.151
0.443MetAsp: 0.443 ± 0.257
0.443MetGlu: 0.443 ± 0.154
0.443MetPhe: 0.443 ± 0.169
1.034MetGly: 1.034 ± 0.364
0.517MetHis: 0.517 ± 0.223
1.181MetIle: 1.181 ± 0.261
0.664MetLys: 0.664 ± 0.214
1.034MetLeu: 1.034 ± 0.298
0.295MetMet: 0.295 ± 0.162
0.517MetAsn: 0.517 ± 0.209
0.96MetPro: 0.96 ± 0.229
0.812MetGln: 0.812 ± 0.276
1.329MetArg: 1.329 ± 0.309
2.288MetSer: 2.288 ± 0.462
1.476MetThr: 1.476 ± 0.365
1.107MetVal: 1.107 ± 0.324
0.443MetTrp: 0.443 ± 0.291
0.074MetTyr: 0.074 ± 0.068
0.0MetXaa: 0.0 ± 0.0
Asn
3.47AsnAla: 3.47 ± 0.467
0.369AsnCys: 0.369 ± 0.155
1.698AsnAsp: 1.698 ± 0.42
1.55AsnGlu: 1.55 ± 0.391
1.034AsnPhe: 1.034 ± 0.265
4.134AsnGly: 4.134 ± 0.708
0.295AsnHis: 0.295 ± 0.123
1.034AsnIle: 1.034 ± 0.275
0.664AsnLys: 0.664 ± 0.18
2.658AsnLeu: 2.658 ± 0.71
0.591AsnMet: 0.591 ± 0.192
0.886AsnAsn: 0.886 ± 0.286
3.543AsnPro: 3.543 ± 0.685
1.329AsnGln: 1.329 ± 0.344
2.436AsnArg: 2.436 ± 0.395
1.624AsnSer: 1.624 ± 0.455
1.772AsnThr: 1.772 ± 0.312
2.51AsnVal: 2.51 ± 0.327
0.664AsnTrp: 0.664 ± 0.215
0.369AsnTyr: 0.369 ± 0.146
0.0AsnXaa: 0.0 ± 0.0
Pro
10.778ProAla: 10.778 ± 1.265
0.443ProCys: 0.443 ± 0.251
3.691ProAsp: 3.691 ± 0.488
3.322ProGlu: 3.322 ± 0.579
1.476ProPhe: 1.476 ± 0.372
6.718ProGly: 6.718 ± 1.059
1.846ProHis: 1.846 ± 0.425
2.288ProIle: 2.288 ± 0.432
2.436ProLys: 2.436 ± 0.462
2.953ProLeu: 2.953 ± 0.354
0.664ProMet: 0.664 ± 0.2
3.174ProAsn: 3.174 ± 0.451
4.503ProPro: 4.503 ± 0.589
1.993ProGln: 1.993 ± 0.449
3.101ProArg: 3.101 ± 0.418
3.174ProSer: 3.174 ± 0.432
4.651ProThr: 4.651 ± 0.551
3.322ProVal: 3.322 ± 0.487
0.664ProTrp: 0.664 ± 0.208
1.329ProTyr: 1.329 ± 0.318
0.0ProXaa: 0.0 ± 0.0
Gln
5.463GlnAla: 5.463 ± 0.638
0.221GlnCys: 0.221 ± 0.215
2.731GlnAsp: 2.731 ± 0.359
1.329GlnGlu: 1.329 ± 0.301
1.107GlnPhe: 1.107 ± 0.228
1.919GlnGly: 1.919 ± 0.347
1.181GlnHis: 1.181 ± 0.362
2.805GlnIle: 2.805 ± 0.426
1.403GlnLys: 1.403 ± 0.263
4.651GlnLeu: 4.651 ± 0.622
0.96GlnMet: 0.96 ± 0.273
1.476GlnAsn: 1.476 ± 0.321
3.765GlnPro: 3.765 ± 0.626
2.879GlnGln: 2.879 ± 0.996
3.322GlnArg: 3.322 ± 0.481
3.101GlnSer: 3.101 ± 0.418
2.658GlnThr: 2.658 ± 0.389
2.51GlnVal: 2.51 ± 0.354
0.443GlnTrp: 0.443 ± 0.201
1.181GlnTyr: 1.181 ± 0.271
0.0GlnXaa: 0.0 ± 0.0
Arg
7.678ArgAla: 7.678 ± 0.8
0.591ArgCys: 0.591 ± 0.216
5.02ArgAsp: 5.02 ± 0.85
3.47ArgGlu: 3.47 ± 0.716
2.067ArgPhe: 2.067 ± 0.345
5.463ArgGly: 5.463 ± 0.615
1.181ArgHis: 1.181 ± 0.315
3.027ArgIle: 3.027 ± 0.56
1.919ArgLys: 1.919 ± 0.454
5.684ArgLeu: 5.684 ± 0.632
1.476ArgMet: 1.476 ± 0.453
2.51ArgAsn: 2.51 ± 0.36
3.986ArgPro: 3.986 ± 0.873
2.805ArgGln: 2.805 ± 0.381
5.389ArgArg: 5.389 ± 0.673
3.396ArgSer: 3.396 ± 0.555
4.282ArgThr: 4.282 ± 0.653
4.356ArgVal: 4.356 ± 0.782
1.181ArgTrp: 1.181 ± 0.288
2.215ArgTyr: 2.215 ± 0.38
0.0ArgXaa: 0.0 ± 0.0
Ser
7.751SerAla: 7.751 ± 0.732
0.148SerCys: 0.148 ± 0.087
2.879SerAsp: 2.879 ± 0.455
3.101SerGlu: 3.101 ± 0.585
1.624SerPhe: 1.624 ± 0.328
4.134SerGly: 4.134 ± 0.952
1.034SerHis: 1.034 ± 0.278
2.51SerIle: 2.51 ± 0.412
1.403SerLys: 1.403 ± 0.255
3.47SerLeu: 3.47 ± 0.69
1.403SerMet: 1.403 ± 0.299
1.55SerAsn: 1.55 ± 0.316
3.174SerPro: 3.174 ± 0.543
1.698SerGln: 1.698 ± 0.337
3.765SerArg: 3.765 ± 0.614
3.174SerSer: 3.174 ± 0.627
4.946SerThr: 4.946 ± 0.596
3.322SerVal: 3.322 ± 0.557
1.919SerTrp: 1.919 ± 0.362
1.034SerTyr: 1.034 ± 0.343
0.0SerXaa: 0.0 ± 0.0
Thr
9.302ThrAla: 9.302 ± 0.919
0.517ThrCys: 0.517 ± 0.189
4.503ThrAsp: 4.503 ± 0.748
2.51ThrGlu: 2.51 ± 0.398
2.141ThrPhe: 2.141 ± 0.448
5.611ThrGly: 5.611 ± 0.844
1.181ThrHis: 1.181 ± 0.32
2.805ThrIle: 2.805 ± 0.487
2.215ThrLys: 2.215 ± 0.418
4.503ThrLeu: 4.503 ± 0.572
0.591ThrMet: 0.591 ± 0.211
2.067ThrAsn: 2.067 ± 0.382
4.577ThrPro: 4.577 ± 0.66
2.362ThrGln: 2.362 ± 0.396
3.543ThrArg: 3.543 ± 0.573
3.913ThrSer: 3.913 ± 0.476
3.543ThrThr: 3.543 ± 0.529
5.463ThrVal: 5.463 ± 0.703
0.738ThrTrp: 0.738 ± 0.252
1.403ThrTyr: 1.403 ± 0.31
0.0ThrXaa: 0.0 ± 0.0
Val
8.047ValAla: 8.047 ± 0.674
0.517ValCys: 0.517 ± 0.192
5.463ValAsp: 5.463 ± 0.458
3.617ValGlu: 3.617 ± 0.681
1.255ValPhe: 1.255 ± 0.298
6.865ValGly: 6.865 ± 0.955
1.255ValHis: 1.255 ± 0.364
1.919ValIle: 1.919 ± 0.318
2.362ValLys: 2.362 ± 0.406
5.758ValLeu: 5.758 ± 0.576
1.329ValMet: 1.329 ± 0.402
2.436ValAsn: 2.436 ± 0.445
4.577ValPro: 4.577 ± 0.527
2.953ValGln: 2.953 ± 0.565
4.134ValArg: 4.134 ± 0.405
4.503ValSer: 4.503 ± 0.513
4.577ValThr: 4.577 ± 0.461
5.389ValVal: 5.389 ± 0.753
1.107ValTrp: 1.107 ± 0.348
1.55ValTyr: 1.55 ± 0.321
0.0ValXaa: 0.0 ± 0.0
Trp
1.403TrpAla: 1.403 ± 0.293
0.443TrpCys: 0.443 ± 0.176
1.181TrpAsp: 1.181 ± 0.338
1.403TrpGlu: 1.403 ± 0.392
0.886TrpPhe: 0.886 ± 0.243
1.403TrpGly: 1.403 ± 0.353
0.517TrpHis: 0.517 ± 0.175
0.812TrpIle: 0.812 ± 0.229
0.369TrpLys: 0.369 ± 0.159
1.255TrpLeu: 1.255 ± 0.357
0.443TrpMet: 0.443 ± 0.165
0.886TrpAsn: 0.886 ± 0.29
0.738TrpPro: 0.738 ± 0.231
1.181TrpGln: 1.181 ± 0.284
1.403TrpArg: 1.403 ± 0.293
1.181TrpSer: 1.181 ± 0.261
1.55TrpThr: 1.55 ± 0.505
0.738TrpVal: 0.738 ± 0.22
0.074TrpTrp: 0.074 ± 0.08
0.591TrpTyr: 0.591 ± 0.217
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.141TyrAla: 2.141 ± 0.417
0.369TyrCys: 0.369 ± 0.159
1.55TyrAsp: 1.55 ± 0.249
0.886TyrGlu: 0.886 ± 0.269
0.517TyrPhe: 0.517 ± 0.193
2.51TyrGly: 2.51 ± 0.468
0.369TyrHis: 0.369 ± 0.196
0.812TyrIle: 0.812 ± 0.218
0.221TyrLys: 0.221 ± 0.133
2.141TyrLeu: 2.141 ± 0.442
0.221TyrMet: 0.221 ± 0.144
1.034TyrAsn: 1.034 ± 0.309
0.96TyrPro: 0.96 ± 0.27
1.255TyrGln: 1.255 ± 0.317
1.846TyrArg: 1.846 ± 0.445
0.96TyrSer: 0.96 ± 0.285
1.255TyrThr: 1.255 ± 0.299
1.403TyrVal: 1.403 ± 0.264
0.96TyrTrp: 0.96 ± 0.278
0.664TyrTyr: 0.664 ± 0.197
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 60 proteins (13547 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski