Amino acid dipepetide frequency for Mycobacterium phage Bobi

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
14.89AlaAla: 14.89 ± 1.622
1.01AlaCys: 1.01 ± 0.239
7.232AlaAsp: 7.232 ± 0.675
8.242AlaGlu: 8.242 ± 0.846
3.031AlaPhe: 3.031 ± 0.398
9.359AlaGly: 9.359 ± 1.204
2.393AlaHis: 2.393 ± 0.422
3.935AlaIle: 3.935 ± 0.459
3.882AlaLys: 3.882 ± 0.491
8.402AlaLeu: 8.402 ± 0.666
2.925AlaMet: 2.925 ± 0.373
2.925AlaAsn: 2.925 ± 0.434
5.424AlaPro: 5.424 ± 0.64
3.457AlaGln: 3.457 ± 0.444
7.019AlaArg: 7.019 ± 0.677
5.69AlaSer: 5.69 ± 0.643
6.275AlaThr: 6.275 ± 0.592
7.019AlaVal: 7.019 ± 0.56
2.765AlaTrp: 2.765 ± 0.594
2.34AlaTyr: 2.34 ± 0.296
0.0AlaXaa: 0.0 ± 0.0
Cys
1.276CysAla: 1.276 ± 0.351
0.16CysCys: 0.16 ± 0.11
1.17CysAsp: 1.17 ± 0.244
0.585CysGlu: 0.585 ± 0.208
0.16CysPhe: 0.16 ± 0.098
1.276CysGly: 1.276 ± 0.288
0.319CysHis: 0.319 ± 0.142
0.425CysIle: 0.425 ± 0.156
0.266CysLys: 0.266 ± 0.139
1.01CysLeu: 1.01 ± 0.249
0.266CysMet: 0.266 ± 0.118
0.425CysAsn: 0.425 ± 0.147
1.117CysPro: 1.117 ± 0.296
0.372CysGln: 0.372 ± 0.173
0.904CysArg: 0.904 ± 0.236
0.425CysSer: 0.425 ± 0.153
0.638CysThr: 0.638 ± 0.216
0.638CysVal: 0.638 ± 0.187
0.372CysTrp: 0.372 ± 0.132
0.372CysTyr: 0.372 ± 0.136
0.0CysXaa: 0.0 ± 0.0
Asp
6.328AspAla: 6.328 ± 0.497
1.01AspCys: 1.01 ± 0.238
4.307AspAsp: 4.307 ± 0.524
3.563AspGlu: 3.563 ± 0.453
1.489AspPhe: 1.489 ± 0.239
7.126AspGly: 7.126 ± 0.59
1.489AspHis: 1.489 ± 0.246
2.659AspIle: 2.659 ± 0.443
1.542AspLys: 1.542 ± 0.23
6.009AspLeu: 6.009 ± 0.569
1.17AspMet: 1.17 ± 0.265
1.383AspAsn: 1.383 ± 0.286
4.786AspPro: 4.786 ± 0.603
2.606AspGln: 2.606 ± 0.326
5.743AspArg: 5.743 ± 0.627
3.882AspSer: 3.882 ± 0.522
3.244AspThr: 3.244 ± 0.374
4.52AspVal: 4.52 ± 0.576
1.329AspTrp: 1.329 ± 0.274
1.861AspTyr: 1.861 ± 0.288
0.0AspXaa: 0.0 ± 0.0
Glu
6.647GluAla: 6.647 ± 0.687
0.798GluCys: 0.798 ± 0.23
3.457GluAsp: 3.457 ± 0.365
2.872GluGlu: 2.872 ± 0.471
2.393GluPhe: 2.393 ± 0.305
2.925GluGly: 2.925 ± 0.425
1.01GluHis: 1.01 ± 0.251
3.137GluIle: 3.137 ± 0.404
1.861GluLys: 1.861 ± 0.322
5.211GluLeu: 5.211 ± 0.617
1.702GluMet: 1.702 ± 0.302
2.074GluAsn: 2.074 ± 0.254
2.765GluPro: 2.765 ± 0.425
2.872GluGln: 2.872 ± 0.378
4.254GluArg: 4.254 ± 0.555
3.457GluSer: 3.457 ± 0.552
3.988GluThr: 3.988 ± 0.695
3.829GluVal: 3.829 ± 0.542
0.904GluTrp: 0.904 ± 0.235
1.702GluTyr: 1.702 ± 0.337
0.0GluXaa: 0.0 ± 0.0
Phe
3.35PheAla: 3.35 ± 0.387
0.319PheCys: 0.319 ± 0.131
2.499PheAsp: 2.499 ± 0.32
1.542PheGlu: 1.542 ± 0.324
0.798PhePhe: 0.798 ± 0.239
3.563PheGly: 3.563 ± 0.8
0.691PheHis: 0.691 ± 0.188
1.223PheIle: 1.223 ± 0.317
1.117PheLys: 1.117 ± 0.265
1.861PheLeu: 1.861 ± 0.303
0.851PheMet: 0.851 ± 0.231
1.064PheAsn: 1.064 ± 0.28
1.595PhePro: 1.595 ± 0.288
0.798PheGln: 0.798 ± 0.283
1.436PheArg: 1.436 ± 0.292
1.329PheSer: 1.329 ± 0.293
2.127PheThr: 2.127 ± 0.351
1.968PheVal: 1.968 ± 0.302
0.425PheTrp: 0.425 ± 0.148
1.064PheTyr: 1.064 ± 0.27
0.0PheXaa: 0.0 ± 0.0
Gly
10.263GlyAla: 10.263 ± 1.202
0.957GlyCys: 0.957 ± 0.248
6.062GlyAsp: 6.062 ± 0.505
3.722GlyGlu: 3.722 ± 0.512
3.084GlyPhe: 3.084 ± 0.497
10.795GlyGly: 10.795 ± 2.845
2.021GlyHis: 2.021 ± 0.303
4.201GlyIle: 4.201 ± 0.514
2.765GlyLys: 2.765 ± 0.373
5.903GlyLeu: 5.903 ± 0.649
2.18GlyMet: 2.18 ± 0.455
2.659GlyAsn: 2.659 ± 0.408
3.51GlyPro: 3.51 ± 0.5
2.393GlyGln: 2.393 ± 0.587
4.999GlyArg: 4.999 ± 0.516
5.584GlySer: 5.584 ± 0.918
6.594GlyThr: 6.594 ± 0.647
5.903GlyVal: 5.903 ± 0.674
2.765GlyTrp: 2.765 ± 0.375
2.499GlyTyr: 2.499 ± 0.392
0.0GlyXaa: 0.0 ± 0.0
His
1.648HisAla: 1.648 ± 0.349
0.479HisCys: 0.479 ± 0.195
1.01HisAsp: 1.01 ± 0.235
1.223HisGlu: 1.223 ± 0.254
0.425HisPhe: 0.425 ± 0.149
1.648HisGly: 1.648 ± 0.27
1.064HisHis: 1.064 ± 0.252
1.489HisIle: 1.489 ± 0.301
0.744HisLys: 0.744 ± 0.206
1.968HisLeu: 1.968 ± 0.36
0.691HisMet: 0.691 ± 0.236
0.691HisAsn: 0.691 ± 0.226
1.648HisPro: 1.648 ± 0.294
0.798HisGln: 0.798 ± 0.23
2.021HisArg: 2.021 ± 0.378
0.585HisSer: 0.585 ± 0.165
1.489HisThr: 1.489 ± 0.359
1.542HisVal: 1.542 ± 0.29
0.319HisTrp: 0.319 ± 0.138
0.851HisTyr: 0.851 ± 0.18
0.0HisXaa: 0.0 ± 0.0
Ile
5.371IleAla: 5.371 ± 0.57
0.744IleCys: 0.744 ± 0.238
3.669IleAsp: 3.669 ± 0.426
3.403IleGlu: 3.403 ± 0.383
0.798IlePhe: 0.798 ± 0.296
3.457IleGly: 3.457 ± 0.476
1.595IleHis: 1.595 ± 0.284
1.117IleIle: 1.117 ± 0.253
1.17IleLys: 1.17 ± 0.262
1.968IleLeu: 1.968 ± 0.389
0.372IleMet: 0.372 ± 0.156
1.861IleAsn: 1.861 ± 0.303
3.084IlePro: 3.084 ± 0.403
1.276IleGln: 1.276 ± 0.279
2.34IleArg: 2.34 ± 0.461
1.914IleSer: 1.914 ± 0.395
4.095IleThr: 4.095 ± 0.403
2.978IleVal: 2.978 ± 0.361
1.01IleTrp: 1.01 ± 0.264
0.904IleTyr: 0.904 ± 0.256
0.0IleXaa: 0.0 ± 0.0
Lys
3.297LysAla: 3.297 ± 0.411
0.372LysCys: 0.372 ± 0.183
1.489LysAsp: 1.489 ± 0.269
1.436LysGlu: 1.436 ± 0.291
1.064LysPhe: 1.064 ± 0.23
2.765LysGly: 2.765 ± 0.349
1.17LysHis: 1.17 ± 0.275
0.957LysIle: 0.957 ± 0.241
1.276LysLys: 1.276 ± 0.316
3.191LysLeu: 3.191 ± 0.404
0.904LysMet: 0.904 ± 0.218
0.851LysAsn: 0.851 ± 0.245
2.872LysPro: 2.872 ± 0.518
1.383LysGln: 1.383 ± 0.226
2.34LysArg: 2.34 ± 0.346
1.808LysSer: 1.808 ± 0.369
2.127LysThr: 2.127 ± 0.367
2.127LysVal: 2.127 ± 0.343
0.798LysTrp: 0.798 ± 0.189
0.798LysTyr: 0.798 ± 0.248
0.0LysXaa: 0.0 ± 0.0
Leu
8.296LeuAla: 8.296 ± 0.843
1.223LeuCys: 1.223 ± 0.283
5.318LeuAsp: 5.318 ± 0.543
3.669LeuGlu: 3.669 ± 0.453
1.914LeuPhe: 1.914 ± 0.305
6.009LeuGly: 6.009 ± 0.628
0.798LeuHis: 0.798 ± 0.237
3.457LeuIle: 3.457 ± 0.416
2.127LeuLys: 2.127 ± 0.372
4.786LeuLeu: 4.786 ± 0.611
1.436LeuMet: 1.436 ± 0.28
2.765LeuAsn: 2.765 ± 0.452
4.626LeuPro: 4.626 ± 0.549
2.659LeuGln: 2.659 ± 0.411
6.062LeuArg: 6.062 ± 0.674
5.158LeuSer: 5.158 ± 0.586
5.265LeuThr: 5.265 ± 0.564
5.584LeuVal: 5.584 ± 0.635
1.117LeuTrp: 1.117 ± 0.236
1.702LeuTyr: 1.702 ± 0.327
0.0LeuXaa: 0.0 ± 0.0
Met
1.861MetAla: 1.861 ± 0.389
0.266MetCys: 0.266 ± 0.154
1.276MetAsp: 1.276 ± 0.275
0.957MetGlu: 0.957 ± 0.184
0.691MetPhe: 0.691 ± 0.206
1.968MetGly: 1.968 ± 0.367
0.106MetHis: 0.106 ± 0.072
0.957MetIle: 0.957 ± 0.246
0.851MetLys: 0.851 ± 0.245
1.489MetLeu: 1.489 ± 0.215
0.479MetMet: 0.479 ± 0.214
1.064MetAsn: 1.064 ± 0.214
1.648MetPro: 1.648 ± 0.342
0.266MetGln: 0.266 ± 0.114
1.914MetArg: 1.914 ± 0.275
2.765MetSer: 2.765 ± 0.369
1.861MetThr: 1.861 ± 0.273
1.542MetVal: 1.542 ± 0.37
0.532MetTrp: 0.532 ± 0.17
0.266MetTyr: 0.266 ± 0.121
0.0MetXaa: 0.0 ± 0.0
Asn
3.776AsnAla: 3.776 ± 0.382
0.16AsnCys: 0.16 ± 0.094
1.755AsnAsp: 1.755 ± 0.311
1.648AsnGlu: 1.648 ± 0.258
0.798AsnPhe: 0.798 ± 0.253
4.361AsnGly: 4.361 ± 0.482
1.01AsnHis: 1.01 ± 0.219
1.329AsnIle: 1.329 ± 0.379
0.957AsnLys: 0.957 ± 0.214
2.499AsnLeu: 2.499 ± 0.369
0.585AsnMet: 0.585 ± 0.187
1.702AsnAsn: 1.702 ± 0.336
2.446AsnPro: 2.446 ± 0.395
1.064AsnGln: 1.064 ± 0.285
2.18AsnArg: 2.18 ± 0.413
1.648AsnSer: 1.648 ± 0.239
2.287AsnThr: 2.287 ± 0.374
1.702AsnVal: 1.702 ± 0.277
0.798AsnTrp: 0.798 ± 0.186
0.532AsnTyr: 0.532 ± 0.138
0.0AsnXaa: 0.0 ± 0.0
Pro
5.477ProAla: 5.477 ± 0.648
0.744ProCys: 0.744 ± 0.209
4.201ProAsp: 4.201 ± 0.48
4.307ProGlu: 4.307 ± 0.426
1.861ProPhe: 1.861 ± 0.256
5.956ProGly: 5.956 ± 0.629
1.595ProHis: 1.595 ± 0.328
2.233ProIle: 2.233 ± 0.372
2.233ProLys: 2.233 ± 0.441
4.307ProLeu: 4.307 ± 0.565
1.436ProMet: 1.436 ± 0.32
2.233ProAsn: 2.233 ± 0.325
3.829ProPro: 3.829 ± 0.54
2.499ProGln: 2.499 ± 0.434
3.616ProArg: 3.616 ± 0.561
2.606ProSer: 2.606 ± 0.371
2.978ProThr: 2.978 ± 0.38
4.733ProVal: 4.733 ± 0.506
0.957ProTrp: 0.957 ± 0.215
1.914ProTyr: 1.914 ± 0.334
0.0ProXaa: 0.0 ± 0.0
Gln
4.68GlnAla: 4.68 ± 0.528
0.425GlnCys: 0.425 ± 0.152
1.223GlnAsp: 1.223 ± 0.248
1.648GlnGlu: 1.648 ± 0.293
1.117GlnPhe: 1.117 ± 0.243
2.021GlnGly: 2.021 ± 0.39
0.638GlnHis: 0.638 ± 0.212
1.914GlnIle: 1.914 ± 0.324
1.329GlnLys: 1.329 ± 0.273
3.031GlnLeu: 3.031 ± 0.423
0.585GlnMet: 0.585 ± 0.196
0.957GlnAsn: 0.957 ± 0.21
2.18GlnPro: 2.18 ± 0.376
2.021GlnGln: 2.021 ± 0.412
2.233GlnArg: 2.233 ± 0.352
2.446GlnSer: 2.446 ± 0.302
1.808GlnThr: 1.808 ± 0.327
2.712GlnVal: 2.712 ± 0.385
0.851GlnTrp: 0.851 ± 0.204
1.17GlnTyr: 1.17 ± 0.278
0.0GlnXaa: 0.0 ± 0.0
Arg
6.115ArgAla: 6.115 ± 0.605
1.17ArgCys: 1.17 ± 0.327
4.467ArgAsp: 4.467 ± 0.563
4.999ArgGlu: 4.999 ± 0.654
2.233ArgPhe: 2.233 ± 0.365
3.882ArgGly: 3.882 ± 0.435
1.329ArgHis: 1.329 ± 0.341
3.776ArgIle: 3.776 ± 0.45
2.712ArgLys: 2.712 ± 0.446
4.839ArgLeu: 4.839 ± 0.587
2.393ArgMet: 2.393 ± 0.443
2.233ArgAsn: 2.233 ± 0.391
3.457ArgPro: 3.457 ± 0.424
2.606ArgGln: 2.606 ± 0.452
6.275ArgArg: 6.275 ± 0.907
4.307ArgSer: 4.307 ± 0.51
3.457ArgThr: 3.457 ± 0.538
5.105ArgVal: 5.105 ± 0.68
1.968ArgTrp: 1.968 ± 0.312
2.18ArgTyr: 2.18 ± 0.363
0.0ArgXaa: 0.0 ± 0.0
Ser
6.222SerAla: 6.222 ± 0.974
0.319SerCys: 0.319 ± 0.136
3.776SerAsp: 3.776 ± 0.451
3.297SerGlu: 3.297 ± 0.427
2.233SerPhe: 2.233 ± 0.438
6.966SerGly: 6.966 ± 0.784
1.223SerHis: 1.223 ± 0.254
2.712SerIle: 2.712 ± 0.375
2.446SerLys: 2.446 ± 0.443
3.457SerLeu: 3.457 ± 0.494
1.276SerMet: 1.276 ± 0.275
2.18SerAsn: 2.18 ± 0.415
3.51SerPro: 3.51 ± 0.388
1.702SerGln: 1.702 ± 0.289
3.722SerArg: 3.722 ± 0.427
3.988SerSer: 3.988 ± 0.611
3.031SerThr: 3.031 ± 0.433
4.733SerVal: 4.733 ± 0.522
1.489SerTrp: 1.489 ± 0.283
1.223SerTyr: 1.223 ± 0.256
0.0SerXaa: 0.0 ± 0.0
Thr
6.328ThrAla: 6.328 ± 0.615
0.532ThrCys: 0.532 ± 0.206
3.988ThrAsp: 3.988 ± 0.497
3.776ThrGlu: 3.776 ± 0.466
1.542ThrPhe: 1.542 ± 0.289
5.903ThrGly: 5.903 ± 0.56
1.542ThrHis: 1.542 ± 0.323
3.244ThrIle: 3.244 ± 0.535
2.021ThrLys: 2.021 ± 0.339
4.892ThrLeu: 4.892 ± 0.451
0.957ThrMet: 0.957 ± 0.215
2.34ThrAsn: 2.34 ± 0.315
4.786ThrPro: 4.786 ± 0.489
1.755ThrGln: 1.755 ± 0.286
3.722ThrArg: 3.722 ± 0.39
4.095ThrSer: 4.095 ± 0.467
4.733ThrThr: 4.733 ± 0.62
5.052ThrVal: 5.052 ± 0.712
1.117ThrTrp: 1.117 ± 0.249
2.074ThrTyr: 2.074 ± 0.316
0.0ThrXaa: 0.0 ± 0.0
Val
7.87ValAla: 7.87 ± 0.554
1.01ValCys: 1.01 ± 0.227
5.584ValAsp: 5.584 ± 0.662
3.988ValGlu: 3.988 ± 0.577
2.233ValPhe: 2.233 ± 0.398
5.69ValGly: 5.69 ± 0.615
1.329ValHis: 1.329 ± 0.256
2.818ValIle: 2.818 ± 0.446
2.127ValLys: 2.127 ± 0.334
5.69ValLeu: 5.69 ± 0.635
1.276ValMet: 1.276 ± 0.218
2.34ValAsn: 2.34 ± 0.293
3.776ValPro: 3.776 ± 0.398
2.499ValGln: 2.499 ± 0.321
5.105ValArg: 5.105 ± 0.686
4.999ValSer: 4.999 ± 0.602
4.733ValThr: 4.733 ± 0.456
5.903ValVal: 5.903 ± 0.585
1.648ValTrp: 1.648 ± 0.337
1.489ValTyr: 1.489 ± 0.326
0.0ValXaa: 0.0 ± 0.0
Trp
1.861TrpAla: 1.861 ± 0.297
0.266TrpCys: 0.266 ± 0.106
1.702TrpAsp: 1.702 ± 0.28
0.957TrpGlu: 0.957 ± 0.237
0.851TrpPhe: 0.851 ± 0.194
0.957TrpGly: 0.957 ± 0.225
0.425TrpHis: 0.425 ± 0.143
0.851TrpIle: 0.851 ± 0.22
0.904TrpLys: 0.904 ± 0.205
1.648TrpLeu: 1.648 ± 0.279
0.798TrpMet: 0.798 ± 0.217
0.744TrpAsn: 0.744 ± 0.196
1.117TrpPro: 1.117 ± 0.269
1.117TrpGln: 1.117 ± 0.277
1.648TrpArg: 1.648 ± 0.245
1.648TrpSer: 1.648 ± 0.483
1.489TrpThr: 1.489 ± 0.255
2.127TrpVal: 2.127 ± 0.372
0.798TrpTrp: 0.798 ± 0.191
0.479TrpTyr: 0.479 ± 0.167
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.659TyrAla: 2.659 ± 0.366
0.213TyrCys: 0.213 ± 0.118
1.861TyrAsp: 1.861 ± 0.412
2.021TyrGlu: 2.021 ± 0.366
0.904TyrPhe: 0.904 ± 0.261
2.233TyrGly: 2.233 ± 0.396
0.585TyrHis: 0.585 ± 0.181
0.691TyrIle: 0.691 ± 0.158
0.691TyrLys: 0.691 ± 0.201
1.808TyrLeu: 1.808 ± 0.351
0.425TyrMet: 0.425 ± 0.154
0.744TyrAsn: 0.744 ± 0.155
1.542TyrPro: 1.542 ± 0.263
0.851TyrGln: 0.851 ± 0.208
1.968TyrArg: 1.968 ± 0.361
1.17TyrSer: 1.17 ± 0.274
2.233TyrThr: 2.233 ± 0.389
2.393TyrVal: 2.393 ± 0.399
0.425TyrTrp: 0.425 ± 0.153
0.744TyrTyr: 0.744 ± 0.169
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 107 proteins (18806 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski