Amino acid dipepetide frequency for Mycobacterium phage BigMama

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
10.085AlaAla: 10.085 ± 0.76
0.731AlaCys: 0.731 ± 0.221
5.895AlaAsp: 5.895 ± 0.494
7.893AlaGlu: 7.893 ± 0.761
2.144AlaPhe: 2.144 ± 0.306
7.162AlaGly: 7.162 ± 0.645
1.51AlaHis: 1.51 ± 0.242
4.19AlaIle: 4.19 ± 0.518
5.7AlaLys: 5.7 ± 0.659
7.942AlaLeu: 7.942 ± 0.812
2.436AlaMet: 2.436 ± 0.369
3.459AlaAsn: 3.459 ± 0.431
3.8AlaPro: 3.8 ± 0.413
3.557AlaGln: 3.557 ± 0.395
4.775AlaArg: 4.775 ± 0.535
5.359AlaSer: 5.359 ± 0.445
4.97AlaThr: 4.97 ± 0.503
5.798AlaVal: 5.798 ± 0.492
1.608AlaTrp: 1.608 ± 0.258
2.972AlaTyr: 2.972 ± 0.403
0.0AlaXaa: 0.0 ± 0.0
Cys
1.072CysAla: 1.072 ± 0.28
0.146CysCys: 0.146 ± 0.081
0.536CysAsp: 0.536 ± 0.191
0.78CysGlu: 0.78 ± 0.234
0.341CysPhe: 0.341 ± 0.125
0.682CysGly: 0.682 ± 0.19
0.292CysHis: 0.292 ± 0.109
0.438CysIle: 0.438 ± 0.175
0.438CysLys: 0.438 ± 0.158
0.585CysLeu: 0.585 ± 0.185
0.195CysMet: 0.195 ± 0.109
0.487CysAsn: 0.487 ± 0.169
1.023CysPro: 1.023 ± 0.336
0.244CysGln: 0.244 ± 0.105
0.487CysArg: 0.487 ± 0.164
0.633CysSer: 0.633 ± 0.193
0.536CysThr: 0.536 ± 0.165
0.585CysVal: 0.585 ± 0.171
0.146CysTrp: 0.146 ± 0.081
0.39CysTyr: 0.39 ± 0.129
0.0CysXaa: 0.0 ± 0.0
Asp
5.7AspAla: 5.7 ± 0.387
0.292AspCys: 0.292 ± 0.115
6.821AspAsp: 6.821 ± 1.595
5.798AspGlu: 5.798 ± 1.306
2.29AspPhe: 2.29 ± 0.296
5.652AspGly: 5.652 ± 0.6
1.218AspHis: 1.218 ± 0.237
2.875AspIle: 2.875 ± 0.418
3.313AspLys: 3.313 ± 0.427
5.993AspLeu: 5.993 ± 0.654
1.462AspMet: 1.462 ± 0.267
2.192AspAsn: 2.192 ± 0.298
4.093AspPro: 4.093 ± 0.393
2.046AspGln: 2.046 ± 0.315
4.726AspArg: 4.726 ± 0.433
3.41AspSer: 3.41 ± 0.51
3.167AspThr: 3.167 ± 0.411
4.287AspVal: 4.287 ± 0.446
1.413AspTrp: 1.413 ± 0.274
2.29AspTyr: 2.29 ± 0.303
0.0AspXaa: 0.0 ± 0.0
Glu
6.285GluAla: 6.285 ± 0.532
0.633GluCys: 0.633 ± 0.239
6.382GluAsp: 6.382 ± 1.436
5.944GluGlu: 5.944 ± 0.58
2.826GluPhe: 2.826 ± 0.374
4.287GluGly: 4.287 ± 0.492
1.315GluHis: 1.315 ± 0.28
3.508GluIle: 3.508 ± 0.415
2.631GluLys: 2.631 ± 0.454
6.821GluLeu: 6.821 ± 0.467
1.608GluMet: 1.608 ± 0.296
2.095GluAsn: 2.095 ± 0.238
2.387GluPro: 2.387 ± 0.36
2.875GluGln: 2.875 ± 0.368
5.213GluArg: 5.213 ± 0.617
3.8GluSer: 3.8 ± 0.419
2.631GluThr: 2.631 ± 0.388
4.141GluVal: 4.141 ± 0.594
1.559GluTrp: 1.559 ± 0.308
1.851GluTyr: 1.851 ± 0.343
0.0GluXaa: 0.0 ± 0.0
Phe
2.29PheAla: 2.29 ± 0.413
0.438PheCys: 0.438 ± 0.156
2.241PheAsp: 2.241 ± 0.3
2.095PheGlu: 2.095 ± 0.337
0.78PhePhe: 0.78 ± 0.207
3.898PheGly: 3.898 ± 0.404
0.585PheHis: 0.585 ± 0.188
1.413PheIle: 1.413 ± 0.317
1.9PheLys: 1.9 ± 0.306
2.582PheLeu: 2.582 ± 0.427
0.536PheMet: 0.536 ± 0.152
1.51PheAsn: 1.51 ± 0.292
1.51PhePro: 1.51 ± 0.288
1.413PheGln: 1.413 ± 0.32
2.387PheArg: 2.387 ± 0.403
1.705PheSer: 1.705 ± 0.363
1.657PheThr: 1.657 ± 0.239
1.559PheVal: 1.559 ± 0.217
0.438PheTrp: 0.438 ± 0.159
1.169PheTyr: 1.169 ± 0.24
0.0PheXaa: 0.0 ± 0.0
Gly
6.577GlyAla: 6.577 ± 0.817
1.218GlyCys: 1.218 ± 0.254
5.262GlyAsp: 5.262 ± 0.504
4.921GlyGlu: 4.921 ± 0.459
2.631GlyPhe: 2.631 ± 0.322
5.993GlyGly: 5.993 ± 0.726
1.608GlyHis: 1.608 ± 0.287
4.58GlyIle: 4.58 ± 0.542
5.457GlyLys: 5.457 ± 0.502
6.236GlyLeu: 6.236 ± 0.643
2.241GlyMet: 2.241 ± 0.341
2.387GlyAsn: 2.387 ± 0.258
4.239GlyPro: 4.239 ± 0.5
2.728GlyGln: 2.728 ± 0.364
3.898GlyArg: 3.898 ± 0.5
6.188GlySer: 6.188 ± 0.597
6.626GlyThr: 6.626 ± 0.726
5.213GlyVal: 5.213 ± 0.507
1.9GlyTrp: 1.9 ± 0.303
3.069GlyTyr: 3.069 ± 0.392
0.0GlyXaa: 0.0 ± 0.0
His
1.559HisAla: 1.559 ± 0.328
0.341HisCys: 0.341 ± 0.139
1.072HisAsp: 1.072 ± 0.228
1.023HisGlu: 1.023 ± 0.249
0.877HisPhe: 0.877 ± 0.173
1.851HisGly: 1.851 ± 0.266
0.39HisHis: 0.39 ± 0.149
0.926HisIle: 0.926 ± 0.229
1.072HisLys: 1.072 ± 0.223
1.754HisLeu: 1.754 ± 0.316
0.487HisMet: 0.487 ± 0.128
0.926HisAsn: 0.926 ± 0.226
1.315HisPro: 1.315 ± 0.298
0.633HisGln: 0.633 ± 0.172
1.413HisArg: 1.413 ± 0.36
0.585HisSer: 0.585 ± 0.192
0.877HisThr: 0.877 ± 0.192
1.023HisVal: 1.023 ± 0.238
0.244HisTrp: 0.244 ± 0.118
0.828HisTyr: 0.828 ± 0.181
0.0HisXaa: 0.0 ± 0.0
Ile
4.531IleAla: 4.531 ± 0.623
0.487IleCys: 0.487 ± 0.172
3.264IleAsp: 3.264 ± 0.402
2.972IleGlu: 2.972 ± 0.347
1.364IlePhe: 1.364 ± 0.226
3.703IleGly: 3.703 ± 0.454
0.877IleHis: 0.877 ± 0.207
2.631IleIle: 2.631 ± 0.399
2.485IleLys: 2.485 ± 0.358
3.069IleLeu: 3.069 ± 0.415
1.267IleMet: 1.267 ± 0.264
2.777IleAsn: 2.777 ± 0.444
3.069IlePro: 3.069 ± 0.447
1.949IleGln: 1.949 ± 0.483
3.605IleArg: 3.605 ± 0.389
2.436IleSer: 2.436 ± 0.286
3.216IleThr: 3.216 ± 0.396
3.216IleVal: 3.216 ± 0.497
0.828IleTrp: 0.828 ± 0.204
1.072IleTyr: 1.072 ± 0.261
0.0IleXaa: 0.0 ± 0.0
Lys
6.188LysAla: 6.188 ± 0.654
0.438LysCys: 0.438 ± 0.165
3.654LysAsp: 3.654 ± 0.412
3.557LysGlu: 3.557 ± 0.487
1.413LysPhe: 1.413 ± 0.208
3.41LysGly: 3.41 ± 0.527
1.121LysHis: 1.121 ± 0.241
3.118LysIle: 3.118 ± 0.371
3.313LysLys: 3.313 ± 0.517
4.434LysLeu: 4.434 ± 0.468
0.926LysMet: 0.926 ± 0.197
1.949LysAsn: 1.949 ± 0.377
3.362LysPro: 3.362 ± 0.369
1.803LysGln: 1.803 ± 0.274
4.629LysArg: 4.629 ± 0.557
3.167LysSer: 3.167 ± 0.306
3.216LysThr: 3.216 ± 0.445
3.703LysVal: 3.703 ± 0.432
1.072LysTrp: 1.072 ± 0.189
1.462LysTyr: 1.462 ± 0.254
0.0LysXaa: 0.0 ± 0.0
Leu
7.162LeuAla: 7.162 ± 0.606
0.585LeuCys: 0.585 ± 0.217
4.97LeuAsp: 4.97 ± 0.57
5.067LeuGlu: 5.067 ± 0.611
2.387LeuPhe: 2.387 ± 0.358
6.188LeuGly: 6.188 ± 0.663
1.559LeuHis: 1.559 ± 0.3
2.826LeuIle: 2.826 ± 0.368
4.97LeuLys: 4.97 ± 0.499
5.895LeuLeu: 5.895 ± 0.539
1.803LeuMet: 1.803 ± 0.271
3.362LeuAsn: 3.362 ± 0.403
4.141LeuPro: 4.141 ± 0.494
2.68LeuGln: 2.68 ± 0.375
5.359LeuArg: 5.359 ± 0.586
5.262LeuSer: 5.262 ± 0.591
4.97LeuThr: 4.97 ± 0.459
4.385LeuVal: 4.385 ± 0.425
1.121LeuTrp: 1.121 ± 0.228
1.267LeuTyr: 1.267 ± 0.247
0.0LeuXaa: 0.0 ± 0.0
Met
2.728MetAla: 2.728 ± 0.349
0.244MetCys: 0.244 ± 0.107
1.267MetAsp: 1.267 ± 0.232
1.121MetGlu: 1.121 ± 0.173
0.585MetPhe: 0.585 ± 0.145
1.657MetGly: 1.657 ± 0.263
0.341MetHis: 0.341 ± 0.123
1.169MetIle: 1.169 ± 0.241
1.315MetLys: 1.315 ± 0.238
1.218MetLeu: 1.218 ± 0.3
0.536MetMet: 0.536 ± 0.191
0.974MetAsn: 0.974 ± 0.239
1.413MetPro: 1.413 ± 0.242
0.78MetGln: 0.78 ± 0.198
1.462MetArg: 1.462 ± 0.216
1.754MetSer: 1.754 ± 0.28
1.51MetThr: 1.51 ± 0.236
1.023MetVal: 1.023 ± 0.224
0.438MetTrp: 0.438 ± 0.134
0.974MetTyr: 0.974 ± 0.218
0.0MetXaa: 0.0 ± 0.0
Asn
4.093AsnAla: 4.093 ± 0.557
0.097AsnCys: 0.097 ± 0.075
2.582AsnAsp: 2.582 ± 0.394
2.436AsnGlu: 2.436 ± 0.337
1.315AsnPhe: 1.315 ± 0.29
3.557AsnGly: 3.557 ± 0.519
0.828AsnHis: 0.828 ± 0.194
1.851AsnIle: 1.851 ± 0.304
1.998AsnLys: 1.998 ± 0.347
2.339AsnLeu: 2.339 ± 0.292
0.78AsnMet: 0.78 ± 0.216
1.413AsnAsn: 1.413 ± 0.251
3.118AsnPro: 3.118 ± 0.536
1.315AsnGln: 1.315 ± 0.293
2.875AsnArg: 2.875 ± 0.38
2.046AsnSer: 2.046 ± 0.308
2.144AsnThr: 2.144 ± 0.273
2.046AsnVal: 2.046 ± 0.268
0.682AsnTrp: 0.682 ± 0.169
1.364AsnTyr: 1.364 ± 0.208
0.0AsnXaa: 0.0 ± 0.0
Pro
4.044ProAla: 4.044 ± 0.414
0.292ProCys: 0.292 ± 0.137
3.313ProAsp: 3.313 ± 0.497
3.995ProGlu: 3.995 ± 0.483
1.949ProPhe: 1.949 ± 0.306
5.213ProGly: 5.213 ± 0.587
1.267ProHis: 1.267 ± 0.213
2.192ProIle: 2.192 ± 0.278
3.313ProLys: 3.313 ± 0.571
3.41ProLeu: 3.41 ± 0.384
0.877ProMet: 0.877 ± 0.235
2.68ProAsn: 2.68 ± 0.37
2.387ProPro: 2.387 ± 0.432
1.754ProGln: 1.754 ± 0.29
2.29ProArg: 2.29 ± 0.334
3.459ProSer: 3.459 ± 0.376
3.069ProThr: 3.069 ± 0.405
3.216ProVal: 3.216 ± 0.373
1.121ProTrp: 1.121 ± 0.261
1.657ProTyr: 1.657 ± 0.264
0.0ProXaa: 0.0 ± 0.0
Gln
3.118GlnAla: 3.118 ± 0.408
0.195GlnCys: 0.195 ± 0.093
2.192GlnAsp: 2.192 ± 0.273
2.046GlnGlu: 2.046 ± 0.301
1.657GlnPhe: 1.657 ± 0.248
2.728GlnGly: 2.728 ± 0.438
0.633GlnHis: 0.633 ± 0.163
2.533GlnIle: 2.533 ± 0.27
2.241GlnLys: 2.241 ± 0.361
2.777GlnLeu: 2.777 ± 0.37
1.023GlnMet: 1.023 ± 0.204
1.803GlnAsn: 1.803 ± 0.283
1.364GlnPro: 1.364 ± 0.248
1.364GlnGln: 1.364 ± 0.219
2.631GlnArg: 2.631 ± 0.27
2.144GlnSer: 2.144 ± 0.351
1.803GlnThr: 1.803 ± 0.316
2.29GlnVal: 2.29 ± 0.387
0.536GlnTrp: 0.536 ± 0.175
0.877GlnTyr: 0.877 ± 0.201
0.0GlnXaa: 0.0 ± 0.0
Arg
5.408ArgAla: 5.408 ± 0.551
0.536ArgCys: 0.536 ± 0.18
3.995ArgAsp: 3.995 ± 0.494
4.385ArgGlu: 4.385 ± 0.519
1.803ArgPhe: 1.803 ± 0.267
5.311ArgGly: 5.311 ± 0.472
1.121ArgHis: 1.121 ± 0.235
3.508ArgIle: 3.508 ± 0.381
4.629ArgLys: 4.629 ± 0.607
4.19ArgLeu: 4.19 ± 0.459
1.851ArgMet: 1.851 ± 0.286
2.582ArgAsn: 2.582 ± 0.432
2.777ArgPro: 2.777 ± 0.428
2.582ArgGln: 2.582 ± 0.279
5.164ArgArg: 5.164 ± 0.556
3.41ArgSer: 3.41 ± 0.448
3.362ArgThr: 3.362 ± 0.408
5.116ArgVal: 5.116 ± 0.565
1.51ArgTrp: 1.51 ± 0.342
2.241ArgTyr: 2.241 ± 0.345
0.0ArgXaa: 0.0 ± 0.0
Ser
6.09SerAla: 6.09 ± 0.573
0.974SerCys: 0.974 ± 0.268
4.58SerAsp: 4.58 ± 0.522
3.118SerGlu: 3.118 ± 0.405
2.533SerPhe: 2.533 ± 0.394
6.577SerGly: 6.577 ± 0.646
0.974SerHis: 0.974 ± 0.233
2.533SerIle: 2.533 ± 0.424
2.485SerLys: 2.485 ± 0.33
3.995SerLeu: 3.995 ± 0.484
1.51SerMet: 1.51 ± 0.245
1.803SerAsn: 1.803 ± 0.293
2.582SerPro: 2.582 ± 0.392
2.241SerGln: 2.241 ± 0.324
3.313SerArg: 3.313 ± 0.417
4.19SerSer: 4.19 ± 0.688
4.336SerThr: 4.336 ± 0.479
2.68SerVal: 2.68 ± 0.384
1.608SerTrp: 1.608 ± 0.389
1.754SerTyr: 1.754 ± 0.285
0.0SerXaa: 0.0 ± 0.0
Thr
5.7ThrAla: 5.7 ± 0.653
0.828ThrCys: 0.828 ± 0.213
3.41ThrAsp: 3.41 ± 0.472
4.044ThrGlu: 4.044 ± 0.549
2.192ThrPhe: 2.192 ± 0.302
5.652ThrGly: 5.652 ± 0.541
1.267ThrHis: 1.267 ± 0.237
3.459ThrIle: 3.459 ± 0.427
2.826ThrLys: 2.826 ± 0.392
3.557ThrLeu: 3.557 ± 0.431
0.926ThrMet: 0.926 ± 0.205
2.29ThrAsn: 2.29 ± 0.335
3.41ThrPro: 3.41 ± 0.33
1.657ThrGln: 1.657 ± 0.281
2.777ThrArg: 2.777 ± 0.401
4.239ThrSer: 4.239 ± 0.549
3.362ThrThr: 3.362 ± 0.506
3.752ThrVal: 3.752 ± 0.431
1.51ThrTrp: 1.51 ± 0.234
1.705ThrTyr: 1.705 ± 0.24
0.0ThrXaa: 0.0 ± 0.0
Val
5.652ValAla: 5.652 ± 0.552
0.78ValCys: 0.78 ± 0.238
3.946ValAsp: 3.946 ± 0.45
4.531ValGlu: 4.531 ± 0.466
1.803ValPhe: 1.803 ± 0.328
5.603ValGly: 5.603 ± 0.553
1.169ValHis: 1.169 ± 0.27
2.582ValIle: 2.582 ± 0.275
3.557ValLys: 3.557 ± 0.439
5.067ValLeu: 5.067 ± 0.52
1.072ValMet: 1.072 ± 0.22
1.949ValAsn: 1.949 ± 0.326
3.313ValPro: 3.313 ± 0.339
2.144ValGln: 2.144 ± 0.393
3.995ValArg: 3.995 ± 0.45
3.021ValSer: 3.021 ± 0.332
3.654ValThr: 3.654 ± 0.431
4.287ValVal: 4.287 ± 0.497
1.121ValTrp: 1.121 ± 0.243
2.582ValTyr: 2.582 ± 0.301
0.0ValXaa: 0.0 ± 0.0
Trp
1.462TrpAla: 1.462 ± 0.291
0.244TrpCys: 0.244 ± 0.117
1.218TrpAsp: 1.218 ± 0.256
1.121TrpGlu: 1.121 ± 0.264
0.487TrpPhe: 0.487 ± 0.213
1.267TrpGly: 1.267 ± 0.279
0.487TrpHis: 0.487 ± 0.153
1.218TrpIle: 1.218 ± 0.228
0.974TrpLys: 0.974 ± 0.212
1.705TrpLeu: 1.705 ± 0.269
0.341TrpMet: 0.341 ± 0.113
0.828TrpAsn: 0.828 ± 0.226
0.78TrpPro: 0.78 ± 0.206
0.926TrpGln: 0.926 ± 0.188
1.608TrpArg: 1.608 ± 0.317
1.218TrpSer: 1.218 ± 0.253
1.657TrpThr: 1.657 ± 0.271
1.462TrpVal: 1.462 ± 0.236
0.195TrpTrp: 0.195 ± 0.111
0.438TrpTyr: 0.438 ± 0.171
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.29TyrAla: 2.29 ± 0.39
0.536TyrCys: 0.536 ± 0.15
2.436TyrAsp: 2.436 ± 0.388
1.949TyrGlu: 1.949 ± 0.377
0.731TyrPhe: 0.731 ± 0.189
2.582TyrGly: 2.582 ± 0.359
0.633TyrHis: 0.633 ± 0.165
1.267TyrIle: 1.267 ± 0.283
1.315TyrLys: 1.315 ± 0.233
2.387TyrLeu: 2.387 ± 0.294
0.585TyrMet: 0.585 ± 0.148
1.413TyrAsn: 1.413 ± 0.222
1.413TyrPro: 1.413 ± 0.299
1.267TyrGln: 1.267 ± 0.29
2.875TyrArg: 2.875 ± 0.391
1.851TyrSer: 1.851 ± 0.389
1.803TyrThr: 1.803 ± 0.295
2.046TyrVal: 2.046 ± 0.302
0.585TyrTrp: 0.585 ± 0.155
0.974TyrTyr: 0.974 ± 0.268
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 87 proteins (20526 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski