Amino acid dipepetide frequency for Mycobacterium phage Airmid

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
9.873AlaAla: 9.873 ± 0.965
0.377AlaCys: 0.377 ± 0.154
5.66AlaAsp: 5.66 ± 0.634
6.54AlaGlu: 6.54 ± 1.003
3.144AlaPhe: 3.144 ± 0.419
6.351AlaGly: 6.351 ± 0.549
1.698AlaHis: 1.698 ± 0.324
4.213AlaIle: 4.213 ± 0.511
4.276AlaLys: 4.276 ± 0.531
7.546AlaLeu: 7.546 ± 0.838
2.327AlaMet: 2.327 ± 0.458
3.27AlaAsn: 3.27 ± 0.498
4.528AlaPro: 4.528 ± 0.527
3.081AlaGln: 3.081 ± 0.449
5.094AlaArg: 5.094 ± 0.667
5.345AlaSer: 5.345 ± 0.644
5.031AlaThr: 5.031 ± 0.641
7.232AlaVal: 7.232 ± 0.829
1.635AlaTrp: 1.635 ± 0.314
1.824AlaTyr: 1.824 ± 0.395
0.0AlaXaa: 0.0 ± 0.0
Cys
0.566CysAla: 0.566 ± 0.214
0.063CysCys: 0.063 ± 0.075
0.503CysAsp: 0.503 ± 0.168
0.755CysGlu: 0.755 ± 0.247
0.377CysPhe: 0.377 ± 0.122
0.692CysGly: 0.692 ± 0.193
0.252CysHis: 0.252 ± 0.122
0.566CysIle: 0.566 ± 0.198
0.503CysLys: 0.503 ± 0.184
0.692CysLeu: 0.692 ± 0.204
0.126CysMet: 0.126 ± 0.083
0.44CysAsn: 0.44 ± 0.122
0.377CysPro: 0.377 ± 0.149
0.189CysGln: 0.189 ± 0.113
0.943CysArg: 0.943 ± 0.272
0.692CysSer: 0.692 ± 0.224
0.314CysThr: 0.314 ± 0.142
0.629CysVal: 0.629 ± 0.219
0.189CysTrp: 0.189 ± 0.096
0.189CysTyr: 0.189 ± 0.113
0.0CysXaa: 0.0 ± 0.0
Asp
6.289AspAla: 6.289 ± 0.652
0.943AspCys: 0.943 ± 0.246
3.207AspAsp: 3.207 ± 0.515
4.213AspGlu: 4.213 ± 0.523
2.578AspPhe: 2.578 ± 0.44
6.163AspGly: 6.163 ± 0.67
1.509AspHis: 1.509 ± 0.35
3.396AspIle: 3.396 ± 0.39
3.207AspLys: 3.207 ± 0.474
6.037AspLeu: 6.037 ± 0.592
1.446AspMet: 1.446 ± 0.27
2.138AspAsn: 2.138 ± 0.401
4.905AspPro: 4.905 ± 0.656
3.018AspGln: 3.018 ± 0.454
3.081AspArg: 3.081 ± 0.396
3.333AspSer: 3.333 ± 0.41
3.459AspThr: 3.459 ± 0.439
5.219AspVal: 5.219 ± 0.587
1.635AspTrp: 1.635 ± 0.28
3.522AspTyr: 3.522 ± 0.445
0.0AspXaa: 0.0 ± 0.0
Glu
6.163GluAla: 6.163 ± 0.759
0.566GluCys: 0.566 ± 0.183
5.282GluAsp: 5.282 ± 0.567
5.408GluGlu: 5.408 ± 0.653
2.39GluPhe: 2.39 ± 0.376
5.534GluGly: 5.534 ± 0.594
1.509GluHis: 1.509 ± 0.327
4.088GluIle: 4.088 ± 0.552
3.018GluLys: 3.018 ± 0.362
7.232GluLeu: 7.232 ± 0.69
2.201GluMet: 2.201 ± 0.358
2.39GluAsn: 2.39 ± 0.367
2.578GluPro: 2.578 ± 0.433
2.704GluGln: 2.704 ± 0.317
5.094GluArg: 5.094 ± 0.644
2.956GluSer: 2.956 ± 0.484
3.647GluThr: 3.647 ± 0.592
5.597GluVal: 5.597 ± 0.644
1.258GluTrp: 1.258 ± 0.267
2.201GluTyr: 2.201 ± 0.477
0.0GluXaa: 0.0 ± 0.0
Phe
2.641PheAla: 2.641 ± 0.394
0.189PheCys: 0.189 ± 0.106
3.207PheAsp: 3.207 ± 0.373
2.201PheGlu: 2.201 ± 0.358
0.818PhePhe: 0.818 ± 0.246
2.704PheGly: 2.704 ± 0.361
0.755PheHis: 0.755 ± 0.233
1.195PheIle: 1.195 ± 0.271
1.572PheLys: 1.572 ± 0.285
2.704PheLeu: 2.704 ± 0.612
0.503PheMet: 0.503 ± 0.146
1.509PheAsn: 1.509 ± 0.292
1.509PhePro: 1.509 ± 0.4
1.195PheGln: 1.195 ± 0.272
2.012PheArg: 2.012 ± 0.322
2.012PheSer: 2.012 ± 0.38
2.012PheThr: 2.012 ± 0.346
2.641PheVal: 2.641 ± 0.424
0.818PheTrp: 0.818 ± 0.263
1.258PheTyr: 1.258 ± 0.264
0.0PheXaa: 0.0 ± 0.0
Gly
6.1GlyAla: 6.1 ± 0.851
0.88GlyCys: 0.88 ± 0.285
5.723GlyAsp: 5.723 ± 0.762
4.779GlyGlu: 4.779 ± 0.555
2.327GlyPhe: 2.327 ± 0.424
9.559GlyGly: 9.559 ± 1.989
1.509GlyHis: 1.509 ± 0.335
4.716GlyIle: 4.716 ± 0.637
4.591GlyLys: 4.591 ± 0.645
6.54GlyLeu: 6.54 ± 0.766
2.138GlyMet: 2.138 ± 0.343
3.584GlyAsn: 3.584 ± 0.543
3.773GlyPro: 3.773 ± 0.472
2.641GlyGln: 2.641 ± 0.596
4.402GlyArg: 4.402 ± 0.425
4.465GlySer: 4.465 ± 0.583
4.402GlyThr: 4.402 ± 0.627
5.534GlyVal: 5.534 ± 0.532
2.012GlyTrp: 2.012 ± 0.348
2.453GlyTyr: 2.453 ± 0.426
0.0GlyXaa: 0.0 ± 0.0
His
1.383HisAla: 1.383 ± 0.312
0.126HisCys: 0.126 ± 0.1
1.509HisAsp: 1.509 ± 0.295
1.572HisGlu: 1.572 ± 0.327
0.629HisPhe: 0.629 ± 0.177
1.635HisGly: 1.635 ± 0.281
0.88HisHis: 0.88 ± 0.306
1.195HisIle: 1.195 ± 0.244
0.377HisLys: 0.377 ± 0.154
1.509HisLeu: 1.509 ± 0.274
0.126HisMet: 0.126 ± 0.087
0.566HisAsn: 0.566 ± 0.183
1.195HisPro: 1.195 ± 0.251
0.566HisGln: 0.566 ± 0.201
1.635HisArg: 1.635 ± 0.343
1.069HisSer: 1.069 ± 0.261
1.069HisThr: 1.069 ± 0.215
1.635HisVal: 1.635 ± 0.358
0.503HisTrp: 0.503 ± 0.195
0.692HisTyr: 0.692 ± 0.243
0.0HisXaa: 0.0 ± 0.0
Ile
5.723IleAla: 5.723 ± 0.648
0.377IleCys: 0.377 ± 0.127
4.402IleAsp: 4.402 ± 0.525
4.025IleGlu: 4.025 ± 0.475
1.321IlePhe: 1.321 ± 0.386
3.522IleGly: 3.522 ± 0.505
1.006IleHis: 1.006 ± 0.274
2.39IleIle: 2.39 ± 0.361
1.949IleLys: 1.949 ± 0.382
3.333IleLeu: 3.333 ± 0.432
0.692IleMet: 0.692 ± 0.239
2.39IleAsn: 2.39 ± 0.445
3.71IlePro: 3.71 ± 0.453
2.012IleGln: 2.012 ± 0.331
4.088IleArg: 4.088 ± 0.429
2.893IleSer: 2.893 ± 0.407
3.333IleThr: 3.333 ± 0.484
3.396IleVal: 3.396 ± 0.492
0.44IleTrp: 0.44 ± 0.171
1.446IleTyr: 1.446 ± 0.338
0.0IleXaa: 0.0 ± 0.0
Lys
5.094LysAla: 5.094 ± 0.662
0.314LysCys: 0.314 ± 0.148
2.956LysAsp: 2.956 ± 0.456
2.264LysGlu: 2.264 ± 0.456
1.509LysPhe: 1.509 ± 0.323
3.333LysGly: 3.333 ± 0.485
1.069LysHis: 1.069 ± 0.209
2.201LysIle: 2.201 ± 0.381
3.144LysLys: 3.144 ± 0.556
3.584LysLeu: 3.584 ± 0.467
1.258LysMet: 1.258 ± 0.287
0.943LysAsn: 0.943 ± 0.252
2.453LysPro: 2.453 ± 0.327
1.069LysGln: 1.069 ± 0.232
2.956LysArg: 2.956 ± 0.481
2.578LysSer: 2.578 ± 0.464
3.207LysThr: 3.207 ± 0.458
4.213LysVal: 4.213 ± 0.511
0.943LysTrp: 0.943 ± 0.244
1.258LysTyr: 1.258 ± 0.303
0.0LysXaa: 0.0 ± 0.0
Leu
7.295LeuAla: 7.295 ± 1.094
0.755LeuCys: 0.755 ± 0.223
6.603LeuAsp: 6.603 ± 0.592
5.911LeuGlu: 5.911 ± 0.678
2.075LeuPhe: 2.075 ± 0.388
5.723LeuGly: 5.723 ± 0.683
0.88LeuHis: 0.88 ± 0.258
3.899LeuIle: 3.899 ± 0.541
4.402LeuLys: 4.402 ± 0.717
6.414LeuLeu: 6.414 ± 0.654
2.201LeuMet: 2.201 ± 0.319
3.459LeuAsn: 3.459 ± 0.59
3.836LeuPro: 3.836 ± 0.544
2.075LeuGln: 2.075 ± 0.431
5.66LeuArg: 5.66 ± 0.541
5.723LeuSer: 5.723 ± 0.717
5.848LeuThr: 5.848 ± 0.734
6.163LeuVal: 6.163 ± 0.58
1.258LeuTrp: 1.258 ± 0.322
2.327LeuTyr: 2.327 ± 0.472
0.0LeuXaa: 0.0 ± 0.0
Met
1.887MetAla: 1.887 ± 0.388
0.252MetCys: 0.252 ± 0.122
1.258MetAsp: 1.258 ± 0.3
1.446MetGlu: 1.446 ± 0.344
0.755MetPhe: 0.755 ± 0.22
1.572MetGly: 1.572 ± 0.333
0.126MetHis: 0.126 ± 0.075
0.566MetIle: 0.566 ± 0.205
1.069MetLys: 1.069 ± 0.261
1.698MetLeu: 1.698 ± 0.382
0.503MetMet: 0.503 ± 0.2
0.943MetAsn: 0.943 ± 0.284
1.195MetPro: 1.195 ± 0.225
0.566MetGln: 0.566 ± 0.199
1.572MetArg: 1.572 ± 0.295
2.515MetSer: 2.515 ± 0.404
2.767MetThr: 2.767 ± 0.381
0.818MetVal: 0.818 ± 0.232
0.692MetTrp: 0.692 ± 0.183
1.069MetTyr: 1.069 ± 0.337
0.0MetXaa: 0.0 ± 0.0
Asn
2.767AsnAla: 2.767 ± 0.416
0.252AsnCys: 0.252 ± 0.12
2.264AsnAsp: 2.264 ± 0.45
1.761AsnGlu: 1.761 ± 0.307
1.761AsnPhe: 1.761 ± 0.353
4.968AsnGly: 4.968 ± 0.587
0.88AsnHis: 0.88 ± 0.225
1.509AsnIle: 1.509 ± 0.464
1.258AsnLys: 1.258 ± 0.268
2.956AsnLeu: 2.956 ± 0.343
0.629AsnMet: 0.629 ± 0.191
0.755AsnAsn: 0.755 ± 0.185
2.641AsnPro: 2.641 ± 0.43
0.943AsnGln: 0.943 ± 0.221
2.012AsnArg: 2.012 ± 0.432
1.572AsnSer: 1.572 ± 0.347
2.201AsnThr: 2.201 ± 0.528
2.39AsnVal: 2.39 ± 0.329
0.44AsnTrp: 0.44 ± 0.164
0.818AsnTyr: 0.818 ± 0.246
0.0AsnXaa: 0.0 ± 0.0
Pro
4.025ProAla: 4.025 ± 0.575
0.314ProCys: 0.314 ± 0.127
3.584ProAsp: 3.584 ± 0.449
4.339ProGlu: 4.339 ± 0.544
1.761ProPhe: 1.761 ± 0.476
4.842ProGly: 4.842 ± 0.78
0.943ProHis: 0.943 ± 0.241
2.704ProIle: 2.704 ± 0.402
2.201ProLys: 2.201 ± 0.353
3.27ProLeu: 3.27 ± 0.439
1.006ProMet: 1.006 ± 0.238
1.635ProAsn: 1.635 ± 0.312
3.081ProPro: 3.081 ± 0.701
1.761ProGln: 1.761 ± 0.33
2.012ProArg: 2.012 ± 0.398
3.207ProSer: 3.207 ± 0.4
3.773ProThr: 3.773 ± 0.465
4.339ProVal: 4.339 ± 0.619
1.195ProTrp: 1.195 ± 0.441
1.509ProTyr: 1.509 ± 0.309
0.0ProXaa: 0.0 ± 0.0
Gln
3.27GlnAla: 3.27 ± 0.457
0.314GlnCys: 0.314 ± 0.108
1.887GlnAsp: 1.887 ± 0.38
2.327GlnGlu: 2.327 ± 0.376
1.446GlnPhe: 1.446 ± 0.298
2.012GlnGly: 2.012 ± 0.347
0.88GlnHis: 0.88 ± 0.203
2.704GlnIle: 2.704 ± 0.359
1.887GlnLys: 1.887 ± 0.314
4.088GlnLeu: 4.088 ± 0.737
1.069GlnMet: 1.069 ± 0.235
0.818GlnAsn: 0.818 ± 0.236
1.258GlnPro: 1.258 ± 0.311
1.069GlnGln: 1.069 ± 0.314
1.572GlnArg: 1.572 ± 0.426
1.509GlnSer: 1.509 ± 0.327
1.698GlnThr: 1.698 ± 0.378
2.83GlnVal: 2.83 ± 0.342
0.629GlnTrp: 0.629 ± 0.211
1.006GlnTyr: 1.006 ± 0.228
0.0GlnXaa: 0.0 ± 0.0
Arg
5.094ArgAla: 5.094 ± 0.616
0.692ArgCys: 0.692 ± 0.28
3.333ArgAsp: 3.333 ± 0.442
5.157ArgGlu: 5.157 ± 0.633
2.641ArgPhe: 2.641 ± 0.487
4.213ArgGly: 4.213 ± 0.618
1.383ArgHis: 1.383 ± 0.36
3.71ArgIle: 3.71 ± 0.456
2.83ArgLys: 2.83 ± 0.487
5.723ArgLeu: 5.723 ± 0.481
1.761ArgMet: 1.761 ± 0.288
2.327ArgAsn: 2.327 ± 0.43
2.39ArgPro: 2.39 ± 0.389
3.081ArgGln: 3.081 ± 0.559
5.157ArgArg: 5.157 ± 0.731
2.578ArgSer: 2.578 ± 0.501
2.956ArgThr: 2.956 ± 0.438
3.459ArgVal: 3.459 ± 0.444
1.446ArgTrp: 1.446 ± 0.3
1.761ArgTyr: 1.761 ± 0.358
0.0ArgXaa: 0.0 ± 0.0
Ser
4.654SerAla: 4.654 ± 0.51
0.566SerCys: 0.566 ± 0.167
4.276SerAsp: 4.276 ± 0.515
4.591SerGlu: 4.591 ± 0.586
2.767SerPhe: 2.767 ± 0.345
4.465SerGly: 4.465 ± 0.596
1.006SerHis: 1.006 ± 0.25
3.27SerIle: 3.27 ± 0.429
2.83SerLys: 2.83 ± 0.358
3.584SerLeu: 3.584 ± 0.505
1.446SerMet: 1.446 ± 0.307
1.761SerAsn: 1.761 ± 0.296
2.453SerPro: 2.453 ± 0.353
2.39SerGln: 2.39 ± 0.376
4.276SerArg: 4.276 ± 0.589
3.144SerSer: 3.144 ± 0.473
2.515SerThr: 2.515 ± 0.408
3.27SerVal: 3.27 ± 0.487
1.195SerTrp: 1.195 ± 0.237
1.887SerTyr: 1.887 ± 0.328
0.0SerXaa: 0.0 ± 0.0
Thr
5.031ThrAla: 5.031 ± 0.43
0.629ThrCys: 0.629 ± 0.214
3.836ThrAsp: 3.836 ± 0.499
4.968ThrGlu: 4.968 ± 0.504
2.138ThrPhe: 2.138 ± 0.464
4.968ThrGly: 4.968 ± 0.683
1.258ThrHis: 1.258 ± 0.268
3.27ThrIle: 3.27 ± 0.403
2.641ThrLys: 2.641 ± 0.492
5.534ThrLeu: 5.534 ± 0.758
1.572ThrMet: 1.572 ± 0.387
1.887ThrAsn: 1.887 ± 0.384
4.213ThrPro: 4.213 ± 0.568
1.698ThrGln: 1.698 ± 0.29
3.081ThrArg: 3.081 ± 0.398
2.704ThrSer: 2.704 ± 0.442
4.339ThrThr: 4.339 ± 0.66
3.773ThrVal: 3.773 ± 0.533
1.006ThrTrp: 1.006 ± 0.263
1.258ThrTyr: 1.258 ± 0.269
0.0ThrXaa: 0.0 ± 0.0
Val
6.477ValAla: 6.477 ± 0.582
1.006ValCys: 1.006 ± 0.247
5.534ValAsp: 5.534 ± 0.689
5.66ValGlu: 5.66 ± 0.529
1.887ValPhe: 1.887 ± 0.348
5.345ValGly: 5.345 ± 0.579
1.383ValHis: 1.383 ± 0.318
3.962ValIle: 3.962 ± 0.5
3.081ValLys: 3.081 ± 0.463
5.597ValLeu: 5.597 ± 0.596
1.132ValMet: 1.132 ± 0.261
2.327ValAsn: 2.327 ± 0.43
2.956ValPro: 2.956 ± 0.509
2.39ValGln: 2.39 ± 0.436
3.773ValArg: 3.773 ± 0.511
5.408ValSer: 5.408 ± 0.526
4.654ValThr: 4.654 ± 0.409
5.785ValVal: 5.785 ± 0.627
1.572ValTrp: 1.572 ± 0.31
2.327ValTyr: 2.327 ± 0.387
0.0ValXaa: 0.0 ± 0.0
Trp
1.698TrpAla: 1.698 ± 0.298
0.189TrpCys: 0.189 ± 0.105
1.635TrpAsp: 1.635 ± 0.351
1.321TrpGlu: 1.321 ± 0.325
0.566TrpPhe: 0.566 ± 0.21
1.698TrpGly: 1.698 ± 0.312
0.377TrpHis: 0.377 ± 0.134
1.321TrpIle: 1.321 ± 0.284
0.566TrpLys: 0.566 ± 0.174
1.572TrpLeu: 1.572 ± 0.333
0.629TrpMet: 0.629 ± 0.182
1.006TrpAsn: 1.006 ± 0.245
1.006TrpPro: 1.006 ± 0.246
0.88TrpGln: 0.88 ± 0.206
0.818TrpArg: 0.818 ± 0.233
1.572TrpSer: 1.572 ± 0.269
1.321TrpThr: 1.321 ± 0.319
1.006TrpVal: 1.006 ± 0.209
0.566TrpTrp: 0.566 ± 0.17
0.377TrpTyr: 0.377 ± 0.154
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.018TyrAla: 3.018 ± 0.483
0.252TyrCys: 0.252 ± 0.121
2.641TyrAsp: 2.641 ± 0.428
2.578TyrGlu: 2.578 ± 0.551
0.566TyrPhe: 0.566 ± 0.195
2.83TyrGly: 2.83 ± 0.466
0.566TyrHis: 0.566 ± 0.198
1.572TyrIle: 1.572 ± 0.296
0.943TyrLys: 0.943 ± 0.239
2.893TyrLeu: 2.893 ± 0.375
0.44TyrMet: 0.44 ± 0.154
0.629TyrAsn: 0.629 ± 0.179
1.446TyrPro: 1.446 ± 0.317
0.943TyrGln: 0.943 ± 0.208
2.453TyrArg: 2.453 ± 0.442
1.069TyrSer: 1.069 ± 0.328
1.321TyrThr: 1.321 ± 0.256
2.327TyrVal: 2.327 ± 0.45
0.692TyrTrp: 0.692 ± 0.211
0.818TyrTyr: 0.818 ± 0.221
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 89 proteins (15903 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski