Amino acid dipepetide frequency for Mycobacterium phage Arlo

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
13.282AlaAla: 13.282 ± 1.245
0.922AlaCys: 0.922 ± 0.211
6.395AlaAsp: 6.395 ± 0.576
6.764AlaGlu: 6.764 ± 0.695
2.952AlaPhe: 2.952 ± 0.503
7.502AlaGly: 7.502 ± 0.884
1.537AlaHis: 1.537 ± 0.325
4.612AlaIle: 4.612 ± 0.458
4.55AlaLys: 4.55 ± 0.714
9.285AlaLeu: 9.285 ± 0.867
2.706AlaMet: 2.706 ± 0.417
2.521AlaAsn: 2.521 ± 0.414
4.612AlaPro: 4.612 ± 0.738
2.337AlaGln: 2.337 ± 0.359
6.149AlaArg: 6.149 ± 0.524
4.981AlaSer: 4.981 ± 0.513
6.088AlaThr: 6.088 ± 0.715
8.609AlaVal: 8.609 ± 0.764
1.722AlaTrp: 1.722 ± 0.344
2.644AlaTyr: 2.644 ± 0.39
0.0AlaXaa: 0.0 ± 0.0
Cys
0.799CysAla: 0.799 ± 0.273
0.0CysCys: 0.0 ± 0.0
0.553CysAsp: 0.553 ± 0.19
0.738CysGlu: 0.738 ± 0.199
0.123CysPhe: 0.123 ± 0.095
0.615CysGly: 0.615 ± 0.21
0.184CysHis: 0.184 ± 0.102
0.43CysIle: 0.43 ± 0.186
0.307CysLys: 0.307 ± 0.166
0.492CysLeu: 0.492 ± 0.194
0.184CysMet: 0.184 ± 0.102
0.246CysAsn: 0.246 ± 0.121
0.369CysPro: 0.369 ± 0.146
0.184CysGln: 0.184 ± 0.104
0.615CysArg: 0.615 ± 0.206
0.246CysSer: 0.246 ± 0.114
0.492CysThr: 0.492 ± 0.229
0.307CysVal: 0.307 ± 0.113
0.246CysTrp: 0.246 ± 0.121
0.061CysTyr: 0.061 ± 0.074
0.0CysXaa: 0.0 ± 0.0
Asp
6.026AspAla: 6.026 ± 0.564
0.799AspCys: 0.799 ± 0.252
4.427AspAsp: 4.427 ± 0.466
3.628AspGlu: 3.628 ± 0.466
2.337AspPhe: 2.337 ± 0.344
5.78AspGly: 5.78 ± 0.575
1.168AspHis: 1.168 ± 0.313
2.952AspIle: 2.952 ± 0.381
2.767AspLys: 2.767 ± 0.495
6.641AspLeu: 6.641 ± 0.762
1.476AspMet: 1.476 ± 0.252
1.783AspAsn: 1.783 ± 0.341
4.919AspPro: 4.919 ± 0.581
1.476AspGln: 1.476 ± 0.302
3.751AspArg: 3.751 ± 0.46
3.69AspSer: 3.69 ± 0.557
3.751AspThr: 3.751 ± 0.439
4.305AspVal: 4.305 ± 0.502
1.722AspTrp: 1.722 ± 0.356
2.152AspTyr: 2.152 ± 0.388
0.0AspXaa: 0.0 ± 0.0
Glu
6.457GluAla: 6.457 ± 0.701
0.492GluCys: 0.492 ± 0.231
5.35GluAsp: 5.35 ± 0.556
5.042GluGlu: 5.042 ± 0.609
2.214GluPhe: 2.214 ± 0.435
4.366GluGly: 4.366 ± 0.614
1.599GluHis: 1.599 ± 0.373
3.444GluIle: 3.444 ± 0.504
2.398GluLys: 2.398 ± 0.402
6.641GluLeu: 6.641 ± 0.555
1.537GluMet: 1.537 ± 0.266
1.722GluAsn: 1.722 ± 0.359
2.644GluPro: 2.644 ± 0.372
2.706GluGln: 2.706 ± 0.35
3.69GluArg: 3.69 ± 0.523
3.505GluSer: 3.505 ± 0.374
3.936GluThr: 3.936 ± 0.547
4.981GluVal: 4.981 ± 0.592
1.66GluTrp: 1.66 ± 0.357
2.46GluTyr: 2.46 ± 0.534
0.0GluXaa: 0.0 ± 0.0
Phe
2.583PheAla: 2.583 ± 0.452
0.43PheCys: 0.43 ± 0.2
2.89PheAsp: 2.89 ± 0.385
2.091PheGlu: 2.091 ± 0.344
0.615PhePhe: 0.615 ± 0.185
3.69PheGly: 3.69 ± 0.549
0.553PheHis: 0.553 ± 0.206
1.23PheIle: 1.23 ± 0.248
1.23PheLys: 1.23 ± 0.287
2.337PheLeu: 2.337 ± 0.433
0.553PheMet: 0.553 ± 0.179
1.168PheAsn: 1.168 ± 0.28
1.599PhePro: 1.599 ± 0.314
1.045PheGln: 1.045 ± 0.242
1.906PheArg: 1.906 ± 0.386
1.414PheSer: 1.414 ± 0.277
2.275PheThr: 2.275 ± 0.397
1.906PheVal: 1.906 ± 0.466
0.553PheTrp: 0.553 ± 0.16
1.045PheTyr: 1.045 ± 0.276
0.0PheXaa: 0.0 ± 0.0
Gly
7.441GlyAla: 7.441 ± 0.976
0.738GlyCys: 0.738 ± 0.199
5.657GlyAsp: 5.657 ± 0.501
4.55GlyGlu: 4.55 ± 0.574
2.952GlyPhe: 2.952 ± 0.467
10.884GlyGly: 10.884 ± 2.981
1.66GlyHis: 1.66 ± 0.285
4.305GlyIle: 4.305 ± 0.753
3.505GlyLys: 3.505 ± 0.486
6.887GlyLeu: 6.887 ± 0.98
2.091GlyMet: 2.091 ± 0.415
3.751GlyAsn: 3.751 ± 0.583
3.813GlyPro: 3.813 ± 0.541
2.214GlyGln: 2.214 ± 0.363
4.612GlyArg: 4.612 ± 0.517
7.133GlySer: 7.133 ± 1.208
5.534GlyThr: 5.534 ± 0.791
5.534GlyVal: 5.534 ± 0.603
2.583GlyTrp: 2.583 ± 0.417
2.89GlyTyr: 2.89 ± 0.382
0.0GlyXaa: 0.0 ± 0.0
His
1.906HisAla: 1.906 ± 0.442
0.246HisCys: 0.246 ± 0.142
1.045HisAsp: 1.045 ± 0.202
1.353HisGlu: 1.353 ± 0.273
0.738HisPhe: 0.738 ± 0.199
1.906HisGly: 1.906 ± 0.441
0.676HisHis: 0.676 ± 0.195
0.922HisIle: 0.922 ± 0.208
0.984HisLys: 0.984 ± 0.269
1.414HisLeu: 1.414 ± 0.317
0.184HisMet: 0.184 ± 0.104
0.307HisAsn: 0.307 ± 0.122
1.168HisPro: 1.168 ± 0.252
0.984HisGln: 0.984 ± 0.266
1.291HisArg: 1.291 ± 0.23
0.676HisSer: 0.676 ± 0.204
1.045HisThr: 1.045 ± 0.259
1.599HisVal: 1.599 ± 0.332
0.492HisTrp: 0.492 ± 0.146
0.799HisTyr: 0.799 ± 0.229
0.0HisXaa: 0.0 ± 0.0
Ile
5.965IleAla: 5.965 ± 0.598
0.246IleCys: 0.246 ± 0.114
3.567IleAsp: 3.567 ± 0.469
3.936IleGlu: 3.936 ± 0.482
1.23IlePhe: 1.23 ± 0.298
3.936IleGly: 3.936 ± 0.466
1.045IleHis: 1.045 ± 0.275
1.906IleIle: 1.906 ± 0.352
1.783IleLys: 1.783 ± 0.398
3.136IleLeu: 3.136 ± 0.39
0.861IleMet: 0.861 ± 0.216
1.906IleAsn: 1.906 ± 0.339
3.075IlePro: 3.075 ± 0.384
1.66IleGln: 1.66 ± 0.323
3.444IleArg: 3.444 ± 0.507
3.321IleSer: 3.321 ± 0.444
3.321IleThr: 3.321 ± 0.425
3.321IleVal: 3.321 ± 0.533
0.615IleTrp: 0.615 ± 0.179
1.845IleTyr: 1.845 ± 0.28
0.0IleXaa: 0.0 ± 0.0
Lys
3.874LysAla: 3.874 ± 0.513
0.246LysCys: 0.246 ± 0.141
2.521LysAsp: 2.521 ± 0.446
2.214LysGlu: 2.214 ± 0.339
1.414LysPhe: 1.414 ± 0.27
2.583LysGly: 2.583 ± 0.391
1.353LysHis: 1.353 ± 0.315
2.521LysIle: 2.521 ± 0.398
2.029LysLys: 2.029 ± 0.4
3.259LysLeu: 3.259 ± 0.421
1.107LysMet: 1.107 ± 0.239
1.66LysAsn: 1.66 ± 0.275
2.706LysPro: 2.706 ± 0.472
1.845LysGln: 1.845 ± 0.415
2.89LysArg: 2.89 ± 0.46
2.89LysSer: 2.89 ± 0.416
2.644LysThr: 2.644 ± 0.455
3.075LysVal: 3.075 ± 0.477
0.861LysTrp: 0.861 ± 0.235
1.045LysTyr: 1.045 ± 0.26
0.0LysXaa: 0.0 ± 0.0
Leu
9.224LeuAla: 9.224 ± 0.876
0.307LeuCys: 0.307 ± 0.131
6.026LeuAsp: 6.026 ± 0.605
5.35LeuGlu: 5.35 ± 0.566
2.029LeuPhe: 2.029 ± 0.369
7.318LeuGly: 7.318 ± 0.701
1.353LeuHis: 1.353 ± 0.293
4.489LeuIle: 4.489 ± 0.509
4.305LeuLys: 4.305 ± 0.46
5.78LeuLeu: 5.78 ± 0.604
1.599LeuMet: 1.599 ± 0.287
2.952LeuAsn: 2.952 ± 0.389
5.227LeuPro: 5.227 ± 0.534
2.583LeuGln: 2.583 ± 0.49
6.149LeuArg: 6.149 ± 0.56
5.35LeuSer: 5.35 ± 0.637
5.965LeuThr: 5.965 ± 0.465
4.489LeuVal: 4.489 ± 0.633
1.045LeuTrp: 1.045 ± 0.309
2.152LeuTyr: 2.152 ± 0.399
0.0LeuXaa: 0.0 ± 0.0
Met
2.337MetAla: 2.337 ± 0.414
0.0MetCys: 0.0 ± 0.0
1.045MetAsp: 1.045 ± 0.228
1.414MetGlu: 1.414 ± 0.323
0.615MetPhe: 0.615 ± 0.159
1.476MetGly: 1.476 ± 0.3
0.369MetHis: 0.369 ± 0.158
0.738MetIle: 0.738 ± 0.205
1.291MetLys: 1.291 ± 0.291
1.045MetLeu: 1.045 ± 0.252
0.123MetMet: 0.123 ± 0.08
0.984MetAsn: 0.984 ± 0.215
1.045MetPro: 1.045 ± 0.251
0.492MetGln: 0.492 ± 0.144
1.107MetArg: 1.107 ± 0.281
2.644MetSer: 2.644 ± 0.432
2.337MetThr: 2.337 ± 0.366
1.168MetVal: 1.168 ± 0.289
0.369MetTrp: 0.369 ± 0.153
0.369MetTyr: 0.369 ± 0.152
0.0MetXaa: 0.0 ± 0.0
Asn
3.382AsnAla: 3.382 ± 0.482
0.0AsnCys: 0.0 ± 0.0
2.214AsnAsp: 2.214 ± 0.401
1.968AsnGlu: 1.968 ± 0.344
1.107AsnPhe: 1.107 ± 0.255
3.69AsnGly: 3.69 ± 0.518
0.861AsnHis: 0.861 ± 0.252
1.845AsnIle: 1.845 ± 0.364
0.615AsnLys: 0.615 ± 0.194
2.398AsnLeu: 2.398 ± 0.379
0.615AsnMet: 0.615 ± 0.171
0.738AsnAsn: 0.738 ± 0.201
2.89AsnPro: 2.89 ± 0.398
1.045AsnGln: 1.045 ± 0.219
1.291AsnArg: 1.291 ± 0.294
1.968AsnSer: 1.968 ± 0.372
1.537AsnThr: 1.537 ± 0.241
2.337AsnVal: 2.337 ± 0.475
0.799AsnTrp: 0.799 ± 0.189
0.984AsnTyr: 0.984 ± 0.231
0.0AsnXaa: 0.0 ± 0.0
Pro
5.411ProAla: 5.411 ± 0.549
0.246ProCys: 0.246 ± 0.128
4.182ProAsp: 4.182 ± 0.474
4.243ProGlu: 4.243 ± 0.552
2.214ProPhe: 2.214 ± 0.35
5.165ProGly: 5.165 ± 0.696
0.676ProHis: 0.676 ± 0.199
2.398ProIle: 2.398 ± 0.374
2.152ProLys: 2.152 ± 0.308
4.182ProLeu: 4.182 ± 0.561
0.799ProMet: 0.799 ± 0.217
1.722ProAsn: 1.722 ± 0.35
2.644ProPro: 2.644 ± 0.404
1.414ProGln: 1.414 ± 0.338
2.46ProArg: 2.46 ± 0.44
3.813ProSer: 3.813 ± 0.465
3.69ProThr: 3.69 ± 0.621
3.997ProVal: 3.997 ± 0.455
0.799ProTrp: 0.799 ± 0.281
1.66ProTyr: 1.66 ± 0.37
0.0ProXaa: 0.0 ± 0.0
Gln
3.013GlnAla: 3.013 ± 0.526
0.123GlnCys: 0.123 ± 0.087
1.23GlnAsp: 1.23 ± 0.261
1.66GlnGlu: 1.66 ± 0.28
1.045GlnPhe: 1.045 ± 0.264
2.398GlnGly: 2.398 ± 0.37
0.553GlnHis: 0.553 ± 0.166
2.521GlnIle: 2.521 ± 0.51
1.23GlnLys: 1.23 ± 0.305
3.567GlnLeu: 3.567 ± 0.509
0.922GlnMet: 0.922 ± 0.272
0.492GlnAsn: 0.492 ± 0.153
1.968GlnPro: 1.968 ± 0.34
1.66GlnGln: 1.66 ± 0.342
1.537GlnArg: 1.537 ± 0.303
1.845GlnSer: 1.845 ± 0.258
1.476GlnThr: 1.476 ± 0.293
2.337GlnVal: 2.337 ± 0.345
0.615GlnTrp: 0.615 ± 0.147
0.492GlnTyr: 0.492 ± 0.143
0.0GlnXaa: 0.0 ± 0.0
Arg
5.596ArgAla: 5.596 ± 0.647
0.615ArgCys: 0.615 ± 0.201
3.075ArgAsp: 3.075 ± 0.383
5.165ArgGlu: 5.165 ± 0.726
1.968ArgPhe: 1.968 ± 0.373
4.796ArgGly: 4.796 ± 0.692
1.045ArgHis: 1.045 ± 0.241
3.259ArgIle: 3.259 ± 0.483
3.444ArgLys: 3.444 ± 0.502
5.842ArgLeu: 5.842 ± 0.727
1.783ArgMet: 1.783 ± 0.368
2.029ArgAsn: 2.029 ± 0.382
2.214ArgPro: 2.214 ± 0.361
1.66ArgGln: 1.66 ± 0.282
5.227ArgArg: 5.227 ± 0.59
3.813ArgSer: 3.813 ± 0.521
2.706ArgThr: 2.706 ± 0.469
4.919ArgVal: 4.919 ± 0.494
1.23ArgTrp: 1.23 ± 0.262
1.783ArgTyr: 1.783 ± 0.302
0.0ArgXaa: 0.0 ± 0.0
Ser
6.026SerAla: 6.026 ± 0.743
0.553SerCys: 0.553 ± 0.206
3.567SerAsp: 3.567 ± 0.45
4.059SerGlu: 4.059 ± 0.431
1.783SerPhe: 1.783 ± 0.429
7.01SerGly: 7.01 ± 1.239
1.66SerHis: 1.66 ± 0.339
3.075SerIle: 3.075 ± 0.486
2.829SerLys: 2.829 ± 0.441
5.042SerLeu: 5.042 ± 0.458
1.537SerMet: 1.537 ± 0.3
2.214SerAsn: 2.214 ± 0.44
3.321SerPro: 3.321 ± 0.487
1.783SerGln: 1.783 ± 0.301
3.075SerArg: 3.075 ± 0.478
3.259SerSer: 3.259 ± 0.595
3.321SerThr: 3.321 ± 0.475
4.366SerVal: 4.366 ± 0.566
1.353SerTrp: 1.353 ± 0.32
1.414SerTyr: 1.414 ± 0.328
0.0SerXaa: 0.0 ± 0.0
Thr
5.719ThrAla: 5.719 ± 0.692
0.369ThrCys: 0.369 ± 0.159
3.997ThrAsp: 3.997 ± 0.483
4.366ThrGlu: 4.366 ± 0.552
2.152ThrPhe: 2.152 ± 0.357
6.211ThrGly: 6.211 ± 0.647
1.107ThrHis: 1.107 ± 0.263
2.829ThrIle: 2.829 ± 0.544
2.521ThrLys: 2.521 ± 0.296
5.78ThrLeu: 5.78 ± 0.655
0.799ThrMet: 0.799 ± 0.206
1.845ThrAsn: 1.845 ± 0.352
3.628ThrPro: 3.628 ± 0.478
1.537ThrGln: 1.537 ± 0.324
3.751ThrArg: 3.751 ± 0.585
4.059ThrSer: 4.059 ± 0.653
4.612ThrThr: 4.612 ± 0.612
5.35ThrVal: 5.35 ± 0.631
1.045ThrTrp: 1.045 ± 0.254
1.906ThrTyr: 1.906 ± 0.305
0.0ThrXaa: 0.0 ± 0.0
Val
7.256ValAla: 7.256 ± 0.722
0.43ValCys: 0.43 ± 0.151
5.227ValAsp: 5.227 ± 0.586
4.981ValGlu: 4.981 ± 0.542
2.091ValPhe: 2.091 ± 0.353
4.919ValGly: 4.919 ± 0.594
1.23ValHis: 1.23 ± 0.245
3.751ValIle: 3.751 ± 0.51
3.013ValLys: 3.013 ± 0.415
5.473ValLeu: 5.473 ± 0.526
0.984ValMet: 0.984 ± 0.276
2.521ValAsn: 2.521 ± 0.317
3.997ValPro: 3.997 ± 0.494
2.214ValGln: 2.214 ± 0.403
5.473ValArg: 5.473 ± 0.679
4.366ValSer: 4.366 ± 0.527
5.165ValThr: 5.165 ± 0.573
4.55ValVal: 4.55 ± 0.633
1.168ValTrp: 1.168 ± 0.315
2.029ValTyr: 2.029 ± 0.405
0.0ValXaa: 0.0 ± 0.0
Trp
1.599TrpAla: 1.599 ± 0.284
0.246TrpCys: 0.246 ± 0.1
1.353TrpAsp: 1.353 ± 0.307
1.045TrpGlu: 1.045 ± 0.242
0.861TrpPhe: 0.861 ± 0.24
1.722TrpGly: 1.722 ± 0.285
0.43TrpHis: 0.43 ± 0.133
1.23TrpIle: 1.23 ± 0.237
0.369TrpLys: 0.369 ± 0.2
2.029TrpLeu: 2.029 ± 0.293
0.369TrpMet: 0.369 ± 0.174
0.553TrpAsn: 0.553 ± 0.182
0.799TrpPro: 0.799 ± 0.255
0.861TrpGln: 0.861 ± 0.246
1.168TrpArg: 1.168 ± 0.288
0.861TrpSer: 0.861 ± 0.19
1.722TrpThr: 1.722 ± 0.358
1.783TrpVal: 1.783 ± 0.316
0.553TrpTrp: 0.553 ± 0.196
0.307TrpTyr: 0.307 ± 0.137
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.275TyrAla: 2.275 ± 0.403
0.184TyrCys: 0.184 ± 0.098
1.23TyrAsp: 1.23 ± 0.302
2.275TyrGlu: 2.275 ± 0.409
0.676TyrPhe: 0.676 ± 0.185
2.767TyrGly: 2.767 ± 0.386
0.676TyrHis: 0.676 ± 0.195
1.66TyrIle: 1.66 ± 0.397
1.414TyrLys: 1.414 ± 0.282
2.46TyrLeu: 2.46 ± 0.391
0.676TyrMet: 0.676 ± 0.187
1.168TyrAsn: 1.168 ± 0.333
1.23TyrPro: 1.23 ± 0.288
0.984TyrGln: 0.984 ± 0.244
2.583TyrArg: 2.583 ± 0.434
1.414TyrSer: 1.414 ± 0.288
2.029TyrThr: 2.029 ± 0.359
1.906TyrVal: 1.906 ± 0.349
0.43TyrTrp: 0.43 ± 0.137
0.553TyrTyr: 0.553 ± 0.175
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 96 proteins (16263 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski