Amino acid dipepetide frequency for Mycobacterium phage ArcherNM

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
11.508AlaAla: 11.508 ± 1.114
0.492AlaCys: 0.492 ± 0.199
5.354AlaAsp: 5.354 ± 0.582
7.508AlaGlu: 7.508 ± 0.815
3.816AlaPhe: 3.816 ± 0.549
8.185AlaGly: 8.185 ± 0.843
2.031AlaHis: 2.031 ± 0.348
3.754AlaIle: 3.754 ± 0.502
5.293AlaLys: 5.293 ± 0.588
8.677AlaLeu: 8.677 ± 0.958
3.139AlaMet: 3.139 ± 0.409
2.831AlaAsn: 2.831 ± 0.413
4.739AlaPro: 4.739 ± 0.58
3.816AlaGln: 3.816 ± 0.598
6.154AlaArg: 6.154 ± 0.662
5.17AlaSer: 5.17 ± 0.588
5.293AlaThr: 5.293 ± 0.646
7.631AlaVal: 7.631 ± 0.615
1.785AlaTrp: 1.785 ± 0.358
2.892AlaTyr: 2.892 ± 0.391
0.0AlaXaa: 0.0 ± 0.0
Cys
0.739CysAla: 0.739 ± 0.184
0.062CysCys: 0.062 ± 0.064
0.492CysAsp: 0.492 ± 0.19
0.677CysGlu: 0.677 ± 0.192
0.431CysPhe: 0.431 ± 0.16
0.862CysGly: 0.862 ± 0.255
0.185CysHis: 0.185 ± 0.151
0.185CysIle: 0.185 ± 0.106
0.308CysLys: 0.308 ± 0.128
0.677CysLeu: 0.677 ± 0.22
0.246CysMet: 0.246 ± 0.114
0.369CysAsn: 0.369 ± 0.136
0.431CysPro: 0.431 ± 0.22
0.123CysGln: 0.123 ± 0.095
0.615CysArg: 0.615 ± 0.228
0.554CysSer: 0.554 ± 0.205
0.677CysThr: 0.677 ± 0.235
0.615CysVal: 0.615 ± 0.191
0.246CysTrp: 0.246 ± 0.131
0.369CysTyr: 0.369 ± 0.155
0.0CysXaa: 0.0 ± 0.0
Asp
6.216AspAla: 6.216 ± 0.522
0.862AspCys: 0.862 ± 0.269
3.569AspAsp: 3.569 ± 0.496
4.493AspGlu: 4.493 ± 0.656
2.646AspPhe: 2.646 ± 0.361
5.046AspGly: 5.046 ± 0.627
1.662AspHis: 1.662 ± 0.366
3.262AspIle: 3.262 ± 0.406
2.277AspLys: 2.277 ± 0.317
5.354AspLeu: 5.354 ± 0.65
1.908AspMet: 1.908 ± 0.339
1.169AspAsn: 1.169 ± 0.247
5.17AspPro: 5.17 ± 0.596
1.908AspGln: 1.908 ± 0.361
3.2AspArg: 3.2 ± 0.452
2.585AspSer: 2.585 ± 0.368
3.262AspThr: 3.262 ± 0.589
4.493AspVal: 4.493 ± 0.504
1.539AspTrp: 1.539 ± 0.34
2.092AspTyr: 2.092 ± 0.267
0.0AspXaa: 0.0 ± 0.0
Glu
8.185GluAla: 8.185 ± 1.001
0.246GluCys: 0.246 ± 0.107
4.554GluAsp: 4.554 ± 0.697
4.923GluGlu: 4.923 ± 0.758
3.077GluPhe: 3.077 ± 0.455
5.416GluGly: 5.416 ± 0.567
1.6GluHis: 1.6 ± 0.286
3.569GluIle: 3.569 ± 0.427
2.523GluLys: 2.523 ± 0.362
6.954GluLeu: 6.954 ± 0.662
2.216GluMet: 2.216 ± 0.322
2.154GluAsn: 2.154 ± 0.342
2.523GluPro: 2.523 ± 0.426
2.585GluGln: 2.585 ± 0.394
4.677GluArg: 4.677 ± 0.511
2.831GluSer: 2.831 ± 0.461
3.631GluThr: 3.631 ± 0.422
4.862GluVal: 4.862 ± 0.524
1.477GluTrp: 1.477 ± 0.306
1.846GluTyr: 1.846 ± 0.279
0.0GluXaa: 0.0 ± 0.0
Phe
3.569PheAla: 3.569 ± 0.492
0.369PheCys: 0.369 ± 0.19
2.154PheAsp: 2.154 ± 0.388
3.569PheGlu: 3.569 ± 0.494
0.923PhePhe: 0.923 ± 0.258
3.016PheGly: 3.016 ± 0.476
0.862PheHis: 0.862 ± 0.297
1.662PheIle: 1.662 ± 0.288
1.6PheLys: 1.6 ± 0.348
3.139PheLeu: 3.139 ± 0.442
0.615PheMet: 0.615 ± 0.189
1.539PheAsn: 1.539 ± 0.321
2.031PhePro: 2.031 ± 0.425
1.292PheGln: 1.292 ± 0.295
1.846PheArg: 1.846 ± 0.317
2.031PheSer: 2.031 ± 0.364
2.523PheThr: 2.523 ± 0.322
2.092PheVal: 2.092 ± 0.379
0.369PheTrp: 0.369 ± 0.152
0.677PheTyr: 0.677 ± 0.182
0.0PheXaa: 0.0 ± 0.0
Gly
8.124GlyAla: 8.124 ± 1.039
0.8GlyCys: 0.8 ± 0.231
6.585GlyAsp: 6.585 ± 0.931
5.17GlyGlu: 5.17 ± 0.665
3.077GlyPhe: 3.077 ± 0.405
8.677GlyGly: 8.677 ± 1.842
1.539GlyHis: 1.539 ± 0.292
4.062GlyIle: 4.062 ± 0.568
3.262GlyLys: 3.262 ± 0.435
6.77GlyLeu: 6.77 ± 0.613
2.031GlyMet: 2.031 ± 0.393
4.0GlyAsn: 4.0 ± 0.655
5.046GlyPro: 5.046 ± 1.942
2.769GlyGln: 2.769 ± 0.468
3.631GlyArg: 3.631 ± 0.383
4.123GlySer: 4.123 ± 0.68
5.354GlyThr: 5.354 ± 0.577
6.339GlyVal: 6.339 ± 0.602
1.415GlyTrp: 1.415 ± 0.277
2.646GlyTyr: 2.646 ± 0.374
0.0GlyXaa: 0.0 ± 0.0
His
1.6HisAla: 1.6 ± 0.257
0.246HisCys: 0.246 ± 0.114
1.477HisAsp: 1.477 ± 0.348
1.415HisGlu: 1.415 ± 0.31
0.492HisPhe: 0.492 ± 0.177
1.539HisGly: 1.539 ± 0.434
0.8HisHis: 0.8 ± 0.218
1.169HisIle: 1.169 ± 0.248
1.046HisLys: 1.046 ± 0.26
1.415HisLeu: 1.415 ± 0.286
0.308HisMet: 0.308 ± 0.146
0.615HisAsn: 0.615 ± 0.162
1.231HisPro: 1.231 ± 0.241
0.985HisGln: 0.985 ± 0.209
1.662HisArg: 1.662 ± 0.331
1.046HisSer: 1.046 ± 0.247
0.985HisThr: 0.985 ± 0.233
0.923HisVal: 0.923 ± 0.241
0.554HisTrp: 0.554 ± 0.181
0.615HisTyr: 0.615 ± 0.243
0.0HisXaa: 0.0 ± 0.0
Ile
5.539IleAla: 5.539 ± 0.586
0.492IleCys: 0.492 ± 0.171
3.262IleAsp: 3.262 ± 0.481
4.308IleGlu: 4.308 ± 0.537
1.415IlePhe: 1.415 ± 0.203
4.554IleGly: 4.554 ± 0.768
0.985IleHis: 0.985 ± 0.238
2.031IleIle: 2.031 ± 0.315
2.216IleLys: 2.216 ± 0.392
3.754IleLeu: 3.754 ± 0.421
0.492IleMet: 0.492 ± 0.18
2.154IleAsn: 2.154 ± 0.386
3.262IlePro: 3.262 ± 0.43
1.292IleGln: 1.292 ± 0.241
2.4IleArg: 2.4 ± 0.311
2.708IleSer: 2.708 ± 0.379
2.646IleThr: 2.646 ± 0.409
3.2IleVal: 3.2 ± 0.52
0.615IleTrp: 0.615 ± 0.195
0.739IleTyr: 0.739 ± 0.209
0.0IleXaa: 0.0 ± 0.0
Lys
4.923LysAla: 4.923 ± 0.536
0.308LysCys: 0.308 ± 0.124
2.585LysAsp: 2.585 ± 0.366
2.277LysGlu: 2.277 ± 0.43
1.108LysPhe: 1.108 ± 0.256
4.123LysGly: 4.123 ± 0.924
0.431LysHis: 0.431 ± 0.151
2.092LysIle: 2.092 ± 0.342
3.077LysLys: 3.077 ± 0.544
3.939LysLeu: 3.939 ± 0.488
0.862LysMet: 0.862 ± 0.23
1.231LysAsn: 1.231 ± 0.29
3.139LysPro: 3.139 ± 0.498
1.6LysGln: 1.6 ± 0.3
3.2LysArg: 3.2 ± 0.52
2.216LysSer: 2.216 ± 0.348
2.339LysThr: 2.339 ± 0.367
3.631LysVal: 3.631 ± 0.466
0.739LysTrp: 0.739 ± 0.243
1.292LysTyr: 1.292 ± 0.242
0.0LysXaa: 0.0 ± 0.0
Leu
9.478LeuAla: 9.478 ± 0.768
0.615LeuCys: 0.615 ± 0.196
5.539LeuAsp: 5.539 ± 0.658
6.093LeuGlu: 6.093 ± 0.632
3.016LeuPhe: 3.016 ± 0.364
5.539LeuGly: 5.539 ± 0.759
1.785LeuHis: 1.785 ± 0.398
4.123LeuIle: 4.123 ± 0.585
3.323LeuLys: 3.323 ± 0.406
5.108LeuLeu: 5.108 ± 0.646
2.216LeuMet: 2.216 ± 0.43
2.031LeuAsn: 2.031 ± 0.387
5.354LeuPro: 5.354 ± 0.554
2.708LeuGln: 2.708 ± 0.588
5.723LeuArg: 5.723 ± 0.708
5.539LeuSer: 5.539 ± 0.582
5.108LeuThr: 5.108 ± 0.633
4.369LeuVal: 4.369 ± 0.58
1.6LeuTrp: 1.6 ± 0.271
2.523LeuTyr: 2.523 ± 0.324
0.0LeuXaa: 0.0 ± 0.0
Met
2.4MetAla: 2.4 ± 0.349
0.123MetCys: 0.123 ± 0.087
0.985MetAsp: 0.985 ± 0.232
1.477MetGlu: 1.477 ± 0.233
0.492MetPhe: 0.492 ± 0.158
1.6MetGly: 1.6 ± 0.297
0.431MetHis: 0.431 ± 0.159
1.231MetIle: 1.231 ± 0.278
1.354MetLys: 1.354 ± 0.371
1.415MetLeu: 1.415 ± 0.37
0.677MetMet: 0.677 ± 0.187
0.8MetAsn: 0.8 ± 0.207
1.046MetPro: 1.046 ± 0.34
0.923MetGln: 0.923 ± 0.224
1.539MetArg: 1.539 ± 0.233
2.585MetSer: 2.585 ± 0.344
2.892MetThr: 2.892 ± 0.405
1.292MetVal: 1.292 ± 0.253
0.246MetTrp: 0.246 ± 0.115
0.862MetTyr: 0.862 ± 0.267
0.0MetXaa: 0.0 ± 0.0
Asn
2.954AsnAla: 2.954 ± 0.504
0.554AsnCys: 0.554 ± 0.162
2.092AsnAsp: 2.092 ± 0.34
2.277AsnGlu: 2.277 ± 0.378
1.108AsnPhe: 1.108 ± 0.321
3.754AsnGly: 3.754 ± 0.537
0.985AsnHis: 0.985 ± 0.206
1.108AsnIle: 1.108 ± 0.255
0.862AsnLys: 0.862 ± 0.218
3.139AsnLeu: 3.139 ± 0.421
0.739AsnMet: 0.739 ± 0.179
0.677AsnAsn: 0.677 ± 0.196
2.216AsnPro: 2.216 ± 0.403
0.862AsnGln: 0.862 ± 0.271
1.785AsnArg: 1.785 ± 0.372
1.415AsnSer: 1.415 ± 0.255
1.539AsnThr: 1.539 ± 0.323
2.462AsnVal: 2.462 ± 0.414
0.985AsnTrp: 0.985 ± 0.23
0.8AsnTyr: 0.8 ± 0.216
0.0AsnXaa: 0.0 ± 0.0
Pro
5.293ProAla: 5.293 ± 0.603
0.185ProCys: 0.185 ± 0.119
4.123ProAsp: 4.123 ± 0.504
4.739ProGlu: 4.739 ± 0.579
1.662ProPhe: 1.662 ± 0.385
5.293ProGly: 5.293 ± 0.718
0.923ProHis: 0.923 ± 0.24
2.523ProIle: 2.523 ± 0.369
3.016ProLys: 3.016 ± 0.724
3.508ProLeu: 3.508 ± 0.468
1.108ProMet: 1.108 ± 0.297
2.092ProAsn: 2.092 ± 0.353
2.462ProPro: 2.462 ± 0.467
2.216ProGln: 2.216 ± 0.674
4.123ProArg: 4.123 ± 0.676
2.892ProSer: 2.892 ± 0.335
3.323ProThr: 3.323 ± 0.446
3.569ProVal: 3.569 ± 0.404
0.862ProTrp: 0.862 ± 0.313
1.415ProTyr: 1.415 ± 0.321
0.0ProXaa: 0.0 ± 0.0
Gln
4.0GlnAla: 4.0 ± 0.527
0.308GlnCys: 0.308 ± 0.148
0.923GlnAsp: 0.923 ± 0.237
1.846GlnGlu: 1.846 ± 0.377
1.539GlnPhe: 1.539 ± 0.369
4.739GlnGly: 4.739 ± 1.36
0.923GlnHis: 0.923 ± 0.273
2.646GlnIle: 2.646 ± 0.394
1.723GlnLys: 1.723 ± 0.321
3.077GlnLeu: 3.077 ± 0.55
0.8GlnMet: 0.8 ± 0.23
0.923GlnAsn: 0.923 ± 0.201
1.231GlnPro: 1.231 ± 0.277
1.846GlnGln: 1.846 ± 0.354
2.339GlnArg: 2.339 ± 0.424
1.662GlnSer: 1.662 ± 0.365
2.277GlnThr: 2.277 ± 0.293
2.646GlnVal: 2.646 ± 0.357
0.739GlnTrp: 0.739 ± 0.244
0.923GlnTyr: 0.923 ± 0.184
0.0GlnXaa: 0.0 ± 0.0
Arg
4.739ArgAla: 4.739 ± 0.633
0.923ArgCys: 0.923 ± 0.275
4.308ArgAsp: 4.308 ± 0.543
5.108ArgGlu: 5.108 ± 0.715
2.523ArgPhe: 2.523 ± 0.424
4.431ArgGly: 4.431 ± 0.633
1.477ArgHis: 1.477 ± 0.315
3.631ArgIle: 3.631 ± 0.469
3.016ArgLys: 3.016 ± 0.508
5.539ArgLeu: 5.539 ± 0.651
1.908ArgMet: 1.908 ± 0.338
1.969ArgAsn: 1.969 ± 0.311
2.523ArgPro: 2.523 ± 0.364
2.216ArgGln: 2.216 ± 0.426
5.231ArgArg: 5.231 ± 0.595
2.954ArgSer: 2.954 ± 0.501
2.954ArgThr: 2.954 ± 0.414
4.369ArgVal: 4.369 ± 0.468
1.354ArgTrp: 1.354 ± 0.271
2.031ArgTyr: 2.031 ± 0.344
0.0ArgXaa: 0.0 ± 0.0
Ser
4.677SerAla: 4.677 ± 0.516
0.554SerCys: 0.554 ± 0.194
3.569SerAsp: 3.569 ± 0.372
3.016SerGlu: 3.016 ± 0.357
2.277SerPhe: 2.277 ± 0.342
4.862SerGly: 4.862 ± 0.728
0.615SerHis: 0.615 ± 0.177
2.892SerIle: 2.892 ± 0.365
2.646SerLys: 2.646 ± 0.397
4.0SerLeu: 4.0 ± 0.648
1.108SerMet: 1.108 ± 0.222
1.231SerAsn: 1.231 ± 0.262
2.892SerPro: 2.892 ± 0.368
2.462SerGln: 2.462 ± 0.445
3.754SerArg: 3.754 ± 0.465
3.139SerSer: 3.139 ± 0.463
3.508SerThr: 3.508 ± 0.434
3.446SerVal: 3.446 ± 0.567
0.923SerTrp: 0.923 ± 0.225
1.539SerTyr: 1.539 ± 0.325
0.0SerXaa: 0.0 ± 0.0
Thr
5.723ThrAla: 5.723 ± 0.501
0.554ThrCys: 0.554 ± 0.139
3.016ThrAsp: 3.016 ± 0.406
2.831ThrGlu: 2.831 ± 0.419
2.4ThrPhe: 2.4 ± 0.393
5.785ThrGly: 5.785 ± 1.004
0.739ThrHis: 0.739 ± 0.216
2.585ThrIle: 2.585 ± 0.372
3.077ThrLys: 3.077 ± 0.47
5.17ThrLeu: 5.17 ± 0.624
1.354ThrMet: 1.354 ± 0.323
2.092ThrAsn: 2.092 ± 0.36
4.8ThrPro: 4.8 ± 0.657
2.831ThrGln: 2.831 ± 0.394
3.016ThrArg: 3.016 ± 0.512
2.831ThrSer: 2.831 ± 0.448
3.2ThrThr: 3.2 ± 0.447
4.308ThrVal: 4.308 ± 0.532
1.231ThrTrp: 1.231 ± 0.31
2.154ThrTyr: 2.154 ± 0.352
0.0ThrXaa: 0.0 ± 0.0
Val
6.154ValAla: 6.154 ± 0.76
0.677ValCys: 0.677 ± 0.177
4.923ValAsp: 4.923 ± 0.53
4.369ValGlu: 4.369 ± 0.602
2.523ValPhe: 2.523 ± 0.469
4.369ValGly: 4.369 ± 0.464
1.108ValHis: 1.108 ± 0.249
3.139ValIle: 3.139 ± 0.445
3.446ValLys: 3.446 ± 0.429
6.277ValLeu: 6.277 ± 0.613
1.415ValMet: 1.415 ± 0.29
2.585ValAsn: 2.585 ± 0.427
3.016ValPro: 3.016 ± 0.524
2.646ValGln: 2.646 ± 0.396
4.985ValArg: 4.985 ± 0.619
3.816ValSer: 3.816 ± 0.599
5.17ValThr: 5.17 ± 0.572
5.354ValVal: 5.354 ± 0.563
1.539ValTrp: 1.539 ± 0.333
1.846ValTyr: 1.846 ± 0.334
0.0ValXaa: 0.0 ± 0.0
Trp
1.477TrpAla: 1.477 ± 0.397
0.246TrpCys: 0.246 ± 0.147
1.354TrpAsp: 1.354 ± 0.282
1.477TrpGlu: 1.477 ± 0.31
0.554TrpPhe: 0.554 ± 0.188
1.415TrpGly: 1.415 ± 0.266
0.739TrpHis: 0.739 ± 0.241
0.8TrpIle: 0.8 ± 0.229
0.369TrpLys: 0.369 ± 0.147
1.415TrpLeu: 1.415 ± 0.302
0.431TrpMet: 0.431 ± 0.15
1.108TrpAsn: 1.108 ± 0.238
0.862TrpPro: 0.862 ± 0.223
1.231TrpGln: 1.231 ± 0.277
1.046TrpArg: 1.046 ± 0.219
1.169TrpSer: 1.169 ± 0.286
1.231TrpThr: 1.231 ± 0.277
1.292TrpVal: 1.292 ± 0.24
0.492TrpTrp: 0.492 ± 0.216
0.492TrpTyr: 0.492 ± 0.17
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.769TyrAla: 2.769 ± 0.384
0.246TyrCys: 0.246 ± 0.12
1.846TyrAsp: 1.846 ± 0.321
2.277TyrGlu: 2.277 ± 0.353
0.8TyrPhe: 0.8 ± 0.212
2.216TyrGly: 2.216 ± 0.375
0.308TyrHis: 0.308 ± 0.145
1.6TyrIle: 1.6 ± 0.254
0.677TyrLys: 0.677 ± 0.229
2.462TyrLeu: 2.462 ± 0.35
0.739TyrMet: 0.739 ± 0.244
0.739TyrAsn: 0.739 ± 0.219
1.354TyrPro: 1.354 ± 0.28
0.8TyrGln: 0.8 ± 0.194
2.277TyrArg: 2.277 ± 0.432
1.846TyrSer: 1.846 ± 0.349
1.785TyrThr: 1.785 ± 0.354
2.462TyrVal: 2.462 ± 0.388
0.554TyrTrp: 0.554 ± 0.212
0.677TyrTyr: 0.677 ± 0.207
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 91 proteins (16250 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski