Amino acid dipepetide frequency for Paenibacillus phage Halcyone

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
7.766AlaAla: 7.766 ± 0.668
0.934AlaCys: 0.934 ± 0.22
5.722AlaAsp: 5.722 ± 0.549
6.832AlaGlu: 6.832 ± 0.702
2.569AlaPhe: 2.569 ± 0.327
5.138AlaGly: 5.138 ± 0.765
1.226AlaHis: 1.226 ± 0.231
4.087AlaIle: 4.087 ± 0.487
5.897AlaLys: 5.897 ± 0.551
7.591AlaLeu: 7.591 ± 0.603
1.635AlaMet: 1.635 ± 0.297
4.029AlaAsn: 4.029 ± 0.56
2.336AlaPro: 2.336 ± 0.334
2.044AlaGln: 2.044 ± 0.332
4.438AlaArg: 4.438 ± 0.589
4.496AlaSer: 4.496 ± 0.566
3.562AlaThr: 3.562 ± 0.541
5.956AlaVal: 5.956 ± 0.726
1.226AlaTrp: 1.226 ± 0.337
3.095AlaTyr: 3.095 ± 0.468
0.0AlaXaa: 0.0 ± 0.0
Cys
0.876CysAla: 0.876 ± 0.242
0.058CysCys: 0.058 ± 0.056
0.526CysAsp: 0.526 ± 0.2
0.701CysGlu: 0.701 ± 0.18
0.584CysPhe: 0.584 ± 0.172
1.051CysGly: 1.051 ± 0.282
0.234CysHis: 0.234 ± 0.117
0.35CysIle: 0.35 ± 0.13
0.759CysLys: 0.759 ± 0.197
0.759CysLeu: 0.759 ± 0.225
0.175CysMet: 0.175 ± 0.098
0.175CysAsn: 0.175 ± 0.096
0.584CysPro: 0.584 ± 0.24
0.35CysGln: 0.35 ± 0.175
1.285CysArg: 1.285 ± 0.276
0.467CysSer: 0.467 ± 0.16
0.467CysThr: 0.467 ± 0.155
0.409CysVal: 0.409 ± 0.145
0.175CysTrp: 0.175 ± 0.103
0.409CysTyr: 0.409 ± 0.156
0.0CysXaa: 0.0 ± 0.0
Asp
5.08AspAla: 5.08 ± 0.674
0.642AspCys: 0.642 ± 0.193
5.956AspAsp: 5.956 ± 0.595
6.598AspGlu: 6.598 ± 0.499
3.445AspPhe: 3.445 ± 0.426
4.321AspGly: 4.321 ± 0.56
1.401AspHis: 1.401 ± 0.252
5.489AspIle: 5.489 ± 0.488
2.744AspLys: 2.744 ± 0.363
5.197AspLeu: 5.197 ± 0.423
1.927AspMet: 1.927 ± 0.288
2.219AspAsn: 2.219 ± 0.353
2.219AspPro: 2.219 ± 0.322
1.577AspGln: 1.577 ± 0.349
3.503AspArg: 3.503 ± 0.42
3.737AspSer: 3.737 ± 0.44
3.328AspThr: 3.328 ± 0.374
5.314AspVal: 5.314 ± 0.602
1.46AspTrp: 1.46 ± 0.308
2.628AspTyr: 2.628 ± 0.362
0.0AspXaa: 0.0 ± 0.0
Glu
5.606GluAla: 5.606 ± 0.554
0.817GluCys: 0.817 ± 0.254
4.087GluAsp: 4.087 ± 0.54
6.89GluGlu: 6.89 ± 0.695
3.27GluPhe: 3.27 ± 0.528
4.029GluGly: 4.029 ± 0.561
1.343GluHis: 1.343 ± 0.274
4.963GluIle: 4.963 ± 0.572
6.481GluLys: 6.481 ± 0.658
8.875GluLeu: 8.875 ± 0.729
2.861GluMet: 2.861 ± 0.382
2.511GluAsn: 2.511 ± 0.368
2.744GluPro: 2.744 ± 0.422
2.628GluGln: 2.628 ± 0.394
5.722GluArg: 5.722 ± 0.581
3.737GluSer: 3.737 ± 0.548
4.204GluThr: 4.204 ± 0.414
5.43GluVal: 5.43 ± 0.7
1.343GluTrp: 1.343 ± 0.316
3.971GluTyr: 3.971 ± 0.522
0.0GluXaa: 0.0 ± 0.0
Phe
2.803PheAla: 2.803 ± 0.412
0.35PheCys: 0.35 ± 0.142
2.686PheAsp: 2.686 ± 0.375
2.628PheGlu: 2.628 ± 0.469
1.109PhePhe: 1.109 ± 0.297
2.394PheGly: 2.394 ± 0.34
0.701PheHis: 0.701 ± 0.196
2.277PheIle: 2.277 ± 0.347
2.102PheLys: 2.102 ± 0.281
2.394PheLeu: 2.394 ± 0.365
1.109PheMet: 1.109 ± 0.278
1.985PheAsn: 1.985 ± 0.36
1.109PhePro: 1.109 ± 0.284
0.759PheGln: 0.759 ± 0.188
2.102PheArg: 2.102 ± 0.306
1.985PheSer: 1.985 ± 0.34
2.394PheThr: 2.394 ± 0.349
2.686PheVal: 2.686 ± 0.399
0.35PheTrp: 0.35 ± 0.135
1.635PheTyr: 1.635 ± 0.326
0.0PheXaa: 0.0 ± 0.0
Gly
4.029GlyAla: 4.029 ± 0.566
0.467GlyCys: 0.467 ± 0.19
4.73GlyAsp: 4.73 ± 0.535
4.671GlyGlu: 4.671 ± 0.665
2.394GlyPhe: 2.394 ± 0.371
5.08GlyGly: 5.08 ± 0.644
0.759GlyHis: 0.759 ± 0.201
5.08GlyIle: 5.08 ± 0.568
5.956GlyLys: 5.956 ± 0.538
5.43GlyLeu: 5.43 ± 0.534
1.693GlyMet: 1.693 ± 0.353
2.686GlyAsn: 2.686 ± 0.372
1.752GlyPro: 1.752 ± 0.376
1.752GlyGln: 1.752 ± 0.35
4.263GlyArg: 4.263 ± 0.538
2.686GlySer: 2.686 ± 0.455
3.854GlyThr: 3.854 ± 0.39
4.554GlyVal: 4.554 ± 0.529
1.109GlyTrp: 1.109 ± 0.227
2.452GlyTyr: 2.452 ± 0.388
0.058GlyXaa: 0.058 ± 0.059
His
1.693HisAla: 1.693 ± 0.337
0.234HisCys: 0.234 ± 0.111
1.518HisAsp: 1.518 ± 0.308
1.577HisGlu: 1.577 ± 0.26
0.526HisPhe: 0.526 ± 0.184
1.343HisGly: 1.343 ± 0.258
0.292HisHis: 0.292 ± 0.144
0.759HisIle: 0.759 ± 0.161
0.934HisLys: 0.934 ± 0.208
0.934HisLeu: 0.934 ± 0.259
0.35HisMet: 0.35 ± 0.121
0.584HisAsn: 0.584 ± 0.154
0.817HisPro: 0.817 ± 0.211
0.584HisGln: 0.584 ± 0.15
0.934HisArg: 0.934 ± 0.315
0.934HisSer: 0.934 ± 0.209
0.759HisThr: 0.759 ± 0.239
1.168HisVal: 1.168 ± 0.256
0.234HisTrp: 0.234 ± 0.125
0.817HisTyr: 0.817 ± 0.222
0.0HisXaa: 0.0 ± 0.0
Ile
5.781IleAla: 5.781 ± 0.697
1.109IleCys: 1.109 ± 0.268
5.372IleAsp: 5.372 ± 0.572
4.963IleGlu: 4.963 ± 0.56
1.577IlePhe: 1.577 ± 0.309
3.737IleGly: 3.737 ± 0.572
0.993IleHis: 0.993 ± 0.224
3.328IleIle: 3.328 ± 0.376
4.554IleLys: 4.554 ± 0.565
4.087IleLeu: 4.087 ± 0.461
1.693IleMet: 1.693 ± 0.378
2.452IleAsn: 2.452 ± 0.329
2.511IlePro: 2.511 ± 0.386
1.518IleGln: 1.518 ± 0.251
4.496IleArg: 4.496 ± 0.469
2.803IleSer: 2.803 ± 0.476
3.62IleThr: 3.62 ± 0.5
4.321IleVal: 4.321 ± 0.633
0.642IleTrp: 0.642 ± 0.2
2.686IleTyr: 2.686 ± 0.474
0.0IleXaa: 0.0 ± 0.0
Lys
5.722LysAla: 5.722 ± 0.573
0.584LysCys: 0.584 ± 0.197
4.263LysAsp: 4.263 ± 0.45
5.08LysGlu: 5.08 ± 0.712
1.985LysPhe: 1.985 ± 0.376
4.321LysGly: 4.321 ± 0.54
1.401LysHis: 1.401 ± 0.348
5.022LysIle: 5.022 ± 0.498
5.314LysLys: 5.314 ± 0.674
6.248LysLeu: 6.248 ± 0.636
1.401LysMet: 1.401 ± 0.313
1.869LysAsn: 1.869 ± 0.308
2.394LysPro: 2.394 ± 0.33
2.277LysGln: 2.277 ± 0.405
5.606LysArg: 5.606 ± 0.535
3.854LysSer: 3.854 ± 0.452
3.854LysThr: 3.854 ± 0.477
4.905LysVal: 4.905 ± 0.666
1.343LysTrp: 1.343 ± 0.292
3.387LysTyr: 3.387 ± 0.458
0.0LysXaa: 0.0 ± 0.0
Leu
7.883LeuAla: 7.883 ± 0.826
0.701LeuCys: 0.701 ± 0.207
6.423LeuAsp: 6.423 ± 0.579
5.897LeuGlu: 5.897 ± 0.614
2.628LeuPhe: 2.628 ± 0.332
4.671LeuGly: 4.671 ± 0.445
1.577LeuHis: 1.577 ± 0.325
4.671LeuIle: 4.671 ± 0.454
5.022LeuLys: 5.022 ± 0.513
6.131LeuLeu: 6.131 ± 0.685
1.869LeuMet: 1.869 ± 0.312
2.394LeuAsn: 2.394 ± 0.405
3.387LeuPro: 3.387 ± 0.472
2.452LeuGln: 2.452 ± 0.356
6.189LeuArg: 6.189 ± 0.642
6.014LeuSer: 6.014 ± 0.531
5.956LeuThr: 5.956 ± 0.496
4.73LeuVal: 4.73 ± 0.553
0.701LeuTrp: 0.701 ± 0.234
3.27LeuTyr: 3.27 ± 0.452
0.0LeuXaa: 0.0 ± 0.0
Met
2.219MetAla: 2.219 ± 0.337
0.175MetCys: 0.175 ± 0.095
1.518MetAsp: 1.518 ± 0.261
1.343MetGlu: 1.343 ± 0.288
0.759MetPhe: 0.759 ± 0.168
1.285MetGly: 1.285 ± 0.25
0.467MetHis: 0.467 ± 0.143
1.285MetIle: 1.285 ± 0.258
1.81MetLys: 1.81 ± 0.319
2.16MetLeu: 2.16 ± 0.379
0.584MetMet: 0.584 ± 0.152
1.46MetAsn: 1.46 ± 0.264
1.168MetPro: 1.168 ± 0.289
0.759MetGln: 0.759 ± 0.195
2.394MetArg: 2.394 ± 0.327
2.102MetSer: 2.102 ± 0.323
1.518MetThr: 1.518 ± 0.284
1.168MetVal: 1.168 ± 0.241
0.292MetTrp: 0.292 ± 0.13
0.759MetTyr: 0.759 ± 0.204
0.0MetXaa: 0.0 ± 0.0
Asn
2.803AsnAla: 2.803 ± 0.444
0.467AsnCys: 0.467 ± 0.18
1.985AsnAsp: 1.985 ± 0.348
3.854AsnGlu: 3.854 ± 0.451
1.051AsnPhe: 1.051 ± 0.231
3.679AsnGly: 3.679 ± 0.487
0.584AsnHis: 0.584 ± 0.173
1.343AsnIle: 1.343 ± 0.269
2.102AsnLys: 2.102 ± 0.344
2.978AsnLeu: 2.978 ± 0.415
0.934AsnMet: 0.934 ± 0.231
1.46AsnAsn: 1.46 ± 0.293
2.044AsnPro: 2.044 ± 0.398
0.876AsnGln: 0.876 ± 0.276
2.92AsnArg: 2.92 ± 0.443
2.044AsnSer: 2.044 ± 0.37
1.81AsnThr: 1.81 ± 0.301
2.861AsnVal: 2.861 ± 0.36
0.642AsnTrp: 0.642 ± 0.21
1.285AsnTyr: 1.285 ± 0.264
0.0AsnXaa: 0.0 ± 0.0
Pro
2.803ProAla: 2.803 ± 0.485
0.234ProCys: 0.234 ± 0.119
2.861ProAsp: 2.861 ± 0.366
3.679ProGlu: 3.679 ± 0.445
0.934ProPhe: 0.934 ± 0.238
2.219ProGly: 2.219 ± 0.422
0.35ProHis: 0.35 ± 0.116
2.569ProIle: 2.569 ± 0.355
2.336ProLys: 2.336 ± 0.325
2.686ProLeu: 2.686 ± 0.421
1.109ProMet: 1.109 ± 0.255
1.869ProAsn: 1.869 ± 0.315
0.876ProPro: 0.876 ± 0.281
0.817ProGln: 0.817 ± 0.209
1.869ProArg: 1.869 ± 0.353
1.401ProSer: 1.401 ± 0.229
2.452ProThr: 2.452 ± 0.323
1.927ProVal: 1.927 ± 0.344
0.35ProTrp: 0.35 ± 0.12
1.752ProTyr: 1.752 ± 0.275
0.0ProXaa: 0.0 ± 0.0
Gln
2.744GlnAla: 2.744 ± 0.408
0.35GlnCys: 0.35 ± 0.167
1.343GlnAsp: 1.343 ± 0.358
2.102GlnGlu: 2.102 ± 0.384
0.876GlnPhe: 0.876 ± 0.239
1.285GlnGly: 1.285 ± 0.326
0.467GlnHis: 0.467 ± 0.138
1.927GlnIle: 1.927 ± 0.291
2.044GlnLys: 2.044 ± 0.266
3.036GlnLeu: 3.036 ± 0.374
0.876GlnMet: 0.876 ± 0.243
1.343GlnAsn: 1.343 ± 0.29
0.526GlnPro: 0.526 ± 0.146
0.993GlnGln: 0.993 ± 0.267
2.336GlnArg: 2.336 ± 0.325
1.168GlnSer: 1.168 ± 0.284
2.277GlnThr: 2.277 ± 0.484
1.46GlnVal: 1.46 ± 0.282
0.584GlnTrp: 0.584 ± 0.177
1.168GlnTyr: 1.168 ± 0.256
0.0GlnXaa: 0.0 ± 0.0
Arg
4.379ArgAla: 4.379 ± 0.588
0.759ArgCys: 0.759 ± 0.214
3.854ArgAsp: 3.854 ± 0.456
6.423ArgGlu: 6.423 ± 0.75
2.394ArgPhe: 2.394 ± 0.534
3.562ArgGly: 3.562 ± 0.437
1.051ArgHis: 1.051 ± 0.254
4.087ArgIle: 4.087 ± 0.524
6.423ArgLys: 6.423 ± 0.618
6.54ArgLeu: 6.54 ± 0.653
1.985ArgMet: 1.985 ± 0.323
3.971ArgAsn: 3.971 ± 0.543
1.985ArgPro: 1.985 ± 0.29
2.452ArgGln: 2.452 ± 0.35
4.379ArgArg: 4.379 ± 0.619
2.569ArgSer: 2.569 ± 0.356
2.861ArgThr: 2.861 ± 0.453
3.328ArgVal: 3.328 ± 0.576
1.109ArgTrp: 1.109 ± 0.203
2.219ArgTyr: 2.219 ± 0.345
0.0ArgXaa: 0.0 ± 0.0
Ser
4.263SerAla: 4.263 ± 0.529
0.584SerCys: 0.584 ± 0.239
3.095SerAsp: 3.095 ± 0.382
4.263SerGlu: 4.263 ± 0.514
2.277SerPhe: 2.277 ± 0.283
4.146SerGly: 4.146 ± 0.484
0.934SerHis: 0.934 ± 0.262
3.211SerIle: 3.211 ± 0.427
3.795SerLys: 3.795 ± 0.446
3.503SerLeu: 3.503 ± 0.422
1.285SerMet: 1.285 ± 0.256
1.693SerAsn: 1.693 ± 0.295
2.336SerPro: 2.336 ± 0.36
1.46SerGln: 1.46 ± 0.333
2.803SerArg: 2.803 ± 0.417
3.211SerSer: 3.211 ± 0.406
2.452SerThr: 2.452 ± 0.432
4.087SerVal: 4.087 ± 0.543
0.817SerTrp: 0.817 ± 0.19
2.452SerTyr: 2.452 ± 0.3
0.0SerXaa: 0.0 ± 0.0
Thr
4.613ThrAla: 4.613 ± 0.625
0.701ThrCys: 0.701 ± 0.273
4.146ThrAsp: 4.146 ± 0.42
4.438ThrGlu: 4.438 ± 0.444
2.394ThrPhe: 2.394 ± 0.36
4.846ThrGly: 4.846 ± 0.502
0.934ThrHis: 0.934 ± 0.212
3.27ThrIle: 3.27 ± 0.475
3.795ThrLys: 3.795 ± 0.603
3.679ThrLeu: 3.679 ± 0.386
1.109ThrMet: 1.109 ± 0.28
1.401ThrAsn: 1.401 ± 0.308
2.16ThrPro: 2.16 ± 0.377
1.927ThrGln: 1.927 ± 0.327
2.92ThrArg: 2.92 ± 0.381
2.628ThrSer: 2.628 ± 0.316
2.219ThrThr: 2.219 ± 0.361
4.263ThrVal: 4.263 ± 0.452
0.876ThrTrp: 0.876 ± 0.186
2.628ThrTyr: 2.628 ± 0.413
0.0ThrXaa: 0.0 ± 0.0
Val
5.138ValAla: 5.138 ± 0.601
0.934ValCys: 0.934 ± 0.281
4.087ValAsp: 4.087 ± 0.365
5.314ValGlu: 5.314 ± 0.554
2.803ValPhe: 2.803 ± 0.335
4.263ValGly: 4.263 ± 0.529
1.168ValHis: 1.168 ± 0.282
4.671ValIle: 4.671 ± 0.485
4.905ValLys: 4.905 ± 0.627
6.014ValLeu: 6.014 ± 0.603
1.46ValMet: 1.46 ± 0.276
2.394ValAsn: 2.394 ± 0.375
2.452ValPro: 2.452 ± 0.434
1.869ValGln: 1.869 ± 0.295
4.263ValArg: 4.263 ± 0.464
3.795ValSer: 3.795 ± 0.377
4.146ValThr: 4.146 ± 0.465
4.321ValVal: 4.321 ± 0.522
0.701ValTrp: 0.701 ± 0.205
2.861ValTyr: 2.861 ± 0.479
0.0ValXaa: 0.0 ± 0.0
Trp
0.934TrpAla: 0.934 ± 0.235
0.0TrpCys: 0.0 ± 0.0
1.285TrpAsp: 1.285 ± 0.357
1.343TrpGlu: 1.343 ± 0.299
0.467TrpPhe: 0.467 ± 0.166
0.701TrpGly: 0.701 ± 0.226
0.526TrpHis: 0.526 ± 0.212
0.934TrpIle: 0.934 ± 0.242
0.817TrpLys: 0.817 ± 0.215
1.168TrpLeu: 1.168 ± 0.24
0.409TrpMet: 0.409 ± 0.155
0.467TrpAsn: 0.467 ± 0.156
0.292TrpPro: 0.292 ± 0.133
0.817TrpGln: 0.817 ± 0.19
0.817TrpArg: 0.817 ± 0.222
0.701TrpSer: 0.701 ± 0.216
0.993TrpThr: 0.993 ± 0.269
1.285TrpVal: 1.285 ± 0.268
0.117TrpTrp: 0.117 ± 0.073
0.584TrpTyr: 0.584 ± 0.173
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.445TyrAla: 3.445 ± 0.395
0.292TyrCys: 0.292 ± 0.124
3.328TyrAsp: 3.328 ± 0.482
2.92TyrGlu: 2.92 ± 0.41
1.518TyrPhe: 1.518 ± 0.35
3.387TyrGly: 3.387 ± 0.501
0.584TyrHis: 0.584 ± 0.18
2.92TyrIle: 2.92 ± 0.469
2.978TyrLys: 2.978 ± 0.343
3.211TyrLeu: 3.211 ± 0.403
0.701TyrMet: 0.701 ± 0.209
0.759TyrAsn: 0.759 ± 0.232
1.401TyrPro: 1.401 ± 0.249
0.993TyrGln: 0.993 ± 0.211
3.095TyrArg: 3.095 ± 0.439
2.336TyrSer: 2.336 ± 0.38
2.219TyrThr: 2.219 ± 0.343
3.387TyrVal: 3.387 ± 0.443
0.584TyrTrp: 0.584 ± 0.19
1.869TyrTyr: 1.869 ± 0.346
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.058XaaLys: 0.058 ± 0.059
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 90 proteins (17127 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski