Amino acid dipepetide frequency for Gordonia phage SoilAssassin

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
17.141AlaAla: 17.141 ± 2.047
1.091AlaCys: 1.091 ± 0.312
9.886AlaAsp: 9.886 ± 0.896
9.63AlaGlu: 9.63 ± 1.061
2.119AlaPhe: 2.119 ± 0.388
8.602AlaGly: 8.602 ± 0.79
2.311AlaHis: 2.311 ± 0.358
6.741AlaIle: 6.741 ± 0.623
3.467AlaLys: 3.467 ± 0.485
8.089AlaLeu: 8.089 ± 0.667
3.274AlaMet: 3.274 ± 0.485
4.558AlaAsn: 4.558 ± 0.617
5.328AlaPro: 5.328 ± 0.559
4.622AlaGln: 4.622 ± 0.531
8.281AlaArg: 8.281 ± 0.849
6.291AlaSer: 6.291 ± 0.713
8.667AlaThr: 8.667 ± 1.083
7.768AlaVal: 7.768 ± 0.734
1.798AlaTrp: 1.798 ± 0.325
2.696AlaTyr: 2.696 ± 0.35
0.0AlaXaa: 0.0 ± 0.0
Cys
0.835CysAla: 0.835 ± 0.282
0.128CysCys: 0.128 ± 0.09
1.027CysAsp: 1.027 ± 0.385
0.899CysGlu: 0.899 ± 0.303
0.128CysPhe: 0.128 ± 0.088
0.963CysGly: 0.963 ± 0.421
0.193CysHis: 0.193 ± 0.105
0.064CysIle: 0.064 ± 0.062
0.385CysLys: 0.385 ± 0.177
0.449CysLeu: 0.449 ± 0.189
0.128CysMet: 0.128 ± 0.114
0.128CysAsn: 0.128 ± 0.1
0.77CysPro: 0.77 ± 0.266
0.193CysGln: 0.193 ± 0.1
1.091CysArg: 1.091 ± 0.294
0.514CysSer: 0.514 ± 0.193
0.321CysThr: 0.321 ± 0.168
0.706CysVal: 0.706 ± 0.212
0.321CysTrp: 0.321 ± 0.17
0.449CysTyr: 0.449 ± 0.141
0.0CysXaa: 0.0 ± 0.0
Asp
7.768AspAla: 7.768 ± 0.828
0.642AspCys: 0.642 ± 0.233
5.457AspAsp: 5.457 ± 0.71
4.751AspGlu: 4.751 ± 0.499
1.926AspPhe: 1.926 ± 0.369
7.383AspGly: 7.383 ± 0.596
1.412AspHis: 1.412 ± 0.307
2.568AspIle: 2.568 ± 0.384
1.22AspLys: 1.22 ± 0.243
7.254AspLeu: 7.254 ± 0.733
1.284AspMet: 1.284 ± 0.318
2.183AspAsn: 2.183 ± 0.447
4.943AspPro: 4.943 ± 0.497
2.054AspGln: 2.054 ± 0.35
4.558AspArg: 4.558 ± 0.583
2.76AspSer: 2.76 ± 0.429
4.301AspThr: 4.301 ± 0.618
5.072AspVal: 5.072 ± 0.584
1.348AspTrp: 1.348 ± 0.265
1.412AspTyr: 1.412 ± 0.3
0.0AspXaa: 0.0 ± 0.0
Glu
8.025GluAla: 8.025 ± 0.772
0.835GluCys: 0.835 ± 0.309
2.504GluAsp: 2.504 ± 0.395
2.632GluGlu: 2.632 ± 0.552
2.696GluPhe: 2.696 ± 0.464
4.558GluGly: 4.558 ± 0.562
1.412GluHis: 1.412 ± 0.363
3.402GluIle: 3.402 ± 0.465
1.798GluLys: 1.798 ± 0.289
4.686GluLeu: 4.686 ± 0.63
1.477GluMet: 1.477 ± 0.32
1.412GluAsn: 1.412 ± 0.275
3.146GluPro: 3.146 ± 0.432
3.338GluGln: 3.338 ± 0.602
5.393GluArg: 5.393 ± 0.693
3.723GluSer: 3.723 ± 0.586
3.338GluThr: 3.338 ± 0.43
3.723GluVal: 3.723 ± 0.532
1.027GluTrp: 1.027 ± 0.315
1.412GluTyr: 1.412 ± 0.284
0.0GluXaa: 0.0 ± 0.0
Phe
3.21PheAla: 3.21 ± 0.447
0.064PheCys: 0.064 ± 0.078
2.825PheAsp: 2.825 ± 0.447
1.669PheGlu: 1.669 ± 0.339
0.642PhePhe: 0.642 ± 0.198
2.696PheGly: 2.696 ± 0.484
0.321PheHis: 0.321 ± 0.2
0.835PheIle: 0.835 ± 0.208
0.514PheLys: 0.514 ± 0.184
1.541PheLeu: 1.541 ± 0.27
0.385PheMet: 0.385 ± 0.148
0.514PheAsn: 0.514 ± 0.155
1.027PhePro: 1.027 ± 0.261
0.706PheGln: 0.706 ± 0.202
1.541PheArg: 1.541 ± 0.347
1.477PheSer: 1.477 ± 0.325
2.247PheThr: 2.247 ± 0.407
2.119PheVal: 2.119 ± 0.468
1.284PheTrp: 1.284 ± 0.315
0.449PheTyr: 0.449 ± 0.162
0.0PheXaa: 0.0 ± 0.0
Gly
7.96GlyAla: 7.96 ± 0.97
0.77GlyCys: 0.77 ± 0.401
5.97GlyAsp: 5.97 ± 0.693
5.714GlyGlu: 5.714 ± 0.6
1.862GlyPhe: 1.862 ± 0.29
7.575GlyGly: 7.575 ± 1.007
1.669GlyHis: 1.669 ± 0.267
4.301GlyIle: 4.301 ± 0.471
3.081GlyLys: 3.081 ± 0.521
5.906GlyLeu: 5.906 ± 0.699
1.669GlyMet: 1.669 ± 0.338
2.696GlyAsn: 2.696 ± 0.401
4.558GlyPro: 4.558 ± 0.433
2.889GlyGln: 2.889 ± 0.463
6.997GlyArg: 6.997 ± 0.699
4.751GlySer: 4.751 ± 0.687
5.778GlyThr: 5.778 ± 0.832
6.548GlyVal: 6.548 ± 0.661
1.027GlyTrp: 1.027 ± 0.275
2.632GlyTyr: 2.632 ± 0.426
0.0GlyXaa: 0.0 ± 0.0
His
1.862HisAla: 1.862 ± 0.3
0.193HisCys: 0.193 ± 0.13
1.348HisAsp: 1.348 ± 0.301
1.156HisGlu: 1.156 ± 0.315
0.321HisPhe: 0.321 ± 0.15
1.733HisGly: 1.733 ± 0.383
0.578HisHis: 0.578 ± 0.2
0.963HisIle: 0.963 ± 0.23
0.578HisLys: 0.578 ± 0.176
1.733HisLeu: 1.733 ± 0.393
0.064HisMet: 0.064 ± 0.066
0.514HisAsn: 0.514 ± 0.204
1.348HisPro: 1.348 ± 0.291
1.156HisGln: 1.156 ± 0.286
1.284HisArg: 1.284 ± 0.317
0.642HisSer: 0.642 ± 0.213
1.156HisThr: 1.156 ± 0.258
0.835HisVal: 0.835 ± 0.21
0.449HisTrp: 0.449 ± 0.18
0.642HisTyr: 0.642 ± 0.208
0.0HisXaa: 0.0 ± 0.0
Ile
5.778IleAla: 5.778 ± 0.695
0.193IleCys: 0.193 ± 0.12
4.044IleAsp: 4.044 ± 0.489
3.788IleGlu: 3.788 ± 0.518
1.156IlePhe: 1.156 ± 0.262
3.788IleGly: 3.788 ± 0.465
1.027IleHis: 1.027 ± 0.219
1.22IleIle: 1.22 ± 0.293
1.091IleLys: 1.091 ± 0.234
2.889IleLeu: 2.889 ± 0.473
0.578IleMet: 0.578 ± 0.21
1.348IleAsn: 1.348 ± 0.302
2.632IlePro: 2.632 ± 0.362
2.632IleGln: 2.632 ± 0.403
3.659IleArg: 3.659 ± 0.373
2.76IleSer: 2.76 ± 0.432
3.146IleThr: 3.146 ± 0.401
4.237IleVal: 4.237 ± 0.532
0.385IleTrp: 0.385 ± 0.205
1.412IleTyr: 1.412 ± 0.336
0.0IleXaa: 0.0 ± 0.0
Lys
3.788LysAla: 3.788 ± 0.613
0.128LysCys: 0.128 ± 0.08
1.669LysAsp: 1.669 ± 0.304
1.348LysGlu: 1.348 ± 0.329
0.963LysPhe: 0.963 ± 0.305
1.669LysGly: 1.669 ± 0.354
0.449LysHis: 0.449 ± 0.179
1.541LysIle: 1.541 ± 0.291
0.899LysLys: 0.899 ± 0.322
2.889LysLeu: 2.889 ± 0.558
0.514LysMet: 0.514 ± 0.197
0.963LysAsn: 0.963 ± 0.317
1.733LysPro: 1.733 ± 0.323
0.835LysGln: 0.835 ± 0.223
2.375LysArg: 2.375 ± 0.399
1.798LysSer: 1.798 ± 0.339
1.798LysThr: 1.798 ± 0.294
2.696LysVal: 2.696 ± 0.403
0.321LysTrp: 0.321 ± 0.15
0.706LysTyr: 0.706 ± 0.225
0.0LysXaa: 0.0 ± 0.0
Leu
10.464LeuAla: 10.464 ± 0.917
0.835LeuCys: 0.835 ± 0.29
6.227LeuAsp: 6.227 ± 0.655
3.916LeuGlu: 3.916 ± 0.474
1.99LeuPhe: 1.99 ± 0.456
7.383LeuGly: 7.383 ± 0.758
1.027LeuHis: 1.027 ± 0.261
3.338LeuIle: 3.338 ± 0.462
1.733LeuLys: 1.733 ± 0.38
5.072LeuLeu: 5.072 ± 0.635
1.669LeuMet: 1.669 ± 0.333
1.99LeuAsn: 1.99 ± 0.42
3.788LeuPro: 3.788 ± 0.465
2.247LeuGln: 2.247 ± 0.436
5.521LeuArg: 5.521 ± 0.708
4.751LeuSer: 4.751 ± 0.535
5.521LeuThr: 5.521 ± 0.656
5.007LeuVal: 5.007 ± 0.652
1.669LeuTrp: 1.669 ± 0.272
1.477LeuTyr: 1.477 ± 0.273
0.0LeuXaa: 0.0 ± 0.0
Met
2.889MetAla: 2.889 ± 0.602
0.449MetCys: 0.449 ± 0.197
0.835MetAsp: 0.835 ± 0.219
0.835MetGlu: 0.835 ± 0.199
0.449MetPhe: 0.449 ± 0.166
1.477MetGly: 1.477 ± 0.392
0.385MetHis: 0.385 ± 0.157
1.412MetIle: 1.412 ± 0.28
0.578MetLys: 0.578 ± 0.208
1.605MetLeu: 1.605 ± 0.336
0.514MetMet: 0.514 ± 0.186
0.578MetAsn: 0.578 ± 0.229
1.348MetPro: 1.348 ± 0.295
0.449MetGln: 0.449 ± 0.227
1.027MetArg: 1.027 ± 0.222
1.669MetSer: 1.669 ± 0.438
2.311MetThr: 2.311 ± 0.391
1.348MetVal: 1.348 ± 0.308
0.257MetTrp: 0.257 ± 0.129
0.128MetTyr: 0.128 ± 0.08
0.0MetXaa: 0.0 ± 0.0
Asn
3.017AsnAla: 3.017 ± 0.591
0.193AsnCys: 0.193 ± 0.127
1.669AsnAsp: 1.669 ± 0.267
1.22AsnGlu: 1.22 ± 0.324
0.899AsnPhe: 0.899 ± 0.367
3.338AsnGly: 3.338 ± 0.475
0.77AsnHis: 0.77 ± 0.198
1.412AsnIle: 1.412 ± 0.367
0.706AsnLys: 0.706 ± 0.154
2.439AsnLeu: 2.439 ± 0.404
0.449AsnMet: 0.449 ± 0.176
0.642AsnAsn: 0.642 ± 0.182
1.926AsnPro: 1.926 ± 0.423
1.348AsnGln: 1.348 ± 0.287
2.247AsnArg: 2.247 ± 0.46
1.348AsnSer: 1.348 ± 0.276
1.99AsnThr: 1.99 ± 0.467
1.22AsnVal: 1.22 ± 0.28
0.642AsnTrp: 0.642 ± 0.183
0.706AsnTyr: 0.706 ± 0.206
0.0AsnXaa: 0.0 ± 0.0
Pro
7.383ProAla: 7.383 ± 0.899
0.642ProCys: 0.642 ± 0.208
4.879ProAsp: 4.879 ± 0.678
4.558ProGlu: 4.558 ± 0.709
0.835ProPhe: 0.835 ± 0.21
5.328ProGly: 5.328 ± 0.618
1.027ProHis: 1.027 ± 0.25
2.183ProIle: 2.183 ± 0.407
1.798ProLys: 1.798 ± 0.405
3.21ProLeu: 3.21 ± 0.474
1.348ProMet: 1.348 ± 0.279
2.247ProAsn: 2.247 ± 0.444
3.338ProPro: 3.338 ± 0.526
1.477ProGln: 1.477 ± 0.312
4.365ProArg: 4.365 ± 0.723
3.146ProSer: 3.146 ± 0.425
3.916ProThr: 3.916 ± 0.676
3.659ProVal: 3.659 ± 0.515
1.412ProTrp: 1.412 ± 0.337
1.22ProTyr: 1.22 ± 0.259
0.0ProXaa: 0.0 ± 0.0
Gln
3.659GlnAla: 3.659 ± 0.497
0.514GlnCys: 0.514 ± 0.194
1.605GlnAsp: 1.605 ± 0.279
1.477GlnGlu: 1.477 ± 0.321
0.963GlnPhe: 0.963 ± 0.242
2.375GlnGly: 2.375 ± 0.378
1.027GlnHis: 1.027 ± 0.205
2.311GlnIle: 2.311 ± 0.358
1.605GlnLys: 1.605 ± 0.387
3.402GlnLeu: 3.402 ± 0.422
1.733GlnMet: 1.733 ± 0.318
0.706GlnAsn: 0.706 ± 0.174
2.054GlnPro: 2.054 ± 0.282
1.541GlnGln: 1.541 ± 0.379
3.852GlnArg: 3.852 ± 0.692
1.477GlnSer: 1.477 ± 0.265
2.311GlnThr: 2.311 ± 0.442
3.017GlnVal: 3.017 ± 0.526
0.578GlnTrp: 0.578 ± 0.188
0.835GlnTyr: 0.835 ± 0.24
0.0GlnXaa: 0.0 ± 0.0
Arg
8.859ArgAla: 8.859 ± 0.739
0.578ArgCys: 0.578 ± 0.271
3.723ArgAsp: 3.723 ± 0.485
4.751ArgGlu: 4.751 ± 0.681
2.439ArgPhe: 2.439 ± 0.414
4.751ArgGly: 4.751 ± 0.52
0.963ArgHis: 0.963 ± 0.319
4.173ArgIle: 4.173 ± 0.534
2.953ArgLys: 2.953 ± 0.335
6.099ArgLeu: 6.099 ± 0.659
1.926ArgMet: 1.926 ± 0.371
2.311ArgAsn: 2.311 ± 0.323
4.558ArgPro: 4.558 ± 0.729
2.889ArgGln: 2.889 ± 0.406
6.612ArgArg: 6.612 ± 0.767
4.43ArgSer: 4.43 ± 0.572
4.686ArgThr: 4.686 ± 0.681
5.328ArgVal: 5.328 ± 0.587
1.798ArgTrp: 1.798 ± 0.413
1.99ArgTyr: 1.99 ± 0.372
0.0ArgXaa: 0.0 ± 0.0
Ser
7.511SerAla: 7.511 ± 0.792
0.385SerCys: 0.385 ± 0.167
3.274SerAsp: 3.274 ± 0.559
2.825SerGlu: 2.825 ± 0.447
1.669SerPhe: 1.669 ± 0.35
5.521SerGly: 5.521 ± 0.848
0.514SerHis: 0.514 ± 0.215
2.953SerIle: 2.953 ± 0.39
1.541SerLys: 1.541 ± 0.347
4.558SerLeu: 4.558 ± 0.525
0.835SerMet: 0.835 ± 0.221
1.156SerAsn: 1.156 ± 0.27
3.788SerPro: 3.788 ± 0.553
1.926SerGln: 1.926 ± 0.324
3.723SerArg: 3.723 ± 0.447
3.21SerSer: 3.21 ± 0.538
3.98SerThr: 3.98 ± 0.515
3.21SerVal: 3.21 ± 0.434
1.477SerTrp: 1.477 ± 0.318
1.156SerTyr: 1.156 ± 0.271
0.0SerXaa: 0.0 ± 0.0
Thr
9.694ThrAla: 9.694 ± 1.531
0.706ThrCys: 0.706 ± 0.211
4.751ThrAsp: 4.751 ± 0.479
2.632ThrGlu: 2.632 ± 0.442
1.412ThrPhe: 1.412 ± 0.334
6.548ThrGly: 6.548 ± 0.899
1.156ThrHis: 1.156 ± 0.269
4.044ThrIle: 4.044 ± 0.5
1.798ThrLys: 1.798 ± 0.306
5.842ThrLeu: 5.842 ± 0.593
0.642ThrMet: 0.642 ± 0.187
1.412ThrAsn: 1.412 ± 0.367
4.237ThrPro: 4.237 ± 0.527
2.375ThrGln: 2.375 ± 0.351
4.365ThrArg: 4.365 ± 0.538
4.044ThrSer: 4.044 ± 0.644
5.264ThrThr: 5.264 ± 0.781
5.457ThrVal: 5.457 ± 0.599
1.477ThrTrp: 1.477 ± 0.296
1.156ThrTyr: 1.156 ± 0.37
0.0ThrXaa: 0.0 ± 0.0
Val
8.153ValAla: 8.153 ± 0.794
0.77ValCys: 0.77 ± 0.267
6.356ValAsp: 6.356 ± 0.463
4.109ValGlu: 4.109 ± 0.513
2.439ValPhe: 2.439 ± 0.425
5.328ValGly: 5.328 ± 0.655
1.284ValHis: 1.284 ± 0.358
3.146ValIle: 3.146 ± 0.447
1.99ValLys: 1.99 ± 0.348
4.237ValLeu: 4.237 ± 0.572
1.22ValMet: 1.22 ± 0.287
1.733ValAsn: 1.733 ± 0.358
4.943ValPro: 4.943 ± 0.607
2.439ValGln: 2.439 ± 0.486
5.97ValArg: 5.97 ± 0.68
3.916ValSer: 3.916 ± 0.58
5.649ValThr: 5.649 ± 0.858
5.264ValVal: 5.264 ± 0.605
0.77ValTrp: 0.77 ± 0.225
1.156ValTyr: 1.156 ± 0.317
0.0ValXaa: 0.0 ± 0.0
Trp
1.862TrpAla: 1.862 ± 0.304
0.257TrpCys: 0.257 ± 0.134
1.091TrpAsp: 1.091 ± 0.33
1.027TrpGlu: 1.027 ± 0.196
0.642TrpPhe: 0.642 ± 0.212
1.348TrpGly: 1.348 ± 0.276
0.257TrpHis: 0.257 ± 0.128
0.449TrpIle: 0.449 ± 0.186
0.706TrpLys: 0.706 ± 0.276
2.183TrpLeu: 2.183 ± 0.347
0.449TrpMet: 0.449 ± 0.172
0.385TrpAsn: 0.385 ± 0.165
1.22TrpPro: 1.22 ± 0.313
0.963TrpGln: 0.963 ± 0.211
1.669TrpArg: 1.669 ± 0.268
1.22TrpSer: 1.22 ± 0.24
1.027TrpThr: 1.027 ± 0.247
1.541TrpVal: 1.541 ± 0.308
0.321TrpTrp: 0.321 ± 0.133
0.257TrpTyr: 0.257 ± 0.098
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.504TyrAla: 2.504 ± 0.376
0.321TyrCys: 0.321 ± 0.145
1.156TyrAsp: 1.156 ± 0.254
1.605TyrGlu: 1.605 ± 0.344
0.514TyrPhe: 0.514 ± 0.198
2.054TyrGly: 2.054 ± 0.351
0.77TyrHis: 0.77 ± 0.298
0.706TyrIle: 0.706 ± 0.173
0.642TyrLys: 0.642 ± 0.236
1.541TyrLeu: 1.541 ± 0.257
0.321TyrMet: 0.321 ± 0.135
0.706TyrAsn: 0.706 ± 0.187
1.284TyrPro: 1.284 ± 0.306
0.963TyrGln: 0.963 ± 0.252
1.284TyrArg: 1.284 ± 0.27
1.284TyrSer: 1.284 ± 0.196
1.605TyrThr: 1.605 ± 0.283
2.119TyrVal: 2.119 ± 0.377
0.514TyrTrp: 0.514 ± 0.206
0.257TyrTyr: 0.257 ± 0.136
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 74 proteins (15578 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski