Amino acid dipepetide frequency for Gordonia phage Denise

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
16.245AlaAla: 16.245 ± 1.385
0.77AlaCys: 0.77 ± 0.219
7.983AlaAsp: 7.983 ± 0.868
8.823AlaGlu: 8.823 ± 1.16
2.241AlaPhe: 2.241 ± 0.371
9.173AlaGly: 9.173 ± 0.807
2.311AlaHis: 2.311 ± 0.453
5.112AlaIle: 5.112 ± 0.651
3.921AlaLys: 3.921 ± 0.474
8.333AlaLeu: 8.333 ± 0.834
3.991AlaMet: 3.991 ± 0.535
3.991AlaAsn: 3.991 ± 0.54
5.252AlaPro: 5.252 ± 0.418
4.131AlaGln: 4.131 ± 0.532
7.913AlaArg: 7.913 ± 0.739
5.952AlaSer: 5.952 ± 0.578
7.072AlaThr: 7.072 ± 1.002
8.263AlaVal: 8.263 ± 0.702
1.891AlaTrp: 1.891 ± 0.315
2.451AlaTyr: 2.451 ± 0.37
0.0AlaXaa: 0.0 ± 0.0
Cys
1.12CysAla: 1.12 ± 0.339
0.14CysCys: 0.14 ± 0.09
0.77CysAsp: 0.77 ± 0.243
0.42CysGlu: 0.42 ± 0.206
0.07CysPhe: 0.07 ± 0.061
1.12CysGly: 1.12 ± 0.377
0.28CysHis: 0.28 ± 0.132
0.21CysIle: 0.21 ± 0.11
0.42CysLys: 0.42 ± 0.153
0.56CysLeu: 0.56 ± 0.211
0.07CysMet: 0.07 ± 0.063
0.28CysAsn: 0.28 ± 0.142
0.77CysPro: 0.77 ± 0.299
0.49CysGln: 0.49 ± 0.197
0.84CysArg: 0.84 ± 0.226
0.7CysSer: 0.7 ± 0.241
0.35CysThr: 0.35 ± 0.161
0.56CysVal: 0.56 ± 0.192
0.21CysTrp: 0.21 ± 0.118
0.35CysTyr: 0.35 ± 0.199
0.0CysXaa: 0.0 ± 0.0
Asp
7.212AspAla: 7.212 ± 0.899
0.63AspCys: 0.63 ± 0.249
4.972AspAsp: 4.972 ± 0.6
5.042AspGlu: 5.042 ± 0.585
1.821AspPhe: 1.821 ± 0.304
6.862AspGly: 6.862 ± 0.598
1.681AspHis: 1.681 ± 0.337
2.801AspIle: 2.801 ± 0.51
1.611AspLys: 1.611 ± 0.246
6.932AspLeu: 6.932 ± 0.686
1.751AspMet: 1.751 ± 0.36
1.611AspAsn: 1.611 ± 0.329
5.042AspPro: 5.042 ± 0.538
2.241AspGln: 2.241 ± 0.477
5.672AspArg: 5.672 ± 0.636
3.151AspSer: 3.151 ± 0.364
3.781AspThr: 3.781 ± 0.579
4.201AspVal: 4.201 ± 0.542
1.611AspTrp: 1.611 ± 0.289
1.33AspTyr: 1.33 ± 0.239
0.0AspXaa: 0.0 ± 0.0
Glu
7.422GluAla: 7.422 ± 0.768
0.84GluCys: 0.84 ± 0.321
3.361GluAsp: 3.361 ± 0.468
3.711GluGlu: 3.711 ± 0.511
2.101GluPhe: 2.101 ± 0.428
4.271GluGly: 4.271 ± 0.469
2.311GluHis: 2.311 ± 0.516
3.711GluIle: 3.711 ± 0.559
1.821GluLys: 1.821 ± 0.393
5.602GluLeu: 5.602 ± 0.666
1.26GluMet: 1.26 ± 0.311
1.26GluAsn: 1.26 ± 0.276
3.011GluPro: 3.011 ± 0.418
3.571GluGln: 3.571 ± 0.765
5.392GluArg: 5.392 ± 0.751
3.921GluSer: 3.921 ± 0.524
3.431GluThr: 3.431 ± 0.462
3.921GluVal: 3.921 ± 0.78
1.12GluTrp: 1.12 ± 0.351
1.611GluTyr: 1.611 ± 0.306
0.0GluXaa: 0.0 ± 0.0
Phe
2.661PheAla: 2.661 ± 0.519
0.28PheCys: 0.28 ± 0.135
2.451PheAsp: 2.451 ± 0.487
1.33PheGlu: 1.33 ± 0.274
0.84PhePhe: 0.84 ± 0.227
3.151PheGly: 3.151 ± 0.489
0.49PheHis: 0.49 ± 0.177
1.19PheIle: 1.19 ± 0.233
0.42PheLys: 0.42 ± 0.195
1.47PheLeu: 1.47 ± 0.307
0.28PheMet: 0.28 ± 0.108
0.91PheAsn: 0.91 ± 0.217
0.98PhePro: 0.98 ± 0.291
0.77PheGln: 0.77 ± 0.247
1.47PheArg: 1.47 ± 0.268
1.541PheSer: 1.541 ± 0.334
1.751PheThr: 1.751 ± 0.31
2.031PheVal: 2.031 ± 0.352
0.91PheTrp: 0.91 ± 0.295
0.21PheTyr: 0.21 ± 0.119
0.0PheXaa: 0.0 ± 0.0
Gly
7.142GlyAla: 7.142 ± 0.754
0.77GlyCys: 0.77 ± 0.285
5.672GlyAsp: 5.672 ± 0.649
5.532GlyGlu: 5.532 ± 0.604
2.241GlyPhe: 2.241 ± 0.36
9.313GlyGly: 9.313 ± 1.479
1.681GlyHis: 1.681 ± 0.358
3.851GlyIle: 3.851 ± 0.529
2.731GlyLys: 2.731 ± 0.441
7.422GlyLeu: 7.422 ± 0.802
2.311GlyMet: 2.311 ± 0.388
2.591GlyAsn: 2.591 ± 0.445
4.341GlyPro: 4.341 ± 0.628
3.081GlyGln: 3.081 ± 0.493
6.652GlyArg: 6.652 ± 0.614
4.552GlySer: 4.552 ± 0.759
5.882GlyThr: 5.882 ± 0.799
6.582GlyVal: 6.582 ± 0.585
1.47GlyTrp: 1.47 ± 0.367
1.891GlyTyr: 1.891 ± 0.351
0.0GlyXaa: 0.0 ± 0.0
His
1.26HisAla: 1.26 ± 0.267
0.07HisCys: 0.07 ± 0.078
1.891HisAsp: 1.891 ± 0.388
2.031HisGlu: 2.031 ± 0.429
0.42HisPhe: 0.42 ± 0.155
1.821HisGly: 1.821 ± 0.392
0.49HisHis: 0.49 ± 0.168
1.26HisIle: 1.26 ± 0.337
0.49HisLys: 0.49 ± 0.165
2.381HisLeu: 2.381 ± 0.43
0.42HisMet: 0.42 ± 0.185
0.42HisAsn: 0.42 ± 0.175
1.33HisPro: 1.33 ± 0.298
0.63HisGln: 0.63 ± 0.228
1.12HisArg: 1.12 ± 0.253
1.19HisSer: 1.19 ± 0.296
1.681HisThr: 1.681 ± 0.329
1.26HisVal: 1.26 ± 0.27
0.63HisTrp: 0.63 ± 0.212
0.91HisTyr: 0.91 ± 0.26
0.0HisXaa: 0.0 ± 0.0
Ile
6.862IleAla: 6.862 ± 0.725
0.14IleCys: 0.14 ± 0.093
4.692IleAsp: 4.692 ± 0.612
3.291IleGlu: 3.291 ± 0.518
1.19IlePhe: 1.19 ± 0.34
4.201IleGly: 4.201 ± 0.465
1.19IleHis: 1.19 ± 0.309
1.4IleIle: 1.4 ± 0.248
1.26IleLys: 1.26 ± 0.264
2.451IleLeu: 2.451 ± 0.341
0.49IleMet: 0.49 ± 0.175
1.47IleAsn: 1.47 ± 0.29
2.731IlePro: 2.731 ± 0.413
1.821IleGln: 1.821 ± 0.341
3.921IleArg: 3.921 ± 0.401
2.661IleSer: 2.661 ± 0.464
3.011IleThr: 3.011 ± 0.55
4.271IleVal: 4.271 ± 0.536
0.84IleTrp: 0.84 ± 0.233
0.98IleTyr: 0.98 ± 0.262
0.0IleXaa: 0.0 ± 0.0
Lys
3.571LysAla: 3.571 ± 0.525
0.35LysCys: 0.35 ± 0.204
1.821LysAsp: 1.821 ± 0.371
1.541LysGlu: 1.541 ± 0.341
0.42LysPhe: 0.42 ± 0.195
1.821LysGly: 1.821 ± 0.325
0.84LysHis: 0.84 ± 0.229
1.05LysIle: 1.05 ± 0.244
0.98LysLys: 0.98 ± 0.252
2.591LysLeu: 2.591 ± 0.457
0.49LysMet: 0.49 ± 0.194
0.63LysAsn: 0.63 ± 0.21
1.821LysPro: 1.821 ± 0.35
0.98LysGln: 0.98 ± 0.278
2.451LysArg: 2.451 ± 0.437
2.381LysSer: 2.381 ± 0.415
1.751LysThr: 1.751 ± 0.399
2.311LysVal: 2.311 ± 0.453
0.7LysTrp: 0.7 ± 0.229
0.49LysTyr: 0.49 ± 0.167
0.0LysXaa: 0.0 ± 0.0
Leu
9.593LeuAla: 9.593 ± 1.09
0.98LeuCys: 0.98 ± 0.29
5.322LeuAsp: 5.322 ± 0.671
4.131LeuGlu: 4.131 ± 0.416
2.101LeuPhe: 2.101 ± 0.453
7.282LeuGly: 7.282 ± 0.75
1.26LeuHis: 1.26 ± 0.305
3.991LeuIle: 3.991 ± 0.468
1.891LeuLys: 1.891 ± 0.353
5.532LeuLeu: 5.532 ± 0.754
1.12LeuMet: 1.12 ± 0.266
2.101LeuAsn: 2.101 ± 0.36
3.991LeuPro: 3.991 ± 0.524
1.541LeuGln: 1.541 ± 0.299
5.392LeuArg: 5.392 ± 0.641
4.972LeuSer: 4.972 ± 0.624
6.232LeuThr: 6.232 ± 0.769
6.372LeuVal: 6.372 ± 0.787
1.4LeuTrp: 1.4 ± 0.266
1.891LeuTyr: 1.891 ± 0.346
0.0LeuXaa: 0.0 ± 0.0
Met
3.221MetAla: 3.221 ± 0.627
0.21MetCys: 0.21 ± 0.12
0.77MetAsp: 0.77 ± 0.193
0.98MetGlu: 0.98 ± 0.344
0.77MetPhe: 0.77 ± 0.203
1.33MetGly: 1.33 ± 0.335
0.63MetHis: 0.63 ± 0.2
0.91MetIle: 0.91 ± 0.217
0.56MetLys: 0.56 ± 0.189
1.4MetLeu: 1.4 ± 0.354
0.21MetMet: 0.21 ± 0.099
0.35MetAsn: 0.35 ± 0.133
1.821MetPro: 1.821 ± 0.322
0.84MetGln: 0.84 ± 0.249
1.12MetArg: 1.12 ± 0.228
1.611MetSer: 1.611 ± 0.353
3.011MetThr: 3.011 ± 0.414
1.33MetVal: 1.33 ± 0.261
0.42MetTrp: 0.42 ± 0.196
0.21MetTyr: 0.21 ± 0.105
0.0MetXaa: 0.0 ± 0.0
Asn
2.871AsnAla: 2.871 ± 0.538
0.07AsnCys: 0.07 ± 0.059
1.681AsnAsp: 1.681 ± 0.47
1.541AsnGlu: 1.541 ± 0.303
0.56AsnPhe: 0.56 ± 0.193
3.011AsnGly: 3.011 ± 0.368
0.49AsnHis: 0.49 ± 0.184
1.05AsnIle: 1.05 ± 0.302
0.77AsnLys: 0.77 ± 0.232
2.521AsnLeu: 2.521 ± 0.498
0.49AsnMet: 0.49 ± 0.229
0.49AsnAsn: 0.49 ± 0.167
3.151AsnPro: 3.151 ± 0.386
1.33AsnGln: 1.33 ± 0.257
1.4AsnArg: 1.4 ± 0.314
1.681AsnSer: 1.681 ± 0.317
1.891AsnThr: 1.891 ± 0.341
0.98AsnVal: 0.98 ± 0.248
0.56AsnTrp: 0.56 ± 0.179
0.77AsnTyr: 0.77 ± 0.272
0.0AsnXaa: 0.0 ± 0.0
Pro
5.812ProAla: 5.812 ± 0.689
0.63ProCys: 0.63 ± 0.244
5.532ProAsp: 5.532 ± 0.706
4.411ProGlu: 4.411 ± 0.645
0.98ProPhe: 0.98 ± 0.238
3.991ProGly: 3.991 ± 0.593
0.84ProHis: 0.84 ± 0.225
2.591ProIle: 2.591 ± 0.487
2.451ProLys: 2.451 ± 0.415
3.361ProLeu: 3.361 ± 0.383
1.19ProMet: 1.19 ± 0.269
2.591ProAsn: 2.591 ± 0.455
3.081ProPro: 3.081 ± 0.438
1.4ProGln: 1.4 ± 0.314
3.431ProArg: 3.431 ± 0.591
2.521ProSer: 2.521 ± 0.417
2.871ProThr: 2.871 ± 0.382
4.341ProVal: 4.341 ± 0.518
1.611ProTrp: 1.611 ± 0.383
1.4ProTyr: 1.4 ± 0.311
0.0ProXaa: 0.0 ± 0.0
Gln
4.201GlnAla: 4.201 ± 0.813
0.49GlnCys: 0.49 ± 0.173
1.12GlnAsp: 1.12 ± 0.242
1.541GlnGlu: 1.541 ± 0.285
0.91GlnPhe: 0.91 ± 0.259
1.961GlnGly: 1.961 ± 0.375
0.91GlnHis: 0.91 ± 0.214
2.241GlnIle: 2.241 ± 0.351
1.26GlnLys: 1.26 ± 0.357
2.871GlnLeu: 2.871 ± 0.449
1.751GlnMet: 1.751 ± 0.33
0.77GlnAsn: 0.77 ± 0.234
1.891GlnPro: 1.891 ± 0.422
1.961GlnGln: 1.961 ± 0.461
3.641GlnArg: 3.641 ± 0.636
1.961GlnSer: 1.961 ± 0.238
1.751GlnThr: 1.751 ± 0.342
2.381GlnVal: 2.381 ± 0.419
0.91GlnTrp: 0.91 ± 0.24
0.98GlnTyr: 0.98 ± 0.328
0.0GlnXaa: 0.0 ± 0.0
Arg
8.403ArgAla: 8.403 ± 0.603
0.91ArgCys: 0.91 ± 0.253
4.061ArgAsp: 4.061 ± 0.69
4.341ArgGlu: 4.341 ± 0.588
2.521ArgPhe: 2.521 ± 0.375
5.042ArgGly: 5.042 ± 0.587
1.751ArgHis: 1.751 ± 0.377
5.182ArgIle: 5.182 ± 0.763
2.801ArgLys: 2.801 ± 0.442
6.442ArgLeu: 6.442 ± 0.731
1.33ArgMet: 1.33 ± 0.275
2.381ArgAsn: 2.381 ± 0.345
3.781ArgPro: 3.781 ± 0.546
3.221ArgGln: 3.221 ± 0.423
6.582ArgArg: 6.582 ± 0.998
3.431ArgSer: 3.431 ± 0.417
4.832ArgThr: 4.832 ± 0.725
5.812ArgVal: 5.812 ± 0.607
1.541ArgTrp: 1.541 ± 0.321
1.33ArgTyr: 1.33 ± 0.256
0.0ArgXaa: 0.0 ± 0.0
Ser
6.722SerAla: 6.722 ± 0.611
0.35SerCys: 0.35 ± 0.149
4.832SerAsp: 4.832 ± 0.598
3.431SerGlu: 3.431 ± 0.503
1.611SerPhe: 1.611 ± 0.316
6.092SerGly: 6.092 ± 0.711
1.05SerHis: 1.05 ± 0.32
3.221SerIle: 3.221 ± 0.447
1.19SerLys: 1.19 ± 0.274
3.221SerLeu: 3.221 ± 0.478
1.33SerMet: 1.33 ± 0.289
0.84SerAsn: 0.84 ± 0.217
2.941SerPro: 2.941 ± 0.469
1.751SerGln: 1.751 ± 0.398
3.991SerArg: 3.991 ± 0.69
2.801SerSer: 2.801 ± 0.393
3.781SerThr: 3.781 ± 0.833
3.781SerVal: 3.781 ± 0.564
1.821SerTrp: 1.821 ± 0.374
1.751SerTyr: 1.751 ± 0.337
0.0SerXaa: 0.0 ± 0.0
Thr
7.843ThrAla: 7.843 ± 0.919
0.63ThrCys: 0.63 ± 0.211
3.711ThrAsp: 3.711 ± 0.512
3.571ThrGlu: 3.571 ± 0.583
1.681ThrPhe: 1.681 ± 0.359
6.652ThrGly: 6.652 ± 0.811
1.26ThrHis: 1.26 ± 0.329
4.481ThrIle: 4.481 ± 0.719
1.19ThrLys: 1.19 ± 0.278
5.182ThrLeu: 5.182 ± 0.564
1.05ThrMet: 1.05 ± 0.304
1.751ThrAsn: 1.751 ± 0.46
3.851ThrPro: 3.851 ± 0.501
2.241ThrGln: 2.241 ± 0.372
4.131ThrArg: 4.131 ± 0.591
3.991ThrSer: 3.991 ± 0.699
4.832ThrThr: 4.832 ± 0.763
5.042ThrVal: 5.042 ± 0.666
1.961ThrTrp: 1.961 ± 0.393
1.26ThrTyr: 1.26 ± 0.272
0.0ThrXaa: 0.0 ± 0.0
Val
8.613ValAla: 8.613 ± 0.932
1.12ValCys: 1.12 ± 0.315
6.372ValAsp: 6.372 ± 0.624
5.392ValGlu: 5.392 ± 0.532
1.751ValPhe: 1.751 ± 0.261
5.252ValGly: 5.252 ± 0.724
0.84ValHis: 0.84 ± 0.233
3.081ValIle: 3.081 ± 0.478
1.751ValLys: 1.751 ± 0.355
4.902ValLeu: 4.902 ± 0.71
1.19ValMet: 1.19 ± 0.308
2.101ValAsn: 2.101 ± 0.385
3.081ValPro: 3.081 ± 0.497
1.681ValGln: 1.681 ± 0.393
6.582ValArg: 6.582 ± 0.566
4.622ValSer: 4.622 ± 0.63
5.462ValThr: 5.462 ± 0.794
5.462ValVal: 5.462 ± 0.642
0.91ValTrp: 0.91 ± 0.213
1.611ValTyr: 1.611 ± 0.444
0.0ValXaa: 0.0 ± 0.0
Trp
2.241TrpAla: 2.241 ± 0.377
0.14TrpCys: 0.14 ± 0.099
1.751TrpAsp: 1.751 ± 0.323
1.4TrpGlu: 1.4 ± 0.258
0.49TrpPhe: 0.49 ± 0.202
1.47TrpGly: 1.47 ± 0.314
0.7TrpHis: 0.7 ± 0.243
0.91TrpIle: 0.91 ± 0.25
0.91TrpLys: 0.91 ± 0.245
1.891TrpLeu: 1.891 ± 0.38
0.7TrpMet: 0.7 ± 0.246
0.28TrpAsn: 0.28 ± 0.127
1.05TrpPro: 1.05 ± 0.269
0.7TrpGln: 0.7 ± 0.16
2.591TrpArg: 2.591 ± 0.481
1.4TrpSer: 1.4 ± 0.323
0.98TrpThr: 0.98 ± 0.23
0.91TrpVal: 0.91 ± 0.241
0.7TrpTrp: 0.7 ± 0.223
0.28TrpTyr: 0.28 ± 0.126
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.801TyrAla: 2.801 ± 0.424
0.21TyrCys: 0.21 ± 0.1
1.19TyrAsp: 1.19 ± 0.293
1.19TyrGlu: 1.19 ± 0.265
0.42TyrPhe: 0.42 ± 0.142
1.891TyrGly: 1.891 ± 0.516
0.7TyrHis: 0.7 ± 0.248
0.63TyrIle: 0.63 ± 0.204
0.63TyrLys: 0.63 ± 0.235
2.031TyrLeu: 2.031 ± 0.45
0.21TyrMet: 0.21 ± 0.105
0.49TyrAsn: 0.49 ± 0.227
1.12TyrPro: 1.12 ± 0.337
1.12TyrGln: 1.12 ± 0.266
1.4TyrArg: 1.4 ± 0.284
1.47TyrSer: 1.47 ± 0.259
1.961TyrThr: 1.961 ± 0.407
1.961TyrVal: 1.961 ± 0.328
0.28TyrTrp: 0.28 ± 0.135
0.28TyrTyr: 0.28 ± 0.12
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 78 proteins (14282 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski