Amino acid dipepetide frequency for Streptococcus phage SOCP

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
0.338AlaAla: 0.338 ± 0.277
0.169AlaCys: 0.169 ± 0.168
4.729AlaAsp: 4.729 ± 0.742
5.067AlaGlu: 5.067 ± 0.906
3.547AlaPhe: 3.547 ± 1.157
3.716AlaGly: 3.716 ± 0.784
0.676AlaHis: 0.676 ± 0.28
3.04AlaIle: 3.04 ± 0.645
4.222AlaLys: 4.222 ± 0.85
7.6AlaLeu: 7.6 ± 1.393
1.182AlaMet: 1.182 ± 0.542
5.236AlaAsn: 5.236 ± 1.077
1.013AlaPro: 1.013 ± 0.467
2.702AlaGln: 2.702 ± 0.755
1.858AlaArg: 1.858 ± 0.432
4.898AlaSer: 4.898 ± 1.137
4.391AlaThr: 4.391 ± 1.115
2.364AlaVal: 2.364 ± 0.659
1.351AlaTrp: 1.351 ± 0.476
3.04AlaTyr: 3.04 ± 0.578
0.0AlaXaa: 0.0 ± 0.0
Cys
0.169CysAla: 0.169 ± 0.165
0.0CysCys: 0.0 ± 0.0
0.676CysAsp: 0.676 ± 0.286
0.169CysGlu: 0.169 ± 0.184
0.338CysPhe: 0.338 ± 0.231
0.169CysGly: 0.169 ± 0.181
0.169CysHis: 0.169 ± 0.151
0.844CysIle: 0.844 ± 0.357
0.169CysLys: 0.169 ± 0.172
1.013CysLeu: 1.013 ± 0.524
0.0CysMet: 0.0 ± 0.0
0.338CysAsn: 0.338 ± 0.215
0.0CysPro: 0.0 ± 0.0
0.0CysGln: 0.0 ± 0.0
0.169CysArg: 0.169 ± 0.145
0.676CysSer: 0.676 ± 0.312
0.169CysThr: 0.169 ± 0.154
0.0CysVal: 0.0 ± 0.0
0.169CysTrp: 0.169 ± 0.19
1.351CysTyr: 1.351 ± 0.494
0.0CysXaa: 0.0 ± 0.0
Asp
3.209AspAla: 3.209 ± 0.49
0.338AspCys: 0.338 ± 0.22
3.884AspAsp: 3.884 ± 1.133
4.56AspGlu: 4.56 ± 0.746
3.884AspPhe: 3.884 ± 0.954
4.898AspGly: 4.898 ± 0.703
0.338AspHis: 0.338 ± 0.228
3.716AspIle: 3.716 ± 0.572
6.418AspLys: 6.418 ± 1.125
5.236AspLeu: 5.236 ± 0.841
1.351AspMet: 1.351 ± 0.486
4.391AspAsn: 4.391 ± 0.997
1.182AspPro: 1.182 ± 0.374
1.182AspGln: 1.182 ± 0.508
1.52AspArg: 1.52 ± 0.41
2.871AspSer: 2.871 ± 0.611
3.884AspThr: 3.884 ± 0.843
3.209AspVal: 3.209 ± 0.546
0.844AspTrp: 0.844 ± 0.344
4.222AspTyr: 4.222 ± 0.682
0.0AspXaa: 0.0 ± 0.0
Glu
3.716GluAla: 3.716 ± 0.648
0.507GluCys: 0.507 ± 0.283
3.884GluAsp: 3.884 ± 0.739
6.418GluGlu: 6.418 ± 1.168
4.053GluPhe: 4.053 ± 0.795
3.378GluGly: 3.378 ± 0.652
0.676GluHis: 0.676 ± 0.399
5.404GluIle: 5.404 ± 0.942
5.236GluLys: 5.236 ± 0.857
6.249GluLeu: 6.249 ± 1.413
1.013GluMet: 1.013 ± 0.485
6.418GluAsn: 6.418 ± 0.923
1.013GluPro: 1.013 ± 0.442
2.702GluGln: 2.702 ± 0.61
4.56GluArg: 4.56 ± 1.196
3.884GluSer: 3.884 ± 1.193
5.067GluThr: 5.067 ± 0.977
6.249GluVal: 6.249 ± 0.997
0.676GluTrp: 0.676 ± 0.371
2.364GluTyr: 2.364 ± 0.535
0.0GluXaa: 0.0 ± 0.0
Phe
3.378PheAla: 3.378 ± 0.622
0.507PheCys: 0.507 ± 0.256
3.547PheAsp: 3.547 ± 0.791
4.053PheGlu: 4.053 ± 0.941
2.533PhePhe: 2.533 ± 0.531
2.702PheGly: 2.702 ± 0.664
0.676PheHis: 0.676 ± 0.332
4.222PheIle: 4.222 ± 1.134
5.404PheLys: 5.404 ± 0.728
3.884PheLeu: 3.884 ± 0.876
1.351PheMet: 1.351 ± 0.468
3.378PheAsn: 3.378 ± 0.842
1.689PhePro: 1.689 ± 0.65
2.702PheGln: 2.702 ± 0.602
1.689PheArg: 1.689 ± 0.5
2.871PheSer: 2.871 ± 0.861
3.378PheThr: 3.378 ± 0.626
3.378PheVal: 3.378 ± 0.709
1.013PheTrp: 1.013 ± 0.512
2.871PheTyr: 2.871 ± 0.849
0.0PheXaa: 0.0 ± 0.0
Gly
2.702GlyAla: 2.702 ± 0.763
0.507GlyCys: 0.507 ± 0.287
3.547GlyAsp: 3.547 ± 0.731
2.364GlyGlu: 2.364 ± 0.389
2.702GlyPhe: 2.702 ± 0.632
3.884GlyGly: 3.884 ± 1.117
0.676GlyHis: 0.676 ± 0.318
3.716GlyIle: 3.716 ± 0.771
4.053GlyLys: 4.053 ± 0.84
4.053GlyLeu: 4.053 ± 0.782
1.689GlyMet: 1.689 ± 0.504
4.053GlyAsn: 4.053 ± 1.034
0.169GlyPro: 0.169 ± 0.156
3.378GlyGln: 3.378 ± 1.366
3.209GlyArg: 3.209 ± 0.698
4.053GlySer: 4.053 ± 1.434
2.702GlyThr: 2.702 ± 0.523
3.884GlyVal: 3.884 ± 0.618
1.351GlyTrp: 1.351 ± 0.673
3.884GlyTyr: 3.884 ± 0.86
0.0GlyXaa: 0.0 ± 0.0
His
0.676HisAla: 0.676 ± 0.386
0.0HisCys: 0.0 ± 0.0
0.507HisAsp: 0.507 ± 0.399
0.844HisGlu: 0.844 ± 0.326
1.013HisPhe: 1.013 ± 0.311
0.507HisGly: 0.507 ± 0.289
0.507HisHis: 0.507 ± 0.327
1.182HisIle: 1.182 ± 0.406
1.013HisLys: 1.013 ± 0.365
0.844HisLeu: 0.844 ± 0.364
0.507HisMet: 0.507 ± 0.21
0.676HisAsn: 0.676 ± 0.263
0.169HisPro: 0.169 ± 0.158
0.507HisGln: 0.507 ± 0.27
0.0HisArg: 0.0 ± 0.0
1.351HisSer: 1.351 ± 0.52
0.844HisThr: 0.844 ± 0.344
1.182HisVal: 1.182 ± 0.435
0.338HisTrp: 0.338 ± 0.226
2.027HisTyr: 2.027 ± 0.58
0.0HisXaa: 0.0 ± 0.0
Ile
4.391IleAla: 4.391 ± 0.909
0.507IleCys: 0.507 ± 0.284
4.729IleAsp: 4.729 ± 0.947
6.08IleGlu: 6.08 ± 1.108
4.053IlePhe: 4.053 ± 0.678
4.222IleGly: 4.222 ± 0.967
1.351IleHis: 1.351 ± 0.49
4.56IleIle: 4.56 ± 0.691
5.911IleLys: 5.911 ± 0.64
5.067IleLeu: 5.067 ± 0.938
2.196IleMet: 2.196 ± 0.621
4.729IleAsn: 4.729 ± 0.566
2.871IlePro: 2.871 ± 0.626
2.702IleGln: 2.702 ± 0.629
2.364IleArg: 2.364 ± 0.592
4.391IleSer: 4.391 ± 0.928
4.053IleThr: 4.053 ± 0.756
2.702IleVal: 2.702 ± 0.602
0.507IleTrp: 0.507 ± 0.282
2.533IleTyr: 2.533 ± 0.619
0.0IleXaa: 0.0 ± 0.0
Lys
5.742LysAla: 5.742 ± 1.195
0.507LysCys: 0.507 ± 0.252
3.378LysAsp: 3.378 ± 0.682
5.404LysGlu: 5.404 ± 0.983
3.378LysPhe: 3.378 ± 0.879
4.56LysGly: 4.56 ± 1.015
1.689LysHis: 1.689 ± 0.442
6.756LysIle: 6.756 ± 0.961
5.911LysLys: 5.911 ± 0.982
5.236LysLeu: 5.236 ± 1.139
1.858LysMet: 1.858 ± 0.468
5.911LysAsn: 5.911 ± 1.046
2.702LysPro: 2.702 ± 0.675
4.56LysGln: 4.56 ± 0.91
3.04LysArg: 3.04 ± 0.784
6.249LysSer: 6.249 ± 0.961
3.04LysThr: 3.04 ± 0.683
2.871LysVal: 2.871 ± 0.891
1.013LysTrp: 1.013 ± 0.406
3.209LysTyr: 3.209 ± 0.901
0.0LysXaa: 0.0 ± 0.0
Leu
4.56LeuAla: 4.56 ± 0.999
0.169LeuCys: 0.169 ± 0.146
5.911LeuAsp: 5.911 ± 0.907
7.6LeuGlu: 7.6 ± 1.469
4.391LeuPhe: 4.391 ± 0.849
1.858LeuGly: 1.858 ± 0.72
1.351LeuHis: 1.351 ± 0.541
3.547LeuIle: 3.547 ± 0.748
8.782LeuLys: 8.782 ± 1.809
4.898LeuLeu: 4.898 ± 0.912
0.507LeuMet: 0.507 ± 0.27
7.6LeuAsn: 7.6 ± 0.737
1.182LeuPro: 1.182 ± 0.433
2.871LeuGln: 2.871 ± 0.689
2.871LeuArg: 2.871 ± 0.625
5.742LeuSer: 5.742 ± 0.966
5.911LeuThr: 5.911 ± 1.089
4.729LeuVal: 4.729 ± 0.776
0.676LeuTrp: 0.676 ± 0.384
5.236LeuTyr: 5.236 ± 1.047
0.0LeuXaa: 0.0 ± 0.0
Met
2.027MetAla: 2.027 ± 0.566
0.0MetCys: 0.0 ± 0.0
1.013MetAsp: 1.013 ± 0.408
2.027MetGlu: 2.027 ± 0.457
0.507MetPhe: 0.507 ± 0.262
1.182MetGly: 1.182 ± 0.391
0.169MetHis: 0.169 ± 0.201
1.182MetIle: 1.182 ± 0.417
2.027MetLys: 2.027 ± 0.54
1.351MetLeu: 1.351 ± 0.494
0.338MetMet: 0.338 ± 0.242
2.027MetAsn: 2.027 ± 0.509
0.338MetPro: 0.338 ± 0.246
0.844MetGln: 0.844 ± 0.469
0.676MetArg: 0.676 ± 0.286
1.351MetSer: 1.351 ± 0.492
2.364MetThr: 2.364 ± 0.602
1.182MetVal: 1.182 ± 0.5
0.0MetTrp: 0.0 ± 0.0
1.013MetTyr: 1.013 ± 0.434
0.0MetXaa: 0.0 ± 0.0
Asn
6.418AsnAla: 6.418 ± 1.27
0.0AsnCys: 0.0 ± 0.0
5.236AsnAsp: 5.236 ± 0.7
5.067AsnGlu: 5.067 ± 0.909
4.222AsnPhe: 4.222 ± 0.778
4.898AsnGly: 4.898 ± 0.891
1.351AsnHis: 1.351 ± 0.413
4.222AsnIle: 4.222 ± 0.763
4.391AsnLys: 4.391 ± 0.873
5.573AsnLeu: 5.573 ± 0.992
2.027AsnMet: 2.027 ± 0.63
5.236AsnAsn: 5.236 ± 1.06
3.716AsnPro: 3.716 ± 0.757
3.378AsnGln: 3.378 ± 0.733
2.364AsnArg: 2.364 ± 0.53
3.209AsnSer: 3.209 ± 0.864
3.716AsnThr: 3.716 ± 0.818
3.716AsnVal: 3.716 ± 0.72
0.507AsnTrp: 0.507 ± 0.238
3.04AsnTyr: 3.04 ± 0.8
0.0AsnXaa: 0.0 ± 0.0
Pro
1.52ProAla: 1.52 ± 0.618
0.338ProCys: 0.338 ± 0.245
2.196ProAsp: 2.196 ± 0.659
1.52ProGlu: 1.52 ± 0.455
1.858ProPhe: 1.858 ± 0.569
0.0ProGly: 0.0 ± 0.0
0.676ProHis: 0.676 ± 0.307
2.027ProIle: 2.027 ± 0.574
1.52ProLys: 1.52 ± 0.527
1.858ProLeu: 1.858 ± 0.536
0.169ProMet: 0.169 ± 0.168
2.364ProAsn: 2.364 ± 0.651
0.676ProPro: 0.676 ± 0.244
0.507ProGln: 0.507 ± 0.248
0.676ProArg: 0.676 ± 0.235
2.027ProSer: 2.027 ± 0.598
1.182ProThr: 1.182 ± 0.462
2.196ProVal: 2.196 ± 0.73
0.338ProTrp: 0.338 ± 0.381
1.351ProTyr: 1.351 ± 0.493
0.0ProXaa: 0.0 ± 0.0
Gln
3.04GlnAla: 3.04 ± 0.669
0.338GlnCys: 0.338 ± 0.203
2.702GlnAsp: 2.702 ± 0.787
2.533GlnGlu: 2.533 ± 0.57
2.027GlnPhe: 2.027 ± 0.432
2.196GlnGly: 2.196 ± 0.536
0.338GlnHis: 0.338 ± 0.23
2.871GlnIle: 2.871 ± 0.582
3.04GlnLys: 3.04 ± 0.603
3.209GlnLeu: 3.209 ± 0.886
1.182GlnMet: 1.182 ± 0.443
1.351GlnAsn: 1.351 ± 0.49
2.027GlnPro: 2.027 ± 0.55
3.209GlnGln: 3.209 ± 0.734
1.182GlnArg: 1.182 ± 0.477
2.702GlnSer: 2.702 ± 0.939
1.182GlnThr: 1.182 ± 0.352
2.364GlnVal: 2.364 ± 0.549
0.507GlnTrp: 0.507 ± 0.244
1.858GlnTyr: 1.858 ± 0.471
0.0GlnXaa: 0.0 ± 0.0
Arg
2.364ArgAla: 2.364 ± 0.605
0.0ArgCys: 0.0 ± 0.0
1.52ArgAsp: 1.52 ± 0.701
1.52ArgGlu: 1.52 ± 0.496
2.871ArgPhe: 2.871 ± 0.532
2.533ArgGly: 2.533 ± 0.425
0.338ArgHis: 0.338 ± 0.221
3.04ArgIle: 3.04 ± 0.877
2.702ArgLys: 2.702 ± 0.784
3.209ArgLeu: 3.209 ± 0.818
0.338ArgMet: 0.338 ± 0.259
2.533ArgAsn: 2.533 ± 0.69
0.844ArgPro: 0.844 ± 0.322
0.844ArgGln: 0.844 ± 0.385
2.364ArgArg: 2.364 ± 0.904
1.689ArgSer: 1.689 ± 0.672
3.378ArgThr: 3.378 ± 0.767
2.702ArgVal: 2.702 ± 0.765
0.507ArgTrp: 0.507 ± 0.336
1.858ArgTyr: 1.858 ± 0.606
0.0ArgXaa: 0.0 ± 0.0
Ser
3.884SerAla: 3.884 ± 0.671
0.507SerCys: 0.507 ± 0.274
3.378SerAsp: 3.378 ± 0.917
3.716SerGlu: 3.716 ± 0.544
3.209SerPhe: 3.209 ± 0.804
5.404SerGly: 5.404 ± 1.537
1.013SerHis: 1.013 ± 0.422
4.053SerIle: 4.053 ± 0.573
3.547SerLys: 3.547 ± 0.773
5.067SerLeu: 5.067 ± 0.975
1.858SerMet: 1.858 ± 0.435
4.391SerAsn: 4.391 ± 0.753
1.013SerPro: 1.013 ± 0.421
2.027SerGln: 2.027 ± 0.643
2.533SerArg: 2.533 ± 0.77
3.884SerSer: 3.884 ± 0.926
4.222SerThr: 4.222 ± 1.133
4.729SerVal: 4.729 ± 1.002
0.844SerTrp: 0.844 ± 0.269
3.716SerTyr: 3.716 ± 0.778
0.0SerXaa: 0.0 ± 0.0
Thr
4.56ThrAla: 4.56 ± 0.739
0.338ThrCys: 0.338 ± 0.241
2.702ThrAsp: 2.702 ± 0.596
4.391ThrGlu: 4.391 ± 0.878
4.391ThrPhe: 4.391 ± 1.038
4.729ThrGly: 4.729 ± 0.816
0.844ThrHis: 0.844 ± 0.338
4.898ThrIle: 4.898 ± 0.88
3.209ThrLys: 3.209 ± 0.788
5.573ThrLeu: 5.573 ± 1.188
0.676ThrMet: 0.676 ± 0.293
3.547ThrAsn: 3.547 ± 0.863
2.702ThrPro: 2.702 ± 0.737
1.858ThrGln: 1.858 ± 0.518
2.196ThrArg: 2.196 ± 0.878
3.547ThrSer: 3.547 ± 0.573
4.56ThrThr: 4.56 ± 0.94
3.04ThrVal: 3.04 ± 0.713
0.507ThrTrp: 0.507 ± 0.268
3.209ThrTyr: 3.209 ± 0.705
0.0ThrXaa: 0.0 ± 0.0
Val
4.56ValAla: 4.56 ± 0.691
0.844ValCys: 0.844 ± 0.345
3.547ValAsp: 3.547 ± 0.633
5.067ValGlu: 5.067 ± 0.851
2.027ValPhe: 2.027 ± 0.632
3.04ValGly: 3.04 ± 0.682
1.013ValHis: 1.013 ± 0.402
5.067ValIle: 5.067 ± 0.866
4.391ValLys: 4.391 ± 0.923
3.04ValLeu: 3.04 ± 0.703
1.689ValMet: 1.689 ± 0.447
3.884ValAsn: 3.884 ± 0.842
1.013ValPro: 1.013 ± 0.385
1.351ValGln: 1.351 ± 0.494
1.52ValArg: 1.52 ± 0.431
3.884ValSer: 3.884 ± 0.755
4.053ValThr: 4.053 ± 0.833
2.871ValVal: 2.871 ± 0.74
1.013ValTrp: 1.013 ± 0.501
2.364ValTyr: 2.364 ± 0.5
0.0ValXaa: 0.0 ± 0.0
Trp
1.013TrpAla: 1.013 ± 0.433
0.0TrpCys: 0.0 ± 0.0
0.507TrpAsp: 0.507 ± 0.317
0.676TrpGlu: 0.676 ± 0.255
1.182TrpPhe: 1.182 ± 0.46
0.676TrpGly: 0.676 ± 0.315
0.0TrpHis: 0.0 ± 0.0
1.182TrpIle: 1.182 ± 0.355
1.013TrpLys: 1.013 ± 0.473
1.182TrpLeu: 1.182 ± 0.552
0.0TrpMet: 0.0 ± 0.0
0.676TrpAsn: 0.676 ± 0.337
0.0TrpPro: 0.0 ± 0.0
0.507TrpGln: 0.507 ± 0.398
0.0TrpArg: 0.0 ± 0.0
0.507TrpSer: 0.507 ± 0.309
1.351TrpThr: 1.351 ± 0.46
0.676TrpVal: 0.676 ± 0.404
0.676TrpTrp: 0.676 ± 0.39
1.351TrpTyr: 1.351 ± 0.844
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.871TyrAla: 2.871 ± 0.784
1.013TyrCys: 1.013 ± 0.429
3.547TyrAsp: 3.547 ± 0.64
4.391TyrGlu: 4.391 ± 0.882
3.04TyrPhe: 3.04 ± 1.052
2.027TyrGly: 2.027 ± 0.588
0.844TyrHis: 0.844 ± 0.394
4.391TyrIle: 4.391 ± 0.73
4.222TyrLys: 4.222 ± 0.92
6.249TyrLeu: 6.249 ± 1.22
1.52TyrMet: 1.52 ± 0.459
3.547TyrAsn: 3.547 ± 0.748
0.676TyrPro: 0.676 ± 0.33
2.027TyrGln: 2.027 ± 0.547
2.364TyrArg: 2.364 ± 0.469
3.209TyrSer: 3.209 ± 0.648
2.027TyrThr: 2.027 ± 0.427
2.196TyrVal: 2.196 ± 0.768
0.338TyrTrp: 0.338 ± 0.228
3.547TyrTyr: 3.547 ± 1.124
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 27 proteins (5922 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski