Amino acid dipepetide frequency for Streptococcus thermophilus (strain ATCC BAA-250 / LMG 18311)

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
5.568AlaAla: 5.568 ± 0.159
0.48AlaCys: 0.48 ± 0.033
4.209AlaAsp: 4.209 ± 0.11
5.132AlaGlu: 5.132 ± 0.121
3.338AlaPhe: 3.338 ± 0.089
5.13AlaGly: 5.13 ± 0.136
1.251AlaHis: 1.251 ± 0.05
6.112AlaIle: 6.112 ± 0.138
5.52AlaLys: 5.52 ± 0.126
7.343AlaLeu: 7.343 ± 0.149
2.004AlaMet: 2.004 ± 0.066
3.141AlaAsn: 3.141 ± 0.092
2.072AlaPro: 2.072 ± 0.089
2.956AlaGln: 2.956 ± 0.091
2.923AlaArg: 2.923 ± 0.086
4.682AlaSer: 4.682 ± 0.112
4.147AlaThr: 4.147 ± 0.107
5.455AlaVal: 5.455 ± 0.118
0.541AlaTrp: 0.541 ± 0.035
2.833AlaTyr: 2.833 ± 0.079
0.0AlaXaa: 0.0 ± 0.0
Cys
0.314CysAla: 0.314 ± 0.028
0.059CysCys: 0.059 ± 0.013
0.317CysAsp: 0.317 ± 0.03
0.247CysGlu: 0.247 ± 0.021
0.258CysPhe: 0.258 ± 0.021
0.471CysGly: 0.471 ± 0.035
0.162CysHis: 0.162 ± 0.02
0.389CysIle: 0.389 ± 0.029
0.264CysLys: 0.264 ± 0.025
0.679CysLeu: 0.679 ± 0.044
0.122CysMet: 0.122 ± 0.016
0.244CysAsn: 0.244 ± 0.025
0.262CysPro: 0.262 ± 0.026
0.297CysGln: 0.297 ± 0.028
0.223CysArg: 0.223 ± 0.023
0.408CysSer: 0.408 ± 0.03
0.293CysThr: 0.293 ± 0.028
0.321CysVal: 0.321 ± 0.03
0.055CysTrp: 0.055 ± 0.012
0.242CysTyr: 0.242 ± 0.021
0.0CysXaa: 0.0 ± 0.0
Asp
3.962AspAla: 3.962 ± 0.092
0.295AspCys: 0.295 ± 0.024
2.951AspAsp: 2.951 ± 0.087
4.145AspGlu: 4.145 ± 0.11
3.521AspPhe: 3.521 ± 0.098
3.903AspGly: 3.903 ± 0.11
1.002AspHis: 1.002 ± 0.051
4.632AspIle: 4.632 ± 0.103
4.468AspLys: 4.468 ± 0.097
6.158AspLeu: 6.158 ± 0.131
1.602AspMet: 1.602 ± 0.061
2.755AspAsn: 2.755 ± 0.083
1.55AspPro: 1.55 ± 0.057
1.997AspGln: 1.997 ± 0.073
2.196AspArg: 2.196 ± 0.071
3.167AspSer: 3.167 ± 0.086
2.75AspThr: 2.75 ± 0.072
4.019AspVal: 4.019 ± 0.103
0.677AspTrp: 0.677 ± 0.043
2.953AspTyr: 2.953 ± 0.085
0.0AspXaa: 0.0 ± 0.0
Glu
5.4GluAla: 5.4 ± 0.136
0.286GluCys: 0.286 ± 0.026
3.995GluAsp: 3.995 ± 0.103
5.83GluGlu: 5.83 ± 0.132
2.681GluPhe: 2.681 ± 0.082
3.663GluGly: 3.663 ± 0.093
1.209GluHis: 1.209 ± 0.049
5.217GluIle: 5.217 ± 0.122
5.544GluLys: 5.544 ± 0.13
6.769GluLeu: 6.769 ± 0.156
1.77GluMet: 1.77 ± 0.073
3.916GluAsn: 3.916 ± 0.091
1.668GluPro: 1.668 ± 0.065
2.427GluGln: 2.427 ± 0.074
3.274GluArg: 3.274 ± 0.085
3.497GluSer: 3.497 ± 0.092
3.916GluThr: 3.916 ± 0.089
5.114GluVal: 5.114 ± 0.121
0.554GluTrp: 0.554 ± 0.038
1.951GluTyr: 1.951 ± 0.075
0.0GluXaa: 0.0 ± 0.0
Phe
3.285PheAla: 3.285 ± 0.093
0.247PheCys: 0.247 ± 0.022
3.268PheAsp: 3.268 ± 0.086
3.073PheGlu: 3.073 ± 0.089
2.115PhePhe: 2.115 ± 0.085
3.434PheGly: 3.434 ± 0.083
0.858PheHis: 0.858 ± 0.044
3.15PheIle: 3.15 ± 0.097
2.709PheLys: 2.709 ± 0.076
4.58PheLeu: 4.58 ± 0.138
1.1PheMet: 1.1 ± 0.051
2.085PheAsn: 2.085 ± 0.072
1.631PhePro: 1.631 ± 0.059
1.574PheGln: 1.574 ± 0.061
1.628PheArg: 1.628 ± 0.059
3.272PheSer: 3.272 ± 0.1
2.56PheThr: 2.56 ± 0.075
3.218PheVal: 3.218 ± 0.103
0.493PheTrp: 0.493 ± 0.033
1.792PheTyr: 1.792 ± 0.071
0.0PheXaa: 0.0 ± 0.0
Gly
4.625GlyAla: 4.625 ± 0.143
0.432GlyCys: 0.432 ± 0.034
3.589GlyAsp: 3.589 ± 0.092
3.65GlyGlu: 3.65 ± 0.099
3.351GlyPhe: 3.351 ± 0.092
4.333GlyGly: 4.333 ± 0.129
1.421GlyHis: 1.421 ± 0.059
5.324GlyIle: 5.324 ± 0.115
4.861GlyLys: 4.861 ± 0.117
6.828GlyLeu: 6.828 ± 0.133
1.829GlyMet: 1.829 ± 0.064
2.897GlyAsn: 2.897 ± 0.077
1.476GlyPro: 1.476 ± 0.063
2.916GlyGln: 2.916 ± 0.074
2.866GlyArg: 2.866 ± 0.082
3.752GlySer: 3.752 ± 0.106
3.602GlyThr: 3.602 ± 0.085
5.038GlyVal: 5.038 ± 0.125
0.568GlyTrp: 0.568 ± 0.035
2.615GlyTyr: 2.615 ± 0.07
0.0GlyXaa: 0.0 ± 0.0
His
1.196HisAla: 1.196 ± 0.051
0.127HisCys: 0.127 ± 0.015
1.03HisAsp: 1.03 ± 0.046
1.111HisGlu: 1.111 ± 0.049
1.074HisPhe: 1.074 ± 0.044
1.284HisGly: 1.284 ± 0.058
0.572HisHis: 0.572 ± 0.039
1.386HisIle: 1.386 ± 0.06
1.083HisLys: 1.083 ± 0.045
2.002HisLeu: 2.002 ± 0.068
0.487HisMet: 0.487 ± 0.032
0.74HisAsn: 0.74 ± 0.039
0.906HisPro: 0.906 ± 0.043
0.805HisGln: 0.805 ± 0.043
0.845HisArg: 0.845 ± 0.039
1.107HisSer: 1.107 ± 0.049
0.926HisThr: 0.926 ± 0.046
1.181HisVal: 1.181 ± 0.06
0.166HisTrp: 0.166 ± 0.019
0.902HisTyr: 0.902 ± 0.044
0.0HisXaa: 0.0 ± 0.0
Ile
6.097IleAla: 6.097 ± 0.135
0.522IleCys: 0.522 ± 0.034
4.457IleAsp: 4.457 ± 0.111
4.911IleGlu: 4.911 ± 0.116
3.584IlePhe: 3.584 ± 0.109
4.986IleGly: 4.986 ± 0.122
1.279IleHis: 1.279 ± 0.052
5.499IleIle: 5.499 ± 0.115
4.831IleLys: 4.831 ± 0.115
7.513IleLeu: 7.513 ± 0.134
1.849IleMet: 1.849 ± 0.078
3.266IleAsn: 3.266 ± 0.087
2.877IlePro: 2.877 ± 0.079
2.475IleGln: 2.475 ± 0.071
2.816IleArg: 2.816 ± 0.079
5.003IleSer: 5.003 ± 0.113
3.992IleThr: 3.992 ± 0.091
5.171IleVal: 5.171 ± 0.104
0.554IleTrp: 0.554 ± 0.033
2.633IleTyr: 2.633 ± 0.072
0.0IleXaa: 0.0 ± 0.0
Lys
5.258LysAla: 5.258 ± 0.117
0.229LysCys: 0.229 ± 0.022
4.322LysAsp: 4.322 ± 0.116
5.789LysGlu: 5.789 ± 0.121
2.17LysPhe: 2.17 ± 0.06
4.112LysGly: 4.112 ± 0.096
1.255LysHis: 1.255 ± 0.053
4.831LysIle: 4.831 ± 0.108
5.455LysLys: 5.455 ± 0.138
5.961LysLeu: 5.961 ± 0.107
2.039LysMet: 2.039 ± 0.063
3.595LysAsn: 3.595 ± 0.098
2.074LysPro: 2.074 ± 0.07
2.576LysGln: 2.576 ± 0.075
3.218LysArg: 3.218 ± 0.101
4.04LysSer: 4.04 ± 0.095
4.08LysThr: 4.08 ± 0.089
4.89LysVal: 4.89 ± 0.106
0.581LysTrp: 0.581 ± 0.039
2.316LysTyr: 2.316 ± 0.087
0.0LysXaa: 0.0 ± 0.0
Leu
8.378LeuAla: 8.378 ± 0.136
0.544LeuCys: 0.544 ± 0.035
5.946LeuAsp: 5.946 ± 0.122
6.952LeuGlu: 6.952 ± 0.137
4.409LeuPhe: 4.409 ± 0.133
6.627LeuGly: 6.627 ± 0.128
1.602LeuHis: 1.602 ± 0.055
6.924LeuIle: 6.924 ± 0.154
6.446LeuLys: 6.446 ± 0.11
9.663LeuLeu: 9.663 ± 0.183
2.338LeuMet: 2.338 ± 0.068
4.372LeuAsn: 4.372 ± 0.109
3.807LeuPro: 3.807 ± 0.09
3.102LeuGln: 3.102 ± 0.08
3.872LeuArg: 3.872 ± 0.096
7.564LeuSer: 7.564 ± 0.141
6.16LeuThr: 6.16 ± 0.114
7.07LeuVal: 7.07 ± 0.134
0.705LeuTrp: 0.705 ± 0.05
3.163LeuTyr: 3.163 ± 0.1
0.0LeuXaa: 0.0 ± 0.0
Met
2.213MetAla: 2.213 ± 0.075
0.127MetCys: 0.127 ± 0.017
1.443MetAsp: 1.443 ± 0.056
1.535MetGlu: 1.535 ± 0.073
0.821MetPhe: 0.821 ± 0.035
1.753MetGly: 1.753 ± 0.07
0.408MetHis: 0.408 ± 0.043
1.912MetIle: 1.912 ± 0.08
1.842MetLys: 1.842 ± 0.056
2.259MetLeu: 2.259 ± 0.075
0.723MetMet: 0.723 ± 0.046
1.194MetAsn: 1.194 ± 0.052
0.788MetPro: 0.788 ± 0.038
0.797MetGln: 0.797 ± 0.041
1.07MetArg: 1.07 ± 0.045
1.792MetSer: 1.792 ± 0.063
2.216MetThr: 2.216 ± 0.077
1.716MetVal: 1.716 ± 0.061
0.162MetTrp: 0.162 ± 0.023
0.618MetTyr: 0.618 ± 0.035
0.0MetXaa: 0.0 ± 0.0
Asn
3.015AsnAla: 3.015 ± 0.091
0.293AsnCys: 0.293 ± 0.029
2.423AsnAsp: 2.423 ± 0.083
2.584AsnGlu: 2.584 ± 0.075
2.1AsnPhe: 2.1 ± 0.071
3.309AsnGly: 3.309 ± 0.096
1.113AsnHis: 1.113 ± 0.049
3.416AsnIle: 3.416 ± 0.09
3.049AsnLys: 3.049 ± 0.081
4.682AsnLeu: 4.682 ± 0.116
1.185AsnMet: 1.185 ± 0.056
2.082AsnAsn: 2.082 ± 0.082
2.209AsnPro: 2.209 ± 0.069
2.213AsnGln: 2.213 ± 0.064
2.185AsnArg: 2.185 ± 0.07
2.471AsnSer: 2.471 ± 0.081
2.364AsnThr: 2.364 ± 0.079
2.901AsnVal: 2.901 ± 0.071
0.426AsnTrp: 0.426 ± 0.026
1.875AsnTyr: 1.875 ± 0.067
0.0AsnXaa: 0.0 ± 0.0
Pro
2.301ProAla: 2.301 ± 0.09
0.168ProCys: 0.168 ± 0.018
2.041ProAsp: 2.041 ± 0.077
2.945ProGlu: 2.945 ± 0.08
1.559ProPhe: 1.559 ± 0.062
1.921ProGly: 1.921 ± 0.075
0.67ProHis: 0.67 ± 0.039
2.543ProIle: 2.543 ± 0.092
2.039ProLys: 2.039 ± 0.067
2.964ProLeu: 2.964 ± 0.091
0.707ProMet: 0.707 ± 0.043
1.683ProAsn: 1.683 ± 0.06
0.515ProPro: 0.515 ± 0.031
1.126ProGln: 1.126 ± 0.051
1.135ProArg: 1.135 ± 0.045
2.054ProSer: 2.054 ± 0.074
1.995ProThr: 1.995 ± 0.068
2.6ProVal: 2.6 ± 0.084
0.242ProTrp: 0.242 ± 0.023
1.301ProTyr: 1.301 ± 0.057
0.0ProXaa: 0.0 ± 0.0
Gln
3.327GlnAla: 3.327 ± 0.091
0.162GlnCys: 0.162 ± 0.019
2.019GlnAsp: 2.019 ± 0.068
3.039GlnGlu: 3.039 ± 0.098
1.62GlnPhe: 1.62 ± 0.058
2.366GlnGly: 2.366 ± 0.073
0.644GlnHis: 0.644 ± 0.037
2.774GlnIle: 2.774 ± 0.081
2.681GlnLys: 2.681 ± 0.089
3.667GlnLeu: 3.667 ± 0.092
0.995GlnMet: 0.995 ± 0.041
1.574GlnAsn: 1.574 ± 0.061
1.091GlnPro: 1.091 ± 0.052
1.185GlnGln: 1.185 ± 0.061
1.419GlnArg: 1.419 ± 0.051
2.032GlnSer: 2.032 ± 0.071
2.272GlnThr: 2.272 ± 0.079
3.049GlnVal: 3.049 ± 0.083
0.271GlnTrp: 0.271 ± 0.026
1.129GlnTyr: 1.129 ± 0.051
0.0GlnXaa: 0.0 ± 0.0
Arg
2.602ArgAla: 2.602 ± 0.08
0.234ArgCys: 0.234 ± 0.022
2.451ArgAsp: 2.451 ± 0.077
3.01ArgGlu: 3.01 ± 0.088
2.03ArgPhe: 2.03 ± 0.067
2.395ArgGly: 2.395 ± 0.078
0.91ArgHis: 0.91 ± 0.04
2.938ArgIle: 2.938 ± 0.074
3.008ArgLys: 3.008 ± 0.089
4.246ArgLeu: 4.246 ± 0.109
1.139ArgMet: 1.139 ± 0.052
1.829ArgAsn: 1.829 ± 0.073
1.284ArgPro: 1.284 ± 0.058
1.98ArgGln: 1.98 ± 0.084
2.15ArgArg: 2.15 ± 0.072
2.205ArgSer: 2.205 ± 0.069
1.903ArgThr: 1.903 ± 0.065
2.956ArgVal: 2.956 ± 0.077
0.301ArgTrp: 0.301 ± 0.028
1.628ArgTyr: 1.628 ± 0.059
0.0ArgXaa: 0.0 ± 0.0
Ser
3.953SerAla: 3.953 ± 0.115
0.38SerCys: 0.38 ± 0.032
3.739SerAsp: 3.739 ± 0.1
3.82SerGlu: 3.82 ± 0.107
3.082SerPhe: 3.082 ± 0.092
4.51SerGly: 4.51 ± 0.108
1.284SerHis: 1.284 ± 0.052
4.545SerIle: 4.545 ± 0.106
4.093SerLys: 4.093 ± 0.097
6.664SerLeu: 6.664 ± 0.135
1.399SerMet: 1.399 ± 0.062
2.809SerAsn: 2.809 ± 0.087
1.89SerPro: 1.89 ± 0.067
2.65SerGln: 2.65 ± 0.071
2.541SerArg: 2.541 ± 0.069
4.309SerSer: 4.309 ± 0.16
3.198SerThr: 3.198 ± 0.098
4.248SerVal: 4.248 ± 0.083
0.587SerTrp: 0.587 ± 0.041
2.689SerTyr: 2.689 ± 0.084
0.0SerXaa: 0.0 ± 0.0
Thr
4.185ThrAla: 4.185 ± 0.106
0.323ThrCys: 0.323 ± 0.029
3.207ThrAsp: 3.207 ± 0.07
3.401ThrGlu: 3.401 ± 0.091
2.687ThrPhe: 2.687 ± 0.076
4.099ThrGly: 4.099 ± 0.098
1.142ThrHis: 1.142 ± 0.052
4.499ThrIle: 4.499 ± 0.092
3.613ThrLys: 3.613 ± 0.096
5.621ThrLeu: 5.621 ± 0.128
1.279ThrMet: 1.279 ± 0.05
2.436ThrAsn: 2.436 ± 0.077
2.381ThrPro: 2.381 ± 0.084
1.938ThrGln: 1.938 ± 0.068
1.975ThrArg: 1.975 ± 0.065
3.61ThrSer: 3.61 ± 0.109
3.121ThrThr: 3.121 ± 0.088
4.342ThrVal: 4.342 ± 0.09
0.52ThrTrp: 0.52 ± 0.032
2.318ThrTyr: 2.318 ± 0.063
0.0ThrXaa: 0.0 ± 0.0
Val
6.132ValAla: 6.132 ± 0.124
0.413ValCys: 0.413 ± 0.032
4.377ValAsp: 4.377 ± 0.111
4.691ValGlu: 4.691 ± 0.116
3.215ValPhe: 3.215 ± 0.095
4.584ValGly: 4.584 ± 0.094
1.192ValHis: 1.192 ± 0.047
5.125ValIle: 5.125 ± 0.12
4.63ValLys: 4.63 ± 0.101
6.92ValLeu: 6.92 ± 0.125
1.764ValMet: 1.764 ± 0.06
3.056ValAsn: 3.056 ± 0.08
2.517ValPro: 2.517 ± 0.077
2.111ValGln: 2.111 ± 0.062
2.827ValArg: 2.827 ± 0.08
4.746ValSer: 4.746 ± 0.107
4.881ValThr: 4.881 ± 0.105
5.457ValVal: 5.457 ± 0.111
0.515ValTrp: 0.515 ± 0.032
2.349ValTyr: 2.349 ± 0.068
0.0ValXaa: 0.0 ± 0.0
Trp
0.48TrpAla: 0.48 ± 0.032
0.055TrpCys: 0.055 ± 0.012
0.476TrpAsp: 0.476 ± 0.034
0.506TrpGlu: 0.506 ± 0.036
0.456TrpPhe: 0.456 ± 0.035
0.605TrpGly: 0.605 ± 0.036
0.196TrpHis: 0.196 ± 0.02
0.618TrpIle: 0.618 ± 0.036
0.504TrpLys: 0.504 ± 0.035
0.941TrpLeu: 0.941 ± 0.047
0.236TrpMet: 0.236 ± 0.027
0.493TrpAsn: 0.493 ± 0.034
0.162TrpPro: 0.162 ± 0.02
0.395TrpGln: 0.395 ± 0.028
0.391TrpArg: 0.391 ± 0.033
0.517TrpSer: 0.517 ± 0.037
0.441TrpThr: 0.441 ± 0.038
0.491TrpVal: 0.491 ± 0.032
0.107TrpTrp: 0.107 ± 0.015
0.242TrpTyr: 0.242 ± 0.025
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.545TyrAla: 2.545 ± 0.077
0.253TyrCys: 0.253 ± 0.023
2.445TyrAsp: 2.445 ± 0.078
2.312TyrGlu: 2.312 ± 0.073
2.002TyrPhe: 2.002 ± 0.079
2.488TyrGly: 2.488 ± 0.082
0.757TyrHis: 0.757 ± 0.043
2.482TyrIle: 2.482 ± 0.08
2.15TyrLys: 2.15 ± 0.066
4.187TyrLeu: 4.187 ± 0.1
0.816TyrMet: 0.816 ± 0.042
1.729TyrAsn: 1.729 ± 0.064
1.351TyrPro: 1.351 ± 0.054
1.794TyrGln: 1.794 ± 0.076
1.646TyrArg: 1.646 ± 0.064
2.161TyrSer: 2.161 ± 0.075
1.879TyrThr: 1.879 ± 0.069
2.264TyrVal: 2.264 ± 0.076
0.288TyrTrp: 0.288 ± 0.024
1.676TyrTyr: 1.676 ± 0.059
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 1577 proteins (458119 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski