Amino acid dipepetide frequency for Prevotella sp. CAG:891

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
7.451AlaAla: 7.451 ± 0.13
1.16AlaCys: 1.16 ± 0.04
5.571AlaAsp: 5.571 ± 0.103
5.877AlaGlu: 5.877 ± 0.112
3.435AlaPhe: 3.435 ± 0.081
5.184AlaGly: 5.184 ± 0.094
1.718AlaHis: 1.718 ± 0.058
4.374AlaIle: 4.374 ± 0.101
4.335AlaLys: 4.335 ± 0.097
7.816AlaLeu: 7.816 ± 0.127
2.093AlaMet: 2.093 ± 0.061
3.608AlaAsn: 3.608 ± 0.08
2.757AlaPro: 2.757 ± 0.069
3.851AlaGln: 3.851 ± 0.074
3.806AlaArg: 3.806 ± 0.094
4.126AlaSer: 4.126 ± 0.079
4.393AlaThr: 4.393 ± 0.076
5.359AlaVal: 5.359 ± 0.086
0.874AlaTrp: 0.874 ± 0.038
3.247AlaTyr: 3.247 ± 0.065
0.0AlaXaa: 0.0 ± 0.0
Cys
1.049CysAla: 1.049 ± 0.043
0.256CysCys: 0.256 ± 0.02
0.703CysAsp: 0.703 ± 0.034
0.739CysGlu: 0.739 ± 0.032
0.627CysPhe: 0.627 ± 0.029
1.158CysGly: 1.158 ± 0.046
0.372CysHis: 0.372 ± 0.025
1.03CysIle: 1.03 ± 0.038
0.738CysLys: 0.738 ± 0.032
1.238CysLeu: 1.238 ± 0.045
0.392CysMet: 0.392 ± 0.027
0.601CysAsn: 0.601 ± 0.03
0.652CysPro: 0.652 ± 0.032
0.453CysGln: 0.453 ± 0.027
0.725CysArg: 0.725 ± 0.029
0.765CysSer: 0.765 ± 0.036
0.749CysThr: 0.749 ± 0.038
0.961CysVal: 0.961 ± 0.041
0.166CysTrp: 0.166 ± 0.017
0.513CysTyr: 0.513 ± 0.027
0.0CysXaa: 0.0 ± 0.0
Asp
4.472AspAla: 4.472 ± 0.095
0.786AspCys: 0.786 ± 0.036
2.768AspAsp: 2.768 ± 0.072
4.053AspGlu: 4.053 ± 0.082
3.115AspPhe: 3.115 ± 0.076
3.927AspGly: 3.927 ± 0.092
0.969AspHis: 0.969 ± 0.043
3.399AspIle: 3.399 ± 0.08
3.365AspLys: 3.365 ± 0.082
4.859AspLeu: 4.859 ± 0.091
1.38AspMet: 1.38 ± 0.044
2.699AspAsn: 2.699 ± 0.069
1.786AspPro: 1.786 ± 0.058
1.263AspGln: 1.263 ± 0.046
2.464AspArg: 2.464 ± 0.06
3.074AspSer: 3.074 ± 0.083
2.901AspThr: 2.901 ± 0.066
3.621AspVal: 3.621 ± 0.072
0.755AspTrp: 0.755 ± 0.034
2.645AspTyr: 2.645 ± 0.063
0.0AspXaa: 0.0 ± 0.0
Glu
5.604GluAla: 5.604 ± 0.109
0.721GluCys: 0.721 ± 0.032
2.682GluAsp: 2.682 ± 0.075
4.211GluGlu: 4.211 ± 0.109
2.191GluPhe: 2.191 ± 0.057
4.315GluGly: 4.315 ± 0.095
1.423GluHis: 1.423 ± 0.044
3.631GluIle: 3.631 ± 0.073
4.156GluLys: 4.156 ± 0.087
5.963GluLeu: 5.963 ± 0.12
1.975GluMet: 1.975 ± 0.051
3.121GluAsn: 3.121 ± 0.073
1.923GluPro: 1.923 ± 0.061
2.828GluGln: 2.828 ± 0.072
3.529GluArg: 3.529 ± 0.093
2.851GluSer: 2.851 ± 0.066
3.293GluThr: 3.293 ± 0.066
4.431GluVal: 4.431 ± 0.085
0.854GluTrp: 0.854 ± 0.034
2.092GluTyr: 2.092 ± 0.061
0.0GluXaa: 0.0 ± 0.0
Phe
3.506PheAla: 3.506 ± 0.082
0.779PheCys: 0.779 ± 0.035
2.845PheAsp: 2.845 ± 0.069
2.443PheGlu: 2.443 ± 0.054
2.147PhePhe: 2.147 ± 0.07
3.218PheGly: 3.218 ± 0.077
0.961PheHis: 0.961 ± 0.037
2.739PheIle: 2.739 ± 0.066
2.369PheLys: 2.369 ± 0.066
3.88PheLeu: 3.88 ± 0.08
1.246PheMet: 1.246 ± 0.041
2.278PheAsn: 2.278 ± 0.062
1.549PhePro: 1.549 ± 0.048
1.296PheGln: 1.296 ± 0.047
2.169PheArg: 2.169 ± 0.062
3.144PheSer: 3.144 ± 0.082
2.942PheThr: 2.942 ± 0.065
3.158PheVal: 3.158 ± 0.074
0.451PheTrp: 0.451 ± 0.027
1.795PheTyr: 1.795 ± 0.056
0.001PheXaa: 0.001 ± 0.001
Gly
4.838GlyAla: 4.838 ± 0.094
1.025GlyCys: 1.025 ± 0.042
3.183GlyAsp: 3.183 ± 0.064
3.904GlyGlu: 3.904 ± 0.083
3.189GlyPhe: 3.189 ± 0.073
4.774GlyGly: 4.774 ± 0.112
1.467GlyHis: 1.467 ± 0.046
4.765GlyIle: 4.765 ± 0.101
4.919GlyLys: 4.919 ± 0.104
5.95GlyLeu: 5.95 ± 0.094
2.098GlyMet: 2.098 ± 0.053
3.498GlyAsn: 3.498 ± 0.083
1.421GlyPro: 1.421 ± 0.05
2.325GlyGln: 2.325 ± 0.078
3.293GlyArg: 3.293 ± 0.065
3.867GlySer: 3.867 ± 0.088
4.08GlyThr: 4.08 ± 0.079
4.958GlyVal: 4.958 ± 0.096
0.955GlyTrp: 0.955 ± 0.041
3.043GlyTyr: 3.043 ± 0.067
0.007GlyXaa: 0.007 ± 0.003
His
1.664HisAla: 1.664 ± 0.058
0.352HisCys: 0.352 ± 0.021
1.12HisAsp: 1.12 ± 0.046
1.228HisGlu: 1.228 ± 0.041
1.366HisPhe: 1.366 ± 0.044
1.394HisGly: 1.394 ± 0.047
0.647HisHis: 0.647 ± 0.035
1.634HisIle: 1.634 ± 0.049
1.059HisLys: 1.059 ± 0.04
2.156HisLeu: 2.156 ± 0.064
0.391HisMet: 0.391 ± 0.023
1.045HisAsn: 1.045 ± 0.035
1.128HisPro: 1.128 ± 0.044
0.718HisGln: 0.718 ± 0.036
1.215HisArg: 1.215 ± 0.047
1.32HisSer: 1.32 ± 0.044
1.411HisThr: 1.411 ± 0.044
1.349HisVal: 1.349 ± 0.047
0.281HisTrp: 0.281 ± 0.021
0.976HisTyr: 0.976 ± 0.04
0.0HisXaa: 0.0 ± 0.0
Ile
4.95IleAla: 4.95 ± 0.101
1.026IleCys: 1.026 ± 0.044
4.072IleAsp: 4.072 ± 0.089
4.174IleGlu: 4.174 ± 0.086
2.305IlePhe: 2.305 ± 0.061
3.985IleGly: 3.985 ± 0.091
1.336IleHis: 1.336 ± 0.054
3.638IleIle: 3.638 ± 0.096
3.465IleLys: 3.465 ± 0.078
4.811IleLeu: 4.811 ± 0.116
1.35IleMet: 1.35 ± 0.047
2.913IleAsn: 2.913 ± 0.062
2.653IlePro: 2.653 ± 0.065
1.957IleGln: 1.957 ± 0.049
2.834IleArg: 2.834 ± 0.06
3.985IleSer: 3.985 ± 0.08
3.53IleThr: 3.53 ± 0.073
4.238IleVal: 4.238 ± 0.088
0.544IleTrp: 0.544 ± 0.029
2.338IleTyr: 2.338 ± 0.056
0.0IleXaa: 0.0 ± 0.0
Lys
5.011LysAla: 5.011 ± 0.103
0.587LysCys: 0.587 ± 0.036
3.075LysAsp: 3.075 ± 0.081
4.086LysGlu: 4.086 ± 0.096
2.149LysPhe: 2.149 ± 0.054
4.119LysGly: 4.119 ± 0.092
1.387LysHis: 1.387 ± 0.038
3.199LysIle: 3.199 ± 0.078
4.077LysLys: 4.077 ± 0.097
5.346LysLeu: 5.346 ± 0.09
1.883LysMet: 1.883 ± 0.048
3.173LysAsn: 3.173 ± 0.079
2.319LysPro: 2.319 ± 0.06
2.943LysGln: 2.943 ± 0.062
3.178LysArg: 3.178 ± 0.07
3.05LysSer: 3.05 ± 0.069
3.117LysThr: 3.117 ± 0.066
4.107LysVal: 4.107 ± 0.087
0.635LysTrp: 0.635 ± 0.028
2.402LysTyr: 2.402 ± 0.071
0.0LysXaa: 0.0 ± 0.0
Leu
7.083LeuAla: 7.083 ± 0.113
1.566LeuCys: 1.566 ± 0.048
4.561LeuAsp: 4.561 ± 0.099
4.522LeuGlu: 4.522 ± 0.102
4.244LeuPhe: 4.244 ± 0.096
5.79LeuGly: 5.79 ± 0.099
2.314LeuHis: 2.314 ± 0.063
5.183LeuIle: 5.183 ± 0.118
5.737LeuLys: 5.737 ± 0.102
9.41LeuLeu: 9.41 ± 0.168
2.575LeuMet: 2.575 ± 0.066
4.636LeuAsn: 4.636 ± 0.086
4.627LeuPro: 4.627 ± 0.086
4.065LeuGln: 4.065 ± 0.088
5.159LeuArg: 5.159 ± 0.095
6.394LeuSer: 6.394 ± 0.114
6.046LeuThr: 6.046 ± 0.102
5.406LeuVal: 5.406 ± 0.106
0.988LeuTrp: 0.988 ± 0.045
3.321LeuTyr: 3.321 ± 0.069
0.001LeuXaa: 0.001 ± 0.001
Met
2.39MetAla: 2.39 ± 0.064
0.271MetCys: 0.271 ± 0.02
1.299MetAsp: 1.299 ± 0.043
1.46MetGlu: 1.46 ± 0.043
0.999MetPhe: 0.999 ± 0.043
1.954MetGly: 1.954 ± 0.052
0.551MetHis: 0.551 ± 0.029
1.329MetIle: 1.329 ± 0.045
2.136MetLys: 2.136 ± 0.048
2.774MetLeu: 2.774 ± 0.069
0.873MetMet: 0.873 ± 0.041
1.532MetAsn: 1.532 ± 0.047
1.4MetPro: 1.4 ± 0.051
1.334MetGln: 1.334 ± 0.037
1.413MetArg: 1.413 ± 0.043
1.516MetSer: 1.516 ± 0.047
1.589MetThr: 1.589 ± 0.047
1.713MetVal: 1.713 ± 0.051
0.237MetTrp: 0.237 ± 0.021
0.723MetTyr: 0.723 ± 0.033
0.0MetXaa: 0.0 ± 0.0
Asn
3.927AsnAla: 3.927 ± 0.082
0.574AsnCys: 0.574 ± 0.031
2.615AsnAsp: 2.615 ± 0.075
3.22AsnGlu: 3.22 ± 0.074
2.247AsnPhe: 2.247 ± 0.057
3.75AsnGly: 3.75 ± 0.092
1.072AsnHis: 1.072 ± 0.038
3.08AsnIle: 3.08 ± 0.069
2.784AsnLys: 2.784 ± 0.076
4.485AsnLeu: 4.485 ± 0.101
1.323AsnMet: 1.323 ± 0.042
2.355AsnAsn: 2.355 ± 0.067
2.457AsnPro: 2.457 ± 0.056
1.671AsnGln: 1.671 ± 0.053
2.534AsnArg: 2.534 ± 0.066
2.483AsnSer: 2.483 ± 0.063
2.561AsnThr: 2.561 ± 0.067
3.173AsnVal: 3.173 ± 0.07
0.604AsnTrp: 0.604 ± 0.031
2.143AsnTyr: 2.143 ± 0.061
0.0AsnXaa: 0.0 ± 0.0
Pro
3.219ProAla: 3.219 ± 0.075
0.463ProCys: 0.463 ± 0.03
2.587ProAsp: 2.587 ± 0.071
3.297ProGlu: 3.297 ± 0.076
1.917ProPhe: 1.917 ± 0.056
2.54ProGly: 2.54 ± 0.064
0.877ProHis: 0.877 ± 0.039
2.179ProIle: 2.179 ± 0.057
1.978ProLys: 1.978 ± 0.053
3.654ProLeu: 3.654 ± 0.076
1.037ProMet: 1.037 ± 0.04
1.913ProAsn: 1.913 ± 0.058
0.901ProPro: 0.901 ± 0.051
1.707ProGln: 1.707 ± 0.051
1.539ProArg: 1.539 ± 0.053
2.139ProSer: 2.139 ± 0.069
2.41ProThr: 2.41 ± 0.064
2.743ProVal: 2.743 ± 0.064
0.465ProTrp: 0.465 ± 0.025
1.661ProTyr: 1.661 ± 0.047
0.001ProXaa: 0.001 ± 0.001
Gln
3.129GlnAla: 3.129 ± 0.065
0.46GlnCys: 0.46 ± 0.025
1.367GlnAsp: 1.367 ± 0.044
1.848GlnGlu: 1.848 ± 0.058
1.558GlnPhe: 1.558 ± 0.055
2.415GlnGly: 2.415 ± 0.067
1.057GlnHis: 1.057 ± 0.038
2.446GlnIle: 2.446 ± 0.059
2.417GlnLys: 2.417 ± 0.056
3.92GlnLeu: 3.92 ± 0.076
1.245GlnMet: 1.245 ± 0.04
1.907GlnAsn: 1.907 ± 0.057
1.74GlnPro: 1.74 ± 0.049
2.083GlnGln: 2.083 ± 0.065
2.311GlnArg: 2.311 ± 0.063
2.203GlnSer: 2.203 ± 0.056
2.626GlnThr: 2.626 ± 0.064
2.365GlnVal: 2.365 ± 0.057
0.561GlnTrp: 0.561 ± 0.031
1.441GlnTyr: 1.441 ± 0.046
0.0GlnXaa: 0.0 ± 0.0
Arg
3.328ArgAla: 3.328 ± 0.067
0.613ArgCys: 0.613 ± 0.032
2.282ArgAsp: 2.282 ± 0.056
3.097ArgGlu: 3.097 ± 0.079
2.471ArgPhe: 2.471 ± 0.063
2.776ArgGly: 2.776 ± 0.076
1.245ArgHis: 1.245 ± 0.046
3.532ArgIle: 3.532 ± 0.078
3.247ArgLys: 3.247 ± 0.081
4.889ArgLeu: 4.889 ± 0.103
1.691ArgMet: 1.691 ± 0.047
2.584ArgAsn: 2.584 ± 0.064
1.926ArgPro: 1.926 ± 0.058
2.213ArgGln: 2.213 ± 0.059
2.876ArgArg: 2.876 ± 0.079
2.78ArgSer: 2.78 ± 0.073
2.888ArgThr: 2.888 ± 0.07
3.119ArgVal: 3.119 ± 0.066
0.651ArgTrp: 0.651 ± 0.03
2.236ArgTyr: 2.236 ± 0.069
0.0ArgXaa: 0.0 ± 0.0
Ser
4.623SerAla: 4.623 ± 0.078
0.76SerCys: 0.76 ± 0.035
2.933SerAsp: 2.933 ± 0.068
3.29SerGlu: 3.29 ± 0.072
3.026SerPhe: 3.026 ± 0.081
4.049SerGly: 4.049 ± 0.077
1.29SerHis: 1.29 ± 0.049
3.732SerIle: 3.732 ± 0.085
3.087SerLys: 3.087 ± 0.087
5.786SerLeu: 5.786 ± 0.115
1.475SerMet: 1.475 ± 0.048
2.481SerAsn: 2.481 ± 0.074
2.135SerPro: 2.135 ± 0.061
2.082SerGln: 2.082 ± 0.058
2.747SerArg: 2.747 ± 0.075
3.431SerSer: 3.431 ± 0.085
3.404SerThr: 3.404 ± 0.09
4.197SerVal: 4.197 ± 0.077
0.711SerTrp: 0.711 ± 0.031
2.285SerTyr: 2.285 ± 0.082
0.0SerXaa: 0.0 ± 0.0
Thr
5.093ThrAla: 5.093 ± 0.099
0.64ThrCys: 0.64 ± 0.033
3.82ThrAsp: 3.82 ± 0.085
3.458ThrGlu: 3.458 ± 0.08
2.665ThrPhe: 2.665 ± 0.061
4.056ThrGly: 4.056 ± 0.073
1.364ThrHis: 1.364 ± 0.046
3.377ThrIle: 3.377 ± 0.068
2.65ThrLys: 2.65 ± 0.068
6.142ThrLeu: 6.142 ± 0.108
1.15ThrMet: 1.15 ± 0.042
2.49ThrAsn: 2.49 ± 0.065
3.119ThrPro: 3.119 ± 0.069
2.123ThrGln: 2.123 ± 0.058
2.369ThrArg: 2.369 ± 0.061
3.246ThrSer: 3.246 ± 0.072
3.546ThrThr: 3.546 ± 0.084
3.954ThrVal: 3.954 ± 0.099
0.604ThrTrp: 0.604 ± 0.032
2.41ThrTyr: 2.41 ± 0.073
0.001ThrXaa: 0.001 ± 0.001
Val
5.508ValAla: 5.508 ± 0.1
1.147ValCys: 1.147 ± 0.046
3.831ValAsp: 3.831 ± 0.084
3.979ValGlu: 3.979 ± 0.091
2.778ValPhe: 2.778 ± 0.072
4.522ValGly: 4.522 ± 0.111
1.194ValHis: 1.194 ± 0.042
3.905ValIle: 3.905 ± 0.079
4.352ValLys: 4.352 ± 0.082
5.72ValLeu: 5.72 ± 0.101
1.907ValMet: 1.907 ± 0.056
3.377ValAsn: 3.377 ± 0.073
2.794ValPro: 2.794 ± 0.06
2.209ValGln: 2.209 ± 0.055
3.471ValArg: 3.471 ± 0.074
4.359ValSer: 4.359 ± 0.08
3.796ValThr: 3.796 ± 0.078
5.041ValVal: 5.041 ± 0.127
0.819ValTrp: 0.819 ± 0.041
2.578ValTyr: 2.578 ± 0.075
0.001ValXaa: 0.001 ± 0.001
Trp
0.76TrpAla: 0.76 ± 0.033
0.162TrpCys: 0.162 ± 0.016
0.614TrpAsp: 0.614 ± 0.033
0.588TrpGlu: 0.588 ± 0.031
0.547TrpPhe: 0.547 ± 0.029
0.863TrpGly: 0.863 ± 0.037
0.323TrpHis: 0.323 ± 0.019
0.679TrpIle: 0.679 ± 0.032
0.777TrpLys: 0.777 ± 0.031
1.147TrpLeu: 1.147 ± 0.042
0.409TrpMet: 0.409 ± 0.025
0.726TrpAsn: 0.726 ± 0.034
0.308TrpPro: 0.308 ± 0.021
0.64TrpGln: 0.64 ± 0.032
0.644TrpArg: 0.644 ± 0.034
0.625TrpSer: 0.625 ± 0.034
0.651TrpThr: 0.651 ± 0.036
0.694TrpVal: 0.694 ± 0.032
0.175TrpTrp: 0.175 ± 0.017
0.463TrpTyr: 0.463 ± 0.029
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.426TyrAla: 3.426 ± 0.078
0.54TyrCys: 0.54 ± 0.029
2.373TyrAsp: 2.373 ± 0.057
2.319TyrGlu: 2.319 ± 0.061
1.938TyrPhe: 1.938 ± 0.05
2.771TyrGly: 2.771 ± 0.067
0.888TyrHis: 0.888 ± 0.034
2.253TyrIle: 2.253 ± 0.059
2.196TyrLys: 2.196 ± 0.058
3.679TyrLeu: 3.679 ± 0.077
0.998TyrMet: 0.998 ± 0.039
2.122TyrAsn: 2.122 ± 0.078
1.63TyrPro: 1.63 ± 0.051
1.336TyrGln: 1.336 ± 0.049
2.164TyrArg: 2.164 ± 0.058
2.166TyrSer: 2.166 ± 0.061
2.345TyrThr: 2.345 ± 0.065
2.702TyrVal: 2.702 ± 0.065
0.466TyrTrp: 0.466 ± 0.029
1.929TyrTyr: 1.929 ± 0.064
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.003XaaAla: 0.003 ± 0.002
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.001XaaLys: 0.001 ± 0.001
0.001XaaLeu: 0.001 ± 0.001
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.001XaaGln: 0.001 ± 0.001
0.004XaaArg: 0.004 ± 0.002
0.001XaaSer: 0.001 ± 0.001
0.001XaaThr: 0.001 ± 0.001
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.014XaaXaa: 0.014 ± 0.005
Statistics based on 2058 proteins (703646 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski