Amino acid dipepetide frequency for Prevotella sp. CAG:5226

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
8.05AlaAla: 8.05 ± 0.137
1.305AlaCys: 1.305 ± 0.046
5.775AlaAsp: 5.775 ± 0.093
5.483AlaGlu: 5.483 ± 0.095
3.379AlaPhe: 3.379 ± 0.076
5.424AlaGly: 5.424 ± 0.095
1.84AlaHis: 1.84 ± 0.047
4.742AlaIle: 4.742 ± 0.097
4.621AlaLys: 4.621 ± 0.094
8.107AlaLeu: 8.107 ± 0.123
2.31AlaMet: 2.31 ± 0.053
3.972AlaAsn: 3.972 ± 0.085
3.039AlaPro: 3.039 ± 0.073
4.034AlaGln: 4.034 ± 0.085
3.766AlaArg: 3.766 ± 0.077
4.589AlaSer: 4.589 ± 0.082
5.205AlaThr: 5.205 ± 0.112
5.32AlaVal: 5.32 ± 0.099
0.951AlaTrp: 0.951 ± 0.04
3.325AlaTyr: 3.325 ± 0.066
0.001AlaXaa: 0.001 ± 0.001
Cys
1.099CysAla: 1.099 ± 0.039
0.283CysCys: 0.283 ± 0.021
0.754CysAsp: 0.754 ± 0.032
0.705CysGlu: 0.705 ± 0.027
0.58CysPhe: 0.58 ± 0.027
1.216CysGly: 1.216 ± 0.043
0.463CysHis: 0.463 ± 0.024
0.936CysIle: 0.936 ± 0.036
0.803CysLys: 0.803 ± 0.032
1.197CysLeu: 1.197 ± 0.042
0.427CysMet: 0.427 ± 0.023
0.712CysAsn: 0.712 ± 0.03
0.656CysPro: 0.656 ± 0.033
0.453CysGln: 0.453 ± 0.025
0.708CysArg: 0.708 ± 0.032
0.877CysSer: 0.877 ± 0.037
0.872CysThr: 0.872 ± 0.035
0.977CysVal: 0.977 ± 0.04
0.186CysTrp: 0.186 ± 0.016
0.676CysTyr: 0.676 ± 0.031
0.0CysXaa: 0.0 ± 0.0
Asp
5.018AspAla: 5.018 ± 0.085
0.747AspCys: 0.747 ± 0.032
3.291AspAsp: 3.291 ± 0.073
4.034AspGlu: 4.034 ± 0.087
2.837AspPhe: 2.837 ± 0.062
4.545AspGly: 4.545 ± 0.1
1.023AspHis: 1.023 ± 0.033
3.785AspIle: 3.785 ± 0.091
3.685AspLys: 3.685 ± 0.069
4.577AspLeu: 4.577 ± 0.067
1.621AspMet: 1.621 ± 0.043
3.11AspAsn: 3.11 ± 0.061
1.771AspPro: 1.771 ± 0.05
1.379AspGln: 1.379 ± 0.041
2.459AspArg: 2.459 ± 0.06
3.037AspSer: 3.037 ± 0.072
3.207AspThr: 3.207 ± 0.076
3.905AspVal: 3.905 ± 0.072
0.805AspTrp: 0.805 ± 0.034
2.866AspTyr: 2.866 ± 0.068
0.0AspXaa: 0.0 ± 0.0
Glu
5.414GluAla: 5.414 ± 0.101
0.664GluCys: 0.664 ± 0.03
2.795GluAsp: 2.795 ± 0.067
3.655GluGlu: 3.655 ± 0.108
1.986GluPhe: 1.986 ± 0.052
4.103GluGly: 4.103 ± 0.09
1.503GluHis: 1.503 ± 0.046
3.249GluIle: 3.249 ± 0.07
3.546GluLys: 3.546 ± 0.089
5.275GluLeu: 5.275 ± 0.107
1.773GluMet: 1.773 ± 0.046
2.689GluAsn: 2.689 ± 0.067
1.785GluPro: 1.785 ± 0.052
2.742GluGln: 2.742 ± 0.069
3.436GluArg: 3.436 ± 0.073
2.602GluSer: 2.602 ± 0.057
2.829GluThr: 2.829 ± 0.059
4.095GluVal: 4.095 ± 0.087
0.802GluTrp: 0.802 ± 0.035
2.151GluTyr: 2.151 ± 0.055
0.0GluXaa: 0.0 ± 0.0
Phe
3.393PheAla: 3.393 ± 0.067
0.762PheCys: 0.762 ± 0.032
2.872PheAsp: 2.872 ± 0.067
2.353PheGlu: 2.353 ± 0.058
1.702PhePhe: 1.702 ± 0.049
3.053PheGly: 3.053 ± 0.074
0.808PheHis: 0.808 ± 0.035
2.41PheIle: 2.41 ± 0.057
2.355PheLys: 2.355 ± 0.066
3.138PheLeu: 3.138 ± 0.077
1.204PheMet: 1.204 ± 0.042
2.279PheAsn: 2.279 ± 0.055
1.322PhePro: 1.322 ± 0.041
1.02PheGln: 1.02 ± 0.039
1.848PheArg: 1.848 ± 0.045
2.703PheSer: 2.703 ± 0.069
2.589PheThr: 2.589 ± 0.064
3.175PheVal: 3.175 ± 0.068
0.421PheTrp: 0.421 ± 0.021
1.737PheTyr: 1.737 ± 0.052
0.0PheXaa: 0.0 ± 0.0
Gly
5.1GlyAla: 5.1 ± 0.098
1.089GlyCys: 1.089 ± 0.036
3.45GlyAsp: 3.45 ± 0.077
3.751GlyGlu: 3.751 ± 0.077
2.867GlyPhe: 2.867 ± 0.059
4.972GlyGly: 4.972 ± 0.109
1.651GlyHis: 1.651 ± 0.045
4.607GlyIle: 4.607 ± 0.077
4.72GlyLys: 4.72 ± 0.089
5.592GlyLeu: 5.592 ± 0.101
2.085GlyMet: 2.085 ± 0.055
3.443GlyAsn: 3.443 ± 0.073
1.388GlyPro: 1.388 ± 0.049
2.561GlyGln: 2.561 ± 0.057
3.516GlyArg: 3.516 ± 0.058
4.101GlySer: 4.101 ± 0.083
4.468GlyThr: 4.468 ± 0.084
5.05GlyVal: 5.05 ± 0.083
0.97GlyTrp: 0.97 ± 0.042
3.223GlyTyr: 3.223 ± 0.065
0.0GlyXaa: 0.0 ± 0.0
His
1.708HisAla: 1.708 ± 0.048
0.346HisCys: 0.346 ± 0.02
1.339HisAsp: 1.339 ± 0.047
1.175HisGlu: 1.175 ± 0.042
1.21HisPhe: 1.21 ± 0.041
1.489HisGly: 1.489 ± 0.053
0.666HisHis: 0.666 ± 0.039
1.774HisIle: 1.774 ± 0.048
1.263HisLys: 1.263 ± 0.04
2.14HisLeu: 2.14 ± 0.051
0.472HisMet: 0.472 ± 0.022
1.173HisAsn: 1.173 ± 0.04
1.034HisPro: 1.034 ± 0.038
0.719HisGln: 0.719 ± 0.029
1.023HisArg: 1.023 ± 0.04
1.195HisSer: 1.195 ± 0.048
1.473HisThr: 1.473 ± 0.044
1.33HisVal: 1.33 ± 0.046
0.291HisTrp: 0.291 ± 0.02
1.115HisTyr: 1.115 ± 0.035
0.0HisXaa: 0.0 ± 0.0
Ile
5.122IleAla: 5.122 ± 0.094
0.975IleCys: 0.975 ± 0.038
4.17IleAsp: 4.17 ± 0.077
3.726IleGlu: 3.726 ± 0.075
2.226IlePhe: 2.226 ± 0.063
4.01IleGly: 4.01 ± 0.09
1.175IleHis: 1.175 ± 0.039
3.668IleIle: 3.668 ± 0.087
3.86IleLys: 3.86 ± 0.074
4.352IleLeu: 4.352 ± 0.091
1.445IleMet: 1.445 ± 0.04
3.034IleAsn: 3.034 ± 0.064
2.426IlePro: 2.426 ± 0.055
1.714IleGln: 1.714 ± 0.049
2.497IleArg: 2.497 ± 0.057
3.814IleSer: 3.814 ± 0.069
3.822IleThr: 3.822 ± 0.077
4.229IleVal: 4.229 ± 0.094
0.52IleTrp: 0.52 ± 0.027
2.328IleTyr: 2.328 ± 0.061
0.0IleXaa: 0.0 ± 0.0
Lys
5.35LysAla: 5.35 ± 0.092
0.636LysCys: 0.636 ± 0.032
3.373LysAsp: 3.373 ± 0.089
3.909LysGlu: 3.909 ± 0.092
2.027LysPhe: 2.027 ± 0.052
4.22LysGly: 4.22 ± 0.073
1.416LysHis: 1.416 ± 0.045
3.019LysIle: 3.019 ± 0.073
3.795LysLys: 3.795 ± 0.1
5.271LysLeu: 5.271 ± 0.084
1.91LysMet: 1.91 ± 0.047
2.709LysAsn: 2.709 ± 0.066
2.156LysPro: 2.156 ± 0.06
2.822LysGln: 2.822 ± 0.072
3.281LysArg: 3.281 ± 0.074
3.144LysSer: 3.144 ± 0.064
3.044LysThr: 3.044 ± 0.064
4.17LysVal: 4.17 ± 0.076
0.7LysTrp: 0.7 ± 0.027
2.598LysTyr: 2.598 ± 0.067
0.0LysXaa: 0.0 ± 0.0
Leu
7.297LeuAla: 7.297 ± 0.108
1.614LeuCys: 1.614 ± 0.049
4.717LeuAsp: 4.717 ± 0.078
4.134LeuGlu: 4.134 ± 0.085
3.718LeuPhe: 3.718 ± 0.09
5.66LeuGly: 5.66 ± 0.107
2.31LeuHis: 2.31 ± 0.053
4.488LeuIle: 4.488 ± 0.086
5.358LeuLys: 5.358 ± 0.091
8.653LeuLeu: 8.653 ± 0.145
2.481LeuMet: 2.481 ± 0.059
4.275LeuAsn: 4.275 ± 0.082
4.378LeuPro: 4.378 ± 0.086
3.757LeuGln: 3.757 ± 0.074
5.137LeuArg: 5.137 ± 0.08
6.094LeuSer: 6.094 ± 0.097
5.535LeuThr: 5.535 ± 0.094
5.414LeuVal: 5.414 ± 0.091
1.035LeuTrp: 1.035 ± 0.039
3.368LeuTyr: 3.368 ± 0.082
0.001LeuXaa: 0.001 ± 0.001
Met
2.659MetAla: 2.659 ± 0.061
0.334MetCys: 0.334 ± 0.02
1.243MetAsp: 1.243 ± 0.036
1.391MetGlu: 1.391 ± 0.043
0.966MetPhe: 0.966 ± 0.036
1.917MetGly: 1.917 ± 0.048
0.551MetHis: 0.551 ± 0.026
1.315MetIle: 1.315 ± 0.04
1.986MetLys: 1.986 ± 0.049
2.808MetLeu: 2.808 ± 0.065
0.831MetMet: 0.831 ± 0.035
1.322MetAsn: 1.322 ± 0.042
1.46MetPro: 1.46 ± 0.046
1.445MetGln: 1.445 ± 0.053
1.503MetArg: 1.503 ± 0.041
1.636MetSer: 1.636 ± 0.045
1.671MetThr: 1.671 ± 0.043
1.798MetVal: 1.798 ± 0.049
0.312MetTrp: 0.312 ± 0.022
0.772MetTyr: 0.772 ± 0.031
0.0MetXaa: 0.0 ± 0.0
Asn
4.186AsnAla: 4.186 ± 0.079
0.595AsnCys: 0.595 ± 0.027
2.869AsnAsp: 2.869 ± 0.07
2.803AsnGlu: 2.803 ± 0.065
2.075AsnPhe: 2.075 ± 0.052
3.96AsnGly: 3.96 ± 0.078
0.965AsnHis: 0.965 ± 0.035
3.223AsnIle: 3.223 ± 0.07
2.991AsnLys: 2.991 ± 0.071
3.871AsnLeu: 3.871 ± 0.071
1.306AsnMet: 1.306 ± 0.045
2.76AsnAsn: 2.76 ± 0.075
2.153AsnPro: 2.153 ± 0.052
1.503AsnGln: 1.503 ± 0.045
2.281AsnArg: 2.281 ± 0.055
2.566AsnSer: 2.566 ± 0.064
3.006AsnThr: 3.006 ± 0.07
3.425AsnVal: 3.425 ± 0.065
0.599AsnTrp: 0.599 ± 0.027
2.168AsnTyr: 2.168 ± 0.059
0.0AsnXaa: 0.0 ± 0.0
Pro
3.541ProAla: 3.541 ± 0.083
0.496ProCys: 0.496 ± 0.026
2.632ProAsp: 2.632 ± 0.064
2.836ProGlu: 2.836 ± 0.065
1.722ProPhe: 1.722 ± 0.046
2.338ProGly: 2.338 ± 0.055
0.876ProHis: 0.876 ± 0.033
2.058ProIle: 2.058 ± 0.06
1.903ProLys: 1.903 ± 0.051
3.362ProLeu: 3.362 ± 0.077
0.985ProMet: 0.985 ± 0.037
1.756ProAsn: 1.756 ± 0.046
0.952ProPro: 0.952 ± 0.041
1.785ProGln: 1.785 ± 0.052
1.553ProArg: 1.553 ± 0.046
2.14ProSer: 2.14 ± 0.049
2.348ProThr: 2.348 ± 0.058
2.898ProVal: 2.898 ± 0.068
0.472ProTrp: 0.472 ± 0.023
1.683ProTyr: 1.683 ± 0.051
0.0ProXaa: 0.0 ± 0.0
Gln
3.353GlnAla: 3.353 ± 0.079
0.47GlnCys: 0.47 ± 0.025
1.53GlnAsp: 1.53 ± 0.044
1.747GlnGlu: 1.747 ± 0.055
1.458GlnPhe: 1.458 ± 0.04
2.494GlnGly: 2.494 ± 0.06
1.063GlnHis: 1.063 ± 0.039
2.267GlnIle: 2.267 ± 0.053
2.212GlnLys: 2.212 ± 0.068
3.898GlnLeu: 3.898 ± 0.082
1.258GlnMet: 1.258 ± 0.043
1.805GlnAsn: 1.805 ± 0.062
1.666GlnPro: 1.666 ± 0.053
2.31GlnGln: 2.31 ± 0.087
2.315GlnArg: 2.315 ± 0.061
2.254GlnSer: 2.254 ± 0.051
2.443GlnThr: 2.443 ± 0.056
2.421GlnVal: 2.421 ± 0.057
0.542GlnTrp: 0.542 ± 0.029
1.617GlnTyr: 1.617 ± 0.044
0.0GlnXaa: 0.0 ± 0.0
Arg
3.556ArgAla: 3.556 ± 0.073
0.646ArgCys: 0.646 ± 0.026
2.426ArgAsp: 2.426 ± 0.055
2.731ArgGlu: 2.731 ± 0.064
2.226ArgPhe: 2.226 ± 0.05
2.737ArgGly: 2.737 ± 0.064
1.315ArgHis: 1.315 ± 0.046
3.405ArgIle: 3.405 ± 0.062
3.057ArgLys: 3.057 ± 0.072
4.563ArgLeu: 4.563 ± 0.085
1.69ArgMet: 1.69 ± 0.048
2.475ArgAsn: 2.475 ± 0.054
1.866ArgPro: 1.866 ± 0.057
2.22ArgGln: 2.22 ± 0.059
2.833ArgArg: 2.833 ± 0.074
2.565ArgSer: 2.565 ± 0.061
3.0ArgThr: 3.0 ± 0.06
3.216ArgVal: 3.216 ± 0.064
0.608ArgTrp: 0.608 ± 0.031
2.283ArgTyr: 2.283 ± 0.058
0.0ArgXaa: 0.0 ± 0.0
Ser
4.78SerAla: 4.78 ± 0.079
0.822SerCys: 0.822 ± 0.033
3.238SerAsp: 3.238 ± 0.065
2.852SerGlu: 2.852 ± 0.059
2.527SerPhe: 2.527 ± 0.055
4.151SerGly: 4.151 ± 0.084
1.291SerHis: 1.291 ± 0.045
3.764SerIle: 3.764 ± 0.074
3.17SerLys: 3.17 ± 0.058
5.471SerLeu: 5.471 ± 0.096
1.453SerMet: 1.453 ± 0.044
2.685SerAsn: 2.685 ± 0.064
2.096SerPro: 2.096 ± 0.055
2.013SerGln: 2.013 ± 0.051
2.633SerArg: 2.633 ± 0.058
3.395SerSer: 3.395 ± 0.085
3.543SerThr: 3.543 ± 0.078
4.306SerVal: 4.306 ± 0.074
0.692SerTrp: 0.692 ± 0.034
2.479SerTyr: 2.479 ± 0.059
0.0SerXaa: 0.0 ± 0.0
Thr
5.277ThrAla: 5.277 ± 0.087
0.756ThrCys: 0.756 ± 0.034
3.929ThrAsp: 3.929 ± 0.082
3.07ThrGlu: 3.07 ± 0.069
2.775ThrPhe: 2.775 ± 0.064
4.239ThrGly: 4.239 ± 0.071
1.324ThrHis: 1.324 ± 0.044
3.742ThrIle: 3.742 ± 0.066
2.838ThrLys: 2.838 ± 0.065
6.34ThrLeu: 6.34 ± 0.099
1.338ThrMet: 1.338 ± 0.043
2.671ThrAsn: 2.671 ± 0.068
3.205ThrPro: 3.205 ± 0.067
2.24ThrGln: 2.24 ± 0.052
2.478ThrArg: 2.478 ± 0.053
3.473ThrSer: 3.473 ± 0.077
4.06ThrThr: 4.06 ± 0.093
3.584ThrVal: 3.584 ± 0.075
0.659ThrTrp: 0.659 ± 0.031
2.64ThrTyr: 2.64 ± 0.07
0.0ThrXaa: 0.0 ± 0.0
Val
5.959ValAla: 5.959 ± 0.109
1.236ValCys: 1.236 ± 0.042
3.878ValAsp: 3.878 ± 0.075
3.947ValGlu: 3.947 ± 0.08
2.559ValPhe: 2.559 ± 0.054
4.359ValGly: 4.359 ± 0.099
1.324ValHis: 1.324 ± 0.04
3.661ValIle: 3.661 ± 0.086
4.228ValLys: 4.228 ± 0.086
5.871ValLeu: 5.871 ± 0.103
1.982ValMet: 1.982 ± 0.05
3.29ValAsn: 3.29 ± 0.067
3.009ValPro: 3.009 ± 0.067
2.296ValGln: 2.296 ± 0.053
3.44ValArg: 3.44 ± 0.075
4.282ValSer: 4.282 ± 0.077
4.191ValThr: 4.191 ± 0.08
5.296ValVal: 5.296 ± 0.092
0.771ValTrp: 0.771 ± 0.039
2.628ValTyr: 2.628 ± 0.07
0.0ValXaa: 0.0 ± 0.0
Trp
0.838TrpAla: 0.838 ± 0.031
0.22TrpCys: 0.22 ± 0.019
0.645TrpAsp: 0.645 ± 0.03
0.528TrpGlu: 0.528 ± 0.029
0.488TrpPhe: 0.488 ± 0.031
0.882TrpGly: 0.882 ± 0.039
0.43TrpHis: 0.43 ± 0.025
0.551TrpIle: 0.551 ± 0.027
0.718TrpLys: 0.718 ± 0.035
1.231TrpLeu: 1.231 ± 0.044
0.279TrpMet: 0.279 ± 0.017
0.721TrpAsn: 0.721 ± 0.028
0.331TrpPro: 0.331 ± 0.02
0.708TrpGln: 0.708 ± 0.036
0.714TrpArg: 0.714 ± 0.033
0.647TrpSer: 0.647 ± 0.027
0.68TrpThr: 0.68 ± 0.032
0.759TrpVal: 0.759 ± 0.037
0.192TrpTrp: 0.192 ± 0.016
0.479TrpTyr: 0.479 ± 0.027
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.537TyrAla: 3.537 ± 0.068
0.584TyrCys: 0.584 ± 0.028
2.779TyrAsp: 2.779 ± 0.061
2.263TyrGlu: 2.263 ± 0.059
1.817TyrPhe: 1.817 ± 0.051
2.832TyrGly: 2.832 ± 0.059
0.943TyrHis: 0.943 ± 0.037
2.448TyrIle: 2.448 ± 0.058
2.441TyrLys: 2.441 ± 0.056
3.709TyrLeu: 3.709 ± 0.074
1.071TyrMet: 1.071 ± 0.04
2.422TyrAsn: 2.422 ± 0.058
1.64TyrPro: 1.64 ± 0.048
1.401TyrGln: 1.401 ± 0.044
1.999TyrArg: 1.999 ± 0.051
2.243TyrSer: 2.243 ± 0.067
2.651TyrThr: 2.651 ± 0.068
2.841TyrVal: 2.841 ± 0.062
0.516TyrTrp: 0.516 ± 0.028
2.111TyrTyr: 2.111 ± 0.062
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.001XaaAla: 0.001 ± 0.001
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.001XaaGly: 0.001 ± 0.001
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.005XaaXaa: 0.005 ± 0.005
Statistics based on 2309 proteins (790966 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski