Amino acid dipepetide frequency for Prevotella sp. CAG:520

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
7.325AlaAla: 7.325 ± 0.118
1.078AlaCys: 1.078 ± 0.039
5.387AlaAsp: 5.387 ± 0.092
5.575AlaGlu: 5.575 ± 0.106
3.266AlaPhe: 3.266 ± 0.078
5.044AlaGly: 5.044 ± 0.09
1.455AlaHis: 1.455 ± 0.046
4.626AlaIle: 4.626 ± 0.083
4.918AlaLys: 4.918 ± 0.08
7.383AlaLeu: 7.383 ± 0.124
2.392AlaMet: 2.392 ± 0.06
3.506AlaAsn: 3.506 ± 0.067
2.579AlaPro: 2.579 ± 0.071
3.243AlaGln: 3.243 ± 0.059
3.428AlaArg: 3.428 ± 0.074
4.672AlaSer: 4.672 ± 0.079
4.935AlaThr: 4.935 ± 0.1
5.452AlaVal: 5.452 ± 0.093
0.862AlaTrp: 0.862 ± 0.036
3.004AlaTyr: 3.004 ± 0.069
0.003AlaXaa: 0.003 ± 0.002
Cys
0.955CysAla: 0.955 ± 0.041
0.243CysCys: 0.243 ± 0.018
0.816CysAsp: 0.816 ± 0.036
0.776CysGlu: 0.776 ± 0.034
0.601CysPhe: 0.601 ± 0.028
1.216CysGly: 1.216 ± 0.048
0.312CysHis: 0.312 ± 0.018
0.787CysIle: 0.787 ± 0.037
0.753CysLys: 0.753 ± 0.035
1.138CysLeu: 1.138 ± 0.037
0.366CysMet: 0.366 ± 0.022
0.652CysAsn: 0.652 ± 0.029
0.533CysPro: 0.533 ± 0.027
0.384CysGln: 0.384 ± 0.022
0.709CysArg: 0.709 ± 0.028
0.845CysSer: 0.845 ± 0.037
0.761CysThr: 0.761 ± 0.028
0.908CysVal: 0.908 ± 0.041
0.159CysTrp: 0.159 ± 0.015
0.604CysTyr: 0.604 ± 0.029
0.0CysXaa: 0.0 ± 0.0
Asp
4.852AspAla: 4.852 ± 0.084
0.759AspCys: 0.759 ± 0.039
3.437AspAsp: 3.437 ± 0.078
4.218AspGlu: 4.218 ± 0.086
3.051AspPhe: 3.051 ± 0.065
4.963AspGly: 4.963 ± 0.107
0.99AspHis: 0.99 ± 0.033
4.06AspIle: 4.06 ± 0.083
3.68AspLys: 3.68 ± 0.072
4.45AspLeu: 4.45 ± 0.081
1.816AspMet: 1.816 ± 0.042
3.224AspAsn: 3.224 ± 0.072
1.918AspPro: 1.918 ± 0.047
1.37AspGln: 1.37 ± 0.043
2.781AspArg: 2.781 ± 0.063
3.143AspSer: 3.143 ± 0.072
2.981AspThr: 2.981 ± 0.063
4.26AspVal: 4.26 ± 0.08
0.86AspTrp: 0.86 ± 0.034
2.926AspTyr: 2.926 ± 0.061
0.0AspXaa: 0.0 ± 0.0
Glu
5.113GluAla: 5.113 ± 0.104
0.739GluCys: 0.739 ± 0.033
3.214GluAsp: 3.214 ± 0.072
4.457GluGlu: 4.457 ± 0.099
2.503GluPhe: 2.503 ± 0.057
4.261GluGly: 4.261 ± 0.083
1.434GluHis: 1.434 ± 0.045
3.741GluIle: 3.741 ± 0.086
4.492GluLys: 4.492 ± 0.081
5.395GluLeu: 5.395 ± 0.096
2.038GluMet: 2.038 ± 0.053
3.204GluAsn: 3.204 ± 0.063
1.876GluPro: 1.876 ± 0.045
2.721GluGln: 2.721 ± 0.062
3.388GluArg: 3.388 ± 0.071
2.995GluSer: 2.995 ± 0.062
3.243GluThr: 3.243 ± 0.066
4.044GluVal: 4.044 ± 0.086
0.826GluTrp: 0.826 ± 0.033
2.702GluTyr: 2.702 ± 0.064
0.0GluXaa: 0.0 ± 0.0
Phe
3.418PheAla: 3.418 ± 0.072
0.75PheCys: 0.75 ± 0.033
2.967PheAsp: 2.967 ± 0.056
2.415PheGlu: 2.415 ± 0.056
2.01PhePhe: 2.01 ± 0.06
3.282PheGly: 3.282 ± 0.065
0.832PheHis: 0.832 ± 0.031
2.579PheIle: 2.579 ± 0.066
2.384PheLys: 2.384 ± 0.055
3.321PheLeu: 3.321 ± 0.075
1.22PheMet: 1.22 ± 0.04
2.264PheAsn: 2.264 ± 0.061
1.391PhePro: 1.391 ± 0.048
1.046PheGln: 1.046 ± 0.036
2.056PheArg: 2.056 ± 0.061
3.141PheSer: 3.141 ± 0.064
2.586PheThr: 2.586 ± 0.061
3.15PheVal: 3.15 ± 0.075
0.472PheTrp: 0.472 ± 0.029
1.771PheTyr: 1.771 ± 0.049
0.001PheXaa: 0.001 ± 0.001
Gly
5.16GlyAla: 5.16 ± 0.092
1.037GlyCys: 1.037 ± 0.04
3.973GlyAsp: 3.973 ± 0.077
4.359GlyGlu: 4.359 ± 0.086
3.008GlyPhe: 3.008 ± 0.063
4.995GlyGly: 4.995 ± 0.096
1.207GlyHis: 1.207 ± 0.036
4.596GlyIle: 4.596 ± 0.087
5.386GlyLys: 5.386 ± 0.08
5.448GlyLeu: 5.448 ± 0.099
2.127GlyMet: 2.127 ± 0.064
3.48GlyAsn: 3.48 ± 0.078
1.237GlyPro: 1.237 ± 0.036
1.837GlyGln: 1.837 ± 0.052
3.012GlyArg: 3.012 ± 0.072
4.101GlySer: 4.101 ± 0.091
4.291GlyThr: 4.291 ± 0.093
5.46GlyVal: 5.46 ± 0.087
0.921GlyTrp: 0.921 ± 0.042
3.2GlyTyr: 3.2 ± 0.072
0.0GlyXaa: 0.0 ± 0.0
His
1.342HisAla: 1.342 ± 0.045
0.292HisCys: 0.292 ± 0.016
1.19HisAsp: 1.19 ± 0.041
1.079HisGlu: 1.079 ± 0.041
0.987HisPhe: 0.987 ± 0.043
1.361HisGly: 1.361 ± 0.044
0.535HisHis: 0.535 ± 0.031
1.43HisIle: 1.43 ± 0.038
1.113HisLys: 1.113 ± 0.04
1.647HisLeu: 1.647 ± 0.041
0.388HisMet: 0.388 ± 0.022
1.087HisAsn: 1.087 ± 0.03
0.954HisPro: 0.954 ± 0.043
0.565HisGln: 0.565 ± 0.031
1.009HisArg: 1.009 ± 0.033
1.159HisSer: 1.159 ± 0.041
1.091HisThr: 1.091 ± 0.035
1.256HisVal: 1.256 ± 0.039
0.237HisTrp: 0.237 ± 0.018
0.951HisTyr: 0.951 ± 0.038
0.0HisXaa: 0.0 ± 0.0
Ile
5.383IleAla: 5.383 ± 0.098
0.907IleCys: 0.907 ± 0.034
4.384IleAsp: 4.384 ± 0.083
4.032IleGlu: 4.032 ± 0.082
2.292IlePhe: 2.292 ± 0.063
4.304IleGly: 4.304 ± 0.076
1.163IleHis: 1.163 ± 0.038
3.781IleIle: 3.781 ± 0.094
3.834IleLys: 3.834 ± 0.08
4.53IleLeu: 4.53 ± 0.096
1.519IleMet: 1.519 ± 0.044
3.123IleAsn: 3.123 ± 0.068
2.42IlePro: 2.42 ± 0.062
1.597IleGln: 1.597 ± 0.047
2.764IleArg: 2.764 ± 0.068
3.948IleSer: 3.948 ± 0.079
3.749IleThr: 3.749 ± 0.077
4.706IleVal: 4.706 ± 0.089
0.56IleTrp: 0.56 ± 0.026
2.355IleTyr: 2.355 ± 0.064
0.0IleXaa: 0.0 ± 0.0
Lys
5.306LysAla: 5.306 ± 0.094
0.618LysCys: 0.618 ± 0.026
3.726LysAsp: 3.726 ± 0.069
4.578LysGlu: 4.578 ± 0.095
2.315LysPhe: 2.315 ± 0.051
4.264LysGly: 4.264 ± 0.09
1.269LysHis: 1.269 ± 0.045
3.572LysIle: 3.572 ± 0.064
4.817LysLys: 4.817 ± 0.105
5.205LysLeu: 5.205 ± 0.087
2.134LysMet: 2.134 ± 0.059
3.383LysAsn: 3.383 ± 0.075
2.316LysPro: 2.316 ± 0.061
2.74LysGln: 2.74 ± 0.059
3.396LysArg: 3.396 ± 0.069
3.421LysSer: 3.421 ± 0.073
3.82LysThr: 3.82 ± 0.066
4.301LysVal: 4.301 ± 0.083
0.785LysTrp: 0.785 ± 0.033
2.784LysTyr: 2.784 ± 0.062
0.0LysXaa: 0.0 ± 0.0
Leu
6.93LeuAla: 6.93 ± 0.106
1.343LeuCys: 1.343 ± 0.05
4.716LeuAsp: 4.716 ± 0.082
4.269LeuGlu: 4.269 ± 0.091
3.679LeuPhe: 3.679 ± 0.081
5.61LeuGly: 5.61 ± 0.114
1.85LeuHis: 1.85 ± 0.043
4.563LeuIle: 4.563 ± 0.08
5.672LeuLys: 5.672 ± 0.096
7.68LeuLeu: 7.68 ± 0.146
2.487LeuMet: 2.487 ± 0.058
4.412LeuAsn: 4.412 ± 0.09
3.633LeuPro: 3.633 ± 0.071
3.164LeuGln: 3.164 ± 0.07
4.635LeuArg: 4.635 ± 0.082
6.027LeuSer: 6.027 ± 0.099
5.291LeuThr: 5.291 ± 0.086
5.343LeuVal: 5.343 ± 0.107
0.923LeuTrp: 0.923 ± 0.039
3.245LeuTyr: 3.245 ± 0.078
0.0LeuXaa: 0.0 ± 0.0
Met
2.351MetAla: 2.351 ± 0.054
0.316MetCys: 0.316 ± 0.02
1.501MetAsp: 1.501 ± 0.045
1.715MetGlu: 1.715 ± 0.043
1.126MetPhe: 1.126 ± 0.042
1.881MetGly: 1.881 ± 0.054
0.592MetHis: 0.592 ± 0.029
1.473MetIle: 1.473 ± 0.043
2.398MetLys: 2.398 ± 0.052
2.898MetLeu: 2.898 ± 0.057
0.926MetMet: 0.926 ± 0.039
1.53MetAsn: 1.53 ± 0.052
1.387MetPro: 1.387 ± 0.046
1.283MetGln: 1.283 ± 0.04
1.544MetArg: 1.544 ± 0.051
1.741MetSer: 1.741 ± 0.047
1.653MetThr: 1.653 ± 0.051
1.702MetVal: 1.702 ± 0.051
0.262MetTrp: 0.262 ± 0.021
0.918MetTyr: 0.918 ± 0.036
0.0MetXaa: 0.0 ± 0.0
Asn
3.994AsnAla: 3.994 ± 0.068
0.627AsnCys: 0.627 ± 0.03
2.927AsnAsp: 2.927 ± 0.061
2.875AsnGlu: 2.875 ± 0.072
2.102AsnPhe: 2.102 ± 0.059
4.117AsnGly: 4.117 ± 0.093
0.932AsnHis: 0.932 ± 0.034
3.658AsnIle: 3.658 ± 0.081
2.894AsnLys: 2.894 ± 0.078
3.912AsnLeu: 3.912 ± 0.071
1.332AsnMet: 1.332 ± 0.045
2.768AsnAsn: 2.768 ± 0.077
2.252AsnPro: 2.252 ± 0.062
1.361AsnGln: 1.361 ± 0.046
2.462AsnArg: 2.462 ± 0.058
2.712AsnSer: 2.712 ± 0.07
2.877AsnThr: 2.877 ± 0.065
3.77AsnVal: 3.77 ± 0.078
0.667AsnTrp: 0.667 ± 0.035
2.172AsnTyr: 2.172 ± 0.06
0.0AsnXaa: 0.0 ± 0.0
Pro
2.809ProAla: 2.809 ± 0.059
0.387ProCys: 0.387 ± 0.026
2.366ProAsp: 2.366 ± 0.063
2.983ProGlu: 2.983 ± 0.063
1.699ProPhe: 1.699 ± 0.048
2.191ProGly: 2.191 ± 0.054
0.72ProHis: 0.72 ± 0.033
1.989ProIle: 1.989 ± 0.049
2.106ProLys: 2.106 ± 0.058
2.931ProLeu: 2.931 ± 0.069
1.044ProMet: 1.044 ± 0.039
1.709ProAsn: 1.709 ± 0.051
0.713ProPro: 0.713 ± 0.034
1.43ProGln: 1.43 ± 0.042
1.348ProArg: 1.348 ± 0.041
2.218ProSer: 2.218 ± 0.063
2.312ProThr: 2.312 ± 0.068
2.595ProVal: 2.595 ± 0.06
0.438ProTrp: 0.438 ± 0.027
1.588ProTyr: 1.588 ± 0.044
0.0ProXaa: 0.0 ± 0.0
Gln
2.277GlnAla: 2.277 ± 0.059
0.38GlnCys: 0.38 ± 0.021
1.466GlnAsp: 1.466 ± 0.044
1.818GlnGlu: 1.818 ± 0.058
1.365GlnPhe: 1.365 ± 0.037
1.887GlnGly: 1.887 ± 0.054
0.777GlnHis: 0.777 ± 0.036
2.104GlnIle: 2.104 ± 0.052
2.356GlnLys: 2.356 ± 0.066
3.534GlnLeu: 3.534 ± 0.075
1.222GlnMet: 1.222 ± 0.04
1.858GlnAsn: 1.858 ± 0.052
1.398GlnPro: 1.398 ± 0.047
1.78GlnGln: 1.78 ± 0.062
2.028GlnArg: 2.028 ± 0.051
1.867GlnSer: 1.867 ± 0.053
2.238GlnThr: 2.238 ± 0.055
1.948GlnVal: 1.948 ± 0.051
0.47GlnTrp: 0.47 ± 0.026
1.433GlnTyr: 1.433 ± 0.049
0.0GlnXaa: 0.0 ± 0.0
Arg
3.108ArgAla: 3.108 ± 0.072
0.552ArgCys: 0.552 ± 0.021
2.457ArgAsp: 2.457 ± 0.058
2.98ArgGlu: 2.98 ± 0.076
2.257ArgPhe: 2.257 ± 0.056
2.722ArgGly: 2.722 ± 0.062
1.122ArgHis: 1.122 ± 0.042
3.499ArgIle: 3.499 ± 0.07
3.25ArgLys: 3.25 ± 0.067
4.754ArgLeu: 4.754 ± 0.084
1.652ArgMet: 1.652 ± 0.046
2.47ArgAsn: 2.47 ± 0.059
1.693ArgPro: 1.693 ± 0.051
1.999ArgGln: 1.999 ± 0.055
2.843ArgArg: 2.843 ± 0.075
2.654ArgSer: 2.654 ± 0.059
2.662ArgThr: 2.662 ± 0.061
2.914ArgVal: 2.914 ± 0.069
0.618ArgTrp: 0.618 ± 0.026
2.261ArgTyr: 2.261 ± 0.06
0.001ArgXaa: 0.001 ± 0.001
Ser
4.989SerAla: 4.989 ± 0.084
0.853SerCys: 0.853 ± 0.034
3.534SerAsp: 3.534 ± 0.072
3.365SerGlu: 3.365 ± 0.074
2.859SerPhe: 2.859 ± 0.062
4.237SerGly: 4.237 ± 0.08
1.21SerHis: 1.21 ± 0.039
3.765SerIle: 3.765 ± 0.079
3.43SerLys: 3.43 ± 0.068
5.421SerLeu: 5.421 ± 0.092
1.61SerMet: 1.61 ± 0.044
2.72SerAsn: 2.72 ± 0.077
2.128SerPro: 2.128 ± 0.049
1.983SerGln: 1.983 ± 0.058
2.586SerArg: 2.586 ± 0.063
3.657SerSer: 3.657 ± 0.101
3.528SerThr: 3.528 ± 0.081
4.631SerVal: 4.631 ± 0.089
0.767SerTrp: 0.767 ± 0.035
2.581SerTyr: 2.581 ± 0.073
0.003SerXaa: 0.003 ± 0.002
Thr
4.986ThrAla: 4.986 ± 0.087
0.685ThrCys: 0.685 ± 0.032
3.79ThrAsp: 3.79 ± 0.075
3.275ThrGlu: 3.275 ± 0.071
2.715ThrPhe: 2.715 ± 0.06
4.053ThrGly: 4.053 ± 0.073
1.073ThrHis: 1.073 ± 0.036
4.035ThrIle: 4.035 ± 0.084
3.187ThrLys: 3.187 ± 0.067
5.735ThrLeu: 5.735 ± 0.092
1.496ThrMet: 1.496 ± 0.048
2.478ThrAsn: 2.478 ± 0.056
2.729ThrPro: 2.729 ± 0.067
1.825ThrGln: 1.825 ± 0.05
2.298ThrArg: 2.298 ± 0.049
3.548ThrSer: 3.548 ± 0.078
3.793ThrThr: 3.793 ± 0.086
4.209ThrVal: 4.209 ± 0.077
0.755ThrTrp: 0.755 ± 0.031
2.508ThrTyr: 2.508 ± 0.075
0.001ThrXaa: 0.001 ± 0.001
Val
5.731ValAla: 5.731 ± 0.102
1.165ValCys: 1.165 ± 0.04
4.364ValAsp: 4.364 ± 0.081
4.569ValGlu: 4.569 ± 0.084
2.895ValPhe: 2.895 ± 0.064
4.665ValGly: 4.665 ± 0.078
1.111ValHis: 1.111 ± 0.039
3.827ValIle: 3.827 ± 0.077
4.761ValLys: 4.761 ± 0.087
5.781ValLeu: 5.781 ± 0.092
1.878ValMet: 1.878 ± 0.062
3.385ValAsn: 3.385 ± 0.082
2.593ValPro: 2.593 ± 0.061
2.045ValGln: 2.045 ± 0.053
3.345ValArg: 3.345 ± 0.057
4.626ValSer: 4.626 ± 0.085
3.886ValThr: 3.886 ± 0.085
5.534ValVal: 5.534 ± 0.106
0.795ValTrp: 0.795 ± 0.032
2.832ValTyr: 2.832 ± 0.069
0.0ValXaa: 0.0 ± 0.0
Trp
0.83TrpAla: 0.83 ± 0.037
0.183TrpCys: 0.183 ± 0.017
0.686TrpAsp: 0.686 ± 0.031
0.608TrpGlu: 0.608 ± 0.032
0.498TrpPhe: 0.498 ± 0.026
0.909TrpGly: 0.909 ± 0.038
0.332TrpHis: 0.332 ± 0.023
0.676TrpIle: 0.676 ± 0.028
0.748TrpLys: 0.748 ± 0.034
1.172TrpLeu: 1.172 ± 0.041
0.399TrpMet: 0.399 ± 0.024
0.72TrpAsn: 0.72 ± 0.034
0.254TrpPro: 0.254 ± 0.019
0.597TrpGln: 0.597 ± 0.028
0.642TrpArg: 0.642 ± 0.033
0.723TrpSer: 0.723 ± 0.028
0.777TrpThr: 0.777 ± 0.04
0.712TrpVal: 0.712 ± 0.032
0.152TrpTrp: 0.152 ± 0.016
0.448TrpTyr: 0.448 ± 0.026
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.273TyrAla: 3.273 ± 0.064
0.603TyrCys: 0.603 ± 0.031
2.839TyrAsp: 2.839 ± 0.061
2.441TyrGlu: 2.441 ± 0.056
1.828TyrPhe: 1.828 ± 0.048
2.873TyrGly: 2.873 ± 0.071
0.757TyrHis: 0.757 ± 0.034
2.586TyrIle: 2.586 ± 0.057
2.572TyrLys: 2.572 ± 0.057
3.136TyrLeu: 3.136 ± 0.077
1.15TyrMet: 1.15 ± 0.037
2.438TyrAsn: 2.438 ± 0.063
1.572TyrPro: 1.572 ± 0.048
1.236TyrGln: 1.236 ± 0.041
2.192TyrArg: 2.192 ± 0.063
2.657TyrSer: 2.657 ± 0.068
2.698TyrThr: 2.698 ± 0.078
2.896TyrVal: 2.896 ± 0.062
0.534TyrTrp: 0.534 ± 0.026
2.051TyrTyr: 2.051 ± 0.065
0.001TyrXaa: 0.001 ± 0.001
Xaa
0.003XaaAla: 0.003 ± 0.002
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.001XaaLeu: 0.001 ± 0.001
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.003XaaPro: 0.003 ± 0.002
0.0XaaGln: 0.0 ± 0.0
0.001XaaArg: 0.001 ± 0.001
0.0XaaSer: 0.0 ± 0.0
0.001XaaThr: 0.001 ± 0.001
0.0XaaVal: 0.0 ± 0.0
0.001XaaTrp: 0.001 ± 0.001
0.0XaaTyr: 0.0 ± 0.0
0.001XaaXaa: 0.001 ± 0.001
Statistics based on 2171 proteins (780981 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski