Amino acid dipepetide frequency for Roseburia sp. CAG:197

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
6.509AlaAla: 6.509 ± 0.114
1.146AlaCys: 1.146 ± 0.044
4.813AlaAsp: 4.813 ± 0.08
5.048AlaGlu: 5.048 ± 0.096
3.069AlaPhe: 3.069 ± 0.067
5.607AlaGly: 5.607 ± 0.1
1.157AlaHis: 1.157 ± 0.041
5.518AlaIle: 5.518 ± 0.098
5.545AlaLys: 5.545 ± 0.108
6.461AlaLeu: 6.461 ± 0.101
2.448AlaMet: 2.448 ± 0.061
2.953AlaAsn: 2.953 ± 0.062
1.792AlaPro: 1.792 ± 0.052
2.656AlaGln: 2.656 ± 0.065
2.595AlaArg: 2.595 ± 0.058
4.087AlaSer: 4.087 ± 0.08
3.752AlaThr: 3.752 ± 0.079
5.688AlaVal: 5.688 ± 0.097
0.626AlaTrp: 0.626 ± 0.031
3.05AlaTyr: 3.05 ± 0.066
0.007AlaXaa: 0.007 ± 0.003
Cys
1.026CysAla: 1.026 ± 0.039
0.24CysCys: 0.24 ± 0.02
0.872CysAsp: 0.872 ± 0.038
0.914CysGlu: 0.914 ± 0.036
0.589CysPhe: 0.589 ± 0.03
1.405CysGly: 1.405 ± 0.051
0.289CysHis: 0.289 ± 0.021
1.155CysIle: 1.155 ± 0.045
0.902CysLys: 0.902 ± 0.034
1.122CysLeu: 1.122 ± 0.043
0.487CysMet: 0.487 ± 0.026
0.623CysAsn: 0.623 ± 0.025
0.542CysPro: 0.542 ± 0.033
0.426CysGln: 0.426 ± 0.024
0.631CysArg: 0.631 ± 0.029
0.86CysSer: 0.86 ± 0.038
0.752CysThr: 0.752 ± 0.031
1.107CysVal: 1.107 ± 0.044
0.124CysTrp: 0.124 ± 0.013
0.653CysTyr: 0.653 ± 0.033
0.0CysXaa: 0.0 ± 0.0
Asp
4.921AspAla: 4.921 ± 0.094
0.805AspCys: 0.805 ± 0.033
3.461AspAsp: 3.461 ± 0.081
4.879AspGlu: 4.879 ± 0.085
2.675AspPhe: 2.675 ± 0.063
4.432AspGly: 4.432 ± 0.093
1.049AspHis: 1.049 ± 0.037
4.715AspIle: 4.715 ± 0.085
3.813AspLys: 3.813 ± 0.077
4.614AspLeu: 4.614 ± 0.084
1.937AspMet: 1.937 ± 0.056
2.618AspAsn: 2.618 ± 0.079
1.555AspPro: 1.555 ± 0.052
1.534AspGln: 1.534 ± 0.05
2.188AspArg: 2.188 ± 0.059
3.187AspSer: 3.187 ± 0.075
3.353AspThr: 3.353 ± 0.073
4.153AspVal: 4.153 ± 0.083
0.605AspTrp: 0.605 ± 0.034
2.916AspTyr: 2.916 ± 0.071
0.003AspXaa: 0.003 ± 0.002
Glu
5.194GluAla: 5.194 ± 0.092
0.837GluCys: 0.837 ± 0.041
4.428GluAsp: 4.428 ± 0.099
6.716GluGlu: 6.716 ± 0.123
2.439GluPhe: 2.439 ± 0.063
3.923GluGly: 3.923 ± 0.066
1.484GluHis: 1.484 ± 0.052
5.555GluIle: 5.555 ± 0.098
7.003GluLys: 7.003 ± 0.098
6.36GluLeu: 6.36 ± 0.102
2.289GluMet: 2.289 ± 0.05
4.329GluAsn: 4.329 ± 0.082
1.754GluPro: 1.754 ± 0.051
3.186GluGln: 3.186 ± 0.07
3.016GluArg: 3.016 ± 0.075
3.48GluSer: 3.48 ± 0.071
3.787GluThr: 3.787 ± 0.084
4.527GluVal: 4.527 ± 0.085
0.607GluTrp: 0.607 ± 0.027
2.896GluTyr: 2.896 ± 0.067
0.0GluXaa: 0.0 ± 0.0
Phe
2.997PheAla: 2.997 ± 0.066
0.778PheCys: 0.778 ± 0.037
2.578PheAsp: 2.578 ± 0.067
2.555PheGlu: 2.555 ± 0.069
1.75PhePhe: 1.75 ± 0.052
2.883PheGly: 2.883 ± 0.062
0.87PheHis: 0.87 ± 0.034
2.723PheIle: 2.723 ± 0.076
1.9PheLys: 1.9 ± 0.053
3.667PheLeu: 3.667 ± 0.083
1.199PheMet: 1.199 ± 0.041
1.467PheAsn: 1.467 ± 0.044
1.243PhePro: 1.243 ± 0.041
1.247PheGln: 1.247 ± 0.045
1.505PheArg: 1.505 ± 0.054
2.833PheSer: 2.833 ± 0.063
2.266PheThr: 2.266 ± 0.064
3.081PheVal: 3.081 ± 0.064
0.407PheTrp: 0.407 ± 0.026
1.753PheTyr: 1.753 ± 0.053
0.0PheXaa: 0.0 ± 0.0
Gly
4.711GlyAla: 4.711 ± 0.097
1.184GlyCys: 1.184 ± 0.041
3.477GlyAsp: 3.477 ± 0.077
4.395GlyGlu: 4.395 ± 0.074
2.88GlyPhe: 2.88 ± 0.064
4.255GlyGly: 4.255 ± 0.094
1.205GlyHis: 1.205 ± 0.046
5.926GlyIle: 5.926 ± 0.1
5.649GlyLys: 5.649 ± 0.101
5.268GlyLeu: 5.268 ± 0.098
2.488GlyMet: 2.488 ± 0.063
3.415GlyAsn: 3.415 ± 0.085
1.155GlyPro: 1.155 ± 0.043
2.071GlyGln: 2.071 ± 0.054
2.562GlyArg: 2.562 ± 0.064
3.732GlySer: 3.732 ± 0.074
4.189GlyThr: 4.189 ± 0.08
4.74GlyVal: 4.74 ± 0.092
0.589GlyTrp: 0.589 ± 0.03
3.195GlyTyr: 3.195 ± 0.075
0.001GlyXaa: 0.001 ± 0.001
His
1.262HisAla: 1.262 ± 0.049
0.305HisCys: 0.305 ± 0.019
0.902HisAsp: 0.902 ± 0.041
1.111HisGlu: 1.111 ± 0.041
0.88HisPhe: 0.88 ± 0.039
1.236HisGly: 1.236 ± 0.046
0.425HisHis: 0.425 ± 0.033
1.401HisIle: 1.401 ± 0.045
1.13HisLys: 1.13 ± 0.041
1.525HisLeu: 1.525 ± 0.047
0.627HisMet: 0.627 ± 0.03
0.767HisAsn: 0.767 ± 0.036
0.79HisPro: 0.79 ± 0.036
0.569HisGln: 0.569 ± 0.031
0.773HisArg: 0.773 ± 0.036
0.907HisSer: 0.907 ± 0.035
0.994HisThr: 0.994 ± 0.037
1.236HisVal: 1.236 ± 0.047
0.159HisTrp: 0.159 ± 0.014
0.813HisTyr: 0.813 ± 0.033
0.0HisXaa: 0.0 ± 0.0
Ile
5.985IleAla: 5.985 ± 0.092
1.393IleCys: 1.393 ± 0.047
4.407IleAsp: 4.407 ± 0.087
4.959IleGlu: 4.959 ± 0.09
2.734IlePhe: 2.734 ± 0.076
5.214IleGly: 5.214 ± 0.098
1.37IleHis: 1.37 ± 0.048
5.227IleIle: 5.227 ± 0.118
4.201IleLys: 4.201 ± 0.081
6.86IleLeu: 6.86 ± 0.106
2.048IleMet: 2.048 ± 0.057
3.062IleAsn: 3.062 ± 0.065
3.027IlePro: 3.027 ± 0.062
2.315IleGln: 2.315 ± 0.053
3.555IleArg: 3.555 ± 0.072
4.962IleSer: 4.962 ± 0.089
4.42IleThr: 4.42 ± 0.085
5.566IleVal: 5.566 ± 0.103
0.58IleTrp: 0.58 ± 0.027
2.945IleTyr: 2.945 ± 0.068
0.0IleXaa: 0.0 ± 0.0
Lys
5.291LysAla: 5.291 ± 0.092
0.771LysCys: 0.771 ± 0.035
4.527LysAsp: 4.527 ± 0.089
6.728LysGlu: 6.728 ± 0.111
2.15LysPhe: 2.15 ± 0.053
3.985LysGly: 3.985 ± 0.073
1.077LysHis: 1.077 ± 0.039
5.145LysIle: 5.145 ± 0.08
7.278LysLys: 7.278 ± 0.124
5.704LysLeu: 5.704 ± 0.087
2.397LysMet: 2.397 ± 0.058
4.347LysAsn: 4.347 ± 0.089
1.853LysPro: 1.853 ± 0.055
2.661LysGln: 2.661 ± 0.06
2.927LysArg: 2.927 ± 0.066
3.617LysSer: 3.617 ± 0.073
4.122LysThr: 4.122 ± 0.075
4.874LysVal: 4.874 ± 0.095
0.572LysTrp: 0.572 ± 0.029
3.1LysTyr: 3.1 ± 0.082
0.0LysXaa: 0.0 ± 0.0
Leu
6.646LeuAla: 6.646 ± 0.09
1.425LeuCys: 1.425 ± 0.043
5.026LeuAsp: 5.026 ± 0.092
5.823LeuGlu: 5.823 ± 0.106
3.772LeuPhe: 3.772 ± 0.091
5.552LeuGly: 5.552 ± 0.102
1.611LeuHis: 1.611 ± 0.053
5.788LeuIle: 5.788 ± 0.103
5.694LeuLys: 5.694 ± 0.09
7.955LeuLeu: 7.955 ± 0.138
2.671LeuMet: 2.671 ± 0.066
3.608LeuAsn: 3.608 ± 0.068
2.962LeuPro: 2.962 ± 0.065
3.046LeuGln: 3.046 ± 0.068
3.461LeuArg: 3.461 ± 0.078
5.885LeuSer: 5.885 ± 0.098
4.804LeuThr: 4.804 ± 0.079
5.709LeuVal: 5.709 ± 0.093
0.702LeuTrp: 0.702 ± 0.031
3.356LeuTyr: 3.356 ± 0.08
0.001LeuXaa: 0.001 ± 0.001
Met
2.265MetAla: 2.265 ± 0.057
0.392MetCys: 0.392 ± 0.023
2.176MetAsp: 2.176 ± 0.055
2.525MetGlu: 2.525 ± 0.063
1.053MetPhe: 1.053 ± 0.036
2.009MetGly: 2.009 ± 0.062
0.516MetHis: 0.516 ± 0.029
2.297MetIle: 2.297 ± 0.064
2.786MetLys: 2.786 ± 0.065
2.779MetLeu: 2.779 ± 0.065
0.98MetMet: 0.98 ± 0.04
1.679MetAsn: 1.679 ± 0.051
1.1MetPro: 1.1 ± 0.036
1.363MetGln: 1.363 ± 0.044
1.302MetArg: 1.302 ± 0.041
1.784MetSer: 1.784 ± 0.049
1.719MetThr: 1.719 ± 0.048
2.088MetVal: 2.088 ± 0.052
0.235MetTrp: 0.235 ± 0.017
1.11MetTyr: 1.11 ± 0.037
0.0MetXaa: 0.0 ± 0.0
Asn
3.492AsnAla: 3.492 ± 0.076
0.686AsnCys: 0.686 ± 0.036
2.448AsnAsp: 2.448 ± 0.059
3.177AsnGlu: 3.177 ± 0.079
1.685AsnPhe: 1.685 ± 0.054
3.567AsnGly: 3.567 ± 0.088
0.901AsnHis: 0.901 ± 0.036
3.63AsnIle: 3.63 ± 0.077
2.918AsnLys: 2.918 ± 0.069
3.91AsnLeu: 3.91 ± 0.067
1.563AsnMet: 1.563 ± 0.045
2.331AsnAsn: 2.331 ± 0.084
1.879AsnPro: 1.879 ± 0.05
1.728AsnGln: 1.728 ± 0.059
2.126AsnArg: 2.126 ± 0.055
2.342AsnSer: 2.342 ± 0.062
2.618AsnThr: 2.618 ± 0.067
3.334AsnVal: 3.334 ± 0.073
0.41AsnTrp: 0.41 ± 0.025
2.122AsnTyr: 2.122 ± 0.057
0.0AsnXaa: 0.0 ± 0.0
Pro
2.103ProAla: 2.103 ± 0.056
0.406ProCys: 0.406 ± 0.023
2.036ProAsp: 2.036 ± 0.052
2.835ProGlu: 2.835 ± 0.062
1.344ProPhe: 1.344 ± 0.044
2.022ProGly: 2.022 ± 0.055
0.488ProHis: 0.488 ± 0.026
2.087ProIle: 2.087 ± 0.053
2.067ProLys: 2.067 ± 0.056
2.354ProLeu: 2.354 ± 0.061
0.866ProMet: 0.866 ± 0.036
1.223ProAsn: 1.223 ± 0.042
0.566ProPro: 0.566 ± 0.031
1.195ProGln: 1.195 ± 0.045
0.777ProArg: 0.777 ± 0.033
1.51ProSer: 1.51 ± 0.043
1.507ProThr: 1.507 ± 0.044
2.637ProVal: 2.637 ± 0.063
0.305ProTrp: 0.305 ± 0.02
1.462ProTyr: 1.462 ± 0.047
0.001ProXaa: 0.001 ± 0.001
Gln
2.426GlnAla: 2.426 ± 0.064
0.368GlnCys: 0.368 ± 0.024
1.633GlnAsp: 1.633 ± 0.048
2.865GlnGlu: 2.865 ± 0.071
1.236GlnPhe: 1.236 ± 0.039
1.858GlnGly: 1.858 ± 0.051
0.516GlnHis: 0.516 ± 0.024
2.848GlnIle: 2.848 ± 0.062
3.353GlnLys: 3.353 ± 0.078
2.984GlnLeu: 2.984 ± 0.067
1.549GlnMet: 1.549 ± 0.054
1.805GlnAsn: 1.805 ± 0.051
1.003GlnPro: 1.003 ± 0.045
1.6GlnGln: 1.6 ± 0.054
1.329GlnArg: 1.329 ± 0.039
1.768GlnSer: 1.768 ± 0.055
1.987GlnThr: 1.987 ± 0.053
2.252GlnVal: 2.252 ± 0.055
0.313GlnTrp: 0.313 ± 0.023
1.416GlnTyr: 1.416 ± 0.042
0.0GlnXaa: 0.0 ± 0.0
Arg
2.439ArgAla: 2.439 ± 0.063
0.515ArgCys: 0.515 ± 0.027
2.049ArgAsp: 2.049 ± 0.055
3.369ArgGlu: 3.369 ± 0.081
1.654ArgPhe: 1.654 ± 0.049
2.163ArgGly: 2.163 ± 0.061
0.693ArgHis: 0.693 ± 0.033
3.313ArgIle: 3.313 ± 0.07
3.376ArgLys: 3.376 ± 0.067
3.41ArgLeu: 3.41 ± 0.08
1.572ArgMet: 1.572 ± 0.05
2.028ArgAsn: 2.028 ± 0.048
1.162ArgPro: 1.162 ± 0.041
1.653ArgGln: 1.653 ± 0.054
1.851ArgArg: 1.851 ± 0.053
1.909ArgSer: 1.909 ± 0.048
2.083ArgThr: 2.083 ± 0.055
2.544ArgVal: 2.544 ± 0.059
0.294ArgTrp: 0.294 ± 0.02
1.751ArgTyr: 1.751 ± 0.048
0.001ArgXaa: 0.001 ± 0.001
Ser
4.36SerAla: 4.36 ± 0.084
0.775SerCys: 0.775 ± 0.033
3.524SerAsp: 3.524 ± 0.075
3.557SerGlu: 3.557 ± 0.073
2.517SerPhe: 2.517 ± 0.053
4.73SerGly: 4.73 ± 0.087
1.008SerHis: 1.008 ± 0.038
4.131SerIle: 4.131 ± 0.085
3.736SerLys: 3.736 ± 0.088
4.668SerLeu: 4.668 ± 0.078
1.944SerMet: 1.944 ± 0.05
2.589SerAsn: 2.589 ± 0.07
1.412SerPro: 1.412 ± 0.044
1.909SerGln: 1.909 ± 0.055
2.223SerArg: 2.223 ± 0.052
3.705SerSer: 3.705 ± 0.094
2.892SerThr: 2.892 ± 0.064
4.381SerVal: 4.381 ± 0.08
0.473SerTrp: 0.473 ± 0.027
2.707SerTyr: 2.707 ± 0.064
0.0SerXaa: 0.0 ± 0.0
Thr
4.01ThrAla: 4.01 ± 0.083
0.693ThrCys: 0.693 ± 0.036
3.518ThrAsp: 3.518 ± 0.074
3.813ThrGlu: 3.813 ± 0.079
2.134ThrPhe: 2.134 ± 0.054
4.482ThrGly: 4.482 ± 0.072
0.897ThrHis: 0.897 ± 0.037
4.413ThrIle: 4.413 ± 0.081
3.662ThrLys: 3.662 ± 0.075
4.794ThrLeu: 4.794 ± 0.093
1.522ThrMet: 1.522 ± 0.045
2.373ThrAsn: 2.373 ± 0.062
1.909ThrPro: 1.909 ± 0.057
1.805ThrGln: 1.805 ± 0.058
1.878ThrArg: 1.878 ± 0.054
3.21ThrSer: 3.21 ± 0.069
3.283ThrThr: 3.283 ± 0.077
4.463ThrVal: 4.463 ± 0.092
0.422ThrTrp: 0.422 ± 0.025
2.548ThrTyr: 2.548 ± 0.066
0.003ThrXaa: 0.003 ± 0.002
Val
5.444ValAla: 5.444 ± 0.1
1.247ValCys: 1.247 ± 0.044
4.188ValAsp: 4.188 ± 0.085
4.701ValGlu: 4.701 ± 0.083
2.901ValPhe: 2.901 ± 0.067
4.359ValGly: 4.359 ± 0.079
1.108ValHis: 1.108 ± 0.038
5.351ValIle: 5.351 ± 0.097
4.723ValLys: 4.723 ± 0.096
6.54ValLeu: 6.54 ± 0.112
2.076ValMet: 2.076 ± 0.053
3.119ValAsn: 3.119 ± 0.07
2.451ValPro: 2.451 ± 0.056
2.184ValGln: 2.184 ± 0.051
2.792ValArg: 2.792 ± 0.062
4.602ValSer: 4.602 ± 0.08
4.518ValThr: 4.518 ± 0.089
5.185ValVal: 5.185 ± 0.112
0.584ValTrp: 0.584 ± 0.031
2.888ValTyr: 2.888 ± 0.068
0.001ValXaa: 0.001 ± 0.002
Trp
0.471TrpAla: 0.471 ± 0.024
0.12TrpCys: 0.12 ± 0.013
0.492TrpAsp: 0.492 ± 0.027
0.56TrpGlu: 0.56 ± 0.027
0.364TrpPhe: 0.364 ± 0.024
0.593TrpGly: 0.593 ± 0.028
0.183TrpHis: 0.183 ± 0.014
0.685TrpIle: 0.685 ± 0.03
0.817TrpLys: 0.817 ± 0.034
0.75TrpLeu: 0.75 ± 0.032
0.28TrpMet: 0.28 ± 0.019
0.535TrpAsn: 0.535 ± 0.031
0.2TrpPro: 0.2 ± 0.018
0.305TrpGln: 0.305 ± 0.02
0.314TrpArg: 0.314 ± 0.025
0.404TrpSer: 0.404 ± 0.024
0.39TrpThr: 0.39 ± 0.027
0.48TrpVal: 0.48 ± 0.029
0.097TrpTrp: 0.097 ± 0.011
0.383TrpTyr: 0.383 ± 0.024
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.977TyrAla: 2.977 ± 0.067
0.632TyrCys: 0.632 ± 0.031
2.856TyrAsp: 2.856 ± 0.064
3.279TyrGlu: 3.279 ± 0.075
1.871TyrPhe: 1.871 ± 0.06
2.81TyrGly: 2.81 ± 0.066
0.963TyrHis: 0.963 ± 0.038
2.92TyrIle: 2.92 ± 0.073
2.577TyrLys: 2.577 ± 0.07
3.771TyrLeu: 3.771 ± 0.078
1.213TyrMet: 1.213 ± 0.048
1.998TyrAsn: 1.998 ± 0.061
1.436TyrPro: 1.436 ± 0.047
1.653TyrGln: 1.653 ± 0.046
2.021TyrArg: 2.021 ± 0.057
2.457TyrSer: 2.457 ± 0.072
2.393TyrThr: 2.393 ± 0.058
2.916TyrVal: 2.916 ± 0.072
0.32TyrTrp: 0.32 ± 0.021
2.152TyrTyr: 2.152 ± 0.066
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.001XaaAsp: 0.001 ± 0.002
0.001XaaGlu: 0.001 ± 0.001
0.0XaaPhe: 0.0 ± 0.0
0.003XaaGly: 0.003 ± 0.002
0.001XaaHis: 0.001 ± 0.001
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.003XaaLeu: 0.003 ± 0.002
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.001XaaGln: 0.001 ± 0.001
0.003XaaArg: 0.003 ± 0.002
0.001XaaSer: 0.001 ± 0.001
0.001XaaThr: 0.001 ± 0.001
0.003XaaVal: 0.003 ± 0.002
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.024XaaXaa: 0.024 ± 0.007
Statistics based on 2420 proteins (741697 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski