Amino acid dipepetide frequency for Clostridium sp. CAG:7

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
8.912AlaAla: 8.912 ± 0.151
1.232AlaCys: 1.232 ± 0.045
4.762AlaAsp: 4.762 ± 0.075
5.59AlaGlu: 5.59 ± 0.099
3.236AlaPhe: 3.236 ± 0.062
7.152AlaGly: 7.152 ± 0.098
1.24AlaHis: 1.24 ± 0.034
4.928AlaIle: 4.928 ± 0.096
5.035AlaLys: 5.035 ± 0.075
7.382AlaLeu: 7.382 ± 0.105
2.738AlaMet: 2.738 ± 0.058
2.374AlaAsn: 2.374 ± 0.054
2.269AlaPro: 2.269 ± 0.056
2.504AlaGln: 2.504 ± 0.052
2.913AlaArg: 2.913 ± 0.057
4.301AlaSer: 4.301 ± 0.079
3.321AlaThr: 3.321 ± 0.066
6.948AlaVal: 6.948 ± 0.116
0.683AlaTrp: 0.683 ± 0.027
2.783AlaTyr: 2.783 ± 0.06
0.004AlaXaa: 0.004 ± 0.002
Cys
1.086CysAla: 1.086 ± 0.034
0.368CysCys: 0.368 ± 0.024
0.831CysAsp: 0.831 ± 0.035
0.887CysGlu: 0.887 ± 0.029
0.674CysPhe: 0.674 ± 0.027
1.496CysGly: 1.496 ± 0.043
0.319CysHis: 0.319 ± 0.018
1.088CysIle: 1.088 ± 0.033
0.726CysLys: 0.726 ± 0.031
1.333CysLeu: 1.333 ± 0.041
0.455CysMet: 0.455 ± 0.023
0.546CysAsn: 0.546 ± 0.027
0.663CysPro: 0.663 ± 0.031
0.459CysGln: 0.459 ± 0.019
0.726CysArg: 0.726 ± 0.029
1.017CysSer: 1.017 ± 0.036
0.757CysThr: 0.757 ± 0.03
1.032CysVal: 1.032 ± 0.033
0.108CysTrp: 0.108 ± 0.01
0.563CysTyr: 0.563 ± 0.028
0.0CysXaa: 0.0 ± 0.0
Asp
4.106AspAla: 4.106 ± 0.08
0.826AspCys: 0.826 ± 0.034
2.832AspAsp: 2.832 ± 0.071
4.394AspGlu: 4.394 ± 0.081
2.466AspPhe: 2.466 ± 0.05
4.539AspGly: 4.539 ± 0.087
1.19AspHis: 1.19 ± 0.039
3.989AspIle: 3.989 ± 0.065
3.264AspLys: 3.264 ± 0.059
4.966AspLeu: 4.966 ± 0.073
1.923AspMet: 1.923 ± 0.05
1.92AspAsn: 1.92 ± 0.047
2.194AspPro: 2.194 ± 0.059
1.8AspGln: 1.8 ± 0.045
2.403AspArg: 2.403 ± 0.055
2.697AspSer: 2.697 ± 0.062
3.057AspThr: 3.057 ± 0.069
3.719AspVal: 3.719 ± 0.077
0.59AspTrp: 0.59 ± 0.026
2.625AspTyr: 2.625 ± 0.053
0.001AspXaa: 0.001 ± 0.001
Glu
5.963GluAla: 5.963 ± 0.093
0.726GluCys: 0.726 ± 0.027
4.436GluAsp: 4.436 ± 0.097
7.484GluGlu: 7.484 ± 0.145
2.436GluPhe: 2.436 ± 0.053
4.851GluGly: 4.851 ± 0.087
1.411GluHis: 1.411 ± 0.045
4.998GluIle: 4.998 ± 0.097
6.991GluLys: 6.991 ± 0.104
6.387GluLeu: 6.387 ± 0.1
2.412GluMet: 2.412 ± 0.048
4.114GluAsn: 4.114 ± 0.079
2.069GluPro: 2.069 ± 0.046
2.828GluGln: 2.828 ± 0.059
3.234GluArg: 3.234 ± 0.058
3.069GluSer: 3.069 ± 0.07
4.022GluThr: 4.022 ± 0.097
4.196GluVal: 4.196 ± 0.074
0.612GluTrp: 0.612 ± 0.023
2.79GluTyr: 2.79 ± 0.069
0.001GluXaa: 0.001 ± 0.001
Phe
3.123PheAla: 3.123 ± 0.063
0.717PheCys: 0.717 ± 0.028
2.405PheAsp: 2.405 ± 0.056
2.416PheGlu: 2.416 ± 0.048
1.735PhePhe: 1.735 ± 0.054
3.174PheGly: 3.174 ± 0.072
0.849PheHis: 0.849 ± 0.029
2.699PheIle: 2.699 ± 0.06
2.238PheLys: 2.238 ± 0.049
4.047PheLeu: 4.047 ± 0.1
1.348PheMet: 1.348 ± 0.049
1.479PheAsn: 1.479 ± 0.048
1.482PhePro: 1.482 ± 0.045
1.201PheGln: 1.201 ± 0.038
1.539PheArg: 1.539 ± 0.044
2.768PheSer: 2.768 ± 0.057
2.394PheThr: 2.394 ± 0.051
2.674PheVal: 2.674 ± 0.051
0.463PheTrp: 0.463 ± 0.024
1.689PheTyr: 1.689 ± 0.042
0.0PheXaa: 0.0 ± 0.0
Gly
5.735GlyAla: 5.735 ± 0.09
1.429GlyCys: 1.429 ± 0.041
3.605GlyAsp: 3.605 ± 0.071
4.871GlyGlu: 4.871 ± 0.08
3.248GlyPhe: 3.248 ± 0.058
5.221GlyGly: 5.221 ± 0.092
1.286GlyHis: 1.286 ± 0.036
6.411GlyIle: 6.411 ± 0.09
5.834GlyLys: 5.834 ± 0.087
6.506GlyLeu: 6.506 ± 0.087
2.74GlyMet: 2.74 ± 0.045
3.234GlyAsn: 3.234 ± 0.064
1.644GlyPro: 1.644 ± 0.05
2.251GlyGln: 2.251 ± 0.052
3.214GlyArg: 3.214 ± 0.062
4.497GlySer: 4.497 ± 0.083
4.549GlyThr: 4.549 ± 0.081
5.163GlyVal: 5.163 ± 0.075
0.888GlyTrp: 0.888 ± 0.036
3.239GlyTyr: 3.239 ± 0.065
0.001GlyXaa: 0.001 ± 0.001
His
1.117HisAla: 1.117 ± 0.033
0.305HisCys: 0.305 ± 0.017
0.859HisAsp: 0.859 ± 0.038
0.993HisGlu: 0.993 ± 0.035
0.926HisPhe: 0.926 ± 0.032
1.304HisGly: 1.304 ± 0.041
0.427HisHis: 0.427 ± 0.03
1.384HisIle: 1.384 ± 0.04
0.997HisLys: 0.997 ± 0.031
1.645HisLeu: 1.645 ± 0.043
0.689HisMet: 0.689 ± 0.028
0.685HisAsn: 0.685 ± 0.029
0.908HisPro: 0.908 ± 0.036
0.555HisGln: 0.555 ± 0.024
0.761HisArg: 0.761 ± 0.03
0.962HisSer: 0.962 ± 0.035
1.006HisThr: 1.006 ± 0.033
1.214HisVal: 1.214 ± 0.034
0.188HisTrp: 0.188 ± 0.014
0.768HisTyr: 0.768 ± 0.031
0.0HisXaa: 0.0 ± 0.0
Ile
5.573IleAla: 5.573 ± 0.095
1.392IleCys: 1.392 ± 0.043
3.474IleAsp: 3.474 ± 0.054
4.146IleGlu: 4.146 ± 0.074
2.797IlePhe: 2.797 ± 0.068
5.062IleGly: 5.062 ± 0.088
1.446IleHis: 1.446 ± 0.044
4.305IleIle: 4.305 ± 0.075
3.756IleLys: 3.756 ± 0.064
7.003IleLeu: 7.003 ± 0.101
1.938IleMet: 1.938 ± 0.048
2.663IleAsn: 2.663 ± 0.058
3.296IlePro: 3.296 ± 0.062
2.436IleGln: 2.436 ± 0.048
3.675IleArg: 3.675 ± 0.075
4.511IleSer: 4.511 ± 0.067
4.084IleThr: 4.084 ± 0.074
4.37IleVal: 4.37 ± 0.076
0.629IleTrp: 0.629 ± 0.026
2.555IleTyr: 2.555 ± 0.052
0.001IleXaa: 0.001 ± 0.001
Lys
5.582LysAla: 5.582 ± 0.076
0.662LysCys: 0.662 ± 0.028
4.045LysAsp: 4.045 ± 0.069
7.043LysGlu: 7.043 ± 0.106
1.775LysPhe: 1.775 ± 0.043
4.551LysGly: 4.551 ± 0.067
1.033LysHis: 1.033 ± 0.035
4.402LysIle: 4.402 ± 0.062
5.992LysLys: 5.992 ± 0.088
5.402LysLeu: 5.402 ± 0.08
2.328LysMet: 2.328 ± 0.055
3.217LysAsn: 3.217 ± 0.054
2.049LysPro: 2.049 ± 0.053
2.27LysGln: 2.27 ± 0.049
2.919LysArg: 2.919 ± 0.058
2.957LysSer: 2.957 ± 0.055
3.71LysThr: 3.71 ± 0.06
4.38LysVal: 4.38 ± 0.074
0.678LysTrp: 0.678 ± 0.03
2.442LysTyr: 2.442 ± 0.057
0.001LysXaa: 0.001 ± 0.001
Leu
7.17LeuAla: 7.17 ± 0.09
1.467LeuCys: 1.467 ± 0.039
5.143LeuAsp: 5.143 ± 0.08
6.601LeuGlu: 6.601 ± 0.106
3.986LeuPhe: 3.986 ± 0.094
6.397LeuGly: 6.397 ± 0.097
1.468LeuHis: 1.468 ± 0.035
5.695LeuIle: 5.695 ± 0.091
6.68LeuLys: 6.68 ± 0.088
8.512LeuLeu: 8.512 ± 0.142
2.842LeuMet: 2.842 ± 0.063
3.648LeuAsn: 3.648 ± 0.058
3.782LeuPro: 3.782 ± 0.073
2.605LeuGln: 2.605 ± 0.058
3.671LeuArg: 3.671 ± 0.065
6.11LeuSer: 6.11 ± 0.082
5.211LeuThr: 5.211 ± 0.081
5.746LeuVal: 5.746 ± 0.088
0.824LeuTrp: 0.824 ± 0.033
3.35LeuTyr: 3.35 ± 0.062
0.001LeuXaa: 0.001 ± 0.001
Met
3.034MetAla: 3.034 ± 0.054
0.38MetCys: 0.38 ± 0.022
2.044MetAsp: 2.044 ± 0.05
2.778MetGlu: 2.778 ± 0.053
1.041MetPhe: 1.041 ± 0.037
2.517MetGly: 2.517 ± 0.059
0.433MetHis: 0.433 ± 0.025
2.381MetIle: 2.381 ± 0.048
2.504MetLys: 2.504 ± 0.056
2.673MetLeu: 2.673 ± 0.065
1.027MetMet: 1.027 ± 0.039
1.49MetAsn: 1.49 ± 0.042
1.232MetPro: 1.232 ± 0.037
0.932MetGln: 0.932 ± 0.03
1.299MetArg: 1.299 ± 0.036
1.813MetSer: 1.813 ± 0.043
1.882MetThr: 1.882 ± 0.047
2.208MetVal: 2.208 ± 0.045
0.228MetTrp: 0.228 ± 0.015
0.881MetTyr: 0.881 ± 0.029
0.002MetXaa: 0.002 ± 0.001
Asn
2.925AsnAla: 2.925 ± 0.057
0.601AsnCys: 0.601 ± 0.027
1.94AsnAsp: 1.94 ± 0.05
2.594AsnGlu: 2.594 ± 0.058
1.489AsnPhe: 1.489 ± 0.043
3.666AsnGly: 3.666 ± 0.075
0.753AsnHis: 0.753 ± 0.027
2.982AsnIle: 2.982 ± 0.058
2.169AsnLys: 2.169 ± 0.047
3.307AsnLeu: 3.307 ± 0.07
1.299AsnMet: 1.299 ± 0.038
1.564AsnAsn: 1.564 ± 0.048
1.96AsnPro: 1.96 ± 0.044
1.371AsnGln: 1.371 ± 0.038
1.884AsnArg: 1.884 ± 0.053
2.305AsnSer: 2.305 ± 0.056
2.263AsnThr: 2.263 ± 0.049
2.828AsnVal: 2.828 ± 0.062
0.479AsnTrp: 0.479 ± 0.022
1.562AsnTyr: 1.562 ± 0.043
0.001AsnXaa: 0.001 ± 0.001
Pro
2.838ProAla: 2.838 ± 0.059
0.475ProCys: 0.475 ± 0.024
2.442ProAsp: 2.442 ± 0.055
3.608ProGlu: 3.608 ± 0.084
1.619ProPhe: 1.619 ± 0.045
2.894ProGly: 2.894 ± 0.058
0.622ProHis: 0.622 ± 0.027
1.924ProIle: 1.924 ± 0.05
1.964ProLys: 1.964 ± 0.046
3.033ProLeu: 3.033 ± 0.063
0.946ProMet: 0.946 ± 0.034
1.099ProAsn: 1.099 ± 0.031
0.773ProPro: 0.773 ± 0.031
1.156ProGln: 1.156 ± 0.037
1.011ProArg: 1.011 ± 0.04
2.006ProSer: 2.006 ± 0.047
1.558ProThr: 1.558 ± 0.041
3.602ProVal: 3.602 ± 0.061
0.369ProTrp: 0.369 ± 0.017
1.481ProTyr: 1.481 ± 0.036
0.002ProXaa: 0.002 ± 0.002
Gln
2.819GlnAla: 2.819 ± 0.06
0.414GlnCys: 0.414 ± 0.019
1.503GlnAsp: 1.503 ± 0.042
2.464GlnGlu: 2.464 ± 0.061
1.224GlnPhe: 1.224 ± 0.034
2.033GlnGly: 2.033 ± 0.046
0.387GlnHis: 0.387 ± 0.023
2.66GlnIle: 2.66 ± 0.055
2.4GlnLys: 2.4 ± 0.056
3.051GlnLeu: 3.051 ± 0.064
1.327GlnMet: 1.327 ± 0.036
1.404GlnAsn: 1.404 ± 0.041
1.046GlnPro: 1.046 ± 0.036
1.054GlnGln: 1.054 ± 0.04
1.168GlnArg: 1.168 ± 0.035
1.558GlnSer: 1.558 ± 0.043
1.702GlnThr: 1.702 ± 0.048
2.498GlnVal: 2.498 ± 0.053
0.371GlnTrp: 0.371 ± 0.02
1.333GlnTyr: 1.333 ± 0.041
0.0GlnXaa: 0.0 ± 0.0
Arg
2.613ArgAla: 2.613 ± 0.051
0.576ArgCys: 0.576 ± 0.025
2.128ArgAsp: 2.128 ± 0.053
3.538ArgGlu: 3.538 ± 0.072
1.844ArgPhe: 1.844 ± 0.046
2.409ArgGly: 2.409 ± 0.048
0.771ArgHis: 0.771 ± 0.028
3.231ArgIle: 3.231 ± 0.061
3.311ArgLys: 3.311 ± 0.05
4.007ArgLeu: 4.007 ± 0.076
1.608ArgMet: 1.608 ± 0.045
1.803ArgAsn: 1.803 ± 0.045
1.442ArgPro: 1.442 ± 0.037
1.714ArgGln: 1.714 ± 0.043
2.146ArgArg: 2.146 ± 0.057
2.333ArgSer: 2.333 ± 0.054
2.244ArgThr: 2.244 ± 0.043
2.519ArgVal: 2.519 ± 0.054
0.366ArgTrp: 0.366 ± 0.022
1.846ArgTyr: 1.846 ± 0.044
0.001ArgXaa: 0.001 ± 0.001
Ser
4.304SerAla: 4.304 ± 0.067
0.882SerCys: 0.882 ± 0.032
3.098SerAsp: 3.098 ± 0.058
3.538SerGlu: 3.538 ± 0.078
2.518SerPhe: 2.518 ± 0.058
4.99SerGly: 4.99 ± 0.082
1.106SerHis: 1.106 ± 0.034
3.809SerIle: 3.809 ± 0.072
2.967SerLys: 2.967 ± 0.056
5.401SerLeu: 5.401 ± 0.08
1.898SerMet: 1.898 ± 0.047
2.044SerAsn: 2.044 ± 0.056
1.76SerPro: 1.76 ± 0.043
2.035SerGln: 2.035 ± 0.05
2.795SerArg: 2.795 ± 0.057
3.606SerSer: 3.606 ± 0.094
2.739SerThr: 2.739 ± 0.066
4.208SerVal: 4.208 ± 0.06
0.62SerTrp: 0.62 ± 0.028
2.401SerTyr: 2.401 ± 0.054
0.0SerXaa: 0.0 ± 0.0
Thr
4.951ThrAla: 4.951 ± 0.085
0.734ThrCys: 0.734 ± 0.03
3.278ThrAsp: 3.278 ± 0.062
3.967ThrGlu: 3.967 ± 0.083
2.14ThrPhe: 2.14 ± 0.051
5.245ThrGly: 5.245 ± 0.074
0.876ThrHis: 0.876 ± 0.033
3.616ThrIle: 3.616 ± 0.062
2.885ThrLys: 2.885 ± 0.058
4.714ThrLeu: 4.714 ± 0.068
1.511ThrMet: 1.511 ± 0.044
1.795ThrAsn: 1.795 ± 0.047
2.231ThrPro: 2.231 ± 0.049
1.507ThrGln: 1.507 ± 0.039
1.961ThrArg: 1.961 ± 0.048
2.871ThrSer: 2.871 ± 0.072
2.878ThrThr: 2.878 ± 0.072
4.556ThrVal: 4.556 ± 0.077
0.538ThrTrp: 0.538 ± 0.027
1.898ThrTyr: 1.898 ± 0.051
0.001ThrXaa: 0.001 ± 0.001
Val
5.21ValAla: 5.21 ± 0.078
1.205ValCys: 1.205 ± 0.038
3.69ValAsp: 3.69 ± 0.057
4.423ValGlu: 4.423 ± 0.072
3.026ValPhe: 3.026 ± 0.064
4.403ValGly: 4.403 ± 0.095
1.135ValHis: 1.135 ± 0.034
5.377ValIle: 5.377 ± 0.088
4.697ValLys: 4.697 ± 0.083
6.962ValLeu: 6.962 ± 0.101
2.2ValMet: 2.2 ± 0.05
2.732ValAsn: 2.732 ± 0.052
2.834ValPro: 2.834 ± 0.054
1.936ValGln: 1.936 ± 0.044
2.937ValArg: 2.937 ± 0.059
4.596ValSer: 4.596 ± 0.067
4.203ValThr: 4.203 ± 0.072
4.942ValVal: 4.942 ± 0.089
0.66ValTrp: 0.66 ± 0.028
2.788ValTyr: 2.788 ± 0.062
0.0ValXaa: 0.0 ± 0.0
Trp
0.554TrpAla: 0.554 ± 0.026
0.158TrpCys: 0.158 ± 0.014
0.58TrpAsp: 0.58 ± 0.027
0.709TrpGlu: 0.709 ± 0.027
0.433TrpPhe: 0.433 ± 0.022
0.667TrpGly: 0.667 ± 0.023
0.161TrpHis: 0.161 ± 0.013
0.708TrpIle: 0.708 ± 0.028
0.832TrpLys: 0.832 ± 0.029
0.958TrpLeu: 0.958 ± 0.031
0.353TrpMet: 0.353 ± 0.022
0.505TrpAsn: 0.505 ± 0.023
0.263TrpPro: 0.263 ± 0.018
0.42TrpGln: 0.42 ± 0.02
0.399TrpArg: 0.399 ± 0.021
0.479TrpSer: 0.479 ± 0.022
0.431TrpThr: 0.431 ± 0.024
0.589TrpVal: 0.589 ± 0.028
0.125TrpTrp: 0.125 ± 0.011
0.445TrpTyr: 0.445 ± 0.03
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.691TyrAla: 2.691 ± 0.054
0.588TyrCys: 0.588 ± 0.024
2.405TyrAsp: 2.405 ± 0.056
2.986TyrGlu: 2.986 ± 0.067
1.828TyrPhe: 1.828 ± 0.051
2.992TyrGly: 2.992 ± 0.052
0.78TyrHis: 0.78 ± 0.032
2.525TyrIle: 2.525 ± 0.051
2.197TyrLys: 2.197 ± 0.048
3.617TyrLeu: 3.617 ± 0.083
1.196TyrMet: 1.196 ± 0.039
1.625TyrAsn: 1.625 ± 0.045
1.52TyrPro: 1.52 ± 0.038
1.382TyrGln: 1.382 ± 0.041
1.797TyrArg: 1.797 ± 0.044
2.263TyrSer: 2.263 ± 0.052
2.142TyrThr: 2.142 ± 0.052
2.559TyrVal: 2.559 ± 0.052
0.351TyrTrp: 0.351 ± 0.02
1.773TyrTyr: 1.773 ± 0.062
0.001TyrXaa: 0.001 ± 0.001
Xaa
0.003XaaAla: 0.003 ± 0.002
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.002XaaGly: 0.002 ± 0.001
0.001XaaHis: 0.001 ± 0.001
0.002XaaIle: 0.002 ± 0.002
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.001XaaMet: 0.001 ± 0.001
0.0XaaAsn: 0.0 ± 0.0
0.001XaaPro: 0.001 ± 0.001
0.0XaaGln: 0.0 ± 0.0
0.001XaaArg: 0.001 ± 0.001
0.0XaaSer: 0.0 ± 0.0
0.002XaaThr: 0.002 ± 0.002
0.004XaaVal: 0.004 ± 0.002
0.0XaaTrp: 0.0 ± 0.0
0.001XaaTyr: 0.001 ± 0.001
0.027XaaXaa: 0.027 ± 0.007
Statistics based on 2911 proteins (951847 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski