Amino acid dipepetide frequency for Clostridium sp. CAG:167

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
6.561AlaAla: 6.561 ± 0.134
1.051AlaCys: 1.051 ± 0.043
4.202AlaAsp: 4.202 ± 0.085
4.741AlaGlu: 4.741 ± 0.092
2.817AlaPhe: 2.817 ± 0.072
6.197AlaGly: 6.197 ± 0.112
1.11AlaHis: 1.11 ± 0.04
4.312AlaIle: 4.312 ± 0.095
5.442AlaLys: 5.442 ± 0.111
6.588AlaLeu: 6.588 ± 0.121
2.49AlaMet: 2.49 ± 0.066
2.378AlaAsn: 2.378 ± 0.073
2.01AlaPro: 2.01 ± 0.061
2.156AlaGln: 2.156 ± 0.062
3.021AlaArg: 3.021 ± 0.076
4.337AlaSer: 4.337 ± 0.089
3.521AlaThr: 3.521 ± 0.086
6.306AlaVal: 6.306 ± 0.104
0.606AlaTrp: 0.606 ± 0.03
2.769AlaTyr: 2.769 ± 0.064
0.003AlaXaa: 0.003 ± 0.002
Cys
0.877CysAla: 0.877 ± 0.037
0.301CysCys: 0.301 ± 0.022
0.786CysAsp: 0.786 ± 0.037
0.842CysGlu: 0.842 ± 0.035
0.682CysPhe: 0.682 ± 0.033
1.436CysGly: 1.436 ± 0.048
0.373CysHis: 0.373 ± 0.028
1.086CysIle: 1.086 ± 0.047
0.848CysLys: 0.848 ± 0.041
1.289CysLeu: 1.289 ± 0.048
0.477CysMet: 0.477 ± 0.028
0.517CysAsn: 0.517 ± 0.031
0.575CysPro: 0.575 ± 0.036
0.575CysGln: 0.575 ± 0.032
0.799CysArg: 0.799 ± 0.039
0.875CysSer: 0.875 ± 0.041
0.69CysThr: 0.69 ± 0.034
1.081CysVal: 1.081 ± 0.044
0.079CysTrp: 0.079 ± 0.011
0.588CysTyr: 0.588 ± 0.034
0.0CysXaa: 0.0 ± 0.0
Asp
4.012AspAla: 4.012 ± 0.075
0.832AspCys: 0.832 ± 0.039
2.819AspAsp: 2.819 ± 0.078
4.581AspGlu: 4.581 ± 0.1
2.454AspPhe: 2.454 ± 0.065
4.277AspGly: 4.277 ± 0.096
0.999AspHis: 0.999 ± 0.039
4.126AspIle: 4.126 ± 0.091
3.973AspLys: 3.973 ± 0.095
4.81AspLeu: 4.81 ± 0.102
1.894AspMet: 1.894 ± 0.059
2.113AspAsn: 2.113 ± 0.057
1.794AspPro: 1.794 ± 0.059
1.76AspGln: 1.76 ± 0.052
2.326AspArg: 2.326 ± 0.059
3.251AspSer: 3.251 ± 0.082
3.105AspThr: 3.105 ± 0.071
3.82AspVal: 3.82 ± 0.085
0.563AspTrp: 0.563 ± 0.03
2.847AspTyr: 2.847 ± 0.072
0.002AspXaa: 0.002 ± 0.002
Glu
5.344GluAla: 5.344 ± 0.107
0.789GluCys: 0.789 ± 0.036
4.461GluAsp: 4.461 ± 0.086
7.297GluGlu: 7.297 ± 0.151
2.516GluPhe: 2.516 ± 0.071
4.512GluGly: 4.512 ± 0.09
1.256GluHis: 1.256 ± 0.056
5.263GluIle: 5.263 ± 0.101
7.509GluLys: 7.509 ± 0.139
6.122GluLeu: 6.122 ± 0.127
2.43GluMet: 2.43 ± 0.067
3.633GluAsn: 3.633 ± 0.076
1.989GluPro: 1.989 ± 0.062
3.033GluGln: 3.033 ± 0.074
3.015GluArg: 3.015 ± 0.084
3.313GluSer: 3.313 ± 0.074
3.95GluThr: 3.95 ± 0.087
4.806GluVal: 4.806 ± 0.096
0.593GluTrp: 0.593 ± 0.031
2.987GluTyr: 2.987 ± 0.074
0.002GluXaa: 0.002 ± 0.002
Phe
2.834PheAla: 2.834 ± 0.066
0.777PheCys: 0.777 ± 0.037
2.402PheAsp: 2.402 ± 0.066
2.638PheGlu: 2.638 ± 0.076
1.741PhePhe: 1.741 ± 0.066
2.863PheGly: 2.863 ± 0.068
0.872PheHis: 0.872 ± 0.038
2.365PheIle: 2.365 ± 0.078
1.97PheLys: 1.97 ± 0.059
3.424PheLeu: 3.424 ± 0.1
1.282PheMet: 1.282 ± 0.049
1.406PheAsn: 1.406 ± 0.051
1.267PhePro: 1.267 ± 0.046
1.246PheGln: 1.246 ± 0.046
1.622PheArg: 1.622 ± 0.057
2.79PheSer: 2.79 ± 0.083
2.207PheThr: 2.207 ± 0.057
3.004PheVal: 3.004 ± 0.068
0.396PheTrp: 0.396 ± 0.026
1.679PheTyr: 1.679 ± 0.055
0.0PheXaa: 0.0 ± 0.0
Gly
4.905GlyAla: 4.905 ± 0.099
1.206GlyCys: 1.206 ± 0.048
3.533GlyAsp: 3.533 ± 0.076
4.895GlyGlu: 4.895 ± 0.099
2.922GlyPhe: 2.922 ± 0.066
4.838GlyGly: 4.838 ± 0.136
1.276GlyHis: 1.276 ± 0.05
5.897GlyIle: 5.897 ± 0.116
6.368GlyLys: 6.368 ± 0.117
5.609GlyLeu: 5.609 ± 0.12
2.416GlyMet: 2.416 ± 0.062
2.956GlyAsn: 2.956 ± 0.077
1.366GlyPro: 1.366 ± 0.048
2.375GlyGln: 2.375 ± 0.064
3.164GlyArg: 3.164 ± 0.077
4.41GlySer: 4.41 ± 0.091
4.474GlyThr: 4.474 ± 0.083
5.697GlyVal: 5.697 ± 0.113
0.625GlyTrp: 0.625 ± 0.035
3.158GlyTyr: 3.158 ± 0.083
0.003GlyXaa: 0.003 ± 0.002
His
0.967HisAla: 0.967 ± 0.046
0.365HisCys: 0.365 ± 0.028
0.861HisAsp: 0.861 ± 0.038
1.053HisGlu: 1.053 ± 0.046
0.878HisPhe: 0.878 ± 0.036
1.425HisGly: 1.425 ± 0.052
0.425HisHis: 0.425 ± 0.03
1.378HisIle: 1.378 ± 0.053
1.021HisLys: 1.021 ± 0.04
1.485HisLeu: 1.485 ± 0.052
0.501HisMet: 0.501 ± 0.029
0.693HisAsn: 0.693 ± 0.034
0.869HisPro: 0.869 ± 0.039
0.579HisGln: 0.579 ± 0.035
0.84HisArg: 0.84 ± 0.042
1.018HisSer: 1.018 ± 0.043
1.002HisThr: 1.002 ± 0.046
1.078HisVal: 1.078 ± 0.04
0.136HisTrp: 0.136 ± 0.014
0.751HisTyr: 0.751 ± 0.033
0.0HisXaa: 0.0 ± 0.0
Ile
5.014IleAla: 5.014 ± 0.113
1.262IleCys: 1.262 ± 0.045
3.638IleAsp: 3.638 ± 0.089
4.377IleGlu: 4.377 ± 0.093
2.576IlePhe: 2.576 ± 0.062
4.665IleGly: 4.665 ± 0.097
1.517IleHis: 1.517 ± 0.059
4.058IleIle: 4.058 ± 0.107
4.226IleLys: 4.226 ± 0.095
6.937IleLeu: 6.937 ± 0.132
1.899IleMet: 1.899 ± 0.06
2.559IleAsn: 2.559 ± 0.071
2.92IlePro: 2.92 ± 0.075
2.752IleGln: 2.752 ± 0.065
3.632IleArg: 3.632 ± 0.082
4.369IleSer: 4.369 ± 0.089
4.155IleThr: 4.155 ± 0.089
4.952IleVal: 4.952 ± 0.107
0.556IleTrp: 0.556 ± 0.033
2.622IleTyr: 2.622 ± 0.063
0.002IleXaa: 0.002 ± 0.001
Lys
5.366LysAla: 5.366 ± 0.107
0.724LysCys: 0.724 ± 0.039
4.672LysAsp: 4.672 ± 0.091
7.612LysGlu: 7.612 ± 0.146
1.905LysPhe: 1.905 ± 0.061
4.971LysGly: 4.971 ± 0.106
0.904LysHis: 0.904 ± 0.047
5.285LysIle: 5.285 ± 0.097
9.056LysLys: 9.056 ± 0.187
5.46LysLeu: 5.46 ± 0.097
2.581LysMet: 2.581 ± 0.071
3.923LysAsn: 3.923 ± 0.085
2.04LysPro: 2.04 ± 0.059
2.582LysGln: 2.582 ± 0.073
3.281LysArg: 3.281 ± 0.085
3.54LysSer: 3.54 ± 0.085
4.355LysThr: 4.355 ± 0.098
5.383LysVal: 5.383 ± 0.095
0.647LysTrp: 0.647 ± 0.036
2.998LysTyr: 2.998 ± 0.08
0.0LysXaa: 0.0 ± 0.0
Leu
6.763LeuAla: 6.763 ± 0.117
1.536LeuCys: 1.536 ± 0.062
4.922LeuAsp: 4.922 ± 0.101
6.0LeuGlu: 6.0 ± 0.129
3.644LeuPhe: 3.644 ± 0.108
5.94LeuGly: 5.94 ± 0.105
1.536LeuHis: 1.536 ± 0.052
5.272LeuIle: 5.272 ± 0.126
6.078LeuLys: 6.078 ± 0.104
7.628LeuLeu: 7.628 ± 0.165
2.557LeuMet: 2.557 ± 0.071
3.342LeuAsn: 3.342 ± 0.076
3.239LeuPro: 3.239 ± 0.064
3.012LeuGln: 3.012 ± 0.085
3.637LeuArg: 3.637 ± 0.084
5.975LeuSer: 5.975 ± 0.115
5.03LeuThr: 5.03 ± 0.088
5.788LeuVal: 5.788 ± 0.1
0.758LeuTrp: 0.758 ± 0.034
3.359LeuTyr: 3.359 ± 0.083
0.0LeuXaa: 0.0 ± 0.0
Met
2.619MetAla: 2.619 ± 0.068
0.369MetCys: 0.369 ± 0.026
2.067MetAsp: 2.067 ± 0.068
2.739MetGlu: 2.739 ± 0.065
0.999MetPhe: 0.999 ± 0.046
2.173MetGly: 2.173 ± 0.064
0.452MetHis: 0.452 ± 0.025
2.21MetIle: 2.21 ± 0.061
2.736MetLys: 2.736 ± 0.073
2.554MetLeu: 2.554 ± 0.056
0.918MetMet: 0.918 ± 0.042
1.604MetAsn: 1.604 ± 0.05
1.099MetPro: 1.099 ± 0.036
1.091MetGln: 1.091 ± 0.038
1.259MetArg: 1.259 ± 0.045
1.787MetSer: 1.787 ± 0.052
1.826MetThr: 1.826 ± 0.058
2.089MetVal: 2.089 ± 0.07
0.163MetTrp: 0.163 ± 0.016
0.973MetTyr: 0.973 ± 0.04
0.0MetXaa: 0.0 ± 0.0
Asn
2.79AsnAla: 2.79 ± 0.075
0.604AsnCys: 0.604 ± 0.03
1.95AsnAsp: 1.95 ± 0.057
2.628AsnGlu: 2.628 ± 0.063
1.511AsnPhe: 1.511 ± 0.047
3.335AsnGly: 3.335 ± 0.088
0.767AsnHis: 0.767 ± 0.038
3.066AsnIle: 3.066 ± 0.072
2.949AsnLys: 2.949 ± 0.07
3.472AsnLeu: 3.472 ± 0.076
1.297AsnMet: 1.297 ± 0.048
1.787AsnAsn: 1.787 ± 0.067
1.747AsnPro: 1.747 ± 0.056
1.482AsnGln: 1.482 ± 0.047
1.872AsnArg: 1.872 ± 0.059
2.2AsnSer: 2.2 ± 0.057
2.281AsnThr: 2.281 ± 0.067
2.744AsnVal: 2.744 ± 0.084
0.38AsnTrp: 0.38 ± 0.025
1.668AsnTyr: 1.668 ± 0.053
0.0AsnXaa: 0.0 ± 0.0
Pro
2.481ProAla: 2.481 ± 0.069
0.431ProCys: 0.431 ± 0.028
2.245ProAsp: 2.245 ± 0.059
3.245ProGlu: 3.245 ± 0.087
1.431ProPhe: 1.431 ± 0.045
2.456ProGly: 2.456 ± 0.07
0.534ProHis: 0.534 ± 0.03
1.771ProIle: 1.771 ± 0.059
1.858ProLys: 1.858 ± 0.062
2.573ProLeu: 2.573 ± 0.073
0.85ProMet: 0.85 ± 0.037
1.018ProAsn: 1.018 ± 0.043
0.72ProPro: 0.72 ± 0.039
1.034ProGln: 1.034 ± 0.039
0.969ProArg: 0.969 ± 0.043
1.89ProSer: 1.89 ± 0.062
1.607ProThr: 1.607 ± 0.065
3.126ProVal: 3.126 ± 0.072
0.339ProTrp: 0.339 ± 0.024
1.43ProTyr: 1.43 ± 0.053
0.0ProXaa: 0.0 ± 0.0
Gln
2.649GlnAla: 2.649 ± 0.067
0.482GlnCys: 0.482 ± 0.03
1.828GlnAsp: 1.828 ± 0.054
2.731GlnGlu: 2.731 ± 0.075
1.206GlnPhe: 1.206 ± 0.042
2.584GlnGly: 2.584 ± 0.063
0.449GlnHis: 0.449 ± 0.029
2.75GlnIle: 2.75 ± 0.067
2.895GlnLys: 2.895 ± 0.08
2.605GlnLeu: 2.605 ± 0.075
1.37GlnMet: 1.37 ± 0.044
1.474GlnAsn: 1.474 ± 0.05
0.926GlnPro: 0.926 ± 0.041
1.333GlnGln: 1.333 ± 0.047
1.458GlnArg: 1.458 ± 0.062
1.801GlnSer: 1.801 ± 0.051
1.794GlnThr: 1.794 ± 0.055
2.571GlnVal: 2.571 ± 0.063
0.425GlnTrp: 0.425 ± 0.029
1.458GlnTyr: 1.458 ± 0.049
0.0GlnXaa: 0.0 ± 0.0
Arg
2.333ArgAla: 2.333 ± 0.063
0.615ArgCys: 0.615 ± 0.034
2.115ArgAsp: 2.115 ± 0.062
3.63ArgGlu: 3.63 ± 0.09
1.79ArgPhe: 1.79 ± 0.055
2.579ArgGly: 2.579 ± 0.076
0.808ArgHis: 0.808 ± 0.04
3.397ArgIle: 3.397 ± 0.069
3.43ArgLys: 3.43 ± 0.073
3.797ArgLeu: 3.797 ± 0.09
1.531ArgMet: 1.531 ± 0.056
1.75ArgAsn: 1.75 ± 0.056
1.252ArgPro: 1.252 ± 0.048
1.978ArgGln: 1.978 ± 0.059
2.264ArgArg: 2.264 ± 0.073
2.316ArgSer: 2.316 ± 0.056
2.15ArgThr: 2.15 ± 0.062
2.807ArgVal: 2.807 ± 0.077
0.407ArgTrp: 0.407 ± 0.028
1.993ArgTyr: 1.993 ± 0.061
0.0ArgXaa: 0.0 ± 0.0
Ser
3.982SerAla: 3.982 ± 0.084
0.75SerCys: 0.75 ± 0.035
3.329SerAsp: 3.329 ± 0.073
3.513SerGlu: 3.513 ± 0.071
2.552SerPhe: 2.552 ± 0.064
5.019SerGly: 5.019 ± 0.108
1.094SerHis: 1.094 ± 0.048
3.89SerIle: 3.89 ± 0.086
3.916SerLys: 3.916 ± 0.089
5.398SerLeu: 5.398 ± 0.108
1.945SerMet: 1.945 ± 0.054
2.172SerAsn: 2.172 ± 0.065
1.685SerPro: 1.685 ± 0.052
2.232SerGln: 2.232 ± 0.064
2.624SerArg: 2.624 ± 0.059
3.901SerSer: 3.901 ± 0.123
2.993SerThr: 2.993 ± 0.081
4.714SerVal: 4.714 ± 0.09
0.491SerTrp: 0.491 ± 0.035
2.635SerTyr: 2.635 ± 0.072
0.0SerXaa: 0.0 ± 0.0
Thr
4.675ThrAla: 4.675 ± 0.106
0.642ThrCys: 0.642 ± 0.032
3.161ThrAsp: 3.161 ± 0.076
3.659ThrGlu: 3.659 ± 0.1
1.989ThrPhe: 1.989 ± 0.057
4.724ThrGly: 4.724 ± 0.088
0.829ThrHis: 0.829 ± 0.035
3.977ThrIle: 3.977 ± 0.089
3.96ThrLys: 3.96 ± 0.094
4.935ThrLeu: 4.935 ± 0.096
1.539ThrMet: 1.539 ± 0.045
2.058ThrAsn: 2.058 ± 0.065
2.265ThrPro: 2.265 ± 0.068
1.547ThrGln: 1.547 ± 0.054
1.978ThrArg: 1.978 ± 0.052
3.156ThrSer: 3.156 ± 0.077
3.264ThrThr: 3.264 ± 0.113
5.041ThrVal: 5.041 ± 0.112
0.51ThrTrp: 0.51 ± 0.034
2.186ThrTyr: 2.186 ± 0.069
0.0ThrXaa: 0.0 ± 0.0
Val
5.393ValAla: 5.393 ± 0.122
1.297ValCys: 1.297 ± 0.05
4.174ValAsp: 4.174 ± 0.078
4.909ValGlu: 4.909 ± 0.097
3.044ValPhe: 3.044 ± 0.084
4.325ValGly: 4.325 ± 0.071
1.091ValHis: 1.091 ± 0.046
5.179ValIle: 5.179 ± 0.104
5.46ValLys: 5.46 ± 0.126
6.973ValLeu: 6.973 ± 0.135
2.207ValMet: 2.207 ± 0.051
2.912ValAsn: 2.912 ± 0.08
2.723ValPro: 2.723 ± 0.065
2.127ValGln: 2.127 ± 0.061
2.915ValArg: 2.915 ± 0.079
4.962ValSer: 4.962 ± 0.11
4.941ValThr: 4.941 ± 0.121
5.74ValVal: 5.74 ± 0.102
0.717ValTrp: 0.717 ± 0.04
2.963ValTyr: 2.963 ± 0.073
0.003ValXaa: 0.003 ± 0.002
Trp
0.458TrpAla: 0.458 ± 0.03
0.132TrpCys: 0.132 ± 0.016
0.564TrpAsp: 0.564 ± 0.027
0.617TrpGlu: 0.617 ± 0.032
0.384TrpPhe: 0.384 ± 0.025
0.625TrpGly: 0.625 ± 0.042
0.203TrpHis: 0.203 ± 0.02
0.717TrpIle: 0.717 ± 0.034
0.729TrpLys: 0.729 ± 0.035
0.81TrpLeu: 0.81 ± 0.041
0.325TrpMet: 0.325 ± 0.026
0.428TrpAsn: 0.428 ± 0.026
0.216TrpPro: 0.216 ± 0.018
0.469TrpGln: 0.469 ± 0.028
0.322TrpArg: 0.322 ± 0.024
0.46TrpSer: 0.46 ± 0.031
0.415TrpThr: 0.415 ± 0.031
0.503TrpVal: 0.503 ± 0.033
0.105TrpTrp: 0.105 ± 0.014
0.347TrpTyr: 0.347 ± 0.026
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.595TyrAla: 2.595 ± 0.057
0.609TyrCys: 0.609 ± 0.032
2.63TyrAsp: 2.63 ± 0.057
3.112TyrGlu: 3.112 ± 0.074
1.744TyrPhe: 1.744 ± 0.063
3.15TyrGly: 3.15 ± 0.082
0.831TyrHis: 0.831 ± 0.039
2.682TyrIle: 2.682 ± 0.061
2.863TyrLys: 2.863 ± 0.074
3.548TyrLeu: 3.548 ± 0.077
1.214TyrMet: 1.214 ± 0.044
1.807TyrAsn: 1.807 ± 0.062
1.303TyrPro: 1.303 ± 0.048
1.427TyrGln: 1.427 ± 0.047
1.947TyrArg: 1.947 ± 0.06
2.482TyrSer: 2.482 ± 0.073
2.337TyrThr: 2.337 ± 0.068
2.801TyrVal: 2.801 ± 0.069
0.328TyrTrp: 0.328 ± 0.025
1.983TyrTyr: 1.983 ± 0.061
0.002TyrXaa: 0.002 ± 0.002
Xaa
0.002XaaAla: 0.002 ± 0.001
0.002XaaCys: 0.002 ± 0.002
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.006XaaIle: 0.006 ± 0.003
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.002XaaMet: 0.002 ± 0.001
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.002XaaSer: 0.002 ± 0.002
0.0XaaThr: 0.0 ± 0.0
0.002XaaVal: 0.002 ± 0.002
0.002XaaTrp: 0.002 ± 0.002
0.0XaaTyr: 0.0 ± 0.0
0.022XaaXaa: 0.022 ± 0.007
Statistics based on 1994 proteins (630825 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski