Amino acid dipepetide frequency for Clostridium sp. CAG:1219

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
2.669AlaAla: 2.669 ± 0.102
0.726AlaCys: 0.726 ± 0.045
2.413AlaAsp: 2.413 ± 0.095
3.146AlaGlu: 3.146 ± 0.131
2.001AlaPhe: 2.001 ± 0.084
3.097AlaGly: 3.097 ± 0.117
0.66AlaHis: 0.66 ± 0.049
5.628AlaIle: 5.628 ± 0.151
5.085AlaLys: 5.085 ± 0.149
4.428AlaLeu: 4.428 ± 0.13
1.362AlaMet: 1.362 ± 0.071
2.886AlaAsn: 2.886 ± 0.1
1.144AlaPro: 1.144 ± 0.062
1.255AlaGln: 1.255 ± 0.075
1.925AlaArg: 1.925 ± 0.082
3.605AlaSer: 3.605 ± 0.126
3.135AlaThr: 3.135 ± 0.112
3.149AlaVal: 3.149 ± 0.108
0.249AlaTrp: 0.249 ± 0.032
2.105AlaTyr: 2.105 ± 0.084
0.007AlaXaa: 0.007 ± 0.005
Cys
0.588CysAla: 0.588 ± 0.05
0.218CysCys: 0.218 ± 0.031
0.778CysAsp: 0.778 ± 0.055
0.84CysGlu: 0.84 ± 0.059
0.529CysPhe: 0.529 ± 0.047
1.141CysGly: 1.141 ± 0.069
0.145CysHis: 0.145 ± 0.022
1.31CysIle: 1.31 ± 0.064
1.082CysLys: 1.082 ± 0.055
0.861CysLeu: 0.861 ± 0.051
0.318CysMet: 0.318 ± 0.033
0.729CysAsn: 0.729 ± 0.051
0.442CysPro: 0.442 ± 0.04
0.283CysGln: 0.283 ± 0.029
0.429CysArg: 0.429 ± 0.041
0.919CysSer: 0.919 ± 0.053
0.57CysThr: 0.57 ± 0.049
0.75CysVal: 0.75 ± 0.052
0.069CysTrp: 0.069 ± 0.017
0.636CysTyr: 0.636 ± 0.054
0.0CysXaa: 0.0 ± 0.0
Asp
2.72AspAla: 2.72 ± 0.107
0.512AspCys: 0.512 ± 0.038
3.177AspAsp: 3.177 ± 0.11
5.33AspGlu: 5.33 ± 0.137
2.703AspPhe: 2.703 ± 0.096
3.298AspGly: 3.298 ± 0.099
0.456AspHis: 0.456 ± 0.041
6.754AspIle: 6.754 ± 0.162
5.552AspLys: 5.552 ± 0.13
4.11AspLeu: 4.11 ± 0.129
1.576AspMet: 1.576 ± 0.066
3.405AspAsn: 3.405 ± 0.116
1.065AspPro: 1.065 ± 0.074
0.778AspGln: 0.778 ± 0.052
1.721AspArg: 1.721 ± 0.095
3.757AspSer: 3.757 ± 0.111
2.879AspThr: 2.879 ± 0.104
4.034AspVal: 4.034 ± 0.128
0.211AspTrp: 0.211 ± 0.026
2.745AspTyr: 2.745 ± 0.103
0.0AspXaa: 0.0 ± 0.0
Glu
3.906GluAla: 3.906 ± 0.132
0.767GluCys: 0.767 ± 0.055
4.297GluAsp: 4.297 ± 0.124
7.664GluGlu: 7.664 ± 0.256
3.083GluPhe: 3.083 ± 0.095
3.194GluGly: 3.194 ± 0.125
0.771GluHis: 0.771 ± 0.057
7.695GluIle: 7.695 ± 0.174
9.492GluLys: 9.492 ± 0.185
6.198GluLeu: 6.198 ± 0.146
1.905GluMet: 1.905 ± 0.091
6.661GluAsn: 6.661 ± 0.156
1.193GluPro: 1.193 ± 0.066
1.946GluGln: 1.946 ± 0.098
2.489GluArg: 2.489 ± 0.109
3.764GluSer: 3.764 ± 0.125
3.011GluThr: 3.011 ± 0.106
5.071GluVal: 5.071 ± 0.14
0.335GluTrp: 0.335 ± 0.038
3.996GluTyr: 3.996 ± 0.11
0.003GluXaa: 0.003 ± 0.003
Phe
2.185PheAla: 2.185 ± 0.087
0.736PheCys: 0.736 ± 0.06
2.776PheAsp: 2.776 ± 0.102
3.191PheGlu: 3.191 ± 0.113
1.891PhePhe: 1.891 ± 0.092
2.665PheGly: 2.665 ± 0.108
0.342PheHis: 0.342 ± 0.032
3.951PheIle: 3.951 ± 0.133
3.543PheLys: 3.543 ± 0.105
3.612PheLeu: 3.612 ± 0.127
0.982PheMet: 0.982 ± 0.058
2.655PheAsn: 2.655 ± 0.093
0.985PhePro: 0.985 ± 0.06
0.657PheGln: 0.657 ± 0.042
1.272PheArg: 1.272 ± 0.067
3.346PheSer: 3.346 ± 0.098
1.95PheThr: 1.95 ± 0.085
2.824PheVal: 2.824 ± 0.117
0.221PheTrp: 0.221 ± 0.027
1.735PheTyr: 1.735 ± 0.078
0.0PheXaa: 0.0 ± 0.0
Gly
3.308GlyAla: 3.308 ± 0.135
0.66GlyCys: 0.66 ± 0.058
2.793GlyAsp: 2.793 ± 0.089
3.73GlyGlu: 3.73 ± 0.113
2.406GlyPhe: 2.406 ± 0.092
3.35GlyGly: 3.35 ± 0.15
0.771GlyHis: 0.771 ± 0.053
6.623GlyIle: 6.623 ± 0.161
5.991GlyLys: 5.991 ± 0.168
3.986GlyLeu: 3.986 ± 0.118
1.435GlyMet: 1.435 ± 0.071
3.522GlyAsn: 3.522 ± 0.115
1.002GlyPro: 1.002 ± 0.11
1.244GlyGln: 1.244 ± 0.062
2.067GlyArg: 2.067 ± 0.096
3.184GlySer: 3.184 ± 0.112
3.536GlyThr: 3.536 ± 0.124
3.948GlyVal: 3.948 ± 0.105
0.301GlyTrp: 0.301 ± 0.031
2.783GlyTyr: 2.783 ± 0.105
0.007GlyXaa: 0.007 ± 0.005
His
0.643HisAla: 0.643 ± 0.047
0.138HisCys: 0.138 ± 0.024
0.529HisAsp: 0.529 ± 0.048
0.709HisGlu: 0.709 ± 0.046
0.456HisPhe: 0.456 ± 0.04
0.764HisGly: 0.764 ± 0.053
0.249HisHis: 0.249 ± 0.031
1.175HisIle: 1.175 ± 0.07
0.843HisLys: 0.843 ± 0.051
0.774HisLeu: 0.774 ± 0.056
0.28HisMet: 0.28 ± 0.032
0.608HisAsn: 0.608 ± 0.048
0.411HisPro: 0.411 ± 0.044
0.235HisGln: 0.235 ± 0.027
0.349HisArg: 0.349 ± 0.036
0.705HisSer: 0.705 ± 0.05
0.605HisThr: 0.605 ± 0.046
0.66HisVal: 0.66 ± 0.05
0.048HisTrp: 0.048 ± 0.011
0.415HisTyr: 0.415 ± 0.037
0.0HisXaa: 0.0 ± 0.0
Ile
5.427IleAla: 5.427 ± 0.156
1.455IleCys: 1.455 ± 0.07
6.326IleAsp: 6.326 ± 0.142
7.467IleGlu: 7.467 ± 0.183
4.539IlePhe: 4.539 ± 0.178
5.569IleGly: 5.569 ± 0.172
0.999IleHis: 0.999 ± 0.062
9.634IleIle: 9.634 ± 0.203
9.834IleLys: 9.834 ± 0.196
9.312IleLeu: 9.312 ± 0.206
2.219IleMet: 2.219 ± 0.077
6.589IleAsn: 6.589 ± 0.172
2.824IlePro: 2.824 ± 0.102
1.884IleGln: 1.884 ± 0.087
3.09IleArg: 3.09 ± 0.115
7.954IleSer: 7.954 ± 0.173
5.372IleThr: 5.372 ± 0.145
7.024IleVal: 7.024 ± 0.178
0.391IleTrp: 0.391 ± 0.037
4.442IleTyr: 4.442 ± 0.129
0.007IleXaa: 0.007 ± 0.005
Lys
4.273LysAla: 4.273 ± 0.149
1.155LysCys: 1.155 ± 0.071
5.935LysAsp: 5.935 ± 0.157
9.223LysGlu: 9.223 ± 0.201
3.626LysPhe: 3.626 ± 0.121
4.355LysGly: 4.355 ± 0.132
0.888LysHis: 0.888 ± 0.056
11.051LysIle: 11.051 ± 0.224
11.214LysLys: 11.214 ± 0.237
8.154LysLeu: 8.154 ± 0.187
3.045LysMet: 3.045 ± 0.094
8.607LysAsn: 8.607 ± 0.194
1.791LysPro: 1.791 ± 0.087
2.665LysGln: 2.665 ± 0.113
3.388LysArg: 3.388 ± 0.103
5.555LysSer: 5.555 ± 0.167
4.594LysThr: 4.594 ± 0.129
6.633LysVal: 6.633 ± 0.165
0.463LysTrp: 0.463 ± 0.045
5.59LysTyr: 5.59 ± 0.156
0.003LysXaa: 0.003 ± 0.003
Leu
4.411LeuAla: 4.411 ± 0.12
1.092LeuCys: 1.092 ± 0.058
5.002LeuAsp: 5.002 ± 0.141
6.309LeuGlu: 6.309 ± 0.143
3.36LeuPhe: 3.36 ± 0.135
4.839LeuGly: 4.839 ± 0.13
0.812LeuHis: 0.812 ± 0.055
7.726LeuIle: 7.726 ± 0.178
8.555LeuLys: 8.555 ± 0.181
6.526LeuLeu: 6.526 ± 0.173
1.901LeuMet: 1.901 ± 0.078
5.935LeuAsn: 5.935 ± 0.155
2.354LeuPro: 2.354 ± 0.093
1.998LeuGln: 1.998 ± 0.079
2.776LeuArg: 2.776 ± 0.12
6.098LeuSer: 6.098 ± 0.134
4.127LeuThr: 4.127 ± 0.116
5.123LeuVal: 5.123 ± 0.13
0.415LeuTrp: 0.415 ± 0.04
3.336LeuTyr: 3.336 ± 0.121
0.003LeuXaa: 0.003 ± 0.004
Met
1.545MetAla: 1.545 ± 0.081
0.339MetCys: 0.339 ± 0.035
1.282MetAsp: 1.282 ± 0.072
2.005MetGlu: 2.005 ± 0.082
1.027MetPhe: 1.027 ± 0.061
1.369MetGly: 1.369 ± 0.07
0.318MetHis: 0.318 ± 0.031
2.088MetIle: 2.088 ± 0.091
2.413MetLys: 2.413 ± 0.092
2.319MetLeu: 2.319 ± 0.089
0.646MetMet: 0.646 ± 0.054
1.493MetAsn: 1.493 ± 0.068
0.767MetPro: 0.767 ± 0.056
0.895MetGln: 0.895 ± 0.056
0.729MetArg: 0.729 ± 0.051
1.638MetSer: 1.638 ± 0.08
1.034MetThr: 1.034 ± 0.06
1.431MetVal: 1.431 ± 0.073
0.149MetTrp: 0.149 ± 0.024
1.179MetTyr: 1.179 ± 0.066
0.003MetXaa: 0.003 ± 0.003
Asn
2.835AsnAla: 2.835 ± 0.087
0.837AsnCys: 0.837 ± 0.065
3.336AsnAsp: 3.336 ± 0.092
4.895AsnGlu: 4.895 ± 0.131
3.121AsnPhe: 3.121 ± 0.097
4.058AsnGly: 4.058 ± 0.117
0.553AsnHis: 0.553 ± 0.047
8.268AsnIle: 8.268 ± 0.189
7.231AsnLys: 7.231 ± 0.162
5.818AsnLeu: 5.818 ± 0.17
1.887AsnMet: 1.887 ± 0.086
4.933AsnAsn: 4.933 ± 0.155
1.493AsnPro: 1.493 ± 0.076
1.386AsnGln: 1.386 ± 0.07
1.614AsnArg: 1.614 ± 0.088
4.76AsnSer: 4.76 ± 0.145
3.27AsnThr: 3.27 ± 0.121
4.998AsnVal: 4.998 ± 0.163
0.297AsnTrp: 0.297 ± 0.036
3.042AsnTyr: 3.042 ± 0.111
0.003AsnXaa: 0.003 ± 0.003
Pro
1.03ProAla: 1.03 ± 0.068
0.346ProCys: 0.346 ± 0.033
1.151ProAsp: 1.151 ± 0.056
1.763ProGlu: 1.763 ± 0.081
1.009ProPhe: 1.009 ± 0.057
1.334ProGly: 1.334 ± 0.07
0.373ProHis: 0.373 ± 0.032
2.219ProIle: 2.219 ± 0.103
2.053ProLys: 2.053 ± 0.084
1.773ProLeu: 1.773 ± 0.077
0.525ProMet: 0.525 ± 0.047
1.428ProAsn: 1.428 ± 0.072
0.467ProPro: 0.467 ± 0.045
0.553ProGln: 0.553 ± 0.041
0.722ProArg: 0.722 ± 0.054
1.607ProSer: 1.607 ± 0.088
1.421ProThr: 1.421 ± 0.092
1.78ProVal: 1.78 ± 0.089
0.135ProTrp: 0.135 ± 0.022
1.061ProTyr: 1.061 ± 0.068
0.0ProXaa: 0.0 ± 0.0
Gln
1.296GlnAla: 1.296 ± 0.07
0.162GlnCys: 0.162 ± 0.022
1.234GlnAsp: 1.234 ± 0.066
1.808GlnGlu: 1.808 ± 0.089
0.664GlnPhe: 0.664 ± 0.053
1.213GlnGly: 1.213 ± 0.09
0.207GlnHis: 0.207 ± 0.026
2.454GlnIle: 2.454 ± 0.103
2.676GlnLys: 2.676 ± 0.101
1.535GlnLeu: 1.535 ± 0.085
0.612GlnMet: 0.612 ± 0.044
1.874GlnAsn: 1.874 ± 0.078
0.377GlnPro: 0.377 ± 0.043
0.508GlnGln: 0.508 ± 0.06
0.871GlnArg: 0.871 ± 0.059
1.314GlnSer: 1.314 ± 0.072
1.051GlnThr: 1.051 ± 0.066
1.58GlnVal: 1.58 ± 0.075
0.059GlnTrp: 0.059 ± 0.014
0.861GlnTyr: 0.861 ± 0.062
0.003GlnXaa: 0.003 ± 0.003
Arg
1.887ArgAla: 1.887 ± 0.085
0.498ArgCys: 0.498 ± 0.047
1.642ArgAsp: 1.642 ± 0.071
2.935ArgGlu: 2.935 ± 0.105
1.334ArgPhe: 1.334 ± 0.062
1.884ArgGly: 1.884 ± 0.085
0.432ArgHis: 0.432 ± 0.042
2.931ArgIle: 2.931 ± 0.108
3.35ArgLys: 3.35 ± 0.112
2.755ArgLeu: 2.755 ± 0.107
0.764ArgMet: 0.764 ± 0.044
1.908ArgAsn: 1.908 ± 0.083
0.733ArgPro: 0.733 ± 0.046
0.93ArgGln: 0.93 ± 0.061
1.262ArgArg: 1.262 ± 0.08
1.386ArgSer: 1.386 ± 0.062
1.473ArgThr: 1.473 ± 0.083
2.313ArgVal: 2.313 ± 0.094
0.2ArgTrp: 0.2 ± 0.025
1.441ArgTyr: 1.441 ± 0.082
0.0ArgXaa: 0.0 ± 0.0
Ser
3.09SerAla: 3.09 ± 0.102
0.716SerCys: 0.716 ± 0.054
3.83SerAsp: 3.83 ± 0.115
4.542SerGlu: 4.542 ± 0.135
2.859SerPhe: 2.859 ± 0.124
3.972SerGly: 3.972 ± 0.127
0.809SerHis: 0.809 ± 0.056
6.402SerIle: 6.402 ± 0.149
7.46SerLys: 7.46 ± 0.188
5.534SerLeu: 5.534 ± 0.119
1.559SerMet: 1.559 ± 0.074
4.611SerAsn: 4.611 ± 0.142
1.293SerPro: 1.293 ± 0.063
1.628SerGln: 1.628 ± 0.074
2.116SerArg: 2.116 ± 0.084
5.071SerSer: 5.071 ± 0.167
3.713SerThr: 3.713 ± 0.132
4.096SerVal: 4.096 ± 0.134
0.415SerTrp: 0.415 ± 0.036
3.118SerTyr: 3.118 ± 0.118
0.0SerXaa: 0.0 ± 0.0
Thr
2.662ThrAla: 2.662 ± 0.095
0.588ThrCys: 0.588 ± 0.043
2.845ThrAsp: 2.845 ± 0.104
3.142ThrGlu: 3.142 ± 0.1
2.057ThrPhe: 2.057 ± 0.088
3.626ThrGly: 3.626 ± 0.169
0.671ThrHis: 0.671 ± 0.046
4.753ThrIle: 4.753 ± 0.15
4.843ThrLys: 4.843 ± 0.129
4.815ThrLeu: 4.815 ± 0.143
1.075ThrMet: 1.075 ± 0.064
2.956ThrAsn: 2.956 ± 0.128
1.594ThrPro: 1.594 ± 0.086
1.117ThrGln: 1.117 ± 0.071
1.556ThrArg: 1.556 ± 0.071
3.816ThrSer: 3.816 ± 0.124
3.111ThrThr: 3.111 ± 0.174
3.336ThrVal: 3.336 ± 0.138
0.249ThrTrp: 0.249 ± 0.025
2.378ThrTyr: 2.378 ± 0.096
0.0ThrXaa: 0.0 ± 0.0
Val
3.91ValAla: 3.91 ± 0.128
1.04ValCys: 1.04 ± 0.069
4.169ValAsp: 4.169 ± 0.135
5.088ValGlu: 5.088 ± 0.154
2.606ValPhe: 2.606 ± 0.101
3.868ValGly: 3.868 ± 0.135
0.629ValHis: 0.629 ± 0.049
6.353ValIle: 6.353 ± 0.162
6.333ValLys: 6.333 ± 0.16
5.883ValLeu: 5.883 ± 0.158
1.289ValMet: 1.289 ± 0.07
3.896ValAsn: 3.896 ± 0.109
1.763ValPro: 1.763 ± 0.079
1.338ValGln: 1.338 ± 0.068
2.039ValArg: 2.039 ± 0.083
4.777ValSer: 4.777 ± 0.134
3.64ValThr: 3.64 ± 0.124
4.853ValVal: 4.853 ± 0.138
0.294ValTrp: 0.294 ± 0.032
3.035ValTyr: 3.035 ± 0.123
0.003ValXaa: 0.003 ± 0.003
Trp
0.232TrpAla: 0.232 ± 0.026
0.086TrpCys: 0.086 ± 0.019
0.228TrpAsp: 0.228 ± 0.03
0.29TrpGlu: 0.29 ± 0.033
0.207TrpPhe: 0.207 ± 0.03
0.315TrpGly: 0.315 ± 0.035
0.083TrpHis: 0.083 ± 0.017
0.429TrpIle: 0.429 ± 0.039
0.449TrpLys: 0.449 ± 0.047
0.377TrpLeu: 0.377 ± 0.04
0.152TrpMet: 0.152 ± 0.026
0.359TrpAsn: 0.359 ± 0.041
0.107TrpPro: 0.107 ± 0.019
0.169TrpGln: 0.169 ± 0.023
0.19TrpArg: 0.19 ± 0.025
0.29TrpSer: 0.29 ± 0.036
0.218TrpThr: 0.218 ± 0.026
0.239TrpVal: 0.239 ± 0.027
0.055TrpTrp: 0.055 ± 0.013
0.263TrpTyr: 0.263 ± 0.029
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.06TyrAla: 2.06 ± 0.088
0.567TyrCys: 0.567 ± 0.045
3.042TyrAsp: 3.042 ± 0.093
3.367TyrGlu: 3.367 ± 0.088
2.036TyrPhe: 2.036 ± 0.097
2.734TyrGly: 2.734 ± 0.095
0.429TyrHis: 0.429 ± 0.038
4.905TyrIle: 4.905 ± 0.142
4.207TyrLys: 4.207 ± 0.133
3.923TyrLeu: 3.923 ± 0.118
1.13TyrMet: 1.13 ± 0.059
3.581TyrAsn: 3.581 ± 0.132
0.944TyrPro: 0.944 ± 0.055
0.885TyrGln: 0.885 ± 0.054
1.448TyrArg: 1.448 ± 0.075
3.284TyrSer: 3.284 ± 0.113
2.475TyrThr: 2.475 ± 0.097
2.962TyrVal: 2.962 ± 0.099
0.183TyrTrp: 0.183 ± 0.026
2.126TyrTyr: 2.126 ± 0.094
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.003XaaPhe: 0.003 ± 0.003
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.007XaaIle: 0.007 ± 0.005
0.007XaaLys: 0.007 ± 0.005
0.003XaaLeu: 0.003 ± 0.003
0.003XaaMet: 0.003 ± 0.003
0.0XaaAsn: 0.0 ± 0.0
0.003XaaPro: 0.003 ± 0.003
0.003XaaGln: 0.003 ± 0.003
0.003XaaArg: 0.003 ± 0.004
0.0XaaSer: 0.0 ± 0.0
0.003XaaThr: 0.003 ± 0.003
0.003XaaVal: 0.003 ± 0.003
0.0XaaTrp: 0.0 ± 0.0
0.003XaaTyr: 0.003 ± 0.003
0.059XaaXaa: 0.059 ± 0.02
Statistics based on 1041 proteins (289292 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski