Amino acid dipepetide frequency for Bacillus sp. UFRGS-B20

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
2.514AlaAla: 2.514 ± 0.147
1.799AlaCys: 1.799 ± 0.113
1.267AlaAsp: 1.267 ± 0.08
1.367AlaGlu: 1.367 ± 0.095
3.875AlaPhe: 3.875 ± 0.152
1.85AlaGly: 1.85 ± 0.128
1.166AlaHis: 1.166 ± 0.093
3.542AlaIle: 3.542 ± 0.16
2.364AlaLys: 2.364 ± 0.121
5.173AlaLeu: 5.173 ± 0.193
1.122AlaMet: 1.122 ± 0.087
2.025AlaAsn: 2.025 ± 0.113
2.044AlaPro: 2.044 ± 0.125
1.292AlaGln: 1.292 ± 0.097
2.251AlaArg: 2.251 ± 0.112
4.351AlaSer: 4.351 ± 0.173
2.464AlaThr: 2.464 ± 0.105
2.571AlaVal: 2.571 ± 0.12
0.483AlaTrp: 0.483 ± 0.054
1.806AlaTyr: 1.806 ± 0.106
0.0AlaXaa: 0.0 ± 0.0
Cys
1.329CysAla: 1.329 ± 0.096
1.423CysCys: 1.423 ± 0.099
0.658CysAsp: 0.658 ± 0.071
0.715CysGlu: 0.715 ± 0.065
3.586CysPhe: 3.586 ± 0.172
1.185CysGly: 1.185 ± 0.085
1.078CysHis: 1.078 ± 0.073
2.426CysIle: 2.426 ± 0.128
1.574CysLys: 1.574 ± 0.104
3.687CysLeu: 3.687 ± 0.139
0.771CysMet: 0.771 ± 0.071
1.756CysAsn: 1.756 ± 0.111
1.185CysPro: 1.185 ± 0.085
0.828CysGln: 0.828 ± 0.076
1.618CysArg: 1.618 ± 0.101
3.499CysSer: 3.499 ± 0.153
1.718CysThr: 1.718 ± 0.112
1.593CysVal: 1.593 ± 0.099
0.495CysTrp: 0.495 ± 0.058
1.335CysTyr: 1.335 ± 0.085
0.0CysXaa: 0.0 ± 0.0
Asp
1.179AspAla: 1.179 ± 0.087
0.834AspCys: 0.834 ± 0.076
0.84AspAsp: 0.84 ± 0.083
0.972AspGlu: 0.972 ± 0.086
2.119AspPhe: 2.119 ± 0.115
1.129AspGly: 1.129 ± 0.084
0.589AspHis: 0.589 ± 0.057
1.818AspIle: 1.818 ± 0.096
1.404AspLys: 1.404 ± 0.089
2.64AspLeu: 2.64 ± 0.124
0.683AspMet: 0.683 ± 0.06
1.285AspAsn: 1.285 ± 0.081
0.821AspPro: 0.821 ± 0.067
0.784AspGln: 0.784 ± 0.064
1.379AspArg: 1.379 ± 0.082
2.144AspSer: 2.144 ± 0.115
1.298AspThr: 1.298 ± 0.093
1.442AspVal: 1.442 ± 0.085
0.357AspTrp: 0.357 ± 0.047
1.003AspTyr: 1.003 ± 0.07
0.0AspXaa: 0.0 ± 0.0
Glu
1.455GluAla: 1.455 ± 0.093
0.784GluCys: 0.784 ± 0.072
0.74GluAsp: 0.74 ± 0.072
1.455GluGlu: 1.455 ± 0.091
1.818GluPhe: 1.818 ± 0.1
1.31GluGly: 1.31 ± 0.096
0.796GluHis: 0.796 ± 0.076
2.188GluIle: 2.188 ± 0.115
2.288GluLys: 2.288 ± 0.126
2.621GluLeu: 2.621 ± 0.139
1.116GluMet: 1.116 ± 0.083
1.586GluAsn: 1.586 ± 0.107
0.972GluPro: 0.972 ± 0.085
1.116GluGln: 1.116 ± 0.078
2.1GluArg: 2.1 ± 0.112
2.1GluSer: 2.1 ± 0.127
1.662GluThr: 1.662 ± 0.113
1.611GluVal: 1.611 ± 0.102
0.332GluTrp: 0.332 ± 0.045
1.116GluTyr: 1.116 ± 0.083
0.0GluXaa: 0.0 ± 0.0
Phe
3.423PheAla: 3.423 ± 0.181
3.179PheCys: 3.179 ± 0.16
1.693PheAsp: 1.693 ± 0.092
1.937PheGlu: 1.937 ± 0.114
10.383PhePhe: 10.383 ± 0.474
2.527PheGly: 2.527 ± 0.138
3.536PheHis: 3.536 ± 0.136
5.9PheIle: 5.9 ± 0.207
3.317PheLys: 3.317 ± 0.139
11.888PheLeu: 11.888 ± 0.295
2.069PheMet: 2.069 ± 0.104
3.48PheAsn: 3.48 ± 0.151
4.903PhePro: 4.903 ± 0.207
2.872PheGln: 2.872 ± 0.124
4.32PheArg: 4.32 ± 0.165
8.928PheSer: 8.928 ± 0.238
4.075PheThr: 4.075 ± 0.16
4.972PheVal: 4.972 ± 0.194
1.16PheTrp: 1.16 ± 0.086
3.687PheTyr: 3.687 ± 0.156
0.0PheXaa: 0.0 ± 0.0
Gly
1.85GlyAla: 1.85 ± 0.127
1.078GlyCys: 1.078 ± 0.083
1.141GlyAsp: 1.141 ± 0.094
1.298GlyGlu: 1.298 ± 0.092
2.721GlyPhe: 2.721 ± 0.136
2.006GlyGly: 2.006 ± 0.143
0.897GlyHis: 0.897 ± 0.075
2.834GlyIle: 2.834 ± 0.132
2.458GlyLys: 2.458 ± 0.144
3.53GlyLeu: 3.53 ± 0.145
1.085GlyMet: 1.085 ± 0.068
1.818GlyAsn: 1.818 ± 0.106
1.135GlyPro: 1.135 ± 0.096
1.097GlyGln: 1.097 ± 0.077
2.069GlyArg: 2.069 ± 0.125
2.796GlySer: 2.796 ± 0.129
1.856GlyThr: 1.856 ± 0.124
1.975GlyVal: 1.975 ± 0.11
0.483GlyTrp: 0.483 ± 0.061
1.492GlyTyr: 1.492 ± 0.096
0.0GlyXaa: 0.0 ± 0.0
His
1.417HisAla: 1.417 ± 0.094
1.028HisCys: 1.028 ± 0.077
0.74HisAsp: 0.74 ± 0.073
0.759HisGlu: 0.759 ± 0.073
3.411HisPhe: 3.411 ± 0.139
0.947HisGly: 0.947 ± 0.065
1.361HisHis: 1.361 ± 0.1
2.389HisIle: 2.389 ± 0.135
1.536HisLys: 1.536 ± 0.088
4.571HisLeu: 4.571 ± 0.157
0.809HisMet: 0.809 ± 0.07
1.473HisAsn: 1.473 ± 0.097
1.561HisPro: 1.561 ± 0.087
1.298HisGln: 1.298 ± 0.087
1.662HisArg: 1.662 ± 0.095
2.997HisSer: 2.997 ± 0.141
1.756HisThr: 1.756 ± 0.104
1.574HisVal: 1.574 ± 0.097
0.332HisTrp: 0.332 ± 0.048
1.43HisTyr: 1.43 ± 0.099
0.0HisXaa: 0.0 ± 0.0
Ile
3.361IleAla: 3.361 ± 0.149
2.383IleCys: 2.383 ± 0.138
1.919IleAsp: 1.919 ± 0.107
2.144IleGlu: 2.144 ± 0.117
6.872IlePhe: 6.872 ± 0.176
2.395IleGly: 2.395 ± 0.136
2.771IleHis: 2.771 ± 0.152
5.599IleIle: 5.599 ± 0.213
3.749IleLys: 3.749 ± 0.157
8.282IleLeu: 8.282 ± 0.225
2.301IleMet: 2.301 ± 0.119
3.618IleAsn: 3.618 ± 0.172
3.549IlePro: 3.549 ± 0.16
2.752IleGln: 2.752 ± 0.136
3.994IleArg: 3.994 ± 0.163
7.906IleSer: 7.906 ± 0.235
4.094IleThr: 4.094 ± 0.183
3.95IleVal: 3.95 ± 0.166
0.821IleTrp: 0.821 ± 0.072
3.248IleTyr: 3.248 ± 0.12
0.0IleXaa: 0.0 ± 0.0
Lys
2.383LysAla: 2.383 ± 0.138
1.392LysCys: 1.392 ± 0.09
1.442LysAsp: 1.442 ± 0.09
2.276LysGlu: 2.276 ± 0.106
2.984LysPhe: 2.984 ± 0.129
2.025LysGly: 2.025 ± 0.115
1.448LysHis: 1.448 ± 0.08
4.332LysIle: 4.332 ± 0.193
5.041LysLys: 5.041 ± 0.23
5.072LysLeu: 5.072 ± 0.182
1.806LysMet: 1.806 ± 0.117
3.448LysAsn: 3.448 ± 0.16
2.194LysPro: 2.194 ± 0.107
2.357LysGln: 2.357 ± 0.117
4.025LysArg: 4.025 ± 0.148
4.326LysSer: 4.326 ± 0.162
3.367LysThr: 3.367 ± 0.135
2.589LysVal: 2.589 ± 0.13
0.708LysTrp: 0.708 ± 0.069
2.013LysTyr: 2.013 ± 0.111
0.0LysXaa: 0.0 ± 0.0
Leu
4.953LeuAla: 4.953 ± 0.175
3.379LeuCys: 3.379 ± 0.158
2.69LeuAsp: 2.69 ± 0.137
3.035LeuGlu: 3.035 ± 0.135
10.652LeuPhe: 10.652 ± 0.294
3.285LeuGly: 3.285 ± 0.138
4.383LeuHis: 4.383 ± 0.182
8.22LeuIle: 8.22 ± 0.204
5.63LeuLys: 5.63 ± 0.175
16.897LeuLeu: 16.897 ± 0.418
3.329LeuMet: 3.329 ± 0.133
5.179LeuAsn: 5.179 ± 0.189
6.107LeuPro: 6.107 ± 0.214
4.853LeuGln: 4.853 ± 0.183
5.731LeuArg: 5.731 ± 0.193
10.408LeuSer: 10.408 ± 0.246
5.611LeuThr: 5.611 ± 0.193
5.981LeuVal: 5.981 ± 0.197
1.367LeuTrp: 1.367 ± 0.096
5.561LeuTyr: 5.561 ± 0.184
0.0LeuXaa: 0.0 ± 0.0
Met
1.448MetAla: 1.448 ± 0.086
0.69MetCys: 0.69 ± 0.064
0.922MetAsp: 0.922 ± 0.069
0.947MetGlu: 0.947 ± 0.08
2.113MetPhe: 2.113 ± 0.112
1.085MetGly: 1.085 ± 0.091
0.997MetHis: 0.997 ± 0.084
2.088MetIle: 2.088 ± 0.108
1.9MetLys: 1.9 ± 0.112
3.191MetLeu: 3.191 ± 0.112
0.508MetMet: 0.508 ± 0.052
1.517MetAsn: 1.517 ± 0.092
1.279MetPro: 1.279 ± 0.073
1.103MetGln: 1.103 ± 0.093
1.53MetArg: 1.53 ± 0.098
1.843MetSer: 1.843 ± 0.1
1.436MetThr: 1.436 ± 0.107
1.423MetVal: 1.423 ± 0.088
0.276MetTrp: 0.276 ± 0.039
1.21MetTyr: 1.21 ± 0.083
0.0MetXaa: 0.0 ± 0.0
Asn
2.477AsnAla: 2.477 ± 0.126
1.63AsnCys: 1.63 ± 0.114
1.361AsnAsp: 1.361 ± 0.101
1.743AsnGlu: 1.743 ± 0.111
3.737AsnPhe: 3.737 ± 0.164
2.044AsnGly: 2.044 ± 0.124
1.649AsnHis: 1.649 ± 0.093
3.611AsnIle: 3.611 ± 0.187
2.84AsnLys: 2.84 ± 0.136
4.74AsnLeu: 4.74 ± 0.181
1.392AsnMet: 1.392 ± 0.081
2.696AsnAsn: 2.696 ± 0.151
2.22AsnPro: 2.22 ± 0.126
1.624AsnGln: 1.624 ± 0.091
2.79AsnArg: 2.79 ± 0.147
4.414AsnSer: 4.414 ± 0.171
2.89AsnThr: 2.89 ± 0.136
2.589AsnVal: 2.589 ± 0.119
0.715AsnTrp: 0.715 ± 0.073
1.743AsnTyr: 1.743 ± 0.111
0.0AsnXaa: 0.0 ± 0.0
Pro
2.288ProAla: 2.288 ± 0.117
1.567ProCys: 1.567 ± 0.116
1.047ProAsp: 1.047 ± 0.084
1.066ProGlu: 1.066 ± 0.087
4.314ProPhe: 4.314 ± 0.182
1.323ProGly: 1.323 ± 0.1
1.574ProHis: 1.574 ± 0.105
3.511ProIle: 3.511 ± 0.134
2.006ProLys: 2.006 ± 0.124
5.919ProLeu: 5.919 ± 0.201
1.11ProMet: 1.11 ± 0.09
1.919ProAsn: 1.919 ± 0.124
2.828ProPro: 2.828 ± 0.156
1.323ProGln: 1.323 ± 0.105
2.207ProArg: 2.207 ± 0.114
4.727ProSer: 4.727 ± 0.175
2.307ProThr: 2.307 ± 0.126
2.552ProVal: 2.552 ± 0.158
0.451ProTrp: 0.451 ± 0.054
1.912ProTyr: 1.912 ± 0.113
0.0ProXaa: 0.0 ± 0.0
Gln
1.373GlnAla: 1.373 ± 0.106
0.828GlnCys: 0.828 ± 0.071
0.815GlnAsp: 0.815 ± 0.065
1.185GlnGlu: 1.185 ± 0.088
2.188GlnPhe: 2.188 ± 0.129
1.103GlnGly: 1.103 ± 0.093
1.185GlnHis: 1.185 ± 0.085
2.589GlnIle: 2.589 ± 0.129
2.213GlnLys: 2.213 ± 0.133
3.868GlnLeu: 3.868 ± 0.154
1.091GlnMet: 1.091 ± 0.084
1.85GlnAsn: 1.85 ± 0.106
1.411GlnPro: 1.411 ± 0.093
1.392GlnGln: 1.392 ± 0.102
2.226GlnArg: 2.226 ± 0.115
2.909GlnSer: 2.909 ± 0.126
1.95GlnThr: 1.95 ± 0.114
1.73GlnVal: 1.73 ± 0.108
0.401GlnTrp: 0.401 ± 0.046
1.423GlnTyr: 1.423 ± 0.085
0.0GlnXaa: 0.0 ± 0.0
Arg
2.295ArgAla: 2.295 ± 0.12
1.85ArgCys: 1.85 ± 0.102
1.379ArgAsp: 1.379 ± 0.091
1.48ArgGlu: 1.48 ± 0.102
4.577ArgPhe: 4.577 ± 0.18
1.843ArgGly: 1.843 ± 0.108
1.567ArgHis: 1.567 ± 0.098
4.558ArgIle: 4.558 ± 0.184
3.862ArgLys: 3.862 ± 0.15
5.693ArgLeu: 5.693 ± 0.176
1.812ArgMet: 1.812 ± 0.107
3.26ArgAsn: 3.26 ± 0.142
2.138ArgPro: 2.138 ± 0.12
1.662ArgGln: 1.662 ± 0.113
3.599ArgArg: 3.599 ± 0.177
4.301ArgSer: 4.301 ± 0.178
2.972ArgThr: 2.972 ± 0.15
2.52ArgVal: 2.52 ± 0.125
0.821ArgTrp: 0.821 ± 0.075
2.408ArgTyr: 2.408 ± 0.104
0.0ArgXaa: 0.0 ± 0.0
Ser
4.564SerAla: 4.564 ± 0.173
3.147SerCys: 3.147 ± 0.134
1.956SerAsp: 1.956 ± 0.114
2.163SerGlu: 2.163 ± 0.119
9.016SerPhe: 9.016 ± 0.278
3.26SerGly: 3.26 ± 0.14
2.796SerHis: 2.796 ± 0.136
6.934SerIle: 6.934 ± 0.177
4.395SerLys: 4.395 ± 0.164
11.461SerLeu: 11.461 ± 0.285
2.107SerMet: 2.107 ± 0.114
4.483SerAsn: 4.483 ± 0.171
4.521SerPro: 4.521 ± 0.169
2.433SerGln: 2.433 ± 0.117
4.464SerArg: 4.464 ± 0.186
9.881SerSer: 9.881 ± 0.318
4.803SerThr: 4.803 ± 0.177
4.872SerVal: 4.872 ± 0.17
1.041SerTrp: 1.041 ± 0.079
3.931SerTyr: 3.931 ± 0.161
0.0SerXaa: 0.0 ± 0.0
Thr
2.433ThrAla: 2.433 ± 0.115
2.006ThrCys: 2.006 ± 0.112
1.329ThrAsp: 1.329 ± 0.099
1.555ThrGlu: 1.555 ± 0.091
4.289ThrPhe: 4.289 ± 0.163
2.351ThrGly: 2.351 ± 0.153
1.624ThrHis: 1.624 ± 0.103
4.608ThrIle: 4.608 ± 0.17
2.884ThrLys: 2.884 ± 0.154
5.549ThrLeu: 5.549 ± 0.182
1.323ThrMet: 1.323 ± 0.089
2.426ThrAsn: 2.426 ± 0.113
2.414ThrPro: 2.414 ± 0.127
1.298ThrGln: 1.298 ± 0.088
2.721ThrArg: 2.721 ± 0.131
5.461ThrSer: 5.461 ± 0.204
3.323ThrThr: 3.323 ± 0.171
3.129ThrVal: 3.129 ± 0.131
0.577ThrTrp: 0.577 ± 0.062
2.144ThrTyr: 2.144 ± 0.119
0.0ThrXaa: 0.0 ± 0.0
Val
2.351ValAla: 2.351 ± 0.102
1.737ValCys: 1.737 ± 0.105
1.361ValAsp: 1.361 ± 0.11
1.505ValGlu: 1.505 ± 0.096
4.778ValPhe: 4.778 ± 0.158
1.962ValGly: 1.962 ± 0.123
1.868ValHis: 1.868 ± 0.109
3.975ValIle: 3.975 ± 0.16
2.834ValLys: 2.834 ± 0.15
6.483ValLeu: 6.483 ± 0.214
1.517ValMet: 1.517 ± 0.101
2.42ValAsn: 2.42 ± 0.136
2.282ValPro: 2.282 ± 0.127
1.762ValGln: 1.762 ± 0.105
2.796ValArg: 2.796 ± 0.147
4.621ValSer: 4.621 ± 0.139
3.003ValThr: 3.003 ± 0.149
2.941ValVal: 2.941 ± 0.158
0.596ValTrp: 0.596 ± 0.06
2.132ValTyr: 2.132 ± 0.11
0.0ValXaa: 0.0 ± 0.0
Trp
0.483TrpAla: 0.483 ± 0.055
0.382TrpCys: 0.382 ± 0.045
0.351TrpAsp: 0.351 ± 0.055
0.464TrpGlu: 0.464 ± 0.052
0.89TrpPhe: 0.89 ± 0.086
0.458TrpGly: 0.458 ± 0.056
0.326TrpHis: 0.326 ± 0.049
1.009TrpIle: 1.009 ± 0.1
0.846TrpLys: 0.846 ± 0.065
1.241TrpLeu: 1.241 ± 0.083
0.382TrpMet: 0.382 ± 0.05
0.721TrpAsn: 0.721 ± 0.076
0.52TrpPro: 0.52 ± 0.055
0.47TrpGln: 0.47 ± 0.05
0.809TrpArg: 0.809 ± 0.064
0.972TrpSer: 0.972 ± 0.065
0.527TrpThr: 0.527 ± 0.066
0.633TrpVal: 0.633 ± 0.061
0.219TrpTrp: 0.219 ± 0.041
0.439TrpTyr: 0.439 ± 0.055
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.749TyrAla: 1.749 ± 0.117
1.342TyrCys: 1.342 ± 0.098
1.072TyrAsp: 1.072 ± 0.081
1.103TyrGlu: 1.103 ± 0.076
3.925TyrPhe: 3.925 ± 0.151
1.53TyrGly: 1.53 ± 0.09
1.442TyrHis: 1.442 ± 0.105
3.574TyrIle: 3.574 ± 0.15
2.088TyrLys: 2.088 ± 0.124
4.746TyrLeu: 4.746 ± 0.155
1.166TyrMet: 1.166 ± 0.088
1.944TyrAsn: 1.944 ± 0.107
1.868TyrPro: 1.868 ± 0.11
1.417TyrGln: 1.417 ± 0.092
2.307TyrArg: 2.307 ± 0.112
3.643TyrSer: 3.643 ± 0.137
2.332TyrThr: 2.332 ± 0.108
2.276TyrVal: 2.276 ± 0.122
0.508TyrTrp: 0.508 ± 0.06
1.756TyrTyr: 1.756 ± 0.118
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 1976 proteins (159495 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski