Amino acid dipepetide frequency for Actinobacteria bacterium SCGC AG-212-D09

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
18.654AlaAla: 18.654 ± 0.507
1.171AlaCys: 1.171 ± 0.075
6.296AlaAsp: 6.296 ± 0.204
7.158AlaGlu: 7.158 ± 0.263
3.79AlaPhe: 3.79 ± 0.144
11.81AlaGly: 11.81 ± 0.336
2.494AlaHis: 2.494 ± 0.126
5.867AlaIle: 5.867 ± 0.171
2.722AlaLys: 2.722 ± 0.126
14.001AlaLeu: 14.001 ± 0.401
2.348AlaMet: 2.348 ± 0.135
3.042AlaAsn: 3.042 ± 0.186
6.529AlaPro: 6.529 ± 0.259
3.861AlaGln: 3.861 ± 0.18
9.804AlaArg: 9.804 ± 0.297
7.706AlaSer: 7.706 ± 0.222
7.673AlaThr: 7.673 ± 0.225
9.555AlaVal: 9.555 ± 0.316
1.714AlaTrp: 1.714 ± 0.104
2.381AlaTyr: 2.381 ± 0.129
0.0AlaXaa: 0.0 ± 0.0
Cys
1.036CysAla: 1.036 ± 0.077
0.114CysCys: 0.114 ± 0.027
0.542CysAsp: 0.542 ± 0.054
0.434CysGlu: 0.434 ± 0.051
0.347CysPhe: 0.347 ± 0.04
0.944CysGly: 0.944 ± 0.081
0.217CysHis: 0.217 ± 0.027
0.315CysIle: 0.315 ± 0.041
0.157CysLys: 0.157 ± 0.033
0.678CysLeu: 0.678 ± 0.059
0.13CysMet: 0.13 ± 0.025
0.249CysAsn: 0.249 ± 0.033
0.542CysPro: 0.542 ± 0.061
0.217CysGln: 0.217 ± 0.038
0.743CysArg: 0.743 ± 0.065
0.634CysSer: 0.634 ± 0.065
0.428CysThr: 0.428 ± 0.049
0.667CysVal: 0.667 ± 0.07
0.152CysTrp: 0.152 ± 0.031
0.217CysTyr: 0.217 ± 0.033
0.0CysXaa: 0.0 ± 0.0
Asp
7.011AspAla: 7.011 ± 0.248
0.493AspCys: 0.493 ± 0.054
3.096AspAsp: 3.096 ± 0.161
3.709AspGlu: 3.709 ± 0.201
1.551AspPhe: 1.551 ± 0.093
5.813AspGly: 5.813 ± 0.218
1.231AspHis: 1.231 ± 0.094
1.892AspIle: 1.892 ± 0.11
0.857AspLys: 0.857 ± 0.074
5.265AspLeu: 5.265 ± 0.202
0.781AspMet: 0.781 ± 0.065
0.944AspAsn: 0.944 ± 0.069
4.371AspPro: 4.371 ± 0.166
1.855AspGln: 1.855 ± 0.113
4.777AspArg: 4.777 ± 0.182
2.511AspSer: 2.511 ± 0.101
2.337AspThr: 2.337 ± 0.124
4.49AspVal: 4.49 ± 0.156
0.954AspTrp: 0.954 ± 0.08
1.383AspTyr: 1.383 ± 0.091
0.0AspXaa: 0.0 ± 0.0
Glu
6.334GluAla: 6.334 ± 0.246
0.385GluCys: 0.385 ± 0.043
2.646GluAsp: 2.646 ± 0.135
3.042GluGlu: 3.042 ± 0.2
1.73GluPhe: 1.73 ± 0.135
3.508GluGly: 3.508 ± 0.154
1.6GluHis: 1.6 ± 0.109
2.592GluIle: 2.592 ± 0.144
1.068GluLys: 1.068 ± 0.098
7.743GluLeu: 7.743 ± 0.28
0.976GluMet: 0.976 ± 0.082
1.063GluAsn: 1.063 ± 0.091
3.319GluPro: 3.319 ± 0.185
2.565GluGln: 2.565 ± 0.139
6.203GluArg: 6.203 ± 0.244
2.738GluSer: 2.738 ± 0.142
2.549GluThr: 2.549 ± 0.121
4.376GluVal: 4.376 ± 0.179
0.77GluTrp: 0.77 ± 0.064
1.014GluTyr: 1.014 ± 0.09
0.0GluXaa: 0.0 ± 0.0
Phe
4.078PheAla: 4.078 ± 0.14
0.315PheCys: 0.315 ± 0.037
2.126PheAsp: 2.126 ± 0.108
1.952PheGlu: 1.952 ± 0.112
0.949PhePhe: 0.949 ± 0.072
3.27PheGly: 3.27 ± 0.162
0.656PheHis: 0.656 ± 0.055
1.177PheIle: 1.177 ± 0.088
0.645PheLys: 0.645 ± 0.068
2.522PheLeu: 2.522 ± 0.138
0.434PheMet: 0.434 ± 0.044
0.884PheAsn: 0.884 ± 0.077
1.475PhePro: 1.475 ± 0.09
0.889PheGln: 0.889 ± 0.078
2.061PheArg: 2.061 ± 0.123
1.746PheSer: 1.746 ± 0.12
2.017PheThr: 2.017 ± 0.131
2.522PheVal: 2.522 ± 0.107
0.428PheTrp: 0.428 ± 0.052
0.732PheTyr: 0.732 ± 0.066
0.0PheXaa: 0.0 ± 0.0
Gly
10.954GlyAla: 10.954 ± 0.361
0.954GlyCys: 0.954 ± 0.077
4.615GlyAsp: 4.615 ± 0.161
4.598GlyGlu: 4.598 ± 0.176
2.923GlyPhe: 2.923 ± 0.129
8.866GlyGly: 8.866 ± 0.416
2.126GlyHis: 2.126 ± 0.102
3.818GlyIle: 3.818 ± 0.171
2.456GlyLys: 2.456 ± 0.124
8.172GlyLeu: 8.172 ± 0.243
1.855GlyMet: 1.855 ± 0.104
2.028GlyAsn: 2.028 ± 0.144
4.973GlyPro: 4.973 ± 0.199
2.999GlyGln: 2.999 ± 0.149
7.228GlyArg: 7.228 ± 0.21
6.534GlySer: 6.534 ± 0.218
5.786GlyThr: 5.786 ± 0.264
6.67GlyVal: 6.67 ± 0.23
1.329GlyTrp: 1.329 ± 0.093
2.071GlyTyr: 2.071 ± 0.096
0.0GlyXaa: 0.0 ± 0.0
His
2.706HisAla: 2.706 ± 0.139
0.206HisCys: 0.206 ± 0.032
1.193HisAsp: 1.193 ± 0.092
1.361HisGlu: 1.361 ± 0.095
0.634HisPhe: 0.634 ± 0.057
2.082HisGly: 2.082 ± 0.105
0.607HisHis: 0.607 ± 0.071
0.819HisIle: 0.819 ± 0.069
0.445HisLys: 0.445 ± 0.046
2.093HisLeu: 2.093 ± 0.113
0.304HisMet: 0.304 ± 0.043
0.553HisAsn: 0.553 ± 0.051
1.426HisPro: 1.426 ± 0.097
0.7HisGln: 0.7 ± 0.063
1.882HisArg: 1.882 ± 0.106
0.981HisSer: 0.981 ± 0.082
0.933HisThr: 0.933 ± 0.073
1.708HisVal: 1.708 ± 0.1
0.39HisTrp: 0.39 ± 0.039
0.493HisTyr: 0.493 ± 0.06
0.0HisXaa: 0.0 ± 0.0
Ile
6.675IleAla: 6.675 ± 0.22
0.293IleCys: 0.293 ± 0.049
2.993IleAsp: 2.993 ± 0.134
2.863IleGlu: 2.863 ± 0.142
1.236IlePhe: 1.236 ± 0.081
4.3IleGly: 4.3 ± 0.181
0.803IleHis: 0.803 ± 0.065
1.063IleIle: 1.063 ± 0.094
0.775IleLys: 0.775 ± 0.073
2.771IleLeu: 2.771 ± 0.121
0.461IleMet: 0.461 ± 0.05
1.117IleAsn: 1.117 ± 0.086
2.294IlePro: 2.294 ± 0.121
1.036IleGln: 1.036 ± 0.087
2.738IleArg: 2.738 ± 0.117
2.608IleSer: 2.608 ± 0.134
2.614IleThr: 2.614 ± 0.15
3.552IleVal: 3.552 ± 0.142
0.602IleTrp: 0.602 ± 0.066
0.873IleTyr: 0.873 ± 0.067
0.0IleXaa: 0.0 ± 0.0
Lys
2.256LysAla: 2.256 ± 0.118
0.195LysCys: 0.195 ± 0.033
0.884LysAsp: 0.884 ± 0.076
1.036LysGlu: 1.036 ± 0.089
0.472LysPhe: 0.472 ± 0.058
1.432LysGly: 1.432 ± 0.098
0.542LysHis: 0.542 ± 0.063
0.841LysIle: 0.841 ± 0.08
0.607LysLys: 0.607 ± 0.078
2.679LysLeu: 2.679 ± 0.144
0.309LysMet: 0.309 ± 0.041
0.461LysAsn: 0.461 ± 0.054
1.648LysPro: 1.648 ± 0.114
0.911LysGln: 0.911 ± 0.078
1.898LysArg: 1.898 ± 0.125
1.041LysSer: 1.041 ± 0.091
1.453LysThr: 1.453 ± 0.106
1.73LysVal: 1.73 ± 0.117
0.266LysTrp: 0.266 ± 0.043
0.418LysTyr: 0.418 ± 0.051
0.0LysXaa: 0.0 ± 0.0
Leu
13.605LeuAla: 13.605 ± 0.42
0.813LeuCys: 0.813 ± 0.068
6.176LeuAsp: 6.176 ± 0.185
5.976LeuGlu: 5.976 ± 0.211
2.879LeuPhe: 2.879 ± 0.145
8.915LeuGly: 8.915 ± 0.254
1.947LeuHis: 1.947 ± 0.113
4.186LeuIle: 4.186 ± 0.173
1.985LeuLys: 1.985 ± 0.108
9.717LeuLeu: 9.717 ± 0.33
1.404LeuMet: 1.404 ± 0.086
2.104LeuAsn: 2.104 ± 0.116
5.444LeuPro: 5.444 ± 0.2
2.944LeuGln: 2.944 ± 0.143
8.373LeuArg: 8.373 ± 0.228
6.133LeuSer: 6.133 ± 0.205
5.764LeuThr: 5.764 ± 0.167
8.237LeuVal: 8.237 ± 0.225
1.269LeuTrp: 1.269 ± 0.088
2.066LeuTyr: 2.066 ± 0.116
0.0LeuXaa: 0.0 ± 0.0
Met
2.066MetAla: 2.066 ± 0.1
0.087MetCys: 0.087 ± 0.023
0.819MetAsp: 0.819 ± 0.069
0.672MetGlu: 0.672 ± 0.07
0.542MetPhe: 0.542 ± 0.054
1.133MetGly: 1.133 ± 0.096
0.363MetHis: 0.363 ± 0.044
0.781MetIle: 0.781 ± 0.061
0.39MetLys: 0.39 ± 0.046
1.925MetLeu: 1.925 ± 0.122
0.244MetMet: 0.244 ± 0.034
0.488MetAsn: 0.488 ± 0.056
0.987MetPro: 0.987 ± 0.063
0.428MetGln: 0.428 ± 0.047
1.524MetArg: 1.524 ± 0.091
1.226MetSer: 1.226 ± 0.067
1.339MetThr: 1.339 ± 0.09
1.139MetVal: 1.139 ± 0.084
0.211MetTrp: 0.211 ± 0.039
0.255MetTyr: 0.255 ± 0.04
0.0MetXaa: 0.0 ± 0.0
Asn
2.76AsnAla: 2.76 ± 0.159
0.206AsnCys: 0.206 ± 0.038
1.361AsnAsp: 1.361 ± 0.102
1.052AsnGlu: 1.052 ± 0.076
0.672AsnPhe: 0.672 ± 0.068
2.559AsnGly: 2.559 ± 0.163
0.607AsnHis: 0.607 ± 0.06
0.775AsnIle: 0.775 ± 0.076
0.45AsnLys: 0.45 ± 0.051
2.164AsnLeu: 2.164 ± 0.127
0.255AsnMet: 0.255 ± 0.041
0.596AsnAsn: 0.596 ± 0.081
1.784AsnPro: 1.784 ± 0.12
0.873AsnGln: 0.873 ± 0.084
1.773AsnArg: 1.773 ± 0.111
1.285AsnSer: 1.285 ± 0.083
1.437AsnThr: 1.437 ± 0.108
1.708AsnVal: 1.708 ± 0.104
0.266AsnTrp: 0.266 ± 0.042
0.58AsnTyr: 0.58 ± 0.057
0.0AsnXaa: 0.0 ± 0.0
Pro
7.64ProAla: 7.64 ± 0.253
0.428ProCys: 0.428 ± 0.052
3.763ProAsp: 3.763 ± 0.166
3.584ProGlu: 3.584 ± 0.182
1.648ProPhe: 1.648 ± 0.089
6.241ProGly: 6.241 ± 0.221
1.036ProHis: 1.036 ± 0.082
2.37ProIle: 2.37 ± 0.121
1.334ProLys: 1.334 ± 0.092
4.962ProLeu: 4.962 ± 0.181
0.895ProMet: 0.895 ± 0.065
1.432ProAsn: 1.432 ± 0.094
3.899ProPro: 3.899 ± 0.211
1.806ProGln: 1.806 ± 0.11
3.796ProArg: 3.796 ± 0.139
3.893ProSer: 3.893 ± 0.181
3.563ProThr: 3.563 ± 0.164
4.533ProVal: 4.533 ± 0.163
0.808ProTrp: 0.808 ± 0.074
1.193ProTyr: 1.193 ± 0.081
0.0ProXaa: 0.0 ± 0.0
Gln
3.823GlnAla: 3.823 ± 0.192
0.201GlnCys: 0.201 ± 0.039
1.263GlnAsp: 1.263 ± 0.089
1.329GlnGlu: 1.329 ± 0.098
0.878GlnPhe: 0.878 ± 0.065
2.408GlnGly: 2.408 ± 0.136
0.672GlnHis: 0.672 ± 0.066
1.735GlnIle: 1.735 ± 0.108
0.521GlnLys: 0.521 ± 0.059
3.845GlnLeu: 3.845 ± 0.17
0.607GlnMet: 0.607 ± 0.065
0.737GlnAsn: 0.737 ± 0.074
2.012GlnPro: 2.012 ± 0.103
1.518GlnGln: 1.518 ± 0.103
2.76GlnArg: 2.76 ± 0.149
1.871GlnSer: 1.871 ± 0.108
1.822GlnThr: 1.822 ± 0.098
2.516GlnVal: 2.516 ± 0.112
0.385GlnTrp: 0.385 ± 0.046
0.743GlnTyr: 0.743 ± 0.068
0.0GlnXaa: 0.0 ± 0.0
Arg
9.663ArgAla: 9.663 ± 0.32
0.754ArgCys: 0.754 ± 0.061
4.571ArgAsp: 4.571 ± 0.193
5.086ArgGlu: 5.086 ± 0.226
2.641ArgPhe: 2.641 ± 0.132
5.976ArgGly: 5.976 ± 0.206
1.909ArgHis: 1.909 ± 0.103
3.557ArgIle: 3.557 ± 0.155
1.67ArgLys: 1.67 ± 0.126
8.416ArgLeu: 8.416 ± 0.241
1.643ArgMet: 1.643 ± 0.096
1.507ArgAsn: 1.507 ± 0.101
4.501ArgPro: 4.501 ± 0.156
2.429ArgGln: 2.429 ± 0.119
8.611ArgArg: 8.611 ± 0.308
4.886ArgSer: 4.886 ± 0.179
3.969ArgThr: 3.969 ± 0.159
6.095ArgVal: 6.095 ± 0.202
1.35ArgTrp: 1.35 ± 0.093
1.985ArgTyr: 1.985 ± 0.111
0.0ArgXaa: 0.0 ± 0.0
Ser
7.575SerAla: 7.575 ± 0.226
0.602SerCys: 0.602 ± 0.064
3.46SerAsp: 3.46 ± 0.153
3.053SerGlu: 3.053 ± 0.139
2.061SerPhe: 2.061 ± 0.106
6.334SerGly: 6.334 ± 0.228
1.155SerHis: 1.155 ± 0.083
2.343SerIle: 2.343 ± 0.134
1.356SerLys: 1.356 ± 0.09
5.509SerLeu: 5.509 ± 0.171
1.22SerMet: 1.22 ± 0.079
1.621SerAsn: 1.621 ± 0.12
3.872SerPro: 3.872 ± 0.142
1.632SerGln: 1.632 ± 0.093
4.306SerArg: 4.306 ± 0.158
4.777SerSer: 4.777 ± 0.214
3.888SerThr: 3.888 ± 0.188
4.268SerVal: 4.268 ± 0.157
0.824SerTrp: 0.824 ± 0.076
1.285SerTyr: 1.285 ± 0.091
0.0SerXaa: 0.0 ± 0.0
Thr
7.098ThrAla: 7.098 ± 0.229
0.423ThrCys: 0.423 ± 0.053
3.156ThrAsp: 3.156 ± 0.135
2.972ThrGlu: 2.972 ± 0.149
2.017ThrPhe: 2.017 ± 0.101
5.938ThrGly: 5.938 ± 0.243
1.166ThrHis: 1.166 ± 0.079
2.695ThrIle: 2.695 ± 0.136
1.296ThrLys: 1.296 ± 0.094
5.553ThrLeu: 5.553 ± 0.183
0.889ThrMet: 0.889 ± 0.079
1.502ThrAsn: 1.502 ± 0.104
3.785ThrPro: 3.785 ± 0.183
1.48ThrGln: 1.48 ± 0.109
3.541ThrArg: 3.541 ± 0.142
3.655ThrSer: 3.655 ± 0.179
3.763ThrThr: 3.763 ± 0.236
5.021ThrVal: 5.021 ± 0.206
0.835ThrTrp: 0.835 ± 0.083
1.399ThrTyr: 1.399 ± 0.114
0.0ThrXaa: 0.0 ± 0.0
Val
10.666ValAla: 10.666 ± 0.255
0.689ValCys: 0.689 ± 0.057
4.381ValAsp: 4.381 ± 0.162
4.316ValGlu: 4.316 ± 0.191
2.603ValPhe: 2.603 ± 0.107
6.632ValGly: 6.632 ± 0.217
1.573ValHis: 1.573 ± 0.087
3.536ValIle: 3.536 ± 0.136
1.578ValLys: 1.578 ± 0.113
8.074ValLeu: 8.074 ± 0.272
1.291ValMet: 1.291 ± 0.087
1.898ValAsn: 1.898 ± 0.103
4.219ValPro: 4.219 ± 0.162
2.283ValGln: 2.283 ± 0.129
5.759ValArg: 5.759 ± 0.201
4.87ValSer: 4.87 ± 0.209
4.609ValThr: 4.609 ± 0.199
6.724ValVal: 6.724 ± 0.186
1.106ValTrp: 1.106 ± 0.083
1.567ValTyr: 1.567 ± 0.099
0.0ValXaa: 0.0 ± 0.0
Trp
1.399TrpAla: 1.399 ± 0.096
0.19TrpCys: 0.19 ± 0.03
0.694TrpAsp: 0.694 ± 0.059
0.759TrpGlu: 0.759 ± 0.061
0.548TrpPhe: 0.548 ± 0.071
0.797TrpGly: 0.797 ± 0.071
0.358TrpHis: 0.358 ± 0.04
0.672TrpIle: 0.672 ± 0.062
0.298TrpLys: 0.298 ± 0.043
1.573TrpLeu: 1.573 ± 0.096
0.255TrpMet: 0.255 ± 0.037
0.407TrpAsn: 0.407 ± 0.046
0.721TrpPro: 0.721 ± 0.059
0.504TrpGln: 0.504 ± 0.058
1.426TrpArg: 1.426 ± 0.097
0.851TrpSer: 0.851 ± 0.075
1.068TrpThr: 1.068 ± 0.087
1.16TrpVal: 1.16 ± 0.086
0.325TrpTrp: 0.325 ± 0.044
0.358TrpTyr: 0.358 ± 0.047
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.576TyrAla: 2.576 ± 0.113
0.244TyrCys: 0.244 ± 0.035
1.291TyrAsp: 1.291 ± 0.095
1.269TyrGlu: 1.269 ± 0.073
0.754TyrPhe: 0.754 ± 0.064
1.996TyrGly: 1.996 ± 0.106
0.455TyrHis: 0.455 ± 0.059
0.634TyrIle: 0.634 ± 0.059
0.466TyrLys: 0.466 ± 0.056
2.277TyrLeu: 2.277 ± 0.102
0.315TyrMet: 0.315 ± 0.041
0.531TyrAsn: 0.531 ± 0.065
1.025TyrPro: 1.025 ± 0.075
0.716TyrGln: 0.716 ± 0.06
2.017TyrArg: 2.017 ± 0.111
1.193TyrSer: 1.193 ± 0.082
1.171TyrThr: 1.171 ± 0.089
1.703TyrVal: 1.703 ± 0.103
0.38TyrTrp: 0.38 ± 0.051
0.537TyrTyr: 0.537 ± 0.059
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 651 proteins (184414 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski