Amino acid dipepetide frequency for Acidithiobacillus sp. SH

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
12.841AlaAla: 12.841 ± 0.185
1.074AlaCys: 1.074 ± 0.038
5.321AlaAsp: 5.321 ± 0.092
6.944AlaGlu: 6.944 ± 0.107
3.642AlaPhe: 3.642 ± 0.073
8.385AlaGly: 8.385 ± 0.12
2.585AlaHis: 2.585 ± 0.058
5.891AlaIle: 5.891 ± 0.092
3.423AlaLys: 3.423 ± 0.082
12.79AlaLeu: 12.79 ± 0.157
3.039AlaMet: 3.039 ± 0.066
2.808AlaAsn: 2.808 ± 0.075
4.424AlaPro: 4.424 ± 0.106
5.071AlaGln: 5.071 ± 0.089
6.543AlaArg: 6.543 ± 0.113
5.546AlaSer: 5.546 ± 0.093
4.63AlaThr: 4.63 ± 0.079
7.262AlaVal: 7.262 ± 0.102
1.701AlaTrp: 1.701 ± 0.054
2.697AlaTyr: 2.697 ± 0.072
0.0AlaXaa: 0.0 ± 0.0
Cys
1.026CysAla: 1.026 ± 0.038
0.12CysCys: 0.12 ± 0.013
0.388CysAsp: 0.388 ± 0.019
0.423CysGlu: 0.423 ± 0.025
0.34CysPhe: 0.34 ± 0.022
0.917CysGly: 0.917 ± 0.033
0.3CysHis: 0.3 ± 0.019
0.436CysIle: 0.436 ± 0.021
0.291CysLys: 0.291 ± 0.018
0.928CysLeu: 0.928 ± 0.037
0.199CysMet: 0.199 ± 0.014
0.288CysAsn: 0.288 ± 0.018
0.617CysPro: 0.617 ± 0.029
0.419CysGln: 0.419 ± 0.025
0.598CysArg: 0.598 ± 0.027
0.578CysSer: 0.578 ± 0.026
0.417CysThr: 0.417 ± 0.022
0.556CysVal: 0.556 ± 0.028
0.144CysTrp: 0.144 ± 0.013
0.218CysTyr: 0.218 ± 0.017
0.0CysXaa: 0.0 ± 0.0
Asp
5.323AspAla: 5.323 ± 0.095
0.469AspCys: 0.469 ± 0.024
2.38AspAsp: 2.38 ± 0.066
2.602AspGlu: 2.602 ± 0.065
2.264AspPhe: 2.264 ± 0.05
3.552AspGly: 3.552 ± 0.067
1.284AspHis: 1.284 ± 0.043
2.788AspIle: 2.788 ± 0.059
1.789AspLys: 1.789 ± 0.051
5.831AspLeu: 5.831 ± 0.088
1.403AspMet: 1.403 ± 0.045
1.391AspAsn: 1.391 ± 0.049
2.837AspPro: 2.837 ± 0.07
2.311AspGln: 2.311 ± 0.054
2.93AspArg: 2.93 ± 0.063
2.452AspSer: 2.452 ± 0.055
2.305AspThr: 2.305 ± 0.057
3.017AspVal: 3.017 ± 0.059
1.056AspTrp: 1.056 ± 0.036
1.623AspTyr: 1.623 ± 0.043
0.0AspXaa: 0.0 ± 0.0
Glu
6.367GluAla: 6.367 ± 0.095
0.396GluCys: 0.396 ± 0.02
2.764GluAsp: 2.764 ± 0.059
3.35GluGlu: 3.35 ± 0.083
1.576GluPhe: 1.576 ± 0.05
3.595GluGly: 3.595 ± 0.069
1.427GluHis: 1.427 ± 0.045
3.654GluIle: 3.654 ± 0.069
3.042GluLys: 3.042 ± 0.069
5.105GluLeu: 5.105 ± 0.087
1.642GluMet: 1.642 ± 0.049
2.283GluAsn: 2.283 ± 0.05
1.992GluPro: 1.992 ± 0.063
3.001GluGln: 3.001 ± 0.066
3.94GluArg: 3.94 ± 0.078
3.309GluSer: 3.309 ± 0.083
2.918GluThr: 2.918 ± 0.068
3.61GluVal: 3.61 ± 0.076
0.762GluTrp: 0.762 ± 0.031
1.295GluTyr: 1.295 ± 0.043
0.0GluXaa: 0.0 ± 0.0
Phe
3.841PheAla: 3.841 ± 0.07
0.483PheCys: 0.483 ± 0.027
1.787PheAsp: 1.787 ± 0.046
1.501PheGlu: 1.501 ± 0.043
1.774PhePhe: 1.774 ± 0.055
2.859PheGly: 2.859 ± 0.061
1.008PheHis: 1.008 ± 0.038
1.962PheIle: 1.962 ± 0.051
0.999PheLys: 0.999 ± 0.038
4.089PheLeu: 4.089 ± 0.091
0.95PheMet: 0.95 ± 0.037
1.187PheAsn: 1.187 ± 0.035
1.793PhePro: 1.793 ± 0.047
1.49PheGln: 1.49 ± 0.04
2.006PheArg: 2.006 ± 0.045
2.529PheSer: 2.529 ± 0.056
1.876PheThr: 1.876 ± 0.052
2.461PheVal: 2.461 ± 0.06
0.684PheTrp: 0.684 ± 0.028
1.17PheTyr: 1.17 ± 0.041
0.0PheXaa: 0.0 ± 0.0
Gly
6.857GlyAla: 6.857 ± 0.113
0.755GlyCys: 0.755 ± 0.031
3.597GlyAsp: 3.597 ± 0.074
4.102GlyGlu: 4.102 ± 0.076
3.169GlyPhe: 3.169 ± 0.058
5.761GlyGly: 5.761 ± 0.105
2.145GlyHis: 2.145 ± 0.051
4.864GlyIle: 4.864 ± 0.085
3.793GlyLys: 3.793 ± 0.067
8.236GlyLeu: 8.236 ± 0.117
2.326GlyMet: 2.326 ± 0.052
2.449GlyAsn: 2.449 ± 0.061
2.968GlyPro: 2.968 ± 0.083
3.501GlyGln: 3.501 ± 0.068
4.386GlyArg: 4.386 ± 0.087
4.537GlySer: 4.537 ± 0.127
3.896GlyThr: 3.896 ± 0.169
5.482GlyVal: 5.482 ± 0.097
1.205GlyTrp: 1.205 ± 0.043
2.26GlyTyr: 2.26 ± 0.054
0.0GlyXaa: 0.0 ± 0.0
His
2.53HisAla: 2.53 ± 0.056
0.36HisCys: 0.36 ± 0.022
1.278HisAsp: 1.278 ± 0.037
1.27HisGlu: 1.27 ± 0.038
1.228HisPhe: 1.228 ± 0.036
2.089HisGly: 2.089 ± 0.055
0.898HisHis: 0.898 ± 0.033
1.514HisIle: 1.514 ± 0.046
0.79HisLys: 0.79 ± 0.03
3.214HisLeu: 3.214 ± 0.06
0.677HisMet: 0.677 ± 0.025
0.742HisAsn: 0.742 ± 0.031
1.959HisPro: 1.959 ± 0.048
1.258HisGln: 1.258 ± 0.041
1.54HisArg: 1.54 ± 0.038
1.441HisSer: 1.441 ± 0.048
1.131HisThr: 1.131 ± 0.042
1.488HisVal: 1.488 ± 0.044
0.664HisTrp: 0.664 ± 0.031
1.041HisTyr: 1.041 ± 0.039
0.0HisXaa: 0.0 ± 0.0
Ile
6.081IleAla: 6.081 ± 0.091
0.543IleCys: 0.543 ± 0.026
2.624IleAsp: 2.624 ± 0.061
2.826IleGlu: 2.826 ± 0.066
2.049IlePhe: 2.049 ± 0.05
4.161IleGly: 4.161 ± 0.077
1.746IleHis: 1.746 ± 0.048
3.105IleIle: 3.105 ± 0.075
1.754IleLys: 1.754 ± 0.048
6.144IleLeu: 6.144 ± 0.105
1.22IleMet: 1.22 ± 0.04
1.894IleAsn: 1.894 ± 0.062
3.361IlePro: 3.361 ± 0.067
2.421IleGln: 2.421 ± 0.054
3.367IleArg: 3.367 ± 0.061
3.501IleSer: 3.501 ± 0.073
2.909IleThr: 2.909 ± 0.065
3.512IleVal: 3.512 ± 0.07
0.808IleTrp: 0.808 ± 0.034
1.46IleTyr: 1.46 ± 0.047
0.0IleXaa: 0.0 ± 0.0
Lys
4.142LysAla: 4.142 ± 0.085
0.199LysCys: 0.199 ± 0.015
1.853LysAsp: 1.853 ± 0.048
2.054LysGlu: 2.054 ± 0.057
0.848LysPhe: 0.848 ± 0.036
2.518LysGly: 2.518 ± 0.062
0.872LysHis: 0.872 ± 0.033
2.247LysIle: 2.247 ± 0.06
2.021LysLys: 2.021 ± 0.062
3.329LysLeu: 3.329 ± 0.067
1.062LysMet: 1.062 ± 0.041
1.614LysAsn: 1.614 ± 0.049
1.934LysPro: 1.934 ± 0.053
1.41LysGln: 1.41 ± 0.049
2.164LysArg: 2.164 ± 0.057
2.252LysSer: 2.252 ± 0.062
2.273LysThr: 2.273 ± 0.053
2.466LysVal: 2.466 ± 0.059
0.368LysTrp: 0.368 ± 0.022
0.777LysTyr: 0.777 ± 0.033
0.0LysXaa: 0.0 ± 0.0
Leu
12.071LeuAla: 12.071 ± 0.175
1.004LeuCys: 1.004 ± 0.031
5.895LeuAsp: 5.895 ± 0.093
6.183LeuGlu: 6.183 ± 0.108
3.988LeuPhe: 3.988 ± 0.086
8.413LeuGly: 8.413 ± 0.115
3.019LeuHis: 3.019 ± 0.059
5.276LeuIle: 5.276 ± 0.101
3.645LeuLys: 3.645 ± 0.068
12.653LeuLeu: 12.653 ± 0.202
2.513LeuMet: 2.513 ± 0.057
3.105LeuAsn: 3.105 ± 0.067
6.351LeuPro: 6.351 ± 0.104
6.177LeuGln: 6.177 ± 0.119
7.974LeuArg: 7.974 ± 0.127
7.036LeuSer: 7.036 ± 0.091
4.962LeuThr: 4.962 ± 0.085
6.848LeuVal: 6.848 ± 0.115
1.634LeuTrp: 1.634 ± 0.056
2.619LeuTyr: 2.619 ± 0.069
0.0LeuXaa: 0.0 ± 0.0
Met
3.165MetAla: 3.165 ± 0.061
0.105MetCys: 0.105 ± 0.011
1.537MetAsp: 1.537 ± 0.045
1.45MetGlu: 1.45 ± 0.043
0.618MetPhe: 0.618 ± 0.029
2.086MetGly: 2.086 ± 0.052
0.675MetHis: 0.675 ± 0.025
1.337MetIle: 1.337 ± 0.04
0.978MetLys: 0.978 ± 0.032
2.606MetLeu: 2.606 ± 0.057
0.612MetMet: 0.612 ± 0.031
0.986MetAsn: 0.986 ± 0.039
1.509MetPro: 1.509 ± 0.043
1.291MetGln: 1.291 ± 0.039
1.722MetArg: 1.722 ± 0.041
1.597MetSer: 1.597 ± 0.041
1.529MetThr: 1.529 ± 0.039
1.786MetVal: 1.786 ± 0.048
0.207MetTrp: 0.207 ± 0.018
0.393MetTyr: 0.393 ± 0.021
0.0MetXaa: 0.0 ± 0.0
Asn
3.28AsnAla: 3.28 ± 0.077
0.281AsnCys: 0.281 ± 0.02
1.42AsnAsp: 1.42 ± 0.041
1.317AsnGlu: 1.317 ± 0.04
1.119AsnPhe: 1.119 ± 0.038
2.496AsnGly: 2.496 ± 0.074
0.839AsnHis: 0.839 ± 0.033
1.858AsnIle: 1.858 ± 0.05
1.031AsnLys: 1.031 ± 0.039
3.291AsnLeu: 3.291 ± 0.061
0.765AsnMet: 0.765 ± 0.034
1.144AsnAsn: 1.144 ± 0.053
2.334AsnPro: 2.334 ± 0.058
1.413AsnGln: 1.413 ± 0.052
1.807AsnArg: 1.807 ± 0.05
1.779AsnSer: 1.779 ± 0.071
1.78AsnThr: 1.78 ± 0.053
1.866AsnVal: 1.866 ± 0.056
0.467AsnTrp: 0.467 ± 0.025
0.888AsnTyr: 0.888 ± 0.035
0.0AsnXaa: 0.0 ± 0.0
Pro
5.461ProAla: 5.461 ± 0.102
0.4ProCys: 0.4 ± 0.022
3.135ProAsp: 3.135 ± 0.072
4.23ProGlu: 4.23 ± 0.085
1.828ProPhe: 1.828 ± 0.051
4.231ProGly: 4.231 ± 0.079
1.249ProHis: 1.249 ± 0.041
2.341ProIle: 2.341 ± 0.053
1.717ProLys: 1.717 ± 0.056
5.368ProLeu: 5.368 ± 0.089
1.314ProMet: 1.314 ± 0.039
1.274ProAsn: 1.274 ± 0.04
2.213ProPro: 2.213 ± 0.07
2.295ProGln: 2.295 ± 0.057
2.582ProArg: 2.582 ± 0.059
2.699ProSer: 2.699 ± 0.063
2.111ProThr: 2.111 ± 0.046
4.108ProVal: 4.108 ± 0.075
0.883ProTrp: 0.883 ± 0.035
1.482ProTyr: 1.482 ± 0.047
0.0ProXaa: 0.0 ± 0.0
Gln
5.311GlnAla: 5.311 ± 0.092
0.36GlnCys: 0.36 ± 0.019
2.17GlnAsp: 2.17 ± 0.056
2.982GlnGlu: 2.982 ± 0.071
1.488GlnPhe: 1.488 ± 0.041
3.593GlnGly: 3.593 ± 0.067
1.395GlnHis: 1.395 ± 0.04
2.731GlnIle: 2.731 ± 0.065
2.059GlnLys: 2.059 ± 0.055
4.576GlnLeu: 4.576 ± 0.088
1.333GlnMet: 1.333 ± 0.04
1.547GlnAsn: 1.547 ± 0.051
2.112GlnPro: 2.112 ± 0.055
3.052GlnGln: 3.052 ± 0.08
3.419GlnArg: 3.419 ± 0.076
2.931GlnSer: 2.931 ± 0.066
2.151GlnThr: 2.151 ± 0.053
3.459GlnVal: 3.459 ± 0.063
0.815GlnTrp: 0.815 ± 0.038
1.145GlnTyr: 1.145 ± 0.043
0.0GlnXaa: 0.0 ± 0.0
Arg
5.527ArgAla: 5.527 ± 0.105
0.603ArgCys: 0.603 ± 0.024
3.228ArgAsp: 3.228 ± 0.069
3.849ArgGlu: 3.849 ± 0.069
2.565ArgPhe: 2.565 ± 0.058
3.743ArgGly: 3.743 ± 0.071
1.966ArgHis: 1.966 ± 0.048
3.689ArgIle: 3.689 ± 0.073
2.447ArgLys: 2.447 ± 0.061
7.401ArgLeu: 7.401 ± 0.111
1.7ArgMet: 1.7 ± 0.052
1.988ArgAsn: 1.988 ± 0.052
2.78ArgPro: 2.78 ± 0.072
3.533ArgGln: 3.533 ± 0.079
4.485ArgArg: 4.485 ± 0.084
3.556ArgSer: 3.556 ± 0.074
2.392ArgThr: 2.392 ± 0.058
4.031ArgVal: 4.031 ± 0.065
1.191ArgTrp: 1.191 ± 0.042
2.039ArgTyr: 2.039 ± 0.057
0.0ArgXaa: 0.0 ± 0.0
Ser
6.471SerAla: 6.471 ± 0.096
0.57SerCys: 0.57 ± 0.027
2.593SerAsp: 2.593 ± 0.053
2.825SerGlu: 2.825 ± 0.06
2.039SerPhe: 2.039 ± 0.047
5.546SerGly: 5.546 ± 0.137
1.403SerHis: 1.403 ± 0.043
3.185SerIle: 3.185 ± 0.059
1.929SerLys: 1.929 ± 0.051
6.469SerLeu: 6.469 ± 0.094
1.708SerMet: 1.708 ± 0.049
1.81SerAsn: 1.81 ± 0.063
3.142SerPro: 3.142 ± 0.067
2.478SerGln: 2.478 ± 0.059
3.463SerArg: 3.463 ± 0.065
3.762SerSer: 3.762 ± 0.096
3.074SerThr: 3.074 ± 0.07
3.758SerVal: 3.758 ± 0.073
0.962SerTrp: 0.962 ± 0.037
1.398SerTyr: 1.398 ± 0.049
0.0SerXaa: 0.0 ± 0.0
Thr
5.399ThrAla: 5.399 ± 0.082
0.419ThrCys: 0.419 ± 0.024
2.22ThrAsp: 2.22 ± 0.053
2.404ThrGlu: 2.404 ± 0.054
1.674ThrPhe: 1.674 ± 0.046
4.372ThrGly: 4.372 ± 0.102
1.323ThrHis: 1.323 ± 0.034
2.585ThrIle: 2.585 ± 0.082
1.23ThrLys: 1.23 ± 0.044
6.212ThrLeu: 6.212 ± 0.102
1.022ThrMet: 1.022 ± 0.034
1.247ThrAsn: 1.247 ± 0.05
3.01ThrPro: 3.01 ± 0.058
2.047ThrGln: 2.047 ± 0.046
2.892ThrArg: 2.892 ± 0.054
2.476ThrSer: 2.476 ± 0.06
2.235ThrThr: 2.235 ± 0.059
3.429ThrVal: 3.429 ± 0.063
0.735ThrTrp: 0.735 ± 0.033
1.236ThrTyr: 1.236 ± 0.042
0.0ThrXaa: 0.0 ± 0.0
Val
6.994ValAla: 6.994 ± 0.117
0.63ValCys: 0.63 ± 0.03
3.519ValAsp: 3.519 ± 0.072
3.78ValGlu: 3.78 ± 0.066
2.593ValPhe: 2.593 ± 0.062
4.613ValGly: 4.613 ± 0.095
1.719ValHis: 1.719 ± 0.046
3.929ValIle: 3.929 ± 0.083
2.179ValLys: 2.179 ± 0.06
7.818ValLeu: 7.818 ± 0.123
1.703ValMet: 1.703 ± 0.047
2.146ValAsn: 2.146 ± 0.057
3.298ValPro: 3.298 ± 0.066
2.915ValGln: 2.915 ± 0.071
3.848ValArg: 3.848 ± 0.073
4.215ValSer: 4.215 ± 0.069
3.601ValThr: 3.601 ± 0.075
4.7ValVal: 4.7 ± 0.09
0.901ValTrp: 0.901 ± 0.037
1.517ValTyr: 1.517 ± 0.046
0.0ValXaa: 0.0 ± 0.0
Trp
1.304TrpAla: 1.304 ± 0.043
0.133TrpCys: 0.133 ± 0.011
0.664TrpAsp: 0.664 ± 0.03
0.805TrpGlu: 0.805 ± 0.036
0.546TrpPhe: 0.546 ± 0.025
1.082TrpGly: 1.082 ± 0.044
0.446TrpHis: 0.446 ± 0.028
0.929TrpIle: 0.929 ± 0.038
0.563TrpLys: 0.563 ± 0.026
2.153TrpLeu: 2.153 ± 0.058
0.471TrpMet: 0.471 ± 0.024
0.45TrpAsn: 0.45 ± 0.027
0.746TrpPro: 0.746 ± 0.028
1.079TrpGln: 1.079 ± 0.041
1.163TrpArg: 1.163 ± 0.047
0.925TrpSer: 0.925 ± 0.034
0.737TrpThr: 0.737 ± 0.03
1.094TrpVal: 1.094 ± 0.039
0.323TrpTrp: 0.323 ± 0.024
0.401TrpTyr: 0.401 ± 0.021
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.624TyrAla: 2.624 ± 0.065
0.3TyrCys: 0.3 ± 0.021
1.232TyrAsp: 1.232 ± 0.046
1.13TyrGlu: 1.13 ± 0.034
1.101TyrPhe: 1.101 ± 0.037
2.277TyrGly: 2.277 ± 0.067
0.812TyrHis: 0.812 ± 0.031
1.163TyrIle: 1.163 ± 0.037
0.723TyrLys: 0.723 ± 0.033
3.155TyrLeu: 3.155 ± 0.068
0.532TyrMet: 0.532 ± 0.024
0.823TyrAsn: 0.823 ± 0.037
1.541TyrPro: 1.541 ± 0.046
1.463TyrGln: 1.463 ± 0.053
1.886TyrArg: 1.886 ± 0.054
1.464TyrSer: 1.464 ± 0.049
1.24TyrThr: 1.24 ± 0.042
1.688TyrVal: 1.688 ± 0.044
0.506TyrTrp: 0.506 ± 0.026
0.77TyrTyr: 0.77 ± 0.032
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 2792 proteins (849494 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski