Amino acid dipepetide frequency for Bathymodiolus azoricus thioautotrophic gill symbiont

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
4.103AlaAla: 4.103 ± 0.126
0.83AlaCys: 0.83 ± 0.044
3.886AlaAsp: 3.886 ± 0.108
3.421AlaGlu: 3.421 ± 0.087
3.438AlaPhe: 3.438 ± 0.111
4.558AlaGly: 4.558 ± 0.128
1.598AlaHis: 1.598 ± 0.058
6.108AlaIle: 6.108 ± 0.104
5.747AlaLys: 5.747 ± 0.121
8.383AlaLeu: 8.383 ± 0.17
2.123AlaMet: 2.123 ± 0.073
3.705AlaAsn: 3.705 ± 0.099
2.34AlaPro: 2.34 ± 0.092
3.753AlaGln: 3.753 ± 0.106
2.599AlaArg: 2.599 ± 0.077
4.483AlaSer: 4.483 ± 0.096
4.07AlaThr: 4.07 ± 0.096
4.568AlaVal: 4.568 ± 0.112
0.851AlaTrp: 0.851 ± 0.058
2.257AlaTyr: 2.257 ± 0.079
0.0AlaXaa: 0.0 ± 0.0
Cys
0.768CysAla: 0.768 ± 0.044
0.211CysCys: 0.211 ± 0.021
0.63CysAsp: 0.63 ± 0.037
0.611CysGlu: 0.611 ± 0.034
0.513CysPhe: 0.513 ± 0.037
0.907CysGly: 0.907 ± 0.045
0.29CysHis: 0.29 ± 0.027
0.818CysIle: 0.818 ± 0.046
0.605CysLys: 0.605 ± 0.036
1.041CysLeu: 1.041 ± 0.051
0.257CysMet: 0.257 ± 0.025
0.471CysAsn: 0.471 ± 0.03
0.528CysPro: 0.528 ± 0.037
0.511CysGln: 0.511 ± 0.035
0.353CysArg: 0.353 ± 0.029
0.741CysSer: 0.741 ± 0.043
0.555CysThr: 0.555 ± 0.038
0.799CysVal: 0.799 ± 0.039
0.083CysTrp: 0.083 ± 0.015
0.313CysTyr: 0.313 ± 0.027
0.0CysXaa: 0.0 ± 0.0
Asp
3.98AspAla: 3.98 ± 0.098
0.647AspCys: 0.647 ± 0.04
3.037AspAsp: 3.037 ± 0.078
3.876AspGlu: 3.876 ± 0.099
3.154AspPhe: 3.154 ± 0.093
3.709AspGly: 3.709 ± 0.103
0.655AspHis: 0.655 ± 0.037
4.917AspIle: 4.917 ± 0.11
4.739AspLys: 4.739 ± 0.101
5.307AspLeu: 5.307 ± 0.124
1.408AspMet: 1.408 ± 0.057
3.252AspAsn: 3.252 ± 0.083
1.389AspPro: 1.389 ± 0.058
1.335AspGln: 1.335 ± 0.053
1.831AspArg: 1.831 ± 0.063
3.202AspSer: 3.202 ± 0.089
2.822AspThr: 2.822 ± 0.075
3.381AspVal: 3.381 ± 0.095
0.793AspTrp: 0.793 ± 0.044
2.303AspTyr: 2.303 ± 0.069
0.0AspXaa: 0.0 ± 0.0
Glu
4.186GluAla: 4.186 ± 0.109
0.611GluCys: 0.611 ± 0.044
2.695GluAsp: 2.695 ± 0.083
2.918GluGlu: 2.918 ± 0.082
2.541GluPhe: 2.541 ± 0.079
3.141GluGly: 3.141 ± 0.092
1.245GluHis: 1.245 ± 0.05
4.641GluIle: 4.641 ± 0.099
4.787GluLys: 4.787 ± 0.108
5.482GluLeu: 5.482 ± 0.115
1.602GluMet: 1.602 ± 0.064
3.137GluAsn: 3.137 ± 0.09
1.454GluPro: 1.454 ± 0.055
2.889GluGln: 2.889 ± 0.085
2.449GluArg: 2.449 ± 0.079
3.264GluSer: 3.264 ± 0.084
2.524GluThr: 2.524 ± 0.073
4.318GluVal: 4.318 ± 0.101
0.448GluTrp: 0.448 ± 0.032
1.886GluTyr: 1.886 ± 0.061
0.0GluXaa: 0.0 ± 0.0
Phe
3.233PheAla: 3.233 ± 0.085
0.588PheCys: 0.588 ± 0.039
3.158PheAsp: 3.158 ± 0.08
2.831PheGlu: 2.831 ± 0.076
2.305PhePhe: 2.305 ± 0.081
2.906PheGly: 2.906 ± 0.09
0.826PheHis: 0.826 ± 0.044
3.557PheIle: 3.557 ± 0.093
3.096PheLys: 3.096 ± 0.085
4.205PheLeu: 4.205 ± 0.118
1.106PheMet: 1.106 ± 0.052
2.568PheAsn: 2.568 ± 0.078
1.281PhePro: 1.281 ± 0.046
1.206PheGln: 1.206 ± 0.045
1.289PheArg: 1.289 ± 0.051
3.694PheSer: 3.694 ± 0.101
2.238PheThr: 2.238 ± 0.071
3.05PheVal: 3.05 ± 0.076
0.594PheTrp: 0.594 ± 0.037
1.656PheTyr: 1.656 ± 0.062
0.0PheXaa: 0.0 ± 0.0
Gly
4.887GlyAla: 4.887 ± 0.133
0.822GlyCys: 0.822 ± 0.045
3.45GlyAsp: 3.45 ± 0.088
3.776GlyGlu: 3.776 ± 0.097
3.442GlyPhe: 3.442 ± 0.092
4.687GlyGly: 4.687 ± 0.138
1.391GlyHis: 1.391 ± 0.064
4.731GlyIle: 4.731 ± 0.129
4.8GlyLys: 4.8 ± 0.11
6.239GlyLeu: 6.239 ± 0.109
1.806GlyMet: 1.806 ± 0.075
2.943GlyAsn: 2.943 ± 0.09
1.181GlyPro: 1.181 ± 0.059
2.128GlyGln: 2.128 ± 0.077
2.474GlyArg: 2.474 ± 0.098
3.888GlySer: 3.888 ± 0.122
3.004GlyThr: 3.004 ± 0.095
5.511GlyVal: 5.511 ± 0.116
0.699GlyTrp: 0.699 ± 0.037
2.39GlyTyr: 2.39 ± 0.086
0.0GlyXaa: 0.0 ± 0.0
His
1.289HisAla: 1.289 ± 0.064
0.307HisCys: 0.307 ± 0.027
0.861HisAsp: 0.861 ± 0.038
0.887HisGlu: 0.887 ± 0.045
0.976HisPhe: 0.976 ± 0.045
1.327HisGly: 1.327 ± 0.059
0.603HisHis: 0.603 ± 0.041
1.85HisIle: 1.85 ± 0.07
1.306HisLys: 1.306 ± 0.058
2.236HisLeu: 2.236 ± 0.073
0.49HisMet: 0.49 ± 0.03
1.179HisAsn: 1.179 ± 0.048
1.237HisPro: 1.237 ± 0.061
1.072HisGln: 1.072 ± 0.048
0.847HisArg: 0.847 ± 0.042
1.402HisSer: 1.402 ± 0.056
1.258HisThr: 1.258 ± 0.053
0.876HisVal: 0.876 ± 0.046
0.275HisTrp: 0.275 ± 0.028
0.868HisTyr: 0.868 ± 0.045
0.0HisXaa: 0.0 ± 0.0
Ile
6.537IleAla: 6.537 ± 0.129
0.857IleCys: 0.857 ± 0.04
5.346IleAsp: 5.346 ± 0.104
5.307IleGlu: 5.307 ± 0.122
3.404IlePhe: 3.404 ± 0.102
5.309IleGly: 5.309 ± 0.134
1.516IleHis: 1.516 ± 0.053
5.515IleIle: 5.515 ± 0.122
6.379IleLys: 6.379 ± 0.132
7.182IleLeu: 7.182 ± 0.159
1.725IleMet: 1.725 ± 0.058
4.614IleAsn: 4.614 ± 0.103
2.687IlePro: 2.687 ± 0.082
2.689IleGln: 2.689 ± 0.081
2.668IleArg: 2.668 ± 0.081
5.43IleSer: 5.43 ± 0.124
4.51IleThr: 4.51 ± 0.093
4.831IleVal: 4.831 ± 0.11
0.592IleTrp: 0.592 ± 0.037
2.405IleTyr: 2.405 ± 0.089
0.0IleXaa: 0.0 ± 0.0
Lys
5.265LysAla: 5.265 ± 0.122
0.576LysCys: 0.576 ± 0.029
4.305LysAsp: 4.305 ± 0.107
4.126LysGlu: 4.126 ± 0.106
2.505LysPhe: 2.505 ± 0.077
4.147LysGly: 4.147 ± 0.09
1.746LysHis: 1.746 ± 0.06
6.439LysIle: 6.439 ± 0.131
5.786LysLys: 5.786 ± 0.146
6.631LysLeu: 6.631 ± 0.139
2.113LysMet: 2.113 ± 0.072
4.681LysAsn: 4.681 ± 0.106
2.476LysPro: 2.476 ± 0.071
3.874LysGln: 3.874 ± 0.089
2.899LysArg: 2.899 ± 0.074
4.579LysSer: 4.579 ± 0.087
4.516LysThr: 4.516 ± 0.106
4.639LysVal: 4.639 ± 0.107
0.576LysTrp: 0.576 ± 0.033
2.105LysTyr: 2.105 ± 0.079
0.0LysXaa: 0.0 ± 0.0
Leu
7.426LeuAla: 7.426 ± 0.153
0.991LeuCys: 0.991 ± 0.05
5.622LeuAsp: 5.622 ± 0.103
5.96LeuGlu: 5.96 ± 0.115
4.301LeuPhe: 4.301 ± 0.124
6.36LeuGly: 6.36 ± 0.15
1.775LeuHis: 1.775 ± 0.071
7.365LeuIle: 7.365 ± 0.144
7.58LeuLys: 7.58 ± 0.136
10.273LeuLeu: 10.273 ± 0.205
2.847LeuMet: 2.847 ± 0.084
5.248LeuAsn: 5.248 ± 0.107
4.057LeuPro: 4.057 ± 0.088
3.252LeuGln: 3.252 ± 0.089
3.502LeuArg: 3.502 ± 0.095
8.677LeuSer: 8.677 ± 0.167
5.392LeuThr: 5.392 ± 0.116
6.016LeuVal: 6.016 ± 0.125
0.909LeuTrp: 0.909 ± 0.055
2.591LeuTyr: 2.591 ± 0.073
0.0LeuXaa: 0.0 ± 0.0
Met
2.13MetAla: 2.13 ± 0.078
0.219MetCys: 0.219 ± 0.022
1.241MetAsp: 1.241 ± 0.049
1.16MetGlu: 1.16 ± 0.058
0.966MetPhe: 0.966 ± 0.049
1.84MetGly: 1.84 ± 0.068
0.576MetHis: 0.576 ± 0.038
1.919MetIle: 1.919 ± 0.063
1.69MetLys: 1.69 ± 0.07
2.549MetLeu: 2.549 ± 0.068
0.774MetMet: 0.774 ± 0.044
1.318MetAsn: 1.318 ± 0.05
1.249MetPro: 1.249 ± 0.058
1.31MetGln: 1.31 ± 0.056
1.147MetArg: 1.147 ± 0.049
2.036MetSer: 2.036 ± 0.063
1.423MetThr: 1.423 ± 0.058
1.859MetVal: 1.859 ± 0.06
0.179MetTrp: 0.179 ± 0.019
0.544MetTyr: 0.544 ± 0.035
0.0MetXaa: 0.0 ± 0.0
Asn
4.124AsnAla: 4.124 ± 0.101
0.507AsnCys: 0.507 ± 0.035
2.666AsnAsp: 2.666 ± 0.073
2.801AsnGlu: 2.801 ± 0.085
2.269AsnPhe: 2.269 ± 0.084
2.96AsnGly: 2.96 ± 0.086
1.139AsnHis: 1.139 ± 0.053
5.006AsnIle: 5.006 ± 0.12
4.372AsnLys: 4.372 ± 0.101
4.706AsnLeu: 4.706 ± 0.095
1.131AsnMet: 1.131 ± 0.048
3.494AsnAsn: 3.494 ± 0.092
2.263AsnPro: 2.263 ± 0.085
2.493AsnGln: 2.493 ± 0.084
1.936AsnArg: 1.936 ± 0.064
3.114AsnSer: 3.114 ± 0.091
3.534AsnThr: 3.534 ± 0.087
2.486AsnVal: 2.486 ± 0.074
0.524AsnTrp: 0.524 ± 0.033
1.725AsnTyr: 1.725 ± 0.066
0.0AsnXaa: 0.0 ± 0.0
Pro
2.232ProAla: 2.232 ± 0.069
0.415ProCys: 0.415 ± 0.03
1.848ProAsp: 1.848 ± 0.068
2.282ProGlu: 2.282 ± 0.067
1.496ProPhe: 1.496 ± 0.055
1.487ProGly: 1.487 ± 0.067
0.672ProHis: 0.672 ± 0.035
2.745ProIle: 2.745 ± 0.075
2.468ProLys: 2.468 ± 0.066
3.686ProLeu: 3.686 ± 0.115
1.095ProMet: 1.095 ± 0.055
1.806ProAsn: 1.806 ± 0.058
1.156ProPro: 1.156 ± 0.054
1.189ProGln: 1.189 ± 0.048
1.133ProArg: 1.133 ± 0.055
2.447ProSer: 2.447 ± 0.088
1.894ProThr: 1.894 ± 0.06
2.284ProVal: 2.284 ± 0.071
0.382ProTrp: 0.382 ± 0.029
1.124ProTyr: 1.124 ± 0.048
0.0ProXaa: 0.0 ± 0.0
Gln
3.492GlnAla: 3.492 ± 0.095
0.4GlnCys: 0.4 ± 0.031
2.23GlnAsp: 2.23 ± 0.074
2.117GlnGlu: 2.117 ± 0.08
1.418GlnPhe: 1.418 ± 0.054
2.543GlnGly: 2.543 ± 0.071
0.991GlnHis: 0.991 ± 0.044
3.143GlnIle: 3.143 ± 0.093
3.033GlnLys: 3.033 ± 0.092
4.076GlnLeu: 4.076 ± 0.103
1.076GlnMet: 1.076 ± 0.049
2.032GlnAsn: 2.032 ± 0.065
1.089GlnPro: 1.089 ± 0.048
2.186GlnGln: 2.186 ± 0.081
1.658GlnArg: 1.658 ± 0.059
2.812GlnSer: 2.812 ± 0.082
2.39GlnThr: 2.39 ± 0.089
2.816GlnVal: 2.816 ± 0.077
0.388GlnTrp: 0.388 ± 0.024
1.341GlnTyr: 1.341 ± 0.057
0.0GlnXaa: 0.0 ± 0.0
Arg
2.574ArgAla: 2.574 ± 0.078
0.367ArgCys: 0.367 ± 0.032
2.021ArgAsp: 2.021 ± 0.077
2.23ArgGlu: 2.23 ± 0.074
1.946ArgPhe: 1.946 ± 0.064
2.29ArgGly: 2.29 ± 0.077
0.811ArgHis: 0.811 ± 0.039
2.895ArgIle: 2.895 ± 0.085
2.401ArgLys: 2.401 ± 0.072
3.934ArgLeu: 3.934 ± 0.096
0.96ArgMet: 0.96 ± 0.045
1.662ArgAsn: 1.662 ± 0.056
1.118ArgPro: 1.118 ± 0.05
1.435ArgGln: 1.435 ± 0.055
1.602ArgArg: 1.602 ± 0.059
2.247ArgSer: 2.247 ± 0.082
1.75ArgThr: 1.75 ± 0.062
2.599ArgVal: 2.599 ± 0.092
0.426ArgTrp: 0.426 ± 0.029
1.491ArgTyr: 1.491 ± 0.058
0.0ArgXaa: 0.0 ± 0.0
Ser
5.108SerAla: 5.108 ± 0.117
0.736SerCys: 0.736 ± 0.039
3.634SerAsp: 3.634 ± 0.084
3.6SerGlu: 3.6 ± 0.088
3.085SerPhe: 3.085 ± 0.073
5.079SerGly: 5.079 ± 0.123
1.368SerHis: 1.368 ± 0.054
5.557SerIle: 5.557 ± 0.147
4.668SerLys: 4.668 ± 0.105
6.687SerLeu: 6.687 ± 0.126
1.656SerMet: 1.656 ± 0.063
3.494SerAsn: 3.494 ± 0.092
2.061SerPro: 2.061 ± 0.073
2.384SerGln: 2.384 ± 0.069
2.368SerArg: 2.368 ± 0.073
4.593SerSer: 4.593 ± 0.118
3.652SerThr: 3.652 ± 0.101
4.754SerVal: 4.754 ± 0.106
0.617SerTrp: 0.617 ± 0.035
1.977SerTyr: 1.977 ± 0.061
0.0SerXaa: 0.0 ± 0.0
Thr
3.356ThrAla: 3.356 ± 0.103
0.53ThrCys: 0.53 ± 0.036
2.933ThrAsp: 2.933 ± 0.08
2.584ThrGlu: 2.584 ± 0.073
2.123ThrPhe: 2.123 ± 0.062
3.794ThrGly: 3.794 ± 0.09
1.589ThrHis: 1.589 ± 0.06
4.195ThrIle: 4.195 ± 0.102
3.659ThrLys: 3.659 ± 0.093
6.258ThrLeu: 6.258 ± 0.124
1.26ThrMet: 1.26 ± 0.057
2.699ThrAsn: 2.699 ± 0.081
2.605ThrPro: 2.605 ± 0.072
2.837ThrGln: 2.837 ± 0.081
1.738ThrArg: 1.738 ± 0.058
3.148ThrSer: 3.148 ± 0.075
3.325ThrThr: 3.325 ± 0.102
3.212ThrVal: 3.212 ± 0.088
0.532ThrTrp: 0.532 ± 0.032
1.673ThrTyr: 1.673 ± 0.064
0.0ThrXaa: 0.0 ± 0.0
Val
5.34ValAla: 5.34 ± 0.117
0.793ValCys: 0.793 ± 0.048
3.984ValAsp: 3.984 ± 0.101
3.821ValGlu: 3.821 ± 0.103
3.296ValPhe: 3.296 ± 0.098
4.264ValGly: 4.264 ± 0.109
1.124ValHis: 1.124 ± 0.045
5.24ValIle: 5.24 ± 0.106
4.339ValLys: 4.339 ± 0.104
6.64ValLeu: 6.64 ± 0.117
1.717ValMet: 1.717 ± 0.06
3.116ValAsn: 3.116 ± 0.086
2.088ValPro: 2.088 ± 0.07
2.032ValGln: 2.032 ± 0.06
2.324ValArg: 2.324 ± 0.082
4.689ValSer: 4.689 ± 0.108
3.045ValThr: 3.045 ± 0.079
5.15ValVal: 5.15 ± 0.124
0.626ValTrp: 0.626 ± 0.038
1.788ValTyr: 1.788 ± 0.058
0.0ValXaa: 0.0 ± 0.0
Trp
0.617TrpAla: 0.617 ± 0.032
0.159TrpCys: 0.159 ± 0.019
0.599TrpAsp: 0.599 ± 0.039
0.411TrpGlu: 0.411 ± 0.031
0.509TrpPhe: 0.509 ± 0.037
0.82TrpGly: 0.82 ± 0.072
0.375TrpHis: 0.375 ± 0.03
0.72TrpIle: 0.72 ± 0.038
0.524TrpLys: 0.524 ± 0.036
1.137TrpLeu: 1.137 ± 0.06
0.327TrpMet: 0.327 ± 0.027
0.346TrpAsn: 0.346 ± 0.032
0.188TrpPro: 0.188 ± 0.021
0.63TrpGln: 0.63 ± 0.039
0.476TrpArg: 0.476 ± 0.03
0.559TrpSer: 0.559 ± 0.038
0.419TrpThr: 0.419 ± 0.028
0.749TrpVal: 0.749 ± 0.04
0.138TrpTrp: 0.138 ± 0.017
0.323TrpTyr: 0.323 ± 0.031
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.171TyrAla: 2.171 ± 0.062
0.438TyrCys: 0.438 ± 0.032
1.589TyrAsp: 1.589 ± 0.064
1.502TyrGlu: 1.502 ± 0.058
1.698TyrPhe: 1.698 ± 0.063
2.121TyrGly: 2.121 ± 0.065
0.853TyrHis: 0.853 ± 0.044
2.142TyrIle: 2.142 ± 0.076
2.007TyrLys: 2.007 ± 0.073
3.494TyrLeu: 3.494 ± 0.102
0.653TyrMet: 0.653 ± 0.032
1.483TyrAsn: 1.483 ± 0.06
1.393TyrPro: 1.393 ± 0.056
2.03TyrGln: 2.03 ± 0.071
1.468TyrArg: 1.468 ± 0.059
2.078TyrSer: 2.078 ± 0.078
1.729TyrThr: 1.729 ± 0.079
1.516TyrVal: 1.516 ± 0.049
0.392TyrTrp: 0.392 ± 0.032
1.266TyrTyr: 1.266 ± 0.059
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 2029 proteins (479403 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski