Amino acid dipepetide frequency for Nitrosopumilales archaeon

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
5.829AlaAla: 5.829 ± 0.146
0.746AlaCys: 0.746 ± 0.048
3.423AlaAsp: 3.423 ± 0.108
3.973AlaGlu: 3.973 ± 0.103
2.835AlaPhe: 2.835 ± 0.098
4.987AlaGly: 4.987 ± 0.16
1.295AlaHis: 1.295 ± 0.059
6.426AlaIle: 6.426 ± 0.162
5.098AlaLys: 5.098 ± 0.122
6.718AlaLeu: 6.718 ± 0.148
1.904AlaMet: 1.904 ± 0.084
3.179AlaAsn: 3.179 ± 0.097
1.946AlaPro: 1.946 ± 0.081
1.976AlaGln: 1.976 ± 0.076
3.54AlaArg: 3.54 ± 0.096
4.934AlaSer: 4.934 ± 0.124
4.011AlaThr: 4.011 ± 0.112
4.793AlaVal: 4.793 ± 0.125
0.501AlaTrp: 0.501 ± 0.04
2.074AlaTyr: 2.074 ± 0.081
0.0AlaXaa: 0.0 ± 0.0
Cys
0.633CysAla: 0.633 ± 0.04
0.167CysCys: 0.167 ± 0.021
0.576CysAsp: 0.576 ± 0.043
0.561CysGlu: 0.561 ± 0.038
0.388CysPhe: 0.388 ± 0.032
0.991CysGly: 0.991 ± 0.057
0.2CysHis: 0.2 ± 0.025
0.815CysIle: 0.815 ± 0.062
0.651CysLys: 0.651 ± 0.048
0.889CysLeu: 0.889 ± 0.054
0.221CysMet: 0.221 ± 0.023
0.507CysAsn: 0.507 ± 0.036
0.504CysPro: 0.504 ± 0.041
0.284CysGln: 0.284 ± 0.03
0.576CysArg: 0.576 ± 0.041
0.791CysSer: 0.791 ± 0.051
0.552CysThr: 0.552 ± 0.043
0.579CysVal: 0.579 ± 0.041
0.075CysTrp: 0.075 ± 0.014
0.358CysTyr: 0.358 ± 0.035
0.0CysXaa: 0.0 ± 0.0
Asp
3.582AspAla: 3.582 ± 0.11
0.558AspCys: 0.558 ± 0.038
2.534AspAsp: 2.534 ± 0.095
3.388AspGlu: 3.388 ± 0.12
2.152AspPhe: 2.152 ± 0.076
3.405AspGly: 3.405 ± 0.11
0.988AspHis: 0.988 ± 0.05
5.026AspIle: 5.026 ± 0.116
3.599AspLys: 3.599 ± 0.124
5.017AspLeu: 5.017 ± 0.112
1.262AspMet: 1.262 ± 0.054
2.653AspAsn: 2.653 ± 0.096
2.301AspPro: 2.301 ± 0.081
1.307AspGln: 1.307 ± 0.069
2.486AspArg: 2.486 ± 0.091
3.629AspSer: 3.629 ± 0.101
2.671AspThr: 2.671 ± 0.089
3.802AspVal: 3.802 ± 0.107
0.484AspTrp: 0.484 ± 0.038
1.916AspTyr: 1.916 ± 0.071
0.0AspXaa: 0.0 ± 0.0
Glu
3.91GluAla: 3.91 ± 0.136
0.534GluCys: 0.534 ± 0.043
2.919GluAsp: 2.919 ± 0.108
4.298GluGlu: 4.298 ± 0.142
2.483GluPhe: 2.483 ± 0.089
3.492GluGly: 3.492 ± 0.114
1.143GluHis: 1.143 ± 0.062
5.742GluIle: 5.742 ± 0.124
4.943GluLys: 4.943 ± 0.126
5.671GluLeu: 5.671 ± 0.132
1.615GluMet: 1.615 ± 0.071
3.152GluAsn: 3.152 ± 0.105
1.916GluPro: 1.916 ± 0.072
2.265GluGln: 2.265 ± 0.087
3.241GluArg: 3.241 ± 0.097
3.913GluSer: 3.913 ± 0.116
2.615GluThr: 2.615 ± 0.089
4.169GluVal: 4.169 ± 0.134
0.615GluTrp: 0.615 ± 0.048
2.056GluTyr: 2.056 ± 0.083
0.0GluXaa: 0.0 ± 0.0
Phe
2.707PheAla: 2.707 ± 0.105
0.427PheCys: 0.427 ± 0.036
2.456PheAsp: 2.456 ± 0.09
2.415PheGlu: 2.415 ± 0.097
1.427PhePhe: 1.427 ± 0.071
3.143PheGly: 3.143 ± 0.101
0.731PheHis: 0.731 ± 0.052
3.017PheIle: 3.017 ± 0.095
2.28PheLys: 2.28 ± 0.077
3.346PheLeu: 3.346 ± 0.116
0.806PheMet: 0.806 ± 0.05
1.856PheAsn: 1.856 ± 0.07
1.609PhePro: 1.609 ± 0.068
1.095PheGln: 1.095 ± 0.053
1.77PheArg: 1.77 ± 0.083
3.071PheSer: 3.071 ± 0.096
2.155PheThr: 2.155 ± 0.075
2.841PheVal: 2.841 ± 0.099
0.373PheTrp: 0.373 ± 0.032
1.376PheTyr: 1.376 ± 0.07
0.0PheXaa: 0.0 ± 0.0
Gly
4.632GlyAla: 4.632 ± 0.141
0.722GlyCys: 0.722 ± 0.043
3.005GlyAsp: 3.005 ± 0.088
3.34GlyGlu: 3.34 ± 0.11
3.107GlyPhe: 3.107 ± 0.103
4.919GlyGly: 4.919 ± 0.152
1.454GlyHis: 1.454 ± 0.079
6.492GlyIle: 6.492 ± 0.143
5.137GlyLys: 5.137 ± 0.115
5.969GlyLeu: 5.969 ± 0.14
2.065GlyMet: 2.065 ± 0.077
3.101GlyAsn: 3.101 ± 0.104
2.033GlyPro: 2.033 ± 0.085
1.865GlyGln: 1.865 ± 0.079
3.498GlyArg: 3.498 ± 0.111
4.883GlySer: 4.883 ± 0.134
3.695GlyThr: 3.695 ± 0.108
4.563GlyVal: 4.563 ± 0.124
0.686GlyTrp: 0.686 ± 0.048
2.51GlyTyr: 2.51 ± 0.1
0.0GlyXaa: 0.0 ± 0.0
His
1.239HisAla: 1.239 ± 0.056
0.233HisCys: 0.233 ± 0.024
1.28HisAsp: 1.28 ± 0.062
1.185HisGlu: 1.185 ± 0.053
0.883HisPhe: 0.883 ± 0.049
1.325HisGly: 1.325 ± 0.061
0.606HisHis: 0.606 ± 0.048
1.6HisIle: 1.6 ± 0.059
1.239HisLys: 1.239 ± 0.058
1.892HisLeu: 1.892 ± 0.079
0.543HisMet: 0.543 ± 0.038
0.988HisAsn: 0.988 ± 0.058
1.086HisPro: 1.086 ± 0.055
0.564HisGln: 0.564 ± 0.04
1.057HisArg: 1.057 ± 0.056
1.459HisSer: 1.459 ± 0.07
1.155HisThr: 1.155 ± 0.065
1.274HisVal: 1.274 ± 0.057
0.152HisTrp: 0.152 ± 0.023
0.752HisTyr: 0.752 ± 0.05
0.0HisXaa: 0.0 ± 0.0
Ile
6.927IleAla: 6.927 ± 0.159
0.901IleCys: 0.901 ± 0.052
5.026IleAsp: 5.026 ± 0.113
5.429IleGlu: 5.429 ± 0.13
3.038IlePhe: 3.038 ± 0.098
6.181IleGly: 6.181 ± 0.154
1.683IleHis: 1.683 ± 0.07
6.969IleIle: 6.969 ± 0.167
5.471IleLys: 5.471 ± 0.118
7.814IleLeu: 7.814 ± 0.167
1.907IleMet: 1.907 ± 0.076
4.071IleAsn: 4.071 ± 0.11
3.659IlePro: 3.659 ± 0.099
2.594IleGln: 2.594 ± 0.089
4.525IleArg: 4.525 ± 0.135
6.432IleSer: 6.432 ± 0.14
4.784IleThr: 4.784 ± 0.126
5.987IleVal: 5.987 ± 0.14
0.633IleTrp: 0.633 ± 0.039
2.277IleTyr: 2.277 ± 0.081
0.0IleXaa: 0.0 ± 0.0
Lys
4.319LysAla: 4.319 ± 0.122
0.716LysCys: 0.716 ± 0.05
3.698LysAsp: 3.698 ± 0.105
4.945LysGlu: 4.945 ± 0.117
2.546LysPhe: 2.546 ± 0.104
4.346LysGly: 4.346 ± 0.117
1.265LysHis: 1.265 ± 0.054
6.271LysIle: 6.271 ± 0.135
5.166LysLys: 5.166 ± 0.133
6.008LysLeu: 6.008 ± 0.139
2.119LysMet: 2.119 ± 0.069
3.534LysAsn: 3.534 ± 0.105
2.543LysPro: 2.543 ± 0.082
2.289LysGln: 2.289 ± 0.08
3.644LysArg: 3.644 ± 0.101
4.623LysSer: 4.623 ± 0.123
3.755LysThr: 3.755 ± 0.117
4.605LysVal: 4.605 ± 0.131
0.573LysTrp: 0.573 ± 0.039
2.429LysTyr: 2.429 ± 0.087
0.0LysXaa: 0.0 ± 0.0
Leu
7.017LeuAla: 7.017 ± 0.147
0.869LeuCys: 0.869 ± 0.053
5.16LeuAsp: 5.16 ± 0.112
5.611LeuGlu: 5.611 ± 0.14
3.083LeuPhe: 3.083 ± 0.11
6.002LeuGly: 6.002 ± 0.119
1.803LeuHis: 1.803 ± 0.07
7.026LeuIle: 7.026 ± 0.155
6.563LeuLys: 6.563 ± 0.144
8.22LeuLeu: 8.22 ± 0.18
2.134LeuMet: 2.134 ± 0.075
4.217LeuAsn: 4.217 ± 0.116
3.617LeuPro: 3.617 ± 0.102
3.029LeuGln: 3.029 ± 0.093
4.889LeuArg: 4.889 ± 0.109
7.226LeuSer: 7.226 ± 0.158
4.88LeuThr: 4.88 ± 0.144
6.199LeuVal: 6.199 ± 0.111
0.66LeuTrp: 0.66 ± 0.051
2.57LeuTyr: 2.57 ± 0.087
0.0LeuXaa: 0.0 ± 0.0
Met
1.791MetAla: 1.791 ± 0.073
0.167MetCys: 0.167 ± 0.019
1.307MetAsp: 1.307 ± 0.058
1.394MetGlu: 1.394 ± 0.066
0.919MetPhe: 0.919 ± 0.051
1.636MetGly: 1.636 ± 0.073
0.567MetHis: 0.567 ± 0.045
2.092MetIle: 2.092 ± 0.087
1.958MetLys: 1.958 ± 0.073
2.418MetLeu: 2.418 ± 0.083
0.722MetMet: 0.722 ± 0.052
1.361MetAsn: 1.361 ± 0.058
1.176MetPro: 1.176 ± 0.06
0.842MetGln: 0.842 ± 0.045
1.31MetArg: 1.31 ± 0.064
2.0MetSer: 2.0 ± 0.083
1.525MetThr: 1.525 ± 0.068
1.734MetVal: 1.734 ± 0.071
0.143MetTrp: 0.143 ± 0.021
0.719MetTyr: 0.719 ± 0.046
0.0MetXaa: 0.0 ± 0.0
Asn
3.45AsnAla: 3.45 ± 0.102
0.564AsnCys: 0.564 ± 0.041
2.388AsnAsp: 2.388 ± 0.077
2.97AsnGlu: 2.97 ± 0.102
1.907AsnPhe: 1.907 ± 0.075
3.238AsnGly: 3.238 ± 0.103
0.982AsnHis: 0.982 ± 0.056
4.119AsnIle: 4.119 ± 0.125
3.202AsnLys: 3.202 ± 0.112
4.31AsnLeu: 4.31 ± 0.105
1.236AsnMet: 1.236 ± 0.053
3.349AsnAsn: 3.349 ± 0.14
2.331AsnPro: 2.331 ± 0.097
1.564AsnGln: 1.564 ± 0.076
2.355AsnArg: 2.355 ± 0.075
3.829AsnSer: 3.829 ± 0.122
2.683AsnThr: 2.683 ± 0.096
3.214AsnVal: 3.214 ± 0.108
0.415AsnTrp: 0.415 ± 0.035
1.689AsnTyr: 1.689 ± 0.068
0.0AsnXaa: 0.0 ± 0.0
Pro
2.382ProAla: 2.382 ± 0.084
0.328ProCys: 0.328 ± 0.03
2.164ProAsp: 2.164 ± 0.086
2.447ProGlu: 2.447 ± 0.093
1.746ProPhe: 1.746 ± 0.083
2.364ProGly: 2.364 ± 0.094
0.898ProHis: 0.898 ± 0.052
3.005ProIle: 3.005 ± 0.103
2.31ProLys: 2.31 ± 0.078
3.471ProLeu: 3.471 ± 0.105
0.874ProMet: 0.874 ± 0.054
1.719ProAsn: 1.719 ± 0.072
1.558ProPro: 1.558 ± 0.078
1.289ProGln: 1.289 ± 0.058
1.665ProArg: 1.665 ± 0.066
2.937ProSer: 2.937 ± 0.101
2.435ProThr: 2.435 ± 0.079
2.698ProVal: 2.698 ± 0.1
0.4ProTrp: 0.4 ± 0.041
1.397ProTyr: 1.397 ± 0.059
0.0ProXaa: 0.0 ± 0.0
Gln
2.089GlnAla: 2.089 ± 0.092
0.269GlnCys: 0.269 ± 0.026
1.412GlnAsp: 1.412 ± 0.058
1.859GlnGlu: 1.859 ± 0.079
1.128GlnPhe: 1.128 ± 0.053
1.689GlnGly: 1.689 ± 0.068
0.74GlnHis: 0.74 ± 0.043
2.949GlnIle: 2.949 ± 0.098
2.373GlnLys: 2.373 ± 0.087
2.758GlnLeu: 2.758 ± 0.101
0.889GlnMet: 0.889 ± 0.05
1.656GlnAsn: 1.656 ± 0.066
1.042GlnPro: 1.042 ± 0.052
1.498GlnGln: 1.498 ± 0.081
1.618GlnArg: 1.618 ± 0.067
2.221GlnSer: 2.221 ± 0.1
1.662GlnThr: 1.662 ± 0.068
1.958GlnVal: 1.958 ± 0.074
0.272GlnTrp: 0.272 ± 0.028
1.14GlnTyr: 1.14 ± 0.059
0.0GlnXaa: 0.0 ± 0.0
Arg
3.089ArgAla: 3.089 ± 0.105
0.591ArgCys: 0.591 ± 0.046
2.716ArgAsp: 2.716 ± 0.102
3.241ArgGlu: 3.241 ± 0.115
2.027ArgPhe: 2.027 ± 0.084
2.928ArgGly: 2.928 ± 0.107
1.068ArgHis: 1.068 ± 0.057
4.713ArgIle: 4.713 ± 0.119
3.668ArgLys: 3.668 ± 0.105
4.751ArgLeu: 4.751 ± 0.129
1.591ArgMet: 1.591 ± 0.068
2.758ArgAsn: 2.758 ± 0.083
1.573ArgPro: 1.573 ± 0.069
1.692ArgGln: 1.692 ± 0.074
2.883ArgArg: 2.883 ± 0.096
3.176ArgSer: 3.176 ± 0.109
2.6ArgThr: 2.6 ± 0.106
3.238ArgVal: 3.238 ± 0.114
0.528ArgTrp: 0.528 ± 0.045
1.949ArgTyr: 1.949 ± 0.071
0.0ArgXaa: 0.0 ± 0.0
Ser
4.808SerAla: 4.808 ± 0.131
0.692SerCys: 0.692 ± 0.043
3.635SerAsp: 3.635 ± 0.091
4.253SerGlu: 4.253 ± 0.126
2.829SerPhe: 2.829 ± 0.106
5.172SerGly: 5.172 ± 0.145
1.474SerHis: 1.474 ± 0.072
6.16SerIle: 6.16 ± 0.134
5.238SerLys: 5.238 ± 0.13
6.79SerLeu: 6.79 ± 0.142
1.967SerMet: 1.967 ± 0.079
3.776SerAsn: 3.776 ± 0.121
2.698SerPro: 2.698 ± 0.091
2.224SerGln: 2.224 ± 0.077
3.537SerArg: 3.537 ± 0.102
6.172SerSer: 6.172 ± 0.172
4.405SerThr: 4.405 ± 0.134
4.614SerVal: 4.614 ± 0.11
0.63SerTrp: 0.63 ± 0.041
2.244SerTyr: 2.244 ± 0.085
0.0SerXaa: 0.0 ± 0.0
Thr
3.844ThrAla: 3.844 ± 0.105
0.489ThrCys: 0.489 ± 0.038
2.749ThrAsp: 2.749 ± 0.084
3.095ThrGlu: 3.095 ± 0.091
2.158ThrPhe: 2.158 ± 0.081
4.304ThrGly: 4.304 ± 0.118
1.212ThrHis: 1.212 ± 0.056
4.799ThrIle: 4.799 ± 0.128
3.647ThrLys: 3.647 ± 0.098
4.766ThrLeu: 4.766 ± 0.125
1.406ThrMet: 1.406 ± 0.065
2.668ThrAsn: 2.668 ± 0.088
2.334ThrPro: 2.334 ± 0.086
1.674ThrGln: 1.674 ± 0.065
2.513ThrArg: 2.513 ± 0.086
4.152ThrSer: 4.152 ± 0.128
3.552ThrThr: 3.552 ± 0.142
3.561ThrVal: 3.561 ± 0.105
0.433ThrTrp: 0.433 ± 0.037
1.57ThrTyr: 1.57 ± 0.063
0.0ThrXaa: 0.0 ± 0.0
Val
5.184ValAla: 5.184 ± 0.134
0.797ValCys: 0.797 ± 0.046
3.752ValAsp: 3.752 ± 0.111
3.97ValGlu: 3.97 ± 0.109
2.471ValPhe: 2.471 ± 0.094
4.605ValGly: 4.605 ± 0.13
1.376ValHis: 1.376 ± 0.073
5.978ValIle: 5.978 ± 0.127
4.438ValLys: 4.438 ± 0.105
5.948ValLeu: 5.948 ± 0.131
1.653ValMet: 1.653 ± 0.07
3.116ValAsn: 3.116 ± 0.096
2.534ValPro: 2.534 ± 0.082
1.934ValGln: 1.934 ± 0.075
3.51ValArg: 3.51 ± 0.097
4.841ValSer: 4.841 ± 0.115
3.761ValThr: 3.761 ± 0.109
5.071ValVal: 5.071 ± 0.135
0.54ValTrp: 0.54 ± 0.043
1.964ValTyr: 1.964 ± 0.078
0.0ValXaa: 0.0 ± 0.0
Trp
0.439TrpAla: 0.439 ± 0.035
0.09TrpCys: 0.09 ± 0.015
0.457TrpAsp: 0.457 ± 0.039
0.388TrpGlu: 0.388 ± 0.038
0.421TrpPhe: 0.421 ± 0.033
0.525TrpGly: 0.525 ± 0.038
0.26TrpHis: 0.26 ± 0.028
0.836TrpIle: 0.836 ± 0.055
0.669TrpLys: 0.669 ± 0.049
0.872TrpLeu: 0.872 ± 0.053
0.257TrpMet: 0.257 ± 0.027
0.522TrpAsn: 0.522 ± 0.044
0.242TrpPro: 0.242 ± 0.025
0.292TrpGln: 0.292 ± 0.03
0.4TrpArg: 0.4 ± 0.033
0.489TrpSer: 0.489 ± 0.042
0.433TrpThr: 0.433 ± 0.035
0.528TrpVal: 0.528 ± 0.04
0.113TrpTrp: 0.113 ± 0.017
0.295TrpTyr: 0.295 ± 0.028
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.319TyrAla: 2.319 ± 0.085
0.457TyrCys: 0.457 ± 0.04
2.104TyrAsp: 2.104 ± 0.08
1.931TyrGlu: 1.931 ± 0.083
1.352TyrPhe: 1.352 ± 0.065
2.423TyrGly: 2.423 ± 0.089
0.767TyrHis: 0.767 ± 0.053
2.277TyrIle: 2.277 ± 0.07
1.746TyrLys: 1.746 ± 0.064
3.146TyrLeu: 3.146 ± 0.1
0.606TyrMet: 0.606 ± 0.043
1.612TyrAsn: 1.612 ± 0.063
1.388TyrPro: 1.388 ± 0.062
0.943TyrGln: 0.943 ± 0.048
1.764TyrArg: 1.764 ± 0.073
2.543TyrSer: 2.543 ± 0.081
1.597TyrThr: 1.597 ± 0.061
2.012TyrVal: 2.012 ± 0.081
0.301TyrTrp: 0.301 ± 0.028
1.301TyrTyr: 1.301 ± 0.065
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 1495 proteins (335054 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski