Amino acid dipepetide frequency for Nanohaloarchaea archaeon

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
5.419AlaAla: 5.419 ± 0.194
0.44AlaCys: 0.44 ± 0.047
4.375AlaAsp: 4.375 ± 0.131
7.368AlaGlu: 7.368 ± 0.194
2.578AlaPhe: 2.578 ± 0.11
5.354AlaGly: 5.354 ± 0.193
0.986AlaHis: 0.986 ± 0.066
3.603AlaIle: 3.603 ± 0.127
3.126AlaLys: 3.126 ± 0.109
6.069AlaLeu: 6.069 ± 0.168
1.556AlaMet: 1.556 ± 0.087
2.029AlaAsn: 2.029 ± 0.098
1.863AlaPro: 1.863 ± 0.096
2.17AlaGln: 2.17 ± 0.107
3.119AlaArg: 3.119 ± 0.116
4.736AlaSer: 4.736 ± 0.154
3.065AlaThr: 3.065 ± 0.123
5.74AlaVal: 5.74 ± 0.177
0.603AlaTrp: 0.603 ± 0.043
1.989AlaTyr: 1.989 ± 0.083
0.0AlaXaa: 0.0 ± 0.0
Cys
0.318CysAla: 0.318 ± 0.035
0.04CysCys: 0.04 ± 0.012
0.44CysAsp: 0.44 ± 0.048
0.462CysGlu: 0.462 ± 0.043
0.231CysPhe: 0.231 ± 0.029
0.845CysGly: 0.845 ± 0.069
0.094CysHis: 0.094 ± 0.018
0.256CysIle: 0.256 ± 0.03
0.318CysLys: 0.318 ± 0.036
0.404CysLeu: 0.404 ± 0.039
0.144CysMet: 0.144 ± 0.021
0.253CysAsn: 0.253 ± 0.034
0.415CysPro: 0.415 ± 0.045
0.177CysGln: 0.177 ± 0.026
0.365CysArg: 0.365 ± 0.038
0.542CysSer: 0.542 ± 0.051
0.271CysThr: 0.271 ± 0.03
0.238CysVal: 0.238 ± 0.028
0.072CysTrp: 0.072 ± 0.016
0.191CysTyr: 0.191 ± 0.026
0.0CysXaa: 0.0 ± 0.0
Asp
4.126AspAla: 4.126 ± 0.133
0.325AspCys: 0.325 ± 0.038
4.195AspAsp: 4.195 ± 0.154
8.076AspGlu: 8.076 ± 0.222
3.408AspPhe: 3.408 ± 0.124
4.408AspGly: 4.408 ± 0.17
1.177AspHis: 1.177 ± 0.065
5.137AspIle: 5.137 ± 0.141
3.83AspLys: 3.83 ± 0.117
6.578AspLeu: 6.578 ± 0.17
1.74AspMet: 1.74 ± 0.088
2.765AspAsn: 2.765 ± 0.104
2.43AspPro: 2.43 ± 0.081
2.292AspGln: 2.292 ± 0.086
3.946AspArg: 3.946 ± 0.133
4.578AspSer: 4.578 ± 0.179
3.801AspThr: 3.801 ± 0.132
4.881AspVal: 4.881 ± 0.16
0.747AspTrp: 0.747 ± 0.06
2.783AspTyr: 2.783 ± 0.112
0.0AspXaa: 0.0 ± 0.0
Glu
7.26GluAla: 7.26 ± 0.188
0.625GluCys: 0.625 ± 0.063
8.845GluAsp: 8.845 ± 0.236
16.029GluGlu: 16.029 ± 0.435
3.787GluPhe: 3.787 ± 0.132
7.227GluGly: 7.227 ± 0.162
1.686GluHis: 1.686 ± 0.082
7.516GluIle: 7.516 ± 0.208
9.137GluLys: 9.137 ± 0.256
8.693GluLeu: 8.693 ± 0.221
3.036GluMet: 3.036 ± 0.104
5.394GluAsn: 5.394 ± 0.157
2.552GluPro: 2.552 ± 0.106
3.426GluGln: 3.426 ± 0.119
4.783GluArg: 4.783 ± 0.144
5.119GluSer: 5.119 ± 0.16
4.946GluThr: 4.946 ± 0.132
7.924GluVal: 7.924 ± 0.189
0.982GluTrp: 0.982 ± 0.073
3.361GluTyr: 3.361 ± 0.122
0.0GluXaa: 0.0 ± 0.0
Phe
2.195PheAla: 2.195 ± 0.102
0.282PheCys: 0.282 ± 0.036
3.043PheAsp: 3.043 ± 0.104
3.715PheGlu: 3.715 ± 0.129
1.758PhePhe: 1.758 ± 0.087
2.686PheGly: 2.686 ± 0.115
0.69PheHis: 0.69 ± 0.055
2.274PheIle: 2.274 ± 0.113
1.61PheLys: 1.61 ± 0.071
3.639PheLeu: 3.639 ± 0.147
1.079PheMet: 1.079 ± 0.072
1.635PheAsn: 1.635 ± 0.082
1.339PhePro: 1.339 ± 0.076
1.588PheGln: 1.588 ± 0.076
1.823PheArg: 1.823 ± 0.08
3.469PheSer: 3.469 ± 0.116
2.314PheThr: 2.314 ± 0.106
2.368PheVal: 2.368 ± 0.094
0.357PheTrp: 0.357 ± 0.045
1.415PheTyr: 1.415 ± 0.09
0.0PheXaa: 0.0 ± 0.0
Gly
4.195GlyAla: 4.195 ± 0.167
0.473GlyCys: 0.473 ± 0.042
4.682GlyAsp: 4.682 ± 0.171
7.502GlyGlu: 7.502 ± 0.2
3.264GlyPhe: 3.264 ± 0.127
5.029GlyGly: 5.029 ± 0.196
1.314GlyHis: 1.314 ± 0.076
4.567GlyIle: 4.567 ± 0.148
4.13GlyLys: 4.13 ± 0.13
7.332GlyLeu: 7.332 ± 0.192
1.704GlyMet: 1.704 ± 0.079
2.827GlyAsn: 2.827 ± 0.129
2.065GlyPro: 2.065 ± 0.098
2.527GlyGln: 2.527 ± 0.107
3.419GlyArg: 3.419 ± 0.124
5.04GlySer: 5.04 ± 0.214
3.866GlyThr: 3.866 ± 0.148
4.744GlyVal: 4.744 ± 0.127
0.711GlyTrp: 0.711 ± 0.054
2.646GlyTyr: 2.646 ± 0.109
0.0GlyXaa: 0.0 ± 0.0
His
1.022HisAla: 1.022 ± 0.05
0.141HisCys: 0.141 ± 0.022
0.964HisAsp: 0.964 ± 0.063
1.646HisGlu: 1.646 ± 0.077
0.744HisPhe: 0.744 ± 0.051
1.31HisGly: 1.31 ± 0.064
0.401HisHis: 0.401 ± 0.044
1.04HisIle: 1.04 ± 0.06
0.758HisLys: 0.758 ± 0.055
1.354HisLeu: 1.354 ± 0.069
0.354HisMet: 0.354 ± 0.035
0.596HisAsn: 0.596 ± 0.052
0.812HisPro: 0.812 ± 0.06
0.646HisGln: 0.646 ± 0.052
1.004HisArg: 1.004 ± 0.063
1.126HisSer: 1.126 ± 0.071
0.798HisThr: 0.798 ± 0.053
1.235HisVal: 1.235 ± 0.068
0.231HisTrp: 0.231 ± 0.03
0.581HisTyr: 0.581 ± 0.051
0.0HisXaa: 0.0 ± 0.0
Ile
4.307IleAla: 4.307 ± 0.172
0.314IleCys: 0.314 ± 0.036
4.859IleAsp: 4.859 ± 0.14
7.072IleGlu: 7.072 ± 0.18
2.332IlePhe: 2.332 ± 0.127
4.816IleGly: 4.816 ± 0.159
1.101IleHis: 1.101 ± 0.064
3.206IleIle: 3.206 ± 0.139
2.794IleLys: 2.794 ± 0.098
4.812IleLeu: 4.812 ± 0.178
1.274IleMet: 1.274 ± 0.077
2.419IleAsn: 2.419 ± 0.094
2.545IlePro: 2.545 ± 0.105
2.065IleGln: 2.065 ± 0.093
2.859IleArg: 2.859 ± 0.1
4.715IleSer: 4.715 ± 0.142
3.148IleThr: 3.148 ± 0.122
3.798IleVal: 3.798 ± 0.131
0.458IleTrp: 0.458 ± 0.04
1.971IleTyr: 1.971 ± 0.082
0.0IleXaa: 0.0 ± 0.0
Lys
4.397LysAla: 4.397 ± 0.135
0.397LysCys: 0.397 ± 0.043
3.791LysAsp: 3.791 ± 0.135
6.635LysGlu: 6.635 ± 0.186
2.072LysPhe: 2.072 ± 0.091
3.599LysGly: 3.599 ± 0.106
1.043LysHis: 1.043 ± 0.063
4.231LysIle: 4.231 ± 0.13
4.173LysLys: 4.173 ± 0.161
4.664LysLeu: 4.664 ± 0.135
1.671LysMet: 1.671 ± 0.079
2.422LysAsn: 2.422 ± 0.094
1.801LysPro: 1.801 ± 0.087
2.025LysGln: 2.025 ± 0.094
2.487LysArg: 2.487 ± 0.111
3.245LysSer: 3.245 ± 0.128
2.827LysThr: 2.827 ± 0.117
4.404LysVal: 4.404 ± 0.132
0.484LysTrp: 0.484 ± 0.044
1.964LysTyr: 1.964 ± 0.1
0.0LysXaa: 0.0 ± 0.0
Leu
5.762LeuAla: 5.762 ± 0.178
0.365LeuCys: 0.365 ± 0.044
6.469LeuAsp: 6.469 ± 0.164
9.805LeuGlu: 9.805 ± 0.265
2.852LeuPhe: 2.852 ± 0.123
6.206LeuGly: 6.206 ± 0.18
1.318LeuHis: 1.318 ± 0.07
4.39LeuIle: 4.39 ± 0.158
5.058LeuLys: 5.058 ± 0.137
6.7LeuLeu: 6.7 ± 0.213
2.119LeuMet: 2.119 ± 0.09
3.397LeuAsn: 3.397 ± 0.115
2.87LeuPro: 2.87 ± 0.106
3.058LeuGln: 3.058 ± 0.11
3.841LeuArg: 3.841 ± 0.123
5.935LeuSer: 5.935 ± 0.162
4.267LeuThr: 4.267 ± 0.152
5.747LeuVal: 5.747 ± 0.157
0.668LeuTrp: 0.668 ± 0.054
2.415LeuTyr: 2.415 ± 0.105
0.0LeuXaa: 0.0 ± 0.0
Met
1.884MetAla: 1.884 ± 0.087
0.123MetCys: 0.123 ± 0.021
1.968MetAsp: 1.968 ± 0.098
2.296MetGlu: 2.296 ± 0.095
0.704MetPhe: 0.704 ± 0.052
1.592MetGly: 1.592 ± 0.074
0.455MetHis: 0.455 ± 0.041
1.596MetIle: 1.596 ± 0.078
1.783MetLys: 1.783 ± 0.095
1.852MetLeu: 1.852 ± 0.103
0.697MetMet: 0.697 ± 0.047
1.116MetAsn: 1.116 ± 0.06
1.011MetPro: 1.011 ± 0.061
0.877MetGln: 0.877 ± 0.06
1.058MetArg: 1.058 ± 0.056
1.762MetSer: 1.762 ± 0.092
1.437MetThr: 1.437 ± 0.076
1.65MetVal: 1.65 ± 0.075
0.162MetTrp: 0.162 ± 0.025
0.664MetTyr: 0.664 ± 0.049
0.0MetXaa: 0.0 ± 0.0
Asn
2.581AsnAla: 2.581 ± 0.104
0.332AsnCys: 0.332 ± 0.039
2.31AsnAsp: 2.31 ± 0.105
2.935AsnGlu: 2.935 ± 0.108
1.736AsnPhe: 1.736 ± 0.078
2.715AsnGly: 2.715 ± 0.099
0.686AsnHis: 0.686 ± 0.048
2.877AsnIle: 2.877 ± 0.102
1.917AsnLys: 1.917 ± 0.088
3.744AsnLeu: 3.744 ± 0.125
1.032AsnMet: 1.032 ± 0.065
1.762AsnAsn: 1.762 ± 0.105
2.119AsnPro: 2.119 ± 0.086
1.715AsnGln: 1.715 ± 0.08
2.097AsnArg: 2.097 ± 0.083
3.256AsnSer: 3.256 ± 0.131
2.394AsnThr: 2.394 ± 0.106
2.621AsnVal: 2.621 ± 0.108
0.491AsnTrp: 0.491 ± 0.043
1.751AsnTyr: 1.751 ± 0.084
0.0AsnXaa: 0.0 ± 0.0
Pro
2.256ProAla: 2.256 ± 0.095
0.206ProCys: 0.206 ± 0.031
2.899ProAsp: 2.899 ± 0.124
5.336ProGlu: 5.336 ± 0.19
1.278ProPhe: 1.278 ± 0.067
2.834ProGly: 2.834 ± 0.099
0.675ProHis: 0.675 ± 0.053
1.549ProIle: 1.549 ± 0.073
1.365ProLys: 1.365 ± 0.068
2.303ProLeu: 2.303 ± 0.09
0.653ProMet: 0.653 ± 0.051
1.079ProAsn: 1.079 ± 0.064
0.978ProPro: 0.978 ± 0.064
1.148ProGln: 1.148 ± 0.069
1.184ProArg: 1.184 ± 0.067
2.365ProSer: 2.365 ± 0.11
1.563ProThr: 1.563 ± 0.076
2.888ProVal: 2.888 ± 0.113
0.278ProTrp: 0.278 ± 0.031
1.202ProTyr: 1.202 ± 0.072
0.0ProXaa: 0.0 ± 0.0
Gln
2.365GlnAla: 2.365 ± 0.1
0.184GlnCys: 0.184 ± 0.024
2.401GlnAsp: 2.401 ± 0.105
3.809GlnGlu: 3.809 ± 0.157
1.09GlnPhe: 1.09 ± 0.062
2.39GlnGly: 2.39 ± 0.097
0.567GlnHis: 0.567 ± 0.043
2.166GlnIle: 2.166 ± 0.091
2.585GlnLys: 2.585 ± 0.097
2.682GlnLeu: 2.682 ± 0.091
1.061GlnMet: 1.061 ± 0.064
1.765GlnAsn: 1.765 ± 0.08
1.144GlnPro: 1.144 ± 0.064
1.787GlnGln: 1.787 ± 0.132
1.755GlnArg: 1.755 ± 0.07
1.946GlnSer: 1.946 ± 0.1
1.711GlnThr: 1.711 ± 0.081
2.217GlnVal: 2.217 ± 0.086
0.267GlnTrp: 0.267 ± 0.031
1.014GlnTyr: 1.014 ± 0.061
0.0GlnXaa: 0.0 ± 0.0
Arg
2.56ArgAla: 2.56 ± 0.101
0.282ArgCys: 0.282 ± 0.033
3.477ArgAsp: 3.477 ± 0.113
5.892ArgGlu: 5.892 ± 0.171
1.827ArgPhe: 1.827 ± 0.088
2.866ArgGly: 2.866 ± 0.111
0.769ArgHis: 0.769 ± 0.047
3.014ArgIle: 3.014 ± 0.095
3.823ArgLys: 3.823 ± 0.147
3.224ArgLeu: 3.224 ± 0.105
1.188ArgMet: 1.188 ± 0.067
2.329ArgAsn: 2.329 ± 0.091
1.671ArgPro: 1.671 ± 0.07
1.538ArgGln: 1.538 ± 0.078
2.148ArgArg: 2.148 ± 0.099
2.671ArgSer: 2.671 ± 0.105
2.321ArgThr: 2.321 ± 0.085
3.043ArgVal: 3.043 ± 0.106
0.448ArgTrp: 0.448 ± 0.039
1.625ArgTyr: 1.625 ± 0.074
0.0ArgXaa: 0.0 ± 0.0
Ser
4.144SerAla: 4.144 ± 0.167
0.394SerCys: 0.394 ± 0.039
4.668SerAsp: 4.668 ± 0.185
6.859SerGlu: 6.859 ± 0.179
3.072SerPhe: 3.072 ± 0.117
5.856SerGly: 5.856 ± 0.23
1.094SerHis: 1.094 ± 0.064
4.217SerIle: 4.217 ± 0.137
3.74SerLys: 3.74 ± 0.132
5.57SerLeu: 5.57 ± 0.176
1.747SerMet: 1.747 ± 0.081
2.729SerAsn: 2.729 ± 0.121
2.278SerPro: 2.278 ± 0.106
2.469SerGln: 2.469 ± 0.1
3.022SerArg: 3.022 ± 0.112
5.812SerSer: 5.812 ± 0.233
3.52SerThr: 3.52 ± 0.147
4.318SerVal: 4.318 ± 0.139
0.664SerTrp: 0.664 ± 0.045
2.206SerTyr: 2.206 ± 0.104
0.0SerXaa: 0.0 ± 0.0
Thr
3.971ThrAla: 3.971 ± 0.136
0.35ThrCys: 0.35 ± 0.037
3.498ThrAsp: 3.498 ± 0.129
4.736ThrGlu: 4.736 ± 0.152
1.888ThrPhe: 1.888 ± 0.101
4.866ThrGly: 4.866 ± 0.168
0.856ThrHis: 0.856 ± 0.054
2.599ThrIle: 2.599 ± 0.106
1.921ThrLys: 1.921 ± 0.097
4.227ThrLeu: 4.227 ± 0.14
1.004ThrMet: 1.004 ± 0.053
1.744ThrAsn: 1.744 ± 0.084
1.888ThrPro: 1.888 ± 0.085
1.744ThrGln: 1.744 ± 0.078
2.123ThrArg: 2.123 ± 0.095
3.549ThrSer: 3.549 ± 0.143
2.697ThrThr: 2.697 ± 0.154
5.137ThrVal: 5.137 ± 0.166
0.404ThrTrp: 0.404 ± 0.044
1.69ThrTyr: 1.69 ± 0.095
0.0ThrXaa: 0.0 ± 0.0
Val
4.859ValAla: 4.859 ± 0.138
0.433ValCys: 0.433 ± 0.037
5.3ValAsp: 5.3 ± 0.162
8.52ValGlu: 8.52 ± 0.206
2.928ValPhe: 2.928 ± 0.105
4.495ValGly: 4.495 ± 0.145
1.097ValHis: 1.097 ± 0.069
4.159ValIle: 4.159 ± 0.129
4.245ValLys: 4.245 ± 0.129
5.769ValLeu: 5.769 ± 0.163
1.628ValMet: 1.628 ± 0.073
2.816ValAsn: 2.816 ± 0.122
2.671ValPro: 2.671 ± 0.106
2.17ValGln: 2.17 ± 0.099
2.971ValArg: 2.971 ± 0.1
4.852ValSer: 4.852 ± 0.167
3.718ValThr: 3.718 ± 0.134
5.181ValVal: 5.181 ± 0.139
0.56ValTrp: 0.56 ± 0.048
2.249ValTyr: 2.249 ± 0.088
0.0ValXaa: 0.0 ± 0.0
Trp
0.606TrpAla: 0.606 ± 0.046
0.069TrpCys: 0.069 ± 0.018
0.599TrpAsp: 0.599 ± 0.054
0.794TrpGlu: 0.794 ± 0.054
0.321TrpPhe: 0.321 ± 0.033
0.563TrpGly: 0.563 ± 0.044
0.188TrpHis: 0.188 ± 0.024
0.599TrpIle: 0.599 ± 0.053
0.708TrpLys: 0.708 ± 0.056
0.682TrpLeu: 0.682 ± 0.057
0.31TrpMet: 0.31 ± 0.036
0.567TrpAsn: 0.567 ± 0.048
0.285TrpPro: 0.285 ± 0.032
0.289TrpGln: 0.289 ± 0.029
0.567TrpArg: 0.567 ± 0.044
0.625TrpSer: 0.625 ± 0.06
0.469TrpThr: 0.469 ± 0.046
0.516TrpVal: 0.516 ± 0.043
0.094TrpTrp: 0.094 ± 0.02
0.307TrpTyr: 0.307 ± 0.039
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.96TyrAla: 1.96 ± 0.073
0.303TyrCys: 0.303 ± 0.034
2.44TyrAsp: 2.44 ± 0.108
2.841TyrGlu: 2.841 ± 0.107
1.412TyrPhe: 1.412 ± 0.067
2.379TyrGly: 2.379 ± 0.096
0.567TyrHis: 0.567 ± 0.048
1.87TyrIle: 1.87 ± 0.088
1.314TyrLys: 1.314 ± 0.072
2.942TyrLeu: 2.942 ± 0.099
0.682TyrMet: 0.682 ± 0.048
1.43TyrAsn: 1.43 ± 0.072
1.3TyrPro: 1.3 ± 0.07
1.195TyrGln: 1.195 ± 0.063
2.148TyrArg: 2.148 ± 0.101
3.036TyrSer: 3.036 ± 0.107
1.758TyrThr: 1.758 ± 0.093
1.971TyrVal: 1.971 ± 0.076
0.477TyrTrp: 0.477 ± 0.042
1.166TyrTyr: 1.166 ± 0.073
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 1074 proteins (277001 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski