Amino acid dipepetide frequency for Emiliania huxleyi virus 99B1

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
5.308AlaAla: 5.308 ± 0.282
0.982AlaCys: 0.982 ± 0.11
3.01AlaAsp: 3.01 ± 0.173
3.1AlaGlu: 3.1 ± 0.235
2.514AlaPhe: 2.514 ± 0.157
3.451AlaGly: 3.451 ± 0.208
1.694AlaHis: 1.694 ± 0.151
4.947AlaIle: 4.947 ± 0.257
3.524AlaLys: 3.524 ± 0.215
5.083AlaLeu: 5.083 ± 0.185
1.874AlaMet: 1.874 ± 0.129
3.298AlaAsn: 3.298 ± 0.203
3.19AlaPro: 3.19 ± 0.226
2.055AlaGln: 2.055 ± 0.159
3.343AlaArg: 3.343 ± 0.207
4.992AlaSer: 4.992 ± 0.24
4.109AlaThr: 4.109 ± 0.212
4.443AlaVal: 4.443 ± 0.221
0.775AlaTrp: 0.775 ± 0.092
2.514AlaTyr: 2.514 ± 0.163
0.0AlaXaa: 0.0 ± 0.0
Cys
1.036CysAla: 1.036 ± 0.083
0.487CysCys: 0.487 ± 0.089
0.991CysAsp: 0.991 ± 0.123
1.0CysGlu: 1.0 ± 0.104
0.631CysPhe: 0.631 ± 0.083
1.325CysGly: 1.325 ± 0.122
0.36CysHis: 0.36 ± 0.058
1.604CysIle: 1.604 ± 0.134
1.199CysLys: 1.199 ± 0.111
0.973CysLeu: 0.973 ± 0.112
0.748CysMet: 0.748 ± 0.095
1.316CysAsn: 1.316 ± 0.129
0.928CysPro: 0.928 ± 0.098
0.406CysGln: 0.406 ± 0.07
0.73CysArg: 0.73 ± 0.109
1.235CysSer: 1.235 ± 0.13
1.37CysThr: 1.37 ± 0.126
1.054CysVal: 1.054 ± 0.106
0.117CysTrp: 0.117 ± 0.039
0.631CysTyr: 0.631 ± 0.085
0.0CysXaa: 0.0 ± 0.0
Asp
4.217AspAla: 4.217 ± 0.189
0.874AspCys: 0.874 ± 0.099
4.488AspAsp: 4.488 ± 0.253
4.317AspGlu: 4.317 ± 0.27
2.28AspPhe: 2.28 ± 0.139
3.334AspGly: 3.334 ± 0.174
1.199AspHis: 1.199 ± 0.12
4.596AspIle: 4.596 ± 0.203
2.92AspLys: 2.92 ± 0.18
3.839AspLeu: 3.839 ± 0.186
2.073AspMet: 2.073 ± 0.138
3.082AspAsn: 3.082 ± 0.198
2.703AspPro: 2.703 ± 0.18
1.595AspGln: 1.595 ± 0.125
2.397AspArg: 2.397 ± 0.154
3.605AspSer: 3.605 ± 0.177
4.289AspThr: 4.289 ± 0.197
4.118AspVal: 4.118 ± 0.191
0.622AspTrp: 0.622 ± 0.078
2.019AspTyr: 2.019 ± 0.165
0.0AspXaa: 0.0 ± 0.0
Glu
2.478GluAla: 2.478 ± 0.173
1.153GluCys: 1.153 ± 0.107
2.586GluAsp: 2.586 ± 0.199
3.028GluGlu: 3.028 ± 0.207
2.253GluPhe: 2.253 ± 0.13
1.919GluGly: 1.919 ± 0.146
1.532GluHis: 1.532 ± 0.139
3.812GluIle: 3.812 ± 0.189
3.028GluLys: 3.028 ± 0.195
5.01GluLeu: 5.01 ± 0.222
1.874GluMet: 1.874 ± 0.137
3.181GluAsn: 3.181 ± 0.17
2.037GluPro: 2.037 ± 0.222
2.298GluGln: 2.298 ± 0.158
2.415GluArg: 2.415 ± 0.154
2.965GluSer: 2.965 ± 0.168
3.551GluThr: 3.551 ± 0.187
2.767GluVal: 2.767 ± 0.173
0.676GluTrp: 0.676 ± 0.072
2.902GluTyr: 2.902 ± 0.169
0.0GluXaa: 0.0 ± 0.0
Phe
2.568PheAla: 2.568 ± 0.161
0.82PheCys: 0.82 ± 0.116
2.586PheAsp: 2.586 ± 0.165
2.037PheGlu: 2.037 ± 0.136
1.406PhePhe: 1.406 ± 0.131
2.469PheGly: 2.469 ± 0.159
0.955PheHis: 0.955 ± 0.104
3.163PheIle: 3.163 ± 0.187
1.919PheLys: 1.919 ± 0.135
2.839PheLeu: 2.839 ± 0.171
1.523PheMet: 1.523 ± 0.124
2.289PheAsn: 2.289 ± 0.158
1.946PhePro: 1.946 ± 0.152
0.739PheGln: 0.739 ± 0.067
1.496PheArg: 1.496 ± 0.112
2.812PheSer: 2.812 ± 0.195
2.649PheThr: 2.649 ± 0.144
2.821PheVal: 2.821 ± 0.174
0.406PheTrp: 0.406 ± 0.066
1.523PheTyr: 1.523 ± 0.1
0.0PheXaa: 0.0 ± 0.0
Gly
3.433GlyAla: 3.433 ± 0.203
0.928GlyCys: 0.928 ± 0.111
3.533GlyAsp: 3.533 ± 0.19
2.397GlyGlu: 2.397 ± 0.173
2.136GlyPhe: 2.136 ± 0.142
3.289GlyGly: 3.289 ± 0.288
1.199GlyHis: 1.199 ± 0.091
4.317GlyIle: 4.317 ± 0.184
3.352GlyLys: 3.352 ± 0.184
3.704GlyLeu: 3.704 ± 0.22
1.541GlyMet: 1.541 ± 0.115
3.073GlyAsn: 3.073 ± 0.22
1.685GlyPro: 1.685 ± 0.144
1.46GlyGln: 1.46 ± 0.112
2.388GlyArg: 2.388 ± 0.151
3.65GlySer: 3.65 ± 0.18
3.965GlyThr: 3.965 ± 0.223
3.578GlyVal: 3.578 ± 0.213
0.658GlyTrp: 0.658 ± 0.082
2.469GlyTyr: 2.469 ± 0.184
0.0GlyXaa: 0.0 ± 0.0
His
2.073HisAla: 2.073 ± 0.124
0.505HisCys: 0.505 ± 0.071
1.46HisAsp: 1.46 ± 0.129
1.415HisGlu: 1.415 ± 0.131
0.838HisPhe: 0.838 ± 0.089
1.577HisGly: 1.577 ± 0.126
0.802HisHis: 0.802 ± 0.084
2.118HisIle: 2.118 ± 0.149
1.46HisLys: 1.46 ± 0.123
1.811HisLeu: 1.811 ± 0.129
0.901HisMet: 0.901 ± 0.086
1.334HisAsn: 1.334 ± 0.106
1.433HisPro: 1.433 ± 0.117
0.811HisGln: 0.811 ± 0.087
1.108HisArg: 1.108 ± 0.111
1.586HisSer: 1.586 ± 0.132
1.901HisThr: 1.901 ± 0.161
1.928HisVal: 1.928 ± 0.144
0.198HisTrp: 0.198 ± 0.043
1.072HisTyr: 1.072 ± 0.104
0.0HisXaa: 0.0 ± 0.0
Ile
5.236IleAla: 5.236 ± 0.194
1.406IleCys: 1.406 ± 0.116
4.344IleAsp: 4.344 ± 0.211
3.767IleGlu: 3.767 ± 0.175
2.451IlePhe: 2.451 ± 0.178
4.47IleGly: 4.47 ± 0.248
2.01IleHis: 2.01 ± 0.171
5.605IleIle: 5.605 ± 0.249
4.226IleLys: 4.226 ± 0.252
5.668IleLeu: 5.668 ± 0.201
1.956IleMet: 1.956 ± 0.128
3.866IleAsn: 3.866 ± 0.215
3.316IlePro: 3.316 ± 0.165
2.316IleGln: 2.316 ± 0.136
3.46IleArg: 3.46 ± 0.191
5.299IleSer: 5.299 ± 0.261
5.11IleThr: 5.11 ± 0.244
5.272IleVal: 5.272 ± 0.254
0.667IleTrp: 0.667 ± 0.081
2.956IleTyr: 2.956 ± 0.163
0.0IleXaa: 0.0 ± 0.0
Lys
2.577LysAla: 2.577 ± 0.215
1.171LysCys: 1.171 ± 0.127
2.73LysAsp: 2.73 ± 0.174
2.839LysGlu: 2.839 ± 0.166
2.406LysPhe: 2.406 ± 0.156
1.865LysGly: 1.865 ± 0.136
2.046LysHis: 2.046 ± 0.136
3.749LysIle: 3.749 ± 0.238
5.2LysLys: 5.2 ± 0.337
4.677LysLeu: 4.677 ± 0.235
1.919LysMet: 1.919 ± 0.155
3.767LysAsn: 3.767 ± 0.237
2.586LysPro: 2.586 ± 0.164
2.208LysGln: 2.208 ± 0.165
3.686LysArg: 3.686 ± 0.248
3.641LysSer: 3.641 ± 0.25
4.217LysThr: 4.217 ± 0.233
2.974LysVal: 2.974 ± 0.187
0.703LysTrp: 0.703 ± 0.094
3.433LysTyr: 3.433 ± 0.19
0.0LysXaa: 0.0 ± 0.0
Leu
5.046LeuAla: 5.046 ± 0.251
1.451LeuCys: 1.451 ± 0.112
3.74LeuAsp: 3.74 ± 0.204
3.433LeuGlu: 3.433 ± 0.177
3.542LeuPhe: 3.542 ± 0.209
3.668LeuGly: 3.668 ± 0.219
2.172LeuHis: 2.172 ± 0.143
5.236LeuIle: 5.236 ± 0.224
4.308LeuLys: 4.308 ± 0.213
6.335LeuLeu: 6.335 ± 0.283
2.271LeuMet: 2.271 ± 0.137
3.902LeuAsn: 3.902 ± 0.202
4.28LeuPro: 4.28 ± 0.218
2.361LeuGln: 2.361 ± 0.135
3.569LeuArg: 3.569 ± 0.229
5.876LeuSer: 5.876 ± 0.228
5.046LeuThr: 5.046 ± 0.218
4.208LeuVal: 4.208 ± 0.178
0.811LeuTrp: 0.811 ± 0.086
3.46LeuTyr: 3.46 ± 0.196
0.0LeuXaa: 0.0 ± 0.0
Met
2.001MetAla: 2.001 ± 0.156
0.451MetCys: 0.451 ± 0.066
1.892MetAsp: 1.892 ± 0.157
1.739MetGlu: 1.739 ± 0.126
1.586MetPhe: 1.586 ± 0.125
1.361MetGly: 1.361 ± 0.113
0.892MetHis: 0.892 ± 0.076
1.775MetIle: 1.775 ± 0.137
1.775MetLys: 1.775 ± 0.129
2.514MetLeu: 2.514 ± 0.168
1.126MetMet: 1.126 ± 0.115
1.676MetAsn: 1.676 ± 0.113
1.514MetPro: 1.514 ± 0.107
1.09MetGln: 1.09 ± 0.111
1.352MetArg: 1.352 ± 0.13
2.676MetSer: 2.676 ± 0.158
2.163MetThr: 2.163 ± 0.129
1.55MetVal: 1.55 ± 0.12
0.369MetTrp: 0.369 ± 0.069
1.712MetTyr: 1.712 ± 0.13
0.0MetXaa: 0.0 ± 0.0
Asn
4.01AsnAla: 4.01 ± 0.219
0.811AsnCys: 0.811 ± 0.087
3.704AsnAsp: 3.704 ± 0.212
2.956AsnGlu: 2.956 ± 0.167
1.613AsnPhe: 1.613 ± 0.116
3.848AsnGly: 3.848 ± 0.202
1.235AsnHis: 1.235 ± 0.111
4.289AsnIle: 4.289 ± 0.209
3.514AsnLys: 3.514 ± 0.193
3.596AsnLeu: 3.596 ± 0.175
1.892AsnMet: 1.892 ± 0.127
3.695AsnAsn: 3.695 ± 0.209
2.794AsnPro: 2.794 ± 0.13
1.379AsnGln: 1.379 ± 0.092
2.64AsnArg: 2.64 ± 0.159
3.587AsnSer: 3.587 ± 0.185
4.271AsnThr: 4.271 ± 0.194
3.965AsnVal: 3.965 ± 0.188
0.541AsnTrp: 0.541 ± 0.064
1.919AsnTyr: 1.919 ± 0.173
0.0AsnXaa: 0.0 ± 0.0
Pro
2.694ProAla: 2.694 ± 0.19
0.73ProCys: 0.73 ± 0.072
3.055ProAsp: 3.055 ± 0.213
2.424ProGlu: 2.424 ± 0.161
1.82ProPhe: 1.82 ± 0.13
2.658ProGly: 2.658 ± 0.181
1.144ProHis: 1.144 ± 0.115
3.181ProIle: 3.181 ± 0.195
2.226ProLys: 2.226 ± 0.171
3.442ProLeu: 3.442 ± 0.229
1.162ProMet: 1.162 ± 0.099
2.316ProAsn: 2.316 ± 0.184
8.948ProPro: 8.948 ± 2.066
1.415ProGln: 1.415 ± 0.1
2.019ProArg: 2.019 ± 0.136
5.876ProSer: 5.876 ± 0.711
3.415ProThr: 3.415 ± 0.184
3.884ProVal: 3.884 ± 0.286
0.424ProTrp: 0.424 ± 0.064
1.901ProTyr: 1.901 ± 0.137
0.0ProXaa: 0.0 ± 0.0
Gln
1.721GlnAla: 1.721 ± 0.127
0.811GlnCys: 0.811 ± 0.148
1.235GlnAsp: 1.235 ± 0.113
1.379GlnGlu: 1.379 ± 0.126
1.424GlnPhe: 1.424 ± 0.129
1.108GlnGly: 1.108 ± 0.095
0.973GlnHis: 0.973 ± 0.084
2.136GlnIle: 2.136 ± 0.132
1.974GlnLys: 1.974 ± 0.164
2.749GlnLeu: 2.749 ± 0.156
1.054GlnMet: 1.054 ± 0.095
1.631GlnAsn: 1.631 ± 0.158
1.478GlnPro: 1.478 ± 0.139
1.334GlnGln: 1.334 ± 0.115
1.802GlnArg: 1.802 ± 0.125
2.19GlnSer: 2.19 ± 0.136
2.181GlnThr: 2.181 ± 0.147
1.956GlnVal: 1.956 ± 0.127
0.378GlnTrp: 0.378 ± 0.059
1.388GlnTyr: 1.388 ± 0.106
0.0GlnXaa: 0.0 ± 0.0
Arg
3.154ArgAla: 3.154 ± 0.212
0.748ArgCys: 0.748 ± 0.091
2.803ArgAsp: 2.803 ± 0.171
2.415ArgGlu: 2.415 ± 0.157
1.829ArgPhe: 1.829 ± 0.125
2.28ArgGly: 2.28 ± 0.148
1.334ArgHis: 1.334 ± 0.101
3.469ArgIle: 3.469 ± 0.196
3.046ArgLys: 3.046 ± 0.242
3.298ArgLeu: 3.298 ± 0.173
1.64ArgMet: 1.64 ± 0.135
2.92ArgAsn: 2.92 ± 0.174
1.892ArgPro: 1.892 ± 0.145
1.974ArgGln: 1.974 ± 0.153
3.614ArgArg: 3.614 ± 0.326
3.244ArgSer: 3.244 ± 0.204
3.037ArgThr: 3.037 ± 0.176
3.136ArgVal: 3.136 ± 0.183
0.442ArgTrp: 0.442 ± 0.063
1.802ArgTyr: 1.802 ± 0.108
0.0ArgXaa: 0.0 ± 0.0
Ser
4.623SerAla: 4.623 ± 0.232
1.316SerCys: 1.316 ± 0.139
4.488SerAsp: 4.488 ± 0.213
3.614SerGlu: 3.614 ± 0.168
2.911SerPhe: 2.911 ± 0.154
4.064SerGly: 4.064 ± 0.214
1.956SerHis: 1.956 ± 0.132
4.965SerIle: 4.965 ± 0.19
3.992SerLys: 3.992 ± 0.226
4.947SerLeu: 4.947 ± 0.218
2.028SerMet: 2.028 ± 0.132
3.731SerAsn: 3.731 ± 0.193
4.614SerPro: 4.614 ± 0.67
1.983SerGln: 1.983 ± 0.132
3.487SerArg: 3.487 ± 0.191
6.155SerSer: 6.155 ± 0.276
5.299SerThr: 5.299 ± 0.264
4.821SerVal: 4.821 ± 0.235
0.847SerTrp: 0.847 ± 0.096
3.109SerTyr: 3.109 ± 0.176
0.0SerXaa: 0.0 ± 0.0
Thr
4.542ThrAla: 4.542 ± 0.243
1.379ThrCys: 1.379 ± 0.122
4.353ThrAsp: 4.353 ± 0.257
3.83ThrGlu: 3.83 ± 0.235
2.649ThrPhe: 2.649 ± 0.162
3.812ThrGly: 3.812 ± 0.256
1.883ThrHis: 1.883 ± 0.142
5.037ThrIle: 5.037 ± 0.268
4.019ThrLys: 4.019 ± 0.249
5.254ThrLeu: 5.254 ± 0.233
1.82ThrMet: 1.82 ± 0.135
3.605ThrAsn: 3.605 ± 0.194
3.857ThrPro: 3.857 ± 0.184
2.19ThrGln: 2.19 ± 0.136
3.406ThrArg: 3.406 ± 0.232
5.137ThrSer: 5.137 ± 0.226
5.38ThrThr: 5.38 ± 0.239
4.443ThrVal: 4.443 ± 0.21
0.757ThrTrp: 0.757 ± 0.083
2.821ThrTyr: 2.821 ± 0.157
0.0ThrXaa: 0.0 ± 0.0
Val
4.19ValAla: 4.19 ± 0.237
1.235ValCys: 1.235 ± 0.115
3.911ValAsp: 3.911 ± 0.164
3.208ValGlu: 3.208 ± 0.15
2.523ValPhe: 2.523 ± 0.149
3.388ValGly: 3.388 ± 0.21
1.559ValHis: 1.559 ± 0.131
5.119ValIle: 5.119 ± 0.213
3.433ValLys: 3.433 ± 0.207
5.182ValLeu: 5.182 ± 0.22
2.019ValMet: 2.019 ± 0.14
3.668ValAsn: 3.668 ± 0.183
3.65ValPro: 3.65 ± 0.21
1.865ValGln: 1.865 ± 0.16
2.902ValArg: 2.902 ± 0.187
4.677ValSer: 4.677 ± 0.212
4.335ValThr: 4.335 ± 0.191
4.596ValVal: 4.596 ± 0.201
0.712ValTrp: 0.712 ± 0.089
2.884ValTyr: 2.884 ± 0.193
0.0ValXaa: 0.0 ± 0.0
Trp
0.604TrpAla: 0.604 ± 0.085
0.216TrpCys: 0.216 ± 0.042
0.658TrpAsp: 0.658 ± 0.082
0.433TrpGlu: 0.433 ± 0.071
0.514TrpPhe: 0.514 ± 0.072
0.631TrpGly: 0.631 ± 0.087
0.36TrpHis: 0.36 ± 0.055
0.685TrpIle: 0.685 ± 0.086
0.82TrpLys: 0.82 ± 0.086
0.838TrpLeu: 0.838 ± 0.084
0.342TrpMet: 0.342 ± 0.063
0.667TrpAsn: 0.667 ± 0.081
0.342TrpPro: 0.342 ± 0.058
0.324TrpGln: 0.324 ± 0.051
0.451TrpArg: 0.451 ± 0.061
0.784TrpSer: 0.784 ± 0.086
0.721TrpThr: 0.721 ± 0.085
0.586TrpVal: 0.586 ± 0.075
0.234TrpTrp: 0.234 ± 0.052
0.541TrpTyr: 0.541 ± 0.075
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.694TyrAla: 2.694 ± 0.163
0.667TyrCys: 0.667 ± 0.098
3.001TyrAsp: 3.001 ± 0.166
2.136TyrGlu: 2.136 ± 0.143
1.64TyrPhe: 1.64 ± 0.134
2.235TyrGly: 2.235 ± 0.141
1.027TyrHis: 1.027 ± 0.109
3.596TyrIle: 3.596 ± 0.149
2.46TyrLys: 2.46 ± 0.189
2.956TyrLeu: 2.956 ± 0.174
1.442TyrMet: 1.442 ± 0.102
3.244TyrAsn: 3.244 ± 0.177
1.541TyrPro: 1.541 ± 0.125
1.099TyrGln: 1.099 ± 0.092
1.811TyrArg: 1.811 ± 0.148
2.965TyrSer: 2.965 ± 0.14
3.163TyrThr: 3.163 ± 0.152
2.965TyrVal: 2.965 ± 0.189
0.415TyrTrp: 0.415 ± 0.06
1.892TyrTyr: 1.892 ± 0.129
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 444 proteins (110970 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski