Amino acid dipepetide frequency for Variola virus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
2.074AlaAla: 2.074 ± 0.21
1.009AlaCys: 1.009 ± 0.164
1.719AlaAsp: 1.719 ± 0.182
1.924AlaGlu: 1.924 ± 0.231
1.644AlaPhe: 1.644 ± 0.19
1.439AlaGly: 1.439 ± 0.163
0.486AlaHis: 0.486 ± 0.1
3.774AlaIle: 3.774 ± 0.239
2.896AlaLys: 2.896 ± 0.223
3.27AlaLeu: 3.27 ± 0.284
1.158AlaMet: 1.158 ± 0.143
2.429AlaAsn: 2.429 ± 0.222
1.028AlaPro: 1.028 ± 0.139
0.785AlaGln: 0.785 ± 0.115
1.495AlaArg: 1.495 ± 0.194
3.307AlaSer: 3.307 ± 0.283
2.504AlaThr: 2.504 ± 0.267
2.803AlaVal: 2.803 ± 0.199
0.243AlaTrp: 0.243 ± 0.061
1.607AlaTyr: 1.607 ± 0.164
0.0AlaXaa: 0.0 ± 0.0
Cys
0.841CysAla: 0.841 ± 0.106
0.579CysCys: 0.579 ± 0.115
1.271CysAsp: 1.271 ± 0.161
0.99CysGlu: 0.99 ± 0.165
0.841CysPhe: 0.841 ± 0.136
1.14CysGly: 1.14 ± 0.137
0.374CysHis: 0.374 ± 0.095
1.962CysIle: 1.962 ± 0.22
1.308CysLys: 1.308 ± 0.167
1.775CysLeu: 1.775 ± 0.188
0.598CysMet: 0.598 ± 0.11
1.457CysAsn: 1.457 ± 0.203
0.673CysPro: 0.673 ± 0.107
0.486CysGln: 0.486 ± 0.116
0.878CysArg: 0.878 ± 0.109
1.682CysSer: 1.682 ± 0.197
1.383CysThr: 1.383 ± 0.196
1.383CysVal: 1.383 ± 0.213
0.224CysTrp: 0.224 ± 0.071
1.327CysTyr: 1.327 ± 0.163
0.0CysXaa: 0.0 ± 0.0
Asp
2.634AspAla: 2.634 ± 0.207
0.972AspCys: 0.972 ± 0.159
5.474AspAsp: 5.474 ± 0.416
4.092AspGlu: 4.092 ± 0.218
2.765AspPhe: 2.765 ± 0.233
2.84AspGly: 2.84 ± 0.226
1.084AspHis: 1.084 ± 0.159
7.586AspIle: 7.586 ± 0.41
5.045AspLys: 5.045 ± 0.321
4.839AspLeu: 4.839 ± 0.312
1.588AspMet: 1.588 ± 0.187
4.821AspAsn: 4.821 ± 0.302
1.644AspPro: 1.644 ± 0.163
1.196AspGln: 1.196 ± 0.141
2.336AspArg: 2.336 ± 0.248
4.409AspSer: 4.409 ± 0.296
3.643AspThr: 3.643 ± 0.312
4.447AspVal: 4.447 ± 0.26
0.467AspTrp: 0.467 ± 0.103
3.513AspTyr: 3.513 ± 0.286
0.0AspXaa: 0.0 ± 0.0
Glu
2.093GluAla: 2.093 ± 0.216
1.065GluCys: 1.065 ± 0.122
3.419GluAsp: 3.419 ± 0.266
3.158GluGlu: 3.158 ± 0.294
2.56GluPhe: 2.56 ± 0.188
1.607GluGly: 1.607 ± 0.203
1.196GluHis: 1.196 ± 0.17
4.802GluIle: 4.802 ± 0.316
3.531GluLys: 3.531 ± 0.274
5.25GluLeu: 5.25 ± 0.402
1.345GluMet: 1.345 ± 0.18
3.382GluAsn: 3.382 ± 0.272
1.738GluPro: 1.738 ± 0.19
1.401GluGln: 1.401 ± 0.173
2.448GluArg: 2.448 ± 0.286
3.774GluSer: 3.774 ± 0.265
3.569GluThr: 3.569 ± 0.227
2.578GluVal: 2.578 ± 0.213
0.523GluTrp: 0.523 ± 0.078
3.606GluTyr: 3.606 ± 0.253
0.0GluXaa: 0.0 ± 0.0
Phe
1.569PheAla: 1.569 ± 0.211
0.953PheCys: 0.953 ± 0.125
3.139PheAsp: 3.139 ± 0.252
2.093PheGlu: 2.093 ± 0.171
2.186PhePhe: 2.186 ± 0.223
1.943PheGly: 1.943 ± 0.19
0.729PheHis: 0.729 ± 0.101
4.652PheIle: 4.652 ± 0.265
3.681PheLys: 3.681 ± 0.282
4.017PheLeu: 4.017 ± 0.369
1.439PheMet: 1.439 ± 0.153
3.662PheAsn: 3.662 ± 0.239
1.327PhePro: 1.327 ± 0.162
0.934PheGln: 0.934 ± 0.13
1.887PheArg: 1.887 ± 0.187
3.886PheSer: 3.886 ± 0.272
2.971PheThr: 2.971 ± 0.256
2.989PheVal: 2.989 ± 0.239
0.374PheTrp: 0.374 ± 0.084
2.261PheTyr: 2.261 ± 0.241
0.0PheXaa: 0.0 ± 0.0
Gly
1.85GlyAla: 1.85 ± 0.19
0.859GlyCys: 0.859 ± 0.133
2.522GlyAsp: 2.522 ± 0.218
2.13GlyGlu: 2.13 ± 0.206
1.868GlyPhe: 1.868 ± 0.191
1.981GlyGly: 1.981 ± 0.184
0.859GlyHis: 0.859 ± 0.13
3.662GlyIle: 3.662 ± 0.285
2.989GlyLys: 2.989 ± 0.259
3.046GlyLeu: 3.046 ± 0.218
0.859GlyMet: 0.859 ± 0.162
3.027GlyAsn: 3.027 ± 0.212
0.878GlyPro: 0.878 ± 0.124
0.71GlyGln: 0.71 ± 0.112
1.831GlyArg: 1.831 ± 0.173
2.915GlySer: 2.915 ± 0.249
2.317GlyThr: 2.317 ± 0.29
2.709GlyVal: 2.709 ± 0.26
0.206GlyTrp: 0.206 ± 0.064
2.261GlyTyr: 2.261 ± 0.239
0.0GlyXaa: 0.0 ± 0.0
His
0.822HisAla: 0.822 ± 0.118
0.598HisCys: 0.598 ± 0.123
1.046HisAsp: 1.046 ± 0.143
0.878HisGlu: 0.878 ± 0.127
0.897HisPhe: 0.897 ± 0.135
1.121HisGly: 1.121 ± 0.161
0.598HisHis: 0.598 ± 0.116
2.448HisIle: 2.448 ± 0.232
1.327HisLys: 1.327 ± 0.134
1.999HisLeu: 1.999 ± 0.175
0.579HisMet: 0.579 ± 0.114
1.327HisAsn: 1.327 ± 0.168
0.785HisPro: 0.785 ± 0.097
0.523HisGln: 0.523 ± 0.094
0.916HisArg: 0.916 ± 0.138
1.308HisSer: 1.308 ± 0.154
1.289HisThr: 1.289 ± 0.163
1.364HisVal: 1.364 ± 0.193
0.224HisTrp: 0.224 ± 0.066
0.916HisTyr: 0.916 ± 0.161
0.0HisXaa: 0.0 ± 0.0
Ile
3.494IleAla: 3.494 ± 0.226
1.607IleCys: 1.607 ± 0.173
7.081IleAsp: 7.081 ± 0.356
4.895IleGlu: 4.895 ± 0.31
4.092IlePhe: 4.092 ± 0.293
3.513IleGly: 3.513 ± 0.253
2.037IleHis: 2.037 ± 0.219
8.146IleIle: 8.146 ± 0.463
7.306IleLys: 7.306 ± 0.389
7.791IleLeu: 7.791 ± 0.441
2.167IleMet: 2.167 ± 0.2
7.791IleAsn: 7.791 ± 0.355
3.401IlePro: 3.401 ± 0.239
2.186IleGln: 2.186 ± 0.243
3.849IleArg: 3.849 ± 0.299
8.314IleSer: 8.314 ± 0.409
5.157IleThr: 5.157 ± 0.336
5.605IleVal: 5.605 ± 0.315
0.523IleTrp: 0.523 ± 0.105
4.503IleTyr: 4.503 ± 0.246
0.0IleXaa: 0.0 ± 0.0
Lys
2.037LysAla: 2.037 ± 0.198
1.831LysCys: 1.831 ± 0.188
5.25LysAsp: 5.25 ± 0.303
3.924LysGlu: 3.924 ± 0.265
3.475LysPhe: 3.475 ± 0.304
2.205LysGly: 2.205 ± 0.161
1.794LysHis: 1.794 ± 0.171
6.577LysIle: 6.577 ± 0.344
5.979LysLys: 5.979 ± 0.388
6.876LysLeu: 6.876 ± 0.362
2.037LysMet: 2.037 ± 0.195
5.138LysAsn: 5.138 ± 0.304
2.223LysPro: 2.223 ± 0.183
2.074LysGln: 2.074 ± 0.185
3.756LysArg: 3.756 ± 0.247
5.942LysSer: 5.942 ± 0.337
4.391LysThr: 4.391 ± 0.267
4.223LysVal: 4.223 ± 0.282
0.598LysTrp: 0.598 ± 0.122
4.559LysTyr: 4.559 ± 0.286
0.0LysXaa: 0.0 ± 0.0
Leu
3.288LeuAla: 3.288 ± 0.264
1.607LeuCys: 1.607 ± 0.196
5.717LeuAsp: 5.717 ± 0.335
5.007LeuGlu: 5.007 ± 0.345
4.783LeuPhe: 4.783 ± 0.324
3.102LeuGly: 3.102 ± 0.283
1.906LeuHis: 1.906 ± 0.245
6.39LeuIle: 6.39 ± 0.381
6.446LeuLys: 6.446 ± 0.332
9.024LeuLeu: 9.024 ± 0.51
2.578LeuMet: 2.578 ± 0.236
5.624LeuAsn: 5.624 ± 0.401
3.344LeuPro: 3.344 ± 0.296
2.055LeuGln: 2.055 ± 0.209
3.344LeuArg: 3.344 ± 0.28
7.567LeuSer: 7.567 ± 0.367
6.035LeuThr: 6.035 ± 0.335
5.437LeuVal: 5.437 ± 0.358
0.523LeuTrp: 0.523 ± 0.114
4.69LeuTyr: 4.69 ± 0.256
0.0LeuXaa: 0.0 ± 0.0
Met
1.551MetAla: 1.551 ± 0.166
0.673MetCys: 0.673 ± 0.105
2.037MetAsp: 2.037 ± 0.16
1.569MetGlu: 1.569 ± 0.163
1.252MetPhe: 1.252 ± 0.144
0.841MetGly: 0.841 ± 0.136
0.43MetHis: 0.43 ± 0.079
2.466MetIle: 2.466 ± 0.208
1.719MetLys: 1.719 ± 0.19
2.597MetLeu: 2.597 ± 0.231
0.953MetMet: 0.953 ± 0.145
1.794MetAsn: 1.794 ± 0.18
0.897MetPro: 0.897 ± 0.12
0.467MetGln: 0.467 ± 0.078
1.177MetArg: 1.177 ± 0.15
2.279MetSer: 2.279 ± 0.182
1.663MetThr: 1.663 ± 0.158
1.513MetVal: 1.513 ± 0.158
0.149MetTrp: 0.149 ± 0.043
1.663MetTyr: 1.663 ± 0.174
0.0MetXaa: 0.0 ± 0.0
Asn
2.709AsnAla: 2.709 ± 0.216
1.177AsnCys: 1.177 ± 0.151
4.764AsnAsp: 4.764 ± 0.342
3.625AsnGlu: 3.625 ± 0.279
2.634AsnPhe: 2.634 ± 0.238
3.176AsnGly: 3.176 ± 0.316
1.626AsnHis: 1.626 ± 0.199
7.847AsnIle: 7.847 ± 0.398
6.11AsnLys: 6.11 ± 0.294
4.802AsnLeu: 4.802 ± 0.308
2.167AsnMet: 2.167 ± 0.213
5.867AsnAsn: 5.867 ± 0.356
2.41AsnPro: 2.41 ± 0.192
1.42AsnGln: 1.42 ± 0.146
3.027AsnArg: 3.027 ± 0.226
4.353AsnSer: 4.353 ± 0.229
4.615AsnThr: 4.615 ± 0.323
4.559AsnVal: 4.559 ± 0.254
0.392AsnTrp: 0.392 ± 0.089
3.288AsnTyr: 3.288 ± 0.247
0.0AsnXaa: 0.0 ± 0.0
Pro
1.345ProAla: 1.345 ± 0.166
0.598ProCys: 0.598 ± 0.108
1.794ProAsp: 1.794 ± 0.176
2.205ProGlu: 2.205 ± 0.151
1.551ProPhe: 1.551 ± 0.195
1.401ProGly: 1.401 ± 0.195
0.71ProHis: 0.71 ± 0.137
3.102ProIle: 3.102 ± 0.27
2.037ProLys: 2.037 ± 0.203
2.859ProLeu: 2.859 ± 0.241
0.953ProMet: 0.953 ± 0.098
2.242ProAsn: 2.242 ± 0.195
1.719ProPro: 1.719 ± 0.271
0.654ProGln: 0.654 ± 0.117
1.569ProArg: 1.569 ± 0.211
2.56ProSer: 2.56 ± 0.216
2.336ProThr: 2.336 ± 0.194
2.093ProVal: 2.093 ± 0.191
0.318ProTrp: 0.318 ± 0.063
1.532ProTyr: 1.532 ± 0.171
0.0ProXaa: 0.0 ± 0.0
Gln
0.561GlnAla: 0.561 ± 0.105
0.486GlnCys: 0.486 ± 0.089
1.289GlnAsp: 1.289 ± 0.146
1.196GlnGlu: 1.196 ± 0.158
0.859GlnPhe: 0.859 ± 0.117
0.785GlnGly: 0.785 ± 0.148
0.654GlnHis: 0.654 ± 0.102
1.626GlnIle: 1.626 ± 0.181
1.495GlnLys: 1.495 ± 0.192
2.728GlnLeu: 2.728 ± 0.22
0.691GlnMet: 0.691 ± 0.124
1.495GlnAsn: 1.495 ± 0.175
0.71GlnPro: 0.71 ± 0.157
0.99GlnGln: 0.99 ± 0.158
1.084GlnArg: 1.084 ± 0.14
1.719GlnSer: 1.719 ± 0.166
1.551GlnThr: 1.551 ± 0.16
1.009GlnVal: 1.009 ± 0.153
0.206GlnTrp: 0.206 ± 0.058
1.663GlnTyr: 1.663 ± 0.175
0.0GlnXaa: 0.0 ± 0.0
Arg
1.177ArgAla: 1.177 ± 0.16
1.065ArgCys: 1.065 ± 0.147
2.616ArgAsp: 2.616 ± 0.232
2.111ArgGlu: 2.111 ± 0.224
2.205ArgPhe: 2.205 ± 0.205
1.85ArgGly: 1.85 ± 0.189
1.383ArgHis: 1.383 ± 0.183
3.513ArgIle: 3.513 ± 0.267
2.597ArgLys: 2.597 ± 0.252
4.148ArgLeu: 4.148 ± 0.262
1.065ArgMet: 1.065 ± 0.137
2.952ArgAsn: 2.952 ± 0.221
1.401ArgPro: 1.401 ± 0.177
1.345ArgGln: 1.345 ± 0.178
2.634ArgArg: 2.634 ± 0.259
3.008ArgSer: 3.008 ± 0.26
2.205ArgThr: 2.205 ± 0.222
2.354ArgVal: 2.354 ± 0.201
0.411ArgTrp: 0.411 ± 0.085
2.616ArgTyr: 2.616 ± 0.236
0.0ArgXaa: 0.0 ± 0.0
Ser
2.728SerAla: 2.728 ± 0.236
1.682SerCys: 1.682 ± 0.201
4.634SerAsp: 4.634 ± 0.32
3.494SerGlu: 3.494 ± 0.338
3.793SerPhe: 3.793 ± 0.266
3.102SerGly: 3.102 ± 0.258
1.551SerHis: 1.551 ± 0.182
7.287SerIle: 7.287 ± 0.356
6.203SerLys: 6.203 ± 0.327
7.212SerLeu: 7.212 ± 0.36
2.336SerMet: 2.336 ± 0.236
4.727SerAsn: 4.727 ± 0.289
2.989SerPro: 2.989 ± 0.266
2.074SerGln: 2.074 ± 0.235
3.344SerArg: 3.344 ± 0.282
6.838SerSer: 6.838 ± 0.489
5.007SerThr: 5.007 ± 0.394
5.082SerVal: 5.082 ± 0.32
0.355SerTrp: 0.355 ± 0.08
3.55SerTyr: 3.55 ± 0.259
0.0SerXaa: 0.0 ± 0.0
Thr
2.317ThrAla: 2.317 ± 0.224
1.289ThrCys: 1.289 ± 0.205
4.223ThrAsp: 4.223 ± 0.314
3.438ThrGlu: 3.438 ± 0.256
2.896ThrPhe: 2.896 ± 0.21
2.597ThrGly: 2.597 ± 0.207
1.364ThrHis: 1.364 ± 0.151
5.923ThrIle: 5.923 ± 0.341
4.578ThrLys: 4.578 ± 0.268
4.989ThrLeu: 4.989 ± 0.284
1.906ThrMet: 1.906 ± 0.19
3.886ThrAsn: 3.886 ± 0.266
2.522ThrPro: 2.522 ± 0.286
1.009ThrGln: 1.009 ± 0.148
2.578ThrArg: 2.578 ± 0.224
4.895ThrSer: 4.895 ± 0.322
3.942ThrThr: 3.942 ± 0.29
4.167ThrVal: 4.167 ± 0.288
0.523ThrTrp: 0.523 ± 0.087
2.84ThrTyr: 2.84 ± 0.237
0.0ThrXaa: 0.0 ± 0.0
Val
2.354ValAla: 2.354 ± 0.173
1.551ValCys: 1.551 ± 0.156
4.073ValAsp: 4.073 ± 0.255
3.55ValGlu: 3.55 ± 0.286
3.251ValPhe: 3.251 ± 0.285
1.906ValGly: 1.906 ± 0.19
1.046ValHis: 1.046 ± 0.134
5.736ValIle: 5.736 ± 0.359
4.914ValLys: 4.914 ± 0.341
5.4ValLeu: 5.4 ± 0.347
1.551ValMet: 1.551 ± 0.2
4.559ValAsn: 4.559 ± 0.26
1.906ValPro: 1.906 ± 0.219
1.252ValGln: 1.252 ± 0.155
2.317ValArg: 2.317 ± 0.205
4.989ValSer: 4.989 ± 0.329
3.83ValThr: 3.83 ± 0.314
3.718ValVal: 3.718 ± 0.289
0.224ValTrp: 0.224 ± 0.072
3.531ValTyr: 3.531 ± 0.3
0.0ValXaa: 0.0 ± 0.0
Trp
0.206TrpAla: 0.206 ± 0.058
0.168TrpCys: 0.168 ± 0.057
0.318TrpAsp: 0.318 ± 0.071
0.411TrpGlu: 0.411 ± 0.085
0.448TrpPhe: 0.448 ± 0.089
0.28TrpGly: 0.28 ± 0.075
0.112TrpHis: 0.112 ± 0.045
0.598TrpIle: 0.598 ± 0.126
0.598TrpLys: 0.598 ± 0.106
0.71TrpLeu: 0.71 ± 0.103
0.355TrpMet: 0.355 ± 0.083
0.43TrpAsn: 0.43 ± 0.114
0.262TrpPro: 0.262 ± 0.066
0.168TrpGln: 0.168 ± 0.061
0.206TrpArg: 0.206 ± 0.063
0.392TrpSer: 0.392 ± 0.079
0.467TrpThr: 0.467 ± 0.092
0.355TrpVal: 0.355 ± 0.089
0.0TrpTrp: 0.0 ± 0.0
0.299TrpTyr: 0.299 ± 0.074
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.943TyrAla: 1.943 ± 0.187
1.401TyrCys: 1.401 ± 0.177
3.008TyrAsp: 3.008 ± 0.252
2.354TyrGlu: 2.354 ± 0.208
2.691TyrPhe: 2.691 ± 0.26
2.653TyrGly: 2.653 ± 0.237
1.028TyrHis: 1.028 ± 0.124
5.362TyrIle: 5.362 ± 0.282
4.129TyrLys: 4.129 ± 0.256
5.157TyrLeu: 5.157 ± 0.29
1.345TyrMet: 1.345 ± 0.162
3.961TyrAsn: 3.961 ± 0.294
1.719TyrPro: 1.719 ± 0.19
1.065TyrGln: 1.065 ± 0.165
1.981TyrArg: 1.981 ± 0.211
3.849TyrSer: 3.849 ± 0.256
3.008TyrThr: 3.008 ± 0.266
3.27TyrVal: 3.27 ± 0.233
0.299TyrTrp: 0.299 ± 0.077
3.027TyrTyr: 3.027 ± 0.237
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 188 proteins (53522 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski