Amino acid dipepetide frequency for Monkeypox virus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
2.233AlaAla: 2.233 ± 0.242
1.025AlaCys: 1.025 ± 0.15
1.995AlaAsp: 1.995 ± 0.181
1.922AlaGlu: 1.922 ± 0.169
1.739AlaPhe: 1.739 ± 0.175
1.592AlaGly: 1.592 ± 0.168
0.458AlaHis: 0.458 ± 0.112
3.898AlaIle: 3.898 ± 0.306
2.599AlaLys: 2.599 ± 0.214
3.276AlaLeu: 3.276 ± 0.232
1.025AlaMet: 1.025 ± 0.145
2.269AlaAsn: 2.269 ± 0.212
0.988AlaPro: 0.988 ± 0.15
0.659AlaGln: 0.659 ± 0.112
1.519AlaArg: 1.519 ± 0.17
3.349AlaSer: 3.349 ± 0.263
2.397AlaThr: 2.397 ± 0.226
2.873AlaVal: 2.873 ± 0.204
0.238AlaTrp: 0.238 ± 0.058
1.72AlaTyr: 1.72 ± 0.212
0.0AlaXaa: 0.0 ± 0.0
Cys
0.75CysAla: 0.75 ± 0.125
0.586CysCys: 0.586 ± 0.11
1.61CysAsp: 1.61 ± 0.182
0.915CysGlu: 0.915 ± 0.176
0.677CysPhe: 0.677 ± 0.115
1.025CysGly: 1.025 ± 0.133
0.439CysHis: 0.439 ± 0.094
2.031CysIle: 2.031 ± 0.223
1.153CysLys: 1.153 ± 0.124
1.574CysLeu: 1.574 ± 0.15
0.586CysMet: 0.586 ± 0.101
1.318CysAsn: 1.318 ± 0.155
0.677CysPro: 0.677 ± 0.121
0.458CysGln: 0.458 ± 0.085
0.787CysArg: 0.787 ± 0.126
1.702CysSer: 1.702 ± 0.186
1.354CysThr: 1.354 ± 0.143
1.318CysVal: 1.318 ± 0.155
0.22CysTrp: 0.22 ± 0.067
1.244CysTyr: 1.244 ± 0.162
0.0CysXaa: 0.0 ± 0.0
Asp
2.672AspAla: 2.672 ± 0.216
0.97AspCys: 0.97 ± 0.136
5.819AspAsp: 5.819 ± 0.509
4.484AspGlu: 4.484 ± 0.264
2.891AspPhe: 2.891 ± 0.219
2.891AspGly: 2.891 ± 0.224
1.135AspHis: 1.135 ± 0.133
7.979AspIle: 7.979 ± 0.426
4.996AspLys: 4.996 ± 0.309
4.886AspLeu: 4.886 ± 0.314
1.83AspMet: 1.83 ± 0.172
4.74AspAsn: 4.74 ± 0.318
1.757AspPro: 1.757 ± 0.161
1.226AspGln: 1.226 ± 0.132
2.379AspArg: 2.379 ± 0.25
4.502AspSer: 4.502 ± 0.279
4.282AspThr: 4.282 ± 0.459
4.484AspVal: 4.484 ± 0.265
0.531AspTrp: 0.531 ± 0.104
3.861AspTyr: 3.861 ± 0.318
0.0AspXaa: 0.0 ± 0.0
Glu
2.068GluAla: 2.068 ± 0.182
1.153GluCys: 1.153 ± 0.125
3.678GluAsp: 3.678 ± 0.288
3.349GluGlu: 3.349 ± 0.298
2.599GluPhe: 2.599 ± 0.229
1.592GluGly: 1.592 ± 0.166
1.208GluHis: 1.208 ± 0.15
5.033GluIle: 5.033 ± 0.347
3.44GluLys: 3.44 ± 0.244
5.234GluLeu: 5.234 ± 0.345
1.354GluMet: 1.354 ± 0.181
3.422GluAsn: 3.422 ± 0.276
1.867GluPro: 1.867 ± 0.237
1.482GluGln: 1.482 ± 0.201
2.342GluArg: 2.342 ± 0.267
3.898GluSer: 3.898 ± 0.239
3.495GluThr: 3.495 ± 0.271
2.818GluVal: 2.818 ± 0.236
0.512GluTrp: 0.512 ± 0.089
3.88GluTyr: 3.88 ± 0.287
0.0GluXaa: 0.0 ± 0.0
Phe
1.629PheAla: 1.629 ± 0.166
0.842PheCys: 0.842 ± 0.127
3.111PheAsp: 3.111 ± 0.246
2.068PheGlu: 2.068 ± 0.153
2.141PhePhe: 2.141 ± 0.189
1.995PheGly: 1.995 ± 0.191
0.769PheHis: 0.769 ± 0.109
4.703PheIle: 4.703 ± 0.305
3.312PheLys: 3.312 ± 0.237
4.081PheLeu: 4.081 ± 0.331
1.427PheMet: 1.427 ± 0.148
3.532PheAsn: 3.532 ± 0.243
1.318PhePro: 1.318 ± 0.144
0.878PheGln: 0.878 ± 0.104
1.903PheArg: 1.903 ± 0.176
3.55PheSer: 3.55 ± 0.296
2.983PheThr: 2.983 ± 0.269
3.184PheVal: 3.184 ± 0.205
0.439PheTrp: 0.439 ± 0.099
2.306PheTyr: 2.306 ± 0.198
0.0PheXaa: 0.0 ± 0.0
Gly
2.086GlyAla: 2.086 ± 0.204
0.787GlyCys: 0.787 ± 0.124
2.672GlyAsp: 2.672 ± 0.209
2.123GlyGlu: 2.123 ± 0.164
1.903GlyPhe: 1.903 ± 0.174
2.159GlyGly: 2.159 ± 0.249
0.878GlyHis: 0.878 ± 0.126
3.733GlyIle: 3.733 ± 0.262
3.111GlyLys: 3.111 ± 0.246
3.02GlyLeu: 3.02 ± 0.233
1.007GlyMet: 1.007 ± 0.137
2.837GlyAsn: 2.837 ± 0.207
0.915GlyPro: 0.915 ± 0.135
0.677GlyGln: 0.677 ± 0.114
1.958GlyArg: 1.958 ± 0.184
2.946GlySer: 2.946 ± 0.22
2.324GlyThr: 2.324 ± 0.209
2.855GlyVal: 2.855 ± 0.247
0.275GlyTrp: 0.275 ± 0.069
2.141GlyTyr: 2.141 ± 0.189
0.0GlyXaa: 0.0 ± 0.0
His
0.878HisAla: 0.878 ± 0.118
0.549HisCys: 0.549 ± 0.115
1.226HisAsp: 1.226 ± 0.149
0.915HisGlu: 0.915 ± 0.104
0.824HisPhe: 0.824 ± 0.117
1.116HisGly: 1.116 ± 0.147
0.458HisHis: 0.458 ± 0.103
2.471HisIle: 2.471 ± 0.226
1.263HisLys: 1.263 ± 0.144
1.976HisLeu: 1.976 ± 0.205
0.494HisMet: 0.494 ± 0.091
1.098HisAsn: 1.098 ± 0.158
0.769HisPro: 0.769 ± 0.1
0.622HisGln: 0.622 ± 0.116
0.97HisArg: 0.97 ± 0.149
1.354HisSer: 1.354 ± 0.148
1.318HisThr: 1.318 ± 0.178
1.244HisVal: 1.244 ± 0.136
0.201HisTrp: 0.201 ± 0.058
0.842HisTyr: 0.842 ± 0.122
0.0HisXaa: 0.0 ± 0.0
Ile
3.44IleAla: 3.44 ± 0.216
1.537IleCys: 1.537 ± 0.171
7.521IleAsp: 7.521 ± 0.371
5.197IleGlu: 5.197 ± 0.311
4.172IlePhe: 4.172 ± 0.276
3.678IleGly: 3.678 ± 0.235
2.214IleHis: 2.214 ± 0.182
8.382IleIle: 8.382 ± 0.437
7.247IleLys: 7.247 ± 0.414
8.162IleLeu: 8.162 ± 0.485
2.306IleMet: 2.306 ± 0.203
7.595IleAsn: 7.595 ± 0.354
3.642IlePro: 3.642 ± 0.282
1.958IleGln: 1.958 ± 0.195
3.88IleArg: 3.88 ± 0.283
8.473IleSer: 8.473 ± 0.378
5.197IleThr: 5.197 ± 0.322
5.453IleVal: 5.453 ± 0.334
0.531IleTrp: 0.531 ± 0.097
4.319IleTyr: 4.319 ± 0.278
0.0IleXaa: 0.0 ± 0.0
Lys
1.995LysAla: 1.995 ± 0.176
1.592LysCys: 1.592 ± 0.15
4.795LysAsp: 4.795 ± 0.316
4.026LysGlu: 4.026 ± 0.263
3.129LysPhe: 3.129 ± 0.268
2.178LysGly: 2.178 ± 0.144
1.867LysHis: 1.867 ± 0.186
6.991LysIle: 6.991 ± 0.426
5.893LysLys: 5.893 ± 0.325
6.643LysLeu: 6.643 ± 0.294
1.995LysMet: 1.995 ± 0.213
5.124LysAsn: 5.124 ± 0.298
2.086LysPro: 2.086 ± 0.166
2.105LysGln: 2.105 ± 0.174
3.44LysArg: 3.44 ± 0.292
5.508LysSer: 5.508 ± 0.348
4.246LysThr: 4.246 ± 0.246
3.678LysVal: 3.678 ± 0.272
0.659LysTrp: 0.659 ± 0.107
4.575LysTyr: 4.575 ± 0.318
0.0LysXaa: 0.0 ± 0.0
Leu
3.239LeuAla: 3.239 ± 0.246
1.702LeuCys: 1.702 ± 0.172
5.874LeuAsp: 5.874 ± 0.349
5.051LeuGlu: 5.051 ± 0.33
4.795LeuPhe: 4.795 ± 0.338
3.166LeuGly: 3.166 ± 0.295
1.629LeuHis: 1.629 ± 0.24
6.405LeuIle: 6.405 ± 0.362
5.948LeuLys: 5.948 ± 0.326
9.004LeuLeu: 9.004 ± 0.47
2.452LeuMet: 2.452 ± 0.225
5.142LeuAsn: 5.142 ± 0.317
3.367LeuPro: 3.367 ± 0.294
1.958LeuGln: 1.958 ± 0.202
3.404LeuArg: 3.404 ± 0.294
7.704LeuSer: 7.704 ± 0.328
5.618LeuThr: 5.618 ± 0.355
5.453LeuVal: 5.453 ± 0.395
0.531LeuTrp: 0.531 ± 0.106
4.74LeuTyr: 4.74 ± 0.263
0.0LeuXaa: 0.0 ± 0.0
Met
1.592MetAla: 1.592 ± 0.17
0.494MetCys: 0.494 ± 0.095
1.976MetAsp: 1.976 ± 0.181
1.537MetGlu: 1.537 ± 0.168
1.373MetPhe: 1.373 ± 0.154
1.025MetGly: 1.025 ± 0.14
0.366MetHis: 0.366 ± 0.074
2.269MetIle: 2.269 ± 0.207
1.885MetLys: 1.885 ± 0.167
2.342MetLeu: 2.342 ± 0.19
1.025MetMet: 1.025 ± 0.153
1.848MetAsn: 1.848 ± 0.197
0.86MetPro: 0.86 ± 0.116
0.494MetGln: 0.494 ± 0.09
1.281MetArg: 1.281 ± 0.14
2.251MetSer: 2.251 ± 0.219
1.665MetThr: 1.665 ± 0.197
1.537MetVal: 1.537 ± 0.151
0.201MetTrp: 0.201 ± 0.066
1.482MetTyr: 1.482 ± 0.145
0.0MetXaa: 0.0 ± 0.0
Asn
2.617AsnAla: 2.617 ± 0.239
1.098AsnCys: 1.098 ± 0.172
4.795AsnAsp: 4.795 ± 0.345
3.752AsnGlu: 3.752 ± 0.285
2.416AsnPhe: 2.416 ± 0.255
3.074AsnGly: 3.074 ± 0.26
1.482AsnHis: 1.482 ± 0.166
7.21AsnIle: 7.21 ± 0.439
5.673AsnLys: 5.673 ± 0.294
4.831AsnLeu: 4.831 ± 0.33
2.196AsnMet: 2.196 ± 0.18
5.307AsnAsn: 5.307 ± 0.359
2.379AsnPro: 2.379 ± 0.194
1.318AsnGln: 1.318 ± 0.161
2.91AsnArg: 2.91 ± 0.224
4.41AsnSer: 4.41 ± 0.271
4.392AsnThr: 4.392 ± 0.353
4.264AsnVal: 4.264 ± 0.257
0.384AsnTrp: 0.384 ± 0.083
3.239AsnTyr: 3.239 ± 0.282
0.0AsnXaa: 0.0 ± 0.0
Pro
1.244ProAla: 1.244 ± 0.147
0.604ProCys: 0.604 ± 0.113
1.903ProAsp: 1.903 ± 0.196
2.416ProGlu: 2.416 ± 0.209
1.556ProPhe: 1.556 ± 0.169
1.336ProGly: 1.336 ± 0.2
0.714ProHis: 0.714 ± 0.135
3.166ProIle: 3.166 ± 0.243
2.013ProLys: 2.013 ± 0.206
2.91ProLeu: 2.91 ± 0.224
0.842ProMet: 0.842 ± 0.132
2.105ProAsn: 2.105 ± 0.182
1.592ProPro: 1.592 ± 0.211
0.787ProGln: 0.787 ± 0.125
1.647ProArg: 1.647 ± 0.256
2.58ProSer: 2.58 ± 0.215
2.507ProThr: 2.507 ± 0.243
2.233ProVal: 2.233 ± 0.199
0.293ProTrp: 0.293 ± 0.075
1.482ProTyr: 1.482 ± 0.179
0.0ProXaa: 0.0 ± 0.0
Gln
0.622GlnAla: 0.622 ± 0.092
0.549GlnCys: 0.549 ± 0.093
1.281GlnAsp: 1.281 ± 0.14
1.208GlnGlu: 1.208 ± 0.114
0.933GlnPhe: 0.933 ± 0.118
0.641GlnGly: 0.641 ± 0.137
0.641GlnHis: 0.641 ± 0.11
1.629GlnIle: 1.629 ± 0.172
1.464GlnLys: 1.464 ± 0.185
2.342GlnLeu: 2.342 ± 0.214
0.714GlnMet: 0.714 ± 0.111
1.409GlnAsn: 1.409 ± 0.168
0.769GlnPro: 0.769 ± 0.159
0.915GlnGln: 0.915 ± 0.143
1.263GlnArg: 1.263 ± 0.139
1.793GlnSer: 1.793 ± 0.194
1.537GlnThr: 1.537 ± 0.182
1.043GlnVal: 1.043 ± 0.142
0.238GlnTrp: 0.238 ± 0.063
1.629GlnTyr: 1.629 ± 0.183
0.0GlnXaa: 0.0 ± 0.0
Arg
1.244ArgAla: 1.244 ± 0.14
0.952ArgCys: 0.952 ± 0.13
2.8ArgAsp: 2.8 ± 0.236
2.013ArgGlu: 2.013 ± 0.22
2.306ArgPhe: 2.306 ± 0.18
1.976ArgGly: 1.976 ± 0.221
1.244ArgHis: 1.244 ± 0.156
3.422ArgIle: 3.422 ± 0.251
2.599ArgLys: 2.599 ± 0.196
4.044ArgLeu: 4.044 ± 0.294
1.007ArgMet: 1.007 ± 0.138
2.855ArgAsn: 2.855 ± 0.234
1.446ArgPro: 1.446 ± 0.213
1.391ArgGln: 1.391 ± 0.177
2.562ArgArg: 2.562 ± 0.254
3.276ArgSer: 3.276 ± 0.278
2.361ArgThr: 2.361 ± 0.22
2.58ArgVal: 2.58 ± 0.23
0.329ArgTrp: 0.329 ± 0.075
2.489ArgTyr: 2.489 ± 0.224
0.0ArgXaa: 0.0 ± 0.0
Ser
2.873SerAla: 2.873 ± 0.253
1.61SerCys: 1.61 ± 0.209
4.74SerAsp: 4.74 ± 0.361
3.825SerGlu: 3.825 ± 0.313
3.623SerPhe: 3.623 ± 0.27
3.477SerGly: 3.477 ± 0.272
1.592SerHis: 1.592 ± 0.182
7.686SerIle: 7.686 ± 0.391
6.112SerLys: 6.112 ± 0.29
6.789SerLeu: 6.789 ± 0.317
2.159SerMet: 2.159 ± 0.208
4.538SerAsn: 4.538 ± 0.294
3.02SerPro: 3.02 ± 0.242
2.123SerGln: 2.123 ± 0.201
3.184SerArg: 3.184 ± 0.223
6.698SerSer: 6.698 ± 0.413
5.087SerThr: 5.087 ± 0.45
5.161SerVal: 5.161 ± 0.277
0.384SerTrp: 0.384 ± 0.089
3.532SerTyr: 3.532 ± 0.32
0.0SerXaa: 0.0 ± 0.0
Thr
2.196ThrAla: 2.196 ± 0.201
1.427ThrCys: 1.427 ± 0.182
4.667ThrAsp: 4.667 ± 0.492
3.239ThrGlu: 3.239 ± 0.292
2.745ThrPhe: 2.745 ± 0.242
2.617ThrGly: 2.617 ± 0.236
1.116ThrHis: 1.116 ± 0.129
5.984ThrIle: 5.984 ± 0.348
4.41ThrLys: 4.41 ± 0.278
5.27ThrLeu: 5.27 ± 0.32
1.976ThrMet: 1.976 ± 0.201
3.806ThrAsn: 3.806 ± 0.315
2.58ThrPro: 2.58 ± 0.24
1.135ThrGln: 1.135 ± 0.168
2.525ThrArg: 2.525 ± 0.196
4.795ThrSer: 4.795 ± 0.325
4.026ThrThr: 4.026 ± 0.361
4.319ThrVal: 4.319 ± 0.298
0.549ThrTrp: 0.549 ± 0.088
3.001ThrTyr: 3.001 ± 0.258
0.0ThrXaa: 0.0 ± 0.0
Val
2.342ValAla: 2.342 ± 0.181
1.409ValCys: 1.409 ± 0.176
4.337ValAsp: 4.337 ± 0.284
3.495ValGlu: 3.495 ± 0.237
3.312ValPhe: 3.312 ± 0.295
1.848ValGly: 1.848 ± 0.181
1.153ValHis: 1.153 ± 0.146
5.618ValIle: 5.618 ± 0.316
4.904ValLys: 4.904 ± 0.289
5.106ValLeu: 5.106 ± 0.284
1.336ValMet: 1.336 ± 0.169
4.429ValAsn: 4.429 ± 0.239
1.94ValPro: 1.94 ± 0.177
1.281ValGln: 1.281 ± 0.135
2.727ValArg: 2.727 ± 0.222
5.142ValSer: 5.142 ± 0.316
3.971ValThr: 3.971 ± 0.344
3.916ValVal: 3.916 ± 0.269
0.238ValTrp: 0.238 ± 0.069
3.294ValTyr: 3.294 ± 0.253
0.0ValXaa: 0.0 ± 0.0
Trp
0.201TrpAla: 0.201 ± 0.067
0.183TrpCys: 0.183 ± 0.059
0.293TrpAsp: 0.293 ± 0.073
0.421TrpGlu: 0.421 ± 0.087
0.403TrpPhe: 0.403 ± 0.094
0.311TrpGly: 0.311 ± 0.074
0.11TrpHis: 0.11 ± 0.043
0.641TrpIle: 0.641 ± 0.109
0.604TrpLys: 0.604 ± 0.108
0.787TrpLeu: 0.787 ± 0.14
0.329TrpMet: 0.329 ± 0.079
0.494TrpAsn: 0.494 ± 0.106
0.293TrpPro: 0.293 ± 0.075
0.165TrpGln: 0.165 ± 0.062
0.256TrpArg: 0.256 ± 0.072
0.531TrpSer: 0.531 ± 0.101
0.421TrpThr: 0.421 ± 0.076
0.311TrpVal: 0.311 ± 0.066
0.0TrpTrp: 0.0 ± 0.0
0.366TrpTyr: 0.366 ± 0.08
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.867TyrAla: 1.867 ± 0.186
1.354TyrCys: 1.354 ± 0.157
3.221TyrAsp: 3.221 ± 0.234
2.471TyrGlu: 2.471 ± 0.237
2.635TyrPhe: 2.635 ± 0.216
2.708TyrGly: 2.708 ± 0.233
1.116TyrHis: 1.116 ± 0.155
5.655TyrIle: 5.655 ± 0.323
3.898TyrLys: 3.898 ± 0.269
4.904TyrLeu: 4.904 ± 0.307
1.427TyrMet: 1.427 ± 0.186
3.88TyrAsn: 3.88 ± 0.353
1.665TyrPro: 1.665 ± 0.169
0.988TyrGln: 0.988 ± 0.161
1.958TyrArg: 1.958 ± 0.228
3.752TyrSer: 3.752 ± 0.24
3.203TyrThr: 3.203 ± 0.274
3.074TyrVal: 3.074 ± 0.219
0.366TyrTrp: 0.366 ± 0.093
2.928TyrTyr: 2.928 ± 0.215
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 174 proteins (54645 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski