Amino acid dipepetide frequency for Malacosoma neustria nuclear polyhedrosis virus (MnNPV)

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
3.601AlaAla: 3.601 ± 0.436
0.66AlaCys: 0.66 ± 0.133
2.529AlaAsp: 2.529 ± 0.242
2.226AlaGlu: 2.226 ± 0.234
2.006AlaPhe: 2.006 ± 0.239
1.869AlaGly: 1.869 ± 0.244
0.852AlaHis: 0.852 ± 0.166
3.601AlaIle: 3.601 ± 0.304
2.639AlaLys: 2.639 ± 0.236
4.15AlaLeu: 4.15 ± 0.395
1.044AlaMet: 1.044 ± 0.22
2.858AlaAsn: 2.858 ± 0.275
1.814AlaPro: 1.814 ± 0.272
1.319AlaGln: 1.319 ± 0.201
1.979AlaArg: 1.979 ± 0.241
2.996AlaSer: 2.996 ± 0.263
3.216AlaThr: 3.216 ± 0.368
2.776AlaVal: 2.776 ± 0.289
0.302AlaTrp: 0.302 ± 0.166
1.484AlaTyr: 1.484 ± 0.179
0.0AlaXaa: 0.0 ± 0.0
Cys
1.072CysAla: 1.072 ± 0.162
0.632CysCys: 0.632 ± 0.136
1.567CysAsp: 1.567 ± 0.204
1.429CysGlu: 1.429 ± 0.222
0.825CysPhe: 0.825 ± 0.156
0.88CysGly: 0.88 ± 0.162
0.22CysHis: 0.22 ± 0.078
1.539CysIle: 1.539 ± 0.218
1.842CysLys: 1.842 ± 0.227
1.787CysLeu: 1.787 ± 0.244
0.577CysMet: 0.577 ± 0.136
1.512CysAsn: 1.512 ± 0.206
1.017CysPro: 1.017 ± 0.205
0.495CysGln: 0.495 ± 0.114
1.154CysArg: 1.154 ± 0.165
1.512CysSer: 1.512 ± 0.18
1.127CysThr: 1.127 ± 0.148
1.924CysVal: 1.924 ± 0.214
0.11CysTrp: 0.11 ± 0.05
0.989CysTyr: 0.989 ± 0.182
0.0CysXaa: 0.0 ± 0.0
Asp
2.336AspAla: 2.336 ± 0.227
1.154AspCys: 1.154 ± 0.208
7.504AspAsp: 7.504 ± 0.73
4.563AspGlu: 4.563 ± 0.344
3.271AspPhe: 3.271 ± 0.325
2.199AspGly: 2.199 ± 0.268
1.512AspHis: 1.512 ± 0.189
4.645AspIle: 4.645 ± 0.38
4.315AspLys: 4.315 ± 0.393
5.195AspLeu: 5.195 ± 0.345
1.347AspMet: 1.347 ± 0.153
6.239AspAsn: 6.239 ± 0.504
1.704AspPro: 1.704 ± 0.187
1.237AspGln: 1.237 ± 0.159
3.051AspArg: 3.051 ± 0.341
3.82AspSer: 3.82 ± 0.345
3.326AspThr: 3.326 ± 0.31
4.123AspVal: 4.123 ± 0.324
0.357AspTrp: 0.357 ± 0.093
3.656AspTyr: 3.656 ± 0.305
0.0AspXaa: 0.0 ± 0.0
Glu
1.896GluAla: 1.896 ± 0.275
1.457GluCys: 1.457 ± 0.2
2.996GluAsp: 2.996 ± 0.249
3.161GluGlu: 3.161 ± 0.323
2.254GluPhe: 2.254 ± 0.247
1.704GluGly: 1.704 ± 0.226
1.869GluHis: 1.869 ± 0.212
4.095GluIle: 4.095 ± 0.372
4.947GluLys: 4.947 ± 0.379
5.167GluLeu: 5.167 ± 0.383
2.089GluMet: 2.089 ± 0.227
5.689GluAsn: 5.689 ± 0.378
1.319GluPro: 1.319 ± 0.225
1.869GluGln: 1.869 ± 0.26
2.556GluArg: 2.556 ± 0.276
3.628GluSer: 3.628 ± 0.293
3.601GluThr: 3.601 ± 0.323
2.116GluVal: 2.116 ± 0.198
0.495GluTrp: 0.495 ± 0.111
2.776GluTyr: 2.776 ± 0.264
0.027GluXaa: 0.027 ± 0.027
Phe
1.896PheAla: 1.896 ± 0.22
1.264PheCys: 1.264 ± 0.201
4.068PheAsp: 4.068 ± 0.277
2.529PheGlu: 2.529 ± 0.242
2.006PhePhe: 2.006 ± 0.245
1.594PheGly: 1.594 ± 0.205
0.962PheHis: 0.962 ± 0.158
3.958PheIle: 3.958 ± 0.379
3.82PheLys: 3.82 ± 0.324
4.315PheLeu: 4.315 ± 0.378
0.962PheMet: 0.962 ± 0.161
5.14PheAsn: 5.14 ± 0.381
1.264PhePro: 1.264 ± 0.247
1.347PheGln: 1.347 ± 0.252
1.951PheArg: 1.951 ± 0.233
2.694PheSer: 2.694 ± 0.249
2.144PheThr: 2.144 ± 0.252
4.233PheVal: 4.233 ± 0.339
0.412PheTrp: 0.412 ± 0.108
2.474PheTyr: 2.474 ± 0.244
0.0PheXaa: 0.0 ± 0.0
Gly
1.787GlyAla: 1.787 ± 0.261
0.715GlyCys: 0.715 ± 0.139
2.776GlyAsp: 2.776 ± 0.335
1.622GlyGlu: 1.622 ± 0.239
1.182GlyPhe: 1.182 ± 0.199
2.639GlyGly: 2.639 ± 0.316
0.577GlyHis: 0.577 ± 0.124
1.924GlyIle: 1.924 ± 0.305
2.474GlyLys: 2.474 ± 0.318
2.474GlyLeu: 2.474 ± 0.324
0.77GlyMet: 0.77 ± 0.132
2.611GlyAsn: 2.611 ± 0.278
0.577GlyPro: 0.577 ± 0.143
1.127GlyGln: 1.127 ± 0.19
1.512GlyArg: 1.512 ± 0.262
1.814GlySer: 1.814 ± 0.234
1.594GlyThr: 1.594 ± 0.197
2.391GlyVal: 2.391 ± 0.335
0.275GlyTrp: 0.275 ± 0.082
1.759GlyTyr: 1.759 ± 0.192
0.0GlyXaa: 0.0 ± 0.0
His
0.989HisAla: 0.989 ± 0.167
0.44HisCys: 0.44 ± 0.097
1.622HisAsp: 1.622 ± 0.228
1.539HisGlu: 1.539 ± 0.201
1.017HisPhe: 1.017 ± 0.162
0.797HisGly: 0.797 ± 0.167
0.605HisHis: 0.605 ± 0.158
1.539HisIle: 1.539 ± 0.204
1.402HisLys: 1.402 ± 0.159
2.061HisLeu: 2.061 ± 0.203
0.742HisMet: 0.742 ± 0.139
1.951HisAsn: 1.951 ± 0.229
0.715HisPro: 0.715 ± 0.14
1.209HisGln: 1.209 ± 0.317
0.825HisArg: 0.825 ± 0.152
1.237HisSer: 1.237 ± 0.148
0.825HisThr: 0.825 ± 0.138
1.649HisVal: 1.649 ± 0.268
0.137HisTrp: 0.137 ± 0.057
1.319HisTyr: 1.319 ± 0.201
0.0HisXaa: 0.0 ± 0.0
Ile
3.023IleAla: 3.023 ± 0.295
1.649IleCys: 1.649 ± 0.267
5.552IleAsp: 5.552 ± 0.442
5.799IleGlu: 5.799 ± 0.405
4.095IlePhe: 4.095 ± 0.329
2.254IleGly: 2.254 ± 0.254
1.264IleHis: 1.264 ± 0.18
5.634IleIle: 5.634 ± 0.425
7.064IleLys: 7.064 ± 0.539
6.239IleLeu: 6.239 ± 0.395
1.924IleMet: 1.924 ± 0.297
6.816IleAsn: 6.816 ± 0.434
1.951IlePro: 1.951 ± 0.246
2.144IleGln: 2.144 ± 0.261
2.886IleArg: 2.886 ± 0.268
4.178IleSer: 4.178 ± 0.384
4.04IleThr: 4.04 ± 0.311
5.415IleVal: 5.415 ± 0.342
0.522IleTrp: 0.522 ± 0.109
3.436IleTyr: 3.436 ± 0.318
0.0IleXaa: 0.0 ± 0.0
Lys
2.061LysAla: 2.061 ± 0.264
1.759LysCys: 1.759 ± 0.283
2.611LysAsp: 2.611 ± 0.285
3.628LysGlu: 3.628 ± 0.3
3.738LysPhe: 3.738 ± 0.392
1.457LysGly: 1.457 ± 0.221
2.364LysHis: 2.364 ± 0.26
6.596LysIle: 6.596 ± 0.509
5.799LysLys: 5.799 ± 0.574
8.081LysLeu: 8.081 ± 0.504
2.364LysMet: 2.364 ± 0.261
6.679LysAsn: 6.679 ± 0.562
2.254LysPro: 2.254 ± 0.279
3.106LysGln: 3.106 ± 0.324
4.425LysArg: 4.425 ± 0.38
4.837LysSer: 4.837 ± 0.46
3.738LysThr: 3.738 ± 0.296
3.628LysVal: 3.628 ± 0.338
0.632LysTrp: 0.632 ± 0.144
4.205LysTyr: 4.205 ± 0.378
0.027LysXaa: 0.027 ± 0.027
Leu
3.711LeuAla: 3.711 ± 0.332
2.226LeuCys: 2.226 ± 0.259
5.25LeuAsp: 5.25 ± 0.354
4.288LeuGlu: 4.288 ± 0.297
4.508LeuPhe: 4.508 ± 0.343
2.171LeuGly: 2.171 ± 0.284
2.281LeuHis: 2.281 ± 0.242
7.421LeuIle: 7.421 ± 0.477
7.036LeuLys: 7.036 ± 0.563
9.043LeuLeu: 9.043 ± 0.577
2.721LeuMet: 2.721 ± 0.325
8.273LeuAsn: 8.273 ± 0.494
3.656LeuPro: 3.656 ± 0.291
3.243LeuGln: 3.243 ± 0.309
3.573LeuArg: 3.573 ± 0.283
6.212LeuSer: 6.212 ± 0.337
5.112LeuThr: 5.112 ± 0.382
5.305LeuVal: 5.305 ± 0.464
0.742LeuTrp: 0.742 ± 0.16
4.37LeuTyr: 4.37 ± 0.364
0.0LeuXaa: 0.0 ± 0.0
Met
1.787MetAla: 1.787 ± 0.222
0.797MetCys: 0.797 ± 0.129
1.209MetAsp: 1.209 ± 0.17
1.649MetGlu: 1.649 ± 0.208
1.814MetPhe: 1.814 ± 0.236
0.88MetGly: 0.88 ± 0.138
0.495MetHis: 0.495 ± 0.122
2.171MetIle: 2.171 ± 0.29
1.759MetLys: 1.759 ± 0.327
3.106MetLeu: 3.106 ± 0.339
1.402MetMet: 1.402 ± 0.524
1.732MetAsn: 1.732 ± 0.196
1.017MetPro: 1.017 ± 0.18
0.742MetGln: 0.742 ± 0.155
1.017MetArg: 1.017 ± 0.146
2.419MetSer: 2.419 ± 0.226
1.154MetThr: 1.154 ± 0.175
1.072MetVal: 1.072 ± 0.194
0.22MetTrp: 0.22 ± 0.074
1.732MetTyr: 1.732 ± 0.22
0.0MetXaa: 0.0 ± 0.0
Asn
3.711AsnAla: 3.711 ± 0.337
1.484AsnCys: 1.484 ± 0.197
6.954AsnAsp: 6.954 ± 0.582
6.074AsnGlu: 6.074 ± 0.473
3.958AsnPhe: 3.958 ± 0.331
2.804AsnGly: 2.804 ± 0.24
1.457AsnHis: 1.457 ± 0.19
6.954AsnIle: 6.954 ± 0.493
6.816AsnLys: 6.816 ± 0.518
6.377AsnLeu: 6.377 ± 0.389
1.814AsnMet: 1.814 ± 0.205
8.63AsnAsn: 8.63 ± 0.7
1.759AsnPro: 1.759 ± 0.227
1.759AsnGln: 1.759 ± 0.267
3.793AsnArg: 3.793 ± 0.294
5.305AsnSer: 5.305 ± 0.463
4.453AsnThr: 4.453 ± 0.296
7.558AsnVal: 7.558 ± 0.446
0.44AsnTrp: 0.44 ± 0.086
3.903AsnTyr: 3.903 ± 0.311
0.027AsnXaa: 0.027 ± 0.025
Pro
1.567ProAla: 1.567 ± 0.205
0.495ProCys: 0.495 ± 0.152
1.924ProAsp: 1.924 ± 0.227
1.402ProGlu: 1.402 ± 0.215
1.704ProPhe: 1.704 ± 0.2
0.989ProGly: 0.989 ± 0.182
0.687ProHis: 0.687 ± 0.151
2.226ProIle: 2.226 ± 0.279
1.319ProLys: 1.319 ± 0.18
3.381ProLeu: 3.381 ± 0.299
0.715ProMet: 0.715 ± 0.141
2.171ProAsn: 2.171 ± 0.234
2.419ProPro: 2.419 ± 0.646
1.237ProGln: 1.237 ± 0.184
1.374ProArg: 1.374 ± 0.239
2.749ProSer: 2.749 ± 0.289
2.364ProThr: 2.364 ± 0.253
1.814ProVal: 1.814 ± 0.277
0.165ProTrp: 0.165 ± 0.064
1.237ProTyr: 1.237 ± 0.19
0.0ProXaa: 0.0 ± 0.0
Gln
0.797GlnAla: 0.797 ± 0.139
1.099GlnCys: 1.099 ± 0.194
1.099GlnAsp: 1.099 ± 0.158
1.154GlnGlu: 1.154 ± 0.224
1.979GlnPhe: 1.979 ± 0.241
0.77GlnGly: 0.77 ± 0.264
1.704GlnHis: 1.704 ± 0.296
2.364GlnIle: 2.364 ± 0.256
1.869GlnLys: 1.869 ± 0.197
4.123GlnLeu: 4.123 ± 0.344
1.099GlnMet: 1.099 ± 0.16
2.391GlnAsn: 2.391 ± 0.272
1.072GlnPro: 1.072 ± 0.178
2.501GlnGln: 2.501 ± 0.4
1.567GlnArg: 1.567 ± 0.205
2.226GlnSer: 2.226 ± 0.245
1.732GlnThr: 1.732 ± 0.209
1.182GlnVal: 1.182 ± 0.173
0.275GlnTrp: 0.275 ± 0.087
1.814GlnTyr: 1.814 ± 0.257
0.0GlnXaa: 0.0 ± 0.0
Arg
2.006ArgAla: 2.006 ± 0.241
1.402ArgCys: 1.402 ± 0.22
2.556ArgAsp: 2.556 ± 0.254
2.061ArgGlu: 2.061 ± 0.244
2.226ArgPhe: 2.226 ± 0.256
1.512ArgGly: 1.512 ± 0.183
0.935ArgHis: 0.935 ± 0.168
3.436ArgIle: 3.436 ± 0.373
3.353ArgLys: 3.353 ± 0.327
4.37ArgLeu: 4.37 ± 0.362
1.457ArgMet: 1.457 ± 0.212
3.436ArgAsn: 3.436 ± 0.283
1.292ArgPro: 1.292 ± 0.201
2.309ArgGln: 2.309 ± 0.28
2.886ArgArg: 2.886 ± 0.512
2.996ArgSer: 2.996 ± 0.439
1.979ArgThr: 1.979 ± 0.209
2.089ArgVal: 2.089 ± 0.25
0.22ArgTrp: 0.22 ± 0.074
2.089ArgTyr: 2.089 ± 0.226
0.0ArgXaa: 0.0 ± 0.0
Ser
3.271SerAla: 3.271 ± 0.332
1.429SerCys: 1.429 ± 0.183
3.601SerAsp: 3.601 ± 0.342
3.023SerGlu: 3.023 ± 0.266
3.683SerPhe: 3.683 ± 0.347
2.391SerGly: 2.391 ± 0.225
1.099SerHis: 1.099 ± 0.15
4.947SerIle: 4.947 ± 0.421
4.59SerLys: 4.59 ± 0.386
5.964SerLeu: 5.964 ± 0.403
1.951SerMet: 1.951 ± 0.276
4.947SerAsn: 4.947 ± 0.398
2.006SerPro: 2.006 ± 0.283
2.226SerGln: 2.226 ± 0.248
2.749SerArg: 2.749 ± 0.309
6.404SerSer: 6.404 ± 0.732
4.48SerThr: 4.48 ± 0.357
4.892SerVal: 4.892 ± 0.43
0.467SerTrp: 0.467 ± 0.106
2.474SerTyr: 2.474 ± 0.246
0.0SerXaa: 0.0 ± 0.0
Thr
3.106ThrAla: 3.106 ± 0.312
0.907ThrCys: 0.907 ± 0.157
3.161ThrAsp: 3.161 ± 0.285
2.309ThrGlu: 2.309 ± 0.31
3.078ThrPhe: 3.078 ± 0.297
1.759ThrGly: 1.759 ± 0.215
0.989ThrHis: 0.989 ± 0.15
4.563ThrIle: 4.563 ± 0.368
3.463ThrLys: 3.463 ± 0.342
5.992ThrLeu: 5.992 ± 0.448
1.539ThrMet: 1.539 ± 0.196
4.645ThrAsn: 4.645 ± 0.343
2.639ThrPro: 2.639 ± 0.237
1.512ThrGln: 1.512 ± 0.196
2.391ThrArg: 2.391 ± 0.263
4.068ThrSer: 4.068 ± 0.402
4.233ThrThr: 4.233 ± 0.46
3.765ThrVal: 3.765 ± 0.324
0.302ThrTrp: 0.302 ± 0.081
1.787ThrTyr: 1.787 ± 0.24
0.0ThrXaa: 0.0 ± 0.0
Val
3.271ValAla: 3.271 ± 0.287
1.594ValCys: 1.594 ± 0.229
4.81ValAsp: 4.81 ± 0.336
3.848ValGlu: 3.848 ± 0.301
2.996ValPhe: 2.996 ± 0.31
2.144ValGly: 2.144 ± 0.262
1.649ValHis: 1.649 ± 0.202
3.381ValIle: 3.381 ± 0.302
4.837ValLys: 4.837 ± 0.435
5.36ValLeu: 5.36 ± 0.404
1.677ValMet: 1.677 ± 0.23
5.14ValAsn: 5.14 ± 0.38
2.336ValPro: 2.336 ± 0.26
2.089ValGln: 2.089 ± 0.245
2.309ValArg: 2.309 ± 0.24
4.205ValSer: 4.205 ± 0.306
3.408ValThr: 3.408 ± 0.306
4.947ValVal: 4.947 ± 0.46
0.77ValTrp: 0.77 ± 0.126
3.82ValTyr: 3.82 ± 0.325
0.0ValXaa: 0.0 ± 0.0
Trp
0.302TrpAla: 0.302 ± 0.1
0.165TrpCys: 0.165 ± 0.06
0.495TrpAsp: 0.495 ± 0.105
0.467TrpGlu: 0.467 ± 0.133
0.33TrpPhe: 0.33 ± 0.1
0.275TrpGly: 0.275 ± 0.119
0.22TrpHis: 0.22 ± 0.077
0.55TrpIle: 0.55 ± 0.13
0.577TrpLys: 0.577 ± 0.112
0.385TrpLeu: 0.385 ± 0.106
0.192TrpMet: 0.192 ± 0.071
0.66TrpAsn: 0.66 ± 0.117
0.165TrpPro: 0.165 ± 0.061
0.247TrpGln: 0.247 ± 0.078
0.412TrpArg: 0.412 ± 0.109
0.44TrpSer: 0.44 ± 0.13
0.412TrpThr: 0.412 ± 0.116
0.22TrpVal: 0.22 ± 0.083
0.165TrpTrp: 0.165 ± 0.085
0.55TrpTyr: 0.55 ± 0.122
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.732TyrAla: 1.732 ± 0.208
0.88TyrCys: 0.88 ± 0.16
3.271TyrAsp: 3.271 ± 0.363
2.941TyrGlu: 2.941 ± 0.262
2.364TyrPhe: 2.364 ± 0.226
1.512TyrGly: 1.512 ± 0.208
0.88TyrHis: 0.88 ± 0.16
4.315TyrIle: 4.315 ± 0.322
4.068TyrLys: 4.068 ± 0.386
3.573TyrLeu: 3.573 ± 0.29
1.759TyrMet: 1.759 ± 0.212
4.508TyrAsn: 4.508 ± 0.391
0.962TyrPro: 0.962 ± 0.153
1.099TyrGln: 1.099 ± 0.178
2.199TyrArg: 2.199 ± 0.207
2.831TyrSer: 2.831 ± 0.27
3.243TyrThr: 3.243 ± 0.378
3.491TyrVal: 3.491 ± 0.315
0.22TyrTrp: 0.22 ± 0.07
3.161TyrTyr: 3.161 ± 0.279
0.027TyrXaa: 0.027 ± 0.027
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.027XaaIle: 0.027 ± 0.025
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.055XaaSer: 0.055 ± 0.039
0.027XaaThr: 0.027 ± 0.027
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 131 proteins (36384 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski