Amino acid dipepetide frequency for Mycobacterium virus Optimus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
9.91AlaAla: 9.91 ± 0.767
1.428AlaCys: 1.428 ± 0.281
5.917AlaAsp: 5.917 ± 0.534
7.228AlaGlu: 7.228 ± 0.506
3.818AlaPhe: 3.818 ± 0.395
7.403AlaGly: 7.403 ± 0.676
2.303AlaHis: 2.303 ± 0.307
5.101AlaIle: 5.101 ± 0.414
3.993AlaLys: 3.993 ± 0.361
8.336AlaLeu: 8.336 ± 0.528
3.177AlaMet: 3.177 ± 0.358
3.148AlaAsn: 3.148 ± 0.356
3.906AlaPro: 3.906 ± 0.322
2.711AlaGln: 2.711 ± 0.338
6.558AlaArg: 6.558 ± 0.41
5.305AlaSer: 5.305 ± 0.46
5.217AlaThr: 5.217 ± 0.429
6.296AlaVal: 6.296 ± 0.499
1.953AlaTrp: 1.953 ± 0.224
2.477AlaTyr: 2.477 ± 0.261
0.0AlaXaa: 0.0 ± 0.0
Cys
1.049CysAla: 1.049 ± 0.239
0.146CysCys: 0.146 ± 0.068
1.282CysAsp: 1.282 ± 0.247
1.137CysGlu: 1.137 ± 0.191
0.379CysPhe: 0.379 ± 0.112
1.807CysGly: 1.807 ± 0.249
0.379CysHis: 0.379 ± 0.107
0.466CysIle: 0.466 ± 0.126
0.874CysLys: 0.874 ± 0.228
1.166CysLeu: 1.166 ± 0.25
0.262CysMet: 0.262 ± 0.093
0.7CysAsn: 0.7 ± 0.163
0.816CysPro: 0.816 ± 0.17
0.408CysGln: 0.408 ± 0.109
1.137CysArg: 1.137 ± 0.218
0.67CysSer: 0.67 ± 0.142
0.787CysThr: 0.787 ± 0.156
0.933CysVal: 0.933 ± 0.16
0.262CysTrp: 0.262 ± 0.096
0.525CysTyr: 0.525 ± 0.127
0.0CysXaa: 0.0 ± 0.0
Asp
5.771AspAla: 5.771 ± 0.359
1.166AspCys: 1.166 ± 0.23
3.847AspAsp: 3.847 ± 0.357
4.693AspGlu: 4.693 ± 0.558
1.894AspPhe: 1.894 ± 0.227
6.937AspGly: 6.937 ± 0.535
1.865AspHis: 1.865 ± 0.253
3.264AspIle: 3.264 ± 0.278
2.303AspLys: 2.303 ± 0.291
5.071AspLeu: 5.071 ± 0.348
1.078AspMet: 1.078 ± 0.169
2.303AspAsn: 2.303 ± 0.253
3.964AspPro: 3.964 ± 0.325
1.953AspGln: 1.953 ± 0.222
3.789AspArg: 3.789 ± 0.318
3.002AspSer: 3.002 ± 0.298
2.944AspThr: 2.944 ± 0.297
3.964AspVal: 3.964 ± 0.357
1.982AspTrp: 1.982 ± 0.246
2.332AspTyr: 2.332 ± 0.251
0.0AspXaa: 0.0 ± 0.0
Glu
6.616GluAla: 6.616 ± 0.48
1.137GluCys: 1.137 ± 0.193
4.11GluAsp: 4.11 ± 0.37
4.867GluGlu: 4.867 ± 0.508
2.769GluPhe: 2.769 ± 0.248
4.284GluGly: 4.284 ± 0.34
1.341GluHis: 1.341 ± 0.159
3.672GluIle: 3.672 ± 0.33
2.769GluLys: 2.769 ± 0.4
6.354GluLeu: 6.354 ± 0.534
2.128GluMet: 2.128 ± 0.309
1.72GluAsn: 1.72 ± 0.22
3.527GluPro: 3.527 ± 0.411
2.711GluGln: 2.711 ± 0.258
4.955GluArg: 4.955 ± 0.396
3.41GluSer: 3.41 ± 0.353
3.352GluThr: 3.352 ± 0.326
4.255GluVal: 4.255 ± 0.42
1.428GluTrp: 1.428 ± 0.249
2.477GluTyr: 2.477 ± 0.32
0.0GluXaa: 0.0 ± 0.0
Phe
2.856PheAla: 2.856 ± 0.273
0.583PheCys: 0.583 ± 0.136
2.419PheAsp: 2.419 ± 0.307
2.273PheGlu: 2.273 ± 0.238
1.282PhePhe: 1.282 ± 0.204
3.002PheGly: 3.002 ± 0.396
0.67PheHis: 0.67 ± 0.172
1.137PheIle: 1.137 ± 0.179
1.399PheLys: 1.399 ± 0.175
1.924PheLeu: 1.924 ± 0.241
0.816PheMet: 0.816 ± 0.144
1.516PheAsn: 1.516 ± 0.238
2.128PhePro: 2.128 ± 0.283
0.845PheGln: 0.845 ± 0.18
1.807PheArg: 1.807 ± 0.271
1.982PheSer: 1.982 ± 0.223
2.011PheThr: 2.011 ± 0.208
2.215PheVal: 2.215 ± 0.245
0.612PheTrp: 0.612 ± 0.146
0.729PheTyr: 0.729 ± 0.145
0.0PheXaa: 0.0 ± 0.0
Gly
7.432GlyAla: 7.432 ± 0.602
1.341GlyCys: 1.341 ± 0.236
5.509GlyAsp: 5.509 ± 0.465
5.246GlyGlu: 5.246 ± 0.393
3.177GlyPhe: 3.177 ± 0.344
9.268GlyGly: 9.268 ± 1.541
2.186GlyHis: 2.186 ± 0.235
4.343GlyIle: 4.343 ± 0.451
3.876GlyLys: 3.876 ± 0.378
7.432GlyLeu: 7.432 ± 0.568
2.303GlyMet: 2.303 ± 0.275
3.031GlyAsn: 3.031 ± 0.307
3.789GlyPro: 3.789 ± 0.304
2.565GlyGln: 2.565 ± 0.347
5.8GlyArg: 5.8 ± 0.447
5.275GlySer: 5.275 ± 0.484
5.13GlyThr: 5.13 ± 0.565
5.071GlyVal: 5.071 ± 0.411
1.982GlyTrp: 1.982 ± 0.246
3.002GlyTyr: 3.002 ± 0.234
0.0GlyXaa: 0.0 ± 0.0
His
1.603HisAla: 1.603 ± 0.236
0.525HisCys: 0.525 ± 0.128
1.69HisAsp: 1.69 ± 0.24
1.603HisGlu: 1.603 ± 0.229
0.495HisPhe: 0.495 ± 0.135
2.39HisGly: 2.39 ± 0.238
0.874HisHis: 0.874 ± 0.148
0.933HisIle: 0.933 ± 0.189
0.787HisLys: 0.787 ± 0.158
2.448HisLeu: 2.448 ± 0.305
0.583HisMet: 0.583 ± 0.122
0.729HisAsn: 0.729 ± 0.144
1.312HisPro: 1.312 ± 0.185
0.904HisGln: 0.904 ± 0.146
1.836HisArg: 1.836 ± 0.201
0.904HisSer: 0.904 ± 0.176
0.816HisThr: 0.816 ± 0.184
1.195HisVal: 1.195 ± 0.213
0.729HisTrp: 0.729 ± 0.152
0.7HisTyr: 0.7 ± 0.149
0.0HisXaa: 0.0 ± 0.0
Ile
5.567IleAla: 5.567 ± 0.468
0.554IleCys: 0.554 ± 0.129
3.789IleAsp: 3.789 ± 0.361
3.789IleGlu: 3.789 ± 0.342
1.049IlePhe: 1.049 ± 0.21
3.585IleGly: 3.585 ± 0.366
1.253IleHis: 1.253 ± 0.208
2.011IleIle: 2.011 ± 0.276
1.632IleLys: 1.632 ± 0.195
3.468IleLeu: 3.468 ± 0.378
0.729IleMet: 0.729 ± 0.155
1.661IleAsn: 1.661 ± 0.216
3.002IlePro: 3.002 ± 0.273
1.516IleGln: 1.516 ± 0.178
2.944IleArg: 2.944 ± 0.301
2.273IleSer: 2.273 ± 0.241
2.798IleThr: 2.798 ± 0.326
3.264IleVal: 3.264 ± 0.33
0.874IleTrp: 0.874 ± 0.146
1.312IleTyr: 1.312 ± 0.181
0.0IleXaa: 0.0 ± 0.0
Lys
4.314LysAla: 4.314 ± 0.402
0.729LysCys: 0.729 ± 0.158
2.099LysAsp: 2.099 ± 0.228
2.157LysGlu: 2.157 ± 0.313
1.195LysPhe: 1.195 ± 0.156
3.031LysGly: 3.031 ± 0.296
1.195LysHis: 1.195 ± 0.193
1.69LysIle: 1.69 ± 0.24
2.215LysLys: 2.215 ± 0.323
3.935LysLeu: 3.935 ± 0.356
1.428LysMet: 1.428 ± 0.186
0.816LysAsn: 0.816 ± 0.142
2.885LysPro: 2.885 ± 0.305
1.253LysGln: 1.253 ± 0.178
2.769LysArg: 2.769 ± 0.338
1.836LysSer: 1.836 ± 0.238
2.215LysThr: 2.215 ± 0.252
3.089LysVal: 3.089 ± 0.378
0.933LysTrp: 0.933 ± 0.149
1.399LysTyr: 1.399 ± 0.222
0.0LysXaa: 0.0 ± 0.0
Leu
8.54LeuAla: 8.54 ± 0.508
0.7LeuCys: 0.7 ± 0.126
5.421LeuAsp: 5.421 ± 0.378
5.071LeuGlu: 5.071 ± 0.399
2.39LeuPhe: 2.39 ± 0.231
6.616LeuGly: 6.616 ± 0.488
1.836LeuHis: 1.836 ± 0.255
3.148LeuIle: 3.148 ± 0.319
3.41LeuLys: 3.41 ± 0.274
6.121LeuLeu: 6.121 ± 0.5
1.836LeuMet: 1.836 ± 0.228
3.235LeuAsn: 3.235 ± 0.261
4.634LeuPro: 4.634 ± 0.394
2.885LeuGln: 2.885 ± 0.268
5.509LeuArg: 5.509 ± 0.412
5.188LeuSer: 5.188 ± 0.353
5.334LeuThr: 5.334 ± 0.344
5.217LeuVal: 5.217 ± 0.387
1.457LeuTrp: 1.457 ± 0.193
2.069LeuTyr: 2.069 ± 0.258
0.0LeuXaa: 0.0 ± 0.0
Met
2.244MetAla: 2.244 ± 0.242
0.233MetCys: 0.233 ± 0.093
1.399MetAsp: 1.399 ± 0.175
1.224MetGlu: 1.224 ± 0.172
0.641MetPhe: 0.641 ± 0.118
1.924MetGly: 1.924 ± 0.281
0.408MetHis: 0.408 ± 0.108
1.282MetIle: 1.282 ± 0.194
1.224MetLys: 1.224 ± 0.181
1.428MetLeu: 1.428 ± 0.198
0.554MetMet: 0.554 ± 0.135
1.137MetAsn: 1.137 ± 0.195
1.166MetPro: 1.166 ± 0.198
0.554MetGln: 0.554 ± 0.123
1.428MetArg: 1.428 ± 0.22
2.652MetSer: 2.652 ± 0.265
2.273MetThr: 2.273 ± 0.324
1.049MetVal: 1.049 ± 0.198
0.583MetTrp: 0.583 ± 0.129
0.408MetTyr: 0.408 ± 0.084
0.0MetXaa: 0.0 ± 0.0
Asn
3.41AsnAla: 3.41 ± 0.359
0.379AsnCys: 0.379 ± 0.119
1.865AsnAsp: 1.865 ± 0.231
1.632AsnGlu: 1.632 ± 0.218
1.195AsnPhe: 1.195 ± 0.178
3.672AsnGly: 3.672 ± 0.49
0.962AsnHis: 0.962 ± 0.171
1.486AsnIle: 1.486 ± 0.276
1.545AsnLys: 1.545 ± 0.247
2.507AsnLeu: 2.507 ± 0.289
0.525AsnMet: 0.525 ± 0.137
0.758AsnAsn: 0.758 ± 0.128
2.681AsnPro: 2.681 ± 0.289
0.816AsnGln: 0.816 ± 0.183
2.39AsnArg: 2.39 ± 0.245
1.836AsnSer: 1.836 ± 0.245
1.661AsnThr: 1.661 ± 0.229
2.273AsnVal: 2.273 ± 0.25
0.904AsnTrp: 0.904 ± 0.183
0.758AsnTyr: 0.758 ± 0.158
0.0AsnXaa: 0.0 ± 0.0
Pro
5.246ProAla: 5.246 ± 0.411
0.845ProCys: 0.845 ± 0.171
3.702ProAsp: 3.702 ± 0.307
4.722ProGlu: 4.722 ± 0.384
1.778ProPhe: 1.778 ± 0.251
5.713ProGly: 5.713 ± 0.586
0.962ProHis: 0.962 ± 0.194
2.536ProIle: 2.536 ± 0.241
2.244ProLys: 2.244 ± 0.272
4.226ProLeu: 4.226 ± 0.349
0.991ProMet: 0.991 ± 0.15
2.011ProAsn: 2.011 ± 0.28
3.264ProPro: 3.264 ± 0.416
1.108ProGln: 1.108 ± 0.19
3.468ProArg: 3.468 ± 0.32
2.798ProSer: 2.798 ± 0.311
2.944ProThr: 2.944 ± 0.301
4.547ProVal: 4.547 ± 0.341
1.224ProTrp: 1.224 ± 0.205
1.486ProTyr: 1.486 ± 0.2
0.0ProXaa: 0.0 ± 0.0
Gln
3.41GlnAla: 3.41 ± 0.292
0.437GlnCys: 0.437 ± 0.114
1.166GlnAsp: 1.166 ± 0.176
1.807GlnGlu: 1.807 ± 0.26
1.137GlnPhe: 1.137 ± 0.206
2.303GlnGly: 2.303 ± 0.291
0.612GlnHis: 0.612 ± 0.136
1.574GlnIle: 1.574 ± 0.188
1.778GlnLys: 1.778 ± 0.299
2.244GlnLeu: 2.244 ± 0.267
0.962GlnMet: 0.962 ± 0.154
0.874GlnAsn: 0.874 ± 0.183
2.099GlnPro: 2.099 ± 0.241
1.37GlnGln: 1.37 ± 0.23
2.099GlnArg: 2.099 ± 0.262
1.924GlnSer: 1.924 ± 0.247
1.807GlnThr: 1.807 ± 0.279
2.215GlnVal: 2.215 ± 0.271
0.612GlnTrp: 0.612 ± 0.096
0.729GlnTyr: 0.729 ± 0.12
0.0GlnXaa: 0.0 ± 0.0
Arg
6.296ArgAla: 6.296 ± 0.505
1.078ArgCys: 1.078 ± 0.219
4.372ArgAsp: 4.372 ± 0.322
4.518ArgGlu: 4.518 ± 0.444
2.128ArgPhe: 2.128 ± 0.227
5.363ArgGly: 5.363 ± 0.394
1.516ArgHis: 1.516 ± 0.21
3.498ArgIle: 3.498 ± 0.392
2.769ArgLys: 2.769 ± 0.323
4.722ArgLeu: 4.722 ± 0.439
1.778ArgMet: 1.778 ± 0.261
1.865ArgAsn: 1.865 ± 0.211
3.498ArgPro: 3.498 ± 0.36
2.332ArgGln: 2.332 ± 0.274
5.275ArgArg: 5.275 ± 0.48
3.235ArgSer: 3.235 ± 0.298
3.264ArgThr: 3.264 ± 0.357
5.13ArgVal: 5.13 ± 0.35
1.982ArgTrp: 1.982 ± 0.249
2.39ArgTyr: 2.39 ± 0.313
0.0ArgXaa: 0.0 ± 0.0
Ser
5.917SerAla: 5.917 ± 0.477
0.729SerCys: 0.729 ± 0.162
3.41SerAsp: 3.41 ± 0.323
3.789SerGlu: 3.789 ± 0.399
1.894SerPhe: 1.894 ± 0.237
5.713SerGly: 5.713 ± 0.616
1.108SerHis: 1.108 ± 0.199
2.827SerIle: 2.827 ± 0.292
1.982SerLys: 1.982 ± 0.234
4.372SerLeu: 4.372 ± 0.354
1.37SerMet: 1.37 ± 0.152
2.099SerAsn: 2.099 ± 0.345
2.769SerPro: 2.769 ± 0.3
1.574SerGln: 1.574 ± 0.199
3.527SerArg: 3.527 ± 0.357
3.381SerSer: 3.381 ± 0.371
2.827SerThr: 2.827 ± 0.297
3.76SerVal: 3.76 ± 0.397
1.69SerTrp: 1.69 ± 0.229
1.486SerTyr: 1.486 ± 0.202
0.0SerXaa: 0.0 ± 0.0
Thr
4.547ThrAla: 4.547 ± 0.4
1.049ThrCys: 1.049 ± 0.187
2.973ThrAsp: 2.973 ± 0.296
3.352ThrGlu: 3.352 ± 0.281
1.953ThrPhe: 1.953 ± 0.28
5.188ThrGly: 5.188 ± 0.436
1.02ThrHis: 1.02 ± 0.175
2.827ThrIle: 2.827 ± 0.333
1.778ThrLys: 1.778 ± 0.286
5.217ThrLeu: 5.217 ± 0.35
0.904ThrMet: 0.904 ± 0.157
1.72ThrAsn: 1.72 ± 0.252
4.343ThrPro: 4.343 ± 0.396
1.516ThrGln: 1.516 ± 0.227
2.681ThrArg: 2.681 ± 0.263
3.264ThrSer: 3.264 ± 0.357
3.235ThrThr: 3.235 ± 0.381
4.518ThrVal: 4.518 ± 0.421
1.778ThrTrp: 1.778 ± 0.279
1.953ThrTyr: 1.953 ± 0.29
0.0ThrXaa: 0.0 ± 0.0
Val
6.47ValAla: 6.47 ± 0.464
0.904ValCys: 0.904 ± 0.161
5.713ValAsp: 5.713 ± 0.476
5.334ValGlu: 5.334 ± 0.396
1.836ValPhe: 1.836 ± 0.219
4.984ValGly: 4.984 ± 0.436
1.428ValHis: 1.428 ± 0.227
3.264ValIle: 3.264 ± 0.273
2.769ValLys: 2.769 ± 0.262
4.867ValLeu: 4.867 ± 0.404
1.312ValMet: 1.312 ± 0.2
2.477ValAsn: 2.477 ± 0.255
3.498ValPro: 3.498 ± 0.305
2.419ValGln: 2.419 ± 0.262
4.43ValArg: 4.43 ± 0.383
4.284ValSer: 4.284 ± 0.386
4.284ValThr: 4.284 ± 0.417
5.042ValVal: 5.042 ± 0.426
0.874ValTrp: 0.874 ± 0.17
2.157ValTyr: 2.157 ± 0.264
0.0ValXaa: 0.0 ± 0.0
Trp
1.865TrpAla: 1.865 ± 0.224
0.612TrpCys: 0.612 ± 0.156
1.312TrpAsp: 1.312 ± 0.174
1.282TrpGlu: 1.282 ± 0.185
0.612TrpPhe: 0.612 ± 0.145
1.516TrpGly: 1.516 ± 0.184
0.641TrpHis: 0.641 ± 0.152
1.049TrpIle: 1.049 ± 0.151
0.7TrpLys: 0.7 ± 0.163
2.069TrpLeu: 2.069 ± 0.248
0.612TrpMet: 0.612 ± 0.145
0.67TrpAsn: 0.67 ± 0.17
1.312TrpPro: 1.312 ± 0.187
0.641TrpGln: 0.641 ± 0.132
1.894TrpArg: 1.894 ± 0.242
1.807TrpSer: 1.807 ± 0.206
1.312TrpThr: 1.312 ± 0.179
2.011TrpVal: 2.011 ± 0.256
0.816TrpTrp: 0.816 ± 0.14
0.641TrpTyr: 0.641 ± 0.164
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.119TyrAla: 3.119 ± 0.29
0.612TyrCys: 0.612 ± 0.147
2.128TyrAsp: 2.128 ± 0.257
2.157TyrGlu: 2.157 ± 0.253
0.583TyrPhe: 0.583 ± 0.128
3.089TyrGly: 3.089 ± 0.277
0.495TyrHis: 0.495 ± 0.132
0.962TyrIle: 0.962 ± 0.165
1.137TyrLys: 1.137 ± 0.178
2.711TyrLeu: 2.711 ± 0.259
0.437TyrMet: 0.437 ± 0.12
0.787TyrAsn: 0.787 ± 0.147
1.224TyrPro: 1.224 ± 0.216
1.078TyrGln: 1.078 ± 0.168
2.711TyrArg: 2.711 ± 0.294
1.195TyrSer: 1.195 ± 0.154
1.603TyrThr: 1.603 ± 0.226
2.303TyrVal: 2.303 ± 0.303
0.7TyrTrp: 0.7 ± 0.152
0.904TyrTyr: 0.904 ± 0.18
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 230 proteins (34311 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski