Amino acid dipepetide frequency for Mycobacterium virus Yoshi

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
15.123AlaAla: 15.123 ± 1.681
1.11AlaCys: 1.11 ± 0.26
6.24AlaAsp: 6.24 ± 0.497
8.619AlaGlu: 8.619 ± 0.899
3.86AlaPhe: 3.86 ± 0.419
8.143AlaGly: 8.143 ± 1.249
2.221AlaHis: 2.221 ± 0.356
4.336AlaIle: 4.336 ± 0.448
4.865AlaLys: 4.865 ± 0.428
9.36AlaLeu: 9.36 ± 0.748
3.596AlaMet: 3.596 ± 0.473
3.173AlaAsn: 3.173 ± 0.429
4.072AlaPro: 4.072 ± 0.482
4.759AlaGln: 4.759 ± 0.415
6.24AlaArg: 6.24 ± 0.766
5.182AlaSer: 5.182 ± 0.504
5.552AlaThr: 5.552 ± 0.752
8.672AlaVal: 8.672 ± 0.652
2.432AlaTrp: 2.432 ± 0.394
2.591AlaTyr: 2.591 ± 0.366
0.0AlaXaa: 0.0 ± 0.0
Cys
1.11CysAla: 1.11 ± 0.293
0.317CysCys: 0.317 ± 0.157
0.952CysAsp: 0.952 ± 0.232
0.952CysGlu: 0.952 ± 0.217
0.264CysPhe: 0.264 ± 0.117
1.798CysGly: 1.798 ± 0.392
0.317CysHis: 0.317 ± 0.139
0.423CysIle: 0.423 ± 0.19
0.476CysLys: 0.476 ± 0.145
0.74CysLeu: 0.74 ± 0.239
0.212CysMet: 0.212 ± 0.107
0.317CysAsn: 0.317 ± 0.12
0.74CysPro: 0.74 ± 0.218
0.582CysGln: 0.582 ± 0.222
1.163CysArg: 1.163 ± 0.272
0.899CysSer: 0.899 ± 0.266
0.899CysThr: 0.899 ± 0.242
0.846CysVal: 0.846 ± 0.206
0.264CysTrp: 0.264 ± 0.094
0.106CysTyr: 0.106 ± 0.073
0.0CysXaa: 0.0 ± 0.0
Asp
6.557AspAla: 6.557 ± 0.647
0.635AspCys: 0.635 ± 0.189
4.653AspAsp: 4.653 ± 0.571
5.129AspGlu: 5.129 ± 0.666
2.274AspPhe: 2.274 ± 0.316
5.658AspGly: 5.658 ± 0.458
1.639AspHis: 1.639 ± 0.327
2.75AspIle: 2.75 ± 0.416
2.168AspLys: 2.168 ± 0.356
4.283AspLeu: 4.283 ± 0.488
1.586AspMet: 1.586 ± 0.301
2.009AspAsn: 2.009 ± 0.325
3.966AspPro: 3.966 ± 0.407
2.274AspGln: 2.274 ± 0.406
4.971AspArg: 4.971 ± 0.505
2.538AspSer: 2.538 ± 0.362
3.437AspThr: 3.437 ± 0.45
4.442AspVal: 4.442 ± 0.48
1.639AspTrp: 1.639 ± 0.263
1.851AspTyr: 1.851 ± 0.363
0.0AspXaa: 0.0 ± 0.0
Glu
7.033GluAla: 7.033 ± 0.547
1.005GluCys: 1.005 ± 0.222
3.913GluAsp: 3.913 ± 0.539
3.173GluGlu: 3.173 ± 0.608
2.115GluPhe: 2.115 ± 0.338
3.913GluGly: 3.913 ± 0.584
1.428GluHis: 1.428 ± 0.262
2.38GluIle: 2.38 ± 0.409
2.062GluLys: 2.062 ± 0.379
5.235GluLeu: 5.235 ± 0.488
1.163GluMet: 1.163 ± 0.247
1.957GluAsn: 1.957 ± 0.305
3.543GluPro: 3.543 ± 0.526
3.014GluGln: 3.014 ± 0.396
4.019GluArg: 4.019 ± 0.701
3.226GluSer: 3.226 ± 0.43
2.644GluThr: 2.644 ± 0.388
4.548GluVal: 4.548 ± 0.442
1.639GluTrp: 1.639 ± 0.271
1.692GluTyr: 1.692 ± 0.362
0.0GluXaa: 0.0 ± 0.0
Phe
2.908PheAla: 2.908 ± 0.33
0.476PheCys: 0.476 ± 0.145
2.538PheAsp: 2.538 ± 0.326
2.062PheGlu: 2.062 ± 0.319
0.793PhePhe: 0.793 ± 0.187
2.855PheGly: 2.855 ± 0.477
0.74PheHis: 0.74 ± 0.172
1.481PheIle: 1.481 ± 0.249
1.005PheLys: 1.005 ± 0.277
2.115PheLeu: 2.115 ± 0.322
0.952PheMet: 0.952 ± 0.254
0.952PheAsn: 0.952 ± 0.193
1.639PhePro: 1.639 ± 0.27
0.846PheGln: 0.846 ± 0.19
1.375PheArg: 1.375 ± 0.294
2.062PheSer: 2.062 ± 0.302
2.432PheThr: 2.432 ± 0.401
1.692PheVal: 1.692 ± 0.323
0.687PheTrp: 0.687 ± 0.174
0.529PheTyr: 0.529 ± 0.183
0.0PheXaa: 0.0 ± 0.0
Gly
8.989GlyAla: 8.989 ± 1.348
0.899GlyCys: 0.899 ± 0.208
4.918GlyAsp: 4.918 ± 0.476
4.495GlyGlu: 4.495 ± 0.405
2.115GlyPhe: 2.115 ± 0.321
10.629GlyGly: 10.629 ± 1.955
1.957GlyHis: 1.957 ± 0.284
3.226GlyIle: 3.226 ± 0.405
3.067GlyLys: 3.067 ± 0.399
6.504GlyLeu: 6.504 ± 0.576
1.745GlyMet: 1.745 ± 0.325
2.75GlyAsn: 2.75 ± 0.422
4.072GlyPro: 4.072 ± 0.427
3.543GlyGln: 3.543 ± 0.5
5.394GlyArg: 5.394 ± 0.394
4.971GlySer: 4.971 ± 0.826
5.975GlyThr: 5.975 ± 0.744
7.244GlyVal: 7.244 ± 0.673
2.538GlyTrp: 2.538 ± 0.427
2.38GlyTyr: 2.38 ± 0.401
0.0GlyXaa: 0.0 ± 0.0
His
1.639HisAla: 1.639 ± 0.308
0.37HisCys: 0.37 ± 0.131
1.322HisAsp: 1.322 ± 0.309
0.635HisGlu: 0.635 ± 0.206
0.529HisPhe: 0.529 ± 0.151
1.957HisGly: 1.957 ± 0.287
0.952HisHis: 0.952 ± 0.245
0.899HisIle: 0.899 ± 0.228
0.793HisLys: 0.793 ± 0.184
1.957HisLeu: 1.957 ± 0.363
0.317HisMet: 0.317 ± 0.125
0.74HisAsn: 0.74 ± 0.192
1.428HisPro: 1.428 ± 0.292
0.846HisGln: 0.846 ± 0.221
1.745HisArg: 1.745 ± 0.28
1.269HisSer: 1.269 ± 0.215
1.481HisThr: 1.481 ± 0.273
1.375HisVal: 1.375 ± 0.275
0.635HisTrp: 0.635 ± 0.206
1.11HisTyr: 1.11 ± 0.225
0.0HisXaa: 0.0 ± 0.0
Ile
5.552IleAla: 5.552 ± 0.585
0.899IleCys: 0.899 ± 0.241
3.913IleAsp: 3.913 ± 0.472
2.961IleGlu: 2.961 ± 0.401
0.74IlePhe: 0.74 ± 0.173
4.495IleGly: 4.495 ± 0.495
1.163IleHis: 1.163 ± 0.242
1.692IleIle: 1.692 ± 0.319
1.428IleLys: 1.428 ± 0.256
1.798IleLeu: 1.798 ± 0.375
0.687IleMet: 0.687 ± 0.192
1.798IleAsn: 1.798 ± 0.335
2.38IlePro: 2.38 ± 0.382
1.481IleGln: 1.481 ± 0.276
3.596IleArg: 3.596 ± 0.446
1.798IleSer: 1.798 ± 0.295
3.331IleThr: 3.331 ± 0.505
2.855IleVal: 2.855 ± 0.347
0.899IleTrp: 0.899 ± 0.211
1.322IleTyr: 1.322 ± 0.292
0.0IleXaa: 0.0 ± 0.0
Lys
4.283LysAla: 4.283 ± 0.566
0.476LysCys: 0.476 ± 0.161
2.115LysAsp: 2.115 ± 0.357
1.692LysGlu: 1.692 ± 0.358
0.952LysPhe: 0.952 ± 0.194
2.591LysGly: 2.591 ± 0.335
1.11LysHis: 1.11 ± 0.242
1.428LysIle: 1.428 ± 0.251
1.269LysLys: 1.269 ± 0.248
2.538LysLeu: 2.538 ± 0.378
0.74LysMet: 0.74 ± 0.17
0.74LysAsn: 0.74 ± 0.26
3.014LysPro: 3.014 ± 0.339
1.798LysGln: 1.798 ± 0.328
3.384LysArg: 3.384 ± 0.476
1.851LysSer: 1.851 ± 0.319
1.745LysThr: 1.745 ± 0.32
2.538LysVal: 2.538 ± 0.387
0.899LysTrp: 0.899 ± 0.207
1.163LysTyr: 1.163 ± 0.289
0.0LysXaa: 0.0 ± 0.0
Leu
9.73LeuAla: 9.73 ± 0.841
0.793LeuCys: 0.793 ± 0.228
5.129LeuAsp: 5.129 ± 0.544
3.067LeuGlu: 3.067 ± 0.392
2.432LeuPhe: 2.432 ± 0.381
5.975LeuGly: 5.975 ± 0.7
1.058LeuHis: 1.058 ± 0.231
4.072LeuIle: 4.072 ± 0.517
3.437LeuLys: 3.437 ± 0.44
6.187LeuLeu: 6.187 ± 0.779
1.11LeuMet: 1.11 ± 0.22
2.38LeuAsn: 2.38 ± 0.392
4.865LeuPro: 4.865 ± 0.535
2.803LeuGln: 2.803 ± 0.402
3.913LeuArg: 3.913 ± 0.47
5.024LeuSer: 5.024 ± 0.534
5.394LeuThr: 5.394 ± 0.495
4.971LeuVal: 4.971 ± 0.541
0.74LeuTrp: 0.74 ± 0.247
1.745LeuTyr: 1.745 ± 0.257
0.0LeuXaa: 0.0 ± 0.0
Met
2.538MetAla: 2.538 ± 0.337
0.423MetCys: 0.423 ± 0.155
1.11MetAsp: 1.11 ± 0.204
0.687MetGlu: 0.687 ± 0.184
0.793MetPhe: 0.793 ± 0.227
1.745MetGly: 1.745 ± 0.307
0.264MetHis: 0.264 ± 0.102
0.899MetIle: 0.899 ± 0.201
0.846MetLys: 0.846 ± 0.241
1.639MetLeu: 1.639 ± 0.249
0.476MetMet: 0.476 ± 0.171
0.793MetAsn: 0.793 ± 0.252
1.216MetPro: 1.216 ± 0.244
0.529MetGln: 0.529 ± 0.146
1.322MetArg: 1.322 ± 0.273
2.115MetSer: 2.115 ± 0.361
2.38MetThr: 2.38 ± 0.329
1.269MetVal: 1.269 ± 0.212
0.635MetTrp: 0.635 ± 0.2
0.423MetTyr: 0.423 ± 0.124
0.0MetXaa: 0.0 ± 0.0
Asn
3.49AsnAla: 3.49 ± 0.418
0.476AsnCys: 0.476 ± 0.173
1.322AsnAsp: 1.322 ± 0.257
1.957AsnGlu: 1.957 ± 0.282
0.476AsnPhe: 0.476 ± 0.135
4.125AsnGly: 4.125 ± 0.458
0.899AsnHis: 0.899 ± 0.25
1.639AsnIle: 1.639 ± 0.266
0.952AsnLys: 0.952 ± 0.194
2.644AsnLeu: 2.644 ± 0.366
0.423AsnMet: 0.423 ± 0.13
1.216AsnAsn: 1.216 ± 0.22
2.697AsnPro: 2.697 ± 0.461
0.582AsnGln: 0.582 ± 0.172
2.009AsnArg: 2.009 ± 0.349
2.009AsnSer: 2.009 ± 0.276
2.274AsnThr: 2.274 ± 0.323
2.168AsnVal: 2.168 ± 0.248
0.37AsnTrp: 0.37 ± 0.109
0.582AsnTyr: 0.582 ± 0.176
0.0AsnXaa: 0.0 ± 0.0
Pro
5.076ProAla: 5.076 ± 0.539
0.899ProCys: 0.899 ± 0.245
4.442ProAsp: 4.442 ± 0.503
4.177ProGlu: 4.177 ± 0.487
1.798ProPhe: 1.798 ± 0.3
5.182ProGly: 5.182 ± 0.633
1.058ProHis: 1.058 ± 0.205
2.591ProIle: 2.591 ± 0.362
2.009ProLys: 2.009 ± 0.275
2.908ProLeu: 2.908 ± 0.405
0.899ProMet: 0.899 ± 0.183
2.697ProAsn: 2.697 ± 0.427
4.125ProPro: 4.125 ± 0.626
2.221ProGln: 2.221 ± 0.496
2.961ProArg: 2.961 ± 0.424
2.591ProSer: 2.591 ± 0.358
3.913ProThr: 3.913 ± 0.439
4.759ProVal: 4.759 ± 0.591
1.11ProTrp: 1.11 ± 0.25
1.692ProTyr: 1.692 ± 0.245
0.0ProXaa: 0.0 ± 0.0
Gln
4.495GlnAla: 4.495 ± 0.525
0.793GlnCys: 0.793 ± 0.258
1.481GlnAsp: 1.481 ± 0.259
1.586GlnGlu: 1.586 ± 0.31
1.11GlnPhe: 1.11 ± 0.242
2.697GlnGly: 2.697 ± 0.367
1.216GlnHis: 1.216 ± 0.208
1.798GlnIle: 1.798 ± 0.305
1.904GlnLys: 1.904 ± 0.367
3.173GlnLeu: 3.173 ± 0.384
1.163GlnMet: 1.163 ± 0.285
1.216GlnAsn: 1.216 ± 0.272
2.538GlnPro: 2.538 ± 0.388
1.957GlnGln: 1.957 ± 0.354
2.644GlnArg: 2.644 ± 0.391
2.115GlnSer: 2.115 ± 0.272
1.745GlnThr: 1.745 ± 0.275
2.485GlnVal: 2.485 ± 0.364
0.582GlnTrp: 0.582 ± 0.189
1.11GlnTyr: 1.11 ± 0.195
0.0GlnXaa: 0.0 ± 0.0
Arg
6.821ArgAla: 6.821 ± 0.688
1.058ArgCys: 1.058 ± 0.252
4.072ArgAsp: 4.072 ± 0.48
4.177ArgGlu: 4.177 ± 0.592
2.009ArgPhe: 2.009 ± 0.302
4.495ArgGly: 4.495 ± 0.388
1.163ArgHis: 1.163 ± 0.248
3.913ArgIle: 3.913 ± 0.569
2.908ArgLys: 2.908 ± 0.519
4.865ArgLeu: 4.865 ± 0.544
1.798ArgMet: 1.798 ± 0.302
1.745ArgAsn: 1.745 ± 0.294
3.49ArgPro: 3.49 ± 0.459
3.014ArgGln: 3.014 ± 0.423
5.552ArgArg: 5.552 ± 0.742
4.125ArgSer: 4.125 ± 0.462
3.226ArgThr: 3.226 ± 0.358
4.019ArgVal: 4.019 ± 0.485
2.38ArgTrp: 2.38 ± 0.308
1.798ArgTyr: 1.798 ± 0.327
0.0ArgXaa: 0.0 ± 0.0
Ser
6.028SerAla: 6.028 ± 0.637
0.423SerCys: 0.423 ± 0.147
3.702SerAsp: 3.702 ± 0.494
3.173SerGlu: 3.173 ± 0.452
1.745SerPhe: 1.745 ± 0.377
6.663SerGly: 6.663 ± 0.861
1.11SerHis: 1.11 ± 0.207
2.75SerIle: 2.75 ± 0.357
1.851SerLys: 1.851 ± 0.32
4.442SerLeu: 4.442 ± 0.534
1.322SerMet: 1.322 ± 0.231
2.274SerAsn: 2.274 ± 0.331
3.279SerPro: 3.279 ± 0.423
1.904SerGln: 1.904 ± 0.3
4.125SerArg: 4.125 ± 0.569
3.331SerSer: 3.331 ± 0.522
3.754SerThr: 3.754 ± 0.433
3.913SerVal: 3.913 ± 0.568
1.11SerTrp: 1.11 ± 0.221
0.952SerTyr: 0.952 ± 0.207
0.0SerXaa: 0.0 ± 0.0
Thr
6.769ThrAla: 6.769 ± 0.677
0.793ThrCys: 0.793 ± 0.217
4.336ThrAsp: 4.336 ± 0.445
3.596ThrGlu: 3.596 ± 0.379
1.957ThrPhe: 1.957 ± 0.398
4.971ThrGly: 4.971 ± 0.685
1.005ThrHis: 1.005 ± 0.174
3.173ThrIle: 3.173 ± 0.549
1.481ThrLys: 1.481 ± 0.256
4.918ThrLeu: 4.918 ± 0.468
1.375ThrMet: 1.375 ± 0.283
2.115ThrAsn: 2.115 ± 0.327
4.548ThrPro: 4.548 ± 0.522
1.798ThrGln: 1.798 ± 0.315
3.807ThrArg: 3.807 ± 0.452
4.177ThrSer: 4.177 ± 0.527
3.966ThrThr: 3.966 ± 0.57
4.918ThrVal: 4.918 ± 0.564
1.586ThrTrp: 1.586 ± 0.241
0.793ThrTyr: 0.793 ± 0.196
0.0ThrXaa: 0.0 ± 0.0
Val
8.408ValAla: 8.408 ± 0.654
0.793ValCys: 0.793 ± 0.194
5.129ValAsp: 5.129 ± 0.556
5.394ValGlu: 5.394 ± 0.535
2.908ValPhe: 2.908 ± 0.452
4.918ValGly: 4.918 ± 0.484
1.639ValHis: 1.639 ± 0.293
3.173ValIle: 3.173 ± 0.454
2.221ValLys: 2.221 ± 0.271
5.394ValLeu: 5.394 ± 0.676
1.216ValMet: 1.216 ± 0.247
1.957ValAsn: 1.957 ± 0.328
3.543ValPro: 3.543 ± 0.41
2.274ValGln: 2.274 ± 0.303
3.966ValArg: 3.966 ± 0.506
5.658ValSer: 5.658 ± 0.825
4.495ValThr: 4.495 ± 0.543
6.663ValVal: 6.663 ± 0.738
1.375ValTrp: 1.375 ± 0.222
1.957ValTyr: 1.957 ± 0.306
0.0ValXaa: 0.0 ± 0.0
Trp
1.269TrpAla: 1.269 ± 0.21
0.317TrpCys: 0.317 ± 0.14
1.375TrpAsp: 1.375 ± 0.232
0.846TrpGlu: 0.846 ± 0.248
0.529TrpPhe: 0.529 ± 0.146
1.851TrpGly: 1.851 ± 0.369
0.423TrpHis: 0.423 ± 0.141
1.11TrpIle: 1.11 ± 0.228
1.005TrpLys: 1.005 ± 0.203
2.168TrpLeu: 2.168 ± 0.354
1.005TrpMet: 1.005 ± 0.271
0.582TrpAsn: 0.582 ± 0.146
1.11TrpPro: 1.11 ± 0.22
0.793TrpGln: 0.793 ± 0.175
1.586TrpArg: 1.586 ± 0.31
2.009TrpSer: 2.009 ± 0.33
1.692TrpThr: 1.692 ± 0.367
1.745TrpVal: 1.745 ± 0.311
0.899TrpTrp: 0.899 ± 0.226
0.635TrpTyr: 0.635 ± 0.168
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.327TyrAla: 2.327 ± 0.379
0.423TyrCys: 0.423 ± 0.149
1.798TyrAsp: 1.798 ± 0.265
1.322TyrGlu: 1.322 ± 0.247
0.74TyrPhe: 0.74 ± 0.222
2.75TyrGly: 2.75 ± 0.417
0.635TyrHis: 0.635 ± 0.197
1.058TyrIle: 1.058 ± 0.231
0.582TyrLys: 0.582 ± 0.165
1.957TyrLeu: 1.957 ± 0.341
0.264TyrMet: 0.264 ± 0.119
0.899TyrAsn: 0.899 ± 0.19
0.899TyrPro: 0.899 ± 0.221
0.793TyrGln: 0.793 ± 0.226
2.855TyrArg: 2.855 ± 0.419
0.899TyrSer: 0.899 ± 0.224
1.745TyrThr: 1.745 ± 0.331
1.957TyrVal: 1.957 ± 0.288
0.529TyrTrp: 0.529 ± 0.155
0.476TyrTyr: 0.476 ± 0.14
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 116 proteins (18912 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski