Amino acid dipepetide frequency for Mycobacterium virus Marcell

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
12.964AlaAla: 12.964 ± 1.255
0.651AlaCys: 0.651 ± 0.181
6.906AlaAsp: 6.906 ± 0.685
6.059AlaGlu: 6.059 ± 0.688
2.997AlaPhe: 2.997 ± 0.505
8.469AlaGly: 8.469 ± 0.844
1.368AlaHis: 1.368 ± 0.358
4.3AlaIle: 4.3 ± 0.491
4.3AlaLys: 4.3 ± 0.545
9.251AlaLeu: 9.251 ± 0.98
2.215AlaMet: 2.215 ± 0.416
2.801AlaAsn: 2.801 ± 0.452
5.212AlaPro: 5.212 ± 0.715
2.671AlaGln: 2.671 ± 0.445
5.798AlaArg: 5.798 ± 0.508
4.886AlaSer: 4.886 ± 0.647
5.798AlaThr: 5.798 ± 0.636
8.99AlaVal: 8.99 ± 0.853
1.824AlaTrp: 1.824 ± 0.428
3.322AlaTyr: 3.322 ± 0.358
0.0AlaXaa: 0.0 ± 0.0
Cys
0.651CysAla: 0.651 ± 0.244
0.0CysCys: 0.0 ± 0.0
0.391CysAsp: 0.391 ± 0.132
0.717CysGlu: 0.717 ± 0.182
0.13CysPhe: 0.13 ± 0.076
0.521CysGly: 0.521 ± 0.223
0.13CysHis: 0.13 ± 0.097
0.261CysIle: 0.261 ± 0.133
0.195CysLys: 0.195 ± 0.118
0.391CysLeu: 0.391 ± 0.186
0.065CysMet: 0.065 ± 0.059
0.195CysAsn: 0.195 ± 0.104
0.195CysPro: 0.195 ± 0.108
0.195CysGln: 0.195 ± 0.118
0.326CysArg: 0.326 ± 0.132
0.261CysSer: 0.261 ± 0.135
0.326CysThr: 0.326 ± 0.141
0.261CysVal: 0.261 ± 0.115
0.261CysTrp: 0.261 ± 0.112
0.13CysTyr: 0.13 ± 0.09
0.0CysXaa: 0.0 ± 0.0
Asp
7.036AspAla: 7.036 ± 0.628
0.521AspCys: 0.521 ± 0.185
4.3AspAsp: 4.3 ± 0.504
3.518AspGlu: 3.518 ± 0.517
2.606AspPhe: 2.606 ± 0.362
5.928AspGly: 5.928 ± 0.608
1.303AspHis: 1.303 ± 0.311
2.801AspIle: 2.801 ± 0.427
2.932AspLys: 2.932 ± 0.427
7.231AspLeu: 7.231 ± 0.832
1.433AspMet: 1.433 ± 0.277
1.889AspAsn: 1.889 ± 0.39
4.691AspPro: 4.691 ± 0.58
1.564AspGln: 1.564 ± 0.286
3.453AspArg: 3.453 ± 0.418
3.127AspSer: 3.127 ± 0.385
3.909AspThr: 3.909 ± 0.397
4.691AspVal: 4.691 ± 0.515
1.954AspTrp: 1.954 ± 0.363
2.02AspTyr: 2.02 ± 0.301
0.0AspXaa: 0.0 ± 0.0
Glu
5.993GluAla: 5.993 ± 0.74
0.195GluCys: 0.195 ± 0.131
5.081GluAsp: 5.081 ± 0.569
4.691GluGlu: 4.691 ± 0.603
2.02GluPhe: 2.02 ± 0.303
3.779GluGly: 3.779 ± 0.488
1.238GluHis: 1.238 ± 0.265
3.322GluIle: 3.322 ± 0.562
2.606GluLys: 2.606 ± 0.428
6.384GluLeu: 6.384 ± 0.598
1.564GluMet: 1.564 ± 0.312
1.564GluAsn: 1.564 ± 0.387
2.932GluPro: 2.932 ± 0.502
2.476GluGln: 2.476 ± 0.392
3.779GluArg: 3.779 ± 0.643
3.388GluSer: 3.388 ± 0.436
3.713GluThr: 3.713 ± 0.479
5.537GluVal: 5.537 ± 0.583
1.694GluTrp: 1.694 ± 0.362
2.085GluTyr: 2.085 ± 0.423
0.0GluXaa: 0.0 ± 0.0
Phe
2.541PheAla: 2.541 ± 0.386
0.326PheCys: 0.326 ± 0.169
2.932PheAsp: 2.932 ± 0.328
1.954PheGlu: 1.954 ± 0.312
0.521PhePhe: 0.521 ± 0.172
3.779PheGly: 3.779 ± 0.466
0.717PheHis: 0.717 ± 0.251
1.303PheIle: 1.303 ± 0.261
1.173PheLys: 1.173 ± 0.295
2.345PheLeu: 2.345 ± 0.415
0.651PheMet: 0.651 ± 0.209
1.238PheAsn: 1.238 ± 0.317
1.564PhePro: 1.564 ± 0.345
0.977PheGln: 0.977 ± 0.252
1.889PheArg: 1.889 ± 0.334
1.824PheSer: 1.824 ± 0.316
2.28PheThr: 2.28 ± 0.395
2.02PheVal: 2.02 ± 0.384
0.521PheTrp: 0.521 ± 0.142
0.847PheTyr: 0.847 ± 0.196
0.0PheXaa: 0.0 ± 0.0
Gly
7.557GlyAla: 7.557 ± 0.838
0.456GlyCys: 0.456 ± 0.169
5.928GlyAsp: 5.928 ± 0.524
4.56GlyGlu: 4.56 ± 0.569
2.606GlyPhe: 2.606 ± 0.44
10.098GlyGly: 10.098 ± 2.236
1.954GlyHis: 1.954 ± 0.324
4.43GlyIle: 4.43 ± 0.74
3.713GlyLys: 3.713 ± 0.453
7.492GlyLeu: 7.492 ± 0.782
2.15GlyMet: 2.15 ± 0.467
3.257GlyAsn: 3.257 ± 0.412
4.169GlyPro: 4.169 ± 0.697
2.345GlyGln: 2.345 ± 0.342
4.951GlyArg: 4.951 ± 0.643
6.059GlySer: 6.059 ± 0.67
5.472GlyThr: 5.472 ± 0.736
5.016GlyVal: 5.016 ± 0.537
2.736GlyTrp: 2.736 ± 0.417
3.062GlyTyr: 3.062 ± 0.383
0.0GlyXaa: 0.0 ± 0.0
His
1.824HisAla: 1.824 ± 0.37
0.13HisCys: 0.13 ± 0.124
0.977HisAsp: 0.977 ± 0.234
1.433HisGlu: 1.433 ± 0.264
0.717HisPhe: 0.717 ± 0.185
1.564HisGly: 1.564 ± 0.42
0.651HisHis: 0.651 ± 0.209
1.107HisIle: 1.107 ± 0.251
1.173HisLys: 1.173 ± 0.281
1.824HisLeu: 1.824 ± 0.441
0.195HisMet: 0.195 ± 0.115
0.261HisAsn: 0.261 ± 0.118
1.173HisPro: 1.173 ± 0.244
0.912HisGln: 0.912 ± 0.248
1.368HisArg: 1.368 ± 0.375
0.717HisSer: 0.717 ± 0.175
1.238HisThr: 1.238 ± 0.289
1.564HisVal: 1.564 ± 0.335
0.521HisTrp: 0.521 ± 0.16
0.651HisTyr: 0.651 ± 0.195
0.0HisXaa: 0.0 ± 0.0
Ile
5.798IleAla: 5.798 ± 0.71
0.261IleCys: 0.261 ± 0.142
3.518IleAsp: 3.518 ± 0.455
3.583IleGlu: 3.583 ± 0.459
0.912IlePhe: 0.912 ± 0.253
4.56IleGly: 4.56 ± 0.576
0.782IleHis: 0.782 ± 0.193
1.824IleIle: 1.824 ± 0.307
1.889IleLys: 1.889 ± 0.292
3.127IleLeu: 3.127 ± 0.401
0.782IleMet: 0.782 ± 0.202
1.824IleAsn: 1.824 ± 0.312
3.062IlePro: 3.062 ± 0.399
1.368IleGln: 1.368 ± 0.302
3.388IleArg: 3.388 ± 0.454
3.127IleSer: 3.127 ± 0.519
3.518IleThr: 3.518 ± 0.461
2.736IleVal: 2.736 ± 0.579
0.651IleTrp: 0.651 ± 0.168
1.759IleTyr: 1.759 ± 0.273
0.0IleXaa: 0.0 ± 0.0
Lys
3.844LysAla: 3.844 ± 0.549
0.195LysCys: 0.195 ± 0.113
2.671LysAsp: 2.671 ± 0.488
1.954LysGlu: 1.954 ± 0.334
1.498LysPhe: 1.498 ± 0.309
2.736LysGly: 2.736 ± 0.375
1.042LysHis: 1.042 ± 0.275
2.28LysIle: 2.28 ± 0.435
2.02LysLys: 2.02 ± 0.391
3.127LysLeu: 3.127 ± 0.469
0.912LysMet: 0.912 ± 0.213
1.433LysAsn: 1.433 ± 0.283
2.606LysPro: 2.606 ± 0.501
1.824LysGln: 1.824 ± 0.37
3.192LysArg: 3.192 ± 0.525
2.541LysSer: 2.541 ± 0.379
2.541LysThr: 2.541 ± 0.412
3.388LysVal: 3.388 ± 0.502
0.782LysTrp: 0.782 ± 0.25
1.042LysTyr: 1.042 ± 0.35
0.0LysXaa: 0.0 ± 0.0
Leu
9.707LeuAla: 9.707 ± 0.878
0.261LeuCys: 0.261 ± 0.133
6.254LeuAsp: 6.254 ± 0.585
5.342LeuGlu: 5.342 ± 0.655
2.215LeuPhe: 2.215 ± 0.412
7.362LeuGly: 7.362 ± 0.681
1.564LeuHis: 1.564 ± 0.295
4.691LeuIle: 4.691 ± 0.568
3.974LeuLys: 3.974 ± 0.443
5.668LeuLeu: 5.668 ± 0.567
1.759LeuMet: 1.759 ± 0.31
3.453LeuAsn: 3.453 ± 0.45
5.603LeuPro: 5.603 ± 0.628
2.671LeuGln: 2.671 ± 0.463
5.863LeuArg: 5.863 ± 0.542
5.147LeuSer: 5.147 ± 0.528
5.733LeuThr: 5.733 ± 0.514
4.235LeuVal: 4.235 ± 0.64
0.977LeuTrp: 0.977 ± 0.322
2.15LeuTyr: 2.15 ± 0.449
0.0LeuXaa: 0.0 ± 0.0
Met
2.476MetAla: 2.476 ± 0.344
0.0MetCys: 0.0 ± 0.0
1.107MetAsp: 1.107 ± 0.277
1.498MetGlu: 1.498 ± 0.363
0.651MetPhe: 0.651 ± 0.174
1.238MetGly: 1.238 ± 0.261
0.261MetHis: 0.261 ± 0.121
0.521MetIle: 0.521 ± 0.179
1.173MetLys: 1.173 ± 0.248
1.173MetLeu: 1.173 ± 0.284
0.065MetMet: 0.065 ± 0.063
1.042MetAsn: 1.042 ± 0.196
1.107MetPro: 1.107 ± 0.24
0.586MetGln: 0.586 ± 0.199
1.042MetArg: 1.042 ± 0.282
2.28MetSer: 2.28 ± 0.461
2.41MetThr: 2.41 ± 0.402
1.238MetVal: 1.238 ± 0.279
0.195MetTrp: 0.195 ± 0.105
0.521MetTyr: 0.521 ± 0.172
0.0MetXaa: 0.0 ± 0.0
Asn
3.322AsnAla: 3.322 ± 0.551
0.065AsnCys: 0.065 ± 0.068
2.41AsnAsp: 2.41 ± 0.386
2.085AsnGlu: 2.085 ± 0.359
0.977AsnPhe: 0.977 ± 0.299
3.583AsnGly: 3.583 ± 0.47
0.782AsnHis: 0.782 ± 0.214
1.694AsnIle: 1.694 ± 0.315
0.521AsnLys: 0.521 ± 0.2
2.541AsnLeu: 2.541 ± 0.314
0.521AsnMet: 0.521 ± 0.165
0.912AsnAsn: 0.912 ± 0.204
2.801AsnPro: 2.801 ± 0.318
1.042AsnGln: 1.042 ± 0.225
1.238AsnArg: 1.238 ± 0.325
1.629AsnSer: 1.629 ± 0.396
2.215AsnThr: 2.215 ± 0.404
2.736AsnVal: 2.736 ± 0.455
0.782AsnTrp: 0.782 ± 0.186
1.238AsnTyr: 1.238 ± 0.273
0.0AsnXaa: 0.0 ± 0.0
Pro
4.886ProAla: 4.886 ± 0.609
0.391ProCys: 0.391 ± 0.159
4.495ProAsp: 4.495 ± 0.55
4.235ProGlu: 4.235 ± 0.524
2.28ProPhe: 2.28 ± 0.359
4.951ProGly: 4.951 ± 0.608
0.977ProHis: 0.977 ± 0.204
2.41ProIle: 2.41 ± 0.404
2.085ProLys: 2.085 ± 0.268
4.495ProLeu: 4.495 ± 0.527
1.303ProMet: 1.303 ± 0.305
1.564ProAsn: 1.564 ± 0.31
3.192ProPro: 3.192 ± 0.542
1.759ProGln: 1.759 ± 0.371
2.801ProArg: 2.801 ± 0.469
3.583ProSer: 3.583 ± 0.465
4.169ProThr: 4.169 ± 0.64
3.779ProVal: 3.779 ± 0.527
0.847ProTrp: 0.847 ± 0.267
1.433ProTyr: 1.433 ± 0.321
0.0ProXaa: 0.0 ± 0.0
Gln
2.866GlnAla: 2.866 ± 0.458
0.065GlnCys: 0.065 ± 0.057
1.303GlnAsp: 1.303 ± 0.329
1.759GlnGlu: 1.759 ± 0.295
1.173GlnPhe: 1.173 ± 0.234
2.345GlnGly: 2.345 ± 0.371
0.521GlnHis: 0.521 ± 0.189
2.866GlnIle: 2.866 ± 0.547
1.042GlnLys: 1.042 ± 0.331
3.844GlnLeu: 3.844 ± 0.422
0.847GlnMet: 0.847 ± 0.239
0.521GlnAsn: 0.521 ± 0.169
1.954GlnPro: 1.954 ± 0.353
1.694GlnGln: 1.694 ± 0.381
1.629GlnArg: 1.629 ± 0.445
1.954GlnSer: 1.954 ± 0.282
1.954GlnThr: 1.954 ± 0.376
2.476GlnVal: 2.476 ± 0.395
0.651GlnTrp: 0.651 ± 0.153
0.521GlnTyr: 0.521 ± 0.136
0.0GlnXaa: 0.0 ± 0.0
Arg
5.147ArgAla: 5.147 ± 0.613
0.717ArgCys: 0.717 ± 0.252
2.866ArgAsp: 2.866 ± 0.338
4.625ArgGlu: 4.625 ± 0.637
1.824ArgPhe: 1.824 ± 0.368
5.081ArgGly: 5.081 ± 0.713
1.173ArgHis: 1.173 ± 0.3
2.866ArgIle: 2.866 ± 0.474
3.453ArgLys: 3.453 ± 0.567
5.472ArgLeu: 5.472 ± 0.743
1.889ArgMet: 1.889 ± 0.357
2.345ArgAsn: 2.345 ± 0.497
2.15ArgPro: 2.15 ± 0.315
1.824ArgGln: 1.824 ± 0.326
5.407ArgArg: 5.407 ± 0.771
3.453ArgSer: 3.453 ± 0.559
2.932ArgThr: 2.932 ± 0.447
5.407ArgVal: 5.407 ± 0.661
1.303ArgTrp: 1.303 ± 0.309
1.759ArgTyr: 1.759 ± 0.312
0.0ArgXaa: 0.0 ± 0.0
Ser
7.036SerAla: 7.036 ± 0.851
0.456SerCys: 0.456 ± 0.203
3.127SerAsp: 3.127 ± 0.41
3.779SerGlu: 3.779 ± 0.595
2.085SerPhe: 2.085 ± 0.517
5.798SerGly: 5.798 ± 0.721
1.433SerHis: 1.433 ± 0.286
2.606SerIle: 2.606 ± 0.398
2.345SerLys: 2.345 ± 0.416
4.821SerLeu: 4.821 ± 0.477
1.368SerMet: 1.368 ± 0.238
2.41SerAsn: 2.41 ± 0.398
3.127SerPro: 3.127 ± 0.418
2.085SerGln: 2.085 ± 0.317
2.606SerArg: 2.606 ± 0.349
3.779SerSer: 3.779 ± 0.621
2.997SerThr: 2.997 ± 0.478
3.779SerVal: 3.779 ± 0.554
1.433SerTrp: 1.433 ± 0.33
1.368SerTyr: 1.368 ± 0.29
0.0SerXaa: 0.0 ± 0.0
Thr
6.189ThrAla: 6.189 ± 0.635
0.195ThrCys: 0.195 ± 0.115
4.625ThrAsp: 4.625 ± 0.563
4.3ThrGlu: 4.3 ± 0.507
2.215ThrPhe: 2.215 ± 0.361
6.906ThrGly: 6.906 ± 0.631
1.238ThrHis: 1.238 ± 0.396
2.606ThrIle: 2.606 ± 0.516
2.671ThrLys: 2.671 ± 0.339
5.863ThrLeu: 5.863 ± 0.626
0.912ThrMet: 0.912 ± 0.213
1.694ThrAsn: 1.694 ± 0.341
3.844ThrPro: 3.844 ± 0.567
1.759ThrGln: 1.759 ± 0.3
3.583ThrArg: 3.583 ± 0.592
3.648ThrSer: 3.648 ± 0.644
4.43ThrThr: 4.43 ± 0.517
5.863ThrVal: 5.863 ± 0.62
0.977ThrTrp: 0.977 ± 0.25
1.889ThrTyr: 1.889 ± 0.333
0.0ThrXaa: 0.0 ± 0.0
Val
6.906ValAla: 6.906 ± 0.674
0.391ValCys: 0.391 ± 0.134
5.277ValAsp: 5.277 ± 0.527
4.56ValGlu: 4.56 ± 0.519
2.345ValPhe: 2.345 ± 0.344
5.016ValGly: 5.016 ± 0.706
1.694ValHis: 1.694 ± 0.263
3.909ValIle: 3.909 ± 0.494
2.801ValLys: 2.801 ± 0.391
5.212ValLeu: 5.212 ± 0.555
1.042ValMet: 1.042 ± 0.309
2.801ValAsn: 2.801 ± 0.42
4.039ValPro: 4.039 ± 0.404
2.02ValGln: 2.02 ± 0.387
5.472ValArg: 5.472 ± 0.669
4.691ValSer: 4.691 ± 0.448
5.668ValThr: 5.668 ± 0.639
4.821ValVal: 4.821 ± 0.553
1.303ValTrp: 1.303 ± 0.29
2.28ValTyr: 2.28 ± 0.416
0.0ValXaa: 0.0 ± 0.0
Trp
1.629TrpAla: 1.629 ± 0.319
0.13TrpCys: 0.13 ± 0.084
1.368TrpAsp: 1.368 ± 0.313
0.912TrpGlu: 0.912 ± 0.222
0.912TrpPhe: 0.912 ± 0.25
1.824TrpGly: 1.824 ± 0.348
0.521TrpHis: 0.521 ± 0.174
1.173TrpIle: 1.173 ± 0.219
0.326TrpLys: 0.326 ± 0.225
1.824TrpLeu: 1.824 ± 0.345
0.391TrpMet: 0.391 ± 0.168
0.456TrpAsn: 0.456 ± 0.147
0.847TrpPro: 0.847 ± 0.267
0.977TrpGln: 0.977 ± 0.243
1.173TrpArg: 1.173 ± 0.301
0.977TrpSer: 0.977 ± 0.243
1.759TrpThr: 1.759 ± 0.365
2.085TrpVal: 2.085 ± 0.369
0.521TrpTrp: 0.521 ± 0.223
0.456TrpTyr: 0.456 ± 0.159
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.41TyrAla: 2.41 ± 0.381
0.195TyrCys: 0.195 ± 0.131
1.368TyrAsp: 1.368 ± 0.336
2.476TyrGlu: 2.476 ± 0.359
0.717TyrPhe: 0.717 ± 0.214
2.476TyrGly: 2.476 ± 0.37
0.651TyrHis: 0.651 ± 0.203
1.498TyrIle: 1.498 ± 0.326
1.238TyrLys: 1.238 ± 0.239
2.541TyrLeu: 2.541 ± 0.375
0.456TyrMet: 0.456 ± 0.159
1.433TyrAsn: 1.433 ± 0.328
1.368TyrPro: 1.368 ± 0.286
1.238TyrGln: 1.238 ± 0.325
2.736TyrArg: 2.736 ± 0.395
1.368TyrSer: 1.368 ± 0.289
2.345TyrThr: 2.345 ± 0.406
1.694TyrVal: 1.694 ± 0.311
0.326TyrTrp: 0.326 ± 0.145
0.651TyrTyr: 0.651 ± 0.232
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 83 proteins (15351 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski