Amino acid dipepetide frequency for Mycobacterium phage Kimona

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
11.927AlaAla: 11.927 ± 1.167
0.774AlaCys: 0.774 ± 0.227
5.416AlaAsp: 5.416 ± 0.559
7.156AlaGlu: 7.156 ± 0.766
3.159AlaPhe: 3.159 ± 0.439
8.446AlaGly: 8.446 ± 1.02
2.385AlaHis: 2.385 ± 0.395
4.964AlaIle: 4.964 ± 0.517
3.933AlaLys: 3.933 ± 0.607
10.96AlaLeu: 10.96 ± 0.975
2.579AlaMet: 2.579 ± 0.439
2.837AlaAsn: 2.837 ± 0.423
4.448AlaPro: 4.448 ± 0.658
4.448AlaGln: 4.448 ± 0.586
6.898AlaArg: 6.898 ± 0.697
3.804AlaSer: 3.804 ± 0.411
5.673AlaThr: 5.673 ± 0.669
6.898AlaVal: 6.898 ± 0.639
2.256AlaTrp: 2.256 ± 0.32
2.321AlaTyr: 2.321 ± 0.462
0.0AlaXaa: 0.0 ± 0.0
Cys
0.709CysAla: 0.709 ± 0.223
0.129CysCys: 0.129 ± 0.13
0.451CysAsp: 0.451 ± 0.158
0.451CysGlu: 0.451 ± 0.201
0.193CysPhe: 0.193 ± 0.106
0.838CysGly: 0.838 ± 0.262
0.258CysHis: 0.258 ± 0.13
0.258CysIle: 0.258 ± 0.142
0.258CysLys: 0.258 ± 0.114
0.451CysLeu: 0.451 ± 0.158
0.258CysMet: 0.258 ± 0.129
0.387CysAsn: 0.387 ± 0.159
0.451CysPro: 0.451 ± 0.216
0.387CysGln: 0.387 ± 0.176
0.838CysArg: 0.838 ± 0.203
0.322CysSer: 0.322 ± 0.137
0.516CysThr: 0.516 ± 0.18
0.838CysVal: 0.838 ± 0.221
0.064CysTrp: 0.064 ± 0.059
0.387CysTyr: 0.387 ± 0.16
0.0CysXaa: 0.0 ± 0.0
Asp
6.576AspAla: 6.576 ± 0.696
0.709AspCys: 0.709 ± 0.271
4.126AspAsp: 4.126 ± 0.515
4.577AspGlu: 4.577 ± 0.592
1.934AspPhe: 1.934 ± 0.326
5.416AspGly: 5.416 ± 0.549
1.87AspHis: 1.87 ± 0.401
2.385AspIle: 2.385 ± 0.437
1.87AspLys: 1.87 ± 0.394
4.9AspLeu: 4.9 ± 0.526
1.547AspMet: 1.547 ± 0.398
1.741AspAsn: 1.741 ± 0.344
4.384AspPro: 4.384 ± 0.652
1.934AspGln: 1.934 ± 0.313
4.448AspArg: 4.448 ± 0.647
3.288AspSer: 3.288 ± 0.562
2.643AspThr: 2.643 ± 0.438
4.384AspVal: 4.384 ± 0.506
1.16AspTrp: 1.16 ± 0.263
2.256AspTyr: 2.256 ± 0.373
0.0AspXaa: 0.0 ± 0.0
Glu
6.834GluAla: 6.834 ± 0.659
0.387GluCys: 0.387 ± 0.171
4.835GluAsp: 4.835 ± 0.694
5.609GluGlu: 5.609 ± 0.836
3.03GluPhe: 3.03 ± 0.505
5.416GluGly: 5.416 ± 0.591
1.225GluHis: 1.225 ± 0.317
2.514GluIle: 2.514 ± 0.3
2.45GluLys: 2.45 ± 0.43
8.188GluLeu: 8.188 ± 0.9
1.16GluMet: 1.16 ± 0.282
2.321GluAsn: 2.321 ± 0.367
2.708GluPro: 2.708 ± 0.516
2.643GluGln: 2.643 ± 0.367
5.029GluArg: 5.029 ± 0.611
3.095GluSer: 3.095 ± 0.394
2.966GluThr: 2.966 ± 0.444
5.093GluVal: 5.093 ± 0.551
1.225GluTrp: 1.225 ± 0.236
2.514GluTyr: 2.514 ± 0.461
0.0GluXaa: 0.0 ± 0.0
Phe
3.417PheAla: 3.417 ± 0.627
0.322PheCys: 0.322 ± 0.16
2.579PheAsp: 2.579 ± 0.367
2.385PheGlu: 2.385 ± 0.423
0.709PhePhe: 0.709 ± 0.194
2.901PheGly: 2.901 ± 0.4
0.58PheHis: 0.58 ± 0.213
1.354PheIle: 1.354 ± 0.266
0.838PheLys: 0.838 ± 0.22
2.772PheLeu: 2.772 ± 0.501
0.838PheMet: 0.838 ± 0.206
1.354PheAsn: 1.354 ± 0.348
1.805PhePro: 1.805 ± 0.386
1.16PheGln: 1.16 ± 0.288
1.87PheArg: 1.87 ± 0.362
1.934PheSer: 1.934 ± 0.353
2.192PheThr: 2.192 ± 0.399
2.45PheVal: 2.45 ± 0.387
0.451PheTrp: 0.451 ± 0.17
0.709PheTyr: 0.709 ± 0.193
0.0PheXaa: 0.0 ± 0.0
Gly
6.06GlyAla: 6.06 ± 0.872
0.709GlyCys: 0.709 ± 0.194
4.577GlyAsp: 4.577 ± 0.482
4.642GlyGlu: 4.642 ± 0.517
3.288GlyPhe: 3.288 ± 0.486
8.381GlyGly: 8.381 ± 1.384
2.643GlyHis: 2.643 ± 0.476
2.837GlyIle: 2.837 ± 0.433
4.9GlyLys: 4.9 ± 0.553
6.64GlyLeu: 6.64 ± 0.723
2.256GlyMet: 2.256 ± 0.334
3.03GlyAsn: 3.03 ± 0.535
4.448GlyPro: 4.448 ± 0.498
3.159GlyGln: 3.159 ± 0.593
4.835GlyArg: 4.835 ± 0.645
4.513GlySer: 4.513 ± 0.644
4.835GlyThr: 4.835 ± 0.748
6.06GlyVal: 6.06 ± 0.576
2.063GlyTrp: 2.063 ± 0.335
3.095GlyTyr: 3.095 ± 0.384
0.0GlyXaa: 0.0 ± 0.0
His
2.128HisAla: 2.128 ± 0.377
0.064HisCys: 0.064 ± 0.069
1.547HisAsp: 1.547 ± 0.327
1.547HisGlu: 1.547 ± 0.268
0.774HisPhe: 0.774 ± 0.195
2.128HisGly: 2.128 ± 0.461
0.58HisHis: 0.58 ± 0.227
1.16HisIle: 1.16 ± 0.296
1.096HisLys: 1.096 ± 0.295
2.385HisLeu: 2.385 ± 0.449
0.193HisMet: 0.193 ± 0.112
0.193HisAsn: 0.193 ± 0.117
0.838HisPro: 0.838 ± 0.202
0.967HisGln: 0.967 ± 0.22
2.128HisArg: 2.128 ± 0.367
0.903HisSer: 0.903 ± 0.214
1.547HisThr: 1.547 ± 0.329
1.354HisVal: 1.354 ± 0.259
0.516HisTrp: 0.516 ± 0.21
0.516HisTyr: 0.516 ± 0.186
0.0HisXaa: 0.0 ± 0.0
Ile
5.996IleAla: 5.996 ± 0.647
0.322IleCys: 0.322 ± 0.146
2.966IleAsp: 2.966 ± 0.353
4.577IleGlu: 4.577 ± 0.552
1.354IlePhe: 1.354 ± 0.256
3.546IleGly: 3.546 ± 0.418
1.225IleHis: 1.225 ± 0.247
1.096IleIle: 1.096 ± 0.261
1.289IleLys: 1.289 ± 0.274
3.095IleLeu: 3.095 ± 0.455
0.645IleMet: 0.645 ± 0.308
1.483IleAsn: 1.483 ± 0.307
3.224IlePro: 3.224 ± 0.452
1.483IleGln: 1.483 ± 0.303
3.739IleArg: 3.739 ± 0.477
1.87IleSer: 1.87 ± 0.492
3.739IleThr: 3.739 ± 0.603
2.256IleVal: 2.256 ± 0.396
0.58IleTrp: 0.58 ± 0.175
1.096IleTyr: 1.096 ± 0.268
0.0IleXaa: 0.0 ± 0.0
Lys
5.222LysAla: 5.222 ± 0.748
0.193LysCys: 0.193 ± 0.111
2.579LysAsp: 2.579 ± 0.335
2.772LysGlu: 2.772 ± 0.461
1.032LysPhe: 1.032 ± 0.232
3.61LysGly: 3.61 ± 0.481
1.354LysHis: 1.354 ± 0.275
1.934LysIle: 1.934 ± 0.392
2.128LysLys: 2.128 ± 0.379
3.481LysLeu: 3.481 ± 0.493
1.032LysMet: 1.032 ± 0.236
1.225LysAsn: 1.225 ± 0.236
2.579LysPro: 2.579 ± 0.528
1.612LysGln: 1.612 ± 0.259
3.546LysArg: 3.546 ± 0.766
1.87LysSer: 1.87 ± 0.377
2.514LysThr: 2.514 ± 0.435
3.224LysVal: 3.224 ± 0.472
1.225LysTrp: 1.225 ± 0.273
1.096LysTyr: 1.096 ± 0.25
0.0LysXaa: 0.0 ± 0.0
Leu
8.897LeuAla: 8.897 ± 0.779
0.322LeuCys: 0.322 ± 0.215
5.351LeuAsp: 5.351 ± 0.506
5.416LeuGlu: 5.416 ± 0.654
2.128LeuPhe: 2.128 ± 0.307
7.156LeuGly: 7.156 ± 0.764
1.547LeuHis: 1.547 ± 0.342
4.9LeuIle: 4.9 ± 0.507
3.804LeuLys: 3.804 ± 0.432
5.931LeuLeu: 5.931 ± 0.512
2.514LeuMet: 2.514 ± 0.376
2.128LeuAsn: 2.128 ± 0.385
3.933LeuPro: 3.933 ± 0.413
3.03LeuGln: 3.03 ± 0.632
7.285LeuArg: 7.285 ± 0.746
4.255LeuSer: 4.255 ± 0.478
5.158LeuThr: 5.158 ± 0.508
5.609LeuVal: 5.609 ± 0.572
1.354LeuTrp: 1.354 ± 0.321
2.256LeuTyr: 2.256 ± 0.379
0.0LeuXaa: 0.0 ± 0.0
Met
2.708MetAla: 2.708 ± 0.525
0.129MetCys: 0.129 ± 0.13
1.805MetAsp: 1.805 ± 0.4
1.16MetGlu: 1.16 ± 0.239
0.516MetPhe: 0.516 ± 0.191
1.805MetGly: 1.805 ± 0.357
0.322MetHis: 0.322 ± 0.155
0.838MetIle: 0.838 ± 0.224
1.547MetLys: 1.547 ± 0.278
1.676MetLeu: 1.676 ± 0.252
0.451MetMet: 0.451 ± 0.166
0.774MetAsn: 0.774 ± 0.254
1.676MetPro: 1.676 ± 0.363
1.096MetGln: 1.096 ± 0.362
1.87MetArg: 1.87 ± 0.402
2.063MetSer: 2.063 ± 0.389
2.579MetThr: 2.579 ± 0.365
1.225MetVal: 1.225 ± 0.309
0.322MetTrp: 0.322 ± 0.132
0.451MetTyr: 0.451 ± 0.198
0.0MetXaa: 0.0 ± 0.0
Asn
3.804AsnAla: 3.804 ± 0.479
0.258AsnCys: 0.258 ± 0.116
1.741AsnAsp: 1.741 ± 0.305
1.87AsnGlu: 1.87 ± 0.345
1.16AsnPhe: 1.16 ± 0.288
2.643AsnGly: 2.643 ± 0.513
0.645AsnHis: 0.645 ± 0.177
2.385AsnIle: 2.385 ± 0.421
0.967AsnLys: 0.967 ± 0.218
2.321AsnLeu: 2.321 ± 0.382
0.645AsnMet: 0.645 ± 0.158
0.709AsnAsn: 0.709 ± 0.256
2.643AsnPro: 2.643 ± 0.389
1.225AsnGln: 1.225 ± 0.3
2.321AsnArg: 2.321 ± 0.373
1.289AsnSer: 1.289 ± 0.285
1.612AsnThr: 1.612 ± 0.274
2.128AsnVal: 2.128 ± 0.365
0.774AsnTrp: 0.774 ± 0.242
1.225AsnTyr: 1.225 ± 0.273
0.0AsnXaa: 0.0 ± 0.0
Pro
5.544ProAla: 5.544 ± 0.545
0.709ProCys: 0.709 ± 0.224
3.933ProAsp: 3.933 ± 0.516
3.804ProGlu: 3.804 ± 0.463
1.934ProPhe: 1.934 ± 0.367
4.706ProGly: 4.706 ± 0.739
1.483ProHis: 1.483 ± 0.286
1.999ProIle: 1.999 ± 0.317
1.87ProLys: 1.87 ± 0.307
3.352ProLeu: 3.352 ± 0.576
0.967ProMet: 0.967 ± 0.21
2.192ProAsn: 2.192 ± 0.446
2.45ProPro: 2.45 ± 0.489
2.514ProGln: 2.514 ± 0.401
2.643ProArg: 2.643 ± 0.433
2.385ProSer: 2.385 ± 0.406
3.546ProThr: 3.546 ± 0.392
3.933ProVal: 3.933 ± 0.538
0.903ProTrp: 0.903 ± 0.349
2.128ProTyr: 2.128 ± 0.306
0.0ProXaa: 0.0 ± 0.0
Gln
3.675GlnAla: 3.675 ± 0.681
0.516GlnCys: 0.516 ± 0.183
1.805GlnAsp: 1.805 ± 0.325
2.45GlnGlu: 2.45 ± 0.365
1.676GlnPhe: 1.676 ± 0.375
3.224GlnGly: 3.224 ± 0.533
0.387GlnHis: 0.387 ± 0.134
1.612GlnIle: 1.612 ± 0.297
1.805GlnLys: 1.805 ± 0.34
3.546GlnLeu: 3.546 ± 0.589
1.805GlnMet: 1.805 ± 0.324
1.418GlnAsn: 1.418 ± 0.412
1.87GlnPro: 1.87 ± 0.405
1.999GlnGln: 1.999 ± 0.615
2.837GlnArg: 2.837 ± 0.426
1.483GlnSer: 1.483 ± 0.32
1.741GlnThr: 1.741 ± 0.346
2.772GlnVal: 2.772 ± 0.357
0.774GlnTrp: 0.774 ± 0.225
0.838GlnTyr: 0.838 ± 0.28
0.0GlnXaa: 0.0 ± 0.0
Arg
6.963ArgAla: 6.963 ± 0.661
1.096ArgCys: 1.096 ± 0.309
3.739ArgAsp: 3.739 ± 0.524
5.351ArgGlu: 5.351 ± 0.512
2.063ArgPhe: 2.063 ± 0.358
4.32ArgGly: 4.32 ± 0.534
1.676ArgHis: 1.676 ± 0.363
4.062ArgIle: 4.062 ± 0.526
4.964ArgLys: 4.964 ± 0.588
5.158ArgLeu: 5.158 ± 0.565
2.643ArgMet: 2.643 ± 0.423
2.579ArgAsn: 2.579 ± 0.386
2.579ArgPro: 2.579 ± 0.41
2.643ArgGln: 2.643 ± 0.502
7.35ArgArg: 7.35 ± 0.84
3.288ArgSer: 3.288 ± 0.543
2.192ArgThr: 2.192 ± 0.405
5.351ArgVal: 5.351 ± 0.532
1.87ArgTrp: 1.87 ± 0.398
2.772ArgTyr: 2.772 ± 0.397
0.0ArgXaa: 0.0 ± 0.0
Ser
3.933SerAla: 3.933 ± 0.501
0.258SerCys: 0.258 ± 0.115
2.901SerAsp: 2.901 ± 0.42
3.739SerGlu: 3.739 ± 0.506
2.192SerPhe: 2.192 ± 0.36
4.577SerGly: 4.577 ± 0.576
0.903SerHis: 0.903 ± 0.223
2.579SerIle: 2.579 ± 0.345
2.45SerLys: 2.45 ± 0.48
3.739SerLeu: 3.739 ± 0.442
1.612SerMet: 1.612 ± 0.347
1.032SerAsn: 1.032 ± 0.236
2.708SerPro: 2.708 ± 0.376
2.256SerGln: 2.256 ± 0.499
3.352SerArg: 3.352 ± 0.494
2.901SerSer: 2.901 ± 0.384
2.256SerThr: 2.256 ± 0.343
3.159SerVal: 3.159 ± 0.382
1.225SerTrp: 1.225 ± 0.219
0.967SerTyr: 0.967 ± 0.286
0.0SerXaa: 0.0 ± 0.0
Thr
5.351ThrAla: 5.351 ± 0.588
0.322ThrCys: 0.322 ± 0.125
2.966ThrAsp: 2.966 ± 0.449
3.739ThrGlu: 3.739 ± 0.562
2.063ThrPhe: 2.063 ± 0.404
4.771ThrGly: 4.771 ± 0.604
1.096ThrHis: 1.096 ± 0.243
2.45ThrIle: 2.45 ± 0.353
2.643ThrLys: 2.643 ± 0.41
4.964ThrLeu: 4.964 ± 0.713
1.16ThrMet: 1.16 ± 0.229
1.934ThrAsn: 1.934 ± 0.317
4.191ThrPro: 4.191 ± 0.533
1.805ThrGln: 1.805 ± 0.328
3.481ThrArg: 3.481 ± 0.329
1.87ThrSer: 1.87 ± 0.308
3.417ThrThr: 3.417 ± 0.658
5.029ThrVal: 5.029 ± 0.645
1.354ThrTrp: 1.354 ± 0.281
1.612ThrTyr: 1.612 ± 0.306
0.0ThrXaa: 0.0 ± 0.0
Val
6.769ValAla: 6.769 ± 0.796
0.709ValCys: 0.709 ± 0.223
5.287ValAsp: 5.287 ± 0.641
5.158ValGlu: 5.158 ± 0.599
2.385ValPhe: 2.385 ± 0.443
5.738ValGly: 5.738 ± 0.666
1.418ValHis: 1.418 ± 0.323
3.352ValIle: 3.352 ± 0.528
3.61ValLys: 3.61 ± 0.421
5.222ValLeu: 5.222 ± 0.644
1.096ValMet: 1.096 ± 0.259
2.579ValAsn: 2.579 ± 0.421
3.997ValPro: 3.997 ± 0.538
1.999ValGln: 1.999 ± 0.293
4.835ValArg: 4.835 ± 0.73
5.093ValSer: 5.093 ± 0.624
3.675ValThr: 3.675 ± 0.501
5.931ValVal: 5.931 ± 0.577
0.903ValTrp: 0.903 ± 0.231
1.805ValTyr: 1.805 ± 0.375
0.0ValXaa: 0.0 ± 0.0
Trp
1.934TrpAla: 1.934 ± 0.39
0.258TrpCys: 0.258 ± 0.125
1.032TrpAsp: 1.032 ± 0.268
1.418TrpGlu: 1.418 ± 0.267
0.451TrpPhe: 0.451 ± 0.137
1.612TrpGly: 1.612 ± 0.352
0.258TrpHis: 0.258 ± 0.122
1.547TrpIle: 1.547 ± 0.276
0.709TrpLys: 0.709 ± 0.187
1.805TrpLeu: 1.805 ± 0.403
0.774TrpMet: 0.774 ± 0.214
0.774TrpAsn: 0.774 ± 0.247
0.774TrpPro: 0.774 ± 0.244
0.709TrpGln: 0.709 ± 0.262
1.16TrpArg: 1.16 ± 0.275
1.032TrpSer: 1.032 ± 0.259
1.354TrpThr: 1.354 ± 0.286
1.483TrpVal: 1.483 ± 0.296
0.709TrpTrp: 0.709 ± 0.225
0.645TrpTyr: 0.645 ± 0.197
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.837TyrAla: 2.837 ± 0.367
0.258TyrCys: 0.258 ± 0.14
2.514TyrAsp: 2.514 ± 0.41
1.418TyrGlu: 1.418 ± 0.33
0.58TyrPhe: 0.58 ± 0.228
1.741TyrGly: 1.741 ± 0.356
0.709TyrHis: 0.709 ± 0.204
1.354TyrIle: 1.354 ± 0.279
1.096TyrLys: 1.096 ± 0.332
2.385TyrLeu: 2.385 ± 0.382
0.774TyrMet: 0.774 ± 0.257
1.676TyrAsn: 1.676 ± 0.312
1.289TyrPro: 1.289 ± 0.262
1.16TyrGln: 1.16 ± 0.229
2.192TyrArg: 2.192 ± 0.442
1.483TyrSer: 1.483 ± 0.258
2.128TyrThr: 2.128 ± 0.395
2.385TyrVal: 2.385 ± 0.474
0.709TyrTrp: 0.709 ± 0.231
1.032TyrTyr: 1.032 ± 0.343
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 87 proteins (15512 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski