Amino acid dipepetide frequency for Roseburia phage Shimadzu

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
11.977AlaAla: 11.977 ± 1.416
0.85AlaCys: 0.85 ± 0.247
5.255AlaAsp: 5.255 ± 0.672
10.046AlaGlu: 10.046 ± 1.139
2.782AlaPhe: 2.782 ± 0.371
6.723AlaGly: 6.723 ± 0.903
1.236AlaHis: 1.236 ± 0.323
4.946AlaIle: 4.946 ± 0.739
4.636AlaLys: 4.636 ± 0.546
6.491AlaLeu: 6.491 ± 0.758
3.091AlaMet: 3.091 ± 0.464
3.014AlaAsn: 3.014 ± 0.48
4.559AlaPro: 4.559 ± 0.617
4.173AlaGln: 4.173 ± 0.91
4.25AlaArg: 4.25 ± 0.719
6.182AlaSer: 6.182 ± 0.708
6.646AlaThr: 6.646 ± 0.853
6.336AlaVal: 6.336 ± 0.587
1.468AlaTrp: 1.468 ± 0.308
2.705AlaTyr: 2.705 ± 0.458
0.0AlaXaa: 0.0 ± 0.0
Cys
0.309CysAla: 0.309 ± 0.14
0.232CysCys: 0.232 ± 0.137
0.541CysAsp: 0.541 ± 0.236
1.005CysGlu: 1.005 ± 0.261
0.85CysPhe: 0.85 ± 0.248
1.314CysGly: 1.314 ± 0.385
0.386CysHis: 0.386 ± 0.193
0.309CysIle: 0.309 ± 0.146
0.85CysLys: 0.85 ± 0.239
0.309CysLeu: 0.309 ± 0.148
0.309CysMet: 0.309 ± 0.17
0.773CysAsn: 0.773 ± 0.262
0.618CysPro: 0.618 ± 0.222
0.386CysGln: 0.386 ± 0.179
0.695CysArg: 0.695 ± 0.308
0.695CysSer: 0.695 ± 0.223
0.695CysThr: 0.695 ± 0.228
1.236CysVal: 1.236 ± 0.325
0.155CysTrp: 0.155 ± 0.114
0.232CysTyr: 0.232 ± 0.138
0.0CysXaa: 0.0 ± 0.0
Asp
4.405AspAla: 4.405 ± 0.824
0.695AspCys: 0.695 ± 0.26
4.791AspAsp: 4.791 ± 0.627
3.786AspGlu: 3.786 ± 0.539
2.782AspPhe: 2.782 ± 0.492
3.864AspGly: 3.864 ± 0.478
0.773AspHis: 0.773 ± 0.18
2.627AspIle: 2.627 ± 0.414
3.4AspLys: 3.4 ± 0.468
4.482AspLeu: 4.482 ± 0.565
1.314AspMet: 1.314 ± 0.393
2.705AspAsn: 2.705 ± 0.336
3.709AspPro: 3.709 ± 0.6
1.855AspGln: 1.855 ± 0.31
3.091AspArg: 3.091 ± 0.56
2.55AspSer: 2.55 ± 0.528
4.018AspThr: 4.018 ± 0.507
2.705AspVal: 2.705 ± 0.425
1.314AspTrp: 1.314 ± 0.274
3.014AspTyr: 3.014 ± 0.416
0.0AspXaa: 0.0 ± 0.0
Glu
9.041GluAla: 9.041 ± 1.004
0.541GluCys: 0.541 ± 0.2
3.709GluAsp: 3.709 ± 0.455
6.027GluGlu: 6.027 ± 0.769
1.468GluPhe: 1.468 ± 0.372
4.25GluGly: 4.25 ± 0.612
0.927GluHis: 0.927 ± 0.27
5.796GluIle: 5.796 ± 0.701
5.796GluLys: 5.796 ± 0.661
7.418GluLeu: 7.418 ± 0.938
3.091GluMet: 3.091 ± 0.549
3.786GluAsn: 3.786 ± 0.528
3.091GluPro: 3.091 ± 0.663
2.318GluGln: 2.318 ± 0.4
3.632GluArg: 3.632 ± 0.686
3.014GluSer: 3.014 ± 0.525
5.255GluThr: 5.255 ± 0.675
4.636GluVal: 4.636 ± 0.509
0.618GluTrp: 0.618 ± 0.24
3.4GluTyr: 3.4 ± 0.505
0.0GluXaa: 0.0 ± 0.0
Phe
3.091PheAla: 3.091 ± 0.374
0.927PheCys: 0.927 ± 0.278
2.395PheAsp: 2.395 ± 0.419
2.627PheGlu: 2.627 ± 0.44
1.005PhePhe: 1.005 ± 0.284
2.086PheGly: 2.086 ± 0.406
0.155PheHis: 0.155 ± 0.105
1.7PheIle: 1.7 ± 0.404
2.164PheLys: 2.164 ± 0.394
2.086PheLeu: 2.086 ± 0.415
0.541PheMet: 0.541 ± 0.175
1.623PheAsn: 1.623 ± 0.319
0.773PhePro: 0.773 ± 0.302
0.85PheGln: 0.85 ± 0.276
1.082PheArg: 1.082 ± 0.227
2.627PheSer: 2.627 ± 0.39
2.859PheThr: 2.859 ± 0.423
1.623PheVal: 1.623 ± 0.349
0.773PheTrp: 0.773 ± 0.225
1.005PheTyr: 1.005 ± 0.251
0.0PheXaa: 0.0 ± 0.0
Gly
5.255GlyAla: 5.255 ± 0.636
1.545GlyCys: 1.545 ± 0.361
3.864GlyAsp: 3.864 ± 0.569
4.018GlyGlu: 4.018 ± 0.539
2.627GlyPhe: 2.627 ± 0.417
6.182GlyGly: 6.182 ± 0.634
1.314GlyHis: 1.314 ± 0.408
3.709GlyIle: 3.709 ± 0.738
4.791GlyLys: 4.791 ± 0.631
5.023GlyLeu: 5.023 ± 0.456
1.314GlyMet: 1.314 ± 0.266
2.627GlyAsn: 2.627 ± 0.548
1.777GlyPro: 1.777 ± 0.318
2.936GlyGln: 2.936 ± 0.484
4.405GlyArg: 4.405 ± 0.563
4.714GlySer: 4.714 ± 0.608
4.946GlyThr: 4.946 ± 0.629
5.1GlyVal: 5.1 ± 0.549
1.082GlyTrp: 1.082 ± 0.321
2.936GlyTyr: 2.936 ± 0.409
0.0GlyXaa: 0.0 ± 0.0
His
1.005HisAla: 1.005 ± 0.285
0.077HisCys: 0.077 ± 0.08
0.541HisAsp: 0.541 ± 0.222
1.082HisGlu: 1.082 ± 0.207
1.005HisPhe: 1.005 ± 0.345
1.314HisGly: 1.314 ± 0.253
0.464HisHis: 0.464 ± 0.166
0.85HisIle: 0.85 ± 0.27
1.236HisLys: 1.236 ± 0.254
0.773HisLeu: 0.773 ± 0.219
0.155HisMet: 0.155 ± 0.117
0.773HisAsn: 0.773 ± 0.209
0.464HisPro: 0.464 ± 0.185
0.232HisGln: 0.232 ± 0.136
0.773HisArg: 0.773 ± 0.209
0.85HisSer: 0.85 ± 0.261
0.618HisThr: 0.618 ± 0.19
1.236HisVal: 1.236 ± 0.301
0.309HisTrp: 0.309 ± 0.155
0.541HisTyr: 0.541 ± 0.206
0.0HisXaa: 0.0 ± 0.0
Ile
6.182IleAla: 6.182 ± 0.635
0.773IleCys: 0.773 ± 0.242
4.405IleAsp: 4.405 ± 0.521
4.946IleGlu: 4.946 ± 0.629
2.164IlePhe: 2.164 ± 0.363
3.091IleGly: 3.091 ± 0.595
0.464IleHis: 0.464 ± 0.179
4.636IleIle: 4.636 ± 0.541
5.486IleLys: 5.486 ± 0.597
3.477IleLeu: 3.477 ± 0.447
1.391IleMet: 1.391 ± 0.261
2.936IleAsn: 2.936 ± 0.496
3.555IlePro: 3.555 ± 0.602
3.786IleGln: 3.786 ± 0.493
2.859IleArg: 2.859 ± 0.387
2.627IleSer: 2.627 ± 0.525
5.023IleThr: 5.023 ± 0.53
3.168IleVal: 3.168 ± 0.567
0.464IleTrp: 0.464 ± 0.168
2.086IleTyr: 2.086 ± 0.46
0.0IleXaa: 0.0 ± 0.0
Lys
7.882LysAla: 7.882 ± 0.739
0.618LysCys: 0.618 ± 0.228
3.014LysAsp: 3.014 ± 0.536
4.636LysGlu: 4.636 ± 0.674
1.7LysPhe: 1.7 ± 0.338
4.946LysGly: 4.946 ± 0.715
1.005LysHis: 1.005 ± 0.258
4.714LysIle: 4.714 ± 0.575
5.641LysLys: 5.641 ± 1.116
4.791LysLeu: 4.791 ± 0.734
1.082LysMet: 1.082 ± 0.27
2.627LysAsn: 2.627 ± 0.441
2.473LysPro: 2.473 ± 0.462
1.777LysGln: 1.777 ± 0.368
2.936LysArg: 2.936 ± 0.427
3.632LysSer: 3.632 ± 0.602
5.409LysThr: 5.409 ± 0.562
3.555LysVal: 3.555 ± 0.767
1.005LysTrp: 1.005 ± 0.303
2.859LysTyr: 2.859 ± 0.597
0.0LysXaa: 0.0 ± 0.0
Leu
7.186LeuAla: 7.186 ± 0.904
0.927LeuCys: 0.927 ± 0.224
3.477LeuAsp: 3.477 ± 0.575
5.796LeuGlu: 5.796 ± 0.582
1.932LeuPhe: 1.932 ± 0.345
4.327LeuGly: 4.327 ± 0.628
1.236LeuHis: 1.236 ± 0.257
4.714LeuIle: 4.714 ± 0.699
4.868LeuLys: 4.868 ± 0.632
5.409LeuLeu: 5.409 ± 0.643
2.395LeuMet: 2.395 ± 0.519
3.014LeuAsn: 3.014 ± 0.507
3.4LeuPro: 3.4 ± 0.527
2.859LeuGln: 2.859 ± 0.493
4.405LeuArg: 4.405 ± 0.633
4.559LeuSer: 4.559 ± 0.565
5.177LeuThr: 5.177 ± 0.532
4.25LeuVal: 4.25 ± 0.547
0.618LeuTrp: 0.618 ± 0.222
2.859LeuTyr: 2.859 ± 0.48
0.0LeuXaa: 0.0 ± 0.0
Met
2.859MetAla: 2.859 ± 0.495
0.464MetCys: 0.464 ± 0.176
1.391MetAsp: 1.391 ± 0.309
1.468MetGlu: 1.468 ± 0.36
0.618MetPhe: 0.618 ± 0.206
1.7MetGly: 1.7 ± 0.363
0.155MetHis: 0.155 ± 0.108
2.086MetIle: 2.086 ± 0.308
1.7MetLys: 1.7 ± 0.295
2.164MetLeu: 2.164 ± 0.385
0.077MetMet: 0.077 ± 0.081
1.468MetAsn: 1.468 ± 0.327
1.159MetPro: 1.159 ± 0.344
1.468MetGln: 1.468 ± 0.322
1.159MetArg: 1.159 ± 0.281
1.932MetSer: 1.932 ± 0.287
1.391MetThr: 1.391 ± 0.319
0.773MetVal: 0.773 ± 0.212
0.232MetTrp: 0.232 ± 0.13
0.618MetTyr: 0.618 ± 0.177
0.0MetXaa: 0.0 ± 0.0
Asn
3.709AsnAla: 3.709 ± 0.482
0.309AsnCys: 0.309 ± 0.148
2.782AsnAsp: 2.782 ± 0.49
2.936AsnGlu: 2.936 ± 0.404
0.773AsnPhe: 0.773 ± 0.213
5.1AsnGly: 5.1 ± 0.75
0.618AsnHis: 0.618 ± 0.199
3.864AsnIle: 3.864 ± 0.596
2.627AsnLys: 2.627 ± 0.503
1.623AsnLeu: 1.623 ± 0.337
1.236AsnMet: 1.236 ± 0.341
1.236AsnAsn: 1.236 ± 0.446
1.855AsnPro: 1.855 ± 0.291
1.005AsnGln: 1.005 ± 0.257
2.318AsnArg: 2.318 ± 0.385
2.55AsnSer: 2.55 ± 0.531
2.086AsnThr: 2.086 ± 0.454
2.782AsnVal: 2.782 ± 0.445
0.618AsnTrp: 0.618 ± 0.198
1.391AsnTyr: 1.391 ± 0.302
0.0AsnXaa: 0.0 ± 0.0
Pro
4.714ProAla: 4.714 ± 0.639
0.541ProCys: 0.541 ± 0.244
3.014ProAsp: 3.014 ± 0.483
4.405ProGlu: 4.405 ± 0.603
1.468ProPhe: 1.468 ± 0.311
2.627ProGly: 2.627 ± 0.54
0.386ProHis: 0.386 ± 0.165
2.55ProIle: 2.55 ± 0.59
1.7ProLys: 1.7 ± 0.394
3.555ProLeu: 3.555 ± 0.508
0.927ProMet: 0.927 ± 0.225
1.314ProAsn: 1.314 ± 0.391
2.241ProPro: 2.241 ± 0.363
1.468ProGln: 1.468 ± 0.316
0.927ProArg: 0.927 ± 0.278
2.627ProSer: 2.627 ± 0.479
3.477ProThr: 3.477 ± 0.572
2.859ProVal: 2.859 ± 0.502
0.618ProTrp: 0.618 ± 0.339
1.777ProTyr: 1.777 ± 0.34
0.0ProXaa: 0.0 ± 0.0
Gln
3.786GlnAla: 3.786 ± 0.582
0.386GlnCys: 0.386 ± 0.148
2.086GlnAsp: 2.086 ± 0.495
3.168GlnGlu: 3.168 ± 0.426
0.927GlnPhe: 0.927 ± 0.309
1.623GlnGly: 1.623 ± 0.361
1.159GlnHis: 1.159 ± 0.297
3.323GlnIle: 3.323 ± 0.554
2.241GlnLys: 2.241 ± 0.365
3.323GlnLeu: 3.323 ± 0.611
1.005GlnMet: 1.005 ± 0.304
2.241GlnAsn: 2.241 ± 0.447
2.086GlnPro: 2.086 ± 0.456
2.627GlnGln: 2.627 ± 0.46
2.55GlnArg: 2.55 ± 0.474
2.086GlnSer: 2.086 ± 0.362
2.705GlnThr: 2.705 ± 0.494
1.391GlnVal: 1.391 ± 0.331
0.464GlnTrp: 0.464 ± 0.207
0.927GlnTyr: 0.927 ± 0.231
0.0GlnXaa: 0.0 ± 0.0
Arg
3.632ArgAla: 3.632 ± 0.536
0.618ArgCys: 0.618 ± 0.199
2.164ArgAsp: 2.164 ± 0.504
4.096ArgGlu: 4.096 ± 0.629
1.623ArgPhe: 1.623 ± 0.354
2.936ArgGly: 2.936 ± 0.47
0.773ArgHis: 0.773 ± 0.263
3.555ArgIle: 3.555 ± 0.565
4.173ArgLys: 4.173 ± 0.822
4.018ArgLeu: 4.018 ± 0.562
1.236ArgMet: 1.236 ± 0.303
2.395ArgAsn: 2.395 ± 0.521
1.468ArgPro: 1.468 ± 0.36
2.627ArgGln: 2.627 ± 0.405
3.323ArgArg: 3.323 ± 0.672
2.55ArgSer: 2.55 ± 0.392
2.782ArgThr: 2.782 ± 0.425
2.55ArgVal: 2.55 ± 0.402
0.386ArgTrp: 0.386 ± 0.184
2.782ArgTyr: 2.782 ± 0.471
0.0ArgXaa: 0.0 ± 0.0
Ser
4.868SerAla: 4.868 ± 0.702
0.618SerCys: 0.618 ± 0.224
3.786SerAsp: 3.786 ± 0.518
4.25SerGlu: 4.25 ± 0.469
2.086SerPhe: 2.086 ± 0.4
5.564SerGly: 5.564 ± 0.572
1.005SerHis: 1.005 ± 0.327
2.318SerIle: 2.318 ± 0.329
4.559SerLys: 4.559 ± 0.486
4.559SerLeu: 4.559 ± 0.551
1.236SerMet: 1.236 ± 0.287
2.241SerAsn: 2.241 ± 0.538
1.777SerPro: 1.777 ± 0.368
2.627SerGln: 2.627 ± 0.448
2.473SerArg: 2.473 ± 0.477
2.936SerSer: 2.936 ± 0.513
4.173SerThr: 4.173 ± 0.668
3.091SerVal: 3.091 ± 0.539
1.159SerTrp: 1.159 ± 0.336
2.318SerTyr: 2.318 ± 0.377
0.0SerXaa: 0.0 ± 0.0
Thr
7.496ThrAla: 7.496 ± 1.03
0.386ThrCys: 0.386 ± 0.145
3.786ThrAsp: 3.786 ± 0.496
5.796ThrGlu: 5.796 ± 0.721
2.009ThrPhe: 2.009 ± 0.474
5.718ThrGly: 5.718 ± 0.768
0.85ThrHis: 0.85 ± 0.251
4.791ThrIle: 4.791 ± 0.554
3.4ThrLys: 3.4 ± 0.445
5.023ThrLeu: 5.023 ± 0.501
1.855ThrMet: 1.855 ± 0.371
1.468ThrAsn: 1.468 ± 0.408
3.941ThrPro: 3.941 ± 0.648
3.245ThrGln: 3.245 ± 0.414
3.323ThrArg: 3.323 ± 0.463
3.168ThrSer: 3.168 ± 0.393
5.1ThrThr: 5.1 ± 0.646
4.636ThrVal: 4.636 ± 0.694
0.618ThrTrp: 0.618 ± 0.209
2.395ThrTyr: 2.395 ± 0.344
0.0ThrXaa: 0.0 ± 0.0
Val
5.177ValAla: 5.177 ± 0.609
0.618ValCys: 0.618 ± 0.205
3.632ValAsp: 3.632 ± 0.5
3.555ValGlu: 3.555 ± 0.49
2.318ValPhe: 2.318 ± 0.362
3.168ValGly: 3.168 ± 0.57
0.85ValHis: 0.85 ± 0.253
3.323ValIle: 3.323 ± 0.477
4.018ValLys: 4.018 ± 0.733
5.718ValLeu: 5.718 ± 0.698
1.314ValMet: 1.314 ± 0.274
2.705ValAsn: 2.705 ± 0.593
2.936ValPro: 2.936 ± 0.454
2.241ValGln: 2.241 ± 0.45
1.932ValArg: 1.932 ± 0.362
4.868ValSer: 4.868 ± 0.621
3.632ValThr: 3.632 ± 0.508
3.245ValVal: 3.245 ± 0.509
0.927ValTrp: 0.927 ± 0.231
1.855ValTyr: 1.855 ± 0.362
0.0ValXaa: 0.0 ± 0.0
Trp
1.545TrpAla: 1.545 ± 0.37
0.309TrpCys: 0.309 ± 0.151
0.695TrpAsp: 0.695 ± 0.274
1.005TrpGlu: 1.005 ± 0.324
0.464TrpPhe: 0.464 ± 0.19
0.541TrpGly: 0.541 ± 0.292
0.0TrpHis: 0.0 ± 0.0
0.927TrpIle: 0.927 ± 0.246
0.773TrpLys: 0.773 ± 0.251
0.85TrpLeu: 0.85 ± 0.288
0.232TrpMet: 0.232 ± 0.143
0.773TrpAsn: 0.773 ± 0.185
0.232TrpPro: 0.232 ± 0.141
0.773TrpGln: 0.773 ± 0.249
1.314TrpArg: 1.314 ± 0.32
0.927TrpSer: 0.927 ± 0.247
0.464TrpThr: 0.464 ± 0.204
1.236TrpVal: 1.236 ± 0.373
0.309TrpTrp: 0.309 ± 0.136
0.386TrpTyr: 0.386 ± 0.167
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.245TyrAla: 3.245 ± 0.44
0.464TyrCys: 0.464 ± 0.155
2.627TyrAsp: 2.627 ± 0.351
3.168TyrGlu: 3.168 ± 0.421
1.391TyrPhe: 1.391 ± 0.323
3.091TyrGly: 3.091 ± 0.445
0.541TyrHis: 0.541 ± 0.193
2.782TyrIle: 2.782 ± 0.486
2.241TyrLys: 2.241 ± 0.363
2.318TyrLeu: 2.318 ± 0.334
1.005TyrMet: 1.005 ± 0.284
1.7TyrAsn: 1.7 ± 0.277
1.082TyrPro: 1.082 ± 0.257
0.85TyrGln: 0.85 ± 0.27
2.241TyrArg: 2.241 ± 0.432
2.55TyrSer: 2.55 ± 0.502
2.55TyrThr: 2.55 ± 0.477
1.623TyrVal: 1.623 ± 0.311
0.618TyrTrp: 0.618 ± 0.177
1.545TyrTyr: 1.545 ± 0.319
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 59 proteins (12942 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski