Amino acid dipepetide frequency for Streptomyces phage Forthebois

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
12.629AlaAla: 12.629 ± 2.042
0.361AlaCys: 0.361 ± 0.262
4.871AlaAsp: 4.871 ± 1.167
5.593AlaGlu: 5.593 ± 1.277
2.165AlaPhe: 2.165 ± 0.719
9.201AlaGly: 9.201 ± 1.53
1.984AlaHis: 1.984 ± 0.545
4.871AlaIle: 4.871 ± 0.977
7.216AlaLys: 7.216 ± 1.404
6.855AlaLeu: 6.855 ± 1.346
3.608AlaMet: 3.608 ± 0.798
5.232AlaAsn: 5.232 ± 0.852
4.51AlaPro: 4.51 ± 1.513
4.691AlaGln: 4.691 ± 0.908
3.789AlaArg: 3.789 ± 0.797
3.969AlaSer: 3.969 ± 0.92
7.758AlaThr: 7.758 ± 1.541
5.412AlaVal: 5.412 ± 1.244
1.984AlaTrp: 1.984 ± 0.566
3.428AlaTyr: 3.428 ± 0.869
0.0AlaXaa: 0.0 ± 0.0
Cys
0.361CysAla: 0.361 ± 0.235
0.0CysCys: 0.0 ± 0.0
0.361CysAsp: 0.361 ± 0.271
0.0CysGlu: 0.0 ± 0.0
0.0CysPhe: 0.0 ± 0.0
1.082CysGly: 1.082 ± 0.51
0.541CysHis: 0.541 ± 0.275
0.361CysIle: 0.361 ± 0.208
0.361CysLys: 0.361 ± 0.208
0.361CysLeu: 0.361 ± 0.23
0.0CysMet: 0.0 ± 0.0
0.0CysAsn: 0.0 ± 0.0
0.361CysPro: 0.361 ± 0.247
0.18CysGln: 0.18 ± 0.172
1.082CysArg: 1.082 ± 0.614
0.18CysSer: 0.18 ± 0.16
0.361CysThr: 0.361 ± 0.238
1.082CysVal: 1.082 ± 0.526
0.0CysTrp: 0.0 ± 0.0
0.18CysTyr: 0.18 ± 0.16
0.0CysXaa: 0.0 ± 0.0
Asp
6.495AspAla: 6.495 ± 1.086
0.361AspCys: 0.361 ± 0.207
3.608AspAsp: 3.608 ± 1.036
3.969AspGlu: 3.969 ± 1.177
3.428AspPhe: 3.428 ± 0.717
4.871AspGly: 4.871 ± 0.887
0.722AspHis: 0.722 ± 0.442
2.706AspIle: 2.706 ± 0.677
3.608AspLys: 3.608 ± 0.916
3.247AspLeu: 3.247 ± 0.678
1.263AspMet: 1.263 ± 0.412
2.165AspAsn: 2.165 ± 0.716
3.789AspPro: 3.789 ± 0.798
2.526AspGln: 2.526 ± 0.787
1.624AspArg: 1.624 ± 0.669
2.526AspSer: 2.526 ± 0.665
3.428AspThr: 3.428 ± 0.75
4.149AspVal: 4.149 ± 0.881
0.902AspTrp: 0.902 ± 0.332
0.541AspTyr: 0.541 ± 0.274
0.0AspXaa: 0.0 ± 0.0
Glu
3.067GluAla: 3.067 ± 0.728
0.361GluCys: 0.361 ± 0.285
2.526GluAsp: 2.526 ± 0.658
1.804GluGlu: 1.804 ± 0.573
3.428GluPhe: 3.428 ± 0.73
3.067GluGly: 3.067 ± 0.67
0.902GluHis: 0.902 ± 0.358
3.247GluIle: 3.247 ± 0.746
3.428GluLys: 3.428 ± 0.855
3.067GluLeu: 3.067 ± 0.593
2.345GluMet: 2.345 ± 0.691
3.969GluAsn: 3.969 ± 0.765
2.345GluPro: 2.345 ± 0.592
1.082GluGln: 1.082 ± 0.521
1.624GluArg: 1.624 ± 0.691
3.789GluSer: 3.789 ± 0.876
2.706GluThr: 2.706 ± 0.773
3.247GluVal: 3.247 ± 0.634
1.624GluTrp: 1.624 ± 0.479
1.984GluTyr: 1.984 ± 0.743
0.0GluXaa: 0.0 ± 0.0
Phe
2.706PheAla: 2.706 ± 0.526
0.0PheCys: 0.0 ± 0.0
2.345PheAsp: 2.345 ± 0.844
1.804PheGlu: 1.804 ± 0.563
0.902PhePhe: 0.902 ± 0.328
4.871PheGly: 4.871 ± 0.826
0.902PheHis: 0.902 ± 0.433
1.263PheIle: 1.263 ± 0.43
0.902PheLys: 0.902 ± 0.375
2.165PheLeu: 2.165 ± 0.673
0.361PheMet: 0.361 ± 0.219
2.345PheAsn: 2.345 ± 0.612
1.624PhePro: 1.624 ± 0.519
1.804PheGln: 1.804 ± 0.489
1.804PheArg: 1.804 ± 0.575
1.804PheSer: 1.804 ± 0.528
2.345PheThr: 2.345 ± 0.661
3.247PheVal: 3.247 ± 0.643
0.722PheTrp: 0.722 ± 0.318
0.902PheTyr: 0.902 ± 0.411
0.0PheXaa: 0.0 ± 0.0
Gly
6.855GlyAla: 6.855 ± 1.481
0.722GlyCys: 0.722 ± 0.434
5.953GlyAsp: 5.953 ± 1.182
4.33GlyGlu: 4.33 ± 0.847
3.428GlyPhe: 3.428 ± 0.647
9.922GlyGly: 9.922 ± 2.024
0.722GlyHis: 0.722 ± 0.408
4.33GlyIle: 4.33 ± 0.856
6.675GlyLys: 6.675 ± 1.625
5.593GlyLeu: 5.593 ± 1.089
2.165GlyMet: 2.165 ± 0.739
7.216GlyAsn: 7.216 ± 1.232
5.593GlyPro: 5.593 ± 0.975
3.789GlyGln: 3.789 ± 0.69
3.608GlyArg: 3.608 ± 0.722
7.397GlySer: 7.397 ± 1.479
6.675GlyThr: 6.675 ± 1.355
5.593GlyVal: 5.593 ± 1.036
2.526GlyTrp: 2.526 ± 0.617
2.706GlyTyr: 2.706 ± 0.795
0.0GlyXaa: 0.0 ± 0.0
His
1.082HisAla: 1.082 ± 0.413
0.0HisCys: 0.0 ± 0.0
1.443HisAsp: 1.443 ± 0.492
0.902HisGlu: 0.902 ± 0.443
1.082HisPhe: 1.082 ± 0.504
1.443HisGly: 1.443 ± 0.894
0.361HisHis: 0.361 ± 0.259
0.541HisIle: 0.541 ± 0.272
0.541HisLys: 0.541 ± 0.292
1.443HisLeu: 1.443 ± 0.428
0.18HisMet: 0.18 ± 0.173
0.541HisAsn: 0.541 ± 0.31
1.443HisPro: 1.443 ± 0.554
0.541HisGln: 0.541 ± 0.311
1.082HisArg: 1.082 ± 0.581
0.18HisSer: 0.18 ± 0.16
0.902HisThr: 0.902 ± 0.422
1.443HisVal: 1.443 ± 0.582
0.18HisTrp: 0.18 ± 0.205
1.082HisTyr: 1.082 ± 0.422
0.0HisXaa: 0.0 ± 0.0
Ile
5.412IleAla: 5.412 ± 1.045
0.361IleCys: 0.361 ± 0.243
1.804IleAsp: 1.804 ± 0.487
1.624IleGlu: 1.624 ± 0.63
1.082IlePhe: 1.082 ± 0.442
5.953IleGly: 5.953 ± 1.18
1.624IleHis: 1.624 ± 0.454
1.984IleIle: 1.984 ± 0.652
1.984IleLys: 1.984 ± 0.637
4.691IleLeu: 4.691 ± 0.929
1.263IleMet: 1.263 ± 0.403
1.443IleAsn: 1.443 ± 0.443
2.165IlePro: 2.165 ± 0.654
1.082IleGln: 1.082 ± 0.46
3.428IleArg: 3.428 ± 0.722
3.789IleSer: 3.789 ± 0.87
5.051IleThr: 5.051 ± 0.921
4.149IleVal: 4.149 ± 0.982
1.443IleTrp: 1.443 ± 0.571
1.624IleTyr: 1.624 ± 0.484
0.0IleXaa: 0.0 ± 0.0
Lys
4.871LysAla: 4.871 ± 1.682
0.361LysCys: 0.361 ± 0.288
3.247LysAsp: 3.247 ± 0.641
2.706LysGlu: 2.706 ± 0.597
1.263LysPhe: 1.263 ± 0.553
4.33LysGly: 4.33 ± 0.968
0.902LysHis: 0.902 ± 0.304
3.428LysIle: 3.428 ± 0.894
2.706LysLys: 2.706 ± 0.783
2.887LysLeu: 2.887 ± 0.781
2.887LysMet: 2.887 ± 0.799
2.165LysAsn: 2.165 ± 0.541
3.067LysPro: 3.067 ± 0.841
2.526LysGln: 2.526 ± 0.537
2.706LysArg: 2.706 ± 0.695
3.247LysSer: 3.247 ± 0.722
3.608LysThr: 3.608 ± 0.685
3.428LysVal: 3.428 ± 0.847
1.443LysTrp: 1.443 ± 0.541
0.361LysTyr: 0.361 ± 0.258
0.0LysXaa: 0.0 ± 0.0
Leu
7.577LeuAla: 7.577 ± 0.998
0.361LeuCys: 0.361 ± 0.278
3.789LeuAsp: 3.789 ± 0.792
3.969LeuGlu: 3.969 ± 0.827
1.624LeuPhe: 1.624 ± 0.48
7.216LeuGly: 7.216 ± 0.933
1.624LeuHis: 1.624 ± 0.562
5.593LeuIle: 5.593 ± 0.965
3.428LeuLys: 3.428 ± 0.912
3.608LeuLeu: 3.608 ± 0.845
2.165LeuMet: 2.165 ± 0.546
2.706LeuAsn: 2.706 ± 0.565
3.067LeuPro: 3.067 ± 0.7
1.082LeuGln: 1.082 ± 0.45
4.33LeuArg: 4.33 ± 0.761
4.871LeuSer: 4.871 ± 0.851
3.608LeuThr: 3.608 ± 0.735
4.691LeuVal: 4.691 ± 1.061
1.443LeuTrp: 1.443 ± 0.526
1.624LeuTyr: 1.624 ± 0.632
0.0LeuXaa: 0.0 ± 0.0
Met
4.33MetAla: 4.33 ± 0.893
0.361MetCys: 0.361 ± 0.234
1.082MetAsp: 1.082 ± 0.448
0.361MetGlu: 0.361 ± 0.25
0.541MetPhe: 0.541 ± 0.35
2.887MetGly: 2.887 ± 0.784
0.361MetHis: 0.361 ± 0.234
0.902MetIle: 0.902 ± 0.333
0.902MetLys: 0.902 ± 0.362
3.247MetLeu: 3.247 ± 0.894
1.984MetMet: 1.984 ± 0.906
1.263MetAsn: 1.263 ± 0.471
1.984MetPro: 1.984 ± 0.609
0.18MetGln: 0.18 ± 0.206
1.443MetArg: 1.443 ± 0.491
1.804MetSer: 1.804 ± 0.691
1.443MetThr: 1.443 ± 0.54
1.263MetVal: 1.263 ± 0.501
0.18MetTrp: 0.18 ± 0.199
1.443MetTyr: 1.443 ± 0.396
0.0MetXaa: 0.0 ± 0.0
Asn
5.412AsnAla: 5.412 ± 1.41
0.18AsnCys: 0.18 ± 0.143
1.984AsnAsp: 1.984 ± 0.545
2.165AsnGlu: 2.165 ± 0.525
0.902AsnPhe: 0.902 ± 0.352
6.855AsnGly: 6.855 ± 1.229
0.722AsnHis: 0.722 ± 0.326
1.082AsnIle: 1.082 ± 0.353
2.345AsnLys: 2.345 ± 0.67
3.067AsnLeu: 3.067 ± 0.831
1.082AsnMet: 1.082 ± 0.499
1.624AsnAsn: 1.624 ± 0.672
4.33AsnPro: 4.33 ± 1.216
1.624AsnGln: 1.624 ± 0.532
1.624AsnArg: 1.624 ± 0.456
4.33AsnSer: 4.33 ± 0.753
4.149AsnThr: 4.149 ± 0.739
3.067AsnVal: 3.067 ± 0.735
0.541AsnTrp: 0.541 ± 0.269
1.082AsnTyr: 1.082 ± 0.334
0.0AsnXaa: 0.0 ± 0.0
Pro
5.593ProAla: 5.593 ± 1.909
0.361ProCys: 0.361 ± 0.287
4.871ProAsp: 4.871 ± 1.158
3.067ProGlu: 3.067 ± 0.847
2.887ProPhe: 2.887 ± 0.817
4.691ProGly: 4.691 ± 1.172
0.722ProHis: 0.722 ± 0.326
1.984ProIle: 1.984 ± 0.488
4.149ProLys: 4.149 ± 1.082
2.706ProLeu: 2.706 ± 0.631
1.082ProMet: 1.082 ± 0.357
3.428ProAsn: 3.428 ± 0.847
2.887ProPro: 2.887 ± 1.15
1.443ProGln: 1.443 ± 0.624
1.984ProArg: 1.984 ± 0.755
1.984ProSer: 1.984 ± 0.727
5.051ProThr: 5.051 ± 1.107
4.149ProVal: 4.149 ± 1.284
1.082ProTrp: 1.082 ± 0.531
2.165ProTyr: 2.165 ± 0.79
0.0ProXaa: 0.0 ± 0.0
Gln
4.51GlnAla: 4.51 ± 0.743
0.361GlnCys: 0.361 ± 0.296
1.443GlnAsp: 1.443 ± 0.568
1.984GlnGlu: 1.984 ± 0.813
1.443GlnPhe: 1.443 ± 0.728
2.526GlnGly: 2.526 ± 0.568
0.361GlnHis: 0.361 ± 0.388
1.984GlnIle: 1.984 ± 0.881
0.722GlnLys: 0.722 ± 0.359
1.984GlnLeu: 1.984 ± 0.511
1.082GlnMet: 1.082 ± 0.405
1.624GlnAsn: 1.624 ± 0.585
1.443GlnPro: 1.443 ± 0.683
1.443GlnGln: 1.443 ± 0.474
2.345GlnArg: 2.345 ± 0.814
2.165GlnSer: 2.165 ± 0.723
3.247GlnThr: 3.247 ± 0.939
1.804GlnVal: 1.804 ± 0.634
0.902GlnTrp: 0.902 ± 0.581
1.984GlnTyr: 1.984 ± 0.703
0.0GlnXaa: 0.0 ± 0.0
Arg
4.149ArgAla: 4.149 ± 0.679
1.082ArgCys: 1.082 ± 0.342
2.887ArgAsp: 2.887 ± 0.952
3.067ArgGlu: 3.067 ± 0.768
1.263ArgPhe: 1.263 ± 0.496
3.789ArgGly: 3.789 ± 0.753
0.541ArgHis: 0.541 ± 0.28
3.789ArgIle: 3.789 ± 0.811
3.969ArgLys: 3.969 ± 1.083
3.789ArgLeu: 3.789 ± 0.815
1.443ArgMet: 1.443 ± 0.438
1.984ArgAsn: 1.984 ± 0.544
2.345ArgPro: 2.345 ± 0.572
2.165ArgGln: 2.165 ± 0.757
3.247ArgArg: 3.247 ± 0.699
2.165ArgSer: 2.165 ± 0.495
2.526ArgThr: 2.526 ± 0.612
4.33ArgVal: 4.33 ± 1.081
0.361ArgTrp: 0.361 ± 0.248
1.263ArgTyr: 1.263 ± 0.398
0.0ArgXaa: 0.0 ± 0.0
Ser
8.118SerAla: 8.118 ± 0.958
0.541SerCys: 0.541 ± 0.358
3.247SerAsp: 3.247 ± 0.712
2.706SerGlu: 2.706 ± 0.715
1.263SerPhe: 1.263 ± 0.446
7.216SerGly: 7.216 ± 1.783
0.541SerHis: 0.541 ± 0.272
3.608SerIle: 3.608 ± 0.932
1.984SerLys: 1.984 ± 0.656
4.149SerLeu: 4.149 ± 0.797
1.984SerMet: 1.984 ± 0.629
2.526SerAsn: 2.526 ± 1.139
2.706SerPro: 2.706 ± 0.752
2.706SerGln: 2.706 ± 0.895
2.706SerArg: 2.706 ± 0.527
4.149SerSer: 4.149 ± 0.908
2.887SerThr: 2.887 ± 0.557
3.969SerVal: 3.969 ± 0.708
0.541SerTrp: 0.541 ± 0.376
2.706SerTyr: 2.706 ± 0.722
0.0SerXaa: 0.0 ± 0.0
Thr
7.577ThrAla: 7.577 ± 1.433
0.902ThrCys: 0.902 ± 0.502
2.526ThrAsp: 2.526 ± 0.598
2.887ThrGlu: 2.887 ± 0.849
2.526ThrPhe: 2.526 ± 0.659
6.314ThrGly: 6.314 ± 1.076
1.263ThrHis: 1.263 ± 0.435
3.789ThrIle: 3.789 ± 0.991
3.247ThrLys: 3.247 ± 0.672
5.051ThrLeu: 5.051 ± 1.251
0.902ThrMet: 0.902 ± 0.314
2.706ThrAsn: 2.706 ± 0.663
5.412ThrPro: 5.412 ± 1.674
3.067ThrGln: 3.067 ± 0.879
3.608ThrArg: 3.608 ± 0.845
4.149ThrSer: 4.149 ± 0.815
6.675ThrThr: 6.675 ± 1.094
5.232ThrVal: 5.232 ± 1.424
0.902ThrTrp: 0.902 ± 0.364
2.706ThrTyr: 2.706 ± 0.721
0.0ThrXaa: 0.0 ± 0.0
Val
5.773ValAla: 5.773 ± 1.513
0.0ValCys: 0.0 ± 0.0
3.608ValAsp: 3.608 ± 0.658
3.247ValGlu: 3.247 ± 0.822
3.428ValPhe: 3.428 ± 0.869
6.675ValGly: 6.675 ± 1.188
1.082ValHis: 1.082 ± 0.348
4.51ValIle: 4.51 ± 0.837
3.247ValLys: 3.247 ± 0.653
5.773ValLeu: 5.773 ± 1.442
1.082ValMet: 1.082 ± 0.505
2.165ValAsn: 2.165 ± 0.59
4.51ValPro: 4.51 ± 0.991
1.804ValGln: 1.804 ± 0.492
4.149ValArg: 4.149 ± 1.027
4.691ValSer: 4.691 ± 0.82
5.773ValThr: 5.773 ± 0.964
4.33ValVal: 4.33 ± 1.006
1.082ValTrp: 1.082 ± 0.433
1.624ValTyr: 1.624 ± 0.59
0.0ValXaa: 0.0 ± 0.0
Trp
1.082TrpAla: 1.082 ± 0.411
0.18TrpCys: 0.18 ± 0.193
2.706TrpAsp: 2.706 ± 0.622
1.082TrpGlu: 1.082 ± 0.392
0.902TrpPhe: 0.902 ± 0.403
0.722TrpGly: 0.722 ± 0.384
0.18TrpHis: 0.18 ± 0.199
0.722TrpIle: 0.722 ± 0.399
0.18TrpLys: 0.18 ± 0.205
1.082TrpLeu: 1.082 ± 0.345
0.361TrpMet: 0.361 ± 0.287
0.902TrpAsn: 0.902 ± 0.522
0.902TrpPro: 0.902 ± 0.317
0.361TrpGln: 0.361 ± 0.292
1.624TrpArg: 1.624 ± 0.344
1.082TrpSer: 1.082 ± 0.538
1.082TrpThr: 1.082 ± 0.522
1.984TrpVal: 1.984 ± 0.579
0.18TrpTrp: 0.18 ± 0.169
1.263TrpTyr: 1.263 ± 0.409
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.526TyrAla: 2.526 ± 0.643
0.0TyrCys: 0.0 ± 0.0
1.624TyrAsp: 1.624 ± 0.615
2.345TyrGlu: 2.345 ± 0.698
1.263TyrPhe: 1.263 ± 0.441
2.165TyrGly: 2.165 ± 0.646
0.361TyrHis: 0.361 ± 0.24
1.263TyrIle: 1.263 ± 0.436
0.541TyrLys: 0.541 ± 0.318
3.608TyrLeu: 3.608 ± 0.936
0.722TyrMet: 0.722 ± 0.285
2.165TyrAsn: 2.165 ± 0.497
1.804TyrPro: 1.804 ± 0.617
1.263TyrGln: 1.263 ± 0.62
2.345TyrArg: 2.345 ± 0.902
1.984TyrSer: 1.984 ± 0.791
2.165TyrThr: 2.165 ± 0.542
1.984TyrVal: 1.984 ± 0.536
0.541TyrTrp: 0.541 ± 0.43
0.541TyrTyr: 0.541 ± 0.28
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 36 proteins (5544 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski