Amino acid dipepetide frequency for Streptococcus satellite phage Javan624

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
4.065AlaAla: 4.065 ± 1.323
0.452AlaCys: 0.452 ± 0.369
2.71AlaAsp: 2.71 ± 1.321
4.065AlaGlu: 4.065 ± 1.899
3.613AlaPhe: 3.613 ± 1.04
1.807AlaGly: 1.807 ± 0.812
0.0AlaHis: 0.0 ± 0.0
4.968AlaIle: 4.968 ± 0.705
6.323AlaLys: 6.323 ± 1.067
4.968AlaLeu: 4.968 ± 1.641
1.355AlaMet: 1.355 ± 0.827
4.968AlaAsn: 4.968 ± 1.942
1.355AlaPro: 1.355 ± 0.698
1.355AlaGln: 1.355 ± 0.732
2.71AlaArg: 2.71 ± 0.81
3.613AlaSer: 3.613 ± 1.06
2.258AlaThr: 2.258 ± 1.018
3.613AlaVal: 3.613 ± 1.115
0.903AlaTrp: 0.903 ± 0.554
2.258AlaTyr: 2.258 ± 0.766
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.0CysCys: 0.0 ± 0.0
0.0CysAsp: 0.0 ± 0.0
0.0CysGlu: 0.0 ± 0.0
0.0CysPhe: 0.0 ± 0.0
0.0CysGly: 0.0 ± 0.0
0.0CysHis: 0.0 ± 0.0
0.0CysIle: 0.0 ± 0.0
0.452CysLys: 0.452 ± 0.513
0.452CysLeu: 0.452 ± 0.409
0.0CysMet: 0.0 ± 0.0
0.452CysAsn: 0.452 ± 0.369
0.0CysPro: 0.0 ± 0.0
0.0CysGln: 0.0 ± 0.0
0.452CysArg: 0.452 ± 0.369
0.0CysSer: 0.0 ± 0.0
0.0CysThr: 0.0 ± 0.0
0.903CysVal: 0.903 ± 0.541
0.0CysTrp: 0.0 ± 0.0
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
3.162AspAla: 3.162 ± 1.55
0.452AspCys: 0.452 ± 0.513
3.162AspAsp: 3.162 ± 1.709
4.065AspGlu: 4.065 ± 1.462
4.968AspPhe: 4.968 ± 1.819
1.355AspGly: 1.355 ± 0.687
0.452AspHis: 0.452 ± 0.395
2.71AspIle: 2.71 ± 0.861
6.323AspLys: 6.323 ± 2.09
5.42AspLeu: 5.42 ± 1.875
0.452AspMet: 0.452 ± 0.467
2.258AspAsn: 2.258 ± 1.019
1.355AspPro: 1.355 ± 0.561
3.613AspGln: 3.613 ± 1.403
2.258AspArg: 2.258 ± 0.954
0.903AspSer: 0.903 ± 0.55
2.71AspThr: 2.71 ± 1.107
4.968AspVal: 4.968 ± 1.374
0.0AspTrp: 0.0 ± 0.0
2.71AspTyr: 2.71 ± 0.862
0.0AspXaa: 0.0 ± 0.0
Glu
3.613GluAla: 3.613 ± 1.421
0.0GluCys: 0.0 ± 0.0
2.71GluAsp: 2.71 ± 0.91
5.42GluGlu: 5.42 ± 1.482
2.258GluPhe: 2.258 ± 1.08
1.355GluGly: 1.355 ± 0.798
0.452GluHis: 0.452 ± 0.467
5.42GluIle: 5.42 ± 1.179
8.582GluLys: 8.582 ± 1.507
13.098GluLeu: 13.098 ± 2.659
4.517GluMet: 4.517 ± 1.543
7.227GluAsn: 7.227 ± 1.576
1.355GluPro: 1.355 ± 0.731
4.517GluGln: 4.517 ± 1.111
0.903GluArg: 0.903 ± 0.612
4.065GluSer: 4.065 ± 1.19
4.065GluThr: 4.065 ± 1.719
4.065GluVal: 4.065 ± 1.272
1.355GluTrp: 1.355 ± 0.63
1.807GluTyr: 1.807 ± 0.828
0.0GluXaa: 0.0 ± 0.0
Phe
1.355PheAla: 1.355 ± 0.543
0.0PheCys: 0.0 ± 0.0
2.258PheAsp: 2.258 ± 0.899
4.968PheGlu: 4.968 ± 1.526
0.452PhePhe: 0.452 ± 0.492
0.903PheGly: 0.903 ± 0.554
2.258PheHis: 2.258 ± 0.816
3.613PheIle: 3.613 ± 1.007
4.065PheLys: 4.065 ± 1.986
5.42PheLeu: 5.42 ± 1.84
0.903PheMet: 0.903 ± 0.538
2.71PheAsn: 2.71 ± 1.305
0.903PhePro: 0.903 ± 0.55
2.258PheGln: 2.258 ± 0.953
0.452PheArg: 0.452 ± 0.369
3.613PheSer: 3.613 ± 1.438
0.903PheThr: 0.903 ± 0.826
1.355PheVal: 1.355 ± 0.698
0.452PheTrp: 0.452 ± 0.409
1.807PheTyr: 1.807 ± 0.845
0.0PheXaa: 0.0 ± 0.0
Gly
2.258GlyAla: 2.258 ± 0.777
0.452GlyCys: 0.452 ± 0.369
1.355GlyAsp: 1.355 ± 0.593
1.807GlyGlu: 1.807 ± 0.876
2.258GlyPhe: 2.258 ± 0.694
1.807GlyGly: 1.807 ± 1.002
0.452GlyHis: 0.452 ± 0.369
3.162GlyIle: 3.162 ± 1.028
3.613GlyLys: 3.613 ± 1.056
4.968GlyLeu: 4.968 ± 1.215
0.0GlyMet: 0.0 ± 0.481
3.162GlyAsn: 3.162 ± 1.278
0.0GlyPro: 0.0 ± 0.0
0.903GlyGln: 0.903 ± 0.67
0.903GlyArg: 0.903 ± 0.552
1.807GlySer: 1.807 ± 0.544
1.355GlyThr: 1.355 ± 0.637
4.968GlyVal: 4.968 ± 1.55
0.0GlyTrp: 0.0 ± 0.0
3.162GlyTyr: 3.162 ± 1.911
0.0GlyXaa: 0.0 ± 0.0
His
0.903HisAla: 0.903 ± 0.504
0.0HisCys: 0.0 ± 0.0
0.452HisAsp: 0.452 ± 0.513
0.903HisGlu: 0.903 ± 0.514
0.903HisPhe: 0.903 ± 0.504
0.903HisGly: 0.903 ± 0.481
0.0HisHis: 0.0 ± 0.0
0.452HisIle: 0.452 ± 0.409
0.0HisLys: 0.0 ± 0.0
2.71HisLeu: 2.71 ± 1.165
0.0HisMet: 0.0 ± 0.0
0.452HisAsn: 0.452 ± 0.369
0.0HisPro: 0.0 ± 0.0
1.355HisGln: 1.355 ± 0.672
0.903HisArg: 0.903 ± 0.481
1.355HisSer: 1.355 ± 0.869
0.903HisThr: 0.903 ± 0.558
0.903HisVal: 0.903 ± 0.62
0.0HisTrp: 0.0 ± 0.0
0.903HisTyr: 0.903 ± 0.703
0.0HisXaa: 0.0 ± 0.0
Ile
3.162IleAla: 3.162 ± 1.158
0.0IleCys: 0.0 ± 0.0
4.517IleAsp: 4.517 ± 1.056
5.42IleGlu: 5.42 ± 1.876
2.71IlePhe: 2.71 ± 1.451
2.71IleGly: 2.71 ± 1.157
1.355IleHis: 1.355 ± 0.858
6.775IleIle: 6.775 ± 2.073
11.743IleLys: 11.743 ± 1.741
8.582IleLeu: 8.582 ± 1.51
0.452IleMet: 0.452 ± 0.443
5.872IleAsn: 5.872 ± 1.386
2.71IlePro: 2.71 ± 0.583
2.258IleGln: 2.258 ± 0.985
2.71IleArg: 2.71 ± 0.991
8.13IleSer: 8.13 ± 2.173
1.807IleThr: 1.807 ± 0.885
4.968IleVal: 4.968 ± 1.201
0.452IleTrp: 0.452 ± 0.369
1.355IleTyr: 1.355 ± 0.704
0.0IleXaa: 0.0 ± 0.0
Lys
9.937LysAla: 9.937 ± 2.207
0.452LysCys: 0.452 ± 0.409
3.613LysAsp: 3.613 ± 1.385
7.678LysGlu: 7.678 ± 1.595
1.355LysPhe: 1.355 ± 0.921
6.775LysGly: 6.775 ± 1.562
1.807LysHis: 1.807 ± 0.953
9.033LysIle: 9.033 ± 1.841
10.388LysLys: 10.388 ± 1.988
7.227LysLeu: 7.227 ± 1.364
1.807LysMet: 1.807 ± 0.797
7.227LysAsn: 7.227 ± 1.462
3.162LysPro: 3.162 ± 1.193
6.323LysGln: 6.323 ± 1.226
4.968LysArg: 4.968 ± 1.452
5.42LysSer: 5.42 ± 1.1
9.033LysThr: 9.033 ± 1.493
6.775LysVal: 6.775 ± 1.82
2.258LysTrp: 2.258 ± 1.211
2.71LysTyr: 2.71 ± 1.045
0.0LysXaa: 0.0 ± 0.0
Leu
7.678LeuAla: 7.678 ± 1.69
0.452LeuCys: 0.452 ± 0.465
8.582LeuAsp: 8.582 ± 1.273
11.743LeuGlu: 11.743 ± 2.458
2.71LeuPhe: 2.71 ± 1.109
5.872LeuGly: 5.872 ± 1.579
1.355LeuHis: 1.355 ± 0.823
6.323LeuIle: 6.323 ± 1.995
10.388LeuLys: 10.388 ± 1.83
9.485LeuLeu: 9.485 ± 1.117
3.613LeuMet: 3.613 ± 1.13
7.227LeuAsn: 7.227 ± 2.051
3.162LeuPro: 3.162 ± 1.153
2.71LeuGln: 2.71 ± 1.076
4.065LeuArg: 4.065 ± 1.568
9.485LeuSer: 9.485 ± 2.292
3.613LeuThr: 3.613 ± 1.504
4.065LeuVal: 4.065 ± 1.029
0.0LeuTrp: 0.0 ± 0.0
3.613LeuTyr: 3.613 ± 0.857
0.0LeuXaa: 0.0 ± 0.0
Met
1.355MetAla: 1.355 ± 1.056
0.0MetCys: 0.0 ± 0.0
1.355MetAsp: 1.355 ± 0.543
1.807MetGlu: 1.807 ± 1.239
0.452MetPhe: 0.452 ± 0.409
0.903MetGly: 0.903 ± 0.628
0.0MetHis: 0.0 ± 0.0
2.258MetIle: 2.258 ± 1.249
2.71MetLys: 2.71 ± 0.803
2.258MetLeu: 2.258 ± 0.749
1.355MetMet: 1.355 ± 0.783
2.258MetAsn: 2.258 ± 0.939
0.903MetPro: 0.903 ± 0.55
1.355MetGln: 1.355 ± 0.582
0.0MetArg: 0.0 ± 0.0
3.162MetSer: 3.162 ± 1.442
2.71MetThr: 2.71 ± 0.731
0.452MetVal: 0.452 ± 0.473
0.452MetTrp: 0.452 ± 0.435
0.0MetTyr: 0.0 ± 0.0
0.0MetXaa: 0.0 ± 0.0
Asn
6.323AsnAla: 6.323 ± 1.355
0.0AsnCys: 0.0 ± 0.0
2.71AsnAsp: 2.71 ± 0.733
3.162AsnGlu: 3.162 ± 0.896
4.517AsnPhe: 4.517 ± 1.154
4.065AsnGly: 4.065 ± 1.295
0.0AsnHis: 0.0 ± 0.0
4.968AsnIle: 4.968 ± 1.052
7.227AsnLys: 7.227 ± 1.722
8.13AsnLeu: 8.13 ± 1.43
1.355AsnMet: 1.355 ± 0.727
4.968AsnAsn: 4.968 ± 1.2
0.903AsnPro: 0.903 ± 0.553
4.065AsnGln: 4.065 ± 1.45
3.162AsnArg: 3.162 ± 1.099
4.968AsnSer: 4.968 ± 0.704
4.517AsnThr: 4.517 ± 1.863
4.065AsnVal: 4.065 ± 1.279
0.452AsnTrp: 0.452 ± 0.625
2.258AsnTyr: 2.258 ± 0.806
0.0AsnXaa: 0.0 ± 0.0
Pro
0.452ProAla: 0.452 ± 0.369
0.0ProCys: 0.0 ± 0.0
2.258ProAsp: 2.258 ± 1.009
0.0ProGlu: 0.0 ± 0.0
0.452ProPhe: 0.452 ± 0.473
0.903ProGly: 0.903 ± 0.552
0.0ProHis: 0.0 ± 0.0
1.355ProIle: 1.355 ± 0.845
3.613ProLys: 3.613 ± 0.598
3.613ProLeu: 3.613 ± 1.331
0.452ProMet: 0.452 ± 0.395
2.258ProAsn: 2.258 ± 0.899
0.903ProPro: 0.903 ± 0.612
0.903ProGln: 0.903 ± 0.57
0.903ProArg: 0.903 ± 0.554
1.355ProSer: 1.355 ± 0.554
0.903ProThr: 0.903 ± 0.544
0.903ProVal: 0.903 ± 0.676
0.452ProTrp: 0.452 ± 0.625
1.355ProTyr: 1.355 ± 0.714
0.0ProXaa: 0.0 ± 0.0
Gln
2.71GlnAla: 2.71 ± 1.606
0.452GlnCys: 0.452 ± 0.409
3.613GlnAsp: 3.613 ± 2.069
4.517GlnGlu: 4.517 ± 1.629
1.807GlnPhe: 1.807 ± 1.635
1.355GlnGly: 1.355 ± 0.863
0.903GlnHis: 0.903 ± 0.473
4.968GlnIle: 4.968 ± 1.077
4.517GlnLys: 4.517 ± 1.02
5.872GlnLeu: 5.872 ± 1.551
1.807GlnMet: 1.807 ± 0.823
3.162GlnAsn: 3.162 ± 0.808
0.0GlnPro: 0.0 ± 0.0
1.807GlnGln: 1.807 ± 1.221
2.71GlnArg: 2.71 ± 0.978
3.162GlnSer: 3.162 ± 0.918
2.71GlnThr: 2.71 ± 0.831
1.807GlnVal: 1.807 ± 0.781
0.452GlnTrp: 0.452 ± 0.413
1.355GlnTyr: 1.355 ± 0.795
0.0GlnXaa: 0.0 ± 0.0
Arg
2.71ArgAla: 2.71 ± 0.951
0.0ArgCys: 0.0 ± 0.0
1.807ArgAsp: 1.807 ± 0.84
4.065ArgGlu: 4.065 ± 1.212
1.807ArgPhe: 1.807 ± 0.826
1.355ArgGly: 1.355 ± 0.684
0.452ArgHis: 0.452 ± 0.369
4.065ArgIle: 4.065 ± 1.399
2.258ArgLys: 2.258 ± 0.809
2.71ArgLeu: 2.71 ± 1.016
2.71ArgMet: 2.71 ± 1.199
3.162ArgAsn: 3.162 ± 1.573
0.452ArgPro: 0.452 ± 0.492
3.613ArgGln: 3.613 ± 1.139
2.258ArgArg: 2.258 ± 0.975
1.355ArgSer: 1.355 ± 0.7
1.355ArgThr: 1.355 ± 0.66
0.903ArgVal: 0.903 ± 0.481
0.903ArgTrp: 0.903 ± 0.577
1.355ArgTyr: 1.355 ± 0.817
0.0ArgXaa: 0.0 ± 0.0
Ser
0.452SerAla: 0.452 ± 0.492
0.0SerCys: 0.0 ± 0.0
5.42SerAsp: 5.42 ± 1.478
4.065SerGlu: 4.065 ± 0.949
4.517SerPhe: 4.517 ± 1.778
2.258SerGly: 2.258 ± 1.115
0.903SerHis: 0.903 ± 0.473
6.323SerIle: 6.323 ± 1.34
7.227SerLys: 7.227 ± 2.758
6.775SerLeu: 6.775 ± 1.655
1.355SerMet: 1.355 ± 0.811
4.968SerAsn: 4.968 ± 1.489
0.903SerPro: 0.903 ± 0.55
2.71SerGln: 2.71 ± 0.579
2.71SerArg: 2.71 ± 0.937
1.807SerSer: 1.807 ± 0.692
1.807SerThr: 1.807 ± 0.823
6.323SerVal: 6.323 ± 1.429
0.0SerTrp: 0.0 ± 0.0
1.355SerTyr: 1.355 ± 0.663
0.0SerXaa: 0.0 ± 0.0
Thr
1.807ThrAla: 1.807 ± 0.793
0.0ThrCys: 0.0 ± 0.0
2.71ThrAsp: 2.71 ± 0.884
2.258ThrGlu: 2.258 ± 0.998
1.355ThrPhe: 1.355 ± 0.686
1.807ThrGly: 1.807 ± 0.748
0.903ThrHis: 0.903 ± 0.481
2.71ThrIle: 2.71 ± 0.948
3.162ThrLys: 3.162 ± 1.392
6.775ThrLeu: 6.775 ± 2.405
0.903ThrMet: 0.903 ± 0.544
3.613ThrAsn: 3.613 ± 1.195
2.71ThrPro: 2.71 ± 1.005
3.162ThrGln: 3.162 ± 0.985
2.258ThrArg: 2.258 ± 1.01
0.452ThrSer: 0.452 ± 0.462
3.162ThrThr: 3.162 ± 1.659
4.517ThrVal: 4.517 ± 0.884
0.452ThrTrp: 0.452 ± 0.449
3.162ThrTyr: 3.162 ± 1.44
0.0ThrXaa: 0.0 ± 0.0
Val
3.613ValAla: 3.613 ± 1.031
0.0ValCys: 0.0 ± 0.0
1.807ValAsp: 1.807 ± 0.84
6.775ValGlu: 6.775 ± 2.252
2.258ValPhe: 2.258 ± 1.026
1.807ValGly: 1.807 ± 1.074
2.258ValHis: 2.258 ± 0.938
5.872ValIle: 5.872 ± 1.28
7.678ValLys: 7.678 ± 1.615
3.162ValLeu: 3.162 ± 0.92
1.355ValMet: 1.355 ± 0.851
3.613ValAsn: 3.613 ± 1.803
1.355ValPro: 1.355 ± 0.684
2.71ValGln: 2.71 ± 0.761
2.71ValArg: 2.71 ± 0.93
4.968ValSer: 4.968 ± 1.957
3.162ValThr: 3.162 ± 1.46
3.162ValVal: 3.162 ± 1.382
0.452ValTrp: 0.452 ± 0.409
2.71ValTyr: 2.71 ± 1.145
0.0ValXaa: 0.0 ± 0.0
Trp
0.452TrpAla: 0.452 ± 0.369
0.0TrpCys: 0.0 ± 0.0
0.0TrpAsp: 0.0 ± 0.0
1.355TrpGlu: 1.355 ± 0.951
0.452TrpPhe: 0.452 ± 0.513
0.0TrpGly: 0.0 ± 0.0
0.0TrpHis: 0.0 ± 0.0
0.903TrpIle: 0.903 ± 0.825
1.807TrpLys: 1.807 ± 1.011
0.903TrpLeu: 0.903 ± 0.754
0.0TrpMet: 0.0 ± 0.0
0.0TrpAsn: 0.0 ± 0.0
0.0TrpPro: 0.0 ± 0.0
0.903TrpGln: 0.903 ± 0.6
0.452TrpArg: 0.452 ± 0.395
0.452TrpSer: 0.452 ± 0.369
0.0TrpThr: 0.0 ± 0.0
0.903TrpVal: 0.903 ± 0.552
0.452TrpTrp: 0.452 ± 0.369
0.452TrpTyr: 0.452 ± 0.513
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.355TyrAla: 1.355 ± 0.783
0.0TyrCys: 0.0 ± 0.0
2.258TyrAsp: 2.258 ± 0.786
2.71TyrGlu: 2.71 ± 1.124
1.807TyrPhe: 1.807 ± 0.888
0.903TyrGly: 0.903 ± 0.665
0.452TyrHis: 0.452 ± 0.369
1.807TyrIle: 1.807 ± 1.0
5.872TyrLys: 5.872 ± 1.668
3.613TyrLeu: 3.613 ± 1.383
0.903TyrMet: 0.903 ± 0.507
2.258TyrAsn: 2.258 ± 1.221
0.903TyrPro: 0.903 ± 0.522
3.162TyrGln: 3.162 ± 1.055
1.807TyrArg: 1.807 ± 0.832
1.807TyrSer: 1.807 ± 1.114
0.903TyrThr: 0.903 ± 0.854
1.807TyrVal: 1.807 ± 1.268
0.0TyrTrp: 0.0 ± 0.0
0.452TyrTyr: 0.452 ± 0.409
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 14 proteins (2215 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski