Amino acid dipepetide frequency for Streptococcus satellite phage Javan652

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
0.84AlaAla: 0.84 ± 0.527
0.42AlaCys: 0.42 ± 0.362
3.359AlaAsp: 3.359 ± 1.176
5.038AlaGlu: 5.038 ± 1.223
2.939AlaPhe: 2.939 ± 1.018
0.84AlaGly: 0.84 ± 0.573
1.679AlaHis: 1.679 ± 0.849
6.297AlaIle: 6.297 ± 1.592
6.297AlaLys: 6.297 ± 1.443
4.198AlaLeu: 4.198 ± 1.652
1.679AlaMet: 1.679 ± 0.966
6.717AlaAsn: 6.717 ± 1.121
2.519AlaPro: 2.519 ± 1.165
2.099AlaGln: 2.099 ± 0.911
2.099AlaArg: 2.099 ± 0.745
3.359AlaSer: 3.359 ± 1.656
4.618AlaThr: 4.618 ± 1.509
1.259AlaVal: 1.259 ± 0.745
0.42AlaTrp: 0.42 ± 0.326
5.038AlaTyr: 5.038 ± 0.893
0.0AlaXaa: 0.0 ± 0.0
Cys
0.84CysAla: 0.84 ± 0.587
0.0CysCys: 0.0 ± 0.0
0.42CysAsp: 0.42 ± 0.391
0.84CysGlu: 0.84 ± 0.487
0.0CysPhe: 0.0 ± 0.0
0.42CysGly: 0.42 ± 0.372
0.0CysHis: 0.0 ± 0.0
0.42CysIle: 0.42 ± 0.352
0.0CysLys: 0.0 ± 0.0
0.84CysLeu: 0.84 ± 0.49
0.0CysMet: 0.0 ± 0.0
0.0CysAsn: 0.0 ± 0.0
0.0CysPro: 0.0 ± 0.0
0.0CysGln: 0.0 ± 0.0
0.42CysArg: 0.42 ± 0.326
0.42CysSer: 0.42 ± 0.391
0.0CysThr: 0.0 ± 0.0
0.42CysVal: 0.42 ± 0.372
0.0CysTrp: 0.0 ± 0.0
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
1.679AspAla: 1.679 ± 0.879
0.0AspCys: 0.0 ± 0.0
2.099AspAsp: 2.099 ± 1.073
5.877AspGlu: 5.877 ± 1.663
3.778AspPhe: 3.778 ± 0.931
1.679AspGly: 1.679 ± 1.1
0.42AspHis: 0.42 ± 0.391
6.297AspIle: 6.297 ± 1.17
7.137AspLys: 7.137 ± 1.366
6.297AspLeu: 6.297 ± 1.25
0.84AspMet: 0.84 ± 0.674
6.717AspAsn: 6.717 ± 1.655
0.84AspPro: 0.84 ± 0.473
0.0AspGln: 0.0 ± 0.0
2.939AspArg: 2.939 ± 1.103
5.038AspSer: 5.038 ± 2.036
2.519AspThr: 2.519 ± 1.121
2.519AspVal: 2.519 ± 0.878
0.0AspTrp: 0.0 ± 0.0
3.778AspTyr: 3.778 ± 1.607
0.0AspXaa: 0.0 ± 0.0
Glu
7.137GluAla: 7.137 ± 1.619
1.259GluCys: 1.259 ± 0.724
3.359GluAsp: 3.359 ± 1.608
5.038GluGlu: 5.038 ± 1.86
2.099GluPhe: 2.099 ± 1.315
2.099GluGly: 2.099 ± 0.931
0.42GluHis: 0.42 ± 0.366
7.976GluIle: 7.976 ± 1.655
7.557GluLys: 7.557 ± 2.215
11.755GluLeu: 11.755 ± 2.216
2.099GluMet: 2.099 ± 0.635
3.778GluAsn: 3.778 ± 0.812
2.519GluPro: 2.519 ± 0.818
3.778GluGln: 3.778 ± 0.783
5.038GluArg: 5.038 ± 1.454
2.099GluSer: 2.099 ± 0.718
5.038GluThr: 5.038 ± 0.897
2.519GluVal: 2.519 ± 1.085
0.0GluTrp: 0.0 ± 0.0
1.259GluTyr: 1.259 ± 0.678
0.0GluXaa: 0.0 ± 0.0
Phe
0.84PheAla: 0.84 ± 0.504
0.0PheCys: 0.0 ± 0.0
4.198PheAsp: 4.198 ± 1.124
4.618PheGlu: 4.618 ± 1.455
3.359PhePhe: 3.359 ± 1.046
3.359PheGly: 3.359 ± 0.902
0.42PheHis: 0.42 ± 0.326
4.198PheIle: 4.198 ± 1.583
5.458PheLys: 5.458 ± 1.131
5.038PheLeu: 5.038 ± 1.173
0.42PheMet: 0.42 ± 0.566
2.939PheAsn: 2.939 ± 0.828
0.84PhePro: 0.84 ± 0.473
1.679PheGln: 1.679 ± 0.861
1.259PheArg: 1.259 ± 0.481
2.519PheSer: 2.519 ± 0.824
1.679PheThr: 1.679 ± 0.978
0.84PheVal: 0.84 ± 0.565
0.0PheTrp: 0.0 ± 0.0
0.84PheTyr: 0.84 ± 0.579
0.0PheXaa: 0.0 ± 0.0
Gly
1.259GlyAla: 1.259 ± 0.763
1.259GlyCys: 1.259 ± 0.543
1.259GlyAsp: 1.259 ± 0.805
4.618GlyGlu: 4.618 ± 1.678
2.939GlyPhe: 2.939 ± 1.132
2.939GlyGly: 2.939 ± 1.111
0.42GlyHis: 0.42 ± 0.326
2.099GlyIle: 2.099 ± 0.708
4.618GlyLys: 4.618 ± 1.46
2.939GlyLeu: 2.939 ± 1.215
1.679GlyMet: 1.679 ± 0.709
2.519GlyAsn: 2.519 ± 1.174
0.0GlyPro: 0.0 ± 0.0
1.259GlyGln: 1.259 ± 0.834
2.939GlyArg: 2.939 ± 0.919
1.259GlySer: 1.259 ± 0.977
3.778GlyThr: 3.778 ± 1.434
3.359GlyVal: 3.359 ± 1.035
0.84GlyTrp: 0.84 ± 0.744
3.778GlyTyr: 3.778 ± 1.49
0.0GlyXaa: 0.0 ± 0.0
His
1.679HisAla: 1.679 ± 0.775
0.42HisCys: 0.42 ± 0.372
1.259HisAsp: 1.259 ± 0.757
0.42HisGlu: 0.42 ± 0.478
1.259HisPhe: 1.259 ± 0.693
2.099HisGly: 2.099 ± 0.76
0.0HisHis: 0.0 ± 0.0
0.84HisIle: 0.84 ± 0.444
0.84HisLys: 0.84 ± 0.503
1.259HisLeu: 1.259 ± 0.659
0.0HisMet: 0.0 ± 0.0
0.0HisAsn: 0.0 ± 0.0
2.099HisPro: 2.099 ± 1.139
0.84HisGln: 0.84 ± 0.551
1.679HisArg: 1.679 ± 0.658
0.42HisSer: 0.42 ± 0.326
1.679HisThr: 1.679 ± 0.709
0.42HisVal: 0.42 ± 0.566
0.0HisTrp: 0.0 ± 0.0
0.84HisTyr: 0.84 ± 0.416
0.0HisXaa: 0.0 ± 0.0
Ile
5.877IleAla: 5.877 ± 1.681
0.42IleCys: 0.42 ± 0.391
5.877IleAsp: 5.877 ± 1.159
7.557IleGlu: 7.557 ± 1.627
2.099IlePhe: 2.099 ± 0.873
3.778IleGly: 3.778 ± 0.895
0.84IleHis: 0.84 ± 0.486
4.618IleIle: 4.618 ± 1.559
5.877IleLys: 5.877 ± 1.153
6.297IleLeu: 6.297 ± 1.523
1.259IleMet: 1.259 ± 0.671
3.359IleAsn: 3.359 ± 1.255
1.259IlePro: 1.259 ± 0.659
4.198IleGln: 4.198 ± 1.34
1.679IleArg: 1.679 ± 0.739
5.877IleSer: 5.877 ± 1.928
5.458IleThr: 5.458 ± 0.871
2.099IleVal: 2.099 ± 0.87
0.0IleTrp: 0.0 ± 0.0
3.359IleTyr: 3.359 ± 1.075
0.0IleXaa: 0.0 ± 0.0
Lys
7.557LysAla: 7.557 ± 2.078
0.0LysCys: 0.0 ± 0.0
3.778LysAsp: 3.778 ± 1.114
6.297LysGlu: 6.297 ± 1.67
2.519LysPhe: 2.519 ± 0.638
4.618LysGly: 4.618 ± 1.558
2.939LysHis: 2.939 ± 0.9
4.618LysIle: 4.618 ± 0.899
7.976LysLys: 7.976 ± 2.076
8.396LysLeu: 8.396 ± 2.037
1.679LysMet: 1.679 ± 0.819
3.778LysAsn: 3.778 ± 0.928
2.519LysPro: 2.519 ± 1.437
5.458LysGln: 5.458 ± 1.205
7.137LysArg: 7.137 ± 1.025
6.297LysSer: 6.297 ± 1.331
5.877LysThr: 5.877 ± 1.632
8.816LysVal: 8.816 ± 1.405
0.42LysTrp: 0.42 ± 0.372
2.519LysTyr: 2.519 ± 0.838
0.0LysXaa: 0.0 ± 0.0
Leu
6.717LeuAla: 6.717 ± 1.766
0.0LeuCys: 0.0 ± 0.0
7.557LeuAsp: 7.557 ± 1.342
6.717LeuGlu: 6.717 ± 1.965
3.778LeuPhe: 3.778 ± 0.841
5.038LeuGly: 5.038 ± 1.019
2.939LeuHis: 2.939 ± 1.15
6.297LeuIle: 6.297 ± 1.647
8.396LeuLys: 8.396 ± 1.745
8.816LeuLeu: 8.816 ± 1.101
1.679LeuMet: 1.679 ± 0.836
4.618LeuAsn: 4.618 ± 1.348
2.519LeuPro: 2.519 ± 0.767
3.359LeuGln: 3.359 ± 1.229
4.198LeuArg: 4.198 ± 2.011
6.297LeuSer: 6.297 ± 1.412
6.297LeuThr: 6.297 ± 1.458
5.038LeuVal: 5.038 ± 1.805
0.84LeuTrp: 0.84 ± 0.665
1.259LeuTyr: 1.259 ± 0.627
0.0LeuXaa: 0.0 ± 0.0
Met
2.939MetAla: 2.939 ± 1.124
0.0MetCys: 0.0 ± 0.0
1.679MetAsp: 1.679 ± 0.6
1.679MetGlu: 1.679 ± 0.669
0.84MetPhe: 0.84 ± 0.585
0.42MetGly: 0.42 ± 0.566
0.42MetHis: 0.42 ± 0.566
1.679MetIle: 1.679 ± 0.647
2.099MetLys: 2.099 ± 0.718
0.84MetLeu: 0.84 ± 0.501
0.0MetMet: 0.0 ± 0.0
1.259MetAsn: 1.259 ± 0.52
0.0MetPro: 0.0 ± 0.0
0.42MetGln: 0.42 ± 0.366
1.259MetArg: 1.259 ± 0.747
1.679MetSer: 1.679 ± 0.596
3.778MetThr: 3.778 ± 1.703
1.679MetVal: 1.679 ± 0.9
0.0MetTrp: 0.0 ± 0.0
0.42MetTyr: 0.42 ± 0.402
0.0MetXaa: 0.0 ± 0.0
Asn
4.618AsnAla: 4.618 ± 0.986
0.42AsnCys: 0.42 ± 0.326
4.618AsnAsp: 4.618 ± 1.006
5.877AsnGlu: 5.877 ± 1.046
1.679AsnPhe: 1.679 ± 1.212
2.099AsnGly: 2.099 ± 0.73
1.259AsnHis: 1.259 ± 0.724
3.778AsnIle: 3.778 ± 1.251
5.877AsnLys: 5.877 ± 1.126
3.778AsnLeu: 3.778 ± 1.506
1.259AsnMet: 1.259 ± 0.735
4.198AsnAsn: 4.198 ± 0.975
1.259AsnPro: 1.259 ± 0.568
2.099AsnGln: 2.099 ± 0.598
3.359AsnArg: 3.359 ± 0.923
3.778AsnSer: 3.778 ± 1.052
5.038AsnThr: 5.038 ± 1.713
1.259AsnVal: 1.259 ± 0.597
0.42AsnTrp: 0.42 ± 0.362
2.939AsnTyr: 2.939 ± 0.959
0.0AsnXaa: 0.0 ± 0.0
Pro
1.679ProAla: 1.679 ± 0.543
0.0ProCys: 0.0 ± 0.0
0.42ProAsp: 0.42 ± 0.372
1.679ProGlu: 1.679 ± 0.95
0.84ProPhe: 0.84 ± 0.704
0.42ProGly: 0.42 ± 0.366
0.84ProHis: 0.84 ± 0.473
1.679ProIle: 1.679 ± 1.107
1.679ProLys: 1.679 ± 0.77
2.519ProLeu: 2.519 ± 0.854
1.259ProMet: 1.259 ± 0.732
2.939ProAsn: 2.939 ± 1.151
0.84ProPro: 0.84 ± 0.521
0.42ProGln: 0.42 ± 0.352
3.778ProArg: 3.778 ± 1.041
1.259ProSer: 1.259 ± 0.746
2.099ProThr: 2.099 ± 0.776
1.679ProVal: 1.679 ± 0.837
0.0ProTrp: 0.0 ± 0.0
1.259ProTyr: 1.259 ± 0.476
0.0ProXaa: 0.0 ± 0.0
Gln
5.458GlnAla: 5.458 ± 1.849
0.0GlnCys: 0.0 ± 0.0
0.84GlnAsp: 0.84 ± 0.644
2.519GlnGlu: 2.519 ± 0.795
2.939GlnPhe: 2.939 ± 0.821
1.679GlnGly: 1.679 ± 0.793
1.679GlnHis: 1.679 ± 0.586
1.259GlnIle: 1.259 ± 0.815
4.198GlnLys: 4.198 ± 0.836
3.778GlnLeu: 3.778 ± 0.857
0.84GlnMet: 0.84 ± 0.502
1.259GlnAsn: 1.259 ± 0.657
0.84GlnPro: 0.84 ± 0.514
2.519GlnGln: 2.519 ± 1.169
2.519GlnArg: 2.519 ± 1.134
2.519GlnSer: 2.519 ± 0.864
2.519GlnThr: 2.519 ± 0.834
1.679GlnVal: 1.679 ± 0.754
0.0GlnTrp: 0.0 ± 0.0
1.259GlnTyr: 1.259 ± 0.752
0.0GlnXaa: 0.0 ± 0.0
Arg
2.939ArgAla: 2.939 ± 1.466
0.0ArgCys: 0.0 ± 0.0
4.198ArgAsp: 4.198 ± 1.129
2.939ArgGlu: 2.939 ± 1.229
2.099ArgPhe: 2.099 ± 0.88
2.939ArgGly: 2.939 ± 0.916
0.84ArgHis: 0.84 ± 0.444
4.198ArgIle: 4.198 ± 1.134
4.198ArgLys: 4.198 ± 1.113
3.778ArgLeu: 3.778 ± 1.074
0.42ArgMet: 0.42 ± 0.362
4.198ArgAsn: 4.198 ± 1.224
1.259ArgPro: 1.259 ± 0.826
3.359ArgGln: 3.359 ± 0.942
2.939ArgArg: 2.939 ± 1.079
3.778ArgSer: 3.778 ± 1.94
2.099ArgThr: 2.099 ± 0.943
2.939ArgVal: 2.939 ± 1.264
1.259ArgTrp: 1.259 ± 0.719
2.939ArgTyr: 2.939 ± 0.805
0.0ArgXaa: 0.0 ± 0.0
Ser
3.778SerAla: 3.778 ± 1.455
0.42SerCys: 0.42 ± 0.366
4.618SerAsp: 4.618 ± 1.455
3.778SerGlu: 3.778 ± 1.268
3.778SerPhe: 3.778 ± 1.042
1.679SerGly: 1.679 ± 0.479
0.84SerHis: 0.84 ± 0.449
4.198SerIle: 4.198 ± 1.235
7.557SerLys: 7.557 ± 1.104
5.458SerLeu: 5.458 ± 1.084
2.099SerMet: 2.099 ± 0.933
4.198SerAsn: 4.198 ± 1.754
1.259SerPro: 1.259 ± 0.487
1.679SerGln: 1.679 ± 1.119
0.84SerArg: 0.84 ± 0.557
3.359SerSer: 3.359 ± 0.883
2.939SerThr: 2.939 ± 0.756
1.679SerVal: 1.679 ± 0.544
0.84SerTrp: 0.84 ± 0.651
3.778SerTyr: 3.778 ± 0.85
0.0SerXaa: 0.0 ± 0.0
Thr
3.778ThrAla: 3.778 ± 1.343
0.0ThrCys: 0.0 ± 0.0
4.198ThrAsp: 4.198 ± 1.95
2.939ThrGlu: 2.939 ± 0.969
3.359ThrPhe: 3.359 ± 1.33
5.038ThrGly: 5.038 ± 1.243
0.42ThrHis: 0.42 ± 0.326
4.618ThrIle: 4.618 ± 1.38
5.458ThrLys: 5.458 ± 1.693
5.038ThrLeu: 5.038 ± 1.066
3.359ThrMet: 3.359 ± 1.125
2.099ThrAsn: 2.099 ± 1.016
2.939ThrPro: 2.939 ± 1.07
3.778ThrGln: 3.778 ± 1.017
3.778ThrArg: 3.778 ± 1.713
2.099ThrSer: 2.099 ± 0.813
3.778ThrThr: 3.778 ± 1.259
5.038ThrVal: 5.038 ± 1.664
0.42ThrTrp: 0.42 ± 0.352
4.198ThrTyr: 4.198 ± 1.147
0.0ThrXaa: 0.0 ± 0.0
Val
1.679ValAla: 1.679 ± 1.088
0.0ValCys: 0.0 ± 0.0
2.939ValAsp: 2.939 ± 1.114
3.778ValGlu: 3.778 ± 1.227
2.099ValPhe: 2.099 ± 0.537
2.939ValGly: 2.939 ± 0.82
0.42ValHis: 0.42 ± 0.326
3.359ValIle: 3.359 ± 1.362
4.618ValLys: 4.618 ± 1.418
4.618ValLeu: 4.618 ± 1.042
1.259ValMet: 1.259 ± 0.841
2.519ValAsn: 2.519 ± 0.75
1.679ValPro: 1.679 ± 0.729
1.679ValGln: 1.679 ± 0.784
0.42ValArg: 0.42 ± 0.362
2.099ValSer: 2.099 ± 0.926
6.717ValThr: 6.717 ± 1.75
5.458ValVal: 5.458 ± 1.4
1.259ValTrp: 1.259 ± 0.791
1.679ValTyr: 1.679 ± 0.548
0.0ValXaa: 0.0 ± 0.0
Trp
0.0TrpAla: 0.0 ± 0.0
0.0TrpCys: 0.0 ± 0.0
1.259TrpAsp: 1.259 ± 0.752
0.84TrpGlu: 0.84 ± 0.678
0.42TrpPhe: 0.42 ± 0.372
0.0TrpGly: 0.0 ± 0.0
0.0TrpHis: 0.0 ± 0.0
0.42TrpIle: 0.42 ± 0.366
0.42TrpLys: 0.42 ± 0.326
1.259TrpLeu: 1.259 ± 0.605
0.0TrpMet: 0.0 ± 0.0
0.0TrpAsn: 0.0 ± 0.0
0.0TrpPro: 0.0 ± 0.0
0.0TrpGln: 0.0 ± 0.0
0.84TrpArg: 0.84 ± 0.491
0.84TrpSer: 0.84 ± 0.416
0.0TrpThr: 0.0 ± 0.0
0.42TrpVal: 0.42 ± 0.362
0.42TrpTrp: 0.42 ± 0.326
0.0TrpTyr: 0.0 ± 0.0
0.0TrpXaa: 0.0 ± 0.0
Tyr
0.84TyrAla: 0.84 ± 0.625
0.42TyrCys: 0.42 ± 0.352
2.519TyrAsp: 2.519 ± 0.947
3.778TyrGlu: 3.778 ± 1.386
1.679TyrPhe: 1.679 ± 0.821
1.679TyrGly: 1.679 ± 0.769
1.259TyrHis: 1.259 ± 0.775
2.939TyrIle: 2.939 ± 1.136
2.939TyrLys: 2.939 ± 1.223
5.458TyrLeu: 5.458 ± 1.385
0.84TyrMet: 0.84 ± 0.625
2.519TyrAsn: 2.519 ± 0.944
2.099TyrPro: 2.099 ± 0.748
1.679TyrGln: 1.679 ± 0.794
4.198TyrArg: 4.198 ± 1.441
3.778TyrSer: 3.778 ± 1.048
0.42TyrThr: 0.42 ± 0.362
2.099TyrVal: 2.099 ± 0.821
0.0TyrTrp: 0.0 ± 0.0
2.099TyrTyr: 2.099 ± 0.783
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 16 proteins (2383 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski