Amino acid dipepetide frequency for Microviridae IME-16

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
7.005AlaAla: 7.005 ± 2.946
1.168AlaCys: 1.168 ± 1.049
4.086AlaAsp: 4.086 ± 1.39
2.919AlaGlu: 2.919 ± 2.246
0.584AlaPhe: 0.584 ± 0.365
2.335AlaGly: 2.335 ± 0.811
0.0AlaHis: 0.0 ± 0.0
3.503AlaIle: 3.503 ± 0.873
2.919AlaLys: 2.919 ± 1.709
6.421AlaLeu: 6.421 ± 1.933
0.584AlaMet: 0.584 ± 0.365
4.67AlaAsn: 4.67 ± 1.847
5.254AlaPro: 5.254 ± 1.47
0.584AlaGln: 0.584 ± 0.557
4.086AlaArg: 4.086 ± 1.402
7.005AlaSer: 7.005 ± 1.51
4.086AlaThr: 4.086 ± 2.541
4.67AlaVal: 4.67 ± 1.813
1.751AlaTrp: 1.751 ± 0.843
3.503AlaTyr: 3.503 ± 0.883
0.0AlaXaa: 0.0 ± 0.0
Cys
0.584CysAla: 0.584 ± 0.746
1.168CysCys: 1.168 ± 1.269
1.168CysAsp: 1.168 ± 0.73
0.584CysGlu: 0.584 ± 0.767
0.584CysPhe: 0.584 ± 0.767
0.0CysGly: 0.0 ± 0.0
0.0CysHis: 0.0 ± 0.0
1.168CysIle: 1.168 ± 0.73
1.168CysLys: 1.168 ± 1.168
1.751CysLeu: 1.751 ± 1.45
1.168CysMet: 1.168 ± 0.977
1.168CysAsn: 1.168 ± 0.73
0.0CysPro: 0.0 ± 0.0
0.584CysGln: 0.584 ± 0.365
1.168CysArg: 1.168 ± 1.459
0.0CysSer: 0.0 ± 0.0
0.584CysThr: 0.584 ± 0.767
1.751CysVal: 1.751 ± 1.057
0.0CysTrp: 0.0 ± 0.0
1.751CysTyr: 1.751 ± 0.843
0.0CysXaa: 0.0 ± 0.0
Asp
4.086AspAla: 4.086 ± 2.554
1.168AspCys: 1.168 ± 0.727
3.503AspAsp: 3.503 ± 1.144
2.919AspGlu: 2.919 ± 1.534
5.838AspPhe: 5.838 ± 1.076
2.335AspGly: 2.335 ± 0.641
0.584AspHis: 0.584 ± 0.767
3.503AspIle: 3.503 ± 1.336
4.086AspLys: 4.086 ± 0.742
3.503AspLeu: 3.503 ± 1.102
2.335AspMet: 2.335 ± 1.5
5.838AspAsn: 5.838 ± 1.922
2.335AspPro: 2.335 ± 0.711
1.751AspGln: 1.751 ± 1.401
1.751AspArg: 1.751 ± 0.843
9.34AspSer: 9.34 ± 2.658
6.421AspThr: 6.421 ± 2.161
5.838AspVal: 5.838 ± 1.818
1.168AspTrp: 1.168 ± 1.114
3.503AspTyr: 3.503 ± 1.205
0.0AspXaa: 0.0 ± 0.0
Glu
2.919GluAla: 2.919 ± 1.083
0.0GluCys: 0.0 ± 0.0
2.335GluAsp: 2.335 ± 1.135
1.751GluGlu: 1.751 ± 0.818
2.335GluPhe: 2.335 ± 1.907
1.751GluGly: 1.751 ± 1.03
0.584GluHis: 0.584 ± 0.746
1.168GluIle: 1.168 ± 0.727
2.335GluLys: 2.335 ± 1.388
2.919GluLeu: 2.919 ± 1.211
0.0GluMet: 0.0 ± 0.0
0.0GluAsn: 0.0 ± 0.0
1.168GluPro: 1.168 ± 0.819
1.751GluGln: 1.751 ± 1.025
2.335GluArg: 2.335 ± 1.354
1.751GluSer: 1.751 ± 1.778
1.751GluThr: 1.751 ± 0.688
4.67GluVal: 4.67 ± 2.479
0.0GluTrp: 0.0 ± 0.0
3.503GluTyr: 3.503 ± 1.405
0.0GluXaa: 0.0 ± 0.0
Phe
2.919PheAla: 2.919 ± 1.879
1.168PheCys: 1.168 ± 0.727
7.005PheAsp: 7.005 ± 2.195
3.503PheGlu: 3.503 ± 1.233
1.168PhePhe: 1.168 ± 0.707
4.67PheGly: 4.67 ± 1.971
1.168PheHis: 1.168 ± 1.049
2.919PheIle: 2.919 ± 1.575
2.335PheLys: 2.335 ± 1.414
7.589PheLeu: 7.589 ± 2.272
0.584PheMet: 0.584 ± 0.557
4.086PheAsn: 4.086 ± 1.243
4.67PhePro: 4.67 ± 1.747
1.751PheGln: 1.751 ± 0.721
1.751PheArg: 1.751 ± 1.096
2.919PheSer: 2.919 ± 1.226
4.086PheThr: 4.086 ± 1.603
2.919PheVal: 2.919 ± 1.147
0.584PheTrp: 0.584 ± 0.767
0.0PheTyr: 0.0 ± 0.0
0.0PheXaa: 0.0 ± 0.0
Gly
2.919GlyAla: 2.919 ± 0.96
0.0GlyCys: 0.0 ± 0.0
6.421GlyAsp: 6.421 ± 1.611
0.584GlyGlu: 0.584 ± 0.557
1.751GlyPhe: 1.751 ± 0.858
1.751GlyGly: 1.751 ± 0.858
0.584GlyHis: 0.584 ± 0.365
4.086GlyIle: 4.086 ± 0.845
2.919GlyLys: 2.919 ± 1.408
1.751GlyLeu: 1.751 ± 0.721
0.584GlyMet: 0.584 ± 0.365
4.086GlyAsn: 4.086 ± 1.147
0.0GlyPro: 0.0 ± 0.0
1.168GlyGln: 1.168 ± 0.73
1.751GlyArg: 1.751 ± 0.843
4.086GlySer: 4.086 ± 2.054
0.0GlyThr: 0.0 ± 0.0
5.254GlyVal: 5.254 ± 1.642
0.0GlyTrp: 0.0 ± 0.0
5.838GlyTyr: 5.838 ± 1.883
0.0GlyXaa: 0.0 ± 0.0
His
0.0HisAla: 0.0 ± 0.0
0.0HisCys: 0.0 ± 0.0
2.335HisAsp: 2.335 ± 0.92
1.168HisGlu: 1.168 ± 0.707
0.584HisPhe: 0.584 ± 0.365
1.168HisGly: 1.168 ± 0.73
0.584HisHis: 0.584 ± 0.767
2.335HisIle: 2.335 ± 1.475
0.0HisLys: 0.0 ± 0.0
1.168HisLeu: 1.168 ± 0.727
0.584HisMet: 0.584 ± 0.365
1.751HisAsn: 1.751 ± 1.155
1.168HisPro: 1.168 ± 1.143
0.0HisGln: 0.0 ± 0.0
0.0HisArg: 0.0 ± 0.0
1.751HisSer: 1.751 ± 1.03
0.584HisThr: 0.584 ± 0.767
0.584HisVal: 0.584 ± 0.365
0.0HisTrp: 0.0 ± 0.0
1.168HisTyr: 1.168 ± 0.727
0.0HisXaa: 0.0 ± 0.0
Ile
2.919IleAla: 2.919 ± 2.186
0.584IleCys: 0.584 ± 0.737
3.503IleAsp: 3.503 ± 1.294
1.168IleGlu: 1.168 ± 0.53
1.751IlePhe: 1.751 ± 0.858
4.67IleGly: 4.67 ± 1.215
1.168IleHis: 1.168 ± 0.707
1.168IleIle: 1.168 ± 0.53
4.67IleLys: 4.67 ± 1.009
3.503IleLeu: 3.503 ± 0.839
0.0IleMet: 0.0 ± 0.0
2.919IleAsn: 2.919 ± 1.047
1.168IlePro: 1.168 ± 0.53
2.335IleGln: 2.335 ± 1.012
4.67IleArg: 4.67 ± 2.402
7.589IleSer: 7.589 ± 2.053
3.503IleThr: 3.503 ± 0.941
0.584IleVal: 0.584 ± 0.767
0.584IleTrp: 0.584 ± 0.767
4.67IleTyr: 4.67 ± 1.936
0.0IleXaa: 0.0 ± 0.0
Lys
3.503LysAla: 3.503 ± 2.563
0.584LysCys: 0.584 ± 0.365
4.086LysAsp: 4.086 ± 2.166
2.335LysGlu: 2.335 ± 1.844
3.503LysPhe: 3.503 ± 1.365
2.335LysGly: 2.335 ± 1.31
0.0LysHis: 0.0 ± 0.0
4.67LysIle: 4.67 ± 2.419
4.67LysLys: 4.67 ± 2.035
7.589LysLeu: 7.589 ± 3.739
1.168LysMet: 1.168 ± 0.53
4.086LysAsn: 4.086 ± 2.582
1.751LysPro: 1.751 ± 1.096
2.919LysGln: 2.919 ± 1.491
5.254LysArg: 5.254 ± 1.197
3.503LysSer: 3.503 ± 1.591
3.503LysThr: 3.503 ± 1.715
2.919LysVal: 2.919 ± 1.807
0.0LysTrp: 0.0 ± 0.0
3.503LysTyr: 3.503 ± 1.649
0.0LysXaa: 0.0 ± 0.0
Leu
7.005LeuAla: 7.005 ± 2.124
4.086LeuCys: 4.086 ± 2.897
6.421LeuAsp: 6.421 ± 1.713
0.0LeuGlu: 0.0 ± 0.0
3.503LeuPhe: 3.503 ± 0.881
6.421LeuGly: 6.421 ± 1.701
1.751LeuHis: 1.751 ± 1.568
5.254LeuIle: 5.254 ± 1.644
8.173LeuLys: 8.173 ± 1.576
7.589LeuLeu: 7.589 ± 2.46
1.751LeuMet: 1.751 ± 1.011
6.421LeuAsn: 6.421 ± 1.229
4.086LeuPro: 4.086 ± 1.338
1.168LeuGln: 1.168 ± 1.114
4.086LeuArg: 4.086 ± 0.906
10.508LeuSer: 10.508 ± 2.571
4.67LeuThr: 4.67 ± 1.583
3.503LeuVal: 3.503 ± 1.591
0.0LeuTrp: 0.0 ± 0.0
1.751LeuTyr: 1.751 ± 1.406
0.0LeuXaa: 0.0 ± 0.0
Met
0.584MetAla: 0.584 ± 0.365
0.0MetCys: 0.0 ± 0.0
0.0MetAsp: 0.0 ± 0.0
0.0MetGlu: 0.0 ± 0.0
2.335MetPhe: 2.335 ± 1.47
0.584MetGly: 0.584 ± 0.557
0.0MetHis: 0.0 ± 0.0
0.584MetIle: 0.584 ± 0.746
0.584MetLys: 0.584 ± 0.557
2.919MetLeu: 2.919 ± 0.856
0.0MetMet: 0.0 ± 0.0
0.584MetAsn: 0.584 ± 0.557
1.751MetPro: 1.751 ± 0.602
0.0MetGln: 0.0 ± 0.0
2.335MetArg: 2.335 ± 1.012
1.751MetSer: 1.751 ± 0.721
0.584MetThr: 0.584 ± 0.365
0.584MetVal: 0.584 ± 1.087
0.0MetTrp: 0.0 ± 0.0
0.584MetTyr: 0.584 ± 0.767
0.0MetXaa: 0.0 ± 0.0
Asn
5.838AsnAla: 5.838 ± 1.877
0.584AsnCys: 0.584 ± 0.365
1.751AsnAsp: 1.751 ± 1.193
2.335AsnGlu: 2.335 ± 1.769
4.67AsnPhe: 4.67 ± 1.382
2.335AsnGly: 2.335 ± 1.034
1.168AsnHis: 1.168 ± 0.53
4.086AsnIle: 4.086 ± 1.219
5.254AsnLys: 5.254 ± 0.924
3.503AsnLeu: 3.503 ± 1.034
0.584AsnMet: 0.584 ± 0.557
4.67AsnAsn: 4.67 ± 2.389
2.335AsnPro: 2.335 ± 1.399
0.584AsnGln: 0.584 ± 0.557
1.168AsnArg: 1.168 ± 0.762
9.924AsnSer: 9.924 ± 3.314
1.751AsnThr: 1.751 ± 0.818
5.254AsnVal: 5.254 ± 1.281
0.0AsnTrp: 0.0 ± 0.0
1.751AsnTyr: 1.751 ± 1.03
0.0AsnXaa: 0.0 ± 0.0
Pro
5.254ProAla: 5.254 ± 1.362
0.584ProCys: 0.584 ± 0.767
4.086ProAsp: 4.086 ± 1.39
0.584ProGlu: 0.584 ± 0.365
1.751ProPhe: 1.751 ± 1.025
1.168ProGly: 1.168 ± 0.73
1.168ProHis: 1.168 ± 0.727
1.168ProIle: 1.168 ± 0.53
0.584ProLys: 0.584 ± 1.087
7.005ProLeu: 7.005 ± 2.044
0.0ProMet: 0.0 ± 0.0
1.168ProAsn: 1.168 ± 0.707
1.751ProPro: 1.751 ± 1.778
0.584ProGln: 0.584 ± 0.365
1.751ProArg: 1.751 ± 1.057
3.503ProSer: 3.503 ± 1.441
4.67ProThr: 4.67 ± 2.024
4.67ProVal: 4.67 ± 3.272
0.0ProTrp: 0.0 ± 0.0
2.919ProTyr: 2.919 ± 2.767
0.0ProXaa: 0.0 ± 0.0
Gln
0.584GlnAla: 0.584 ± 0.557
0.584GlnCys: 0.584 ± 0.365
2.335GlnAsp: 2.335 ± 1.135
0.584GlnGlu: 0.584 ± 0.746
1.751GlnPhe: 1.751 ± 1.108
1.751GlnGly: 1.751 ± 1.025
0.584GlnHis: 0.584 ± 0.746
1.168GlnIle: 1.168 ± 1.114
0.0GlnLys: 0.0 ± 0.0
2.919GlnLeu: 2.919 ± 1.34
0.0GlnMet: 0.0 ± 0.0
1.751GlnAsn: 1.751 ± 1.671
1.751GlnPro: 1.751 ± 1.096
0.0GlnGln: 0.0 ± 0.0
0.584GlnArg: 0.584 ± 0.557
2.919GlnSer: 2.919 ± 0.96
1.751GlnThr: 1.751 ± 0.907
0.0GlnVal: 0.0 ± 0.0
0.0GlnTrp: 0.0 ± 0.0
1.168GlnTyr: 1.168 ± 0.834
0.0GlnXaa: 0.0 ± 0.0
Arg
2.919ArgAla: 2.919 ± 1.123
0.584ArgCys: 0.584 ± 0.746
1.168ArgAsp: 1.168 ± 0.53
4.67ArgGlu: 4.67 ± 0.51
3.503ArgPhe: 3.503 ± 0.883
3.503ArgGly: 3.503 ± 0.881
0.584ArgHis: 0.584 ± 0.557
4.086ArgIle: 4.086 ± 1.39
4.67ArgLys: 4.67 ± 1.127
5.254ArgLeu: 5.254 ± 1.356
1.168ArgMet: 1.168 ± 0.834
2.335ArgAsn: 2.335 ± 1.1
1.168ArgPro: 1.168 ± 1.193
2.335ArgGln: 2.335 ± 0.72
2.335ArgArg: 2.335 ± 1.088
1.751ArgSer: 1.751 ± 1.45
0.584ArgThr: 0.584 ± 1.087
2.335ArgVal: 2.335 ± 1.017
0.0ArgTrp: 0.0 ± 0.0
3.503ArgTyr: 3.503 ± 1.316
0.0ArgXaa: 0.0 ± 0.0
Ser
7.589SerAla: 7.589 ± 1.879
1.751SerCys: 1.751 ± 1.096
7.005SerAsp: 7.005 ± 1.455
2.919SerGlu: 2.919 ± 1.404
9.924SerPhe: 9.924 ± 3.159
4.67SerGly: 4.67 ± 1.478
3.503SerHis: 3.503 ± 1.854
3.503SerIle: 3.503 ± 1.879
4.086SerLys: 4.086 ± 1.147
9.924SerLeu: 9.924 ± 2.94
1.751SerMet: 1.751 ± 0.954
5.254SerAsn: 5.254 ± 1.941
4.086SerPro: 4.086 ± 1.006
1.751SerGln: 1.751 ± 0.818
5.254SerArg: 5.254 ± 1.445
15.762SerSer: 15.762 ± 4.084
8.173SerThr: 8.173 ± 2.052
8.173SerVal: 8.173 ± 2.809
0.0SerTrp: 0.0 ± 0.0
5.838SerTyr: 5.838 ± 1.859
0.0SerXaa: 0.0 ± 0.0
Thr
4.086ThrAla: 4.086 ± 1.189
0.0ThrCys: 0.0 ± 0.0
7.589ThrAsp: 7.589 ± 2.094
1.751ThrGlu: 1.751 ± 0.907
2.919ThrPhe: 2.919 ± 1.389
0.0ThrGly: 0.0 ± 0.0
1.751ThrHis: 1.751 ± 0.688
3.503ThrIle: 3.503 ± 1.406
2.919ThrLys: 2.919 ± 0.557
4.67ThrLeu: 4.67 ± 2.291
0.0ThrMet: 0.0 ± 0.0
2.335ThrAsn: 2.335 ± 1.135
2.919ThrPro: 2.919 ± 1.534
0.0ThrGln: 0.0 ± 0.0
2.335ThrArg: 2.335 ± 1.034
9.34ThrSer: 9.34 ± 1.873
0.584ThrThr: 0.584 ± 0.365
3.503ThrVal: 3.503 ± 2.094
0.0ThrTrp: 0.0 ± 0.0
5.254ThrTyr: 5.254 ± 1.141
0.0ThrXaa: 0.0 ± 0.0
Val
3.503ValAla: 3.503 ± 1.48
1.751ValCys: 1.751 ± 1.834
3.503ValAsp: 3.503 ± 1.636
3.503ValGlu: 3.503 ± 3.589
4.086ValPhe: 4.086 ± 1.483
0.584ValGly: 0.584 ± 0.365
0.584ValHis: 0.584 ± 0.365
1.751ValIle: 1.751 ± 1.786
2.919ValLys: 2.919 ± 2.324
3.503ValLeu: 3.503 ± 1.429
1.751ValMet: 1.751 ± 0.71
4.67ValAsn: 4.67 ± 1.921
4.67ValPro: 4.67 ± 2.584
1.751ValGln: 1.751 ± 1.315
3.503ValArg: 3.503 ± 0.657
9.34ValSer: 9.34 ± 1.323
5.254ValThr: 5.254 ± 2.051
9.34ValVal: 9.34 ± 3.985
0.0ValTrp: 0.0 ± 0.0
3.503ValTyr: 3.503 ± 1.233
0.0ValXaa: 0.0 ± 0.0
Trp
0.584TrpAla: 0.584 ± 0.746
0.0TrpCys: 0.0 ± 0.0
0.0TrpAsp: 0.0 ± 0.0
0.0TrpGlu: 0.0 ± 0.0
0.584TrpPhe: 0.584 ± 0.365
0.584TrpGly: 0.584 ± 0.557
0.0TrpHis: 0.0 ± 0.0
0.0TrpIle: 0.0 ± 0.0
1.751TrpLys: 1.751 ± 1.025
0.0TrpLeu: 0.0 ± 0.0
0.0TrpMet: 0.0 ± 0.0
0.0TrpAsn: 0.0 ± 0.0
0.0TrpPro: 0.0 ± 0.0
0.0TrpGln: 0.0 ± 0.0
0.0TrpArg: 0.0 ± 0.0
1.168TrpSer: 1.168 ± 1.535
0.0TrpThr: 0.0 ± 0.0
0.0TrpVal: 0.0 ± 0.0
0.0TrpTrp: 0.0 ± 0.0
0.0TrpTyr: 0.0 ± 0.0
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.335TyrAla: 2.335 ± 2.203
0.584TyrCys: 0.584 ± 0.365
3.503TyrAsp: 3.503 ± 2.226
1.751TyrGlu: 1.751 ± 0.675
4.67TyrPhe: 4.67 ± 1.088
2.919TyrGly: 2.919 ± 1.302
1.168TyrHis: 1.168 ± 0.834
3.503TyrIle: 3.503 ± 1.38
5.838TyrLys: 5.838 ± 0.87
4.67TyrLeu: 4.67 ± 1.912
1.168TyrMet: 1.168 ± 0.891
1.751TyrAsn: 1.751 ± 0.858
2.335TyrPro: 2.335 ± 1.1
1.168TyrGln: 1.168 ± 0.762
2.335TyrArg: 2.335 ± 1.354
7.005TyrSer: 7.005 ± 2.735
2.919TyrThr: 2.919 ± 0.913
2.919TyrVal: 2.919 ± 1.441
0.584TyrTrp: 0.584 ± 0.557
2.335TyrTyr: 2.335 ± 1.1
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 7 proteins (1714 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski