Amino acid dipepetide frequency for Sewage-associated gemycircularvirus 9

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
2.191AlaAla: 2.191 ± 1.029
2.191AlaCys: 2.191 ± 1.029
6.572AlaAsp: 6.572 ± 0.272
1.095AlaGlu: 1.095 ± 1.011
5.476AlaPhe: 5.476 ± 1.064
2.191AlaGly: 2.191 ± 2.022
0.0AlaHis: 0.0 ± 0.0
0.0AlaIle: 0.0 ± 0.0
1.095AlaLys: 1.095 ± 1.011
9.858AlaLeu: 9.858 ± 4.653
0.0AlaMet: 0.0 ± 0.0
1.095AlaAsn: 1.095 ± 0.808
4.381AlaPro: 4.381 ± 0.992
3.286AlaGln: 3.286 ± 0.136
4.381AlaArg: 4.381 ± 0.721
6.572AlaSer: 6.572 ± 1.609
2.191AlaThr: 2.191 ± 2.022
6.572AlaVal: 6.572 ± 1.609
2.191AlaTrp: 2.191 ± 0.755
0.0AlaTyr: 0.0 ± 0.0
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
2.191CysCys: 2.191 ± 1.029
2.191CysAsp: 2.191 ± 1.029
0.0CysGlu: 0.0 ± 0.0
4.381CysPhe: 4.381 ± 2.058
4.381CysGly: 4.381 ± 0.992
0.0CysHis: 0.0 ± 0.0
2.191CysIle: 2.191 ± 1.029
0.0CysLys: 0.0 ± 0.0
1.095CysLeu: 1.095 ± 0.808
2.191CysMet: 2.191 ± 1.616
0.0CysAsn: 0.0 ± 0.0
1.095CysPro: 1.095 ± 1.011
1.095CysGln: 1.095 ± 0.808
5.476CysArg: 5.476 ± 2.706
0.0CysSer: 0.0 ± 0.0
0.0CysThr: 0.0 ± 0.0
2.191CysVal: 2.191 ± 1.616
0.0CysTrp: 0.0 ± 0.0
2.191CysTyr: 2.191 ± 1.029
0.0CysXaa: 0.0 ± 0.0
Asp
1.095AspAla: 1.095 ± 1.011
0.0AspCys: 0.0 ± 0.0
3.286AspAsp: 3.286 ± 1.632
1.095AspGlu: 1.095 ± 0.808
3.286AspPhe: 3.286 ± 0.136
6.572AspGly: 6.572 ± 3.087
1.095AspHis: 1.095 ± 1.011
8.762AspIle: 8.762 ± 1.442
2.191AspLys: 2.191 ± 1.029
6.572AspLeu: 6.572 ± 3.453
1.095AspMet: 1.095 ± 1.011
2.191AspAsn: 2.191 ± 1.029
4.381AspPro: 4.381 ± 0.721
0.0AspGln: 0.0 ± 0.0
4.381AspArg: 4.381 ± 2.058
1.095AspSer: 1.095 ± 1.011
7.667AspThr: 7.667 ± 5.639
3.286AspVal: 3.286 ± 3.032
6.572AspTrp: 6.572 ± 1.609
4.381AspTyr: 4.381 ± 2.361
0.0AspXaa: 0.0 ± 0.0
Glu
4.381GluAla: 4.381 ± 2.058
1.095GluCys: 1.095 ± 0.808
0.0GluAsp: 0.0 ± 0.0
0.0GluGlu: 0.0 ± 0.0
4.381GluPhe: 4.381 ± 2.058
0.0GluGly: 0.0 ± 0.0
0.0GluHis: 0.0 ± 0.0
1.095GluIle: 1.095 ± 1.011
3.286GluLys: 3.286 ± 0.136
4.381GluLeu: 4.381 ± 2.058
1.095GluMet: 1.095 ± 0.714
6.572GluAsn: 6.572 ± 1.609
3.286GluPro: 3.286 ± 1.632
0.0GluGln: 0.0 ± 0.0
0.0GluArg: 0.0 ± 0.0
6.572GluSer: 6.572 ± 1.609
1.095GluThr: 1.095 ± 1.011
1.095GluVal: 1.095 ± 0.808
1.095GluTrp: 1.095 ± 0.808
2.191GluTyr: 2.191 ± 1.029
0.0GluXaa: 0.0 ± 0.0
Phe
3.286PheAla: 3.286 ± 0.136
0.0PheCys: 0.0 ± 0.0
6.572PheAsp: 6.572 ± 1.609
0.0PheGlu: 0.0 ± 0.0
4.381PhePhe: 4.381 ± 0.721
3.286PheGly: 3.286 ± 0.136
0.0PheHis: 0.0 ± 0.0
3.286PheIle: 3.286 ± 0.136
2.191PheLys: 2.191 ± 0.755
2.191PheLeu: 2.191 ± 2.022
0.0PheMet: 0.0 ± 0.0
2.191PheAsn: 2.191 ± 1.029
3.286PhePro: 3.286 ± 1.59
0.0PheGln: 0.0 ± 0.0
4.381PheArg: 4.381 ± 0.721
1.095PheSer: 1.095 ± 1.011
4.381PheThr: 4.381 ± 1.011
5.476PheVal: 5.476 ± 2.607
2.191PheTrp: 2.191 ± 1.029
0.0PheTyr: 0.0 ± 0.0
0.0PheXaa: 0.0 ± 0.0
Gly
4.381GlyAla: 4.381 ± 0.992
2.191GlyCys: 2.191 ± 0.755
7.667GlyAsp: 7.667 ± 0.651
5.476GlyGlu: 5.476 ± 2.607
2.191GlyPhe: 2.191 ± 2.022
19.715GlyGly: 19.715 ± 6.234
0.0GlyHis: 0.0 ± 0.0
7.667GlyIle: 7.667 ± 2.628
2.191GlyLys: 2.191 ± 1.616
6.572GlyLeu: 6.572 ± 1.609
1.095GlyMet: 1.095 ± 0.871
4.381GlyAsn: 4.381 ± 4.043
3.286GlyPro: 3.286 ± 0.136
0.0GlyGln: 0.0 ± 0.0
10.953GlyArg: 10.953 ± 3.765
3.286GlySer: 3.286 ± 1.727
6.572GlyThr: 6.572 ± 0.272
3.286GlyVal: 3.286 ± 0.136
2.191GlyTrp: 2.191 ± 1.029
2.191GlyTyr: 2.191 ± 1.029
0.0GlyXaa: 0.0 ± 0.0
His
2.191HisAla: 2.191 ± 1.029
4.381HisCys: 4.381 ± 2.058
0.0HisAsp: 0.0 ± 0.0
3.286HisGlu: 3.286 ± 0.136
1.095HisPhe: 1.095 ± 1.011
0.0HisGly: 0.0 ± 0.0
0.0HisHis: 0.0 ± 0.0
1.095HisIle: 1.095 ± 0.808
0.0HisLys: 0.0 ± 0.0
2.191HisLeu: 2.191 ± 1.029
0.0HisMet: 0.0 ± 0.0
0.0HisAsn: 0.0 ± 0.0
2.191HisPro: 2.191 ± 1.029
0.0HisGln: 0.0 ± 0.0
1.095HisArg: 1.095 ± 0.871
1.095HisSer: 1.095 ± 1.011
1.095HisThr: 1.095 ± 1.011
0.0HisVal: 0.0 ± 0.0
0.0HisTrp: 0.0 ± 0.0
0.0HisTyr: 0.0 ± 0.0
0.0HisXaa: 0.0 ± 0.0
Ile
1.095IleAla: 1.095 ± 1.011
3.286IleCys: 3.286 ± 1.632
2.191IleAsp: 2.191 ± 0.755
2.191IleGlu: 2.191 ± 1.029
3.286IlePhe: 3.286 ± 1.192
8.762IleGly: 8.762 ± 1.115
0.0IleHis: 0.0 ± 0.0
2.191IleIle: 2.191 ± 2.022
1.095IleLys: 1.095 ± 0.808
2.191IleLeu: 2.191 ± 0.907
0.0IleMet: 0.0 ± 0.0
2.191IleAsn: 2.191 ± 2.022
1.095IlePro: 1.095 ± 0.871
0.0IleGln: 0.0 ± 0.0
5.476IleArg: 5.476 ± 1.015
1.095IleSer: 1.095 ± 0.871
6.572IleThr: 6.572 ± 3.026
2.191IleVal: 2.191 ± 2.022
1.095IleTrp: 1.095 ± 0.808
1.095IleTyr: 1.095 ± 0.808
0.0IleXaa: 0.0 ± 0.0
Lys
1.095LysAla: 1.095 ± 0.808
1.095LysCys: 1.095 ± 0.808
3.286LysAsp: 3.286 ± 1.727
2.191LysGlu: 2.191 ± 0.755
3.286LysPhe: 3.286 ± 1.632
3.286LysGly: 3.286 ± 0.136
0.0LysHis: 0.0 ± 0.0
0.0LysIle: 0.0 ± 0.0
1.095LysLys: 1.095 ± 1.011
1.095LysLeu: 1.095 ± 0.808
1.095LysMet: 1.095 ± 0.815
1.095LysAsn: 1.095 ± 0.808
3.286LysPro: 3.286 ± 0.136
0.0LysGln: 0.0 ± 0.0
5.476LysArg: 5.476 ± 2.017
5.476LysSer: 5.476 ± 1.064
3.286LysThr: 3.286 ± 1.59
0.0LysVal: 0.0 ± 0.0
3.286LysTrp: 3.286 ± 1.632
2.191LysTyr: 2.191 ± 1.029
0.0LysXaa: 0.0 ± 0.0
Leu
2.191LeuAla: 2.191 ± 0.907
2.191LeuCys: 2.191 ± 1.741
7.667LeuAsp: 7.667 ± 2.089
2.191LeuGlu: 2.191 ± 1.029
0.0LeuPhe: 0.0 ± 0.0
5.476LeuGly: 5.476 ± 2.607
4.381LeuHis: 4.381 ± 2.058
0.0LeuIle: 0.0 ± 0.0
2.191LeuLys: 2.191 ± 2.022
6.572LeuLeu: 6.572 ± 1.806
0.0LeuMet: 0.0 ± 0.0
1.095LeuAsn: 1.095 ± 1.011
4.381LeuPro: 4.381 ± 2.661
6.572LeuGln: 6.572 ± 1.806
2.191LeuArg: 2.191 ± 1.741
6.572LeuSer: 6.572 ± 1.536
3.286LeuThr: 3.286 ± 0.136
7.667LeuVal: 7.667 ± 2.312
2.191LeuTrp: 2.191 ± 0.755
3.286LeuTyr: 3.286 ± 0.136
0.0LeuXaa: 0.0 ± 0.0
Met
0.0MetAla: 0.0 ± 0.0
0.0MetCys: 0.0 ± 0.0
1.095MetAsp: 1.095 ± 1.011
1.095MetGlu: 1.095 ± 0.808
0.0MetPhe: 0.0 ± 0.0
2.191MetGly: 2.191 ± 1.741
0.0MetHis: 0.0 ± 0.0
1.095MetIle: 1.095 ± 1.011
1.095MetLys: 1.095 ± 0.808
1.095MetLeu: 1.095 ± 1.011
0.0MetMet: 0.0 ± 0.0
1.095MetAsn: 1.095 ± 0.808
3.286MetPro: 3.286 ± 0.136
0.0MetGln: 0.0 ± 0.0
2.191MetArg: 2.191 ± 2.022
1.095MetSer: 1.095 ± 0.808
1.095MetThr: 1.095 ± 1.011
0.0MetVal: 0.0 ± 0.0
0.0MetTrp: 0.0 ± 0.0
2.191MetTyr: 2.191 ± 0.907
0.0MetXaa: 0.0 ± 0.0
Asn
3.286AsnAla: 3.286 ± 3.032
1.095AsnCys: 1.095 ± 0.808
2.191AsnAsp: 2.191 ± 2.022
0.0AsnGlu: 0.0 ± 0.0
0.0AsnPhe: 0.0 ± 0.0
2.191AsnGly: 2.191 ± 1.029
2.191AsnHis: 2.191 ± 1.029
0.0AsnIle: 0.0 ± 0.0
1.095AsnLys: 1.095 ± 0.808
5.476AsnLeu: 5.476 ± 2.706
1.095AsnMet: 1.095 ± 1.011
2.191AsnAsn: 2.191 ± 2.022
0.0AsnPro: 0.0 ± 0.0
0.0AsnGln: 0.0 ± 0.0
3.286AsnArg: 3.286 ± 0.136
3.286AsnSer: 3.286 ± 1.192
4.381AsnThr: 4.381 ± 1.509
4.381AsnVal: 4.381 ± 1.011
0.0AsnTrp: 0.0 ± 0.0
0.0AsnTyr: 0.0 ± 0.0
0.0AsnXaa: 0.0 ± 0.0
Pro
3.286ProAla: 3.286 ± 3.032
0.0ProCys: 0.0 ± 0.0
0.0ProAsp: 0.0 ± 0.0
5.476ProGlu: 5.476 ± 1.064
4.381ProPhe: 4.381 ± 1.011
4.381ProGly: 4.381 ± 1.011
2.191ProHis: 2.191 ± 1.029
3.286ProIle: 3.286 ± 1.59
4.381ProLys: 4.381 ± 2.058
1.095ProLeu: 1.095 ± 0.871
1.095ProMet: 1.095 ± 1.011
2.191ProAsn: 2.191 ± 1.029
2.191ProPro: 2.191 ± 0.907
1.095ProGln: 1.095 ± 1.011
7.667ProArg: 7.667 ± 2.089
6.572ProSer: 6.572 ± 3.265
4.381ProThr: 4.381 ± 1.011
3.286ProVal: 3.286 ± 3.032
0.0ProTrp: 0.0 ± 0.0
1.095ProTyr: 1.095 ± 0.871
0.0ProXaa: 0.0 ± 0.0
Gln
1.095GlnAla: 1.095 ± 0.808
2.191GlnCys: 2.191 ± 1.029
1.095GlnAsp: 1.095 ± 1.011
0.0GlnGlu: 0.0 ± 0.0
1.095GlnPhe: 1.095 ± 1.011
1.095GlnGly: 1.095 ± 0.871
0.0GlnHis: 0.0 ± 0.0
0.0GlnIle: 0.0 ± 0.0
2.191GlnLys: 2.191 ± 1.029
3.286GlnLeu: 3.286 ± 0.136
1.095GlnMet: 1.095 ± 0.871
0.0GlnAsn: 0.0 ± 0.0
0.0GlnPro: 0.0 ± 0.0
0.0GlnGln: 0.0 ± 0.0
2.191GlnArg: 2.191 ± 2.022
4.381GlnSer: 4.381 ± 2.058
0.0GlnThr: 0.0 ± 0.0
0.0GlnVal: 0.0 ± 0.0
0.0GlnTrp: 0.0 ± 0.0
0.0GlnTyr: 0.0 ± 0.0
0.0GlnXaa: 0.0 ± 0.0
Arg
4.381ArgAla: 4.381 ± 2.058
2.191ArgCys: 2.191 ± 1.029
6.572ArgAsp: 6.572 ± 1.609
9.858ArgGlu: 9.858 ± 4.627
4.381ArgPhe: 4.381 ± 1.011
5.476ArgGly: 5.476 ± 1.064
3.286ArgHis: 3.286 ± 0.136
3.286ArgIle: 3.286 ± 1.712
5.476ArgLys: 5.476 ± 1.015
3.286ArgLeu: 3.286 ± 1.462
1.095ArgMet: 1.095 ± 0.977
0.0ArgAsn: 0.0 ± 0.0
4.381ArgPro: 4.381 ± 1.011
2.191ArgGln: 2.191 ± 0.907
15.334ArgArg: 15.334 ± 6.519
6.572ArgSer: 6.572 ± 4.547
7.667ArgThr: 7.667 ± 1.029
3.286ArgVal: 3.286 ± 0.136
1.095ArgTrp: 1.095 ± 1.011
2.191ArgTyr: 2.191 ± 1.029
0.0ArgXaa: 0.0 ± 0.0
Ser
6.572SerAla: 6.572 ± 3.026
0.0SerCys: 0.0 ± 0.0
6.572SerAsp: 6.572 ± 0.272
1.095SerGlu: 1.095 ± 1.011
1.095SerPhe: 1.095 ± 0.808
9.858SerGly: 9.858 ± 0.408
0.0SerHis: 0.0 ± 0.0
7.667SerIle: 7.667 ± 1.398
0.0SerLys: 0.0 ± 0.0
3.286SerLeu: 3.286 ± 1.632
0.0SerMet: 0.0 ± 0.0
4.381SerAsn: 4.381 ± 0.721
8.762SerPro: 8.762 ± 2.603
0.0SerGln: 0.0 ± 0.0
7.667SerArg: 7.667 ± 1.029
5.476SerSer: 5.476 ± 2.017
7.667SerThr: 7.667 ± 5.55
3.286SerVal: 3.286 ± 1.632
0.0SerTrp: 0.0 ± 0.0
1.095SerTyr: 1.095 ± 1.011
0.0SerXaa: 0.0 ± 0.0
Thr
7.667ThrAla: 7.667 ± 2.519
0.0ThrCys: 0.0 ± 0.0
4.381ThrAsp: 4.381 ± 2.556
1.095ThrGlu: 1.095 ± 0.808
1.095ThrPhe: 1.095 ± 1.011
5.476ThrGly: 5.476 ± 1.064
1.095ThrHis: 1.095 ± 1.011
3.286ThrIle: 3.286 ± 3.032
2.191ThrLys: 2.191 ± 2.022
2.191ThrLeu: 2.191 ± 0.755
3.286ThrMet: 3.286 ± 1.712
1.095ThrAsn: 1.095 ± 1.011
7.667ThrPro: 7.667 ± 1.029
2.191ThrGln: 2.191 ± 1.029
5.476ThrArg: 5.476 ± 2.017
7.667ThrSer: 7.667 ± 7.076
5.476ThrThr: 5.476 ± 5.054
2.191ThrVal: 2.191 ± 2.022
0.0ThrTrp: 0.0 ± 0.0
4.381ThrTyr: 4.381 ± 1.011
0.0ThrXaa: 0.0 ± 0.0
Val
1.095ValAla: 1.095 ± 1.011
2.191ValCys: 2.191 ± 1.029
2.191ValAsp: 2.191 ± 1.029
3.286ValGlu: 3.286 ± 1.632
4.381ValPhe: 4.381 ± 0.721
8.762ValGly: 8.762 ± 0.551
4.381ValHis: 4.381 ± 2.058
2.191ValIle: 2.191 ± 1.029
5.476ValLys: 5.476 ± 1.525
1.095ValLeu: 1.095 ± 1.011
1.095ValMet: 1.095 ± 1.011
4.381ValAsn: 4.381 ± 1.011
1.095ValPro: 1.095 ± 1.011
0.0ValGln: 0.0 ± 0.0
0.0ValArg: 0.0 ± 0.0
4.381ValSer: 4.381 ± 1.509
1.095ValThr: 1.095 ± 1.011
1.095ValVal: 1.095 ± 1.011
0.0ValTrp: 0.0 ± 0.0
4.381ValTyr: 4.381 ± 2.556
0.0ValXaa: 0.0 ± 0.0
Trp
3.286TrpAla: 3.286 ± 1.632
1.095TrpCys: 1.095 ± 1.011
4.381TrpAsp: 4.381 ± 2.058
0.0TrpGlu: 0.0 ± 0.0
0.0TrpPhe: 0.0 ± 0.0
1.095TrpGly: 1.095 ± 0.808
1.095TrpHis: 1.095 ± 1.011
0.0TrpIle: 0.0 ± 0.0
2.191TrpLys: 2.191 ± 1.029
5.476TrpLeu: 5.476 ± 1.525
1.095TrpMet: 1.095 ± 0.808
0.0TrpAsn: 0.0 ± 0.0
0.0TrpPro: 0.0 ± 0.0
0.0TrpGln: 0.0 ± 0.0
2.191TrpArg: 2.191 ± 2.022
0.0TrpSer: 0.0 ± 0.0
0.0TrpThr: 0.0 ± 0.0
1.095TrpVal: 1.095 ± 0.808
0.0TrpTrp: 0.0 ± 0.0
0.0TrpTyr: 0.0 ± 0.0
0.0TrpXaa: 0.0 ± 0.0
Tyr
8.762TyrAla: 8.762 ± 2.603
3.286TyrCys: 3.286 ± 1.632
1.095TyrAsp: 1.095 ± 1.011
2.191TyrGlu: 2.191 ± 1.029
0.0TyrPhe: 0.0 ± 0.0
1.095TyrGly: 1.095 ± 0.808
0.0TyrHis: 0.0 ± 0.0
0.0TyrIle: 0.0 ± 0.0
2.191TyrLys: 2.191 ± 1.029
1.095TyrLeu: 1.095 ± 0.871
1.095TyrMet: 1.095 ± 0.808
0.0TyrAsn: 0.0 ± 0.0
0.0TyrPro: 0.0 ± 0.0
3.286TyrGln: 3.286 ± 0.136
3.286TyrArg: 3.286 ± 1.59
2.191TyrSer: 2.191 ± 0.907
0.0TyrThr: 0.0 ± 0.0
2.191TyrVal: 2.191 ± 0.907
1.095TyrTrp: 1.095 ± 1.011
0.0TyrTyr: 0.0 ± 0.0
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 3 proteins (914 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski