Amino acid dipepetide frequency for Human feces-associated smacovirus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
4.823AlaAla: 4.823 ± 0.593
0.0AlaCys: 0.0 ± 0.0
4.823AlaAsp: 4.823 ± 0.593
1.608AlaGlu: 1.608 ± 1.315
0.0AlaPhe: 0.0 ± 0.0
3.215AlaGly: 3.215 ± 1.907
1.608AlaHis: 1.608 ± 1.315
4.823AlaIle: 4.823 ± 1.676
0.0AlaLys: 0.0 ± 0.0
6.431AlaLeu: 6.431 ± 2.99
1.608AlaMet: 1.608 ± 1.315
1.608AlaAsn: 1.608 ± 0.954
3.215AlaPro: 3.215 ± 1.907
3.215AlaGln: 3.215 ± 0.361
0.0AlaArg: 0.0 ± 0.0
12.862AlaSer: 12.862 ± 5.361
8.039AlaThr: 8.039 ± 2.5
1.608AlaVal: 1.608 ± 0.954
3.215AlaTrp: 3.215 ± 0.361
1.608AlaTyr: 1.608 ± 1.315
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.0CysCys: 0.0 ± 0.0
0.0CysAsp: 0.0 ± 0.0
0.0CysGlu: 0.0 ± 0.0
0.0CysPhe: 0.0 ± 0.0
0.0CysGly: 0.0 ± 0.0
0.0CysHis: 0.0 ± 0.0
1.608CysIle: 1.608 ± 1.315
3.215CysLys: 3.215 ± 2.629
0.0CysLeu: 0.0 ± 0.0
0.0CysMet: 0.0 ± 0.0
0.0CysAsn: 0.0 ± 0.0
0.0CysPro: 0.0 ± 0.0
0.0CysGln: 0.0 ± 0.0
0.0CysArg: 0.0 ± 0.0
1.608CysSer: 1.608 ± 0.954
0.0CysThr: 0.0 ± 0.0
1.608CysVal: 1.608 ± 1.315
0.0CysTrp: 0.0 ± 0.0
1.608CysTyr: 1.608 ± 1.315
0.0CysXaa: 0.0 ± 0.0
Asp
1.608AspAla: 1.608 ± 0.954
1.608AspCys: 1.608 ± 1.315
0.0AspAsp: 0.0 ± 0.0
4.823AspGlu: 4.823 ± 1.676
0.0AspPhe: 0.0 ± 0.0
8.039AspGly: 8.039 ± 2.5
0.0AspHis: 0.0 ± 0.0
1.608AspIle: 1.608 ± 1.315
1.608AspLys: 1.608 ± 1.315
1.608AspLeu: 1.608 ± 0.954
3.215AspMet: 3.215 ± 1.907
1.608AspAsn: 1.608 ± 0.954
3.215AspPro: 3.215 ± 1.907
1.608AspGln: 1.608 ± 0.954
8.039AspArg: 8.039 ± 4.305
4.823AspSer: 4.823 ± 0.593
3.215AspThr: 3.215 ± 0.361
3.215AspVal: 3.215 ± 1.907
0.0AspTrp: 0.0 ± 0.0
0.0AspTyr: 0.0 ± 0.0
0.0AspXaa: 0.0 ± 0.0
Glu
4.823GluAla: 4.823 ± 1.676
0.0GluCys: 0.0 ± 0.0
1.608GluAsp: 1.608 ± 0.954
1.608GluGlu: 1.608 ± 1.315
1.608GluPhe: 1.608 ± 0.954
3.215GluGly: 3.215 ± 0.361
0.0GluHis: 0.0 ± 0.0
6.431GluIle: 6.431 ± 2.99
3.215GluLys: 3.215 ± 2.629
1.608GluLeu: 1.608 ± 0.954
1.608GluMet: 1.608 ± 0.954
3.215GluAsn: 3.215 ± 1.907
0.0GluPro: 0.0 ± 0.0
1.608GluGln: 1.608 ± 1.315
1.608GluArg: 1.608 ± 1.315
11.254GluSer: 11.254 ± 4.666
3.215GluThr: 3.215 ± 0.361
6.431GluVal: 6.431 ± 2.99
0.0GluTrp: 0.0 ± 0.0
0.0GluTyr: 0.0 ± 0.0
0.0GluXaa: 0.0 ± 0.0
Phe
3.215PheAla: 3.215 ± 1.907
0.0PheCys: 0.0 ± 0.0
4.823PheAsp: 4.823 ± 2.861
1.608PheGlu: 1.608 ± 0.954
3.215PhePhe: 3.215 ± 0.361
3.215PheGly: 3.215 ± 0.361
1.608PheHis: 1.608 ± 0.954
0.0PheIle: 0.0 ± 0.0
4.823PheLys: 4.823 ± 0.593
1.608PheLeu: 1.608 ± 0.954
4.823PheMet: 4.823 ± 1.727
0.0PheAsn: 0.0 ± 0.0
0.0PhePro: 0.0 ± 0.0
0.0PheGln: 0.0 ± 0.0
4.823PheArg: 4.823 ± 2.861
0.0PheSer: 0.0 ± 0.0
1.608PheThr: 1.608 ± 0.954
1.608PheVal: 1.608 ± 0.954
0.0PheTrp: 0.0 ± 0.0
0.0PheTyr: 0.0 ± 0.0
0.0PheXaa: 0.0 ± 0.0
Gly
3.215GlyAla: 3.215 ± 0.361
1.608GlyCys: 1.608 ± 0.954
1.608GlyAsp: 1.608 ± 0.954
3.215GlyGlu: 3.215 ± 1.907
8.039GlyPhe: 8.039 ± 4.768
3.215GlyGly: 3.215 ± 2.629
1.608GlyHis: 1.608 ± 0.954
4.823GlyIle: 4.823 ± 2.861
4.823GlyLys: 4.823 ± 3.944
4.823GlyLeu: 4.823 ± 1.676
0.0GlyMet: 0.0 ± 0.0
4.823GlyAsn: 4.823 ± 0.593
1.608GlyPro: 1.608 ± 1.315
4.823GlyGln: 4.823 ± 1.676
1.608GlyArg: 1.608 ± 0.954
4.823GlySer: 4.823 ± 2.861
6.431GlyThr: 6.431 ± 3.815
4.823GlyVal: 4.823 ± 1.676
4.823GlyTrp: 4.823 ± 1.676
3.215GlyTyr: 3.215 ± 0.361
0.0GlyXaa: 0.0 ± 0.0
His
1.608HisAla: 1.608 ± 1.315
0.0HisCys: 0.0 ± 0.0
0.0HisAsp: 0.0 ± 0.0
0.0HisGlu: 0.0 ± 0.0
0.0HisPhe: 0.0 ± 0.0
3.215HisGly: 3.215 ± 1.907
0.0HisHis: 0.0 ± 0.0
3.215HisIle: 3.215 ± 0.361
0.0HisLys: 0.0 ± 0.0
0.0HisLeu: 0.0 ± 0.0
0.0HisMet: 0.0 ± 0.0
0.0HisAsn: 0.0 ± 0.0
0.0HisPro: 0.0 ± 0.0
0.0HisGln: 0.0 ± 0.0
0.0HisArg: 0.0 ± 0.0
0.0HisSer: 0.0 ± 0.0
1.608HisThr: 1.608 ± 0.954
1.608HisVal: 1.608 ± 1.315
1.608HisTrp: 1.608 ± 1.315
1.608HisTyr: 1.608 ± 0.954
0.0HisXaa: 0.0 ± 0.0
Ile
0.0IleAla: 0.0 ± 0.0
1.608IleCys: 1.608 ± 1.315
6.431IleAsp: 6.431 ± 0.722
4.823IleGlu: 4.823 ± 3.944
3.215IlePhe: 3.215 ± 1.907
4.823IleGly: 4.823 ± 1.676
1.608IleHis: 1.608 ± 0.954
3.215IleIle: 3.215 ± 2.629
1.608IleLys: 1.608 ± 1.315
9.646IleLeu: 9.646 ± 3.454
3.215IleMet: 3.215 ± 2.629
1.608IleAsn: 1.608 ± 1.315
3.215IlePro: 3.215 ± 2.629
1.608IleGln: 1.608 ± 0.954
4.823IleArg: 4.823 ± 3.944
3.215IleSer: 3.215 ± 0.361
3.215IleThr: 3.215 ± 0.361
1.608IleVal: 1.608 ± 0.954
0.0IleTrp: 0.0 ± 0.0
1.608IleTyr: 1.608 ± 0.954
0.0IleXaa: 0.0 ± 0.0
Lys
6.431LysAla: 6.431 ± 2.99
0.0LysCys: 0.0 ± 0.0
1.608LysAsp: 1.608 ± 1.315
1.608LysGlu: 1.608 ± 1.315
3.215LysPhe: 3.215 ± 1.907
8.039LysGly: 8.039 ± 2.5
1.608LysHis: 1.608 ± 1.315
1.608LysIle: 1.608 ± 0.954
3.215LysLys: 3.215 ± 2.629
4.823LysLeu: 4.823 ± 1.676
1.608LysMet: 1.608 ± 1.315
3.215LysAsn: 3.215 ± 2.629
1.608LysPro: 1.608 ± 1.315
4.823LysGln: 4.823 ± 1.676
0.0LysArg: 0.0 ± 0.0
1.608LysSer: 1.608 ± 1.315
1.608LysThr: 1.608 ± 0.954
1.608LysVal: 1.608 ± 1.315
6.431LysTrp: 6.431 ± 2.99
0.0LysTyr: 0.0 ± 0.0
0.0LysXaa: 0.0 ± 0.0
Leu
4.823LeuAla: 4.823 ± 0.593
0.0LeuCys: 0.0 ± 0.0
6.431LeuAsp: 6.431 ± 1.546
4.823LeuGlu: 4.823 ± 1.676
1.608LeuPhe: 1.608 ± 0.954
1.608LeuGly: 1.608 ± 0.954
0.0LeuHis: 0.0 ± 0.0
0.0LeuIle: 0.0 ± 0.0
1.608LeuLys: 1.608 ± 1.315
1.608LeuLeu: 1.608 ± 0.954
0.0LeuMet: 0.0 ± 0.0
1.608LeuAsn: 1.608 ± 0.954
12.862LeuPro: 12.862 ± 7.629
4.823LeuGln: 4.823 ± 2.861
0.0LeuArg: 0.0 ± 0.0
4.823LeuSer: 4.823 ± 1.676
6.431LeuThr: 6.431 ± 0.722
6.431LeuVal: 6.431 ± 2.99
1.608LeuTrp: 1.608 ± 1.315
3.215LeuTyr: 3.215 ± 0.361
0.0LeuXaa: 0.0 ± 0.0
Met
3.215MetAla: 3.215 ± 1.907
0.0MetCys: 0.0 ± 0.0
0.0MetAsp: 0.0 ± 0.0
0.0MetGlu: 0.0 ± 0.0
0.0MetPhe: 0.0 ± 0.0
1.608MetGly: 1.608 ± 0.954
0.0MetHis: 0.0 ± 0.0
3.215MetIle: 3.215 ± 2.629
0.0MetLys: 0.0 ± 0.0
3.215MetLeu: 3.215 ± 0.361
0.0MetMet: 0.0 ± 0.0
1.608MetAsn: 1.608 ± 0.954
4.823MetPro: 4.823 ± 2.861
0.0MetGln: 0.0 ± 0.0
3.215MetArg: 3.215 ± 0.361
3.215MetSer: 3.215 ± 1.907
0.0MetThr: 0.0 ± 0.0
3.215MetVal: 3.215 ± 2.629
0.0MetTrp: 0.0 ± 0.0
0.0MetTyr: 0.0 ± 0.0
0.0MetXaa: 0.0 ± 0.0
Asn
3.215AsnAla: 3.215 ± 1.907
0.0AsnCys: 0.0 ± 0.0
3.215AsnAsp: 3.215 ± 2.629
0.0AsnGlu: 0.0 ± 0.0
1.608AsnPhe: 1.608 ± 1.315
4.823AsnGly: 4.823 ± 0.593
0.0AsnHis: 0.0 ± 0.0
3.215AsnIle: 3.215 ± 0.361
0.0AsnLys: 0.0 ± 0.0
0.0AsnLeu: 0.0 ± 0.0
0.0AsnMet: 0.0 ± 0.0
4.823AsnAsn: 4.823 ± 2.861
4.823AsnPro: 4.823 ± 2.861
3.215AsnGln: 3.215 ± 0.361
1.608AsnArg: 1.608 ± 0.954
1.608AsnSer: 1.608 ± 0.954
6.431AsnThr: 6.431 ± 0.722
6.431AsnVal: 6.431 ± 1.546
0.0AsnTrp: 0.0 ± 0.0
3.215AsnTyr: 3.215 ± 1.907
0.0AsnXaa: 0.0 ± 0.0
Pro
4.823ProAla: 4.823 ± 2.861
0.0ProCys: 0.0 ± 0.0
1.608ProAsp: 1.608 ± 0.954
1.608ProGlu: 1.608 ± 0.954
1.608ProPhe: 1.608 ± 0.954
1.608ProGly: 1.608 ± 0.954
0.0ProHis: 0.0 ± 0.0
4.823ProIle: 4.823 ± 2.861
3.215ProLys: 3.215 ± 0.361
4.823ProLeu: 4.823 ± 2.861
0.0ProMet: 0.0 ± 0.0
6.431ProAsn: 6.431 ± 1.546
4.823ProPro: 4.823 ± 0.593
0.0ProGln: 0.0 ± 0.0
8.039ProArg: 8.039 ± 0.232
3.215ProSer: 3.215 ± 0.361
6.431ProThr: 6.431 ± 1.546
3.215ProVal: 3.215 ± 1.907
0.0ProTrp: 0.0 ± 0.0
1.608ProTyr: 1.608 ± 1.315
0.0ProXaa: 0.0 ± 0.0
Gln
3.215GlnAla: 3.215 ± 0.361
1.608GlnCys: 1.608 ± 1.315
0.0GlnAsp: 0.0 ± 0.0
4.823GlnGlu: 4.823 ± 0.593
1.608GlnPhe: 1.608 ± 0.954
1.608GlnGly: 1.608 ± 0.954
1.608GlnHis: 1.608 ± 1.315
6.431GlnIle: 6.431 ± 2.99
1.608GlnLys: 1.608 ± 0.954
1.608GlnLeu: 1.608 ± 1.315
0.0GlnMet: 0.0 ± 0.0
1.608GlnAsn: 1.608 ± 0.954
0.0GlnPro: 0.0 ± 0.0
1.608GlnGln: 1.608 ± 0.954
1.608GlnArg: 1.608 ± 1.315
3.215GlnSer: 3.215 ± 0.361
1.608GlnThr: 1.608 ± 0.954
4.823GlnVal: 4.823 ± 2.861
1.608GlnTrp: 1.608 ± 1.315
4.823GlnTyr: 4.823 ± 1.676
0.0GlnXaa: 0.0 ± 0.0
Arg
1.608ArgAla: 1.608 ± 1.315
0.0ArgCys: 0.0 ± 0.0
1.608ArgAsp: 1.608 ± 0.954
1.608ArgGlu: 1.608 ± 1.315
3.215ArgPhe: 3.215 ± 0.361
4.823ArgGly: 4.823 ± 1.676
0.0ArgHis: 0.0 ± 0.0
4.823ArgIle: 4.823 ± 1.676
3.215ArgLys: 3.215 ± 1.907
1.608ArgLeu: 1.608 ± 0.954
0.0ArgMet: 0.0 ± 0.0
0.0ArgAsn: 0.0 ± 0.0
3.215ArgPro: 3.215 ± 2.629
0.0ArgGln: 0.0 ± 0.0
3.215ArgArg: 3.215 ± 0.361
3.215ArgSer: 3.215 ± 2.629
3.215ArgThr: 3.215 ± 1.907
8.039ArgVal: 8.039 ± 0.232
1.608ArgTrp: 1.608 ± 1.315
4.823ArgTyr: 4.823 ± 1.676
0.0ArgXaa: 0.0 ± 0.0
Ser
6.431SerAla: 6.431 ± 0.722
1.608SerCys: 1.608 ± 1.315
3.215SerAsp: 3.215 ± 1.907
4.823SerGlu: 4.823 ± 1.676
0.0SerPhe: 0.0 ± 0.0
8.039SerGly: 8.039 ± 0.232
0.0SerHis: 0.0 ± 0.0
3.215SerIle: 3.215 ± 2.629
3.215SerLys: 3.215 ± 2.629
3.215SerLeu: 3.215 ± 1.907
1.608SerMet: 1.608 ± 0.954
3.215SerAsn: 3.215 ± 0.361
3.215SerPro: 3.215 ± 1.907
0.0SerGln: 0.0 ± 0.0
1.608SerArg: 1.608 ± 1.315
4.823SerSer: 4.823 ± 0.593
8.039SerThr: 8.039 ± 4.768
4.823SerVal: 4.823 ± 0.593
3.215SerTrp: 3.215 ± 2.629
4.823SerTyr: 4.823 ± 2.861
0.0SerXaa: 0.0 ± 0.0
Thr
4.823ThrAla: 4.823 ± 2.861
0.0ThrCys: 0.0 ± 0.0
3.215ThrAsp: 3.215 ± 0.361
0.0ThrGlu: 0.0 ± 0.0
3.215ThrPhe: 3.215 ± 1.907
8.039ThrGly: 8.039 ± 0.232
3.215ThrHis: 3.215 ± 1.907
3.215ThrIle: 3.215 ± 0.361
3.215ThrLys: 3.215 ± 2.629
6.431ThrLeu: 6.431 ± 3.815
3.215ThrMet: 3.215 ± 1.907
8.039ThrAsn: 8.039 ± 2.037
6.431ThrPro: 6.431 ± 3.815
3.215ThrGln: 3.215 ± 0.361
1.608ThrArg: 1.608 ± 1.315
3.215ThrSer: 3.215 ± 1.907
1.608ThrThr: 1.608 ± 0.954
6.431ThrVal: 6.431 ± 1.546
1.608ThrTrp: 1.608 ± 0.954
3.215ThrTyr: 3.215 ± 1.907
0.0ThrXaa: 0.0 ± 0.0
Val
3.215ValAla: 3.215 ± 0.361
0.0ValCys: 0.0 ± 0.0
4.823ValAsp: 4.823 ± 1.676
9.646ValGlu: 9.646 ± 3.352
3.215ValPhe: 3.215 ± 0.361
4.823ValGly: 4.823 ± 1.676
0.0ValHis: 0.0 ± 0.0
3.215ValIle: 3.215 ± 1.907
8.039ValLys: 8.039 ± 0.232
6.431ValLeu: 6.431 ± 2.99
1.608ValMet: 1.608 ± 0.954
3.215ValAsn: 3.215 ± 1.907
3.215ValPro: 3.215 ± 0.361
4.823ValGln: 4.823 ± 1.676
3.215ValArg: 3.215 ± 1.907
0.0ValSer: 0.0 ± 0.0
6.431ValThr: 6.431 ± 3.815
3.215ValVal: 3.215 ± 1.907
3.215ValTrp: 3.215 ± 0.361
1.608ValTyr: 1.608 ± 1.315
0.0ValXaa: 0.0 ± 0.0
Trp
0.0TrpAla: 0.0 ± 0.0
1.608TrpCys: 1.608 ± 1.315
0.0TrpAsp: 0.0 ± 0.0
1.608TrpGlu: 1.608 ± 1.315
0.0TrpPhe: 0.0 ± 0.0
0.0TrpGly: 0.0 ± 0.0
0.0TrpHis: 0.0 ± 0.0
0.0TrpIle: 0.0 ± 0.0
3.215TrpLys: 3.215 ± 0.361
3.215TrpLeu: 3.215 ± 0.361
3.215TrpMet: 3.215 ± 0.361
1.608TrpAsn: 1.608 ± 0.954
0.0TrpPro: 0.0 ± 0.0
3.215TrpGln: 3.215 ± 2.629
4.823TrpArg: 4.823 ± 1.676
1.608TrpSer: 1.608 ± 1.315
1.608TrpThr: 1.608 ± 1.315
1.608TrpVal: 1.608 ± 1.315
0.0TrpTrp: 0.0 ± 0.0
1.608TrpTyr: 1.608 ± 1.315
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.215TyrAla: 3.215 ± 0.361
0.0TyrCys: 0.0 ± 0.0
4.823TyrAsp: 4.823 ± 1.676
4.823TyrGlu: 4.823 ± 0.593
3.215TyrPhe: 3.215 ± 1.907
0.0TyrGly: 0.0 ± 0.0
1.608TyrHis: 1.608 ± 1.315
1.608TyrIle: 1.608 ± 1.315
4.823TyrLys: 4.823 ± 0.593
1.608TyrLeu: 1.608 ± 0.954
0.0TyrMet: 0.0 ± 0.0
0.0TyrAsn: 0.0 ± 0.0
1.608TyrPro: 1.608 ± 0.954
6.431TyrGln: 6.431 ± 1.546
0.0TyrArg: 0.0 ± 0.0
0.0TyrSer: 0.0 ± 0.0
3.215TyrThr: 3.215 ± 2.629
1.608TyrVal: 1.608 ± 1.315
0.0TyrTrp: 0.0 ± 0.0
4.823TyrTyr: 4.823 ± 0.593
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 2 proteins (623 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski