Amino acid dipepetide frequency for Sewage-associated circular DNA virus-23

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
9.885AlaAla: 9.885 ± 2.76
0.0AlaCys: 0.0 ± 0.0
1.647AlaAsp: 1.647 ± 1.297
3.295AlaGlu: 3.295 ± 2.429
0.0AlaPhe: 0.0 ± 0.0
6.59AlaGly: 6.59 ± 2.346
0.0AlaHis: 0.0 ± 0.0
3.295AlaIle: 3.295 ± 0.083
4.942AlaLys: 4.942 ± 1.38
3.295AlaLeu: 3.295 ± 0.083
1.647AlaMet: 1.647 ± 1.214
4.942AlaAsn: 4.942 ± 1.132
3.295AlaPro: 3.295 ± 2.429
1.647AlaGln: 1.647 ± 1.297
3.295AlaArg: 3.295 ± 0.083
6.59AlaSer: 6.59 ± 0.166
4.942AlaThr: 4.942 ± 3.643
6.59AlaVal: 6.59 ± 0.166
0.0AlaTrp: 0.0 ± 0.0
3.295AlaTyr: 3.295 ± 0.083
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.0CysCys: 0.0 ± 0.0
0.0CysAsp: 0.0 ± 0.0
0.0CysGlu: 0.0 ± 0.0
0.0CysPhe: 0.0 ± 0.0
3.295CysGly: 3.295 ± 0.083
1.647CysHis: 1.647 ± 1.214
0.0CysIle: 0.0 ± 0.0
1.647CysLys: 1.647 ± 1.214
0.0CysLeu: 0.0 ± 0.0
1.647CysMet: 1.647 ± 1.297
1.647CysAsn: 1.647 ± 1.214
0.0CysPro: 0.0 ± 0.0
0.0CysGln: 0.0 ± 0.0
3.295CysArg: 3.295 ± 2.594
1.647CysSer: 1.647 ± 1.297
1.647CysThr: 1.647 ± 1.297
1.647CysVal: 1.647 ± 1.214
0.0CysTrp: 0.0 ± 0.0
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
1.647AspAla: 1.647 ± 1.214
0.0AspCys: 0.0 ± 0.0
4.942AspAsp: 4.942 ± 3.891
3.295AspGlu: 3.295 ± 2.594
4.942AspPhe: 4.942 ± 1.38
1.647AspGly: 1.647 ± 1.214
1.647AspHis: 1.647 ± 1.297
1.647AspIle: 1.647 ± 1.214
1.647AspLys: 1.647 ± 1.214
1.647AspLeu: 1.647 ± 1.297
0.0AspMet: 0.0 ± 0.0
0.0AspAsn: 0.0 ± 0.0
1.647AspPro: 1.647 ± 1.297
0.0AspGln: 0.0 ± 0.0
3.295AspArg: 3.295 ± 0.083
4.942AspSer: 4.942 ± 1.38
3.295AspThr: 3.295 ± 0.083
0.0AspVal: 0.0 ± 0.0
1.647AspTrp: 1.647 ± 1.297
3.295AspTyr: 3.295 ± 2.594
0.0AspXaa: 0.0 ± 0.0
Glu
4.942GluAla: 4.942 ± 1.38
0.0GluCys: 0.0 ± 0.0
0.0GluAsp: 0.0 ± 0.0
1.647GluGlu: 1.647 ± 1.297
3.295GluPhe: 3.295 ± 2.594
3.295GluGly: 3.295 ± 2.594
0.0GluHis: 0.0 ± 0.0
9.885GluIle: 9.885 ± 0.248
1.647GluLys: 1.647 ± 1.214
6.59GluLeu: 6.59 ± 5.189
0.0GluMet: 0.0 ± 0.0
1.647GluAsn: 1.647 ± 1.297
1.647GluPro: 1.647 ± 1.214
3.295GluGln: 3.295 ± 0.083
0.0GluArg: 0.0 ± 0.0
1.647GluSer: 1.647 ± 1.214
0.0GluThr: 0.0 ± 0.0
4.942GluVal: 4.942 ± 3.891
1.647GluTrp: 1.647 ± 1.297
0.0GluTyr: 0.0 ± 0.0
0.0GluXaa: 0.0 ± 0.0
Phe
1.647PheAla: 1.647 ± 1.214
0.0PheCys: 0.0 ± 0.0
3.295PheAsp: 3.295 ± 2.594
1.647PheGlu: 1.647 ± 1.297
0.0PhePhe: 0.0 ± 0.0
1.647PheGly: 1.647 ± 1.297
1.647PheHis: 1.647 ± 1.297
1.647PheIle: 1.647 ± 1.297
1.647PheLys: 1.647 ± 1.297
0.0PheLeu: 0.0 ± 0.0
1.647PheMet: 1.647 ± 1.214
3.295PheAsn: 3.295 ± 0.083
0.0PhePro: 0.0 ± 0.0
0.0PheGln: 0.0 ± 0.0
0.0PheArg: 0.0 ± 0.0
0.0PheSer: 0.0 ± 0.0
1.647PheThr: 1.647 ± 1.297
3.295PheVal: 3.295 ± 0.083
0.0PheTrp: 0.0 ± 0.0
1.647PheTyr: 1.647 ± 1.297
0.0PheXaa: 0.0 ± 0.0
Gly
9.885GlyAla: 9.885 ± 4.775
0.0GlyCys: 0.0 ± 0.0
0.0GlyAsp: 0.0 ± 0.0
4.942GlyGlu: 4.942 ± 3.643
4.942GlyPhe: 4.942 ± 1.132
3.295GlyGly: 3.295 ± 0.083
0.0GlyHis: 0.0 ± 0.0
6.59GlyIle: 6.59 ± 0.166
6.59GlyLys: 6.59 ± 0.166
3.295GlyLeu: 3.295 ± 2.429
1.647GlyMet: 1.647 ± 1.297
3.295GlyAsn: 3.295 ± 0.083
4.942GlyPro: 4.942 ± 3.891
3.295GlyGln: 3.295 ± 2.594
6.59GlyArg: 6.59 ± 2.346
8.237GlySer: 8.237 ± 3.56
4.942GlyThr: 4.942 ± 1.38
3.295GlyVal: 3.295 ± 0.083
1.647GlyTrp: 1.647 ± 1.297
3.295GlyTyr: 3.295 ± 2.594
0.0GlyXaa: 0.0 ± 0.0
His
0.0HisAla: 0.0 ± 0.0
0.0HisCys: 0.0 ± 0.0
0.0HisAsp: 0.0 ± 0.0
0.0HisGlu: 0.0 ± 0.0
0.0HisPhe: 0.0 ± 0.0
3.295HisGly: 3.295 ± 0.083
0.0HisHis: 0.0 ± 0.0
4.942HisIle: 4.942 ± 1.38
0.0HisLys: 0.0 ± 0.0
0.0HisLeu: 0.0 ± 0.0
1.647HisMet: 1.647 ± 0.799
1.647HisAsn: 1.647 ± 1.214
0.0HisPro: 0.0 ± 0.0
1.647HisGln: 1.647 ± 1.297
3.295HisArg: 3.295 ± 2.594
1.647HisSer: 1.647 ± 1.297
0.0HisThr: 0.0 ± 0.0
1.647HisVal: 1.647 ± 1.297
0.0HisTrp: 0.0 ± 0.0
0.0HisTyr: 0.0 ± 0.0
0.0HisXaa: 0.0 ± 0.0
Ile
3.295IleAla: 3.295 ± 0.083
0.0IleCys: 0.0 ± 0.0
3.295IleAsp: 3.295 ± 0.083
0.0IleGlu: 0.0 ± 0.0
0.0IlePhe: 0.0 ± 0.0
0.0IleGly: 0.0 ± 0.0
3.295IleHis: 3.295 ± 2.429
8.237IleIle: 8.237 ± 3.974
6.59IleLys: 6.59 ± 5.189
6.59IleLeu: 6.59 ± 2.346
3.295IleMet: 3.295 ± 2.594
4.942IleAsn: 4.942 ± 1.38
3.295IlePro: 3.295 ± 2.429
1.647IleGln: 1.647 ± 1.297
4.942IleArg: 4.942 ± 3.643
1.647IleSer: 1.647 ± 1.214
4.942IleThr: 4.942 ± 1.132
0.0IleVal: 0.0 ± 0.0
0.0IleTrp: 0.0 ± 0.0
4.942IleTyr: 4.942 ± 3.643
0.0IleXaa: 0.0 ± 0.0
Lys
8.237LysAla: 8.237 ± 3.56
1.647LysCys: 1.647 ± 1.214
3.295LysAsp: 3.295 ± 0.083
3.295LysGlu: 3.295 ± 2.594
3.295LysPhe: 3.295 ± 2.594
4.942LysGly: 4.942 ± 1.38
0.0LysHis: 0.0 ± 0.0
6.59LysIle: 6.59 ± 0.166
4.942LysLys: 4.942 ± 3.643
0.0LysLeu: 0.0 ± 0.0
1.647LysMet: 1.647 ± 1.214
3.295LysAsn: 3.295 ± 2.429
3.295LysPro: 3.295 ± 0.083
3.295LysGln: 3.295 ± 2.429
6.59LysArg: 6.59 ± 2.346
3.295LysSer: 3.295 ± 0.083
1.647LysThr: 1.647 ± 1.297
6.59LysVal: 6.59 ± 2.346
3.295LysTrp: 3.295 ± 0.083
3.295LysTyr: 3.295 ± 0.083
0.0LysXaa: 0.0 ± 0.0
Leu
4.942LeuAla: 4.942 ± 1.38
1.647LeuCys: 1.647 ± 1.214
0.0LeuAsp: 0.0 ± 0.0
1.647LeuGlu: 1.647 ± 1.214
0.0LeuPhe: 0.0 ± 0.0
1.647LeuGly: 1.647 ± 1.214
1.647LeuHis: 1.647 ± 1.297
1.647LeuIle: 1.647 ± 1.214
6.59LeuLys: 6.59 ± 2.346
6.59LeuLeu: 6.59 ± 0.166
0.0LeuMet: 0.0 ± 0.0
3.295LeuAsn: 3.295 ± 2.429
1.647LeuPro: 1.647 ± 1.214
1.647LeuGln: 1.647 ± 1.214
6.59LeuArg: 6.59 ± 2.677
6.59LeuSer: 6.59 ± 2.346
4.942LeuThr: 4.942 ± 3.643
1.647LeuVal: 1.647 ± 1.297
0.0LeuTrp: 0.0 ± 0.0
3.295LeuTyr: 3.295 ± 0.083
0.0LeuXaa: 0.0 ± 0.0
Met
1.647MetAla: 1.647 ± 1.214
0.0MetCys: 0.0 ± 0.0
3.295MetAsp: 3.295 ± 0.083
1.647MetGlu: 1.647 ± 1.297
0.0MetPhe: 0.0 ± 0.0
3.295MetGly: 3.295 ± 0.083
0.0MetHis: 0.0 ± 0.0
0.0MetIle: 0.0 ± 0.0
1.647MetLys: 1.647 ± 1.214
0.0MetLeu: 0.0 ± 0.0
0.0MetMet: 0.0 ± 0.0
1.647MetAsn: 1.647 ± 1.214
0.0MetPro: 0.0 ± 0.0
0.0MetGln: 0.0 ± 0.0
1.647MetArg: 1.647 ± 1.214
0.0MetSer: 0.0 ± 0.0
1.647MetThr: 1.647 ± 1.297
4.942MetVal: 4.942 ± 1.38
0.0MetTrp: 0.0 ± 0.0
1.647MetTyr: 1.647 ± 1.214
0.0MetXaa: 0.0 ± 0.0
Asn
3.295AsnAla: 3.295 ± 2.429
0.0AsnCys: 0.0 ± 0.0
3.295AsnAsp: 3.295 ± 2.594
1.647AsnGlu: 1.647 ± 1.297
0.0AsnPhe: 0.0 ± 0.0
1.647AsnGly: 1.647 ± 1.297
1.647AsnHis: 1.647 ± 1.214
0.0AsnIle: 0.0 ± 0.0
4.942AsnLys: 4.942 ± 1.132
6.59AsnLeu: 6.59 ± 2.346
3.295AsnMet: 3.295 ± 0.083
1.647AsnAsn: 1.647 ± 1.297
1.647AsnPro: 1.647 ± 1.297
3.295AsnGln: 3.295 ± 2.429
3.295AsnArg: 3.295 ± 0.083
6.59AsnSer: 6.59 ± 2.346
6.59AsnThr: 6.59 ± 2.346
4.942AsnVal: 4.942 ± 3.643
0.0AsnTrp: 0.0 ± 0.0
4.942AsnTyr: 4.942 ± 1.38
0.0AsnXaa: 0.0 ± 0.0
Pro
4.942ProAla: 4.942 ± 1.132
1.647ProCys: 1.647 ± 1.297
0.0ProAsp: 0.0 ± 0.0
1.647ProGlu: 1.647 ± 1.297
1.647ProPhe: 1.647 ± 1.297
4.942ProGly: 4.942 ± 1.132
3.295ProHis: 3.295 ± 2.594
1.647ProIle: 1.647 ± 1.297
1.647ProLys: 1.647 ± 1.214
1.647ProLeu: 1.647 ± 1.214
0.0ProMet: 0.0 ± 0.0
0.0ProAsn: 0.0 ± 0.0
3.295ProPro: 3.295 ± 0.083
1.647ProGln: 1.647 ± 1.297
3.295ProArg: 3.295 ± 0.083
1.647ProSer: 1.647 ± 1.297
3.295ProThr: 3.295 ± 2.429
0.0ProVal: 0.0 ± 0.0
1.647ProTrp: 1.647 ± 1.297
3.295ProTyr: 3.295 ± 2.429
0.0ProXaa: 0.0 ± 0.0
Gln
0.0GlnAla: 0.0 ± 0.0
1.647GlnCys: 1.647 ± 1.297
1.647GlnAsp: 1.647 ± 1.214
1.647GlnGlu: 1.647 ± 1.297
0.0GlnPhe: 0.0 ± 0.0
3.295GlnGly: 3.295 ± 0.083
3.295GlnHis: 3.295 ± 2.594
1.647GlnIle: 1.647 ± 1.214
1.647GlnLys: 1.647 ± 1.214
0.0GlnLeu: 0.0 ± 0.0
0.0GlnMet: 0.0 ± 0.0
4.942GlnAsn: 4.942 ± 1.132
1.647GlnPro: 1.647 ± 1.297
3.295GlnGln: 3.295 ± 2.594
1.647GlnArg: 1.647 ± 1.297
3.295GlnSer: 3.295 ± 0.083
3.295GlnThr: 3.295 ± 2.429
3.295GlnVal: 3.295 ± 0.083
0.0GlnTrp: 0.0 ± 0.0
0.0GlnTyr: 0.0 ± 0.0
0.0GlnXaa: 0.0 ± 0.0
Arg
3.295ArgAla: 3.295 ± 2.594
3.295ArgCys: 3.295 ± 0.083
1.647ArgAsp: 1.647 ± 1.297
3.295ArgGlu: 3.295 ± 2.594
0.0ArgPhe: 0.0 ± 0.0
6.59ArgGly: 6.59 ± 0.166
1.647ArgHis: 1.647 ± 1.297
0.0ArgIle: 0.0 ± 0.0
11.532ArgLys: 11.532 ± 0.966
4.942ArgLeu: 4.942 ± 3.643
1.647ArgMet: 1.647 ± 1.214
6.59ArgAsn: 6.59 ± 2.346
3.295ArgPro: 3.295 ± 2.594
0.0ArgGln: 0.0 ± 0.0
6.59ArgArg: 6.59 ± 5.189
1.647ArgSer: 1.647 ± 1.297
9.885ArgThr: 9.885 ± 4.775
3.295ArgVal: 3.295 ± 2.594
4.942ArgTrp: 4.942 ± 1.38
6.59ArgTyr: 6.59 ± 2.346
0.0ArgXaa: 0.0 ± 0.0
Ser
1.647SerAla: 1.647 ± 1.297
0.0SerCys: 0.0 ± 0.0
3.295SerAsp: 3.295 ± 2.429
3.295SerGlu: 3.295 ± 2.594
1.647SerPhe: 1.647 ± 1.297
11.532SerGly: 11.532 ± 0.966
0.0SerHis: 0.0 ± 0.0
3.295SerIle: 3.295 ± 2.429
4.942SerLys: 4.942 ± 1.132
1.647SerLeu: 1.647 ± 1.297
0.0SerMet: 0.0 ± 0.0
3.295SerAsn: 3.295 ± 0.083
3.295SerPro: 3.295 ± 2.429
0.0SerGln: 0.0 ± 0.0
4.942SerArg: 4.942 ± 1.38
6.59SerSer: 6.59 ± 2.346
3.295SerThr: 3.295 ± 2.429
6.59SerVal: 6.59 ± 2.677
0.0SerTrp: 0.0 ± 0.0
4.942SerTyr: 4.942 ± 1.132
0.0SerXaa: 0.0 ± 0.0
Thr
1.647ThrAla: 1.647 ± 1.214
1.647ThrCys: 1.647 ± 1.214
6.59ThrAsp: 6.59 ± 2.677
1.647ThrGlu: 1.647 ± 1.297
0.0ThrPhe: 0.0 ± 0.0
4.942ThrGly: 4.942 ± 1.132
0.0ThrHis: 0.0 ± 0.0
3.295ThrIle: 3.295 ± 2.429
4.942ThrLys: 4.942 ± 1.132
6.59ThrLeu: 6.59 ± 4.858
0.0ThrMet: 0.0 ± 0.0
3.295ThrAsn: 3.295 ± 2.429
3.295ThrPro: 3.295 ± 0.083
4.942ThrGln: 4.942 ± 1.132
4.942ThrArg: 4.942 ± 3.643
3.295ThrSer: 3.295 ± 2.594
3.295ThrThr: 3.295 ± 2.429
4.942ThrVal: 4.942 ± 1.38
3.295ThrTrp: 3.295 ± 2.429
1.647ThrTyr: 1.647 ± 1.214
0.0ThrXaa: 0.0 ± 0.0
Val
4.942ValAla: 4.942 ± 1.38
1.647ValCys: 1.647 ± 1.297
3.295ValAsp: 3.295 ± 0.083
4.942ValGlu: 4.942 ± 1.38
4.942ValPhe: 4.942 ± 1.38
4.942ValGly: 4.942 ± 1.38
0.0ValHis: 0.0 ± 0.0
3.295ValIle: 3.295 ± 0.083
1.647ValLys: 1.647 ± 1.214
1.647ValLeu: 1.647 ± 1.214
0.0ValMet: 0.0 ± 0.0
6.59ValAsn: 6.59 ± 0.166
3.295ValPro: 3.295 ± 2.594
3.295ValGln: 3.295 ± 2.429
6.59ValArg: 6.59 ± 0.166
3.295ValSer: 3.295 ± 0.083
3.295ValThr: 3.295 ± 0.083
4.942ValVal: 4.942 ± 3.643
1.647ValTrp: 1.647 ± 1.297
3.295ValTyr: 3.295 ± 2.594
0.0ValXaa: 0.0 ± 0.0
Trp
0.0TrpAla: 0.0 ± 0.0
1.647TrpCys: 1.647 ± 1.297
0.0TrpAsp: 0.0 ± 0.0
1.647TrpGlu: 1.647 ± 1.297
0.0TrpPhe: 0.0 ± 0.0
1.647TrpGly: 1.647 ± 1.214
0.0TrpHis: 0.0 ± 0.0
0.0TrpIle: 0.0 ± 0.0
1.647TrpLys: 1.647 ± 1.214
3.295TrpLeu: 3.295 ± 2.429
0.0TrpMet: 0.0 ± 0.0
3.295TrpAsn: 3.295 ± 2.594
0.0TrpPro: 0.0 ± 0.0
0.0TrpGln: 0.0 ± 0.0
1.647TrpArg: 1.647 ± 1.297
0.0TrpSer: 0.0 ± 0.0
0.0TrpThr: 0.0 ± 0.0
3.295TrpVal: 3.295 ± 2.594
3.295TrpTrp: 3.295 ± 0.083
1.647TrpTyr: 1.647 ± 1.297
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.295TyrAla: 3.295 ± 0.083
3.295TyrCys: 3.295 ± 0.083
1.647TyrAsp: 1.647 ± 1.297
4.942TyrGlu: 4.942 ± 1.38
0.0TyrPhe: 0.0 ± 0.0
8.237TyrGly: 8.237 ± 3.56
0.0TyrHis: 0.0 ± 0.0
3.295TyrIle: 3.295 ± 2.429
1.647TyrLys: 1.647 ± 1.297
1.647TyrLeu: 1.647 ± 1.297
3.295TyrMet: 3.295 ± 1.863
0.0TyrAsn: 0.0 ± 0.0
1.647TyrPro: 1.647 ± 1.214
3.295TyrGln: 3.295 ± 2.594
8.237TyrArg: 8.237 ± 1.463
1.647TyrSer: 1.647 ± 1.214
1.647TyrThr: 1.647 ± 1.297
1.647TyrVal: 1.647 ± 1.214
0.0TyrTrp: 0.0 ± 0.0
1.647TyrTyr: 1.647 ± 1.214
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 2 proteins (608 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski