Amino acid dipepetide frequency for Sewage-associated circular DNA virus-1

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
6.64AlaAla: 6.64 ± 3.821
1.328AlaCys: 1.328 ± 1.031
1.328AlaAsp: 1.328 ± 0.764
1.328AlaGlu: 1.328 ± 0.764
3.984AlaPhe: 3.984 ± 1.298
5.312AlaGly: 5.312 ± 1.261
0.0AlaHis: 0.0 ± 0.0
2.656AlaIle: 2.656 ± 0.267
6.64AlaLys: 6.64 ± 2.026
5.312AlaLeu: 5.312 ± 0.534
1.328AlaMet: 1.328 ± 0.764
2.656AlaAsn: 2.656 ± 1.528
1.328AlaPro: 1.328 ± 0.764
7.968AlaGln: 7.968 ± 0.994
6.64AlaArg: 6.64 ± 0.23
5.312AlaSer: 5.312 ± 2.33
2.656AlaThr: 2.656 ± 0.267
5.312AlaVal: 5.312 ± 2.33
1.328AlaTrp: 1.328 ± 1.031
1.328AlaTyr: 1.328 ± 1.031
0.0AlaXaa: 0.0 ± 0.0
Cys
1.328CysAla: 1.328 ± 0.764
0.0CysCys: 0.0 ± 0.0
0.0CysAsp: 0.0 ± 0.0
1.328CysGlu: 1.328 ± 0.764
0.0CysPhe: 0.0 ± 0.0
2.656CysGly: 2.656 ± 0.267
0.0CysHis: 0.0 ± 0.0
0.0CysIle: 0.0 ± 0.0
0.0CysLys: 0.0 ± 0.0
1.328CysLeu: 1.328 ± 0.764
0.0CysMet: 0.0 ± 0.0
0.0CysAsn: 0.0 ± 0.0
0.0CysPro: 0.0 ± 0.0
0.0CysGln: 0.0 ± 0.0
0.0CysArg: 0.0 ± 0.0
0.0CysSer: 0.0 ± 0.0
0.0CysThr: 0.0 ± 0.0
3.984CysVal: 3.984 ± 1.298
0.0CysTrp: 0.0 ± 0.0
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
1.328AspAla: 1.328 ± 1.031
1.328AspCys: 1.328 ± 0.764
2.656AspAsp: 2.656 ± 0.267
1.328AspGlu: 1.328 ± 1.031
2.656AspPhe: 2.656 ± 2.063
1.328AspGly: 1.328 ± 1.031
1.328AspHis: 1.328 ± 1.031
1.328AspIle: 1.328 ± 1.031
3.984AspLys: 3.984 ± 1.298
5.312AspLeu: 5.312 ± 0.534
0.0AspMet: 0.0 ± 0.0
0.0AspAsn: 0.0 ± 0.0
1.328AspPro: 1.328 ± 0.764
3.984AspGln: 3.984 ± 0.497
2.656AspArg: 2.656 ± 1.528
5.312AspSer: 5.312 ± 1.261
3.984AspThr: 3.984 ± 0.497
1.328AspVal: 1.328 ± 1.031
3.984AspTrp: 3.984 ± 3.094
2.656AspTyr: 2.656 ± 0.267
0.0AspXaa: 0.0 ± 0.0
Glu
2.656GluAla: 2.656 ± 0.267
0.0GluCys: 0.0 ± 0.0
1.328GluAsp: 1.328 ± 0.764
1.328GluGlu: 1.328 ± 0.764
2.656GluPhe: 2.656 ± 0.267
1.328GluGly: 1.328 ± 1.031
0.0GluHis: 0.0 ± 0.0
2.656GluIle: 2.656 ± 1.528
0.0GluLys: 0.0 ± 0.0
1.328GluLeu: 1.328 ± 1.031
2.656GluMet: 2.656 ± 0.267
2.656GluAsn: 2.656 ± 2.063
2.656GluPro: 2.656 ± 0.267
0.0GluGln: 0.0 ± 0.0
6.64GluArg: 6.64 ± 0.23
2.656GluSer: 2.656 ± 0.267
2.656GluThr: 2.656 ± 0.267
5.312GluVal: 5.312 ± 3.057
2.656GluTrp: 2.656 ± 2.063
3.984GluTyr: 3.984 ± 0.497
0.0GluXaa: 0.0 ± 0.0
Phe
2.656PheAla: 2.656 ± 1.528
0.0PheCys: 0.0 ± 0.0
5.312PheAsp: 5.312 ± 4.125
1.328PheGlu: 1.328 ± 0.764
2.656PhePhe: 2.656 ± 2.063
3.984PheGly: 3.984 ± 0.497
0.0PheHis: 0.0 ± 0.0
1.328PheIle: 1.328 ± 1.031
3.984PheLys: 3.984 ± 2.293
3.984PheLeu: 3.984 ± 1.298
0.0PheMet: 0.0 ± 0.0
1.328PheAsn: 1.328 ± 0.764
2.656PhePro: 2.656 ± 0.267
0.0PheGln: 0.0 ± 0.0
0.0PheArg: 0.0 ± 0.0
5.312PheSer: 5.312 ± 0.534
0.0PheThr: 0.0 ± 0.0
3.984PheVal: 3.984 ± 1.298
1.328PheTrp: 1.328 ± 1.031
0.0PheTyr: 0.0 ± 0.0
0.0PheXaa: 0.0 ± 0.0
Gly
9.296GlyAla: 9.296 ± 0.037
0.0GlyCys: 0.0 ± 0.0
3.984GlyAsp: 3.984 ± 1.298
2.656GlyGlu: 2.656 ± 2.063
2.656GlyPhe: 2.656 ± 1.528
5.312GlyGly: 5.312 ± 3.057
2.656GlyHis: 2.656 ± 0.267
3.984GlyIle: 3.984 ± 0.497
7.968GlyLys: 7.968 ± 0.994
2.656GlyLeu: 2.656 ± 1.528
1.328GlyMet: 1.328 ± 0.764
5.312GlyAsn: 5.312 ± 1.261
3.984GlyPro: 3.984 ± 0.497
3.984GlyGln: 3.984 ± 2.293
1.328GlyArg: 1.328 ± 1.031
6.64GlySer: 6.64 ± 0.23
2.656GlyThr: 2.656 ± 1.528
2.656GlyVal: 2.656 ± 1.528
1.328GlyTrp: 1.328 ± 1.031
1.328GlyTyr: 1.328 ± 1.031
0.0GlyXaa: 0.0 ± 0.0
His
2.656HisAla: 2.656 ± 2.063
1.328HisCys: 1.328 ± 0.764
0.0HisAsp: 0.0 ± 0.0
0.0HisGlu: 0.0 ± 0.0
2.656HisPhe: 2.656 ± 0.267
1.328HisGly: 1.328 ± 0.764
0.0HisHis: 0.0 ± 0.0
2.656HisIle: 2.656 ± 2.063
0.0HisLys: 0.0 ± 0.0
1.328HisLeu: 1.328 ± 1.031
0.0HisMet: 0.0 ± 0.0
0.0HisAsn: 0.0 ± 0.0
1.328HisPro: 1.328 ± 1.031
1.328HisGln: 1.328 ± 1.031
1.328HisArg: 1.328 ± 1.031
2.656HisSer: 2.656 ± 0.267
0.0HisThr: 0.0 ± 0.0
0.0HisVal: 0.0 ± 0.0
0.0HisTrp: 0.0 ± 0.0
2.656HisTyr: 2.656 ± 1.528
0.0HisXaa: 0.0 ± 0.0
Ile
9.296IleAla: 9.296 ± 1.833
0.0IleCys: 0.0 ± 0.0
3.984IleAsp: 3.984 ± 1.298
3.984IleGlu: 3.984 ± 1.298
0.0IlePhe: 0.0 ± 0.0
2.656IleGly: 2.656 ± 0.267
1.328IleHis: 1.328 ± 0.764
1.328IleIle: 1.328 ± 1.031
5.312IleLys: 5.312 ± 0.534
0.0IleLeu: 0.0 ± 0.0
0.0IleMet: 0.0 ± 0.721
2.656IleAsn: 2.656 ± 1.528
0.0IlePro: 0.0 ± 0.0
3.984IleGln: 3.984 ± 2.293
0.0IleArg: 0.0 ± 0.0
3.984IleSer: 3.984 ± 0.497
3.984IleThr: 3.984 ± 0.497
5.312IleVal: 5.312 ± 2.33
0.0IleTrp: 0.0 ± 0.0
2.656IleTyr: 2.656 ± 0.267
0.0IleXaa: 0.0 ± 0.0
Lys
3.984LysAla: 3.984 ± 2.293
5.312LysCys: 5.312 ± 0.534
0.0LysAsp: 0.0 ± 0.0
3.984LysGlu: 3.984 ± 2.293
0.0LysPhe: 0.0 ± 0.0
5.312LysGly: 5.312 ± 3.057
2.656LysHis: 2.656 ± 0.267
1.328LysIle: 1.328 ± 0.764
9.296LysLys: 9.296 ± 0.037
5.312LysLeu: 5.312 ± 3.057
0.0LysMet: 0.0 ± 0.0
2.656LysAsn: 2.656 ± 1.528
2.656LysPro: 2.656 ± 1.528
2.656LysGln: 2.656 ± 0.267
6.64LysArg: 6.64 ± 0.23
5.312LysSer: 5.312 ± 0.534
2.656LysThr: 2.656 ± 2.063
5.312LysVal: 5.312 ± 1.261
1.328LysTrp: 1.328 ± 0.764
3.984LysTyr: 3.984 ± 1.298
0.0LysXaa: 0.0 ± 0.0
Leu
2.656LeuAla: 2.656 ± 0.267
0.0LeuCys: 0.0 ± 0.0
2.656LeuAsp: 2.656 ± 2.063
2.656LeuGlu: 2.656 ± 1.528
0.0LeuPhe: 0.0 ± 0.0
6.64LeuGly: 6.64 ± 1.565
3.984LeuHis: 3.984 ± 3.094
10.624LeuIle: 10.624 ± 0.727
2.656LeuLys: 2.656 ± 1.528
7.968LeuLeu: 7.968 ± 0.994
1.328LeuMet: 1.328 ± 1.031
6.64LeuAsn: 6.64 ± 0.23
2.656LeuPro: 2.656 ± 0.267
6.64LeuGln: 6.64 ± 0.23
2.656LeuArg: 2.656 ± 1.528
5.312LeuSer: 5.312 ± 0.534
6.64LeuThr: 6.64 ± 1.565
3.984LeuVal: 3.984 ± 1.298
1.328LeuTrp: 1.328 ± 0.764
3.984LeuTyr: 3.984 ± 2.293
0.0LeuXaa: 0.0 ± 0.0
Met
0.0MetAla: 0.0 ± 0.0
0.0MetCys: 0.0 ± 0.0
0.0MetAsp: 0.0 ± 0.0
0.0MetGlu: 0.0 ± 0.0
2.656MetPhe: 2.656 ± 1.528
0.0MetGly: 0.0 ± 0.0
0.0MetHis: 0.0 ± 0.0
0.0MetIle: 0.0 ± 0.0
1.328MetLys: 1.328 ± 0.764
1.328MetLeu: 1.328 ± 1.031
0.0MetMet: 0.0 ± 0.0
0.0MetAsn: 0.0 ± 0.0
1.328MetPro: 1.328 ± 1.031
1.328MetGln: 1.328 ± 0.764
1.328MetArg: 1.328 ± 0.764
5.312MetSer: 5.312 ± 0.534
1.328MetThr: 1.328 ± 0.764
2.656MetVal: 2.656 ± 1.528
1.328MetTrp: 1.328 ± 1.031
1.328MetTyr: 1.328 ± 0.764
0.0MetXaa: 0.0 ± 0.0
Asn
1.328AsnAla: 1.328 ± 1.031
0.0AsnCys: 0.0 ± 0.0
1.328AsnAsp: 1.328 ± 0.764
2.656AsnGlu: 2.656 ± 0.267
3.984AsnPhe: 3.984 ± 1.298
6.64AsnGly: 6.64 ± 1.565
0.0AsnHis: 0.0 ± 0.0
2.656AsnIle: 2.656 ± 0.267
2.656AsnLys: 2.656 ± 0.267
2.656AsnLeu: 2.656 ± 1.528
1.328AsnMet: 1.328 ± 1.031
2.656AsnAsn: 2.656 ± 1.528
1.328AsnPro: 1.328 ± 0.764
1.328AsnGln: 1.328 ± 0.764
5.312AsnArg: 5.312 ± 3.057
2.656AsnSer: 2.656 ± 1.528
0.0AsnThr: 0.0 ± 0.0
6.64AsnVal: 6.64 ± 2.026
0.0AsnTrp: 0.0 ± 0.0
1.328AsnTyr: 1.328 ± 1.031
0.0AsnXaa: 0.0 ± 0.0
Pro
3.984ProAla: 3.984 ± 1.298
0.0ProCys: 0.0 ± 0.0
2.656ProAsp: 2.656 ± 1.528
0.0ProGlu: 0.0 ± 0.0
1.328ProPhe: 1.328 ± 0.764
3.984ProGly: 3.984 ± 2.293
1.328ProHis: 1.328 ± 1.031
0.0ProIle: 0.0 ± 0.0
5.312ProLys: 5.312 ± 1.261
2.656ProLeu: 2.656 ± 1.528
5.312ProMet: 5.312 ± 1.016
1.328ProAsn: 1.328 ± 1.031
5.312ProPro: 5.312 ± 0.534
1.328ProGln: 1.328 ± 1.031
3.984ProArg: 3.984 ± 3.094
1.328ProSer: 1.328 ± 0.764
1.328ProThr: 1.328 ± 0.764
1.328ProVal: 1.328 ± 0.764
0.0ProTrp: 0.0 ± 0.0
0.0ProTyr: 0.0 ± 0.0
0.0ProXaa: 0.0 ± 0.0
Gln
2.656GlnAla: 2.656 ± 1.528
0.0GlnCys: 0.0 ± 0.0
3.984GlnAsp: 3.984 ± 1.298
6.64GlnGlu: 6.64 ± 3.361
1.328GlnPhe: 1.328 ± 1.031
2.656GlnGly: 2.656 ± 1.528
0.0GlnHis: 0.0 ± 0.0
2.656GlnIle: 2.656 ± 1.528
2.656GlnLys: 2.656 ± 1.528
3.984GlnLeu: 3.984 ± 2.293
0.0GlnMet: 0.0 ± 0.0
3.984GlnAsn: 3.984 ± 0.497
2.656GlnPro: 2.656 ± 1.528
2.656GlnGln: 2.656 ± 1.528
2.656GlnArg: 2.656 ± 2.063
7.968GlnSer: 7.968 ± 2.79
1.328GlnThr: 1.328 ± 0.764
1.328GlnVal: 1.328 ± 0.764
1.328GlnTrp: 1.328 ± 1.031
1.328GlnTyr: 1.328 ± 0.764
0.0GlnXaa: 0.0 ± 0.0
Arg
2.656ArgAla: 2.656 ± 1.528
0.0ArgCys: 0.0 ± 0.0
2.656ArgAsp: 2.656 ± 2.063
0.0ArgGlu: 0.0 ± 0.0
0.0ArgPhe: 0.0 ± 0.0
2.656ArgGly: 2.656 ± 1.528
6.64ArgHis: 6.64 ± 1.565
1.328ArgIle: 1.328 ± 0.764
1.328ArgLys: 1.328 ± 0.764
2.656ArgLeu: 2.656 ± 0.267
0.0ArgMet: 0.0 ± 0.0
2.656ArgAsn: 2.656 ± 0.267
3.984ArgPro: 3.984 ± 1.298
5.312ArgGln: 5.312 ± 0.534
2.656ArgArg: 2.656 ± 2.063
5.312ArgSer: 5.312 ± 2.33
5.312ArgThr: 5.312 ± 1.261
3.984ArgVal: 3.984 ± 0.497
1.328ArgTrp: 1.328 ± 0.764
0.0ArgTyr: 0.0 ± 0.0
0.0ArgXaa: 0.0 ± 0.0
Ser
5.312SerAla: 5.312 ± 2.33
0.0SerCys: 0.0 ± 0.0
7.968SerAsp: 7.968 ± 2.597
1.328SerGlu: 1.328 ± 1.031
1.328SerPhe: 1.328 ± 1.031
6.64SerGly: 6.64 ± 2.026
0.0SerHis: 0.0 ± 0.0
7.968SerIle: 7.968 ± 0.994
6.64SerLys: 6.64 ± 2.026
7.968SerLeu: 7.968 ± 0.801
2.656SerMet: 2.656 ± 1.528
3.984SerAsn: 3.984 ± 0.497
3.984SerPro: 3.984 ± 2.293
5.312SerGln: 5.312 ± 3.057
2.656SerArg: 2.656 ± 0.267
9.296SerSer: 9.296 ± 0.037
7.968SerThr: 7.968 ± 2.597
3.984SerVal: 3.984 ± 0.497
2.656SerTrp: 2.656 ± 2.063
3.984SerTyr: 3.984 ± 0.497
0.0SerXaa: 0.0 ± 0.0
Thr
2.656ThrAla: 2.656 ± 0.267
0.0ThrCys: 0.0 ± 0.0
1.328ThrAsp: 1.328 ± 0.764
3.984ThrGlu: 3.984 ± 0.497
3.984ThrPhe: 3.984 ± 2.293
2.656ThrGly: 2.656 ± 0.267
1.328ThrHis: 1.328 ± 0.764
0.0ThrIle: 0.0 ± 0.0
1.328ThrLys: 1.328 ± 0.764
5.312ThrLeu: 5.312 ± 2.33
0.0ThrMet: 0.0 ± 0.0
2.656ThrAsn: 2.656 ± 2.063
1.328ThrPro: 1.328 ± 0.764
2.656ThrGln: 2.656 ± 2.063
2.656ThrArg: 2.656 ± 2.063
3.984ThrSer: 3.984 ± 1.298
1.328ThrThr: 1.328 ± 0.764
3.984ThrVal: 3.984 ± 2.293
3.984ThrTrp: 3.984 ± 0.497
2.656ThrTyr: 2.656 ± 0.267
0.0ThrXaa: 0.0 ± 0.0
Val
3.984ValAla: 3.984 ± 2.293
1.328ValCys: 1.328 ± 0.764
2.656ValAsp: 2.656 ± 1.528
3.984ValGlu: 3.984 ± 0.497
5.312ValPhe: 5.312 ± 2.33
5.312ValGly: 5.312 ± 0.534
0.0ValHis: 0.0 ± 0.0
5.312ValIle: 5.312 ± 2.33
5.312ValLys: 5.312 ± 0.534
13.28ValLeu: 13.28 ± 3.131
2.656ValMet: 2.656 ± 1.528
2.656ValAsn: 2.656 ± 1.528
2.656ValPro: 2.656 ± 0.267
0.0ValGln: 0.0 ± 0.0
1.328ValArg: 1.328 ± 0.764
7.968ValSer: 7.968 ± 2.79
1.328ValThr: 1.328 ± 1.031
3.984ValVal: 3.984 ± 1.298
1.328ValTrp: 1.328 ± 1.031
1.328ValTyr: 1.328 ± 0.764
0.0ValXaa: 0.0 ± 0.0
Trp
2.656TrpAla: 2.656 ± 2.063
0.0TrpCys: 0.0 ± 0.0
3.984TrpAsp: 3.984 ± 0.497
2.656TrpGlu: 2.656 ± 2.063
2.656TrpPhe: 2.656 ± 0.267
1.328TrpGly: 1.328 ± 1.031
0.0TrpHis: 0.0 ± 0.0
0.0TrpIle: 0.0 ± 0.0
0.0TrpLys: 0.0 ± 0.0
2.656TrpLeu: 2.656 ± 2.063
0.0TrpMet: 0.0 ± 0.0
1.328TrpAsn: 1.328 ± 1.031
0.0TrpPro: 0.0 ± 0.0
0.0TrpGln: 0.0 ± 0.0
0.0TrpArg: 0.0 ± 0.0
2.656TrpSer: 2.656 ± 0.267
1.328TrpThr: 1.328 ± 1.031
1.328TrpVal: 1.328 ± 1.031
2.656TrpTrp: 2.656 ± 2.063
3.984TrpTyr: 3.984 ± 1.298
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.656TyrAla: 2.656 ± 1.528
0.0TyrCys: 0.0 ± 0.0
0.0TyrAsp: 0.0 ± 0.0
3.984TyrGlu: 3.984 ± 2.293
0.0TyrPhe: 0.0 ± 0.0
3.984TyrGly: 3.984 ± 0.497
0.0TyrHis: 0.0 ± 0.0
3.984TyrIle: 3.984 ± 3.094
3.984TyrLys: 3.984 ± 0.497
3.984TyrLeu: 3.984 ± 0.497
1.328TyrMet: 1.328 ± 0.764
1.328TyrAsn: 1.328 ± 0.764
1.328TyrPro: 1.328 ± 1.031
1.328TyrGln: 1.328 ± 0.764
0.0TyrArg: 0.0 ± 0.0
2.656TyrSer: 2.656 ± 0.267
1.328TyrThr: 1.328 ± 0.764
5.312TyrVal: 5.312 ± 2.33
1.328TyrTrp: 1.328 ± 1.031
0.0TyrTyr: 0.0 ± 0.0
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 2 proteins (754 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski