Amino acid dipepetide frequency for Sewage-associated circular DNA virus-37

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
7.716AlaAla: 7.716 ± 3.081
0.0AlaCys: 0.0 ± 0.0
6.173AlaAsp: 6.173 ± 2.473
1.543AlaGlu: 1.543 ± 1.065
0.0AlaPhe: 0.0 ± 0.0
4.63AlaGly: 4.63 ± 0.951
1.543AlaHis: 1.543 ± 1.065
3.086AlaIle: 3.086 ± 2.358
3.086AlaLys: 3.086 ± 2.358
7.716AlaLeu: 7.716 ± 1.408
3.086AlaMet: 3.086 ± 0.114
10.802AlaAsn: 10.802 ± 0.722
3.086AlaPro: 3.086 ± 2.13
6.173AlaGln: 6.173 ± 0.229
4.63AlaArg: 4.63 ± 1.294
9.259AlaSer: 9.259 ± 4.146
9.259AlaThr: 9.259 ± 1.901
3.086AlaVal: 3.086 ± 2.13
3.086AlaTrp: 3.086 ± 0.114
7.716AlaTyr: 7.716 ± 1.408
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.0CysCys: 0.0 ± 0.0
3.086CysAsp: 3.086 ± 2.358
1.543CysGlu: 1.543 ± 1.179
1.543CysPhe: 1.543 ± 1.179
0.0CysGly: 0.0 ± 0.0
1.543CysHis: 1.543 ± 1.179
0.0CysIle: 0.0 ± 0.0
0.0CysLys: 0.0 ± 0.0
0.0CysLeu: 0.0 ± 0.0
0.0CysMet: 0.0 ± 0.0
0.0CysAsn: 0.0 ± 0.0
0.0CysPro: 0.0 ± 0.0
0.0CysGln: 0.0 ± 0.0
1.543CysArg: 1.543 ± 1.179
1.543CysSer: 1.543 ± 1.065
1.543CysThr: 1.543 ± 1.065
1.543CysVal: 1.543 ± 1.065
0.0CysTrp: 0.0 ± 0.0
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
6.173AspAla: 6.173 ± 4.26
0.0AspCys: 0.0 ± 0.0
3.086AspAsp: 3.086 ± 2.358
1.543AspGlu: 1.543 ± 1.179
1.543AspPhe: 1.543 ± 1.179
4.63AspGly: 4.63 ± 3.538
3.086AspHis: 3.086 ± 2.358
4.63AspIle: 4.63 ± 1.294
4.63AspLys: 4.63 ± 3.538
1.543AspLeu: 1.543 ± 1.179
1.543AspMet: 1.543 ± 1.065
1.543AspAsn: 1.543 ± 1.065
0.0AspPro: 0.0 ± 0.0
3.086AspGln: 3.086 ± 0.114
0.0AspArg: 0.0 ± 0.0
3.086AspSer: 3.086 ± 2.13
1.543AspThr: 1.543 ± 1.065
4.63AspVal: 4.63 ± 0.951
4.63AspTrp: 4.63 ± 3.538
4.63AspTyr: 4.63 ± 1.294
0.0AspXaa: 0.0 ± 0.0
Glu
3.086GluAla: 3.086 ± 2.358
0.0GluCys: 0.0 ± 0.0
3.086GluAsp: 3.086 ± 2.358
1.543GluGlu: 1.543 ± 1.179
1.543GluPhe: 1.543 ± 1.065
0.0GluGly: 0.0 ± 0.0
0.0GluHis: 0.0 ± 0.0
1.543GluIle: 1.543 ± 1.065
1.543GluLys: 1.543 ± 1.065
3.086GluLeu: 3.086 ± 2.358
0.0GluMet: 0.0 ± 0.0
0.0GluAsn: 0.0 ± 0.0
0.0GluPro: 0.0 ± 0.0
1.543GluGln: 1.543 ± 1.065
3.086GluArg: 3.086 ± 0.114
1.543GluSer: 1.543 ± 1.179
1.543GluThr: 1.543 ± 1.179
3.086GluVal: 3.086 ± 2.358
4.63GluTrp: 4.63 ± 1.294
1.543GluTyr: 1.543 ± 1.065
0.0GluXaa: 0.0 ± 0.0
Phe
3.086PheAla: 3.086 ± 0.114
3.086PheCys: 3.086 ± 0.114
6.173PheAsp: 6.173 ± 2.473
0.0PheGlu: 0.0 ± 0.0
0.0PhePhe: 0.0 ± 0.0
1.543PheGly: 1.543 ± 1.179
1.543PheHis: 1.543 ± 1.179
1.543PheIle: 1.543 ± 1.179
0.0PheLys: 0.0 ± 0.0
1.543PheLeu: 1.543 ± 1.179
3.086PheMet: 3.086 ± 1.846
9.259PheAsn: 9.259 ± 1.901
0.0PhePro: 0.0 ± 0.0
6.173PheGln: 6.173 ± 2.473
3.086PheArg: 3.086 ± 0.114
1.543PheSer: 1.543 ± 1.065
3.086PheThr: 3.086 ± 0.114
1.543PheVal: 1.543 ± 1.065
0.0PheTrp: 0.0 ± 0.0
1.543PheTyr: 1.543 ± 1.179
0.0PheXaa: 0.0 ± 0.0
Gly
7.716GlyAla: 7.716 ± 1.408
1.543GlyCys: 1.543 ± 1.179
6.173GlyAsp: 6.173 ± 0.229
3.086GlyGlu: 3.086 ± 2.358
1.543GlyPhe: 1.543 ± 1.065
0.0GlyGly: 0.0 ± 0.0
1.543GlyHis: 1.543 ± 1.179
3.086GlyIle: 3.086 ± 2.13
4.63GlyLys: 4.63 ± 0.951
6.173GlyLeu: 6.173 ± 2.473
0.0GlyMet: 0.0 ± 0.0
3.086GlyAsn: 3.086 ± 2.13
0.0GlyPro: 0.0 ± 0.0
1.543GlyGln: 1.543 ± 1.065
0.0GlyArg: 0.0 ± 0.0
7.716GlySer: 7.716 ± 3.081
4.63GlyThr: 4.63 ± 1.294
4.63GlyVal: 4.63 ± 3.195
1.543GlyTrp: 1.543 ± 1.065
0.0GlyTyr: 0.0 ± 0.0
0.0GlyXaa: 0.0 ± 0.0
His
1.543HisAla: 1.543 ± 1.179
0.0HisCys: 0.0 ± 0.0
1.543HisAsp: 1.543 ± 1.065
0.0HisGlu: 0.0 ± 0.0
3.086HisPhe: 3.086 ± 2.358
3.086HisGly: 3.086 ± 0.114
6.173HisHis: 6.173 ± 0.229
3.086HisIle: 3.086 ± 2.358
0.0HisLys: 0.0 ± 0.0
3.086HisLeu: 3.086 ± 0.114
1.543HisMet: 1.543 ± 1.179
0.0HisAsn: 0.0 ± 0.0
1.543HisPro: 1.543 ± 1.179
0.0HisGln: 0.0 ± 0.0
3.086HisArg: 3.086 ± 2.13
3.086HisSer: 3.086 ± 0.114
0.0HisThr: 0.0 ± 0.0
4.63HisVal: 4.63 ± 0.951
0.0HisTrp: 0.0 ± 0.0
0.0HisTyr: 0.0 ± 0.0
0.0HisXaa: 0.0 ± 0.0
Ile
7.716IleAla: 7.716 ± 0.836
0.0IleCys: 0.0 ± 0.0
1.543IleAsp: 1.543 ± 1.179
0.0IleGlu: 0.0 ± 0.0
6.173IlePhe: 6.173 ± 2.016
1.543IleGly: 1.543 ± 1.179
1.543IleHis: 1.543 ± 1.179
3.086IleIle: 3.086 ± 2.358
1.543IleLys: 1.543 ± 1.065
3.086IleLeu: 3.086 ± 0.114
0.0IleMet: 0.0 ± 0.0
1.543IleAsn: 1.543 ± 1.065
3.086IlePro: 3.086 ± 0.114
1.543IleGln: 1.543 ± 1.065
0.0IleArg: 0.0 ± 0.0
4.63IleSer: 4.63 ± 1.294
4.63IleThr: 4.63 ± 3.538
6.173IleVal: 6.173 ± 0.229
0.0IleTrp: 0.0 ± 0.0
1.543IleTyr: 1.543 ± 1.179
0.0IleXaa: 0.0 ± 0.0
Lys
0.0LysAla: 0.0 ± 0.0
1.543LysCys: 1.543 ± 1.065
4.63LysAsp: 4.63 ± 0.951
4.63LysGlu: 4.63 ± 3.538
3.086LysPhe: 3.086 ± 0.114
1.543LysGly: 1.543 ± 1.065
0.0LysHis: 0.0 ± 0.0
1.543LysIle: 1.543 ± 1.179
4.63LysLys: 4.63 ± 3.195
3.086LysLeu: 3.086 ± 2.13
0.0LysMet: 0.0 ± 0.0
1.543LysAsn: 1.543 ± 1.065
1.543LysPro: 1.543 ± 1.179
3.086LysGln: 3.086 ± 0.114
3.086LysArg: 3.086 ± 0.114
4.63LysSer: 4.63 ± 0.951
7.716LysThr: 7.716 ± 5.325
3.086LysVal: 3.086 ± 2.13
0.0LysTrp: 0.0 ± 0.0
1.543LysTyr: 1.543 ± 1.179
0.0LysXaa: 0.0 ± 0.0
Leu
3.086LeuAla: 3.086 ± 0.114
1.543LeuCys: 1.543 ± 1.065
4.63LeuAsp: 4.63 ± 3.195
1.543LeuGlu: 1.543 ± 1.179
3.086LeuPhe: 3.086 ± 0.114
10.802LeuGly: 10.802 ± 3.766
3.086LeuHis: 3.086 ± 0.114
0.0LeuIle: 0.0 ± 0.0
3.086LeuLys: 3.086 ± 2.13
4.63LeuLeu: 4.63 ± 3.538
1.543LeuMet: 1.543 ± 1.065
1.543LeuAsn: 1.543 ± 1.179
6.173LeuPro: 6.173 ± 2.473
1.543LeuGln: 1.543 ± 1.179
4.63LeuArg: 4.63 ± 0.951
1.543LeuSer: 1.543 ± 1.065
3.086LeuThr: 3.086 ± 0.114
3.086LeuVal: 3.086 ± 0.114
0.0LeuTrp: 0.0 ± 0.0
3.086LeuTyr: 3.086 ± 0.114
0.0LeuXaa: 0.0 ± 0.0
Met
1.543MetAla: 1.543 ± 1.065
0.0MetCys: 0.0 ± 0.0
0.0MetAsp: 0.0 ± 0.0
0.0MetGlu: 0.0 ± 0.0
1.543MetPhe: 1.543 ± 1.179
0.0MetGly: 0.0 ± 0.0
0.0MetHis: 0.0 ± 0.0
0.0MetIle: 0.0 ± 0.0
1.543MetLys: 1.543 ± 1.065
1.543MetLeu: 1.543 ± 1.179
1.543MetMet: 1.543 ± 1.065
0.0MetAsn: 0.0 ± 0.0
4.63MetPro: 4.63 ± 0.951
0.0MetGln: 0.0 ± 0.0
3.086MetArg: 3.086 ± 2.13
1.543MetSer: 1.543 ± 1.065
0.0MetThr: 0.0 ± 0.0
0.0MetVal: 0.0 ± 0.0
0.0MetTrp: 0.0 ± 0.0
1.543MetTyr: 1.543 ± 1.179
0.0MetXaa: 0.0 ± 0.0
Asn
7.716AsnAla: 7.716 ± 1.408
1.543AsnCys: 1.543 ± 1.179
3.086AsnAsp: 3.086 ± 2.13
1.543AsnGlu: 1.543 ± 1.065
1.543AsnPhe: 1.543 ± 1.179
6.173AsnGly: 6.173 ± 2.016
0.0AsnHis: 0.0 ± 0.0
0.0AsnIle: 0.0 ± 0.0
1.543AsnLys: 1.543 ± 1.065
3.086AsnLeu: 3.086 ± 0.114
1.543AsnMet: 1.543 ± 1.065
1.543AsnAsn: 1.543 ± 1.065
1.543AsnPro: 1.543 ± 1.065
6.173AsnGln: 6.173 ± 2.016
3.086AsnArg: 3.086 ± 0.114
0.0AsnSer: 0.0 ± 0.0
1.543AsnThr: 1.543 ± 1.179
6.173AsnVal: 6.173 ± 2.016
3.086AsnTrp: 3.086 ± 2.13
1.543AsnTyr: 1.543 ± 1.179
0.0AsnXaa: 0.0 ± 0.0
Pro
10.802ProAla: 10.802 ± 3.766
0.0ProCys: 0.0 ± 0.0
0.0ProAsp: 0.0 ± 0.0
0.0ProGlu: 0.0 ± 0.0
3.086ProPhe: 3.086 ± 0.114
4.63ProGly: 4.63 ± 3.195
4.63ProHis: 4.63 ± 1.294
1.543ProIle: 1.543 ± 1.179
3.086ProLys: 3.086 ± 2.13
0.0ProLeu: 0.0 ± 0.0
1.543ProMet: 1.543 ± 1.065
1.543ProAsn: 1.543 ± 1.179
7.716ProPro: 7.716 ± 3.652
3.086ProGln: 3.086 ± 2.358
3.086ProArg: 3.086 ± 2.358
0.0ProSer: 0.0 ± 0.0
1.543ProThr: 1.543 ± 1.179
3.086ProVal: 3.086 ± 2.13
0.0ProTrp: 0.0 ± 0.0
0.0ProTyr: 0.0 ± 0.0
0.0ProXaa: 0.0 ± 0.0
Gln
1.543GlnAla: 1.543 ± 1.065
1.543GlnCys: 1.543 ± 1.179
1.543GlnAsp: 1.543 ± 1.179
0.0GlnGlu: 0.0 ± 0.0
4.63GlnPhe: 4.63 ± 1.294
1.543GlnGly: 1.543 ± 1.065
0.0GlnHis: 0.0 ± 0.0
3.086GlnIle: 3.086 ± 2.13
1.543GlnLys: 1.543 ± 1.065
4.63GlnLeu: 4.63 ± 0.951
0.0GlnMet: 0.0 ± 0.0
1.543GlnAsn: 1.543 ± 1.179
1.543GlnPro: 1.543 ± 1.179
1.543GlnGln: 1.543 ± 1.179
4.63GlnArg: 4.63 ± 1.294
4.63GlnSer: 4.63 ± 1.294
3.086GlnThr: 3.086 ± 0.114
3.086GlnVal: 3.086 ± 2.13
1.543GlnTrp: 1.543 ± 1.179
3.086GlnTyr: 3.086 ± 0.114
0.0GlnXaa: 0.0 ± 0.0
Arg
1.543ArgAla: 1.543 ± 1.065
1.543ArgCys: 1.543 ± 1.179
1.543ArgAsp: 1.543 ± 1.179
3.086ArgGlu: 3.086 ± 0.114
0.0ArgPhe: 0.0 ± 0.0
3.086ArgGly: 3.086 ± 0.114
1.543ArgHis: 1.543 ± 1.179
0.0ArgIle: 0.0 ± 0.0
6.173ArgLys: 6.173 ± 2.016
1.543ArgLeu: 1.543 ± 1.179
0.0ArgMet: 0.0 ± 0.0
6.173ArgAsn: 6.173 ± 2.016
4.63ArgPro: 4.63 ± 1.294
1.543ArgGln: 1.543 ± 1.179
6.173ArgArg: 6.173 ± 2.016
1.543ArgSer: 1.543 ± 1.065
6.173ArgThr: 6.173 ± 0.229
4.63ArgVal: 4.63 ± 1.294
3.086ArgTrp: 3.086 ± 0.114
1.543ArgTyr: 1.543 ± 1.179
0.0ArgXaa: 0.0 ± 0.0
Ser
6.173SerAla: 6.173 ± 2.016
0.0SerCys: 0.0 ± 0.0
0.0SerAsp: 0.0 ± 0.0
6.173SerGlu: 6.173 ± 2.016
6.173SerPhe: 6.173 ± 0.229
3.086SerGly: 3.086 ± 2.13
4.63SerHis: 4.63 ± 0.951
3.086SerIle: 3.086 ± 2.13
1.543SerLys: 1.543 ± 1.179
7.716SerLeu: 7.716 ± 3.081
1.543SerMet: 1.543 ± 1.065
1.543SerAsn: 1.543 ± 1.179
3.086SerPro: 3.086 ± 2.358
1.543SerGln: 1.543 ± 1.065
1.543SerArg: 1.543 ± 1.065
7.716SerSer: 7.716 ± 5.325
6.173SerThr: 6.173 ± 2.016
3.086SerVal: 3.086 ± 2.13
1.543SerTrp: 1.543 ± 1.179
1.543SerTyr: 1.543 ± 1.179
0.0SerXaa: 0.0 ± 0.0
Thr
1.543ThrAla: 1.543 ± 1.065
0.0ThrCys: 0.0 ± 0.0
1.543ThrAsp: 1.543 ± 1.179
3.086ThrGlu: 3.086 ± 2.358
1.543ThrPhe: 1.543 ± 1.179
0.0ThrGly: 0.0 ± 0.0
3.086ThrHis: 3.086 ± 2.13
6.173ThrIle: 6.173 ± 0.229
4.63ThrLys: 4.63 ± 0.951
1.543ThrLeu: 1.543 ± 1.065
0.0ThrMet: 0.0 ± 0.748
6.173ThrAsn: 6.173 ± 2.016
4.63ThrPro: 4.63 ± 0.951
0.0ThrGln: 0.0 ± 0.0
4.63ThrArg: 4.63 ± 3.538
4.63ThrSer: 4.63 ± 0.951
6.173ThrThr: 6.173 ± 0.229
9.259ThrVal: 9.259 ± 1.901
3.086ThrTrp: 3.086 ± 0.114
7.716ThrTyr: 7.716 ± 3.081
0.0ThrXaa: 0.0 ± 0.0
Val
12.346ValAla: 12.346 ± 4.031
0.0ValCys: 0.0 ± 0.0
3.086ValAsp: 3.086 ± 0.114
1.543ValGlu: 1.543 ± 1.065
3.086ValPhe: 3.086 ± 0.114
7.716ValGly: 7.716 ± 5.325
1.543ValHis: 1.543 ± 1.065
6.173ValIle: 6.173 ± 0.229
3.086ValLys: 3.086 ± 0.114
6.173ValLeu: 6.173 ± 2.016
0.0ValMet: 0.0 ± 0.0
4.63ValAsn: 4.63 ± 0.951
3.086ValPro: 3.086 ± 0.114
3.086ValGln: 3.086 ± 0.114
1.543ValArg: 1.543 ± 1.179
4.63ValSer: 4.63 ± 1.294
4.63ValThr: 4.63 ± 0.951
1.543ValVal: 1.543 ± 1.179
0.0ValTrp: 0.0 ± 0.0
3.086ValTyr: 3.086 ± 2.13
0.0ValXaa: 0.0 ± 0.0
Trp
6.173TrpAla: 6.173 ± 0.229
1.543TrpCys: 1.543 ± 1.179
1.543TrpAsp: 1.543 ± 1.179
1.543TrpGlu: 1.543 ± 1.179
1.543TrpPhe: 1.543 ± 1.179
3.086TrpGly: 3.086 ± 2.358
0.0TrpHis: 0.0 ± 0.0
3.086TrpIle: 3.086 ± 0.114
0.0TrpLys: 0.0 ± 0.0
0.0TrpLeu: 0.0 ± 0.0
0.0TrpMet: 0.0 ± 0.0
0.0TrpAsn: 0.0 ± 0.0
0.0TrpPro: 0.0 ± 0.0
0.0TrpGln: 0.0 ± 0.0
3.086TrpArg: 3.086 ± 2.13
3.086TrpSer: 3.086 ± 0.114
1.543TrpThr: 1.543 ± 1.065
0.0TrpVal: 0.0 ± 0.0
0.0TrpTrp: 0.0 ± 0.0
0.0TrpTyr: 0.0 ± 0.0
0.0TrpXaa: 0.0 ± 0.0
Tyr
6.173TyrAla: 6.173 ± 2.473
0.0TyrCys: 0.0 ± 0.0
3.086TyrAsp: 3.086 ± 0.114
0.0TyrGlu: 0.0 ± 0.0
3.086TyrPhe: 3.086 ± 2.358
1.543TyrGly: 1.543 ± 1.065
0.0TyrHis: 0.0 ± 0.0
4.63TyrIle: 4.63 ± 1.294
4.63TyrLys: 4.63 ± 0.951
1.543TyrLeu: 1.543 ± 1.065
0.0TyrMet: 0.0 ± 0.0
0.0TyrAsn: 0.0 ± 0.0
3.086TyrPro: 3.086 ± 0.114
3.086TyrGln: 3.086 ± 0.114
1.543TyrArg: 1.543 ± 1.179
1.543TyrSer: 1.543 ± 1.065
3.086TyrThr: 3.086 ± 2.13
4.63TyrVal: 4.63 ± 3.538
0.0TyrTrp: 0.0 ± 0.0
1.543TyrTyr: 1.543 ± 1.065
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 2 proteins (649 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski