Amino acid dipepetide frequency for Sewage-associated gemycircularvirus 2

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
6.105AlaAla: 6.105 ± 1.553
0.0AlaCys: 0.0 ± 0.0
2.442AlaAsp: 2.442 ± 0.9
12.21AlaGlu: 12.21 ± 3.106
2.442AlaPhe: 2.442 ± 1.182
3.663AlaGly: 3.663 ± 0.482
0.0AlaHis: 0.0 ± 0.0
0.0AlaIle: 0.0 ± 0.0
0.0AlaLys: 0.0 ± 0.0
1.221AlaLeu: 1.221 ± 0.919
0.0AlaMet: 0.0 ± 0.0
2.442AlaAsn: 2.442 ± 1.182
3.663AlaPro: 3.663 ± 0.482
3.663AlaGln: 3.663 ± 1.603
7.326AlaArg: 7.326 ± 3.143
2.442AlaSer: 2.442 ± 1.839
8.547AlaThr: 8.547 ± 0.83
3.663AlaVal: 3.663 ± 1.603
1.221AlaTrp: 1.221 ± 0.919
6.105AlaTyr: 6.105 ± 0.429
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.0CysCys: 0.0 ± 0.0
1.221CysAsp: 1.221 ± 0.917
0.0CysGlu: 0.0 ± 0.0
2.442CysPhe: 2.442 ± 1.182
0.0CysGly: 0.0 ± 0.0
1.221CysHis: 1.221 ± 0.917
6.105CysIle: 6.105 ± 2.664
0.0CysLys: 0.0 ± 0.0
2.442CysLeu: 2.442 ± 1.182
1.221CysMet: 1.221 ± 0.919
0.0CysAsn: 0.0 ± 0.0
0.0CysPro: 0.0 ± 0.0
2.442CysGln: 2.442 ± 1.182
0.0CysArg: 0.0 ± 0.0
2.442CysSer: 2.442 ± 1.182
0.0CysThr: 0.0 ± 0.0
0.0CysVal: 0.0 ± 0.0
0.0CysTrp: 0.0 ± 0.0
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
1.221AspAla: 1.221 ± 0.919
0.0AspCys: 0.0 ± 0.0
1.221AspAsp: 1.221 ± 0.917
4.884AspGlu: 4.884 ± 0.729
3.663AspPhe: 3.663 ± 1.603
6.105AspGly: 6.105 ± 1.553
0.0AspHis: 0.0 ± 0.0
2.442AspIle: 2.442 ± 0.9
1.221AspLys: 1.221 ± 0.917
2.442AspLeu: 2.442 ± 1.182
4.884AspMet: 4.884 ± 0.729
1.221AspAsn: 1.221 ± 0.919
8.547AspPro: 8.547 ± 2.315
2.442AspGln: 2.442 ± 1.839
4.884AspArg: 4.884 ± 2.363
3.663AspSer: 3.663 ± 1.572
3.663AspThr: 3.663 ± 2.758
9.768AspVal: 9.768 ± 2.89
3.663AspTrp: 3.663 ± 1.567
3.663AspTyr: 3.663 ± 0.482
0.0AspXaa: 0.0 ± 0.0
Glu
2.442GluAla: 2.442 ± 1.182
2.442GluCys: 2.442 ± 1.182
6.105GluAsp: 6.105 ± 2.664
0.0GluGlu: 0.0 ± 0.0
3.663GluPhe: 3.663 ± 1.603
8.547GluGly: 8.547 ± 0.83
0.0GluHis: 0.0 ± 0.0
4.884GluIle: 4.884 ± 2.363
0.0GluLys: 0.0 ± 0.0
2.442GluLeu: 2.442 ± 1.182
0.0GluMet: 0.0 ± 0.0
1.221GluAsn: 1.221 ± 0.919
3.663GluPro: 3.663 ± 0.482
0.0GluGln: 0.0 ± 0.0
8.547GluArg: 8.547 ± 0.83
2.442GluSer: 2.442 ± 0.9
4.884GluThr: 4.884 ± 0.872
0.0GluVal: 0.0 ± 0.0
2.442GluTrp: 2.442 ± 1.182
3.663GluTyr: 3.663 ± 1.603
0.0GluXaa: 0.0 ± 0.0
Phe
4.884PheAla: 4.884 ± 2.363
1.221PheCys: 1.221 ± 0.919
4.884PheAsp: 4.884 ± 0.729
0.0PheGlu: 0.0 ± 0.0
3.663PhePhe: 3.663 ± 1.603
3.663PheGly: 3.663 ± 1.603
1.221PheHis: 1.221 ± 0.917
1.221PheIle: 1.221 ± 0.917
1.221PheLys: 1.221 ± 1.379
2.442PheLeu: 2.442 ± 1.182
0.0PheMet: 0.0 ± 0.0
3.663PheAsn: 3.663 ± 0.482
0.0PhePro: 0.0 ± 0.0
0.0PheGln: 0.0 ± 0.0
4.884PheArg: 4.884 ± 0.872
2.442PheSer: 2.442 ± 0.9
1.221PheThr: 1.221 ± 0.919
3.663PheVal: 3.663 ± 0.482
2.442PheTrp: 2.442 ± 0.9
4.884PheTyr: 4.884 ± 0.729
0.0PheXaa: 0.0 ± 0.0
Gly
7.326GlyAla: 7.326 ± 1.235
0.0GlyCys: 0.0 ± 0.0
4.884GlyAsp: 4.884 ± 0.872
3.663GlyGlu: 3.663 ± 1.603
2.442GlyPhe: 2.442 ± 1.839
7.326GlyGly: 7.326 ± 1.745
0.0GlyHis: 0.0 ± 0.0
8.547GlyIle: 8.547 ± 2.717
2.442GlyLys: 2.442 ± 1.833
8.547GlyLeu: 8.547 ± 2.315
0.0GlyMet: 0.0 ± 0.0
3.663GlyAsn: 3.663 ± 1.567
4.884GlyPro: 4.884 ± 2.413
2.442GlyGln: 2.442 ± 1.839
13.431GlyArg: 13.431 ± 3.175
1.221GlySer: 1.221 ± 0.919
3.663GlyThr: 3.663 ± 1.737
2.442GlyVal: 2.442 ± 1.839
0.0GlyTrp: 0.0 ± 0.0
1.221GlyTyr: 1.221 ± 0.919
0.0GlyXaa: 0.0 ± 0.0
His
2.442HisAla: 2.442 ± 0.9
2.442HisCys: 2.442 ± 1.182
0.0HisAsp: 0.0 ± 0.0
3.663HisGlu: 3.663 ± 0.482
0.0HisPhe: 0.0 ± 0.0
0.0HisGly: 0.0 ± 0.0
0.0HisHis: 0.0 ± 0.0
0.0HisIle: 0.0 ± 0.0
0.0HisLys: 0.0 ± 0.0
2.442HisLeu: 2.442 ± 1.182
0.0HisMet: 0.0 ± 0.0
1.221HisAsn: 1.221 ± 0.917
4.884HisPro: 4.884 ± 2.363
0.0HisGln: 0.0 ± 0.0
2.442HisArg: 2.442 ± 1.182
1.221HisSer: 1.221 ± 0.917
0.0HisThr: 0.0 ± 0.0
2.442HisVal: 2.442 ± 1.182
0.0HisTrp: 0.0 ± 0.0
0.0HisTyr: 0.0 ± 0.0
0.0HisXaa: 0.0 ± 0.0
Ile
0.0IleAla: 0.0 ± 0.0
0.0IleCys: 0.0 ± 0.0
2.442IleAsp: 2.442 ± 0.9
2.442IleGlu: 2.442 ± 1.182
4.884IlePhe: 4.884 ± 0.872
7.326IleGly: 7.326 ± 1.235
2.442IleHis: 2.442 ± 1.182
2.442IleIle: 2.442 ± 1.182
2.442IleLys: 2.442 ± 1.182
2.442IleLeu: 2.442 ± 1.839
2.442IleMet: 2.442 ± 1.182
2.442IleAsn: 2.442 ± 1.839
1.221IlePro: 1.221 ± 0.919
0.0IleGln: 0.0 ± 0.0
2.442IleArg: 2.442 ± 1.839
2.442IleSer: 2.442 ± 1.182
4.884IleThr: 4.884 ± 0.729
2.442IleVal: 2.442 ± 1.182
2.442IleTrp: 2.442 ± 1.182
3.663IleTyr: 3.663 ± 0.482
0.0IleXaa: 0.0 ± 0.0
Lys
1.221LysAla: 1.221 ± 0.917
0.0LysCys: 0.0 ± 0.0
2.442LysAsp: 2.442 ± 1.833
6.105LysGlu: 6.105 ± 1.553
1.221LysPhe: 1.221 ± 0.917
2.442LysGly: 2.442 ± 0.9
0.0LysHis: 0.0 ± 0.0
1.221LysIle: 1.221 ± 1.379
2.442LysLys: 2.442 ± 1.839
2.442LysLeu: 2.442 ± 1.182
0.0LysMet: 0.0 ± 0.0
1.221LysAsn: 1.221 ± 0.919
3.663LysPro: 3.663 ± 0.482
1.221LysGln: 1.221 ± 0.917
1.221LysArg: 1.221 ± 0.917
0.0LysSer: 0.0 ± 0.0
1.221LysThr: 1.221 ± 0.917
3.663LysVal: 3.663 ± 0.482
0.0LysTrp: 0.0 ± 0.0
3.663LysTyr: 3.663 ± 1.603
0.0LysXaa: 0.0 ± 0.0
Leu
4.884LeuAla: 4.884 ± 0.872
4.884LeuCys: 4.884 ± 2.363
4.884LeuAsp: 4.884 ± 2.33
4.884LeuGlu: 4.884 ± 2.33
1.221LeuPhe: 1.221 ± 0.917
6.105LeuGly: 6.105 ± 1.585
4.884LeuHis: 4.884 ± 2.363
1.221LeuIle: 1.221 ± 0.917
1.221LeuLys: 1.221 ± 0.917
4.884LeuLeu: 4.884 ± 2.413
0.0LeuMet: 0.0 ± 0.0
1.221LeuAsn: 1.221 ± 0.919
0.0LeuPro: 0.0 ± 0.0
0.0LeuGln: 0.0 ± 0.0
9.768LeuArg: 9.768 ± 1.744
4.884LeuSer: 4.884 ± 2.363
0.0LeuThr: 0.0 ± 0.0
6.105LeuVal: 6.105 ± 2.385
1.221LeuTrp: 1.221 ± 0.917
2.442LeuTyr: 2.442 ± 1.839
0.0LeuXaa: 0.0 ± 0.0
Met
4.884MetAla: 4.884 ± 0.872
0.0MetCys: 0.0 ± 0.0
1.221MetAsp: 1.221 ± 0.919
0.0MetGlu: 0.0 ± 0.0
2.442MetPhe: 2.442 ± 1.182
1.221MetGly: 1.221 ± 0.917
0.0MetHis: 0.0 ± 0.0
0.0MetIle: 0.0 ± 0.0
1.221MetLys: 1.221 ± 0.919
1.221MetLeu: 1.221 ± 0.919
2.442MetMet: 2.442 ± 0.9
0.0MetAsn: 0.0 ± 0.0
3.663MetPro: 3.663 ± 0.482
0.0MetGln: 0.0 ± 0.0
0.0MetArg: 0.0 ± 0.0
0.0MetSer: 0.0 ± 0.0
1.221MetThr: 1.221 ± 0.919
0.0MetVal: 0.0 ± 0.0
0.0MetTrp: 0.0 ± 0.0
1.221MetTyr: 1.221 ± 0.919
0.0MetXaa: 0.0 ± 0.0
Asn
2.442AsnAla: 2.442 ± 1.182
1.221AsnCys: 1.221 ± 0.917
3.663AsnAsp: 3.663 ± 2.758
0.0AsnGlu: 0.0 ± 0.0
0.0AsnPhe: 0.0 ± 0.0
0.0AsnGly: 0.0 ± 0.0
1.221AsnHis: 1.221 ± 0.917
0.0AsnIle: 0.0 ± 0.0
2.442AsnLys: 2.442 ± 0.9
3.663AsnLeu: 3.663 ± 0.482
0.0AsnMet: 0.0 ± 0.0
0.0AsnAsn: 0.0 ± 0.0
1.221AsnPro: 1.221 ± 0.919
0.0AsnGln: 0.0 ± 0.0
1.221AsnArg: 1.221 ± 0.919
6.105AsnSer: 6.105 ± 1.726
1.221AsnThr: 1.221 ± 0.919
4.884AsnVal: 4.884 ± 0.872
0.0AsnTrp: 0.0 ± 0.0
0.0AsnTyr: 0.0 ± 0.0
0.0AsnXaa: 0.0 ± 0.0
Pro
1.221ProAla: 1.221 ± 0.919
1.221ProCys: 1.221 ± 0.917
4.884ProAsp: 4.884 ± 2.363
3.663ProGlu: 3.663 ± 0.482
0.0ProPhe: 0.0 ± 0.0
2.442ProGly: 2.442 ± 1.839
0.0ProHis: 0.0 ± 0.0
2.442ProIle: 2.442 ± 1.839
3.663ProLys: 3.663 ± 1.603
2.442ProLeu: 2.442 ± 1.839
3.663ProMet: 3.663 ± 2.758
4.884ProAsn: 4.884 ± 2.363
1.221ProPro: 1.221 ± 0.919
1.221ProGln: 1.221 ± 0.919
3.663ProArg: 3.663 ± 1.603
2.442ProSer: 2.442 ± 1.839
4.884ProThr: 4.884 ± 2.413
0.0ProVal: 0.0 ± 0.0
4.884ProTrp: 4.884 ± 2.363
3.663ProTyr: 3.663 ± 0.482
0.0ProXaa: 0.0 ± 0.0
Gln
0.0GlnAla: 0.0 ± 0.0
2.442GlnCys: 2.442 ± 1.182
0.0GlnAsp: 0.0 ± 0.0
0.0GlnGlu: 0.0 ± 0.0
1.221GlnPhe: 1.221 ± 0.917
2.442GlnGly: 2.442 ± 1.839
4.884GlnHis: 4.884 ± 2.363
0.0GlnIle: 0.0 ± 0.0
1.221GlnLys: 1.221 ± 0.917
2.442GlnLeu: 2.442 ± 1.182
0.0GlnMet: 0.0 ± 0.0
0.0GlnAsn: 0.0 ± 0.0
2.442GlnPro: 2.442 ± 1.839
2.442GlnGln: 2.442 ± 1.182
0.0GlnArg: 0.0 ± 0.0
1.221GlnSer: 1.221 ± 0.919
1.221GlnThr: 1.221 ± 0.919
2.442GlnVal: 2.442 ± 1.839
0.0GlnTrp: 0.0 ± 0.0
1.221GlnTyr: 1.221 ± 0.919
0.0GlnXaa: 0.0 ± 0.0
Arg
2.442ArgAla: 2.442 ± 0.9
0.0ArgCys: 0.0 ± 0.0
3.663ArgAsp: 3.663 ± 2.758
6.105ArgGlu: 6.105 ± 1.553
3.663ArgPhe: 3.663 ± 2.758
4.884ArgGly: 4.884 ± 3.678
2.442ArgHis: 2.442 ± 1.182
4.884ArgIle: 4.884 ± 0.872
0.0ArgLys: 0.0 ± 0.0
3.663ArgLeu: 3.663 ± 1.567
1.221ArgMet: 1.221 ± 1.642
0.0ArgAsn: 0.0 ± 0.0
7.326ArgPro: 7.326 ± 1.235
0.0ArgGln: 0.0 ± 0.0
18.315ArgArg: 18.315 ± 7.98
13.431ArgSer: 13.431 ± 1.605
7.326ArgThr: 7.326 ± 0.965
15.873ArgVal: 15.873 ± 1.55
2.442ArgTrp: 2.442 ± 1.182
2.442ArgTyr: 2.442 ± 1.182
0.0ArgXaa: 0.0 ± 0.0
Ser
4.884SerAla: 4.884 ± 0.872
0.0SerCys: 0.0 ± 0.0
3.663SerAsp: 3.663 ± 0.482
0.0SerGlu: 0.0 ± 0.0
3.663SerPhe: 3.663 ± 0.482
4.884SerGly: 4.884 ± 3.678
1.221SerHis: 1.221 ± 0.919
3.663SerIle: 3.663 ± 0.482
4.884SerLys: 4.884 ± 0.872
10.989SerLeu: 10.989 ± 3.832
0.0SerMet: 0.0 ± 0.0
0.0SerAsn: 0.0 ± 0.0
1.221SerPro: 1.221 ± 0.919
1.221SerGln: 1.221 ± 0.919
6.105SerArg: 6.105 ± 1.726
2.442SerSer: 2.442 ± 1.839
1.221SerThr: 1.221 ± 0.919
1.221SerVal: 1.221 ± 0.919
0.0SerTrp: 0.0 ± 0.0
2.442SerTyr: 2.442 ± 0.9
0.0SerXaa: 0.0 ± 0.0
Thr
6.105ThrAla: 6.105 ± 1.726
0.0ThrCys: 0.0 ± 0.0
4.884ThrAsp: 4.884 ± 0.729
3.663ThrGlu: 3.663 ± 0.482
4.884ThrPhe: 4.884 ± 1.86
3.663ThrGly: 3.663 ± 0.482
0.0ThrHis: 0.0 ± 0.0
6.105ThrIle: 6.105 ± 1.726
0.0ThrLys: 0.0 ± 0.0
2.442ThrLeu: 2.442 ± 0.9
1.221ThrMet: 1.221 ± 0.919
3.663ThrAsn: 3.663 ± 2.758
4.884ThrPro: 4.884 ± 0.872
0.0ThrGln: 0.0 ± 0.0
6.105ThrArg: 6.105 ± 3.296
0.0ThrSer: 0.0 ± 0.0
1.221ThrThr: 1.221 ± 0.919
4.884ThrVal: 4.884 ± 2.413
4.884ThrTrp: 4.884 ± 0.872
2.442ThrTyr: 2.442 ± 1.182
0.0ThrXaa: 0.0 ± 0.0
Val
6.105ValAla: 6.105 ± 1.553
0.0ValCys: 0.0 ± 0.0
12.21ValAsp: 12.21 ± 4.056
4.884ValGlu: 4.884 ± 2.33
4.884ValPhe: 4.884 ± 0.729
9.768ValGly: 9.768 ± 2.89
0.0ValHis: 0.0 ± 0.0
3.663ValIle: 3.663 ± 2.758
0.0ValLys: 0.0 ± 0.0
1.221ValLeu: 1.221 ± 0.917
1.221ValMet: 1.221 ± 0.749
1.221ValAsn: 1.221 ± 0.919
0.0ValPro: 0.0 ± 0.0
3.663ValGln: 3.663 ± 0.482
4.884ValArg: 4.884 ± 3.678
3.663ValSer: 3.663 ± 2.758
8.547ValThr: 8.547 ± 1.068
4.884ValVal: 4.884 ± 2.363
0.0ValTrp: 0.0 ± 0.0
0.0ValTyr: 0.0 ± 0.0
0.0ValXaa: 0.0 ± 0.0
Trp
3.663TrpAla: 3.663 ± 1.603
0.0TrpCys: 0.0 ± 0.0
0.0TrpAsp: 0.0 ± 0.0
0.0TrpGlu: 0.0 ± 0.0
0.0TrpPhe: 0.0 ± 0.0
1.221TrpGly: 1.221 ± 0.917
1.221TrpHis: 1.221 ± 0.919
0.0TrpIle: 0.0 ± 0.0
7.326TrpLys: 7.326 ± 3.545
2.442TrpLeu: 2.442 ± 1.833
0.0TrpMet: 0.0 ± 0.0
0.0TrpAsn: 0.0 ± 0.0
0.0TrpPro: 0.0 ± 0.0
1.221TrpGln: 1.221 ± 0.919
1.221TrpArg: 1.221 ± 0.919
0.0TrpSer: 0.0 ± 0.0
4.884TrpThr: 4.884 ± 0.872
2.442TrpVal: 2.442 ± 1.182
0.0TrpTrp: 0.0 ± 0.0
1.221TrpTyr: 1.221 ± 0.919
0.0TrpXaa: 0.0 ± 0.0
Tyr
4.884TyrAla: 4.884 ± 2.33
3.663TyrCys: 3.663 ± 1.603
4.884TyrAsp: 4.884 ± 0.872
1.221TyrGlu: 1.221 ± 0.919
1.221TyrPhe: 1.221 ± 0.917
2.442TyrGly: 2.442 ± 1.182
2.442TyrHis: 2.442 ± 1.182
2.442TyrIle: 2.442 ± 1.839
3.663TyrLys: 3.663 ± 1.567
2.442TyrLeu: 2.442 ± 1.182
1.221TyrMet: 1.221 ± 1.08
1.221TyrAsn: 1.221 ± 0.919
0.0TyrPro: 0.0 ± 0.0
3.663TyrGln: 3.663 ± 0.482
4.884TyrArg: 4.884 ± 0.872
1.221TyrSer: 1.221 ± 0.919
1.221TyrThr: 1.221 ± 0.919
0.0TyrVal: 0.0 ± 0.0
1.221TyrTrp: 1.221 ± 0.919
1.221TyrTyr: 1.221 ± 0.919
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 3 proteins (820 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski