Amino acid dipepetide frequency for Sewage-associated circular DNA virus-25

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
10.929AlaAla: 10.929 ± 2.919
0.0AlaCys: 0.0 ± 0.0
4.372AlaAsp: 4.372 ± 1.685
3.279AlaGlu: 3.279 ± 1.604
5.464AlaPhe: 5.464 ± 0.941
6.557AlaGly: 6.557 ± 2.517
2.186AlaHis: 2.186 ± 0.663
4.372AlaIle: 4.372 ± 1.678
3.279AlaLys: 3.279 ± 0.391
6.557AlaLeu: 6.557 ± 2.108
2.186AlaMet: 2.186 ± 1.455
2.186AlaAsn: 2.186 ± 0.663
7.65AlaPro: 7.65 ± 1.648
3.279AlaGln: 3.279 ± 1.054
9.836AlaArg: 9.836 ± 3.652
1.093AlaSer: 1.093 ± 0.728
2.186AlaThr: 2.186 ± 1.101
3.279AlaVal: 3.279 ± 2.183
1.093AlaTrp: 1.093 ± 0.728
5.464AlaTyr: 5.464 ± 2.841
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.0CysCys: 0.0 ± 0.0
1.093CysAsp: 1.093 ± 0.91
1.093CysGlu: 1.093 ± 0.728
1.093CysPhe: 1.093 ± 0.728
0.0CysGly: 0.0 ± 0.0
0.0CysHis: 0.0 ± 0.0
2.186CysIle: 2.186 ± 0.839
0.0CysLys: 0.0 ± 0.0
2.186CysLeu: 2.186 ± 1.101
1.093CysMet: 1.093 ± 0.92
0.0CysAsn: 0.0 ± 0.0
1.093CysPro: 1.093 ± 0.728
0.0CysGln: 0.0 ± 0.0
0.0CysArg: 0.0 ± 0.0
0.0CysSer: 0.0 ± 0.0
1.093CysThr: 1.093 ± 0.728
0.0CysVal: 0.0 ± 0.0
1.093CysTrp: 1.093 ± 0.728
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
2.186AspAla: 2.186 ± 1.455
0.0AspCys: 0.0 ± 0.0
0.0AspAsp: 0.0 ± 0.0
1.093AspGlu: 1.093 ± 0.728
1.093AspPhe: 1.093 ± 0.92
1.093AspGly: 1.093 ± 0.728
0.0AspHis: 0.0 ± 0.0
6.557AspIle: 6.557 ± 1.477
4.372AspLys: 4.372 ± 1.121
7.65AspLeu: 7.65 ± 2.55
2.186AspMet: 2.186 ± 1.566
1.093AspAsn: 1.093 ± 0.91
5.464AspPro: 5.464 ± 1.604
2.186AspGln: 2.186 ± 1.455
2.186AspArg: 2.186 ± 1.101
1.093AspSer: 1.093 ± 0.92
5.464AspThr: 5.464 ± 1.484
3.279AspVal: 3.279 ± 1.416
0.0AspTrp: 0.0 ± 0.0
1.093AspTyr: 1.093 ± 0.728
0.0AspXaa: 0.0 ± 0.0
Glu
3.279GluAla: 3.279 ± 2.183
0.0GluCys: 0.0 ± 0.0
3.279GluAsp: 3.279 ± 1.273
4.372GluGlu: 4.372 ± 1.326
4.372GluPhe: 4.372 ± 0.389
1.093GluGly: 1.093 ± 0.728
0.0GluHis: 0.0 ± 0.0
3.279GluIle: 3.279 ± 0.391
1.093GluLys: 1.093 ± 0.91
3.279GluLeu: 3.279 ± 1.416
0.0GluMet: 0.0 ± 0.0
1.093GluAsn: 1.093 ± 0.91
3.279GluPro: 3.279 ± 0.391
3.279GluGln: 3.279 ± 1.054
2.186GluArg: 2.186 ± 1.455
2.186GluSer: 2.186 ± 1.101
0.0GluThr: 0.0 ± 0.0
6.557GluVal: 6.557 ± 1.127
2.186GluTrp: 2.186 ± 0.663
3.279GluTyr: 3.279 ± 1.799
0.0GluXaa: 0.0 ± 0.0
Phe
4.372PheAla: 4.372 ± 2.203
1.093PheCys: 1.093 ± 0.91
3.279PheAsp: 3.279 ± 1.604
2.186PheGlu: 2.186 ± 0.663
2.186PhePhe: 2.186 ± 0.663
2.186PheGly: 2.186 ± 1.841
0.0PheHis: 0.0 ± 0.0
2.186PheIle: 2.186 ± 0.663
0.0PheLys: 0.0 ± 0.0
2.186PheLeu: 2.186 ± 0.663
0.0PheMet: 0.0 ± 0.0
2.186PheAsn: 2.186 ± 1.101
2.186PhePro: 2.186 ± 1.455
2.186PheGln: 2.186 ± 1.455
0.0PheArg: 0.0 ± 0.0
5.464PheSer: 5.464 ± 2.005
3.279PheThr: 3.279 ± 1.054
4.372PheVal: 4.372 ± 1.121
0.0PheTrp: 0.0 ± 0.0
2.186PheTyr: 2.186 ± 1.101
0.0PheXaa: 0.0 ± 0.0
Gly
9.836GlyAla: 9.836 ± 2.564
1.093GlyCys: 1.093 ± 0.728
2.186GlyAsp: 2.186 ± 1.101
2.186GlyGlu: 2.186 ± 1.101
1.093GlyPhe: 1.093 ± 0.92
2.186GlyGly: 2.186 ± 0.839
0.0GlyHis: 0.0 ± 0.0
3.279GlyIle: 3.279 ± 1.273
4.372GlyLys: 4.372 ± 1.678
3.279GlyLeu: 3.279 ± 1.799
0.0GlyMet: 0.0 ± 0.0
1.093GlyAsn: 1.093 ± 0.728
3.279GlyPro: 3.279 ± 1.273
2.186GlyGln: 2.186 ± 1.455
4.372GlyArg: 4.372 ± 0.389
3.279GlySer: 3.279 ± 1.604
6.557GlyThr: 6.557 ± 2.546
1.093GlyVal: 1.093 ± 0.728
0.0GlyTrp: 0.0 ± 0.0
3.279GlyTyr: 3.279 ± 0.391
0.0GlyXaa: 0.0 ± 0.0
His
0.0HisAla: 0.0 ± 0.0
0.0HisCys: 0.0 ± 0.0
1.093HisAsp: 1.093 ± 0.91
1.093HisGlu: 1.093 ± 0.92
0.0HisPhe: 0.0 ± 0.0
2.186HisGly: 2.186 ± 0.663
0.0HisHis: 0.0 ± 0.0
1.093HisIle: 1.093 ± 0.728
0.0HisLys: 0.0 ± 0.0
1.093HisLeu: 1.093 ± 0.91
2.186HisMet: 2.186 ± 0.839
0.0HisAsn: 0.0 ± 0.0
3.279HisPro: 3.279 ± 1.416
0.0HisGln: 0.0 ± 0.0
1.093HisArg: 1.093 ± 0.728
0.0HisSer: 0.0 ± 0.0
0.0HisThr: 0.0 ± 0.0
4.372HisVal: 4.372 ± 0.389
0.0HisTrp: 0.0 ± 0.0
1.093HisTyr: 1.093 ± 0.728
0.0HisXaa: 0.0 ± 0.0
Ile
4.372IleAla: 4.372 ± 1.685
1.093IleCys: 1.093 ± 0.92
2.186IleAsp: 2.186 ± 1.455
2.186IleGlu: 2.186 ± 1.455
3.279IlePhe: 3.279 ± 1.416
3.279IleGly: 3.279 ± 1.273
0.0IleHis: 0.0 ± 0.0
2.186IleIle: 2.186 ± 1.101
5.464IleLys: 5.464 ± 2.005
4.372IleLeu: 4.372 ± 2.203
0.0IleMet: 0.0 ± 0.0
9.836IleAsn: 9.836 ± 4.619
5.464IlePro: 5.464 ± 2.016
3.279IleGln: 3.279 ± 0.391
6.557IleArg: 6.557 ± 3.598
4.372IleSer: 4.372 ± 2.658
7.65IleThr: 7.65 ± 1.74
1.093IleVal: 1.093 ± 0.91
1.093IleTrp: 1.093 ± 0.728
4.372IleTyr: 4.372 ± 2.203
0.0IleXaa: 0.0 ± 0.0
Lys
3.279LysAla: 3.279 ± 0.391
0.0LysCys: 0.0 ± 0.0
0.0LysAsp: 0.0 ± 0.0
4.372LysGlu: 4.372 ± 2.287
1.093LysPhe: 1.093 ± 0.91
5.464LysGly: 5.464 ± 0.582
1.093LysHis: 1.093 ± 0.92
2.186LysIle: 2.186 ± 1.841
2.186LysLys: 2.186 ± 0.839
2.186LysLeu: 2.186 ± 0.663
0.0LysMet: 0.0 ± 0.0
1.093LysAsn: 1.093 ± 0.91
1.093LysPro: 1.093 ± 0.728
1.093LysGln: 1.093 ± 0.91
4.372LysArg: 4.372 ± 1.678
5.464LysSer: 5.464 ± 2.857
8.743LysThr: 8.743 ± 0.736
0.0LysVal: 0.0 ± 0.0
1.093LysTrp: 1.093 ± 0.728
4.372LysTyr: 4.372 ± 1.249
0.0LysXaa: 0.0 ± 0.0
Leu
3.279LeuAla: 3.279 ± 1.273
0.0LeuCys: 0.0 ± 0.0
5.464LeuAsp: 5.464 ± 2.005
4.372LeuGlu: 4.372 ± 1.326
2.186LeuPhe: 2.186 ± 0.663
2.186LeuGly: 2.186 ± 1.841
0.0LeuHis: 0.0 ± 0.0
6.557LeuIle: 6.557 ± 4.386
1.093LeuLys: 1.093 ± 0.92
4.372LeuLeu: 4.372 ± 2.287
0.0LeuMet: 0.0 ± 0.0
4.372LeuAsn: 4.372 ± 0.389
8.743LeuPro: 8.743 ± 1.984
2.186LeuGln: 2.186 ± 1.455
2.186LeuArg: 2.186 ± 0.663
8.743LeuSer: 8.743 ± 3.248
4.372LeuThr: 4.372 ± 2.287
5.464LeuVal: 5.464 ± 2.373
1.093LeuTrp: 1.093 ± 0.728
3.279LeuTyr: 3.279 ± 0.391
0.0LeuXaa: 0.0 ± 0.0
Met
2.186MetAla: 2.186 ± 0.839
0.0MetCys: 0.0 ± 0.0
1.093MetAsp: 1.093 ± 0.728
2.186MetGlu: 2.186 ± 1.455
1.093MetPhe: 1.093 ± 0.92
1.093MetGly: 1.093 ± 0.91
0.0MetHis: 0.0 ± 0.0
0.0MetIle: 0.0 ± 0.0
0.0MetLys: 0.0 ± 0.0
0.0MetLeu: 0.0 ± 0.0
0.0MetMet: 0.0 ± 0.0
1.093MetAsn: 1.093 ± 0.92
0.0MetPro: 0.0 ± 0.0
1.093MetGln: 1.093 ± 0.728
1.093MetArg: 1.093 ± 0.728
2.186MetSer: 2.186 ± 1.455
1.093MetThr: 1.093 ± 0.92
0.0MetVal: 0.0 ± 0.0
0.0MetTrp: 0.0 ± 0.0
1.093MetTyr: 1.093 ± 0.92
0.0MetXaa: 0.0 ± 0.0
Asn
5.464AsnAla: 5.464 ± 2.389
0.0AsnCys: 0.0 ± 0.0
0.0AsnAsp: 0.0 ± 0.0
7.65AsnGlu: 7.65 ± 0.28
2.186AsnPhe: 2.186 ± 1.101
4.372AsnGly: 4.372 ± 0.389
1.093AsnHis: 1.093 ± 0.91
3.279AsnIle: 3.279 ± 1.799
2.186AsnLys: 2.186 ± 0.663
1.093AsnLeu: 1.093 ± 0.91
2.186AsnMet: 2.186 ± 0.704
4.372AsnAsn: 4.372 ± 1.326
2.186AsnPro: 2.186 ± 0.663
0.0AsnGln: 0.0 ± 0.0
1.093AsnArg: 1.093 ± 0.92
4.372AsnSer: 4.372 ± 2.287
2.186AsnThr: 2.186 ± 1.82
2.186AsnVal: 2.186 ± 0.663
0.0AsnTrp: 0.0 ± 0.0
0.0AsnTyr: 0.0 ± 0.0
0.0AsnXaa: 0.0 ± 0.0
Pro
9.836ProAla: 9.836 ± 2.693
1.093ProCys: 1.093 ± 0.728
3.279ProAsp: 3.279 ± 1.054
1.093ProGlu: 1.093 ± 0.91
0.0ProPhe: 0.0 ± 0.0
5.464ProGly: 5.464 ± 0.941
4.372ProHis: 4.372 ± 0.389
5.464ProIle: 5.464 ± 2.016
4.372ProLys: 4.372 ± 1.121
6.557ProLeu: 6.557 ± 2.108
1.093ProMet: 1.093 ± 0.728
1.093ProAsn: 1.093 ± 0.728
2.186ProPro: 2.186 ± 1.455
2.186ProGln: 2.186 ± 1.455
1.093ProArg: 1.093 ± 0.728
6.557ProSer: 6.557 ± 1.82
6.557ProThr: 6.557 ± 1.823
2.186ProVal: 2.186 ± 0.663
1.093ProTrp: 1.093 ± 0.728
2.186ProTyr: 2.186 ± 1.82
0.0ProXaa: 0.0 ± 0.0
Gln
2.186GlnAla: 2.186 ± 1.455
1.093GlnCys: 1.093 ± 0.728
1.093GlnAsp: 1.093 ± 0.728
1.093GlnGlu: 1.093 ± 0.728
1.093GlnPhe: 1.093 ± 0.728
3.279GlnGly: 3.279 ± 1.273
0.0GlnHis: 0.0 ± 0.0
2.186GlnIle: 2.186 ± 0.663
3.279GlnLys: 3.279 ± 2.183
3.279GlnLeu: 3.279 ± 1.054
1.093GlnMet: 1.093 ± 0.728
1.093GlnAsn: 1.093 ± 0.91
1.093GlnPro: 1.093 ± 0.728
1.093GlnGln: 1.093 ± 0.728
4.372GlnArg: 4.372 ± 1.326
0.0GlnSer: 0.0 ± 0.0
1.093GlnThr: 1.093 ± 0.92
4.372GlnVal: 4.372 ± 1.896
0.0GlnTrp: 0.0 ± 0.0
3.279GlnTyr: 3.279 ± 1.416
0.0GlnXaa: 0.0 ± 0.0
Arg
3.279ArgAla: 3.279 ± 1.054
1.093ArgCys: 1.093 ± 0.92
6.557ArgAsp: 6.557 ± 0.691
3.279ArgGlu: 3.279 ± 1.054
6.557ArgPhe: 6.557 ± 2.904
2.186ArgGly: 2.186 ± 0.839
1.093ArgHis: 1.093 ± 0.728
5.464ArgIle: 5.464 ± 2.03
1.093ArgLys: 1.093 ± 0.728
5.464ArgLeu: 5.464 ± 0.941
0.0ArgMet: 0.0 ± 0.0
0.0ArgAsn: 0.0 ± 0.0
4.372ArgPro: 4.372 ± 0.389
2.186ArgGln: 2.186 ± 0.839
2.186ArgArg: 2.186 ± 0.663
2.186ArgSer: 2.186 ± 0.839
2.186ArgThr: 2.186 ± 0.839
2.186ArgVal: 2.186 ± 0.663
4.372ArgTrp: 4.372 ± 0.389
5.464ArgTyr: 5.464 ± 0.941
0.0ArgXaa: 0.0 ± 0.0
Ser
4.372SerAla: 4.372 ± 1.121
0.0SerCys: 0.0 ± 0.0
2.186SerAsp: 2.186 ± 1.455
1.093SerGlu: 1.093 ± 0.91
3.279SerPhe: 3.279 ± 1.814
3.279SerGly: 3.279 ± 1.054
0.0SerHis: 0.0 ± 0.0
9.836SerIle: 9.836 ± 4.014
3.279SerLys: 3.279 ± 1.799
2.186SerLeu: 2.186 ± 1.841
1.093SerMet: 1.093 ± 0.728
8.743SerAsn: 8.743 ± 1.949
4.372SerPro: 4.372 ± 0.389
4.372SerGln: 4.372 ± 1.121
7.65SerArg: 7.65 ± 0.28
6.557SerSer: 6.557 ± 0.691
2.186SerThr: 2.186 ± 1.841
6.557SerVal: 6.557 ± 2.122
1.093SerTrp: 1.093 ± 0.728
2.186SerTyr: 2.186 ± 1.82
0.0SerXaa: 0.0 ± 0.0
Thr
7.65ThrAla: 7.65 ± 1.74
2.186ThrCys: 2.186 ± 0.839
1.093ThrAsp: 1.093 ± 0.91
2.186ThrGlu: 2.186 ± 1.101
1.093ThrPhe: 1.093 ± 0.91
5.464ThrGly: 5.464 ± 2.389
4.372ThrHis: 4.372 ± 1.685
2.186ThrIle: 2.186 ± 1.82
5.464ThrLys: 5.464 ± 2.159
4.372ThrLeu: 4.372 ± 1.121
0.0ThrMet: 0.0 ± 0.0
3.279ThrAsn: 3.279 ± 1.814
5.464ThrPro: 5.464 ± 2.575
2.186ThrGln: 2.186 ± 1.455
4.372ThrArg: 4.372 ± 1.896
12.022ThrSer: 12.022 ± 3.413
14.208ThrThr: 14.208 ± 4.713
1.093ThrVal: 1.093 ± 0.728
0.0ThrTrp: 0.0 ± 0.0
1.093ThrTyr: 1.093 ± 0.92
0.0ThrXaa: 0.0 ± 0.0
Val
3.279ValAla: 3.279 ± 2.183
2.186ValCys: 2.186 ± 0.663
3.279ValAsp: 3.279 ± 1.604
2.186ValGlu: 2.186 ± 1.455
4.372ValPhe: 4.372 ± 1.685
1.093ValGly: 1.093 ± 0.92
4.372ValHis: 4.372 ± 1.121
6.557ValIle: 6.557 ± 1.477
2.186ValLys: 2.186 ± 0.839
4.372ValLeu: 4.372 ± 0.389
0.0ValMet: 0.0 ± 0.0
4.372ValAsn: 4.372 ± 1.326
3.279ValPro: 3.279 ± 1.416
1.093ValGln: 1.093 ± 0.728
3.279ValArg: 3.279 ± 1.054
2.186ValSer: 2.186 ± 0.839
4.372ValThr: 4.372 ± 1.685
4.372ValVal: 4.372 ± 2.287
0.0ValTrp: 0.0 ± 0.0
1.093ValTyr: 1.093 ± 0.728
0.0ValXaa: 0.0 ± 0.0
Trp
0.0TrpAla: 0.0 ± 0.0
0.0TrpCys: 0.0 ± 0.0
1.093TrpAsp: 1.093 ± 0.92
0.0TrpGlu: 0.0 ± 0.0
0.0TrpPhe: 0.0 ± 0.0
1.093TrpGly: 1.093 ± 0.728
0.0TrpHis: 0.0 ± 0.0
0.0TrpIle: 0.0 ± 0.0
0.0TrpLys: 0.0 ± 0.0
2.186TrpLeu: 2.186 ± 0.663
0.0TrpMet: 0.0 ± 0.0
0.0TrpAsn: 0.0 ± 0.0
1.093TrpPro: 1.093 ± 0.91
0.0TrpGln: 0.0 ± 0.0
1.093TrpArg: 1.093 ± 0.728
2.186TrpSer: 2.186 ± 1.455
2.186TrpThr: 2.186 ± 1.455
2.186TrpVal: 2.186 ± 1.455
0.0TrpTrp: 0.0 ± 0.0
1.093TrpTyr: 1.093 ± 0.728
0.0TrpXaa: 0.0 ± 0.0
Tyr
5.464TyrAla: 5.464 ± 1.484
1.093TyrCys: 1.093 ± 0.728
5.464TyrAsp: 5.464 ± 2.005
1.093TyrGlu: 1.093 ± 0.92
0.0TyrPhe: 0.0 ± 0.0
1.093TyrGly: 1.093 ± 0.728
0.0TyrHis: 0.0 ± 0.0
3.279TyrIle: 3.279 ± 2.73
5.464TyrLys: 5.464 ± 2.841
3.279TyrLeu: 3.279 ± 1.814
1.093TyrMet: 1.093 ± 0.728
0.0TyrAsn: 0.0 ± 0.0
2.186TyrPro: 2.186 ± 1.101
2.186TyrGln: 2.186 ± 0.663
2.186TyrArg: 2.186 ± 1.101
4.372TyrSer: 4.372 ± 0.389
4.372TyrThr: 4.372 ± 2.203
3.279TyrVal: 3.279 ± 1.273
0.0TyrTrp: 0.0 ± 0.0
2.186TyrTyr: 2.186 ± 1.101
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 3 proteins (916 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski