Amino acid dipepetide frequency for Circoviridae 2 LDMD-2013

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
8.314AlaAla: 8.314 ± 2.851
1.188AlaCys: 1.188 ± 1.076
4.751AlaAsp: 4.751 ± 0.888
2.375AlaGlu: 2.375 ± 0.421
1.188AlaPhe: 1.188 ± 0.654
9.501AlaGly: 9.501 ± 3.505
3.563AlaHis: 3.563 ± 1.497
3.563AlaIle: 3.563 ± 0.233
7.126AlaLys: 7.126 ± 1.263
2.375AlaLeu: 2.375 ± 0.421
3.563AlaMet: 3.563 ± 1.694
4.751AlaAsn: 4.751 ± 0.888
4.751AlaPro: 4.751 ± 2.572
5.938AlaGln: 5.938 ± 1.918
4.751AlaArg: 4.751 ± 0.842
10.689AlaSer: 10.689 ± 2.429
5.938AlaThr: 5.938 ± 1.542
5.938AlaVal: 5.938 ± 0.188
0.0AlaTrp: 0.0 ± 0.0
7.126AlaTyr: 7.126 ± 0.466
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.0CysCys: 0.0 ± 0.0
1.188CysAsp: 1.188 ± 0.654
2.375CysGlu: 2.375 ± 0.421
1.188CysPhe: 1.188 ± 1.076
1.188CysGly: 1.188 ± 0.654
0.0CysHis: 0.0 ± 0.0
1.188CysIle: 1.188 ± 1.076
0.0CysLys: 0.0 ± 0.0
1.188CysLeu: 1.188 ± 0.654
0.0CysMet: 0.0 ± 0.0
0.0CysAsn: 0.0 ± 0.0
0.0CysPro: 0.0 ± 0.0
0.0CysGln: 0.0 ± 0.0
0.0CysArg: 0.0 ± 0.0
2.375CysSer: 2.375 ± 2.151
0.0CysThr: 0.0 ± 0.0
2.375CysVal: 2.375 ± 1.309
0.0CysTrp: 0.0 ± 0.0
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
3.563AspAla: 3.563 ± 0.233
0.0AspCys: 0.0 ± 0.0
1.188AspAsp: 1.188 ± 1.076
4.751AspGlu: 4.751 ± 0.842
1.188AspPhe: 1.188 ± 0.654
4.751AspGly: 4.751 ± 0.842
3.563AspHis: 3.563 ± 0.233
4.751AspIle: 4.751 ± 0.842
1.188AspLys: 1.188 ± 0.654
3.563AspLeu: 3.563 ± 0.233
0.0AspMet: 0.0 ± 0.0
2.375AspAsn: 2.375 ± 1.309
4.751AspPro: 4.751 ± 0.842
0.0AspGln: 0.0 ± 0.0
3.563AspArg: 3.563 ± 3.227
1.188AspSer: 1.188 ± 0.654
4.751AspThr: 4.751 ± 0.888
3.563AspVal: 3.563 ± 0.233
0.0AspTrp: 0.0 ± 0.0
2.375AspTyr: 2.375 ± 0.421
0.0AspXaa: 0.0 ± 0.0
Glu
4.751GluAla: 4.751 ± 0.842
1.188GluCys: 1.188 ± 0.654
5.938GluAsp: 5.938 ± 1.918
3.563GluGlu: 3.563 ± 1.497
2.375GluPhe: 2.375 ± 1.309
7.126GluGly: 7.126 ± 2.196
1.188GluHis: 1.188 ± 1.076
2.375GluIle: 2.375 ± 1.309
1.188GluLys: 1.188 ± 1.076
8.314GluLeu: 8.314 ± 2.851
1.188GluMet: 1.188 ± 0.654
2.375GluAsn: 2.375 ± 0.421
1.188GluPro: 1.188 ± 1.076
2.375GluGln: 2.375 ± 1.309
3.563GluArg: 3.563 ± 3.227
2.375GluSer: 2.375 ± 2.151
1.188GluThr: 1.188 ± 1.076
4.751GluVal: 4.751 ± 0.842
0.0GluTrp: 0.0 ± 0.0
5.938GluTyr: 5.938 ± 1.542
0.0GluXaa: 0.0 ± 0.0
Phe
1.188PheAla: 1.188 ± 0.654
1.188PheCys: 1.188 ± 0.654
1.188PheAsp: 1.188 ± 1.076
2.375PheGlu: 2.375 ± 0.421
2.375PhePhe: 2.375 ± 1.309
4.751PheGly: 4.751 ± 0.842
0.0PheHis: 0.0 ± 0.0
2.375PheIle: 2.375 ± 2.151
1.188PheLys: 1.188 ± 0.654
1.188PheLeu: 1.188 ± 0.654
0.0PheMet: 0.0 ± 0.0
2.375PheAsn: 2.375 ± 0.421
3.563PhePro: 3.563 ± 1.963
4.751PheGln: 4.751 ± 0.842
0.0PheArg: 0.0 ± 0.0
3.563PheSer: 3.563 ± 1.497
2.375PheThr: 2.375 ± 1.309
2.375PheVal: 2.375 ± 1.309
0.0PheTrp: 0.0 ± 0.0
0.0PheTyr: 0.0 ± 0.0
0.0PheXaa: 0.0 ± 0.0
Gly
9.501GlyAla: 9.501 ± 0.045
0.0GlyCys: 0.0 ± 0.0
5.938GlyAsp: 5.938 ± 3.272
8.314GlyGlu: 8.314 ± 2.851
3.563GlyPhe: 3.563 ± 3.227
4.751GlyGly: 4.751 ± 2.572
0.0GlyHis: 0.0 ± 0.0
3.563GlyIle: 3.563 ± 1.497
4.751GlyLys: 4.751 ± 0.888
5.938GlyLeu: 5.938 ± 1.542
3.563GlyMet: 3.563 ± 1.963
3.563GlyAsn: 3.563 ± 0.233
4.751GlyPro: 4.751 ± 0.888
3.563GlyGln: 3.563 ± 1.963
8.314GlyArg: 8.314 ± 4.58
5.938GlySer: 5.938 ± 0.188
14.252GlyThr: 14.252 ± 2.527
3.563GlyVal: 3.563 ± 1.497
0.0GlyTrp: 0.0 ± 0.0
3.563GlyTyr: 3.563 ± 1.497
0.0GlyXaa: 0.0 ± 0.0
His
1.188HisAla: 1.188 ± 0.654
0.0HisCys: 0.0 ± 0.0
1.188HisAsp: 1.188 ± 1.076
0.0HisGlu: 0.0 ± 0.0
0.0HisPhe: 0.0 ± 0.0
3.563HisGly: 3.563 ± 0.233
0.0HisHis: 0.0 ± 0.0
2.375HisIle: 2.375 ± 2.151
1.188HisLys: 1.188 ± 1.076
0.0HisLeu: 0.0 ± 0.0
0.0HisMet: 0.0 ± 0.0
1.188HisAsn: 1.188 ± 0.654
2.375HisPro: 2.375 ± 0.421
1.188HisGln: 1.188 ± 1.076
2.375HisArg: 2.375 ± 0.421
2.375HisSer: 2.375 ± 0.421
0.0HisThr: 0.0 ± 0.0
2.375HisVal: 2.375 ± 0.421
1.188HisTrp: 1.188 ± 1.076
0.0HisTyr: 0.0 ± 0.0
0.0HisXaa: 0.0 ± 0.0
Ile
7.126IleAla: 7.126 ± 1.263
0.0IleCys: 0.0 ± 0.0
3.563IleAsp: 3.563 ± 0.233
4.751IleGlu: 4.751 ± 4.302
2.375IlePhe: 2.375 ± 0.421
1.188IleGly: 1.188 ± 1.076
4.751IleHis: 4.751 ± 2.572
7.126IleIle: 7.126 ± 1.263
2.375IleLys: 2.375 ± 1.309
0.0IleLeu: 0.0 ± 0.0
1.188IleMet: 1.188 ± 1.076
1.188IleAsn: 1.188 ± 0.654
3.563IlePro: 3.563 ± 1.497
1.188IleGln: 1.188 ± 1.076
0.0IleArg: 0.0 ± 0.0
0.0IleSer: 0.0 ± 0.0
4.751IleThr: 4.751 ± 0.842
2.375IleVal: 2.375 ± 0.421
0.0IleTrp: 0.0 ± 0.0
0.0IleTyr: 0.0 ± 0.0
0.0IleXaa: 0.0 ± 0.0
Lys
4.751LysAla: 4.751 ± 0.842
0.0LysCys: 0.0 ± 0.0
3.563LysAsp: 3.563 ± 1.497
2.375LysGlu: 2.375 ± 1.309
2.375LysPhe: 2.375 ± 1.309
4.751LysGly: 4.751 ± 0.888
0.0LysHis: 0.0 ± 0.0
1.188LysIle: 1.188 ± 1.076
0.0LysLys: 0.0 ± 0.0
4.751LysLeu: 4.751 ± 2.617
0.0LysMet: 0.0 ± 0.0
1.188LysAsn: 1.188 ± 0.654
2.375LysPro: 2.375 ± 1.309
2.375LysGln: 2.375 ± 1.309
3.563LysArg: 3.563 ± 1.497
2.375LysSer: 2.375 ± 0.421
5.938LysThr: 5.938 ± 0.188
0.0LysVal: 0.0 ± 0.0
0.0LysTrp: 0.0 ± 0.0
2.375LysTyr: 2.375 ± 0.421
0.0LysXaa: 0.0 ± 0.0
Leu
7.126LeuAla: 7.126 ± 1.263
0.0LeuCys: 0.0 ± 0.0
1.188LeuAsp: 1.188 ± 1.076
8.314LeuGlu: 8.314 ± 1.121
1.188LeuPhe: 1.188 ± 0.654
4.751LeuGly: 4.751 ± 0.888
0.0LeuHis: 0.0 ± 0.0
1.188LeuIle: 1.188 ± 0.654
2.375LeuLys: 2.375 ± 1.309
3.563LeuLeu: 3.563 ± 3.227
2.375LeuMet: 2.375 ± 0.882
3.563LeuAsn: 3.563 ± 1.497
1.188LeuPro: 1.188 ± 1.076
0.0LeuGln: 0.0 ± 0.0
3.563LeuArg: 3.563 ± 1.497
7.126LeuSer: 7.126 ± 0.466
1.188LeuThr: 1.188 ± 0.654
4.751LeuVal: 4.751 ± 0.888
0.0LeuTrp: 0.0 ± 0.0
3.563LeuTyr: 3.563 ± 1.963
0.0LeuXaa: 0.0 ± 0.0
Met
4.751MetAla: 4.751 ± 2.617
0.0MetCys: 0.0 ± 0.0
0.0MetAsp: 0.0 ± 0.0
0.0MetGlu: 0.0 ± 0.0
1.188MetPhe: 1.188 ± 0.654
0.0MetGly: 0.0 ± 0.0
0.0MetHis: 0.0 ± 0.0
1.188MetIle: 1.188 ± 0.654
2.375MetLys: 2.375 ± 0.421
0.0MetLeu: 0.0 ± 0.0
0.0MetMet: 0.0 ± 0.0
0.0MetAsn: 0.0 ± 0.0
1.188MetPro: 1.188 ± 0.654
1.188MetGln: 1.188 ± 0.654
0.0MetArg: 0.0 ± 0.0
0.0MetSer: 0.0 ± 0.0
1.188MetThr: 1.188 ± 0.654
1.188MetVal: 1.188 ± 0.654
2.375MetTrp: 2.375 ± 0.421
3.563MetTyr: 3.563 ± 0.233
0.0MetXaa: 0.0 ± 0.0
Asn
3.563AsnAla: 3.563 ± 1.497
1.188AsnCys: 1.188 ± 0.654
0.0AsnAsp: 0.0 ± 0.0
4.751AsnGlu: 4.751 ± 0.842
0.0AsnPhe: 0.0 ± 0.0
1.188AsnGly: 1.188 ± 0.654
0.0AsnHis: 0.0 ± 0.0
4.751AsnIle: 4.751 ± 0.842
1.188AsnLys: 1.188 ± 0.654
3.563AsnLeu: 3.563 ± 0.233
0.0AsnMet: 0.0 ± 0.0
3.563AsnAsn: 3.563 ± 0.233
4.751AsnPro: 4.751 ± 2.617
3.563AsnGln: 3.563 ± 1.963
1.188AsnArg: 1.188 ± 1.076
3.563AsnSer: 3.563 ± 1.963
2.375AsnThr: 2.375 ± 1.309
1.188AsnVal: 1.188 ± 0.654
0.0AsnTrp: 0.0 ± 0.0
2.375AsnTyr: 2.375 ± 0.421
0.0AsnXaa: 0.0 ± 0.0
Pro
9.501ProAla: 9.501 ± 1.775
0.0ProCys: 0.0 ± 0.0
2.375ProAsp: 2.375 ± 2.151
3.563ProGlu: 3.563 ± 1.497
0.0ProPhe: 0.0 ± 0.0
3.563ProGly: 3.563 ± 1.497
0.0ProHis: 0.0 ± 0.0
0.0ProIle: 0.0 ± 0.0
3.563ProLys: 3.563 ± 1.497
2.375ProLeu: 2.375 ± 1.309
1.188ProMet: 1.188 ± 0.654
0.0ProAsn: 0.0 ± 0.0
1.188ProPro: 1.188 ± 0.654
0.0ProGln: 0.0 ± 0.0
3.563ProArg: 3.563 ± 0.233
3.563ProSer: 3.563 ± 1.963
2.375ProThr: 2.375 ± 0.421
3.563ProVal: 3.563 ± 0.233
2.375ProTrp: 2.375 ± 1.309
2.375ProTyr: 2.375 ± 0.421
0.0ProXaa: 0.0 ± 0.0
Gln
2.375GlnAla: 2.375 ± 2.151
1.188GlnCys: 1.188 ± 0.654
2.375GlnAsp: 2.375 ± 1.309
1.188GlnGlu: 1.188 ± 0.654
1.188GlnPhe: 1.188 ± 0.654
2.375GlnGly: 2.375 ± 0.421
0.0GlnHis: 0.0 ± 0.0
2.375GlnIle: 2.375 ± 0.421
0.0GlnLys: 0.0 ± 0.0
3.563GlnLeu: 3.563 ± 1.497
0.0GlnMet: 0.0 ± 0.0
4.751GlnAsn: 4.751 ± 0.842
2.375GlnPro: 2.375 ± 0.421
0.0GlnGln: 0.0 ± 0.0
2.375GlnArg: 2.375 ± 1.309
4.751GlnSer: 4.751 ± 0.888
2.375GlnThr: 2.375 ± 1.309
2.375GlnVal: 2.375 ± 0.421
1.188GlnTrp: 1.188 ± 1.076
1.188GlnTyr: 1.188 ± 0.654
0.0GlnXaa: 0.0 ± 0.0
Arg
4.751ArgAla: 4.751 ± 0.888
1.188ArgCys: 1.188 ± 1.076
1.188ArgAsp: 1.188 ± 1.076
2.375ArgGlu: 2.375 ± 0.421
1.188ArgPhe: 1.188 ± 0.654
11.876ArgGly: 11.876 ± 0.376
2.375ArgHis: 2.375 ± 2.151
2.375ArgIle: 2.375 ± 2.151
3.563ArgLys: 3.563 ± 1.963
2.375ArgLeu: 2.375 ± 2.151
0.0ArgMet: 0.0 ± 0.0
1.188ArgAsn: 1.188 ± 0.654
0.0ArgPro: 0.0 ± 0.0
1.188ArgGln: 1.188 ± 1.076
3.563ArgArg: 3.563 ± 0.233
7.126ArgSer: 7.126 ± 1.263
4.751ArgThr: 4.751 ± 4.302
3.563ArgVal: 3.563 ± 1.963
0.0ArgTrp: 0.0 ± 0.0
3.563ArgTyr: 3.563 ± 1.497
0.0ArgXaa: 0.0 ± 0.0
Ser
5.938SerAla: 5.938 ± 1.542
0.0SerCys: 0.0 ± 0.0
0.0SerAsp: 0.0 ± 0.0
1.188SerGlu: 1.188 ± 0.654
3.563SerPhe: 3.563 ± 0.233
11.876SerGly: 11.876 ± 3.084
3.563SerHis: 3.563 ± 0.233
0.0SerIle: 0.0 ± 0.0
4.751SerLys: 4.751 ± 0.888
2.375SerLeu: 2.375 ± 2.151
4.751SerMet: 4.751 ± 0.888
2.375SerAsn: 2.375 ± 1.309
4.751SerPro: 4.751 ± 0.888
3.563SerGln: 3.563 ± 1.497
4.751SerArg: 4.751 ± 2.572
4.751SerSer: 4.751 ± 0.842
8.314SerThr: 8.314 ± 2.851
3.563SerVal: 3.563 ± 1.963
0.0SerTrp: 0.0 ± 0.0
2.375SerTyr: 2.375 ± 2.151
0.0SerXaa: 0.0 ± 0.0
Thr
4.751ThrAla: 4.751 ± 0.888
0.0ThrCys: 0.0 ± 0.0
7.126ThrAsp: 7.126 ± 2.196
3.563ThrGlu: 3.563 ± 1.497
4.751ThrPhe: 4.751 ± 0.842
9.501ThrGly: 9.501 ± 3.415
2.375ThrHis: 2.375 ± 1.309
3.563ThrIle: 3.563 ± 1.497
0.0ThrLys: 0.0 ± 0.0
8.314ThrLeu: 8.314 ± 0.609
0.0ThrMet: 0.0 ± 0.0
2.375ThrAsn: 2.375 ± 1.309
1.188ThrPro: 1.188 ± 0.654
1.188ThrGln: 1.188 ± 0.654
7.126ThrArg: 7.126 ± 2.993
3.563ThrSer: 3.563 ± 1.963
2.375ThrThr: 2.375 ± 1.309
4.751ThrVal: 4.751 ± 2.617
1.188ThrTrp: 1.188 ± 0.654
3.563ThrTyr: 3.563 ± 1.963
0.0ThrXaa: 0.0 ± 0.0
Val
7.126ValAla: 7.126 ± 1.263
2.375ValCys: 2.375 ± 0.421
2.375ValAsp: 2.375 ± 0.421
4.751ValGlu: 4.751 ± 0.888
4.751ValPhe: 4.751 ± 0.888
3.563ValGly: 3.563 ± 1.963
1.188ValHis: 1.188 ± 1.076
2.375ValIle: 2.375 ± 0.421
4.751ValLys: 4.751 ± 0.888
3.563ValLeu: 3.563 ± 1.497
1.188ValMet: 1.188 ± 0.654
3.563ValAsn: 3.563 ± 0.233
0.0ValPro: 0.0 ± 0.0
3.563ValGln: 3.563 ± 1.963
4.751ValArg: 4.751 ± 0.888
5.938ValSer: 5.938 ± 3.272
1.188ValThr: 1.188 ± 0.654
4.751ValVal: 4.751 ± 2.572
1.188ValTrp: 1.188 ± 0.654
0.0ValTyr: 0.0 ± 0.0
0.0ValXaa: 0.0 ± 0.0
Trp
1.188TrpAla: 1.188 ± 0.654
0.0TrpCys: 0.0 ± 0.0
2.375TrpAsp: 2.375 ± 1.309
0.0TrpGlu: 0.0 ± 0.0
0.0TrpPhe: 0.0 ± 0.0
1.188TrpGly: 1.188 ± 1.076
0.0TrpHis: 0.0 ± 0.0
0.0TrpIle: 0.0 ± 0.0
2.375TrpLys: 2.375 ± 0.421
1.188TrpLeu: 1.188 ± 0.654
0.0TrpMet: 0.0 ± 0.0
0.0TrpAsn: 0.0 ± 0.0
0.0TrpPro: 0.0 ± 0.0
1.188TrpGln: 1.188 ± 1.076
0.0TrpArg: 0.0 ± 0.0
0.0TrpSer: 0.0 ± 0.0
0.0TrpThr: 0.0 ± 0.0
0.0TrpVal: 0.0 ± 0.0
0.0TrpTrp: 0.0 ± 0.0
2.375TrpTyr: 2.375 ± 2.151
0.0TrpXaa: 0.0 ± 0.0
Tyr
5.938TyrAla: 5.938 ± 1.542
3.563TyrCys: 3.563 ± 1.497
3.563TyrAsp: 3.563 ± 1.497
2.375TyrGlu: 2.375 ± 1.309
2.375TyrPhe: 2.375 ± 2.151
5.938TyrGly: 5.938 ± 3.272
0.0TyrHis: 0.0 ± 0.0
1.188TyrIle: 1.188 ± 1.076
0.0TyrLys: 0.0 ± 0.0
0.0TyrLeu: 0.0 ± 0.0
0.0TyrMet: 0.0 ± 0.0
2.375TyrAsn: 2.375 ± 1.309
1.188TyrPro: 1.188 ± 1.076
1.188TyrGln: 1.188 ± 1.076
1.188TyrArg: 1.188 ± 1.076
1.188TyrSer: 1.188 ± 0.654
5.938TyrThr: 5.938 ± 1.542
5.938TyrVal: 5.938 ± 0.188
2.375TyrTrp: 2.375 ± 2.151
1.188TyrTyr: 1.188 ± 0.654
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 2 proteins (843 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski