Amino acid dipepetide frequency for Chaetoceros protobacilladnavirus 2 (Chaetoceros sp. DNA virus 7)

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
7.407AlaAla: 7.407 ± 4.613
0.0AlaCys: 0.0 ± 0.0
3.704AlaAsp: 3.704 ± 1.47
2.778AlaGlu: 2.778 ± 1.242
3.704AlaPhe: 3.704 ± 0.492
7.407AlaGly: 7.407 ± 0.678
1.852AlaHis: 1.852 ± 0.458
3.704AlaIle: 3.704 ± 1.747
5.556AlaLys: 5.556 ± 1.049
3.704AlaLeu: 3.704 ± 2.582
1.852AlaMet: 1.852 ± 1.875
5.556AlaAsn: 5.556 ± 3.278
6.481AlaPro: 6.481 ± 1.04
2.778AlaGln: 2.778 ± 1.242
6.481AlaArg: 6.481 ± 3.308
6.481AlaSer: 6.481 ± 3.308
4.63AlaThr: 4.63 ± 2.533
4.63AlaVal: 4.63 ± 1.587
0.926AlaTrp: 0.926 ± 0.669
0.926AlaTyr: 0.926 ± 0.938
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.0CysCys: 0.0 ± 0.0
0.926CysAsp: 0.926 ± 0.669
0.0CysGlu: 0.0 ± 0.0
0.0CysPhe: 0.0 ± 0.0
0.0CysGly: 0.0 ± 0.0
0.0CysHis: 0.0 ± 0.0
0.0CysIle: 0.0 ± 0.0
0.926CysLys: 0.926 ± 0.669
0.926CysLeu: 0.926 ± 0.759
0.0CysMet: 0.0 ± 0.0
1.852CysAsn: 1.852 ± 0.458
0.926CysPro: 0.926 ± 0.669
0.0CysGln: 0.0 ± 0.0
0.926CysArg: 0.926 ± 0.669
0.0CysSer: 0.0 ± 0.0
0.926CysThr: 0.926 ± 0.669
1.852CysVal: 1.852 ± 0.458
0.926CysTrp: 0.926 ± 0.669
0.926CysTyr: 0.926 ± 0.669
0.0CysXaa: 0.0 ± 0.0
Asp
1.852AspAla: 1.852 ± 0.458
0.926AspCys: 0.926 ± 0.669
6.481AspAsp: 6.481 ± 1.73
4.63AspGlu: 4.63 ± 1.05
4.63AspPhe: 4.63 ± 1.204
5.556AspGly: 5.556 ± 1.049
0.0AspHis: 0.0 ± 0.0
5.556AspIle: 5.556 ± 1.718
3.704AspLys: 3.704 ± 1.747
3.704AspLeu: 3.704 ± 0.917
0.926AspMet: 0.926 ± 0.669
2.778AspAsn: 2.778 ± 0.526
4.63AspPro: 4.63 ± 1.447
4.63AspGln: 4.63 ± 0.308
1.852AspArg: 1.852 ± 1.875
2.778AspSer: 2.778 ± 1.639
6.481AspThr: 6.481 ± 1.606
4.63AspVal: 4.63 ± 1.226
0.926AspTrp: 0.926 ± 0.759
1.852AspTyr: 1.852 ± 0.458
0.0AspXaa: 0.0 ± 0.0
Glu
7.407GluAla: 7.407 ± 2.254
1.852GluCys: 1.852 ± 1.338
2.778GluAsp: 2.778 ± 0.859
8.333GluGlu: 8.333 ± 3.74
5.556GluPhe: 5.556 ± 1.053
4.63GluGly: 4.63 ± 2.953
1.852GluHis: 1.852 ± 1.338
3.704GluIle: 3.704 ± 1.45
0.926GluLys: 0.926 ± 0.669
2.778GluLeu: 2.778 ± 1.684
2.778GluMet: 2.778 ± 0.859
6.481GluAsn: 6.481 ± 1.73
3.704GluPro: 3.704 ± 2.675
2.778GluGln: 2.778 ± 1.242
3.704GluArg: 3.704 ± 2.582
3.704GluSer: 3.704 ± 1.793
3.704GluThr: 3.704 ± 1.47
2.778GluVal: 2.778 ± 0.526
0.926GluTrp: 0.926 ± 0.938
1.852GluTyr: 1.852 ± 1.338
0.0GluXaa: 0.0 ± 0.0
Phe
3.704PheAla: 3.704 ± 1.787
0.0PheCys: 0.0 ± 0.0
2.778PheAsp: 2.778 ± 1.242
3.704PheGlu: 3.704 ± 1.47
1.852PhePhe: 1.852 ± 1.338
1.852PheGly: 1.852 ± 1.338
2.778PheHis: 2.778 ± 1.242
2.778PheIle: 2.778 ± 1.242
0.926PheLys: 0.926 ± 0.669
1.852PheLeu: 1.852 ± 1.098
0.926PheMet: 0.926 ± 0.759
2.778PheAsn: 2.778 ± 1.061
1.852PhePro: 1.852 ± 1.338
0.0PheGln: 0.0 ± 0.0
1.852PheArg: 1.852 ± 1.338
3.704PheSer: 3.704 ± 0.971
1.852PheThr: 1.852 ± 0.458
2.778PheVal: 2.778 ± 0.859
0.926PheTrp: 0.926 ± 0.669
1.852PheTyr: 1.852 ± 1.338
0.0PheXaa: 0.0 ± 0.0
Gly
12.037GlyAla: 12.037 ± 3.679
0.926GlyCys: 0.926 ± 0.669
3.704GlyAsp: 3.704 ± 1.787
0.926GlyGlu: 0.926 ± 0.669
2.778GlyPhe: 2.778 ± 1.639
7.407GlyGly: 7.407 ± 0.678
0.926GlyHis: 0.926 ± 0.938
0.926GlyIle: 0.926 ± 0.759
2.778GlyLys: 2.778 ± 2.006
5.556GlyLeu: 5.556 ± 3.748
0.0GlyMet: 0.0 ± 0.0
1.852GlyAsn: 1.852 ± 0.458
1.852GlyPro: 1.852 ± 0.874
4.63GlyGln: 4.63 ± 1.204
4.63GlyArg: 4.63 ± 3.344
3.704GlySer: 3.704 ± 1.47
4.63GlyThr: 4.63 ± 2.533
4.63GlyVal: 4.63 ± 2.533
1.852GlyTrp: 1.852 ± 0.458
2.778GlyTyr: 2.778 ± 1.896
0.0GlyXaa: 0.0 ± 0.0
His
3.704HisAla: 3.704 ± 1.47
0.0HisCys: 0.0 ± 0.0
1.852HisAsp: 1.852 ± 0.458
1.852HisGlu: 1.852 ± 1.338
0.0HisPhe: 0.0 ± 0.0
1.852HisGly: 1.852 ± 1.338
0.926HisHis: 0.926 ± 0.759
1.852HisIle: 1.852 ± 0.874
3.704HisLys: 3.704 ± 1.793
0.926HisLeu: 0.926 ± 0.759
0.0HisMet: 0.0 ± 0.0
1.852HisAsn: 1.852 ± 0.874
2.778HisPro: 2.778 ± 1.061
0.926HisGln: 0.926 ± 0.669
2.778HisArg: 2.778 ± 2.006
0.926HisSer: 0.926 ± 0.938
1.852HisThr: 1.852 ± 1.098
0.0HisVal: 0.0 ± 0.0
1.852HisTrp: 1.852 ± 1.338
0.926HisTyr: 0.926 ± 0.669
0.0HisXaa: 0.0 ± 0.0
Ile
4.63IleAla: 4.63 ± 3.694
2.778IleCys: 2.778 ± 2.278
6.481IleAsp: 6.481 ± 2.063
7.407IleGlu: 7.407 ± 3.587
1.852IlePhe: 1.852 ± 1.338
2.778IleGly: 2.778 ± 0.526
1.852IleHis: 1.852 ± 0.458
2.778IleIle: 2.778 ± 1.684
1.852IleLys: 1.852 ± 1.519
0.926IleLeu: 0.926 ± 0.669
0.926IleMet: 0.926 ± 0.759
3.704IleAsn: 3.704 ± 0.917
0.926IlePro: 0.926 ± 0.759
1.852IleGln: 1.852 ± 1.098
3.704IleArg: 3.704 ± 0.492
0.926IleSer: 0.926 ± 0.938
2.778IleThr: 2.778 ± 1.061
3.704IleVal: 3.704 ± 0.492
0.0IleTrp: 0.0 ± 0.0
1.852IleTyr: 1.852 ± 0.874
0.0IleXaa: 0.0 ± 0.0
Lys
7.407LysAla: 7.407 ± 3.467
0.926LysCys: 0.926 ± 0.669
2.778LysAsp: 2.778 ± 1.061
3.704LysGlu: 3.704 ± 1.793
1.852LysPhe: 1.852 ± 1.338
4.63LysGly: 4.63 ± 1.226
2.778LysHis: 2.778 ± 1.242
2.778LysIle: 2.778 ± 1.639
6.481LysLys: 6.481 ± 0.769
4.63LysLeu: 4.63 ± 2.116
0.0LysMet: 0.0 ± 0.0
3.704LysAsn: 3.704 ± 1.747
3.704LysPro: 3.704 ± 0.971
2.778LysGln: 2.778 ± 0.859
6.481LysArg: 6.481 ± 2.84
6.481LysSer: 6.481 ± 1.867
3.704LysThr: 3.704 ± 0.492
2.778LysVal: 2.778 ± 0.526
2.778LysTrp: 2.778 ± 1.061
1.852LysTyr: 1.852 ± 1.338
0.0LysXaa: 0.0 ± 0.0
Leu
4.63LeuAla: 4.63 ± 2.384
0.0LeuCys: 0.0 ± 0.0
2.778LeuAsp: 2.778 ± 0.526
5.556LeuGlu: 5.556 ± 2.419
0.926LeuPhe: 0.926 ± 0.669
2.778LeuGly: 2.778 ± 2.278
2.778LeuHis: 2.778 ± 0.859
1.852LeuIle: 1.852 ± 1.098
6.481LeuLys: 6.481 ± 0.577
6.481LeuLeu: 6.481 ± 0.577
0.926LeuMet: 0.926 ± 0.693
7.407LeuAsn: 7.407 ± 1.295
0.926LeuPro: 0.926 ± 0.759
2.778LeuGln: 2.778 ± 1.061
0.0LeuArg: 0.0 ± 0.0
2.778LeuSer: 2.778 ± 0.859
4.63LeuThr: 4.63 ± 0.308
2.778LeuVal: 2.778 ± 1.061
1.852LeuTrp: 1.852 ± 1.338
0.0LeuTyr: 0.0 ± 0.0
0.0LeuXaa: 0.0 ± 0.0
Met
1.852MetAla: 1.852 ± 1.519
0.0MetCys: 0.0 ± 0.0
2.778MetAsp: 2.778 ± 1.061
0.926MetGlu: 0.926 ± 0.669
0.0MetPhe: 0.0 ± 0.0
0.926MetGly: 0.926 ± 0.759
0.926MetHis: 0.926 ± 0.759
0.0MetIle: 0.0 ± 0.0
0.0MetLys: 0.0 ± 0.0
0.926MetLeu: 0.926 ± 0.759
0.0MetMet: 0.0 ± 0.0
1.852MetAsn: 1.852 ± 1.098
1.852MetPro: 1.852 ± 0.458
0.926MetGln: 0.926 ± 0.938
0.0MetArg: 0.0 ± 0.0
1.852MetSer: 1.852 ± 1.338
0.926MetThr: 0.926 ± 0.759
0.926MetVal: 0.926 ± 0.938
0.0MetTrp: 0.0 ± 0.0
0.926MetTyr: 0.926 ± 0.759
0.0MetXaa: 0.0 ± 0.0
Asn
1.852AsnAla: 1.852 ± 1.519
0.0AsnCys: 0.0 ± 0.0
5.556AsnAsp: 5.556 ± 1.049
5.556AsnGlu: 5.556 ± 1.053
0.926AsnPhe: 0.926 ± 0.938
2.778AsnGly: 2.778 ± 0.859
1.852AsnHis: 1.852 ± 1.338
3.704AsnIle: 3.704 ± 1.45
5.556AsnLys: 5.556 ± 1.69
3.704AsnLeu: 3.704 ± 1.787
0.926AsnMet: 0.926 ± 1.364
4.63AsnAsn: 4.63 ± 1.447
2.778AsnPro: 2.778 ± 0.859
1.852AsnGln: 1.852 ± 0.458
2.778AsnArg: 2.778 ± 0.526
0.926AsnSer: 0.926 ± 0.669
3.704AsnThr: 3.704 ± 0.971
5.556AsnVal: 5.556 ± 1.049
0.926AsnTrp: 0.926 ± 0.759
2.778AsnTyr: 2.778 ± 1.061
0.0AsnXaa: 0.0 ± 0.0
Pro
1.852ProAla: 1.852 ± 1.338
0.0ProCys: 0.0 ± 0.0
4.63ProAsp: 4.63 ± 1.204
4.63ProGlu: 4.63 ± 2.041
2.778ProPhe: 2.778 ± 1.061
2.778ProGly: 2.778 ± 0.859
0.0ProHis: 0.0 ± 0.0
2.778ProIle: 2.778 ± 1.061
4.63ProLys: 4.63 ± 0.308
2.778ProLeu: 2.778 ± 0.859
0.926ProMet: 0.926 ± 0.584
1.852ProAsn: 1.852 ± 0.458
2.778ProPro: 2.778 ± 2.006
0.926ProGln: 0.926 ± 0.759
2.778ProArg: 2.778 ± 1.061
4.63ProSer: 4.63 ± 2.116
1.852ProThr: 1.852 ± 0.458
2.778ProVal: 2.778 ± 1.061
0.0ProTrp: 0.0 ± 0.0
1.852ProTyr: 1.852 ± 0.874
0.0ProXaa: 0.0 ± 0.0
Gln
2.778GlnAla: 2.778 ± 1.684
0.0GlnCys: 0.0 ± 0.0
2.778GlnAsp: 2.778 ± 0.526
3.704GlnGlu: 3.704 ± 1.747
1.852GlnPhe: 1.852 ± 0.458
2.778GlnGly: 2.778 ± 2.278
0.926GlnHis: 0.926 ± 0.669
2.778GlnIle: 2.778 ± 0.526
0.926GlnLys: 0.926 ± 0.669
4.63GlnLeu: 4.63 ± 1.226
0.0GlnMet: 0.0 ± 0.0
2.778GlnAsn: 2.778 ± 2.278
1.852GlnPro: 1.852 ± 0.458
0.926GlnGln: 0.926 ± 0.759
2.778GlnArg: 2.778 ± 2.006
2.778GlnSer: 2.778 ± 0.859
2.778GlnThr: 2.778 ± 1.061
1.852GlnVal: 1.852 ± 1.338
0.926GlnTrp: 0.926 ± 0.669
0.926GlnTyr: 0.926 ± 0.669
0.0GlnXaa: 0.0 ± 0.0
Arg
6.481ArgAla: 6.481 ± 2.883
0.0ArgCys: 0.0 ± 0.0
0.0ArgAsp: 0.0 ± 0.0
2.778ArgGlu: 2.778 ± 0.526
3.704ArgPhe: 3.704 ± 1.47
1.852ArgGly: 1.852 ± 1.098
3.704ArgHis: 3.704 ± 0.492
3.704ArgIle: 3.704 ± 1.47
6.481ArgLys: 6.481 ± 2.679
3.704ArgLeu: 3.704 ± 0.492
1.852ArgMet: 1.852 ± 1.519
1.852ArgAsn: 1.852 ± 0.458
0.0ArgPro: 0.0 ± 0.0
2.778ArgGln: 2.778 ± 1.242
7.407ArgArg: 7.407 ± 0.731
4.63ArgSer: 4.63 ± 1.662
5.556ArgThr: 5.556 ± 1.049
0.926ArgVal: 0.926 ± 0.759
0.926ArgTrp: 0.926 ± 0.669
0.926ArgTyr: 0.926 ± 0.669
0.0ArgXaa: 0.0 ± 0.0
Ser
3.704SerAla: 3.704 ± 0.917
0.0SerCys: 0.0 ± 0.0
9.259SerAsp: 9.259 ± 2.971
2.778SerGlu: 2.778 ± 1.684
0.0SerPhe: 0.0 ± 0.0
2.778SerGly: 2.778 ± 2.278
1.852SerHis: 1.852 ± 1.338
1.852SerIle: 1.852 ± 0.458
8.333SerLys: 8.333 ± 1.217
3.704SerLeu: 3.704 ± 0.917
0.0SerMet: 0.0 ± 0.0
3.704SerAsn: 3.704 ± 0.917
4.63SerPro: 4.63 ± 1.204
1.852SerGln: 1.852 ± 1.098
1.852SerArg: 1.852 ± 1.098
3.704SerSer: 3.704 ± 0.971
0.926SerThr: 0.926 ± 0.759
2.778SerVal: 2.778 ± 0.526
0.926SerTrp: 0.926 ± 0.669
1.852SerTyr: 1.852 ± 1.098
0.0SerXaa: 0.0 ± 0.0
Thr
5.556ThrAla: 5.556 ± 0.376
0.926ThrCys: 0.926 ± 0.669
4.63ThrAsp: 4.63 ± 1.662
3.704ThrGlu: 3.704 ± 1.47
1.852ThrPhe: 1.852 ± 1.338
7.407ThrGly: 7.407 ± 1.833
0.926ThrHis: 0.926 ± 0.759
1.852ThrIle: 1.852 ± 0.874
7.407ThrLys: 7.407 ± 3.891
2.778ThrLeu: 2.778 ± 2.278
0.926ThrMet: 0.926 ± 0.759
1.852ThrAsn: 1.852 ± 1.519
0.0ThrPro: 0.0 ± 0.0
3.704ThrGln: 3.704 ± 0.917
1.852ThrArg: 1.852 ± 1.519
4.63ThrSer: 4.63 ± 1.447
6.481ThrThr: 6.481 ± 4.04
3.704ThrVal: 3.704 ± 0.492
1.852ThrTrp: 1.852 ± 1.338
1.852ThrTyr: 1.852 ± 0.458
0.0ThrXaa: 0.0 ± 0.0
Val
1.852ValAla: 1.852 ± 1.098
0.926ValCys: 0.926 ± 0.669
1.852ValAsp: 1.852 ± 1.519
3.704ValGlu: 3.704 ± 2.307
2.778ValPhe: 2.778 ± 2.006
3.704ValGly: 3.704 ± 1.787
1.852ValHis: 1.852 ± 1.338
6.481ValIle: 6.481 ± 1.366
3.704ValLys: 3.704 ± 1.45
1.852ValLeu: 1.852 ± 1.875
0.926ValMet: 0.926 ± 0.759
2.778ValAsn: 2.778 ± 2.006
2.778ValPro: 2.778 ± 0.526
3.704ValGln: 3.704 ± 1.793
3.704ValArg: 3.704 ± 1.787
1.852ValSer: 1.852 ± 0.458
4.63ValThr: 4.63 ± 1.447
2.778ValVal: 2.778 ± 1.639
0.0ValTrp: 0.0 ± 0.0
0.926ValTyr: 0.926 ± 0.938
0.0ValXaa: 0.0 ± 0.0
Trp
0.926TrpAla: 0.926 ± 0.669
0.0TrpCys: 0.0 ± 0.0
0.926TrpAsp: 0.926 ± 0.669
2.778TrpGlu: 2.778 ± 1.242
1.852TrpPhe: 1.852 ± 0.458
1.852TrpGly: 1.852 ± 1.338
0.0TrpHis: 0.0 ± 0.0
0.926TrpIle: 0.926 ± 0.669
0.0TrpLys: 0.0 ± 0.0
1.852TrpLeu: 1.852 ± 1.338
1.852TrpMet: 1.852 ± 0.458
0.0TrpAsn: 0.0 ± 0.0
1.852TrpPro: 1.852 ± 0.458
0.0TrpGln: 0.0 ± 0.0
2.778TrpArg: 2.778 ± 1.639
0.0TrpSer: 0.0 ± 0.0
1.852TrpThr: 1.852 ± 1.338
0.0TrpVal: 0.0 ± 0.0
0.0TrpTrp: 0.0 ± 0.0
0.0TrpTyr: 0.0 ± 0.0
0.0TrpXaa: 0.0 ± 0.0
Tyr
0.926TyrAla: 0.926 ± 0.938
1.852TyrCys: 1.852 ± 1.338
1.852TyrAsp: 1.852 ± 0.458
2.778TyrGlu: 2.778 ± 1.242
1.852TyrPhe: 1.852 ± 1.338
2.778TyrGly: 2.778 ± 1.684
2.778TyrHis: 2.778 ± 1.242
3.704TyrIle: 3.704 ± 0.971
1.852TyrLys: 1.852 ± 0.458
0.926TyrLeu: 0.926 ± 0.759
0.926TyrMet: 0.926 ± 0.759
0.0TyrAsn: 0.0 ± 0.0
0.926TyrPro: 0.926 ± 0.669
0.926TyrGln: 0.926 ± 0.759
0.926TyrArg: 0.926 ± 0.669
0.0TyrSer: 0.0 ± 0.0
0.0TyrThr: 0.0 ± 0.0
0.926TyrVal: 0.926 ± 0.669
0.926TyrTrp: 0.926 ± 0.938
0.926TyrTyr: 0.926 ± 0.669
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 3 proteins (1081 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski