Amino acid dipepetide frequency for Chaetoceros protobacilladnavirus 4

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
4.87AlaAla: 4.87 ± 3.608
2.435AlaCys: 2.435 ± 1.24
5.682AlaAsp: 5.682 ± 1.58
6.494AlaGlu: 6.494 ± 3.25
1.623AlaPhe: 1.623 ± 0.739
6.494AlaGly: 6.494 ± 1.601
1.623AlaHis: 1.623 ± 1.526
7.305AlaIle: 7.305 ± 0.957
7.305AlaLys: 7.305 ± 1.792
3.247AlaLeu: 3.247 ± 0.599
0.0AlaMet: 0.0 ± 0.485
3.247AlaAsn: 3.247 ± 1.626
4.87AlaPro: 4.87 ± 2.763
1.623AlaGln: 1.623 ± 1.526
6.494AlaArg: 6.494 ± 1.745
3.247AlaSer: 3.247 ± 0.547
4.87AlaThr: 4.87 ± 3.596
2.435AlaVal: 2.435 ± 0.115
0.812AlaTrp: 0.812 ± 0.601
4.058AlaTyr: 4.058 ± 0.551
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.0CysCys: 0.0 ± 0.0
1.623CysAsp: 1.623 ± 1.279
0.0CysGlu: 0.0 ± 0.0
0.0CysPhe: 0.0 ± 0.0
0.0CysGly: 0.0 ± 0.0
0.0CysHis: 0.0 ± 0.0
0.812CysIle: 0.812 ± 0.763
0.0CysLys: 0.0 ± 0.0
0.0CysLeu: 0.0 ± 0.0
1.623CysMet: 1.623 ± 1.279
0.812CysAsn: 0.812 ± 0.763
1.623CysPro: 1.623 ± 0.625
0.0CysGln: 0.0 ± 0.0
2.435CysArg: 2.435 ± 1.24
1.623CysSer: 1.623 ± 1.279
0.0CysThr: 0.0 ± 0.0
0.812CysVal: 0.812 ± 0.763
0.812CysTrp: 0.812 ± 0.64
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
3.247AspAla: 3.247 ± 1.569
2.435AspCys: 2.435 ± 1.009
4.058AspAsp: 4.058 ± 2.355
5.682AspGlu: 5.682 ± 1.628
1.623AspPhe: 1.623 ± 0.656
8.929AspGly: 8.929 ± 3.114
0.812AspHis: 0.812 ± 0.763
6.494AspIle: 6.494 ± 0.452
4.058AspLys: 4.058 ± 0.842
4.87AspLeu: 4.87 ± 1.491
0.812AspMet: 0.812 ± 0.467
1.623AspAsn: 1.623 ± 0.625
3.247AspPro: 3.247 ± 0.547
1.623AspGln: 1.623 ± 1.279
1.623AspArg: 1.623 ± 0.739
2.435AspSer: 2.435 ± 1.377
0.0AspThr: 0.0 ± 0.0
2.435AspVal: 2.435 ± 1.147
0.812AspTrp: 0.812 ± 0.601
0.812AspTyr: 0.812 ± 0.601
0.0AspXaa: 0.0 ± 0.0
Glu
4.87GluAla: 4.87 ± 1.016
0.0GluCys: 0.0 ± 0.0
4.058GluAsp: 4.058 ± 0.551
5.682GluGlu: 5.682 ± 1.576
4.87GluPhe: 4.87 ± 1.967
4.058GluGly: 4.058 ± 2.198
1.623GluHis: 1.623 ± 1.279
1.623GluIle: 1.623 ± 1.526
1.623GluLys: 1.623 ± 0.656
9.74GluLeu: 9.74 ± 2.032
1.623GluMet: 1.623 ± 0.739
3.247GluAsn: 3.247 ± 0.873
4.058GluPro: 4.058 ± 2.355
4.058GluGln: 4.058 ± 1.195
2.435GluArg: 2.435 ± 1.083
4.058GluSer: 4.058 ± 2.174
4.058GluThr: 4.058 ± 1.687
1.623GluVal: 1.623 ± 1.279
1.623GluTrp: 1.623 ± 0.625
0.812GluTyr: 0.812 ± 0.64
0.0GluXaa: 0.0 ± 0.0
Phe
3.247PheAla: 3.247 ± 1.478
0.0PheCys: 0.0 ± 0.0
3.247PheAsp: 3.247 ± 1.25
1.623PheGlu: 1.623 ± 1.279
0.812PhePhe: 0.812 ± 0.64
0.0PheGly: 0.0 ± 0.0
2.435PheHis: 2.435 ± 0.115
2.435PheIle: 2.435 ± 1.377
1.623PheLys: 1.623 ± 0.625
1.623PheLeu: 1.623 ± 0.625
0.812PheMet: 0.812 ± 0.64
2.435PheAsn: 2.435 ± 1.009
2.435PhePro: 2.435 ± 1.147
1.623PheGln: 1.623 ± 0.625
2.435PheArg: 2.435 ± 1.083
6.494PheSer: 6.494 ± 0.452
3.247PheThr: 3.247 ± 0.599
1.623PheVal: 1.623 ± 1.279
1.623PheTrp: 1.623 ± 1.279
0.812PheTyr: 0.812 ± 0.601
0.0PheXaa: 0.0 ± 0.0
Gly
7.305GlyAla: 7.305 ± 0.919
0.0GlyCys: 0.0 ± 0.0
2.435GlyAsp: 2.435 ± 1.804
4.058GlyGlu: 4.058 ± 1.788
1.623GlyPhe: 1.623 ± 1.279
3.247GlyGly: 3.247 ± 1.738
1.623GlyHis: 1.623 ± 1.203
1.623GlyIle: 1.623 ± 0.739
4.058GlyLys: 4.058 ± 1.185
3.247GlyLeu: 3.247 ± 1.478
0.812GlyMet: 0.812 ± 0.601
2.435GlyAsn: 2.435 ± 1.24
3.247GlyPro: 3.247 ± 0.599
4.058GlyGln: 4.058 ± 1.756
5.682GlyArg: 5.682 ± 1.576
5.682GlySer: 5.682 ± 0.983
5.682GlyThr: 5.682 ± 2.214
4.87GlyVal: 4.87 ± 1.175
1.623GlyTrp: 1.623 ± 0.625
0.812GlyTyr: 0.812 ± 0.763
0.0GlyXaa: 0.0 ± 0.0
His
4.87HisAla: 4.87 ± 1.875
0.812HisCys: 0.812 ± 0.763
1.623HisAsp: 1.623 ± 1.203
2.435HisGlu: 2.435 ± 1.009
2.435HisPhe: 2.435 ± 1.009
0.0HisGly: 0.0 ± 0.0
1.623HisHis: 1.623 ± 0.625
0.812HisIle: 0.812 ± 0.64
3.247HisLys: 3.247 ± 0.599
0.812HisLeu: 0.812 ± 0.601
2.435HisMet: 2.435 ± 1.083
0.812HisAsn: 0.812 ± 0.763
1.623HisPro: 1.623 ± 1.279
1.623HisGln: 1.623 ± 0.625
0.812HisArg: 0.812 ± 0.64
1.623HisSer: 1.623 ± 0.656
1.623HisThr: 1.623 ± 0.656
0.812HisVal: 0.812 ± 0.64
0.812HisTrp: 0.812 ± 0.64
1.623HisTyr: 1.623 ± 0.739
0.0HisXaa: 0.0 ± 0.0
Ile
3.247IleAla: 3.247 ± 1.626
0.812IleCys: 0.812 ± 0.64
3.247IleAsp: 3.247 ± 0.547
3.247IleGlu: 3.247 ± 0.547
1.623IlePhe: 1.623 ± 0.739
5.682IleGly: 5.682 ± 3.374
2.435IleHis: 2.435 ± 1.009
1.623IleIle: 1.623 ± 0.656
3.247IleLys: 3.247 ± 0.873
3.247IleLeu: 3.247 ± 1.311
2.435IleMet: 2.435 ± 1.377
3.247IleAsn: 3.247 ± 1.25
1.623IlePro: 1.623 ± 0.625
2.435IleGln: 2.435 ± 1.24
4.058IleArg: 4.058 ± 0.551
5.682IleSer: 5.682 ± 0.983
0.0IleThr: 0.0 ± 0.0
3.247IleVal: 3.247 ± 1.626
0.812IleTrp: 0.812 ± 0.64
1.623IleTyr: 1.623 ± 0.656
0.0IleXaa: 0.0 ± 0.0
Lys
12.175LysAla: 12.175 ± 3.457
0.812LysCys: 0.812 ± 0.64
0.812LysAsp: 0.812 ± 0.763
4.058LysGlu: 4.058 ± 1.687
3.247LysPhe: 3.247 ± 0.547
3.247LysGly: 3.247 ± 0.547
1.623LysHis: 1.623 ± 1.526
2.435LysIle: 2.435 ± 1.11
10.552LysLys: 10.552 ± 3.377
4.058LysLeu: 4.058 ± 0.842
1.623LysMet: 1.623 ± 0.739
3.247LysAsn: 3.247 ± 1.962
2.435LysPro: 2.435 ± 1.009
3.247LysGln: 3.247 ± 0.547
6.494LysArg: 6.494 ± 0.452
4.058LysSer: 4.058 ± 1.687
1.623LysThr: 1.623 ± 0.656
2.435LysVal: 2.435 ± 0.115
0.812LysTrp: 0.812 ± 0.763
3.247LysTyr: 3.247 ± 1.311
0.0LysXaa: 0.0 ± 0.0
Leu
6.494LeuAla: 6.494 ± 2.956
0.812LeuCys: 0.812 ± 0.64
4.058LeuAsp: 4.058 ± 0.551
4.058LeuGlu: 4.058 ± 1.756
0.0LeuPhe: 0.0 ± 0.0
3.247LeuGly: 3.247 ± 0.599
2.435LeuHis: 2.435 ± 1.083
4.058LeuIle: 4.058 ± 1.185
5.682LeuLys: 5.682 ± 0.619
4.87LeuLeu: 4.87 ± 2.479
0.0LeuMet: 0.0 ± 0.0
3.247LeuAsn: 3.247 ± 0.547
4.058LeuPro: 4.058 ± 3.007
2.435LeuGln: 2.435 ± 1.377
2.435LeuArg: 2.435 ± 1.083
2.435LeuSer: 2.435 ± 1.24
2.435LeuThr: 2.435 ± 0.115
4.058LeuVal: 4.058 ± 0.842
0.812LeuTrp: 0.812 ± 0.64
2.435LeuTyr: 2.435 ± 2.289
0.0LeuXaa: 0.0 ± 0.0
Met
0.0MetAla: 0.0 ± 0.0
0.812MetCys: 0.812 ± 0.763
0.812MetAsp: 0.812 ± 0.763
1.623MetGlu: 1.623 ± 1.279
0.812MetPhe: 0.812 ± 0.64
0.812MetGly: 0.812 ± 0.601
0.812MetHis: 0.812 ± 0.64
0.812MetIle: 0.812 ± 0.64
0.812MetLys: 0.812 ± 0.601
0.812MetLeu: 0.812 ± 0.64
0.0MetMet: 0.0 ± 0.0
2.435MetAsn: 2.435 ± 1.24
2.435MetPro: 2.435 ± 1.11
0.0MetGln: 0.0 ± 0.0
0.812MetArg: 0.812 ± 0.763
1.623MetSer: 1.623 ± 0.656
3.247MetThr: 3.247 ± 1.626
2.435MetVal: 2.435 ± 1.377
0.0MetTrp: 0.0 ± 0.0
0.0MetTyr: 0.0 ± 0.0
0.0MetXaa: 0.0 ± 0.0
Asn
4.058AsnAla: 4.058 ± 2.706
0.812AsnCys: 0.812 ± 0.64
3.247AsnAsp: 3.247 ± 2.1
2.435AsnGlu: 2.435 ± 1.009
1.623AsnPhe: 1.623 ± 0.656
3.247AsnGly: 3.247 ± 0.873
2.435AsnHis: 2.435 ± 1.009
5.682AsnIle: 5.682 ± 1.576
4.058AsnLys: 4.058 ± 0.842
3.247AsnLeu: 3.247 ± 0.547
1.623AsnMet: 1.623 ± 0.625
4.87AsnAsn: 4.87 ± 1.326
4.87AsnPro: 4.87 ± 1.326
2.435AsnGln: 2.435 ± 1.24
0.0AsnArg: 0.0 ± 0.0
3.247AsnSer: 3.247 ± 1.25
3.247AsnThr: 3.247 ± 1.25
5.682AsnVal: 5.682 ± 2.214
0.812AsnTrp: 0.812 ± 0.64
1.623AsnTyr: 1.623 ± 0.625
0.0AsnXaa: 0.0 ± 0.0
Pro
4.87ProAla: 4.87 ± 3.459
0.0ProCys: 0.0 ± 0.0
4.058ProAsp: 4.058 ± 1.551
4.87ProGlu: 4.87 ± 2.167
4.87ProPhe: 4.87 ± 0.974
4.058ProGly: 4.058 ± 1.195
2.435ProHis: 2.435 ± 1.083
0.812ProIle: 0.812 ± 0.601
3.247ProLys: 3.247 ± 0.547
3.247ProLeu: 3.247 ± 2.405
0.812ProMet: 0.812 ± 0.601
0.812ProAsn: 0.812 ± 0.601
2.435ProPro: 2.435 ± 1.147
0.812ProGln: 0.812 ± 0.763
3.247ProArg: 3.247 ± 1.738
4.87ProSer: 4.87 ± 2.295
3.247ProThr: 3.247 ± 1.25
2.435ProVal: 2.435 ± 0.115
0.0ProTrp: 0.0 ± 0.0
1.623ProTyr: 1.623 ± 1.203
0.0ProXaa: 0.0 ± 0.0
Gln
2.435GlnAla: 2.435 ± 1.804
0.0GlnCys: 0.0 ± 0.0
2.435GlnAsp: 2.435 ± 1.24
2.435GlnGlu: 2.435 ± 1.009
1.623GlnPhe: 1.623 ± 0.625
3.247GlnGly: 3.247 ± 1.962
1.623GlnHis: 1.623 ± 1.203
1.623GlnIle: 1.623 ± 0.625
0.0GlnLys: 0.0 ± 0.0
1.623GlnLeu: 1.623 ± 1.526
0.812GlnMet: 0.812 ± 0.601
3.247GlnAsn: 3.247 ± 0.873
1.623GlnPro: 1.623 ± 0.625
0.812GlnGln: 0.812 ± 0.601
4.87GlnArg: 4.87 ± 0.974
3.247GlnSer: 3.247 ± 2.1
2.435GlnThr: 2.435 ± 1.919
0.812GlnVal: 0.812 ± 0.601
0.0GlnTrp: 0.0 ± 0.0
2.435GlnTyr: 2.435 ± 1.147
0.0GlnXaa: 0.0 ± 0.0
Arg
4.058ArgAla: 4.058 ± 1.195
0.0ArgCys: 0.0 ± 0.0
3.247ArgAsp: 3.247 ± 1.738
2.435ArgGlu: 2.435 ± 1.147
2.435ArgPhe: 2.435 ± 1.24
2.435ArgGly: 2.435 ± 1.377
0.812ArgHis: 0.812 ± 0.64
2.435ArgIle: 2.435 ± 0.115
8.929ArgLys: 8.929 ± 2.299
5.682ArgLeu: 5.682 ± 1.206
0.812ArgMet: 0.812 ± 0.613
8.117ArgAsn: 8.117 ± 0.658
2.435ArgPro: 2.435 ± 1.009
2.435ArgGln: 2.435 ± 1.147
4.87ArgArg: 4.87 ± 1.824
3.247ArgSer: 3.247 ± 1.625
4.87ArgThr: 4.87 ± 1.033
5.682ArgVal: 5.682 ± 0.464
0.0ArgTrp: 0.0 ± 0.0
0.812ArgTyr: 0.812 ± 0.64
0.0ArgXaa: 0.0 ± 0.0
Ser
6.494SerAla: 6.494 ± 3.967
0.0SerCys: 0.0 ± 0.0
2.435SerAsp: 2.435 ± 0.115
2.435SerGlu: 2.435 ± 0.115
5.682SerPhe: 5.682 ± 1.288
6.494SerGly: 6.494 ± 0.948
4.058SerHis: 4.058 ± 1.185
4.058SerIle: 4.058 ± 1.185
6.494SerLys: 6.494 ± 0.452
1.623SerLeu: 1.623 ± 0.625
0.812SerMet: 0.812 ± 0.64
4.87SerAsn: 4.87 ± 2.017
0.812SerPro: 0.812 ± 0.763
3.247SerGln: 3.247 ± 1.25
4.87SerArg: 4.87 ± 1.016
6.494SerSer: 6.494 ± 1.88
2.435SerThr: 2.435 ± 2.289
4.87SerVal: 4.87 ± 0.974
0.812SerTrp: 0.812 ± 0.64
3.247SerTyr: 3.247 ± 0.599
0.0SerXaa: 0.0 ± 0.0
Thr
2.435ThrAla: 2.435 ± 1.147
0.812ThrCys: 0.812 ± 0.64
4.058ThrAsp: 4.058 ± 0.668
2.435ThrGlu: 2.435 ± 1.083
1.623ThrPhe: 1.623 ± 0.739
4.058ThrGly: 4.058 ± 1.635
0.0ThrHis: 0.0 ± 0.0
2.435ThrIle: 2.435 ± 1.11
3.247ThrLys: 3.247 ± 1.625
4.058ThrLeu: 4.058 ± 0.668
0.0ThrMet: 0.0 ± 0.0
3.247ThrAsn: 3.247 ± 0.547
2.435ThrPro: 2.435 ± 0.115
1.623ThrGln: 1.623 ± 1.526
4.87ThrArg: 4.87 ± 1.175
5.682ThrSer: 5.682 ± 1.288
3.247ThrThr: 3.247 ± 0.547
4.058ThrVal: 4.058 ± 0.842
1.623ThrTrp: 1.623 ± 1.279
1.623ThrTyr: 1.623 ± 0.625
0.0ThrXaa: 0.0 ± 0.0
Val
2.435ValAla: 2.435 ± 1.11
0.812ValCys: 0.812 ± 0.64
4.058ValAsp: 4.058 ± 2.174
5.682ValGlu: 5.682 ± 1.58
0.812ValPhe: 0.812 ± 0.64
0.812ValGly: 0.812 ± 0.601
2.435ValHis: 2.435 ± 1.919
4.87ValIle: 4.87 ± 0.23
1.623ValLys: 1.623 ± 0.625
1.623ValLeu: 1.623 ± 0.739
3.247ValMet: 3.247 ± 1.962
4.87ValAsn: 4.87 ± 2.398
3.247ValPro: 3.247 ± 2.405
2.435ValGln: 2.435 ± 1.804
3.247ValArg: 3.247 ± 1.25
4.058ValSer: 4.058 ± 0.842
4.058ValThr: 4.058 ± 1.756
3.247ValVal: 3.247 ± 0.873
0.0ValTrp: 0.0 ± 0.0
0.812ValTyr: 0.812 ± 0.763
0.0ValXaa: 0.0 ± 0.0
Trp
0.0TrpAla: 0.0 ± 0.0
0.0TrpCys: 0.0 ± 0.0
0.812TrpAsp: 0.812 ± 0.64
1.623TrpGlu: 1.623 ± 0.739
0.0TrpPhe: 0.0 ± 0.0
2.435TrpGly: 2.435 ± 1.147
0.0TrpHis: 0.0 ± 0.0
0.0TrpIle: 0.0 ± 0.0
0.812TrpLys: 0.812 ± 0.64
0.0TrpLeu: 0.0 ± 0.0
0.0TrpMet: 0.0 ± 0.0
0.812TrpAsn: 0.812 ± 0.64
1.623TrpPro: 1.623 ± 1.279
0.812TrpGln: 0.812 ± 0.763
2.435TrpArg: 2.435 ± 1.009
0.812TrpSer: 0.812 ± 0.64
0.812TrpThr: 0.812 ± 0.64
0.0TrpVal: 0.0 ± 0.0
1.623TrpTrp: 1.623 ± 0.625
0.812TrpTyr: 0.812 ± 0.64
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.435TyrAla: 2.435 ± 1.083
0.812TyrCys: 0.812 ± 0.763
1.623TyrAsp: 1.623 ± 0.739
2.435TyrGlu: 2.435 ± 1.147
3.247TyrPhe: 3.247 ± 0.599
1.623TyrGly: 1.623 ± 0.656
1.623TyrHis: 1.623 ± 0.625
1.623TyrIle: 1.623 ± 0.656
1.623TyrLys: 1.623 ± 0.625
2.435TyrLeu: 2.435 ± 0.115
0.0TyrMet: 0.0 ± 0.0
1.623TyrAsn: 1.623 ± 0.625
1.623TyrPro: 1.623 ± 0.739
0.0TyrGln: 0.0 ± 0.0
1.623TyrArg: 1.623 ± 1.279
1.623TyrSer: 1.623 ± 0.739
2.435TyrThr: 2.435 ± 1.11
0.812TyrVal: 0.812 ± 0.64
0.0TyrTrp: 0.0 ± 0.0
1.623TyrTyr: 1.623 ± 1.279
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 3 proteins (1233 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski