Amino acid dipepetide frequency for Cotton leaf curl Kokhran virus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
2.669AlaAla: 2.669 ± 1.376
0.89AlaCys: 0.89 ± 0.785
1.779AlaAsp: 1.779 ± 0.738
0.89AlaGlu: 0.89 ± 0.656
1.779AlaPhe: 1.779 ± 1.0
1.779AlaGly: 1.779 ± 1.009
2.669AlaHis: 2.669 ± 1.437
0.89AlaIle: 0.89 ± 0.656
1.779AlaLys: 1.779 ± 1.161
5.338AlaLeu: 5.338 ± 1.668
0.0AlaMet: 0.0 ± 0.0
1.779AlaAsn: 1.779 ± 0.738
0.89AlaPro: 0.89 ± 0.785
5.338AlaGln: 5.338 ± 2.407
3.559AlaArg: 3.559 ± 2.289
4.448AlaSer: 4.448 ± 2.759
6.228AlaThr: 6.228 ± 2.036
0.89AlaVal: 0.89 ± 0.967
1.779AlaTrp: 1.779 ± 0.738
0.0AlaTyr: 0.0 ± 0.0
0.0AlaXaa: 0.0 ± 0.0
Cys
0.89CysAla: 0.89 ± 0.656
1.779CysCys: 1.779 ± 2.449
0.0CysAsp: 0.0 ± 0.0
1.779CysGlu: 1.779 ± 1.32
0.89CysPhe: 0.89 ± 0.874
2.669CysGly: 2.669 ± 1.967
0.0CysHis: 0.0 ± 0.0
1.779CysIle: 1.779 ± 1.662
1.779CysLys: 1.779 ± 0.738
0.89CysLeu: 0.89 ± 0.831
0.89CysMet: 0.89 ± 1.225
0.89CysAsn: 0.89 ± 0.656
1.779CysPro: 1.779 ± 2.449
1.779CysGln: 1.779 ± 0.884
3.559CysArg: 3.559 ± 1.447
4.448CysSer: 4.448 ± 1.933
3.559CysThr: 3.559 ± 0.956
1.779CysVal: 1.779 ± 1.57
0.0CysTrp: 0.0 ± 0.0
0.89CysTyr: 0.89 ± 0.874
0.0CysXaa: 0.0 ± 0.0
Asp
1.779AspAla: 1.779 ± 1.0
0.89AspCys: 0.89 ± 0.656
2.669AspAsp: 2.669 ± 1.317
2.669AspGlu: 2.669 ± 0.883
0.89AspPhe: 0.89 ± 0.785
0.89AspGly: 0.89 ± 0.656
0.89AspHis: 0.89 ± 0.874
2.669AspIle: 2.669 ± 1.583
0.89AspLys: 0.89 ± 0.785
3.559AspLeu: 3.559 ± 1.468
0.0AspMet: 0.0 ± 0.0
0.89AspAsn: 0.89 ± 0.785
2.669AspPro: 2.669 ± 1.246
2.669AspGln: 2.669 ± 1.872
2.669AspArg: 2.669 ± 1.376
6.228AspSer: 6.228 ± 1.66
1.779AspThr: 1.779 ± 1.423
6.228AspVal: 6.228 ± 2.118
1.779AspTrp: 1.779 ± 0.884
0.89AspTyr: 0.89 ± 1.225
0.0AspXaa: 0.0 ± 0.0
Glu
2.669GluAla: 2.669 ± 1.352
0.89GluCys: 0.89 ± 0.831
2.669GluAsp: 2.669 ± 1.862
2.669GluGlu: 2.669 ± 1.127
2.669GluPhe: 2.669 ± 1.446
3.559GluGly: 3.559 ± 1.266
0.89GluHis: 0.89 ± 0.831
0.0GluIle: 0.0 ± 0.0
0.89GluLys: 0.89 ± 0.656
1.779GluLeu: 1.779 ± 1.169
0.0GluMet: 0.0 ± 0.0
3.559GluAsn: 3.559 ± 2.116
2.669GluPro: 2.669 ± 1.466
2.669GluGln: 2.669 ± 2.422
2.669GluArg: 2.669 ± 1.447
3.559GluSer: 3.559 ± 1.032
0.0GluThr: 0.0 ± 0.0
0.89GluVal: 0.89 ± 0.831
0.0GluTrp: 0.0 ± 0.0
0.89GluTyr: 0.89 ± 0.831
0.0GluXaa: 0.0 ± 0.0
Phe
0.0PheAla: 0.0 ± 0.0
0.89PheCys: 0.89 ± 0.656
2.669PheAsp: 2.669 ± 1.73
2.669PheGlu: 2.669 ± 1.609
4.448PhePhe: 4.448 ± 1.631
0.89PheGly: 0.89 ± 0.785
3.559PheHis: 3.559 ± 1.468
2.669PheIle: 2.669 ± 1.169
2.669PheLys: 2.669 ± 1.967
4.448PheLeu: 4.448 ± 1.108
0.89PheMet: 0.89 ± 0.656
4.448PheAsn: 4.448 ± 1.605
3.559PhePro: 3.559 ± 1.339
0.89PheGln: 0.89 ± 0.831
6.228PheArg: 6.228 ± 2.331
2.669PheSer: 2.669 ± 1.967
2.669PheThr: 2.669 ± 1.886
0.89PheVal: 0.89 ± 0.967
0.0PheTrp: 0.0 ± 0.0
0.89PheTyr: 0.89 ± 0.785
0.0PheXaa: 0.0 ± 0.0
Gly
3.559GlyAla: 3.559 ± 1.937
3.559GlyCys: 3.559 ± 1.16
0.89GlyAsp: 0.89 ± 0.831
3.559GlyGlu: 3.559 ± 1.449
0.89GlyPhe: 0.89 ± 1.225
1.779GlyGly: 1.779 ± 0.738
1.779GlyHis: 1.779 ± 1.311
2.669GlyIle: 2.669 ± 1.76
5.338GlyLys: 5.338 ± 2.839
2.669GlyLeu: 2.669 ± 0.883
0.0GlyMet: 0.0 ± 0.0
0.89GlyAsn: 0.89 ± 0.874
2.669GlyPro: 2.669 ± 1.155
2.669GlyGln: 2.669 ± 1.155
2.669GlyArg: 2.669 ± 1.447
2.669GlySer: 2.669 ± 1.967
3.559GlyThr: 3.559 ± 1.266
0.0GlyVal: 0.0 ± 0.0
0.89GlyTrp: 0.89 ± 0.785
1.779GlyTyr: 1.779 ± 1.423
0.0GlyXaa: 0.0 ± 0.0
His
2.669HisAla: 2.669 ± 1.352
1.779HisCys: 1.779 ± 1.396
2.669HisAsp: 2.669 ± 1.774
0.89HisGlu: 0.89 ± 1.225
4.448HisPhe: 4.448 ± 2.158
0.89HisGly: 0.89 ± 1.225
0.0HisHis: 0.0 ± 0.0
1.779HisIle: 1.779 ± 0.738
0.89HisLys: 0.89 ± 0.874
2.669HisLeu: 2.669 ± 1.397
0.0HisMet: 0.0 ± 0.0
1.779HisAsn: 1.779 ± 1.201
1.779HisPro: 1.779 ± 1.007
2.669HisGln: 2.669 ± 1.246
3.559HisArg: 3.559 ± 1.318
4.448HisSer: 4.448 ± 1.594
0.89HisThr: 0.89 ± 0.785
2.669HisVal: 2.669 ± 1.436
0.0HisTrp: 0.0 ± 0.0
0.89HisTyr: 0.89 ± 0.656
0.0HisXaa: 0.0 ± 0.0
Ile
0.89IleAla: 0.89 ± 0.874
0.89IleCys: 0.89 ± 1.225
0.89IleAsp: 0.89 ± 0.656
0.0IleGlu: 0.0 ± 0.0
2.669IlePhe: 2.669 ± 1.317
0.89IleGly: 0.89 ± 0.785
3.559IleHis: 3.559 ± 1.85
6.228IleIle: 6.228 ± 2.353
4.448IleLys: 4.448 ± 1.108
1.779IleLeu: 1.779 ± 1.217
2.669IleMet: 2.669 ± 1.902
0.0IleAsn: 0.0 ± 0.0
1.779IlePro: 1.779 ± 0.884
3.559IleGln: 3.559 ± 1.979
7.117IleArg: 7.117 ± 1.997
7.117IleSer: 7.117 ± 2.505
4.448IleThr: 4.448 ± 1.865
1.779IleVal: 1.779 ± 1.009
2.669IleTrp: 2.669 ± 1.717
2.669IleTyr: 2.669 ± 1.256
0.0IleXaa: 0.0 ± 0.0
Lys
3.559LysAla: 3.559 ± 1.714
2.669LysCys: 2.669 ± 1.447
1.779LysAsp: 1.779 ± 1.0
2.669LysGlu: 2.669 ± 1.155
3.559LysPhe: 3.559 ± 1.367
1.779LysGly: 1.779 ± 0.884
1.779LysHis: 1.779 ± 1.311
3.559LysIle: 3.559 ± 1.439
1.779LysLys: 1.779 ± 0.738
0.0LysLeu: 0.0 ± 0.0
0.0LysMet: 0.0 ± 0.0
5.338LysAsn: 5.338 ± 2.31
4.448LysPro: 4.448 ± 1.824
0.0LysGln: 0.0 ± 0.0
4.448LysArg: 4.448 ± 1.824
5.338LysSer: 5.338 ± 2.382
1.779LysThr: 1.779 ± 0.738
3.559LysVal: 3.559 ± 2.286
0.89LysTrp: 0.89 ± 0.785
3.559LysTyr: 3.559 ± 1.154
0.0LysXaa: 0.0 ± 0.0
Leu
0.89LeuAla: 0.89 ± 0.874
2.669LeuCys: 2.669 ± 1.447
2.669LeuAsp: 2.669 ± 1.317
1.779LeuGlu: 1.779 ± 0.884
1.779LeuPhe: 1.779 ± 0.884
4.448LeuGly: 4.448 ± 1.554
1.779LeuHis: 1.779 ± 1.007
5.338LeuIle: 5.338 ± 2.335
6.228LeuLys: 6.228 ± 1.985
4.448LeuLeu: 4.448 ± 3.219
0.89LeuMet: 0.89 ± 0.785
2.669LeuAsn: 2.669 ± 1.052
0.0LeuPro: 0.0 ± 0.0
3.559LeuGln: 3.559 ± 1.787
7.117LeuArg: 7.117 ± 1.52
7.117LeuSer: 7.117 ± 2.462
4.448LeuThr: 4.448 ± 1.02
2.669LeuVal: 2.669 ± 1.73
0.0LeuTrp: 0.0 ± 0.0
5.338LeuTyr: 5.338 ± 0.972
0.0LeuXaa: 0.0 ± 0.0
Met
0.89MetAla: 0.89 ± 0.785
2.669MetCys: 2.669 ± 0.883
2.669MetAsp: 2.669 ± 1.256
1.779MetGlu: 1.779 ± 1.237
2.669MetPhe: 2.669 ± 1.71
1.779MetGly: 1.779 ± 1.169
0.0MetHis: 0.0 ± 0.0
0.0MetIle: 0.0 ± 0.0
0.0MetLys: 0.0 ± 0.0
1.779MetLeu: 1.779 ± 1.32
0.0MetMet: 0.0 ± 0.0
0.89MetAsn: 0.89 ± 0.785
0.89MetPro: 0.89 ± 0.656
0.0MetGln: 0.0 ± 0.0
0.89MetArg: 0.89 ± 0.874
2.669MetSer: 2.669 ± 0.883
0.89MetThr: 0.89 ± 0.874
0.0MetVal: 0.0 ± 0.0
1.779MetTrp: 1.779 ± 1.597
1.779MetTyr: 1.779 ± 1.57
0.0MetXaa: 0.0 ± 0.0
Asn
1.779AsnAla: 1.779 ± 0.738
0.89AsnCys: 0.89 ± 0.831
0.89AsnAsp: 0.89 ± 0.656
1.779AsnGlu: 1.779 ± 1.32
1.779AsnPhe: 1.779 ± 1.161
0.0AsnGly: 0.0 ± 0.0
4.448AsnHis: 4.448 ± 1.793
2.669AsnIle: 2.669 ± 1.351
0.89AsnLys: 0.89 ± 0.656
4.448AsnLeu: 4.448 ± 2.482
3.559AsnMet: 3.559 ± 1.299
1.779AsnAsn: 1.779 ± 1.161
1.779AsnPro: 1.779 ± 1.009
2.669AsnGln: 2.669 ± 1.155
4.448AsnArg: 4.448 ± 1.302
5.338AsnSer: 5.338 ± 2.195
1.779AsnThr: 1.779 ± 1.311
1.779AsnVal: 1.779 ± 0.738
0.89AsnTrp: 0.89 ± 0.656
1.779AsnTyr: 1.779 ± 1.32
0.0AsnXaa: 0.0 ± 0.0
Pro
1.779ProAla: 1.779 ± 1.57
3.559ProCys: 3.559 ± 1.49
1.779ProAsp: 1.779 ± 1.32
0.89ProGlu: 0.89 ± 0.831
0.89ProPhe: 0.89 ± 0.874
0.89ProGly: 0.89 ± 0.656
2.669ProHis: 2.669 ± 1.437
3.559ProIle: 3.559 ± 1.985
2.669ProLys: 2.669 ± 1.967
2.669ProLeu: 2.669 ± 1.246
0.89ProMet: 0.89 ± 0.785
2.669ProAsn: 2.669 ± 1.127
1.779ProPro: 1.779 ± 1.0
2.669ProGln: 2.669 ± 0.962
6.228ProArg: 6.228 ± 1.29
5.338ProSer: 5.338 ± 1.568
4.448ProThr: 4.448 ± 2.776
3.559ProVal: 3.559 ± 1.418
0.89ProTrp: 0.89 ± 0.874
1.779ProTyr: 1.779 ± 1.32
0.0ProXaa: 0.0 ± 0.0
Gln
3.559GlnAla: 3.559 ± 1.266
0.89GlnCys: 0.89 ± 0.656
4.448GlnAsp: 4.448 ± 3.615
1.779GlnGlu: 1.779 ± 1.161
1.779GlnPhe: 1.779 ± 1.217
4.448GlnGly: 4.448 ± 1.927
3.559GlnHis: 3.559 ± 0.956
1.779GlnIle: 1.779 ± 1.311
2.669GlnLys: 2.669 ± 2.303
3.559GlnLeu: 3.559 ± 2.289
1.779GlnMet: 1.779 ± 0.884
0.89GlnAsn: 0.89 ± 0.831
7.117GlnPro: 7.117 ± 3.27
1.779GlnGln: 1.779 ± 1.161
0.89GlnArg: 0.89 ± 0.967
8.007GlnSer: 8.007 ± 1.76
0.0GlnThr: 0.0 ± 0.0
1.779GlnVal: 1.779 ± 1.009
0.0GlnTrp: 0.0 ± 0.0
2.669GlnTyr: 2.669 ± 1.256
0.0GlnXaa: 0.0 ± 0.0
Arg
2.669ArgAla: 2.669 ± 1.052
2.669ArgCys: 2.669 ± 2.303
3.559ArgAsp: 3.559 ± 1.714
4.448ArgGlu: 4.448 ± 1.926
4.448ArgPhe: 4.448 ± 1.915
3.559ArgGly: 3.559 ± 1.477
1.779ArgHis: 1.779 ± 1.32
7.117ArgIle: 7.117 ± 4.084
4.448ArgLys: 4.448 ± 1.353
2.669ArgLeu: 2.669 ± 0.962
2.669ArgMet: 2.669 ± 1.71
3.559ArgAsn: 3.559 ± 1.468
4.448ArgPro: 4.448 ± 2.309
5.338ArgGln: 5.338 ± 1.757
9.786ArgArg: 9.786 ± 4.586
9.786ArgSer: 9.786 ± 2.531
4.448ArgThr: 4.448 ± 1.825
6.228ArgVal: 6.228 ± 1.999
0.89ArgTrp: 0.89 ± 0.656
1.779ArgTyr: 1.779 ± 1.32
0.0ArgXaa: 0.0 ± 0.0
Ser
6.228SerAla: 6.228 ± 2.971
2.669SerCys: 2.669 ± 1.256
5.338SerAsp: 5.338 ± 1.457
1.779SerGlu: 1.779 ± 0.738
6.228SerPhe: 6.228 ± 2.013
2.669SerGly: 2.669 ± 1.397
2.669SerHis: 2.669 ± 1.157
6.228SerIle: 6.228 ± 2.694
5.338SerLys: 5.338 ± 1.858
8.897SerLeu: 8.897 ± 4.171
3.559SerMet: 3.559 ± 2.336
4.448SerAsn: 4.448 ± 2.343
8.007SerPro: 8.007 ± 2.234
4.448SerGln: 4.448 ± 2.707
8.897SerArg: 8.897 ± 1.832
17.794SerSer: 17.794 ± 5.819
7.117SerThr: 7.117 ± 2.641
8.007SerVal: 8.007 ± 1.931
0.89SerTrp: 0.89 ± 0.831
2.669SerTyr: 2.669 ± 1.967
0.0SerXaa: 0.0 ± 0.0
Thr
4.448ThrAla: 4.448 ± 1.108
0.89ThrCys: 0.89 ± 0.656
0.89ThrAsp: 0.89 ± 0.656
0.89ThrGlu: 0.89 ± 0.831
0.89ThrPhe: 0.89 ± 0.656
5.338ThrGly: 5.338 ± 1.689
3.559ThrHis: 3.559 ± 1.418
3.559ThrIle: 3.559 ± 2.012
3.559ThrLys: 3.559 ± 1.477
6.228ThrLeu: 6.228 ± 1.74
0.0ThrMet: 0.0 ± 0.0
4.448ThrAsn: 4.448 ± 1.302
2.669ThrPro: 2.669 ± 1.238
3.559ThrGln: 3.559 ± 2.031
4.448ThrArg: 4.448 ± 1.794
4.448ThrSer: 4.448 ± 2.416
3.559ThrThr: 3.559 ± 1.802
2.669ThrVal: 2.669 ± 1.794
0.89ThrTrp: 0.89 ± 0.656
0.89ThrTyr: 0.89 ± 0.656
0.0ThrXaa: 0.0 ± 0.0
Val
0.0ValAla: 0.0 ± 0.0
0.89ValCys: 0.89 ± 0.967
2.669ValAsp: 2.669 ± 0.883
0.89ValGlu: 0.89 ± 1.225
1.779ValPhe: 1.779 ± 1.009
4.448ValGly: 4.448 ± 1.594
1.779ValHis: 1.779 ± 1.32
2.669ValIle: 2.669 ± 1.273
3.559ValLys: 3.559 ± 2.434
3.559ValLeu: 3.559 ± 1.678
1.779ValMet: 1.779 ± 1.57
1.779ValAsn: 1.779 ± 1.009
1.779ValPro: 1.779 ± 1.091
5.338ValGln: 5.338 ± 2.033
4.448ValArg: 4.448 ± 2.109
5.338ValSer: 5.338 ± 1.878
3.559ValThr: 3.559 ± 3.141
1.779ValVal: 1.779 ± 1.091
0.0ValTrp: 0.0 ± 0.0
4.448ValTyr: 4.448 ± 2.116
0.0ValXaa: 0.0 ± 0.0
Trp
1.779TrpAla: 1.779 ± 0.738
0.0TrpCys: 0.0 ± 0.0
0.89TrpAsp: 0.89 ± 1.225
0.0TrpGlu: 0.0 ± 0.0
0.89TrpPhe: 0.89 ± 0.656
0.0TrpGly: 0.0 ± 0.0
0.0TrpHis: 0.0 ± 0.0
0.89TrpIle: 0.89 ± 0.874
0.89TrpLys: 0.89 ± 0.656
0.0TrpLeu: 0.0 ± 0.0
1.779TrpMet: 1.779 ± 1.009
0.0TrpAsn: 0.0 ± 0.0
0.0TrpPro: 0.0 ± 0.0
0.89TrpGln: 0.89 ± 0.656
0.89TrpArg: 0.89 ± 0.874
2.669TrpSer: 2.669 ± 1.872
0.0TrpThr: 0.0 ± 0.0
0.89TrpVal: 0.89 ± 0.785
0.0TrpTrp: 0.0 ± 0.0
1.779TrpTyr: 1.779 ± 0.738
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.669TyrAla: 2.669 ± 2.356
0.0TyrCys: 0.0 ± 0.0
1.779TyrAsp: 1.779 ± 1.091
1.779TyrGlu: 1.779 ± 1.32
2.669TyrPhe: 2.669 ± 0.867
1.779TyrGly: 1.779 ± 0.884
0.0TyrHis: 0.0 ± 0.0
0.89TyrIle: 0.89 ± 0.831
1.779TyrLys: 1.779 ± 1.311
4.448TyrLeu: 4.448 ± 1.933
1.779TyrMet: 1.779 ± 1.031
2.669TyrAsn: 2.669 ± 1.609
0.89TyrPro: 0.89 ± 0.656
0.89TyrGln: 0.89 ± 0.785
1.779TyrArg: 1.779 ± 1.57
4.448TyrSer: 4.448 ± 1.882
2.669TyrThr: 2.669 ± 1.246
4.448TyrVal: 4.448 ± 2.47
0.0TyrTrp: 0.0 ± 0.0
2.669TyrTyr: 2.669 ± 1.127
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 6 proteins (1125 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski