Amino acid dipepetide frequency for Sida yellow leaf curl virus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
6.012AlaAla: 6.012 ± 3.326
1.002AlaCys: 1.002 ± 0.782
1.002AlaAsp: 1.002 ± 0.74
1.002AlaGlu: 1.002 ± 1.167
0.0AlaPhe: 0.0 ± 0.0
4.008AlaGly: 4.008 ± 1.536
0.0AlaHis: 0.0 ± 0.0
0.0AlaIle: 0.0 ± 0.0
5.01AlaLys: 5.01 ± 1.635
7.014AlaLeu: 7.014 ± 2.194
0.0AlaMet: 0.0 ± 0.0
1.002AlaAsn: 1.002 ± 0.74
0.0AlaPro: 0.0 ± 0.0
1.002AlaGln: 1.002 ± 0.74
3.006AlaArg: 3.006 ± 2.221
7.014AlaSer: 7.014 ± 2.948
4.008AlaThr: 4.008 ± 1.358
3.006AlaVal: 3.006 ± 1.527
0.0AlaTrp: 0.0 ± 0.0
1.002AlaTyr: 1.002 ± 1.167
0.0AlaXaa: 0.0 ± 0.0
Cys
1.002CysAla: 1.002 ± 1.184
0.0CysCys: 0.0 ± 0.0
1.002CysAsp: 1.002 ± 0.74
2.004CysGlu: 2.004 ± 0.768
1.002CysPhe: 1.002 ± 1.102
1.002CysGly: 1.002 ± 1.184
0.0CysHis: 0.0 ± 0.0
2.004CysIle: 2.004 ± 1.369
2.004CysLys: 2.004 ± 0.768
2.004CysLeu: 2.004 ± 1.133
0.0CysMet: 0.0 ± 0.0
1.002CysAsn: 1.002 ± 0.74
0.0CysPro: 0.0 ± 0.0
0.0CysGln: 0.0 ± 0.0
0.0CysArg: 0.0 ± 0.0
1.002CysSer: 1.002 ± 1.184
2.004CysThr: 2.004 ± 1.369
1.002CysVal: 1.002 ± 0.782
0.0CysTrp: 0.0 ± 0.0
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
0.0AspAla: 0.0 ± 0.0
0.0AspCys: 0.0 ± 0.0
4.008AspAsp: 4.008 ± 2.138
4.008AspGlu: 4.008 ± 1.958
6.012AspPhe: 6.012 ± 1.88
2.004AspGly: 2.004 ± 1.133
0.0AspHis: 0.0 ± 0.0
5.01AspIle: 5.01 ± 1.024
1.002AspLys: 1.002 ± 0.74
5.01AspLeu: 5.01 ± 2.152
1.002AspMet: 1.002 ± 0.99
2.004AspAsn: 2.004 ± 1.369
3.006AspPro: 3.006 ± 1.021
1.002AspGln: 1.002 ± 1.184
4.008AspArg: 4.008 ± 1.358
5.01AspSer: 5.01 ± 1.118
1.002AspThr: 1.002 ± 0.74
5.01AspVal: 5.01 ± 1.99
1.002AspTrp: 1.002 ± 0.74
2.004AspTyr: 2.004 ± 1.481
0.0AspXaa: 0.0 ± 0.0
Glu
3.006GluAla: 3.006 ± 1.29
1.002GluCys: 1.002 ± 1.184
1.002GluAsp: 1.002 ± 1.167
4.008GluGlu: 4.008 ± 2.961
0.0GluPhe: 0.0 ± 0.0
4.008GluGly: 4.008 ± 1.958
1.002GluHis: 1.002 ± 0.74
1.002GluIle: 1.002 ± 1.167
0.0GluLys: 0.0 ± 0.0
2.004GluLeu: 2.004 ± 1.14
1.002GluMet: 1.002 ± 0.74
6.012GluAsn: 6.012 ± 2.507
2.004GluPro: 2.004 ± 0.768
3.006GluGln: 3.006 ± 1.765
3.006GluArg: 3.006 ± 2.109
3.006GluSer: 3.006 ± 1.267
1.002GluThr: 1.002 ± 1.102
1.002GluVal: 1.002 ± 0.74
3.006GluTrp: 3.006 ± 1.021
2.004GluTyr: 2.004 ± 1.481
0.0GluXaa: 0.0 ± 0.0
Phe
1.002PheAla: 1.002 ± 1.167
1.002PheCys: 1.002 ± 0.782
3.006PheAsp: 3.006 ± 1.021
0.0PheGlu: 0.0 ± 0.0
1.002PhePhe: 1.002 ± 0.74
2.004PheGly: 2.004 ± 0.768
2.004PheHis: 2.004 ± 1.133
1.002PheIle: 1.002 ± 0.74
2.004PheLys: 2.004 ± 2.334
4.008PheLeu: 4.008 ± 2.113
1.002PheMet: 1.002 ± 0.74
5.01PheAsn: 5.01 ± 1.305
3.006PhePro: 3.006 ± 1.407
3.006PheGln: 3.006 ± 1.683
4.008PheArg: 4.008 ± 2.129
2.004PheSer: 2.004 ± 1.246
4.008PheThr: 4.008 ± 2.12
0.0PheVal: 0.0 ± 0.0
3.006PheTrp: 3.006 ± 1.888
2.004PheTyr: 2.004 ± 1.564
0.0PheXaa: 0.0 ± 0.0
Gly
1.002GlyAla: 1.002 ± 0.74
2.004GlyCys: 2.004 ± 1.369
3.006GlyAsp: 3.006 ± 1.407
5.01GlyGlu: 5.01 ± 1.977
3.006GlyPhe: 3.006 ± 1.407
7.014GlyGly: 7.014 ± 2.806
1.002GlyHis: 1.002 ± 0.74
4.008GlyIle: 4.008 ± 1.111
7.014GlyLys: 7.014 ± 2.806
2.004GlyLeu: 2.004 ± 1.531
0.0GlyMet: 0.0 ± 0.0
2.004GlyAsn: 2.004 ± 0.768
3.006GlyPro: 3.006 ± 1.362
4.008GlyGln: 4.008 ± 1.43
1.002GlyArg: 1.002 ± 0.74
3.006GlySer: 3.006 ± 1.407
6.012GlyThr: 6.012 ± 2.287
3.006GlyVal: 3.006 ± 2.185
0.0GlyTrp: 0.0 ± 0.0
0.0GlyTyr: 0.0 ± 0.0
0.0GlyXaa: 0.0 ± 0.0
His
1.002HisAla: 1.002 ± 0.782
2.004HisCys: 2.004 ± 1.069
2.004HisAsp: 2.004 ± 1.369
0.0HisGlu: 0.0 ± 0.0
0.0HisPhe: 0.0 ± 0.0
2.004HisGly: 2.004 ± 1.531
1.002HisHis: 1.002 ± 1.184
2.004HisIle: 2.004 ± 1.656
2.004HisLys: 2.004 ± 1.262
2.004HisLeu: 2.004 ± 1.481
0.0HisMet: 0.0 ± 0.0
5.01HisAsn: 5.01 ± 2.773
3.006HisPro: 3.006 ± 1.407
1.002HisGln: 1.002 ± 0.782
3.006HisArg: 3.006 ± 2.437
2.004HisSer: 2.004 ± 1.14
3.006HisThr: 3.006 ± 1.765
3.006HisVal: 3.006 ± 1.745
1.002HisTrp: 1.002 ± 0.74
0.0HisTyr: 0.0 ± 0.0
0.0HisXaa: 0.0 ± 0.0
Ile
0.0IleAla: 0.0 ± 0.0
0.0IleCys: 0.0 ± 0.0
2.004IleAsp: 2.004 ± 1.069
0.0IleGlu: 0.0 ± 0.0
5.01IlePhe: 5.01 ± 2.63
2.004IleGly: 2.004 ± 1.069
0.0IleHis: 0.0 ± 0.0
2.004IleIle: 2.004 ± 1.133
7.014IleLys: 7.014 ± 0.837
2.004IleLeu: 2.004 ± 1.262
0.0IleMet: 0.0 ± 0.0
0.0IleAsn: 0.0 ± 0.0
3.006IlePro: 3.006 ± 1.407
2.004IleGln: 2.004 ± 1.14
5.01IleArg: 5.01 ± 3.143
4.008IleSer: 4.008 ± 2.737
8.016IleThr: 8.016 ± 3.287
3.006IleVal: 3.006 ± 1.565
1.002IleTrp: 1.002 ± 0.782
4.008IleTyr: 4.008 ± 2.036
0.0IleXaa: 0.0 ± 0.0
Lys
4.008LysAla: 4.008 ± 1.193
0.0LysCys: 0.0 ± 0.0
6.012LysAsp: 6.012 ± 4.442
3.006LysGlu: 3.006 ± 2.221
5.01LysPhe: 5.01 ± 1.318
2.004LysGly: 2.004 ± 1.481
1.002LysHis: 1.002 ± 0.74
3.006LysIle: 3.006 ± 1.745
1.002LysLys: 1.002 ± 1.184
4.008LysLeu: 4.008 ± 1.044
0.0LysMet: 0.0 ± 0.0
4.008LysAsn: 4.008 ± 1.536
5.01LysPro: 5.01 ± 1.011
0.0LysGln: 0.0 ± 0.0
5.01LysArg: 5.01 ± 2.57
5.01LysSer: 5.01 ± 1.635
4.008LysThr: 4.008 ± 2.266
7.014LysVal: 7.014 ± 3.432
0.0LysTrp: 0.0 ± 0.0
2.004LysTyr: 2.004 ± 0.768
0.0LysXaa: 0.0 ± 0.0
Leu
1.002LeuAla: 1.002 ± 0.74
2.004LeuCys: 2.004 ± 0.768
9.018LeuAsp: 9.018 ± 2.872
2.004LeuGlu: 2.004 ± 1.656
2.004LeuPhe: 2.004 ± 1.531
5.01LeuGly: 5.01 ± 1.118
3.006LeuHis: 3.006 ± 1.527
5.01LeuIle: 5.01 ± 2.188
5.01LeuLys: 5.01 ± 1.99
4.008LeuLeu: 4.008 ± 2.493
0.0LeuMet: 0.0 ± 0.0
3.006LeuAsn: 3.006 ± 1.83
4.008LeuPro: 4.008 ± 2.175
2.004LeuGln: 2.004 ± 1.481
5.01LeuArg: 5.01 ± 1.305
6.012LeuSer: 6.012 ± 3.399
5.01LeuThr: 5.01 ± 2.147
5.01LeuVal: 5.01 ± 1.318
0.0LeuTrp: 0.0 ± 0.0
7.014LeuTyr: 7.014 ± 4.335
0.0LeuXaa: 0.0 ± 0.0
Met
1.002MetAla: 1.002 ± 0.782
0.0MetCys: 0.0 ± 0.0
3.006MetAsp: 3.006 ± 1.745
0.0MetGlu: 0.0 ± 0.0
2.004MetPhe: 2.004 ± 1.564
0.0MetGly: 0.0 ± 0.0
1.002MetHis: 1.002 ± 0.782
0.0MetIle: 0.0 ± 0.0
2.004MetLys: 2.004 ± 1.133
4.008MetLeu: 4.008 ± 1.415
0.0MetMet: 0.0 ± 0.0
0.0MetAsn: 0.0 ± 0.0
2.004MetPro: 2.004 ± 0.768
2.004MetGln: 2.004 ± 1.133
0.0MetArg: 0.0 ± 0.0
0.0MetSer: 0.0 ± 0.0
0.0MetThr: 0.0 ± 0.0
0.0MetVal: 0.0 ± 0.0
1.002MetTrp: 1.002 ± 0.74
1.002MetTyr: 1.002 ± 0.782
0.0MetXaa: 0.0 ± 0.0
Asn
6.012AsnAla: 6.012 ± 3.386
1.002AsnCys: 1.002 ± 0.74
2.004AsnAsp: 2.004 ± 1.369
3.006AsnGlu: 3.006 ± 1.362
2.004AsnPhe: 2.004 ± 1.262
2.004AsnGly: 2.004 ± 1.262
7.014AsnHis: 7.014 ± 3.603
4.008AsnIle: 4.008 ± 1.958
3.006AsnLys: 3.006 ± 1.021
2.004AsnLeu: 2.004 ± 1.14
2.004AsnMet: 2.004 ± 1.424
2.004AsnAsn: 2.004 ± 1.262
8.016AsnPro: 8.016 ± 2.791
0.0AsnGln: 0.0 ± 0.0
1.002AsnArg: 1.002 ± 0.782
5.01AsnSer: 5.01 ± 1.011
0.0AsnThr: 0.0 ± 0.0
4.008AsnVal: 4.008 ± 2.113
0.0AsnTrp: 0.0 ± 0.0
3.006AsnTyr: 3.006 ± 2.221
0.0AsnXaa: 0.0 ± 0.0
Pro
1.002ProAla: 1.002 ± 1.102
2.004ProCys: 2.004 ± 1.369
7.014ProAsp: 7.014 ± 3.367
3.006ProGlu: 3.006 ± 2.109
2.004ProPhe: 2.004 ± 1.133
2.004ProGly: 2.004 ± 1.481
4.008ProHis: 4.008 ± 1.401
1.002ProIle: 1.002 ± 1.102
6.012ProLys: 6.012 ± 2.053
4.008ProLeu: 4.008 ± 2.113
3.006ProMet: 3.006 ± 1.4
1.002ProAsn: 1.002 ± 0.74
2.004ProPro: 2.004 ± 1.069
7.014ProGln: 7.014 ± 3.123
7.014ProArg: 7.014 ± 1.676
6.012ProSer: 6.012 ± 2.105
4.008ProThr: 4.008 ± 3.168
6.012ProVal: 6.012 ± 3.003
2.004ProTrp: 2.004 ± 1.481
1.002ProTyr: 1.002 ± 0.782
0.0ProXaa: 0.0 ± 0.0
Gln
3.006GlnAla: 3.006 ± 0.98
1.002GlnCys: 1.002 ± 0.74
1.002GlnAsp: 1.002 ± 1.184
3.006GlnGlu: 3.006 ± 1.021
0.0GlnPhe: 0.0 ± 0.0
1.002GlnGly: 1.002 ± 1.184
1.002GlnHis: 1.002 ± 1.184
2.004GlnIle: 2.004 ± 1.14
2.004GlnLys: 2.004 ± 1.481
4.008GlnLeu: 4.008 ± 1.415
1.002GlnMet: 1.002 ± 0.74
0.0GlnAsn: 0.0 ± 0.0
2.004GlnPro: 2.004 ± 1.531
0.0GlnGln: 0.0 ± 0.0
2.004GlnArg: 2.004 ± 1.246
7.014GlnSer: 7.014 ± 1.81
1.002GlnThr: 1.002 ± 0.74
4.008GlnVal: 4.008 ± 2.392
0.0GlnTrp: 0.0 ± 0.0
3.006GlnTyr: 3.006 ± 1.29
0.0GlnXaa: 0.0 ± 0.0
Arg
4.008ArgAla: 4.008 ± 2.308
2.004ArgCys: 2.004 ± 2.203
3.006ArgAsp: 3.006 ± 2.346
3.006ArgGlu: 3.006 ± 1.267
7.014ArgPhe: 7.014 ± 3.368
8.016ArgGly: 8.016 ± 3.517
1.002ArgHis: 1.002 ± 0.782
5.01ArgIle: 5.01 ± 1.444
1.002ArgLys: 1.002 ± 0.782
4.008ArgLeu: 4.008 ± 2.12
1.002ArgMet: 1.002 ± 1.102
2.004ArgAsn: 2.004 ± 1.481
4.008ArgPro: 4.008 ± 1.536
3.006ArgGln: 3.006 ± 1.484
7.014ArgArg: 7.014 ± 4.419
3.006ArgSer: 3.006 ± 1.304
5.01ArgThr: 5.01 ± 1.988
4.008ArgVal: 4.008 ± 1.044
0.0ArgTrp: 0.0 ± 0.0
1.002ArgTyr: 1.002 ± 1.167
0.0ArgXaa: 0.0 ± 0.0
Ser
2.004SerAla: 2.004 ± 1.481
1.002SerCys: 1.002 ± 1.184
2.004SerAsp: 2.004 ± 0.768
1.002SerGlu: 1.002 ± 0.782
2.004SerPhe: 2.004 ± 1.069
6.012SerGly: 6.012 ± 3.466
1.002SerHis: 1.002 ± 0.782
4.008SerIle: 4.008 ± 1.18
3.006SerLys: 3.006 ± 1.026
7.014SerLeu: 7.014 ± 2.887
1.002SerMet: 1.002 ± 2.06
8.016SerAsn: 8.016 ± 1.598
10.02SerPro: 10.02 ± 3.837
2.004SerGln: 2.004 ± 2.369
6.012SerArg: 6.012 ± 1.88
6.012SerSer: 6.012 ± 4.405
6.012SerThr: 6.012 ± 2.983
4.008SerVal: 4.008 ± 1.472
2.004SerTrp: 2.004 ± 0.768
3.006SerTyr: 3.006 ± 0.98
0.0SerXaa: 0.0 ± 0.0
Thr
3.006ThrAla: 3.006 ± 1.484
0.0ThrCys: 0.0 ± 0.0
1.002ThrAsp: 1.002 ± 1.167
1.002ThrGlu: 1.002 ± 0.782
1.002ThrPhe: 1.002 ± 0.74
3.006ThrGly: 3.006 ± 0.98
5.01ThrHis: 5.01 ± 2.299
2.004ThrIle: 2.004 ± 1.531
2.004ThrLys: 2.004 ± 1.481
4.008ThrLeu: 4.008 ± 1.43
1.002ThrMet: 1.002 ± 0.74
5.01ThrAsn: 5.01 ± 1.318
11.022ThrPro: 11.022 ± 7.251
1.002ThrGln: 1.002 ± 0.74
4.008ThrArg: 4.008 ± 1.78
7.014ThrSer: 7.014 ± 2.95
5.01ThrThr: 5.01 ± 4.656
4.008ThrVal: 4.008 ± 2.361
1.002ThrTrp: 1.002 ± 1.167
5.01ThrTyr: 5.01 ± 2.147
0.0ThrXaa: 0.0 ± 0.0
Val
2.004ValAla: 2.004 ± 0.768
0.0ValCys: 0.0 ± 0.0
0.0ValAsp: 0.0 ± 0.0
3.006ValGlu: 3.006 ± 1.527
2.004ValPhe: 2.004 ± 1.369
2.004ValGly: 2.004 ± 1.564
2.004ValHis: 2.004 ± 2.369
4.008ValIle: 4.008 ± 2.201
6.012ValLys: 6.012 ± 2.304
4.008ValLeu: 4.008 ± 1.044
3.006ValMet: 3.006 ± 1.745
7.014ValAsn: 7.014 ± 1.277
5.01ValPro: 5.01 ± 1.024
4.008ValGln: 4.008 ± 1.144
4.008ValArg: 4.008 ± 2.014
4.008ValSer: 4.008 ± 1.958
2.004ValThr: 2.004 ± 1.564
2.004ValVal: 2.004 ± 0.768
1.002ValTrp: 1.002 ± 1.167
7.014ValTyr: 7.014 ± 2.236
0.0ValXaa: 0.0 ± 0.0
Trp
2.004TrpAla: 2.004 ± 1.481
0.0TrpCys: 0.0 ± 0.0
0.0TrpAsp: 0.0 ± 0.0
1.002TrpGlu: 1.002 ± 1.167
0.0TrpPhe: 0.0 ± 0.0
0.0TrpGly: 0.0 ± 0.0
0.0TrpHis: 0.0 ± 0.0
0.0TrpIle: 0.0 ± 0.0
2.004TrpLys: 2.004 ± 0.768
1.002TrpLeu: 1.002 ± 0.782
1.002TrpMet: 1.002 ± 0.782
1.002TrpAsn: 1.002 ± 1.184
0.0TrpPro: 0.0 ± 0.0
1.002TrpGln: 1.002 ± 0.74
2.004TrpArg: 2.004 ± 1.369
1.002TrpSer: 1.002 ± 0.74
2.004TrpThr: 2.004 ± 1.14
2.004TrpVal: 2.004 ± 0.768
0.0TrpTrp: 0.0 ± 0.0
0.0TrpTyr: 0.0 ± 0.0
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.006TyrAla: 3.006 ± 2.346
1.002TyrCys: 1.002 ± 0.74
1.002TyrAsp: 1.002 ± 0.782
3.006TyrGlu: 3.006 ± 1.765
3.006TyrPhe: 3.006 ± 0.98
2.004TyrGly: 2.004 ± 0.768
4.008TyrHis: 4.008 ± 2.28
3.006TyrIle: 3.006 ± 1.29
2.004TyrLys: 2.004 ± 1.481
7.014TyrLeu: 7.014 ± 3.258
2.004TyrMet: 2.004 ± 1.226
2.004TyrAsn: 2.004 ± 0.768
1.002TyrPro: 1.002 ± 0.74
1.002TyrGln: 1.002 ± 0.74
2.004TyrArg: 2.004 ± 1.564
1.002TyrSer: 1.002 ± 0.74
3.006TyrThr: 3.006 ± 2.448
3.006TyrVal: 3.006 ± 2.61
0.0TyrTrp: 0.0 ± 0.0
0.0TyrTyr: 0.0 ± 0.0
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 5 proteins (999 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski