Amino acid dipepetide frequency for Tomato golden mosaic virus (strain Yellow vein) (TGMV)

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
3.349AlaAla: 3.349 ± 1.685
2.009AlaCys: 2.009 ± 0.812
0.0AlaAsp: 0.0 ± 0.0
2.679AlaGlu: 2.679 ± 1.156
1.34AlaPhe: 1.34 ± 0.937
2.009AlaGly: 2.009 ± 1.308
0.67AlaHis: 0.67 ± 0.525
2.009AlaIle: 2.009 ± 1.376
6.028AlaLys: 6.028 ± 2.047
4.689AlaLeu: 4.689 ± 1.46
0.67AlaMet: 0.67 ± 0.628
2.009AlaAsn: 2.009 ± 1.436
1.34AlaPro: 1.34 ± 0.817
1.34AlaGln: 1.34 ± 0.768
3.349AlaArg: 3.349 ± 1.448
8.038AlaSer: 8.038 ± 2.307
3.349AlaThr: 3.349 ± 1.559
1.34AlaVal: 1.34 ± 0.937
0.67AlaTrp: 0.67 ± 0.621
0.0AlaTyr: 0.0 ± 0.0
0.0AlaXaa: 0.0 ± 0.0
Cys
0.67CysAla: 0.67 ± 0.735
0.67CysCys: 0.67 ± 0.525
0.67CysAsp: 0.67 ± 0.628
0.67CysGlu: 0.67 ± 0.621
0.0CysPhe: 0.0 ± 0.0
0.67CysGly: 0.67 ± 0.735
0.0CysHis: 0.0 ± 0.0
2.009CysIle: 2.009 ± 0.921
1.34CysLys: 1.34 ± 0.656
1.34CysLeu: 1.34 ± 1.091
0.67CysMet: 0.67 ± 0.628
2.009CysAsn: 2.009 ± 0.667
0.67CysPro: 0.67 ± 0.628
1.34CysGln: 1.34 ± 1.051
1.34CysArg: 1.34 ± 0.768
1.34CysSer: 1.34 ± 0.768
1.34CysThr: 1.34 ± 0.817
1.34CysVal: 1.34 ± 0.817
1.34CysTrp: 1.34 ± 1.342
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
2.679AspAla: 2.679 ± 1.154
0.67AspCys: 0.67 ± 0.735
2.679AspAsp: 2.679 ± 1.379
2.679AspGlu: 2.679 ± 0.653
2.679AspPhe: 2.679 ± 1.245
2.679AspGly: 2.679 ± 1.638
0.67AspHis: 0.67 ± 0.628
3.349AspIle: 3.349 ± 1.449
3.349AspLys: 3.349 ± 1.702
6.698AspLeu: 6.698 ± 1.157
0.0AspMet: 0.0 ± 0.0
2.679AspAsn: 2.679 ± 1.599
2.009AspPro: 2.009 ± 1.399
0.67AspGln: 0.67 ± 0.628
5.358AspArg: 5.358 ± 1.669
6.028AspSer: 6.028 ± 0.967
0.67AspThr: 0.67 ± 0.525
3.349AspVal: 3.349 ± 1.227
0.67AspTrp: 0.67 ± 0.525
0.67AspTyr: 0.67 ± 0.671
0.0AspXaa: 0.0 ± 0.0
Glu
2.009GluAla: 2.009 ± 1.014
0.67GluCys: 0.67 ± 0.628
1.34GluAsp: 1.34 ± 0.864
3.349GluGlu: 3.349 ± 1.685
2.009GluPhe: 2.009 ± 0.667
4.019GluGly: 4.019 ± 1.831
0.0GluHis: 0.0 ± 0.0
3.349GluIle: 3.349 ± 2.128
2.009GluLys: 2.009 ± 1.178
3.349GluLeu: 3.349 ± 1.156
0.67GluMet: 0.67 ± 0.525
7.368GluAsn: 7.368 ± 2.328
2.679GluPro: 2.679 ± 0.85
2.009GluGln: 2.009 ± 1.234
2.009GluArg: 2.009 ± 0.667
4.019GluSer: 4.019 ± 2.415
0.0GluThr: 0.0 ± 0.0
1.34GluVal: 1.34 ± 1.944
0.67GluTrp: 0.67 ± 0.735
1.34GluTyr: 1.34 ± 0.69
0.0GluXaa: 0.0 ± 0.0
Phe
1.34PheAla: 1.34 ± 0.941
0.67PheCys: 0.67 ± 0.621
4.019PheAsp: 4.019 ± 2.096
0.67PheGlu: 0.67 ± 0.525
1.34PhePhe: 1.34 ± 0.69
2.679PheGly: 2.679 ± 1.393
3.349PheHis: 3.349 ± 1.97
1.34PheIle: 1.34 ± 0.801
5.358PheLys: 5.358 ± 2.627
3.349PheLeu: 3.349 ± 1.647
0.0PheMet: 0.0 ± 0.0
2.679PheAsn: 2.679 ± 0.815
2.009PhePro: 2.009 ± 1.11
4.019PheGln: 4.019 ± 1.465
2.009PheArg: 2.009 ± 1.521
3.349PheSer: 3.349 ± 1.798
2.679PheThr: 2.679 ± 0.679
2.009PheVal: 2.009 ± 1.382
2.009PheTrp: 2.009 ± 1.399
2.009PheTyr: 2.009 ± 0.944
0.0PheXaa: 0.0 ± 0.0
Gly
2.679GlyAla: 2.679 ± 0.997
2.679GlyCys: 2.679 ± 0.875
2.679GlyAsp: 2.679 ± 1.06
4.019GlyGlu: 4.019 ± 1.594
1.34GlyPhe: 1.34 ± 0.875
2.009GlyGly: 2.009 ± 1.014
0.67GlyHis: 0.67 ± 0.525
2.009GlyIle: 2.009 ± 0.827
8.038GlyLys: 8.038 ± 2.251
0.67GlyLeu: 0.67 ± 0.628
1.34GlyMet: 1.34 ± 1.268
3.349GlyAsn: 3.349 ± 2.108
4.689GlyPro: 4.689 ± 1.376
2.679GlyGln: 2.679 ± 1.368
2.009GlyArg: 2.009 ± 0.968
3.349GlySer: 3.349 ± 1.866
3.349GlyThr: 3.349 ± 1.327
2.679GlyVal: 2.679 ± 1.874
0.0GlyTrp: 0.0 ± 0.0
1.34GlyTyr: 1.34 ± 0.669
0.0GlyXaa: 0.0 ± 0.0
His
1.34HisAla: 1.34 ± 1.241
1.34HisCys: 1.34 ± 0.927
2.009HisAsp: 2.009 ± 1.319
1.34HisGlu: 1.34 ± 0.801
1.34HisPhe: 1.34 ± 0.69
1.34HisGly: 1.34 ± 1.342
0.0HisHis: 0.0 ± 0.0
1.34HisIle: 1.34 ± 1.213
0.67HisLys: 0.67 ± 0.776
4.019HisLeu: 4.019 ± 1.212
0.0HisMet: 0.0 ± 0.0
3.349HisAsn: 3.349 ± 1.647
2.009HisPro: 2.009 ± 1.053
2.679HisGln: 2.679 ± 1.599
2.679HisArg: 2.679 ± 1.859
2.679HisSer: 2.679 ± 1.241
2.009HisThr: 2.009 ± 1.367
4.019HisVal: 4.019 ± 1.287
0.67HisTrp: 0.67 ± 0.525
0.67HisTyr: 0.67 ± 0.628
0.0HisXaa: 0.0 ± 0.0
Ile
0.0IleAla: 0.0 ± 0.0
1.34IleCys: 1.34 ± 0.864
6.028IleAsp: 6.028 ± 1.869
4.019IleGlu: 4.019 ± 1.578
0.67IlePhe: 0.67 ± 0.525
3.349IleGly: 3.349 ± 1.827
3.349IleHis: 3.349 ± 1.228
4.019IleIle: 4.019 ± 1.696
4.019IleLys: 4.019 ± 1.084
4.019IleLeu: 4.019 ± 1.755
0.0IleMet: 0.0 ± 0.0
4.019IleAsn: 4.019 ± 1.594
3.349IlePro: 3.349 ± 1.358
2.009IleGln: 2.009 ± 1.325
4.689IleArg: 4.689 ± 1.367
6.028IleSer: 6.028 ± 1.969
4.689IleThr: 4.689 ± 2.084
2.009IleVal: 2.009 ± 0.667
1.34IleTrp: 1.34 ± 0.823
2.679IleTyr: 2.679 ± 1.486
0.0IleXaa: 0.0 ± 0.0
Lys
2.009LysAla: 2.009 ± 1.209
0.0LysCys: 0.0 ± 0.0
4.019LysAsp: 4.019 ± 1.697
2.679LysGlu: 2.679 ± 2.102
2.679LysPhe: 2.679 ± 1.156
2.679LysGly: 2.679 ± 0.653
2.679LysHis: 2.679 ± 1.433
6.698LysIle: 6.698 ± 0.826
0.67LysLys: 0.67 ± 0.525
4.019LysLeu: 4.019 ± 1.475
2.009LysMet: 2.009 ± 0.803
3.349LysAsn: 3.349 ± 1.372
1.34LysPro: 1.34 ± 0.656
0.67LysGln: 0.67 ± 0.628
6.698LysArg: 6.698 ± 2.572
2.679LysSer: 2.679 ± 1.028
4.019LysThr: 4.019 ± 1.514
5.358LysVal: 5.358 ± 2.806
0.67LysTrp: 0.67 ± 0.628
5.358LysTyr: 5.358 ± 1.911
0.0LysXaa: 0.0 ± 0.0
Leu
2.009LeuAla: 2.009 ± 1.183
0.67LeuCys: 0.67 ± 0.525
4.019LeuAsp: 4.019 ± 1.42
2.679LeuGlu: 2.679 ± 1.848
3.349LeuPhe: 3.349 ± 1.572
4.019LeuGly: 4.019 ± 0.724
4.019LeuHis: 4.019 ± 1.051
1.34LeuIle: 1.34 ± 1.051
6.028LeuLys: 6.028 ± 1.438
2.009LeuLeu: 2.009 ± 1.484
0.67LeuMet: 0.67 ± 0.621
7.368LeuAsn: 7.368 ± 2.669
2.679LeuPro: 2.679 ± 1.153
4.689LeuGln: 4.689 ± 1.977
3.349LeuArg: 3.349 ± 1.156
6.698LeuSer: 6.698 ± 2.598
4.019LeuThr: 4.019 ± 2.161
5.358LeuVal: 5.358 ± 1.829
0.0LeuTrp: 0.0 ± 0.0
5.358LeuTyr: 5.358 ± 2.459
0.0LeuXaa: 0.0 ± 0.0
Met
2.009MetAla: 2.009 ± 1.367
0.67MetCys: 0.67 ± 0.621
4.019MetAsp: 4.019 ± 1.314
0.0MetGlu: 0.0 ± 0.0
1.34MetPhe: 1.34 ± 1.241
0.67MetGly: 0.67 ± 0.972
0.67MetHis: 0.67 ± 0.628
1.34MetIle: 1.34 ± 0.669
0.67MetLys: 0.67 ± 0.628
0.67MetLeu: 0.67 ± 0.776
0.0MetMet: 0.0 ± 0.0
0.0MetAsn: 0.0 ± 0.0
1.34MetPro: 1.34 ± 0.656
0.0MetGln: 0.0 ± 0.0
1.34MetArg: 1.34 ± 0.768
1.34MetSer: 1.34 ± 0.669
1.34MetThr: 1.34 ± 0.669
1.34MetVal: 1.34 ± 0.669
0.67MetTrp: 0.67 ± 0.525
1.34MetTyr: 1.34 ± 0.88
0.0MetXaa: 0.0 ± 0.0
Asn
4.019AsnAla: 4.019 ± 1.272
2.679AsnCys: 2.679 ± 0.679
2.679AsnAsp: 2.679 ± 1.09
2.679AsnGlu: 2.679 ± 1.634
0.67AsnPhe: 0.67 ± 0.776
2.679AsnGly: 2.679 ± 1.19
3.349AsnHis: 3.349 ± 2.004
3.349AsnIle: 3.349 ± 0.952
2.009AsnLys: 2.009 ± 1.092
4.019AsnLeu: 4.019 ± 1.483
1.34AsnMet: 1.34 ± 1.134
2.679AsnAsn: 2.679 ± 1.486
2.009AsnPro: 2.009 ± 0.769
4.019AsnGln: 4.019 ± 2.107
4.019AsnArg: 4.019 ± 1.303
4.019AsnSer: 4.019 ± 1.527
4.019AsnThr: 4.019 ± 1.212
4.689AsnVal: 4.689 ± 0.813
0.67AsnTrp: 0.67 ± 0.525
3.349AsnTyr: 3.349 ± 1.37
0.0AsnXaa: 0.0 ± 0.0
Pro
0.0ProAla: 0.0 ± 0.0
0.67ProCys: 0.67 ± 0.621
1.34ProAsp: 1.34 ± 0.656
3.349ProGlu: 3.349 ± 1.524
1.34ProPhe: 1.34 ± 0.69
2.009ProGly: 2.009 ± 1.211
2.009ProHis: 2.009 ± 1.178
4.019ProIle: 4.019 ± 1.868
4.019ProLys: 4.019 ± 1.643
4.689ProLeu: 4.689 ± 2.845
1.34ProMet: 1.34 ± 1.241
1.34ProAsn: 1.34 ± 0.69
2.009ProPro: 2.009 ± 0.846
4.019ProGln: 4.019 ± 1.986
2.009ProArg: 2.009 ± 1.367
8.038ProSer: 8.038 ± 1.768
2.679ProThr: 2.679 ± 0.966
2.009ProVal: 2.009 ± 0.72
2.009ProTrp: 2.009 ± 0.72
1.34ProTyr: 1.34 ± 0.817
0.0ProXaa: 0.0 ± 0.0
Gln
3.349GlnAla: 3.349 ± 0.548
0.67GlnCys: 0.67 ± 0.525
1.34GlnAsp: 1.34 ± 1.469
1.34GlnGlu: 1.34 ± 0.817
3.349GlnPhe: 3.349 ± 1.179
2.679GlnGly: 2.679 ± 1.145
2.009GlnHis: 2.009 ± 1.755
2.679GlnIle: 2.679 ± 1.64
0.0GlnLys: 0.0 ± 0.0
4.689GlnLeu: 4.689 ± 2.474
0.67GlnMet: 0.67 ± 0.565
0.67GlnAsn: 0.67 ± 0.525
3.349GlnPro: 3.349 ± 1.247
2.009GlnGln: 2.009 ± 1.991
5.358GlnArg: 5.358 ± 1.078
2.679GlnSer: 2.679 ± 0.96
2.009GlnThr: 2.009 ± 1.325
4.019GlnVal: 4.019 ± 1.207
0.67GlnTrp: 0.67 ± 0.525
1.34GlnTyr: 1.34 ± 0.656
0.0GlnXaa: 0.0 ± 0.0
Arg
4.689ArgAla: 4.689 ± 2.376
0.67ArgCys: 0.67 ± 0.628
4.019ArgAsp: 4.019 ± 2.295
3.349ArgGlu: 3.349 ± 2.088
8.038ArgPhe: 8.038 ± 2.753
5.358ArgGly: 5.358 ± 2.182
2.009ArgHis: 2.009 ± 1.436
5.358ArgIle: 5.358 ± 1.976
1.34ArgLys: 1.34 ± 0.817
4.019ArgLeu: 4.019 ± 1.348
0.67ArgMet: 0.67 ± 1.076
2.009ArgAsn: 2.009 ± 1.47
3.349ArgPro: 3.349 ± 1.227
2.679ArgGln: 2.679 ± 1.146
8.707ArgArg: 8.707 ± 4.193
9.377ArgSer: 9.377 ± 1.558
4.019ArgThr: 4.019 ± 1.978
2.679ArgVal: 2.679 ± 1.393
0.0ArgTrp: 0.0 ± 0.0
0.0ArgTyr: 0.0 ± 0.0
0.0ArgXaa: 0.0 ± 0.0
Ser
6.028SerAla: 6.028 ± 1.559
0.67SerCys: 0.67 ± 0.628
2.679SerAsp: 2.679 ± 0.653
0.0SerGlu: 0.0 ± 0.0
3.349SerPhe: 3.349 ± 1.051
1.34SerGly: 1.34 ± 0.864
2.679SerHis: 2.679 ± 1.018
6.028SerIle: 6.028 ± 2.278
5.358SerLys: 5.358 ± 2.37
6.698SerLeu: 6.698 ± 1.719
2.679SerMet: 2.679 ± 0.726
5.358SerAsn: 5.358 ± 1.942
6.028SerPro: 6.028 ± 1.795
3.349SerGln: 3.349 ± 1.827
9.377SerArg: 9.377 ± 2.719
11.386SerSer: 11.386 ± 3.425
12.056SerThr: 12.056 ± 3.894
4.019SerVal: 4.019 ± 1.337
1.34SerTrp: 1.34 ± 1.256
3.349SerTyr: 3.349 ± 0.548
0.0SerXaa: 0.0 ± 0.0
Thr
4.019ThrAla: 4.019 ± 1.303
0.67ThrCys: 0.67 ± 0.972
2.009ThrAsp: 2.009 ± 1.399
2.679ThrGlu: 2.679 ± 1.074
4.689ThrPhe: 4.689 ± 2.372
4.689ThrGly: 4.689 ± 0.95
3.349ThrHis: 3.349 ± 1.481
2.009ThrIle: 2.009 ± 1.885
1.34ThrLys: 1.34 ± 1.091
2.679ThrLeu: 2.679 ± 1.035
2.009ThrMet: 2.009 ± 0.667
6.028ThrAsn: 6.028 ± 1.241
4.689ThrPro: 4.689 ± 1.699
1.34ThrGln: 1.34 ± 0.69
2.009ThrArg: 2.009 ± 0.769
5.358ThrSer: 5.358 ± 2.694
2.679ThrThr: 2.679 ± 1.893
4.019ThrVal: 4.019 ± 1.303
0.67ThrTrp: 0.67 ± 0.972
2.679ThrTyr: 2.679 ± 1.341
0.0ThrXaa: 0.0 ± 0.0
Val
0.67ValAla: 0.67 ± 0.972
0.67ValCys: 0.67 ± 0.628
3.349ValAsp: 3.349 ± 1.228
3.349ValGlu: 3.349 ± 1.095
4.019ValPhe: 4.019 ± 1.212
2.679ValGly: 2.679 ± 1.281
2.679ValHis: 2.679 ± 1.804
4.019ValIle: 4.019 ± 1.174
3.349ValLys: 3.349 ± 1.085
2.679ValLeu: 2.679 ± 1.475
2.679ValMet: 2.679 ± 1.295
1.34ValAsn: 1.34 ± 1.241
2.679ValPro: 2.679 ± 0.753
3.349ValGln: 3.349 ± 1.225
3.349ValArg: 3.349 ± 1.559
4.689ValSer: 4.689 ± 2.705
2.009ValThr: 2.009 ± 1.164
4.019ValVal: 4.019 ± 1.795
1.34ValTrp: 1.34 ± 0.864
4.689ValTyr: 4.689 ± 1.873
0.0ValXaa: 0.0 ± 0.0
Trp
2.009TrpAla: 2.009 ± 1.053
0.0TrpCys: 0.0 ± 0.0
0.67TrpAsp: 0.67 ± 0.628
1.34TrpGlu: 1.34 ± 1.187
0.0TrpPhe: 0.0 ± 0.0
0.67TrpGly: 0.67 ± 0.525
0.0TrpHis: 0.0 ± 0.0
0.0TrpIle: 0.0 ± 0.0
2.009TrpLys: 2.009 ± 0.72
1.34TrpLeu: 1.34 ± 0.656
1.34TrpMet: 1.34 ± 0.88
0.0TrpAsn: 0.0 ± 0.0
0.0TrpPro: 0.0 ± 0.0
0.67TrpGln: 0.67 ± 0.525
1.34TrpArg: 1.34 ± 0.929
0.67TrpSer: 0.67 ± 0.671
2.009TrpThr: 2.009 ± 0.913
0.67TrpVal: 0.67 ± 0.621
0.0TrpTrp: 0.0 ± 0.0
0.67TrpTyr: 0.67 ± 0.972
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.009TyrAla: 2.009 ± 1.308
1.34TyrCys: 1.34 ± 0.801
0.67TyrAsp: 0.67 ± 0.621
1.34TyrGlu: 1.34 ± 1.241
3.349TyrPhe: 3.349 ± 0.916
3.349TyrGly: 3.349 ± 0.879
1.34TyrHis: 1.34 ± 1.187
5.358TyrIle: 5.358 ± 1.451
2.679TyrLys: 2.679 ± 1.308
4.689TyrLeu: 4.689 ± 2.717
1.34TyrMet: 1.34 ± 0.789
1.34TyrAsn: 1.34 ± 0.656
2.009TyrPro: 2.009 ± 1.112
1.34TyrGln: 1.34 ± 0.69
2.009TyrArg: 2.009 ± 1.367
2.009TyrSer: 2.009 ± 0.858
0.67TyrThr: 0.67 ± 0.776
1.34TyrVal: 1.34 ± 1.342
0.0TyrTrp: 0.0 ± 0.0
1.34TyrTyr: 1.34 ± 0.669
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 7 proteins (1494 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski