Amino acid dipepetide frequency for Tobacco mottle leaf curl virus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
3.099AlaAla: 3.099 ± 1.142
2.066AlaCys: 2.066 ± 0.825
1.033AlaAsp: 1.033 ± 0.716
3.099AlaGlu: 3.099 ± 0.995
0.0AlaPhe: 0.0 ± 0.0
1.033AlaGly: 1.033 ± 0.89
2.066AlaHis: 2.066 ± 1.094
3.099AlaIle: 3.099 ± 0.879
4.132AlaLys: 4.132 ± 1.65
7.231AlaLeu: 7.231 ± 2.861
1.033AlaMet: 1.033 ± 1.023
4.132AlaAsn: 4.132 ± 1.337
4.132AlaPro: 4.132 ± 2.234
1.033AlaGln: 1.033 ± 0.716
4.132AlaArg: 4.132 ± 2.234
7.231AlaSer: 7.231 ± 1.217
3.099AlaThr: 3.099 ± 1.937
1.033AlaVal: 1.033 ± 1.155
0.0AlaTrp: 0.0 ± 0.0
2.066AlaTyr: 2.066 ± 2.311
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.0CysCys: 0.0 ± 0.0
0.0CysAsp: 0.0 ± 0.0
1.033CysGlu: 1.033 ± 0.89
0.0CysPhe: 0.0 ± 0.0
2.066CysGly: 2.066 ± 1.072
0.0CysHis: 0.0 ± 0.0
1.033CysIle: 1.033 ± 0.89
2.066CysLys: 2.066 ± 0.825
1.033CysLeu: 1.033 ± 1.292
0.0CysMet: 0.0 ± 0.0
1.033CysAsn: 1.033 ± 0.716
0.0CysPro: 0.0 ± 0.0
1.033CysGln: 1.033 ± 0.716
2.066CysArg: 2.066 ± 1.072
3.099CysSer: 3.099 ± 1.392
3.099CysThr: 3.099 ± 0.879
1.033CysVal: 1.033 ± 0.89
0.0CysTrp: 0.0 ± 0.0
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
1.033AspAla: 1.033 ± 0.716
1.033AspCys: 1.033 ± 1.177
3.099AspAsp: 3.099 ± 2.135
2.066AspGlu: 2.066 ± 0.825
3.099AspPhe: 3.099 ± 1.263
2.066AspGly: 2.066 ± 1.431
0.0AspHis: 0.0 ± 0.0
4.132AspIle: 4.132 ± 3.409
3.099AspLys: 3.099 ± 1.263
6.198AspLeu: 6.198 ± 2.169
0.0AspMet: 0.0 ± 0.0
1.033AspAsn: 1.033 ± 0.89
0.0AspPro: 0.0 ± 0.0
1.033AspGln: 1.033 ± 1.155
5.165AspArg: 5.165 ± 1.417
5.165AspSer: 5.165 ± 1.415
5.165AspThr: 5.165 ± 2.379
4.132AspVal: 4.132 ± 1.65
1.033AspTrp: 1.033 ± 0.716
0.0AspTyr: 0.0 ± 0.0
0.0AspXaa: 0.0 ± 0.0
Glu
4.132GluAla: 4.132 ± 1.337
1.033GluCys: 1.033 ± 1.177
1.033GluAsp: 1.033 ± 1.155
7.231GluGlu: 7.231 ± 3.897
1.033GluPhe: 1.033 ± 0.716
5.165GluGly: 5.165 ± 2.548
0.0GluHis: 0.0 ± 0.0
2.066GluIle: 2.066 ± 1.632
0.0GluLys: 0.0 ± 0.0
4.132GluLeu: 4.132 ± 1.814
1.033GluMet: 1.033 ± 0.716
8.264GluAsn: 8.264 ± 3.188
4.132GluPro: 4.132 ± 1.119
3.099GluGln: 3.099 ± 1.56
2.066GluArg: 2.066 ± 1.072
2.066GluSer: 2.066 ± 1.896
0.0GluThr: 0.0 ± 0.0
2.066GluVal: 2.066 ± 1.324
4.132GluTrp: 4.132 ± 0.968
2.066GluTyr: 2.066 ± 1.431
0.0GluXaa: 0.0 ± 0.0
Phe
2.066PheAla: 2.066 ± 1.678
1.033PheCys: 1.033 ± 0.89
2.066PheAsp: 2.066 ± 0.825
1.033PheGlu: 1.033 ± 0.716
2.066PhePhe: 2.066 ± 1.431
3.099PheGly: 3.099 ± 1.263
3.099PheHis: 3.099 ± 0.879
1.033PheIle: 1.033 ± 0.716
2.066PheLys: 2.066 ± 2.311
3.099PheLeu: 3.099 ± 2.147
0.0PheMet: 0.0 ± 0.0
3.099PheAsn: 3.099 ± 0.995
0.0PhePro: 0.0 ± 0.0
5.165PheGln: 5.165 ± 1.797
3.099PheArg: 3.099 ± 1.448
0.0PheSer: 0.0 ± 0.0
1.033PheThr: 1.033 ± 1.177
0.0PheVal: 0.0 ± 0.0
3.099PheTrp: 3.099 ± 1.86
1.033PheTyr: 1.033 ± 0.89
0.0PheXaa: 0.0 ± 0.0
Gly
3.099GlyAla: 3.099 ± 1.443
3.099GlyCys: 3.099 ± 0.879
5.165GlyAsp: 5.165 ± 2.566
6.198GlyGlu: 6.198 ± 1.952
1.033GlyPhe: 1.033 ± 1.177
6.198GlyGly: 6.198 ± 2.083
2.066GlyHis: 2.066 ± 1.072
2.066GlyIle: 2.066 ± 0.825
9.298GlyLys: 9.298 ± 3.387
1.033GlyLeu: 1.033 ± 1.177
0.0GlyMet: 0.0 ± 0.0
1.033GlyAsn: 1.033 ± 0.89
5.165GlyPro: 5.165 ± 1.233
3.099GlyGln: 3.099 ± 1.898
2.066GlyArg: 2.066 ± 1.431
5.165GlySer: 5.165 ± 2.162
4.132GlyThr: 4.132 ± 1.085
1.033GlyVal: 1.033 ± 1.155
0.0GlyTrp: 0.0 ± 0.0
0.0GlyTyr: 0.0 ± 0.0
0.0GlyXaa: 0.0 ± 0.0
His
2.066HisAla: 2.066 ± 1.779
1.033HisCys: 1.033 ± 1.177
1.033HisAsp: 1.033 ± 0.89
1.033HisGlu: 1.033 ± 0.716
2.066HisPhe: 2.066 ± 1.431
1.033HisGly: 1.033 ± 1.177
2.066HisHis: 2.066 ± 1.072
3.099HisIle: 3.099 ± 2.165
1.033HisLys: 1.033 ± 1.155
4.132HisLeu: 4.132 ± 1.598
0.0HisMet: 0.0 ± 0.0
3.099HisAsn: 3.099 ± 1.443
2.066HisPro: 2.066 ± 1.072
3.099HisGln: 3.099 ± 1.639
4.132HisArg: 4.132 ± 1.407
1.033HisSer: 1.033 ± 1.155
3.099HisThr: 3.099 ± 1.86
4.132HisVal: 4.132 ± 1.538
2.066HisTrp: 2.066 ± 1.324
1.033HisTyr: 1.033 ± 0.716
0.0HisXaa: 0.0 ± 0.0
Ile
0.0IleAla: 0.0 ± 0.0
1.033IleCys: 1.033 ± 0.716
5.165IleAsp: 5.165 ± 3.167
1.033IleGlu: 1.033 ± 1.155
2.066IlePhe: 2.066 ± 1.072
1.033IleGly: 1.033 ± 0.89
2.066IleHis: 2.066 ± 1.896
2.066IleIle: 2.066 ± 1.431
5.165IleLys: 5.165 ± 1.594
1.033IleLeu: 1.033 ± 0.89
1.033IleMet: 1.033 ± 0.647
1.033IleAsn: 1.033 ± 1.155
2.066IlePro: 2.066 ± 1.072
2.066IleGln: 2.066 ± 1.678
5.165IleArg: 5.165 ± 2.707
6.198IleSer: 6.198 ± 3.103
3.099IleThr: 3.099 ± 1.57
4.132IleVal: 4.132 ± 1.65
2.066IleTrp: 2.066 ± 1.323
6.198IleTyr: 6.198 ± 3.675
0.0IleXaa: 0.0 ± 0.0
Lys
5.165LysAla: 5.165 ± 1.052
0.0LysCys: 0.0 ± 0.0
3.099LysAsp: 3.099 ± 2.147
4.132LysGlu: 4.132 ± 2.863
3.099LysPhe: 3.099 ± 1.443
1.033LysGly: 1.033 ± 0.716
1.033LysHis: 1.033 ± 0.716
7.231LysIle: 7.231 ± 3.423
2.066LysLys: 2.066 ± 1.431
2.066LysLeu: 2.066 ± 0.825
0.0LysMet: 0.0 ± 0.0
2.066LysAsn: 2.066 ± 1.779
3.099LysPro: 3.099 ± 0.879
0.0LysGln: 0.0 ± 0.0
2.066LysArg: 2.066 ± 1.779
4.132LysSer: 4.132 ± 0.968
2.066LysThr: 2.066 ± 1.431
5.165LysVal: 5.165 ± 4.449
0.0LysTrp: 0.0 ± 0.0
3.099LysTyr: 3.099 ± 1.263
0.0LysXaa: 0.0 ± 0.0
Leu
0.0LeuAla: 0.0 ± 0.0
1.033LeuCys: 1.033 ± 0.716
6.198LeuAsp: 6.198 ± 2.319
4.132LeuGlu: 4.132 ± 1.836
0.0LeuPhe: 0.0 ± 0.0
6.198LeuGly: 6.198 ± 1.043
3.099LeuHis: 3.099 ± 1.443
2.066LeuIle: 2.066 ± 1.324
4.132LeuLys: 4.132 ± 1.65
4.132LeuLeu: 4.132 ± 1.538
0.0LeuMet: 0.0 ± 0.0
3.099LeuAsn: 3.099 ± 1.448
4.132LeuPro: 4.132 ± 2.528
3.099LeuGln: 3.099 ± 2.147
4.132LeuArg: 4.132 ± 1.992
6.198LeuSer: 6.198 ± 2.661
3.099LeuThr: 3.099 ± 1.691
6.198LeuVal: 6.198 ± 1.109
0.0LeuTrp: 0.0 ± 0.0
3.099LeuTyr: 3.099 ± 0.995
0.0LeuXaa: 0.0 ± 0.0
Met
1.033MetAla: 1.033 ± 0.89
1.033MetCys: 1.033 ± 0.89
3.099MetAsp: 3.099 ± 1.937
0.0MetGlu: 0.0 ± 0.0
2.066MetPhe: 2.066 ± 1.779
1.033MetGly: 1.033 ± 1.292
1.033MetHis: 1.033 ± 0.89
0.0MetIle: 0.0 ± 0.0
0.0MetLys: 0.0 ± 0.0
0.0MetLeu: 0.0 ± 0.0
0.0MetMet: 0.0 ± 0.0
1.033MetAsn: 1.033 ± 0.89
3.099MetPro: 3.099 ± 1.142
1.033MetGln: 1.033 ± 0.716
2.066MetArg: 2.066 ± 1.896
1.033MetSer: 1.033 ± 0.716
0.0MetThr: 0.0 ± 0.0
0.0MetVal: 0.0 ± 0.0
2.066MetTrp: 2.066 ± 1.431
2.066MetTyr: 2.066 ± 1.323
0.0MetXaa: 0.0 ± 0.0
Asn
4.132AsnAla: 4.132 ± 1.65
1.033AsnCys: 1.033 ± 0.716
3.099AsnAsp: 3.099 ± 0.879
3.099AsnGlu: 3.099 ± 1.56
2.066AsnPhe: 2.066 ± 1.094
4.132AsnGly: 4.132 ± 1.129
8.264AsnHis: 8.264 ± 4.03
2.066AsnIle: 2.066 ± 1.277
1.033AsnLys: 1.033 ± 0.716
3.099AsnLeu: 3.099 ± 1.443
3.099AsnMet: 3.099 ± 1.837
2.066AsnAsn: 2.066 ± 1.323
3.099AsnPro: 3.099 ± 0.995
0.0AsnGln: 0.0 ± 0.0
3.099AsnArg: 3.099 ± 2.497
3.099AsnSer: 3.099 ± 1.142
0.0AsnThr: 0.0 ± 0.0
5.165AsnVal: 5.165 ± 2.458
0.0AsnTrp: 0.0 ± 0.0
4.132AsnTyr: 4.132 ± 1.879
0.0AsnXaa: 0.0 ± 0.0
Pro
2.066ProAla: 2.066 ± 1.431
1.033ProCys: 1.033 ± 0.89
2.066ProAsp: 2.066 ± 1.277
5.165ProGlu: 5.165 ± 2.379
1.033ProPhe: 1.033 ± 0.716
1.033ProGly: 1.033 ± 0.716
5.165ProHis: 5.165 ± 2.162
2.066ProIle: 2.066 ± 1.323
1.033ProLys: 1.033 ± 0.89
3.099ProLeu: 3.099 ± 1.389
3.099ProMet: 3.099 ± 1.56
1.033ProAsn: 1.033 ± 0.716
3.099ProPro: 3.099 ± 1.392
5.165ProGln: 5.165 ± 3.167
4.132ProArg: 4.132 ± 2.635
7.231ProSer: 7.231 ± 2.762
2.066ProThr: 2.066 ± 2.583
4.132ProVal: 4.132 ± 0.968
2.066ProTrp: 2.066 ± 0.825
4.132ProTyr: 4.132 ± 2.646
0.0ProXaa: 0.0 ± 0.0
Gln
3.099GlnAla: 3.099 ± 0.995
1.033GlnCys: 1.033 ± 0.716
2.066GlnAsp: 2.066 ± 2.354
4.132GlnGlu: 4.132 ± 0.968
1.033GlnPhe: 1.033 ± 0.716
2.066GlnGly: 2.066 ± 1.072
3.099GlnHis: 3.099 ± 1.691
3.099GlnIle: 3.099 ± 1.443
2.066GlnLys: 2.066 ± 1.431
3.099GlnLeu: 3.099 ± 1.443
0.0GlnMet: 0.0 ± 0.0
1.033GlnAsn: 1.033 ± 0.716
3.099GlnPro: 3.099 ± 2.135
0.0GlnGln: 0.0 ± 0.0
4.132GlnArg: 4.132 ± 0.968
4.132GlnSer: 4.132 ± 1.879
0.0GlnThr: 0.0 ± 0.0
4.132GlnVal: 4.132 ± 1.538
0.0GlnTrp: 0.0 ± 0.0
1.033GlnTyr: 1.033 ± 0.89
0.0GlnXaa: 0.0 ± 0.0
Arg
5.165ArgAla: 5.165 ± 2.546
1.033ArgCys: 1.033 ± 0.716
3.099ArgAsp: 3.099 ± 2.669
2.066ArgGlu: 2.066 ± 1.324
7.231ArgPhe: 7.231 ± 3.528
8.264ArgGly: 8.264 ± 2.549
3.099ArgHis: 3.099 ± 1.639
6.198ArgIle: 6.198 ± 1.489
2.066ArgLys: 2.066 ± 1.277
2.066ArgLeu: 2.066 ± 1.896
2.066ArgMet: 2.066 ± 3.698
3.099ArgAsn: 3.099 ± 1.263
6.198ArgPro: 6.198 ± 2.084
0.0ArgGln: 0.0 ± 0.0
10.331ArgArg: 10.331 ± 7.141
3.099ArgSer: 3.099 ± 1.691
8.264ArgThr: 8.264 ± 2.883
5.165ArgVal: 5.165 ± 1.233
0.0ArgTrp: 0.0 ± 0.0
1.033ArgTyr: 1.033 ± 0.89
0.0ArgXaa: 0.0 ± 0.0
Ser
6.198SerAla: 6.198 ± 3.511
0.0SerCys: 0.0 ± 0.0
2.066SerAsp: 2.066 ± 0.825
0.0SerGlu: 0.0 ± 0.0
1.033SerPhe: 1.033 ± 1.177
4.132SerGly: 4.132 ± 1.437
1.033SerHis: 1.033 ± 0.89
6.198SerIle: 6.198 ± 3.259
4.132SerLys: 4.132 ± 1.337
5.165SerLeu: 5.165 ± 2.162
0.0SerMet: 0.0 ± 0.0
5.165SerAsn: 5.165 ± 1.517
8.264SerPro: 8.264 ± 3.862
3.099SerGln: 3.099 ± 2.147
7.231SerArg: 7.231 ± 1.094
10.331SerSer: 10.331 ± 4.925
7.231SerThr: 7.231 ± 6.22
5.165SerVal: 5.165 ± 2.184
0.0SerTrp: 0.0 ± 0.0
3.099SerTyr: 3.099 ± 1.263
0.0SerXaa: 0.0 ± 0.0
Thr
3.099ThrAla: 3.099 ± 2.319
1.033ThrCys: 1.033 ± 1.292
2.066ThrAsp: 2.066 ± 1.632
3.099ThrGlu: 3.099 ± 2.497
1.033ThrPhe: 1.033 ± 1.292
4.132ThrGly: 4.132 ± 1.085
4.132ThrHis: 4.132 ± 2.555
0.0ThrIle: 0.0 ± 0.0
0.0ThrLys: 0.0 ± 0.0
2.066ThrLeu: 2.066 ± 0.825
2.066ThrMet: 2.066 ± 1.431
4.132ThrAsn: 4.132 ± 0.968
3.099ThrPro: 3.099 ± 1.56
1.033ThrGln: 1.033 ± 0.716
6.198ThrArg: 6.198 ± 3.139
5.165ThrSer: 5.165 ± 4.294
4.132ThrThr: 4.132 ± 3.559
3.099ThrVal: 3.099 ± 1.86
1.033ThrTrp: 1.033 ± 1.292
3.099ThrTyr: 3.099 ± 1.443
0.0ThrXaa: 0.0 ± 0.0
Val
2.066ValAla: 2.066 ± 1.431
1.033ValCys: 1.033 ± 0.716
2.066ValAsp: 2.066 ± 0.825
3.099ValGlu: 3.099 ± 1.443
2.066ValPhe: 2.066 ± 1.277
4.132ValGly: 4.132 ± 2.622
0.0ValHis: 0.0 ± 0.0
3.099ValIle: 3.099 ± 2.133
4.132ValLys: 4.132 ± 1.65
4.132ValLeu: 4.132 ± 1.598
4.132ValMet: 4.132 ± 3.559
5.165ValAsn: 5.165 ± 2.229
4.132ValPro: 4.132 ± 0.968
6.198ValGln: 6.198 ± 2.083
4.132ValArg: 4.132 ± 2.708
2.066ValSer: 2.066 ± 0.825
2.066ValThr: 2.066 ± 1.779
1.033ValVal: 1.033 ± 0.89
1.033ValTrp: 1.033 ± 1.155
5.165ValTyr: 5.165 ± 2.184
0.0ValXaa: 0.0 ± 0.0
Trp
3.099TrpAla: 3.099 ± 2.147
0.0TrpCys: 0.0 ± 0.0
0.0TrpAsp: 0.0 ± 0.0
1.033TrpGlu: 1.033 ± 1.155
0.0TrpPhe: 0.0 ± 0.0
1.033TrpGly: 1.033 ± 0.716
0.0TrpHis: 0.0 ± 0.0
0.0TrpIle: 0.0 ± 0.0
2.066TrpLys: 2.066 ± 0.825
1.033TrpLeu: 1.033 ± 0.89
1.033TrpMet: 1.033 ± 0.89
2.066TrpAsn: 2.066 ± 1.896
0.0TrpPro: 0.0 ± 0.0
1.033TrpGln: 1.033 ± 0.716
2.066TrpArg: 2.066 ± 1.277
1.033TrpSer: 1.033 ± 1.292
1.033TrpThr: 1.033 ± 1.155
2.066TrpVal: 2.066 ± 0.825
0.0TrpTrp: 0.0 ± 0.0
0.0TrpTyr: 0.0 ± 0.0
0.0TrpXaa: 0.0 ± 0.0
Tyr
4.132TyrAla: 4.132 ± 2.402
0.0TyrCys: 0.0 ± 0.0
1.033TyrAsp: 1.033 ± 0.89
2.066TyrGlu: 2.066 ± 1.779
5.165TyrPhe: 5.165 ± 0.831
2.066TyrGly: 2.066 ± 0.825
1.033TyrHis: 1.033 ± 1.155
3.099TyrIle: 3.099 ± 0.995
1.033TyrLys: 1.033 ± 0.716
6.198TyrLeu: 6.198 ± 3.215
2.066TyrMet: 2.066 ± 1.202
3.099TyrAsn: 3.099 ± 0.995
1.033TyrPro: 1.033 ± 0.716
2.066TyrGln: 2.066 ± 0.825
3.099TyrArg: 3.099 ± 1.56
2.066TyrSer: 2.066 ± 0.825
1.033TyrThr: 1.033 ± 1.155
2.066TyrVal: 2.066 ± 1.094
0.0TyrTrp: 0.0 ± 0.0
0.0TyrTyr: 0.0 ± 0.0
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 5 proteins (969 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski