Amino acid dipepetide frequency for Vesicular stomatitis Indiana virus (strain Glasgow) (VSIV)

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
8.214AlaAla: 8.214 ± 1.711
1.54AlaCys: 1.54 ± 0.929
6.16AlaAsp: 6.16 ± 1.402
2.053AlaGlu: 2.053 ± 1.478
3.08AlaPhe: 3.08 ± 0.659
3.08AlaGly: 3.08 ± 0.794
1.54AlaHis: 1.54 ± 0.917
1.027AlaIle: 1.027 ± 0.588
1.54AlaLys: 1.54 ± 0.906
4.107AlaLeu: 4.107 ± 1.582
0.0AlaMet: 0.0 ± 0.0
0.513AlaAsn: 0.513 ± 0.417
6.674AlaPro: 6.674 ± 2.165
1.54AlaGln: 1.54 ± 0.531
2.053AlaArg: 2.053 ± 0.752
8.214AlaSer: 8.214 ± 1.665
2.567AlaThr: 2.567 ± 0.675
6.674AlaVal: 6.674 ± 1.575
1.54AlaTrp: 1.54 ± 0.917
2.567AlaTyr: 2.567 ± 0.675
0.0AlaXaa: 0.0 ± 0.0
Cys
1.027CysAla: 1.027 ± 0.834
0.0CysCys: 0.0 ± 0.0
1.027CysAsp: 1.027 ± 0.856
1.54CysGlu: 1.54 ± 0.917
0.513CysPhe: 0.513 ± 0.428
0.513CysGly: 0.513 ± 0.428
0.513CysHis: 0.513 ± 0.428
0.0CysIle: 0.0 ± 0.0
3.08CysLys: 3.08 ± 1.491
0.0CysLeu: 0.0 ± 0.0
1.027CysMet: 1.027 ± 0.895
0.0CysAsn: 0.0 ± 0.0
1.54CysPro: 1.54 ± 1.284
0.513CysGln: 0.513 ± 0.428
0.0CysArg: 0.0 ± 0.0
1.54CysSer: 1.54 ± 0.929
0.513CysThr: 0.513 ± 0.502
0.513CysVal: 0.513 ± 0.417
0.0CysTrp: 0.0 ± 0.0
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
1.54AspAla: 1.54 ± 0.746
0.513AspCys: 0.513 ± 0.417
6.16AspAsp: 6.16 ± 1.483
5.647AspGlu: 5.647 ± 1.267
3.08AspPhe: 3.08 ± 0.659
2.567AspGly: 2.567 ± 1.093
3.08AspHis: 3.08 ± 1.835
3.593AspIle: 3.593 ± 1.546
4.107AspLys: 4.107 ± 1.236
3.593AspLeu: 3.593 ± 1.546
1.027AspMet: 1.027 ± 0.834
1.54AspAsn: 1.54 ± 1.028
2.567AspPro: 2.567 ± 0.864
2.567AspGln: 2.567 ± 0.864
0.513AspArg: 0.513 ± 0.428
5.647AspSer: 5.647 ± 0.86
3.593AspThr: 3.593 ± 0.838
6.674AspVal: 6.674 ± 1.355
1.54AspTrp: 1.54 ± 0.531
2.053AspTyr: 2.053 ± 0.753
0.0AspXaa: 0.0 ± 0.0
Glu
6.16GluAla: 6.16 ± 2.396
1.54GluCys: 1.54 ± 0.531
4.107GluAsp: 4.107 ± 1.065
4.107GluGlu: 4.107 ± 2.036
1.027GluPhe: 1.027 ± 0.603
5.133GluGly: 5.133 ± 1.093
3.08GluHis: 3.08 ± 0.94
3.593GluIle: 3.593 ± 1.443
4.107GluLys: 4.107 ± 1.677
4.62GluLeu: 4.62 ± 1.276
2.567GluMet: 2.567 ± 0.681
0.0GluAsn: 0.0 ± 0.0
1.027GluPro: 1.027 ± 0.603
2.567GluGln: 2.567 ± 1.04
1.027GluArg: 1.027 ± 0.856
3.08GluSer: 3.08 ± 0.94
1.54GluThr: 1.54 ± 1.284
0.513GluVal: 0.513 ± 0.417
2.053GluTrp: 2.053 ± 0.894
5.133GluTyr: 5.133 ± 1.262
0.0GluXaa: 0.0 ± 0.0
Phe
1.027PheAla: 1.027 ± 0.856
0.513PheCys: 0.513 ± 0.417
1.54PheAsp: 1.54 ± 1.25
3.08PheGlu: 3.08 ± 1.072
4.107PhePhe: 4.107 ± 1.221
4.62PheGly: 4.62 ± 1.441
1.027PheHis: 1.027 ± 0.834
2.567PheIle: 2.567 ± 1.456
1.027PheLys: 1.027 ± 0.834
3.08PheLeu: 3.08 ± 1.142
1.54PheMet: 1.54 ± 0.872
3.08PheAsn: 3.08 ± 1.835
2.053PhePro: 2.053 ± 1.712
1.54PheGln: 1.54 ± 1.506
7.7PheArg: 7.7 ± 2.308
3.08PheSer: 3.08 ± 0.754
3.593PheThr: 3.593 ± 1.0
0.513PheVal: 0.513 ± 0.428
0.513PheTrp: 0.513 ± 0.417
1.54PheTyr: 1.54 ± 0.917
0.0PheXaa: 0.0 ± 0.0
Gly
2.053GlyAla: 2.053 ± 0.919
0.0GlyCys: 0.0 ± 0.0
2.053GlyAsp: 2.053 ± 0.852
2.567GlyGlu: 2.567 ± 1.965
1.54GlyPhe: 1.54 ± 0.898
1.027GlyGly: 1.027 ± 0.603
1.027GlyHis: 1.027 ± 0.426
2.053GlyIle: 2.053 ± 0.703
7.187GlyLys: 7.187 ± 1.755
13.86GlyLeu: 13.86 ± 1.212
3.08GlyMet: 3.08 ± 0.659
2.053GlyAsn: 2.053 ± 0.752
3.08GlyPro: 3.08 ± 1.433
2.567GlyGln: 2.567 ± 1.092
4.107GlyArg: 4.107 ± 0.916
2.567GlySer: 2.567 ± 0.675
5.133GlyThr: 5.133 ± 2.003
5.133GlyVal: 5.133 ± 1.081
1.54GlyTrp: 1.54 ± 0.746
1.54GlyTyr: 1.54 ± 0.746
0.0GlyXaa: 0.0 ± 0.0
His
2.053HisAla: 2.053 ± 0.793
1.54HisCys: 1.54 ± 0.917
1.54HisAsp: 1.54 ± 0.917
0.513HisGlu: 0.513 ± 0.417
3.593HisPhe: 3.593 ± 1.788
1.027HisGly: 1.027 ± 0.588
0.513HisHis: 0.513 ± 0.428
0.513HisIle: 0.513 ± 0.428
1.027HisLys: 1.027 ± 0.588
1.027HisLeu: 1.027 ± 0.426
2.053HisMet: 2.053 ± 0.919
2.053HisAsn: 2.053 ± 1.322
1.027HisPro: 1.027 ± 0.588
0.0HisGln: 0.0 ± 0.0
1.54HisArg: 1.54 ± 0.917
2.567HisSer: 2.567 ± 0.885
1.027HisThr: 1.027 ± 0.588
1.027HisVal: 1.027 ± 0.426
2.567HisTrp: 2.567 ± 0.71
0.513HisTyr: 0.513 ± 0.428
0.0HisXaa: 0.0 ± 0.0
Ile
1.54IleAla: 1.54 ± 0.844
0.513IleCys: 0.513 ± 0.428
3.593IleAsp: 3.593 ± 0.923
4.107IleGlu: 4.107 ± 0.621
2.053IlePhe: 2.053 ± 1.172
7.7IleGly: 7.7 ± 1.61
0.513IleHis: 0.513 ± 0.417
3.08IleIle: 3.08 ± 1.072
2.567IleLys: 2.567 ± 1.288
6.674IleLeu: 6.674 ± 1.688
0.513IleMet: 0.513 ± 0.523
2.567IleAsn: 2.567 ± 1.115
0.513IlePro: 0.513 ± 0.417
2.053IleGln: 2.053 ± 1.278
3.08IleArg: 3.08 ± 1.293
2.567IleSer: 2.567 ± 1.681
2.053IleThr: 2.053 ± 0.753
4.107IleVal: 4.107 ± 0.883
1.54IleTrp: 1.54 ± 0.917
2.567IleTyr: 2.567 ± 0.885
0.0IleXaa: 0.0 ± 0.0
Lys
8.727LysAla: 8.727 ± 1.68
1.027LysCys: 1.027 ± 0.856
2.053LysAsp: 2.053 ± 1.105
2.053LysGlu: 2.053 ± 0.753
2.567LysPhe: 2.567 ± 0.885
4.107LysGly: 4.107 ± 1.29
2.567LysHis: 2.567 ± 0.803
3.593LysIle: 3.593 ± 1.04
4.62LysLys: 4.62 ± 1.609
6.16LysLeu: 6.16 ± 2.672
3.593LysMet: 3.593 ± 0.659
2.053LysAsn: 2.053 ± 1.139
0.513LysPro: 0.513 ± 0.417
1.027LysGln: 1.027 ± 0.588
4.62LysArg: 4.62 ± 0.958
9.754LysSer: 9.754 ± 1.837
4.107LysThr: 4.107 ± 1.362
1.54LysVal: 1.54 ± 1.006
2.567LysTrp: 2.567 ± 0.675
2.567LysTyr: 2.567 ± 0.821
0.0LysXaa: 0.0 ± 0.0
Leu
7.187LeuAla: 7.187 ± 1.996
1.54LeuCys: 1.54 ± 0.746
5.133LeuAsp: 5.133 ± 0.964
2.053LeuGlu: 2.053 ± 0.898
3.08LeuPhe: 3.08 ± 1.796
6.16LeuGly: 6.16 ± 1.168
0.513LeuHis: 0.513 ± 0.428
5.133LeuIle: 5.133 ± 1.32
8.214LeuLys: 8.214 ± 2.123
2.567LeuLeu: 2.567 ± 1.668
3.08LeuMet: 3.08 ± 1.149
4.107LeuAsn: 4.107 ± 0.964
5.647LeuPro: 5.647 ± 0.86
1.54LeuGln: 1.54 ± 0.531
5.133LeuArg: 5.133 ± 0.693
6.16LeuSer: 6.16 ± 2.023
6.16LeuThr: 6.16 ± 2.023
2.053LeuVal: 2.053 ± 0.852
0.513LeuTrp: 0.513 ± 0.428
7.7LeuTyr: 7.7 ± 1.234
0.0LeuXaa: 0.0 ± 0.0
Met
2.053MetAla: 2.053 ± 0.793
0.513MetCys: 0.513 ± 0.428
2.567MetAsp: 2.567 ± 0.864
1.027MetGlu: 1.027 ± 0.753
2.567MetPhe: 2.567 ± 1.092
1.54MetGly: 1.54 ± 0.917
0.0MetHis: 0.0 ± 0.0
4.107MetIle: 4.107 ± 0.964
0.513MetLys: 0.513 ± 0.428
2.567MetLeu: 2.567 ± 0.675
1.027MetMet: 1.027 ± 0.834
1.027MetAsn: 1.027 ± 0.588
1.54MetPro: 1.54 ± 0.726
1.027MetGln: 1.027 ± 0.426
0.513MetArg: 0.513 ± 0.523
3.593MetSer: 3.593 ± 1.457
3.593MetThr: 3.593 ± 1.745
1.027MetVal: 1.027 ± 0.426
0.0MetTrp: 0.0 ± 0.0
2.567MetTyr: 2.567 ± 1.281
0.0MetXaa: 0.0 ± 0.0
Asn
1.027AsnAla: 1.027 ± 0.834
0.513AsnCys: 0.513 ± 0.428
1.54AsnAsp: 1.54 ± 0.746
1.54AsnGlu: 1.54 ± 0.991
0.0AsnPhe: 0.0 ± 0.0
2.053AsnGly: 2.053 ± 1.278
0.0AsnHis: 0.0 ± 0.0
2.053AsnIle: 2.053 ± 0.919
0.0AsnLys: 0.0 ± 0.0
4.107AsnLeu: 4.107 ± 1.236
0.0AsnMet: 0.0 ± 0.0
0.0AsnAsn: 0.0 ± 0.0
2.053AsnPro: 2.053 ± 1.139
3.593AsnGln: 3.593 ± 0.766
4.107AsnArg: 4.107 ± 1.088
2.567AsnSer: 2.567 ± 0.675
1.54AsnThr: 1.54 ± 0.906
2.567AsnVal: 2.567 ± 0.675
1.027AsnTrp: 1.027 ± 0.856
2.053AsnTyr: 2.053 ± 0.753
0.0AsnXaa: 0.0 ± 0.0
Pro
3.593ProAla: 3.593 ± 1.073
0.0ProCys: 0.0 ± 0.0
3.593ProAsp: 3.593 ± 1.609
7.187ProGlu: 7.187 ± 1.531
4.62ProPhe: 4.62 ± 2.752
1.54ProGly: 1.54 ± 0.746
3.08ProHis: 3.08 ± 1.142
3.08ProIle: 3.08 ± 1.288
2.567ProLys: 2.567 ± 1.104
4.107ProLeu: 4.107 ± 1.461
3.08ProMet: 3.08 ± 1.835
2.053ProAsn: 2.053 ± 0.793
3.593ProPro: 3.593 ± 1.387
1.027ProGln: 1.027 ± 0.426
0.0ProArg: 0.0 ± 0.0
4.62ProSer: 4.62 ± 1.917
1.027ProThr: 1.027 ± 0.426
1.54ProVal: 1.54 ± 0.531
0.0ProTrp: 0.0 ± 0.0
2.053ProTyr: 2.053 ± 0.703
0.0ProXaa: 0.0 ± 0.0
Gln
2.053GlnAla: 2.053 ± 1.478
1.027GlnCys: 1.027 ± 0.426
1.027GlnAsp: 1.027 ± 0.856
1.54GlnGlu: 1.54 ± 0.531
1.54GlnPhe: 1.54 ± 0.726
4.62GlnGly: 4.62 ± 1.016
0.0GlnHis: 0.0 ± 0.0
2.053GlnIle: 2.053 ± 1.004
1.027GlnLys: 1.027 ± 0.588
2.567GlnLeu: 2.567 ± 0.675
2.053GlnMet: 2.053 ± 1.172
1.54GlnAsn: 1.54 ± 0.956
3.593GlnPro: 3.593 ± 1.437
1.54GlnGln: 1.54 ± 0.956
0.513GlnArg: 0.513 ± 0.502
1.54GlnSer: 1.54 ± 1.006
1.027GlnThr: 1.027 ± 0.856
2.053GlnVal: 2.053 ± 1.278
0.513GlnTrp: 0.513 ± 0.502
1.027GlnTyr: 1.027 ± 0.426
0.0GlnXaa: 0.0 ± 0.0
Arg
4.107ArgAla: 4.107 ± 0.916
0.0ArgCys: 0.0 ± 0.0
1.027ArgAsp: 1.027 ± 0.834
4.107ArgGlu: 4.107 ± 0.307
1.027ArgPhe: 1.027 ± 0.426
2.567ArgGly: 2.567 ± 1.218
1.54ArgHis: 1.54 ± 0.917
1.027ArgIle: 1.027 ± 0.426
3.593ArgLys: 3.593 ± 2.097
2.053ArgLeu: 2.053 ± 1.176
3.08ArgMet: 3.08 ± 0.746
0.513ArgAsn: 0.513 ± 0.417
5.133ArgPro: 5.133 ± 2.639
2.567ArgGln: 2.567 ± 0.94
1.54ArgArg: 1.54 ± 0.917
3.593ArgSer: 3.593 ± 0.821
4.107ArgThr: 4.107 ± 1.088
2.053ArgVal: 2.053 ± 0.753
0.513ArgTrp: 0.513 ± 0.428
3.08ArgTyr: 3.08 ± 0.518
0.0ArgXaa: 0.0 ± 0.0
Ser
2.567SerAla: 2.567 ± 0.515
1.54SerCys: 1.54 ± 0.929
7.187SerAsp: 7.187 ± 1.531
2.053SerGlu: 2.053 ± 0.941
5.133SerPhe: 5.133 ± 1.35
5.133SerGly: 5.133 ± 1.041
2.567SerHis: 2.567 ± 0.71
3.593SerIle: 3.593 ± 1.63
10.267SerLys: 10.267 ± 2.0
9.24SerLeu: 9.24 ± 1.356
1.027SerMet: 1.027 ± 0.619
6.16SerAsn: 6.16 ± 1.232
1.54SerPro: 1.54 ± 0.635
2.567SerGln: 2.567 ± 1.456
2.567SerArg: 2.567 ± 1.052
11.807SerSer: 11.807 ± 1.928
3.08SerThr: 3.08 ± 1.063
4.107SerVal: 4.107 ± 0.307
0.513SerTrp: 0.513 ± 0.428
3.593SerTyr: 3.593 ± 0.868
0.0SerXaa: 0.0 ± 0.0
Thr
1.027ThrAla: 1.027 ± 0.834
0.513ThrCys: 0.513 ± 0.428
1.54ThrAsp: 1.54 ± 1.284
2.567ThrGlu: 2.567 ± 0.94
2.053ThrPhe: 2.053 ± 0.941
4.107ThrGly: 4.107 ± 2.277
2.567ThrHis: 2.567 ± 0.71
7.187ThrIle: 7.187 ± 1.289
2.567ThrLys: 2.567 ± 0.821
1.54ThrLeu: 1.54 ± 1.006
1.54ThrMet: 1.54 ± 0.917
1.027ThrAsn: 1.027 ± 0.834
6.16ThrPro: 6.16 ± 1.037
1.54ThrGln: 1.54 ± 0.956
2.053ThrArg: 2.053 ± 0.752
3.08ThrSer: 3.08 ± 0.909
4.107ThrThr: 4.107 ± 1.704
5.647ThrVal: 5.647 ± 1.094
2.053ThrTrp: 2.053 ± 1.139
1.54ThrTyr: 1.54 ± 0.917
0.0ThrXaa: 0.0 ± 0.0
Val
2.567ValAla: 2.567 ± 1.092
0.0ValCys: 0.0 ± 0.0
4.62ValAsp: 4.62 ± 1.615
5.647ValGlu: 5.647 ± 0.78
1.54ValPhe: 1.54 ± 0.898
3.593ValGly: 3.593 ± 1.233
0.513ValHis: 0.513 ± 0.428
1.54ValIle: 1.54 ± 0.531
4.107ValLys: 4.107 ± 0.307
5.133ValLeu: 5.133 ± 1.395
1.027ValMet: 1.027 ± 0.603
1.027ValAsn: 1.027 ± 0.426
3.593ValPro: 3.593 ± 0.617
1.54ValGln: 1.54 ± 0.531
4.107ValArg: 4.107 ± 1.532
5.133ValSer: 5.133 ± 1.332
3.08ValThr: 3.08 ± 1.072
1.54ValVal: 1.54 ± 1.028
1.54ValTrp: 1.54 ± 0.531
1.027ValTyr: 1.027 ± 0.603
0.0ValXaa: 0.0 ± 0.0
Trp
2.053TrpAla: 2.053 ± 0.793
0.0TrpCys: 0.0 ± 0.0
3.593TrpAsp: 3.593 ± 1.661
0.0TrpGlu: 0.0 ± 0.0
1.54TrpPhe: 1.54 ± 0.746
1.54TrpGly: 1.54 ± 0.726
1.027TrpHis: 1.027 ± 0.856
0.513TrpIle: 0.513 ± 0.417
2.567TrpLys: 2.567 ± 1.161
2.053TrpLeu: 2.053 ± 0.752
0.513TrpMet: 0.513 ± 0.428
0.513TrpAsn: 0.513 ± 0.502
0.0TrpPro: 0.0 ± 0.0
0.0TrpGln: 0.0 ± 0.0
0.0TrpArg: 0.0 ± 0.0
1.54TrpSer: 1.54 ± 0.531
0.0TrpThr: 0.0 ± 0.0
2.567TrpVal: 2.567 ± 0.885
0.0TrpTrp: 0.0 ± 0.0
0.513TrpTyr: 0.513 ± 0.428
0.0TrpXaa: 0.0 ± 0.0
Tyr
4.107TyrAla: 4.107 ± 1.291
1.027TyrCys: 1.027 ± 0.856
1.54TyrAsp: 1.54 ± 0.917
3.593TyrGlu: 3.593 ± 0.974
3.08TyrPhe: 3.08 ± 0.549
1.54TyrGly: 1.54 ± 0.726
2.053TyrHis: 2.053 ± 0.793
3.593TyrIle: 3.593 ± 0.883
5.133TyrLys: 5.133 ± 1.421
5.133TyrLeu: 5.133 ± 0.532
1.027TyrMet: 1.027 ± 0.426
0.513TyrAsn: 0.513 ± 0.502
0.513TyrPro: 0.513 ± 0.417
1.54TyrGln: 1.54 ± 0.956
2.053TyrArg: 2.053 ± 1.172
3.593TyrSer: 3.593 ± 0.766
2.567TyrThr: 2.567 ± 1.115
1.027TyrVal: 1.027 ± 0.603
0.0TyrTrp: 0.0 ± 0.0
0.0TyrTyr: 0.0 ± 0.0
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 8 proteins (1949 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski