Amino acid dipepetide frequency for SARS coronavirus PUMC03

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
6.355AlaAla: 6.355 ± 0.816
2.23AlaCys: 2.23 ± 0.877
3.233AlaAsp: 3.233 ± 0.78
2.564AlaGlu: 2.564 ± 0.726
2.676AlaPhe: 2.676 ± 0.495
4.348AlaGly: 4.348 ± 0.983
1.003AlaHis: 1.003 ± 0.52
4.236AlaIle: 4.236 ± 0.577
3.902AlaLys: 3.902 ± 1.575
6.8AlaLeu: 6.8 ± 0.896
2.453AlaMet: 2.453 ± 0.61
3.902AlaAsn: 3.902 ± 0.994
2.899AlaPro: 2.899 ± 0.602
2.453AlaGln: 2.453 ± 0.588
3.01AlaArg: 3.01 ± 0.827
4.905AlaSer: 4.905 ± 2.774
5.017AlaThr: 5.017 ± 1.041
4.905AlaVal: 4.905 ± 1.451
1.449AlaTrp: 1.449 ± 0.569
3.902AlaTyr: 3.902 ± 0.766
0.0AlaXaa: 0.0 ± 0.0
Cys
2.23CysAla: 2.23 ± 0.642
1.672CysCys: 1.672 ± 0.494
2.564CysAsp: 2.564 ± 0.561
1.003CysGlu: 1.003 ± 0.52
1.449CysPhe: 1.449 ± 0.476
2.564CysGly: 2.564 ± 0.729
0.557CysHis: 0.557 ± 0.165
1.672CysIle: 1.672 ± 0.733
0.78CysLys: 0.78 ± 0.228
2.787CysLeu: 2.787 ± 0.627
0.557CysMet: 0.557 ± 0.165
1.449CysAsn: 1.449 ± 0.415
0.78CysPro: 0.78 ± 0.228
0.557CysGln: 0.557 ± 0.289
1.115CysArg: 1.115 ± 0.578
2.118CysSer: 2.118 ± 0.794
2.23CysThr: 2.23 ± 0.789
2.899CysVal: 2.899 ± 0.83
0.223CysTrp: 0.223 ± 0.116
1.561CysTyr: 1.561 ± 0.455
0.0CysXaa: 0.0 ± 0.0
Asp
4.682AspAla: 4.682 ± 1.182
1.338AspCys: 1.338 ± 0.38
2.899AspAsp: 2.899 ± 1.02
2.787AspGlu: 2.787 ± 0.462
2.899AspPhe: 2.899 ± 0.51
3.902AspGly: 3.902 ± 0.994
0.892AspHis: 0.892 ± 0.654
3.456AspIle: 3.456 ± 0.982
3.122AspLys: 3.122 ± 0.538
5.017AspLeu: 5.017 ± 0.887
1.449AspMet: 1.449 ± 0.334
3.122AspAsn: 3.122 ± 1.315
1.338AspPro: 1.338 ± 0.454
1.561AspGln: 1.561 ± 0.698
1.672AspArg: 1.672 ± 0.572
3.456AspSer: 3.456 ± 0.664
3.344AspThr: 3.344 ± 1.293
4.459AspVal: 4.459 ± 1.764
0.557AspTrp: 0.557 ± 0.289
3.456AspTyr: 3.456 ± 0.773
0.0AspXaa: 0.0 ± 0.0
Glu
3.122GluAla: 3.122 ± 1.148
1.672GluCys: 1.672 ± 0.637
2.23GluAsp: 2.23 ± 0.745
4.013GluGlu: 4.013 ± 1.264
2.007GluPhe: 2.007 ± 0.816
2.787GluGly: 2.787 ± 0.976
1.338GluHis: 1.338 ± 0.48
3.122GluIle: 3.122 ± 0.696
2.007GluLys: 2.007 ± 0.642
4.459GluLeu: 4.459 ± 0.663
0.78GluMet: 0.78 ± 0.405
2.007GluAsn: 2.007 ± 0.642
1.672GluPro: 1.672 ± 0.405
1.895GluGln: 1.895 ± 0.379
1.449GluArg: 1.449 ± 0.473
2.341GluSer: 2.341 ± 0.346
2.899GluThr: 2.899 ± 1.039
3.567GluVal: 3.567 ± 0.649
0.446GluTrp: 0.446 ± 0.231
1.895GluTyr: 1.895 ± 0.759
0.0GluXaa: 0.0 ± 0.0
Phe
2.787PheAla: 2.787 ± 1.427
1.784PheCys: 1.784 ± 0.544
3.122PheAsp: 3.122 ± 1.014
1.895PheGlu: 1.895 ± 0.817
2.23PhePhe: 2.23 ± 0.277
3.122PheGly: 3.122 ± 1.907
0.669PheHis: 0.669 ± 0.19
2.23PheIle: 2.23 ± 0.659
3.567PheLys: 3.567 ± 0.815
4.571PheLeu: 4.571 ± 1.303
1.003PheMet: 1.003 ± 0.513
3.344PheAsn: 3.344 ± 1.981
2.118PhePro: 2.118 ± 0.712
0.78PheGln: 0.78 ± 0.569
1.449PheArg: 1.449 ± 0.569
3.456PheSer: 3.456 ± 1.06
3.567PheThr: 3.567 ± 1.051
3.902PheVal: 3.902 ± 0.923
0.334PheTrp: 0.334 ± 0.173
2.787PheTyr: 2.787 ± 0.642
0.0PheXaa: 0.0 ± 0.0
Gly
4.794GlyAla: 4.794 ± 1.926
1.672GlyCys: 1.672 ± 0.498
4.013GlyAsp: 4.013 ± 0.501
2.118GlyGlu: 2.118 ± 0.355
3.79GlyPhe: 3.79 ± 0.992
4.125GlyGly: 4.125 ± 2.291
1.561GlyHis: 1.561 ± 1.003
3.679GlyIle: 3.679 ± 1.261
3.01GlyLys: 3.01 ± 1.136
3.79GlyLeu: 3.79 ± 0.62
1.003GlyMet: 1.003 ± 0.831
3.233GlyAsn: 3.233 ± 1.033
2.453GlyPro: 2.453 ± 2.025
2.118GlyGln: 2.118 ± 1.058
1.784GlyArg: 1.784 ± 0.835
3.902GlySer: 3.902 ± 0.742
5.574GlyThr: 5.574 ± 2.006
6.355GlyVal: 6.355 ± 1.354
0.446GlyTrp: 0.446 ± 0.402
2.787GlyTyr: 2.787 ± 0.393
0.0GlyXaa: 0.0 ± 0.0
His
1.561HisAla: 1.561 ± 0.501
0.669HisCys: 0.669 ± 0.347
1.003HisAsp: 1.003 ± 0.52
1.115HisGlu: 1.115 ± 0.373
1.338HisPhe: 1.338 ± 0.694
1.561HisGly: 1.561 ± 0.68
0.557HisHis: 0.557 ± 0.289
1.003HisIle: 1.003 ± 0.831
0.669HisLys: 0.669 ± 0.347
2.118HisLeu: 2.118 ± 0.523
0.446HisMet: 0.446 ± 0.231
0.892HisAsn: 0.892 ± 0.463
0.557HisPro: 0.557 ± 0.289
0.334HisGln: 0.334 ± 0.173
0.223HisArg: 0.223 ± 0.116
1.449HisSer: 1.449 ± 0.653
2.23HisThr: 2.23 ± 0.987
1.672HisVal: 1.672 ± 0.498
0.334HisTrp: 0.334 ± 0.441
0.669HisTyr: 0.669 ± 0.19
0.0HisXaa: 0.0 ± 0.0
Ile
3.679IleAla: 3.679 ± 1.934
1.449IleCys: 1.449 ± 0.664
3.233IleAsp: 3.233 ± 0.655
1.449IleGlu: 1.449 ± 0.535
1.561IlePhe: 1.561 ± 0.648
3.902IleGly: 3.902 ± 1.653
0.446IleHis: 0.446 ± 0.158
2.787IleIle: 2.787 ± 0.886
3.79IleLys: 3.79 ± 0.953
4.125IleLeu: 4.125 ± 0.392
1.449IleMet: 1.449 ± 0.653
2.899IleAsn: 2.899 ± 0.736
2.007IlePro: 2.007 ± 0.635
1.895IleGln: 1.895 ± 0.593
1.784IleArg: 1.784 ± 0.522
3.567IleSer: 3.567 ± 1.262
4.571IleThr: 4.571 ± 0.942
4.013IleVal: 4.013 ± 1.018
0.446IleTrp: 0.446 ± 0.158
1.003IleTyr: 1.003 ± 0.513
0.0IleXaa: 0.0 ± 0.0
Lys
2.787LysAla: 2.787 ± 1.015
1.895LysCys: 1.895 ± 0.593
3.01LysAsp: 3.01 ± 1.505
2.787LysGlu: 2.787 ± 1.056
2.787LysPhe: 2.787 ± 0.864
5.017LysGly: 5.017 ± 0.704
2.007LysHis: 2.007 ± 0.698
2.23LysIle: 2.23 ± 0.929
2.899LysLys: 2.899 ± 2.949
6.243LysLeu: 6.243 ± 1.089
1.449LysMet: 1.449 ± 0.334
2.007LysAsn: 2.007 ± 0.635
3.567LysPro: 3.567 ± 0.586
1.784LysGln: 1.784 ± 1.551
2.564LysArg: 2.564 ± 0.372
4.125LysSer: 4.125 ± 0.777
3.456LysThr: 3.456 ± 0.626
3.01LysVal: 3.01 ± 0.771
0.78LysTrp: 0.78 ± 0.228
2.453LysTyr: 2.453 ± 0.79
0.0LysXaa: 0.0 ± 0.0
Leu
6.02LeuAla: 6.02 ± 1.426
2.787LeuCys: 2.787 ± 0.597
5.017LeuAsp: 5.017 ± 0.704
4.125LeuGlu: 4.125 ± 1.595
2.899LeuPhe: 2.899 ± 0.665
5.24LeuGly: 5.24 ± 0.791
1.672LeuHis: 1.672 ± 0.498
3.01LeuIle: 3.01 ± 0.922
6.689LeuLys: 6.689 ± 1.642
9.253LeuLeu: 9.253 ± 2.015
2.676LeuMet: 2.676 ± 1.046
6.243LeuAsn: 6.243 ± 0.644
4.905LeuPro: 4.905 ± 2.071
4.459LeuGln: 4.459 ± 0.423
4.571LeuArg: 4.571 ± 0.783
7.135LeuSer: 7.135 ± 1.631
5.797LeuThr: 5.797 ± 1.006
5.24LeuVal: 5.24 ± 2.066
1.226LeuTrp: 1.226 ± 0.991
3.567LeuTyr: 3.567 ± 0.761
0.0LeuXaa: 0.0 ± 0.0
Met
1.784MetAla: 1.784 ± 1.577
0.78MetCys: 0.78 ± 0.405
1.449MetAsp: 1.449 ± 0.577
0.669MetGlu: 0.669 ± 0.7
0.892MetPhe: 0.892 ± 0.272
1.003MetGly: 1.003 ± 0.317
0.446MetHis: 0.446 ± 0.231
0.892MetIle: 0.892 ± 0.315
0.892MetLys: 0.892 ± 0.314
2.787MetLeu: 2.787 ± 0.982
0.669MetMet: 0.669 ± 0.347
0.892MetAsn: 0.892 ± 0.272
1.338MetPro: 1.338 ± 0.48
1.115MetGln: 1.115 ± 0.373
0.669MetArg: 0.669 ± 0.19
2.453MetSer: 2.453 ± 0.991
1.338MetThr: 1.338 ± 0.48
1.449MetVal: 1.449 ± 0.535
0.669MetTrp: 0.669 ± 1.037
1.338MetTyr: 1.338 ± 0.38
0.0MetXaa: 0.0 ± 0.0
Asn
4.459AsnAla: 4.459 ± 0.845
1.895AsnCys: 1.895 ± 0.817
1.784AsnAsp: 1.784 ± 0.5
1.672AsnGlu: 1.672 ± 0.498
2.453AsnPhe: 2.453 ± 2.222
4.236AsnGly: 4.236 ± 0.647
1.449AsnHis: 1.449 ± 0.415
2.453AsnIle: 2.453 ± 0.935
2.787AsnLys: 2.787 ± 0.496
4.905AsnLeu: 4.905 ± 1.381
1.449AsnMet: 1.449 ± 0.578
3.567AsnAsn: 3.567 ± 1.288
1.895AsnPro: 1.895 ± 0.471
1.672AsnGln: 1.672 ± 1.368
2.118AsnArg: 2.118 ± 1.419
3.456AsnSer: 3.456 ± 1.827
3.567AsnThr: 3.567 ± 1.226
5.017AsnVal: 5.017 ± 0.202
0.446AsnTrp: 0.446 ± 0.504
2.676AsnTyr: 2.676 ± 0.584
0.0AsnXaa: 0.0 ± 0.0
Pro
3.122ProAla: 3.122 ± 0.536
1.115ProCys: 1.115 ± 0.33
1.784ProAsp: 1.784 ± 0.798
1.784ProGlu: 1.784 ± 0.835
2.23ProPhe: 2.23 ± 1.475
2.118ProGly: 2.118 ± 0.355
0.78ProHis: 0.78 ± 0.228
2.564ProIle: 2.564 ± 0.372
2.899ProLys: 2.899 ± 1.53
4.459ProLeu: 4.459 ± 0.656
0.446ProMet: 0.446 ± 0.158
2.23ProAsn: 2.23 ± 0.578
1.672ProPro: 1.672 ± 0.23
1.561ProGln: 1.561 ± 2.789
1.784ProArg: 1.784 ± 1.65
2.341ProSer: 2.341 ± 0.818
3.233ProThr: 3.233 ± 0.839
3.233ProVal: 3.233 ± 0.794
0.334ProTrp: 0.334 ± 0.171
1.115ProTyr: 1.115 ± 0.289
0.0ProXaa: 0.0 ± 0.0
Gln
3.122GlnAla: 3.122 ± 0.863
0.892GlnCys: 0.892 ± 0.272
1.895GlnAsp: 1.895 ± 0.754
1.672GlnGlu: 1.672 ± 0.494
1.672GlnPhe: 1.672 ± 0.875
2.118GlnGly: 2.118 ± 2.625
0.892GlnHis: 0.892 ± 0.314
1.784GlnIle: 1.784 ± 1.572
1.672GlnLys: 1.672 ± 0.903
3.679GlnLeu: 3.679 ± 0.795
1.115GlnMet: 1.115 ± 0.37
1.449GlnAsn: 1.449 ± 0.822
2.118GlnPro: 2.118 ± 0.712
1.784GlnGln: 1.784 ± 1.137
1.672GlnArg: 1.672 ± 1.348
2.007GlnSer: 2.007 ± 0.262
2.564GlnThr: 2.564 ± 0.726
2.453GlnVal: 2.453 ± 1.044
0.669GlnTrp: 0.669 ± 0.496
1.338GlnTyr: 1.338 ± 0.454
0.0GlnXaa: 0.0 ± 0.0
Arg
3.567ArgAla: 3.567 ± 0.586
1.226ArgCys: 1.226 ± 0.444
2.118ArgAsp: 2.118 ± 0.794
2.453ArgGlu: 2.453 ± 0.725
1.672ArgPhe: 1.672 ± 0.574
2.341ArgGly: 2.341 ± 3.01
1.115ArgHis: 1.115 ± 0.373
1.895ArgIle: 1.895 ± 1.246
1.895ArgLys: 1.895 ± 0.759
3.233ArgLeu: 3.233 ± 0.214
0.557ArgMet: 0.557 ± 0.728
2.341ArgAsn: 2.341 ± 1.297
1.226ArgPro: 1.226 ± 0.986
1.895ArgGln: 1.895 ± 1.627
1.115ArgArg: 1.115 ± 1.812
2.564ArgSer: 2.564 ± 1.117
1.561ArgThr: 1.561 ± 1.063
3.79ArgVal: 3.79 ± 0.62
0.334ArgTrp: 0.334 ± 0.441
1.449ArgTyr: 1.449 ± 0.476
0.0ArgXaa: 0.0 ± 0.0
Ser
5.797SerAla: 5.797 ± 1.757
1.561SerCys: 1.561 ± 0.648
4.236SerAsp: 4.236 ± 0.965
3.344SerGlu: 3.344 ± 0.72
3.902SerPhe: 3.902 ± 1.277
3.902SerGly: 3.902 ± 2.174
1.561SerHis: 1.561 ± 0.81
2.564SerIle: 2.564 ± 0.729
3.456SerLys: 3.456 ± 0.626
5.797SerLeu: 5.797 ± 0.985
1.561SerMet: 1.561 ± 0.482
3.233SerAsn: 3.233 ± 1.46
2.118SerPro: 2.118 ± 0.916
2.787SerGln: 2.787 ± 0.785
2.341SerArg: 2.341 ± 3.761
4.125SerSer: 4.125 ± 1.199
5.017SerThr: 5.017 ± 1.455
5.686SerVal: 5.686 ± 1.62
1.003SerTrp: 1.003 ± 0.296
3.122SerTyr: 3.122 ± 1.002
0.0SerXaa: 0.0 ± 0.0
Thr
3.567ThrAla: 3.567 ± 1.526
2.676ThrCys: 2.676 ± 1.388
3.567ThrAsp: 3.567 ± 1.856
3.79ThrGlu: 3.79 ± 0.746
4.348ThrPhe: 4.348 ± 0.816
4.459ThrGly: 4.459 ± 0.758
1.338ThrHis: 1.338 ± 0.48
3.902ThrIle: 3.902 ± 0.961
3.456ThrLys: 3.456 ± 0.475
6.355ThrLeu: 6.355 ± 0.695
1.784ThrMet: 1.784 ± 0.601
3.456ThrAsn: 3.456 ± 0.298
2.899ThrPro: 2.899 ± 1.02
3.567ThrGln: 3.567 ± 1.706
3.01ThrArg: 3.01 ± 0.985
5.686ThrSer: 5.686 ± 1.918
5.686ThrThr: 5.686 ± 1.186
5.128ThrVal: 5.128 ± 0.901
0.334ThrTrp: 0.334 ± 0.441
2.341ThrTyr: 2.341 ± 0.451
0.0ThrXaa: 0.0 ± 0.0
Val
5.797ValAla: 5.797 ± 1.242
2.007ValCys: 2.007 ± 0.816
5.24ValAsp: 5.24 ± 1.707
4.125ValGlu: 4.125 ± 1.632
4.125ValPhe: 4.125 ± 1.106
3.01ValGly: 3.01 ± 1.139
0.892ValHis: 0.892 ± 0.463
4.236ValIle: 4.236 ± 1.214
4.794ValLys: 4.794 ± 1.506
7.246ValLeu: 7.246 ± 1.219
1.449ValMet: 1.449 ± 0.415
3.679ValAsn: 3.679 ± 1.136
2.899ValPro: 2.899 ± 0.26
2.787ValGln: 2.787 ± 0.772
3.122ValArg: 3.122 ± 0.506
4.794ValSer: 4.794 ± 1.283
6.355ValThr: 6.355 ± 1.315
6.689ValVal: 6.689 ± 1.977
0.557ValTrp: 0.557 ± 0.413
4.125ValTyr: 4.125 ± 0.921
0.0ValXaa: 0.0 ± 0.0
Trp
0.669TrpAla: 0.669 ± 0.347
0.223TrpCys: 0.223 ± 0.116
0.557TrpAsp: 0.557 ± 0.289
0.669TrpGlu: 0.669 ± 0.19
0.892TrpPhe: 0.892 ± 0.314
0.334TrpGly: 0.334 ± 0.171
0.223TrpHis: 0.223 ± 0.116
0.557TrpIle: 0.557 ± 0.497
0.446TrpLys: 0.446 ± 0.231
1.338TrpLeu: 1.338 ± 1.074
0.111TrpMet: 0.111 ± 0.058
1.338TrpAsn: 1.338 ± 0.457
0.557TrpPro: 0.557 ± 0.926
0.223TrpGln: 0.223 ± 0.116
0.334TrpArg: 0.334 ± 0.171
0.78TrpSer: 0.78 ± 0.501
0.557TrpThr: 0.557 ± 0.165
0.78TrpVal: 0.78 ± 0.694
0.111TrpTrp: 0.111 ± 0.058
0.446TrpTyr: 0.446 ± 0.456
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.341TyrAla: 2.341 ± 0.451
1.226TyrCys: 1.226 ± 0.636
2.453TyrAsp: 2.453 ± 0.851
1.672TyrGlu: 1.672 ± 0.494
2.899TyrPhe: 2.899 ± 0.468
2.118TyrGly: 2.118 ± 0.422
1.003TyrHis: 1.003 ± 0.379
1.784TyrIle: 1.784 ± 0.5
4.125TyrLys: 4.125 ± 0.698
3.567TyrLeu: 3.567 ± 0.828
1.003TyrMet: 1.003 ± 0.52
2.787TyrAsn: 2.787 ± 0.24
1.784TyrPro: 1.784 ± 0.703
1.338TyrGln: 1.338 ± 0.685
2.564TyrArg: 2.564 ± 0.826
2.453TyrSer: 2.453 ± 0.576
2.787TyrThr: 2.787 ± 0.824
3.679TyrVal: 3.679 ± 1.136
0.446TyrTrp: 0.446 ± 0.158
2.341TyrTyr: 2.341 ± 0.769
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 4 proteins (8971 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski