Amino acid dipepetide frequency for SARS coronavirus PUMC02

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
6.355AlaAla: 6.355 ± 0.822
2.23AlaCys: 2.23 ± 0.905
3.233AlaAsp: 3.233 ± 0.781
2.564AlaGlu: 2.564 ± 0.713
2.676AlaPhe: 2.676 ± 0.436
4.348AlaGly: 4.348 ± 0.857
1.003AlaHis: 1.003 ± 0.522
4.236AlaIle: 4.236 ± 0.529
3.902AlaLys: 3.902 ± 1.598
6.8AlaLeu: 6.8 ± 0.802
2.453AlaMet: 2.453 ± 0.633
3.902AlaAsn: 3.902 ± 1.025
2.899AlaPro: 2.899 ± 0.596
2.453AlaGln: 2.453 ± 0.556
3.01AlaArg: 3.01 ± 0.818
4.905AlaSer: 4.905 ± 2.694
5.017AlaThr: 5.017 ± 0.981
4.905AlaVal: 4.905 ± 1.402
1.449AlaTrp: 1.449 ± 0.526
3.902AlaTyr: 3.902 ± 0.752
0.0AlaXaa: 0.0 ± 0.0
Cys
2.23CysAla: 2.23 ± 0.666
1.672CysCys: 1.672 ± 0.509
2.564CysAsp: 2.564 ± 0.502
1.003CysGlu: 1.003 ± 0.522
1.449CysPhe: 1.449 ± 0.477
2.564CysGly: 2.564 ± 0.756
0.557CysHis: 0.557 ± 0.17
1.672CysIle: 1.672 ± 0.738
0.78CysLys: 0.78 ± 0.236
2.787CysLeu: 2.787 ± 0.572
0.557CysMet: 0.557 ± 0.17
1.449CysAsn: 1.449 ± 0.431
0.78CysPro: 0.78 ± 0.236
0.557CysGln: 0.557 ± 0.29
1.115CysArg: 1.115 ± 0.58
2.118CysSer: 2.118 ± 0.8
2.23CysThr: 2.23 ± 0.798
2.899CysVal: 2.899 ± 0.861
0.223CysTrp: 0.223 ± 0.116
1.561CysTyr: 1.561 ± 0.472
0.0CysXaa: 0.0 ± 0.0
Asp
4.682AspAla: 4.682 ± 1.228
1.338AspCys: 1.338 ± 0.394
2.899AspAsp: 2.899 ± 0.951
2.787AspGlu: 2.787 ± 0.495
2.899AspPhe: 2.899 ± 0.543
3.902AspGly: 3.902 ± 1.025
0.892AspHis: 0.892 ± 0.72
3.456AspIle: 3.456 ± 0.883
3.122AspLys: 3.122 ± 0.492
5.017AspLeu: 5.017 ± 0.838
1.449AspMet: 1.449 ± 0.302
3.122AspAsn: 3.122 ± 1.405
1.338AspPro: 1.338 ± 0.452
1.561AspGln: 1.561 ± 0.606
1.672AspArg: 1.672 ± 0.579
3.456AspSer: 3.456 ± 0.665
3.344AspThr: 3.344 ± 1.315
4.459AspVal: 4.459 ± 1.771
0.557AspTrp: 0.557 ± 0.29
3.456AspTyr: 3.456 ± 0.804
0.0AspXaa: 0.0 ± 0.0
Glu
3.122GluAla: 3.122 ± 1.009
1.672GluCys: 1.672 ± 0.641
2.23GluAsp: 2.23 ± 0.765
4.013GluGlu: 4.013 ± 1.253
2.007GluPhe: 2.007 ± 0.828
2.787GluGly: 2.787 ± 0.84
1.338GluHis: 1.338 ± 0.491
3.122GluIle: 3.122 ± 0.606
2.007GluLys: 2.007 ± 0.662
4.459GluLeu: 4.459 ± 0.722
0.78GluMet: 0.78 ± 0.406
2.007GluAsn: 2.007 ± 0.662
1.672GluPro: 1.672 ± 0.38
1.895GluGln: 1.895 ± 0.427
1.449GluArg: 1.449 ± 0.482
2.341GluSer: 2.341 ± 0.381
2.899GluThr: 2.899 ± 1.039
3.567GluVal: 3.567 ± 0.712
0.446GluTrp: 0.446 ± 0.232
1.895GluTyr: 1.895 ± 0.771
0.0GluXaa: 0.0 ± 0.0
Phe
2.787PheAla: 2.787 ± 1.551
1.784PheCys: 1.784 ± 0.563
3.122PheAsp: 3.122 ± 1.044
1.895PheGlu: 1.895 ± 0.817
2.23PhePhe: 2.23 ± 0.308
3.122PheGly: 3.122 ± 1.846
0.669PheHis: 0.669 ± 0.197
2.23PheIle: 2.23 ± 0.679
3.567PheLys: 3.567 ± 0.865
4.571PheLeu: 4.571 ± 1.254
1.003PheMet: 1.003 ± 0.51
3.344PheAsn: 3.344 ± 1.877
2.118PhePro: 2.118 ± 0.68
0.78PheGln: 0.78 ± 0.56
1.449PheArg: 1.449 ± 0.526
3.456PheSer: 3.456 ± 1.092
3.567PheThr: 3.567 ± 1.064
3.902PheVal: 3.902 ± 0.827
0.334PheTrp: 0.334 ± 0.174
2.787PheTyr: 2.787 ± 0.573
0.0PheXaa: 0.0 ± 0.0
Gly
4.794GlyAla: 4.794 ± 1.937
1.672GlyCys: 1.672 ± 0.516
4.013GlyAsp: 4.013 ± 0.531
2.118GlyGlu: 2.118 ± 0.359
3.79GlyPhe: 3.79 ± 0.96
4.125GlyGly: 4.125 ± 2.116
1.561GlyHis: 1.561 ± 1.085
3.79GlyIle: 3.79 ± 1.168
3.01GlyLys: 3.01 ± 1.088
3.79GlyLeu: 3.79 ± 0.613
1.003GlyMet: 1.003 ± 0.736
3.233GlyAsn: 3.233 ± 1.018
2.453GlyPro: 2.453 ± 1.804
2.118GlyGln: 2.118 ± 0.933
1.784GlyArg: 1.784 ± 0.861
3.902GlySer: 3.902 ± 0.701
5.574GlyThr: 5.574 ± 2.112
6.243GlyVal: 6.243 ± 1.42
0.446GlyTrp: 0.446 ± 0.395
2.787GlyTyr: 2.787 ± 0.444
0.0GlyXaa: 0.0 ± 0.0
His
1.561HisAla: 1.561 ± 0.499
0.669HisCys: 0.669 ± 0.348
1.003HisAsp: 1.003 ± 0.522
1.115HisGlu: 1.115 ± 0.383
1.338HisPhe: 1.338 ± 0.696
1.561HisGly: 1.561 ± 0.608
0.557HisHis: 0.557 ± 0.29
1.003HisIle: 1.003 ± 0.736
0.669HisLys: 0.669 ± 0.348
2.118HisLeu: 2.118 ± 0.489
0.446HisMet: 0.446 ± 0.232
0.892HisAsn: 0.892 ± 0.464
0.557HisPro: 0.557 ± 0.29
0.334HisGln: 0.334 ± 0.174
0.223HisArg: 0.223 ± 0.116
1.449HisSer: 1.449 ± 0.665
2.23HisThr: 2.23 ± 0.986
1.672HisVal: 1.672 ± 0.516
0.334HisTrp: 0.334 ± 0.398
0.669HisTyr: 0.669 ± 0.197
0.0HisXaa: 0.0 ± 0.0
Ile
3.79IleAla: 3.79 ± 1.871
1.449IleCys: 1.449 ± 0.662
3.233IleAsp: 3.233 ± 0.705
1.449IleGlu: 1.449 ± 0.546
1.561IlePhe: 1.561 ± 0.649
3.902IleGly: 3.902 ± 1.812
0.446IleHis: 0.446 ± 0.16
2.787IleIle: 2.787 ± 0.844
3.79IleLys: 3.79 ± 1.002
4.125IleLeu: 4.125 ± 0.363
1.449IleMet: 1.449 ± 0.665
2.899IleAsn: 2.899 ± 0.707
2.007IlePro: 2.007 ± 0.648
1.895IleGln: 1.895 ± 0.611
1.784IleArg: 1.784 ± 0.55
3.567IleSer: 3.567 ± 1.276
4.571IleThr: 4.571 ± 0.87
4.013IleVal: 4.013 ± 1.056
0.446IleTrp: 0.446 ± 0.16
1.003IleTyr: 1.003 ± 0.51
0.0IleXaa: 0.0 ± 0.0
Lys
2.787LysAla: 2.787 ± 1.036
1.895LysCys: 1.895 ± 0.611
3.01LysAsp: 3.01 ± 1.432
2.787LysGlu: 2.787 ± 1.025
2.787LysPhe: 2.787 ± 0.892
5.017LysGly: 5.017 ± 0.771
2.007LysHis: 2.007 ± 0.605
2.23LysIle: 2.23 ± 0.942
2.899LysLys: 2.899 ± 2.648
6.243LysLeu: 6.243 ± 1.048
1.449LysMet: 1.449 ± 0.302
2.007LysAsn: 2.007 ± 0.648
3.567LysPro: 3.567 ± 0.652
1.784LysGln: 1.784 ± 1.551
2.564LysArg: 2.564 ± 0.406
4.125LysSer: 4.125 ± 0.842
3.456LysThr: 3.456 ± 0.593
3.01LysVal: 3.01 ± 0.782
0.78LysTrp: 0.78 ± 0.236
2.453LysTyr: 2.453 ± 0.806
0.0LysXaa: 0.0 ± 0.0
Leu
6.02LeuAla: 6.02 ± 1.447
2.787LeuCys: 2.787 ± 0.516
5.017LeuAsp: 5.017 ± 0.771
4.125LeuGlu: 4.125 ± 1.58
2.899LeuPhe: 2.899 ± 0.613
5.24LeuGly: 5.24 ± 0.695
1.672LeuHis: 1.672 ± 0.516
3.01LeuIle: 3.01 ± 0.949
6.689LeuLys: 6.689 ± 1.606
9.253LeuLeu: 9.253 ± 2.267
2.676LeuMet: 2.676 ± 1.071
6.243LeuAsn: 6.243 ± 0.668
4.905LeuPro: 4.905 ± 1.931
4.459LeuGln: 4.459 ± 0.417
4.571LeuArg: 4.571 ± 0.79
7.135LeuSer: 7.135 ± 1.596
5.797LeuThr: 5.797 ± 1.106
5.24LeuVal: 5.24 ± 2.12
1.226LeuTrp: 1.226 ± 1.087
3.567LeuTyr: 3.567 ± 0.682
0.0LeuXaa: 0.0 ± 0.0
Met
1.784MetAla: 1.784 ± 1.755
0.78MetCys: 0.78 ± 0.406
1.449MetAsp: 1.449 ± 0.54
0.669MetGlu: 0.669 ± 0.768
0.892MetPhe: 0.892 ± 0.281
1.003MetGly: 1.003 ± 0.324
0.446MetHis: 0.446 ± 0.232
0.892MetIle: 0.892 ± 0.319
0.892MetLys: 0.892 ± 0.275
2.787MetLeu: 2.787 ± 0.962
0.669MetMet: 0.669 ± 0.348
0.892MetAsn: 0.892 ± 0.281
1.338MetPro: 1.338 ± 0.491
1.115MetGln: 1.115 ± 0.383
0.669MetArg: 0.669 ± 0.197
2.453MetSer: 2.453 ± 0.88
1.338MetThr: 1.338 ± 0.491
1.449MetVal: 1.449 ± 0.546
0.669MetTrp: 0.669 ± 1.15
1.338MetTyr: 1.338 ± 0.394
0.0MetXaa: 0.0 ± 0.0
Asn
4.459AsnAla: 4.459 ± 0.923
1.895AsnCys: 1.895 ± 0.817
1.784AsnAsp: 1.784 ± 0.446
1.672AsnGlu: 1.672 ± 0.516
2.453AsnPhe: 2.453 ± 2.204
4.236AsnGly: 4.236 ± 0.61
1.449AsnHis: 1.449 ± 0.431
2.453AsnIle: 2.453 ± 0.905
2.787AsnLys: 2.787 ± 0.465
4.905AsnLeu: 4.905 ± 1.348
1.449AsnMet: 1.449 ± 0.588
3.567AsnAsn: 3.567 ± 1.158
1.895AsnPro: 1.895 ± 0.499
1.672AsnGln: 1.672 ± 1.298
2.118AsnArg: 2.118 ± 1.541
3.456AsnSer: 3.456 ± 1.597
3.567AsnThr: 3.567 ± 1.215
5.017AsnVal: 5.017 ± 0.203
0.446AsnTrp: 0.446 ± 0.558
2.676AsnTyr: 2.676 ± 0.502
0.0AsnXaa: 0.0 ± 0.0
Pro
3.122ProAla: 3.122 ± 0.535
1.115ProCys: 1.115 ± 0.339
1.784ProAsp: 1.784 ± 0.804
1.784ProGlu: 1.784 ± 0.858
2.23ProPhe: 2.23 ± 1.456
2.118ProGly: 2.118 ± 0.359
0.78ProHis: 0.78 ± 0.236
2.564ProIle: 2.564 ± 0.406
2.899ProLys: 2.899 ± 1.46
4.459ProLeu: 4.459 ± 0.622
0.446ProMet: 0.446 ± 0.16
2.23ProAsn: 2.23 ± 0.496
1.672ProPro: 1.672 ± 0.222
1.672ProGln: 1.672 ± 2.542
1.784ProArg: 1.784 ± 1.482
2.341ProSer: 2.341 ± 0.802
3.233ProThr: 3.233 ± 0.738
3.233ProVal: 3.233 ± 0.748
0.334ProTrp: 0.334 ± 0.17
1.115ProTyr: 1.115 ± 0.248
0.0ProXaa: 0.0 ± 0.0
Gln
3.122GlnAla: 3.122 ± 0.763
0.892GlnCys: 0.892 ± 0.281
1.895GlnAsp: 1.895 ± 0.763
1.672GlnGlu: 1.672 ± 0.509
1.672GlnPhe: 1.672 ± 0.903
2.118GlnGly: 2.118 ± 2.364
0.892GlnHis: 0.892 ± 0.275
1.784GlnIle: 1.784 ± 1.559
1.672GlnLys: 1.672 ± 0.872
3.679GlnLeu: 3.679 ± 0.761
1.115GlnMet: 1.115 ± 0.379
1.449GlnAsn: 1.449 ± 0.776
2.118GlnPro: 2.118 ± 0.68
1.784GlnGln: 1.784 ± 1.022
1.672GlnArg: 1.672 ± 1.336
2.007GlnSer: 2.007 ± 0.286
2.564GlnThr: 2.564 ± 0.713
2.453GlnVal: 2.453 ± 1.056
0.669GlnTrp: 0.669 ± 0.541
1.338GlnTyr: 1.338 ± 0.452
0.0GlnXaa: 0.0 ± 0.0
Arg
3.567ArgAla: 3.567 ± 0.652
1.226ArgCys: 1.226 ± 0.453
2.118ArgAsp: 2.118 ± 0.8
2.453ArgGlu: 2.453 ± 0.751
1.672ArgPhe: 1.672 ± 0.602
2.341ArgGly: 2.341 ± 3.007
1.115ArgHis: 1.115 ± 0.383
1.895ArgIle: 1.895 ± 1.376
1.895ArgLys: 1.895 ± 0.771
3.233ArgLeu: 3.233 ± 0.197
0.557ArgMet: 0.557 ± 0.798
2.341ArgAsn: 2.341 ± 1.291
1.226ArgPro: 1.226 ± 1.018
1.895ArgGln: 1.895 ± 1.455
1.115ArgArg: 1.115 ± 1.647
2.564ArgSer: 2.564 ± 1.101
1.561ArgThr: 1.561 ± 1.2
3.79ArgVal: 3.79 ± 0.613
0.334ArgTrp: 0.334 ± 0.398
1.449ArgTyr: 1.449 ± 0.477
0.0ArgXaa: 0.0 ± 0.0
Ser
5.797SerAla: 5.797 ± 1.542
1.561SerCys: 1.561 ± 0.649
4.236SerAsp: 4.236 ± 1.012
3.344SerGlu: 3.344 ± 0.634
3.902SerPhe: 3.902 ± 1.138
3.902SerGly: 3.902 ± 2.118
1.561SerHis: 1.561 ± 0.813
2.564SerIle: 2.564 ± 0.756
3.456SerLys: 3.456 ± 0.593
5.797SerLeu: 5.797 ± 0.976
1.561SerMet: 1.561 ± 0.53
3.233SerAsn: 3.233 ± 1.426
2.118SerPro: 2.118 ± 0.898
2.676SerGln: 2.676 ± 0.809
2.341SerArg: 2.341 ± 3.558
4.125SerSer: 4.125 ± 1.085
5.017SerThr: 5.017 ± 1.481
5.686SerVal: 5.686 ± 1.677
1.003SerTrp: 1.003 ± 0.255
3.122SerTyr: 3.122 ± 0.998
0.0SerXaa: 0.0 ± 0.0
Thr
3.567ThrAla: 3.567 ± 1.446
2.676ThrCys: 2.676 ± 1.393
3.567ThrAsp: 3.567 ± 1.904
3.79ThrGlu: 3.79 ± 0.723
4.348ThrPhe: 4.348 ± 0.872
4.459ThrGly: 4.459 ± 0.709
1.338ThrHis: 1.338 ± 0.491
3.902ThrIle: 3.902 ± 0.893
3.456ThrLys: 3.456 ± 0.462
6.355ThrLeu: 6.355 ± 0.698
1.784ThrMet: 1.784 ± 0.617
3.456ThrAsn: 3.456 ± 0.26
2.899ThrPro: 2.899 ± 0.951
3.567ThrGln: 3.567 ± 1.644
3.01ThrArg: 3.01 ± 1.098
5.686ThrSer: 5.686 ± 1.816
5.686ThrThr: 5.686 ± 1.277
5.128ThrVal: 5.128 ± 0.957
0.334ThrTrp: 0.334 ± 0.398
2.341ThrTyr: 2.341 ± 0.458
0.0ThrXaa: 0.0 ± 0.0
Val
5.686ValAla: 5.686 ± 1.175
2.007ValCys: 2.007 ± 0.828
5.24ValAsp: 5.24 ± 1.756
4.125ValGlu: 4.125 ± 1.611
4.125ValPhe: 4.125 ± 1.008
3.01ValGly: 3.01 ± 1.151
0.892ValHis: 0.892 ± 0.464
4.236ValIle: 4.236 ± 1.261
4.794ValLys: 4.794 ± 1.553
7.246ValLeu: 7.246 ± 1.213
1.449ValMet: 1.449 ± 0.431
3.679ValAsn: 3.679 ± 1.174
3.01ValPro: 3.01 ± 0.297
2.787ValGln: 2.787 ± 0.74
3.122ValArg: 3.122 ± 0.564
4.682ValSer: 4.682 ± 1.346
6.355ValThr: 6.355 ± 1.444
6.689ValVal: 6.689 ± 2.036
0.557ValTrp: 0.557 ± 0.383
4.125ValTyr: 4.125 ± 0.859
0.0ValXaa: 0.0 ± 0.0
Trp
0.669TrpAla: 0.669 ± 0.348
0.223TrpCys: 0.223 ± 0.116
0.557TrpAsp: 0.557 ± 0.29
0.669TrpGlu: 0.669 ± 0.197
0.892TrpPhe: 0.892 ± 0.275
0.334TrpGly: 0.334 ± 0.17
0.223TrpHis: 0.223 ± 0.116
0.557TrpIle: 0.557 ± 0.547
0.446TrpLys: 0.446 ± 0.232
1.338TrpLeu: 1.338 ± 1.208
0.111TrpMet: 0.111 ± 0.058
1.338TrpAsn: 1.338 ± 0.462
0.557TrpPro: 0.557 ± 0.97
0.223TrpGln: 0.223 ± 0.116
0.334TrpArg: 0.334 ± 0.17
0.78TrpSer: 0.78 ± 0.542
0.557TrpThr: 0.557 ± 0.17
0.78TrpVal: 0.78 ± 0.755
0.111TrpTrp: 0.111 ± 0.058
0.446TrpTyr: 0.446 ± 0.429
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.341TyrAla: 2.341 ± 0.458
1.226TyrCys: 1.226 ± 0.638
2.453TyrAsp: 2.453 ± 0.872
1.672TyrGlu: 1.672 ± 0.509
2.899TyrPhe: 2.899 ± 0.446
2.118TyrGly: 2.118 ± 0.447
1.003TyrHis: 1.003 ± 0.363
1.784TyrIle: 1.784 ± 0.446
4.125TyrLys: 4.125 ± 0.789
3.567TyrLeu: 3.567 ± 0.859
1.003TyrMet: 1.003 ± 0.522
2.787TyrAsn: 2.787 ± 0.258
1.784TyrPro: 1.784 ± 0.714
1.338TyrGln: 1.338 ± 0.68
2.564TyrArg: 2.564 ± 0.915
2.453TyrSer: 2.453 ± 0.494
2.787TyrThr: 2.787 ± 0.848
3.679TyrVal: 3.679 ± 1.174
0.446TyrTrp: 0.446 ± 0.16
2.341TyrTyr: 2.341 ± 0.778
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 4 proteins (8971 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski