Amino acid dipepetide frequency for Bat SARS-like coronavirus WIV1

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
6.31AlaAla: 6.31 ± 0.551
2.243AlaCys: 2.243 ± 0.341
2.874AlaAsp: 2.874 ± 0.412
2.103AlaGlu: 2.103 ± 0.465
2.874AlaPhe: 2.874 ± 0.532
4.697AlaGly: 4.697 ± 0.515
1.052AlaHis: 1.052 ± 0.399
4.066AlaIle: 4.066 ± 0.302
3.646AlaLys: 3.646 ± 1.125
7.361AlaLeu: 7.361 ± 0.712
2.314AlaMet: 2.314 ± 0.391
3.646AlaAsn: 3.646 ± 0.689
2.594AlaPro: 2.594 ± 0.373
2.173AlaGln: 2.173 ± 0.388
3.435AlaArg: 3.435 ± 0.457
4.978AlaSer: 4.978 ± 1.261
5.188AlaThr: 5.188 ± 0.658
4.557AlaVal: 4.557 ± 0.928
1.122AlaTrp: 1.122 ± 0.219
3.856AlaTyr: 3.856 ± 0.528
0.0AlaXaa: 0.0 ± 0.0
Cys
2.664CysAla: 2.664 ± 0.435
1.612CysCys: 1.612 ± 0.278
2.314CysAsp: 2.314 ± 0.525
1.052CysGlu: 1.052 ± 0.398
1.332CysPhe: 1.332 ± 0.488
2.524CysGly: 2.524 ± 0.577
0.561CysHis: 0.561 ± 0.125
1.753CysIle: 1.753 ± 0.422
0.841CysLys: 0.841 ± 0.301
2.874CysLeu: 2.874 ± 0.583
0.701CysMet: 0.701 ± 0.155
1.332CysAsn: 1.332 ± 0.376
0.911CysPro: 0.911 ± 0.233
0.561CysGln: 0.561 ± 0.181
1.052CysArg: 1.052 ± 0.347
2.103CysSer: 2.103 ± 0.53
2.243CysThr: 2.243 ± 0.426
2.944CysVal: 2.944 ± 0.613
0.351CysTrp: 0.351 ± 0.551
1.612CysTyr: 1.612 ± 0.404
0.0CysXaa: 0.0 ± 0.0
Asp
4.206AspAla: 4.206 ± 0.965
1.192AspCys: 1.192 ± 0.268
2.454AspAsp: 2.454 ± 0.635
2.874AspGlu: 2.874 ± 0.383
2.874AspPhe: 2.874 ± 0.736
3.856AspGly: 3.856 ± 0.649
1.052AspHis: 1.052 ± 0.186
2.804AspIle: 2.804 ± 0.574
2.524AspLys: 2.524 ± 0.502
4.557AspLeu: 4.557 ± 0.585
1.262AspMet: 1.262 ± 0.278
3.085AspAsn: 3.085 ± 0.446
1.472AspPro: 1.472 ± 0.428
1.472AspGln: 1.472 ± 0.323
1.262AspArg: 1.262 ± 0.378
3.365AspSer: 3.365 ± 0.458
3.716AspThr: 3.716 ± 1.082
4.066AspVal: 4.066 ± 0.691
0.561AspTrp: 0.561 ± 0.181
3.505AspTyr: 3.505 ± 0.432
0.0AspXaa: 0.0 ± 0.0
Glu
3.295GluAla: 3.295 ± 0.721
1.612GluCys: 1.612 ± 0.474
2.594GluAsp: 2.594 ± 0.552
4.697GluGlu: 4.697 ± 0.929
1.893GluPhe: 1.893 ± 0.605
2.664GluGly: 2.664 ± 0.423
1.402GluHis: 1.402 ± 0.411
2.594GluIle: 2.594 ± 0.526
2.243GluLys: 2.243 ± 0.548
4.907GluLeu: 4.907 ± 0.917
1.052GluMet: 1.052 ± 0.316
1.823GluAsn: 1.823 ± 0.333
2.033GluPro: 2.033 ± 0.674
1.823GluGln: 1.823 ± 0.26
1.262GluArg: 1.262 ± 0.296
2.804GluSer: 2.804 ± 0.43
2.734GluThr: 2.734 ± 0.646
4.136GluVal: 4.136 ± 0.583
0.421GluTrp: 0.421 ± 0.148
2.173GluTyr: 2.173 ± 0.499
0.0GluXaa: 0.0 ± 0.0
Phe
2.874PheAla: 2.874 ± 0.791
1.893PheCys: 1.893 ± 0.504
2.874PheAsp: 2.874 ± 0.668
1.683PheGlu: 1.683 ± 0.421
2.173PhePhe: 2.173 ± 0.22
3.085PheGly: 3.085 ± 1.12
0.491PheHis: 0.491 ± 0.347
2.804PheIle: 2.804 ± 0.488
3.015PheLys: 3.015 ± 0.512
5.679PheLeu: 5.679 ± 1.44
0.911PheMet: 0.911 ± 0.239
2.944PheAsn: 2.944 ± 1.319
1.963PhePro: 1.963 ± 0.398
1.122PheGln: 1.122 ± 0.791
1.472PheArg: 1.472 ± 0.426
2.804PheSer: 2.804 ± 0.399
3.926PheThr: 3.926 ± 0.487
3.505PheVal: 3.505 ± 0.599
0.351PheTrp: 0.351 ± 0.316
2.524PheTyr: 2.524 ± 0.56
0.0PheXaa: 0.0 ± 0.0
Gly
4.206GlyAla: 4.206 ± 0.781
1.753GlyCys: 1.753 ± 0.357
3.786GlyAsp: 3.786 ± 0.593
2.243GlyGlu: 2.243 ± 0.254
3.365GlyPhe: 3.365 ± 0.434
4.066GlyGly: 4.066 ± 0.897
1.332GlyHis: 1.332 ± 0.398
3.575GlyIle: 3.575 ± 0.596
2.874GlyLys: 2.874 ± 0.499
3.926GlyLeu: 3.926 ± 0.576
1.052GlyMet: 1.052 ± 0.395
2.734GlyAsn: 2.734 ± 0.451
2.384GlyPro: 2.384 ± 0.875
2.173GlyGln: 2.173 ± 0.462
1.753GlyArg: 1.753 ± 0.309
3.926GlySer: 3.926 ± 0.3
5.538GlyThr: 5.538 ± 0.936
6.73GlyVal: 6.73 ± 1.057
0.351GlyTrp: 0.351 ± 0.286
2.874GlyTyr: 2.874 ± 0.393
0.0GlyXaa: 0.0 ± 0.0
His
1.402HisAla: 1.402 ± 0.218
0.701HisCys: 0.701 ± 0.258
0.911HisAsp: 0.911 ± 0.305
1.192HisGlu: 1.192 ± 0.338
1.262HisPhe: 1.262 ± 0.352
1.472HisGly: 1.472 ± 0.324
0.561HisHis: 0.561 ± 0.207
1.052HisIle: 1.052 ± 0.438
0.771HisLys: 0.771 ± 0.285
2.033HisLeu: 2.033 ± 0.234
0.421HisMet: 0.421 ± 0.206
1.052HisAsn: 1.052 ± 0.328
0.561HisPro: 0.561 ± 0.184
0.491HisGln: 0.491 ± 0.305
0.351HisArg: 0.351 ± 0.361
1.753HisSer: 1.753 ± 0.304
1.893HisThr: 1.893 ± 0.352
1.542HisVal: 1.542 ± 0.342
0.351HisTrp: 0.351 ± 0.183
0.841HisTyr: 0.841 ± 0.418
0.0HisXaa: 0.0 ± 0.0
Ile
3.575IleAla: 3.575 ± 1.309
1.192IleCys: 1.192 ± 0.333
3.295IleAsp: 3.295 ± 0.373
1.823IleGlu: 1.823 ± 0.336
1.612IlePhe: 1.612 ± 0.466
3.295IleGly: 3.295 ± 0.93
0.491IleHis: 0.491 ± 0.435
2.944IleIle: 2.944 ± 1.331
3.505IleLys: 3.505 ± 0.592
4.066IleLeu: 4.066 ± 0.423
1.823IleMet: 1.823 ± 0.605
3.085IleAsn: 3.085 ± 0.377
1.823IlePro: 1.823 ± 0.451
2.033IleGln: 2.033 ± 0.348
1.823IleArg: 1.823 ± 0.325
3.435IleSer: 3.435 ± 0.707
4.627IleThr: 4.627 ± 0.611
4.487IleVal: 4.487 ± 0.636
0.351IleTrp: 0.351 ± 0.228
1.052IleTyr: 1.052 ± 0.503
0.0IleXaa: 0.0 ± 0.0
Lys
2.804LysAla: 2.804 ± 0.821
1.893LysCys: 1.893 ± 0.355
2.804LysAsp: 2.804 ± 0.752
2.944LysGlu: 2.944 ± 0.423
2.734LysPhe: 2.734 ± 0.569
4.978LysGly: 4.978 ± 0.882
1.683LysHis: 1.683 ± 0.477
2.384LysIle: 2.384 ± 0.527
2.804LysLys: 2.804 ± 1.115
6.73LysLeu: 6.73 ± 0.556
1.542LysMet: 1.542 ± 0.284
2.103LysAsn: 2.103 ± 0.262
3.435LysPro: 3.435 ± 0.375
1.332LysGln: 1.332 ± 0.743
2.664LysArg: 2.664 ± 0.164
3.856LysSer: 3.856 ± 0.507
3.505LysThr: 3.505 ± 0.63
3.225LysVal: 3.225 ± 0.663
0.701LysTrp: 0.701 ± 0.177
2.103LysTyr: 2.103 ± 0.358
0.0LysXaa: 0.0 ± 0.0
Leu
6.66LeuAla: 6.66 ± 0.803
2.734LeuCys: 2.734 ± 0.408
5.679LeuAsp: 5.679 ± 0.713
4.627LeuGlu: 4.627 ± 1.028
3.225LeuPhe: 3.225 ± 1.139
5.328LeuGly: 5.328 ± 0.581
1.753LeuHis: 1.753 ± 0.413
3.435LeuIle: 3.435 ± 1.564
6.87LeuLys: 6.87 ± 0.951
10.446LeuLeu: 10.446 ± 1.66
2.734LeuMet: 2.734 ± 0.399
6.31LeuAsn: 6.31 ± 0.7
4.697LeuPro: 4.697 ± 0.885
4.697LeuGln: 4.697 ± 0.933
4.907LeuArg: 4.907 ± 0.66
7.642LeuSer: 7.642 ± 1.332
5.819LeuThr: 5.819 ± 0.63
6.169LeuVal: 6.169 ± 1.341
1.052LeuTrp: 1.052 ± 0.364
3.505LeuTyr: 3.505 ± 0.758
0.0LeuXaa: 0.0 ± 0.0
Met
1.683MetAla: 1.683 ± 0.559
0.911MetCys: 0.911 ± 0.291
1.542MetAsp: 1.542 ± 0.432
0.841MetGlu: 0.841 ± 0.621
0.911MetPhe: 0.911 ± 0.304
0.771MetGly: 0.771 ± 0.286
0.491MetHis: 0.491 ± 0.158
0.561MetIle: 0.561 ± 0.307
1.052MetLys: 1.052 ± 0.327
3.015MetLeu: 3.015 ± 0.532
0.771MetMet: 0.771 ± 0.268
0.981MetAsn: 0.981 ± 0.259
1.402MetPro: 1.402 ± 0.317
1.402MetGln: 1.402 ± 0.318
0.771MetArg: 0.771 ± 0.412
2.243MetSer: 2.243 ± 0.469
1.472MetThr: 1.472 ± 0.272
1.262MetVal: 1.262 ± 0.36
0.701MetTrp: 0.701 ± 0.35
1.402MetTyr: 1.402 ± 0.293
0.0MetXaa: 0.0 ± 0.0
Asn
4.206AsnAla: 4.206 ± 0.648
1.683AsnCys: 1.683 ± 0.347
1.683AsnAsp: 1.683 ± 0.306
1.823AsnGlu: 1.823 ± 0.35
2.033AsnPhe: 2.033 ± 1.312
4.347AsnGly: 4.347 ± 0.617
1.472AsnHis: 1.472 ± 0.445
2.524AsnIle: 2.524 ± 0.491
3.015AsnLys: 3.015 ± 0.614
4.557AsnLeu: 4.557 ± 0.647
1.472AsnMet: 1.472 ± 0.445
3.575AsnAsn: 3.575 ± 0.901
1.823AsnPro: 1.823 ± 0.406
1.332AsnGln: 1.332 ± 0.71
1.612AsnArg: 1.612 ± 0.515
3.646AsnSer: 3.646 ± 0.885
3.155AsnThr: 3.155 ± 1.02
4.417AsnVal: 4.417 ± 0.549
0.491AsnTrp: 0.491 ± 0.207
2.384AsnTyr: 2.384 ± 0.368
0.0AsnXaa: 0.0 ± 0.0
Pro
2.804ProAla: 2.804 ± 0.372
1.262ProCys: 1.262 ± 0.418
1.753ProAsp: 1.753 ± 0.544
1.542ProGlu: 1.542 ± 0.318
2.173ProPhe: 2.173 ± 0.873
2.033ProGly: 2.033 ± 0.325
0.631ProHis: 0.631 ± 0.188
2.874ProIle: 2.874 ± 0.33
3.155ProLys: 3.155 ± 0.656
4.627ProLeu: 4.627 ± 0.73
0.491ProMet: 0.491 ± 0.464
2.033ProAsn: 2.033 ± 0.39
1.753ProPro: 1.753 ± 0.303
1.332ProGln: 1.332 ± 1.193
1.823ProArg: 1.823 ± 0.761
2.314ProSer: 2.314 ± 0.453
3.225ProThr: 3.225 ± 0.493
3.435ProVal: 3.435 ± 0.741
0.28ProTrp: 0.28 ± 0.086
1.052ProTyr: 1.052 ± 0.201
0.0ProXaa: 0.0 ± 0.0
Gln
3.015GlnAla: 3.015 ± 0.464
1.122GlnCys: 1.122 ± 0.222
1.542GlnAsp: 1.542 ± 0.398
2.033GlnGlu: 2.033 ± 0.624
1.542GlnPhe: 1.542 ± 0.554
2.033GlnGly: 2.033 ± 1.082
0.701GlnHis: 0.701 ± 0.381
1.893GlnIle: 1.893 ± 1.289
1.402GlnLys: 1.402 ± 0.414
3.856GlnLeu: 3.856 ± 0.786
0.981GlnMet: 0.981 ± 0.203
1.402GlnAsn: 1.402 ± 0.514
2.243GlnPro: 2.243 ± 0.357
1.753GlnGln: 1.753 ± 0.47
1.542GlnArg: 1.542 ± 0.5
1.753GlnSer: 1.753 ± 0.302
2.384GlnThr: 2.384 ± 0.347
2.804GlnVal: 2.804 ± 0.632
0.631GlnTrp: 0.631 ± 0.19
1.122GlnTyr: 1.122 ± 0.237
0.0GlnXaa: 0.0 ± 0.0
Arg
3.716ArgAla: 3.716 ± 0.67
1.122ArgCys: 1.122 ± 0.274
1.893ArgAsp: 1.893 ± 0.41
2.384ArgGlu: 2.384 ± 0.592
1.472ArgPhe: 1.472 ± 0.355
2.314ArgGly: 2.314 ± 1.275
1.122ArgHis: 1.122 ± 0.238
1.963ArgIle: 1.963 ± 1.035
2.103ArgLys: 2.103 ± 0.331
2.944ArgLeu: 2.944 ± 0.458
0.561ArgMet: 0.561 ± 0.385
1.753ArgAsn: 1.753 ± 0.749
1.192ArgPro: 1.192 ± 0.381
1.753ArgGln: 1.753 ± 0.807
0.841ArgArg: 0.841 ± 0.775
3.015ArgSer: 3.015 ± 0.453
1.683ArgThr: 1.683 ± 0.47
3.435ArgVal: 3.435 ± 0.51
0.421ArgTrp: 0.421 ± 0.282
1.262ArgTyr: 1.262 ± 0.369
0.0ArgXaa: 0.0 ± 0.0
Ser
5.679SerAla: 5.679 ± 0.896
1.963SerCys: 1.963 ± 0.541
3.575SerAsp: 3.575 ± 1.003
3.716SerGlu: 3.716 ± 0.563
4.206SerPhe: 4.206 ± 0.836
3.926SerGly: 3.926 ± 0.968
1.612SerHis: 1.612 ± 0.451
2.384SerIle: 2.384 ± 0.317
3.155SerLys: 3.155 ± 0.394
6.59SerLeu: 6.59 ± 1.327
1.402SerMet: 1.402 ± 0.388
3.155SerAsn: 3.155 ± 0.848
2.314SerPro: 2.314 ± 0.864
2.173SerGln: 2.173 ± 0.473
1.963SerArg: 1.963 ± 1.85
4.066SerSer: 4.066 ± 0.966
4.907SerThr: 4.907 ± 0.682
6.239SerVal: 6.239 ± 0.745
1.122SerTrp: 1.122 ± 0.16
3.225SerTyr: 3.225 ± 0.535
0.0SerXaa: 0.0 ± 0.0
Thr
3.505ThrAla: 3.505 ± 1.194
2.524ThrCys: 2.524 ± 0.824
3.085ThrAsp: 3.085 ± 0.842
3.926ThrGlu: 3.926 ± 0.587
4.557ThrPhe: 4.557 ± 0.529
4.206ThrGly: 4.206 ± 0.543
1.612ThrHis: 1.612 ± 0.596
4.557ThrIle: 4.557 ± 0.766
3.856ThrLys: 3.856 ± 0.375
6.8ThrLeu: 6.8 ± 0.529
1.542ThrMet: 1.542 ± 0.402
3.225ThrAsn: 3.225 ± 0.468
3.015ThrPro: 3.015 ± 0.482
3.295ThrGln: 3.295 ± 0.956
2.734ThrArg: 2.734 ± 0.38
5.749ThrSer: 5.749 ± 0.818
6.38ThrThr: 6.38 ± 1.237
5.048ThrVal: 5.048 ± 0.648
0.491ThrTrp: 0.491 ± 0.187
2.314ThrTyr: 2.314 ± 0.336
0.0ThrXaa: 0.0 ± 0.0
Val
5.889ValAla: 5.889 ± 0.891
2.033ValCys: 2.033 ± 0.565
4.767ValAsp: 4.767 ± 1.005
4.347ValGlu: 4.347 ± 1.109
3.856ValPhe: 3.856 ± 0.676
2.944ValGly: 2.944 ± 0.667
1.192ValHis: 1.192 ± 0.347
3.926ValIle: 3.926 ± 0.389
4.907ValLys: 4.907 ± 0.804
8.202ValLeu: 8.202 ± 0.665
1.753ValMet: 1.753 ± 0.293
3.225ValAsn: 3.225 ± 0.676
2.944ValPro: 2.944 ± 0.202
3.155ValGln: 3.155 ± 0.597
3.085ValArg: 3.085 ± 0.413
4.627ValSer: 4.627 ± 0.911
6.941ValThr: 6.941 ± 0.864
7.011ValVal: 7.011 ± 0.94
0.631ValTrp: 0.631 ± 0.153
4.206ValTyr: 4.206 ± 0.441
0.0ValXaa: 0.0 ± 0.0
Trp
0.701TrpAla: 0.701 ± 0.258
0.21TrpCys: 0.21 ± 0.08
0.351TrpAsp: 0.351 ± 0.234
0.491TrpGlu: 0.491 ± 0.114
1.402TrpPhe: 1.402 ± 0.425
0.21TrpGly: 0.21 ± 0.121
0.351TrpHis: 0.351 ± 0.354
0.491TrpIle: 0.491 ± 0.174
0.561TrpLys: 0.561 ± 0.198
1.472TrpLeu: 1.472 ± 0.45
0.14TrpMet: 0.14 ± 0.055
1.262TrpAsn: 1.262 ± 0.286
0.421TrpPro: 0.421 ± 0.419
0.28TrpGln: 0.28 ± 0.17
0.21TrpArg: 0.21 ± 0.121
0.701TrpSer: 0.701 ± 0.209
0.421TrpThr: 0.421 ± 0.136
0.631TrpVal: 0.631 ± 0.27
0.07TrpTrp: 0.07 ± 0.047
0.491TrpTyr: 0.491 ± 0.307
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.963TyrAla: 1.963 ± 0.425
1.683TyrCys: 1.683 ± 0.411
2.384TyrAsp: 2.384 ± 0.607
1.823TyrGlu: 1.823 ± 0.302
2.594TyrPhe: 2.594 ± 0.455
1.893TyrGly: 1.893 ± 0.234
1.052TyrHis: 1.052 ± 0.327
1.753TyrIle: 1.753 ± 0.347
3.926TyrLys: 3.926 ± 0.469
3.646TyrLeu: 3.646 ± 0.834
1.192TyrMet: 1.192 ± 0.404
2.524TyrAsn: 2.524 ± 0.336
1.612TyrPro: 1.612 ± 0.387
1.402TyrGln: 1.402 ± 0.587
2.384TyrArg: 2.384 ± 0.382
2.594TyrSer: 2.594 ± 0.689
2.664TyrThr: 2.664 ± 0.374
3.996TyrVal: 3.996 ± 0.866
0.421TyrTrp: 0.421 ± 0.191
2.524TyrTyr: 2.524 ± 0.335
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 13 proteins (14265 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski