Amino acid dipepetide frequency for Changjiang astro-like virus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
3.332AlaAla: 3.332 ± 0.636
1.428AlaCys: 1.428 ± 1.069
4.76AlaAsp: 4.76 ± 1.703
2.38AlaGlu: 2.38 ± 1.256
0.952AlaPhe: 0.952 ± 0.797
3.332AlaGly: 3.332 ± 1.55
0.952AlaHis: 0.952 ± 0.414
4.284AlaIle: 4.284 ± 1.15
2.38AlaLys: 2.38 ± 0.511
4.76AlaLeu: 4.76 ± 1.294
0.476AlaMet: 0.476 ± 0.356
2.856AlaAsn: 2.856 ± 1.119
2.856AlaPro: 2.856 ± 0.541
1.428AlaGln: 1.428 ± 1.433
1.904AlaArg: 1.904 ± 0.534
2.856AlaSer: 2.856 ± 1.108
4.284AlaThr: 4.284 ± 2.297
4.284AlaVal: 4.284 ± 0.373
0.476AlaTrp: 0.476 ± 0.398
3.332AlaTyr: 3.332 ± 0.636
0.0AlaXaa: 0.0 ± 0.0
Cys
1.428CysAla: 1.428 ± 0.662
0.0CysCys: 0.0 ± 0.0
2.38CysAsp: 2.38 ± 1.31
1.904CysGlu: 1.904 ± 0.555
0.952CysPhe: 0.952 ± 0.414
2.38CysGly: 2.38 ± 0.758
0.0CysHis: 0.0 ± 0.0
1.904CysIle: 1.904 ± 1.313
0.952CysLys: 0.952 ± 0.51
2.856CysLeu: 2.856 ± 1.367
1.904CysMet: 1.904 ± 0.703
0.476CysAsn: 0.476 ± 0.47
0.476CysPro: 0.476 ± 0.356
0.0CysGln: 0.0 ± 0.0
1.904CysArg: 1.904 ± 1.322
1.428CysSer: 1.428 ± 0.462
0.476CysThr: 0.476 ± 0.398
2.38CysVal: 2.38 ± 0.821
0.476CysTrp: 0.476 ± 0.398
0.476CysTyr: 0.476 ± 0.47
0.0CysXaa: 0.0 ± 0.0
Asp
3.808AspAla: 3.808 ± 1.496
0.952AspCys: 0.952 ± 0.466
3.808AspAsp: 3.808 ± 1.163
2.38AspGlu: 2.38 ± 1.317
1.904AspPhe: 1.904 ± 1.101
5.236AspGly: 5.236 ± 2.463
1.428AspHis: 1.428 ± 0.462
2.856AspIle: 2.856 ± 0.957
2.38AspLys: 2.38 ± 1.119
3.332AspLeu: 3.332 ± 0.421
1.904AspMet: 1.904 ± 0.321
3.332AspAsn: 3.332 ± 1.271
3.808AspPro: 3.808 ± 1.197
1.904AspGln: 1.904 ± 0.979
3.332AspArg: 3.332 ± 1.345
4.76AspSer: 4.76 ± 0.981
1.904AspThr: 1.904 ± 1.011
4.76AspVal: 4.76 ± 0.416
2.38AspTrp: 2.38 ± 0.734
3.332AspTyr: 3.332 ± 0.777
0.0AspXaa: 0.0 ± 0.0
Glu
1.428GluAla: 1.428 ± 0.581
1.428GluCys: 1.428 ± 0.662
3.332GluAsp: 3.332 ± 1.206
5.236GluGlu: 5.236 ± 1.402
1.904GluPhe: 1.904 ± 1.353
3.808GluGly: 3.808 ± 0.641
2.856GluHis: 2.856 ± 0.48
3.332GluIle: 3.332 ± 1.708
3.808GluLys: 3.808 ± 0.75
3.808GluLeu: 3.808 ± 1.067
2.38GluMet: 2.38 ± 1.487
2.856GluAsn: 2.856 ± 1.743
1.904GluPro: 1.904 ± 0.726
2.38GluGln: 2.38 ± 1.209
1.428GluArg: 1.428 ± 0.462
1.904GluSer: 1.904 ± 0.849
2.856GluThr: 2.856 ± 1.184
5.712GluVal: 5.712 ± 0.527
1.904GluTrp: 1.904 ± 1.151
3.808GluTyr: 3.808 ± 0.75
0.0GluXaa: 0.0 ± 0.0
Phe
1.904PheAla: 1.904 ± 0.375
1.904PheCys: 1.904 ± 0.321
2.38PheAsp: 2.38 ± 0.67
2.38PheGlu: 2.38 ± 0.86
0.952PhePhe: 0.952 ± 0.956
1.904PheGly: 1.904 ± 0.375
0.952PheHis: 0.952 ± 0.414
0.952PheIle: 0.952 ± 0.51
2.856PheLys: 2.856 ± 1.277
2.856PheLeu: 2.856 ± 0.982
1.904PheMet: 1.904 ± 0.534
1.428PheAsn: 1.428 ± 0.648
0.952PhePro: 0.952 ± 0.956
0.952PheGln: 0.952 ± 0.797
0.476PheArg: 0.476 ± 0.47
1.428PheSer: 1.428 ± 0.904
2.38PheThr: 2.38 ± 0.538
1.428PheVal: 1.428 ± 0.904
0.952PheTrp: 0.952 ± 0.466
0.0PheTyr: 0.0 ± 0.0
0.0PheXaa: 0.0 ± 0.0
Gly
4.76GlyAla: 4.76 ± 1.002
1.904GlyCys: 1.904 ± 1.322
4.76GlyAsp: 4.76 ± 0.779
3.808GlyGlu: 3.808 ± 0.768
1.904GlyPhe: 1.904 ± 0.534
4.284GlyGly: 4.284 ± 0.824
1.904GlyHis: 1.904 ± 0.986
3.332GlyIle: 3.332 ± 0.421
3.808GlyLys: 3.808 ± 1.717
5.712GlyLeu: 5.712 ± 2.352
1.904GlyMet: 1.904 ± 1.101
2.38GlyAsn: 2.38 ± 0.67
0.952GlyPro: 0.952 ± 0.466
1.428GlyGln: 1.428 ± 0.801
2.38GlyArg: 2.38 ± 1.116
3.808GlySer: 3.808 ± 1.279
2.856GlyThr: 2.856 ± 0.541
5.236GlyVal: 5.236 ± 1.086
3.332GlyTrp: 3.332 ± 1.52
4.76GlyTyr: 4.76 ± 0.926
0.0GlyXaa: 0.0 ± 0.0
His
0.476HisAla: 0.476 ± 0.398
0.476HisCys: 0.476 ± 0.356
2.38HisAsp: 2.38 ± 1.045
1.904HisGlu: 1.904 ± 1.227
0.0HisPhe: 0.0 ± 0.0
4.284HisGly: 4.284 ± 1.782
0.0HisHis: 0.0 ± 0.0
0.952HisIle: 0.952 ± 0.466
2.38HisLys: 2.38 ± 0.68
0.952HisLeu: 0.952 ± 0.466
0.952HisMet: 0.952 ± 0.414
0.952HisAsn: 0.952 ± 0.414
0.0HisPro: 0.0 ± 0.0
0.476HisGln: 0.476 ± 0.356
1.904HisArg: 1.904 ± 0.835
0.0HisSer: 0.0 ± 0.0
2.38HisThr: 2.38 ± 0.741
2.38HisVal: 2.38 ± 1.045
0.0HisTrp: 0.0 ± 0.0
1.428HisTyr: 1.428 ± 0.684
0.0HisXaa: 0.0 ± 0.0
Ile
1.904IleAla: 1.904 ± 0.375
0.952IleCys: 0.952 ± 0.576
4.284IleAsp: 4.284 ± 1.432
1.904IleGlu: 1.904 ± 0.321
1.428IlePhe: 1.428 ± 0.581
2.38IleGly: 2.38 ± 0.67
2.38IleHis: 2.38 ± 0.86
2.856IleIle: 2.856 ± 2.138
4.76IleLys: 4.76 ± 1.891
4.76IleLeu: 4.76 ± 0.508
1.428IleMet: 1.428 ± 0.271
4.284IleAsn: 4.284 ± 1.241
2.38IlePro: 2.38 ± 0.68
0.952IleGln: 0.952 ± 0.395
1.904IleArg: 1.904 ± 0.835
3.808IleSer: 3.808 ± 1.965
4.284IleThr: 4.284 ± 1.202
6.663IleVal: 6.663 ± 0.465
2.38IleTrp: 2.38 ± 0.86
2.856IleTyr: 2.856 ± 1.224
0.0IleXaa: 0.0 ± 0.0
Lys
2.38LysAla: 2.38 ± 0.68
0.952LysCys: 0.952 ± 0.395
1.904LysAsp: 1.904 ± 1.594
1.904LysGlu: 1.904 ± 0.828
2.856LysPhe: 2.856 ± 0.432
2.856LysGly: 2.856 ± 0.957
1.428LysHis: 1.428 ± 1.069
2.38LysIle: 2.38 ± 0.511
5.712LysLys: 5.712 ± 2.512
6.663LysLeu: 6.663 ± 1.292
2.38LysMet: 2.38 ± 0.511
1.428LysAsn: 1.428 ± 0.904
2.38LysPro: 2.38 ± 0.68
1.904LysGln: 1.904 ± 0.534
6.663LysArg: 6.663 ± 1.581
3.332LysSer: 3.332 ± 0.964
4.284LysThr: 4.284 ± 0.803
3.808LysVal: 3.808 ± 0.75
1.428LysTrp: 1.428 ± 0.684
2.38LysTyr: 2.38 ± 0.547
0.0LysXaa: 0.0 ± 0.0
Leu
4.76LeuAla: 4.76 ± 1.464
2.856LeuCys: 2.856 ± 1.356
5.236LeuAsp: 5.236 ± 1.794
7.615LeuGlu: 7.615 ± 1.674
1.428LeuPhe: 1.428 ± 1.043
2.856LeuGly: 2.856 ± 0.48
1.904LeuHis: 1.904 ± 0.979
4.76LeuIle: 4.76 ± 1.357
5.236LeuLys: 5.236 ± 1.694
6.663LeuLeu: 6.663 ± 0.867
2.856LeuMet: 2.856 ± 1.324
4.284LeuAsn: 4.284 ± 1.335
3.332LeuPro: 3.332 ± 0.434
1.904LeuGln: 1.904 ± 0.849
5.712LeuArg: 5.712 ± 1.103
7.615LeuSer: 7.615 ± 0.635
1.904LeuThr: 1.904 ± 0.79
6.188LeuVal: 6.188 ± 1.83
1.904LeuTrp: 1.904 ± 0.979
3.332LeuTyr: 3.332 ± 1.275
0.0LeuXaa: 0.0 ± 0.0
Met
3.332MetAla: 3.332 ± 0.708
0.0MetCys: 0.0 ± 0.0
0.476MetAsp: 0.476 ± 0.398
1.428MetGlu: 1.428 ± 0.73
1.428MetPhe: 1.428 ± 0.484
2.38MetGly: 2.38 ± 0.511
0.0MetHis: 0.0 ± 0.0
1.904MetIle: 1.904 ± 0.828
1.428MetLys: 1.428 ± 0.662
2.38MetLeu: 2.38 ± 0.86
0.952MetMet: 0.952 ± 0.395
2.856MetAsn: 2.856 ± 0.957
0.952MetPro: 0.952 ± 0.414
1.428MetGln: 1.428 ± 0.271
2.38MetArg: 2.38 ± 1.487
2.856MetSer: 2.856 ± 1.419
0.476MetThr: 0.476 ± 0.356
2.38MetVal: 2.38 ± 1.782
0.952MetTrp: 0.952 ± 0.51
0.0MetTyr: 0.0 ± 0.0
0.0MetXaa: 0.0 ± 0.0
Asn
2.38AsnAla: 2.38 ± 0.538
1.904AsnCys: 1.904 ± 1.322
0.952AsnAsp: 0.952 ± 0.51
2.38AsnGlu: 2.38 ± 0.747
1.428AsnPhe: 1.428 ± 0.271
5.712AsnGly: 5.712 ± 1.298
1.904AsnHis: 1.904 ± 0.591
4.284AsnIle: 4.284 ± 1.196
1.904AsnLys: 1.904 ± 0.555
1.904AsnLeu: 1.904 ± 0.591
1.428AsnMet: 1.428 ± 0.271
2.38AsnAsn: 2.38 ± 1.403
1.904AsnPro: 1.904 ± 0.375
0.952AsnGln: 0.952 ± 0.414
1.428AsnArg: 1.428 ± 0.271
2.856AsnSer: 2.856 ± 1.119
5.236AsnThr: 5.236 ± 2.533
4.284AsnVal: 4.284 ± 0.791
1.904AsnTrp: 1.904 ± 0.983
0.476AsnTyr: 0.476 ± 0.398
0.0AsnXaa: 0.0 ± 0.0
Pro
0.476ProAla: 0.476 ± 0.398
1.904ProCys: 1.904 ± 1.227
1.428ProAsp: 1.428 ± 0.271
1.904ProGlu: 1.904 ± 0.555
0.952ProPhe: 0.952 ± 0.713
3.808ProGly: 3.808 ± 1.182
1.428ProHis: 1.428 ± 0.662
3.332ProIle: 3.332 ± 0.636
2.38ProLys: 2.38 ± 0.67
5.236ProLeu: 5.236 ± 0.803
0.0ProMet: 0.0 ± 0.0
2.38ProAsn: 2.38 ± 0.538
1.904ProPro: 1.904 ± 0.979
2.38ProGln: 2.38 ± 0.547
1.428ProArg: 1.428 ± 0.271
2.856ProSer: 2.856 ± 1.162
3.808ProThr: 3.808 ± 0.613
2.38ProVal: 2.38 ± 0.927
0.476ProTrp: 0.476 ± 0.47
0.0ProTyr: 0.0 ± 0.0
0.0ProXaa: 0.0 ± 0.0
Gln
1.428GlnAla: 1.428 ± 0.801
0.952GlnCys: 0.952 ± 0.414
3.332GlnAsp: 3.332 ± 1.15
1.428GlnGlu: 1.428 ± 0.662
0.476GlnPhe: 0.476 ± 0.47
1.428GlnGly: 1.428 ± 0.904
0.0GlnHis: 0.0 ± 0.0
3.332GlnIle: 3.332 ± 2.543
1.904GlnLys: 1.904 ± 1.258
3.808GlnLeu: 3.808 ± 1.282
1.428GlnMet: 1.428 ± 0.662
1.428GlnAsn: 1.428 ± 0.581
1.428GlnPro: 1.428 ± 0.581
0.476GlnGln: 0.476 ± 0.356
2.856GlnArg: 2.856 ± 0.713
1.428GlnSer: 1.428 ± 0.904
1.904GlnThr: 1.904 ± 0.534
3.332GlnVal: 3.332 ± 1.898
1.428GlnTrp: 1.428 ± 0.484
0.476GlnTyr: 0.476 ± 0.398
0.0GlnXaa: 0.0 ± 0.0
Arg
3.332ArgAla: 3.332 ± 0.959
1.904ArgCys: 1.904 ± 1.88
4.76ArgAsp: 4.76 ± 1.459
2.38ArgGlu: 2.38 ± 1.045
2.856ArgPhe: 2.856 ± 1.108
1.428ArgGly: 1.428 ± 0.484
2.38ArgHis: 2.38 ± 0.838
2.856ArgIle: 2.856 ± 1.224
2.38ArgLys: 2.38 ± 1.209
8.091ArgLeu: 8.091 ± 3.257
0.952ArgMet: 0.952 ± 0.713
2.38ArgAsn: 2.38 ± 0.199
1.904ArgPro: 1.904 ± 0.726
3.808ArgGln: 3.808 ± 1.522
4.76ArgArg: 4.76 ± 0.926
4.284ArgSer: 4.284 ± 0.791
1.428ArgThr: 1.428 ± 1.433
1.428ArgVal: 1.428 ± 0.801
1.428ArgTrp: 1.428 ± 0.871
3.332ArgTyr: 3.332 ± 1.271
0.0ArgXaa: 0.0 ± 0.0
Ser
2.856SerAla: 2.856 ± 0.432
0.952SerCys: 0.952 ± 0.713
3.808SerAsp: 3.808 ± 0.09
2.856SerGlu: 2.856 ± 0.541
4.284SerPhe: 4.284 ± 2.238
8.091SerGly: 8.091 ± 1.6
0.0SerHis: 0.0 ± 0.0
3.332SerIle: 3.332 ± 0.548
4.284SerLys: 4.284 ± 1.335
4.284SerLeu: 4.284 ± 1.335
1.428SerMet: 1.428 ± 0.271
2.38SerAsn: 2.38 ± 0.927
3.332SerPro: 3.332 ± 0.959
1.904SerGln: 1.904 ± 0.321
2.856SerArg: 2.856 ± 1.46
3.332SerSer: 3.332 ± 0.821
5.236SerThr: 5.236 ± 2.129
3.332SerVal: 3.332 ± 0.421
0.0SerTrp: 0.0 ± 0.0
2.856SerTyr: 2.856 ± 0.891
0.0SerXaa: 0.0 ± 0.0
Thr
3.332ThrAla: 3.332 ± 2.061
0.952ThrCys: 0.952 ± 0.661
2.38ThrAsp: 2.38 ± 0.927
2.856ThrGlu: 2.856 ± 0.48
1.428ThrPhe: 1.428 ± 0.78
2.856ThrGly: 2.856 ± 1.018
1.428ThrHis: 1.428 ± 0.73
5.236ThrIle: 5.236 ± 1.742
4.76ThrLys: 4.76 ± 1.009
5.236ThrLeu: 5.236 ± 1.178
0.476ThrMet: 0.476 ± 0.398
1.904ThrAsn: 1.904 ± 0.841
3.332ThrPro: 3.332 ± 0.708
4.284ThrGln: 4.284 ± 1.208
4.76ThrArg: 4.76 ± 1.561
3.332ThrSer: 3.332 ± 0.777
2.856ThrThr: 2.856 ± 1.018
4.284ThrVal: 4.284 ± 2.53
1.428ThrTrp: 1.428 ± 0.866
4.76ThrTyr: 4.76 ± 1.34
0.0ThrXaa: 0.0 ± 0.0
Val
5.236ValAla: 5.236 ± 1.688
2.38ValCys: 2.38 ± 0.927
3.808ValAsp: 3.808 ± 1.314
5.712ValGlu: 5.712 ± 1.138
0.476ValPhe: 0.476 ± 0.398
3.808ValGly: 3.808 ± 0.09
1.428ValHis: 1.428 ± 1.195
5.712ValIle: 5.712 ± 1.129
2.856ValLys: 2.856 ± 1.551
3.808ValLeu: 3.808 ± 0.681
2.856ValMet: 2.856 ± 1.108
4.76ValAsn: 4.76 ± 1.072
3.808ValPro: 3.808 ± 0.09
3.332ValGln: 3.332 ± 1.4
4.284ValArg: 4.284 ± 1.478
3.808ValSer: 3.808 ± 1.496
6.188ValThr: 6.188 ± 2.702
9.519ValVal: 9.519 ± 3.386
1.904ValTrp: 1.904 ± 0.979
3.808ValTyr: 3.808 ± 0.681
0.0ValXaa: 0.0 ± 0.0
Trp
1.428TrpAla: 1.428 ± 0.648
0.952TrpCys: 0.952 ± 0.466
1.904TrpAsp: 1.904 ± 1.227
2.38TrpGlu: 2.38 ± 1.388
1.428TrpPhe: 1.428 ± 0.484
0.476TrpGly: 0.476 ± 0.356
0.476TrpHis: 0.476 ± 0.356
0.0TrpIle: 0.0 ± 0.0
1.428TrpLys: 1.428 ± 0.73
1.904TrpLeu: 1.904 ± 0.979
1.428TrpMet: 1.428 ± 0.67
0.476TrpAsn: 0.476 ± 0.47
0.952TrpPro: 0.952 ± 0.576
0.952TrpGln: 0.952 ± 0.576
2.856TrpArg: 2.856 ± 0.844
3.332TrpSer: 3.332 ± 0.821
1.904TrpThr: 1.904 ± 0.591
0.952TrpVal: 0.952 ± 0.713
0.0TrpTrp: 0.0 ± 0.0
0.476TrpTyr: 0.476 ± 0.47
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.332TyrAla: 3.332 ± 1.078
0.476TyrCys: 0.476 ± 0.47
1.428TyrAsp: 1.428 ± 0.462
3.332TyrGlu: 3.332 ± 0.434
1.904TyrPhe: 1.904 ± 0.841
1.904TyrGly: 1.904 ± 1.151
1.428TyrHis: 1.428 ± 0.871
0.952TyrIle: 0.952 ± 0.395
1.428TyrLys: 1.428 ± 0.73
3.332TyrLeu: 3.332 ± 1.4
0.476TyrMet: 0.476 ± 0.39
2.38TyrAsn: 2.38 ± 1.076
1.904TyrPro: 1.904 ± 0.726
1.428TyrGln: 1.428 ± 0.271
2.856TyrArg: 2.856 ± 0.836
2.38TyrSer: 2.38 ± 0.199
5.236TyrThr: 5.236 ± 0.634
4.76TyrVal: 4.76 ± 0.553
0.952TyrTrp: 0.952 ± 0.661
3.332TyrTyr: 3.332 ± 0.636
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 4 proteins (2102 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski