Amino acid dipepetide frequency for Sunguru virus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
2.159AlaAla: 2.159 ± 0.683
1.619AlaCys: 1.619 ± 1.085
2.969AlaAsp: 2.969 ± 1.219
2.969AlaGlu: 2.969 ± 0.985
1.889AlaPhe: 1.889 ± 1.549
3.509AlaGly: 3.509 ± 1.412
1.08AlaHis: 1.08 ± 0.435
3.239AlaIle: 3.239 ± 0.747
2.159AlaLys: 2.159 ± 1.175
4.049AlaLeu: 4.049 ± 1.411
1.619AlaMet: 1.619 ± 1.376
1.619AlaAsn: 1.619 ± 0.621
1.35AlaPro: 1.35 ± 0.566
1.08AlaGln: 1.08 ± 0.418
1.889AlaArg: 1.889 ± 0.861
2.969AlaSer: 2.969 ± 0.859
1.619AlaThr: 1.619 ± 0.535
2.699AlaVal: 2.699 ± 0.517
0.0AlaTrp: 0.0 ± 0.0
1.08AlaTyr: 1.08 ± 0.594
0.0AlaXaa: 0.0 ± 0.0
Cys
1.35CysAla: 1.35 ± 0.28
0.27CysCys: 0.27 ± 0.331
1.08CysAsp: 1.08 ± 0.418
1.08CysGlu: 1.08 ± 0.655
0.81CysPhe: 0.81 ± 0.493
0.81CysGly: 0.81 ± 0.501
0.27CysHis: 0.27 ± 0.522
0.54CysIle: 0.54 ± 0.414
1.08CysLys: 1.08 ± 0.592
2.159CysLeu: 2.159 ± 0.733
0.0CysMet: 0.0 ± 0.0
0.81CysAsn: 0.81 ± 0.62
0.81CysPro: 0.81 ± 0.992
0.54CysGln: 0.54 ± 0.29
0.81CysArg: 0.81 ± 0.354
1.08CysSer: 1.08 ± 0.435
0.27CysThr: 0.27 ± 0.161
1.35CysVal: 1.35 ± 1.036
0.27CysTrp: 0.27 ± 0.161
0.81CysTyr: 0.81 ± 0.518
0.0CysXaa: 0.0 ± 0.0
Asp
1.619AspAla: 1.619 ± 2.11
0.81AspCys: 0.81 ± 0.62
5.398AspAsp: 5.398 ± 1.29
2.429AspGlu: 2.429 ± 0.693
3.239AspPhe: 3.239 ± 0.398
2.699AspGly: 2.699 ± 0.909
0.81AspHis: 0.81 ± 0.534
5.128AspIle: 5.128 ± 1.152
4.049AspLys: 4.049 ± 0.79
7.018AspLeu: 7.018 ± 1.525
1.08AspMet: 1.08 ± 0.285
2.159AspAsn: 2.159 ± 0.551
4.588AspPro: 4.588 ± 1.654
4.858AspGln: 4.858 ± 0.979
3.239AspArg: 3.239 ± 1.841
3.509AspSer: 3.509 ± 0.735
1.35AspThr: 1.35 ± 0.662
2.429AspVal: 2.429 ± 0.736
1.619AspTrp: 1.619 ± 0.721
2.159AspTyr: 2.159 ± 0.87
0.0AspXaa: 0.0 ± 0.0
Glu
1.889GluAla: 1.889 ± 0.701
0.81GluCys: 0.81 ± 0.493
3.509GluAsp: 3.509 ± 1.457
5.128GluGlu: 5.128 ± 2.87
3.779GluPhe: 3.779 ± 0.505
3.509GluGly: 3.509 ± 1.08
0.81GluHis: 0.81 ± 0.6
4.049GluIle: 4.049 ± 1.364
4.588GluLys: 4.588 ± 0.794
6.748GluLeu: 6.748 ± 1.85
0.81GluMet: 0.81 ± 1.09
1.619GluAsn: 1.619 ± 0.577
0.54GluPro: 0.54 ± 0.345
2.429GluGln: 2.429 ± 1.508
2.429GluArg: 2.429 ± 0.683
4.049GluSer: 4.049 ± 1.86
2.969GluThr: 2.969 ± 0.928
4.049GluVal: 4.049 ± 0.491
1.35GluTrp: 1.35 ± 0.824
2.699GluTyr: 2.699 ± 0.639
0.0GluXaa: 0.0 ± 0.0
Phe
2.159PheAla: 2.159 ± 0.518
1.08PheCys: 1.08 ± 1.215
3.239PheAsp: 3.239 ± 1.485
2.699PheGlu: 2.699 ± 0.935
2.699PhePhe: 2.699 ± 2.085
3.509PheGly: 3.509 ± 0.902
1.619PheHis: 1.619 ± 1.253
1.619PheIle: 1.619 ± 0.86
5.128PheLys: 5.128 ± 1.855
6.208PheLeu: 6.208 ± 0.933
1.35PheMet: 1.35 ± 0.28
4.049PheAsn: 4.049 ± 1.313
2.429PhePro: 2.429 ± 1.17
2.159PheGln: 2.159 ± 0.53
2.699PheArg: 2.699 ± 0.947
3.779PheSer: 3.779 ± 0.953
0.81PheThr: 0.81 ± 0.348
3.239PheVal: 3.239 ± 0.766
1.08PheTrp: 1.08 ± 0.454
0.54PheTyr: 0.54 ± 0.469
0.0PheXaa: 0.0 ± 0.0
Gly
1.619GlyAla: 1.619 ± 0.476
0.54GlyCys: 0.54 ± 0.485
2.159GlyAsp: 2.159 ± 1.013
3.239GlyGlu: 3.239 ± 1.281
4.049GlyPhe: 4.049 ± 1.698
4.318GlyGly: 4.318 ± 0.678
1.08GlyHis: 1.08 ± 0.764
5.128GlyIle: 5.128 ± 1.186
3.239GlyLys: 3.239 ± 0.906
7.287GlyLeu: 7.287 ± 1.912
0.81GlyMet: 0.81 ± 0.458
2.969GlyAsn: 2.969 ± 0.837
2.429GlyPro: 2.429 ± 0.898
1.619GlyGln: 1.619 ± 0.663
2.159GlyArg: 2.159 ± 1.506
7.287GlySer: 7.287 ± 1.217
2.969GlyThr: 2.969 ± 1.122
2.969GlyVal: 2.969 ± 0.577
1.08GlyTrp: 1.08 ± 0.568
1.08GlyTyr: 1.08 ± 0.47
0.0GlyXaa: 0.0 ± 0.0
His
0.54HisAla: 0.54 ± 0.345
0.0HisCys: 0.0 ± 0.0
1.35HisAsp: 1.35 ± 0.514
1.35HisGlu: 1.35 ± 0.566
1.619HisPhe: 1.619 ± 0.448
1.08HisGly: 1.08 ± 1.257
0.54HisHis: 0.54 ± 0.29
1.35HisIle: 1.35 ± 0.662
0.54HisLys: 0.54 ± 0.29
1.889HisLeu: 1.889 ± 0.451
0.27HisMet: 0.27 ± 0.161
1.08HisAsn: 1.08 ± 0.579
2.429HisPro: 2.429 ± 0.931
0.27HisGln: 0.27 ± 0.161
1.08HisArg: 1.08 ± 0.646
1.619HisSer: 1.619 ± 0.869
0.0HisThr: 0.0 ± 0.0
0.81HisVal: 0.81 ± 0.484
0.27HisTrp: 0.27 ± 0.161
1.08HisTyr: 1.08 ± 0.579
0.0HisXaa: 0.0 ± 0.0
Ile
2.429IleAla: 2.429 ± 0.843
1.35IleCys: 1.35 ± 0.602
3.779IleAsp: 3.779 ± 0.84
6.478IleGlu: 6.478 ± 1.429
3.239IlePhe: 3.239 ± 1.065
5.128IleGly: 5.128 ± 0.554
1.889IleHis: 1.889 ± 0.49
5.128IleIle: 5.128 ± 0.662
7.287IleLys: 7.287 ± 1.901
5.668IleLeu: 5.668 ± 1.787
1.619IleMet: 1.619 ± 0.553
3.239IleAsn: 3.239 ± 0.678
4.318IlePro: 4.318 ± 1.211
3.509IleGln: 3.509 ± 1.26
4.588IleArg: 4.588 ± 1.837
6.478IleSer: 6.478 ± 1.279
3.779IleThr: 3.779 ± 0.954
2.969IleVal: 2.969 ± 0.933
2.159IleTrp: 2.159 ± 0.852
2.159IleTyr: 2.159 ± 0.524
0.0IleXaa: 0.0 ± 0.0
Lys
2.969LysAla: 2.969 ± 0.512
0.27LysCys: 0.27 ± 0.331
6.478LysAsp: 6.478 ± 1.321
3.509LysGlu: 3.509 ± 1.084
4.049LysPhe: 4.049 ± 1.167
3.239LysGly: 3.239 ± 0.828
1.08LysHis: 1.08 ± 0.646
4.858LysIle: 4.858 ± 1.41
5.128LysLys: 5.128 ± 1.598
6.748LysLeu: 6.748 ± 1.263
1.619LysMet: 1.619 ± 0.709
4.049LysAsn: 4.049 ± 0.953
3.239LysPro: 3.239 ± 0.906
1.889LysGln: 1.889 ± 0.687
5.398LysArg: 5.398 ± 1.75
5.128LysSer: 5.128 ± 1.195
2.969LysThr: 2.969 ± 1.483
4.588LysVal: 4.588 ± 0.336
2.159LysTrp: 2.159 ± 0.729
2.159LysTyr: 2.159 ± 0.947
0.0LysXaa: 0.0 ± 0.0
Leu
3.509LeuAla: 3.509 ± 0.569
1.35LeuCys: 1.35 ± 0.531
4.588LeuAsp: 4.588 ± 1.44
5.668LeuGlu: 5.668 ± 1.193
4.049LeuPhe: 4.049 ± 1.049
6.208LeuGly: 6.208 ± 2.027
1.08LeuHis: 1.08 ± 0.579
9.177LeuIle: 9.177 ± 2.962
8.637LeuLys: 8.637 ± 1.904
11.066LeuLeu: 11.066 ± 2.636
2.429LeuMet: 2.429 ± 0.585
6.748LeuAsn: 6.748 ± 1.929
3.779LeuPro: 3.779 ± 1.117
2.429LeuGln: 2.429 ± 0.956
8.637LeuArg: 8.637 ± 2.578
11.066LeuSer: 11.066 ± 2.137
4.858LeuThr: 4.858 ± 0.704
3.239LeuVal: 3.239 ± 1.996
0.81LeuTrp: 0.81 ± 0.545
2.969LeuTyr: 2.969 ± 0.579
0.0LeuXaa: 0.0 ± 0.0
Met
1.619MetAla: 1.619 ± 0.581
0.27MetCys: 0.27 ± 0.161
1.889MetAsp: 1.889 ± 0.562
1.35MetGlu: 1.35 ± 0.707
1.619MetPhe: 1.619 ± 0.708
1.889MetGly: 1.889 ± 0.594
0.0MetHis: 0.0 ± 0.0
1.619MetIle: 1.619 ± 0.71
1.35MetLys: 1.35 ± 0.5
1.35MetLeu: 1.35 ± 0.633
0.81MetMet: 0.81 ± 0.789
0.0MetAsn: 0.0 ± 0.0
0.27MetPro: 0.27 ± 0.161
0.0MetGln: 0.0 ± 0.0
1.619MetArg: 1.619 ± 0.709
2.969MetSer: 2.969 ± 1.192
1.08MetThr: 1.08 ± 0.509
1.889MetVal: 1.889 ± 0.752
0.0MetTrp: 0.0 ± 0.0
1.35MetTyr: 1.35 ± 0.616
0.0MetXaa: 0.0 ± 0.0
Asn
2.969AsnAla: 2.969 ± 0.806
0.54AsnCys: 0.54 ± 0.29
2.429AsnAsp: 2.429 ± 0.611
1.35AsnGlu: 1.35 ± 0.641
1.619AsnPhe: 1.619 ± 0.365
1.619AsnGly: 1.619 ± 0.71
2.429AsnHis: 2.429 ± 0.705
4.588AsnIle: 4.588 ± 1.544
4.318AsnLys: 4.318 ± 0.449
6.478AsnLeu: 6.478 ± 1.257
1.08AsnMet: 1.08 ± 0.564
2.969AsnAsn: 2.969 ± 0.44
3.239AsnPro: 3.239 ± 1.323
3.509AsnGln: 3.509 ± 1.446
2.429AsnArg: 2.429 ± 1.141
4.858AsnSer: 4.858 ± 0.876
2.429AsnThr: 2.429 ± 1.168
2.699AsnVal: 2.699 ± 1.308
2.159AsnTrp: 2.159 ± 0.545
0.81AsnTyr: 0.81 ± 0.401
0.0AsnXaa: 0.0 ± 0.0
Pro
2.159ProAla: 2.159 ± 0.435
0.54ProCys: 0.54 ± 0.414
1.619ProAsp: 1.619 ± 0.488
2.699ProGlu: 2.699 ± 2.113
1.619ProPhe: 1.619 ± 0.663
1.35ProGly: 1.35 ± 0.616
0.54ProHis: 0.54 ± 0.323
4.049ProIle: 4.049 ± 0.395
2.429ProLys: 2.429 ± 0.781
4.049ProLeu: 4.049 ± 0.801
0.54ProMet: 0.54 ± 0.485
2.699ProAsn: 2.699 ± 1.007
1.35ProPro: 1.35 ± 0.966
1.35ProGln: 1.35 ± 0.566
2.699ProArg: 2.699 ± 0.819
4.049ProSer: 4.049 ± 0.826
1.889ProThr: 1.889 ± 0.673
2.699ProVal: 2.699 ± 0.886
0.81ProTrp: 0.81 ± 0.534
3.239ProTyr: 3.239 ± 0.572
0.0ProXaa: 0.0 ± 0.0
Gln
2.699GlnAla: 2.699 ± 1.366
0.81GlnCys: 0.81 ± 0.662
0.81GlnAsp: 0.81 ± 0.348
1.889GlnGlu: 1.889 ± 0.554
1.889GlnPhe: 1.889 ± 0.87
1.619GlnGly: 1.619 ± 0.663
0.27GlnHis: 0.27 ± 0.533
4.858GlnIle: 4.858 ± 2.125
2.159GlnLys: 2.159 ± 0.717
3.239GlnLeu: 3.239 ± 0.611
0.54GlnMet: 0.54 ± 0.644
1.35GlnAsn: 1.35 ± 0.807
1.35GlnPro: 1.35 ± 0.566
1.889GlnGln: 1.889 ± 0.987
1.08GlnArg: 1.08 ± 0.67
2.699GlnSer: 2.699 ± 1.051
3.239GlnThr: 3.239 ± 0.895
1.619GlnVal: 1.619 ± 0.621
0.81GlnTrp: 0.81 ± 0.348
0.81GlnTyr: 0.81 ± 0.787
0.0GlnXaa: 0.0 ± 0.0
Arg
1.35ArgAla: 1.35 ± 0.5
0.81ArgCys: 0.81 ± 0.332
2.429ArgAsp: 2.429 ± 1.09
3.509ArgGlu: 3.509 ± 1.127
3.509ArgPhe: 3.509 ± 1.521
3.779ArgGly: 3.779 ± 1.403
0.54ArgHis: 0.54 ± 0.323
1.889ArgIle: 1.889 ± 0.93
3.509ArgLys: 3.509 ± 0.698
5.398ArgLeu: 5.398 ± 0.668
1.08ArgMet: 1.08 ± 0.275
3.779ArgAsn: 3.779 ± 1.259
1.619ArgPro: 1.619 ± 0.507
1.35ArgGln: 1.35 ± 0.566
2.699ArgArg: 2.699 ± 0.921
4.318ArgSer: 4.318 ± 0.753
3.509ArgThr: 3.509 ± 1.106
3.509ArgVal: 3.509 ± 0.797
1.889ArgTrp: 1.889 ± 0.736
1.08ArgTyr: 1.08 ± 0.548
0.0ArgXaa: 0.0 ± 0.0
Ser
2.699SerAla: 2.699 ± 0.964
1.35SerCys: 1.35 ± 0.807
7.287SerAsp: 7.287 ± 2.19
4.858SerGlu: 4.858 ± 1.082
4.858SerPhe: 4.858 ± 1.696
3.779SerGly: 3.779 ± 1.208
2.159SerHis: 2.159 ± 0.87
6.748SerIle: 6.748 ± 1.592
5.398SerLys: 5.398 ± 1.201
8.637SerLeu: 8.637 ± 0.971
1.619SerMet: 1.619 ± 0.701
4.588SerAsn: 4.588 ± 0.841
2.969SerPro: 2.969 ± 0.803
2.429SerGln: 2.429 ± 0.683
2.969SerArg: 2.969 ± 1.193
6.208SerSer: 6.208 ± 1.402
4.858SerThr: 4.858 ± 1.283
5.668SerVal: 5.668 ± 1.66
1.889SerTrp: 1.889 ± 1.038
2.159SerTyr: 2.159 ± 0.641
0.0SerXaa: 0.0 ± 0.0
Thr
2.699ThrAla: 2.699 ± 1.103
1.35ThrCys: 1.35 ± 1.057
2.429ThrAsp: 2.429 ± 1.07
2.159ThrGlu: 2.159 ± 0.632
0.81ThrPhe: 0.81 ± 0.484
3.779ThrGly: 3.779 ± 1.103
1.35ThrHis: 1.35 ± 0.566
4.588ThrIle: 4.588 ± 1.407
2.699ThrLys: 2.699 ± 1.061
4.858ThrLeu: 4.858 ± 1.243
1.619ThrMet: 1.619 ± 0.969
1.889ThrAsn: 1.889 ± 0.771
1.619ThrPro: 1.619 ± 0.594
0.81ThrGln: 0.81 ± 0.484
1.08ThrArg: 1.08 ± 0.67
4.049ThrSer: 4.049 ± 0.983
4.588ThrThr: 4.588 ± 1.068
2.969ThrVal: 2.969 ± 0.992
0.54ThrTrp: 0.54 ± 0.29
1.35ThrTyr: 1.35 ± 0.59
0.0ThrXaa: 0.0 ± 0.0
Val
2.429ValAla: 2.429 ± 2.002
1.619ValCys: 1.619 ± 0.542
3.509ValAsp: 3.509 ± 0.967
4.049ValGlu: 4.049 ± 1.359
2.699ValPhe: 2.699 ± 0.886
2.699ValGly: 2.699 ± 1.029
0.81ValHis: 0.81 ± 0.484
5.938ValIle: 5.938 ± 0.938
3.779ValLys: 3.779 ± 1.708
3.239ValLeu: 3.239 ± 1.029
1.619ValMet: 1.619 ± 0.542
4.588ValAsn: 4.588 ± 0.332
2.429ValPro: 2.429 ± 0.381
1.619ValGln: 1.619 ± 0.774
2.159ValArg: 2.159 ± 0.928
4.588ValSer: 4.588 ± 0.518
1.889ValThr: 1.889 ± 1.044
2.969ValVal: 2.969 ± 0.646
0.81ValTrp: 0.81 ± 0.484
2.159ValTyr: 2.159 ± 0.509
0.0ValXaa: 0.0 ± 0.0
Trp
0.81TrpAla: 0.81 ± 0.332
0.54TrpCys: 0.54 ± 0.29
1.889TrpAsp: 1.889 ± 1.021
0.54TrpGlu: 0.54 ± 0.487
1.619TrpPhe: 1.619 ± 0.529
1.08TrpGly: 1.08 ± 0.646
0.0TrpHis: 0.0 ± 0.0
0.54TrpIle: 0.54 ± 0.485
2.699TrpLys: 2.699 ± 0.516
1.889TrpLeu: 1.889 ± 0.636
1.619TrpMet: 1.619 ± 1.059
0.54TrpAsn: 0.54 ± 0.414
0.27TrpPro: 0.27 ± 0.161
0.81TrpGln: 0.81 ± 0.493
0.54TrpArg: 0.54 ± 0.29
0.54TrpSer: 0.54 ± 0.323
1.08TrpThr: 1.08 ± 0.501
2.429TrpVal: 2.429 ± 1.37
0.0TrpTrp: 0.0 ± 0.0
0.27TrpTyr: 0.27 ± 0.53
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.889TyrAla: 1.889 ± 0.44
0.27TyrCys: 0.27 ± 0.331
1.35TyrAsp: 1.35 ± 0.8
1.08TyrGlu: 1.08 ± 0.275
2.429TyrPhe: 2.429 ± 0.705
2.429TyrGly: 2.429 ± 0.725
0.81TyrHis: 0.81 ± 0.6
2.159TyrIle: 2.159 ± 0.509
1.35TyrLys: 1.35 ± 0.668
4.318TyrLeu: 4.318 ± 1.662
0.54TyrMet: 0.54 ± 0.664
3.779TyrAsn: 3.779 ± 1.396
1.35TyrPro: 1.35 ± 0.28
1.08TyrGln: 1.08 ± 0.623
1.35TyrArg: 1.35 ± 0.824
1.889TyrSer: 1.889 ± 0.49
1.08TyrThr: 1.08 ± 0.545
0.81TyrVal: 0.81 ± 0.332
0.0TyrTrp: 0.0 ± 0.0
1.35TyrTyr: 1.35 ± 1.155
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 7 proteins (3706 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski