Amino acid dipepetide frequency for Human gokushovirus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
9.302AlaAla: 9.302 ± 4.123
0.0AlaCys: 0.0 ± 0.0
6.202AlaAsp: 6.202 ± 1.664
4.651AlaGlu: 4.651 ± 1.463
0.775AlaPhe: 0.775 ± 0.716
5.426AlaGly: 5.426 ± 1.478
2.326AlaHis: 2.326 ± 1.909
3.876AlaIle: 3.876 ± 1.41
7.752AlaLys: 7.752 ± 4.498
3.876AlaLeu: 3.876 ± 1.826
3.101AlaMet: 3.101 ± 1.211
4.651AlaAsn: 4.651 ± 1.987
3.876AlaPro: 3.876 ± 0.572
3.876AlaGln: 3.876 ± 0.572
4.651AlaArg: 4.651 ± 1.799
5.426AlaSer: 5.426 ± 4.016
7.752AlaThr: 7.752 ± 2.481
3.876AlaVal: 3.876 ± 2.688
2.326AlaTrp: 2.326 ± 1.613
3.876AlaTyr: 3.876 ± 0.821
0.0AlaXaa: 0.0 ± 0.0
Cys
0.775CysAla: 0.775 ± 0.728
0.0CysCys: 0.0 ± 0.0
1.55CysAsp: 1.55 ± 0.962
0.775CysGlu: 0.775 ± 0.728
0.0CysPhe: 0.0 ± 0.0
0.775CysGly: 0.775 ± 0.728
0.0CysHis: 0.0 ± 0.0
0.0CysIle: 0.0 ± 0.0
0.0CysLys: 0.0 ± 0.0
0.775CysLeu: 0.775 ± 0.538
0.0CysMet: 0.0 ± 0.0
0.0CysAsn: 0.0 ± 0.0
0.0CysPro: 0.0 ± 0.0
0.0CysGln: 0.0 ± 0.0
0.775CysArg: 0.775 ± 0.728
0.775CysSer: 0.775 ± 0.538
1.55CysThr: 1.55 ± 0.603
0.0CysVal: 0.0 ± 0.0
0.775CysTrp: 0.775 ± 0.538
0.775CysTyr: 0.775 ± 0.538
0.0CysXaa: 0.0 ± 0.0
Asp
3.876AspAla: 3.876 ± 1.844
0.0AspCys: 0.0 ± 0.0
2.326AspAsp: 2.326 ± 1.613
4.651AspGlu: 4.651 ± 0.884
1.55AspPhe: 1.55 ± 0.603
3.876AspGly: 3.876 ± 1.033
0.0AspHis: 0.0 ± 0.0
0.775AspIle: 0.775 ± 0.977
3.876AspLys: 3.876 ± 1.81
3.101AspLeu: 3.101 ± 1.346
0.775AspMet: 0.775 ± 0.716
2.326AspAsn: 2.326 ± 0.88
2.326AspPro: 2.326 ± 0.894
0.775AspGln: 0.775 ± 0.728
3.876AspArg: 3.876 ± 0.572
1.55AspSer: 1.55 ± 0.962
6.202AspThr: 6.202 ± 0.762
2.326AspVal: 2.326 ± 1.54
0.775AspTrp: 0.775 ± 0.728
5.426AspTyr: 5.426 ± 2.883
0.0AspXaa: 0.0 ± 0.0
Glu
6.977GluAla: 6.977 ± 1.283
0.775GluCys: 0.775 ± 0.538
3.101GluAsp: 3.101 ± 1.196
3.101GluGlu: 3.101 ± 1.206
2.326GluPhe: 2.326 ± 0.804
2.326GluGly: 2.326 ± 0.894
3.101GluHis: 3.101 ± 1.206
4.651GluIle: 4.651 ± 1.484
6.202GluLys: 6.202 ± 2.051
4.651GluLeu: 4.651 ± 1.44
2.326GluMet: 2.326 ± 0.894
3.101GluAsn: 3.101 ± 2.151
2.326GluPro: 2.326 ± 1.225
2.326GluGln: 2.326 ± 1.225
1.55GluArg: 1.55 ± 0.606
1.55GluSer: 1.55 ± 0.603
3.101GluThr: 3.101 ± 1.058
3.101GluVal: 3.101 ± 0.86
2.326GluTrp: 2.326 ± 0.88
6.977GluTyr: 6.977 ± 1.305
0.0GluXaa: 0.0 ± 0.0
Phe
2.326PheAla: 2.326 ± 1.077
0.0PheCys: 0.0 ± 0.0
1.55PheAsp: 1.55 ± 0.603
2.326PheGlu: 2.326 ± 0.894
2.326PhePhe: 2.326 ± 0.88
3.101PheGly: 3.101 ± 1.328
0.775PheHis: 0.775 ± 0.538
0.775PheIle: 0.775 ± 0.728
2.326PheLys: 2.326 ± 1.54
3.101PheLeu: 3.101 ± 1.328
0.775PheMet: 0.775 ± 0.716
2.326PheAsn: 2.326 ± 1.214
0.0PhePro: 0.0 ± 0.0
0.0PheGln: 0.0 ± 0.0
2.326PheArg: 2.326 ± 0.88
0.775PheSer: 0.775 ± 0.728
2.326PheThr: 2.326 ± 1.214
3.876PheVal: 3.876 ± 1.771
0.775PheTrp: 0.775 ± 0.538
0.775PheTyr: 0.775 ± 0.977
0.0PheXaa: 0.0 ± 0.0
Gly
5.426GlyAla: 5.426 ± 1.605
0.0GlyCys: 0.0 ± 0.0
3.101GlyAsp: 3.101 ± 0.454
5.426GlyGlu: 5.426 ± 1.983
3.101GlyPhe: 3.101 ± 0.454
6.202GlyGly: 6.202 ± 1.664
1.55GlyHis: 1.55 ± 0.606
6.202GlyIle: 6.202 ± 0.971
3.101GlyLys: 3.101 ± 0.454
6.977GlyLeu: 6.977 ± 2.148
0.0GlyMet: 0.0 ± 0.0
5.426GlyAsn: 5.426 ± 2.502
0.0GlyPro: 0.0 ± 0.0
4.651GlyGln: 4.651 ± 0.879
3.876GlyArg: 3.876 ± 1.033
6.202GlySer: 6.202 ± 2.611
7.752GlyThr: 7.752 ± 2.59
3.101GlyVal: 3.101 ± 1.685
0.0GlyTrp: 0.0 ± 0.0
5.426GlyTyr: 5.426 ± 1.174
0.0GlyXaa: 0.0 ± 0.0
His
0.0HisAla: 0.0 ± 0.0
0.0HisCys: 0.0 ± 0.0
0.775HisAsp: 0.775 ± 0.538
0.775HisGlu: 0.775 ± 0.538
2.326HisPhe: 2.326 ± 1.863
0.775HisGly: 0.775 ± 0.728
0.0HisHis: 0.0 ± 0.0
2.326HisIle: 2.326 ± 0.804
0.775HisLys: 0.775 ± 0.977
0.775HisLeu: 0.775 ± 0.716
0.775HisMet: 0.775 ± 0.728
0.0HisAsn: 0.0 ± 0.0
0.0HisPro: 0.0 ± 0.0
1.55HisGln: 1.55 ± 1.075
0.775HisArg: 0.775 ± 0.977
1.55HisSer: 1.55 ± 0.603
0.775HisThr: 0.775 ± 0.977
0.0HisVal: 0.0 ± 0.0
0.0HisTrp: 0.0 ± 0.0
1.55HisTyr: 1.55 ± 0.603
0.0HisXaa: 0.0 ± 0.0
Ile
5.426IleAla: 5.426 ± 1.217
0.0IleCys: 0.0 ± 0.0
3.101IleAsp: 3.101 ± 1.923
1.55IleGlu: 1.55 ± 0.603
3.876IlePhe: 3.876 ± 1.41
9.302IleGly: 9.302 ± 1.33
0.775IleHis: 0.775 ± 0.977
1.55IleIle: 1.55 ± 0.962
2.326IleLys: 2.326 ± 1.54
3.876IleLeu: 3.876 ± 2.169
3.876IleMet: 3.876 ± 1.657
3.101IleAsn: 3.101 ± 1.346
3.101IlePro: 3.101 ± 1.328
3.876IleGln: 3.876 ± 1.41
1.55IleArg: 1.55 ± 0.962
0.0IleSer: 0.0 ± 0.0
2.326IleThr: 2.326 ± 1.261
0.0IleVal: 0.0 ± 0.0
0.0IleTrp: 0.0 ± 0.0
3.101IleTyr: 3.101 ± 2.151
0.0IleXaa: 0.0 ± 0.0
Lys
4.651LysAla: 4.651 ± 1.695
0.0LysCys: 0.0 ± 0.0
6.202LysAsp: 6.202 ± 1.344
3.876LysGlu: 3.876 ± 2.169
1.55LysPhe: 1.55 ± 1.075
6.202LysGly: 6.202 ± 3.306
1.55LysHis: 1.55 ± 0.962
3.101LysIle: 3.101 ± 2.815
7.752LysLys: 7.752 ± 3.258
1.55LysLeu: 1.55 ± 0.911
1.55LysMet: 1.55 ± 0.949
3.876LysAsn: 3.876 ± 0.572
4.651LysPro: 4.651 ± 1.81
5.426LysGln: 5.426 ± 2.537
3.876LysArg: 3.876 ± 3.642
1.55LysSer: 1.55 ± 0.911
4.651LysThr: 4.651 ± 1.376
1.55LysVal: 1.55 ± 0.962
2.326LysTrp: 2.326 ± 0.479
3.876LysTyr: 3.876 ± 2.169
0.0LysXaa: 0.0 ± 0.0
Leu
5.426LeuAla: 5.426 ± 1.352
0.775LeuCys: 0.775 ± 0.538
2.326LeuAsp: 2.326 ± 0.479
6.202LeuGlu: 6.202 ± 3.053
1.55LeuPhe: 1.55 ± 1.064
6.202LeuGly: 6.202 ± 2.691
0.0LeuHis: 0.0 ± 0.0
1.55LeuIle: 1.55 ± 0.962
6.977LeuLys: 6.977 ± 2.675
6.202LeuLeu: 6.202 ± 2.116
1.55LeuMet: 1.55 ± 0.606
2.326LeuAsn: 2.326 ± 1.214
4.651LeuPro: 4.651 ± 2.341
1.55LeuGln: 1.55 ± 1.075
3.101LeuArg: 3.101 ± 1.923
5.426LeuSer: 5.426 ± 2.156
3.101LeuThr: 3.101 ± 0.454
2.326LeuVal: 2.326 ± 0.804
1.55LeuTrp: 1.55 ± 0.603
1.55LeuTyr: 1.55 ± 1.075
0.0LeuXaa: 0.0 ± 0.0
Met
1.55MetAla: 1.55 ± 0.911
0.0MetCys: 0.0 ± 0.0
3.876MetAsp: 3.876 ± 0.872
0.0MetGlu: 0.0 ± 0.0
0.775MetPhe: 0.775 ± 0.538
1.55MetGly: 1.55 ± 0.606
0.0MetHis: 0.0 ± 0.0
1.55MetIle: 1.55 ± 0.606
1.55MetLys: 1.55 ± 1.075
1.55MetLeu: 1.55 ± 0.962
0.775MetMet: 0.775 ± 0.716
1.55MetAsn: 1.55 ± 0.911
3.101MetPro: 3.101 ± 1.058
2.326MetGln: 2.326 ± 0.479
2.326MetArg: 2.326 ± 1.212
3.876MetSer: 3.876 ± 1.353
0.775MetThr: 0.775 ± 0.728
0.775MetVal: 0.775 ± 0.538
0.775MetTrp: 0.775 ± 0.728
0.775MetTyr: 0.775 ± 0.716
0.0MetXaa: 0.0 ± 0.0
Asn
7.752AsnAla: 7.752 ± 2.521
0.775AsnCys: 0.775 ± 0.538
3.101AsnAsp: 3.101 ± 2.151
6.977AsnGlu: 6.977 ± 1.442
0.775AsnPhe: 0.775 ± 0.977
3.101AsnGly: 3.101 ± 1.346
0.0AsnHis: 0.0 ± 0.0
1.55AsnIle: 1.55 ± 0.603
3.101AsnLys: 3.101 ± 0.818
4.651AsnLeu: 4.651 ± 3.358
2.326AsnMet: 2.326 ± 1.066
3.101AsnAsn: 3.101 ± 1.211
1.55AsnPro: 1.55 ± 0.911
3.876AsnGln: 3.876 ± 1.353
0.775AsnArg: 0.775 ± 0.538
5.426AsnSer: 5.426 ± 3.696
3.876AsnThr: 3.876 ± 0.813
5.426AsnVal: 5.426 ± 1.352
0.0AsnTrp: 0.0 ± 0.0
1.55AsnTyr: 1.55 ± 1.075
0.0AsnXaa: 0.0 ± 0.0
Pro
3.101ProAla: 3.101 ± 0.86
1.55ProCys: 1.55 ± 1.457
0.775ProAsp: 0.775 ± 0.538
3.876ProGlu: 3.876 ± 1.033
0.775ProPhe: 0.775 ± 0.716
3.101ProGly: 3.101 ± 1.328
0.775ProHis: 0.775 ± 0.728
5.426ProIle: 5.426 ± 1.204
0.0ProLys: 0.0 ± 0.0
2.326ProLeu: 2.326 ± 0.894
1.55ProMet: 1.55 ± 0.606
1.55ProAsn: 1.55 ± 0.911
0.775ProPro: 0.775 ± 0.728
0.775ProGln: 0.775 ± 0.716
3.876ProArg: 3.876 ± 1.41
3.101ProSer: 3.101 ± 1.346
3.101ProThr: 3.101 ± 1.346
3.101ProVal: 3.101 ± 1.612
0.0ProTrp: 0.0 ± 0.0
1.55ProTyr: 1.55 ± 1.457
0.0ProXaa: 0.0 ± 0.0
Gln
3.101GlnAla: 3.101 ± 1.211
0.775GlnCys: 0.775 ± 0.728
1.55GlnAsp: 1.55 ± 0.911
3.101GlnGlu: 3.101 ± 1.346
0.775GlnPhe: 0.775 ± 0.538
4.651GlnGly: 4.651 ± 0.969
1.55GlnHis: 1.55 ± 0.962
2.326GlnIle: 2.326 ± 1.212
4.651GlnLys: 4.651 ± 3.529
2.326GlnLeu: 2.326 ± 0.88
1.55GlnMet: 1.55 ± 0.603
1.55GlnAsn: 1.55 ± 1.457
1.55GlnPro: 1.55 ± 1.075
3.101GlnGln: 3.101 ± 1.555
2.326GlnArg: 2.326 ± 0.894
3.101GlnSer: 3.101 ± 2.157
3.101GlnThr: 3.101 ± 1.328
1.55GlnVal: 1.55 ± 0.606
1.55GlnTrp: 1.55 ± 1.457
2.326GlnTyr: 2.326 ± 1.261
0.0GlnXaa: 0.0 ± 0.0
Arg
3.101ArgAla: 3.101 ± 0.454
0.775ArgCys: 0.775 ± 0.977
1.55ArgAsp: 1.55 ± 0.603
6.202ArgGlu: 6.202 ± 1.597
0.775ArgPhe: 0.775 ± 0.538
3.876ArgGly: 3.876 ± 1.778
0.0ArgHis: 0.0 ± 0.0
3.876ArgIle: 3.876 ± 1.033
3.101ArgLys: 3.101 ± 1.823
3.101ArgLeu: 3.101 ± 1.328
1.55ArgMet: 1.55 ± 0.603
3.101ArgAsn: 3.101 ± 1.206
2.326ArgPro: 2.326 ± 1.225
1.55ArgGln: 1.55 ± 0.603
1.55ArgArg: 1.55 ± 1.457
1.55ArgSer: 1.55 ± 1.075
2.326ArgThr: 2.326 ± 0.804
3.101ArgVal: 3.101 ± 1.923
0.775ArgTrp: 0.775 ± 0.977
4.651ArgTyr: 4.651 ± 0.969
0.0ArgXaa: 0.0 ± 0.0
Ser
10.078SerAla: 10.078 ± 4.83
0.775SerCys: 0.775 ± 0.538
0.0SerAsp: 0.0 ± 0.0
4.651SerGlu: 4.651 ± 1.567
0.775SerPhe: 0.775 ± 0.716
5.426SerGly: 5.426 ± 1.942
0.775SerHis: 0.775 ± 0.538
3.101SerIle: 3.101 ± 0.454
3.101SerLys: 3.101 ± 1.206
6.202SerLeu: 6.202 ± 1.506
0.0SerMet: 0.0 ± 0.0
4.651SerAsn: 4.651 ± 3.305
2.326SerPro: 2.326 ± 0.88
3.101SerGln: 3.101 ± 0.818
2.326SerArg: 2.326 ± 0.894
3.876SerSer: 3.876 ± 2.598
3.101SerThr: 3.101 ± 0.454
1.55SerVal: 1.55 ± 1.432
1.55SerTrp: 1.55 ± 1.432
5.426SerTyr: 5.426 ± 1.272
0.0SerXaa: 0.0 ± 0.0
Thr
5.426ThrAla: 5.426 ± 2.612
2.326ThrCys: 2.326 ± 0.88
3.876ThrAsp: 3.876 ± 0.872
3.876ThrGlu: 3.876 ± 1.826
0.0ThrPhe: 0.0 ± 0.0
4.651ThrGly: 4.651 ± 1.463
0.775ThrHis: 0.775 ± 0.977
4.651ThrIle: 4.651 ± 2.341
3.101ThrLys: 3.101 ± 0.818
3.876ThrLeu: 3.876 ± 1.844
0.775ThrMet: 0.775 ± 0.728
6.202ThrAsn: 6.202 ± 1.201
3.876ThrPro: 3.876 ± 1.448
1.55ThrGln: 1.55 ± 1.432
3.101ThrArg: 3.101 ± 0.86
6.977ThrSer: 6.977 ± 3.243
6.202ThrThr: 6.202 ± 1.446
3.101ThrVal: 3.101 ± 1.823
0.775ThrTrp: 0.775 ± 0.538
3.876ThrTyr: 3.876 ± 0.821
0.0ThrXaa: 0.0 ± 0.0
Val
2.326ValAla: 2.326 ± 0.479
0.0ValCys: 0.0 ± 0.0
2.326ValAsp: 2.326 ± 0.479
1.55ValGlu: 1.55 ± 0.962
0.775ValPhe: 0.775 ± 0.728
0.0ValGly: 0.0 ± 0.0
0.0ValHis: 0.0 ± 0.0
3.101ValIle: 3.101 ± 0.818
4.651ValLys: 4.651 ± 1.535
1.55ValLeu: 1.55 ± 0.962
1.55ValMet: 1.55 ± 0.911
3.876ValAsn: 3.876 ± 1.41
3.876ValPro: 3.876 ± 0.572
2.326ValGln: 2.326 ± 0.804
3.101ValArg: 3.101 ± 0.818
3.876ValSer: 3.876 ± 0.572
5.426ValThr: 5.426 ± 1.899
0.775ValVal: 0.775 ± 0.538
0.0ValTrp: 0.0 ± 0.0
0.775ValTyr: 0.775 ± 0.538
0.0ValXaa: 0.0 ± 0.0
Trp
2.326TrpAla: 2.326 ± 0.88
0.0TrpCys: 0.0 ± 0.0
0.775TrpAsp: 0.775 ± 0.977
1.55TrpGlu: 1.55 ± 0.603
2.326TrpPhe: 2.326 ± 1.613
3.101TrpGly: 3.101 ± 1.206
0.775TrpHis: 0.775 ± 0.538
0.775TrpIle: 0.775 ± 0.538
0.775TrpLys: 0.775 ± 0.977
0.775TrpLeu: 0.775 ± 0.728
1.55TrpMet: 1.55 ± 0.911
1.55TrpAsn: 1.55 ± 1.274
0.0TrpPro: 0.0 ± 0.0
0.0TrpGln: 0.0 ± 0.0
0.775TrpArg: 0.775 ± 0.728
0.775TrpSer: 0.775 ± 0.716
0.775TrpThr: 0.775 ± 0.538
0.0TrpVal: 0.0 ± 0.0
0.0TrpTrp: 0.0 ± 0.0
0.775TrpTyr: 0.775 ± 0.728
0.0TrpXaa: 0.0 ± 0.0
Tyr
4.651TyrAla: 4.651 ± 2.341
0.775TyrCys: 0.775 ± 0.728
2.326TyrAsp: 2.326 ± 0.88
0.775TyrGlu: 0.775 ± 0.716
5.426TyrPhe: 5.426 ± 2.371
3.876TyrGly: 3.876 ± 1.114
0.775TyrHis: 0.775 ± 0.728
1.55TyrIle: 1.55 ± 1.075
5.426TyrLys: 5.426 ± 2.189
3.101TyrLeu: 3.101 ± 1.618
2.326TyrMet: 2.326 ± 0.894
5.426TyrAsn: 5.426 ± 2.612
0.775TyrPro: 0.775 ± 0.538
3.876TyrGln: 3.876 ± 0.572
2.326TyrArg: 2.326 ± 1.214
5.426TyrSer: 5.426 ± 0.861
0.775TyrThr: 0.775 ± 0.716
2.326TyrVal: 2.326 ± 1.225
3.101TyrTrp: 3.101 ± 1.685
0.775TyrTyr: 0.775 ± 0.538
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 4 proteins (1291 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski