Amino acid dipepetide frequency for Gokushovirus WZ-2015a

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
4.378AlaAla: 4.378 ± 2.419
1.751AlaCys: 1.751 ± 0.578
3.503AlaAsp: 3.503 ± 1.596
3.503AlaGlu: 3.503 ± 2.157
0.876AlaPhe: 0.876 ± 0.579
3.503AlaGly: 3.503 ± 3.797
0.0AlaHis: 0.0 ± 0.0
3.503AlaIle: 3.503 ± 2.044
0.0AlaLys: 0.0 ± 0.0
3.503AlaLeu: 3.503 ± 1.156
0.876AlaMet: 0.876 ± 0.949
6.13AlaAsn: 6.13 ± 1.841
2.627AlaPro: 2.627 ± 1.736
2.627AlaGln: 2.627 ± 1.655
3.503AlaArg: 3.503 ± 0.961
7.005AlaSer: 7.005 ± 2.737
2.627AlaThr: 2.627 ± 1.736
4.378AlaVal: 4.378 ± 1.083
0.876AlaTrp: 0.876 ± 0.579
3.503AlaTyr: 3.503 ± 1.596
0.0AlaXaa: 0.0 ± 0.0
Cys
0.876CysAla: 0.876 ± 0.744
0.0CysCys: 0.0 ± 0.0
1.751CysAsp: 1.751 ± 1.488
0.876CysGlu: 0.876 ± 0.579
0.876CysPhe: 0.876 ± 0.744
1.751CysGly: 1.751 ± 1.488
0.0CysHis: 0.0 ± 0.0
0.0CysIle: 0.0 ± 0.0
0.876CysLys: 0.876 ± 0.744
1.751CysLeu: 1.751 ± 0.578
0.0CysMet: 0.0 ± 0.0
0.0CysAsn: 0.0 ± 0.0
0.0CysPro: 0.0 ± 0.0
0.0CysGln: 0.0 ± 0.0
0.0CysArg: 0.0 ± 0.0
0.0CysSer: 0.0 ± 0.0
1.751CysThr: 1.751 ± 1.488
0.876CysVal: 0.876 ± 0.579
0.876CysTrp: 0.876 ± 0.744
0.876CysTyr: 0.876 ± 0.744
0.0CysXaa: 0.0 ± 0.0
Asp
0.0AspAla: 0.0 ± 0.0
0.876AspCys: 0.876 ± 0.744
1.751AspAsp: 1.751 ± 1.022
3.503AspGlu: 3.503 ± 0.961
1.751AspPhe: 1.751 ± 0.578
1.751AspGly: 1.751 ± 1.157
1.751AspHis: 1.751 ± 0.578
7.005AspIle: 7.005 ± 1.469
7.005AspLys: 7.005 ± 1.839
7.005AspLeu: 7.005 ± 0.746
1.751AspMet: 1.751 ± 0.798
1.751AspAsn: 1.751 ± 0.578
0.0AspPro: 0.0 ± 0.0
0.876AspGln: 0.876 ± 0.949
3.503AspArg: 3.503 ± 1.379
3.503AspSer: 3.503 ± 1.386
2.627AspThr: 2.627 ± 1.021
5.254AspVal: 5.254 ± 0.43
0.0AspTrp: 0.0 ± 0.0
5.254AspTyr: 5.254 ± 1.771
0.0AspXaa: 0.0 ± 0.0
Glu
1.751GluAla: 1.751 ± 1.898
1.751GluCys: 1.751 ± 1.488
5.254GluAsp: 5.254 ± 2.375
7.005GluGlu: 7.005 ± 1.638
5.254GluPhe: 5.254 ± 1.771
1.751GluGly: 1.751 ± 0.578
2.627GluHis: 2.627 ± 0.885
5.254GluIle: 5.254 ± 0.951
5.254GluLys: 5.254 ± 0.951
4.378GluLeu: 4.378 ± 0.468
1.751GluMet: 1.751 ± 1.022
0.876GluAsn: 0.876 ± 0.744
0.876GluPro: 0.876 ± 0.579
1.751GluGln: 1.751 ± 1.157
5.254GluArg: 5.254 ± 0.43
7.005GluSer: 7.005 ± 0.555
6.13GluThr: 6.13 ± 0.938
2.627GluVal: 2.627 ± 1.2
0.876GluTrp: 0.876 ± 0.579
3.503GluTyr: 3.503 ± 1.912
0.0GluXaa: 0.0 ± 0.0
Phe
4.378PheAla: 4.378 ± 1.921
0.876PheCys: 0.876 ± 0.744
3.503PheAsp: 3.503 ± 0.277
0.876PheGlu: 0.876 ± 0.579
0.876PhePhe: 0.876 ± 0.579
2.627PheGly: 2.627 ± 1.736
0.876PheHis: 0.876 ± 0.744
1.751PheIle: 1.751 ± 0.578
2.627PheLys: 2.627 ± 1.515
3.503PheLeu: 3.503 ± 0.961
1.751PheMet: 1.751 ± 0.578
0.876PheAsn: 0.876 ± 0.949
3.503PhePro: 3.503 ± 0.961
1.751PheGln: 1.751 ± 0.578
2.627PheArg: 2.627 ± 0.885
2.627PheSer: 2.627 ± 1.021
0.876PheThr: 0.876 ± 0.579
0.876PheVal: 0.876 ± 0.579
1.751PheTrp: 1.751 ± 1.157
0.0PheTyr: 0.0 ± 0.0
0.0PheXaa: 0.0 ± 0.0
Gly
2.627GlyAla: 2.627 ± 1.827
0.0GlyCys: 0.0 ± 0.0
1.751GlyAsp: 1.751 ± 1.157
6.13GlyGlu: 6.13 ± 2.285
4.378GlyPhe: 4.378 ± 1.083
3.503GlyGly: 3.503 ± 1.596
0.876GlyHis: 0.876 ± 0.579
3.503GlyIle: 3.503 ± 2.314
4.378GlyLys: 4.378 ± 1.731
6.13GlyLeu: 6.13 ± 2.505
3.503GlyMet: 3.503 ± 1.386
1.751GlyAsn: 1.751 ± 1.157
0.0GlyPro: 0.0 ± 0.0
1.751GlyGln: 1.751 ± 1.022
2.627GlyArg: 2.627 ± 0.475
4.378GlySer: 4.378 ± 0.773
2.627GlyThr: 2.627 ± 1.736
3.503GlyVal: 3.503 ± 1.596
1.751GlyTrp: 1.751 ± 1.022
5.254GlyTyr: 5.254 ± 1.212
0.0GlyXaa: 0.0 ± 0.0
His
0.0HisAla: 0.0 ± 0.0
0.0HisCys: 0.0 ± 0.0
0.0HisAsp: 0.0 ± 0.0
0.0HisGlu: 0.0 ± 0.0
2.627HisPhe: 2.627 ± 0.885
0.876HisGly: 0.876 ± 0.579
0.0HisHis: 0.0 ± 0.0
0.876HisIle: 0.876 ± 0.744
0.876HisLys: 0.876 ± 0.744
1.751HisLeu: 1.751 ± 1.157
0.0HisMet: 0.0 ± 0.0
0.0HisAsn: 0.0 ± 0.0
0.0HisPro: 0.0 ± 0.0
0.0HisGln: 0.0 ± 0.0
0.0HisArg: 0.0 ± 0.0
0.0HisSer: 0.0 ± 0.0
0.876HisThr: 0.876 ± 0.744
0.876HisVal: 0.876 ± 0.744
0.876HisTrp: 0.876 ± 0.579
0.876HisTyr: 0.876 ± 0.949
0.0HisXaa: 0.0 ± 0.0
Ile
2.627IleAla: 2.627 ± 1.021
0.876IleCys: 0.876 ± 0.744
4.378IleAsp: 4.378 ± 1.739
3.503IleGlu: 3.503 ± 1.156
0.876IlePhe: 0.876 ± 0.579
0.876IleGly: 0.876 ± 0.579
0.0IleHis: 0.0 ± 0.0
0.876IleIle: 0.876 ± 0.744
7.005IleLys: 7.005 ± 1.716
1.751IleLeu: 1.751 ± 1.157
2.627IleMet: 2.627 ± 1.034
5.254IleAsn: 5.254 ± 1.898
6.13IlePro: 6.13 ± 1.92
4.378IleGln: 4.378 ± 1.739
1.751IleArg: 1.751 ± 1.022
6.13IleSer: 6.13 ± 0.521
0.0IleThr: 0.0 ± 0.0
1.751IleVal: 1.751 ± 0.578
3.503IleTrp: 3.503 ± 0.277
0.876IleTyr: 0.876 ± 0.579
0.0IleXaa: 0.0 ± 0.0
Lys
7.005LysAla: 7.005 ± 2.7
1.751LysCys: 1.751 ± 1.488
3.503LysAsp: 3.503 ± 0.961
7.005LysGlu: 7.005 ± 3.116
2.627LysPhe: 2.627 ± 0.885
3.503LysGly: 3.503 ± 0.961
0.876LysHis: 0.876 ± 0.949
5.254LysIle: 5.254 ± 1.212
7.005LysLys: 7.005 ± 3.845
7.005LysLeu: 7.005 ± 1.716
1.751LysMet: 1.751 ± 1.157
6.13LysAsn: 6.13 ± 3.471
2.627LysPro: 2.627 ± 1.2
2.627LysGln: 2.627 ± 1.515
3.503LysArg: 3.503 ± 0.961
3.503LysSer: 3.503 ± 1.386
3.503LysThr: 3.503 ± 2.578
3.503LysVal: 3.503 ± 1.379
1.751LysTrp: 1.751 ± 0.798
6.13LysTyr: 6.13 ± 4.116
0.0LysXaa: 0.0 ± 0.0
Leu
5.254LeuAla: 5.254 ± 2.042
0.876LeuCys: 0.876 ± 0.579
4.378LeuAsp: 4.378 ± 1.379
6.13LeuGlu: 6.13 ± 1.956
0.876LeuPhe: 0.876 ± 0.949
8.757LeuGly: 8.757 ± 0.937
0.0LeuHis: 0.0 ± 0.0
2.627LeuIle: 2.627 ± 1.655
7.881LeuLys: 7.881 ± 2.315
7.005LeuLeu: 7.005 ± 0.555
3.503LeuMet: 3.503 ± 1.227
7.005LeuAsn: 7.005 ± 1.839
3.503LeuPro: 3.503 ± 2.314
1.751LeuGln: 1.751 ± 1.157
2.627LeuArg: 2.627 ± 0.885
5.254LeuSer: 5.254 ± 2.042
1.751LeuThr: 1.751 ± 0.578
3.503LeuVal: 3.503 ± 1.379
0.0LeuTrp: 0.0 ± 0.0
3.503LeuTyr: 3.503 ± 1.912
0.0LeuXaa: 0.0 ± 0.0
Met
1.751MetAla: 1.751 ± 0.798
0.876MetCys: 0.876 ± 0.744
0.876MetAsp: 0.876 ± 0.949
1.751MetGlu: 1.751 ± 1.157
0.876MetPhe: 0.876 ± 0.579
2.627MetGly: 2.627 ± 1.655
0.0MetHis: 0.0 ± 0.0
0.876MetIle: 0.876 ± 0.579
4.378MetLys: 4.378 ± 0.468
3.503MetLeu: 3.503 ± 0.961
0.0MetMet: 0.0 ± 0.0
0.876MetAsn: 0.876 ± 0.579
1.751MetPro: 1.751 ± 1.157
0.876MetGln: 0.876 ± 0.949
1.751MetArg: 1.751 ± 0.798
1.751MetSer: 1.751 ± 0.798
0.876MetThr: 0.876 ± 0.579
1.751MetVal: 1.751 ± 0.578
0.0MetTrp: 0.0 ± 0.0
1.751MetTyr: 1.751 ± 1.488
0.0MetXaa: 0.0 ± 0.0
Asn
5.254AsnAla: 5.254 ± 3.311
0.876AsnCys: 0.876 ± 0.744
3.503AsnAsp: 3.503 ± 1.156
3.503AsnGlu: 3.503 ± 1.386
2.627AsnPhe: 2.627 ± 1.655
3.503AsnGly: 3.503 ± 1.156
0.0AsnHis: 0.0 ± 0.0
3.503AsnIle: 3.503 ± 1.455
5.254AsnLys: 5.254 ± 2.017
7.005AsnLeu: 7.005 ± 0.981
1.751AsnMet: 1.751 ± 0.578
1.751AsnAsn: 1.751 ± 0.798
1.751AsnPro: 1.751 ± 0.798
0.876AsnGln: 0.876 ± 0.744
3.503AsnArg: 3.503 ± 1.386
4.378AsnSer: 4.378 ± 0.773
3.503AsnThr: 3.503 ± 2.726
3.503AsnVal: 3.503 ± 1.596
1.751AsnTrp: 1.751 ± 1.488
4.378AsnTyr: 4.378 ± 1.379
0.0AsnXaa: 0.0 ± 0.0
Pro
0.0ProAla: 0.0 ± 0.0
0.0ProCys: 0.0 ± 0.0
1.751ProAsp: 1.751 ± 0.578
2.627ProGlu: 2.627 ± 0.885
0.876ProPhe: 0.876 ± 0.579
0.0ProGly: 0.0 ± 0.0
0.876ProHis: 0.876 ± 0.744
3.503ProIle: 3.503 ± 2.314
2.627ProLys: 2.627 ± 2.232
2.627ProLeu: 2.627 ± 0.475
2.627ProMet: 2.627 ± 0.885
2.627ProAsn: 2.627 ± 0.885
2.627ProPro: 2.627 ± 1.736
5.254ProGln: 5.254 ± 2.505
1.751ProArg: 1.751 ± 0.578
2.627ProSer: 2.627 ± 1.021
1.751ProThr: 1.751 ± 0.798
3.503ProVal: 3.503 ± 2.314
0.0ProTrp: 0.0 ± 0.0
1.751ProTyr: 1.751 ± 0.578
0.0ProXaa: 0.0 ± 0.0
Gln
5.254GlnAla: 5.254 ± 3.311
0.0GlnCys: 0.0 ± 0.0
1.751GlnAsp: 1.751 ± 0.798
2.627GlnGlu: 2.627 ± 1.655
0.876GlnPhe: 0.876 ± 0.744
3.503GlnGly: 3.503 ± 1.379
0.0GlnHis: 0.0 ± 0.0
3.503GlnIle: 3.503 ± 0.277
2.627GlnLys: 2.627 ± 1.2
4.378GlnLeu: 4.378 ± 1.485
1.751GlnMet: 1.751 ± 1.399
1.751GlnAsn: 1.751 ± 1.488
2.627GlnPro: 2.627 ± 1.736
2.627GlnGln: 2.627 ± 1.021
3.503GlnArg: 3.503 ± 2.044
2.627GlnSer: 2.627 ± 1.736
0.876GlnThr: 0.876 ± 0.949
1.751GlnVal: 1.751 ± 1.157
0.0GlnTrp: 0.0 ± 0.0
0.876GlnTyr: 0.876 ± 0.744
0.0GlnXaa: 0.0 ± 0.0
Arg
4.378ArgAla: 4.378 ± 1.731
0.876ArgCys: 0.876 ± 0.744
5.254ArgAsp: 5.254 ± 2.3
7.881ArgGlu: 7.881 ± 1.308
1.751ArgPhe: 1.751 ± 0.578
1.751ArgGly: 1.751 ± 0.798
0.0ArgHis: 0.0 ± 0.0
0.876ArgIle: 0.876 ± 0.579
5.254ArgLys: 5.254 ± 1.212
1.751ArgLeu: 1.751 ± 1.022
0.0ArgMet: 0.0 ± 0.0
0.876ArgAsn: 0.876 ± 0.949
0.876ArgPro: 0.876 ± 0.744
1.751ArgGln: 1.751 ± 1.022
2.627ArgArg: 2.627 ± 0.475
5.254ArgSer: 5.254 ± 1.337
0.876ArgThr: 0.876 ± 0.579
3.503ArgVal: 3.503 ± 0.277
0.876ArgTrp: 0.876 ± 0.744
2.627ArgTyr: 2.627 ± 1.021
0.0ArgXaa: 0.0 ± 0.0
Ser
7.881SerAla: 7.881 ± 3.062
0.876SerCys: 0.876 ± 0.579
3.503SerAsp: 3.503 ± 0.277
6.13SerGlu: 6.13 ± 3.059
1.751SerPhe: 1.751 ± 0.798
9.632SerGly: 9.632 ± 4.075
0.876SerHis: 0.876 ± 0.579
3.503SerIle: 3.503 ± 2.314
2.627SerLys: 2.627 ± 1.515
4.378SerLeu: 4.378 ± 2.893
1.751SerMet: 1.751 ± 1.022
9.632SerAsn: 9.632 ± 2.771
3.503SerPro: 3.503 ± 0.961
2.627SerGln: 2.627 ± 2.232
1.751SerArg: 1.751 ± 0.798
4.378SerSer: 4.378 ± 1.083
5.254SerThr: 5.254 ± 2.017
4.378SerVal: 4.378 ± 1.083
0.876SerTrp: 0.876 ± 0.744
0.876SerTyr: 0.876 ± 0.744
0.0SerXaa: 0.0 ± 0.0
Thr
1.751ThrAla: 1.751 ± 0.798
0.0ThrCys: 0.0 ± 0.0
4.378ThrAsp: 4.378 ± 1.965
1.751ThrGlu: 1.751 ± 1.022
0.876ThrPhe: 0.876 ± 0.579
2.627ThrGly: 2.627 ± 0.885
0.876ThrHis: 0.876 ± 0.744
2.627ThrIle: 2.627 ± 1.021
4.378ThrLys: 4.378 ± 2.475
3.503ThrLeu: 3.503 ± 1.379
0.0ThrMet: 0.0 ± 0.0
6.13ThrAsn: 6.13 ± 1.841
3.503ThrPro: 3.503 ± 1.455
1.751ThrGln: 1.751 ± 1.898
0.876ThrArg: 0.876 ± 0.579
5.254ThrSer: 5.254 ± 0.43
4.378ThrThr: 4.378 ± 1.379
5.254ThrVal: 5.254 ± 1.734
0.0ThrTrp: 0.0 ± 0.0
1.751ThrTyr: 1.751 ± 0.578
0.0ThrXaa: 0.0 ± 0.0
Val
3.503ValAla: 3.503 ± 1.386
0.0ValCys: 0.0 ± 0.0
3.503ValAsp: 3.503 ± 2.314
2.627ValGlu: 2.627 ± 1.2
2.627ValPhe: 2.627 ± 1.2
3.503ValGly: 3.503 ± 1.455
0.0ValHis: 0.0 ± 0.0
1.751ValIle: 1.751 ± 0.578
6.13ValLys: 6.13 ± 1.956
4.378ValLeu: 4.378 ± 0.773
0.876ValMet: 0.876 ± 0.579
3.503ValAsn: 3.503 ± 0.277
1.751ValPro: 1.751 ± 1.157
2.627ValGln: 2.627 ± 1.021
1.751ValArg: 1.751 ± 0.578
7.005ValSer: 7.005 ± 1.839
6.13ValThr: 6.13 ± 2.245
4.378ValVal: 4.378 ± 0.468
0.876ValTrp: 0.876 ± 0.579
3.503ValTyr: 3.503 ± 1.912
0.0ValXaa: 0.0 ± 0.0
Trp
0.0TrpAla: 0.0 ± 0.0
0.0TrpCys: 0.0 ± 0.0
0.876TrpAsp: 0.876 ± 0.579
1.751TrpGlu: 1.751 ± 0.798
1.751TrpPhe: 1.751 ± 0.578
0.876TrpGly: 0.876 ± 0.579
0.876TrpHis: 0.876 ± 0.579
0.876TrpIle: 0.876 ± 0.744
0.876TrpLys: 0.876 ± 0.744
0.0TrpLeu: 0.0 ± 0.0
0.876TrpMet: 0.876 ± 0.579
0.876TrpAsn: 0.876 ± 0.579
0.0TrpPro: 0.0 ± 0.0
1.751TrpGln: 1.751 ± 1.022
1.751TrpArg: 1.751 ± 0.578
1.751TrpSer: 1.751 ± 0.798
0.876TrpThr: 0.876 ± 0.744
1.751TrpVal: 1.751 ± 0.578
0.876TrpTrp: 0.876 ± 0.949
0.0TrpTyr: 0.0 ± 0.0
0.0TrpXaa: 0.0 ± 0.0
Tyr
0.0TyrAla: 0.0 ± 0.0
0.876TyrCys: 0.876 ± 0.744
1.751TyrAsp: 1.751 ± 1.022
1.751TyrGlu: 1.751 ± 1.488
3.503TyrPhe: 3.503 ± 1.156
3.503TyrGly: 3.503 ± 2.977
0.0TyrHis: 0.0 ± 0.0
3.503TyrIle: 3.503 ± 1.156
3.503TyrLys: 3.503 ± 1.912
0.876TyrLeu: 0.876 ± 0.579
0.876TyrMet: 0.876 ± 0.949
4.378TyrAsn: 4.378 ± 0.468
1.751TyrPro: 1.751 ± 0.578
6.13TyrGln: 6.13 ± 0.906
4.378TyrArg: 4.378 ± 0.468
1.751TyrSer: 1.751 ± 1.488
4.378TyrThr: 4.378 ± 1.379
3.503TyrVal: 3.503 ± 1.912
0.876TyrTrp: 0.876 ± 0.579
2.627TyrTyr: 2.627 ± 1.2
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 3 proteins (1143 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski