Amino acid dipepetide frequency for Marine gokushovirus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
11.276AlaAla: 11.276 ± 3.737
0.705AlaCys: 0.705 ± 0.49
6.342AlaAsp: 6.342 ± 1.132
6.342AlaGlu: 6.342 ± 2.561
5.638AlaPhe: 5.638 ± 1.587
5.638AlaGly: 5.638 ± 1.722
2.819AlaHis: 2.819 ± 0.801
4.228AlaIle: 4.228 ± 1.526
6.342AlaLys: 6.342 ± 3.383
4.933AlaLeu: 4.933 ± 2.274
2.114AlaMet: 2.114 ± 0.848
3.524AlaAsn: 3.524 ± 2.09
4.228AlaPro: 4.228 ± 1.483
4.228AlaGln: 4.228 ± 1.507
4.228AlaArg: 4.228 ± 0.773
4.228AlaSer: 4.228 ± 1.547
7.047AlaThr: 7.047 ± 1.869
4.228AlaVal: 4.228 ± 1.601
0.705AlaTrp: 0.705 ± 0.953
2.114AlaTyr: 2.114 ± 1.327
0.0AlaXaa: 0.0 ± 0.0
Cys
0.705CysAla: 0.705 ± 0.574
0.0CysCys: 0.0 ± 0.0
1.409CysAsp: 1.409 ± 1.148
0.0CysGlu: 0.0 ± 0.0
0.705CysPhe: 0.705 ± 0.574
2.819CysGly: 2.819 ± 1.519
0.705CysHis: 0.705 ± 0.574
1.409CysIle: 1.409 ± 0.949
0.0CysLys: 0.0 ± 0.0
0.705CysLeu: 0.705 ± 0.49
0.0CysMet: 0.0 ± 0.0
0.705CysAsn: 0.705 ± 0.953
0.705CysPro: 0.705 ± 0.909
0.705CysGln: 0.705 ± 0.49
0.705CysArg: 0.705 ± 0.574
0.705CysSer: 0.705 ± 0.49
1.409CysThr: 1.409 ± 0.954
1.409CysVal: 1.409 ± 0.61
0.0CysTrp: 0.0 ± 0.0
0.705CysTyr: 0.705 ± 0.574
0.0CysXaa: 0.0 ± 0.0
Asp
6.342AspAla: 6.342 ± 2.113
0.705AspCys: 0.705 ± 0.574
1.409AspAsp: 1.409 ± 0.801
2.819AspGlu: 2.819 ± 1.22
3.524AspPhe: 3.524 ± 1.185
2.114AspGly: 2.114 ± 0.746
1.409AspHis: 1.409 ± 1.056
0.705AspIle: 0.705 ± 0.909
5.638AspLys: 5.638 ± 2.164
5.638AspLeu: 5.638 ± 1.091
0.705AspMet: 0.705 ± 0.84
4.228AspAsn: 4.228 ± 1.83
1.409AspPro: 1.409 ± 0.98
1.409AspGln: 1.409 ± 0.61
2.819AspArg: 2.819 ± 0.947
3.524AspSer: 3.524 ± 1.794
0.705AspThr: 0.705 ± 0.574
4.228AspVal: 4.228 ± 1.275
0.705AspTrp: 0.705 ± 0.84
4.228AspTyr: 4.228 ± 1.375
0.0AspXaa: 0.0 ± 0.0
Glu
5.638GluAla: 5.638 ± 3.362
0.705GluCys: 0.705 ± 0.953
0.705GluAsp: 0.705 ± 0.574
4.933GluGlu: 4.933 ± 2.062
2.819GluPhe: 2.819 ± 0.853
0.0GluGly: 0.0 ± 0.0
1.409GluHis: 1.409 ± 0.61
4.933GluIle: 4.933 ± 1.516
1.409GluLys: 1.409 ± 0.801
4.228GluLeu: 4.228 ± 2.235
3.524GluMet: 3.524 ± 1.589
2.819GluAsn: 2.819 ± 1.265
2.114GluPro: 2.114 ± 1.146
0.705GluGln: 0.705 ± 0.49
4.228GluArg: 4.228 ± 2.005
2.114GluSer: 2.114 ± 0.949
2.819GluThr: 2.819 ± 1.041
4.228GluVal: 4.228 ± 1.423
1.409GluTrp: 1.409 ± 0.61
3.524GluTyr: 3.524 ± 1.117
0.0GluXaa: 0.0 ± 0.0
Phe
5.638PheAla: 5.638 ± 2.194
0.0PheCys: 0.0 ± 0.0
5.638PheAsp: 5.638 ± 1.984
0.0PheGlu: 0.0 ± 0.0
2.819PhePhe: 2.819 ± 1.378
4.228PheGly: 4.228 ± 1.176
1.409PheHis: 1.409 ± 1.148
2.114PheIle: 2.114 ± 1.418
2.114PheLys: 2.114 ± 1.184
2.114PheLeu: 2.114 ± 1.173
1.409PheMet: 1.409 ± 0.944
4.933PheAsn: 4.933 ± 1.42
0.0PhePro: 0.0 ± 0.0
2.819PheGln: 2.819 ± 0.596
0.705PheArg: 0.705 ± 0.49
4.933PheSer: 4.933 ± 1.728
2.114PheThr: 2.114 ± 1.654
3.524PheVal: 3.524 ± 1.373
0.0PheTrp: 0.0 ± 0.0
0.705PheTyr: 0.705 ± 0.49
0.0PheXaa: 0.0 ± 0.0
Gly
4.933GlyAla: 4.933 ± 1.902
0.705GlyCys: 0.705 ± 0.574
7.047GlyAsp: 7.047 ± 1.751
6.342GlyGlu: 6.342 ± 2.345
3.524GlyPhe: 3.524 ± 1.039
7.752GlyGly: 7.752 ± 4.023
2.114GlyHis: 2.114 ± 0.746
5.638GlyIle: 5.638 ± 2.631
6.342GlyLys: 6.342 ± 1.791
8.457GlyLeu: 8.457 ± 4.136
2.819GlyMet: 2.819 ± 1.161
2.114GlyAsn: 2.114 ± 0.801
0.705GlyPro: 0.705 ± 0.49
1.409GlyGln: 1.409 ± 0.949
2.819GlyArg: 2.819 ± 1.178
4.228GlySer: 4.228 ± 1.476
3.524GlyThr: 3.524 ± 1.597
7.047GlyVal: 7.047 ± 2.25
0.0GlyTrp: 0.0 ± 0.0
5.638GlyTyr: 5.638 ± 1.906
0.0GlyXaa: 0.0 ± 0.0
His
1.409HisAla: 1.409 ± 0.751
0.705HisCys: 0.705 ± 0.574
2.819HisAsp: 2.819 ± 1.617
2.819HisGlu: 2.819 ± 1.041
2.114HisPhe: 2.114 ± 1.109
3.524HisGly: 3.524 ± 2.45
0.0HisHis: 0.0 ± 0.0
0.705HisIle: 0.705 ± 0.574
2.819HisLys: 2.819 ± 1.31
1.409HisLeu: 1.409 ± 0.98
0.705HisMet: 0.705 ± 0.672
0.0HisAsn: 0.0 ± 0.0
1.409HisPro: 1.409 ± 0.751
0.705HisGln: 0.705 ± 0.73
0.705HisArg: 0.705 ± 0.49
0.0HisSer: 0.0 ± 0.0
0.705HisThr: 0.705 ± 0.574
0.705HisVal: 0.705 ± 0.73
0.0HisTrp: 0.0 ± 0.0
1.409HisTyr: 1.409 ± 1.148
0.0HisXaa: 0.0 ± 0.0
Ile
4.228IleAla: 4.228 ± 1.738
0.705IleCys: 0.705 ± 0.909
1.409IleAsp: 1.409 ± 1.679
2.114IleGlu: 2.114 ± 1.33
0.705IlePhe: 0.705 ± 0.49
5.638IleGly: 5.638 ± 1.629
0.0IleHis: 0.0 ± 0.0
2.819IleIle: 2.819 ± 0.596
5.638IleLys: 5.638 ± 2.172
2.114IleLeu: 2.114 ± 1.102
0.0IleMet: 0.0 ± 0.0
4.933IleAsn: 4.933 ± 1.571
2.114IlePro: 2.114 ± 1.232
2.114IleGln: 2.114 ± 1.146
3.524IleArg: 3.524 ± 1.297
0.0IleSer: 0.0 ± 0.0
3.524IleThr: 3.524 ± 0.851
3.524IleVal: 3.524 ± 2.659
0.705IleTrp: 0.705 ± 0.49
1.409IleTyr: 1.409 ± 0.98
0.0IleXaa: 0.0 ± 0.0
Lys
6.342LysAla: 6.342 ± 2.641
2.114LysCys: 2.114 ± 1.308
2.819LysAsp: 2.819 ± 0.822
2.114LysGlu: 2.114 ± 0.949
1.409LysPhe: 1.409 ± 0.589
7.047LysGly: 7.047 ± 2.823
1.409LysHis: 1.409 ± 0.61
2.114LysIle: 2.114 ± 0.658
6.342LysLys: 6.342 ± 3.228
4.933LysLeu: 4.933 ± 2.023
2.819LysMet: 2.819 ± 1.682
1.409LysAsn: 1.409 ± 1.037
4.228LysPro: 4.228 ± 1.734
2.819LysGln: 2.819 ± 1.805
3.524LysArg: 3.524 ± 1.655
3.524LysSer: 3.524 ± 2.237
4.228LysThr: 4.228 ± 1.614
4.933LysVal: 4.933 ± 1.751
1.409LysTrp: 1.409 ± 1.335
2.819LysTyr: 2.819 ± 1.621
0.0LysXaa: 0.0 ± 0.0
Leu
4.933LeuAla: 4.933 ± 1.646
0.705LeuCys: 0.705 ± 0.574
4.228LeuAsp: 4.228 ± 1.068
2.819LeuGlu: 2.819 ± 1.097
3.524LeuPhe: 3.524 ± 1.492
9.866LeuGly: 9.866 ± 4.339
1.409LeuHis: 1.409 ± 0.801
3.524LeuIle: 3.524 ± 1.534
4.933LeuLys: 4.933 ± 1.575
2.114LeuLeu: 2.114 ± 0.869
0.705LeuMet: 0.705 ± 0.839
1.409LeuAsn: 1.409 ± 0.865
6.342LeuPro: 6.342 ± 2.098
5.638LeuGln: 5.638 ± 2.171
2.114LeuArg: 2.114 ± 0.73
2.819LeuSer: 2.819 ± 1.191
3.524LeuThr: 3.524 ± 1.318
4.933LeuVal: 4.933 ± 1.682
2.114LeuTrp: 2.114 ± 1.722
2.819LeuTyr: 2.819 ± 1.96
0.0LeuXaa: 0.0 ± 0.0
Met
3.524MetAla: 3.524 ± 1.408
0.0MetCys: 0.0 ± 0.0
2.114MetAsp: 2.114 ± 0.831
0.0MetGlu: 0.0 ± 0.0
1.409MetPhe: 1.409 ± 0.808
2.819MetGly: 2.819 ± 1.191
0.705MetHis: 0.705 ± 0.49
0.705MetIle: 0.705 ± 0.751
4.933MetLys: 4.933 ± 2.007
0.705MetLeu: 0.705 ± 0.73
0.705MetMet: 0.705 ± 0.86
0.705MetAsn: 0.705 ± 0.49
3.524MetPro: 3.524 ± 1.492
0.705MetGln: 0.705 ± 0.49
2.114MetArg: 2.114 ± 1.274
2.114MetSer: 2.114 ± 1.295
0.0MetThr: 0.0 ± 0.0
0.705MetVal: 0.705 ± 0.751
0.705MetTrp: 0.705 ± 0.49
1.409MetTyr: 1.409 ± 1.818
0.0MetXaa: 0.0 ± 0.0
Asn
4.228AsnAla: 4.228 ± 1.387
0.705AsnCys: 0.705 ± 0.574
0.705AsnAsp: 0.705 ± 0.49
4.933AsnGlu: 4.933 ± 2.036
1.409AsnPhe: 1.409 ± 1.177
3.524AsnGly: 3.524 ± 1.607
0.0AsnHis: 0.0 ± 0.0
2.819AsnIle: 2.819 ± 1.14
2.819AsnLys: 2.819 ± 1.676
7.047AsnLeu: 7.047 ± 3.736
0.0AsnMet: 0.0 ± 0.922
3.524AsnAsn: 3.524 ± 1.489
2.819AsnPro: 2.819 ± 1.24
0.705AsnGln: 0.705 ± 0.49
2.114AsnArg: 2.114 ± 1.47
2.114AsnSer: 2.114 ± 0.899
1.409AsnThr: 1.409 ± 0.904
0.705AsnVal: 0.705 ± 0.751
0.705AsnTrp: 0.705 ± 0.49
0.705AsnTyr: 0.705 ± 0.574
0.0AsnXaa: 0.0 ± 0.0
Pro
4.228ProAla: 4.228 ± 2.355
1.409ProCys: 1.409 ± 1.148
2.819ProAsp: 2.819 ± 1.257
2.819ProGlu: 2.819 ± 1.063
1.409ProPhe: 1.409 ± 0.751
2.819ProGly: 2.819 ± 1.039
1.409ProHis: 1.409 ± 1.148
5.638ProIle: 5.638 ± 1.205
2.114ProLys: 2.114 ± 1.671
2.114ProLeu: 2.114 ± 0.658
2.114ProMet: 2.114 ± 1.455
1.409ProAsn: 1.409 ± 0.98
2.819ProPro: 2.819 ± 1.24
4.933ProGln: 4.933 ± 2.577
2.114ProArg: 2.114 ± 1.172
1.409ProSer: 1.409 ± 0.801
4.228ProThr: 4.228 ± 1.824
3.524ProVal: 3.524 ± 1.56
0.705ProTrp: 0.705 ± 0.49
1.409ProTyr: 1.409 ± 0.904
0.0ProXaa: 0.0 ± 0.0
Gln
4.228GlnAla: 4.228 ± 2.465
2.114GlnCys: 2.114 ± 0.895
3.524GlnAsp: 3.524 ± 2.024
4.228GlnGlu: 4.228 ± 1.291
0.705GlnPhe: 0.705 ± 0.86
2.819GlnGly: 2.819 ± 1.96
1.409GlnHis: 1.409 ± 0.925
2.114GlnIle: 2.114 ± 0.946
2.819GlnLys: 2.819 ± 0.596
2.819GlnLeu: 2.819 ± 1.508
2.114GlnMet: 2.114 ± 2.074
2.114GlnAsn: 2.114 ± 1.45
0.705GlnPro: 0.705 ± 0.49
3.524GlnGln: 3.524 ± 1.408
3.524GlnArg: 3.524 ± 1.053
0.705GlnSer: 0.705 ± 0.49
3.524GlnThr: 3.524 ± 0.943
1.409GlnVal: 1.409 ± 1.818
0.705GlnTrp: 0.705 ± 0.574
1.409GlnTyr: 1.409 ± 0.925
0.0GlnXaa: 0.0 ± 0.0
Arg
0.705ArgAla: 0.705 ± 0.672
0.705ArgCys: 0.705 ± 0.574
2.114ArgAsp: 2.114 ± 1.47
3.524ArgGlu: 3.524 ± 2.058
2.114ArgPhe: 2.114 ± 1.346
4.228ArgGly: 4.228 ± 1.511
1.409ArgHis: 1.409 ± 0.962
0.0ArgIle: 0.0 ± 0.0
5.638ArgLys: 5.638 ± 1.965
7.047ArgLeu: 7.047 ± 2.233
2.114ArgMet: 2.114 ± 0.658
1.409ArgAsn: 1.409 ± 1.255
2.819ArgPro: 2.819 ± 1.192
2.819ArgGln: 2.819 ± 1.249
1.409ArgArg: 1.409 ± 0.875
3.524ArgSer: 3.524 ± 1.318
0.0ArgThr: 0.0 ± 0.0
2.819ArgVal: 2.819 ± 0.999
0.705ArgTrp: 0.705 ± 0.49
4.228ArgTyr: 4.228 ± 1.375
0.0ArgXaa: 0.0 ± 0.0
Ser
7.752SerAla: 7.752 ± 3.198
1.409SerCys: 1.409 ± 0.98
1.409SerAsp: 1.409 ± 1.021
1.409SerGlu: 1.409 ± 0.925
2.819SerPhe: 2.819 ± 1.039
2.114SerGly: 2.114 ± 0.899
2.114SerHis: 2.114 ± 0.946
2.114SerIle: 2.114 ± 1.102
0.705SerLys: 0.705 ± 0.49
4.228SerLeu: 4.228 ± 1.353
1.409SerMet: 1.409 ± 0.98
2.114SerAsn: 2.114 ± 0.658
2.819SerPro: 2.819 ± 1.026
3.524SerGln: 3.524 ± 1.11
5.638SerArg: 5.638 ± 1.394
4.228SerSer: 4.228 ± 1.953
2.114SerThr: 2.114 ± 1.47
5.638SerVal: 5.638 ± 1.752
0.0SerTrp: 0.0 ± 0.0
1.409SerTyr: 1.409 ± 0.98
0.0SerXaa: 0.0 ± 0.0
Thr
4.228ThrAla: 4.228 ± 2.027
0.0ThrCys: 0.0 ± 0.0
1.409ThrAsp: 1.409 ± 0.98
2.819ThrGlu: 2.819 ± 1.214
4.933ThrPhe: 4.933 ± 1.196
7.752ThrGly: 7.752 ± 2.266
1.409ThrHis: 1.409 ± 0.751
1.409ThrIle: 1.409 ± 0.61
2.114ThrLys: 2.114 ± 1.917
3.524ThrLeu: 3.524 ± 1.43
0.705ThrMet: 0.705 ± 0.574
1.409ThrAsn: 1.409 ± 0.925
2.114ThrPro: 2.114 ± 1.096
1.409ThrGln: 1.409 ± 0.801
2.114ThrArg: 2.114 ± 0.869
5.638ThrSer: 5.638 ± 3.203
1.409ThrThr: 1.409 ± 0.589
1.409ThrVal: 1.409 ± 0.98
0.0ThrTrp: 0.0 ± 0.0
1.409ThrTyr: 1.409 ± 0.865
0.0ThrXaa: 0.0 ± 0.0
Val
6.342ValAla: 6.342 ± 1.462
0.705ValCys: 0.705 ± 0.953
2.114ValAsp: 2.114 ± 0.946
2.114ValGlu: 2.114 ± 0.946
3.524ValPhe: 3.524 ± 0.539
3.524ValGly: 3.524 ± 1.597
0.705ValHis: 0.705 ± 0.574
2.819ValIle: 2.819 ± 1.063
2.819ValLys: 2.819 ± 1.328
2.114ValLeu: 2.114 ± 0.912
3.524ValMet: 3.524 ± 1.082
3.524ValAsn: 3.524 ± 1.423
7.047ValPro: 7.047 ± 2.604
3.524ValGln: 3.524 ± 2.052
1.409ValArg: 1.409 ± 0.808
4.228ValSer: 4.228 ± 1.47
2.819ValThr: 2.819 ± 1.141
5.638ValVal: 5.638 ± 1.558
2.114ValTrp: 2.114 ± 1.278
3.524ValTyr: 3.524 ± 1.822
0.0ValXaa: 0.0 ± 0.0
Trp
1.409TrpAla: 1.409 ± 1.015
0.0TrpCys: 0.0 ± 0.0
1.409TrpAsp: 1.409 ± 0.98
0.705TrpGlu: 0.705 ± 0.49
0.705TrpPhe: 0.705 ± 0.49
0.0TrpGly: 0.0 ± 0.0
0.705TrpHis: 0.705 ± 0.49
0.0TrpIle: 0.0 ± 0.0
0.0TrpLys: 0.0 ± 0.0
0.705TrpLeu: 0.705 ± 0.953
0.705TrpMet: 0.705 ± 0.909
0.0TrpAsn: 0.0 ± 0.0
2.114TrpPro: 2.114 ± 0.946
0.0TrpGln: 0.0 ± 0.0
0.0TrpArg: 0.0 ± 0.0
2.114TrpSer: 2.114 ± 0.658
0.0TrpThr: 0.0 ± 0.0
1.409TrpVal: 1.409 ± 1.015
0.0TrpTrp: 0.0 ± 0.0
1.409TrpTyr: 1.409 ± 0.954
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.819TyrAla: 2.819 ± 1.026
1.409TyrCys: 1.409 ± 1.148
2.819TyrAsp: 2.819 ± 0.853
1.409TyrGlu: 1.409 ± 0.61
1.409TyrPhe: 1.409 ± 0.865
4.933TyrGly: 4.933 ± 2.404
2.114TyrHis: 2.114 ± 1.722
1.409TyrIle: 1.409 ± 0.61
2.819TyrLys: 2.819 ± 1.086
3.524TyrLeu: 3.524 ± 1.319
0.705TyrMet: 0.705 ± 0.49
1.409TyrAsn: 1.409 ± 0.98
1.409TyrPro: 1.409 ± 0.61
2.819TyrGln: 2.819 ± 0.729
3.524TyrArg: 3.524 ± 1.389
2.819TyrSer: 2.819 ± 0.999
2.114TyrThr: 2.114 ± 1.47
2.114TyrVal: 2.114 ± 0.831
0.705TyrTrp: 0.705 ± 0.49
2.114TyrTyr: 2.114 ± 1.053
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 9 proteins (1420 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski