Amino acid dipepetide frequency for Sputnik virophage

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
1.444AlaAla: 1.444 ± 0.399
0.825AlaCys: 0.825 ± 0.353
1.238AlaAsp: 1.238 ± 0.398
2.063AlaGlu: 2.063 ± 0.669
1.651AlaPhe: 1.651 ± 0.51
3.92AlaGly: 3.92 ± 0.954
0.206AlaHis: 0.206 ± 0.22
2.682AlaIle: 2.682 ± 0.376
5.364AlaLys: 5.364 ± 1.451
3.095AlaLeu: 3.095 ± 0.555
0.825AlaMet: 0.825 ± 0.466
2.476AlaAsn: 2.476 ± 0.711
1.651AlaPro: 1.651 ± 0.984
2.888AlaGln: 2.888 ± 0.668
1.238AlaArg: 1.238 ± 0.598
3.507AlaSer: 3.507 ± 1.218
1.857AlaThr: 1.857 ± 0.595
2.063AlaVal: 2.063 ± 0.442
0.0AlaTrp: 0.0 ± 0.0
2.888AlaTyr: 2.888 ± 0.764
0.0AlaXaa: 0.0 ± 0.0
Cys
0.206CysAla: 0.206 ± 0.159
0.619CysCys: 0.619 ± 0.291
0.0CysAsp: 0.0 ± 0.0
1.032CysGlu: 1.032 ± 0.394
0.619CysPhe: 0.619 ± 0.382
1.238CysGly: 1.238 ± 0.52
0.0CysHis: 0.0 ± 0.0
0.619CysIle: 0.619 ± 0.417
1.651CysLys: 1.651 ± 0.609
0.825CysLeu: 0.825 ± 0.327
0.206CysMet: 0.206 ± 0.159
1.032CysAsn: 1.032 ± 0.363
0.825CysPro: 0.825 ± 0.386
0.619CysGln: 0.619 ± 0.359
0.206CysArg: 0.206 ± 0.264
0.413CysSer: 0.413 ± 0.315
0.206CysThr: 0.206 ± 0.164
0.413CysVal: 0.413 ± 0.429
0.0CysTrp: 0.0 ± 0.0
0.206CysTyr: 0.206 ± 0.159
0.0CysXaa: 0.0 ± 0.0
Asp
1.651AspAla: 1.651 ± 0.561
0.825AspCys: 0.825 ± 0.382
3.92AspAsp: 3.92 ± 1.081
5.364AspGlu: 5.364 ± 1.18
2.269AspPhe: 2.269 ± 0.481
3.714AspGly: 3.714 ± 1.251
0.413AspHis: 0.413 ± 0.278
4.745AspIle: 4.745 ± 1.029
4.333AspLys: 4.333 ± 1.115
3.714AspLeu: 3.714 ± 0.793
2.476AspMet: 2.476 ± 0.621
3.714AspAsn: 3.714 ± 1.074
2.682AspPro: 2.682 ± 0.779
1.238AspGln: 1.238 ± 0.478
1.444AspArg: 1.444 ± 0.711
3.507AspSer: 3.507 ± 0.693
1.444AspThr: 1.444 ± 0.459
2.888AspVal: 2.888 ± 0.732
0.0AspTrp: 0.0 ± 0.0
1.857AspTyr: 1.857 ± 0.586
0.0AspXaa: 0.0 ± 0.0
Glu
2.888GluAla: 2.888 ± 0.603
1.032GluCys: 1.032 ± 0.399
2.476GluAsp: 2.476 ± 0.579
6.189GluGlu: 6.189 ± 2.782
3.301GluPhe: 3.301 ± 0.912
1.444GluGly: 1.444 ± 0.479
1.651GluHis: 1.651 ± 0.542
5.57GluIle: 5.57 ± 1.551
6.808GluLys: 6.808 ± 1.735
8.871GluLeu: 8.871 ± 1.969
1.238GluMet: 1.238 ± 0.394
3.714GluAsn: 3.714 ± 1.088
1.032GluPro: 1.032 ± 0.357
2.269GluGln: 2.269 ± 0.641
1.857GluArg: 1.857 ± 0.864
2.063GluSer: 2.063 ± 0.411
2.888GluThr: 2.888 ± 0.882
3.507GluVal: 3.507 ± 1.048
0.206GluTrp: 0.206 ± 0.159
2.063GluTyr: 2.063 ± 0.596
0.0GluXaa: 0.0 ± 0.0
Phe
1.651PheAla: 1.651 ± 0.481
1.032PheCys: 1.032 ± 0.485
4.333PheAsp: 4.333 ± 1.105
2.682PheGlu: 2.682 ± 0.749
2.269PhePhe: 2.269 ± 0.644
1.651PheGly: 1.651 ± 0.53
0.413PheHis: 0.413 ± 0.317
3.301PheIle: 3.301 ± 0.853
4.333PheLys: 4.333 ± 0.963
3.714PheLeu: 3.714 ± 0.952
0.413PheMet: 0.413 ± 0.288
3.714PheAsn: 3.714 ± 0.919
1.857PhePro: 1.857 ± 0.588
1.238PheGln: 1.238 ± 0.528
1.238PheArg: 1.238 ± 0.345
3.095PheSer: 3.095 ± 0.718
3.095PheThr: 3.095 ± 0.769
3.92PheVal: 3.92 ± 0.614
0.0PheTrp: 0.0 ± 0.0
1.444PheTyr: 1.444 ± 0.498
0.0PheXaa: 0.0 ± 0.0
Gly
3.301GlyAla: 3.301 ± 0.721
0.619GlyCys: 0.619 ± 0.431
4.126GlyAsp: 4.126 ± 1.473
2.269GlyGlu: 2.269 ± 0.907
2.888GlyPhe: 2.888 ± 0.723
7.015GlyGly: 7.015 ± 2.192
1.032GlyHis: 1.032 ± 0.553
5.158GlyIle: 5.158 ± 1.274
4.952GlyLys: 4.952 ± 1.305
4.333GlyLeu: 4.333 ± 1.407
1.032GlyMet: 1.032 ± 0.465
3.507GlyAsn: 3.507 ± 1.057
1.238GlyPro: 1.238 ± 0.607
3.507GlyGln: 3.507 ± 1.329
2.269GlyArg: 2.269 ± 0.67
4.745GlySer: 4.745 ± 1.394
5.158GlyThr: 5.158 ± 1.822
3.92GlyVal: 3.92 ± 1.686
0.206GlyTrp: 0.206 ± 0.193
2.063GlyTyr: 2.063 ± 0.484
0.0GlyXaa: 0.0 ± 0.0
His
0.825HisAla: 0.825 ± 0.452
0.619HisCys: 0.619 ± 0.311
0.206HisAsp: 0.206 ± 0.159
0.619HisGlu: 0.619 ± 0.333
0.206HisPhe: 0.206 ± 0.159
0.413HisGly: 0.413 ± 0.283
0.206HisHis: 0.206 ± 0.209
0.619HisIle: 0.619 ± 0.31
2.063HisLys: 2.063 ± 0.801
1.444HisLeu: 1.444 ± 0.66
0.206HisMet: 0.206 ± 0.202
0.413HisAsn: 0.413 ± 0.298
0.825HisPro: 0.825 ± 0.264
0.0HisGln: 0.0 ± 0.0
0.619HisArg: 0.619 ± 0.359
1.444HisSer: 1.444 ± 0.433
0.413HisThr: 0.413 ± 0.289
0.619HisVal: 0.619 ± 0.257
0.206HisTrp: 0.206 ± 0.214
0.825HisTyr: 0.825 ± 0.388
0.0HisXaa: 0.0 ± 0.0
Ile
2.476IleAla: 2.476 ± 0.747
0.619IleCys: 0.619 ± 0.35
4.333IleAsp: 4.333 ± 1.072
2.476IleGlu: 2.476 ± 0.673
4.126IlePhe: 4.126 ± 1.177
4.126IleGly: 4.126 ± 0.878
1.032IleHis: 1.032 ± 0.357
5.158IleIle: 5.158 ± 1.235
6.396IleLys: 6.396 ± 1.387
5.158IleLeu: 5.158 ± 0.961
1.032IleMet: 1.032 ± 0.528
8.046IleAsn: 8.046 ± 1.316
4.952IlePro: 4.952 ± 1.283
3.301IleGln: 3.301 ± 0.742
1.651IleArg: 1.651 ± 0.629
4.952IleSer: 4.952 ± 1.029
5.364IleThr: 5.364 ± 1.309
3.92IleVal: 3.92 ± 0.613
0.413IleTrp: 0.413 ± 0.241
4.745IleTyr: 4.745 ± 0.934
0.0IleXaa: 0.0 ± 0.0
Lys
4.333LysAla: 4.333 ± 0.98
0.206LysCys: 0.206 ± 0.221
5.364LysAsp: 5.364 ± 0.963
7.427LysGlu: 7.427 ± 1.645
3.92LysPhe: 3.92 ± 1.233
7.84LysGly: 7.84 ± 2.984
1.238LysHis: 1.238 ± 0.489
7.427LysIle: 7.427 ± 1.417
16.092LysLys: 16.092 ± 3.616
7.634LysLeu: 7.634 ± 1.372
2.888LysMet: 2.888 ± 0.91
6.602LysAsn: 6.602 ± 1.25
3.301LysPro: 3.301 ± 0.739
3.095LysGln: 3.095 ± 0.998
3.92LysArg: 3.92 ± 1.153
5.983LysSer: 5.983 ± 1.123
5.158LysThr: 5.158 ± 1.048
4.333LysVal: 4.333 ± 0.978
0.206LysTrp: 0.206 ± 0.159
5.57LysTyr: 5.57 ± 1.624
0.0LysXaa: 0.0 ± 0.0
Leu
3.095LeuAla: 3.095 ± 0.832
0.825LeuCys: 0.825 ± 0.49
4.745LeuAsp: 4.745 ± 0.979
6.189LeuGlu: 6.189 ± 1.396
3.095LeuPhe: 3.095 ± 0.975
4.126LeuGly: 4.126 ± 0.762
0.619LeuHis: 0.619 ± 0.33
4.333LeuIle: 4.333 ± 0.861
11.347LeuLys: 11.347 ± 1.551
7.427LeuLeu: 7.427 ± 1.587
2.063LeuMet: 2.063 ± 0.522
6.808LeuAsn: 6.808 ± 1.423
2.476LeuPro: 2.476 ± 0.453
4.126LeuGln: 4.126 ± 0.86
3.92LeuArg: 3.92 ± 0.604
6.189LeuSer: 6.189 ± 0.74
3.92LeuThr: 3.92 ± 1.117
4.126LeuVal: 4.126 ± 0.881
1.032LeuTrp: 1.032 ± 0.328
4.333LeuTyr: 4.333 ± 0.918
0.0LeuXaa: 0.0 ± 0.0
Met
1.238MetAla: 1.238 ± 0.428
0.206MetCys: 0.206 ± 0.214
0.619MetAsp: 0.619 ± 0.352
1.032MetGlu: 1.032 ± 0.484
1.032MetPhe: 1.032 ± 0.396
0.0MetGly: 0.0 ± 0.0
0.206MetHis: 0.206 ± 0.209
1.444MetIle: 1.444 ± 0.486
2.476MetLys: 2.476 ± 0.606
2.063MetLeu: 2.063 ± 0.639
0.619MetMet: 0.619 ± 0.314
2.269MetAsn: 2.269 ± 0.864
0.619MetPro: 0.619 ± 0.379
0.619MetGln: 0.619 ± 0.343
0.825MetArg: 0.825 ± 0.526
2.888MetSer: 2.888 ± 0.662
0.619MetThr: 0.619 ± 0.317
1.444MetVal: 1.444 ± 0.618
0.206MetTrp: 0.206 ± 0.164
1.651MetTyr: 1.651 ± 0.424
0.0MetXaa: 0.0 ± 0.0
Asn
3.301AsnAla: 3.301 ± 0.83
0.825AsnCys: 0.825 ± 0.398
4.333AsnAsp: 4.333 ± 0.928
5.364AsnGlu: 5.364 ± 1.343
3.507AsnPhe: 3.507 ± 1.027
5.777AsnGly: 5.777 ± 0.987
0.619AsnHis: 0.619 ± 0.34
5.57AsnIle: 5.57 ± 0.949
6.602AsnLys: 6.602 ± 1.408
5.777AsnLeu: 5.777 ± 1.456
1.444AsnMet: 1.444 ± 0.416
8.871AsnAsn: 8.871 ± 2.251
4.126AsnPro: 4.126 ± 1.282
1.857AsnGln: 1.857 ± 0.835
1.238AsnArg: 1.238 ± 0.729
3.301AsnSer: 3.301 ± 0.766
6.189AsnThr: 6.189 ± 1.361
5.158AsnVal: 5.158 ± 1.233
0.413AsnTrp: 0.413 ± 0.26
3.507AsnTyr: 3.507 ± 0.729
0.0AsnXaa: 0.0 ± 0.0
Pro
0.825ProAla: 0.825 ± 0.483
0.206ProCys: 0.206 ± 0.242
1.238ProAsp: 1.238 ± 0.579
2.682ProGlu: 2.682 ± 0.707
3.095ProPhe: 3.095 ± 0.741
2.682ProGly: 2.682 ± 0.77
1.238ProHis: 1.238 ± 0.348
3.095ProIle: 3.095 ± 0.843
4.333ProLys: 4.333 ± 1.12
2.476ProLeu: 2.476 ± 0.625
0.619ProMet: 0.619 ± 0.29
3.301ProAsn: 3.301 ± 1.133
2.476ProPro: 2.476 ± 0.665
0.413ProGln: 0.413 ± 0.292
1.857ProArg: 1.857 ± 1.058
4.126ProSer: 4.126 ± 1.242
2.888ProThr: 2.888 ± 0.968
1.651ProVal: 1.651 ± 0.449
0.825ProTrp: 0.825 ± 0.415
2.888ProTyr: 2.888 ± 0.733
0.0ProXaa: 0.0 ± 0.0
Gln
2.063GlnAla: 2.063 ± 0.888
0.0GlnCys: 0.0 ± 0.0
2.063GlnAsp: 2.063 ± 0.551
1.444GlnGlu: 1.444 ± 0.607
1.651GlnPhe: 1.651 ± 0.617
1.032GlnGly: 1.032 ± 0.409
0.206GlnHis: 0.206 ± 0.159
3.507GlnIle: 3.507 ± 0.711
2.888GlnLys: 2.888 ± 0.86
2.476GlnLeu: 2.476 ± 0.654
1.444GlnMet: 1.444 ± 0.45
2.476GlnAsn: 2.476 ± 0.801
2.063GlnPro: 2.063 ± 0.945
2.476GlnGln: 2.476 ± 0.745
1.238GlnArg: 1.238 ± 0.58
2.063GlnSer: 2.063 ± 0.559
3.301GlnThr: 3.301 ± 0.865
2.063GlnVal: 2.063 ± 1.02
0.0GlnTrp: 0.0 ± 0.0
1.857GlnTyr: 1.857 ± 0.479
0.0GlnXaa: 0.0 ± 0.0
Arg
0.825ArgAla: 0.825 ± 0.312
0.0ArgCys: 0.0 ± 0.0
1.857ArgAsp: 1.857 ± 0.624
2.682ArgGlu: 2.682 ± 0.929
1.032ArgPhe: 1.032 ± 0.377
1.444ArgGly: 1.444 ± 0.521
0.413ArgHis: 0.413 ± 0.249
3.92ArgIle: 3.92 ± 0.844
3.92ArgLys: 3.92 ± 0.965
4.126ArgLeu: 4.126 ± 0.887
1.238ArgMet: 1.238 ± 0.65
0.825ArgAsn: 0.825 ± 0.394
0.619ArgPro: 0.619 ± 0.358
0.619ArgGln: 0.619 ± 0.342
0.619ArgArg: 0.619 ± 0.36
1.651ArgSer: 1.651 ± 0.639
1.444ArgThr: 1.444 ± 0.581
1.444ArgVal: 1.444 ± 0.719
0.825ArgTrp: 0.825 ± 0.339
2.269ArgTyr: 2.269 ± 0.634
0.0ArgXaa: 0.0 ± 0.0
Ser
2.269SerAla: 2.269 ± 0.771
0.206SerCys: 0.206 ± 0.159
2.888SerAsp: 2.888 ± 0.685
3.095SerGlu: 3.095 ± 0.726
2.682SerPhe: 2.682 ± 0.659
5.983SerGly: 5.983 ± 1.91
1.444SerHis: 1.444 ± 0.532
5.158SerIle: 5.158 ± 1.286
5.983SerLys: 5.983 ± 1.158
5.983SerLeu: 5.983 ± 1.113
1.651SerMet: 1.651 ± 0.58
7.84SerAsn: 7.84 ± 1.235
2.063SerPro: 2.063 ± 0.832
1.651SerGln: 1.651 ± 0.737
2.476SerArg: 2.476 ± 0.628
4.333SerSer: 4.333 ± 1.041
4.539SerThr: 4.539 ± 1.133
3.301SerVal: 3.301 ± 0.821
0.206SerTrp: 0.206 ± 0.191
3.714SerTyr: 3.714 ± 0.684
0.0SerXaa: 0.0 ± 0.0
Thr
3.095ThrAla: 3.095 ± 0.801
0.413ThrCys: 0.413 ± 0.303
2.888ThrAsp: 2.888 ± 0.977
2.269ThrGlu: 2.269 ± 0.718
2.476ThrPhe: 2.476 ± 0.634
4.745ThrGly: 4.745 ± 1.875
0.619ThrHis: 0.619 ± 0.328
4.952ThrIle: 4.952 ± 0.928
4.952ThrLys: 4.952 ± 1.724
6.189ThrLeu: 6.189 ± 1.468
0.825ThrMet: 0.825 ± 0.375
3.92ThrAsn: 3.92 ± 1.169
4.333ThrPro: 4.333 ± 0.75
2.476ThrGln: 2.476 ± 0.503
1.651ThrArg: 1.651 ± 0.491
5.158ThrSer: 5.158 ± 1.102
3.507ThrThr: 3.507 ± 1.272
2.063ThrVal: 2.063 ± 0.654
0.619ThrTrp: 0.619 ± 0.579
3.301ThrTyr: 3.301 ± 0.914
0.0ThrXaa: 0.0 ± 0.0
Val
3.92ValAla: 3.92 ± 1.154
0.619ValCys: 0.619 ± 0.351
2.063ValAsp: 2.063 ± 0.649
2.476ValGlu: 2.476 ± 0.753
3.095ValPhe: 3.095 ± 0.745
3.301ValGly: 3.301 ± 0.733
0.413ValHis: 0.413 ± 0.361
3.301ValIle: 3.301 ± 0.847
4.539ValLys: 4.539 ± 0.823
6.189ValLeu: 6.189 ± 1.221
0.619ValMet: 0.619 ± 0.339
3.095ValAsn: 3.095 ± 0.738
2.888ValPro: 2.888 ± 0.568
2.063ValGln: 2.063 ± 0.914
1.032ValArg: 1.032 ± 0.429
3.714ValSer: 3.714 ± 0.905
5.158ValThr: 5.158 ± 1.106
4.539ValVal: 4.539 ± 1.239
0.0ValTrp: 0.0 ± 0.0
2.063ValTyr: 2.063 ± 0.56
0.0ValXaa: 0.0 ± 0.0
Trp
0.619TrpAla: 0.619 ± 0.308
0.413TrpCys: 0.413 ± 0.429
0.206TrpAsp: 0.206 ± 0.193
0.0TrpGlu: 0.0 ± 0.0
0.0TrpPhe: 0.0 ± 0.0
0.0TrpGly: 0.0 ± 0.0
0.0TrpHis: 0.0 ± 0.0
0.206TrpIle: 0.206 ± 0.22
0.206TrpLys: 0.206 ± 0.159
0.0TrpLeu: 0.0 ± 0.0
0.413TrpMet: 0.413 ± 0.206
0.619TrpAsn: 0.619 ± 0.318
0.0TrpPro: 0.0 ± 0.0
0.0TrpGln: 0.0 ± 0.0
0.206TrpArg: 0.206 ± 0.209
1.238TrpSer: 1.238 ± 0.394
1.032TrpThr: 1.032 ± 0.369
0.825TrpVal: 0.825 ± 0.415
0.0TrpTrp: 0.0 ± 0.0
0.0TrpTyr: 0.0 ± 0.0
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.857TyrAla: 1.857 ± 0.451
0.825TyrCys: 0.825 ± 0.416
3.301TyrAsp: 3.301 ± 0.766
3.714TyrGlu: 3.714 ± 1.07
2.476TyrPhe: 2.476 ± 0.574
2.888TyrGly: 2.888 ± 0.847
0.825TyrHis: 0.825 ± 0.3
3.507TyrIle: 3.507 ± 0.932
3.301TyrLys: 3.301 ± 1.117
3.301TyrLeu: 3.301 ± 0.886
0.619TyrMet: 0.619 ± 0.416
4.539TyrAsn: 4.539 ± 0.769
2.888TyrPro: 2.888 ± 1.073
1.857TyrGln: 1.857 ± 0.458
2.269TyrArg: 2.269 ± 0.526
3.095TyrSer: 3.095 ± 1.236
2.269TyrThr: 2.269 ± 0.715
2.888TyrVal: 2.888 ± 0.56
0.619TyrTrp: 0.619 ± 0.329
3.92TyrTyr: 3.92 ± 1.144
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 21 proteins (4848 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski