Amino acid dipepetide frequency for Streptococcus satellite phage Javan632

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
1.945AlaAla: 1.945 ± 0.804
0.778AlaCys: 0.778 ± 0.532
4.667AlaAsp: 4.667 ± 1.259
5.445AlaGlu: 5.445 ± 1.309
3.112AlaPhe: 3.112 ± 0.866
3.89AlaGly: 3.89 ± 0.652
0.0AlaHis: 0.0 ± 0.0
3.89AlaIle: 3.89 ± 1.387
5.834AlaLys: 5.834 ± 1.275
9.335AlaLeu: 9.335 ± 1.903
1.945AlaMet: 1.945 ± 0.839
4.667AlaAsn: 4.667 ± 1.367
1.167AlaPro: 1.167 ± 1.056
2.334AlaGln: 2.334 ± 0.773
1.556AlaArg: 1.556 ± 1.032
4.667AlaSer: 4.667 ± 1.714
3.501AlaThr: 3.501 ± 1.054
0.778AlaVal: 0.778 ± 0.517
0.0AlaTrp: 0.0 ± 0.0
2.334AlaTyr: 2.334 ± 0.73
0.0AlaXaa: 0.0 ± 0.0
Cys
0.389CysAla: 0.389 ± 0.352
0.0CysCys: 0.0 ± 0.0
0.778CysAsp: 0.778 ± 0.537
0.0CysGlu: 0.0 ± 0.0
0.0CysPhe: 0.0 ± 0.0
0.0CysGly: 0.0 ± 0.0
0.0CysHis: 0.0 ± 0.0
0.389CysIle: 0.389 ± 0.353
0.778CysLys: 0.778 ± 0.581
0.778CysLeu: 0.778 ± 0.542
0.0CysMet: 0.0 ± 0.0
0.389CysAsn: 0.389 ± 0.341
0.778CysPro: 0.778 ± 0.759
0.0CysGln: 0.0 ± 0.0
0.778CysArg: 0.778 ± 0.65
0.0CysSer: 0.0 ± 0.0
0.0CysThr: 0.0 ± 0.0
0.778CysVal: 0.778 ± 0.44
0.389CysTrp: 0.389 ± 0.471
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
0.389AspAla: 0.389 ± 0.455
1.167AspCys: 1.167 ± 1.056
3.89AspAsp: 3.89 ± 1.406
3.89AspGlu: 3.89 ± 1.368
1.945AspPhe: 1.945 ± 1.188
5.445AspGly: 5.445 ± 1.675
0.389AspHis: 0.389 ± 0.341
6.223AspIle: 6.223 ± 1.911
6.612AspLys: 6.612 ± 1.474
5.445AspLeu: 5.445 ± 1.54
2.334AspMet: 2.334 ± 0.843
3.501AspAsn: 3.501 ± 0.892
0.0AspPro: 0.0 ± 0.0
0.389AspGln: 0.389 ± 0.398
3.501AspArg: 3.501 ± 1.214
3.501AspSer: 3.501 ± 1.044
2.334AspThr: 2.334 ± 0.838
1.556AspVal: 1.556 ± 1.2
0.389AspTrp: 0.389 ± 0.428
4.278AspTyr: 4.278 ± 1.396
0.0AspXaa: 0.0 ± 0.0
Glu
7.39GluAla: 7.39 ± 1.358
0.778GluCys: 0.778 ± 0.611
4.667GluAsp: 4.667 ± 1.265
7.001GluGlu: 7.001 ± 1.897
3.112GluPhe: 3.112 ± 0.953
2.723GluGly: 2.723 ± 1.129
2.334GluHis: 2.334 ± 0.939
3.89GluIle: 3.89 ± 1.042
6.223GluLys: 6.223 ± 1.844
12.835GluLeu: 12.835 ± 2.853
1.556GluMet: 1.556 ± 0.872
5.056GluAsn: 5.056 ± 1.275
2.334GluPro: 2.334 ± 1.183
6.223GluGln: 6.223 ± 1.721
2.334GluArg: 2.334 ± 1.623
2.334GluSer: 2.334 ± 0.641
4.278GluThr: 4.278 ± 1.21
4.667GluVal: 4.667 ± 1.671
1.556GluTrp: 1.556 ± 0.574
2.334GluTyr: 2.334 ± 1.202
0.0GluXaa: 0.0 ± 0.0
Phe
1.556PheAla: 1.556 ± 0.88
0.389PheCys: 0.389 ± 0.379
3.89PheAsp: 3.89 ± 1.204
4.278PheGlu: 4.278 ± 1.146
2.334PhePhe: 2.334 ± 1.015
2.723PheGly: 2.723 ± 1.032
1.167PheHis: 1.167 ± 0.811
3.112PheIle: 3.112 ± 1.193
3.501PheLys: 3.501 ± 1.048
2.334PheLeu: 2.334 ± 0.755
0.389PheMet: 0.389 ± 0.402
2.723PheAsn: 2.723 ± 0.995
1.556PhePro: 1.556 ± 0.866
0.778PheGln: 0.778 ± 0.524
0.778PheArg: 0.778 ± 0.397
3.89PheSer: 3.89 ± 1.311
3.112PheThr: 3.112 ± 0.873
1.556PheVal: 1.556 ± 0.885
0.0PheTrp: 0.0 ± 0.0
1.167PheTyr: 1.167 ± 0.619
0.0PheXaa: 0.0 ± 0.0
Gly
1.945GlyAla: 1.945 ± 0.756
0.389GlyCys: 0.389 ± 0.325
2.723GlyAsp: 2.723 ± 1.045
2.723GlyGlu: 2.723 ± 0.738
2.334GlyPhe: 2.334 ± 1.048
0.778GlyGly: 0.778 ± 0.53
1.167GlyHis: 1.167 ± 0.58
4.278GlyIle: 4.278 ± 1.0
6.612GlyLys: 6.612 ± 1.536
5.834GlyLeu: 5.834 ± 1.387
0.389GlyMet: 0.389 ± 0.379
1.556GlyAsn: 1.556 ± 0.822
0.389GlyPro: 0.389 ± 0.453
1.945GlyGln: 1.945 ± 0.957
1.556GlyArg: 1.556 ± 0.691
2.334GlySer: 2.334 ± 0.987
4.278GlyThr: 4.278 ± 1.342
4.278GlyVal: 4.278 ± 1.155
0.389GlyTrp: 0.389 ± 0.402
3.501GlyTyr: 3.501 ± 0.951
0.0GlyXaa: 0.0 ± 0.0
His
1.945HisAla: 1.945 ± 0.862
0.0HisCys: 0.0 ± 0.0
0.389HisAsp: 0.389 ± 0.341
2.334HisGlu: 2.334 ± 1.137
1.167HisPhe: 1.167 ± 0.494
1.167HisGly: 1.167 ± 0.536
0.389HisHis: 0.389 ± 0.352
1.556HisIle: 1.556 ± 0.623
2.723HisLys: 2.723 ± 0.852
1.945HisLeu: 1.945 ± 1.004
0.389HisMet: 0.389 ± 0.394
0.778HisAsn: 0.778 ± 0.531
0.0HisPro: 0.0 ± 0.0
0.0HisGln: 0.0 ± 0.0
0.389HisArg: 0.389 ± 0.471
1.167HisSer: 1.167 ± 0.564
1.556HisThr: 1.556 ± 0.62
0.0HisVal: 0.0 ± 0.0
0.0HisTrp: 0.0 ± 0.0
0.778HisTyr: 0.778 ± 0.795
0.0HisXaa: 0.0 ± 0.0
Ile
5.056IleAla: 5.056 ± 1.884
0.389IleCys: 0.389 ± 0.371
4.278IleAsp: 4.278 ± 1.026
3.501IleGlu: 3.501 ± 1.036
2.334IlePhe: 2.334 ± 0.714
4.278IleGly: 4.278 ± 1.291
0.778IleHis: 0.778 ± 0.513
4.278IleIle: 4.278 ± 1.169
7.39IleLys: 7.39 ± 1.396
5.056IleLeu: 5.056 ± 1.262
1.556IleMet: 1.556 ± 0.779
3.501IleAsn: 3.501 ± 0.721
1.167IlePro: 1.167 ± 0.54
2.334IleGln: 2.334 ± 1.184
2.334IleArg: 2.334 ± 0.98
7.001IleSer: 7.001 ± 1.758
4.667IleThr: 4.667 ± 0.815
3.501IleVal: 3.501 ± 0.985
0.0IleTrp: 0.0 ± 0.0
3.112IleTyr: 3.112 ± 1.491
0.0IleXaa: 0.0 ± 0.0
Lys
7.39LysAla: 7.39 ± 2.121
0.0LysCys: 0.0 ± 0.0
3.89LysAsp: 3.89 ± 1.082
9.335LysGlu: 9.335 ± 2.01
2.334LysPhe: 2.334 ± 0.792
3.501LysGly: 3.501 ± 1.084
3.112LysHis: 3.112 ± 0.692
5.056LysIle: 5.056 ± 1.373
9.724LysLys: 9.724 ± 1.571
7.001LysLeu: 7.001 ± 1.27
1.945LysMet: 1.945 ± 0.969
5.056LysAsn: 5.056 ± 1.241
2.723LysPro: 2.723 ± 1.07
7.39LysGln: 7.39 ± 1.127
5.056LysArg: 5.056 ± 1.334
8.557LysSer: 8.557 ± 1.67
4.667LysThr: 4.667 ± 0.871
4.667LysVal: 4.667 ± 1.114
0.389LysTrp: 0.389 ± 0.325
2.334LysTyr: 2.334 ± 1.017
0.0LysXaa: 0.0 ± 0.0
Leu
8.946LeuAla: 8.946 ± 1.471
0.389LeuCys: 0.389 ± 0.379
8.557LeuAsp: 8.557 ± 2.025
11.28LeuGlu: 11.28 ± 2.803
2.723LeuPhe: 2.723 ± 1.035
6.612LeuGly: 6.612 ± 1.569
1.167LeuHis: 1.167 ± 0.538
3.501LeuIle: 3.501 ± 1.023
9.335LeuLys: 9.335 ± 1.516
10.502LeuLeu: 10.502 ± 2.116
1.945LeuMet: 1.945 ± 0.941
6.612LeuAsn: 6.612 ± 1.37
2.723LeuPro: 2.723 ± 0.89
3.501LeuGln: 3.501 ± 1.172
5.834LeuArg: 5.834 ± 0.905
7.779LeuSer: 7.779 ± 1.787
3.501LeuThr: 3.501 ± 0.921
3.112LeuVal: 3.112 ± 0.779
1.167LeuTrp: 1.167 ± 0.656
2.723LeuTyr: 2.723 ± 1.014
0.0LeuXaa: 0.0 ± 0.0
Met
3.501MetAla: 3.501 ± 1.236
0.0MetCys: 0.0 ± 0.0
0.389MetAsp: 0.389 ± 0.352
1.556MetGlu: 1.556 ± 0.721
0.389MetPhe: 0.389 ± 0.341
0.0MetGly: 0.0 ± 0.0
0.0MetHis: 0.0 ± 0.0
1.556MetIle: 1.556 ± 1.292
2.334MetLys: 2.334 ± 1.19
1.945MetLeu: 1.945 ± 0.606
0.0MetMet: 0.0 ± 0.0
1.556MetAsn: 1.556 ± 0.623
0.0MetPro: 0.0 ± 0.0
1.945MetGln: 1.945 ± 1.002
1.167MetArg: 1.167 ± 0.78
1.556MetSer: 1.556 ± 0.751
1.945MetThr: 1.945 ± 0.792
0.778MetVal: 0.778 ± 0.578
0.389MetTrp: 0.389 ± 0.398
0.389MetTyr: 0.389 ± 0.377
0.0MetXaa: 0.0 ± 0.0
Asn
2.334AsnAla: 2.334 ± 0.97
0.778AsnCys: 0.778 ± 0.592
2.334AsnAsp: 2.334 ± 1.001
3.89AsnGlu: 3.89 ± 1.294
2.723AsnPhe: 2.723 ± 0.838
4.278AsnGly: 4.278 ± 1.194
1.167AsnHis: 1.167 ± 0.756
3.112AsnIle: 3.112 ± 0.908
3.89AsnLys: 3.89 ± 1.072
5.834AsnLeu: 5.834 ± 1.781
1.167AsnMet: 1.167 ± 0.73
2.334AsnAsn: 2.334 ± 0.859
1.945AsnPro: 1.945 ± 0.832
1.945AsnGln: 1.945 ± 0.856
3.112AsnArg: 3.112 ± 1.2
5.056AsnSer: 5.056 ± 1.506
3.89AsnThr: 3.89 ± 1.058
1.945AsnVal: 1.945 ± 0.778
1.167AsnTrp: 1.167 ± 0.612
1.945AsnTyr: 1.945 ± 0.555
0.0AsnXaa: 0.0 ± 0.0
Pro
0.778ProAla: 0.778 ± 0.469
0.0ProCys: 0.0 ± 0.0
0.778ProAsp: 0.778 ± 0.515
1.167ProGlu: 1.167 ± 0.777
2.334ProPhe: 2.334 ± 0.791
0.389ProGly: 0.389 ± 0.453
0.389ProHis: 0.389 ± 0.398
2.334ProIle: 2.334 ± 0.873
3.112ProLys: 3.112 ± 0.999
1.945ProLeu: 1.945 ± 0.889
0.389ProMet: 0.389 ± 0.332
1.167ProAsn: 1.167 ± 0.555
0.389ProPro: 0.389 ± 0.341
0.0ProGln: 0.0 ± 0.0
1.945ProArg: 1.945 ± 0.717
1.945ProSer: 1.945 ± 0.879
1.556ProThr: 1.556 ± 0.62
1.945ProVal: 1.945 ± 0.827
0.389ProTrp: 0.389 ± 0.398
0.778ProTyr: 0.778 ± 0.482
0.0ProXaa: 0.0 ± 0.0
Gln
2.334GlnAla: 2.334 ± 1.37
0.0GlnCys: 0.0 ± 0.0
1.167GlnAsp: 1.167 ± 0.775
4.667GlnGlu: 4.667 ± 1.427
0.778GlnPhe: 0.778 ± 0.554
1.556GlnGly: 1.556 ± 1.177
1.167GlnHis: 1.167 ± 0.559
2.723GlnIle: 2.723 ± 1.305
3.501GlnLys: 3.501 ± 1.76
3.89GlnLeu: 3.89 ± 1.186
1.167GlnMet: 1.167 ± 0.648
3.112GlnAsn: 3.112 ± 1.4
0.0GlnPro: 0.0 ± 0.0
5.834GlnGln: 5.834 ± 1.944
3.89GlnArg: 3.89 ± 1.182
3.501GlnSer: 3.501 ± 1.254
1.167GlnThr: 1.167 ± 0.551
3.89GlnVal: 3.89 ± 1.026
0.778GlnTrp: 0.778 ± 0.563
2.723GlnTyr: 2.723 ± 1.249
0.0GlnXaa: 0.0 ± 0.0
Arg
1.945ArgAla: 1.945 ± 0.912
0.0ArgCys: 0.0 ± 0.0
2.334ArgAsp: 2.334 ± 1.029
4.667ArgGlu: 4.667 ± 1.307
1.556ArgPhe: 1.556 ± 1.065
2.723ArgGly: 2.723 ± 0.806
0.778ArgHis: 0.778 ± 0.532
2.334ArgIle: 2.334 ± 0.965
4.667ArgLys: 4.667 ± 1.25
6.612ArgLeu: 6.612 ± 1.361
1.556ArgMet: 1.556 ± 0.698
1.167ArgAsn: 1.167 ± 0.829
1.167ArgPro: 1.167 ± 0.568
2.334ArgGln: 2.334 ± 1.154
1.167ArgArg: 1.167 ± 0.444
2.334ArgSer: 2.334 ± 0.846
3.112ArgThr: 3.112 ± 1.116
3.112ArgVal: 3.112 ± 1.529
0.389ArgTrp: 0.389 ± 0.453
2.723ArgTyr: 2.723 ± 0.981
0.0ArgXaa: 0.0 ± 0.0
Ser
3.112SerAla: 3.112 ± 1.09
0.0SerCys: 0.0 ± 0.0
3.501SerAsp: 3.501 ± 1.003
5.834SerGlu: 5.834 ± 2.237
3.112SerPhe: 3.112 ± 1.056
3.501SerGly: 3.501 ± 0.891
1.945SerHis: 1.945 ± 0.778
5.445SerIle: 5.445 ± 1.403
5.834SerLys: 5.834 ± 1.198
6.223SerLeu: 6.223 ± 1.386
1.945SerMet: 1.945 ± 0.865
3.89SerAsn: 3.89 ± 1.329
1.556SerPro: 1.556 ± 0.644
3.501SerGln: 3.501 ± 1.545
3.112SerArg: 3.112 ± 1.255
3.89SerSer: 3.89 ± 1.274
2.723SerThr: 2.723 ± 1.201
4.278SerVal: 4.278 ± 1.078
0.389SerTrp: 0.389 ± 0.325
2.723SerTyr: 2.723 ± 0.878
0.0SerXaa: 0.0 ± 0.0
Thr
3.501ThrAla: 3.501 ± 0.905
0.389ThrCys: 0.389 ± 0.377
3.112ThrAsp: 3.112 ± 1.204
4.667ThrGlu: 4.667 ± 1.45
2.723ThrPhe: 2.723 ± 1.305
2.723ThrGly: 2.723 ± 1.331
0.389ThrHis: 0.389 ± 0.325
7.39ThrIle: 7.39 ± 1.567
2.723ThrLys: 2.723 ± 1.145
5.834ThrLeu: 5.834 ± 1.382
1.556ThrMet: 1.556 ± 0.786
2.334ThrAsn: 2.334 ± 0.705
3.501ThrPro: 3.501 ± 1.151
2.723ThrGln: 2.723 ± 1.031
1.945ThrArg: 1.945 ± 0.895
1.556ThrSer: 1.556 ± 0.878
4.667ThrThr: 4.667 ± 1.271
4.667ThrVal: 4.667 ± 0.876
0.778ThrTrp: 0.778 ± 0.502
1.556ThrTyr: 1.556 ± 0.628
0.0ThrXaa: 0.0 ± 0.0
Val
3.89ValAla: 3.89 ± 0.895
0.0ValCys: 0.0 ± 0.0
3.89ValAsp: 3.89 ± 1.207
3.89ValGlu: 3.89 ± 1.474
2.334ValPhe: 2.334 ± 0.866
1.167ValGly: 1.167 ± 0.609
1.556ValHis: 1.556 ± 0.862
4.278ValIle: 4.278 ± 1.289
4.667ValLys: 4.667 ± 1.494
4.667ValLeu: 4.667 ± 1.474
0.389ValMet: 0.389 ± 0.4
2.334ValAsn: 2.334 ± 0.846
0.778ValPro: 0.778 ± 0.523
1.556ValGln: 1.556 ± 1.021
3.112ValArg: 3.112 ± 0.862
2.334ValSer: 2.334 ± 0.75
3.89ValThr: 3.89 ± 1.072
1.167ValVal: 1.167 ± 0.828
0.0ValTrp: 0.0 ± 0.0
3.501ValTyr: 3.501 ± 1.232
0.0ValXaa: 0.0 ± 0.0
Trp
0.389TrpAla: 0.389 ± 0.325
0.0TrpCys: 0.0 ± 0.0
0.778TrpAsp: 0.778 ± 0.611
2.723TrpGlu: 2.723 ± 1.043
0.389TrpPhe: 0.389 ± 0.471
0.0TrpGly: 0.0 ± 0.0
0.0TrpHis: 0.0 ± 0.0
0.0TrpIle: 0.0 ± 0.0
0.389TrpLys: 0.389 ± 0.398
0.389TrpLeu: 0.389 ± 0.403
0.0TrpMet: 0.0 ± 0.0
0.389TrpAsn: 0.389 ± 0.341
0.389TrpPro: 0.389 ± 0.352
0.778TrpGln: 0.778 ± 0.469
0.0TrpArg: 0.0 ± 0.0
0.389TrpSer: 0.389 ± 0.325
0.389TrpThr: 0.389 ± 0.378
0.389TrpVal: 0.389 ± 0.378
0.389TrpTrp: 0.389 ± 0.325
0.778TrpTyr: 0.778 ± 0.637
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.112TyrAla: 3.112 ± 1.592
0.778TyrCys: 0.778 ± 0.397
1.167TyrAsp: 1.167 ± 0.736
1.167TyrGlu: 1.167 ± 0.499
3.501TyrPhe: 3.501 ± 1.54
1.945TyrGly: 1.945 ± 0.783
0.778TyrHis: 0.778 ± 0.523
1.167TyrIle: 1.167 ± 0.537
4.278TyrLys: 4.278 ± 1.465
3.501TyrLeu: 3.501 ± 1.003
0.389TyrMet: 0.389 ± 0.352
2.723TyrAsn: 2.723 ± 0.903
1.167TyrPro: 1.167 ± 0.594
2.334TyrGln: 2.334 ± 0.791
3.112TyrArg: 3.112 ± 1.117
2.723TyrSer: 2.723 ± 0.81
3.501TyrThr: 3.501 ± 1.244
2.334TyrVal: 2.334 ± 1.039
0.0TyrTrp: 0.0 ± 0.0
1.556TyrTyr: 1.556 ± 0.653
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 19 proteins (2572 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski