Amino acid dipepetide frequency for Streptococcus satellite phage Javan99

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
3.654AlaAla: 3.654 ± 1.397
1.218AlaCys: 1.218 ± 0.626
2.842AlaAsp: 2.842 ± 1.069
7.714AlaGlu: 7.714 ± 2.181
2.03AlaPhe: 2.03 ± 0.84
2.842AlaGly: 2.842 ± 1.046
0.0AlaHis: 0.0 ± 0.0
4.466AlaIle: 4.466 ± 1.347
5.684AlaLys: 5.684 ± 1.262
4.466AlaLeu: 4.466 ± 1.367
2.436AlaMet: 2.436 ± 1.249
4.06AlaAsn: 4.06 ± 1.368
0.406AlaPro: 0.406 ± 0.41
2.03AlaGln: 2.03 ± 0.758
3.654AlaArg: 3.654 ± 1.229
3.248AlaSer: 3.248 ± 1.121
2.842AlaThr: 2.842 ± 1.129
5.278AlaVal: 5.278 ± 1.84
0.812AlaTrp: 0.812 ± 0.451
3.248AlaTyr: 3.248 ± 1.001
0.0AlaXaa: 0.0 ± 0.0
Cys
0.406CysAla: 0.406 ± 0.333
0.0CysCys: 0.0 ± 0.0
0.0CysAsp: 0.0 ± 0.0
0.406CysGlu: 0.406 ± 0.427
0.0CysPhe: 0.0 ± 0.0
0.0CysGly: 0.0 ± 0.0
0.0CysHis: 0.0 ± 0.0
0.0CysIle: 0.0 ± 0.0
0.406CysLys: 0.406 ± 0.372
0.812CysLeu: 0.812 ± 0.675
0.0CysMet: 0.0 ± 0.0
0.0CysAsn: 0.0 ± 0.0
1.218CysPro: 1.218 ± 0.788
0.812CysGln: 0.812 ± 0.659
0.406CysArg: 0.406 ± 0.372
0.0CysSer: 0.0 ± 0.0
0.0CysThr: 0.0 ± 0.0
0.0CysVal: 0.0 ± 0.0
0.0CysTrp: 0.0 ± 0.0
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
3.248AspAla: 3.248 ± 0.982
0.0AspCys: 0.0 ± 0.0
2.842AspAsp: 2.842 ± 1.175
2.842AspGlu: 2.842 ± 1.315
4.06AspPhe: 4.06 ± 1.875
3.248AspGly: 3.248 ± 1.281
0.0AspHis: 0.0 ± 0.0
4.466AspIle: 4.466 ± 1.188
6.09AspLys: 6.09 ± 1.194
5.278AspLeu: 5.278 ± 1.83
2.436AspMet: 2.436 ± 0.927
3.654AspAsn: 3.654 ± 1.29
0.0AspPro: 0.0 ± 0.0
0.812AspGln: 0.812 ± 0.743
1.624AspArg: 1.624 ± 0.703
2.03AspSer: 2.03 ± 0.867
2.03AspThr: 2.03 ± 0.698
4.466AspVal: 4.466 ± 1.187
0.0AspTrp: 0.0 ± 0.0
2.842AspTyr: 2.842 ± 1.003
0.0AspXaa: 0.0 ± 0.0
Glu
5.278GluAla: 5.278 ± 2.069
0.406GluCys: 0.406 ± 0.333
6.09GluAsp: 6.09 ± 1.768
6.496GluGlu: 6.496 ± 2.03
2.842GluPhe: 2.842 ± 1.458
2.436GluGly: 2.436 ± 0.972
1.218GluHis: 1.218 ± 0.676
7.308GluIle: 7.308 ± 1.517
9.744GluLys: 9.744 ± 1.514
15.022GluLeu: 15.022 ± 3.278
2.436GluMet: 2.436 ± 1.18
5.278GluAsn: 5.278 ± 1.315
2.436GluPro: 2.436 ± 0.845
5.278GluGln: 5.278 ± 1.348
3.654GluArg: 3.654 ± 1.524
2.03GluSer: 2.03 ± 0.763
4.06GluThr: 4.06 ± 0.972
3.654GluVal: 3.654 ± 1.179
2.03GluTrp: 2.03 ± 1.049
2.03GluTyr: 2.03 ± 1.354
0.0GluXaa: 0.0 ± 0.0
Phe
2.436PheAla: 2.436 ± 0.787
0.812PheCys: 0.812 ± 0.615
3.654PheAsp: 3.654 ± 1.208
2.436PheGlu: 2.436 ± 1.45
0.812PhePhe: 0.812 ± 0.59
2.03PheGly: 2.03 ± 0.76
0.812PheHis: 0.812 ± 0.508
4.06PheIle: 4.06 ± 1.902
3.248PheLys: 3.248 ± 1.18
2.436PheLeu: 2.436 ± 0.833
1.218PheMet: 1.218 ± 0.651
0.812PheAsn: 0.812 ± 0.523
0.406PhePro: 0.406 ± 0.46
1.218PheGln: 1.218 ± 0.693
1.218PheArg: 1.218 ± 0.689
3.248PheSer: 3.248 ± 0.898
0.812PheThr: 0.812 ± 0.451
2.436PheVal: 2.436 ± 1.096
0.406PheTrp: 0.406 ± 0.41
1.624PheTyr: 1.624 ± 1.305
0.0PheXaa: 0.0 ± 0.0
Gly
2.03GlyAla: 2.03 ± 0.879
0.406GlyCys: 0.406 ± 0.372
1.218GlyAsp: 1.218 ± 0.696
2.842GlyGlu: 2.842 ± 0.851
2.03GlyPhe: 2.03 ± 0.994
1.218GlyGly: 1.218 ± 0.945
0.812GlyHis: 0.812 ± 0.538
3.654GlyIle: 3.654 ± 1.344
5.684GlyLys: 5.684 ± 1.415
7.308GlyLeu: 7.308 ± 2.433
0.406GlyMet: 0.406 ± 0.333
2.842GlyAsn: 2.842 ± 0.92
0.0GlyPro: 0.0 ± 0.0
2.436GlyGln: 2.436 ± 0.78
1.218GlyArg: 1.218 ± 0.72
1.624GlySer: 1.624 ± 0.895
2.436GlyThr: 2.436 ± 1.27
4.466GlyVal: 4.466 ± 1.985
0.0GlyTrp: 0.0 ± 0.0
2.436GlyTyr: 2.436 ± 0.824
0.0GlyXaa: 0.0 ± 0.0
His
0.406HisAla: 0.406 ± 0.46
0.0HisCys: 0.0 ± 0.0
0.812HisAsp: 0.812 ± 0.567
2.436HisGlu: 2.436 ± 0.905
1.218HisPhe: 1.218 ± 0.642
2.436HisGly: 2.436 ± 0.903
0.406HisHis: 0.406 ± 0.367
0.0HisIle: 0.0 ± 0.0
0.812HisLys: 0.812 ± 0.547
2.03HisLeu: 2.03 ± 0.762
0.0HisMet: 0.0 ± 0.0
1.218HisAsn: 1.218 ± 0.593
0.0HisPro: 0.0 ± 0.0
0.406HisGln: 0.406 ± 0.382
1.218HisArg: 1.218 ± 0.686
0.406HisSer: 0.406 ± 0.372
1.218HisThr: 1.218 ± 1.115
0.0HisVal: 0.0 ± 0.0
0.0HisTrp: 0.0 ± 0.0
0.0HisTyr: 0.0 ± 0.0
0.0HisXaa: 0.0 ± 0.0
Ile
4.466IleAla: 4.466 ± 1.516
0.0IleCys: 0.0 ± 0.0
4.872IleAsp: 4.872 ± 1.228
8.932IleGlu: 8.932 ± 2.259
0.812IlePhe: 0.812 ± 0.529
2.842IleGly: 2.842 ± 0.97
1.624IleHis: 1.624 ± 0.702
4.872IleIle: 4.872 ± 1.389
6.09IleLys: 6.09 ± 1.689
4.872IleLeu: 4.872 ± 1.115
1.624IleMet: 1.624 ± 0.91
4.466IleAsn: 4.466 ± 1.485
1.624IlePro: 1.624 ± 0.815
0.812IleGln: 0.812 ± 0.445
2.436IleArg: 2.436 ± 0.904
4.466IleSer: 4.466 ± 1.129
3.654IleThr: 3.654 ± 1.049
3.248IleVal: 3.248 ± 1.01
0.406IleTrp: 0.406 ± 0.367
3.248IleTyr: 3.248 ± 1.239
0.0IleXaa: 0.0 ± 0.0
Lys
7.714LysAla: 7.714 ± 2.65
0.812LysCys: 0.812 ± 0.652
4.06LysAsp: 4.06 ± 1.267
9.338LysGlu: 9.338 ± 1.883
1.624LysPhe: 1.624 ± 0.761
5.278LysGly: 5.278 ± 1.124
2.03LysHis: 2.03 ± 0.806
4.872LysIle: 4.872 ± 1.562
10.15LysLys: 10.15 ± 2.258
10.556LysLeu: 10.556 ± 2.823
1.218LysMet: 1.218 ± 0.679
8.12LysAsn: 8.12 ± 1.582
2.842LysPro: 2.842 ± 0.94
6.09LysGln: 6.09 ± 1.405
5.278LysArg: 5.278 ± 1.591
5.684LysSer: 5.684 ± 1.927
6.902LysThr: 6.902 ± 1.417
4.466LysVal: 4.466 ± 1.154
1.624LysTrp: 1.624 ± 0.898
4.06LysTyr: 4.06 ± 1.297
0.0LysXaa: 0.0 ± 0.0
Leu
9.338LeuAla: 9.338 ± 2.189
0.0LeuCys: 0.0 ± 0.0
7.308LeuAsp: 7.308 ± 2.179
10.962LeuGlu: 10.962 ± 2.61
4.466LeuPhe: 4.466 ± 1.352
4.872LeuGly: 4.872 ± 1.677
1.624LeuHis: 1.624 ± 0.725
3.248LeuIle: 3.248 ± 1.168
13.804LeuLys: 13.804 ± 2.814
9.744LeuLeu: 9.744 ± 1.691
2.842LeuMet: 2.842 ± 1.259
4.872LeuAsn: 4.872 ± 1.12
1.624LeuPro: 1.624 ± 0.665
4.06LeuGln: 4.06 ± 1.423
4.466LeuArg: 4.466 ± 1.248
6.496LeuSer: 6.496 ± 1.511
8.12LeuThr: 8.12 ± 1.494
5.278LeuVal: 5.278 ± 1.455
0.406LeuTrp: 0.406 ± 0.452
5.278LeuTyr: 5.278 ± 0.877
0.0LeuXaa: 0.0 ± 0.0
Met
2.842MetAla: 2.842 ± 1.206
0.0MetCys: 0.0 ± 0.0
2.03MetAsp: 2.03 ± 0.963
3.248MetGlu: 3.248 ± 0.978
0.406MetPhe: 0.406 ± 0.452
0.812MetGly: 0.812 ± 0.605
0.406MetHis: 0.406 ± 0.367
0.812MetIle: 0.812 ± 0.533
2.03MetLys: 2.03 ± 0.703
4.06MetLeu: 4.06 ± 1.169
2.03MetMet: 2.03 ± 1.116
1.218MetAsn: 1.218 ± 0.51
0.812MetPro: 0.812 ± 0.638
1.218MetGln: 1.218 ± 0.637
2.842MetArg: 2.842 ± 1.06
0.406MetSer: 0.406 ± 0.406
2.842MetThr: 2.842 ± 1.241
0.812MetVal: 0.812 ± 0.587
0.0MetTrp: 0.0 ± 0.0
0.406MetTyr: 0.406 ± 0.499
0.0MetXaa: 0.0 ± 0.0
Asn
4.06AsnAla: 4.06 ± 1.125
0.0AsnCys: 0.0 ± 0.0
4.466AsnAsp: 4.466 ± 1.127
3.654AsnGlu: 3.654 ± 1.153
2.03AsnPhe: 2.03 ± 0.786
4.466AsnGly: 4.466 ± 1.079
0.812AsnHis: 0.812 ± 0.523
2.03AsnIle: 2.03 ± 1.158
4.872AsnLys: 4.872 ± 1.807
4.872AsnLeu: 4.872 ± 1.179
2.03AsnMet: 2.03 ± 0.786
3.248AsnAsn: 3.248 ± 1.231
1.624AsnPro: 1.624 ± 0.625
4.466AsnGln: 4.466 ± 0.951
2.03AsnArg: 2.03 ± 0.804
2.03AsnSer: 2.03 ± 0.847
3.248AsnThr: 3.248 ± 1.561
2.03AsnVal: 2.03 ± 0.799
1.218AsnTrp: 1.218 ± 0.676
2.842AsnTyr: 2.842 ± 1.285
0.0AsnXaa: 0.0 ± 0.0
Pro
0.812ProAla: 0.812 ± 0.55
0.0ProCys: 0.0 ± 0.0
0.0ProAsp: 0.0 ± 0.0
1.624ProGlu: 1.624 ± 1.098
2.03ProPhe: 2.03 ± 0.731
0.406ProGly: 0.406 ± 0.382
0.0ProHis: 0.0 ± 0.0
2.842ProIle: 2.842 ± 0.741
3.654ProLys: 3.654 ± 0.801
3.654ProLeu: 3.654 ± 1.04
0.406ProMet: 0.406 ± 0.517
0.812ProAsn: 0.812 ± 0.496
1.218ProPro: 1.218 ± 0.914
0.406ProGln: 0.406 ± 0.406
2.03ProArg: 2.03 ± 0.796
0.406ProSer: 0.406 ± 0.333
1.624ProThr: 1.624 ± 0.716
1.624ProVal: 1.624 ± 0.641
0.0ProTrp: 0.0 ± 0.0
0.812ProTyr: 0.812 ± 0.518
0.0ProXaa: 0.0 ± 0.0
Gln
2.436GlnAla: 2.436 ± 0.629
0.406GlnCys: 0.406 ± 0.458
2.842GlnAsp: 2.842 ± 1.31
4.872GlnGlu: 4.872 ± 1.568
1.218GlnPhe: 1.218 ± 0.693
1.218GlnGly: 1.218 ± 0.662
0.812GlnHis: 0.812 ± 0.575
1.624GlnIle: 1.624 ± 0.978
6.09GlnLys: 6.09 ± 1.479
4.872GlnLeu: 4.872 ± 1.555
0.406GlnMet: 0.406 ± 0.427
2.436GlnAsn: 2.436 ± 1.154
1.624GlnPro: 1.624 ± 0.963
5.278GlnGln: 5.278 ± 1.655
2.03GlnArg: 2.03 ± 0.68
4.06GlnSer: 4.06 ± 1.113
2.03GlnThr: 2.03 ± 0.852
1.624GlnVal: 1.624 ± 1.165
0.0GlnTrp: 0.0 ± 0.0
1.624GlnTyr: 1.624 ± 0.824
0.0GlnXaa: 0.0 ± 0.0
Arg
2.436ArgAla: 2.436 ± 0.878
0.0ArgCys: 0.0 ± 0.0
2.842ArgAsp: 2.842 ± 0.653
6.09ArgGlu: 6.09 ± 2.379
0.406ArgPhe: 0.406 ± 0.372
2.03ArgGly: 2.03 ± 0.876
0.406ArgHis: 0.406 ± 0.372
4.466ArgIle: 4.466 ± 1.405
3.654ArgLys: 3.654 ± 1.052
3.248ArgLeu: 3.248 ± 1.304
3.248ArgMet: 3.248 ± 1.395
2.436ArgAsn: 2.436 ± 0.819
0.406ArgPro: 0.406 ± 0.367
2.436ArgGln: 2.436 ± 1.237
0.0ArgArg: 0.0 ± 0.0
1.218ArgSer: 1.218 ± 0.63
2.03ArgThr: 2.03 ± 0.991
5.278ArgVal: 5.278 ± 1.108
0.406ArgTrp: 0.406 ± 0.382
1.218ArgTyr: 1.218 ± 0.646
0.0ArgXaa: 0.0 ± 0.0
Ser
2.436SerAla: 2.436 ± 0.738
0.0SerCys: 0.0 ± 0.0
2.842SerAsp: 2.842 ± 0.996
4.06SerGlu: 4.06 ± 1.338
2.03SerPhe: 2.03 ± 0.701
1.218SerGly: 1.218 ± 0.669
0.406SerHis: 0.406 ± 0.333
4.466SerIle: 4.466 ± 1.576
5.684SerLys: 5.684 ± 1.493
5.684SerLeu: 5.684 ± 1.402
1.218SerMet: 1.218 ± 0.769
4.06SerAsn: 4.06 ± 1.299
3.654SerPro: 3.654 ± 0.98
2.03SerGln: 2.03 ± 1.046
2.842SerArg: 2.842 ± 1.089
2.03SerSer: 2.03 ± 0.751
0.812SerThr: 0.812 ± 0.566
4.872SerVal: 4.872 ± 1.399
0.406SerTrp: 0.406 ± 0.372
1.624SerTyr: 1.624 ± 0.641
0.0SerXaa: 0.0 ± 0.0
Thr
3.248ThrAla: 3.248 ± 1.223
0.406ThrCys: 0.406 ± 0.333
0.812ThrAsp: 0.812 ± 0.655
2.842ThrGlu: 2.842 ± 1.014
0.812ThrPhe: 0.812 ± 0.605
4.872ThrGly: 4.872 ± 1.33
1.624ThrHis: 1.624 ± 0.707
4.872ThrIle: 4.872 ± 0.913
2.436ThrLys: 2.436 ± 1.26
7.714ThrLeu: 7.714 ± 1.706
1.624ThrMet: 1.624 ± 0.484
2.03ThrAsn: 2.03 ± 0.885
2.436ThrPro: 2.436 ± 0.826
3.248ThrGln: 3.248 ± 0.968
2.03ThrArg: 2.03 ± 0.574
1.624ThrSer: 1.624 ± 0.783
5.684ThrThr: 5.684 ± 1.457
4.466ThrVal: 4.466 ± 1.673
0.406ThrTrp: 0.406 ± 0.447
1.624ThrTyr: 1.624 ± 0.73
0.0ThrXaa: 0.0 ± 0.0
Val
2.436ValAla: 2.436 ± 0.859
0.0ValCys: 0.0 ± 0.0
1.218ValAsp: 1.218 ± 0.57
4.872ValGlu: 4.872 ± 2.161
2.436ValPhe: 2.436 ± 1.084
2.436ValGly: 2.436 ± 1.015
0.812ValHis: 0.812 ± 0.508
4.06ValIle: 4.06 ± 1.1
6.09ValLys: 6.09 ± 2.044
6.09ValLeu: 6.09 ± 1.55
1.218ValMet: 1.218 ± 0.745
2.436ValAsn: 2.436 ± 0.874
2.03ValPro: 2.03 ± 0.654
2.436ValGln: 2.436 ± 0.932
3.654ValArg: 3.654 ± 1.029
6.902ValSer: 6.902 ± 1.523
3.248ValThr: 3.248 ± 1.042
2.436ValVal: 2.436 ± 0.786
0.812ValTrp: 0.812 ± 0.572
2.842ValTyr: 2.842 ± 0.991
0.0ValXaa: 0.0 ± 0.0
Trp
0.406TrpAla: 0.406 ± 0.372
0.0TrpCys: 0.0 ± 0.0
0.406TrpAsp: 0.406 ± 0.372
2.03TrpGlu: 2.03 ± 0.897
0.406TrpPhe: 0.406 ± 0.452
0.0TrpGly: 0.0 ± 0.0
0.406TrpHis: 0.406 ± 0.372
0.406TrpIle: 0.406 ± 0.367
1.218TrpLys: 1.218 ± 0.777
1.624TrpLeu: 1.624 ± 0.821
0.812TrpMet: 0.812 ± 0.605
0.406TrpAsn: 0.406 ± 0.41
0.0TrpPro: 0.0 ± 0.0
0.406TrpGln: 0.406 ± 0.372
0.406TrpArg: 0.406 ± 0.46
0.812TrpSer: 0.812 ± 0.445
0.0TrpThr: 0.0 ± 0.0
0.0TrpVal: 0.0 ± 0.0
0.812TrpTrp: 0.812 ± 0.445
0.406TrpTyr: 0.406 ± 0.452
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.842TyrAla: 2.842 ± 1.331
0.0TyrCys: 0.0 ± 0.0
0.812TyrAsp: 0.812 ± 0.575
2.03TyrGlu: 2.03 ± 0.959
4.466TyrPhe: 4.466 ± 1.419
0.406TyrGly: 0.406 ± 0.372
0.812TyrHis: 0.812 ± 0.59
3.248TyrIle: 3.248 ± 1.248
5.278TyrLys: 5.278 ± 1.678
4.466TyrLeu: 4.466 ± 1.14
1.218TyrMet: 1.218 ± 0.641
1.624TyrAsn: 1.624 ± 0.546
0.0TyrPro: 0.0 ± 0.0
1.624TyrGln: 1.624 ± 0.765
1.218TyrArg: 1.218 ± 0.756
3.654TyrSer: 3.654 ± 1.295
1.218TyrThr: 1.218 ± 0.5
2.03TyrVal: 2.03 ± 1.007
1.218TyrTrp: 1.218 ± 0.49
1.624TyrTyr: 1.624 ± 0.975
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 18 proteins (2464 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski