Amino acid dipepetide frequency for Clostridium phage PhiS63

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
1.701AlaAla: 1.701 ± 0.576
0.1AlaCys: 0.1 ± 0.106
2.602AlaAsp: 2.602 ± 0.467
3.402AlaGlu: 3.402 ± 0.577
1.501AlaPhe: 1.501 ± 0.444
2.401AlaGly: 2.401 ± 0.515
0.4AlaHis: 0.4 ± 0.175
4.303AlaIle: 4.303 ± 0.878
5.904AlaLys: 5.904 ± 1.022
5.103AlaLeu: 5.103 ± 0.711
1.701AlaMet: 1.701 ± 0.413
3.202AlaAsn: 3.202 ± 0.542
1.201AlaPro: 1.201 ± 0.29
1.301AlaGln: 1.301 ± 0.307
2.502AlaArg: 2.502 ± 0.566
2.301AlaSer: 2.301 ± 0.457
3.502AlaThr: 3.502 ± 0.536
2.502AlaVal: 2.502 ± 0.596
0.7AlaTrp: 0.7 ± 0.252
1.601AlaTyr: 1.601 ± 0.286
0.0AlaXaa: 0.0 ± 0.0
Cys
0.1CysAla: 0.1 ± 0.095
0.3CysCys: 0.3 ± 0.176
0.901CysAsp: 0.901 ± 0.269
0.5CysGlu: 0.5 ± 0.22
1.101CysPhe: 1.101 ± 0.41
1.301CysGly: 1.301 ± 0.438
0.4CysHis: 0.4 ± 0.161
1.201CysIle: 1.201 ± 0.352
0.901CysLys: 0.901 ± 0.401
1.001CysLeu: 1.001 ± 0.474
0.1CysMet: 0.1 ± 0.096
0.7CysAsn: 0.7 ± 0.256
0.3CysPro: 0.3 ± 0.195
0.3CysGln: 0.3 ± 0.171
0.3CysArg: 0.3 ± 0.205
0.901CysSer: 0.901 ± 0.351
1.101CysThr: 1.101 ± 0.301
0.7CysVal: 0.7 ± 0.268
0.3CysTrp: 0.3 ± 0.173
0.4CysTyr: 0.4 ± 0.209
0.0CysXaa: 0.0 ± 0.0
Asp
1.901AspAla: 1.901 ± 0.431
1.101AspCys: 1.101 ± 0.302
4.102AspAsp: 4.102 ± 0.807
5.103AspGlu: 5.103 ± 1.086
3.602AspPhe: 3.602 ± 0.593
3.402AspGly: 3.402 ± 0.715
0.8AspHis: 0.8 ± 0.276
6.804AspIle: 6.804 ± 0.806
5.803AspLys: 5.803 ± 0.626
5.904AspLeu: 5.904 ± 0.578
1.801AspMet: 1.801 ± 0.259
3.802AspAsn: 3.802 ± 0.817
1.401AspPro: 1.401 ± 0.411
1.201AspGln: 1.201 ± 0.352
2.401AspArg: 2.401 ± 0.477
3.902AspSer: 3.902 ± 0.524
3.402AspThr: 3.402 ± 0.533
2.502AspVal: 2.502 ± 0.471
0.6AspTrp: 0.6 ± 0.214
2.802AspTyr: 2.802 ± 0.688
0.0AspXaa: 0.0 ± 0.0
Glu
4.503GluAla: 4.503 ± 0.693
0.6GluCys: 0.6 ± 0.391
4.703GluAsp: 4.703 ± 0.83
11.207GluGlu: 11.207 ± 1.99
2.602GluPhe: 2.602 ± 0.428
6.004GluGly: 6.004 ± 0.614
1.501GluHis: 1.501 ± 0.439
6.004GluIle: 6.004 ± 0.773
8.405GluLys: 8.405 ± 1.506
9.306GluLeu: 9.306 ± 1.088
2.802GluMet: 2.802 ± 0.627
5.503GluAsn: 5.503 ± 0.881
1.101GluPro: 1.101 ± 0.406
1.901GluGln: 1.901 ± 0.366
2.702GluArg: 2.702 ± 0.577
3.202GluSer: 3.202 ± 0.624
2.902GluThr: 2.902 ± 0.542
4.803GluVal: 4.803 ± 0.769
1.401GluTrp: 1.401 ± 0.456
3.202GluTyr: 3.202 ± 0.616
0.0GluXaa: 0.0 ± 0.0
Phe
1.601PheAla: 1.601 ± 0.352
0.5PheCys: 0.5 ± 0.199
3.002PheAsp: 3.002 ± 0.741
3.502PheGlu: 3.502 ± 0.751
2.502PhePhe: 2.502 ± 0.689
2.301PheGly: 2.301 ± 0.487
0.1PheHis: 0.1 ± 0.099
4.002PheIle: 4.002 ± 0.625
5.503PheLys: 5.503 ± 0.628
2.702PheLeu: 2.702 ± 0.485
1.001PheMet: 1.001 ± 0.301
2.602PheAsn: 2.602 ± 0.536
1.201PhePro: 1.201 ± 0.264
1.301PheGln: 1.301 ± 0.419
1.601PheArg: 1.601 ± 0.458
2.502PheSer: 2.502 ± 0.58
1.801PheThr: 1.801 ± 0.517
1.801PheVal: 1.801 ± 0.386
0.2PheTrp: 0.2 ± 0.142
1.601PheTyr: 1.601 ± 0.461
0.0PheXaa: 0.0 ± 0.0
Gly
3.102GlyAla: 3.102 ± 0.9
1.001GlyCys: 1.001 ± 0.328
4.002GlyAsp: 4.002 ± 1.091
3.402GlyGlu: 3.402 ± 0.485
2.602GlyPhe: 2.602 ± 0.643
3.502GlyGly: 3.502 ± 0.97
0.5GlyHis: 0.5 ± 0.242
6.004GlyIle: 6.004 ± 0.743
7.605GlyLys: 7.605 ± 0.834
4.703GlyLeu: 4.703 ± 0.64
0.901GlyMet: 0.901 ± 0.279
4.102GlyAsn: 4.102 ± 0.621
0.2GlyPro: 0.2 ± 0.13
1.701GlyGln: 1.701 ± 0.345
2.401GlyArg: 2.401 ± 0.631
3.102GlySer: 3.102 ± 0.545
3.402GlyThr: 3.402 ± 0.966
3.702GlyVal: 3.702 ± 0.793
1.101GlyTrp: 1.101 ± 0.289
3.002GlyTyr: 3.002 ± 0.56
0.0GlyXaa: 0.0 ± 0.0
His
0.3HisAla: 0.3 ± 0.181
0.3HisCys: 0.3 ± 0.19
0.6HisAsp: 0.6 ± 0.454
1.101HisGlu: 1.101 ± 0.288
0.5HisPhe: 0.5 ± 0.191
0.901HisGly: 0.901 ± 0.247
0.7HisHis: 0.7 ± 0.295
1.301HisIle: 1.301 ± 0.325
0.6HisLys: 0.6 ± 0.197
1.201HisLeu: 1.201 ± 0.417
0.0HisMet: 0.0 ± 0.0
0.7HisAsn: 0.7 ± 0.267
0.3HisPro: 0.3 ± 0.146
0.1HisGln: 0.1 ± 0.076
0.901HisArg: 0.901 ± 0.388
0.7HisSer: 0.7 ± 0.223
0.4HisThr: 0.4 ± 0.203
0.5HisVal: 0.5 ± 0.216
0.1HisTrp: 0.1 ± 0.09
0.4HisTyr: 0.4 ± 0.211
0.0HisXaa: 0.0 ± 0.0
Ile
5.103IleAla: 5.103 ± 0.573
0.901IleCys: 0.901 ± 0.304
6.104IleAsp: 6.104 ± 0.718
7.204IleGlu: 7.204 ± 0.813
3.302IlePhe: 3.302 ± 0.62
4.903IleGly: 4.903 ± 0.985
0.8IleHis: 0.8 ± 0.246
6.104IleIle: 6.104 ± 0.856
11.907IleLys: 11.907 ± 0.873
6.304IleLeu: 6.304 ± 0.928
1.001IleMet: 1.001 ± 0.383
7.605IleAsn: 7.605 ± 0.798
2.702IlePro: 2.702 ± 0.492
2.301IleGln: 2.301 ± 0.475
2.802IleArg: 2.802 ± 0.599
4.203IleSer: 4.203 ± 0.636
4.903IleThr: 4.903 ± 0.505
5.103IleVal: 5.103 ± 0.596
1.001IleTrp: 1.001 ± 0.287
3.902IleTyr: 3.902 ± 0.576
0.0IleXaa: 0.0 ± 0.0
Lys
5.803LysAla: 5.803 ± 0.642
2.101LysCys: 2.101 ± 0.633
6.204LysAsp: 6.204 ± 0.597
12.207LysGlu: 12.207 ± 2.264
3.402LysPhe: 3.402 ± 0.458
5.803LysGly: 5.803 ± 0.857
1.001LysHis: 1.001 ± 0.364
12.007LysIle: 12.007 ± 1.353
12.307LysLys: 12.307 ± 1.334
11.007LysLeu: 11.007 ± 1.412
2.802LysMet: 2.802 ± 0.593
8.405LysAsn: 8.405 ± 0.614
2.101LysPro: 2.101 ± 0.47
3.202LysGln: 3.202 ± 0.833
4.002LysArg: 4.002 ± 0.664
5.703LysSer: 5.703 ± 0.9
5.003LysThr: 5.003 ± 0.837
6.304LysVal: 6.304 ± 0.588
0.4LysTrp: 0.4 ± 0.179
5.103LysTyr: 5.103 ± 0.956
0.0LysXaa: 0.0 ± 0.0
Leu
5.103LeuAla: 5.103 ± 0.952
1.401LeuCys: 1.401 ± 0.355
6.604LeuAsp: 6.604 ± 0.494
7.805LeuGlu: 7.805 ± 1.119
3.202LeuPhe: 3.202 ± 0.482
4.503LeuGly: 4.503 ± 0.618
0.5LeuHis: 0.5 ± 0.269
6.404LeuIle: 6.404 ± 0.677
11.707LeuLys: 11.707 ± 1.181
5.503LeuLeu: 5.503 ± 0.865
2.301LeuMet: 2.301 ± 0.46
6.904LeuAsn: 6.904 ± 0.81
0.901LeuPro: 0.901 ± 0.413
1.601LeuGln: 1.601 ± 0.426
3.502LeuArg: 3.502 ± 0.731
4.303LeuSer: 4.303 ± 0.79
5.203LeuThr: 5.203 ± 0.568
4.303LeuVal: 4.303 ± 0.73
0.7LeuTrp: 0.7 ± 0.318
3.502LeuTyr: 3.502 ± 0.699
0.0LeuXaa: 0.0 ± 0.0
Met
1.401MetAla: 1.401 ± 0.377
0.5MetCys: 0.5 ± 0.356
1.401MetAsp: 1.401 ± 0.389
2.301MetGlu: 2.301 ± 0.436
0.7MetPhe: 0.7 ± 0.262
1.401MetGly: 1.401 ± 0.321
0.0MetHis: 0.0 ± 0.0
1.701MetIle: 1.701 ± 0.381
3.002MetLys: 3.002 ± 0.755
1.701MetLeu: 1.701 ± 0.381
0.5MetMet: 0.5 ± 0.275
1.201MetAsn: 1.201 ± 0.337
0.2MetPro: 0.2 ± 0.141
1.101MetGln: 1.101 ± 0.285
0.901MetArg: 0.901 ± 0.348
1.301MetSer: 1.301 ± 0.378
0.7MetThr: 0.7 ± 0.23
1.401MetVal: 1.401 ± 0.348
0.1MetTrp: 0.1 ± 0.099
1.301MetTyr: 1.301 ± 0.345
0.0MetXaa: 0.0 ± 0.0
Asn
3.102AsnAla: 3.102 ± 0.545
1.001AsnCys: 1.001 ± 0.317
3.802AsnAsp: 3.802 ± 0.616
6.204AsnGlu: 6.204 ± 0.771
3.302AsnPhe: 3.302 ± 0.501
5.003AsnGly: 5.003 ± 0.766
0.6AsnHis: 0.6 ± 0.24
6.204AsnIle: 6.204 ± 1.019
7.505AsnLys: 7.505 ± 0.799
7.004AsnLeu: 7.004 ± 0.928
2.001AsnMet: 2.001 ± 0.468
7.404AsnAsn: 7.404 ± 1.102
1.601AsnPro: 1.601 ± 0.385
1.901AsnGln: 1.901 ± 0.268
2.101AsnArg: 2.101 ± 0.466
4.403AsnSer: 4.403 ± 0.679
4.102AsnThr: 4.102 ± 0.897
2.201AsnVal: 2.201 ± 0.442
0.5AsnTrp: 0.5 ± 0.178
2.902AsnTyr: 2.902 ± 0.576
0.0AsnXaa: 0.0 ± 0.0
Pro
1.101ProAla: 1.101 ± 0.295
0.3ProCys: 0.3 ± 0.236
1.001ProAsp: 1.001 ± 0.291
0.7ProGlu: 0.7 ± 0.288
1.001ProPhe: 1.001 ± 0.352
1.401ProGly: 1.401 ± 0.333
0.6ProHis: 0.6 ± 0.217
2.101ProIle: 2.101 ± 0.527
2.201ProLys: 2.201 ± 0.524
1.801ProLeu: 1.801 ± 0.507
0.0ProMet: 0.0 ± 0.0
1.201ProAsn: 1.201 ± 0.343
0.7ProPro: 0.7 ± 0.289
0.1ProGln: 0.1 ± 0.102
0.4ProArg: 0.4 ± 0.169
1.901ProSer: 1.901 ± 0.469
1.301ProThr: 1.301 ± 0.305
2.001ProVal: 2.001 ± 0.42
0.2ProTrp: 0.2 ± 0.113
0.901ProTyr: 0.901 ± 0.281
0.0ProXaa: 0.0 ± 0.0
Gln
2.001GlnAla: 2.001 ± 0.512
0.1GlnCys: 0.1 ± 0.096
1.101GlnAsp: 1.101 ± 0.346
1.801GlnGlu: 1.801 ± 0.46
1.401GlnPhe: 1.401 ± 0.372
1.901GlnGly: 1.901 ± 0.287
0.2GlnHis: 0.2 ± 0.144
1.601GlnIle: 1.601 ± 0.438
3.002GlnLys: 3.002 ± 0.876
2.101GlnLeu: 2.101 ± 0.426
0.8GlnMet: 0.8 ± 0.287
2.602GlnAsn: 2.602 ± 0.585
0.5GlnPro: 0.5 ± 0.196
1.101GlnGln: 1.101 ± 0.338
0.901GlnArg: 0.901 ± 0.284
1.801GlnSer: 1.801 ± 0.472
0.6GlnThr: 0.6 ± 0.288
1.201GlnVal: 1.201 ± 0.299
0.2GlnTrp: 0.2 ± 0.145
1.101GlnTyr: 1.101 ± 0.272
0.0GlnXaa: 0.0 ± 0.0
Arg
1.701ArgAla: 1.701 ± 0.418
0.3ArgCys: 0.3 ± 0.271
1.901ArgAsp: 1.901 ± 0.445
3.902ArgGlu: 3.902 ± 0.853
2.201ArgPhe: 2.201 ± 0.434
1.101ArgGly: 1.101 ± 0.339
0.2ArgHis: 0.2 ± 0.125
3.802ArgIle: 3.802 ± 0.639
5.103ArgLys: 5.103 ± 0.967
2.802ArgLeu: 2.802 ± 0.666
1.201ArgMet: 1.201 ± 0.344
2.401ArgAsn: 2.401 ± 0.528
1.001ArgPro: 1.001 ± 0.337
1.101ArgGln: 1.101 ± 0.478
2.401ArgArg: 2.401 ± 0.526
1.501ArgSer: 1.501 ± 0.342
1.701ArgThr: 1.701 ± 0.349
2.101ArgVal: 2.101 ± 0.501
0.7ArgTrp: 0.7 ± 0.265
1.301ArgTyr: 1.301 ± 0.396
0.0ArgXaa: 0.0 ± 0.0
Ser
2.602SerAla: 2.602 ± 0.548
0.7SerCys: 0.7 ± 0.251
3.202SerAsp: 3.202 ± 0.469
3.502SerGlu: 3.502 ± 0.519
1.901SerPhe: 1.901 ± 0.465
3.902SerGly: 3.902 ± 0.951
1.101SerHis: 1.101 ± 0.289
4.603SerIle: 4.603 ± 0.767
6.204SerLys: 6.204 ± 0.657
5.703SerLeu: 5.703 ± 0.877
0.7SerMet: 0.7 ± 0.229
4.203SerAsn: 4.203 ± 0.769
1.001SerPro: 1.001 ± 0.281
1.401SerGln: 1.401 ± 0.439
2.301SerArg: 2.301 ± 0.512
3.602SerSer: 3.602 ± 0.65
2.602SerThr: 2.602 ± 0.36
2.001SerVal: 2.001 ± 0.407
0.3SerTrp: 0.3 ± 0.156
2.401SerTyr: 2.401 ± 0.331
0.0SerXaa: 0.0 ± 0.0
Thr
2.902ThrAla: 2.902 ± 0.658
0.1ThrCys: 0.1 ± 0.102
2.902ThrAsp: 2.902 ± 0.521
3.502ThrGlu: 3.502 ± 0.658
2.301ThrPhe: 2.301 ± 0.441
3.102ThrGly: 3.102 ± 0.896
0.6ThrHis: 0.6 ± 0.324
4.903ThrIle: 4.903 ± 0.694
4.703ThrLys: 4.703 ± 0.848
4.303ThrLeu: 4.303 ± 0.551
0.901ThrMet: 0.901 ± 0.244
3.502ThrAsn: 3.502 ± 0.555
2.201ThrPro: 2.201 ± 0.555
1.801ThrGln: 1.801 ± 0.436
1.701ThrArg: 1.701 ± 0.497
2.201ThrSer: 2.201 ± 0.451
4.803ThrThr: 4.803 ± 0.589
3.002ThrVal: 3.002 ± 0.605
0.5ThrTrp: 0.5 ± 0.236
2.401ThrTyr: 2.401 ± 0.381
0.0ThrXaa: 0.0 ± 0.0
Val
2.201ValAla: 2.201 ± 0.511
0.5ValCys: 0.5 ± 0.225
3.802ValAsp: 3.802 ± 0.622
3.802ValGlu: 3.802 ± 0.648
1.601ValPhe: 1.601 ± 0.393
3.902ValGly: 3.902 ± 0.889
0.7ValHis: 0.7 ± 0.272
3.102ValIle: 3.102 ± 0.538
7.004ValLys: 7.004 ± 0.926
3.302ValLeu: 3.302 ± 0.505
1.201ValMet: 1.201 ± 0.266
3.602ValAsn: 3.602 ± 0.573
1.301ValPro: 1.301 ± 0.336
1.401ValGln: 1.401 ± 0.506
2.201ValArg: 2.201 ± 0.553
3.002ValSer: 3.002 ± 0.554
3.302ValThr: 3.302 ± 1.036
3.102ValVal: 3.102 ± 0.715
0.2ValTrp: 0.2 ± 0.143
1.701ValTyr: 1.701 ± 0.378
0.0ValXaa: 0.0 ± 0.0
Trp
0.2TrpAla: 0.2 ± 0.146
0.1TrpCys: 0.1 ± 0.101
1.001TrpAsp: 1.001 ± 0.243
0.6TrpGlu: 0.6 ± 0.275
0.4TrpPhe: 0.4 ± 0.205
1.201TrpGly: 1.201 ± 0.307
0.1TrpHis: 0.1 ± 0.102
1.501TrpIle: 1.501 ± 0.377
0.7TrpLys: 0.7 ± 0.243
0.8TrpLeu: 0.8 ± 0.279
0.1TrpMet: 0.1 ± 0.102
0.5TrpAsn: 0.5 ± 0.247
0.0TrpPro: 0.0 ± 0.0
0.2TrpGln: 0.2 ± 0.157
0.4TrpArg: 0.4 ± 0.195
0.5TrpSer: 0.5 ± 0.212
0.2TrpThr: 0.2 ± 0.141
0.5TrpVal: 0.5 ± 0.284
0.1TrpTrp: 0.1 ± 0.11
0.5TrpTyr: 0.5 ± 0.241
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.501TyrAla: 1.501 ± 0.442
0.901TyrCys: 0.901 ± 0.362
3.402TyrAsp: 3.402 ± 0.636
2.602TyrGlu: 2.602 ± 0.547
2.201TyrPhe: 2.201 ± 0.487
2.101TyrGly: 2.101 ± 0.443
0.8TyrHis: 0.8 ± 0.283
4.503TyrIle: 4.503 ± 0.67
4.803TyrLys: 4.803 ± 0.633
3.802TyrLeu: 3.802 ± 0.641
0.901TyrMet: 0.901 ± 0.294
2.702TyrAsn: 2.702 ± 0.504
1.001TyrPro: 1.001 ± 0.32
1.001TyrGln: 1.001 ± 0.335
2.001TyrArg: 2.001 ± 0.398
2.902TyrSer: 2.902 ± 0.708
1.401TyrThr: 1.401 ± 0.51
1.201TyrVal: 1.201 ± 0.35
0.3TyrTrp: 0.3 ± 0.219
2.802TyrTyr: 2.802 ± 0.775
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 43 proteins (9995 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski