Amino acid dipepetide frequency for Thermus phage phiOH16

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
8.333AlaAla: 8.333 ± 1.56
0.0AlaCys: 0.0 ± 0.0
3.646AlaAsp: 3.646 ± 1.369
3.646AlaGlu: 3.646 ± 1.053
6.771AlaPhe: 6.771 ± 1.192
7.292AlaGly: 7.292 ± 2.582
2.604AlaHis: 2.604 ± 0.921
5.729AlaIle: 5.729 ± 1.928
2.604AlaLys: 2.604 ± 1.107
12.5AlaLeu: 12.5 ± 3.047
0.521AlaMet: 0.521 ± 0.489
1.562AlaAsn: 1.562 ± 0.76
4.688AlaPro: 4.688 ± 1.211
3.125AlaGln: 3.125 ± 1.451
11.979AlaArg: 11.979 ± 3.978
4.167AlaSer: 4.167 ± 1.421
6.25AlaThr: 6.25 ± 0.626
12.5AlaVal: 12.5 ± 2.793
5.729AlaTrp: 5.729 ± 1.706
6.25AlaTyr: 6.25 ± 1.689
0.0AlaXaa: 0.0 ± 0.0
Cys
0.521CysAla: 0.521 ± 0.397
0.0CysCys: 0.0 ± 0.0
0.521CysAsp: 0.521 ± 0.397
0.0CysGlu: 0.0 ± 0.0
0.521CysPhe: 0.521 ± 0.461
0.521CysGly: 0.521 ± 0.461
0.0CysHis: 0.0 ± 0.0
1.042CysIle: 1.042 ± 0.793
0.0CysLys: 0.0 ± 0.0
0.0CysLeu: 0.0 ± 0.0
0.0CysMet: 0.0 ± 0.0
0.0CysAsn: 0.0 ± 0.0
0.521CysPro: 0.521 ± 0.397
0.0CysGln: 0.0 ± 0.0
0.0CysArg: 0.0 ± 0.0
0.521CysSer: 0.521 ± 0.397
0.0CysThr: 0.0 ± 0.0
0.0CysVal: 0.0 ± 0.0
0.0CysTrp: 0.0 ± 0.0
0.521CysTyr: 0.521 ± 0.461
0.0CysXaa: 0.0 ± 0.0
Asp
7.812AspAla: 7.812 ± 1.79
0.0AspCys: 0.0 ± 0.0
0.0AspAsp: 0.0 ± 0.0
2.604AspGlu: 2.604 ± 1.255
1.562AspPhe: 1.562 ± 0.487
6.25AspGly: 6.25 ± 2.12
0.0AspHis: 0.0 ± 0.0
1.042AspIle: 1.042 ± 0.793
1.562AspLys: 1.562 ± 0.751
6.25AspLeu: 6.25 ± 1.627
0.0AspMet: 0.0 ± 0.0
0.521AspAsn: 0.521 ± 0.397
6.25AspPro: 6.25 ± 2.274
1.042AspGln: 1.042 ± 0.647
5.208AspArg: 5.208 ± 1.989
1.562AspSer: 1.562 ± 0.918
3.125AspThr: 3.125 ± 0.865
5.208AspVal: 5.208 ± 0.623
2.083AspTrp: 2.083 ± 0.773
1.562AspTyr: 1.562 ± 0.404
0.0AspXaa: 0.0 ± 0.0
Glu
10.938GluAla: 10.938 ± 2.318
0.0GluCys: 0.0 ± 0.0
0.521GluAsp: 0.521 ± 0.44
4.167GluGlu: 4.167 ± 1.222
0.521GluPhe: 0.521 ± 0.757
5.208GluGly: 5.208 ± 2.5
1.042GluHis: 1.042 ± 0.408
1.562GluIle: 1.562 ± 0.782
1.042GluLys: 1.042 ± 0.408
3.646GluLeu: 3.646 ± 2.169
0.521GluMet: 0.521 ± 0.44
1.562GluAsn: 1.562 ± 0.404
3.125GluPro: 3.125 ± 0.814
1.042GluGln: 1.042 ± 0.793
2.083GluArg: 2.083 ± 0.85
1.042GluSer: 1.042 ± 0.923
0.0GluThr: 0.0 ± 0.0
7.292GluVal: 7.292 ± 1.703
2.083GluTrp: 2.083 ± 0.579
2.083GluTyr: 2.083 ± 0.85
0.0GluXaa: 0.0 ± 0.0
Phe
4.688PheAla: 4.688 ± 1.901
0.0PheCys: 0.0 ± 0.0
2.604PheAsp: 2.604 ± 1.961
1.562PheGlu: 1.562 ± 0.968
0.0PhePhe: 0.0 ± 0.0
5.729PheGly: 5.729 ± 2.102
0.521PheHis: 0.521 ± 0.515
1.562PheIle: 1.562 ± 0.76
2.083PheLys: 2.083 ± 0.648
1.562PheLeu: 1.562 ± 1.111
1.042PheMet: 1.042 ± 0.88
0.521PheAsn: 0.521 ± 0.592
2.083PhePro: 2.083 ± 0.36
0.0PheGln: 0.0 ± 0.0
1.562PheArg: 1.562 ± 0.871
1.562PheSer: 1.562 ± 0.985
3.125PheThr: 3.125 ± 0.944
2.604PheVal: 2.604 ± 1.012
4.167PheTrp: 4.167 ± 0.948
1.562PheTyr: 1.562 ± 0.636
0.0PheXaa: 0.0 ± 0.0
Gly
5.729GlyAla: 5.729 ± 1.882
0.0GlyCys: 0.0 ± 0.0
3.646GlyAsp: 3.646 ± 1.364
2.083GlyGlu: 2.083 ± 1.76
3.646GlyPhe: 3.646 ± 0.565
5.729GlyGly: 5.729 ± 1.454
1.042GlyHis: 1.042 ± 0.683
3.125GlyIle: 3.125 ± 0.917
2.604GlyLys: 2.604 ± 1.047
11.979GlyLeu: 11.979 ± 3.077
2.604GlyMet: 2.604 ± 0.979
1.042GlyAsn: 1.042 ± 0.576
4.688GlyPro: 4.688 ± 1.774
0.0GlyGln: 0.0 ± 0.0
8.333GlyArg: 8.333 ± 1.091
6.25GlySer: 6.25 ± 1.903
3.646GlyThr: 3.646 ± 1.799
9.896GlyVal: 9.896 ± 1.321
2.083GlyTrp: 2.083 ± 1.289
0.521GlyTyr: 0.521 ± 0.397
0.0GlyXaa: 0.0 ± 0.0
His
2.083HisAla: 2.083 ± 1.151
0.0HisCys: 0.0 ± 0.0
1.042HisAsp: 1.042 ± 0.755
1.042HisGlu: 1.042 ± 0.872
1.042HisPhe: 1.042 ± 0.793
1.562HisGly: 1.562 ± 1.32
0.0HisHis: 0.0 ± 0.0
0.521HisIle: 0.521 ± 0.44
0.0HisLys: 0.0 ± 0.0
1.042HisLeu: 1.042 ± 0.688
0.521HisMet: 0.521 ± 0.461
0.0HisAsn: 0.0 ± 0.0
1.042HisPro: 1.042 ± 0.793
0.521HisGln: 0.521 ± 0.515
2.083HisArg: 2.083 ± 1.223
0.0HisSer: 0.0 ± 0.0
0.0HisThr: 0.0 ± 0.0
1.562HisVal: 1.562 ± 0.999
0.521HisTrp: 0.521 ± 0.592
0.521HisTyr: 0.521 ± 0.44
0.0HisXaa: 0.0 ± 0.0
Ile
3.646IleAla: 3.646 ± 1.316
0.521IleCys: 0.521 ± 0.397
2.083IleAsp: 2.083 ± 1.266
0.521IleGlu: 0.521 ± 0.461
2.604IlePhe: 2.604 ± 1.449
3.125IleGly: 3.125 ± 1.645
0.0IleHis: 0.0 ± 0.0
2.083IleIle: 2.083 ± 1.375
0.0IleLys: 0.0 ± 0.0
3.125IleLeu: 3.125 ± 0.955
0.521IleMet: 0.521 ± 0.461
0.521IleAsn: 0.521 ± 0.397
3.646IlePro: 3.646 ± 1.149
1.042IleGln: 1.042 ± 0.793
4.688IleArg: 4.688 ± 1.536
2.604IleSer: 2.604 ± 1.242
1.042IleThr: 1.042 ± 1.184
3.646IleVal: 3.646 ± 1.536
0.521IleTrp: 0.521 ± 0.44
0.521IleTyr: 0.521 ± 0.515
0.0IleXaa: 0.0 ± 0.0
Lys
3.125LysAla: 3.125 ± 1.598
0.521LysCys: 0.521 ± 0.397
2.083LysAsp: 2.083 ± 0.85
0.521LysGlu: 0.521 ± 0.502
0.521LysPhe: 0.521 ± 0.397
4.167LysGly: 4.167 ± 1.313
0.0LysHis: 0.0 ± 0.0
0.0LysIle: 0.0 ± 0.0
1.042LysLys: 1.042 ± 0.552
3.125LysLeu: 3.125 ± 0.467
0.521LysMet: 0.521 ± 0.38
0.0LysAsn: 0.0 ± 0.0
1.042LysPro: 1.042 ± 0.576
1.042LysGln: 1.042 ± 0.408
2.604LysArg: 2.604 ± 1.107
2.604LysSer: 2.604 ± 1.062
0.521LysThr: 0.521 ± 0.515
4.167LysVal: 4.167 ± 1.65
0.521LysTrp: 0.521 ± 0.461
0.0LysTyr: 0.0 ± 0.0
0.0LysXaa: 0.0 ± 0.0
Leu
13.021LeuAla: 13.021 ± 2.98
0.521LeuCys: 0.521 ± 0.461
6.771LeuAsp: 6.771 ± 1.405
4.167LeuGlu: 4.167 ± 1.944
1.042LeuPhe: 1.042 ± 0.552
7.292LeuGly: 7.292 ± 1.459
2.604LeuHis: 2.604 ± 1.584
3.646LeuIle: 3.646 ± 1.762
4.167LeuLys: 4.167 ± 1.427
15.625LeuLeu: 15.625 ± 3.636
2.083LeuMet: 2.083 ± 1.179
2.604LeuAsn: 2.604 ± 0.791
9.375LeuPro: 9.375 ± 1.175
4.688LeuGln: 4.688 ± 0.921
6.25LeuArg: 6.25 ± 2.799
5.208LeuSer: 5.208 ± 1.273
3.646LeuThr: 3.646 ± 1.4
10.417LeuVal: 10.417 ± 2.985
2.083LeuTrp: 2.083 ± 1.409
3.125LeuTyr: 3.125 ± 1.304
0.0LeuXaa: 0.0 ± 0.0
Met
1.042MetAla: 1.042 ± 0.683
0.0MetCys: 0.0 ± 0.0
1.042MetAsp: 1.042 ± 0.612
0.521MetGlu: 0.521 ± 0.757
0.0MetPhe: 0.0 ± 0.0
1.042MetGly: 1.042 ± 0.755
0.0MetHis: 0.0 ± 0.0
1.562MetIle: 1.562 ± 1.033
0.521MetLys: 0.521 ± 0.502
0.0MetLeu: 0.0 ± 0.0
1.042MetMet: 1.042 ± 0.552
0.521MetAsn: 0.521 ± 0.54
0.521MetPro: 0.521 ± 0.44
1.042MetGln: 1.042 ± 0.923
1.042MetArg: 1.042 ± 0.88
1.042MetSer: 1.042 ± 0.408
1.562MetThr: 1.562 ± 0.631
0.521MetVal: 0.521 ± 0.44
0.521MetTrp: 0.521 ± 0.461
0.521MetTyr: 0.521 ± 0.44
0.0MetXaa: 0.0 ± 0.0
Asn
0.0AsnAla: 0.0 ± 0.0
0.0AsnCys: 0.0 ± 0.0
0.521AsnAsp: 0.521 ± 0.397
0.521AsnGlu: 0.521 ± 0.44
1.042AsnPhe: 1.042 ± 0.552
1.562AsnGly: 1.562 ± 0.954
0.0AsnHis: 0.0 ± 0.0
0.521AsnIle: 0.521 ± 0.397
0.0AsnLys: 0.0 ± 0.0
2.083AsnLeu: 2.083 ± 0.555
0.0AsnMet: 0.0 ± 0.0
0.521AsnAsn: 0.521 ± 0.397
6.771AsnPro: 6.771 ± 2.537
1.042AsnGln: 1.042 ± 0.688
1.042AsnArg: 1.042 ± 0.596
0.521AsnSer: 0.521 ± 0.461
1.042AsnThr: 1.042 ± 0.647
2.083AsnVal: 2.083 ± 1.053
0.0AsnTrp: 0.0 ± 0.0
2.083AsnTyr: 2.083 ± 0.959
0.0AsnXaa: 0.0 ± 0.0
Pro
6.25ProAla: 6.25 ± 1.219
0.0ProCys: 0.0 ± 0.0
7.292ProAsp: 7.292 ± 3.266
8.333ProGlu: 8.333 ± 2.349
2.083ProPhe: 2.083 ± 0.799
4.688ProGly: 4.688 ± 0.825
1.562ProHis: 1.562 ± 0.886
4.167ProIle: 4.167 ± 1.283
0.521ProLys: 0.521 ± 0.461
4.688ProLeu: 4.688 ± 1.204
0.521ProMet: 0.521 ± 0.44
3.646ProAsn: 3.646 ± 1.382
7.292ProPro: 7.292 ± 1.628
2.604ProGln: 2.604 ± 1.439
1.562ProArg: 1.562 ± 0.735
3.125ProSer: 3.125 ± 0.808
2.083ProThr: 2.083 ± 1.027
5.729ProVal: 5.729 ± 1.208
3.125ProTrp: 3.125 ± 0.807
3.125ProTyr: 3.125 ± 1.457
0.0ProXaa: 0.0 ± 0.0
Gln
4.688GlnAla: 4.688 ± 0.672
0.521GlnCys: 0.521 ± 0.397
0.521GlnAsp: 0.521 ± 0.44
1.562GlnGlu: 1.562 ± 0.882
2.083GlnPhe: 2.083 ± 0.69
2.604GlnGly: 2.604 ± 0.996
0.521GlnHis: 0.521 ± 0.461
0.521GlnIle: 0.521 ± 0.592
1.562GlnLys: 1.562 ± 0.782
3.646GlnLeu: 3.646 ± 1.692
0.0GlnMet: 0.0 ± 0.0
1.562GlnAsn: 1.562 ± 0.851
0.521GlnPro: 0.521 ± 0.397
1.562GlnGln: 1.562 ± 1.19
2.604GlnArg: 2.604 ± 1.271
2.604GlnSer: 2.604 ± 0.484
1.042GlnThr: 1.042 ± 0.596
3.125GlnVal: 3.125 ± 0.997
2.604GlnTrp: 2.604 ± 1.615
1.562GlnTyr: 1.562 ± 0.487
0.0GlnXaa: 0.0 ± 0.0
Arg
7.292ArgAla: 7.292 ± 2.964
0.521ArgCys: 0.521 ± 0.397
3.646ArgAsp: 3.646 ± 1.526
4.688ArgGlu: 4.688 ± 2.476
5.208ArgPhe: 5.208 ± 2.13
5.208ArgGly: 5.208 ± 0.718
0.521ArgHis: 0.521 ± 0.44
1.562ArgIle: 1.562 ± 0.487
3.646ArgLys: 3.646 ± 1.345
6.25ArgLeu: 6.25 ± 2.022
1.042ArgMet: 1.042 ± 0.758
1.562ArgAsn: 1.562 ± 0.782
5.208ArgPro: 5.208 ± 0.825
4.688ArgGln: 4.688 ± 1.326
5.208ArgArg: 5.208 ± 1.181
4.167ArgSer: 4.167 ± 1.944
2.083ArgThr: 2.083 ± 0.831
7.812ArgVal: 7.812 ± 2.229
1.042ArgTrp: 1.042 ± 0.793
3.646ArgTyr: 3.646 ± 1.249
0.0ArgXaa: 0.0 ± 0.0
Ser
6.771SerAla: 6.771 ± 1.491
1.042SerCys: 1.042 ± 0.923
2.083SerAsp: 2.083 ± 0.676
1.562SerGlu: 1.562 ± 0.404
3.646SerPhe: 3.646 ± 0.673
4.167SerGly: 4.167 ± 0.564
2.083SerHis: 2.083 ± 1.132
0.521SerIle: 0.521 ± 0.461
2.604SerLys: 2.604 ± 1.196
7.292SerLeu: 7.292 ± 0.815
0.0SerMet: 0.0 ± 0.43
1.042SerAsn: 1.042 ± 0.408
3.646SerPro: 3.646 ± 1.542
1.042SerGln: 1.042 ± 0.505
3.646SerArg: 3.646 ± 1.34
1.562SerSer: 1.562 ± 0.735
3.125SerThr: 3.125 ± 1.58
3.646SerVal: 3.646 ± 1.059
1.562SerTrp: 1.562 ± 1.32
2.604SerTyr: 2.604 ± 0.853
0.0SerXaa: 0.0 ± 0.0
Thr
2.083ThrAla: 2.083 ± 0.787
1.042ThrCys: 1.042 ± 0.793
4.167ThrAsp: 4.167 ± 1.514
1.562ThrGlu: 1.562 ± 0.594
3.125ThrPhe: 3.125 ± 1.493
3.125ThrGly: 3.125 ± 1.228
0.521ThrHis: 0.521 ± 0.44
2.083ThrIle: 2.083 ± 0.885
0.521ThrLys: 0.521 ± 0.502
4.167ThrLeu: 4.167 ± 1.112
0.521ThrMet: 0.521 ± 0.461
1.042ThrAsn: 1.042 ± 0.793
2.083ThrPro: 2.083 ± 0.776
1.042ThrGln: 1.042 ± 0.505
2.604ThrArg: 2.604 ± 1.216
4.167ThrSer: 4.167 ± 1.515
0.521ThrThr: 0.521 ± 0.397
2.083ThrVal: 2.083 ± 1.075
0.521ThrTrp: 0.521 ± 0.54
1.042ThrTyr: 1.042 ± 0.505
0.0ThrXaa: 0.0 ± 0.0
Val
9.896ValAla: 9.896 ± 2.541
0.0ValCys: 0.0 ± 0.0
8.333ValAsp: 8.333 ± 1.545
7.292ValGlu: 7.292 ± 1.672
1.042ValPhe: 1.042 ± 0.843
5.208ValGly: 5.208 ± 1.3
1.562ValHis: 1.562 ± 0.751
4.688ValIle: 4.688 ± 2.03
3.646ValLys: 3.646 ± 0.858
11.458ValLeu: 11.458 ± 2.954
0.521ValMet: 0.521 ± 0.489
2.083ValAsn: 2.083 ± 0.811
5.208ValPro: 5.208 ± 1.039
6.771ValGln: 6.771 ± 1.931
7.292ValArg: 7.292 ± 2.057
6.25ValSer: 6.25 ± 2.074
1.562ValThr: 1.562 ± 0.594
8.854ValVal: 8.854 ± 2.957
1.042ValTrp: 1.042 ± 0.793
4.167ValTyr: 4.167 ± 1.596
0.0ValXaa: 0.0 ± 0.0
Trp
3.125TrpAla: 3.125 ± 1.289
0.0TrpCys: 0.0 ± 0.0
1.562TrpAsp: 1.562 ± 0.404
1.042TrpGlu: 1.042 ± 0.88
0.521TrpPhe: 0.521 ± 0.54
2.083TrpGly: 2.083 ± 0.579
0.521TrpHis: 0.521 ± 0.44
0.0TrpIle: 0.0 ± 0.0
0.0TrpLys: 0.0 ± 0.0
6.771TrpLeu: 6.771 ± 2.542
1.042TrpMet: 1.042 ± 0.772
0.521TrpAsn: 0.521 ± 0.592
2.083TrpPro: 2.083 ± 1.216
2.083TrpGln: 2.083 ± 0.36
2.604TrpArg: 2.604 ± 1.03
2.604TrpSer: 2.604 ± 0.801
1.042TrpThr: 1.042 ± 0.793
3.125TrpVal: 3.125 ± 1.488
0.0TrpTrp: 0.0 ± 0.0
1.562TrpTyr: 1.562 ± 0.674
0.0TrpXaa: 0.0 ± 0.0
Tyr
8.854TyrAla: 8.854 ± 1.986
0.0TyrCys: 0.0 ± 0.0
1.562TyrAsp: 1.562 ± 0.621
2.083TyrGlu: 2.083 ± 1.275
1.562TyrPhe: 1.562 ± 0.674
2.083TyrGly: 2.083 ± 0.69
0.0TyrHis: 0.0 ± 0.0
0.521TyrIle: 0.521 ± 0.757
0.0TyrLys: 0.0 ± 0.0
3.125TyrLeu: 3.125 ± 0.798
0.521TyrMet: 0.521 ± 0.44
0.521TyrAsn: 0.521 ± 0.397
2.604TyrPro: 2.604 ± 0.996
1.042TyrGln: 1.042 ± 0.576
2.604TyrArg: 2.604 ± 0.938
2.604TyrSer: 2.604 ± 0.484
2.604TyrThr: 2.604 ± 0.938
2.604TyrVal: 2.604 ± 1.585
2.083TyrTrp: 2.083 ± 0.765
1.042TyrTyr: 1.042 ± 0.505
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 9 proteins (1921 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski