Amino acid dipepetide frequency for Geobacillus phage GBK1

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
0.681AlaAla: 0.681 ± 0.284
0.303AlaCys: 0.303 ± 0.172
3.407AlaAsp: 3.407 ± 0.55
4.618AlaGlu: 4.618 ± 0.642
2.801AlaPhe: 2.801 ± 0.393
5.527AlaGly: 5.527 ± 0.68
0.984AlaHis: 0.984 ± 0.256
5.678AlaIle: 5.678 ± 0.64
5.148AlaLys: 5.148 ± 0.671
7.117AlaLeu: 7.117 ± 0.784
2.044AlaMet: 2.044 ± 0.454
3.407AlaAsn: 3.407 ± 0.461
3.104AlaPro: 3.104 ± 0.389
2.65AlaGln: 2.65 ± 0.424
4.013AlaArg: 4.013 ± 0.582
3.558AlaSer: 3.558 ± 1.045
3.483AlaThr: 3.483 ± 0.589
4.997AlaVal: 4.997 ± 0.721
0.757AlaTrp: 0.757 ± 0.307
2.726AlaTyr: 2.726 ± 0.387
0.0AlaXaa: 0.0 ± 0.0
Cys
0.681CysAla: 0.681 ± 0.232
0.076CysCys: 0.076 ± 0.062
0.53CysAsp: 0.53 ± 0.189
0.681CysGlu: 0.681 ± 0.234
0.227CysPhe: 0.227 ± 0.109
0.909CysGly: 0.909 ± 0.305
0.227CysHis: 0.227 ± 0.125
0.303CysIle: 0.303 ± 0.19
0.606CysLys: 0.606 ± 0.233
1.136CysLeu: 1.136 ± 0.289
0.0CysMet: 0.0 ± 0.0
0.303CysAsn: 0.303 ± 0.197
0.53CysPro: 0.53 ± 0.237
0.379CysGln: 0.379 ± 0.18
0.606CysArg: 0.606 ± 0.228
0.303CysSer: 0.303 ± 0.153
0.606CysThr: 0.606 ± 0.241
0.454CysVal: 0.454 ± 0.163
0.227CysTrp: 0.227 ± 0.114
0.53CysTyr: 0.53 ± 0.273
0.0CysXaa: 0.0 ± 0.0
Asp
3.256AspAla: 3.256 ± 0.47
0.681AspCys: 0.681 ± 0.323
3.104AspAsp: 3.104 ± 0.526
4.846AspGlu: 4.846 ± 0.786
2.953AspPhe: 2.953 ± 0.437
3.937AspGly: 3.937 ± 0.736
0.757AspHis: 0.757 ± 0.243
3.558AspIle: 3.558 ± 0.48
4.694AspLys: 4.694 ± 0.633
5.073AspLeu: 5.073 ± 0.51
1.211AspMet: 1.211 ± 0.281
2.498AspAsn: 2.498 ± 0.357
2.801AspPro: 2.801 ± 0.394
0.909AspGln: 0.909 ± 0.226
2.726AspArg: 2.726 ± 0.486
2.726AspSer: 2.726 ± 0.489
3.028AspThr: 3.028 ± 0.445
2.801AspVal: 2.801 ± 0.385
0.53AspTrp: 0.53 ± 0.184
2.877AspTyr: 2.877 ± 0.416
0.0AspXaa: 0.0 ± 0.0
Glu
5.376GluAla: 5.376 ± 0.635
0.606GluCys: 0.606 ± 0.229
3.028GluAsp: 3.028 ± 0.408
8.328GluGlu: 8.328 ± 0.924
3.104GluPhe: 3.104 ± 0.484
4.694GluGly: 4.694 ± 0.56
1.666GluHis: 1.666 ± 0.322
5.376GluIle: 5.376 ± 0.463
7.723GluLys: 7.723 ± 1.152
10.373GluLeu: 10.373 ± 1.159
1.893GluMet: 1.893 ± 0.328
3.18GluAsn: 3.18 ± 0.536
1.893GluPro: 1.893 ± 0.34
4.846GluGln: 4.846 ± 0.672
4.921GluArg: 4.921 ± 0.508
2.726GluSer: 2.726 ± 0.454
3.256GluThr: 3.256 ± 0.522
8.025GluVal: 8.025 ± 0.691
1.287GluTrp: 1.287 ± 0.274
3.483GluTyr: 3.483 ± 0.452
0.0GluXaa: 0.0 ± 0.0
Phe
2.498PheAla: 2.498 ± 0.355
0.227PheCys: 0.227 ± 0.155
3.104PheAsp: 3.104 ± 0.419
2.877PheGlu: 2.877 ± 0.403
2.044PhePhe: 2.044 ± 0.45
3.256PheGly: 3.256 ± 0.365
1.06PheHis: 1.06 ± 0.225
2.726PheIle: 2.726 ± 0.489
3.558PheLys: 3.558 ± 0.583
3.331PheLeu: 3.331 ± 0.491
1.59PheMet: 1.59 ± 0.317
2.574PheAsn: 2.574 ± 0.363
1.514PhePro: 1.514 ± 0.285
1.817PheGln: 1.817 ± 0.365
2.271PheArg: 2.271 ± 0.453
2.271PheSer: 2.271 ± 0.39
2.877PheThr: 2.877 ± 0.432
2.726PheVal: 2.726 ± 0.605
0.303PheTrp: 0.303 ± 0.145
2.044PheTyr: 2.044 ± 0.439
0.0PheXaa: 0.0 ± 0.0
Gly
4.618GlyAla: 4.618 ± 0.845
0.53GlyCys: 0.53 ± 0.211
3.71GlyAsp: 3.71 ± 0.561
5.678GlyGlu: 5.678 ± 0.493
3.18GlyPhe: 3.18 ± 0.441
4.013GlyGly: 4.013 ± 0.593
1.287GlyHis: 1.287 ± 0.312
4.391GlyIle: 4.391 ± 0.713
5.527GlyLys: 5.527 ± 0.609
5.376GlyLeu: 5.376 ± 0.609
0.833GlyMet: 0.833 ± 0.218
2.423GlyAsn: 2.423 ± 0.427
0.076GlyPro: 0.076 ± 0.064
1.741GlyGln: 1.741 ± 0.386
3.407GlyArg: 3.407 ± 0.528
3.861GlySer: 3.861 ± 0.475
4.164GlyThr: 4.164 ± 0.772
4.77GlyVal: 4.77 ± 0.542
1.363GlyTrp: 1.363 ± 0.345
3.104GlyTyr: 3.104 ± 0.37
0.0GlyXaa: 0.0 ± 0.0
His
0.757HisAla: 0.757 ± 0.196
0.303HisCys: 0.303 ± 0.236
0.454HisAsp: 0.454 ± 0.188
2.044HisGlu: 2.044 ± 0.405
1.211HisPhe: 1.211 ± 0.326
1.439HisGly: 1.439 ± 0.559
0.379HisHis: 0.379 ± 0.18
1.136HisIle: 1.136 ± 0.293
1.06HisLys: 1.06 ± 0.332
1.136HisLeu: 1.136 ± 0.317
0.151HisMet: 0.151 ± 0.1
0.833HisAsn: 0.833 ± 0.277
0.681HisPro: 0.681 ± 0.236
0.454HisGln: 0.454 ± 0.206
1.287HisArg: 1.287 ± 0.301
1.363HisSer: 1.363 ± 0.284
1.136HisThr: 1.136 ± 0.37
1.211HisVal: 1.211 ± 0.311
0.303HisTrp: 0.303 ± 0.137
0.984HisTyr: 0.984 ± 0.24
0.0HisXaa: 0.0 ± 0.0
Ile
4.24IleAla: 4.24 ± 0.503
0.454IleCys: 0.454 ± 0.198
5.376IleAsp: 5.376 ± 0.578
7.95IleGlu: 7.95 ± 0.725
2.044IlePhe: 2.044 ± 0.404
3.558IleGly: 3.558 ± 0.509
1.211IleHis: 1.211 ± 0.239
2.801IleIle: 2.801 ± 0.43
4.997IleLys: 4.997 ± 0.63
3.71IleLeu: 3.71 ± 0.539
1.211IleMet: 1.211 ± 0.229
3.331IleAsn: 3.331 ± 0.49
2.044IlePro: 2.044 ± 0.329
3.028IleGln: 3.028 ± 0.521
3.558IleArg: 3.558 ± 0.605
3.558IleSer: 3.558 ± 0.48
3.634IleThr: 3.634 ± 0.474
4.846IleVal: 4.846 ± 0.627
0.454IleTrp: 0.454 ± 0.153
1.363IleTyr: 1.363 ± 0.266
0.0IleXaa: 0.0 ± 0.0
Lys
5.678LysAla: 5.678 ± 0.61
0.909LysCys: 0.909 ± 0.27
3.71LysAsp: 3.71 ± 0.535
7.42LysGlu: 7.42 ± 0.912
3.407LysPhe: 3.407 ± 0.478
3.634LysGly: 3.634 ± 0.689
2.12LysHis: 2.12 ± 0.381
5.073LysIle: 5.073 ± 0.719
5.451LysLys: 5.451 ± 0.739
7.42LysLeu: 7.42 ± 0.889
2.953LysMet: 2.953 ± 0.579
3.937LysAsn: 3.937 ± 0.566
2.347LysPro: 2.347 ± 0.46
3.256LysGln: 3.256 ± 0.481
4.997LysArg: 4.997 ± 0.737
3.18LysSer: 3.18 ± 0.548
3.786LysThr: 3.786 ± 0.551
6.663LysVal: 6.663 ± 0.834
0.984LysTrp: 0.984 ± 0.244
2.877LysTyr: 2.877 ± 0.479
0.0LysXaa: 0.0 ± 0.0
Leu
6.435LeuAla: 6.435 ± 0.632
0.909LeuCys: 0.909 ± 0.254
4.543LeuAsp: 4.543 ± 0.514
7.798LeuGlu: 7.798 ± 0.912
3.71LeuPhe: 3.71 ± 0.42
4.921LeuGly: 4.921 ± 0.636
1.439LeuHis: 1.439 ± 0.316
5.3LeuIle: 5.3 ± 0.663
9.161LeuLys: 9.161 ± 0.858
6.814LeuLeu: 6.814 ± 0.696
1.893LeuMet: 1.893 ± 0.312
3.331LeuAsn: 3.331 ± 0.479
3.786LeuPro: 3.786 ± 0.501
3.634LeuGln: 3.634 ± 0.473
5.603LeuArg: 5.603 ± 0.684
4.77LeuSer: 4.77 ± 0.67
4.694LeuThr: 4.694 ± 0.519
4.013LeuVal: 4.013 ± 0.472
0.833LeuTrp: 0.833 ± 0.227
3.558LeuTyr: 3.558 ± 0.499
0.0LeuXaa: 0.0 ± 0.0
Met
2.423MetAla: 2.423 ± 0.46
0.0MetCys: 0.0 ± 0.0
0.606MetAsp: 0.606 ± 0.205
2.271MetGlu: 2.271 ± 0.387
0.984MetPhe: 0.984 ± 0.274
1.211MetGly: 1.211 ± 0.255
0.227MetHis: 0.227 ± 0.114
1.136MetIle: 1.136 ± 0.297
2.271MetLys: 2.271 ± 0.418
2.423MetLeu: 2.423 ± 0.468
0.757MetMet: 0.757 ± 0.239
1.363MetAsn: 1.363 ± 0.318
1.136MetPro: 1.136 ± 0.278
0.53MetGln: 0.53 ± 0.201
0.757MetArg: 0.757 ± 0.276
1.439MetSer: 1.439 ± 0.343
1.893MetThr: 1.893 ± 0.326
1.666MetVal: 1.666 ± 0.339
0.0MetTrp: 0.0 ± 0.0
0.833MetTyr: 0.833 ± 0.246
0.0MetXaa: 0.0 ± 0.0
Asn
3.558AsnAla: 3.558 ± 0.727
0.379AsnCys: 0.379 ± 0.173
3.028AsnAsp: 3.028 ± 0.483
3.028AsnGlu: 3.028 ± 0.501
2.12AsnPhe: 2.12 ± 0.394
4.164AsnGly: 4.164 ± 0.415
0.984AsnHis: 0.984 ± 0.257
2.726AsnIle: 2.726 ± 0.486
4.316AsnLys: 4.316 ± 0.647
3.786AsnLeu: 3.786 ± 0.527
0.757AsnMet: 0.757 ± 0.255
2.65AsnAsn: 2.65 ± 0.621
2.574AsnPro: 2.574 ± 0.414
1.136AsnGln: 1.136 ± 0.299
1.59AsnArg: 1.59 ± 0.31
2.574AsnSer: 2.574 ± 0.384
1.893AsnThr: 1.893 ± 0.465
2.65AsnVal: 2.65 ± 0.368
0.606AsnTrp: 0.606 ± 0.202
2.196AsnTyr: 2.196 ± 0.398
0.0AsnXaa: 0.0 ± 0.0
Pro
2.498ProAla: 2.498 ± 0.349
0.303ProCys: 0.303 ± 0.135
1.514ProAsp: 1.514 ± 0.285
4.013ProGlu: 4.013 ± 0.546
1.969ProPhe: 1.969 ± 0.332
0.076ProGly: 0.076 ± 0.082
0.909ProHis: 0.909 ± 0.21
2.196ProIle: 2.196 ± 0.453
1.514ProLys: 1.514 ± 0.38
3.256ProLeu: 3.256 ± 0.471
0.454ProMet: 0.454 ± 0.168
1.439ProAsn: 1.439 ± 0.328
1.741ProPro: 1.741 ± 0.356
1.514ProGln: 1.514 ± 0.292
1.666ProArg: 1.666 ± 0.308
2.271ProSer: 2.271 ± 0.402
2.196ProThr: 2.196 ± 0.543
2.196ProVal: 2.196 ± 0.469
0.606ProTrp: 0.606 ± 0.214
1.741ProTyr: 1.741 ± 0.384
0.0ProXaa: 0.0 ± 0.0
Gln
3.634GlnAla: 3.634 ± 0.507
0.379GlnCys: 0.379 ± 0.156
2.196GlnAsp: 2.196 ± 0.352
2.953GlnGlu: 2.953 ± 0.465
1.287GlnPhe: 1.287 ± 0.298
3.028GlnGly: 3.028 ± 0.527
0.757GlnHis: 0.757 ± 0.218
2.044GlnIle: 2.044 ± 0.312
2.271GlnLys: 2.271 ± 0.49
3.861GlnLeu: 3.861 ± 0.623
1.211GlnMet: 1.211 ± 0.282
1.59GlnAsn: 1.59 ± 0.382
1.287GlnPro: 1.287 ± 0.339
1.969GlnGln: 1.969 ± 0.336
2.801GlnArg: 2.801 ± 0.546
1.666GlnSer: 1.666 ± 0.348
1.514GlnThr: 1.514 ± 0.351
2.574GlnVal: 2.574 ± 0.426
0.454GlnTrp: 0.454 ± 0.166
0.909GlnTyr: 0.909 ± 0.266
0.0GlnXaa: 0.0 ± 0.0
Arg
4.618ArgAla: 4.618 ± 0.514
0.757ArgCys: 0.757 ± 0.269
2.953ArgAsp: 2.953 ± 0.47
6.208ArgGlu: 6.208 ± 0.884
2.65ArgPhe: 2.65 ± 0.406
3.18ArgGly: 3.18 ± 0.436
0.606ArgHis: 0.606 ± 0.226
3.104ArgIle: 3.104 ± 0.347
5.754ArgLys: 5.754 ± 0.756
3.861ArgLeu: 3.861 ± 0.529
1.439ArgMet: 1.439 ± 0.326
3.104ArgAsn: 3.104 ± 0.459
1.136ArgPro: 1.136 ± 0.277
1.817ArgGln: 1.817 ± 0.368
3.18ArgArg: 3.18 ± 0.486
1.514ArgSer: 1.514 ± 0.326
2.423ArgThr: 2.423 ± 0.477
3.71ArgVal: 3.71 ± 0.554
0.681ArgTrp: 0.681 ± 0.181
1.741ArgTyr: 1.741 ± 0.341
0.0ArgXaa: 0.0 ± 0.0
Ser
2.877SerAla: 2.877 ± 0.623
0.227SerCys: 0.227 ± 0.145
2.196SerAsp: 2.196 ± 0.331
3.71SerGlu: 3.71 ± 0.695
3.256SerPhe: 3.256 ± 0.881
3.937SerGly: 3.937 ± 0.676
0.833SerHis: 0.833 ± 0.211
3.18SerIle: 3.18 ± 0.423
3.937SerLys: 3.937 ± 0.526
4.543SerLeu: 4.543 ± 0.599
1.439SerMet: 1.439 ± 0.342
2.423SerAsn: 2.423 ± 0.434
1.439SerPro: 1.439 ± 0.54
2.196SerGln: 2.196 ± 0.445
2.726SerArg: 2.726 ± 0.409
3.937SerSer: 3.937 ± 0.876
3.028SerThr: 3.028 ± 0.513
3.331SerVal: 3.331 ± 0.391
0.681SerTrp: 0.681 ± 0.28
2.044SerTyr: 2.044 ± 0.447
0.0SerXaa: 0.0 ± 0.0
Thr
3.786ThrAla: 3.786 ± 0.626
0.757ThrCys: 0.757 ± 0.228
3.786ThrAsp: 3.786 ± 0.58
3.18ThrGlu: 3.18 ± 0.496
2.423ThrPhe: 2.423 ± 0.519
4.618ThrGly: 4.618 ± 0.725
0.833ThrHis: 0.833 ± 0.255
3.786ThrIle: 3.786 ± 0.524
3.104ThrLys: 3.104 ± 0.443
4.164ThrLeu: 4.164 ± 0.614
0.909ThrMet: 0.909 ± 0.206
2.498ThrAsn: 2.498 ± 0.493
3.028ThrPro: 3.028 ± 0.394
1.666ThrGln: 1.666 ± 0.319
2.423ThrArg: 2.423 ± 0.4
2.801ThrSer: 2.801 ± 0.625
2.801ThrThr: 2.801 ± 0.639
4.088ThrVal: 4.088 ± 0.62
0.833ThrTrp: 0.833 ± 0.321
2.196ThrTyr: 2.196 ± 0.404
0.0ThrXaa: 0.0 ± 0.0
Val
5.83ValAla: 5.83 ± 0.685
0.909ValCys: 0.909 ± 0.259
4.316ValAsp: 4.316 ± 0.527
4.316ValGlu: 4.316 ± 0.617
3.104ValPhe: 3.104 ± 0.421
4.013ValGly: 4.013 ± 0.526
0.984ValHis: 0.984 ± 0.279
5.754ValIle: 5.754 ± 0.659
5.376ValLys: 5.376 ± 0.63
4.997ValLeu: 4.997 ± 0.605
1.666ValMet: 1.666 ± 0.403
3.407ValAsn: 3.407 ± 0.492
2.044ValPro: 2.044 ± 0.371
2.801ValGln: 2.801 ± 0.38
3.256ValArg: 3.256 ± 0.544
4.088ValSer: 4.088 ± 0.889
4.164ValThr: 4.164 ± 0.541
4.921ValVal: 4.921 ± 0.564
0.681ValTrp: 0.681 ± 0.179
3.18ValTyr: 3.18 ± 0.557
0.0ValXaa: 0.0 ± 0.0
Trp
0.606TrpAla: 0.606 ± 0.273
0.227TrpCys: 0.227 ± 0.119
0.833TrpAsp: 0.833 ± 0.248
0.757TrpGlu: 0.757 ± 0.233
0.151TrpPhe: 0.151 ± 0.113
0.757TrpGly: 0.757 ± 0.202
0.303TrpHis: 0.303 ± 0.202
0.984TrpIle: 0.984 ± 0.277
0.909TrpLys: 0.909 ± 0.212
0.984TrpLeu: 0.984 ± 0.329
0.379TrpMet: 0.379 ± 0.172
0.606TrpAsn: 0.606 ± 0.26
0.0TrpPro: 0.0 ± 0.0
0.303TrpGln: 0.303 ± 0.136
0.681TrpArg: 0.681 ± 0.158
0.984TrpSer: 0.984 ± 0.218
0.606TrpThr: 0.606 ± 0.208
0.984TrpVal: 0.984 ± 0.298
0.227TrpTrp: 0.227 ± 0.122
0.909TrpTyr: 0.909 ± 0.244
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.953TyrAla: 2.953 ± 0.56
0.379TyrCys: 0.379 ± 0.166
2.574TyrAsp: 2.574 ± 0.455
3.104TyrGlu: 3.104 ± 0.423
2.044TyrPhe: 2.044 ± 0.313
3.18TyrGly: 3.18 ± 0.437
0.757TyrHis: 0.757 ± 0.225
2.12TyrIle: 2.12 ± 0.474
2.498TyrLys: 2.498 ± 0.522
3.786TyrLeu: 3.786 ± 0.717
1.06TyrMet: 1.06 ± 0.215
1.817TyrAsn: 1.817 ± 0.393
0.909TyrPro: 0.909 ± 0.239
1.741TyrGln: 1.741 ± 0.432
1.969TyrArg: 1.969 ± 0.302
2.347TyrSer: 2.347 ± 0.436
2.498TyrThr: 2.498 ± 0.592
3.18TyrVal: 3.18 ± 0.458
0.379TyrTrp: 0.379 ± 0.152
1.136TyrTyr: 1.136 ± 0.249
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 56 proteins (13209 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski