Amino acid dipepetide frequency for Arthrobacter phage Beagle

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
14.89AlaAla: 14.89 ± 1.262
0.844AlaCys: 0.844 ± 0.205
7.423AlaAsp: 7.423 ± 0.582
8.311AlaGlu: 8.311 ± 0.814
2.933AlaPhe: 2.933 ± 0.365
8.934AlaGly: 8.934 ± 0.582
2.133AlaHis: 2.133 ± 0.279
4.845AlaIle: 4.845 ± 0.539
5.289AlaLys: 5.289 ± 0.531
10.889AlaLeu: 10.889 ± 0.742
3.467AlaMet: 3.467 ± 0.365
3.645AlaAsn: 3.645 ± 0.496
6.089AlaPro: 6.089 ± 0.56
3.867AlaGln: 3.867 ± 0.508
6.089AlaArg: 6.089 ± 0.557
5.067AlaSer: 5.067 ± 0.354
5.911AlaThr: 5.911 ± 0.566
7.2AlaVal: 7.2 ± 0.672
2.133AlaTrp: 2.133 ± 0.257
2.8AlaTyr: 2.8 ± 0.356
0.0AlaXaa: 0.0 ± 0.0
Cys
0.622CysAla: 0.622 ± 0.164
0.178CysCys: 0.178 ± 0.141
0.756CysAsp: 0.756 ± 0.254
0.8CysGlu: 0.8 ± 0.222
0.089CysPhe: 0.089 ± 0.064
1.333CysGly: 1.333 ± 0.239
0.533CysHis: 0.533 ± 0.157
0.356CysIle: 0.356 ± 0.133
0.356CysLys: 0.356 ± 0.119
0.578CysLeu: 0.578 ± 0.2
0.222CysMet: 0.222 ± 0.097
0.133CysAsn: 0.133 ± 0.073
0.533CysPro: 0.533 ± 0.167
0.444CysGln: 0.444 ± 0.137
1.022CysArg: 1.022 ± 0.254
0.756CysSer: 0.756 ± 0.184
0.533CysThr: 0.533 ± 0.184
0.489CysVal: 0.489 ± 0.141
0.222CysTrp: 0.222 ± 0.092
0.178CysTyr: 0.178 ± 0.081
0.0CysXaa: 0.0 ± 0.0
Asp
8.223AspAla: 8.223 ± 0.474
0.489AspCys: 0.489 ± 0.213
3.289AspAsp: 3.289 ± 0.4
3.6AspGlu: 3.6 ± 0.311
2.0AspPhe: 2.0 ± 0.409
7.334AspGly: 7.334 ± 0.593
1.289AspHis: 1.289 ± 0.257
2.8AspIle: 2.8 ± 0.374
2.311AspLys: 2.311 ± 0.3
5.156AspLeu: 5.156 ± 0.469
1.511AspMet: 1.511 ± 0.247
2.222AspAsn: 2.222 ± 0.37
4.445AspPro: 4.445 ± 0.514
1.422AspGln: 1.422 ± 0.219
3.2AspArg: 3.2 ± 0.378
2.933AspSer: 2.933 ± 0.364
3.645AspThr: 3.645 ± 0.48
3.689AspVal: 3.689 ± 0.377
1.244AspTrp: 1.244 ± 0.234
1.645AspTyr: 1.645 ± 0.235
0.0AspXaa: 0.0 ± 0.0
Glu
8.089GluAla: 8.089 ± 0.85
0.711GluCys: 0.711 ± 0.203
3.733GluAsp: 3.733 ± 0.399
4.711GluGlu: 4.711 ± 0.441
2.533GluPhe: 2.533 ± 0.332
4.8GluGly: 4.8 ± 0.573
1.956GluHis: 1.956 ± 0.329
3.911GluIle: 3.911 ± 0.5
2.622GluLys: 2.622 ± 0.415
4.267GluLeu: 4.267 ± 0.487
1.911GluMet: 1.911 ± 0.255
2.045GluAsn: 2.045 ± 0.319
3.111GluPro: 3.111 ± 0.478
2.711GluGln: 2.711 ± 0.316
5.022GluArg: 5.022 ± 0.534
4.045GluSer: 4.045 ± 0.401
3.645GluThr: 3.645 ± 0.429
4.0GluVal: 4.0 ± 0.396
1.467GluTrp: 1.467 ± 0.268
1.822GluTyr: 1.822 ± 0.26
0.0GluXaa: 0.0 ± 0.0
Phe
3.333PheAla: 3.333 ± 0.398
0.311PheCys: 0.311 ± 0.102
2.578PheAsp: 2.578 ± 0.333
2.045PheGlu: 2.045 ± 0.25
0.844PhePhe: 0.844 ± 0.196
2.8PheGly: 2.8 ± 0.46
0.622PheHis: 0.622 ± 0.157
1.645PheIle: 1.645 ± 0.323
1.111PheLys: 1.111 ± 0.348
1.911PheLeu: 1.911 ± 0.266
1.022PheMet: 1.022 ± 0.187
1.156PheAsn: 1.156 ± 0.257
1.556PhePro: 1.556 ± 0.258
0.933PheGln: 0.933 ± 0.2
1.689PheArg: 1.689 ± 0.224
1.556PheSer: 1.556 ± 0.267
2.578PheThr: 2.578 ± 0.453
1.911PheVal: 1.911 ± 0.223
0.756PheTrp: 0.756 ± 0.325
1.067PheTyr: 1.067 ± 0.249
0.0PheXaa: 0.0 ± 0.0
Gly
6.445GlyAla: 6.445 ± 0.532
0.711GlyCys: 0.711 ± 0.244
5.067GlyAsp: 5.067 ± 0.485
5.778GlyGlu: 5.778 ± 0.541
3.2GlyPhe: 3.2 ± 0.505
6.534GlyGly: 6.534 ± 0.702
2.133GlyHis: 2.133 ± 0.291
3.733GlyIle: 3.733 ± 0.391
4.089GlyLys: 4.089 ± 0.415
6.134GlyLeu: 6.134 ± 0.532
1.867GlyMet: 1.867 ± 0.299
2.845GlyAsn: 2.845 ± 0.333
3.6GlyPro: 3.6 ± 0.391
3.378GlyGln: 3.378 ± 0.587
5.022GlyArg: 5.022 ± 0.595
4.134GlySer: 4.134 ± 0.388
6.756GlyThr: 6.756 ± 0.881
5.289GlyVal: 5.289 ± 0.563
1.867GlyTrp: 1.867 ± 0.294
2.889GlyTyr: 2.889 ± 0.322
0.0GlyXaa: 0.0 ± 0.0
His
2.533HisAla: 2.533 ± 0.341
0.4HisCys: 0.4 ± 0.133
1.2HisAsp: 1.2 ± 0.202
1.645HisGlu: 1.645 ± 0.316
1.333HisPhe: 1.333 ± 0.231
1.689HisGly: 1.689 ± 0.228
0.756HisHis: 0.756 ± 0.18
0.844HisIle: 0.844 ± 0.168
0.622HisLys: 0.622 ± 0.194
1.822HisLeu: 1.822 ± 0.301
0.444HisMet: 0.444 ± 0.141
0.667HisAsn: 0.667 ± 0.186
1.911HisPro: 1.911 ± 0.365
0.533HisGln: 0.533 ± 0.187
1.511HisArg: 1.511 ± 0.332
1.2HisSer: 1.2 ± 0.22
1.778HisThr: 1.778 ± 0.295
1.333HisVal: 1.333 ± 0.259
0.533HisTrp: 0.533 ± 0.141
0.8HisTyr: 0.8 ± 0.167
0.0HisXaa: 0.0 ± 0.0
Ile
4.845IleAla: 4.845 ± 0.524
0.4IleCys: 0.4 ± 0.147
3.511IleAsp: 3.511 ± 0.429
3.467IleGlu: 3.467 ± 0.396
1.2IlePhe: 1.2 ± 0.214
3.245IleGly: 3.245 ± 0.531
1.511IleHis: 1.511 ± 0.282
2.445IleIle: 2.445 ± 0.371
1.778IleLys: 1.778 ± 0.283
3.333IleLeu: 3.333 ± 0.447
1.111IleMet: 1.111 ± 0.25
1.778IleAsn: 1.778 ± 0.349
2.533IlePro: 2.533 ± 0.387
1.778IleGln: 1.778 ± 0.263
3.422IleArg: 3.422 ± 0.436
3.156IleSer: 3.156 ± 0.347
3.822IleThr: 3.822 ± 0.318
3.022IleVal: 3.022 ± 0.315
0.978IleTrp: 0.978 ± 0.174
0.756IleTyr: 0.756 ± 0.193
0.0IleXaa: 0.0 ± 0.0
Lys
6.0LysAla: 6.0 ± 0.611
0.578LysCys: 0.578 ± 0.178
1.822LysAsp: 1.822 ± 0.279
2.667LysGlu: 2.667 ± 0.375
1.645LysPhe: 1.645 ± 0.28
2.578LysGly: 2.578 ± 0.318
0.8LysHis: 0.8 ± 0.205
1.556LysIle: 1.556 ± 0.368
1.956LysLys: 1.956 ± 0.333
3.111LysLeu: 3.111 ± 0.332
1.022LysMet: 1.022 ± 0.2
1.289LysAsn: 1.289 ± 0.301
2.489LysPro: 2.489 ± 0.369
1.689LysGln: 1.689 ± 0.251
2.356LysArg: 2.356 ± 0.357
2.178LysSer: 2.178 ± 0.386
2.533LysThr: 2.533 ± 0.348
2.756LysVal: 2.756 ± 0.374
0.667LysTrp: 0.667 ± 0.165
0.8LysTyr: 0.8 ± 0.178
0.0LysXaa: 0.0 ± 0.0
Leu
9.423LeuAla: 9.423 ± 0.839
0.933LeuCys: 0.933 ± 0.198
5.245LeuAsp: 5.245 ± 0.488
5.156LeuGlu: 5.156 ± 0.591
1.867LeuPhe: 1.867 ± 0.3
5.778LeuGly: 5.778 ± 0.643
1.645LeuHis: 1.645 ± 0.232
3.333LeuIle: 3.333 ± 0.337
3.511LeuLys: 3.511 ± 0.467
6.134LeuLeu: 6.134 ± 0.613
1.6LeuMet: 1.6 ± 0.261
2.8LeuAsn: 2.8 ± 0.341
3.733LeuPro: 3.733 ± 0.36
2.133LeuGln: 2.133 ± 0.275
5.289LeuArg: 5.289 ± 0.727
4.578LeuSer: 4.578 ± 0.452
5.734LeuThr: 5.734 ± 0.404
5.022LeuVal: 5.022 ± 0.512
1.2LeuTrp: 1.2 ± 0.224
1.422LeuTyr: 1.422 ± 0.27
0.0LeuXaa: 0.0 ± 0.0
Met
2.578MetAla: 2.578 ± 0.31
0.222MetCys: 0.222 ± 0.106
1.556MetAsp: 1.556 ± 0.205
1.467MetGlu: 1.467 ± 0.252
0.756MetPhe: 0.756 ± 0.15
2.4MetGly: 2.4 ± 0.365
0.4MetHis: 0.4 ± 0.161
0.933MetIle: 0.933 ± 0.248
1.111MetLys: 1.111 ± 0.223
1.556MetLeu: 1.556 ± 0.253
0.444MetMet: 0.444 ± 0.12
0.844MetAsn: 0.844 ± 0.169
1.422MetPro: 1.422 ± 0.228
0.4MetGln: 0.4 ± 0.13
1.556MetArg: 1.556 ± 0.309
2.267MetSer: 2.267 ± 0.282
2.4MetThr: 2.4 ± 0.39
1.333MetVal: 1.333 ± 0.24
0.267MetTrp: 0.267 ± 0.115
0.444MetTyr: 0.444 ± 0.12
0.0MetXaa: 0.0 ± 0.0
Asn
3.422AsnAla: 3.422 ± 0.439
0.222AsnCys: 0.222 ± 0.083
1.6AsnAsp: 1.6 ± 0.235
2.267AsnGlu: 2.267 ± 0.477
1.289AsnPhe: 1.289 ± 0.267
3.733AsnGly: 3.733 ± 0.468
0.622AsnHis: 0.622 ± 0.18
2.178AsnIle: 2.178 ± 0.457
0.978AsnLys: 0.978 ± 0.188
2.445AsnLeu: 2.445 ± 0.295
0.444AsnMet: 0.444 ± 0.144
0.8AsnAsn: 0.8 ± 0.191
1.911AsnPro: 1.911 ± 0.316
0.8AsnGln: 0.8 ± 0.191
2.267AsnArg: 2.267 ± 0.363
1.956AsnSer: 1.956 ± 0.347
2.533AsnThr: 2.533 ± 0.431
2.311AsnVal: 2.311 ± 0.338
0.8AsnTrp: 0.8 ± 0.264
0.533AsnTyr: 0.533 ± 0.151
0.0AsnXaa: 0.0 ± 0.0
Pro
6.489ProAla: 6.489 ± 0.583
0.578ProCys: 0.578 ± 0.181
3.422ProAsp: 3.422 ± 0.427
4.267ProGlu: 4.267 ± 0.428
1.556ProPhe: 1.556 ± 0.274
4.4ProGly: 4.4 ± 0.473
1.333ProHis: 1.333 ± 0.252
2.133ProIle: 2.133 ± 0.307
2.311ProLys: 2.311 ± 0.321
3.156ProLeu: 3.156 ± 0.326
1.244ProMet: 1.244 ± 0.266
1.289ProAsn: 1.289 ± 0.223
3.2ProPro: 3.2 ± 0.509
1.511ProGln: 1.511 ± 0.229
2.0ProArg: 2.0 ± 0.32
3.6ProSer: 3.6 ± 0.411
4.311ProThr: 4.311 ± 0.349
4.756ProVal: 4.756 ± 0.563
0.8ProTrp: 0.8 ± 0.216
1.6ProTyr: 1.6 ± 0.303
0.0ProXaa: 0.0 ± 0.0
Gln
3.2GlnAla: 3.2 ± 0.322
0.311GlnCys: 0.311 ± 0.14
1.2GlnAsp: 1.2 ± 0.195
2.222GlnGlu: 2.222 ± 0.31
1.244GlnPhe: 1.244 ± 0.21
2.489GlnGly: 2.489 ± 0.527
0.667GlnHis: 0.667 ± 0.177
1.689GlnIle: 1.689 ± 0.27
1.645GlnLys: 1.645 ± 0.273
2.667GlnLeu: 2.667 ± 0.349
0.978GlnMet: 0.978 ± 0.184
0.844GlnAsn: 0.844 ± 0.18
1.378GlnPro: 1.378 ± 0.283
1.822GlnGln: 1.822 ± 0.565
2.845GlnArg: 2.845 ± 0.344
1.111GlnSer: 1.111 ± 0.271
1.689GlnThr: 1.689 ± 0.274
1.911GlnVal: 1.911 ± 0.296
0.978GlnTrp: 0.978 ± 0.17
1.022GlnTyr: 1.022 ± 0.225
0.0GlnXaa: 0.0 ± 0.0
Arg
5.734ArgAla: 5.734 ± 0.541
0.8ArgCys: 0.8 ± 0.182
4.134ArgAsp: 4.134 ± 0.458
4.222ArgGlu: 4.222 ± 0.394
1.778ArgPhe: 1.778 ± 0.334
4.578ArgGly: 4.578 ± 0.49
1.867ArgHis: 1.867 ± 0.287
2.8ArgIle: 2.8 ± 0.351
2.4ArgLys: 2.4 ± 0.36
5.334ArgLeu: 5.334 ± 0.547
1.689ArgMet: 1.689 ± 0.276
2.222ArgAsn: 2.222 ± 0.394
2.978ArgPro: 2.978 ± 0.363
2.089ArgGln: 2.089 ± 0.254
4.978ArgArg: 4.978 ± 0.521
3.645ArgSer: 3.645 ± 0.398
4.089ArgThr: 4.089 ± 0.44
4.934ArgVal: 4.934 ± 0.51
1.645ArgTrp: 1.645 ± 0.274
1.467ArgTyr: 1.467 ± 0.25
0.0ArgXaa: 0.0 ± 0.0
Ser
6.845SerAla: 6.845 ± 0.555
0.489SerCys: 0.489 ± 0.144
3.378SerAsp: 3.378 ± 0.367
3.333SerGlu: 3.333 ± 0.282
2.045SerPhe: 2.045 ± 0.254
4.534SerGly: 4.534 ± 0.444
0.978SerHis: 0.978 ± 0.169
2.8SerIle: 2.8 ± 0.409
2.089SerLys: 2.089 ± 0.411
4.134SerLeu: 4.134 ± 0.538
1.778SerMet: 1.778 ± 0.276
1.867SerAsn: 1.867 ± 0.311
3.245SerPro: 3.245 ± 0.402
1.244SerGln: 1.244 ± 0.201
4.045SerArg: 4.045 ± 0.442
3.067SerSer: 3.067 ± 0.414
3.956SerThr: 3.956 ± 0.51
3.511SerVal: 3.511 ± 0.461
0.8SerTrp: 0.8 ± 0.182
1.556SerTyr: 1.556 ± 0.26
0.0SerXaa: 0.0 ± 0.0
Thr
8.578ThrAla: 8.578 ± 0.946
0.8ThrCys: 0.8 ± 0.209
4.489ThrAsp: 4.489 ± 0.404
4.222ThrGlu: 4.222 ± 0.455
2.622ThrPhe: 2.622 ± 0.366
6.222ThrGly: 6.222 ± 0.58
1.778ThrHis: 1.778 ± 0.271
3.467ThrIle: 3.467 ± 0.35
2.711ThrLys: 2.711 ± 0.4
6.045ThrLeu: 6.045 ± 0.475
1.422ThrMet: 1.422 ± 0.261
2.311ThrAsn: 2.311 ± 0.323
4.178ThrPro: 4.178 ± 0.485
1.156ThrGln: 1.156 ± 0.194
3.2ThrArg: 3.2 ± 0.378
4.089ThrSer: 4.089 ± 0.42
5.067ThrThr: 5.067 ± 0.538
5.778ThrVal: 5.778 ± 0.513
1.111ThrTrp: 1.111 ± 0.223
1.689ThrTyr: 1.689 ± 0.244
0.0ThrXaa: 0.0 ± 0.0
Val
7.2ValAla: 7.2 ± 0.655
0.311ValCys: 0.311 ± 0.129
4.8ValAsp: 4.8 ± 0.562
4.4ValGlu: 4.4 ± 0.501
1.511ValPhe: 1.511 ± 0.259
4.845ValGly: 4.845 ± 0.507
1.733ValHis: 1.733 ± 0.311
4.089ValIle: 4.089 ± 0.407
2.578ValLys: 2.578 ± 0.347
4.578ValLeu: 4.578 ± 0.399
1.378ValMet: 1.378 ± 0.298
2.667ValAsn: 2.667 ± 0.423
3.2ValPro: 3.2 ± 0.46
2.267ValGln: 2.267 ± 0.311
4.711ValArg: 4.711 ± 0.553
3.6ValSer: 3.6 ± 0.419
6.667ValThr: 6.667 ± 0.603
4.578ValVal: 4.578 ± 0.525
1.378ValTrp: 1.378 ± 0.252
1.2ValTyr: 1.2 ± 0.21
0.0ValXaa: 0.0 ± 0.0
Trp
1.467TrpAla: 1.467 ± 0.227
0.4TrpCys: 0.4 ± 0.153
1.6TrpAsp: 1.6 ± 0.226
1.111TrpGlu: 1.111 ± 0.224
0.267TrpPhe: 0.267 ± 0.112
1.511TrpGly: 1.511 ± 0.212
0.444TrpHis: 0.444 ± 0.139
1.022TrpIle: 1.022 ± 0.277
0.578TrpLys: 0.578 ± 0.176
1.689TrpLeu: 1.689 ± 0.257
0.533TrpMet: 0.533 ± 0.138
0.8TrpAsn: 0.8 ± 0.284
0.844TrpPro: 0.844 ± 0.22
0.667TrpGln: 0.667 ± 0.183
1.289TrpArg: 1.289 ± 0.179
1.2TrpSer: 1.2 ± 0.236
1.467TrpThr: 1.467 ± 0.28
1.822TrpVal: 1.822 ± 0.306
0.489TrpTrp: 0.489 ± 0.12
0.578TrpTyr: 0.578 ± 0.168
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.845TyrAla: 2.845 ± 0.447
0.444TyrCys: 0.444 ± 0.141
1.645TyrAsp: 1.645 ± 0.241
1.422TyrGlu: 1.422 ± 0.28
0.711TyrPhe: 0.711 ± 0.17
1.733TyrGly: 1.733 ± 0.272
0.356TyrHis: 0.356 ± 0.122
1.511TyrIle: 1.511 ± 0.27
0.444TyrLys: 0.444 ± 0.132
1.645TyrLeu: 1.645 ± 0.351
0.222TyrMet: 0.222 ± 0.095
0.889TyrAsn: 0.889 ± 0.188
1.6TyrPro: 1.6 ± 0.337
1.067TyrGln: 1.067 ± 0.25
1.867TyrArg: 1.867 ± 0.265
1.556TyrSer: 1.556 ± 0.241
1.822TyrThr: 1.822 ± 0.289
2.045TyrVal: 2.045 ± 0.339
0.444TyrTrp: 0.444 ± 0.144
0.889TyrTyr: 0.889 ± 0.22
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 131 proteins (22500 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski