Amino acid dipepetide frequency for Microbacterium phage WaterT

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
9.852AlaAla: 9.852 ± 1.421
0.49AlaCys: 0.49 ± 0.162
6.858AlaAsp: 6.858 ± 0.612
6.314AlaGlu: 6.314 ± 0.726
2.885AlaPhe: 2.885 ± 0.412
8.273AlaGly: 8.273 ± 0.985
1.578AlaHis: 1.578 ± 0.31
5.661AlaIle: 5.661 ± 0.584
3.756AlaLys: 3.756 ± 0.439
8.273AlaLeu: 8.273 ± 0.851
3.32AlaMet: 3.32 ± 0.379
4.409AlaAsn: 4.409 ± 0.602
3.538AlaPro: 3.538 ± 0.525
3.484AlaGln: 3.484 ± 0.572
4.681AlaArg: 4.681 ± 0.566
5.987AlaSer: 5.987 ± 0.86
6.858AlaThr: 6.858 ± 0.767
6.641AlaVal: 6.641 ± 0.645
1.796AlaTrp: 1.796 ± 0.303
3.048AlaTyr: 3.048 ± 0.452
0.0AlaXaa: 0.0 ± 0.0
Cys
0.435CysAla: 0.435 ± 0.18
0.109CysCys: 0.109 ± 0.071
0.272CysAsp: 0.272 ± 0.115
0.435CysGlu: 0.435 ± 0.225
0.218CysPhe: 0.218 ± 0.097
0.653CysGly: 0.653 ± 0.24
0.109CysHis: 0.109 ± 0.076
0.272CysIle: 0.272 ± 0.119
0.272CysLys: 0.272 ± 0.14
0.218CysLeu: 0.218 ± 0.106
0.163CysMet: 0.163 ± 0.103
0.327CysAsn: 0.327 ± 0.131
0.109CysPro: 0.109 ± 0.076
0.109CysGln: 0.109 ± 0.073
0.327CysArg: 0.327 ± 0.138
0.218CysSer: 0.218 ± 0.108
0.272CysThr: 0.272 ± 0.124
0.49CysVal: 0.49 ± 0.147
0.109CysTrp: 0.109 ± 0.076
0.327CysTyr: 0.327 ± 0.122
0.0CysXaa: 0.0 ± 0.0
Asp
6.586AspAla: 6.586 ± 0.76
0.49AspCys: 0.49 ± 0.184
4.246AspAsp: 4.246 ± 0.51
5.389AspGlu: 5.389 ± 0.577
2.722AspPhe: 2.722 ± 0.396
5.77AspGly: 5.77 ± 0.62
0.98AspHis: 0.98 ± 0.232
2.504AspIle: 2.504 ± 0.295
2.395AspLys: 2.395 ± 0.416
6.423AspLeu: 6.423 ± 0.566
1.633AspMet: 1.633 ± 0.292
2.776AspAsn: 2.776 ± 0.389
3.375AspPro: 3.375 ± 0.504
1.578AspGln: 1.578 ± 0.303
3.647AspArg: 3.647 ± 0.508
4.79AspSer: 4.79 ± 0.489
3.81AspThr: 3.81 ± 0.431
3.973AspVal: 3.973 ± 0.548
1.415AspTrp: 1.415 ± 0.267
3.048AspTyr: 3.048 ± 0.501
0.0AspXaa: 0.0 ± 0.0
Glu
7.511GluAla: 7.511 ± 0.869
0.49GluCys: 0.49 ± 0.168
4.627GluAsp: 4.627 ± 0.575
5.497GluGlu: 5.497 ± 0.663
3.103GluPhe: 3.103 ± 0.462
4.79GluGly: 4.79 ± 0.523
0.925GluHis: 0.925 ± 0.247
4.899GluIle: 4.899 ± 0.635
3.756GluLys: 3.756 ± 0.543
4.627GluLeu: 4.627 ± 0.501
2.504GluMet: 2.504 ± 0.343
2.939GluAsn: 2.939 ± 0.366
3.103GluPro: 3.103 ± 0.408
2.123GluGln: 2.123 ± 0.335
3.266GluArg: 3.266 ± 0.41
4.354GluSer: 4.354 ± 0.549
3.048GluThr: 3.048 ± 0.401
4.79GluVal: 4.79 ± 0.497
1.96GluTrp: 1.96 ± 0.405
2.613GluTyr: 2.613 ± 0.356
0.0GluXaa: 0.0 ± 0.0
Phe
3.266PheAla: 3.266 ± 0.497
0.218PheCys: 0.218 ± 0.099
2.558PheAsp: 2.558 ± 0.384
3.211PheGlu: 3.211 ± 0.419
1.143PhePhe: 1.143 ± 0.23
3.429PheGly: 3.429 ± 0.444
0.708PheHis: 0.708 ± 0.187
1.578PheIle: 1.578 ± 0.312
0.762PheLys: 0.762 ± 0.211
2.341PheLeu: 2.341 ± 0.437
0.381PheMet: 0.381 ± 0.129
1.633PheAsn: 1.633 ± 0.298
1.524PhePro: 1.524 ± 0.273
0.98PheGln: 0.98 ± 0.252
2.286PheArg: 2.286 ± 0.373
2.613PheSer: 2.613 ± 0.393
2.449PheThr: 2.449 ± 0.321
2.232PheVal: 2.232 ± 0.326
0.925PheTrp: 0.925 ± 0.242
1.524PheTyr: 1.524 ± 0.328
0.0PheXaa: 0.0 ± 0.0
Gly
6.695GlyAla: 6.695 ± 0.797
0.435GlyCys: 0.435 ± 0.15
6.042GlyAsp: 6.042 ± 0.69
6.042GlyGlu: 6.042 ± 0.652
3.592GlyPhe: 3.592 ± 0.389
7.348GlyGly: 7.348 ± 0.775
1.089GlyHis: 1.089 ± 0.274
4.191GlyIle: 4.191 ± 0.429
3.592GlyLys: 3.592 ± 0.523
6.314GlyLeu: 6.314 ± 0.603
1.47GlyMet: 1.47 ± 0.301
3.211GlyAsn: 3.211 ± 0.385
3.157GlyPro: 3.157 ± 0.739
3.103GlyGln: 3.103 ± 0.537
4.572GlyArg: 4.572 ± 0.386
4.681GlySer: 4.681 ± 0.585
6.532GlyThr: 6.532 ± 0.732
7.239GlyVal: 7.239 ± 0.633
1.96GlyTrp: 1.96 ± 0.287
3.701GlyTyr: 3.701 ± 0.464
0.0GlyXaa: 0.0 ± 0.0
His
1.197HisAla: 1.197 ± 0.232
0.163HisCys: 0.163 ± 0.096
1.089HisAsp: 1.089 ± 0.246
0.98HisGlu: 0.98 ± 0.308
0.435HisPhe: 0.435 ± 0.145
1.034HisGly: 1.034 ± 0.239
0.381HisHis: 0.381 ± 0.14
0.762HisIle: 0.762 ± 0.238
0.435HisLys: 0.435 ± 0.131
1.578HisLeu: 1.578 ± 0.344
0.218HisMet: 0.218 ± 0.122
0.435HisAsn: 0.435 ± 0.157
0.708HisPro: 0.708 ± 0.219
0.435HisGln: 0.435 ± 0.169
1.252HisArg: 1.252 ± 0.33
0.871HisSer: 0.871 ± 0.26
1.034HisThr: 1.034 ± 0.231
1.089HisVal: 1.089 ± 0.273
0.544HisTrp: 0.544 ± 0.17
0.381HisTyr: 0.381 ± 0.146
0.0HisXaa: 0.0 ± 0.0
Ile
5.062IleAla: 5.062 ± 0.66
0.435IleCys: 0.435 ± 0.19
3.919IleAsp: 3.919 ± 0.507
4.518IleGlu: 4.518 ± 0.526
1.578IlePhe: 1.578 ± 0.309
3.647IleGly: 3.647 ± 0.451
0.925IleHis: 0.925 ± 0.252
1.578IleIle: 1.578 ± 0.27
1.851IleLys: 1.851 ± 0.283
3.266IleLeu: 3.266 ± 0.421
0.653IleMet: 0.653 ± 0.174
2.014IleAsn: 2.014 ± 0.367
2.885IlePro: 2.885 ± 0.398
2.068IleGln: 2.068 ± 0.356
3.592IleArg: 3.592 ± 0.462
3.484IleSer: 3.484 ± 0.348
2.994IleThr: 2.994 ± 0.411
3.048IleVal: 3.048 ± 0.401
1.143IleTrp: 1.143 ± 0.207
1.252IleTyr: 1.252 ± 0.304
0.0IleXaa: 0.0 ± 0.0
Lys
4.137LysAla: 4.137 ± 0.483
0.054LysCys: 0.054 ± 0.048
2.341LysAsp: 2.341 ± 0.386
2.722LysGlu: 2.722 ± 0.452
1.306LysPhe: 1.306 ± 0.262
3.375LysGly: 3.375 ± 0.452
1.034LysHis: 1.034 ± 0.271
2.286LysIle: 2.286 ± 0.378
2.177LysLys: 2.177 ± 0.339
3.048LysLeu: 3.048 ± 0.44
1.197LysMet: 1.197 ± 0.301
0.925LysAsn: 0.925 ± 0.221
2.504LysPro: 2.504 ± 0.464
1.524LysGln: 1.524 ± 0.259
3.647LysArg: 3.647 ± 0.451
2.613LysSer: 2.613 ± 0.447
1.796LysThr: 1.796 ± 0.259
3.157LysVal: 3.157 ± 0.407
0.49LysTrp: 0.49 ± 0.163
1.524LysTyr: 1.524 ± 0.244
0.0LysXaa: 0.0 ± 0.0
Leu
7.784LeuAla: 7.784 ± 0.768
0.163LeuCys: 0.163 ± 0.09
5.443LeuAsp: 5.443 ± 0.483
4.463LeuGlu: 4.463 ± 0.457
1.796LeuPhe: 1.796 ± 0.271
6.096LeuGly: 6.096 ± 0.626
1.034LeuHis: 1.034 ± 0.256
3.647LeuIle: 3.647 ± 0.35
3.973LeuLys: 3.973 ± 0.531
5.389LeuLeu: 5.389 ± 0.592
1.687LeuMet: 1.687 ± 0.254
3.266LeuAsn: 3.266 ± 0.454
3.484LeuPro: 3.484 ± 0.451
2.286LeuGln: 2.286 ± 0.391
4.899LeuArg: 4.899 ± 0.573
5.879LeuSer: 5.879 ± 0.571
5.879LeuThr: 5.879 ± 0.638
4.627LeuVal: 4.627 ± 0.517
1.578LeuTrp: 1.578 ± 0.297
1.851LeuTyr: 1.851 ± 0.334
0.0LeuXaa: 0.0 ± 0.0
Met
3.647MetAla: 3.647 ± 0.409
0.054MetCys: 0.054 ± 0.057
1.143MetAsp: 1.143 ± 0.281
1.306MetGlu: 1.306 ± 0.287
0.762MetPhe: 0.762 ± 0.19
1.252MetGly: 1.252 ± 0.266
0.49MetHis: 0.49 ± 0.154
0.925MetIle: 0.925 ± 0.239
1.089MetLys: 1.089 ± 0.258
1.524MetLeu: 1.524 ± 0.305
0.544MetMet: 0.544 ± 0.176
0.816MetAsn: 0.816 ± 0.177
0.871MetPro: 0.871 ± 0.178
0.871MetGln: 0.871 ± 0.276
1.742MetArg: 1.742 ± 0.332
2.341MetSer: 2.341 ± 0.338
2.286MetThr: 2.286 ± 0.404
1.197MetVal: 1.197 ± 0.276
0.544MetTrp: 0.544 ± 0.162
0.272MetTyr: 0.272 ± 0.147
0.0MetXaa: 0.0 ± 0.0
Asn
4.028AsnAla: 4.028 ± 0.477
0.327AsnCys: 0.327 ± 0.132
2.177AsnAsp: 2.177 ± 0.324
2.286AsnGlu: 2.286 ± 0.325
1.361AsnPhe: 1.361 ± 0.244
3.429AsnGly: 3.429 ± 0.429
0.653AsnHis: 0.653 ± 0.2
1.796AsnIle: 1.796 ± 0.301
1.47AsnLys: 1.47 ± 0.33
2.939AsnLeu: 2.939 ± 0.424
1.415AsnMet: 1.415 ± 0.36
0.925AsnAsn: 0.925 ± 0.254
2.613AsnPro: 2.613 ± 0.393
1.143AsnGln: 1.143 ± 0.251
2.83AsnArg: 2.83 ± 0.388
2.667AsnSer: 2.667 ± 0.464
2.123AsnThr: 2.123 ± 0.387
2.286AsnVal: 2.286 ± 0.321
0.762AsnTrp: 0.762 ± 0.189
1.034AsnTyr: 1.034 ± 0.195
0.0AsnXaa: 0.0 ± 0.0
Pro
4.627ProAla: 4.627 ± 0.449
0.109ProCys: 0.109 ± 0.077
4.3ProAsp: 4.3 ± 0.651
3.211ProGlu: 3.211 ± 0.544
1.796ProPhe: 1.796 ± 0.333
4.627ProGly: 4.627 ± 0.661
0.49ProHis: 0.49 ± 0.151
2.286ProIle: 2.286 ± 0.431
1.796ProLys: 1.796 ± 0.348
2.613ProLeu: 2.613 ± 0.445
0.544ProMet: 0.544 ± 0.159
1.089ProAsn: 1.089 ± 0.199
1.034ProPro: 1.034 ± 0.274
1.252ProGln: 1.252 ± 0.35
2.014ProArg: 2.014 ± 0.368
2.286ProSer: 2.286 ± 0.348
3.266ProThr: 3.266 ± 0.501
4.409ProVal: 4.409 ± 0.548
0.708ProTrp: 0.708 ± 0.212
1.47ProTyr: 1.47 ± 0.237
0.0ProXaa: 0.0 ± 0.0
Gln
4.082GlnAla: 4.082 ± 0.56
0.435GlnCys: 0.435 ± 0.155
1.96GlnAsp: 1.96 ± 0.296
2.177GlnGlu: 2.177 ± 0.325
1.524GlnPhe: 1.524 ± 0.269
2.994GlnGly: 2.994 ± 0.877
0.544GlnHis: 0.544 ± 0.219
1.47GlnIle: 1.47 ± 0.314
1.197GlnLys: 1.197 ± 0.287
1.796GlnLeu: 1.796 ± 0.398
0.708GlnMet: 0.708 ± 0.202
1.089GlnAsn: 1.089 ± 0.254
1.034GlnPro: 1.034 ± 0.221
1.143GlnGln: 1.143 ± 0.417
1.633GlnArg: 1.633 ± 0.355
1.905GlnSer: 1.905 ± 0.409
1.796GlnThr: 1.796 ± 0.306
2.341GlnVal: 2.341 ± 0.42
0.925GlnTrp: 0.925 ± 0.224
1.47GlnTyr: 1.47 ± 0.324
0.0GlnXaa: 0.0 ± 0.0
Arg
5.171ArgAla: 5.171 ± 0.531
0.544ArgCys: 0.544 ± 0.225
3.647ArgAsp: 3.647 ± 0.506
4.518ArgGlu: 4.518 ± 0.654
2.232ArgPhe: 2.232 ± 0.353
4.627ArgGly: 4.627 ± 0.448
0.599ArgHis: 0.599 ± 0.225
3.266ArgIle: 3.266 ± 0.406
2.994ArgLys: 2.994 ± 0.456
4.79ArgLeu: 4.79 ± 0.482
2.123ArgMet: 2.123 ± 0.372
1.796ArgAsn: 1.796 ± 0.272
1.851ArgPro: 1.851 ± 0.319
1.905ArgGln: 1.905 ± 0.329
2.885ArgArg: 2.885 ± 0.462
2.994ArgSer: 2.994 ± 0.36
4.082ArgThr: 4.082 ± 0.468
4.681ArgVal: 4.681 ± 0.627
1.415ArgTrp: 1.415 ± 0.27
2.014ArgTyr: 2.014 ± 0.351
0.0ArgXaa: 0.0 ± 0.0
Ser
5.497SerAla: 5.497 ± 0.779
0.218SerCys: 0.218 ± 0.11
4.137SerAsp: 4.137 ± 0.457
4.082SerGlu: 4.082 ± 0.549
2.341SerPhe: 2.341 ± 0.364
6.804SerGly: 6.804 ± 0.799
0.98SerHis: 0.98 ± 0.227
3.157SerIle: 3.157 ± 0.323
2.449SerLys: 2.449 ± 0.394
5.606SerLeu: 5.606 ± 0.556
1.578SerMet: 1.578 ± 0.294
2.558SerAsn: 2.558 ± 0.392
2.667SerPro: 2.667 ± 0.389
1.578SerGln: 1.578 ± 0.292
3.538SerArg: 3.538 ± 0.493
4.191SerSer: 4.191 ± 0.701
4.246SerThr: 4.246 ± 0.567
4.627SerVal: 4.627 ± 0.528
1.089SerTrp: 1.089 ± 0.268
2.177SerTyr: 2.177 ± 0.337
0.0SerXaa: 0.0 ± 0.0
Thr
6.477ThrAla: 6.477 ± 0.797
0.054ThrCys: 0.054 ± 0.05
3.973ThrAsp: 3.973 ± 0.546
4.246ThrGlu: 4.246 ± 0.495
2.504ThrPhe: 2.504 ± 0.414
6.586ThrGly: 6.586 ± 0.643
0.871ThrHis: 0.871 ± 0.193
3.266ThrIle: 3.266 ± 0.439
2.341ThrLys: 2.341 ± 0.3
5.552ThrLeu: 5.552 ± 0.649
1.034ThrMet: 1.034 ± 0.208
2.939ThrAsn: 2.939 ± 0.398
3.81ThrPro: 3.81 ± 0.548
2.613ThrGln: 2.613 ± 0.399
3.375ThrArg: 3.375 ± 0.498
3.701ThrSer: 3.701 ± 0.5
4.518ThrThr: 4.518 ± 0.636
5.443ThrVal: 5.443 ± 0.577
1.687ThrTrp: 1.687 ± 0.385
2.341ThrTyr: 2.341 ± 0.361
0.0ThrXaa: 0.0 ± 0.0
Val
6.586ValAla: 6.586 ± 0.675
0.381ValCys: 0.381 ± 0.138
4.735ValAsp: 4.735 ± 0.701
5.661ValGlu: 5.661 ± 0.598
2.613ValPhe: 2.613 ± 0.407
5.552ValGly: 5.552 ± 0.599
0.871ValHis: 0.871 ± 0.202
3.157ValIle: 3.157 ± 0.422
3.048ValLys: 3.048 ± 0.478
5.389ValLeu: 5.389 ± 0.546
1.306ValMet: 1.306 ± 0.283
2.504ValAsn: 2.504 ± 0.426
2.994ValPro: 2.994 ± 0.611
2.177ValGln: 2.177 ± 0.388
4.082ValArg: 4.082 ± 0.442
4.463ValSer: 4.463 ± 0.501
6.314ValThr: 6.314 ± 0.675
5.552ValVal: 5.552 ± 0.561
1.361ValTrp: 1.361 ± 0.319
3.157ValTyr: 3.157 ± 0.417
0.0ValXaa: 0.0 ± 0.0
Trp
1.851TrpAla: 1.851 ± 0.304
0.054TrpCys: 0.054 ± 0.061
1.633TrpAsp: 1.633 ± 0.36
1.415TrpGlu: 1.415 ± 0.308
0.925TrpPhe: 0.925 ± 0.238
1.415TrpGly: 1.415 ± 0.291
0.327TrpHis: 0.327 ± 0.142
1.197TrpIle: 1.197 ± 0.271
1.089TrpLys: 1.089 ± 0.22
1.252TrpLeu: 1.252 ± 0.261
0.653TrpMet: 0.653 ± 0.221
1.306TrpAsn: 1.306 ± 0.256
0.708TrpPro: 0.708 ± 0.188
0.708TrpGln: 0.708 ± 0.175
1.361TrpArg: 1.361 ± 0.319
1.742TrpSer: 1.742 ± 0.286
1.143TrpThr: 1.143 ± 0.231
1.306TrpVal: 1.306 ± 0.323
0.327TrpTrp: 0.327 ± 0.176
0.925TrpTyr: 0.925 ± 0.259
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.211TyrAla: 3.211 ± 0.389
0.218TyrCys: 0.218 ± 0.111
2.449TyrAsp: 2.449 ± 0.334
2.449TyrGlu: 2.449 ± 0.316
0.925TyrPhe: 0.925 ± 0.263
3.32TyrGly: 3.32 ± 0.523
0.435TyrHis: 0.435 ± 0.136
2.014TyrIle: 2.014 ± 0.398
1.415TyrLys: 1.415 ± 0.217
2.286TyrLeu: 2.286 ± 0.344
0.49TyrMet: 0.49 ± 0.146
1.306TyrAsn: 1.306 ± 0.282
1.96TyrPro: 1.96 ± 0.401
1.143TyrGln: 1.143 ± 0.221
2.504TyrArg: 2.504 ± 0.406
1.742TyrSer: 1.742 ± 0.31
2.885TyrThr: 2.885 ± 0.398
2.776TyrVal: 2.776 ± 0.422
0.599TyrTrp: 0.599 ± 0.186
1.197TyrTyr: 1.197 ± 0.24
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 121 proteins (18373 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski