Amino acid dipepetide frequency for Gordonia phage UmaThurman

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
16.568AlaAla: 16.568 ± 1.233
0.862AlaCys: 0.862 ± 0.254
7.699AlaAsp: 7.699 ± 0.683
6.344AlaGlu: 6.344 ± 0.667
3.572AlaPhe: 3.572 ± 0.565
9.67AlaGly: 9.67 ± 0.983
1.786AlaHis: 1.786 ± 0.34
5.112AlaIle: 5.112 ± 0.509
3.511AlaLys: 3.511 ± 0.74
9.608AlaLeu: 9.608 ± 1.055
2.402AlaMet: 2.402 ± 0.507
3.449AlaAsn: 3.449 ± 0.53
5.358AlaPro: 5.358 ± 0.727
4.127AlaGln: 4.127 ± 0.583
8.253AlaArg: 8.253 ± 0.824
5.851AlaSer: 5.851 ± 0.704
7.206AlaThr: 7.206 ± 0.933
8.931AlaVal: 8.931 ± 0.926
2.217AlaTrp: 2.217 ± 0.449
2.279AlaTyr: 2.279 ± 0.389
0.0AlaXaa: 0.0 ± 0.0
Cys
0.616CysAla: 0.616 ± 0.253
0.185CysCys: 0.185 ± 0.129
1.047CysAsp: 1.047 ± 0.323
0.554CysGlu: 0.554 ± 0.225
0.062CysPhe: 0.062 ± 0.058
1.047CysGly: 1.047 ± 0.334
0.308CysHis: 0.308 ± 0.126
0.246CysIle: 0.246 ± 0.14
0.185CysLys: 0.185 ± 0.095
0.431CysLeu: 0.431 ± 0.18
0.246CysMet: 0.246 ± 0.141
0.37CysAsn: 0.37 ± 0.139
0.801CysPro: 0.801 ± 0.229
0.308CysGln: 0.308 ± 0.135
0.616CysArg: 0.616 ± 0.204
0.554CysSer: 0.554 ± 0.205
0.554CysThr: 0.554 ± 0.177
0.37CysVal: 0.37 ± 0.148
0.185CysTrp: 0.185 ± 0.137
0.123CysTyr: 0.123 ± 0.09
0.0CysXaa: 0.0 ± 0.0
Asp
7.021AspAla: 7.021 ± 0.732
0.37AspCys: 0.37 ± 0.192
5.79AspAsp: 5.79 ± 0.647
4.866AspGlu: 4.866 ± 0.58
2.094AspPhe: 2.094 ± 0.344
7.021AspGly: 7.021 ± 0.743
2.033AspHis: 2.033 ± 0.406
2.956AspIle: 2.956 ± 0.48
1.848AspLys: 1.848 ± 0.431
5.79AspLeu: 5.79 ± 0.603
1.478AspMet: 1.478 ± 0.271
2.094AspAsn: 2.094 ± 0.384
4.127AspPro: 4.127 ± 0.518
2.34AspGln: 2.34 ± 0.383
4.311AspArg: 4.311 ± 0.653
3.634AspSer: 3.634 ± 0.429
4.188AspThr: 4.188 ± 0.482
5.79AspVal: 5.79 ± 0.691
0.862AspTrp: 0.862 ± 0.228
1.663AspTyr: 1.663 ± 0.357
0.0AspXaa: 0.0 ± 0.0
Glu
5.051GluAla: 5.051 ± 0.536
0.37GluCys: 0.37 ± 0.157
3.018GluAsp: 3.018 ± 0.384
1.848GluGlu: 1.848 ± 0.444
1.909GluPhe: 1.909 ± 0.329
3.203GluGly: 3.203 ± 0.438
1.293GluHis: 1.293 ± 0.233
2.772GluIle: 2.772 ± 0.389
2.033GluLys: 2.033 ± 0.367
4.989GluLeu: 4.989 ± 0.753
1.355GluMet: 1.355 ± 0.289
1.355GluAsn: 1.355 ± 0.287
3.572GluPro: 3.572 ± 0.702
2.71GluGln: 2.71 ± 0.509
4.743GluArg: 4.743 ± 0.798
2.833GluSer: 2.833 ± 0.365
2.648GluThr: 2.648 ± 0.324
4.496GluVal: 4.496 ± 0.516
1.293GluTrp: 1.293 ± 0.298
1.725GluTyr: 1.725 ± 0.349
0.0GluXaa: 0.0 ± 0.0
Phe
2.895PheAla: 2.895 ± 0.38
0.246PheCys: 0.246 ± 0.104
2.587PheAsp: 2.587 ± 0.351
1.786PheGlu: 1.786 ± 0.312
1.109PhePhe: 1.109 ± 0.223
2.402PheGly: 2.402 ± 0.345
0.37PheHis: 0.37 ± 0.172
0.554PheIle: 0.554 ± 0.212
0.985PheLys: 0.985 ± 0.331
1.848PheLeu: 1.848 ± 0.337
0.616PheMet: 0.616 ± 0.157
0.616PheAsn: 0.616 ± 0.23
1.355PhePro: 1.355 ± 0.232
0.678PheGln: 0.678 ± 0.182
2.217PheArg: 2.217 ± 0.378
1.601PheSer: 1.601 ± 0.358
2.525PheThr: 2.525 ± 0.478
2.956PheVal: 2.956 ± 0.454
0.246PheTrp: 0.246 ± 0.11
0.739PheTyr: 0.739 ± 0.201
0.0PheXaa: 0.0 ± 0.0
Gly
8.561GlyAla: 8.561 ± 1.033
0.554GlyCys: 0.554 ± 0.211
5.543GlyAsp: 5.543 ± 0.494
4.619GlyGlu: 4.619 ± 0.717
2.956GlyPhe: 2.956 ± 0.45
7.637GlyGly: 7.637 ± 0.865
2.402GlyHis: 2.402 ± 0.533
3.572GlyIle: 3.572 ± 0.53
3.572GlyLys: 3.572 ± 0.528
7.884GlyLeu: 7.884 ± 1.139
1.909GlyMet: 1.909 ± 0.362
2.402GlyAsn: 2.402 ± 0.381
4.311GlyPro: 4.311 ± 0.609
3.08GlyGln: 3.08 ± 0.381
6.713GlyArg: 6.713 ± 0.856
4.435GlySer: 4.435 ± 0.604
5.112GlyThr: 5.112 ± 0.598
5.913GlyVal: 5.913 ± 0.692
2.525GlyTrp: 2.525 ± 0.379
2.464GlyTyr: 2.464 ± 0.363
0.0GlyXaa: 0.0 ± 0.0
His
2.094HisAla: 2.094 ± 0.339
0.246HisCys: 0.246 ± 0.12
1.909HisAsp: 1.909 ± 0.337
0.678HisGlu: 0.678 ± 0.207
0.678HisPhe: 0.678 ± 0.221
1.909HisGly: 1.909 ± 0.395
0.616HisHis: 0.616 ± 0.224
0.985HisIle: 0.985 ± 0.237
0.801HisLys: 0.801 ± 0.218
1.663HisLeu: 1.663 ± 0.335
0.308HisMet: 0.308 ± 0.111
0.185HisAsn: 0.185 ± 0.102
2.279HisPro: 2.279 ± 0.407
0.862HisGln: 0.862 ± 0.204
1.786HisArg: 1.786 ± 0.329
0.862HisSer: 0.862 ± 0.239
1.17HisThr: 1.17 ± 0.303
1.663HisVal: 1.663 ± 0.364
0.493HisTrp: 0.493 ± 0.163
0.37HisTyr: 0.37 ± 0.146
0.0HisXaa: 0.0 ± 0.0
Ile
5.42IleAla: 5.42 ± 0.625
0.246IleCys: 0.246 ± 0.113
3.757IleAsp: 3.757 ± 0.502
2.895IleGlu: 2.895 ± 0.455
1.047IlePhe: 1.047 ± 0.264
4.619IleGly: 4.619 ± 0.704
0.678IleHis: 0.678 ± 0.259
1.54IleIle: 1.54 ± 0.36
1.601IleLys: 1.601 ± 0.428
2.217IleLeu: 2.217 ± 0.36
0.246IleMet: 0.246 ± 0.122
1.293IleAsn: 1.293 ± 0.313
2.895IlePro: 2.895 ± 0.44
0.801IleGln: 0.801 ± 0.247
3.757IleArg: 3.757 ± 0.545
2.34IleSer: 2.34 ± 0.392
3.449IleThr: 3.449 ± 0.367
4.188IleVal: 4.188 ± 0.532
0.308IleTrp: 0.308 ± 0.13
0.924IleTyr: 0.924 ± 0.257
0.0IleXaa: 0.0 ± 0.0
Lys
3.388LysAla: 3.388 ± 0.522
0.123LysCys: 0.123 ± 0.092
1.909LysAsp: 1.909 ± 0.468
1.54LysGlu: 1.54 ± 0.343
1.047LysPhe: 1.047 ± 0.273
2.525LysGly: 2.525 ± 0.453
0.554LysHis: 0.554 ± 0.217
1.725LysIle: 1.725 ± 0.323
1.786LysLys: 1.786 ± 0.342
2.895LysLeu: 2.895 ± 0.529
0.739LysMet: 0.739 ± 0.316
0.985LysAsn: 0.985 ± 0.246
2.71LysPro: 2.71 ± 0.376
0.862LysGln: 0.862 ± 0.259
1.786LysArg: 1.786 ± 0.353
1.848LysSer: 1.848 ± 0.349
2.71LysThr: 2.71 ± 0.441
2.895LysVal: 2.895 ± 0.379
0.493LysTrp: 0.493 ± 0.149
1.047LysTyr: 1.047 ± 0.231
0.0LysXaa: 0.0 ± 0.0
Leu
11.086LeuAla: 11.086 ± 0.928
0.554LeuCys: 0.554 ± 0.192
5.42LeuAsp: 5.42 ± 0.668
2.895LeuGlu: 2.895 ± 0.514
2.094LeuPhe: 2.094 ± 0.278
6.344LeuGly: 6.344 ± 0.722
1.293LeuHis: 1.293 ± 0.289
3.326LeuIle: 3.326 ± 0.455
1.848LeuLys: 1.848 ± 0.321
4.065LeuLeu: 4.065 ± 0.64
2.279LeuMet: 2.279 ± 0.426
1.663LeuAsn: 1.663 ± 0.346
4.127LeuPro: 4.127 ± 0.46
2.033LeuGln: 2.033 ± 0.397
4.989LeuArg: 4.989 ± 0.602
5.174LeuSer: 5.174 ± 0.614
5.543LeuThr: 5.543 ± 0.709
6.159LeuVal: 6.159 ± 0.689
2.156LeuTrp: 2.156 ± 0.429
1.17LeuTyr: 1.17 ± 0.237
0.0LeuXaa: 0.0 ± 0.0
Met
3.695MetAla: 3.695 ± 0.549
0.308MetCys: 0.308 ± 0.128
0.431MetAsp: 0.431 ± 0.176
0.678MetGlu: 0.678 ± 0.192
0.616MetPhe: 0.616 ± 0.159
1.848MetGly: 1.848 ± 0.363
0.431MetHis: 0.431 ± 0.182
0.924MetIle: 0.924 ± 0.276
0.554MetLys: 0.554 ± 0.183
1.848MetLeu: 1.848 ± 0.3
0.308MetMet: 0.308 ± 0.184
0.431MetAsn: 0.431 ± 0.178
1.909MetPro: 1.909 ± 0.353
0.739MetGln: 0.739 ± 0.237
2.033MetArg: 2.033 ± 0.547
1.725MetSer: 1.725 ± 0.348
2.34MetThr: 2.34 ± 0.373
0.678MetVal: 0.678 ± 0.254
0.616MetTrp: 0.616 ± 0.218
0.431MetTyr: 0.431 ± 0.161
0.0MetXaa: 0.0 ± 0.0
Asn
2.772AsnAla: 2.772 ± 0.452
0.246AsnCys: 0.246 ± 0.127
1.971AsnAsp: 1.971 ± 0.301
0.924AsnGlu: 0.924 ± 0.254
0.431AsnPhe: 0.431 ± 0.154
3.634AsnGly: 3.634 ± 0.561
0.801AsnHis: 0.801 ± 0.208
0.862AsnIle: 0.862 ± 0.245
0.678AsnLys: 0.678 ± 0.191
2.156AsnLeu: 2.156 ± 0.373
0.493AsnMet: 0.493 ± 0.187
0.493AsnAsn: 0.493 ± 0.188
2.833AsnPro: 2.833 ± 0.418
0.924AsnGln: 0.924 ± 0.183
1.725AsnArg: 1.725 ± 0.389
1.663AsnSer: 1.663 ± 0.301
2.402AsnThr: 2.402 ± 0.395
1.971AsnVal: 1.971 ± 0.373
0.431AsnTrp: 0.431 ± 0.136
0.924AsnTyr: 0.924 ± 0.221
0.0AsnXaa: 0.0 ± 0.0
Pro
6.713ProAla: 6.713 ± 0.835
1.047ProCys: 1.047 ± 0.28
4.804ProAsp: 4.804 ± 0.666
3.757ProGlu: 3.757 ± 0.609
1.601ProPhe: 1.601 ± 0.258
5.235ProGly: 5.235 ± 0.719
1.047ProHis: 1.047 ± 0.281
3.08ProIle: 3.08 ± 0.396
2.833ProLys: 2.833 ± 0.43
3.326ProLeu: 3.326 ± 0.508
1.478ProMet: 1.478 ± 0.416
2.033ProAsn: 2.033 ± 0.378
3.819ProPro: 3.819 ± 0.761
1.848ProGln: 1.848 ± 0.283
3.634ProArg: 3.634 ± 0.595
3.449ProSer: 3.449 ± 0.489
4.25ProThr: 4.25 ± 0.633
4.065ProVal: 4.065 ± 0.429
1.232ProTrp: 1.232 ± 0.269
1.109ProTyr: 1.109 ± 0.26
0.0ProXaa: 0.0 ± 0.0
Gln
3.757GlnAla: 3.757 ± 0.526
0.123GlnCys: 0.123 ± 0.082
1.232GlnAsp: 1.232 ± 0.235
1.17GlnGlu: 1.17 ± 0.267
1.17GlnPhe: 1.17 ± 0.284
2.34GlnGly: 2.34 ± 0.395
1.355GlnHis: 1.355 ± 0.325
1.663GlnIle: 1.663 ± 0.323
0.862GlnLys: 0.862 ± 0.201
3.264GlnLeu: 3.264 ± 0.446
0.739GlnMet: 0.739 ± 0.228
0.924GlnAsn: 0.924 ± 0.23
2.217GlnPro: 2.217 ± 0.402
2.094GlnGln: 2.094 ± 0.359
2.648GlnArg: 2.648 ± 0.371
1.601GlnSer: 1.601 ± 0.361
2.033GlnThr: 2.033 ± 0.387
2.587GlnVal: 2.587 ± 0.42
1.109GlnTrp: 1.109 ± 0.284
1.109GlnTyr: 1.109 ± 0.218
0.0GlnXaa: 0.0 ± 0.0
Arg
8.007ArgAla: 8.007 ± 0.841
0.924ArgCys: 0.924 ± 0.253
5.851ArgAsp: 5.851 ± 0.514
3.88ArgGlu: 3.88 ± 0.5
2.094ArgPhe: 2.094 ± 0.338
6.529ArgGly: 6.529 ± 0.745
1.786ArgHis: 1.786 ± 0.358
3.695ArgIle: 3.695 ± 0.423
1.848ArgLys: 1.848 ± 0.338
4.743ArgLeu: 4.743 ± 0.502
1.971ArgMet: 1.971 ± 0.352
2.464ArgAsn: 2.464 ± 0.377
3.757ArgPro: 3.757 ± 0.447
2.956ArgGln: 2.956 ± 0.557
6.96ArgArg: 6.96 ± 0.871
4.311ArgSer: 4.311 ± 0.445
4.127ArgThr: 4.127 ± 0.498
4.866ArgVal: 4.866 ± 0.632
1.293ArgTrp: 1.293 ± 0.287
1.848ArgTyr: 1.848 ± 0.38
0.0ArgXaa: 0.0 ± 0.0
Ser
6.098SerAla: 6.098 ± 0.76
0.308SerCys: 0.308 ± 0.146
3.08SerAsp: 3.08 ± 0.386
3.388SerGlu: 3.388 ± 0.526
1.232SerPhe: 1.232 ± 0.317
5.851SerGly: 5.851 ± 0.685
0.739SerHis: 0.739 ± 0.178
3.08SerIle: 3.08 ± 0.498
1.725SerLys: 1.725 ± 0.295
3.757SerLeu: 3.757 ± 0.478
1.663SerMet: 1.663 ± 0.278
1.971SerAsn: 1.971 ± 0.449
3.018SerPro: 3.018 ± 0.476
1.17SerGln: 1.17 ± 0.354
3.757SerArg: 3.757 ± 0.606
2.648SerSer: 2.648 ± 0.453
4.127SerThr: 4.127 ± 0.502
4.496SerVal: 4.496 ± 0.645
1.848SerTrp: 1.848 ± 0.299
0.739SerTyr: 0.739 ± 0.247
0.0SerXaa: 0.0 ± 0.0
Thr
7.637ThrAla: 7.637 ± 0.803
0.739ThrCys: 0.739 ± 0.217
5.358ThrAsp: 5.358 ± 0.704
3.695ThrGlu: 3.695 ± 0.615
2.217ThrPhe: 2.217 ± 0.459
5.112ThrGly: 5.112 ± 0.727
1.293ThrHis: 1.293 ± 0.274
3.141ThrIle: 3.141 ± 0.53
2.71ThrLys: 2.71 ± 0.398
5.482ThrLeu: 5.482 ± 0.677
1.293ThrMet: 1.293 ± 0.239
2.094ThrAsn: 2.094 ± 0.371
4.743ThrPro: 4.743 ± 0.517
2.402ThrGln: 2.402 ± 0.35
3.88ThrArg: 3.88 ± 0.482
3.511ThrSer: 3.511 ± 0.51
4.435ThrThr: 4.435 ± 0.585
5.482ThrVal: 5.482 ± 0.727
1.355ThrTrp: 1.355 ± 0.269
1.232ThrTyr: 1.232 ± 0.299
0.0ThrXaa: 0.0 ± 0.0
Val
9.608ValAla: 9.608 ± 1.03
0.924ValCys: 0.924 ± 0.284
6.467ValAsp: 6.467 ± 0.776
5.42ValGlu: 5.42 ± 0.707
1.17ValPhe: 1.17 ± 0.274
5.851ValGly: 5.851 ± 0.762
1.54ValHis: 1.54 ± 0.288
3.141ValIle: 3.141 ± 0.373
2.772ValLys: 2.772 ± 0.361
4.804ValLeu: 4.804 ± 0.406
1.971ValMet: 1.971 ± 0.396
1.663ValAsn: 1.663 ± 0.289
3.942ValPro: 3.942 ± 0.521
2.34ValGln: 2.34 ± 0.409
6.406ValArg: 6.406 ± 0.76
3.88ValSer: 3.88 ± 0.553
5.728ValThr: 5.728 ± 0.717
7.391ValVal: 7.391 ± 0.834
1.663ValTrp: 1.663 ± 0.385
1.848ValTyr: 1.848 ± 0.272
0.0ValXaa: 0.0 ± 0.0
Trp
1.848TrpAla: 1.848 ± 0.359
0.185TrpCys: 0.185 ± 0.131
1.232TrpAsp: 1.232 ± 0.358
0.739TrpGlu: 0.739 ± 0.215
0.616TrpPhe: 0.616 ± 0.241
0.985TrpGly: 0.985 ± 0.241
0.554TrpHis: 0.554 ± 0.21
0.985TrpIle: 0.985 ± 0.231
0.678TrpLys: 0.678 ± 0.187
2.094TrpLeu: 2.094 ± 0.318
0.554TrpMet: 0.554 ± 0.177
0.862TrpAsn: 0.862 ± 0.384
1.232TrpPro: 1.232 ± 0.269
1.047TrpGln: 1.047 ± 0.261
1.54TrpArg: 1.54 ± 0.285
1.601TrpSer: 1.601 ± 0.336
1.355TrpThr: 1.355 ± 0.265
2.033TrpVal: 2.033 ± 0.3
0.678TrpTrp: 0.678 ± 0.185
0.678TrpTyr: 0.678 ± 0.188
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.971TyrAla: 1.971 ± 0.273
0.246TyrCys: 0.246 ± 0.104
1.17TyrAsp: 1.17 ± 0.234
1.54TyrGlu: 1.54 ± 0.371
0.616TyrPhe: 0.616 ± 0.226
2.402TyrGly: 2.402 ± 0.377
0.678TyrHis: 0.678 ± 0.201
0.862TyrIle: 0.862 ± 0.314
0.739TyrLys: 0.739 ± 0.216
1.355TyrLeu: 1.355 ± 0.31
0.431TyrMet: 0.431 ± 0.154
0.985TyrAsn: 0.985 ± 0.245
1.417TyrPro: 1.417 ± 0.251
0.678TyrGln: 0.678 ± 0.227
2.156TyrArg: 2.156 ± 0.386
1.17TyrSer: 1.17 ± 0.285
1.909TyrThr: 1.909 ± 0.363
1.601TyrVal: 1.601 ± 0.319
0.493TyrTrp: 0.493 ± 0.182
0.739TyrTyr: 0.739 ± 0.234
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 83 proteins (16237 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski