Amino acid dipepetide frequency for Gordonia phage Gustav

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
17.416AlaAla: 17.416 ± 1.703
0.351AlaCys: 0.351 ± 0.144
7.935AlaAsp: 7.935 ± 0.707
8.567AlaGlu: 8.567 ± 0.851
2.458AlaPhe: 2.458 ± 0.572
10.604AlaGly: 10.604 ± 1.132
1.756AlaHis: 1.756 ± 0.415
3.933AlaIle: 3.933 ± 0.641
6.18AlaLys: 6.18 ± 0.71
9.831AlaLeu: 9.831 ± 0.77
3.301AlaMet: 3.301 ± 0.596
4.003AlaAsn: 4.003 ± 0.651
5.197AlaPro: 5.197 ± 0.877
3.722AlaGln: 3.722 ± 0.531
7.654AlaArg: 7.654 ± 0.717
6.882AlaSer: 6.882 ± 0.868
6.812AlaThr: 6.812 ± 0.65
9.199AlaVal: 9.199 ± 0.741
2.949AlaTrp: 2.949 ± 0.505
3.371AlaTyr: 3.371 ± 0.418
0.0AlaXaa: 0.0 ± 0.0
Cys
0.421CysAla: 0.421 ± 0.212
0.0CysCys: 0.0 ± 0.0
0.281CysAsp: 0.281 ± 0.129
0.421CysGlu: 0.421 ± 0.179
0.211CysPhe: 0.211 ± 0.099
0.772CysGly: 0.772 ± 0.273
0.211CysHis: 0.211 ± 0.124
0.07CysIle: 0.07 ± 0.074
0.07CysLys: 0.07 ± 0.073
0.211CysLeu: 0.211 ± 0.124
0.14CysMet: 0.14 ± 0.096
0.281CysAsn: 0.281 ± 0.14
0.281CysPro: 0.281 ± 0.149
0.351CysGln: 0.351 ± 0.152
0.562CysArg: 0.562 ± 0.249
0.772CysSer: 0.772 ± 0.289
0.281CysThr: 0.281 ± 0.131
0.281CysVal: 0.281 ± 0.154
0.211CysTrp: 0.211 ± 0.132
0.07CysTyr: 0.07 ± 0.061
0.0CysXaa: 0.0 ± 0.0
Asp
7.584AspAla: 7.584 ± 0.727
0.351AspCys: 0.351 ± 0.148
4.284AspAsp: 4.284 ± 0.991
3.09AspGlu: 3.09 ± 0.597
0.421AspPhe: 0.421 ± 0.212
5.688AspGly: 5.688 ± 0.613
1.545AspHis: 1.545 ± 0.388
2.528AspIle: 2.528 ± 0.344
2.177AspLys: 2.177 ± 0.444
7.093AspLeu: 7.093 ± 0.798
1.615AspMet: 1.615 ± 0.359
2.458AspAsn: 2.458 ± 0.482
6.11AspPro: 6.11 ± 0.838
2.458AspGln: 2.458 ± 0.395
4.003AspArg: 4.003 ± 0.598
2.107AspSer: 2.107 ± 0.413
4.705AspThr: 4.705 ± 0.544
3.301AspVal: 3.301 ± 0.55
1.615AspTrp: 1.615 ± 0.309
1.966AspTyr: 1.966 ± 0.379
0.0AspXaa: 0.0 ± 0.0
Glu
7.865GluAla: 7.865 ± 0.762
0.211GluCys: 0.211 ± 0.125
4.284GluAsp: 4.284 ± 0.554
4.143GluGlu: 4.143 ± 0.473
2.598GluPhe: 2.598 ± 0.51
4.775GluGly: 4.775 ± 0.591
1.475GluHis: 1.475 ± 0.408
3.722GluIle: 3.722 ± 0.459
0.281GluLys: 0.281 ± 0.134
4.916GluLeu: 4.916 ± 0.597
0.772GluMet: 0.772 ± 0.206
1.615GluAsn: 1.615 ± 0.418
3.862GluPro: 3.862 ± 0.669
1.756GluGln: 1.756 ± 0.47
4.354GluArg: 4.354 ± 0.824
2.528GluSer: 2.528 ± 0.4
3.16GluThr: 3.16 ± 0.584
5.126GluVal: 5.126 ± 0.847
1.615GluTrp: 1.615 ± 0.365
1.334GluTyr: 1.334 ± 0.254
0.0GluXaa: 0.0 ± 0.0
Phe
2.739PheAla: 2.739 ± 0.454
0.211PheCys: 0.211 ± 0.107
1.966PheAsp: 1.966 ± 0.349
1.053PheGlu: 1.053 ± 0.265
0.211PhePhe: 0.211 ± 0.107
1.896PheGly: 1.896 ± 0.357
0.492PheHis: 0.492 ± 0.217
1.053PheIle: 1.053 ± 0.322
1.264PheLys: 1.264 ± 0.298
2.317PheLeu: 2.317 ± 0.394
0.562PheMet: 0.562 ± 0.185
0.492PheAsn: 0.492 ± 0.182
1.685PhePro: 1.685 ± 0.343
1.124PheGln: 1.124 ± 0.272
1.756PheArg: 1.756 ± 0.344
1.545PheSer: 1.545 ± 0.322
1.966PheThr: 1.966 ± 0.291
1.194PheVal: 1.194 ± 0.382
0.421PheTrp: 0.421 ± 0.179
0.772PheTyr: 0.772 ± 0.219
0.0PheXaa: 0.0 ± 0.0
Gly
8.497GlyAla: 8.497 ± 1.023
0.702GlyCys: 0.702 ± 0.236
5.969GlyAsp: 5.969 ± 0.564
5.056GlyGlu: 5.056 ± 0.566
2.388GlyPhe: 2.388 ± 0.441
8.216GlyGly: 8.216 ± 1.148
2.107GlyHis: 2.107 ± 0.376
5.267GlyIle: 5.267 ± 0.835
3.652GlyLys: 3.652 ± 0.447
6.671GlyLeu: 6.671 ± 0.812
1.966GlyMet: 1.966 ± 0.428
3.02GlyAsn: 3.02 ± 0.336
4.565GlyPro: 4.565 ± 0.583
3.301GlyGln: 3.301 ± 0.471
6.671GlyArg: 6.671 ± 0.734
5.758GlySer: 5.758 ± 0.828
5.829GlyThr: 5.829 ± 0.589
6.32GlyVal: 6.32 ± 0.695
1.826GlyTrp: 1.826 ± 0.378
2.388GlyTyr: 2.388 ± 0.427
0.0GlyXaa: 0.0 ± 0.0
His
1.475HisAla: 1.475 ± 0.307
0.07HisCys: 0.07 ± 0.069
1.124HisAsp: 1.124 ± 0.313
0.281HisGlu: 0.281 ± 0.122
0.492HisPhe: 0.492 ± 0.199
2.388HisGly: 2.388 ± 0.461
0.632HisHis: 0.632 ± 0.316
0.913HisIle: 0.913 ± 0.246
0.632HisLys: 0.632 ± 0.203
2.317HisLeu: 2.317 ± 0.489
0.492HisMet: 0.492 ± 0.207
0.492HisAsn: 0.492 ± 0.198
1.264HisPro: 1.264 ± 0.413
1.194HisGln: 1.194 ± 0.27
1.404HisArg: 1.404 ± 0.365
0.632HisSer: 0.632 ± 0.263
1.615HisThr: 1.615 ± 0.3
1.404HisVal: 1.404 ± 0.323
0.281HisTrp: 0.281 ± 0.144
0.421HisTyr: 0.421 ± 0.213
0.0HisXaa: 0.0 ± 0.0
Ile
3.301IleAla: 3.301 ± 0.928
0.211IleCys: 0.211 ± 0.119
2.107IleAsp: 2.107 ± 0.482
0.913IleGlu: 0.913 ± 0.199
0.843IlePhe: 0.843 ± 0.25
3.371IleGly: 3.371 ± 0.5
1.756IleHis: 1.756 ± 0.267
0.983IleIle: 0.983 ± 0.302
1.756IleLys: 1.756 ± 0.331
4.494IleLeu: 4.494 ± 0.494
0.772IleMet: 0.772 ± 0.283
0.772IleAsn: 0.772 ± 0.346
3.581IlePro: 3.581 ± 0.59
1.966IleGln: 1.966 ± 0.34
3.441IleArg: 3.441 ± 0.442
2.739IleSer: 2.739 ± 0.385
3.511IleThr: 3.511 ± 0.461
2.598IleVal: 2.598 ± 0.344
0.913IleTrp: 0.913 ± 0.23
1.053IleTyr: 1.053 ± 0.267
0.0IleXaa: 0.0 ± 0.0
Lys
5.407LysAla: 5.407 ± 0.731
0.281LysCys: 0.281 ± 0.126
2.528LysAsp: 2.528 ± 0.304
2.317LysGlu: 2.317 ± 0.371
1.404LysPhe: 1.404 ± 0.424
2.247LysGly: 2.247 ± 0.489
0.211LysHis: 0.211 ± 0.1
1.615LysIle: 1.615 ± 0.323
0.281LysLys: 0.281 ± 0.137
3.722LysLeu: 3.722 ± 0.413
0.632LysMet: 0.632 ± 0.195
0.702LysAsn: 0.702 ± 0.26
1.826LysPro: 1.826 ± 0.326
0.492LysGln: 0.492 ± 0.201
2.107LysArg: 2.107 ± 0.351
2.879LysSer: 2.879 ± 0.428
2.739LysThr: 2.739 ± 0.577
3.792LysVal: 3.792 ± 0.798
0.632LysTrp: 0.632 ± 0.245
0.983LysTyr: 0.983 ± 0.221
0.0LysXaa: 0.0 ± 0.0
Leu
9.902LeuAla: 9.902 ± 0.922
0.632LeuCys: 0.632 ± 0.229
5.829LeuAsp: 5.829 ± 0.816
5.899LeuGlu: 5.899 ± 0.757
1.545LeuPhe: 1.545 ± 0.263
8.076LeuGly: 8.076 ± 0.793
1.615LeuHis: 1.615 ± 0.338
2.739LeuIle: 2.739 ± 0.596
4.565LeuLys: 4.565 ± 0.592
5.337LeuLeu: 5.337 ± 0.628
1.896LeuMet: 1.896 ± 0.298
1.896LeuAsn: 1.896 ± 0.429
4.494LeuPro: 4.494 ± 0.572
2.669LeuGln: 2.669 ± 0.41
5.126LeuArg: 5.126 ± 0.689
3.862LeuSer: 3.862 ± 0.695
6.25LeuThr: 6.25 ± 0.581
5.267LeuVal: 5.267 ± 0.559
1.264LeuTrp: 1.264 ± 0.329
2.669LeuTyr: 2.669 ± 0.476
0.0LeuXaa: 0.0 ± 0.0
Met
3.792MetAla: 3.792 ± 0.656
0.07MetCys: 0.07 ± 0.075
1.404MetAsp: 1.404 ± 0.28
1.545MetGlu: 1.545 ± 0.391
0.421MetPhe: 0.421 ± 0.177
1.826MetGly: 1.826 ± 0.322
0.281MetHis: 0.281 ± 0.125
0.843MetIle: 0.843 ± 0.191
0.421MetLys: 0.421 ± 0.164
0.983MetLeu: 0.983 ± 0.229
0.14MetMet: 0.14 ± 0.091
0.772MetAsn: 0.772 ± 0.209
1.194MetPro: 1.194 ± 0.318
0.211MetGln: 0.211 ± 0.123
1.124MetArg: 1.124 ± 0.276
1.966MetSer: 1.966 ± 0.412
2.879MetThr: 2.879 ± 0.517
1.685MetVal: 1.685 ± 0.345
0.07MetTrp: 0.07 ± 0.061
0.281MetTyr: 0.281 ± 0.135
0.0MetXaa: 0.0 ± 0.0
Asn
3.581AsnAla: 3.581 ± 0.595
0.07AsnCys: 0.07 ± 0.064
1.826AsnAsp: 1.826 ± 0.46
0.772AsnGlu: 0.772 ± 0.208
0.983AsnPhe: 0.983 ± 0.378
2.598AsnGly: 2.598 ± 0.513
0.492AsnHis: 0.492 ± 0.173
1.545AsnIle: 1.545 ± 0.342
1.615AsnLys: 1.615 ± 0.333
2.037AsnLeu: 2.037 ± 0.372
0.702AsnMet: 0.702 ± 0.205
1.334AsnAsn: 1.334 ± 0.339
3.371AsnPro: 3.371 ± 0.372
0.351AsnGln: 0.351 ± 0.167
1.545AsnArg: 1.545 ± 0.3
1.475AsnSer: 1.475 ± 0.343
2.247AsnThr: 2.247 ± 0.405
1.826AsnVal: 1.826 ± 0.3
0.562AsnTrp: 0.562 ± 0.184
0.772AsnTyr: 0.772 ± 0.189
0.0AsnXaa: 0.0 ± 0.0
Pro
7.233ProAla: 7.233 ± 0.721
0.562ProCys: 0.562 ± 0.291
5.337ProAsp: 5.337 ± 0.539
5.478ProGlu: 5.478 ± 0.898
2.037ProPhe: 2.037 ± 0.406
5.478ProGly: 5.478 ± 0.71
0.843ProHis: 0.843 ± 0.298
2.458ProIle: 2.458 ± 0.343
1.334ProLys: 1.334 ± 0.256
3.581ProLeu: 3.581 ± 0.545
1.545ProMet: 1.545 ± 0.377
1.826ProAsn: 1.826 ± 0.335
3.581ProPro: 3.581 ± 0.632
1.404ProGln: 1.404 ± 0.367
2.598ProArg: 2.598 ± 0.444
3.23ProSer: 3.23 ± 0.549
4.986ProThr: 4.986 ± 0.739
4.705ProVal: 4.705 ± 0.596
1.404ProTrp: 1.404 ± 0.302
0.843ProTyr: 0.843 ± 0.222
0.0ProXaa: 0.0 ± 0.0
Gln
4.775GlnAla: 4.775 ± 0.638
0.07GlnCys: 0.07 ± 0.093
1.756GlnAsp: 1.756 ± 0.362
1.334GlnGlu: 1.334 ± 0.339
1.194GlnPhe: 1.194 ± 0.271
2.879GlnGly: 2.879 ± 0.461
0.702GlnHis: 0.702 ± 0.247
1.826GlnIle: 1.826 ± 0.347
0.492GlnLys: 0.492 ± 0.215
3.792GlnLeu: 3.792 ± 0.643
0.772GlnMet: 0.772 ± 0.254
0.983GlnAsn: 0.983 ± 0.395
1.124GlnPro: 1.124 ± 0.342
1.124GlnGln: 1.124 ± 0.269
2.317GlnArg: 2.317 ± 0.458
1.194GlnSer: 1.194 ± 0.296
1.896GlnThr: 1.896 ± 0.344
2.739GlnVal: 2.739 ± 0.357
0.913GlnTrp: 0.913 ± 0.242
0.772GlnTyr: 0.772 ± 0.299
0.0GlnXaa: 0.0 ± 0.0
Arg
7.584ArgAla: 7.584 ± 0.8
0.702ArgCys: 0.702 ± 0.263
4.635ArgAsp: 4.635 ± 0.673
4.846ArgGlu: 4.846 ± 0.697
1.475ArgPhe: 1.475 ± 0.375
6.812ArgGly: 6.812 ± 0.881
1.194ArgHis: 1.194 ± 0.317
1.896ArgIle: 1.896 ± 0.362
2.739ArgLys: 2.739 ± 0.474
3.862ArgLeu: 3.862 ± 0.445
2.317ArgMet: 2.317 ± 0.395
1.966ArgAsn: 1.966 ± 0.384
3.862ArgPro: 3.862 ± 0.607
2.528ArgGln: 2.528 ± 0.41
5.969ArgArg: 5.969 ± 0.949
2.669ArgSer: 2.669 ± 0.379
3.441ArgThr: 3.441 ± 0.47
5.056ArgVal: 5.056 ± 0.646
2.037ArgTrp: 2.037 ± 0.386
1.615ArgTyr: 1.615 ± 0.32
0.0ArgXaa: 0.0 ± 0.0
Ser
7.935SerAla: 7.935 ± 0.722
0.07SerCys: 0.07 ± 0.074
3.16SerAsp: 3.16 ± 0.523
3.371SerGlu: 3.371 ± 0.463
1.685SerPhe: 1.685 ± 0.313
5.267SerGly: 5.267 ± 0.761
0.843SerHis: 0.843 ± 0.298
1.966SerIle: 1.966 ± 0.364
2.317SerLys: 2.317 ± 0.381
3.933SerLeu: 3.933 ± 0.578
1.194SerMet: 1.194 ± 0.302
1.545SerAsn: 1.545 ± 0.321
2.388SerPro: 2.388 ± 0.435
2.037SerGln: 2.037 ± 0.379
2.809SerArg: 2.809 ± 0.495
2.317SerSer: 2.317 ± 0.41
2.879SerThr: 2.879 ± 0.556
2.879SerVal: 2.879 ± 0.554
1.053SerTrp: 1.053 ± 0.272
1.264SerTyr: 1.264 ± 0.258
0.0SerXaa: 0.0 ± 0.0
Thr
9.48ThrAla: 9.48 ± 1.066
0.632ThrCys: 0.632 ± 0.194
3.792ThrAsp: 3.792 ± 0.398
4.846ThrGlu: 4.846 ± 0.638
1.404ThrPhe: 1.404 ± 0.337
7.584ThrGly: 7.584 ± 0.973
1.124ThrHis: 1.124 ± 0.355
2.247ThrIle: 2.247 ± 0.383
2.739ThrLys: 2.739 ± 0.453
5.829ThrLeu: 5.829 ± 0.781
1.053ThrMet: 1.053 ± 0.248
1.826ThrAsn: 1.826 ± 0.328
5.056ThrPro: 5.056 ± 0.607
1.826ThrGln: 1.826 ± 0.266
4.846ThrArg: 4.846 ± 0.53
2.809ThrSer: 2.809 ± 0.442
5.056ThrThr: 5.056 ± 0.787
6.39ThrVal: 6.39 ± 0.713
1.124ThrTrp: 1.124 ± 0.275
1.124ThrTyr: 1.124 ± 0.234
0.0ThrXaa: 0.0 ± 0.0
Val
9.059ValAla: 9.059 ± 0.907
0.492ValCys: 0.492 ± 0.222
4.424ValAsp: 4.424 ± 0.509
4.284ValGlu: 4.284 ± 0.51
1.545ValPhe: 1.545 ± 0.374
6.18ValGly: 6.18 ± 0.968
1.615ValHis: 1.615 ± 0.295
3.511ValIle: 3.511 ± 0.43
3.16ValLys: 3.16 ± 0.543
6.11ValLeu: 6.11 ± 0.919
1.615ValMet: 1.615 ± 0.309
1.826ValAsn: 1.826 ± 0.339
3.933ValPro: 3.933 ± 0.683
2.669ValGln: 2.669 ± 0.451
4.846ValArg: 4.846 ± 0.736
3.23ValSer: 3.23 ± 0.473
6.671ValThr: 6.671 ± 0.531
6.531ValVal: 6.531 ± 0.638
0.983ValTrp: 0.983 ± 0.278
1.685ValTyr: 1.685 ± 0.389
0.0ValXaa: 0.0 ± 0.0
Trp
1.475TrpAla: 1.475 ± 0.335
0.211TrpCys: 0.211 ± 0.141
1.194TrpAsp: 1.194 ± 0.323
1.334TrpGlu: 1.334 ± 0.302
0.562TrpPhe: 0.562 ± 0.298
1.194TrpGly: 1.194 ± 0.273
0.351TrpHis: 0.351 ± 0.139
0.843TrpIle: 0.843 ± 0.264
0.421TrpLys: 0.421 ± 0.197
2.177TrpLeu: 2.177 ± 0.381
0.211TrpMet: 0.211 ± 0.123
0.772TrpAsn: 0.772 ± 0.196
1.334TrpPro: 1.334 ± 0.33
0.772TrpGln: 0.772 ± 0.186
1.826TrpArg: 1.826 ± 0.307
1.334TrpSer: 1.334 ± 0.284
2.177TrpThr: 2.177 ± 0.341
1.966TrpVal: 1.966 ± 0.497
0.562TrpTrp: 0.562 ± 0.192
0.14TrpTyr: 0.14 ± 0.109
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.598TyrAla: 2.598 ± 0.409
0.0TyrCys: 0.0 ± 0.0
1.264TyrAsp: 1.264 ± 0.328
0.843TyrGlu: 0.843 ± 0.246
0.843TyrPhe: 0.843 ± 0.368
2.317TyrGly: 2.317 ± 0.483
0.281TyrHis: 0.281 ± 0.138
1.124TyrIle: 1.124 ± 0.319
0.702TyrLys: 0.702 ± 0.239
2.317TyrLeu: 2.317 ± 0.428
0.14TyrMet: 0.14 ± 0.111
1.124TyrAsn: 1.124 ± 0.307
1.756TyrPro: 1.756 ± 0.363
0.702TyrGln: 0.702 ± 0.206
2.317TyrArg: 2.317 ± 0.328
0.983TyrSer: 0.983 ± 0.266
1.685TyrThr: 1.685 ± 0.454
2.107TyrVal: 2.107 ± 0.301
0.421TyrTrp: 0.421 ± 0.175
0.281TyrTyr: 0.281 ± 0.167
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 69 proteins (14241 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski