Amino acid dipepetide frequency for Gordonia phage Petra

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
16.695AlaAla: 16.695 ± 1.264
0.941AlaCys: 0.941 ± 0.241
7.877AlaAsp: 7.877 ± 0.639
6.878AlaGlu: 6.878 ± 0.878
3.527AlaPhe: 3.527 ± 0.583
8.23AlaGly: 8.23 ± 0.943
2.057AlaHis: 2.057 ± 0.324
6.408AlaIle: 6.408 ± 0.572
3.997AlaLys: 3.997 ± 0.53
9.406AlaLeu: 9.406 ± 0.827
2.293AlaMet: 2.293 ± 0.489
3.233AlaAsn: 3.233 ± 0.526
5.761AlaPro: 5.761 ± 0.634
4.82AlaGln: 4.82 ± 0.5
8.641AlaArg: 8.641 ± 0.807
5.114AlaSer: 5.114 ± 0.566
7.172AlaThr: 7.172 ± 0.827
7.877AlaVal: 7.877 ± 0.865
2.234AlaTrp: 2.234 ± 0.36
2.116AlaTyr: 2.116 ± 0.323
0.0AlaXaa: 0.0 ± 0.0
Cys
0.588CysAla: 0.588 ± 0.193
0.294CysCys: 0.294 ± 0.167
0.999CysAsp: 0.999 ± 0.279
0.353CysGlu: 0.353 ± 0.139
0.0CysPhe: 0.0 ± 0.0
1.176CysGly: 1.176 ± 0.282
0.47CysHis: 0.47 ± 0.218
0.353CysIle: 0.353 ± 0.155
0.235CysLys: 0.235 ± 0.115
0.47CysLeu: 0.47 ± 0.179
0.235CysMet: 0.235 ± 0.115
0.47CysAsn: 0.47 ± 0.163
0.941CysPro: 0.941 ± 0.25
0.353CysGln: 0.353 ± 0.136
0.764CysArg: 0.764 ± 0.247
0.588CysSer: 0.588 ± 0.181
0.47CysThr: 0.47 ± 0.156
0.47CysVal: 0.47 ± 0.162
0.235CysTrp: 0.235 ± 0.147
0.118CysTyr: 0.118 ± 0.092
0.0CysXaa: 0.0 ± 0.0
Asp
7.818AspAla: 7.818 ± 0.632
0.411AspCys: 0.411 ± 0.133
5.643AspAsp: 5.643 ± 0.756
4.997AspGlu: 4.997 ± 0.658
2.057AspPhe: 2.057 ± 0.327
7.113AspGly: 7.113 ± 0.798
1.999AspHis: 1.999 ± 0.406
2.939AspIle: 2.939 ± 0.349
1.411AspLys: 1.411 ± 0.266
5.526AspLeu: 5.526 ± 0.572
1.47AspMet: 1.47 ± 0.245
1.764AspAsn: 1.764 ± 0.307
3.703AspPro: 3.703 ± 0.472
2.587AspGln: 2.587 ± 0.33
4.644AspArg: 4.644 ± 0.685
3.527AspSer: 3.527 ± 0.348
3.997AspThr: 3.997 ± 0.502
4.585AspVal: 4.585 ± 0.537
0.882AspTrp: 0.882 ± 0.203
1.764AspTyr: 1.764 ± 0.363
0.0AspXaa: 0.0 ± 0.0
Glu
6.525GluAla: 6.525 ± 0.793
0.411GluCys: 0.411 ± 0.161
2.41GluAsp: 2.41 ± 0.355
2.41GluGlu: 2.41 ± 0.634
1.764GluPhe: 1.764 ± 0.333
4.115GluGly: 4.115 ± 0.494
1.705GluHis: 1.705 ± 0.323
2.587GluIle: 2.587 ± 0.305
2.057GluLys: 2.057 ± 0.374
5.056GluLeu: 5.056 ± 0.815
1.47GluMet: 1.47 ± 0.265
1.176GluAsn: 1.176 ± 0.262
3.233GluPro: 3.233 ± 0.637
2.645GluGln: 2.645 ± 0.404
4.879GluArg: 4.879 ± 0.777
2.351GluSer: 2.351 ± 0.29
3.41GluThr: 3.41 ± 0.44
4.526GluVal: 4.526 ± 0.561
1.587GluTrp: 1.587 ± 0.322
1.705GluTyr: 1.705 ± 0.353
0.0GluXaa: 0.0 ± 0.0
Phe
3.292PheAla: 3.292 ± 0.417
0.235PheCys: 0.235 ± 0.108
2.293PheAsp: 2.293 ± 0.301
1.881PheGlu: 1.881 ± 0.266
0.764PhePhe: 0.764 ± 0.212
2.587PheGly: 2.587 ± 0.306
0.47PheHis: 0.47 ± 0.163
0.823PheIle: 0.823 ± 0.252
1.117PheLys: 1.117 ± 0.291
1.999PheLeu: 1.999 ± 0.442
0.588PheMet: 0.588 ± 0.182
0.941PheAsn: 0.941 ± 0.288
1.764PhePro: 1.764 ± 0.276
0.764PheGln: 0.764 ± 0.201
2.116PheArg: 2.116 ± 0.319
1.411PheSer: 1.411 ± 0.232
1.822PheThr: 1.822 ± 0.313
2.528PheVal: 2.528 ± 0.315
0.176PheTrp: 0.176 ± 0.108
0.647PheTyr: 0.647 ± 0.17
0.0PheXaa: 0.0 ± 0.0
Gly
8.054GlyAla: 8.054 ± 0.798
0.705GlyCys: 0.705 ± 0.209
5.643GlyAsp: 5.643 ± 0.553
4.879GlyGlu: 4.879 ± 0.635
2.822GlyPhe: 2.822 ± 0.423
7.348GlyGly: 7.348 ± 0.842
1.646GlyHis: 1.646 ± 0.296
4.174GlyIle: 4.174 ± 0.619
3.233GlyLys: 3.233 ± 0.442
7.76GlyLeu: 7.76 ± 1.043
1.764GlyMet: 1.764 ± 0.267
2.763GlyAsn: 2.763 ± 0.436
3.939GlyPro: 3.939 ± 0.698
2.998GlyGln: 2.998 ± 0.379
7.231GlyArg: 7.231 ± 0.693
5.056GlySer: 5.056 ± 0.596
4.644GlyThr: 4.644 ± 0.525
5.232GlyVal: 5.232 ± 0.55
2.116GlyTrp: 2.116 ± 0.336
2.704GlyTyr: 2.704 ± 0.444
0.0GlyXaa: 0.0 ± 0.0
His
2.88HisAla: 2.88 ± 0.396
0.353HisCys: 0.353 ± 0.151
2.175HisAsp: 2.175 ± 0.308
0.705HisGlu: 0.705 ± 0.241
0.823HisPhe: 0.823 ± 0.207
2.234HisGly: 2.234 ± 0.48
0.588HisHis: 0.588 ± 0.183
1.234HisIle: 1.234 ± 0.285
0.353HisLys: 0.353 ± 0.13
2.234HisLeu: 2.234 ± 0.423
0.118HisMet: 0.118 ± 0.073
0.47HisAsn: 0.47 ± 0.143
1.822HisPro: 1.822 ± 0.359
0.823HisGln: 0.823 ± 0.241
1.764HisArg: 1.764 ± 0.359
0.882HisSer: 0.882 ± 0.245
1.822HisThr: 1.822 ± 0.386
1.646HisVal: 1.646 ± 0.337
0.411HisTrp: 0.411 ± 0.154
0.411HisTyr: 0.411 ± 0.161
0.0HisXaa: 0.0 ± 0.0
Ile
6.408IleAla: 6.408 ± 0.679
0.588IleCys: 0.588 ± 0.173
4.35IleAsp: 4.35 ± 0.582
2.528IleGlu: 2.528 ± 0.375
1.058IlePhe: 1.058 ± 0.25
4.056IleGly: 4.056 ± 0.614
0.882IleHis: 0.882 ± 0.251
1.881IleIle: 1.881 ± 0.339
1.528IleLys: 1.528 ± 0.413
2.469IleLeu: 2.469 ± 0.328
0.647IleMet: 0.647 ± 0.179
1.293IleAsn: 1.293 ± 0.253
3.174IlePro: 3.174 ± 0.445
1.176IleGln: 1.176 ± 0.258
3.468IleArg: 3.468 ± 0.475
2.057IleSer: 2.057 ± 0.333
3.88IleThr: 3.88 ± 0.431
3.527IleVal: 3.527 ± 0.441
0.353IleTrp: 0.353 ± 0.129
1.117IleTyr: 1.117 ± 0.217
0.0IleXaa: 0.0 ± 0.0
Lys
3.527LysAla: 3.527 ± 0.504
0.235LysCys: 0.235 ± 0.097
2.057LysAsp: 2.057 ± 0.411
1.587LysGlu: 1.587 ± 0.284
1.117LysPhe: 1.117 ± 0.262
2.351LysGly: 2.351 ± 0.386
0.705LysHis: 0.705 ± 0.189
1.528LysIle: 1.528 ± 0.391
2.057LysLys: 2.057 ± 0.338
3.174LysLeu: 3.174 ± 0.451
1.058LysMet: 1.058 ± 0.277
0.941LysAsn: 0.941 ± 0.265
2.469LysPro: 2.469 ± 0.362
0.705LysGln: 0.705 ± 0.19
2.175LysArg: 2.175 ± 0.369
1.587LysSer: 1.587 ± 0.324
2.175LysThr: 2.175 ± 0.347
2.234LysVal: 2.234 ± 0.358
0.705LysTrp: 0.705 ± 0.189
0.823LysTyr: 0.823 ± 0.195
0.0LysXaa: 0.0 ± 0.0
Leu
10.111LeuAla: 10.111 ± 0.765
0.941LeuCys: 0.941 ± 0.227
5.291LeuAsp: 5.291 ± 0.632
3.292LeuGlu: 3.292 ± 0.501
2.293LeuPhe: 2.293 ± 0.377
6.408LeuGly: 6.408 ± 0.7
1.176LeuHis: 1.176 ± 0.279
3.174LeuIle: 3.174 ± 0.509
1.764LeuLys: 1.764 ± 0.346
4.585LeuLeu: 4.585 ± 0.659
1.881LeuMet: 1.881 ± 0.326
2.351LeuAsn: 2.351 ± 0.372
4.585LeuPro: 4.585 ± 0.488
2.057LeuGln: 2.057 ± 0.302
5.291LeuArg: 5.291 ± 0.45
4.82LeuSer: 4.82 ± 0.544
5.761LeuThr: 5.761 ± 0.622
6.643LeuVal: 6.643 ± 0.651
1.94LeuTrp: 1.94 ± 0.386
1.646LeuTyr: 1.646 ± 0.334
0.0LeuXaa: 0.0 ± 0.0
Met
2.704MetAla: 2.704 ± 0.511
0.235MetCys: 0.235 ± 0.116
0.705MetAsp: 0.705 ± 0.232
0.941MetGlu: 0.941 ± 0.218
0.764MetPhe: 0.764 ± 0.189
1.587MetGly: 1.587 ± 0.359
0.294MetHis: 0.294 ± 0.131
0.999MetIle: 0.999 ± 0.283
0.823MetLys: 0.823 ± 0.213
1.764MetLeu: 1.764 ± 0.349
0.353MetMet: 0.353 ± 0.136
0.647MetAsn: 0.647 ± 0.179
1.94MetPro: 1.94 ± 0.302
0.588MetGln: 0.588 ± 0.234
2.057MetArg: 2.057 ± 0.487
1.822MetSer: 1.822 ± 0.306
2.41MetThr: 2.41 ± 0.333
0.588MetVal: 0.588 ± 0.217
0.588MetTrp: 0.588 ± 0.218
0.353MetTyr: 0.353 ± 0.154
0.0MetXaa: 0.0 ± 0.0
Asn
2.763AsnAla: 2.763 ± 0.466
0.235AsnCys: 0.235 ± 0.121
1.47AsnAsp: 1.47 ± 0.237
0.882AsnGlu: 0.882 ± 0.207
0.529AsnPhe: 0.529 ± 0.163
3.174AsnGly: 3.174 ± 0.476
0.999AsnHis: 0.999 ± 0.212
1.058AsnIle: 1.058 ± 0.267
0.764AsnLys: 0.764 ± 0.229
2.293AsnLeu: 2.293 ± 0.332
0.588AsnMet: 0.588 ± 0.207
0.882AsnAsn: 0.882 ± 0.266
3.116AsnPro: 3.116 ± 0.543
0.882AsnGln: 0.882 ± 0.197
2.057AsnArg: 2.057 ± 0.415
1.881AsnSer: 1.881 ± 0.302
2.293AsnThr: 2.293 ± 0.435
2.528AsnVal: 2.528 ± 0.374
0.411AsnTrp: 0.411 ± 0.147
0.529AsnTyr: 0.529 ± 0.176
0.0AsnXaa: 0.0 ± 0.0
Pro
6.525ProAla: 6.525 ± 0.641
0.588ProCys: 0.588 ± 0.249
4.938ProAsp: 4.938 ± 0.635
3.586ProGlu: 3.586 ± 0.546
1.705ProPhe: 1.705 ± 0.271
5.232ProGly: 5.232 ± 0.621
1.705ProHis: 1.705 ± 0.349
2.763ProIle: 2.763 ± 0.342
2.41ProLys: 2.41 ± 0.352
3.88ProLeu: 3.88 ± 0.504
1.646ProMet: 1.646 ± 0.404
1.47ProAsn: 1.47 ± 0.265
3.292ProPro: 3.292 ± 0.587
1.999ProGln: 1.999 ± 0.322
3.703ProArg: 3.703 ± 0.453
3.233ProSer: 3.233 ± 0.378
3.88ProThr: 3.88 ± 0.497
3.997ProVal: 3.997 ± 0.608
1.646ProTrp: 1.646 ± 0.334
1.176ProTyr: 1.176 ± 0.272
0.0ProXaa: 0.0 ± 0.0
Gln
3.703GlnAla: 3.703 ± 0.376
0.235GlnCys: 0.235 ± 0.115
1.293GlnAsp: 1.293 ± 0.337
1.528GlnGlu: 1.528 ± 0.297
1.352GlnPhe: 1.352 ± 0.308
2.293GlnGly: 2.293 ± 0.388
1.293GlnHis: 1.293 ± 0.282
1.411GlnIle: 1.411 ± 0.28
0.941GlnLys: 0.941 ± 0.22
2.822GlnLeu: 2.822 ± 0.347
0.764GlnMet: 0.764 ± 0.205
0.941GlnAsn: 0.941 ± 0.214
2.645GlnPro: 2.645 ± 0.459
2.234GlnGln: 2.234 ± 0.426
3.233GlnArg: 3.233 ± 0.437
1.47GlnSer: 1.47 ± 0.234
1.587GlnThr: 1.587 ± 0.268
2.469GlnVal: 2.469 ± 0.363
1.058GlnTrp: 1.058 ± 0.24
1.176GlnTyr: 1.176 ± 0.2
0.0GlnXaa: 0.0 ± 0.0
Arg
7.818ArgAla: 7.818 ± 0.775
0.764ArgCys: 0.764 ± 0.24
5.408ArgAsp: 5.408 ± 0.605
4.879ArgGlu: 4.879 ± 0.648
1.705ArgPhe: 1.705 ± 0.332
6.172ArgGly: 6.172 ± 0.627
2.528ArgHis: 2.528 ± 0.431
3.645ArgIle: 3.645 ± 0.448
2.293ArgLys: 2.293 ± 0.317
4.762ArgLeu: 4.762 ± 0.511
2.41ArgMet: 2.41 ± 0.346
3.057ArgAsn: 3.057 ± 0.377
3.939ArgPro: 3.939 ± 0.52
2.88ArgGln: 2.88 ± 0.555
7.642ArgArg: 7.642 ± 0.934
3.703ArgSer: 3.703 ± 0.442
4.526ArgThr: 4.526 ± 0.441
5.291ArgVal: 5.291 ± 0.546
1.587ArgTrp: 1.587 ± 0.374
1.822ArgTyr: 1.822 ± 0.388
0.0ArgXaa: 0.0 ± 0.0
Ser
5.761SerAla: 5.761 ± 0.814
0.176SerCys: 0.176 ± 0.107
3.351SerAsp: 3.351 ± 0.46
3.174SerGlu: 3.174 ± 0.459
1.293SerPhe: 1.293 ± 0.285
5.291SerGly: 5.291 ± 0.587
0.705SerHis: 0.705 ± 0.207
3.468SerIle: 3.468 ± 0.397
1.646SerLys: 1.646 ± 0.291
3.116SerLeu: 3.116 ± 0.516
1.234SerMet: 1.234 ± 0.237
1.587SerAsn: 1.587 ± 0.316
2.939SerPro: 2.939 ± 0.44
0.882SerGln: 0.882 ± 0.223
3.292SerArg: 3.292 ± 0.478
2.41SerSer: 2.41 ± 0.341
4.526SerThr: 4.526 ± 0.49
3.997SerVal: 3.997 ± 0.503
1.058SerTrp: 1.058 ± 0.199
0.823SerTyr: 0.823 ± 0.216
0.0SerXaa: 0.0 ± 0.0
Thr
7.76ThrAla: 7.76 ± 0.715
0.529ThrCys: 0.529 ± 0.138
4.762ThrAsp: 4.762 ± 0.802
3.645ThrGlu: 3.645 ± 0.444
1.705ThrPhe: 1.705 ± 0.477
6.643ThrGly: 6.643 ± 0.7
1.705ThrHis: 1.705 ± 0.336
2.469ThrIle: 2.469 ± 0.349
2.704ThrLys: 2.704 ± 0.494
5.526ThrLeu: 5.526 ± 0.708
0.941ThrMet: 0.941 ± 0.243
2.175ThrAsn: 2.175 ± 0.36
3.88ThrPro: 3.88 ± 0.455
2.293ThrGln: 2.293 ± 0.399
4.82ThrArg: 4.82 ± 0.524
3.762ThrSer: 3.762 ± 0.477
3.939ThrThr: 3.939 ± 0.558
5.526ThrVal: 5.526 ± 0.548
0.823ThrTrp: 0.823 ± 0.236
1.528ThrTyr: 1.528 ± 0.42
0.0ThrXaa: 0.0 ± 0.0
Val
8.583ValAla: 8.583 ± 0.79
1.117ValCys: 1.117 ± 0.269
5.291ValAsp: 5.291 ± 0.538
5.114ValGlu: 5.114 ± 0.714
1.528ValPhe: 1.528 ± 0.324
5.114ValGly: 5.114 ± 0.673
1.411ValHis: 1.411 ± 0.293
3.762ValIle: 3.762 ± 0.453
2.587ValLys: 2.587 ± 0.439
5.526ValLeu: 5.526 ± 0.667
1.646ValMet: 1.646 ± 0.355
1.646ValAsn: 1.646 ± 0.313
3.586ValPro: 3.586 ± 0.47
2.528ValGln: 2.528 ± 0.416
5.526ValArg: 5.526 ± 0.672
2.88ValSer: 2.88 ± 0.457
5.526ValThr: 5.526 ± 0.758
6.702ValVal: 6.702 ± 0.701
1.646ValTrp: 1.646 ± 0.318
1.999ValTyr: 1.999 ± 0.32
0.0ValXaa: 0.0 ± 0.0
Trp
1.528TrpAla: 1.528 ± 0.282
0.353TrpCys: 0.353 ± 0.148
1.411TrpAsp: 1.411 ± 0.331
0.705TrpGlu: 0.705 ± 0.179
0.705TrpPhe: 0.705 ± 0.211
1.293TrpGly: 1.293 ± 0.266
0.647TrpHis: 0.647 ± 0.212
0.882TrpIle: 0.882 ± 0.216
0.47TrpLys: 0.47 ± 0.159
1.881TrpLeu: 1.881 ± 0.397
0.529TrpMet: 0.529 ± 0.183
0.764TrpAsn: 0.764 ± 0.341
1.352TrpPro: 1.352 ± 0.286
0.705TrpGln: 0.705 ± 0.213
1.764TrpArg: 1.764 ± 0.318
1.352TrpSer: 1.352 ± 0.292
1.528TrpThr: 1.528 ± 0.246
1.47TrpVal: 1.47 ± 0.268
0.705TrpTrp: 0.705 ± 0.191
0.588TrpTyr: 0.588 ± 0.204
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.234TyrAla: 2.234 ± 0.299
0.235TyrCys: 0.235 ± 0.139
1.47TyrAsp: 1.47 ± 0.265
2.057TyrGlu: 2.057 ± 0.397
0.705TyrPhe: 0.705 ± 0.232
2.293TyrGly: 2.293 ± 0.364
0.764TyrHis: 0.764 ± 0.249
0.882TyrIle: 0.882 ± 0.188
0.823TyrLys: 0.823 ± 0.174
1.764TyrLeu: 1.764 ± 0.371
0.47TyrMet: 0.47 ± 0.185
0.764TyrAsn: 0.764 ± 0.164
1.293TyrPro: 1.293 ± 0.247
0.529TyrGln: 0.529 ± 0.194
1.822TyrArg: 1.822 ± 0.381
0.823TyrSer: 0.823 ± 0.217
1.764TyrThr: 1.764 ± 0.303
1.881TyrVal: 1.881 ± 0.365
0.47TyrTrp: 0.47 ± 0.18
0.529TyrTyr: 0.529 ± 0.177
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 89 proteins (17012 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski