Amino acid dipepetide frequency for Arthrobacter phage Shiba

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
7.585AlaAla: 7.585 ± 0.979
0.529AlaCys: 0.529 ± 0.184
5.233AlaAsp: 5.233 ± 0.536
5.527AlaGlu: 5.527 ± 0.64
3.175AlaPhe: 3.175 ± 0.432
6.468AlaGly: 6.468 ± 1.237
1.646AlaHis: 1.646 ± 0.306
5.468AlaIle: 5.468 ± 0.569
5.35AlaLys: 5.35 ± 0.61
7.702AlaLeu: 7.702 ± 1.232
3.469AlaMet: 3.469 ± 0.791
3.057AlaAsn: 3.057 ± 0.488
3.528AlaPro: 3.528 ± 0.554
3.41AlaGln: 3.41 ± 0.483
4.41AlaArg: 4.41 ± 0.524
5.88AlaSer: 5.88 ± 0.65
5.233AlaThr: 5.233 ± 0.564
6.997AlaVal: 6.997 ± 0.702
1.47AlaTrp: 1.47 ± 0.349
2.763AlaTyr: 2.763 ± 0.415
0.0AlaXaa: 0.0 ± 0.0
Cys
0.412CysAla: 0.412 ± 0.148
0.0CysCys: 0.0 ± 0.0
0.412CysAsp: 0.412 ± 0.179
0.412CysGlu: 0.412 ± 0.139
0.235CysPhe: 0.235 ± 0.117
0.588CysGly: 0.588 ± 0.243
0.412CysHis: 0.412 ± 0.193
0.47CysIle: 0.47 ± 0.177
0.235CysLys: 0.235 ± 0.143
0.588CysLeu: 0.588 ± 0.144
0.059CysMet: 0.059 ± 0.06
0.176CysAsn: 0.176 ± 0.095
0.47CysPro: 0.47 ± 0.158
0.059CysGln: 0.059 ± 0.058
0.294CysArg: 0.294 ± 0.134
0.294CysSer: 0.294 ± 0.108
0.529CysThr: 0.529 ± 0.187
0.294CysVal: 0.294 ± 0.145
0.118CysTrp: 0.118 ± 0.088
0.176CysTyr: 0.176 ± 0.098
0.0CysXaa: 0.0 ± 0.0
Asp
5.88AspAla: 5.88 ± 0.712
0.47AspCys: 0.47 ± 0.218
3.939AspAsp: 3.939 ± 0.713
3.939AspGlu: 3.939 ± 0.534
2.999AspPhe: 2.999 ± 0.441
4.468AspGly: 4.468 ± 0.52
0.764AspHis: 0.764 ± 0.214
3.822AspIle: 3.822 ± 0.47
3.998AspLys: 3.998 ± 0.607
4.527AspLeu: 4.527 ± 0.549
1.47AspMet: 1.47 ± 0.226
2.881AspAsn: 2.881 ± 0.374
2.822AspPro: 2.822 ± 0.647
2.587AspGln: 2.587 ± 0.331
2.881AspArg: 2.881 ± 0.588
2.881AspSer: 2.881 ± 0.478
3.057AspThr: 3.057 ± 0.473
4.233AspVal: 4.233 ± 0.566
1.176AspTrp: 1.176 ± 0.243
1.94AspTyr: 1.94 ± 0.382
0.0AspXaa: 0.0 ± 0.0
Glu
5.938GluAla: 5.938 ± 0.583
0.47GluCys: 0.47 ± 0.171
3.763GluAsp: 3.763 ± 0.496
5.468GluGlu: 5.468 ± 0.833
2.411GluPhe: 2.411 ± 0.372
4.88GluGly: 4.88 ± 0.548
1.058GluHis: 1.058 ± 0.233
4.175GluIle: 4.175 ± 0.444
3.939GluLys: 3.939 ± 0.727
5.35GluLeu: 5.35 ± 0.635
1.823GluMet: 1.823 ± 0.312
2.293GluAsn: 2.293 ± 0.361
2.411GluPro: 2.411 ± 0.445
2.528GluGln: 2.528 ± 0.404
3.41GluArg: 3.41 ± 0.515
2.94GluSer: 2.94 ± 0.421
3.998GluThr: 3.998 ± 0.526
4.762GluVal: 4.762 ± 0.701
1.352GluTrp: 1.352 ± 0.317
2.469GluTyr: 2.469 ± 0.352
0.0GluXaa: 0.0 ± 0.0
Phe
2.352PheAla: 2.352 ± 0.419
0.176PheCys: 0.176 ± 0.099
2.881PheAsp: 2.881 ± 0.359
2.705PheGlu: 2.705 ± 0.397
1.294PhePhe: 1.294 ± 0.291
3.057PheGly: 3.057 ± 0.362
0.764PheHis: 0.764 ± 0.18
2.234PheIle: 2.234 ± 0.377
2.058PheLys: 2.058 ± 0.321
2.469PheLeu: 2.469 ± 0.351
1.411PheMet: 1.411 ± 0.333
2.234PheAsn: 2.234 ± 0.322
1.117PhePro: 1.117 ± 0.268
0.941PheGln: 0.941 ± 0.231
1.764PheArg: 1.764 ± 0.427
2.117PheSer: 2.117 ± 0.344
2.94PheThr: 2.94 ± 0.502
2.587PheVal: 2.587 ± 0.378
0.823PheTrp: 0.823 ± 0.241
1.058PheTyr: 1.058 ± 0.328
0.0PheXaa: 0.0 ± 0.0
Gly
7.82GlyAla: 7.82 ± 0.893
0.294GlyCys: 0.294 ± 0.135
5.174GlyAsp: 5.174 ± 0.635
3.763GlyGlu: 3.763 ± 0.369
2.763GlyPhe: 2.763 ± 0.319
5.468GlyGly: 5.468 ± 0.925
1.117GlyHis: 1.117 ± 0.28
4.762GlyIle: 4.762 ± 0.86
3.704GlyLys: 3.704 ± 0.475
6.703GlyLeu: 6.703 ± 0.951
2.469GlyMet: 2.469 ± 0.629
3.175GlyAsn: 3.175 ± 0.565
1.646GlyPro: 1.646 ± 0.297
1.94GlyGln: 1.94 ± 0.309
3.234GlyArg: 3.234 ± 0.347
5.35GlySer: 5.35 ± 0.687
4.645GlyThr: 4.645 ± 0.52
6.585GlyVal: 6.585 ± 0.74
1.764GlyTrp: 1.764 ± 0.281
2.587GlyTyr: 2.587 ± 0.46
0.0GlyXaa: 0.0 ± 0.0
His
1.411HisAla: 1.411 ± 0.282
0.118HisCys: 0.118 ± 0.076
0.647HisAsp: 0.647 ± 0.197
1.411HisGlu: 1.411 ± 0.314
0.412HisPhe: 0.412 ± 0.149
1.764HisGly: 1.764 ± 0.431
0.588HisHis: 0.588 ± 0.213
0.823HisIle: 0.823 ± 0.221
0.882HisLys: 0.882 ± 0.241
1.0HisLeu: 1.0 ± 0.26
0.706HisMet: 0.706 ± 0.219
1.058HisAsn: 1.058 ± 0.217
0.823HisPro: 0.823 ± 0.234
0.882HisGln: 0.882 ± 0.196
0.941HisArg: 0.941 ± 0.219
1.058HisSer: 1.058 ± 0.229
1.0HisThr: 1.0 ± 0.235
1.294HisVal: 1.294 ± 0.256
0.294HisTrp: 0.294 ± 0.126
1.235HisTyr: 1.235 ± 0.258
0.0HisXaa: 0.0 ± 0.0
Ile
5.233IleAla: 5.233 ± 0.86
0.353IleCys: 0.353 ± 0.171
4.939IleAsp: 4.939 ± 0.464
3.704IleGlu: 3.704 ± 0.399
1.881IlePhe: 1.881 ± 0.37
3.881IleGly: 3.881 ± 0.852
1.411IleHis: 1.411 ± 0.265
2.822IleIle: 2.822 ± 0.358
3.234IleLys: 3.234 ± 0.439
4.88IleLeu: 4.88 ± 0.689
1.705IleMet: 1.705 ± 0.356
3.234IleAsn: 3.234 ± 0.491
2.175IlePro: 2.175 ± 0.363
1.881IleGln: 1.881 ± 0.299
3.175IleArg: 3.175 ± 0.433
3.528IleSer: 3.528 ± 0.421
3.881IleThr: 3.881 ± 0.444
4.292IleVal: 4.292 ± 0.484
1.0IleTrp: 1.0 ± 0.308
1.823IleTyr: 1.823 ± 0.341
0.0IleXaa: 0.0 ± 0.0
Lys
6.174LysAla: 6.174 ± 0.747
0.235LysCys: 0.235 ± 0.15
3.293LysAsp: 3.293 ± 0.529
4.88LysGlu: 4.88 ± 0.744
2.175LysPhe: 2.175 ± 0.391
4.704LysGly: 4.704 ± 0.607
1.47LysHis: 1.47 ± 0.413
3.763LysIle: 3.763 ± 0.463
5.233LysLys: 5.233 ± 0.628
5.409LysLeu: 5.409 ± 0.646
2.293LysMet: 2.293 ± 0.436
2.058LysAsn: 2.058 ± 0.298
2.881LysPro: 2.881 ± 0.469
2.411LysGln: 2.411 ± 0.445
3.057LysArg: 3.057 ± 0.483
3.939LysSer: 3.939 ± 0.422
3.351LysThr: 3.351 ± 0.521
3.645LysVal: 3.645 ± 0.409
1.294LysTrp: 1.294 ± 0.267
1.823LysTyr: 1.823 ± 0.37
0.0LysXaa: 0.0 ± 0.0
Leu
8.643LeuAla: 8.643 ± 1.124
0.588LeuCys: 0.588 ± 0.205
4.468LeuAsp: 4.468 ± 0.466
5.115LeuGlu: 5.115 ± 0.514
2.646LeuPhe: 2.646 ± 0.354
6.174LeuGly: 6.174 ± 1.147
1.411LeuHis: 1.411 ± 0.278
5.233LeuIle: 5.233 ± 0.612
5.292LeuLys: 5.292 ± 0.644
5.233LeuLeu: 5.233 ± 0.731
2.175LeuMet: 2.175 ± 0.302
4.057LeuAsn: 4.057 ± 0.53
3.351LeuPro: 3.351 ± 0.336
2.587LeuGln: 2.587 ± 0.343
4.116LeuArg: 4.116 ± 0.533
4.939LeuSer: 4.939 ± 0.777
5.468LeuThr: 5.468 ± 0.554
4.939LeuVal: 4.939 ± 0.542
1.058LeuTrp: 1.058 ± 0.217
2.763LeuTyr: 2.763 ± 0.514
0.0LeuXaa: 0.0 ± 0.0
Met
3.293MetAla: 3.293 ± 0.561
0.0MetCys: 0.0 ± 0.0
1.823MetAsp: 1.823 ± 0.335
2.234MetGlu: 2.234 ± 0.426
1.352MetPhe: 1.352 ± 0.25
1.646MetGly: 1.646 ± 0.383
0.647MetHis: 0.647 ± 0.213
1.294MetIle: 1.294 ± 0.28
1.587MetLys: 1.587 ± 0.407
2.234MetLeu: 2.234 ± 0.342
0.823MetMet: 0.823 ± 0.22
0.882MetAsn: 0.882 ± 0.228
0.823MetPro: 0.823 ± 0.202
1.058MetGln: 1.058 ± 0.272
1.529MetArg: 1.529 ± 0.284
2.528MetSer: 2.528 ± 0.32
1.999MetThr: 1.999 ± 0.389
1.235MetVal: 1.235 ± 0.309
0.412MetTrp: 0.412 ± 0.153
0.764MetTyr: 0.764 ± 0.223
0.0MetXaa: 0.0 ± 0.0
Asn
3.234AsnAla: 3.234 ± 0.476
0.353AsnCys: 0.353 ± 0.134
2.058AsnAsp: 2.058 ± 0.357
2.705AsnGlu: 2.705 ± 0.417
1.94AsnPhe: 1.94 ± 0.412
4.645AsnGly: 4.645 ± 0.598
0.764AsnHis: 0.764 ± 0.218
2.705AsnIle: 2.705 ± 0.355
3.057AsnLys: 3.057 ± 0.432
3.469AsnLeu: 3.469 ± 0.38
1.294AsnMet: 1.294 ± 0.295
1.823AsnAsn: 1.823 ± 0.378
3.116AsnPro: 3.116 ± 0.442
1.411AsnGln: 1.411 ± 0.223
2.352AsnArg: 2.352 ± 0.496
2.705AsnSer: 2.705 ± 0.414
2.705AsnThr: 2.705 ± 0.333
2.352AsnVal: 2.352 ± 0.417
0.647AsnTrp: 0.647 ± 0.173
1.705AsnTyr: 1.705 ± 0.343
0.0AsnXaa: 0.0 ± 0.0
Pro
3.704ProAla: 3.704 ± 0.409
0.353ProCys: 0.353 ± 0.184
3.057ProAsp: 3.057 ± 0.426
3.528ProGlu: 3.528 ± 0.52
1.235ProPhe: 1.235 ± 0.198
2.881ProGly: 2.881 ± 0.391
1.0ProHis: 1.0 ± 0.275
1.764ProIle: 1.764 ± 0.272
2.352ProLys: 2.352 ± 0.313
2.293ProLeu: 2.293 ± 0.385
0.706ProMet: 0.706 ± 0.221
2.058ProAsn: 2.058 ± 0.366
1.881ProPro: 1.881 ± 0.379
1.117ProGln: 1.117 ± 0.239
2.058ProArg: 2.058 ± 0.385
3.057ProSer: 3.057 ± 0.48
2.881ProThr: 2.881 ± 0.523
3.116ProVal: 3.116 ± 0.478
0.588ProTrp: 0.588 ± 0.154
1.764ProTyr: 1.764 ± 0.315
0.0ProXaa: 0.0 ± 0.0
Gln
3.057GlnAla: 3.057 ± 0.471
0.118GlnCys: 0.118 ± 0.091
1.352GlnAsp: 1.352 ± 0.299
1.587GlnGlu: 1.587 ± 0.361
1.235GlnPhe: 1.235 ± 0.313
2.293GlnGly: 2.293 ± 0.406
0.823GlnHis: 0.823 ± 0.252
2.469GlnIle: 2.469 ± 0.332
2.293GlnLys: 2.293 ± 0.296
3.293GlnLeu: 3.293 ± 0.513
0.47GlnMet: 0.47 ± 0.2
1.705GlnAsn: 1.705 ± 0.32
1.352GlnPro: 1.352 ± 0.232
1.176GlnGln: 1.176 ± 0.27
2.058GlnArg: 2.058 ± 0.325
1.529GlnSer: 1.529 ± 0.258
2.058GlnThr: 2.058 ± 0.373
2.117GlnVal: 2.117 ± 0.362
0.412GlnTrp: 0.412 ± 0.147
1.411GlnTyr: 1.411 ± 0.276
0.0GlnXaa: 0.0 ± 0.0
Arg
3.645ArgAla: 3.645 ± 0.522
0.176ArgCys: 0.176 ± 0.104
2.94ArgAsp: 2.94 ± 0.516
2.763ArgGlu: 2.763 ± 0.369
1.823ArgPhe: 1.823 ± 0.27
2.822ArgGly: 2.822 ± 0.392
1.0ArgHis: 1.0 ± 0.256
3.175ArgIle: 3.175 ± 0.379
4.116ArgLys: 4.116 ± 0.603
4.821ArgLeu: 4.821 ± 0.449
1.705ArgMet: 1.705 ± 0.372
2.763ArgAsn: 2.763 ± 0.348
2.234ArgPro: 2.234 ± 0.318
1.881ArgGln: 1.881 ± 0.364
3.939ArgArg: 3.939 ± 0.692
2.469ArgSer: 2.469 ± 0.345
3.41ArgThr: 3.41 ± 0.537
2.999ArgVal: 2.999 ± 0.497
0.882ArgTrp: 0.882 ± 0.251
2.117ArgTyr: 2.117 ± 0.371
0.0ArgXaa: 0.0 ± 0.0
Ser
5.35SerAla: 5.35 ± 0.72
0.529SerCys: 0.529 ± 0.194
2.705SerAsp: 2.705 ± 0.448
4.057SerGlu: 4.057 ± 0.415
2.175SerPhe: 2.175 ± 0.288
4.586SerGly: 4.586 ± 0.587
1.0SerHis: 1.0 ± 0.264
3.587SerIle: 3.587 ± 0.387
3.998SerLys: 3.998 ± 0.583
4.88SerLeu: 4.88 ± 0.597
1.823SerMet: 1.823 ± 0.267
2.469SerAsn: 2.469 ± 0.318
2.763SerPro: 2.763 ± 0.379
1.352SerGln: 1.352 ± 0.292
3.528SerArg: 3.528 ± 0.462
4.116SerSer: 4.116 ± 0.58
4.527SerThr: 4.527 ± 0.677
3.234SerVal: 3.234 ± 0.422
1.235SerTrp: 1.235 ± 0.238
2.058SerTyr: 2.058 ± 0.507
0.0SerXaa: 0.0 ± 0.0
Thr
4.998ThrAla: 4.998 ± 0.775
0.412ThrCys: 0.412 ± 0.152
3.587ThrAsp: 3.587 ± 0.564
3.998ThrGlu: 3.998 ± 0.511
2.234ThrPhe: 2.234 ± 0.326
5.703ThrGly: 5.703 ± 0.666
1.176ThrHis: 1.176 ± 0.275
3.881ThrIle: 3.881 ± 0.468
3.881ThrLys: 3.881 ± 0.492
5.821ThrLeu: 5.821 ± 0.728
1.529ThrMet: 1.529 ± 0.361
2.763ThrAsn: 2.763 ± 0.42
3.763ThrPro: 3.763 ± 0.413
1.823ThrGln: 1.823 ± 0.26
2.175ThrArg: 2.175 ± 0.328
3.293ThrSer: 3.293 ± 0.541
4.468ThrThr: 4.468 ± 0.643
5.644ThrVal: 5.644 ± 0.695
0.823ThrTrp: 0.823 ± 0.215
2.352ThrTyr: 2.352 ± 0.49
0.0ThrXaa: 0.0 ± 0.0
Val
6.056ValAla: 6.056 ± 0.663
0.412ValCys: 0.412 ± 0.172
4.939ValAsp: 4.939 ± 0.512
4.233ValGlu: 4.233 ± 0.645
2.469ValPhe: 2.469 ± 0.416
4.586ValGly: 4.586 ± 0.681
0.588ValHis: 0.588 ± 0.173
4.175ValIle: 4.175 ± 0.445
5.233ValLys: 5.233 ± 0.505
6.056ValLeu: 6.056 ± 0.675
1.411ValMet: 1.411 ± 0.273
3.645ValAsn: 3.645 ± 0.565
2.822ValPro: 2.822 ± 0.393
1.999ValGln: 1.999 ± 0.425
3.587ValArg: 3.587 ± 0.418
3.998ValSer: 3.998 ± 0.483
5.056ValThr: 5.056 ± 0.706
5.409ValVal: 5.409 ± 0.683
0.941ValTrp: 0.941 ± 0.3
1.999ValTyr: 1.999 ± 0.473
0.0ValXaa: 0.0 ± 0.0
Trp
0.941TrpAla: 0.941 ± 0.234
0.176TrpCys: 0.176 ± 0.111
1.294TrpAsp: 1.294 ± 0.318
1.294TrpGlu: 1.294 ± 0.277
0.764TrpPhe: 0.764 ± 0.166
1.117TrpGly: 1.117 ± 0.245
0.176TrpHis: 0.176 ± 0.102
0.941TrpIle: 0.941 ± 0.266
1.411TrpLys: 1.411 ± 0.324
1.352TrpLeu: 1.352 ± 0.3
0.294TrpMet: 0.294 ± 0.119
1.058TrpAsn: 1.058 ± 0.288
0.294TrpPro: 0.294 ± 0.136
0.647TrpGln: 0.647 ± 0.286
1.0TrpArg: 1.0 ± 0.34
1.117TrpSer: 1.117 ± 0.288
0.941TrpThr: 0.941 ± 0.199
1.411TrpVal: 1.411 ± 0.271
0.412TrpTrp: 0.412 ± 0.174
0.706TrpTyr: 0.706 ± 0.217
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.175TyrAla: 3.175 ± 0.45
0.47TyrCys: 0.47 ± 0.158
2.234TyrAsp: 2.234 ± 0.495
2.175TyrGlu: 2.175 ± 0.4
1.764TyrPhe: 1.764 ± 0.397
3.234TyrGly: 3.234 ± 0.518
0.529TyrHis: 0.529 ± 0.179
1.47TyrIle: 1.47 ± 0.355
2.234TyrLys: 2.234 ± 0.477
2.175TyrLeu: 2.175 ± 0.418
0.47TyrMet: 0.47 ± 0.185
1.646TyrAsn: 1.646 ± 0.235
1.117TyrPro: 1.117 ± 0.261
1.117TyrGln: 1.117 ± 0.24
2.175TyrArg: 2.175 ± 0.417
2.234TyrSer: 2.234 ± 0.394
2.175TyrThr: 2.175 ± 0.518
2.469TyrVal: 2.469 ± 0.494
0.647TyrTrp: 0.647 ± 0.203
1.176TyrTyr: 1.176 ± 0.367
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 86 proteins (17009 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski