Amino acid dipepetide frequency for Streptomyces phage Zuko

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
10.624AlaAla: 10.624 ± 0.996
1.012AlaCys: 1.012 ± 0.275
5.523AlaAsp: 5.523 ± 0.531
6.956AlaGlu: 6.956 ± 0.863
3.162AlaPhe: 3.162 ± 0.416
7.968AlaGly: 7.968 ± 0.91
2.192AlaHis: 2.192 ± 0.335
4.384AlaIle: 4.384 ± 0.512
5.86AlaLys: 5.86 ± 0.585
8.01AlaLeu: 8.01 ± 0.816
2.825AlaMet: 2.825 ± 0.325
2.825AlaAsn: 2.825 ± 0.481
4.089AlaPro: 4.089 ± 0.561
4.174AlaGln: 4.174 ± 0.387
5.944AlaArg: 5.944 ± 0.624
5.481AlaSer: 5.481 ± 0.437
5.944AlaThr: 5.944 ± 0.513
5.481AlaVal: 5.481 ± 0.537
1.686AlaTrp: 1.686 ± 0.27
2.572AlaTyr: 2.572 ± 0.315
0.0AlaXaa: 0.0 ± 0.0
Cys
0.801CysAla: 0.801 ± 0.198
0.211CysCys: 0.211 ± 0.1
0.885CysAsp: 0.885 ± 0.174
0.675CysGlu: 0.675 ± 0.19
0.295CysPhe: 0.295 ± 0.129
1.054CysGly: 1.054 ± 0.256
0.253CysHis: 0.253 ± 0.111
0.337CysIle: 0.337 ± 0.104
0.253CysLys: 0.253 ± 0.118
0.801CysLeu: 0.801 ± 0.227
0.337CysMet: 0.337 ± 0.109
0.422CysAsn: 0.422 ± 0.137
0.885CysPro: 0.885 ± 0.248
0.379CysGln: 0.379 ± 0.166
1.433CysArg: 1.433 ± 0.315
0.843CysSer: 0.843 ± 0.18
0.717CysThr: 0.717 ± 0.206
0.59CysVal: 0.59 ± 0.168
0.169CysTrp: 0.169 ± 0.093
0.379CysTyr: 0.379 ± 0.146
0.0CysXaa: 0.0 ± 0.0
Asp
5.944AspAla: 5.944 ± 0.646
0.632AspCys: 0.632 ± 0.191
5.481AspAsp: 5.481 ± 0.775
6.029AspGlu: 6.029 ± 0.941
2.192AspPhe: 2.192 ± 0.355
4.553AspGly: 4.553 ± 0.416
1.686AspHis: 1.686 ± 0.291
2.572AspIle: 2.572 ± 0.414
2.361AspLys: 2.361 ± 0.371
5.396AspLeu: 5.396 ± 0.447
2.445AspMet: 2.445 ± 0.373
2.487AspAsn: 2.487 ± 0.286
4.384AspPro: 4.384 ± 0.511
2.108AspGln: 2.108 ± 0.308
3.415AspArg: 3.415 ± 0.454
2.656AspSer: 2.656 ± 0.37
3.71AspThr: 3.71 ± 0.559
3.246AspVal: 3.246 ± 0.372
1.138AspTrp: 1.138 ± 0.202
2.108AspTyr: 2.108 ± 0.297
0.0AspXaa: 0.0 ± 0.0
Glu
6.872GluAla: 6.872 ± 0.654
0.927GluCys: 0.927 ± 0.203
5.438GluAsp: 5.438 ± 0.874
4.89GluGlu: 4.89 ± 0.717
2.277GluPhe: 2.277 ± 0.369
4.089GluGly: 4.089 ± 0.515
1.433GluHis: 1.433 ± 0.274
3.035GluIle: 3.035 ± 0.344
2.487GluLys: 2.487 ± 0.39
5.017GluLeu: 5.017 ± 0.6
2.53GluMet: 2.53 ± 0.351
2.319GluAsn: 2.319 ± 0.426
3.499GluPro: 3.499 ± 0.512
1.728GluGln: 1.728 ± 0.312
4.216GluArg: 4.216 ± 0.593
3.668GluSer: 3.668 ± 0.363
3.162GluThr: 3.162 ± 0.44
4.848GluVal: 4.848 ± 0.552
1.855GluTrp: 1.855 ± 0.317
2.614GluTyr: 2.614 ± 0.417
0.0GluXaa: 0.0 ± 0.0
Phe
2.319PheAla: 2.319 ± 0.384
0.464PheCys: 0.464 ± 0.164
2.487PheAsp: 2.487 ± 0.298
1.728PheGlu: 1.728 ± 0.313
1.054PhePhe: 1.054 ± 0.203
2.15PheGly: 2.15 ± 0.287
0.885PheHis: 0.885 ± 0.187
1.307PheIle: 1.307 ± 0.198
1.265PheLys: 1.265 ± 0.291
2.951PheLeu: 2.951 ± 0.377
0.675PheMet: 0.675 ± 0.171
1.391PheAsn: 1.391 ± 0.284
2.108PhePro: 2.108 ± 0.346
1.265PheGln: 1.265 ± 0.231
2.698PheArg: 2.698 ± 0.381
1.939PheSer: 1.939 ± 0.285
1.813PheThr: 1.813 ± 0.308
2.234PheVal: 2.234 ± 0.286
0.464PheTrp: 0.464 ± 0.153
0.506PheTyr: 0.506 ± 0.133
0.0PheXaa: 0.0 ± 0.0
Gly
6.872GlyAla: 6.872 ± 0.801
0.801GlyCys: 0.801 ± 0.198
4.848GlyAsp: 4.848 ± 0.502
4.427GlyGlu: 4.427 ± 0.501
2.656GlyPhe: 2.656 ± 0.383
6.788GlyGly: 6.788 ± 1.34
1.686GlyHis: 1.686 ± 0.242
3.331GlyIle: 3.331 ± 0.508
4.848GlyLys: 4.848 ± 0.542
5.185GlyLeu: 5.185 ± 0.426
1.939GlyMet: 1.939 ± 0.3
2.234GlyAsn: 2.234 ± 0.322
2.53GlyPro: 2.53 ± 0.373
2.993GlyGln: 2.993 ± 0.409
4.384GlyArg: 4.384 ± 0.368
4.384GlySer: 4.384 ± 0.558
7.251GlyThr: 7.251 ± 0.791
5.776GlyVal: 5.776 ± 0.448
2.024GlyTrp: 2.024 ± 0.311
2.909GlyTyr: 2.909 ± 0.344
0.0GlyXaa: 0.0 ± 0.0
His
1.897HisAla: 1.897 ± 0.329
0.295HisCys: 0.295 ± 0.116
1.391HisAsp: 1.391 ± 0.297
1.728HisGlu: 1.728 ± 0.373
0.59HisPhe: 0.59 ± 0.161
1.265HisGly: 1.265 ± 0.194
0.59HisHis: 0.59 ± 0.256
0.675HisIle: 0.675 ± 0.154
0.927HisLys: 0.927 ± 0.256
1.433HisLeu: 1.433 ± 0.305
0.759HisMet: 0.759 ± 0.211
0.885HisAsn: 0.885 ± 0.164
1.981HisPro: 1.981 ± 0.309
0.632HisGln: 0.632 ± 0.174
1.307HisArg: 1.307 ± 0.233
1.307HisSer: 1.307 ± 0.234
1.855HisThr: 1.855 ± 0.259
1.771HisVal: 1.771 ± 0.343
0.379HisTrp: 0.379 ± 0.154
0.422HisTyr: 0.422 ± 0.175
0.0HisXaa: 0.0 ± 0.0
Ile
4.047IleAla: 4.047 ± 0.483
0.295IleCys: 0.295 ± 0.118
2.614IleAsp: 2.614 ± 0.349
2.445IleGlu: 2.445 ± 0.323
1.518IlePhe: 1.518 ± 0.223
3.12IleGly: 3.12 ± 0.462
1.012IleHis: 1.012 ± 0.239
2.319IleIle: 2.319 ± 0.464
2.825IleLys: 2.825 ± 0.435
2.445IleLeu: 2.445 ± 0.299
1.012IleMet: 1.012 ± 0.175
1.433IleAsn: 1.433 ± 0.259
2.277IlePro: 2.277 ± 0.299
2.108IleGln: 2.108 ± 0.303
3.415IleArg: 3.415 ± 0.423
2.572IleSer: 2.572 ± 0.295
3.246IleThr: 3.246 ± 0.391
2.867IleVal: 2.867 ± 0.311
0.801IleTrp: 0.801 ± 0.164
1.054IleTyr: 1.054 ± 0.213
0.0IleXaa: 0.0 ± 0.0
Lys
5.185LysAla: 5.185 ± 0.702
0.379LysCys: 0.379 ± 0.135
2.53LysAsp: 2.53 ± 0.331
3.035LysGlu: 3.035 ± 0.399
1.476LysPhe: 1.476 ± 0.267
3.288LysGly: 3.288 ± 0.469
0.759LysHis: 0.759 ± 0.203
2.572LysIle: 2.572 ± 0.379
3.71LysLys: 3.71 ± 0.572
3.752LysLeu: 3.752 ± 0.48
1.223LysMet: 1.223 ± 0.219
1.813LysAsn: 1.813 ± 0.293
3.035LysPro: 3.035 ± 0.452
0.759LysGln: 0.759 ± 0.208
3.541LysArg: 3.541 ± 0.437
3.078LysSer: 3.078 ± 0.388
3.035LysThr: 3.035 ± 0.455
3.415LysVal: 3.415 ± 0.412
1.138LysTrp: 1.138 ± 0.252
1.223LysTyr: 1.223 ± 0.267
0.0LysXaa: 0.0 ± 0.0
Leu
7.125LeuAla: 7.125 ± 0.61
0.717LeuCys: 0.717 ± 0.157
4.469LeuAsp: 4.469 ± 0.412
5.438LeuGlu: 5.438 ± 0.572
1.981LeuPhe: 1.981 ± 0.31
5.734LeuGly: 5.734 ± 0.664
1.433LeuHis: 1.433 ± 0.24
3.035LeuIle: 3.035 ± 0.381
3.541LeuLys: 3.541 ± 0.486
5.607LeuLeu: 5.607 ± 0.526
2.361LeuMet: 2.361 ± 0.276
2.53LeuAsn: 2.53 ± 0.401
4.68LeuPro: 4.68 ± 0.499
2.487LeuGln: 2.487 ± 0.325
4.384LeuArg: 4.384 ± 0.36
4.384LeuSer: 4.384 ± 0.506
4.89LeuThr: 4.89 ± 0.54
5.649LeuVal: 5.649 ± 0.527
1.433LeuTrp: 1.433 ± 0.25
1.939LeuTyr: 1.939 ± 0.261
0.0LeuXaa: 0.0 ± 0.0
Met
3.288MetAla: 3.288 ± 0.395
0.337MetCys: 0.337 ± 0.122
1.897MetAsp: 1.897 ± 0.301
1.771MetGlu: 1.771 ± 0.302
0.632MetPhe: 0.632 ± 0.167
1.855MetGly: 1.855 ± 0.295
0.548MetHis: 0.548 ± 0.166
1.686MetIle: 1.686 ± 0.259
1.602MetLys: 1.602 ± 0.287
1.391MetLeu: 1.391 ± 0.257
0.885MetMet: 0.885 ± 0.177
0.506MetAsn: 0.506 ± 0.145
1.56MetPro: 1.56 ± 0.226
0.801MetGln: 0.801 ± 0.209
1.897MetArg: 1.897 ± 0.283
2.487MetSer: 2.487 ± 0.307
2.53MetThr: 2.53 ± 0.325
1.602MetVal: 1.602 ± 0.349
0.337MetTrp: 0.337 ± 0.127
0.506MetTyr: 0.506 ± 0.155
0.0MetXaa: 0.0 ± 0.0
Asn
3.583AsnAla: 3.583 ± 0.45
0.506AsnCys: 0.506 ± 0.173
1.771AsnAsp: 1.771 ± 0.263
1.728AsnGlu: 1.728 ± 0.233
1.223AsnPhe: 1.223 ± 0.263
2.993AsnGly: 2.993 ± 0.389
0.717AsnHis: 0.717 ± 0.162
2.15AsnIle: 2.15 ± 0.333
1.602AsnLys: 1.602 ± 0.258
2.656AsnLeu: 2.656 ± 0.331
0.885AsnMet: 0.885 ± 0.218
1.307AsnAsn: 1.307 ± 0.3
2.656AsnPro: 2.656 ± 0.321
1.18AsnGln: 1.18 ± 0.229
1.728AsnArg: 1.728 ± 0.258
1.433AsnSer: 1.433 ± 0.325
2.909AsnThr: 2.909 ± 0.691
2.361AsnVal: 2.361 ± 0.359
0.843AsnTrp: 0.843 ± 0.206
1.054AsnTyr: 1.054 ± 0.202
0.0AsnXaa: 0.0 ± 0.0
Pro
5.607ProAla: 5.607 ± 0.593
0.632ProCys: 0.632 ± 0.203
3.626ProAsp: 3.626 ± 0.394
4.595ProGlu: 4.595 ± 0.569
1.644ProPhe: 1.644 ± 0.246
3.921ProGly: 3.921 ± 0.474
1.138ProHis: 1.138 ± 0.294
1.981ProIle: 1.981 ± 0.289
2.74ProLys: 2.74 ± 0.457
3.794ProLeu: 3.794 ± 0.469
1.391ProMet: 1.391 ± 0.237
1.981ProAsn: 1.981 ± 0.318
3.162ProPro: 3.162 ± 0.656
1.855ProGln: 1.855 ± 0.362
2.192ProArg: 2.192 ± 0.368
3.162ProSer: 3.162 ± 0.413
3.879ProThr: 3.879 ± 0.395
4.806ProVal: 4.806 ± 0.501
1.307ProTrp: 1.307 ± 0.221
1.518ProTyr: 1.518 ± 0.227
0.0ProXaa: 0.0 ± 0.0
Gln
3.752GlnAla: 3.752 ± 0.449
0.422GlnCys: 0.422 ± 0.138
2.234GlnAsp: 2.234 ± 0.327
1.897GlnGlu: 1.897 ± 0.318
1.223GlnPhe: 1.223 ± 0.229
2.445GlnGly: 2.445 ± 0.31
0.97GlnHis: 0.97 ± 0.255
1.476GlnIle: 1.476 ± 0.207
1.897GlnLys: 1.897 ± 0.238
2.024GlnLeu: 2.024 ± 0.392
0.885GlnMet: 0.885 ± 0.202
0.759GlnAsn: 0.759 ± 0.194
1.56GlnPro: 1.56 ± 0.303
1.18GlnGln: 1.18 ± 0.264
2.487GlnArg: 2.487 ± 0.354
2.614GlnSer: 2.614 ± 0.376
2.066GlnThr: 2.066 ± 0.298
2.614GlnVal: 2.614 ± 0.269
0.675GlnTrp: 0.675 ± 0.167
1.518GlnTyr: 1.518 ± 0.232
0.0GlnXaa: 0.0 ± 0.0
Arg
6.071ArgAla: 6.071 ± 0.608
0.843ArgCys: 0.843 ± 0.244
3.668ArgAsp: 3.668 ± 0.403
4.68ArgGlu: 4.68 ± 0.519
2.066ArgPhe: 2.066 ± 0.301
4.427ArgGly: 4.427 ± 0.434
1.265ArgHis: 1.265 ± 0.254
2.53ArgIle: 2.53 ± 0.335
2.993ArgLys: 2.993 ± 0.469
5.354ArgLeu: 5.354 ± 0.517
1.433ArgMet: 1.433 ± 0.255
2.361ArgAsn: 2.361 ± 0.319
2.361ArgPro: 2.361 ± 0.353
2.445ArgGln: 2.445 ± 0.348
4.258ArgArg: 4.258 ± 0.611
2.909ArgSer: 2.909 ± 0.355
3.794ArgThr: 3.794 ± 0.379
5.27ArgVal: 5.27 ± 0.499
1.265ArgTrp: 1.265 ± 0.275
1.855ArgTyr: 1.855 ± 0.287
0.0ArgXaa: 0.0 ± 0.0
Ser
5.481SerAla: 5.481 ± 0.535
0.675SerCys: 0.675 ± 0.175
3.541SerAsp: 3.541 ± 0.484
3.331SerGlu: 3.331 ± 0.418
1.518SerPhe: 1.518 ± 0.259
6.408SerGly: 6.408 ± 0.64
1.391SerHis: 1.391 ± 0.233
2.066SerIle: 2.066 ± 0.326
1.939SerLys: 1.939 ± 0.337
4.047SerLeu: 4.047 ± 0.446
1.686SerMet: 1.686 ± 0.302
2.403SerAsn: 2.403 ± 0.534
3.246SerPro: 3.246 ± 0.393
2.614SerGln: 2.614 ± 0.342
3.499SerArg: 3.499 ± 0.421
3.457SerSer: 3.457 ± 0.69
4.132SerThr: 4.132 ± 0.555
3.499SerVal: 3.499 ± 0.364
1.138SerTrp: 1.138 ± 0.219
1.728SerTyr: 1.728 ± 0.258
0.0SerXaa: 0.0 ± 0.0
Thr
6.83ThrAla: 6.83 ± 0.794
1.054ThrCys: 1.054 ± 0.26
4.427ThrAsp: 4.427 ± 0.452
3.963ThrGlu: 3.963 ± 0.433
2.53ThrPhe: 2.53 ± 0.341
6.282ThrGly: 6.282 ± 0.739
1.56ThrHis: 1.56 ± 0.206
2.825ThrIle: 2.825 ± 0.349
2.867ThrLys: 2.867 ± 0.416
4.975ThrLeu: 4.975 ± 0.519
1.686ThrMet: 1.686 ± 0.289
2.867ThrAsn: 2.867 ± 0.485
4.3ThrPro: 4.3 ± 0.535
1.56ThrGln: 1.56 ± 0.304
2.656ThrArg: 2.656 ± 0.376
3.963ThrSer: 3.963 ± 0.628
5.481ThrThr: 5.481 ± 0.581
5.27ThrVal: 5.27 ± 0.58
1.18ThrTrp: 1.18 ± 0.218
2.445ThrTyr: 2.445 ± 0.328
0.0ThrXaa: 0.0 ± 0.0
Val
6.788ValAla: 6.788 ± 0.554
0.843ValCys: 0.843 ± 0.217
3.963ValAsp: 3.963 ± 0.41
4.722ValGlu: 4.722 ± 0.602
1.897ValPhe: 1.897 ± 0.338
4.848ValGly: 4.848 ± 0.465
1.644ValHis: 1.644 ± 0.242
3.457ValIle: 3.457 ± 0.362
3.12ValLys: 3.12 ± 0.334
4.637ValLeu: 4.637 ± 0.528
1.728ValMet: 1.728 ± 0.236
2.782ValAsn: 2.782 ± 0.428
4.089ValPro: 4.089 ± 0.422
2.108ValGln: 2.108 ± 0.256
5.101ValArg: 5.101 ± 0.562
4.469ValSer: 4.469 ± 0.429
5.185ValThr: 5.185 ± 0.566
5.438ValVal: 5.438 ± 0.531
1.771ValTrp: 1.771 ± 0.245
1.855ValTyr: 1.855 ± 0.303
0.0ValXaa: 0.0 ± 0.0
Trp
1.855TrpAla: 1.855 ± 0.305
0.169TrpCys: 0.169 ± 0.094
1.433TrpAsp: 1.433 ± 0.235
1.307TrpGlu: 1.307 ± 0.221
0.632TrpPhe: 0.632 ± 0.156
1.855TrpGly: 1.855 ± 0.378
0.464TrpHis: 0.464 ± 0.141
0.801TrpIle: 0.801 ± 0.161
0.885TrpLys: 0.885 ± 0.211
1.56TrpLeu: 1.56 ± 0.221
0.59TrpMet: 0.59 ± 0.164
1.096TrpAsn: 1.096 ± 0.232
1.096TrpPro: 1.096 ± 0.219
0.675TrpGln: 0.675 ± 0.134
1.476TrpArg: 1.476 ± 0.228
1.138TrpSer: 1.138 ± 0.197
1.265TrpThr: 1.265 ± 0.222
1.307TrpVal: 1.307 ± 0.218
0.337TrpTrp: 0.337 ± 0.144
0.717TrpTyr: 0.717 ± 0.166
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.771TyrAla: 1.771 ± 0.253
0.506TyrCys: 0.506 ± 0.144
2.361TyrAsp: 2.361 ± 0.304
1.56TyrGlu: 1.56 ± 0.212
1.138TyrPhe: 1.138 ± 0.191
2.825TyrGly: 2.825 ± 0.294
0.464TyrHis: 0.464 ± 0.149
0.885TyrIle: 0.885 ± 0.149
1.223TyrLys: 1.223 ± 0.274
2.782TyrLeu: 2.782 ± 0.362
0.801TyrMet: 0.801 ± 0.222
0.885TyrAsn: 0.885 ± 0.242
1.433TyrPro: 1.433 ± 0.274
1.644TyrGln: 1.644 ± 0.229
1.813TyrArg: 1.813 ± 0.258
1.855TyrSer: 1.855 ± 0.323
1.855TyrThr: 1.855 ± 0.331
2.403TyrVal: 2.403 ± 0.318
0.717TyrTrp: 0.717 ± 0.17
0.843TyrTyr: 0.843 ± 0.189
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 115 proteins (23721 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski