Amino acid dipepetide frequency for Gordonia phage NHagos

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
16.731AlaAla: 16.731 ± 1.293
0.95AlaCys: 0.95 ± 0.237
9.975AlaAsp: 9.975 ± 0.958
7.97AlaGlu: 7.97 ± 0.738
3.536AlaPhe: 3.536 ± 0.524
9.342AlaGly: 9.342 ± 0.932
2.639AlaHis: 2.639 ± 0.279
4.275AlaIle: 4.275 ± 0.792
3.642AlaLys: 3.642 ± 0.495
10.609AlaLeu: 10.609 ± 0.689
2.85AlaMet: 2.85 ± 0.55
2.164AlaAsn: 2.164 ± 0.496
6.281AlaPro: 6.281 ± 0.621
4.222AlaGln: 4.222 ± 0.534
7.811AlaArg: 7.811 ± 0.795
6.281AlaSer: 6.281 ± 0.574
6.861AlaThr: 6.861 ± 0.57
11.4AlaVal: 11.4 ± 0.897
2.797AlaTrp: 2.797 ± 0.403
1.953AlaTyr: 1.953 ± 0.362
0.0AlaXaa: 0.0 ± 0.0
Cys
1.056CysAla: 1.056 ± 0.245
0.158CysCys: 0.158 ± 0.102
0.844CysAsp: 0.844 ± 0.264
0.581CysGlu: 0.581 ± 0.276
0.211CysPhe: 0.211 ± 0.097
1.267CysGly: 1.267 ± 0.335
0.317CysHis: 0.317 ± 0.129
0.053CysIle: 0.053 ± 0.055
0.106CysLys: 0.106 ± 0.078
0.475CysLeu: 0.475 ± 0.189
0.053CysMet: 0.053 ± 0.059
0.211CysAsn: 0.211 ± 0.146
0.844CysPro: 0.844 ± 0.244
0.106CysGln: 0.106 ± 0.069
0.844CysArg: 0.844 ± 0.233
0.739CysSer: 0.739 ± 0.218
0.264CysThr: 0.264 ± 0.123
1.056CysVal: 1.056 ± 0.307
0.158CysTrp: 0.158 ± 0.11
0.158CysTyr: 0.158 ± 0.102
0.0CysXaa: 0.0 ± 0.0
Asp
8.656AspAla: 8.656 ± 0.64
0.528AspCys: 0.528 ± 0.168
5.542AspAsp: 5.542 ± 0.741
6.228AspGlu: 6.228 ± 0.701
1.742AspPhe: 1.742 ± 0.279
5.436AspGly: 5.436 ± 0.569
1.478AspHis: 1.478 ± 0.305
2.85AspIle: 2.85 ± 0.405
2.164AspLys: 2.164 ± 0.388
6.703AspLeu: 6.703 ± 0.68
1.478AspMet: 1.478 ± 0.217
1.689AspAsn: 1.689 ± 0.342
4.275AspPro: 4.275 ± 0.495
2.058AspGln: 2.058 ± 0.3
5.542AspArg: 5.542 ± 0.464
3.8AspSer: 3.8 ± 0.387
5.331AspThr: 5.331 ± 0.532
4.011AspVal: 4.011 ± 0.439
1.056AspTrp: 1.056 ± 0.254
1.794AspTyr: 1.794 ± 0.406
0.0AspXaa: 0.0 ± 0.0
Glu
8.761GluAla: 8.761 ± 0.896
0.528GluCys: 0.528 ± 0.208
4.17GluAsp: 4.17 ± 0.613
3.114GluGlu: 3.114 ± 0.424
1.636GluPhe: 1.636 ± 0.304
4.803GluGly: 4.803 ± 0.539
1.267GluHis: 1.267 ± 0.229
3.589GluIle: 3.589 ± 0.393
1.267GluLys: 1.267 ± 0.265
5.542GluLeu: 5.542 ± 0.506
1.108GluMet: 1.108 ± 0.292
1.372GluAsn: 1.372 ± 0.261
3.378GluPro: 3.378 ± 0.465
2.269GluGln: 2.269 ± 0.411
4.117GluArg: 4.117 ± 0.481
3.22GluSer: 3.22 ± 0.401
4.222GluThr: 4.222 ± 0.54
4.75GluVal: 4.75 ± 0.492
1.161GluTrp: 1.161 ± 0.241
1.531GluTyr: 1.531 ± 0.294
0.0GluXaa: 0.0 ± 0.0
Phe
3.22PheAla: 3.22 ± 0.386
0.211PheCys: 0.211 ± 0.1
2.797PheAsp: 2.797 ± 0.407
1.689PheGlu: 1.689 ± 0.299
0.581PhePhe: 0.581 ± 0.248
2.428PheGly: 2.428 ± 0.318
0.369PheHis: 0.369 ± 0.123
1.108PheIle: 1.108 ± 0.236
0.528PheLys: 0.528 ± 0.152
2.058PheLeu: 2.058 ± 0.373
0.633PheMet: 0.633 ± 0.172
0.528PheAsn: 0.528 ± 0.174
0.844PhePro: 0.844 ± 0.192
0.633PheGln: 0.633 ± 0.171
1.425PheArg: 1.425 ± 0.228
1.425PheSer: 1.425 ± 0.272
2.006PheThr: 2.006 ± 0.284
1.531PheVal: 1.531 ± 0.306
0.317PheTrp: 0.317 ± 0.117
0.369PheTyr: 0.369 ± 0.171
0.0PheXaa: 0.0 ± 0.0
Gly
8.445GlyAla: 8.445 ± 0.937
0.792GlyCys: 0.792 ± 0.27
5.436GlyAsp: 5.436 ± 0.521
5.858GlyGlu: 5.858 ± 0.591
1.953GlyPhe: 1.953 ± 0.336
8.55GlyGly: 8.55 ± 1.154
1.847GlyHis: 1.847 ± 0.278
3.642GlyIle: 3.642 ± 0.569
2.375GlyLys: 2.375 ± 0.384
8.445GlyLeu: 8.445 ± 0.771
2.164GlyMet: 2.164 ± 0.297
2.533GlyAsn: 2.533 ± 0.369
4.117GlyPro: 4.117 ± 0.541
2.797GlyGln: 2.797 ± 0.471
6.703GlyArg: 6.703 ± 0.707
4.908GlySer: 4.908 ± 0.516
5.7GlyThr: 5.7 ± 0.463
6.65GlyVal: 6.65 ± 0.547
2.269GlyTrp: 2.269 ± 0.322
2.797GlyTyr: 2.797 ± 0.421
0.0GlyXaa: 0.0 ± 0.0
His
2.164HisAla: 2.164 ± 0.384
0.106HisCys: 0.106 ± 0.076
1.583HisAsp: 1.583 ± 0.276
1.003HisGlu: 1.003 ± 0.25
0.739HisPhe: 0.739 ± 0.188
1.003HisGly: 1.003 ± 0.218
0.475HisHis: 0.475 ± 0.148
0.739HisIle: 0.739 ± 0.233
0.317HisLys: 0.317 ± 0.137
1.583HisLeu: 1.583 ± 0.34
0.475HisMet: 0.475 ± 0.15
0.422HisAsn: 0.422 ± 0.131
1.531HisPro: 1.531 ± 0.302
0.317HisGln: 0.317 ± 0.105
1.372HisArg: 1.372 ± 0.27
1.056HisSer: 1.056 ± 0.228
1.636HisThr: 1.636 ± 0.303
0.739HisVal: 0.739 ± 0.204
0.422HisTrp: 0.422 ± 0.12
0.475HisTyr: 0.475 ± 0.156
0.0HisXaa: 0.0 ± 0.0
Ile
6.703IleAla: 6.703 ± 0.581
0.369IleCys: 0.369 ± 0.154
3.483IleAsp: 3.483 ± 0.364
4.592IleGlu: 4.592 ± 0.617
0.897IlePhe: 0.897 ± 0.18
4.433IleGly: 4.433 ± 0.7
0.475IleHis: 0.475 ± 0.156
2.217IleIle: 2.217 ± 0.412
1.689IleLys: 1.689 ± 0.516
1.636IleLeu: 1.636 ± 0.259
1.056IleMet: 1.056 ± 0.262
1.794IleAsn: 1.794 ± 0.405
1.478IlePro: 1.478 ± 0.37
0.897IleGln: 0.897 ± 0.241
2.217IleArg: 2.217 ± 0.346
3.114IleSer: 3.114 ± 0.469
3.853IleThr: 3.853 ± 0.552
3.325IleVal: 3.325 ± 0.571
0.581IleTrp: 0.581 ± 0.192
0.844IleTyr: 0.844 ± 0.212
0.0IleXaa: 0.0 ± 0.0
Lys
4.222LysAla: 4.222 ± 0.617
0.158LysCys: 0.158 ± 0.115
1.319LysAsp: 1.319 ± 0.317
0.844LysGlu: 0.844 ± 0.192
0.475LysPhe: 0.475 ± 0.195
2.375LysGly: 2.375 ± 0.35
0.211LysHis: 0.211 ± 0.091
1.372LysIle: 1.372 ± 0.312
0.686LysLys: 0.686 ± 0.281
2.533LysLeu: 2.533 ± 0.438
0.422LysMet: 0.422 ± 0.206
0.792LysAsn: 0.792 ± 0.2
1.531LysPro: 1.531 ± 0.435
0.95LysGln: 0.95 ± 0.223
2.006LysArg: 2.006 ± 0.324
1.636LysSer: 1.636 ± 0.324
0.897LysThr: 0.897 ± 0.271
2.164LysVal: 2.164 ± 0.297
0.211LysTrp: 0.211 ± 0.117
0.844LysTyr: 0.844 ± 0.212
0.0LysXaa: 0.0 ± 0.0
Leu
10.239LeuAla: 10.239 ± 0.789
0.792LeuCys: 0.792 ± 0.212
7.653LeuAsp: 7.653 ± 0.645
3.167LeuGlu: 3.167 ± 0.378
2.006LeuPhe: 2.006 ± 0.349
7.02LeuGly: 7.02 ± 0.728
1.636LeuHis: 1.636 ± 0.33
4.433LeuIle: 4.433 ± 0.429
1.742LeuLys: 1.742 ± 0.365
5.542LeuLeu: 5.542 ± 0.6
2.006LeuMet: 2.006 ± 0.356
2.797LeuAsn: 2.797 ± 0.389
4.381LeuPro: 4.381 ± 0.592
2.428LeuGln: 2.428 ± 0.43
5.278LeuArg: 5.278 ± 0.508
4.961LeuSer: 4.961 ± 0.414
6.439LeuThr: 6.439 ± 0.588
5.172LeuVal: 5.172 ± 0.5
1.478LeuTrp: 1.478 ± 0.247
2.217LeuTyr: 2.217 ± 0.362
0.0LeuXaa: 0.0 ± 0.0
Met
3.061MetAla: 3.061 ± 0.401
0.211MetCys: 0.211 ± 0.133
1.425MetAsp: 1.425 ± 0.276
1.319MetGlu: 1.319 ± 0.27
0.475MetPhe: 0.475 ± 0.146
1.742MetGly: 1.742 ± 0.29
0.369MetHis: 0.369 ± 0.165
1.319MetIle: 1.319 ± 0.256
0.475MetLys: 0.475 ± 0.158
1.319MetLeu: 1.319 ± 0.28
0.317MetMet: 0.317 ± 0.115
0.897MetAsn: 0.897 ± 0.215
1.372MetPro: 1.372 ± 0.242
0.475MetGln: 0.475 ± 0.17
1.214MetArg: 1.214 ± 0.206
2.217MetSer: 2.217 ± 0.334
1.9MetThr: 1.9 ± 0.334
1.689MetVal: 1.689 ± 0.262
0.422MetTrp: 0.422 ± 0.176
0.211MetTyr: 0.211 ± 0.115
0.0MetXaa: 0.0 ± 0.0
Asn
3.642AsnAla: 3.642 ± 0.487
0.422AsnCys: 0.422 ± 0.152
1.636AsnAsp: 1.636 ± 0.271
1.214AsnGlu: 1.214 ± 0.228
0.475AsnPhe: 0.475 ± 0.143
2.85AsnGly: 2.85 ± 0.532
0.422AsnHis: 0.422 ± 0.139
0.897AsnIle: 0.897 ± 0.213
0.475AsnLys: 0.475 ± 0.157
1.847AsnLeu: 1.847 ± 0.325
0.264AsnMet: 0.264 ± 0.106
1.056AsnAsn: 1.056 ± 0.477
1.636AsnPro: 1.636 ± 0.306
0.739AsnGln: 0.739 ± 0.213
1.531AsnArg: 1.531 ± 0.273
1.847AsnSer: 1.847 ± 0.246
2.375AsnThr: 2.375 ± 0.261
2.586AsnVal: 2.586 ± 0.534
0.369AsnTrp: 0.369 ± 0.121
0.792AsnTyr: 0.792 ± 0.178
0.0AsnXaa: 0.0 ± 0.0
Pro
6.439ProAla: 6.439 ± 0.731
0.369ProCys: 0.369 ± 0.152
4.117ProAsp: 4.117 ± 0.416
4.433ProGlu: 4.433 ± 0.502
1.583ProPhe: 1.583 ± 0.307
5.331ProGly: 5.331 ± 0.664
0.739ProHis: 0.739 ± 0.162
2.533ProIle: 2.533 ± 0.351
0.844ProLys: 0.844 ± 0.186
3.325ProLeu: 3.325 ± 0.373
1.583ProMet: 1.583 ± 0.244
1.161ProAsn: 1.161 ± 0.233
2.956ProPro: 2.956 ± 0.446
1.742ProGln: 1.742 ± 0.291
2.692ProArg: 2.692 ± 0.28
3.536ProSer: 3.536 ± 0.511
3.114ProThr: 3.114 ± 0.492
5.225ProVal: 5.225 ± 0.634
1.056ProTrp: 1.056 ± 0.22
1.267ProTyr: 1.267 ± 0.264
0.0ProXaa: 0.0 ± 0.0
Gln
4.011GlnAla: 4.011 ± 0.515
0.211GlnCys: 0.211 ± 0.092
1.108GlnAsp: 1.108 ± 0.252
1.214GlnGlu: 1.214 ± 0.379
0.95GlnPhe: 0.95 ± 0.251
2.375GlnGly: 2.375 ± 0.367
0.422GlnHis: 0.422 ± 0.156
2.269GlnIle: 2.269 ± 0.359
0.686GlnLys: 0.686 ± 0.215
3.431GlnLeu: 3.431 ± 0.472
0.633GlnMet: 0.633 ± 0.187
1.161GlnAsn: 1.161 ± 0.385
1.319GlnPro: 1.319 ± 0.25
1.319GlnGln: 1.319 ± 0.332
2.322GlnArg: 2.322 ± 0.372
1.108GlnSer: 1.108 ± 0.277
2.269GlnThr: 2.269 ± 0.267
3.061GlnVal: 3.061 ± 0.343
0.528GlnTrp: 0.528 ± 0.19
0.633GlnTyr: 0.633 ± 0.191
0.0GlnXaa: 0.0 ± 0.0
Arg
8.656ArgAla: 8.656 ± 0.957
0.95ArgCys: 0.95 ± 0.276
4.75ArgAsp: 4.75 ± 0.595
4.222ArgGlu: 4.222 ± 0.578
1.794ArgPhe: 1.794 ± 0.318
4.539ArgGly: 4.539 ± 0.46
1.531ArgHis: 1.531 ± 0.27
2.375ArgIle: 2.375 ± 0.332
2.006ArgLys: 2.006 ± 0.312
5.964ArgLeu: 5.964 ± 0.548
1.267ArgMet: 1.267 ± 0.258
1.372ArgAsn: 1.372 ± 0.284
3.22ArgPro: 3.22 ± 0.541
2.428ArgGln: 2.428 ± 0.508
5.331ArgArg: 5.331 ± 0.672
3.958ArgSer: 3.958 ± 0.518
4.275ArgThr: 4.275 ± 0.53
5.489ArgVal: 5.489 ± 0.482
1.478ArgTrp: 1.478 ± 0.289
2.428ArgTyr: 2.428 ± 0.417
0.0ArgXaa: 0.0 ± 0.0
Ser
6.228SerAla: 6.228 ± 0.594
0.739SerCys: 0.739 ± 0.269
3.853SerAsp: 3.853 ± 0.371
2.692SerGlu: 2.692 ± 0.363
1.214SerPhe: 1.214 ± 0.293
7.02SerGly: 7.02 ± 0.663
1.267SerHis: 1.267 ± 0.227
2.956SerIle: 2.956 ± 0.567
1.372SerLys: 1.372 ± 0.316
4.961SerLeu: 4.961 ± 0.558
1.953SerMet: 1.953 ± 0.328
1.108SerAsn: 1.108 ± 0.212
3.22SerPro: 3.22 ± 0.323
2.269SerGln: 2.269 ± 0.336
3.536SerArg: 3.536 ± 0.41
3.061SerSer: 3.061 ± 0.541
3.906SerThr: 3.906 ± 0.442
3.8SerVal: 3.8 ± 0.501
1.267SerTrp: 1.267 ± 0.241
1.531SerTyr: 1.531 ± 0.344
0.0SerXaa: 0.0 ± 0.0
Thr
7.864ThrAla: 7.864 ± 0.604
0.633ThrCys: 0.633 ± 0.246
4.908ThrAsp: 4.908 ± 0.685
4.222ThrGlu: 4.222 ± 0.502
1.372ThrPhe: 1.372 ± 0.301
6.703ThrGly: 6.703 ± 0.533
0.792ThrHis: 0.792 ± 0.184
3.272ThrIle: 3.272 ± 0.415
1.847ThrLys: 1.847 ± 0.299
5.067ThrLeu: 5.067 ± 0.595
1.689ThrMet: 1.689 ± 0.272
2.058ThrAsn: 2.058 ± 0.283
5.067ThrPro: 5.067 ± 0.568
1.531ThrGln: 1.531 ± 0.295
4.592ThrArg: 4.592 ± 0.469
3.906ThrSer: 3.906 ± 0.527
4.803ThrThr: 4.803 ± 0.78
5.383ThrVal: 5.383 ± 0.529
1.319ThrTrp: 1.319 ± 0.229
1.372ThrTyr: 1.372 ± 0.244
0.0ThrXaa: 0.0 ± 0.0
Val
7.283ValAla: 7.283 ± 0.633
0.897ValCys: 0.897 ± 0.302
4.803ValAsp: 4.803 ± 0.669
5.225ValGlu: 5.225 ± 0.596
1.742ValPhe: 1.742 ± 0.346
7.178ValGly: 7.178 ± 0.748
1.319ValHis: 1.319 ± 0.3
4.645ValIle: 4.645 ± 0.781
2.692ValLys: 2.692 ± 0.334
6.439ValLeu: 6.439 ± 0.63
1.531ValMet: 1.531 ± 0.291
2.428ValAsn: 2.428 ± 0.358
3.853ValPro: 3.853 ± 0.31
2.164ValGln: 2.164 ± 0.304
6.228ValArg: 6.228 ± 0.584
4.117ValSer: 4.117 ± 0.537
5.647ValThr: 5.647 ± 0.749
5.489ValVal: 5.489 ± 0.67
2.164ValTrp: 2.164 ± 0.499
1.583ValTyr: 1.583 ± 0.299
0.0ValXaa: 0.0 ± 0.0
Trp
2.375TrpAla: 2.375 ± 0.364
0.264TrpCys: 0.264 ± 0.11
1.372TrpAsp: 1.372 ± 0.238
1.267TrpGlu: 1.267 ± 0.225
0.792TrpPhe: 0.792 ± 0.201
1.056TrpGly: 1.056 ± 0.244
0.158TrpHis: 0.158 ± 0.083
0.369TrpIle: 0.369 ± 0.13
0.475TrpLys: 0.475 ± 0.163
1.636TrpLeu: 1.636 ± 0.23
0.264TrpMet: 0.264 ± 0.092
0.897TrpAsn: 0.897 ± 0.285
1.003TrpPro: 1.003 ± 0.254
0.897TrpGln: 0.897 ± 0.268
1.267TrpArg: 1.267 ± 0.24
1.689TrpSer: 1.689 ± 0.307
1.478TrpThr: 1.478 ± 0.311
1.794TrpVal: 1.794 ± 0.28
0.581TrpTrp: 0.581 ± 0.203
0.581TrpTyr: 0.581 ± 0.135
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.586TyrAla: 2.586 ± 0.489
0.211TyrCys: 0.211 ± 0.105
1.583TyrAsp: 1.583 ± 0.306
0.95TyrGlu: 0.95 ± 0.295
0.528TyrPhe: 0.528 ± 0.161
2.533TyrGly: 2.533 ± 0.356
0.422TyrHis: 0.422 ± 0.136
0.739TyrIle: 0.739 ± 0.175
0.264TyrLys: 0.264 ± 0.11
2.269TyrLeu: 2.269 ± 0.401
0.633TyrMet: 0.633 ± 0.141
0.633TyrAsn: 0.633 ± 0.168
1.742TyrPro: 1.742 ± 0.265
0.792TyrGln: 0.792 ± 0.195
2.006TyrArg: 2.006 ± 0.406
1.372TyrSer: 1.372 ± 0.303
1.425TyrThr: 1.425 ± 0.257
2.111TyrVal: 2.111 ± 0.352
0.581TyrTrp: 0.581 ± 0.179
0.528TyrTyr: 0.528 ± 0.175
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 82 proteins (18948 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski