Amino acid dipepetide frequency for Mycobacterium phage Swish

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
18.058AlaAla: 18.058 ± 1.408
1.247AlaCys: 1.247 ± 0.256
7.205AlaAsp: 7.205 ± 0.64
7.62AlaGlu: 7.62 ± 0.823
2.355AlaPhe: 2.355 ± 0.388
10.391AlaGly: 10.391 ± 1.13
2.309AlaHis: 2.309 ± 0.366
4.48AlaIle: 4.48 ± 0.652
3.556AlaLys: 3.556 ± 0.427
10.576AlaLeu: 10.576 ± 0.958
3.233AlaMet: 3.233 ± 0.378
3.094AlaAsn: 3.094 ± 0.392
6.835AlaPro: 6.835 ± 0.455
4.018AlaGln: 4.018 ± 0.435
8.913AlaArg: 8.913 ± 0.797
5.727AlaSer: 5.727 ± 0.531
7.897AlaThr: 7.897 ± 0.61
8.867AlaVal: 8.867 ± 0.829
1.94AlaTrp: 1.94 ± 0.292
2.817AlaTyr: 2.817 ± 0.267
0.0AlaXaa: 0.0 ± 0.0
Cys
0.877CysAla: 0.877 ± 0.218
0.046CysCys: 0.046 ± 0.048
0.97CysAsp: 0.97 ± 0.203
0.831CysGlu: 0.831 ± 0.217
0.369CysPhe: 0.369 ± 0.118
1.478CysGly: 1.478 ± 0.297
0.369CysHis: 0.369 ± 0.119
0.416CysIle: 0.416 ± 0.147
0.277CysLys: 0.277 ± 0.122
0.693CysLeu: 0.693 ± 0.209
0.277CysMet: 0.277 ± 0.105
0.277CysAsn: 0.277 ± 0.094
0.924CysPro: 0.924 ± 0.251
0.323CysGln: 0.323 ± 0.115
1.016CysArg: 1.016 ± 0.285
0.508CysSer: 0.508 ± 0.144
0.6CysThr: 0.6 ± 0.167
0.877CysVal: 0.877 ± 0.201
0.231CysTrp: 0.231 ± 0.117
0.323CysTyr: 0.323 ± 0.131
0.0CysXaa: 0.0 ± 0.0
Asp
6.881AspAla: 6.881 ± 0.565
1.016AspCys: 1.016 ± 0.255
4.664AspAsp: 4.664 ± 0.635
5.219AspGlu: 5.219 ± 0.685
1.663AspPhe: 1.663 ± 0.243
6.235AspGly: 6.235 ± 0.664
1.339AspHis: 1.339 ± 0.287
3.048AspIle: 3.048 ± 0.437
1.986AspLys: 1.986 ± 0.341
5.634AspLeu: 5.634 ± 0.429
1.062AspMet: 1.062 ± 0.223
2.078AspAsn: 2.078 ± 0.354
5.126AspPro: 5.126 ± 0.514
1.524AspGln: 1.524 ± 0.244
3.879AspArg: 3.879 ± 0.486
2.725AspSer: 2.725 ± 0.321
3.833AspThr: 3.833 ± 0.452
3.787AspVal: 3.787 ± 0.347
1.155AspTrp: 1.155 ± 0.24
1.847AspTyr: 1.847 ± 0.243
0.0AspXaa: 0.0 ± 0.0
Glu
7.343GluAla: 7.343 ± 0.913
0.647GluCys: 0.647 ± 0.177
3.695GluAsp: 3.695 ± 0.549
2.171GluGlu: 2.171 ± 0.302
2.078GluPhe: 2.078 ± 0.356
4.48GluGly: 4.48 ± 0.704
1.293GluHis: 1.293 ± 0.263
4.018GluIle: 4.018 ± 0.563
1.663GluLys: 1.663 ± 0.271
6.281GluLeu: 6.281 ± 0.564
0.877GluMet: 0.877 ± 0.216
1.062GluAsn: 1.062 ± 0.179
3.325GluPro: 3.325 ± 0.533
2.494GluGln: 2.494 ± 0.361
4.48GluArg: 4.48 ± 0.537
1.616GluSer: 1.616 ± 0.327
3.51GluThr: 3.51 ± 0.43
6.004GluVal: 6.004 ± 0.625
0.924GluTrp: 0.924 ± 0.23
1.155GluTyr: 1.155 ± 0.213
0.0GluXaa: 0.0 ± 0.0
Phe
3.048PheAla: 3.048 ± 0.33
0.277PheCys: 0.277 ± 0.117
1.524PheAsp: 1.524 ± 0.229
1.247PheGlu: 1.247 ± 0.213
0.416PhePhe: 0.416 ± 0.122
2.91PheGly: 2.91 ± 0.453
0.462PheHis: 0.462 ± 0.119
0.647PheIle: 0.647 ± 0.165
1.016PheLys: 1.016 ± 0.194
1.524PheLeu: 1.524 ± 0.293
0.277PheMet: 0.277 ± 0.102
0.877PheAsn: 0.877 ± 0.226
1.385PhePro: 1.385 ± 0.221
0.554PheGln: 0.554 ± 0.156
1.616PheArg: 1.616 ± 0.279
1.155PheSer: 1.155 ± 0.222
1.524PheThr: 1.524 ± 0.269
1.385PheVal: 1.385 ± 0.208
0.369PheTrp: 0.369 ± 0.147
0.693PheTyr: 0.693 ± 0.15
0.0PheXaa: 0.0 ± 0.0
Gly
9.421GlyAla: 9.421 ± 1.444
1.201GlyCys: 1.201 ± 0.289
5.496GlyAsp: 5.496 ± 0.608
5.311GlyGlu: 5.311 ± 0.603
1.986GlyPhe: 1.986 ± 0.335
13.024GlyGly: 13.024 ± 2.735
1.894GlyHis: 1.894 ± 0.364
4.849GlyIle: 4.849 ± 0.416
3.464GlyLys: 3.464 ± 0.476
7.759GlyLeu: 7.759 ± 0.986
2.032GlyMet: 2.032 ± 0.317
2.586GlyAsn: 2.586 ± 0.496
4.572GlyPro: 4.572 ± 0.587
3.602GlyGln: 3.602 ± 0.548
5.911GlyArg: 5.911 ± 0.63
5.403GlySer: 5.403 ± 0.561
6.743GlyThr: 6.743 ± 0.718
6.881GlyVal: 6.881 ± 0.665
1.986GlyTrp: 1.986 ± 0.327
2.725GlyTyr: 2.725 ± 0.373
0.0GlyXaa: 0.0 ± 0.0
His
1.94HisAla: 1.94 ± 0.312
0.416HisCys: 0.416 ± 0.164
1.201HisAsp: 1.201 ± 0.244
1.478HisGlu: 1.478 ± 0.269
0.462HisPhe: 0.462 ± 0.125
1.432HisGly: 1.432 ± 0.253
0.6HisHis: 0.6 ± 0.184
1.432HisIle: 1.432 ± 0.267
0.462HisLys: 0.462 ± 0.15
1.385HisLeu: 1.385 ± 0.294
0.508HisMet: 0.508 ± 0.159
0.416HisAsn: 0.416 ± 0.14
1.062HisPro: 1.062 ± 0.266
0.231HisGln: 0.231 ± 0.095
1.894HisArg: 1.894 ± 0.353
0.877HisSer: 0.877 ± 0.193
1.986HisThr: 1.986 ± 0.343
1.108HisVal: 1.108 ± 0.284
0.185HisTrp: 0.185 ± 0.084
0.647HisTyr: 0.647 ± 0.18
0.0HisXaa: 0.0 ± 0.0
Ile
5.681IleAla: 5.681 ± 0.422
0.554IleCys: 0.554 ± 0.168
3.741IleAsp: 3.741 ± 0.391
4.757IleGlu: 4.757 ± 0.504
0.924IlePhe: 0.924 ± 0.164
5.496IleGly: 5.496 ± 0.551
0.647IleHis: 0.647 ± 0.187
1.616IleIle: 1.616 ± 0.306
1.293IleLys: 1.293 ± 0.332
2.817IleLeu: 2.817 ± 0.389
0.831IleMet: 0.831 ± 0.186
1.524IleAsn: 1.524 ± 0.246
2.771IlePro: 2.771 ± 0.331
0.924IleGln: 0.924 ± 0.228
2.402IleArg: 2.402 ± 0.399
2.309IleSer: 2.309 ± 0.344
3.51IleThr: 3.51 ± 0.373
3.233IleVal: 3.233 ± 0.408
0.369IleTrp: 0.369 ± 0.114
0.924IleTyr: 0.924 ± 0.202
0.0IleXaa: 0.0 ± 0.0
Lys
4.064LysAla: 4.064 ± 0.532
0.185LysCys: 0.185 ± 0.095
1.524LysAsp: 1.524 ± 0.24
0.831LysGlu: 0.831 ± 0.25
0.6LysPhe: 0.6 ± 0.192
2.679LysGly: 2.679 ± 0.373
0.554LysHis: 0.554 ± 0.181
1.709LysIle: 1.709 ± 0.277
0.831LysLys: 0.831 ± 0.239
2.91LysLeu: 2.91 ± 0.369
0.369LysMet: 0.369 ± 0.121
0.785LysAsn: 0.785 ± 0.193
1.432LysPro: 1.432 ± 0.288
1.293LysGln: 1.293 ± 0.213
2.171LysArg: 2.171 ± 0.362
1.339LysSer: 1.339 ± 0.212
2.355LysThr: 2.355 ± 0.345
2.586LysVal: 2.586 ± 0.303
0.369LysTrp: 0.369 ± 0.106
0.693LysTyr: 0.693 ± 0.191
0.0LysXaa: 0.0 ± 0.0
Leu
11.361LeuAla: 11.361 ± 0.972
1.108LeuCys: 1.108 ± 0.249
6.189LeuAsp: 6.189 ± 0.579
3.233LeuGlu: 3.233 ± 0.34
1.847LeuPhe: 1.847 ± 0.369
7.482LeuGly: 7.482 ± 1.152
1.016LeuHis: 1.016 ± 0.215
4.11LeuIle: 4.11 ± 0.409
2.54LeuLys: 2.54 ± 0.43
5.311LeuLeu: 5.311 ± 0.562
1.524LeuMet: 1.524 ± 0.239
3.233LeuAsn: 3.233 ± 0.481
5.819LeuPro: 5.819 ± 0.508
1.57LeuGln: 1.57 ± 0.29
4.203LeuArg: 4.203 ± 0.461
3.926LeuSer: 3.926 ± 0.38
7.251LeuThr: 7.251 ± 0.537
4.618LeuVal: 4.618 ± 0.442
1.247LeuTrp: 1.247 ± 0.275
1.616LeuTyr: 1.616 ± 0.295
0.0LeuXaa: 0.0 ± 0.0
Met
2.586MetAla: 2.586 ± 0.36
0.277MetCys: 0.277 ± 0.108
0.647MetAsp: 0.647 ± 0.158
0.693MetGlu: 0.693 ± 0.131
0.693MetPhe: 0.693 ± 0.185
1.293MetGly: 1.293 ± 0.195
0.462MetHis: 0.462 ± 0.188
0.877MetIle: 0.877 ± 0.184
0.693MetLys: 0.693 ± 0.18
1.293MetLeu: 1.293 ± 0.217
0.462MetMet: 0.462 ± 0.135
0.462MetAsn: 0.462 ± 0.142
1.57MetPro: 1.57 ± 0.249
0.369MetGln: 0.369 ± 0.135
1.432MetArg: 1.432 ± 0.255
2.032MetSer: 2.032 ± 0.298
2.54MetThr: 2.54 ± 0.309
1.339MetVal: 1.339 ± 0.232
0.462MetTrp: 0.462 ± 0.126
0.508MetTyr: 0.508 ± 0.173
0.0MetXaa: 0.0 ± 0.0
Asn
3.464AsnAla: 3.464 ± 0.387
0.277AsnCys: 0.277 ± 0.112
2.032AsnAsp: 2.032 ± 0.352
1.201AsnGlu: 1.201 ± 0.26
0.323AsnPhe: 0.323 ± 0.124
3.371AsnGly: 3.371 ± 0.461
0.554AsnHis: 0.554 ± 0.177
1.339AsnIle: 1.339 ± 0.247
1.062AsnLys: 1.062 ± 0.224
1.801AsnLeu: 1.801 ± 0.29
0.508AsnMet: 0.508 ± 0.116
1.155AsnAsn: 1.155 ± 0.324
1.894AsnPro: 1.894 ± 0.288
0.369AsnGln: 0.369 ± 0.118
1.986AsnArg: 1.986 ± 0.371
1.801AsnSer: 1.801 ± 0.315
2.679AsnThr: 2.679 ± 0.376
1.755AsnVal: 1.755 ± 0.239
0.369AsnTrp: 0.369 ± 0.144
0.831AsnTyr: 0.831 ± 0.191
0.0AsnXaa: 0.0 ± 0.0
Pro
7.574ProAla: 7.574 ± 0.581
0.462ProCys: 0.462 ± 0.154
4.203ProAsp: 4.203 ± 0.566
5.357ProGlu: 5.357 ± 0.671
1.339ProPhe: 1.339 ± 0.224
6.004ProGly: 6.004 ± 0.747
1.293ProHis: 1.293 ± 0.282
2.725ProIle: 2.725 ± 0.392
1.247ProLys: 1.247 ± 0.239
3.371ProLeu: 3.371 ± 0.371
1.247ProMet: 1.247 ± 0.197
1.986ProAsn: 1.986 ± 0.294
4.618ProPro: 4.618 ± 0.606
1.755ProGln: 1.755 ± 0.218
3.972ProArg: 3.972 ± 0.501
2.586ProSer: 2.586 ± 0.358
3.972ProThr: 3.972 ± 0.337
5.219ProVal: 5.219 ± 0.439
1.339ProTrp: 1.339 ± 0.248
1.062ProTyr: 1.062 ± 0.25
0.0ProXaa: 0.0 ± 0.0
Gln
4.064GlnAla: 4.064 ± 0.709
0.277GlnCys: 0.277 ± 0.137
1.062GlnAsp: 1.062 ± 0.23
0.831GlnGlu: 0.831 ± 0.2
1.062GlnPhe: 1.062 ± 0.206
2.171GlnGly: 2.171 ± 0.37
0.97GlnHis: 0.97 ± 0.189
2.078GlnIle: 2.078 ± 0.325
0.831GlnLys: 0.831 ± 0.222
2.448GlnLeu: 2.448 ± 0.328
0.693GlnMet: 0.693 ± 0.155
0.739GlnAsn: 0.739 ± 0.206
1.663GlnPro: 1.663 ± 0.226
1.663GlnGln: 1.663 ± 0.259
3.002GlnArg: 3.002 ± 0.45
1.755GlnSer: 1.755 ± 0.278
2.309GlnThr: 2.309 ± 0.276
1.847GlnVal: 1.847 ± 0.268
0.693GlnTrp: 0.693 ± 0.18
0.647GlnTyr: 0.647 ± 0.216
0.0GlnXaa: 0.0 ± 0.0
Arg
6.927ArgAla: 6.927 ± 0.589
0.739ArgCys: 0.739 ± 0.202
4.341ArgAsp: 4.341 ± 0.476
4.849ArgGlu: 4.849 ± 0.638
1.663ArgPhe: 1.663 ± 0.289
4.387ArgGly: 4.387 ± 0.475
1.616ArgHis: 1.616 ± 0.331
2.263ArgIle: 2.263 ± 0.357
1.801ArgLys: 1.801 ± 0.299
5.865ArgLeu: 5.865 ± 0.477
1.709ArgMet: 1.709 ± 0.366
2.448ArgAsn: 2.448 ± 0.342
4.064ArgPro: 4.064 ± 0.555
2.679ArgGln: 2.679 ± 0.411
6.558ArgArg: 6.558 ± 0.662
2.956ArgSer: 2.956 ± 0.367
4.156ArgThr: 4.156 ± 0.488
5.588ArgVal: 5.588 ± 0.679
1.755ArgTrp: 1.755 ± 0.257
2.217ArgTyr: 2.217 ± 0.305
0.0ArgXaa: 0.0 ± 0.0
Ser
5.219SerAla: 5.219 ± 0.568
0.647SerCys: 0.647 ± 0.18
3.325SerAsp: 3.325 ± 0.312
2.309SerGlu: 2.309 ± 0.369
0.693SerPhe: 0.693 ± 0.188
6.004SerGly: 6.004 ± 0.742
0.785SerHis: 0.785 ± 0.18
1.801SerIle: 1.801 ± 0.325
1.108SerLys: 1.108 ± 0.212
4.11SerLeu: 4.11 ± 0.654
1.293SerMet: 1.293 ± 0.252
1.062SerAsn: 1.062 ± 0.235
3.187SerPro: 3.187 ± 0.393
2.078SerGln: 2.078 ± 0.292
3.464SerArg: 3.464 ± 0.323
2.448SerSer: 2.448 ± 0.379
3.787SerThr: 3.787 ± 0.483
3.879SerVal: 3.879 ± 0.442
1.016SerTrp: 1.016 ± 0.186
1.57SerTyr: 1.57 ± 0.22
0.0SerXaa: 0.0 ± 0.0
Thr
8.775ThrAla: 8.775 ± 0.547
0.554ThrCys: 0.554 ± 0.18
5.034ThrAsp: 5.034 ± 0.481
3.602ThrGlu: 3.602 ± 0.49
1.524ThrPhe: 1.524 ± 0.266
7.297ThrGly: 7.297 ± 0.523
1.062ThrHis: 1.062 ± 0.244
3.556ThrIle: 3.556 ± 0.442
2.494ThrLys: 2.494 ± 0.321
5.496ThrLeu: 5.496 ± 0.481
1.709ThrMet: 1.709 ± 0.282
2.032ThrAsn: 2.032 ± 0.382
5.219ThrPro: 5.219 ± 0.674
1.986ThrGln: 1.986 ± 0.313
3.648ThrArg: 3.648 ± 0.446
4.249ThrSer: 4.249 ± 0.367
4.295ThrThr: 4.295 ± 0.583
6.096ThrVal: 6.096 ± 0.552
1.201ThrTrp: 1.201 ± 0.272
1.57ThrTyr: 1.57 ± 0.251
0.0ThrXaa: 0.0 ± 0.0
Val
8.267ValAla: 8.267 ± 0.495
1.016ValCys: 1.016 ± 0.231
4.434ValAsp: 4.434 ± 0.484
5.634ValGlu: 5.634 ± 0.627
1.524ValPhe: 1.524 ± 0.217
7.435ValGly: 7.435 ± 0.677
1.616ValHis: 1.616 ± 0.27
3.464ValIle: 3.464 ± 0.329
2.124ValLys: 2.124 ± 0.326
6.466ValLeu: 6.466 ± 0.529
1.293ValMet: 1.293 ± 0.268
1.986ValAsn: 1.986 ± 0.299
3.879ValPro: 3.879 ± 0.316
1.847ValGln: 1.847 ± 0.276
4.618ValArg: 4.618 ± 0.492
4.064ValSer: 4.064 ± 0.451
5.219ValThr: 5.219 ± 0.548
5.958ValVal: 5.958 ± 0.652
1.847ValTrp: 1.847 ± 0.302
1.986ValTyr: 1.986 ± 0.392
0.0ValXaa: 0.0 ± 0.0
Trp
2.217TrpAla: 2.217 ± 0.348
0.416TrpCys: 0.416 ± 0.15
1.293TrpAsp: 1.293 ± 0.293
0.462TrpGlu: 0.462 ± 0.133
0.785TrpPhe: 0.785 ± 0.26
1.016TrpGly: 1.016 ± 0.192
0.462TrpHis: 0.462 ± 0.159
0.924TrpIle: 0.924 ± 0.21
0.277TrpLys: 0.277 ± 0.107
1.524TrpLeu: 1.524 ± 0.29
0.462TrpMet: 0.462 ± 0.145
0.277TrpAsn: 0.277 ± 0.117
0.831TrpPro: 0.831 ± 0.215
0.647TrpGln: 0.647 ± 0.196
1.339TrpArg: 1.339 ± 0.222
1.339TrpSer: 1.339 ± 0.235
1.524TrpThr: 1.524 ± 0.282
1.432TrpVal: 1.432 ± 0.345
0.416TrpTrp: 0.416 ± 0.143
0.462TrpTyr: 0.462 ± 0.147
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.371TyrAla: 3.371 ± 0.356
0.323TyrCys: 0.323 ± 0.119
2.124TyrAsp: 2.124 ± 0.261
1.478TyrGlu: 1.478 ± 0.254
0.6TyrPhe: 0.6 ± 0.179
2.263TyrGly: 2.263 ± 0.375
0.416TyrHis: 0.416 ± 0.149
0.97TyrIle: 0.97 ± 0.203
0.554TyrLys: 0.554 ± 0.17
1.94TyrLeu: 1.94 ± 0.382
0.277TyrMet: 0.277 ± 0.128
0.693TyrAsn: 0.693 ± 0.149
1.155TyrPro: 1.155 ± 0.25
0.877TyrGln: 0.877 ± 0.197
2.078TyrArg: 2.078 ± 0.329
0.97TyrSer: 0.97 ± 0.178
1.709TyrThr: 1.709 ± 0.257
2.263TyrVal: 2.263 ± 0.283
0.231TyrTrp: 0.231 ± 0.095
0.647TyrTyr: 0.647 ± 0.174
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 103 proteins (21654 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski