Amino acid dipepetide frequency for Mycobacterium phage AlleyCat

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
20.548AlaAla: 20.548 ± 1.598
1.133AlaCys: 1.133 ± 0.262
8.343AlaAsp: 8.343 ± 0.563
7.673AlaGlu: 7.673 ± 0.743
3.09AlaPhe: 3.09 ± 0.611
12.772AlaGly: 12.772 ± 0.882
2.523AlaHis: 2.523 ± 0.359
4.635AlaIle: 4.635 ± 0.49
3.811AlaLys: 3.811 ± 0.489
10.66AlaLeu: 10.66 ± 0.773
2.987AlaMet: 2.987 ± 0.472
2.781AlaAsn: 2.781 ± 0.377
6.077AlaPro: 6.077 ± 0.561
4.377AlaGln: 4.377 ± 0.425
8.394AlaArg: 8.394 ± 0.698
5.304AlaSer: 5.304 ± 0.606
6.386AlaThr: 6.386 ± 0.652
10.403AlaVal: 10.403 ± 0.919
2.472AlaTrp: 2.472 ± 0.371
2.678AlaTyr: 2.678 ± 0.358
0.0AlaXaa: 0.0 ± 0.0
Cys
1.184CysAla: 1.184 ± 0.236
0.051CysCys: 0.051 ± 0.051
0.669CysAsp: 0.669 ± 0.18
0.412CysGlu: 0.412 ± 0.166
0.154CysPhe: 0.154 ± 0.099
1.339CysGly: 1.339 ± 0.369
0.36CysHis: 0.36 ± 0.158
0.412CysIle: 0.412 ± 0.155
0.463CysLys: 0.463 ± 0.129
0.36CysLeu: 0.36 ± 0.122
0.206CysMet: 0.206 ± 0.14
0.309CysAsn: 0.309 ± 0.11
0.618CysPro: 0.618 ± 0.189
0.566CysGln: 0.566 ± 0.194
0.463CysArg: 0.463 ± 0.154
0.824CysSer: 0.824 ± 0.242
0.824CysThr: 0.824 ± 0.246
0.669CysVal: 0.669 ± 0.218
0.412CysTrp: 0.412 ± 0.126
0.257CysTyr: 0.257 ± 0.136
0.0CysXaa: 0.0 ± 0.0
Asp
8.858AspAla: 8.858 ± 0.788
1.03AspCys: 1.03 ± 0.296
5.819AspAsp: 5.819 ± 0.585
5.459AspGlu: 5.459 ± 0.658
1.184AspPhe: 1.184 ± 0.233
6.283AspGly: 6.283 ± 0.608
1.133AspHis: 1.133 ± 0.182
2.214AspIle: 2.214 ± 0.297
1.802AspLys: 1.802 ± 0.312
6.283AspLeu: 6.283 ± 0.495
1.442AspMet: 1.442 ± 0.384
1.905AspAsn: 1.905 ± 0.278
3.656AspPro: 3.656 ± 0.444
1.648AspGln: 1.648 ± 0.275
5.356AspArg: 5.356 ± 0.621
2.06AspSer: 2.06 ± 0.338
3.914AspThr: 3.914 ± 0.444
5.098AspVal: 5.098 ± 0.379
1.339AspTrp: 1.339 ± 0.224
1.442AspTyr: 1.442 ± 0.281
0.0AspXaa: 0.0 ± 0.0
Glu
6.798GluAla: 6.798 ± 0.815
0.618GluCys: 0.618 ± 0.16
3.296GluAsp: 3.296 ± 0.393
2.678GluGlu: 2.678 ± 0.434
1.493GluPhe: 1.493 ± 0.279
3.399GluGly: 3.399 ± 0.468
2.317GluHis: 2.317 ± 0.374
1.596GluIle: 1.596 ± 0.347
1.596GluLys: 1.596 ± 0.261
5.716GluLeu: 5.716 ± 0.596
1.493GluMet: 1.493 ± 0.239
0.927GluAsn: 0.927 ± 0.212
3.09GluPro: 3.09 ± 0.48
2.472GluGln: 2.472 ± 0.423
5.562GluArg: 5.562 ± 0.745
2.884GluSer: 2.884 ± 0.408
2.472GluThr: 2.472 ± 0.341
4.171GluVal: 4.171 ± 0.404
0.978GluTrp: 0.978 ± 0.164
1.493GluTyr: 1.493 ± 0.248
0.0GluXaa: 0.0 ± 0.0
Phe
2.781PheAla: 2.781 ± 0.441
0.154PheCys: 0.154 ± 0.088
2.729PheAsp: 2.729 ± 0.343
1.596PheGlu: 1.596 ± 0.242
0.566PhePhe: 0.566 ± 0.171
2.935PheGly: 2.935 ± 0.4
0.618PheHis: 0.618 ± 0.199
1.081PheIle: 1.081 ± 0.18
0.978PheLys: 0.978 ± 0.255
1.699PheLeu: 1.699 ± 0.261
0.463PheMet: 0.463 ± 0.152
0.875PheAsn: 0.875 ± 0.212
1.184PhePro: 1.184 ± 0.284
0.618PheGln: 0.618 ± 0.177
1.802PheArg: 1.802 ± 0.301
1.545PheSer: 1.545 ± 0.231
2.111PheThr: 2.111 ± 0.336
1.596PheVal: 1.596 ± 0.279
0.309PheTrp: 0.309 ± 0.129
0.515PheTyr: 0.515 ± 0.138
0.0PheXaa: 0.0 ± 0.0
Gly
9.115GlyAla: 9.115 ± 0.879
1.133GlyCys: 1.133 ± 0.256
6.437GlyAsp: 6.437 ± 0.614
4.944GlyGlu: 4.944 ± 0.509
2.317GlyPhe: 2.317 ± 0.309
8.652GlyGly: 8.652 ± 1.296
1.648GlyHis: 1.648 ± 0.339
4.017GlyIle: 4.017 ± 0.388
3.296GlyLys: 3.296 ± 0.392
5.562GlyLeu: 5.562 ± 0.788
2.369GlyMet: 2.369 ± 0.431
3.296GlyAsn: 3.296 ± 0.398
4.017GlyPro: 4.017 ± 0.472
2.626GlyGln: 2.626 ± 0.362
6.901GlyArg: 6.901 ± 0.57
4.686GlySer: 4.686 ± 0.645
6.386GlyThr: 6.386 ± 0.5
6.128GlyVal: 6.128 ± 0.519
1.905GlyTrp: 1.905 ± 0.29
2.008GlyTyr: 2.008 ± 0.373
0.0GlyXaa: 0.0 ± 0.0
His
2.42HisAla: 2.42 ± 0.429
0.463HisCys: 0.463 ± 0.16
1.596HisAsp: 1.596 ± 0.295
1.287HisGlu: 1.287 ± 0.25
0.566HisPhe: 0.566 ± 0.194
2.214HisGly: 2.214 ± 0.394
0.772HisHis: 0.772 ± 0.181
0.772HisIle: 0.772 ± 0.172
0.824HisLys: 0.824 ± 0.232
2.214HisLeu: 2.214 ± 0.327
0.463HisMet: 0.463 ± 0.154
0.669HisAsn: 0.669 ± 0.182
1.545HisPro: 1.545 ± 0.273
0.721HisGln: 0.721 ± 0.248
1.854HisArg: 1.854 ± 0.311
1.03HisSer: 1.03 ± 0.207
1.133HisThr: 1.133 ± 0.251
1.802HisVal: 1.802 ± 0.295
0.36HisTrp: 0.36 ± 0.161
0.618HisTyr: 0.618 ± 0.186
0.0HisXaa: 0.0 ± 0.0
Ile
5.201IleAla: 5.201 ± 0.5
0.154IleCys: 0.154 ± 0.097
3.965IleAsp: 3.965 ± 0.411
2.729IleGlu: 2.729 ± 0.511
0.721IlePhe: 0.721 ± 0.193
4.017IleGly: 4.017 ± 0.413
0.824IleHis: 0.824 ± 0.21
1.287IleIle: 1.287 ± 0.236
1.39IleLys: 1.39 ± 0.286
2.575IleLeu: 2.575 ± 0.342
0.412IleMet: 0.412 ± 0.139
1.648IleAsn: 1.648 ± 0.264
1.957IlePro: 1.957 ± 0.379
0.875IleGln: 0.875 ± 0.202
2.935IleArg: 2.935 ± 0.413
2.317IleSer: 2.317 ± 0.343
2.678IleThr: 2.678 ± 0.334
3.347IleVal: 3.347 ± 0.456
0.566IleTrp: 0.566 ± 0.197
0.721IleTyr: 0.721 ± 0.2
0.0IleXaa: 0.0 ± 0.0
Lys
3.914LysAla: 3.914 ± 0.499
0.463LysCys: 0.463 ± 0.175
1.081LysAsp: 1.081 ± 0.263
0.875LysGlu: 0.875 ± 0.167
1.133LysPhe: 1.133 ± 0.278
2.06LysGly: 2.06 ± 0.3
1.236LysHis: 1.236 ± 0.308
1.236LysIle: 1.236 ± 0.309
0.824LysLys: 0.824 ± 0.193
4.12LysLeu: 4.12 ± 0.459
0.875LysMet: 0.875 ± 0.202
1.133LysAsn: 1.133 ± 0.286
2.626LysPro: 2.626 ± 0.523
1.287LysGln: 1.287 ± 0.317
2.781LysArg: 2.781 ± 0.384
1.699LysSer: 1.699 ± 0.292
2.266LysThr: 2.266 ± 0.343
2.729LysVal: 2.729 ± 0.456
0.669LysTrp: 0.669 ± 0.202
0.566LysTyr: 0.566 ± 0.153
0.0LysXaa: 0.0 ± 0.0
Leu
11.278LeuAla: 11.278 ± 0.602
0.978LeuCys: 0.978 ± 0.203
5.716LeuAsp: 5.716 ± 0.498
3.193LeuGlu: 3.193 ± 0.371
2.317LeuPhe: 2.317 ± 0.293
6.18LeuGly: 6.18 ± 0.559
1.648LeuHis: 1.648 ± 0.271
2.884LeuIle: 2.884 ± 0.457
2.935LeuLys: 2.935 ± 0.412
5.047LeuLeu: 5.047 ± 0.487
1.287LeuMet: 1.287 ± 0.236
2.111LeuAsn: 2.111 ± 0.339
4.635LeuPro: 4.635 ± 0.522
3.347LeuGln: 3.347 ± 0.428
6.798LeuArg: 6.798 ± 0.755
4.326LeuSer: 4.326 ± 0.558
5.562LeuThr: 5.562 ± 0.565
5.356LeuVal: 5.356 ± 0.532
1.545LeuTrp: 1.545 ± 0.296
1.699LeuTyr: 1.699 ± 0.258
0.0LeuXaa: 0.0 ± 0.0
Met
1.957MetAla: 1.957 ± 0.247
0.206MetCys: 0.206 ± 0.113
0.772MetAsp: 0.772 ± 0.19
0.618MetGlu: 0.618 ± 0.152
0.824MetPhe: 0.824 ± 0.223
1.648MetGly: 1.648 ± 0.299
0.412MetHis: 0.412 ± 0.148
0.875MetIle: 0.875 ± 0.194
0.463MetLys: 0.463 ± 0.161
2.369MetLeu: 2.369 ± 0.341
0.309MetMet: 0.309 ± 0.116
0.927MetAsn: 0.927 ± 0.22
1.39MetPro: 1.39 ± 0.241
0.721MetGln: 0.721 ± 0.244
1.339MetArg: 1.339 ± 0.257
2.008MetSer: 2.008 ± 0.322
2.214MetThr: 2.214 ± 0.296
1.339MetVal: 1.339 ± 0.279
0.463MetTrp: 0.463 ± 0.161
0.463MetTyr: 0.463 ± 0.182
0.0MetXaa: 0.0 ± 0.0
Asn
3.759AsnAla: 3.759 ± 0.548
0.206AsnCys: 0.206 ± 0.092
1.802AsnAsp: 1.802 ± 0.296
1.081AsnGlu: 1.081 ± 0.191
0.463AsnPhe: 0.463 ± 0.194
3.708AsnGly: 3.708 ± 0.414
0.669AsnHis: 0.669 ± 0.143
0.875AsnIle: 0.875 ± 0.249
0.978AsnLys: 0.978 ± 0.239
1.854AsnLeu: 1.854 ± 0.401
0.566AsnMet: 0.566 ± 0.17
0.875AsnAsn: 0.875 ± 0.21
2.832AsnPro: 2.832 ± 0.352
0.978AsnGln: 0.978 ± 0.21
1.648AsnArg: 1.648 ± 0.293
1.133AsnSer: 1.133 ± 0.239
2.06AsnThr: 2.06 ± 0.369
2.317AsnVal: 2.317 ± 0.378
0.566AsnTrp: 0.566 ± 0.154
0.566AsnTyr: 0.566 ± 0.143
0.0AsnXaa: 0.0 ± 0.0
Pro
8.703ProAla: 8.703 ± 0.665
0.36ProCys: 0.36 ± 0.151
3.914ProAsp: 3.914 ± 0.543
3.296ProGlu: 3.296 ± 0.346
1.339ProPhe: 1.339 ± 0.28
4.532ProGly: 4.532 ± 0.461
1.184ProHis: 1.184 ± 0.262
2.626ProIle: 2.626 ± 0.293
2.678ProLys: 2.678 ± 0.512
3.656ProLeu: 3.656 ± 0.438
1.236ProMet: 1.236 ± 0.286
1.493ProAsn: 1.493 ± 0.327
2.884ProPro: 2.884 ± 0.412
2.06ProGln: 2.06 ± 0.385
3.347ProArg: 3.347 ± 0.518
2.42ProSer: 2.42 ± 0.306
3.399ProThr: 3.399 ± 0.414
4.017ProVal: 4.017 ± 0.414
0.875ProTrp: 0.875 ± 0.244
1.236ProTyr: 1.236 ± 0.219
0.0ProXaa: 0.0 ± 0.0
Gln
4.995GlnAla: 4.995 ± 0.496
0.206GlnCys: 0.206 ± 0.1
1.236GlnAsp: 1.236 ± 0.264
1.802GlnGlu: 1.802 ± 0.256
0.927GlnPhe: 0.927 ± 0.184
1.699GlnGly: 1.699 ± 0.272
0.772GlnHis: 0.772 ± 0.219
1.854GlnIle: 1.854 ± 0.269
1.081GlnLys: 1.081 ± 0.237
3.244GlnLeu: 3.244 ± 0.532
1.184GlnMet: 1.184 ± 0.248
0.721GlnAsn: 0.721 ± 0.244
2.266GlnPro: 2.266 ± 0.306
1.493GlnGln: 1.493 ± 0.316
3.45GlnArg: 3.45 ± 0.329
1.802GlnSer: 1.802 ± 0.275
1.648GlnThr: 1.648 ± 0.257
2.678GlnVal: 2.678 ± 0.451
0.927GlnTrp: 0.927 ± 0.202
0.927GlnTyr: 0.927 ± 0.217
0.0GlnXaa: 0.0 ± 0.0
Arg
8.085ArgAla: 8.085 ± 0.681
0.927ArgCys: 0.927 ± 0.24
4.686ArgAsp: 4.686 ± 0.545
4.326ArgGlu: 4.326 ± 0.532
2.214ArgPhe: 2.214 ± 0.292
5.922ArgGly: 5.922 ± 0.536
1.854ArgHis: 1.854 ± 0.338
4.171ArgIle: 4.171 ± 0.353
2.884ArgLys: 2.884 ± 0.386
6.025ArgLeu: 6.025 ± 0.566
1.699ArgMet: 1.699 ± 0.284
2.369ArgAsn: 2.369 ± 0.316
3.965ArgPro: 3.965 ± 0.541
3.09ArgGln: 3.09 ± 0.349
7.158ArgArg: 7.158 ± 0.749
3.193ArgSer: 3.193 ± 0.458
4.532ArgThr: 4.532 ± 0.507
5.819ArgVal: 5.819 ± 0.664
1.751ArgTrp: 1.751 ± 0.282
1.751ArgTyr: 1.751 ± 0.32
0.0ArgXaa: 0.0 ± 0.0
Ser
5.768SerAla: 5.768 ± 0.613
0.566SerCys: 0.566 ± 0.168
2.987SerAsp: 2.987 ± 0.359
3.193SerGlu: 3.193 ± 0.438
1.081SerPhe: 1.081 ± 0.241
4.635SerGly: 4.635 ± 0.621
0.927SerHis: 0.927 ± 0.214
2.214SerIle: 2.214 ± 0.355
1.339SerLys: 1.339 ± 0.23
2.987SerLeu: 2.987 ± 0.491
1.493SerMet: 1.493 ± 0.285
1.545SerAsn: 1.545 ± 0.309
2.523SerPro: 2.523 ± 0.413
1.854SerGln: 1.854 ± 0.296
3.09SerArg: 3.09 ± 0.401
2.935SerSer: 2.935 ± 0.434
4.068SerThr: 4.068 ± 0.476
3.553SerVal: 3.553 ± 0.489
1.648SerTrp: 1.648 ± 0.298
1.133SerTyr: 1.133 ± 0.21
0.0SerXaa: 0.0 ± 0.0
Thr
8.446ThrAla: 8.446 ± 0.662
0.669ThrCys: 0.669 ± 0.23
3.399ThrAsp: 3.399 ± 0.403
3.244ThrGlu: 3.244 ± 0.409
2.008ThrPhe: 2.008 ± 0.294
5.922ThrGly: 5.922 ± 0.515
1.596ThrHis: 1.596 ± 0.323
2.987ThrIle: 2.987 ± 0.398
2.369ThrLys: 2.369 ± 0.38
4.583ThrLeu: 4.583 ± 0.53
0.824ThrMet: 0.824 ± 0.216
1.802ThrAsn: 1.802 ± 0.31
3.502ThrPro: 3.502 ± 0.461
1.648ThrGln: 1.648 ± 0.302
4.274ThrArg: 4.274 ± 0.472
2.987ThrSer: 2.987 ± 0.491
4.377ThrThr: 4.377 ± 0.514
6.952ThrVal: 6.952 ± 0.596
1.287ThrTrp: 1.287 ± 0.248
1.287ThrTyr: 1.287 ± 0.231
0.0ThrXaa: 0.0 ± 0.0
Val
8.446ValAla: 8.446 ± 0.763
0.772ValCys: 0.772 ± 0.23
7.055ValAsp: 7.055 ± 0.806
4.48ValGlu: 4.48 ± 0.554
2.472ValPhe: 2.472 ± 0.337
5.974ValGly: 5.974 ± 0.612
2.266ValHis: 2.266 ± 0.343
3.09ValIle: 3.09 ± 0.399
2.987ValLys: 2.987 ± 0.402
5.613ValLeu: 5.613 ± 0.653
1.39ValMet: 1.39 ± 0.212
2.214ValAsn: 2.214 ± 0.332
4.223ValPro: 4.223 ± 0.578
2.678ValGln: 2.678 ± 0.349
5.562ValArg: 5.562 ± 0.529
3.965ValSer: 3.965 ± 0.442
4.944ValThr: 4.944 ± 0.568
6.489ValVal: 6.489 ± 0.638
1.184ValTrp: 1.184 ± 0.215
1.493ValTyr: 1.493 ± 0.254
0.0ValXaa: 0.0 ± 0.0
Trp
2.317TrpAla: 2.317 ± 0.353
0.206TrpCys: 0.206 ± 0.091
1.184TrpAsp: 1.184 ± 0.281
0.875TrpGlu: 0.875 ± 0.181
0.669TrpPhe: 0.669 ± 0.181
1.236TrpGly: 1.236 ± 0.294
0.36TrpHis: 0.36 ± 0.119
0.669TrpIle: 0.669 ± 0.186
0.412TrpLys: 0.412 ± 0.135
2.575TrpLeu: 2.575 ± 0.386
0.309TrpMet: 0.309 ± 0.12
0.721TrpAsn: 0.721 ± 0.174
1.081TrpPro: 1.081 ± 0.262
0.927TrpGln: 0.927 ± 0.255
1.905TrpArg: 1.905 ± 0.3
1.081TrpSer: 1.081 ± 0.25
1.442TrpThr: 1.442 ± 0.24
1.236TrpVal: 1.236 ± 0.258
0.309TrpTrp: 0.309 ± 0.13
0.463TrpTyr: 0.463 ± 0.131
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.472TyrAla: 2.472 ± 0.37
0.257TyrCys: 0.257 ± 0.118
1.236TyrAsp: 1.236 ± 0.309
1.287TyrGlu: 1.287 ± 0.263
0.618TyrPhe: 0.618 ± 0.155
2.317TyrGly: 2.317 ± 0.357
0.206TyrHis: 0.206 ± 0.103
0.824TyrIle: 0.824 ± 0.22
0.463TyrLys: 0.463 ± 0.166
1.699TyrLeu: 1.699 ± 0.336
0.257TyrMet: 0.257 ± 0.09
0.618TyrAsn: 0.618 ± 0.168
1.236TyrPro: 1.236 ± 0.209
0.927TyrGln: 0.927 ± 0.243
1.802TyrArg: 1.802 ± 0.266
1.339TyrSer: 1.339 ± 0.282
1.751TyrThr: 1.751 ± 0.33
1.596TyrVal: 1.596 ± 0.294
0.412TyrTrp: 0.412 ± 0.123
0.566TyrTyr: 0.566 ± 0.217
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 99 proteins (19419 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski