Amino acid dipepetide frequency for Mycobacterium phage Aminay

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
20.446AlaAla: 20.446 ± 1.765
0.946AlaCys: 0.946 ± 0.261
7.884AlaAsp: 7.884 ± 0.738
8.725AlaGlu: 8.725 ± 0.832
3.416AlaPhe: 3.416 ± 0.502
15.137AlaGly: 15.137 ± 1.381
1.997AlaHis: 1.997 ± 0.313
6.097AlaIle: 6.097 ± 0.594
3.469AlaLys: 3.469 ± 0.518
9.934AlaLeu: 9.934 ± 0.738
3.101AlaMet: 3.101 ± 0.416
2.628AlaAsn: 2.628 ± 0.31
6.57AlaPro: 6.57 ± 0.789
5.203AlaGln: 5.203 ± 0.591
9.986AlaArg: 9.986 ± 0.959
6.044AlaSer: 6.044 ± 0.555
7.674AlaThr: 7.674 ± 0.628
9.986AlaVal: 9.986 ± 0.89
2.681AlaTrp: 2.681 ± 0.354
2.523AlaTyr: 2.523 ± 0.355
0.0AlaXaa: 0.0 ± 0.0
Cys
1.051CysAla: 1.051 ± 0.251
0.158CysCys: 0.158 ± 0.09
0.578CysAsp: 0.578 ± 0.174
0.368CysGlu: 0.368 ± 0.129
0.315CysPhe: 0.315 ± 0.143
1.367CysGly: 1.367 ± 0.31
0.105CysHis: 0.105 ± 0.074
0.368CysIle: 0.368 ± 0.132
0.578CysLys: 0.578 ± 0.237
0.578CysLeu: 0.578 ± 0.157
0.158CysMet: 0.158 ± 0.093
0.42CysAsn: 0.42 ± 0.127
0.736CysPro: 0.736 ± 0.193
0.315CysGln: 0.315 ± 0.126
1.156CysArg: 1.156 ± 0.391
0.736CysSer: 0.736 ± 0.191
0.788CysThr: 0.788 ± 0.221
0.368CysVal: 0.368 ± 0.157
0.315CysTrp: 0.315 ± 0.128
0.105CysTyr: 0.105 ± 0.069
0.0CysXaa: 0.0 ± 0.0
Asp
7.779AspAla: 7.779 ± 0.712
0.368AspCys: 0.368 ± 0.123
5.519AspAsp: 5.519 ± 0.643
5.992AspGlu: 5.992 ± 0.695
1.577AspPhe: 1.577 ± 0.268
6.833AspGly: 6.833 ± 0.579
0.999AspHis: 0.999 ± 0.212
1.734AspIle: 1.734 ± 0.268
1.945AspLys: 1.945 ± 0.379
4.52AspLeu: 4.52 ± 0.553
1.367AspMet: 1.367 ± 0.284
1.997AspAsn: 1.997 ± 0.366
3.679AspPro: 3.679 ± 0.383
1.367AspGln: 1.367 ± 0.234
4.625AspArg: 4.625 ± 0.638
2.838AspSer: 2.838 ± 0.407
3.627AspThr: 3.627 ± 0.412
3.259AspVal: 3.259 ± 0.365
0.946AspTrp: 0.946 ± 0.199
2.155AspTyr: 2.155 ± 0.426
0.0AspXaa: 0.0 ± 0.0
Glu
5.887GluAla: 5.887 ± 0.631
0.999GluCys: 0.999 ± 0.267
2.733GluAsp: 2.733 ± 0.361
1.209GluGlu: 1.209 ± 0.202
1.156GluPhe: 1.156 ± 0.242
2.733GluGly: 2.733 ± 0.339
1.629GluHis: 1.629 ± 0.355
3.732GluIle: 3.732 ± 0.533
1.629GluLys: 1.629 ± 0.275
6.99GluLeu: 6.99 ± 0.622
1.472GluMet: 1.472 ± 0.272
1.577GluAsn: 1.577 ± 0.273
2.733GluPro: 2.733 ± 0.45
2.47GluGln: 2.47 ± 0.351
4.73GluArg: 4.73 ± 0.624
2.155GluSer: 2.155 ± 0.346
1.997GluThr: 1.997 ± 0.359
4.205GluVal: 4.205 ± 0.507
0.894GluTrp: 0.894 ± 0.225
1.156GluTyr: 1.156 ± 0.249
0.0GluXaa: 0.0 ± 0.0
Phe
2.733PheAla: 2.733 ± 0.359
0.263PheCys: 0.263 ± 0.104
2.313PheAsp: 2.313 ± 0.282
1.682PheGlu: 1.682 ± 0.278
0.631PhePhe: 0.631 ± 0.198
3.048PheGly: 3.048 ± 0.446
0.473PheHis: 0.473 ± 0.152
0.841PheIle: 0.841 ± 0.179
0.894PheLys: 0.894 ± 0.217
1.577PheLeu: 1.577 ± 0.309
0.631PheMet: 0.631 ± 0.215
0.841PheAsn: 0.841 ± 0.179
0.999PhePro: 0.999 ± 0.215
0.788PheGln: 0.788 ± 0.237
1.524PheArg: 1.524 ± 0.321
1.472PheSer: 1.472 ± 0.29
1.892PheThr: 1.892 ± 0.354
1.945PheVal: 1.945 ± 0.336
0.736PheTrp: 0.736 ± 0.165
0.473PheTyr: 0.473 ± 0.136
0.0PheXaa: 0.0 ± 0.0
Gly
12.089GlyAla: 12.089 ± 1.291
0.841GlyCys: 0.841 ± 0.207
5.361GlyAsp: 5.361 ± 0.495
4.362GlyGlu: 4.362 ± 0.49
2.208GlyPhe: 2.208 ± 0.324
10.67GlyGly: 10.67 ± 1.972
1.945GlyHis: 1.945 ± 0.314
2.838GlyIle: 2.838 ± 0.418
3.364GlyLys: 3.364 ± 0.415
6.675GlyLeu: 6.675 ± 0.62
1.524GlyMet: 1.524 ± 0.314
3.048GlyAsn: 3.048 ± 0.428
3.995GlyPro: 3.995 ± 0.62
3.311GlyGln: 3.311 ± 0.379
7.358GlyArg: 7.358 ± 0.745
5.046GlySer: 5.046 ± 0.646
5.624GlyThr: 5.624 ± 0.632
6.202GlyVal: 6.202 ± 0.716
2.418GlyTrp: 2.418 ± 0.388
2.365GlyTyr: 2.365 ± 0.338
0.0GlyXaa: 0.0 ± 0.0
His
2.102HisAla: 2.102 ± 0.425
0.21HisCys: 0.21 ± 0.099
1.104HisAsp: 1.104 ± 0.253
1.261HisGlu: 1.261 ± 0.228
0.42HisPhe: 0.42 ± 0.155
1.84HisGly: 1.84 ± 0.304
0.368HisHis: 0.368 ± 0.134
0.788HisIle: 0.788 ± 0.227
0.578HisLys: 0.578 ± 0.174
1.524HisLeu: 1.524 ± 0.256
0.42HisMet: 0.42 ± 0.157
0.526HisAsn: 0.526 ± 0.191
1.104HisPro: 1.104 ± 0.288
0.841HisGln: 0.841 ± 0.201
2.05HisArg: 2.05 ± 0.365
0.894HisSer: 0.894 ± 0.185
0.999HisThr: 0.999 ± 0.24
1.84HisVal: 1.84 ± 0.37
0.526HisTrp: 0.526 ± 0.182
0.473HisTyr: 0.473 ± 0.148
0.0HisXaa: 0.0 ± 0.0
Ile
5.887IleAla: 5.887 ± 0.557
0.473IleCys: 0.473 ± 0.148
3.679IleAsp: 3.679 ± 0.451
3.732IleGlu: 3.732 ± 0.501
0.736IlePhe: 0.736 ± 0.189
3.889IleGly: 3.889 ± 0.561
0.683IleHis: 0.683 ± 0.208
1.524IleIle: 1.524 ± 0.32
1.577IleLys: 1.577 ± 0.325
2.733IleLeu: 2.733 ± 0.38
0.526IleMet: 0.526 ± 0.174
1.682IleAsn: 1.682 ± 0.282
1.787IlePro: 1.787 ± 0.364
0.841IleGln: 0.841 ± 0.224
3.364IleArg: 3.364 ± 0.394
1.945IleSer: 1.945 ± 0.299
2.996IleThr: 2.996 ± 0.404
3.627IleVal: 3.627 ± 0.363
0.578IleTrp: 0.578 ± 0.176
0.578IleTyr: 0.578 ± 0.168
0.0IleXaa: 0.0 ± 0.0
Lys
5.046LysAla: 5.046 ± 0.72
0.368LysCys: 0.368 ± 0.138
1.524LysAsp: 1.524 ± 0.299
0.683LysGlu: 0.683 ± 0.18
0.841LysPhe: 0.841 ± 0.164
2.26LysGly: 2.26 ± 0.415
0.841LysHis: 0.841 ± 0.219
1.314LysIle: 1.314 ± 0.232
0.946LysLys: 0.946 ± 0.251
3.416LysLeu: 3.416 ± 0.489
0.894LysMet: 0.894 ± 0.199
0.736LysAsn: 0.736 ± 0.197
2.05LysPro: 2.05 ± 0.324
1.314LysGln: 1.314 ± 0.291
2.418LysArg: 2.418 ± 0.415
1.472LysSer: 1.472 ± 0.28
1.892LysThr: 1.892 ± 0.27
2.628LysVal: 2.628 ± 0.491
0.526LysTrp: 0.526 ± 0.138
1.051LysTyr: 1.051 ± 0.22
0.0LysXaa: 0.0 ± 0.0
Leu
12.562LeuAla: 12.562 ± 0.931
0.683LeuCys: 0.683 ± 0.213
6.675LeuAsp: 6.675 ± 0.526
1.892LeuGlu: 1.892 ± 0.294
2.208LeuPhe: 2.208 ± 0.298
7.463LeuGly: 7.463 ± 0.634
1.472LeuHis: 1.472 ± 0.277
2.891LeuIle: 2.891 ± 0.479
2.365LeuLys: 2.365 ± 0.394
6.938LeuLeu: 6.938 ± 0.677
1.419LeuMet: 1.419 ± 0.275
1.472LeuAsn: 1.472 ± 0.241
4.52LeuPro: 4.52 ± 0.539
2.891LeuGln: 2.891 ± 0.36
5.309LeuArg: 5.309 ± 0.724
5.414LeuSer: 5.414 ± 0.731
5.414LeuThr: 5.414 ± 0.624
4.888LeuVal: 4.888 ± 0.455
1.472LeuTrp: 1.472 ± 0.298
1.945LeuTyr: 1.945 ± 0.334
0.0LeuXaa: 0.0 ± 0.0
Met
2.681MetAla: 2.681 ± 0.283
0.263MetCys: 0.263 ± 0.109
0.578MetAsp: 0.578 ± 0.223
0.526MetGlu: 0.526 ± 0.185
0.631MetPhe: 0.631 ± 0.192
1.314MetGly: 1.314 ± 0.303
0.473MetHis: 0.473 ± 0.134
1.209MetIle: 1.209 ± 0.232
0.999MetLys: 0.999 ± 0.187
2.05MetLeu: 2.05 ± 0.331
0.315MetMet: 0.315 ± 0.132
1.104MetAsn: 1.104 ± 0.179
1.524MetPro: 1.524 ± 0.229
0.473MetGln: 0.473 ± 0.179
1.104MetArg: 1.104 ± 0.239
1.682MetSer: 1.682 ± 0.331
2.523MetThr: 2.523 ± 0.419
1.209MetVal: 1.209 ± 0.308
0.263MetTrp: 0.263 ± 0.103
0.263MetTyr: 0.263 ± 0.121
0.0MetXaa: 0.0 ± 0.0
Asn
3.995AsnAla: 3.995 ± 0.504
0.158AsnCys: 0.158 ± 0.08
1.051AsnAsp: 1.051 ± 0.196
1.209AsnGlu: 1.209 ± 0.222
0.473AsnPhe: 0.473 ± 0.156
3.259AsnGly: 3.259 ± 0.396
0.683AsnHis: 0.683 ± 0.211
1.209AsnIle: 1.209 ± 0.275
1.104AsnLys: 1.104 ± 0.246
2.102AsnLeu: 2.102 ± 0.278
0.841AsnMet: 0.841 ± 0.186
0.736AsnAsn: 0.736 ± 0.18
2.155AsnPro: 2.155 ± 0.384
0.841AsnGln: 0.841 ± 0.199
1.997AsnArg: 1.997 ± 0.446
1.261AsnSer: 1.261 ± 0.325
1.577AsnThr: 1.577 ± 0.274
2.208AsnVal: 2.208 ± 0.34
0.526AsnTrp: 0.526 ± 0.178
0.788AsnTyr: 0.788 ± 0.214
0.0AsnXaa: 0.0 ± 0.0
Pro
9.986ProAla: 9.986 ± 0.875
0.42ProCys: 0.42 ± 0.154
3.206ProAsp: 3.206 ± 0.391
4.362ProGlu: 4.362 ± 0.468
1.261ProPhe: 1.261 ± 0.237
4.415ProGly: 4.415 ± 0.535
1.787ProHis: 1.787 ± 0.291
2.365ProIle: 2.365 ± 0.315
1.156ProLys: 1.156 ± 0.232
3.364ProLeu: 3.364 ± 0.383
1.209ProMet: 1.209 ± 0.237
0.736ProAsn: 0.736 ± 0.209
3.995ProPro: 3.995 ± 0.447
1.84ProGln: 1.84 ± 0.348
2.891ProArg: 2.891 ± 0.432
3.259ProSer: 3.259 ± 0.455
3.942ProThr: 3.942 ± 0.721
3.942ProVal: 3.942 ± 0.528
0.894ProTrp: 0.894 ± 0.183
1.524ProTyr: 1.524 ± 0.31
0.0ProXaa: 0.0 ± 0.0
Gln
5.624GlnAla: 5.624 ± 0.614
0.315GlnCys: 0.315 ± 0.138
1.472GlnAsp: 1.472 ± 0.342
0.894GlnGlu: 0.894 ± 0.255
1.104GlnPhe: 1.104 ± 0.212
2.313GlnGly: 2.313 ± 0.324
0.841GlnHis: 0.841 ± 0.223
2.155GlnIle: 2.155 ± 0.34
1.156GlnLys: 1.156 ± 0.258
2.733GlnLeu: 2.733 ± 0.338
0.736GlnMet: 0.736 ± 0.165
0.841GlnAsn: 0.841 ± 0.199
2.102GlnPro: 2.102 ± 0.348
1.209GlnGln: 1.209 ± 0.361
2.523GlnArg: 2.523 ± 0.445
1.156GlnSer: 1.156 ± 0.273
2.102GlnThr: 2.102 ± 0.323
2.102GlnVal: 2.102 ± 0.317
1.051GlnTrp: 1.051 ± 0.261
0.999GlnTyr: 0.999 ± 0.175
0.0GlnXaa: 0.0 ± 0.0
Arg
8.462ArgAla: 8.462 ± 0.868
1.419ArgCys: 1.419 ± 0.321
4.73ArgAsp: 4.73 ± 0.556
3.416ArgGlu: 3.416 ± 0.478
1.997ArgPhe: 1.997 ± 0.455
5.466ArgGly: 5.466 ± 0.517
1.629ArgHis: 1.629 ± 0.299
3.259ArgIle: 3.259 ± 0.388
2.47ArgLys: 2.47 ± 0.453
6.412ArgLeu: 6.412 ± 0.764
1.892ArgMet: 1.892 ± 0.349
2.26ArgAsn: 2.26 ± 0.325
3.837ArgPro: 3.837 ± 0.507
3.048ArgGln: 3.048 ± 0.383
7.779ArgArg: 7.779 ± 0.932
3.574ArgSer: 3.574 ± 0.468
4.573ArgThr: 4.573 ± 0.438
5.256ArgVal: 5.256 ± 0.572
1.682ArgTrp: 1.682 ± 0.319
1.787ArgTyr: 1.787 ± 0.374
0.0ArgXaa: 0.0 ± 0.0
Ser
5.571SerAla: 5.571 ± 0.759
0.526SerCys: 0.526 ± 0.181
2.838SerAsp: 2.838 ± 0.296
2.733SerGlu: 2.733 ± 0.389
1.787SerPhe: 1.787 ± 0.289
4.783SerGly: 4.783 ± 0.644
0.736SerHis: 0.736 ± 0.169
2.47SerIle: 2.47 ± 0.422
1.682SerLys: 1.682 ± 0.384
3.259SerLeu: 3.259 ± 0.432
1.892SerMet: 1.892 ± 0.294
1.577SerAsn: 1.577 ± 0.292
4.047SerPro: 4.047 ± 0.514
1.419SerGln: 1.419 ± 0.275
3.627SerArg: 3.627 ± 0.413
2.575SerSer: 2.575 ± 0.419
3.469SerThr: 3.469 ± 0.4
4.468SerVal: 4.468 ± 0.767
1.261SerTrp: 1.261 ± 0.239
0.894SerTyr: 0.894 ± 0.203
0.0SerXaa: 0.0 ± 0.0
Thr
8.567ThrAla: 8.567 ± 0.979
0.473ThrCys: 0.473 ± 0.172
3.574ThrAsp: 3.574 ± 0.442
2.681ThrGlu: 2.681 ± 0.429
1.892ThrPhe: 1.892 ± 0.323
5.466ThrGly: 5.466 ± 0.613
1.261ThrHis: 1.261 ± 0.274
3.784ThrIle: 3.784 ± 0.533
2.26ThrLys: 2.26 ± 0.345
4.73ThrLeu: 4.73 ± 0.678
0.999ThrMet: 0.999 ± 0.24
2.155ThrAsn: 2.155 ± 0.315
4.52ThrPro: 4.52 ± 0.611
1.734ThrGln: 1.734 ± 0.286
3.574ThrArg: 3.574 ± 0.418
3.889ThrSer: 3.889 ± 0.449
5.519ThrThr: 5.519 ± 0.606
5.414ThrVal: 5.414 ± 0.449
1.472ThrTrp: 1.472 ± 0.319
1.997ThrTyr: 1.997 ± 0.429
0.0ThrXaa: 0.0 ± 0.0
Val
8.672ValAla: 8.672 ± 0.714
1.104ValCys: 1.104 ± 0.292
5.624ValAsp: 5.624 ± 0.561
4.678ValGlu: 4.678 ± 0.536
2.05ValPhe: 2.05 ± 0.327
5.046ValGly: 5.046 ± 0.684
1.419ValHis: 1.419 ± 0.346
2.733ValIle: 2.733 ± 0.415
2.628ValLys: 2.628 ± 0.38
6.255ValLeu: 6.255 ± 0.54
0.894ValMet: 0.894 ± 0.277
2.365ValAsn: 2.365 ± 0.314
4.1ValPro: 4.1 ± 0.463
1.419ValGln: 1.419 ± 0.281
4.888ValArg: 4.888 ± 0.506
3.679ValSer: 3.679 ± 0.449
5.519ValThr: 5.519 ± 0.688
3.995ValVal: 3.995 ± 0.473
1.892ValTrp: 1.892 ± 0.427
1.84ValTyr: 1.84 ± 0.402
0.0ValXaa: 0.0 ± 0.0
Trp
2.365TrpAla: 2.365 ± 0.311
0.263TrpCys: 0.263 ± 0.125
1.209TrpAsp: 1.209 ± 0.22
0.631TrpGlu: 0.631 ± 0.181
0.736TrpPhe: 0.736 ± 0.245
1.314TrpGly: 1.314 ± 0.26
0.315TrpHis: 0.315 ± 0.107
0.788TrpIle: 0.788 ± 0.162
0.631TrpLys: 0.631 ± 0.214
2.26TrpLeu: 2.26 ± 0.369
0.315TrpMet: 0.315 ± 0.121
0.578TrpAsn: 0.578 ± 0.164
0.894TrpPro: 0.894 ± 0.176
1.104TrpGln: 1.104 ± 0.236
2.102TrpArg: 2.102 ± 0.317
1.472TrpSer: 1.472 ± 0.305
1.892TrpThr: 1.892 ± 0.261
1.314TrpVal: 1.314 ± 0.241
0.631TrpTrp: 0.631 ± 0.203
0.473TrpTyr: 0.473 ± 0.152
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.943TyrAla: 2.943 ± 0.361
0.315TyrCys: 0.315 ± 0.157
1.472TyrAsp: 1.472 ± 0.272
1.051TyrGlu: 1.051 ± 0.304
0.315TyrPhe: 0.315 ± 0.137
2.628TyrGly: 2.628 ± 0.476
0.21TyrHis: 0.21 ± 0.098
0.736TyrIle: 0.736 ± 0.21
0.788TyrLys: 0.788 ± 0.192
1.892TyrLeu: 1.892 ± 0.366
0.526TyrMet: 0.526 ± 0.14
0.894TyrAsn: 0.894 ± 0.175
1.209TyrPro: 1.209 ± 0.281
0.999TyrGln: 0.999 ± 0.233
1.945TyrArg: 1.945 ± 0.313
1.051TyrSer: 1.051 ± 0.279
1.945TyrThr: 1.945 ± 0.44
1.945TyrVal: 1.945 ± 0.343
0.526TyrTrp: 0.526 ± 0.174
0.631TyrTyr: 0.631 ± 0.143
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 105 proteins (19027 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski