Amino acid dipepetide frequency for Mycobacterium phage Nibb

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
22.073AlaAla: 22.073 ± 1.968
1.153AlaCys: 1.153 ± 0.29
8.599AlaAsp: 8.599 ± 0.612
8.599AlaGlu: 8.599 ± 0.974
3.67AlaPhe: 3.67 ± 0.496
10.014AlaGly: 10.014 ± 0.982
2.622AlaHis: 2.622 ± 0.367
4.928AlaIle: 4.928 ± 0.438
5.4AlaLys: 5.4 ± 0.623
12.426AlaLeu: 12.426 ± 0.948
3.303AlaMet: 3.303 ± 0.378
2.884AlaAsn: 2.884 ± 0.397
6.921AlaPro: 6.921 ± 0.694
4.561AlaGln: 4.561 ± 0.648
7.602AlaArg: 7.602 ± 0.687
6.344AlaSer: 6.344 ± 0.943
6.763AlaThr: 6.763 ± 0.792
9.962AlaVal: 9.962 ± 0.743
2.726AlaTrp: 2.726 ± 0.305
2.884AlaTyr: 2.884 ± 0.46
0.0AlaXaa: 0.0 ± 0.0
Cys
1.416CysAla: 1.416 ± 0.284
0.105CysCys: 0.105 ± 0.07
0.786CysAsp: 0.786 ± 0.196
0.786CysGlu: 0.786 ± 0.247
0.21CysPhe: 0.21 ± 0.097
1.258CysGly: 1.258 ± 0.353
0.262CysHis: 0.262 ± 0.123
0.315CysIle: 0.315 ± 0.114
0.524CysLys: 0.524 ± 0.192
0.839CysLeu: 0.839 ± 0.24
0.21CysMet: 0.21 ± 0.101
0.21CysAsn: 0.21 ± 0.098
0.891CysPro: 0.891 ± 0.247
0.105CysGln: 0.105 ± 0.083
0.786CysArg: 0.786 ± 0.193
0.944CysSer: 0.944 ± 0.254
0.315CysThr: 0.315 ± 0.115
0.577CysVal: 0.577 ± 0.18
0.472CysTrp: 0.472 ± 0.151
0.21CysTyr: 0.21 ± 0.097
0.0CysXaa: 0.0 ± 0.0
Asp
7.865AspAla: 7.865 ± 0.698
0.577AspCys: 0.577 ± 0.203
4.666AspAsp: 4.666 ± 0.544
5.662AspGlu: 5.662 ± 0.615
0.996AspPhe: 0.996 ± 0.22
6.763AspGly: 6.763 ± 0.507
0.944AspHis: 0.944 ± 0.228
1.573AspIle: 1.573 ± 0.228
1.835AspLys: 1.835 ± 0.316
6.239AspLeu: 6.239 ± 0.607
2.097AspMet: 2.097 ± 0.37
1.783AspAsn: 1.783 ± 0.285
4.09AspPro: 4.09 ± 0.45
1.992AspGln: 1.992 ± 0.311
4.824AspArg: 4.824 ± 0.613
2.254AspSer: 2.254 ± 0.327
2.884AspThr: 2.884 ± 0.417
4.928AspVal: 4.928 ± 0.523
0.786AspTrp: 0.786 ± 0.185
1.573AspTyr: 1.573 ± 0.314
0.0AspXaa: 0.0 ± 0.0
Glu
7.13GluAla: 7.13 ± 0.852
0.786GluCys: 0.786 ± 0.201
3.093GluAsp: 3.093 ± 0.538
1.363GluGlu: 1.363 ± 0.281
2.254GluPhe: 2.254 ± 0.404
4.719GluGly: 4.719 ± 0.517
1.625GluHis: 1.625 ± 0.323
2.254GluIle: 2.254 ± 0.367
1.416GluLys: 1.416 ± 0.389
5.4GluLeu: 5.4 ± 0.529
1.625GluMet: 1.625 ± 0.321
0.996GluAsn: 0.996 ± 0.261
3.723GluPro: 3.723 ± 0.486
2.674GluGln: 2.674 ± 0.389
4.876GluArg: 4.876 ± 0.588
2.569GluSer: 2.569 ± 0.393
2.202GluThr: 2.202 ± 0.331
4.666GluVal: 4.666 ± 0.68
1.311GluTrp: 1.311 ± 0.253
1.887GluTyr: 1.887 ± 0.348
0.0GluXaa: 0.0 ± 0.0
Phe
3.723PheAla: 3.723 ± 0.554
0.315PheCys: 0.315 ± 0.139
2.517PheAsp: 2.517 ± 0.439
1.153PheGlu: 1.153 ± 0.229
0.419PhePhe: 0.419 ± 0.161
3.041PheGly: 3.041 ± 0.434
0.419PheHis: 0.419 ± 0.164
0.839PheIle: 0.839 ± 0.192
1.153PheLys: 1.153 ± 0.385
2.15PheLeu: 2.15 ± 0.336
0.524PheMet: 0.524 ± 0.163
0.891PheAsn: 0.891 ± 0.254
1.363PhePro: 1.363 ± 0.261
0.629PheGln: 0.629 ± 0.187
1.206PheArg: 1.206 ± 0.275
1.101PheSer: 1.101 ± 0.234
1.573PheThr: 1.573 ± 0.351
2.674PheVal: 2.674 ± 0.293
0.315PheTrp: 0.315 ± 0.111
0.472PheTyr: 0.472 ± 0.164
0.0PheXaa: 0.0 ± 0.0
Gly
10.538GlyAla: 10.538 ± 1.069
1.206GlyCys: 1.206 ± 0.266
5.243GlyAsp: 5.243 ± 0.474
5.243GlyGlu: 5.243 ± 0.455
2.726GlyPhe: 2.726 ± 0.425
10.014GlyGly: 10.014 ± 1.539
1.94GlyHis: 1.94 ± 0.299
3.198GlyIle: 3.198 ± 0.605
4.352GlyLys: 4.352 ± 0.522
6.816GlyLeu: 6.816 ± 0.948
2.097GlyMet: 2.097 ± 0.311
2.936GlyAsn: 2.936 ± 0.336
3.513GlyPro: 3.513 ± 0.465
2.097GlyGln: 2.097 ± 0.395
5.558GlyArg: 5.558 ± 0.591
5.243GlySer: 5.243 ± 0.625
5.872GlyThr: 5.872 ± 0.737
7.498GlyVal: 7.498 ± 0.607
2.517GlyTrp: 2.517 ± 0.308
2.464GlyTyr: 2.464 ± 0.414
0.0GlyXaa: 0.0 ± 0.0
His
2.202HisAla: 2.202 ± 0.383
0.419HisCys: 0.419 ± 0.133
1.835HisAsp: 1.835 ± 0.344
1.049HisGlu: 1.049 ± 0.219
0.577HisPhe: 0.577 ± 0.192
2.307HisGly: 2.307 ± 0.324
0.629HisHis: 0.629 ± 0.203
0.577HisIle: 0.577 ± 0.202
0.419HisLys: 0.419 ± 0.144
1.887HisLeu: 1.887 ± 0.274
0.472HisMet: 0.472 ± 0.207
0.682HisAsn: 0.682 ± 0.154
1.153HisPro: 1.153 ± 0.236
0.419HisGln: 0.419 ± 0.129
1.73HisArg: 1.73 ± 0.295
0.891HisSer: 0.891 ± 0.2
1.468HisThr: 1.468 ± 0.279
2.097HisVal: 2.097 ± 0.31
0.524HisTrp: 0.524 ± 0.175
0.577HisTyr: 0.577 ± 0.167
0.0HisXaa: 0.0 ± 0.0
Ile
5.82IleAla: 5.82 ± 0.539
0.21IleCys: 0.21 ± 0.1
2.831IleAsp: 2.831 ± 0.483
3.198IleGlu: 3.198 ± 0.475
0.419IlePhe: 0.419 ± 0.162
4.509IleGly: 4.509 ± 0.763
0.524IleHis: 0.524 ± 0.196
0.629IleIle: 0.629 ± 0.198
1.311IleLys: 1.311 ± 0.258
1.992IleLeu: 1.992 ± 0.316
0.315IleMet: 0.315 ± 0.11
1.258IleAsn: 1.258 ± 0.238
1.887IlePro: 1.887 ± 0.406
0.629IleGln: 0.629 ± 0.186
2.307IleArg: 2.307 ± 0.259
1.73IleSer: 1.73 ± 0.252
2.517IleThr: 2.517 ± 0.357
3.827IleVal: 3.827 ± 0.428
0.734IleTrp: 0.734 ± 0.203
0.472IleTyr: 0.472 ± 0.159
0.0IleXaa: 0.0 ± 0.0
Lys
4.247LysAla: 4.247 ± 0.622
0.472LysCys: 0.472 ± 0.139
1.258LysAsp: 1.258 ± 0.233
0.734LysGlu: 0.734 ± 0.154
0.786LysPhe: 0.786 ± 0.187
3.408LysGly: 3.408 ± 0.438
0.577LysHis: 0.577 ± 0.15
1.363LysIle: 1.363 ± 0.358
0.682LysLys: 0.682 ± 0.183
2.726LysLeu: 2.726 ± 0.313
1.153LysMet: 1.153 ± 0.243
0.682LysAsn: 0.682 ± 0.246
2.936LysPro: 2.936 ± 0.44
0.577LysGln: 0.577 ± 0.173
2.622LysArg: 2.622 ± 0.411
1.416LysSer: 1.416 ± 0.229
2.202LysThr: 2.202 ± 0.331
2.569LysVal: 2.569 ± 0.332
0.682LysTrp: 0.682 ± 0.19
0.996LysTyr: 0.996 ± 0.277
0.0LysXaa: 0.0 ± 0.0
Leu
13.003LeuAla: 13.003 ± 0.783
0.839LeuCys: 0.839 ± 0.199
7.917LeuAsp: 7.917 ± 0.592
2.517LeuGlu: 2.517 ± 0.462
2.464LeuPhe: 2.464 ± 0.419
7.078LeuGly: 7.078 ± 0.849
2.15LeuHis: 2.15 ± 0.339
3.356LeuIle: 3.356 ± 0.44
2.15LeuLys: 2.15 ± 0.352
6.292LeuLeu: 6.292 ± 0.644
1.73LeuMet: 1.73 ± 0.292
2.254LeuAsn: 2.254 ± 0.38
4.666LeuPro: 4.666 ± 0.64
2.569LeuGln: 2.569 ± 0.405
6.868LeuArg: 6.868 ± 0.524
5.82LeuSer: 5.82 ± 0.579
5.4LeuThr: 5.4 ± 0.49
5.295LeuVal: 5.295 ± 0.579
1.573LeuTrp: 1.573 ± 0.319
1.678LeuTyr: 1.678 ± 0.377
0.0LeuXaa: 0.0 ± 0.0
Met
3.251MetAla: 3.251 ± 0.38
0.105MetCys: 0.105 ± 0.083
0.944MetAsp: 0.944 ± 0.174
0.524MetGlu: 0.524 ± 0.172
0.839MetPhe: 0.839 ± 0.219
1.363MetGly: 1.363 ± 0.258
0.524MetHis: 0.524 ± 0.152
1.416MetIle: 1.416 ± 0.286
0.472MetLys: 0.472 ± 0.161
1.363MetLeu: 1.363 ± 0.264
0.315MetMet: 0.315 ± 0.158
0.629MetAsn: 0.629 ± 0.153
1.311MetPro: 1.311 ± 0.283
0.682MetGln: 0.682 ± 0.228
1.835MetArg: 1.835 ± 0.296
2.464MetSer: 2.464 ± 0.36
1.468MetThr: 1.468 ± 0.244
1.678MetVal: 1.678 ± 0.27
0.315MetTrp: 0.315 ± 0.13
0.734MetTyr: 0.734 ± 0.206
0.0MetXaa: 0.0 ± 0.0
Asn
3.565AsnAla: 3.565 ± 0.517
0.367AsnCys: 0.367 ± 0.123
1.258AsnAsp: 1.258 ± 0.352
0.891AsnGlu: 0.891 ± 0.198
0.524AsnPhe: 0.524 ± 0.171
3.46AsnGly: 3.46 ± 0.468
0.419AsnHis: 0.419 ± 0.132
0.629AsnIle: 0.629 ± 0.204
0.629AsnLys: 0.629 ± 0.169
2.045AsnLeu: 2.045 ± 0.329
0.315AsnMet: 0.315 ± 0.125
0.629AsnAsn: 0.629 ± 0.189
2.359AsnPro: 2.359 ± 0.362
0.577AsnGln: 0.577 ± 0.15
1.835AsnArg: 1.835 ± 0.26
0.786AsnSer: 0.786 ± 0.23
1.73AsnThr: 1.73 ± 0.282
2.622AsnVal: 2.622 ± 0.372
0.315AsnTrp: 0.315 ± 0.128
0.786AsnTyr: 0.786 ± 0.191
0.0AsnXaa: 0.0 ± 0.0
Pro
8.546ProAla: 8.546 ± 0.744
0.472ProCys: 0.472 ± 0.168
3.618ProAsp: 3.618 ± 0.412
4.509ProGlu: 4.509 ± 0.588
1.416ProPhe: 1.416 ± 0.285
5.82ProGly: 5.82 ± 0.561
0.944ProHis: 0.944 ± 0.266
1.678ProIle: 1.678 ± 0.281
2.045ProLys: 2.045 ± 0.288
4.666ProLeu: 4.666 ± 0.495
1.049ProMet: 1.049 ± 0.23
1.311ProAsn: 1.311 ± 0.455
2.989ProPro: 2.989 ± 0.392
1.416ProGln: 1.416 ± 0.308
3.251ProArg: 3.251 ± 0.447
2.989ProSer: 2.989 ± 0.385
3.46ProThr: 3.46 ± 0.414
5.295ProVal: 5.295 ± 0.646
1.101ProTrp: 1.101 ± 0.228
1.468ProTyr: 1.468 ± 0.27
0.0ProXaa: 0.0 ± 0.0
Gln
4.876GlnAla: 4.876 ± 0.509
0.262GlnCys: 0.262 ± 0.118
0.944GlnAsp: 0.944 ± 0.263
1.206GlnGlu: 1.206 ± 0.203
0.891GlnPhe: 0.891 ± 0.273
2.569GlnGly: 2.569 ± 0.377
0.996GlnHis: 0.996 ± 0.196
1.625GlnIle: 1.625 ± 0.272
0.419GlnLys: 0.419 ± 0.152
2.517GlnLeu: 2.517 ± 0.41
0.629GlnMet: 0.629 ± 0.184
0.629GlnAsn: 0.629 ± 0.19
1.678GlnPro: 1.678 ± 0.233
1.468GlnGln: 1.468 ± 0.235
2.202GlnArg: 2.202 ± 0.347
1.73GlnSer: 1.73 ± 0.289
1.468GlnThr: 1.468 ± 0.257
2.884GlnVal: 2.884 ± 0.473
0.682GlnTrp: 0.682 ± 0.216
0.944GlnTyr: 0.944 ± 0.208
0.0GlnXaa: 0.0 ± 0.0
Arg
6.973ArgAla: 6.973 ± 0.685
1.153ArgCys: 1.153 ± 0.297
4.352ArgAsp: 4.352 ± 0.444
4.194ArgGlu: 4.194 ± 0.544
1.835ArgPhe: 1.835 ± 0.296
4.142ArgGly: 4.142 ± 0.382
1.783ArgHis: 1.783 ± 0.385
2.936ArgIle: 2.936 ± 0.383
2.936ArgLys: 2.936 ± 0.46
6.659ArgLeu: 6.659 ± 0.583
2.15ArgMet: 2.15 ± 0.305
1.992ArgAsn: 1.992 ± 0.339
4.247ArgPro: 4.247 ± 0.708
2.359ArgGln: 2.359 ± 0.332
6.449ArgArg: 6.449 ± 0.708
3.513ArgSer: 3.513 ± 0.36
3.513ArgThr: 3.513 ± 0.508
4.142ArgVal: 4.142 ± 0.455
1.94ArgTrp: 1.94 ± 0.337
1.573ArgTyr: 1.573 ± 0.253
0.0ArgXaa: 0.0 ± 0.0
Ser
6.606SerAla: 6.606 ± 0.668
0.472SerCys: 0.472 ± 0.184
3.041SerAsp: 3.041 ± 0.378
3.198SerGlu: 3.198 ± 0.368
1.625SerPhe: 1.625 ± 0.335
4.614SerGly: 4.614 ± 0.893
1.573SerHis: 1.573 ± 0.321
2.464SerIle: 2.464 ± 0.465
1.258SerLys: 1.258 ± 0.237
4.509SerLeu: 4.509 ± 0.388
1.311SerMet: 1.311 ± 0.255
1.363SerAsn: 1.363 ± 0.255
3.198SerPro: 3.198 ± 0.367
1.625SerGln: 1.625 ± 0.285
2.831SerArg: 2.831 ± 0.353
3.251SerSer: 3.251 ± 0.579
3.146SerThr: 3.146 ± 0.361
3.723SerVal: 3.723 ± 0.408
0.996SerTrp: 0.996 ± 0.185
1.73SerTyr: 1.73 ± 0.241
0.0SerXaa: 0.0 ± 0.0
Thr
6.868ThrAla: 6.868 ± 0.825
0.577ThrCys: 0.577 ± 0.163
3.251ThrAsp: 3.251 ± 0.365
3.356ThrGlu: 3.356 ± 0.432
1.887ThrPhe: 1.887 ± 0.27
5.715ThrGly: 5.715 ± 0.559
1.311ThrHis: 1.311 ± 0.253
3.198ThrIle: 3.198 ± 0.486
1.887ThrLys: 1.887 ± 0.363
4.771ThrLeu: 4.771 ± 0.462
0.577ThrMet: 0.577 ± 0.19
1.363ThrAsn: 1.363 ± 0.245
3.618ThrPro: 3.618 ± 0.47
1.94ThrGln: 1.94 ± 0.314
2.674ThrArg: 2.674 ± 0.398
2.884ThrSer: 2.884 ± 0.589
3.093ThrThr: 3.093 ± 0.445
4.876ThrVal: 4.876 ± 0.416
1.153ThrTrp: 1.153 ± 0.221
1.678ThrTyr: 1.678 ± 0.309
0.0ThrXaa: 0.0 ± 0.0
Val
9.595ValAla: 9.595 ± 0.974
1.101ValCys: 1.101 ± 0.249
5.453ValAsp: 5.453 ± 0.58
6.187ValGlu: 6.187 ± 0.695
1.835ValPhe: 1.835 ± 0.3
6.816ValGly: 6.816 ± 0.647
2.097ValHis: 2.097 ± 0.463
2.989ValIle: 2.989 ± 0.427
2.412ValLys: 2.412 ± 0.359
6.868ValLeu: 6.868 ± 0.664
1.678ValMet: 1.678 ± 0.255
1.573ValAsn: 1.573 ± 0.236
5.295ValPro: 5.295 ± 0.603
2.254ValGln: 2.254 ± 0.266
4.824ValArg: 4.824 ± 0.586
4.142ValSer: 4.142 ± 0.365
4.876ValThr: 4.876 ± 0.476
6.763ValVal: 6.763 ± 0.785
1.573ValTrp: 1.573 ± 0.266
1.73ValTyr: 1.73 ± 0.208
0.0ValXaa: 0.0 ± 0.0
Trp
2.517TrpAla: 2.517 ± 0.397
0.367TrpCys: 0.367 ± 0.135
0.682TrpAsp: 0.682 ± 0.204
0.734TrpGlu: 0.734 ± 0.175
0.682TrpPhe: 0.682 ± 0.149
1.363TrpGly: 1.363 ± 0.294
0.419TrpHis: 0.419 ± 0.157
0.786TrpIle: 0.786 ± 0.186
0.21TrpLys: 0.21 ± 0.098
2.779TrpLeu: 2.779 ± 0.362
0.262TrpMet: 0.262 ± 0.115
0.734TrpAsn: 0.734 ± 0.208
0.891TrpPro: 0.891 ± 0.229
1.153TrpGln: 1.153 ± 0.209
2.254TrpArg: 2.254 ± 0.352
1.101TrpSer: 1.101 ± 0.25
1.206TrpThr: 1.206 ± 0.231
1.468TrpVal: 1.468 ± 0.296
0.367TrpTrp: 0.367 ± 0.135
0.472TrpTyr: 0.472 ± 0.129
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.622TyrAla: 2.622 ± 0.32
0.262TyrCys: 0.262 ± 0.117
2.045TyrAsp: 2.045 ± 0.303
1.468TyrGlu: 1.468 ± 0.285
0.419TyrPhe: 0.419 ± 0.158
2.097TyrGly: 2.097 ± 0.325
0.157TyrHis: 0.157 ± 0.088
0.367TyrIle: 0.367 ± 0.135
0.577TyrLys: 0.577 ± 0.198
2.674TyrLeu: 2.674 ± 0.362
0.419TyrMet: 0.419 ± 0.155
0.944TyrAsn: 0.944 ± 0.22
1.311TyrPro: 1.311 ± 0.284
0.891TyrGln: 0.891 ± 0.252
2.307TyrArg: 2.307 ± 0.331
1.363TyrSer: 1.363 ± 0.236
1.52TyrThr: 1.52 ± 0.283
2.517TyrVal: 2.517 ± 0.344
0.367TyrTrp: 0.367 ± 0.136
0.629TyrTyr: 0.629 ± 0.185
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 102 proteins (19074 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski