Amino acid dipepetide frequency for Mycobacterium phage Jeon

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
20.144AlaAla: 20.144 ± 2.073
0.939AlaCys: 0.939 ± 0.281
7.306AlaAsp: 7.306 ± 0.644
9.498AlaGlu: 9.498 ± 0.834
2.505AlaPhe: 2.505 ± 0.369
8.82AlaGly: 8.82 ± 1.173
1.305AlaHis: 1.305 ± 0.249
4.801AlaIle: 4.801 ± 0.637
4.123AlaLys: 4.123 ± 0.752
13.047AlaLeu: 13.047 ± 1.411
2.818AlaMet: 2.818 ± 0.302
3.444AlaAsn: 3.444 ± 0.561
5.688AlaPro: 5.688 ± 0.629
4.123AlaGln: 4.123 ± 0.568
8.245AlaArg: 8.245 ± 0.785
5.793AlaSer: 5.793 ± 0.505
6.054AlaThr: 6.054 ± 0.543
9.602AlaVal: 9.602 ± 0.727
2.14AlaTrp: 2.14 ± 0.39
2.192AlaTyr: 2.192 ± 0.391
0.0AlaXaa: 0.0 ± 0.0
Cys
1.044CysAla: 1.044 ± 0.229
0.157CysCys: 0.157 ± 0.09
0.783CysAsp: 0.783 ± 0.198
0.887CysGlu: 0.887 ± 0.272
0.313CysPhe: 0.313 ± 0.123
1.513CysGly: 1.513 ± 0.415
0.261CysHis: 0.261 ± 0.154
0.47CysIle: 0.47 ± 0.127
0.261CysLys: 0.261 ± 0.106
0.626CysLeu: 0.626 ± 0.218
0.261CysMet: 0.261 ± 0.105
0.365CysAsn: 0.365 ± 0.14
0.835CysPro: 0.835 ± 0.209
0.313CysGln: 0.313 ± 0.148
1.148CysArg: 1.148 ± 0.268
0.887CysSer: 0.887 ± 0.195
0.731CysThr: 0.731 ± 0.224
0.887CysVal: 0.887 ± 0.207
0.313CysTrp: 0.313 ± 0.138
0.104CysTyr: 0.104 ± 0.078
0.0CysXaa: 0.0 ± 0.0
Asp
7.41AspAla: 7.41 ± 0.543
0.626AspCys: 0.626 ± 0.19
4.54AspAsp: 4.54 ± 0.694
6.576AspGlu: 6.576 ± 0.657
1.357AspPhe: 1.357 ± 0.239
6.471AspGly: 6.471 ± 0.616
1.096AspHis: 1.096 ± 0.254
2.087AspIle: 2.087 ± 0.33
1.931AspLys: 1.931 ± 0.306
5.166AspLeu: 5.166 ± 0.449
1.357AspMet: 1.357 ± 0.261
0.939AspAsn: 0.939 ± 0.182
4.54AspPro: 4.54 ± 0.738
2.035AspGln: 2.035 ± 0.391
4.279AspArg: 4.279 ± 0.66
2.922AspSer: 2.922 ± 0.418
2.87AspThr: 2.87 ± 0.352
4.279AspVal: 4.279 ± 0.566
1.2AspTrp: 1.2 ± 0.237
1.2AspTyr: 1.2 ± 0.228
0.0AspXaa: 0.0 ± 0.0
Glu
8.298GluAla: 8.298 ± 0.876
0.835GluCys: 0.835 ± 0.244
3.966GluAsp: 3.966 ± 0.561
3.131GluGlu: 3.131 ± 0.405
1.983GluPhe: 1.983 ± 0.314
4.645GluGly: 4.645 ± 0.472
1.879GluHis: 1.879 ± 0.313
3.183GluIle: 3.183 ± 0.352
2.87GluLys: 2.87 ± 0.476
6.523GluLeu: 6.523 ± 0.626
1.879GluMet: 1.879 ± 0.362
1.774GluAsn: 1.774 ± 0.287
3.81GluPro: 3.81 ± 0.61
2.662GluGln: 2.662 ± 0.362
5.375GluArg: 5.375 ± 0.544
4.436GluSer: 4.436 ± 0.436
3.497GluThr: 3.497 ± 0.434
5.48GluVal: 5.48 ± 0.662
1.931GluTrp: 1.931 ± 0.388
1.774GluTyr: 1.774 ± 0.326
0.0GluXaa: 0.0 ± 0.0
Phe
2.14PheAla: 2.14 ± 0.291
0.104PheCys: 0.104 ± 0.071
2.609PheAsp: 2.609 ± 0.384
1.67PheGlu: 1.67 ± 0.268
0.209PhePhe: 0.209 ± 0.099
2.401PheGly: 2.401 ± 0.497
0.209PheHis: 0.209 ± 0.086
0.939PheIle: 0.939 ± 0.211
0.678PheLys: 0.678 ± 0.195
1.879PheLeu: 1.879 ± 0.329
0.417PheMet: 0.417 ± 0.139
0.731PheAsn: 0.731 ± 0.148
1.67PhePro: 1.67 ± 0.284
1.096PheGln: 1.096 ± 0.224
2.035PheArg: 2.035 ± 0.339
1.357PheSer: 1.357 ± 0.197
1.774PheThr: 1.774 ± 0.369
1.827PheVal: 1.827 ± 0.321
0.417PheTrp: 0.417 ± 0.166
0.365PheTyr: 0.365 ± 0.121
0.0PheXaa: 0.0 ± 0.0
Gly
8.298GlyAla: 8.298 ± 1.24
1.148GlyCys: 1.148 ± 0.305
5.375GlyAsp: 5.375 ± 0.506
5.166GlyGlu: 5.166 ± 0.534
2.087GlyPhe: 2.087 ± 0.298
9.237GlyGly: 9.237 ± 2.213
1.513GlyHis: 1.513 ± 0.274
2.922GlyIle: 2.922 ± 0.364
3.079GlyLys: 3.079 ± 0.387
7.045GlyLeu: 7.045 ± 0.979
2.244GlyMet: 2.244 ± 0.3
2.609GlyAsn: 2.609 ± 0.382
3.392GlyPro: 3.392 ± 0.508
3.183GlyGln: 3.183 ± 0.545
5.323GlyArg: 5.323 ± 0.591
5.427GlySer: 5.427 ± 0.571
5.897GlyThr: 5.897 ± 0.511
7.88GlyVal: 7.88 ± 0.928
2.035GlyTrp: 2.035 ± 0.323
2.244GlyTyr: 2.244 ± 0.369
0.0GlyXaa: 0.0 ± 0.0
His
1.931HisAla: 1.931 ± 0.354
0.209HisCys: 0.209 ± 0.11
1.096HisAsp: 1.096 ± 0.268
0.835HisGlu: 0.835 ± 0.18
0.417HisPhe: 0.417 ± 0.133
1.566HisGly: 1.566 ± 0.257
0.47HisHis: 0.47 ± 0.145
0.574HisIle: 0.574 ± 0.176
0.365HisLys: 0.365 ± 0.136
1.722HisLeu: 1.722 ± 0.276
0.417HisMet: 0.417 ± 0.132
0.261HisAsn: 0.261 ± 0.109
1.148HisPro: 1.148 ± 0.231
0.678HisGln: 0.678 ± 0.192
1.879HisArg: 1.879 ± 0.376
0.835HisSer: 0.835 ± 0.23
0.835HisThr: 0.835 ± 0.25
1.044HisVal: 1.044 ± 0.221
0.47HisTrp: 0.47 ± 0.134
0.261HisTyr: 0.261 ± 0.115
0.0HisXaa: 0.0 ± 0.0
Ile
5.949IleAla: 5.949 ± 0.516
0.417IleCys: 0.417 ± 0.13
3.549IleAsp: 3.549 ± 0.412
3.34IleGlu: 3.34 ± 0.45
0.47IlePhe: 0.47 ± 0.149
3.601IleGly: 3.601 ± 0.511
0.417IleHis: 0.417 ± 0.151
1.461IleIle: 1.461 ± 0.28
1.096IleLys: 1.096 ± 0.213
2.818IleLeu: 2.818 ± 0.432
0.835IleMet: 0.835 ± 0.248
0.992IleAsn: 0.992 ± 0.27
1.461IlePro: 1.461 ± 0.259
1.409IleGln: 1.409 ± 0.223
3.079IleArg: 3.079 ± 0.412
1.618IleSer: 1.618 ± 0.285
2.766IleThr: 2.766 ± 0.381
3.079IleVal: 3.079 ± 0.408
0.626IleTrp: 0.626 ± 0.168
1.148IleTyr: 1.148 ± 0.246
0.0IleXaa: 0.0 ± 0.0
Lys
4.749LysAla: 4.749 ± 0.76
0.626LysCys: 0.626 ± 0.188
1.513LysAsp: 1.513 ± 0.334
1.879LysGlu: 1.879 ± 0.339
0.835LysPhe: 0.835 ± 0.227
1.722LysGly: 1.722 ± 0.352
0.313LysHis: 0.313 ± 0.112
1.67LysIle: 1.67 ± 0.378
1.461LysLys: 1.461 ± 0.468
2.296LysLeu: 2.296 ± 0.359
0.939LysMet: 0.939 ± 0.195
0.626LysAsn: 0.626 ± 0.185
2.296LysPro: 2.296 ± 0.307
0.992LysGln: 0.992 ± 0.222
2.087LysArg: 2.087 ± 0.417
1.879LysSer: 1.879 ± 0.276
1.879LysThr: 1.879 ± 0.298
2.453LysVal: 2.453 ± 0.461
0.574LysTrp: 0.574 ± 0.179
0.783LysTyr: 0.783 ± 0.192
0.0LysXaa: 0.0 ± 0.0
Leu
10.229LeuAla: 10.229 ± 0.742
0.939LeuCys: 0.939 ± 0.188
5.062LeuAsp: 5.062 ± 0.54
6.471LeuGlu: 6.471 ± 0.532
1.879LeuPhe: 1.879 ± 0.386
7.41LeuGly: 7.41 ± 0.849
1.409LeuHis: 1.409 ± 0.3
3.34LeuIle: 3.34 ± 0.376
2.14LeuLys: 2.14 ± 0.337
4.906LeuLeu: 4.906 ± 0.511
1.409LeuMet: 1.409 ± 0.337
2.766LeuAsn: 2.766 ± 0.441
5.48LeuPro: 5.48 ± 0.465
3.027LeuGln: 3.027 ± 0.416
5.584LeuArg: 5.584 ± 0.539
5.636LeuSer: 5.636 ± 0.598
6.21LeuThr: 6.21 ± 0.498
6.106LeuVal: 6.106 ± 0.695
1.148LeuTrp: 1.148 ± 0.292
1.827LeuTyr: 1.827 ± 0.374
0.0LeuXaa: 0.0 ± 0.0
Met
3.288MetAla: 3.288 ± 0.375
0.104MetCys: 0.104 ± 0.078
0.939MetAsp: 0.939 ± 0.235
0.939MetGlu: 0.939 ± 0.216
0.678MetPhe: 0.678 ± 0.164
1.722MetGly: 1.722 ± 0.309
0.365MetHis: 0.365 ± 0.16
1.096MetIle: 1.096 ± 0.237
0.887MetLys: 0.887 ± 0.184
1.879MetLeu: 1.879 ± 0.409
0.365MetMet: 0.365 ± 0.132
0.626MetAsn: 0.626 ± 0.17
1.513MetPro: 1.513 ± 0.278
0.522MetGln: 0.522 ± 0.228
1.305MetArg: 1.305 ± 0.241
2.192MetSer: 2.192 ± 0.301
2.296MetThr: 2.296 ± 0.437
1.618MetVal: 1.618 ± 0.25
0.626MetTrp: 0.626 ± 0.212
0.313MetTyr: 0.313 ± 0.147
0.0MetXaa: 0.0 ± 0.0
Asn
3.81AsnAla: 3.81 ± 0.455
0.365AsnCys: 0.365 ± 0.153
1.409AsnAsp: 1.409 ± 0.314
1.357AsnGlu: 1.357 ± 0.23
0.47AsnPhe: 0.47 ± 0.176
2.922AsnGly: 2.922 ± 0.37
0.261AsnHis: 0.261 ± 0.12
0.783AsnIle: 0.783 ± 0.256
0.574AsnLys: 0.574 ± 0.149
2.922AsnLeu: 2.922 ± 0.438
0.365AsnMet: 0.365 ± 0.15
0.783AsnAsn: 0.783 ± 0.261
1.879AsnPro: 1.879 ± 0.312
0.835AsnGln: 0.835 ± 0.219
2.14AsnArg: 2.14 ± 0.277
1.774AsnSer: 1.774 ± 0.325
1.2AsnThr: 1.2 ± 0.297
1.774AsnVal: 1.774 ± 0.312
0.626AsnTrp: 0.626 ± 0.212
1.096AsnTyr: 1.096 ± 0.222
0.0AsnXaa: 0.0 ± 0.0
Pro
6.315ProAla: 6.315 ± 0.718
0.574ProCys: 0.574 ± 0.199
4.123ProAsp: 4.123 ± 0.577
4.853ProGlu: 4.853 ± 0.526
1.305ProPhe: 1.305 ± 0.198
5.427ProGly: 5.427 ± 0.567
1.044ProHis: 1.044 ± 0.206
2.14ProIle: 2.14 ± 0.265
1.774ProLys: 1.774 ± 0.304
3.34ProLeu: 3.34 ± 0.378
1.096ProMet: 1.096 ± 0.234
1.827ProAsn: 1.827 ± 0.267
3.497ProPro: 3.497 ± 0.512
1.722ProGln: 1.722 ± 0.279
2.244ProArg: 2.244 ± 0.345
3.027ProSer: 3.027 ± 0.386
4.592ProThr: 4.592 ± 0.442
5.114ProVal: 5.114 ± 0.491
1.044ProTrp: 1.044 ± 0.25
0.626ProTyr: 0.626 ± 0.202
0.0ProXaa: 0.0 ± 0.0
Gln
4.749GlnAla: 4.749 ± 0.478
0.313GlnCys: 0.313 ± 0.146
1.513GlnAsp: 1.513 ± 0.288
2.087GlnGlu: 2.087 ± 0.313
1.044GlnPhe: 1.044 ± 0.29
2.244GlnGly: 2.244 ± 0.347
0.887GlnHis: 0.887 ± 0.206
2.192GlnIle: 2.192 ± 0.457
1.252GlnLys: 1.252 ± 0.326
3.705GlnLeu: 3.705 ± 0.452
1.148GlnMet: 1.148 ± 0.252
1.044GlnAsn: 1.044 ± 0.2
1.827GlnPro: 1.827 ± 0.257
1.879GlnGln: 1.879 ± 0.329
2.14GlnArg: 2.14 ± 0.341
1.722GlnSer: 1.722 ± 0.35
1.722GlnThr: 1.722 ± 0.343
2.87GlnVal: 2.87 ± 0.498
0.731GlnTrp: 0.731 ± 0.214
0.835GlnTyr: 0.835 ± 0.201
0.0GlnXaa: 0.0 ± 0.0
Arg
7.515ArgAla: 7.515 ± 0.654
1.2ArgCys: 1.2 ± 0.229
4.018ArgAsp: 4.018 ± 0.493
4.384ArgGlu: 4.384 ± 0.567
1.983ArgPhe: 1.983 ± 0.313
4.697ArgGly: 4.697 ± 0.444
1.513ArgHis: 1.513 ± 0.263
2.662ArgIle: 2.662 ± 0.315
2.192ArgLys: 2.192 ± 0.425
6.21ArgLeu: 6.21 ± 0.777
1.983ArgMet: 1.983 ± 0.361
1.67ArgAsn: 1.67 ± 0.325
2.87ArgPro: 2.87 ± 0.3
2.714ArgGln: 2.714 ± 0.355
6.471ArgArg: 6.471 ± 0.834
3.653ArgSer: 3.653 ± 0.456
3.653ArgThr: 3.653 ± 0.332
5.323ArgVal: 5.323 ± 0.417
1.827ArgTrp: 1.827 ± 0.366
2.087ArgTyr: 2.087 ± 0.292
0.0ArgXaa: 0.0 ± 0.0
Ser
5.949SerAla: 5.949 ± 0.707
0.626SerCys: 0.626 ± 0.184
3.392SerAsp: 3.392 ± 0.351
3.549SerGlu: 3.549 ± 0.444
1.566SerPhe: 1.566 ± 0.277
6.576SerGly: 6.576 ± 0.702
0.887SerHis: 0.887 ± 0.212
2.401SerIle: 2.401 ± 0.4
1.722SerLys: 1.722 ± 0.26
4.645SerLeu: 4.645 ± 0.409
1.357SerMet: 1.357 ± 0.268
1.618SerAsn: 1.618 ± 0.363
3.236SerPro: 3.236 ± 0.384
2.453SerGln: 2.453 ± 0.335
3.444SerArg: 3.444 ± 0.412
3.183SerSer: 3.183 ± 0.445
3.601SerThr: 3.601 ± 0.554
4.123SerVal: 4.123 ± 0.461
1.409SerTrp: 1.409 ± 0.281
1.67SerTyr: 1.67 ± 0.281
0.0SerXaa: 0.0 ± 0.0
Thr
6.732ThrAla: 6.732 ± 0.89
0.939ThrCys: 0.939 ± 0.277
3.497ThrAsp: 3.497 ± 0.523
3.862ThrGlu: 3.862 ± 0.496
1.983ThrPhe: 1.983 ± 0.275
5.845ThrGly: 5.845 ± 0.696
0.731ThrHis: 0.731 ± 0.166
3.914ThrIle: 3.914 ± 0.52
1.67ThrLys: 1.67 ± 0.328
4.54ThrLeu: 4.54 ± 0.557
1.618ThrMet: 1.618 ± 0.286
1.722ThrAsn: 1.722 ± 0.329
4.436ThrPro: 4.436 ± 0.517
1.827ThrGln: 1.827 ± 0.356
3.027ThrArg: 3.027 ± 0.386
3.653ThrSer: 3.653 ± 0.458
3.392ThrThr: 3.392 ± 0.658
5.062ThrVal: 5.062 ± 0.45
1.305ThrTrp: 1.305 ± 0.286
0.992ThrTyr: 0.992 ± 0.219
0.0ThrXaa: 0.0 ± 0.0
Val
9.602ValAla: 9.602 ± 0.728
1.096ValCys: 1.096 ± 0.27
5.219ValAsp: 5.219 ± 0.53
6.523ValGlu: 6.523 ± 0.801
1.983ValPhe: 1.983 ± 0.31
5.897ValGly: 5.897 ± 0.681
1.618ValHis: 1.618 ± 0.318
2.401ValIle: 2.401 ± 0.318
2.453ValLys: 2.453 ± 0.384
6.21ValLeu: 6.21 ± 0.568
1.461ValMet: 1.461 ± 0.276
2.296ValAsn: 2.296 ± 0.33
4.488ValPro: 4.488 ± 0.502
3.027ValGln: 3.027 ± 0.347
4.853ValArg: 4.853 ± 0.822
4.801ValSer: 4.801 ± 0.56
5.271ValThr: 5.271 ± 0.551
6.889ValVal: 6.889 ± 0.578
1.252ValTrp: 1.252 ± 0.24
1.513ValTyr: 1.513 ± 0.286
0.0ValXaa: 0.0 ± 0.0
Trp
2.505TrpAla: 2.505 ± 0.408
0.47TrpCys: 0.47 ± 0.152
1.409TrpAsp: 1.409 ± 0.319
1.461TrpGlu: 1.461 ± 0.3
0.835TrpPhe: 0.835 ± 0.195
1.148TrpGly: 1.148 ± 0.206
0.574TrpHis: 0.574 ± 0.161
0.835TrpIle: 0.835 ± 0.158
0.678TrpLys: 0.678 ± 0.203
1.722TrpLeu: 1.722 ± 0.323
0.417TrpMet: 0.417 ± 0.147
0.678TrpAsn: 0.678 ± 0.221
0.626TrpPro: 0.626 ± 0.206
0.678TrpGln: 0.678 ± 0.152
1.722TrpArg: 1.722 ± 0.291
1.409TrpSer: 1.409 ± 0.27
1.096TrpThr: 1.096 ± 0.286
1.67TrpVal: 1.67 ± 0.401
0.261TrpTrp: 0.261 ± 0.128
0.313TrpTyr: 0.313 ± 0.127
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.931TyrAla: 1.931 ± 0.305
0.47TyrCys: 0.47 ± 0.182
1.566TyrAsp: 1.566 ± 0.253
1.461TyrGlu: 1.461 ± 0.236
0.626TyrPhe: 0.626 ± 0.162
2.244TyrGly: 2.244 ± 0.47
0.261TyrHis: 0.261 ± 0.138
0.261TyrIle: 0.261 ± 0.116
0.365TyrLys: 0.365 ± 0.12
1.774TyrLeu: 1.774 ± 0.388
0.835TyrMet: 0.835 ± 0.206
0.574TyrAsn: 0.574 ± 0.218
0.992TyrPro: 0.992 ± 0.226
0.783TyrGln: 0.783 ± 0.214
2.244TyrArg: 2.244 ± 0.407
1.2TyrSer: 1.2 ± 0.264
1.409TyrThr: 1.409 ± 0.288
1.774TyrVal: 1.774 ± 0.407
0.574TyrTrp: 0.574 ± 0.182
0.417TyrTyr: 0.417 ± 0.139
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 85 proteins (19163 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski