Amino acid dipepetide frequency for Mycobacterium phage Tweety

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
16.929AlaAla: 16.929 ± 3.057
0.914AlaCys: 0.914 ± 0.256
8.223AlaAsp: 8.223 ± 0.627
7.255AlaGlu: 7.255 ± 0.683
2.741AlaPhe: 2.741 ± 0.383
12.307AlaGly: 12.307 ± 2.195
2.418AlaHis: 2.418 ± 0.383
4.138AlaIle: 4.138 ± 0.498
4.031AlaLys: 4.031 ± 0.507
7.793AlaLeu: 7.793 ± 0.874
2.311AlaMet: 2.311 ± 0.346
2.096AlaAsn: 2.096 ± 0.353
5.213AlaPro: 5.213 ± 0.535
3.87AlaGln: 3.87 ± 0.529
7.094AlaArg: 7.094 ± 0.751
5.052AlaSer: 5.052 ± 0.514
6.019AlaThr: 6.019 ± 0.545
7.739AlaVal: 7.739 ± 0.722
3.063AlaTrp: 3.063 ± 0.651
2.311AlaTyr: 2.311 ± 0.338
0.0AlaXaa: 0.0 ± 0.0
Cys
0.967CysAla: 0.967 ± 0.261
0.054CysCys: 0.054 ± 0.05
0.914CysAsp: 0.914 ± 0.304
1.129CysGlu: 1.129 ± 0.248
0.161CysPhe: 0.161 ± 0.095
1.827CysGly: 1.827 ± 0.38
0.107CysHis: 0.107 ± 0.077
0.107CysIle: 0.107 ± 0.07
0.484CysLys: 0.484 ± 0.144
0.699CysLeu: 0.699 ± 0.249
0.161CysMet: 0.161 ± 0.101
0.537CysAsn: 0.537 ± 0.164
1.021CysPro: 1.021 ± 0.285
0.43CysGln: 0.43 ± 0.154
0.752CysArg: 0.752 ± 0.247
0.537CysSer: 0.537 ± 0.16
0.806CysThr: 0.806 ± 0.199
0.645CysVal: 0.645 ± 0.223
0.322CysTrp: 0.322 ± 0.123
0.161CysTyr: 0.161 ± 0.1
0.0CysXaa: 0.0 ± 0.0
Asp
6.933AspAla: 6.933 ± 0.586
0.967AspCys: 0.967 ± 0.239
4.407AspAsp: 4.407 ± 0.626
3.225AspGlu: 3.225 ± 0.362
2.096AspPhe: 2.096 ± 0.272
6.234AspGly: 6.234 ± 0.606
1.129AspHis: 1.129 ± 0.253
2.741AspIle: 2.741 ± 0.427
1.935AspLys: 1.935 ± 0.312
6.288AspLeu: 6.288 ± 0.566
1.236AspMet: 1.236 ± 0.321
1.612AspAsn: 1.612 ± 0.333
4.514AspPro: 4.514 ± 0.617
2.418AspGln: 2.418 ± 0.321
4.998AspArg: 4.998 ± 0.573
3.332AspSer: 3.332 ± 0.496
4.461AspThr: 4.461 ± 0.557
4.622AspVal: 4.622 ± 0.533
1.612AspTrp: 1.612 ± 0.285
2.257AspTyr: 2.257 ± 0.34
0.0AspXaa: 0.0 ± 0.0
Glu
6.772GluAla: 6.772 ± 0.693
0.752GluCys: 0.752 ± 0.212
3.171GluAsp: 3.171 ± 0.411
2.795GluGlu: 2.795 ± 0.521
1.72GluPhe: 1.72 ± 0.296
3.278GluGly: 3.278 ± 0.418
1.559GluHis: 1.559 ± 0.304
2.687GluIle: 2.687 ± 0.432
1.935GluLys: 1.935 ± 0.329
5.858GluLeu: 5.858 ± 0.765
1.774GluMet: 1.774 ± 0.333
1.666GluAsn: 1.666 ± 0.271
2.848GluPro: 2.848 ± 0.423
2.526GluGln: 2.526 ± 0.423
4.461GluArg: 4.461 ± 0.656
2.687GluSer: 2.687 ± 0.36
4.353GluThr: 4.353 ± 0.575
4.031GluVal: 4.031 ± 0.649
1.559GluTrp: 1.559 ± 0.283
1.935GluTyr: 1.935 ± 0.334
0.0GluXaa: 0.0 ± 0.0
Phe
2.741PheAla: 2.741 ± 0.374
0.269PheCys: 0.269 ± 0.103
2.311PheAsp: 2.311 ± 0.375
1.666PheGlu: 1.666 ± 0.314
0.914PhePhe: 0.914 ± 0.266
2.902PheGly: 2.902 ± 0.509
0.43PheHis: 0.43 ± 0.136
1.182PheIle: 1.182 ± 0.331
0.86PheLys: 0.86 ± 0.231
1.827PheLeu: 1.827 ± 0.279
0.591PheMet: 0.591 ± 0.207
1.129PheAsn: 1.129 ± 0.344
1.612PhePro: 1.612 ± 0.292
0.967PheGln: 0.967 ± 0.336
1.505PheArg: 1.505 ± 0.28
1.451PheSer: 1.451 ± 0.237
2.257PheThr: 2.257 ± 0.414
2.042PheVal: 2.042 ± 0.319
0.752PheTrp: 0.752 ± 0.235
0.806PheTyr: 0.806 ± 0.236
0.0PheXaa: 0.0 ± 0.0
Gly
9.405GlyAla: 9.405 ± 1.129
0.86GlyCys: 0.86 ± 0.215
6.342GlyAsp: 6.342 ± 0.637
3.87GlyGlu: 3.87 ± 0.548
2.687GlyPhe: 2.687 ± 0.341
10.91GlyGly: 10.91 ± 2.706
2.042GlyHis: 2.042 ± 0.285
4.299GlyIle: 4.299 ± 0.561
2.472GlyLys: 2.472 ± 0.362
6.234GlyLeu: 6.234 ± 0.581
2.58GlyMet: 2.58 ± 0.457
2.795GlyAsn: 2.795 ± 0.397
3.977GlyPro: 3.977 ± 0.546
2.096GlyGln: 2.096 ± 0.527
5.106GlyArg: 5.106 ± 0.638
7.739GlySer: 7.739 ± 1.712
6.073GlyThr: 6.073 ± 0.698
5.428GlyVal: 5.428 ± 0.528
2.418GlyTrp: 2.418 ± 0.382
2.418GlyTyr: 2.418 ± 0.5
0.0GlyXaa: 0.0 ± 0.0
His
1.72HisAla: 1.72 ± 0.398
0.43HisCys: 0.43 ± 0.162
1.075HisAsp: 1.075 ± 0.253
1.075HisGlu: 1.075 ± 0.27
0.43HisPhe: 0.43 ± 0.139
2.042HisGly: 2.042 ± 0.299
0.914HisHis: 0.914 ± 0.224
1.182HisIle: 1.182 ± 0.249
0.484HisLys: 0.484 ± 0.144
1.129HisLeu: 1.129 ± 0.215
0.43HisMet: 0.43 ± 0.132
0.914HisAsn: 0.914 ± 0.196
1.827HisPro: 1.827 ± 0.296
0.806HisGln: 0.806 ± 0.243
2.741HisArg: 2.741 ± 0.432
0.645HisSer: 0.645 ± 0.19
2.042HisThr: 2.042 ± 0.383
1.29HisVal: 1.29 ± 0.335
0.537HisTrp: 0.537 ± 0.159
0.699HisTyr: 0.699 ± 0.199
0.0HisXaa: 0.0 ± 0.0
Ile
4.998IleAla: 4.998 ± 0.557
0.43IleCys: 0.43 ± 0.175
3.816IleAsp: 3.816 ± 0.488
3.601IleGlu: 3.601 ± 0.434
0.699IlePhe: 0.699 ± 0.223
3.493IleGly: 3.493 ± 0.535
1.559IleHis: 1.559 ± 0.35
1.451IleIle: 1.451 ± 0.34
0.914IleLys: 0.914 ± 0.206
2.526IleLeu: 2.526 ± 0.427
0.269IleMet: 0.269 ± 0.11
2.096IleAsn: 2.096 ± 0.331
2.58IlePro: 2.58 ± 0.33
1.182IleGln: 1.182 ± 0.271
2.58IleArg: 2.58 ± 0.383
2.15IleSer: 2.15 ± 0.413
3.601IleThr: 3.601 ± 0.418
3.225IleVal: 3.225 ± 0.37
0.967IleTrp: 0.967 ± 0.249
0.806IleTyr: 0.806 ± 0.203
0.0IleXaa: 0.0 ± 0.0
Lys
3.655LysAla: 3.655 ± 0.585
0.376LysCys: 0.376 ± 0.156
1.559LysAsp: 1.559 ± 0.233
1.505LysGlu: 1.505 ± 0.274
1.129LysPhe: 1.129 ± 0.195
2.633LysGly: 2.633 ± 0.312
0.86LysHis: 0.86 ± 0.227
1.075LysIle: 1.075 ± 0.263
1.29LysLys: 1.29 ± 0.38
2.257LysLeu: 2.257 ± 0.432
0.591LysMet: 0.591 ± 0.16
0.806LysAsn: 0.806 ± 0.213
2.096LysPro: 2.096 ± 0.361
1.344LysGln: 1.344 ± 0.278
2.58LysArg: 2.58 ± 0.449
1.666LysSer: 1.666 ± 0.334
2.096LysThr: 2.096 ± 0.33
2.633LysVal: 2.633 ± 0.472
0.86LysTrp: 0.86 ± 0.27
1.075LysTyr: 1.075 ± 0.244
0.0LysXaa: 0.0 ± 0.0
Leu
7.578LeuAla: 7.578 ± 0.858
0.86LeuCys: 0.86 ± 0.272
5.643LeuAsp: 5.643 ± 0.55
4.084LeuGlu: 4.084 ± 0.594
2.257LeuPhe: 2.257 ± 0.302
5.536LeuGly: 5.536 ± 0.553
1.397LeuHis: 1.397 ± 0.32
2.956LeuIle: 2.956 ± 0.484
1.881LeuLys: 1.881 ± 0.305
4.891LeuLeu: 4.891 ± 0.62
1.72LeuMet: 1.72 ± 0.324
2.257LeuAsn: 2.257 ± 0.319
5.643LeuPro: 5.643 ± 0.68
3.01LeuGln: 3.01 ± 0.515
5.267LeuArg: 5.267 ± 0.624
5.374LeuSer: 5.374 ± 0.426
5.482LeuThr: 5.482 ± 0.529
4.622LeuVal: 4.622 ± 0.476
1.344LeuTrp: 1.344 ± 0.294
2.096LeuTyr: 2.096 ± 0.44
0.0LeuXaa: 0.0 ± 0.0
Met
2.096MetAla: 2.096 ± 0.363
0.43MetCys: 0.43 ± 0.162
1.29MetAsp: 1.29 ± 0.288
1.021MetGlu: 1.021 ± 0.21
0.752MetPhe: 0.752 ± 0.26
1.612MetGly: 1.612 ± 0.296
0.161MetHis: 0.161 ± 0.094
0.752MetIle: 0.752 ± 0.214
0.914MetLys: 0.914 ± 0.2
1.397MetLeu: 1.397 ± 0.248
0.591MetMet: 0.591 ± 0.249
1.021MetAsn: 1.021 ± 0.214
1.344MetPro: 1.344 ± 0.27
0.43MetGln: 0.43 ± 0.165
1.505MetArg: 1.505 ± 0.235
3.01MetSer: 3.01 ± 0.426
2.365MetThr: 2.365 ± 0.401
1.182MetVal: 1.182 ± 0.337
0.322MetTrp: 0.322 ± 0.135
0.269MetTyr: 0.269 ± 0.13
0.0MetXaa: 0.0 ± 0.0
Asn
3.01AsnAla: 3.01 ± 0.376
0.215AsnCys: 0.215 ± 0.106
1.72AsnAsp: 1.72 ± 0.294
1.827AsnGlu: 1.827 ± 0.331
0.86AsnPhe: 0.86 ± 0.278
3.977AsnGly: 3.977 ± 0.476
0.699AsnHis: 0.699 ± 0.143
1.612AsnIle: 1.612 ± 0.424
0.967AsnLys: 0.967 ± 0.222
2.472AsnLeu: 2.472 ± 0.404
0.484AsnMet: 0.484 ± 0.138
1.559AsnAsn: 1.559 ± 0.363
2.687AsnPro: 2.687 ± 0.356
0.914AsnGln: 0.914 ± 0.279
1.827AsnArg: 1.827 ± 0.325
1.612AsnSer: 1.612 ± 0.327
2.042AsnThr: 2.042 ± 0.339
1.935AsnVal: 1.935 ± 0.332
0.591AsnTrp: 0.591 ± 0.17
0.484AsnTyr: 0.484 ± 0.152
0.0AsnXaa: 0.0 ± 0.0
Pro
5.159ProAla: 5.159 ± 0.667
1.075ProCys: 1.075 ± 0.224
4.192ProAsp: 4.192 ± 0.517
4.407ProGlu: 4.407 ± 0.479
1.827ProPhe: 1.827 ± 0.364
6.342ProGly: 6.342 ± 0.763
1.827ProHis: 1.827 ± 0.324
1.881ProIle: 1.881 ± 0.333
2.472ProLys: 2.472 ± 0.371
3.977ProLeu: 3.977 ± 0.547
1.612ProMet: 1.612 ± 0.399
2.257ProAsn: 2.257 ± 0.395
3.332ProPro: 3.332 ± 0.549
2.15ProGln: 2.15 ± 0.387
3.386ProArg: 3.386 ± 0.548
3.063ProSer: 3.063 ± 0.356
3.278ProThr: 3.278 ± 0.44
4.461ProVal: 4.461 ± 0.577
1.182ProTrp: 1.182 ± 0.266
1.236ProTyr: 1.236 ± 0.223
0.0ProXaa: 0.0 ± 0.0
Gln
4.299GlnAla: 4.299 ± 0.687
0.484GlnCys: 0.484 ± 0.225
1.559GlnAsp: 1.559 ± 0.257
1.612GlnGlu: 1.612 ± 0.348
0.914GlnPhe: 0.914 ± 0.2
2.257GlnGly: 2.257 ± 0.48
0.699GlnHis: 0.699 ± 0.204
1.72GlnIle: 1.72 ± 0.271
1.29GlnLys: 1.29 ± 0.236
3.063GlnLeu: 3.063 ± 0.493
0.967GlnMet: 0.967 ± 0.205
0.752GlnAsn: 0.752 ± 0.248
2.633GlnPro: 2.633 ± 0.405
1.397GlnGln: 1.397 ± 0.279
2.311GlnArg: 2.311 ± 0.318
2.365GlnSer: 2.365 ± 0.382
1.935GlnThr: 1.935 ± 0.326
2.257GlnVal: 2.257 ± 0.297
0.591GlnTrp: 0.591 ± 0.166
0.914GlnTyr: 0.914 ± 0.267
0.0GlnXaa: 0.0 ± 0.0
Arg
6.557ArgAla: 6.557 ± 0.602
1.021ArgCys: 1.021 ± 0.357
4.138ArgAsp: 4.138 ± 0.552
4.514ArgGlu: 4.514 ± 0.562
1.881ArgPhe: 1.881 ± 0.368
4.353ArgGly: 4.353 ± 0.562
1.612ArgHis: 1.612 ± 0.319
4.084ArgIle: 4.084 ± 0.465
2.365ArgLys: 2.365 ± 0.388
4.837ArgLeu: 4.837 ± 0.67
2.257ArgMet: 2.257 ± 0.378
2.472ArgAsn: 2.472 ± 0.405
3.762ArgPro: 3.762 ± 0.489
2.365ArgGln: 2.365 ± 0.461
5.589ArgArg: 5.589 ± 0.897
3.762ArgSer: 3.762 ± 0.383
3.816ArgThr: 3.816 ± 0.515
4.783ArgVal: 4.783 ± 0.586
2.365ArgTrp: 2.365 ± 0.425
1.666ArgTyr: 1.666 ± 0.326
0.0ArgXaa: 0.0 ± 0.0
Ser
7.632SerAla: 7.632 ± 2.103
0.645SerCys: 0.645 ± 0.222
3.977SerAsp: 3.977 ± 0.447
3.225SerGlu: 3.225 ± 0.495
1.988SerPhe: 1.988 ± 0.376
6.664SerGly: 6.664 ± 0.715
1.021SerHis: 1.021 ± 0.211
2.848SerIle: 2.848 ± 0.39
2.472SerLys: 2.472 ± 0.432
3.655SerLeu: 3.655 ± 0.415
1.236SerMet: 1.236 ± 0.27
2.042SerAsn: 2.042 ± 0.322
3.063SerPro: 3.063 ± 0.381
1.612SerGln: 1.612 ± 0.271
3.225SerArg: 3.225 ± 0.399
3.923SerSer: 3.923 ± 0.578
3.332SerThr: 3.332 ± 0.392
4.353SerVal: 4.353 ± 0.47
1.075SerTrp: 1.075 ± 0.232
1.505SerTyr: 1.505 ± 0.231
0.0SerXaa: 0.0 ± 0.0
Thr
7.417ThrAla: 7.417 ± 0.487
0.591ThrCys: 0.591 ± 0.198
4.138ThrAsp: 4.138 ± 0.611
3.708ThrGlu: 3.708 ± 0.423
1.612ThrPhe: 1.612 ± 0.334
5.589ThrGly: 5.589 ± 0.573
1.666ThrHis: 1.666 ± 0.291
3.225ThrIle: 3.225 ± 0.393
2.042ThrLys: 2.042 ± 0.335
5.106ThrLeu: 5.106 ± 0.602
1.182ThrMet: 1.182 ± 0.247
2.257ThrAsn: 2.257 ± 0.403
4.783ThrPro: 4.783 ± 0.568
1.881ThrGln: 1.881 ± 0.277
4.514ThrArg: 4.514 ± 0.594
3.923ThrSer: 3.923 ± 0.444
5.106ThrThr: 5.106 ± 0.747
5.428ThrVal: 5.428 ± 0.631
1.182ThrTrp: 1.182 ± 0.275
1.774ThrTyr: 1.774 ± 0.333
0.0ThrXaa: 0.0 ± 0.0
Val
8.223ValAla: 8.223 ± 0.649
0.86ValCys: 0.86 ± 0.196
5.374ValAsp: 5.374 ± 0.513
4.944ValGlu: 4.944 ± 0.719
2.203ValPhe: 2.203 ± 0.343
5.159ValGly: 5.159 ± 0.646
1.397ValHis: 1.397 ± 0.304
2.741ValIle: 2.741 ± 0.371
1.935ValLys: 1.935 ± 0.287
5.428ValLeu: 5.428 ± 0.651
1.236ValMet: 1.236 ± 0.202
1.988ValAsn: 1.988 ± 0.329
3.87ValPro: 3.87 ± 0.427
2.741ValGln: 2.741 ± 0.396
4.084ValArg: 4.084 ± 0.577
4.461ValSer: 4.461 ± 0.487
4.891ValThr: 4.891 ± 0.572
5.912ValVal: 5.912 ± 0.741
2.15ValTrp: 2.15 ± 0.378
1.182ValTyr: 1.182 ± 0.222
0.0ValXaa: 0.0 ± 0.0
Trp
2.311TrpAla: 2.311 ± 0.294
0.215TrpCys: 0.215 ± 0.118
1.505TrpAsp: 1.505 ± 0.274
1.182TrpGlu: 1.182 ± 0.331
0.645TrpPhe: 0.645 ± 0.176
1.021TrpGly: 1.021 ± 0.229
0.645TrpHis: 0.645 ± 0.214
1.236TrpIle: 1.236 ± 0.256
0.645TrpLys: 0.645 ± 0.155
2.203TrpLeu: 2.203 ± 0.354
0.967TrpMet: 0.967 ± 0.235
0.591TrpAsn: 0.591 ± 0.216
1.182TrpPro: 1.182 ± 0.295
1.021TrpGln: 1.021 ± 0.259
2.15TrpArg: 2.15 ± 0.393
1.612TrpSer: 1.612 ± 0.399
1.666TrpThr: 1.666 ± 0.314
2.203TrpVal: 2.203 ± 0.421
0.967TrpTrp: 0.967 ± 0.228
0.537TrpTyr: 0.537 ± 0.244
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.117TyrAla: 3.117 ± 0.622
0.376TyrCys: 0.376 ± 0.149
1.666TyrAsp: 1.666 ± 0.433
1.774TyrGlu: 1.774 ± 0.317
0.699TyrPhe: 0.699 ± 0.189
1.612TyrGly: 1.612 ± 0.338
0.269TyrHis: 0.269 ± 0.1
1.021TyrIle: 1.021 ± 0.204
0.699TyrLys: 0.699 ± 0.186
2.257TyrLeu: 2.257 ± 0.393
0.215TyrMet: 0.215 ± 0.102
0.645TyrAsn: 0.645 ± 0.188
1.129TyrPro: 1.129 ± 0.219
0.86TyrGln: 0.86 ± 0.2
2.526TyrArg: 2.526 ± 0.373
1.129TyrSer: 1.129 ± 0.26
1.505TyrThr: 1.505 ± 0.309
1.988TyrVal: 1.988 ± 0.296
0.645TyrTrp: 0.645 ± 0.186
0.645TyrTyr: 0.645 ± 0.176
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 109 proteins (18608 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski