Amino acid dipepetide frequency for Mycobacterium phage Pippy

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
14.578AlaAla: 14.578 ± 1.774
1.256AlaCys: 1.256 ± 0.291
7.207AlaAsp: 7.207 ± 0.593
6.989AlaGlu: 6.989 ± 0.637
2.73AlaPhe: 2.73 ± 0.417
8.681AlaGly: 8.681 ± 1.092
2.239AlaHis: 2.239 ± 0.348
4.368AlaIle: 4.368 ± 0.516
4.423AlaLys: 4.423 ± 0.483
8.572AlaLeu: 8.572 ± 0.864
1.802AlaMet: 1.802 ± 0.296
3.549AlaAsn: 3.549 ± 0.455
5.351AlaPro: 5.351 ± 0.551
3.658AlaGln: 3.658 ± 0.498
8.081AlaArg: 8.081 ± 0.896
5.132AlaSer: 5.132 ± 0.57
6.607AlaThr: 6.607 ± 0.659
6.497AlaVal: 6.497 ± 0.621
2.402AlaTrp: 2.402 ± 0.413
2.402AlaTyr: 2.402 ± 0.393
0.0AlaXaa: 0.0 ± 0.0
Cys
0.983CysAla: 0.983 ± 0.245
0.055CysCys: 0.055 ± 0.055
1.201CysAsp: 1.201 ± 0.274
0.819CysGlu: 0.819 ± 0.197
0.218CysPhe: 0.218 ± 0.114
1.638CysGly: 1.638 ± 0.377
0.437CysHis: 0.437 ± 0.144
0.164CysIle: 0.164 ± 0.085
0.601CysLys: 0.601 ± 0.187
0.71CysLeu: 0.71 ± 0.224
0.055CysMet: 0.055 ± 0.054
0.491CysAsn: 0.491 ± 0.166
1.365CysPro: 1.365 ± 0.312
0.218CysGln: 0.218 ± 0.109
0.819CysArg: 0.819 ± 0.23
0.983CysSer: 0.983 ± 0.249
0.874CysThr: 0.874 ± 0.237
0.874CysVal: 0.874 ± 0.178
0.328CysTrp: 0.328 ± 0.149
0.273CysTyr: 0.273 ± 0.109
0.0CysXaa: 0.0 ± 0.0
Asp
6.989AspAla: 6.989 ± 0.628
0.928AspCys: 0.928 ± 0.193
4.696AspAsp: 4.696 ± 0.63
4.15AspGlu: 4.15 ± 0.412
1.638AspPhe: 1.638 ± 0.238
7.153AspGly: 7.153 ± 0.639
1.201AspHis: 1.201 ± 0.252
2.73AspIle: 2.73 ± 0.377
1.693AspLys: 1.693 ± 0.318
6.006AspLeu: 6.006 ± 0.55
1.147AspMet: 1.147 ± 0.274
1.474AspAsn: 1.474 ± 0.361
4.805AspPro: 4.805 ± 0.485
2.239AspGln: 2.239 ± 0.315
4.696AspArg: 4.696 ± 0.443
3.877AspSer: 3.877 ± 0.593
4.313AspThr: 4.313 ± 0.493
4.586AspVal: 4.586 ± 0.519
1.583AspTrp: 1.583 ± 0.274
1.529AspTyr: 1.529 ± 0.287
0.0AspXaa: 0.0 ± 0.0
Glu
6.607GluAla: 6.607 ± 0.761
0.874GluCys: 0.874 ± 0.26
3.058GluAsp: 3.058 ± 0.369
3.112GluGlu: 3.112 ± 0.558
2.02GluPhe: 2.02 ± 0.354
2.948GluGly: 2.948 ± 0.456
1.802GluHis: 1.802 ± 0.429
2.512GluIle: 2.512 ± 0.395
2.184GluLys: 2.184 ± 0.363
5.897GluLeu: 5.897 ± 0.645
1.747GluMet: 1.747 ± 0.282
1.529GluAsn: 1.529 ± 0.283
3.058GluPro: 3.058 ± 0.45
3.003GluGln: 3.003 ± 0.359
5.242GluArg: 5.242 ± 0.655
3.276GluSer: 3.276 ± 0.552
4.313GluThr: 4.313 ± 0.62
3.713GluVal: 3.713 ± 0.46
1.802GluTrp: 1.802 ± 0.299
1.802GluTyr: 1.802 ± 0.323
0.0GluXaa: 0.0 ± 0.0
Phe
3.058PheAla: 3.058 ± 0.439
0.273PheCys: 0.273 ± 0.113
2.894PheAsp: 2.894 ± 0.518
1.583PheGlu: 1.583 ± 0.282
0.655PhePhe: 0.655 ± 0.159
2.348PheGly: 2.348 ± 0.484
0.437PheHis: 0.437 ± 0.136
1.365PheIle: 1.365 ± 0.354
1.037PheLys: 1.037 ± 0.218
2.129PheLeu: 2.129 ± 0.281
0.874PheMet: 0.874 ± 0.232
0.983PheAsn: 0.983 ± 0.248
1.31PhePro: 1.31 ± 0.264
0.928PheGln: 0.928 ± 0.289
1.966PheArg: 1.966 ± 0.378
1.256PheSer: 1.256 ± 0.262
2.293PheThr: 2.293 ± 0.311
2.348PheVal: 2.348 ± 0.328
0.491PheTrp: 0.491 ± 0.153
0.71PheTyr: 0.71 ± 0.229
0.0PheXaa: 0.0 ± 0.0
Gly
7.262GlyAla: 7.262 ± 1.005
1.092GlyCys: 1.092 ± 0.245
5.788GlyAsp: 5.788 ± 0.483
4.532GlyGlu: 4.532 ± 0.574
2.621GlyPhe: 2.621 ± 0.34
8.081GlyGly: 8.081 ± 1.327
1.911GlyHis: 1.911 ± 0.328
3.658GlyIle: 3.658 ± 0.514
2.402GlyLys: 2.402 ± 0.374
5.515GlyLeu: 5.515 ± 0.537
2.621GlyMet: 2.621 ± 0.451
2.566GlyAsn: 2.566 ± 0.369
3.713GlyPro: 3.713 ± 0.55
2.457GlyGln: 2.457 ± 0.507
6.115GlyArg: 6.115 ± 0.649
4.969GlySer: 4.969 ± 0.621
5.351GlyThr: 5.351 ± 0.571
6.061GlyVal: 6.061 ± 0.562
1.966GlyTrp: 1.966 ± 0.326
1.856GlyTyr: 1.856 ± 0.291
0.0GlyXaa: 0.0 ± 0.0
His
2.402HisAla: 2.402 ± 0.374
0.382HisCys: 0.382 ± 0.131
1.147HisAsp: 1.147 ± 0.217
1.256HisGlu: 1.256 ± 0.273
0.437HisPhe: 0.437 ± 0.138
1.529HisGly: 1.529 ± 0.28
0.874HisHis: 0.874 ± 0.244
1.638HisIle: 1.638 ± 0.279
0.874HisLys: 0.874 ± 0.186
1.529HisLeu: 1.529 ± 0.277
0.655HisMet: 0.655 ± 0.169
1.092HisAsn: 1.092 ± 0.213
1.42HisPro: 1.42 ± 0.256
0.655HisGln: 0.655 ± 0.156
2.293HisArg: 2.293 ± 0.392
0.71HisSer: 0.71 ± 0.178
1.529HisThr: 1.529 ± 0.286
1.365HisVal: 1.365 ± 0.353
0.546HisTrp: 0.546 ± 0.152
0.819HisTyr: 0.819 ± 0.219
0.0HisXaa: 0.0 ± 0.0
Ile
4.969IleAla: 4.969 ± 0.586
0.71IleCys: 0.71 ± 0.249
3.986IleAsp: 3.986 ± 0.529
3.822IleGlu: 3.822 ± 0.389
0.819IlePhe: 0.819 ± 0.217
3.385IleGly: 3.385 ± 0.434
1.201IleHis: 1.201 ± 0.261
1.256IleIle: 1.256 ± 0.286
1.365IleLys: 1.365 ± 0.291
2.02IleLeu: 2.02 ± 0.34
0.382IleMet: 0.382 ± 0.113
1.802IleAsn: 1.802 ± 0.287
2.894IlePro: 2.894 ± 0.366
1.529IleGln: 1.529 ± 0.36
3.003IleArg: 3.003 ± 0.499
2.184IleSer: 2.184 ± 0.426
3.276IleThr: 3.276 ± 0.497
3.331IleVal: 3.331 ± 0.383
1.092IleTrp: 1.092 ± 0.285
0.601IleTyr: 0.601 ± 0.168
0.0IleXaa: 0.0 ± 0.0
Lys
3.931LysAla: 3.931 ± 0.48
0.437LysCys: 0.437 ± 0.166
2.239LysAsp: 2.239 ± 0.284
1.201LysGlu: 1.201 ± 0.258
1.201LysPhe: 1.201 ± 0.238
2.73LysGly: 2.73 ± 0.316
0.928LysHis: 0.928 ± 0.221
1.147LysIle: 1.147 ± 0.257
1.583LysLys: 1.583 ± 0.377
2.73LysLeu: 2.73 ± 0.5
0.601LysMet: 0.601 ± 0.162
1.201LysAsn: 1.201 ± 0.233
2.621LysPro: 2.621 ± 0.414
1.529LysGln: 1.529 ± 0.279
2.129LysArg: 2.129 ± 0.376
2.184LysSer: 2.184 ± 0.338
2.512LysThr: 2.512 ± 0.482
2.457LysVal: 2.457 ± 0.41
0.546LysTrp: 0.546 ± 0.169
1.092LysTyr: 1.092 ± 0.303
0.0LysXaa: 0.0 ± 0.0
Leu
8.572LeuAla: 8.572 ± 0.881
0.874LeuCys: 0.874 ± 0.205
5.187LeuAsp: 5.187 ± 0.587
4.04LeuGlu: 4.04 ± 0.529
1.856LeuPhe: 1.856 ± 0.26
5.46LeuGly: 5.46 ± 0.649
1.365LeuHis: 1.365 ± 0.267
3.003LeuIle: 3.003 ± 0.44
2.402LeuLys: 2.402 ± 0.407
5.078LeuLeu: 5.078 ± 0.619
1.802LeuMet: 1.802 ± 0.291
2.839LeuAsn: 2.839 ± 0.404
4.859LeuPro: 4.859 ± 0.594
2.948LeuGln: 2.948 ± 0.439
5.187LeuArg: 5.187 ± 0.786
5.405LeuSer: 5.405 ± 0.524
5.624LeuThr: 5.624 ± 0.526
5.515LeuVal: 5.515 ± 0.552
1.201LeuTrp: 1.201 ± 0.277
2.184LeuTyr: 2.184 ± 0.379
0.0LeuXaa: 0.0 ± 0.0
Met
2.839MetAla: 2.839 ± 0.46
0.218MetCys: 0.218 ± 0.133
1.42MetAsp: 1.42 ± 0.274
1.037MetGlu: 1.037 ± 0.199
0.71MetPhe: 0.71 ± 0.238
1.42MetGly: 1.42 ± 0.264
0.218MetHis: 0.218 ± 0.102
0.983MetIle: 0.983 ± 0.262
0.601MetLys: 0.601 ± 0.208
1.966MetLeu: 1.966 ± 0.336
0.546MetMet: 0.546 ± 0.211
0.655MetAsn: 0.655 ± 0.166
1.092MetPro: 1.092 ± 0.234
0.491MetGln: 0.491 ± 0.146
1.583MetArg: 1.583 ± 0.318
2.566MetSer: 2.566 ± 0.408
2.566MetThr: 2.566 ± 0.362
1.638MetVal: 1.638 ± 0.319
0.218MetTrp: 0.218 ± 0.111
0.382MetTyr: 0.382 ± 0.147
0.0MetXaa: 0.0 ± 0.0
Asn
3.221AsnAla: 3.221 ± 0.355
0.218AsnCys: 0.218 ± 0.108
1.474AsnAsp: 1.474 ± 0.26
1.474AsnGlu: 1.474 ± 0.34
0.71AsnPhe: 0.71 ± 0.275
3.385AsnGly: 3.385 ± 0.474
0.928AsnHis: 0.928 ± 0.217
1.147AsnIle: 1.147 ± 0.317
0.983AsnLys: 0.983 ± 0.236
2.73AsnLeu: 2.73 ± 0.419
1.037AsnMet: 1.037 ± 0.21
1.638AsnAsn: 1.638 ± 0.345
2.566AsnPro: 2.566 ± 0.324
1.201AsnGln: 1.201 ± 0.269
2.075AsnArg: 2.075 ± 0.34
1.529AsnSer: 1.529 ± 0.286
1.802AsnThr: 1.802 ± 0.255
2.075AsnVal: 2.075 ± 0.408
0.764AsnTrp: 0.764 ± 0.181
0.874AsnTyr: 0.874 ± 0.216
0.0AsnXaa: 0.0 ± 0.0
Pro
5.242ProAla: 5.242 ± 0.616
0.819ProCys: 0.819 ± 0.209
4.368ProAsp: 4.368 ± 0.527
4.259ProGlu: 4.259 ± 0.52
2.184ProPhe: 2.184 ± 0.37
5.897ProGly: 5.897 ± 0.657
1.583ProHis: 1.583 ± 0.305
2.129ProIle: 2.129 ± 0.379
2.566ProLys: 2.566 ± 0.44
4.259ProLeu: 4.259 ± 0.501
1.693ProMet: 1.693 ± 0.304
2.129ProAsn: 2.129 ± 0.293
4.04ProPro: 4.04 ± 0.546
2.512ProGln: 2.512 ± 0.364
3.167ProArg: 3.167 ± 0.473
3.058ProSer: 3.058 ± 0.339
3.221ProThr: 3.221 ± 0.517
4.423ProVal: 4.423 ± 0.576
1.42ProTrp: 1.42 ± 0.231
1.256ProTyr: 1.256 ± 0.303
0.0ProXaa: 0.0 ± 0.0
Gln
4.641GlnAla: 4.641 ± 0.546
0.218GlnCys: 0.218 ± 0.147
1.529GlnAsp: 1.529 ± 0.295
1.747GlnGlu: 1.747 ± 0.326
1.31GlnPhe: 1.31 ± 0.249
2.402GlnGly: 2.402 ± 0.378
1.092GlnHis: 1.092 ± 0.238
1.911GlnIle: 1.911 ± 0.328
1.31GlnLys: 1.31 ± 0.237
3.221GlnLeu: 3.221 ± 0.425
0.382GlnMet: 0.382 ± 0.123
0.874GlnAsn: 0.874 ± 0.27
2.402GlnPro: 2.402 ± 0.39
1.31GlnGln: 1.31 ± 0.239
2.512GlnArg: 2.512 ± 0.371
1.911GlnSer: 1.911 ± 0.303
1.693GlnThr: 1.693 ± 0.299
2.839GlnVal: 2.839 ± 0.422
0.819GlnTrp: 0.819 ± 0.187
0.874GlnTyr: 0.874 ± 0.253
0.0GlnXaa: 0.0 ± 0.0
Arg
7.426ArgAla: 7.426 ± 0.746
1.365ArgCys: 1.365 ± 0.333
5.242ArgAsp: 5.242 ± 0.616
4.805ArgGlu: 4.805 ± 0.631
2.675ArgPhe: 2.675 ± 0.396
4.423ArgGly: 4.423 ± 0.601
1.966ArgHis: 1.966 ± 0.436
4.423ArgIle: 4.423 ± 0.514
3.112ArgLys: 3.112 ± 0.544
5.296ArgLeu: 5.296 ± 0.621
2.348ArgMet: 2.348 ± 0.359
2.348ArgAsn: 2.348 ± 0.383
3.385ArgPro: 3.385 ± 0.431
2.184ArgGln: 2.184 ± 0.426
7.316ArgArg: 7.316 ± 0.865
3.167ArgSer: 3.167 ± 0.379
3.713ArgThr: 3.713 ± 0.508
5.078ArgVal: 5.078 ± 0.706
2.129ArgTrp: 2.129 ± 0.392
1.747ArgTyr: 1.747 ± 0.353
0.0ArgXaa: 0.0 ± 0.0
Ser
6.061SerAla: 6.061 ± 0.698
0.655SerCys: 0.655 ± 0.193
3.658SerAsp: 3.658 ± 0.394
3.167SerGlu: 3.167 ± 0.415
1.802SerPhe: 1.802 ± 0.383
4.914SerGly: 4.914 ± 0.465
0.874SerHis: 0.874 ± 0.205
2.621SerIle: 2.621 ± 0.339
2.075SerLys: 2.075 ± 0.36
3.658SerLeu: 3.658 ± 0.398
1.529SerMet: 1.529 ± 0.271
2.02SerAsn: 2.02 ± 0.326
3.549SerPro: 3.549 ± 0.358
1.747SerGln: 1.747 ± 0.262
3.658SerArg: 3.658 ± 0.466
2.785SerSer: 2.785 ± 0.495
3.494SerThr: 3.494 ± 0.453
4.859SerVal: 4.859 ± 0.572
0.983SerTrp: 0.983 ± 0.235
1.638SerTyr: 1.638 ± 0.238
0.0SerXaa: 0.0 ± 0.0
Thr
6.552ThrAla: 6.552 ± 0.724
0.764ThrCys: 0.764 ± 0.262
4.805ThrAsp: 4.805 ± 0.527
4.095ThrGlu: 4.095 ± 0.416
1.911ThrPhe: 1.911 ± 0.355
5.842ThrGly: 5.842 ± 0.543
1.856ThrHis: 1.856 ± 0.339
3.331ThrIle: 3.331 ± 0.364
2.02ThrLys: 2.02 ± 0.354
5.078ThrLeu: 5.078 ± 0.632
1.693ThrMet: 1.693 ± 0.261
1.638ThrAsn: 1.638 ± 0.295
4.859ThrPro: 4.859 ± 0.713
2.02ThrGln: 2.02 ± 0.271
4.313ThrArg: 4.313 ± 0.583
3.549ThrSer: 3.549 ± 0.5
4.75ThrThr: 4.75 ± 0.711
5.132ThrVal: 5.132 ± 0.547
1.037ThrTrp: 1.037 ± 0.265
1.583ThrTyr: 1.583 ± 0.284
0.0ThrXaa: 0.0 ± 0.0
Val
7.371ValAla: 7.371 ± 0.686
1.147ValCys: 1.147 ± 0.239
4.75ValAsp: 4.75 ± 0.573
5.296ValGlu: 5.296 ± 0.532
1.856ValPhe: 1.856 ± 0.344
5.132ValGly: 5.132 ± 0.512
1.529ValHis: 1.529 ± 0.321
3.058ValIle: 3.058 ± 0.487
2.457ValLys: 2.457 ± 0.364
5.242ValLeu: 5.242 ± 0.639
1.147ValMet: 1.147 ± 0.239
1.638ValAsn: 1.638 ± 0.269
4.368ValPro: 4.368 ± 0.407
2.675ValGln: 2.675 ± 0.319
5.242ValArg: 5.242 ± 0.722
4.969ValSer: 4.969 ± 0.59
5.187ValThr: 5.187 ± 0.636
5.842ValVal: 5.842 ± 0.593
1.693ValTrp: 1.693 ± 0.266
1.583ValTyr: 1.583 ± 0.399
0.0ValXaa: 0.0 ± 0.0
Trp
1.802TrpAla: 1.802 ± 0.345
0.382TrpCys: 0.382 ± 0.149
1.256TrpAsp: 1.256 ± 0.279
0.983TrpGlu: 0.983 ± 0.333
0.764TrpPhe: 0.764 ± 0.207
1.201TrpGly: 1.201 ± 0.23
0.437TrpHis: 0.437 ± 0.163
1.147TrpIle: 1.147 ± 0.169
0.71TrpLys: 0.71 ± 0.185
2.02TrpLeu: 2.02 ± 0.385
0.655TrpMet: 0.655 ± 0.191
0.655TrpAsn: 0.655 ± 0.166
0.819TrpPro: 0.819 ± 0.241
0.874TrpGln: 0.874 ± 0.237
2.566TrpArg: 2.566 ± 0.479
1.201TrpSer: 1.201 ± 0.249
2.02TrpThr: 2.02 ± 0.33
1.474TrpVal: 1.474 ± 0.32
0.874TrpTrp: 0.874 ± 0.21
0.655TrpTyr: 0.655 ± 0.184
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.075TyrAla: 2.075 ± 0.387
0.437TyrCys: 0.437 ± 0.143
1.529TyrAsp: 1.529 ± 0.385
2.129TyrGlu: 2.129 ± 0.363
0.819TyrPhe: 0.819 ± 0.188
1.966TyrGly: 1.966 ± 0.306
0.273TyrHis: 0.273 ± 0.123
1.037TyrIle: 1.037 ± 0.227
0.601TyrLys: 0.601 ± 0.216
1.747TyrLeu: 1.747 ± 0.348
0.382TyrMet: 0.382 ± 0.112
0.71TyrAsn: 0.71 ± 0.156
1.747TyrPro: 1.747 ± 0.299
0.819TyrGln: 0.819 ± 0.198
2.184TyrArg: 2.184 ± 0.364
1.037TyrSer: 1.037 ± 0.271
1.693TyrThr: 1.693 ± 0.258
2.075TyrVal: 2.075 ± 0.324
0.601TyrTrp: 0.601 ± 0.167
0.328TyrTyr: 0.328 ± 0.099
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 103 proteins (18316 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski