Amino acid dipepetide frequency for Mycobacterium phage Bruin

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
11.939AlaAla: 11.939 ± 1.279
1.202AlaCys: 1.202 ± 0.257
6.098AlaAsp: 6.098 ± 0.533
8.847AlaGlu: 8.847 ± 0.816
2.791AlaPhe: 2.791 ± 0.369
8.847AlaGly: 8.847 ± 0.816
2.319AlaHis: 2.319 ± 0.411
3.951AlaIle: 3.951 ± 0.454
4.982AlaLys: 4.982 ± 0.486
7.945AlaLeu: 7.945 ± 0.567
2.92AlaMet: 2.92 ± 0.38
2.963AlaAsn: 2.963 ± 0.593
4.595AlaPro: 4.595 ± 0.441
3.479AlaGln: 3.479 ± 0.384
5.884AlaArg: 5.884 ± 0.728
5.841AlaSer: 5.841 ± 0.625
6.399AlaThr: 6.399 ± 0.54
5.927AlaVal: 5.927 ± 0.494
1.761AlaTrp: 1.761 ± 0.262
2.663AlaTyr: 2.663 ± 0.382
0.0AlaXaa: 0.0 ± 0.0
Cys
1.074CysAla: 1.074 ± 0.225
0.215CysCys: 0.215 ± 0.107
0.644CysAsp: 0.644 ± 0.2
0.687CysGlu: 0.687 ± 0.211
0.344CysPhe: 0.344 ± 0.128
1.89CysGly: 1.89 ± 0.35
0.258CysHis: 0.258 ± 0.087
0.472CysIle: 0.472 ± 0.133
0.515CysLys: 0.515 ± 0.154
1.374CysLeu: 1.374 ± 0.301
0.172CysMet: 0.172 ± 0.086
0.472CysAsn: 0.472 ± 0.144
0.687CysPro: 0.687 ± 0.18
0.515CysGln: 0.515 ± 0.179
0.644CysArg: 0.644 ± 0.21
0.902CysSer: 0.902 ± 0.223
0.429CysThr: 0.429 ± 0.121
0.73CysVal: 0.73 ± 0.185
0.215CysTrp: 0.215 ± 0.103
0.429CysTyr: 0.429 ± 0.147
0.0CysXaa: 0.0 ± 0.0
Asp
5.884AspAla: 5.884 ± 0.557
1.288AspCys: 1.288 ± 0.29
3.951AspAsp: 3.951 ± 0.469
4.295AspGlu: 4.295 ± 0.581
1.503AspPhe: 1.503 ± 0.242
6.828AspGly: 6.828 ± 0.611
1.546AspHis: 1.546 ± 0.31
3.35AspIle: 3.35 ± 0.443
2.577AspLys: 2.577 ± 0.347
4.638AspLeu: 4.638 ± 0.364
1.503AspMet: 1.503 ± 0.266
1.761AspAsn: 1.761 ± 0.31
3.736AspPro: 3.736 ± 0.325
1.761AspGln: 1.761 ± 0.375
4.123AspArg: 4.123 ± 0.419
2.92AspSer: 2.92 ± 0.393
3.565AspThr: 3.565 ± 0.377
4.423AspVal: 4.423 ± 0.465
1.546AspTrp: 1.546 ± 0.233
2.663AspTyr: 2.663 ± 0.44
0.0AspXaa: 0.0 ± 0.0
Glu
7.043GluAla: 7.043 ± 0.736
1.074GluCys: 1.074 ± 0.235
4.896GluAsp: 4.896 ± 0.531
4.896GluGlu: 4.896 ± 0.517
2.577GluPhe: 2.577 ± 0.338
4.423GluGly: 4.423 ± 0.482
1.331GluHis: 1.331 ± 0.257
2.92GluIle: 2.92 ± 0.346
2.663GluLys: 2.663 ± 0.377
7.344GluLeu: 7.344 ± 0.57
1.202GluMet: 1.202 ± 0.22
2.147GluAsn: 2.147 ± 0.266
3.092GluPro: 3.092 ± 0.462
2.362GluGln: 2.362 ± 0.308
4.037GluArg: 4.037 ± 0.46
2.491GluSer: 2.491 ± 0.321
3.565GluThr: 3.565 ± 0.403
4.252GluVal: 4.252 ± 0.454
1.202GluTrp: 1.202 ± 0.241
2.362GluTyr: 2.362 ± 0.322
0.0GluXaa: 0.0 ± 0.0
Phe
2.92PheAla: 2.92 ± 0.4
0.644PheCys: 0.644 ± 0.165
2.448PheAsp: 2.448 ± 0.318
2.061PheGlu: 2.061 ± 0.279
0.988PhePhe: 0.988 ± 0.234
2.62PheGly: 2.62 ± 0.412
0.988PheHis: 0.988 ± 0.224
1.417PheIle: 1.417 ± 0.247
1.245PheLys: 1.245 ± 0.231
1.804PheLeu: 1.804 ± 0.306
0.601PheMet: 0.601 ± 0.148
1.031PheAsn: 1.031 ± 0.224
1.245PhePro: 1.245 ± 0.223
0.816PheGln: 0.816 ± 0.204
1.288PheArg: 1.288 ± 0.217
2.018PheSer: 2.018 ± 0.373
2.233PheThr: 2.233 ± 0.348
1.933PheVal: 1.933 ± 0.356
0.301PheTrp: 0.301 ± 0.11
0.988PheTyr: 0.988 ± 0.19
0.0PheXaa: 0.0 ± 0.0
Gly
7.859GlyAla: 7.859 ± 0.793
0.773GlyCys: 0.773 ± 0.205
5.54GlyAsp: 5.54 ± 0.514
5.239GlyGlu: 5.239 ± 0.485
2.19GlyPhe: 2.19 ± 0.302
9.62GlyGly: 9.62 ± 1.715
2.018GlyHis: 2.018 ± 0.29
4.252GlyIle: 4.252 ± 0.507
4.123GlyLys: 4.123 ± 0.346
6.442GlyLeu: 6.442 ± 0.506
2.233GlyMet: 2.233 ± 0.377
3.393GlyAsn: 3.393 ± 0.356
3.565GlyPro: 3.565 ± 0.496
2.405GlyGln: 2.405 ± 0.431
4.338GlyArg: 4.338 ± 0.412
5.282GlySer: 5.282 ± 0.573
6.657GlyThr: 6.657 ± 0.611
6.485GlyVal: 6.485 ± 0.575
1.632GlyTrp: 1.632 ± 0.264
3.522GlyTyr: 3.522 ± 0.412
0.0GlyXaa: 0.0 ± 0.0
His
2.104HisAla: 2.104 ± 0.318
0.558HisCys: 0.558 ± 0.166
1.589HisAsp: 1.589 ± 0.296
1.288HisGlu: 1.288 ± 0.244
0.687HisPhe: 0.687 ± 0.136
2.362HisGly: 2.362 ± 0.383
1.074HisHis: 1.074 ± 0.23
1.031HisIle: 1.031 ± 0.216
1.16HisLys: 1.16 ± 0.225
2.577HisLeu: 2.577 ± 0.333
0.515HisMet: 0.515 ± 0.136
1.031HisAsn: 1.031 ± 0.23
1.632HisPro: 1.632 ± 0.297
0.902HisGln: 0.902 ± 0.246
1.374HisArg: 1.374 ± 0.234
1.245HisSer: 1.245 ± 0.222
1.331HisThr: 1.331 ± 0.255
1.245HisVal: 1.245 ± 0.227
0.387HisTrp: 0.387 ± 0.138
0.945HisTyr: 0.945 ± 0.23
0.0HisXaa: 0.0 ± 0.0
Ile
4.681IleAla: 4.681 ± 0.439
0.387IleCys: 0.387 ± 0.136
3.35IleAsp: 3.35 ± 0.355
3.65IleGlu: 3.65 ± 0.396
0.945IlePhe: 0.945 ± 0.249
3.092IleGly: 3.092 ± 0.399
1.288IleHis: 1.288 ± 0.248
1.589IleIle: 1.589 ± 0.282
1.89IleLys: 1.89 ± 0.261
3.264IleLeu: 3.264 ± 0.417
0.859IleMet: 0.859 ± 0.182
2.147IleAsn: 2.147 ± 0.332
2.405IlePro: 2.405 ± 0.262
1.718IleGln: 1.718 ± 0.306
2.706IleArg: 2.706 ± 0.346
2.749IleSer: 2.749 ± 0.42
2.834IleThr: 2.834 ± 0.37
3.436IleVal: 3.436 ± 0.375
0.515IleTrp: 0.515 ± 0.145
1.245IleTyr: 1.245 ± 0.253
0.0IleXaa: 0.0 ± 0.0
Lys
4.81LysAla: 4.81 ± 0.508
0.687LysCys: 0.687 ± 0.153
2.663LysAsp: 2.663 ± 0.366
2.448LysGlu: 2.448 ± 0.31
1.288LysPhe: 1.288 ± 0.221
3.436LysGly: 3.436 ± 0.462
1.245LysHis: 1.245 ± 0.265
1.718LysIle: 1.718 ± 0.28
2.362LysLys: 2.362 ± 0.318
3.908LysLeu: 3.908 ± 0.373
1.245LysMet: 1.245 ± 0.264
1.589LysAsn: 1.589 ± 0.284
2.534LysPro: 2.534 ± 0.394
1.675LysGln: 1.675 ± 0.228
3.178LysArg: 3.178 ± 0.455
2.963LysSer: 2.963 ± 0.433
1.847LysThr: 1.847 ± 0.281
3.436LysVal: 3.436 ± 0.292
1.16LysTrp: 1.16 ± 0.201
1.117LysTyr: 1.117 ± 0.216
0.0LysXaa: 0.0 ± 0.0
Leu
9.062LeuAla: 9.062 ± 0.788
0.687LeuCys: 0.687 ± 0.148
4.595LeuAsp: 4.595 ± 0.45
4.638LeuGlu: 4.638 ± 0.487
2.276LeuPhe: 2.276 ± 0.327
5.97LeuGly: 5.97 ± 0.543
1.804LeuHis: 1.804 ± 0.314
3.951LeuIle: 3.951 ± 0.368
4.037LeuLys: 4.037 ± 0.41
6.914LeuLeu: 6.914 ± 0.616
2.448LeuMet: 2.448 ± 0.311
3.436LeuAsn: 3.436 ± 0.361
3.822LeuPro: 3.822 ± 0.414
2.92LeuGln: 2.92 ± 0.333
5.411LeuArg: 5.411 ± 0.405
5.154LeuSer: 5.154 ± 0.436
5.669LeuThr: 5.669 ± 0.531
5.368LeuVal: 5.368 ± 0.426
1.288LeuTrp: 1.288 ± 0.263
1.761LeuTyr: 1.761 ± 0.292
0.0LeuXaa: 0.0 ± 0.0
Met
2.491MetAla: 2.491 ± 0.352
0.301MetCys: 0.301 ± 0.123
1.288MetAsp: 1.288 ± 0.257
1.503MetGlu: 1.503 ± 0.284
0.73MetPhe: 0.73 ± 0.186
1.546MetGly: 1.546 ± 0.27
0.558MetHis: 0.558 ± 0.209
0.902MetIle: 0.902 ± 0.182
1.761MetLys: 1.761 ± 0.312
1.546MetLeu: 1.546 ± 0.216
0.429MetMet: 0.429 ± 0.136
0.945MetAsn: 0.945 ± 0.212
1.074MetPro: 1.074 ± 0.189
0.687MetGln: 0.687 ± 0.227
1.331MetArg: 1.331 ± 0.232
2.491MetSer: 2.491 ± 0.451
1.933MetThr: 1.933 ± 0.28
1.417MetVal: 1.417 ± 0.257
0.558MetTrp: 0.558 ± 0.161
0.301MetTyr: 0.301 ± 0.102
0.0MetXaa: 0.0 ± 0.0
Asn
2.834AsnAla: 2.834 ± 0.408
0.258AsnCys: 0.258 ± 0.087
2.319AsnAsp: 2.319 ± 0.313
1.933AsnGlu: 1.933 ± 0.285
1.031AsnPhe: 1.031 ± 0.265
4.552AsnGly: 4.552 ± 0.59
0.816AsnHis: 0.816 ± 0.235
1.331AsnIle: 1.331 ± 0.342
0.945AsnLys: 0.945 ± 0.158
3.479AsnLeu: 3.479 ± 0.544
0.387AsnMet: 0.387 ± 0.215
0.945AsnAsn: 0.945 ± 0.215
2.92AsnPro: 2.92 ± 0.366
1.202AsnGln: 1.202 ± 0.196
2.19AsnArg: 2.19 ± 0.265
1.89AsnSer: 1.89 ± 0.27
1.847AsnThr: 1.847 ± 0.342
2.061AsnVal: 2.061 ± 0.299
0.558AsnTrp: 0.558 ± 0.176
0.773AsnTyr: 0.773 ± 0.159
0.0AsnXaa: 0.0 ± 0.0
Pro
5.239ProAla: 5.239 ± 0.483
0.601ProCys: 0.601 ± 0.167
3.693ProAsp: 3.693 ± 0.342
3.307ProGlu: 3.307 ± 0.359
1.89ProPhe: 1.89 ± 0.308
5.025ProGly: 5.025 ± 0.672
1.546ProHis: 1.546 ± 0.267
2.405ProIle: 2.405 ± 0.288
2.534ProLys: 2.534 ± 0.394
3.736ProLeu: 3.736 ± 0.34
1.202ProMet: 1.202 ± 0.198
1.804ProAsn: 1.804 ± 0.225
3.006ProPro: 3.006 ± 0.426
1.761ProGln: 1.761 ± 0.286
2.663ProArg: 2.663 ± 0.388
2.834ProSer: 2.834 ± 0.303
3.393ProThr: 3.393 ± 0.289
3.522ProVal: 3.522 ± 0.439
1.202ProTrp: 1.202 ± 0.213
1.675ProTyr: 1.675 ± 0.243
0.0ProXaa: 0.0 ± 0.0
Gln
3.865GlnAla: 3.865 ± 0.36
0.344GlnCys: 0.344 ± 0.179
1.718GlnAsp: 1.718 ± 0.285
1.89GlnGlu: 1.89 ± 0.274
0.773GlnPhe: 0.773 ± 0.161
2.362GlnGly: 2.362 ± 0.312
0.472GlnHis: 0.472 ± 0.12
1.503GlnIle: 1.503 ± 0.288
1.46GlnLys: 1.46 ± 0.31
3.65GlnLeu: 3.65 ± 0.576
0.859GlnMet: 0.859 ± 0.203
0.902GlnAsn: 0.902 ± 0.209
1.589GlnPro: 1.589 ± 0.269
1.031GlnGln: 1.031 ± 0.21
2.362GlnArg: 2.362 ± 0.301
2.061GlnSer: 2.061 ± 0.301
1.675GlnThr: 1.675 ± 0.21
2.534GlnVal: 2.534 ± 0.387
0.773GlnTrp: 0.773 ± 0.221
0.988GlnTyr: 0.988 ± 0.191
0.0GlnXaa: 0.0 ± 0.0
Arg
6.7ArgAla: 6.7 ± 0.665
0.902ArgCys: 0.902 ± 0.242
4.209ArgAsp: 4.209 ± 0.494
3.178ArgGlu: 3.178 ± 0.36
1.89ArgPhe: 1.89 ± 0.314
3.736ArgGly: 3.736 ± 0.371
2.061ArgHis: 2.061 ± 0.339
2.663ArgIle: 2.663 ± 0.37
3.092ArgLys: 3.092 ± 0.392
4.509ArgLeu: 4.509 ± 0.534
2.147ArgMet: 2.147 ± 0.311
1.589ArgAsn: 1.589 ± 0.221
3.092ArgPro: 3.092 ± 0.378
1.847ArgGln: 1.847 ± 0.274
3.908ArgArg: 3.908 ± 0.511
2.834ArgSer: 2.834 ± 0.327
2.749ArgThr: 2.749 ± 0.331
4.466ArgVal: 4.466 ± 0.491
1.417ArgTrp: 1.417 ± 0.262
2.276ArgTyr: 2.276 ± 0.299
0.0ArgXaa: 0.0 ± 0.0
Ser
5.068SerAla: 5.068 ± 0.538
0.387SerCys: 0.387 ± 0.123
3.221SerAsp: 3.221 ± 0.385
3.522SerGlu: 3.522 ± 0.426
1.976SerPhe: 1.976 ± 0.308
5.97SerGly: 5.97 ± 0.841
1.417SerHis: 1.417 ± 0.247
2.061SerIle: 2.061 ± 0.3
2.19SerLys: 2.19 ± 0.372
4.982SerLeu: 4.982 ± 0.524
1.202SerMet: 1.202 ± 0.216
2.018SerAsn: 2.018 ± 0.271
2.92SerPro: 2.92 ± 0.286
1.503SerGln: 1.503 ± 0.297
3.479SerArg: 3.479 ± 0.447
3.65SerSer: 3.65 ± 0.539
3.779SerThr: 3.779 ± 0.5
3.951SerVal: 3.951 ± 0.43
1.503SerTrp: 1.503 ± 0.279
1.546SerTyr: 1.546 ± 0.226
0.0SerXaa: 0.0 ± 0.0
Thr
6.399ThrAla: 6.399 ± 0.542
0.601ThrCys: 0.601 ± 0.15
3.565ThrAsp: 3.565 ± 0.397
4.123ThrGlu: 4.123 ± 0.453
2.018ThrPhe: 2.018 ± 0.302
5.454ThrGly: 5.454 ± 0.6
1.202ThrHis: 1.202 ± 0.183
3.264ThrIle: 3.264 ± 0.397
2.491ThrLys: 2.491 ± 0.4
4.896ThrLeu: 4.896 ± 0.467
1.46ThrMet: 1.46 ± 0.312
2.233ThrAsn: 2.233 ± 0.306
4.982ThrPro: 4.982 ± 0.578
1.933ThrGln: 1.933 ± 0.317
2.92ThrArg: 2.92 ± 0.375
2.577ThrSer: 2.577 ± 0.347
3.522ThrThr: 3.522 ± 0.491
5.54ThrVal: 5.54 ± 0.418
1.417ThrTrp: 1.417 ± 0.338
1.976ThrTyr: 1.976 ± 0.248
0.0ThrXaa: 0.0 ± 0.0
Val
7.344ValAla: 7.344 ± 0.589
0.945ValCys: 0.945 ± 0.213
4.896ValAsp: 4.896 ± 0.495
5.626ValGlu: 5.626 ± 0.672
2.061ValPhe: 2.061 ± 0.345
5.025ValGly: 5.025 ± 0.452
1.718ValHis: 1.718 ± 0.29
3.607ValIle: 3.607 ± 0.443
3.522ValLys: 3.522 ± 0.456
3.994ValLeu: 3.994 ± 0.48
1.546ValMet: 1.546 ± 0.233
2.491ValAsn: 2.491 ± 0.302
3.607ValPro: 3.607 ± 0.425
2.233ValGln: 2.233 ± 0.324
4.209ValArg: 4.209 ± 0.484
3.951ValSer: 3.951 ± 0.396
5.497ValThr: 5.497 ± 0.633
5.927ValVal: 5.927 ± 0.763
0.773ValTrp: 0.773 ± 0.218
1.976ValTyr: 1.976 ± 0.263
0.0ValXaa: 0.0 ± 0.0
Trp
1.46TrpAla: 1.46 ± 0.233
0.387TrpCys: 0.387 ± 0.153
1.589TrpAsp: 1.589 ± 0.213
0.816TrpGlu: 0.816 ± 0.194
0.816TrpPhe: 0.816 ± 0.2
1.202TrpGly: 1.202 ± 0.238
0.687TrpHis: 0.687 ± 0.187
1.031TrpIle: 1.031 ± 0.193
0.644TrpLys: 0.644 ± 0.16
1.632TrpLeu: 1.632 ± 0.31
0.301TrpMet: 0.301 ± 0.122
0.73TrpAsn: 0.73 ± 0.15
0.902TrpPro: 0.902 ± 0.201
0.902TrpGln: 0.902 ± 0.234
0.859TrpArg: 0.859 ± 0.196
1.074TrpSer: 1.074 ± 0.183
1.074TrpThr: 1.074 ± 0.251
2.061TrpVal: 2.061 ± 0.352
0.429TrpTrp: 0.429 ± 0.107
0.601TrpTyr: 0.601 ± 0.184
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.62TyrAla: 2.62 ± 0.301
0.429TyrCys: 0.429 ± 0.188
1.589TyrAsp: 1.589 ± 0.339
1.933TyrGlu: 1.933 ± 0.321
0.816TyrPhe: 0.816 ± 0.208
3.607TyrGly: 3.607 ± 0.443
0.687TyrHis: 0.687 ± 0.194
1.331TyrIle: 1.331 ± 0.211
1.117TyrLys: 1.117 ± 0.242
2.577TyrLeu: 2.577 ± 0.268
0.515TyrMet: 0.515 ± 0.133
0.644TyrAsn: 0.644 ± 0.162
1.632TyrPro: 1.632 ± 0.273
1.16TyrGln: 1.16 ± 0.208
2.362TyrArg: 2.362 ± 0.326
1.417TyrSer: 1.417 ± 0.22
2.577TyrThr: 2.577 ± 0.336
2.233TyrVal: 2.233 ± 0.342
0.558TyrTrp: 0.558 ± 0.177
0.73TyrTyr: 0.73 ± 0.22
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 141 proteins (23286 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski