Amino acid dipepetide frequency for Mycobacterium phage Bongo

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
11.491AlaAla: 11.491 ± 1.267
0.786AlaCys: 0.786 ± 0.188
6.728AlaAsp: 6.728 ± 0.543
8.039AlaGlu: 8.039 ± 0.701
2.621AlaPhe: 2.621 ± 0.365
6.947AlaGly: 6.947 ± 0.688
2.228AlaHis: 2.228 ± 0.369
5.199AlaIle: 5.199 ± 0.433
5.418AlaLys: 5.418 ± 0.514
10.005AlaLeu: 10.005 ± 0.988
2.578AlaMet: 2.578 ± 0.323
3.626AlaAsn: 3.626 ± 0.65
4.282AlaPro: 4.282 ± 0.554
4.325AlaGln: 4.325 ± 0.516
6.029AlaArg: 6.029 ± 0.497
4.806AlaSer: 4.806 ± 0.469
5.767AlaThr: 5.767 ± 0.577
7.34AlaVal: 7.34 ± 0.577
2.272AlaTrp: 2.272 ± 0.303
2.578AlaTyr: 2.578 ± 0.325
0.0AlaXaa: 0.0 ± 0.0
Cys
0.918CysAla: 0.918 ± 0.24
0.044CysCys: 0.044 ± 0.045
0.568CysAsp: 0.568 ± 0.189
0.393CysGlu: 0.393 ± 0.131
0.175CysPhe: 0.175 ± 0.091
1.049CysGly: 1.049 ± 0.289
0.175CysHis: 0.175 ± 0.096
0.481CysIle: 0.481 ± 0.161
0.175CysLys: 0.175 ± 0.089
0.481CysLeu: 0.481 ± 0.145
0.262CysMet: 0.262 ± 0.118
0.393CysAsn: 0.393 ± 0.138
0.699CysPro: 0.699 ± 0.208
0.175CysGln: 0.175 ± 0.076
0.481CysArg: 0.481 ± 0.195
0.524CysSer: 0.524 ± 0.181
0.481CysThr: 0.481 ± 0.124
0.83CysVal: 0.83 ± 0.186
0.306CysTrp: 0.306 ± 0.108
0.35CysTyr: 0.35 ± 0.13
0.0CysXaa: 0.0 ± 0.0
Asp
6.554AspAla: 6.554 ± 0.514
0.743AspCys: 0.743 ± 0.216
5.199AspAsp: 5.199 ± 0.508
4.675AspGlu: 4.675 ± 0.503
2.01AspPhe: 2.01 ± 0.283
6.728AspGly: 6.728 ± 0.464
1.617AspHis: 1.617 ± 0.249
2.84AspIle: 2.84 ± 0.378
2.359AspLys: 2.359 ± 0.324
5.68AspLeu: 5.68 ± 0.532
1.966AspMet: 1.966 ± 0.29
1.966AspAsn: 1.966 ± 0.338
4.675AspPro: 4.675 ± 0.388
1.922AspGln: 1.922 ± 0.281
4.456AspArg: 4.456 ± 0.401
2.927AspSer: 2.927 ± 0.445
3.277AspThr: 3.277 ± 0.356
3.932AspVal: 3.932 ± 0.459
1.485AspTrp: 1.485 ± 0.248
2.228AspTyr: 2.228 ± 0.306
0.0AspXaa: 0.0 ± 0.0
Glu
6.859GluAla: 6.859 ± 0.67
0.524GluCys: 0.524 ± 0.16
4.02GluAsp: 4.02 ± 0.495
3.539GluGlu: 3.539 ± 0.469
2.316GluPhe: 2.316 ± 0.371
5.024GluGly: 5.024 ± 0.487
1.442GluHis: 1.442 ± 0.249
3.321GluIle: 3.321 ± 0.462
2.84GluLys: 2.84 ± 0.336
5.636GluLeu: 5.636 ± 0.509
2.185GluMet: 2.185 ± 0.344
2.053GluAsn: 2.053 ± 0.321
2.141GluPro: 2.141 ± 0.312
2.709GluGln: 2.709 ± 0.351
4.675GluArg: 4.675 ± 0.409
2.753GluSer: 2.753 ± 0.326
3.102GluThr: 3.102 ± 0.391
4.762GluVal: 4.762 ± 0.525
1.879GluTrp: 1.879 ± 0.305
2.053GluTyr: 2.053 ± 0.328
0.0GluXaa: 0.0 ± 0.0
Phe
2.665PheAla: 2.665 ± 0.337
0.218PheCys: 0.218 ± 0.112
3.146PheAsp: 3.146 ± 0.362
2.01PheGlu: 2.01 ± 0.24
1.005PhePhe: 1.005 ± 0.213
2.359PheGly: 2.359 ± 0.278
0.568PheHis: 0.568 ± 0.194
1.573PheIle: 1.573 ± 0.279
1.573PheLys: 1.573 ± 0.29
2.316PheLeu: 2.316 ± 0.336
0.612PheMet: 0.612 ± 0.164
1.092PheAsn: 1.092 ± 0.19
1.529PhePro: 1.529 ± 0.265
1.354PheGln: 1.354 ± 0.299
1.66PheArg: 1.66 ± 0.227
1.66PheSer: 1.66 ± 0.276
2.01PheThr: 2.01 ± 0.274
2.49PheVal: 2.49 ± 0.32
0.437PheTrp: 0.437 ± 0.144
0.743PheTyr: 0.743 ± 0.17
0.0PheXaa: 0.0 ± 0.0
Gly
7.427GlyAla: 7.427 ± 0.899
0.699GlyCys: 0.699 ± 0.213
5.287GlyAsp: 5.287 ± 0.561
4.85GlyGlu: 4.85 ± 0.412
2.665GlyPhe: 2.665 ± 0.286
8.651GlyGly: 8.651 ± 1.14
1.005GlyHis: 1.005 ± 0.275
3.976GlyIle: 3.976 ± 0.573
3.845GlyLys: 3.845 ± 0.366
6.117GlyLeu: 6.117 ± 0.726
1.748GlyMet: 1.748 ± 0.294
3.364GlyAsn: 3.364 ± 0.403
3.321GlyPro: 3.321 ± 0.637
3.015GlyGln: 3.015 ± 0.45
4.194GlyArg: 4.194 ± 0.433
5.592GlySer: 5.592 ± 0.609
5.811GlyThr: 5.811 ± 0.51
6.204GlyVal: 6.204 ± 0.512
2.49GlyTrp: 2.49 ± 0.343
2.971GlyTyr: 2.971 ± 0.356
0.0GlyXaa: 0.0 ± 0.0
His
1.442HisAla: 1.442 ± 0.307
0.306HisCys: 0.306 ± 0.117
1.354HisAsp: 1.354 ± 0.281
1.791HisGlu: 1.791 ± 0.31
0.743HisPhe: 0.743 ± 0.17
1.835HisGly: 1.835 ± 0.292
0.568HisHis: 0.568 ± 0.174
1.267HisIle: 1.267 ± 0.232
0.83HisLys: 0.83 ± 0.175
1.922HisLeu: 1.922 ± 0.343
0.393HisMet: 0.393 ± 0.142
0.83HisAsn: 0.83 ± 0.168
0.961HisPro: 0.961 ± 0.284
0.393HisGln: 0.393 ± 0.173
1.66HisArg: 1.66 ± 0.361
1.005HisSer: 1.005 ± 0.27
1.617HisThr: 1.617 ± 0.287
1.311HisVal: 1.311 ± 0.265
0.393HisTrp: 0.393 ± 0.131
0.568HisTyr: 0.568 ± 0.172
0.0HisXaa: 0.0 ± 0.0
Ile
4.544IleAla: 4.544 ± 0.414
0.437IleCys: 0.437 ± 0.137
4.151IleAsp: 4.151 ± 0.356
3.714IleGlu: 3.714 ± 0.414
1.049IlePhe: 1.049 ± 0.22
4.063IleGly: 4.063 ± 0.428
0.961IleHis: 0.961 ± 0.254
2.053IleIle: 2.053 ± 0.31
2.359IleLys: 2.359 ± 0.351
3.845IleLeu: 3.845 ± 0.386
1.136IleMet: 1.136 ± 0.224
1.748IleAsn: 1.748 ± 0.24
3.364IlePro: 3.364 ± 0.455
1.442IleGln: 1.442 ± 0.257
3.67IleArg: 3.67 ± 0.38
1.66IleSer: 1.66 ± 0.237
3.408IleThr: 3.408 ± 0.36
2.665IleVal: 2.665 ± 0.343
0.83IleTrp: 0.83 ± 0.168
1.354IleTyr: 1.354 ± 0.304
0.0IleXaa: 0.0 ± 0.0
Lys
5.199LysAla: 5.199 ± 0.544
0.262LysCys: 0.262 ± 0.111
2.447LysAsp: 2.447 ± 0.339
1.835LysGlu: 1.835 ± 0.276
1.442LysPhe: 1.442 ± 0.265
3.495LysGly: 3.495 ± 0.426
0.918LysHis: 0.918 ± 0.189
2.053LysIle: 2.053 ± 0.327
2.01LysLys: 2.01 ± 0.331
3.583LysLeu: 3.583 ± 0.384
1.136LysMet: 1.136 ± 0.219
1.704LysAsn: 1.704 ± 0.309
2.185LysPro: 2.185 ± 0.35
1.791LysGln: 1.791 ± 0.245
2.796LysArg: 2.796 ± 0.318
2.49LysSer: 2.49 ± 0.273
2.84LysThr: 2.84 ± 0.336
3.277LysVal: 3.277 ± 0.387
1.573LysTrp: 1.573 ± 0.24
1.66LysTyr: 1.66 ± 0.272
0.0LysXaa: 0.0 ± 0.0
Leu
9.437LeuAla: 9.437 ± 0.573
0.612LeuCys: 0.612 ± 0.171
5.942LeuAsp: 5.942 ± 0.515
5.33LeuGlu: 5.33 ± 0.407
2.49LeuPhe: 2.49 ± 0.342
6.379LeuGly: 6.379 ± 0.623
1.398LeuHis: 1.398 ± 0.306
3.801LeuIle: 3.801 ± 0.392
3.277LeuLys: 3.277 ± 0.39
5.374LeuLeu: 5.374 ± 0.514
1.879LeuMet: 1.879 ± 0.313
2.971LeuAsn: 2.971 ± 0.432
4.369LeuPro: 4.369 ± 0.513
2.621LeuGln: 2.621 ± 0.335
5.068LeuArg: 5.068 ± 0.498
4.719LeuSer: 4.719 ± 0.439
4.937LeuThr: 4.937 ± 0.55
5.68LeuVal: 5.68 ± 0.63
1.442LeuTrp: 1.442 ± 0.249
2.49LeuTyr: 2.49 ± 0.376
0.0LeuXaa: 0.0 ± 0.0
Met
3.583MetAla: 3.583 ± 0.338
0.218MetCys: 0.218 ± 0.132
1.136MetAsp: 1.136 ± 0.232
1.223MetGlu: 1.223 ± 0.206
0.655MetPhe: 0.655 ± 0.194
1.573MetGly: 1.573 ± 0.288
0.481MetHis: 0.481 ± 0.161
0.918MetIle: 0.918 ± 0.207
0.874MetLys: 0.874 ± 0.165
1.704MetLeu: 1.704 ± 0.281
0.306MetMet: 0.306 ± 0.122
0.699MetAsn: 0.699 ± 0.146
1.66MetPro: 1.66 ± 0.259
1.005MetGln: 1.005 ± 0.25
1.748MetArg: 1.748 ± 0.288
3.189MetSer: 3.189 ± 0.406
1.835MetThr: 1.835 ± 0.288
1.18MetVal: 1.18 ± 0.201
0.524MetTrp: 0.524 ± 0.136
0.699MetTyr: 0.699 ± 0.195
0.0MetXaa: 0.0 ± 0.0
Asn
3.67AsnAla: 3.67 ± 0.546
0.393AsnCys: 0.393 ± 0.14
1.791AsnAsp: 1.791 ± 0.339
2.272AsnGlu: 2.272 ± 0.374
1.136AsnPhe: 1.136 ± 0.228
3.495AsnGly: 3.495 ± 0.409
1.005AsnHis: 1.005 ± 0.203
1.311AsnIle: 1.311 ± 0.214
1.442AsnLys: 1.442 ± 0.202
2.49AsnLeu: 2.49 ± 0.311
0.961AsnMet: 0.961 ± 0.181
1.092AsnAsn: 1.092 ± 0.186
2.621AsnPro: 2.621 ± 0.318
1.354AsnGln: 1.354 ± 0.242
2.665AsnArg: 2.665 ± 0.324
2.053AsnSer: 2.053 ± 0.241
2.053AsnThr: 2.053 ± 0.315
2.185AsnVal: 2.185 ± 0.289
0.655AsnTrp: 0.655 ± 0.161
0.874AsnTyr: 0.874 ± 0.188
0.0AsnXaa: 0.0 ± 0.0
Pro
5.505ProAla: 5.505 ± 0.732
0.393ProCys: 0.393 ± 0.129
3.714ProAsp: 3.714 ± 0.461
3.495ProGlu: 3.495 ± 0.429
1.748ProPhe: 1.748 ± 0.219
4.456ProGly: 4.456 ± 0.76
1.049ProHis: 1.049 ± 0.218
1.748ProIle: 1.748 ± 0.311
2.097ProLys: 2.097 ± 0.323
3.932ProLeu: 3.932 ± 0.431
1.704ProMet: 1.704 ± 0.281
1.748ProAsn: 1.748 ± 0.303
2.185ProPro: 2.185 ± 0.415
1.791ProGln: 1.791 ± 0.314
2.578ProArg: 2.578 ± 0.357
2.534ProSer: 2.534 ± 0.322
2.884ProThr: 2.884 ± 0.342
3.801ProVal: 3.801 ± 0.388
0.961ProTrp: 0.961 ± 0.222
1.398ProTyr: 1.398 ± 0.229
0.0ProXaa: 0.0 ± 0.0
Gln
4.063GlnAla: 4.063 ± 0.52
0.612GlnCys: 0.612 ± 0.194
1.835GlnAsp: 1.835 ± 0.241
1.791GlnGlu: 1.791 ± 0.258
0.874GlnPhe: 0.874 ± 0.195
2.796GlnGly: 2.796 ± 0.4
0.83GlnHis: 0.83 ± 0.221
2.796GlnIle: 2.796 ± 0.332
1.442GlnLys: 1.442 ± 0.224
2.84GlnLeu: 2.84 ± 0.388
1.049GlnMet: 1.049 ± 0.229
1.005GlnAsn: 1.005 ± 0.21
1.529GlnPro: 1.529 ± 0.225
1.922GlnGln: 1.922 ± 0.275
2.665GlnArg: 2.665 ± 0.354
1.573GlnSer: 1.573 ± 0.264
1.922GlnThr: 1.922 ± 0.301
2.665GlnVal: 2.665 ± 0.343
1.136GlnTrp: 1.136 ± 0.204
0.918GlnTyr: 0.918 ± 0.188
0.0GlnXaa: 0.0 ± 0.0
Arg
6.466ArgAla: 6.466 ± 0.462
0.568ArgCys: 0.568 ± 0.157
3.495ArgAsp: 3.495 ± 0.364
4.063ArgGlu: 4.063 ± 0.404
2.403ArgPhe: 2.403 ± 0.313
4.719ArgGly: 4.719 ± 0.506
1.18ArgHis: 1.18 ± 0.255
3.801ArgIle: 3.801 ± 0.412
3.757ArgLys: 3.757 ± 0.413
4.456ArgLeu: 4.456 ± 0.401
2.272ArgMet: 2.272 ± 0.35
2.665ArgAsn: 2.665 ± 0.298
2.884ArgPro: 2.884 ± 0.387
2.971ArgGln: 2.971 ± 0.39
3.801ArgArg: 3.801 ± 0.397
2.534ArgSer: 2.534 ± 0.281
3.583ArgThr: 3.583 ± 0.419
4.413ArgVal: 4.413 ± 0.436
1.354ArgTrp: 1.354 ± 0.242
1.879ArgTyr: 1.879 ± 0.265
0.0ArgXaa: 0.0 ± 0.0
Ser
5.199SerAla: 5.199 ± 0.468
0.437SerCys: 0.437 ± 0.143
3.583SerAsp: 3.583 ± 0.38
2.228SerGlu: 2.228 ± 0.366
2.01SerPhe: 2.01 ± 0.326
4.893SerGly: 4.893 ± 0.494
1.092SerHis: 1.092 ± 0.245
2.403SerIle: 2.403 ± 0.35
1.922SerLys: 1.922 ± 0.246
4.02SerLeu: 4.02 ± 0.327
0.961SerMet: 0.961 ± 0.207
2.534SerAsn: 2.534 ± 0.349
2.053SerPro: 2.053 ± 0.287
1.18SerGln: 1.18 ± 0.254
3.015SerArg: 3.015 ± 0.358
3.67SerSer: 3.67 ± 0.459
4.238SerThr: 4.238 ± 0.459
4.194SerVal: 4.194 ± 0.459
1.529SerTrp: 1.529 ± 0.292
2.01SerTyr: 2.01 ± 0.312
0.0SerXaa: 0.0 ± 0.0
Thr
6.117ThrAla: 6.117 ± 0.578
0.35ThrCys: 0.35 ± 0.142
3.801ThrAsp: 3.801 ± 0.462
3.889ThrGlu: 3.889 ± 0.413
1.966ThrPhe: 1.966 ± 0.277
5.287ThrGly: 5.287 ± 0.514
1.223ThrHis: 1.223 ± 0.23
3.189ThrIle: 3.189 ± 0.473
2.534ThrLys: 2.534 ± 0.331
5.243ThrLeu: 5.243 ± 0.444
1.092ThrMet: 1.092 ± 0.215
1.879ThrAsn: 1.879 ± 0.337
3.801ThrPro: 3.801 ± 0.464
1.966ThrGln: 1.966 ± 0.306
3.452ThrArg: 3.452 ± 0.401
2.621ThrSer: 2.621 ± 0.377
3.757ThrThr: 3.757 ± 0.417
5.243ThrVal: 5.243 ± 0.477
1.267ThrTrp: 1.267 ± 0.222
2.228ThrTyr: 2.228 ± 0.325
0.0ThrXaa: 0.0 ± 0.0
Val
6.379ValAla: 6.379 ± 0.526
0.699ValCys: 0.699 ± 0.209
5.374ValAsp: 5.374 ± 0.574
4.762ValGlu: 4.762 ± 0.505
1.966ValPhe: 1.966 ± 0.265
5.374ValGly: 5.374 ± 0.501
1.922ValHis: 1.922 ± 0.327
3.67ValIle: 3.67 ± 0.435
3.452ValLys: 3.452 ± 0.362
5.724ValLeu: 5.724 ± 0.513
1.573ValMet: 1.573 ± 0.291
2.578ValAsn: 2.578 ± 0.27
3.233ValPro: 3.233 ± 0.399
2.621ValGln: 2.621 ± 0.399
4.238ValArg: 4.238 ± 0.463
4.5ValSer: 4.5 ± 0.498
4.5ValThr: 4.5 ± 0.582
4.85ValVal: 4.85 ± 0.534
1.311ValTrp: 1.311 ± 0.272
1.704ValTyr: 1.704 ± 0.277
0.0ValXaa: 0.0 ± 0.0
Trp
2.49TrpAla: 2.49 ± 0.343
0.306TrpCys: 0.306 ± 0.156
1.398TrpAsp: 1.398 ± 0.248
2.053TrpGlu: 2.053 ± 0.278
0.743TrpPhe: 0.743 ± 0.199
1.354TrpGly: 1.354 ± 0.221
0.699TrpHis: 0.699 ± 0.219
0.961TrpIle: 0.961 ± 0.184
0.961TrpLys: 0.961 ± 0.178
2.534TrpLeu: 2.534 ± 0.337
0.612TrpMet: 0.612 ± 0.177
0.655TrpAsn: 0.655 ± 0.161
0.743TrpPro: 0.743 ± 0.159
0.655TrpGln: 0.655 ± 0.157
1.704TrpArg: 1.704 ± 0.27
0.961TrpSer: 0.961 ± 0.255
1.18TrpThr: 1.18 ± 0.255
1.617TrpVal: 1.617 ± 0.234
0.786TrpTrp: 0.786 ± 0.214
0.655TrpTyr: 0.655 ± 0.19
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.058TyrAla: 3.058 ± 0.379
0.35TyrCys: 0.35 ± 0.105
2.272TyrAsp: 2.272 ± 0.328
1.879TyrGlu: 1.879 ± 0.32
1.005TyrPhe: 1.005 ± 0.205
2.753TyrGly: 2.753 ± 0.329
0.83TyrHis: 0.83 ± 0.206
1.092TyrIle: 1.092 ± 0.182
1.529TyrLys: 1.529 ± 0.271
2.316TyrLeu: 2.316 ± 0.287
0.743TyrMet: 0.743 ± 0.188
0.918TyrAsn: 0.918 ± 0.214
1.485TyrPro: 1.485 ± 0.318
1.049TyrGln: 1.049 ± 0.247
2.709TyrArg: 2.709 ± 0.355
1.442TyrSer: 1.442 ± 0.244
1.704TyrThr: 1.704 ± 0.315
1.791TyrVal: 1.791 ± 0.248
0.437TyrTrp: 0.437 ± 0.119
1.18TyrTyr: 1.18 ± 0.224
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 132 proteins (22889 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski