Amino acid dipepetide frequency for Mycobacterium phage MilleniumForce

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
14.511AlaAla: 14.511 ± 1.611
0.863AlaCys: 0.863 ± 0.226
7.336AlaAsp: 7.336 ± 0.709
7.876AlaGlu: 7.876 ± 0.741
2.481AlaPhe: 2.481 ± 0.457
9.602AlaGly: 9.602 ± 1.127
2.643AlaHis: 2.643 ± 0.35
4.262AlaIle: 4.262 ± 0.593
4.1AlaLys: 4.1 ± 0.463
7.822AlaLeu: 7.822 ± 0.879
2.643AlaMet: 2.643 ± 0.404
2.05AlaAsn: 2.05 ± 0.291
5.772AlaPro: 5.772 ± 0.654
3.506AlaGln: 3.506 ± 0.459
7.606AlaArg: 7.606 ± 0.786
4.909AlaSer: 4.909 ± 0.614
6.042AlaThr: 6.042 ± 0.505
7.606AlaVal: 7.606 ± 0.714
2.751AlaTrp: 2.751 ± 0.441
2.427AlaTyr: 2.427 ± 0.301
0.0AlaXaa: 0.0 ± 0.0
Cys
1.349CysAla: 1.349 ± 0.319
0.108CysCys: 0.108 ± 0.082
1.187CysAsp: 1.187 ± 0.307
0.755CysGlu: 0.755 ± 0.216
0.108CysPhe: 0.108 ± 0.065
1.672CysGly: 1.672 ± 0.36
0.162CysHis: 0.162 ± 0.109
0.108CysIle: 0.108 ± 0.075
0.539CysLys: 0.539 ± 0.219
0.755CysLeu: 0.755 ± 0.249
0.27CysMet: 0.27 ± 0.117
0.539CysAsn: 0.539 ± 0.186
1.025CysPro: 1.025 ± 0.244
0.216CysGln: 0.216 ± 0.097
1.349CysArg: 1.349 ± 0.349
0.701CysSer: 0.701 ± 0.187
0.917CysThr: 0.917 ± 0.233
0.809CysVal: 0.809 ± 0.213
0.324CysTrp: 0.324 ± 0.119
0.27CysTyr: 0.27 ± 0.118
0.0CysXaa: 0.0 ± 0.0
Asp
6.905AspAla: 6.905 ± 0.578
0.971AspCys: 0.971 ± 0.201
3.992AspAsp: 3.992 ± 0.521
3.291AspGlu: 3.291 ± 0.388
1.942AspPhe: 1.942 ± 0.238
6.203AspGly: 6.203 ± 0.628
1.403AspHis: 1.403 ± 0.258
2.266AspIle: 2.266 ± 0.357
1.51AspLys: 1.51 ± 0.279
6.096AspLeu: 6.096 ± 0.512
1.025AspMet: 1.025 ± 0.271
1.403AspAsn: 1.403 ± 0.3
4.963AspPro: 4.963 ± 0.551
2.212AspGln: 2.212 ± 0.317
5.232AspArg: 5.232 ± 0.616
3.452AspSer: 3.452 ± 0.556
4.1AspThr: 4.1 ± 0.472
4.423AspVal: 4.423 ± 0.621
1.564AspTrp: 1.564 ± 0.292
2.104AspTyr: 2.104 ± 0.339
0.0AspXaa: 0.0 ± 0.0
Glu
5.286GluAla: 5.286 ± 0.569
0.917GluCys: 0.917 ± 0.291
2.913GluAsp: 2.913 ± 0.348
2.751GluGlu: 2.751 ± 0.546
2.212GluPhe: 2.212 ± 0.378
3.183GluGly: 3.183 ± 0.354
1.349GluHis: 1.349 ± 0.366
2.589GluIle: 2.589 ± 0.378
2.05GluLys: 2.05 ± 0.312
5.718GluLeu: 5.718 ± 0.691
1.564GluMet: 1.564 ± 0.379
1.996GluAsn: 1.996 ± 0.274
2.805GluPro: 2.805 ± 0.444
2.751GluGln: 2.751 ± 0.416
5.826GluArg: 5.826 ± 0.645
2.967GluSer: 2.967 ± 0.47
4.208GluThr: 4.208 ± 0.616
4.208GluVal: 4.208 ± 0.542
1.133GluTrp: 1.133 ± 0.238
1.834GluTyr: 1.834 ± 0.322
0.0GluXaa: 0.0 ± 0.0
Phe
3.183PheAla: 3.183 ± 0.44
0.324PheCys: 0.324 ± 0.125
2.427PheAsp: 2.427 ± 0.36
1.564PheGlu: 1.564 ± 0.325
0.809PhePhe: 0.809 ± 0.254
2.967PheGly: 2.967 ± 0.603
0.378PheHis: 0.378 ± 0.13
1.403PheIle: 1.403 ± 0.378
0.809PheLys: 0.809 ± 0.219
1.672PheLeu: 1.672 ± 0.251
0.755PheMet: 0.755 ± 0.253
1.079PheAsn: 1.079 ± 0.3
1.672PhePro: 1.672 ± 0.331
0.863PheGln: 0.863 ± 0.301
1.456PheArg: 1.456 ± 0.265
1.403PheSer: 1.403 ± 0.275
2.32PheThr: 2.32 ± 0.406
1.996PheVal: 1.996 ± 0.27
0.647PheTrp: 0.647 ± 0.143
0.971PheTyr: 0.971 ± 0.26
0.0PheXaa: 0.0 ± 0.0
Gly
9.332GlyAla: 9.332 ± 1.207
1.187GlyCys: 1.187 ± 0.306
5.556GlyAsp: 5.556 ± 0.538
3.938GlyGlu: 3.938 ± 0.593
2.697GlyPhe: 2.697 ± 0.464
9.548GlyGly: 9.548 ± 1.581
1.456GlyHis: 1.456 ± 0.291
4.531GlyIle: 4.531 ± 0.534
2.535GlyLys: 2.535 ± 0.413
5.934GlyLeu: 5.934 ± 0.559
2.32GlyMet: 2.32 ± 0.415
3.129GlyAsn: 3.129 ± 0.429
4.693GlyPro: 4.693 ± 0.572
2.266GlyGln: 2.266 ± 0.568
4.855GlyArg: 4.855 ± 0.578
5.988GlySer: 5.988 ± 0.652
6.203GlyThr: 6.203 ± 0.601
5.772GlyVal: 5.772 ± 0.583
2.374GlyTrp: 2.374 ± 0.365
2.212GlyTyr: 2.212 ± 0.383
0.0GlyXaa: 0.0 ± 0.0
His
2.158HisAla: 2.158 ± 0.363
0.216HisCys: 0.216 ± 0.145
0.971HisAsp: 0.971 ± 0.224
1.241HisGlu: 1.241 ± 0.314
0.378HisPhe: 0.378 ± 0.117
1.888HisGly: 1.888 ± 0.309
1.241HisHis: 1.241 ± 0.333
1.349HisIle: 1.349 ± 0.324
0.755HisLys: 0.755 ± 0.237
1.241HisLeu: 1.241 ± 0.278
0.593HisMet: 0.593 ± 0.181
0.863HisAsn: 0.863 ± 0.214
1.456HisPro: 1.456 ± 0.298
0.863HisGln: 0.863 ± 0.196
2.158HisArg: 2.158 ± 0.347
0.863HisSer: 0.863 ± 0.185
1.403HisThr: 1.403 ± 0.313
1.456HisVal: 1.456 ± 0.298
0.539HisTrp: 0.539 ± 0.182
0.701HisTyr: 0.701 ± 0.168
0.0HisXaa: 0.0 ± 0.0
Ile
5.286IleAla: 5.286 ± 0.558
0.809IleCys: 0.809 ± 0.237
3.722IleAsp: 3.722 ± 0.436
3.722IleGlu: 3.722 ± 0.374
0.863IlePhe: 0.863 ± 0.233
3.614IleGly: 3.614 ± 0.433
1.51IleHis: 1.51 ± 0.281
1.349IleIle: 1.349 ± 0.248
1.187IleLys: 1.187 ± 0.243
1.942IleLeu: 1.942 ± 0.406
0.593IleMet: 0.593 ± 0.16
1.78IleAsn: 1.78 ± 0.299
2.859IlePro: 2.859 ± 0.279
1.241IleGln: 1.241 ± 0.243
2.913IleArg: 2.913 ± 0.465
1.996IleSer: 1.996 ± 0.365
3.56IleThr: 3.56 ± 0.407
2.805IleVal: 2.805 ± 0.375
0.971IleTrp: 0.971 ± 0.245
0.809IleTyr: 0.809 ± 0.246
0.0IleXaa: 0.0 ± 0.0
Lys
3.614LysAla: 3.614 ± 0.53
0.539LysCys: 0.539 ± 0.193
1.564LysAsp: 1.564 ± 0.302
1.079LysGlu: 1.079 ± 0.248
1.349LysPhe: 1.349 ± 0.242
2.374LysGly: 2.374 ± 0.401
1.025LysHis: 1.025 ± 0.227
0.971LysIle: 0.971 ± 0.255
0.971LysLys: 0.971 ± 0.282
3.183LysLeu: 3.183 ± 0.582
0.755LysMet: 0.755 ± 0.206
1.133LysAsn: 1.133 ± 0.224
2.212LysPro: 2.212 ± 0.292
1.618LysGln: 1.618 ± 0.306
2.212LysArg: 2.212 ± 0.271
1.726LysSer: 1.726 ± 0.299
2.32LysThr: 2.32 ± 0.303
2.158LysVal: 2.158 ± 0.357
0.485LysTrp: 0.485 ± 0.152
1.187LysTyr: 1.187 ± 0.277
0.0LysXaa: 0.0 ± 0.0
Leu
7.984LeuAla: 7.984 ± 0.729
0.917LeuCys: 0.917 ± 0.269
5.017LeuAsp: 5.017 ± 0.592
3.722LeuGlu: 3.722 ± 0.524
2.374LeuPhe: 2.374 ± 0.315
5.556LeuGly: 5.556 ± 0.561
0.863LeuHis: 0.863 ± 0.208
3.075LeuIle: 3.075 ± 0.413
1.888LeuLys: 1.888 ± 0.324
4.423LeuLeu: 4.423 ± 0.514
1.403LeuMet: 1.403 ± 0.308
2.643LeuAsn: 2.643 ± 0.368
5.502LeuPro: 5.502 ± 0.697
2.805LeuGln: 2.805 ± 0.51
5.556LeuArg: 5.556 ± 0.612
5.394LeuSer: 5.394 ± 0.569
6.042LeuThr: 6.042 ± 0.614
5.286LeuVal: 5.286 ± 0.541
1.349LeuTrp: 1.349 ± 0.263
1.942LeuTyr: 1.942 ± 0.355
0.0LeuXaa: 0.0 ± 0.0
Met
2.212MetAla: 2.212 ± 0.354
0.162MetCys: 0.162 ± 0.099
1.295MetAsp: 1.295 ± 0.263
1.187MetGlu: 1.187 ± 0.232
0.863MetPhe: 0.863 ± 0.21
1.834MetGly: 1.834 ± 0.302
0.162MetHis: 0.162 ± 0.092
0.863MetIle: 0.863 ± 0.199
0.917MetLys: 0.917 ± 0.228
2.104MetLeu: 2.104 ± 0.256
0.701MetMet: 0.701 ± 0.235
0.917MetAsn: 0.917 ± 0.211
1.403MetPro: 1.403 ± 0.293
0.432MetGln: 0.432 ± 0.14
1.295MetArg: 1.295 ± 0.296
2.643MetSer: 2.643 ± 0.349
2.266MetThr: 2.266 ± 0.323
1.403MetVal: 1.403 ± 0.344
0.324MetTrp: 0.324 ± 0.119
0.324MetTyr: 0.324 ± 0.122
0.0MetXaa: 0.0 ± 0.0
Asn
3.129AsnAla: 3.129 ± 0.385
0.216AsnCys: 0.216 ± 0.108
1.942AsnAsp: 1.942 ± 0.279
1.672AsnGlu: 1.672 ± 0.303
0.863AsnPhe: 0.863 ± 0.285
4.154AsnGly: 4.154 ± 0.516
1.187AsnHis: 1.187 ± 0.252
1.672AsnIle: 1.672 ± 0.432
1.133AsnLys: 1.133 ± 0.271
2.481AsnLeu: 2.481 ± 0.361
0.378AsnMet: 0.378 ± 0.141
1.78AsnAsn: 1.78 ± 0.318
2.481AsnPro: 2.481 ± 0.328
1.187AsnGln: 1.187 ± 0.304
2.266AsnArg: 2.266 ± 0.392
1.349AsnSer: 1.349 ± 0.254
2.374AsnThr: 2.374 ± 0.378
1.834AsnVal: 1.834 ± 0.347
0.755AsnTrp: 0.755 ± 0.157
0.647AsnTyr: 0.647 ± 0.164
0.0AsnXaa: 0.0 ± 0.0
Pro
5.232ProAla: 5.232 ± 0.53
0.809ProCys: 0.809 ± 0.234
4.747ProAsp: 4.747 ± 0.468
4.423ProGlu: 4.423 ± 0.473
1.51ProPhe: 1.51 ± 0.315
6.365ProGly: 6.365 ± 0.68
1.618ProHis: 1.618 ± 0.296
2.374ProIle: 2.374 ± 0.299
2.05ProLys: 2.05 ± 0.375
4.747ProLeu: 4.747 ± 0.478
1.564ProMet: 1.564 ± 0.352
2.427ProAsn: 2.427 ± 0.312
3.722ProPro: 3.722 ± 0.523
2.05ProGln: 2.05 ± 0.37
3.668ProArg: 3.668 ± 0.496
3.183ProSer: 3.183 ± 0.457
3.452ProThr: 3.452 ± 0.5
5.071ProVal: 5.071 ± 0.549
1.025ProTrp: 1.025 ± 0.223
1.564ProTyr: 1.564 ± 0.287
0.0ProXaa: 0.0 ± 0.0
Gln
4.639GlnAla: 4.639 ± 0.519
0.162GlnCys: 0.162 ± 0.089
1.618GlnAsp: 1.618 ± 0.282
1.672GlnGlu: 1.672 ± 0.292
1.079GlnPhe: 1.079 ± 0.238
2.427GlnGly: 2.427 ± 0.52
0.863GlnHis: 0.863 ± 0.245
1.834GlnIle: 1.834 ± 0.288
1.025GlnLys: 1.025 ± 0.189
2.697GlnLeu: 2.697 ± 0.393
0.809GlnMet: 0.809 ± 0.206
1.133GlnAsn: 1.133 ± 0.263
2.481GlnPro: 2.481 ± 0.395
1.079GlnGln: 1.079 ± 0.226
2.104GlnArg: 2.104 ± 0.383
2.535GlnSer: 2.535 ± 0.424
1.456GlnThr: 1.456 ± 0.316
2.697GlnVal: 2.697 ± 0.407
0.539GlnTrp: 0.539 ± 0.136
0.863GlnTyr: 0.863 ± 0.261
0.0GlnXaa: 0.0 ± 0.0
Arg
6.797ArgAla: 6.797 ± 0.537
1.78ArgCys: 1.78 ± 0.416
4.262ArgAsp: 4.262 ± 0.642
5.232ArgGlu: 5.232 ± 0.631
2.104ArgPhe: 2.104 ± 0.412
4.1ArgGly: 4.1 ± 0.464
1.564ArgHis: 1.564 ± 0.276
3.776ArgIle: 3.776 ± 0.43
2.266ArgLys: 2.266 ± 0.401
4.963ArgLeu: 4.963 ± 0.578
2.643ArgMet: 2.643 ± 0.471
2.589ArgAsn: 2.589 ± 0.38
3.776ArgPro: 3.776 ± 0.487
2.32ArgGln: 2.32 ± 0.384
5.988ArgArg: 5.988 ± 0.889
3.884ArgSer: 3.884 ± 0.399
3.884ArgThr: 3.884 ± 0.626
5.394ArgVal: 5.394 ± 0.549
1.726ArgTrp: 1.726 ± 0.32
2.05ArgTyr: 2.05 ± 0.304
0.0ArgXaa: 0.0 ± 0.0
Ser
5.664SerAla: 5.664 ± 0.82
0.539SerCys: 0.539 ± 0.179
4.046SerAsp: 4.046 ± 0.467
2.913SerGlu: 2.913 ± 0.438
2.158SerPhe: 2.158 ± 0.502
6.257SerGly: 6.257 ± 0.632
0.971SerHis: 0.971 ± 0.204
3.021SerIle: 3.021 ± 0.393
2.427SerLys: 2.427 ± 0.452
3.776SerLeu: 3.776 ± 0.416
1.403SerMet: 1.403 ± 0.28
1.942SerAsn: 1.942 ± 0.367
3.075SerPro: 3.075 ± 0.378
1.564SerGln: 1.564 ± 0.308
3.398SerArg: 3.398 ± 0.468
3.722SerSer: 3.722 ± 0.561
3.237SerThr: 3.237 ± 0.421
4.208SerVal: 4.208 ± 0.533
1.618SerTrp: 1.618 ± 0.254
1.618SerTyr: 1.618 ± 0.235
0.0SerXaa: 0.0 ± 0.0
Thr
7.067ThrAla: 7.067 ± 0.689
0.917ThrCys: 0.917 ± 0.25
4.046ThrAsp: 4.046 ± 0.563
3.398ThrGlu: 3.398 ± 0.381
1.672ThrPhe: 1.672 ± 0.313
6.042ThrGly: 6.042 ± 0.572
1.618ThrHis: 1.618 ± 0.336
3.398ThrIle: 3.398 ± 0.423
2.212ThrLys: 2.212 ± 0.323
4.909ThrLeu: 4.909 ± 0.519
1.51ThrMet: 1.51 ± 0.298
2.374ThrAsn: 2.374 ± 0.332
4.963ThrPro: 4.963 ± 0.592
2.158ThrGln: 2.158 ± 0.387
4.477ThrArg: 4.477 ± 0.412
3.614ThrSer: 3.614 ± 0.43
4.909ThrThr: 4.909 ± 0.625
5.556ThrVal: 5.556 ± 0.549
1.079ThrTrp: 1.079 ± 0.296
1.888ThrTyr: 1.888 ± 0.257
0.0ThrXaa: 0.0 ± 0.0
Val
7.606ValAla: 7.606 ± 0.62
1.079ValCys: 1.079 ± 0.264
5.286ValAsp: 5.286 ± 0.569
5.179ValGlu: 5.179 ± 0.649
1.888ValPhe: 1.888 ± 0.418
5.664ValGly: 5.664 ± 0.55
1.241ValHis: 1.241 ± 0.263
2.589ValIle: 2.589 ± 0.401
2.212ValLys: 2.212 ± 0.338
5.286ValLeu: 5.286 ± 0.544
1.295ValMet: 1.295 ± 0.203
2.427ValAsn: 2.427 ± 0.365
4.208ValPro: 4.208 ± 0.423
2.913ValGln: 2.913 ± 0.36
4.477ValArg: 4.477 ± 0.501
4.747ValSer: 4.747 ± 0.559
5.556ValThr: 5.556 ± 0.566
6.042ValVal: 6.042 ± 0.72
1.78ValTrp: 1.78 ± 0.353
1.295ValTyr: 1.295 ± 0.234
0.0ValXaa: 0.0 ± 0.0
Trp
2.158TrpAla: 2.158 ± 0.346
0.216TrpCys: 0.216 ± 0.108
1.403TrpAsp: 1.403 ± 0.272
1.187TrpGlu: 1.187 ± 0.361
0.701TrpPhe: 0.701 ± 0.197
0.863TrpGly: 0.863 ± 0.212
0.593TrpHis: 0.593 ± 0.187
1.025TrpIle: 1.025 ± 0.216
1.133TrpLys: 1.133 ± 0.21
1.726TrpLeu: 1.726 ± 0.372
0.755TrpMet: 0.755 ± 0.256
0.539TrpAsn: 0.539 ± 0.203
1.025TrpPro: 1.025 ± 0.281
0.809TrpGln: 0.809 ± 0.268
2.266TrpArg: 2.266 ± 0.417
1.51TrpSer: 1.51 ± 0.399
1.672TrpThr: 1.672 ± 0.277
1.672TrpVal: 1.672 ± 0.464
0.863TrpTrp: 0.863 ± 0.185
0.27TrpTyr: 0.27 ± 0.132
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.697TyrAla: 2.697 ± 0.409
0.432TyrCys: 0.432 ± 0.145
1.726TyrAsp: 1.726 ± 0.421
1.51TyrGlu: 1.51 ± 0.294
0.647TyrPhe: 0.647 ± 0.195
2.212TyrGly: 2.212 ± 0.375
0.485TyrHis: 0.485 ± 0.17
1.079TyrIle: 1.079 ± 0.162
0.755TyrLys: 0.755 ± 0.219
2.266TyrLeu: 2.266 ± 0.315
0.27TyrMet: 0.27 ± 0.127
0.755TyrAsn: 0.755 ± 0.204
1.51TyrPro: 1.51 ± 0.252
0.863TyrGln: 0.863 ± 0.241
1.996TyrArg: 1.996 ± 0.369
0.971TyrSer: 0.971 ± 0.233
1.888TyrThr: 1.888 ± 0.347
2.374TyrVal: 2.374 ± 0.325
0.539TyrTrp: 0.539 ± 0.189
0.701TyrTyr: 0.701 ± 0.163
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 104 proteins (18539 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski