Amino acid dipepetide frequency for Mycobacterium phage FlagStaff

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
20.297AlaAla: 20.297 ± 2.175
0.683AlaCys: 0.683 ± 0.189
8.611AlaAsp: 8.611 ± 0.885
7.722AlaGlu: 7.722 ± 1.275
4.169AlaPhe: 4.169 ± 0.61
14.283AlaGly: 14.283 ± 1.878
2.187AlaHis: 2.187 ± 0.376
7.107AlaIle: 7.107 ± 0.847
5.262AlaLys: 5.262 ± 0.603
9.909AlaLeu: 9.909 ± 0.886
3.622AlaMet: 3.622 ± 0.523
4.374AlaAsn: 4.374 ± 0.506
6.834AlaPro: 6.834 ± 0.722
5.194AlaGln: 5.194 ± 0.729
7.927AlaArg: 7.927 ± 1.076
4.852AlaSer: 4.852 ± 0.741
8.337AlaThr: 8.337 ± 0.772
9.704AlaVal: 9.704 ± 0.899
2.46AlaTrp: 2.46 ± 0.378
1.982AlaTyr: 1.982 ± 0.327
0.0AlaXaa: 0.0 ± 0.0
Cys
1.093CysAla: 1.093 ± 0.289
0.273CysCys: 0.273 ± 0.14
0.478CysAsp: 0.478 ± 0.223
0.547CysGlu: 0.547 ± 0.229
0.205CysPhe: 0.205 ± 0.133
1.435CysGly: 1.435 ± 0.372
0.205CysHis: 0.205 ± 0.111
0.478CysIle: 0.478 ± 0.196
0.205CysLys: 0.205 ± 0.126
0.41CysLeu: 0.41 ± 0.15
0.0CysMet: 0.0 ± 0.0
0.273CysAsn: 0.273 ± 0.162
0.957CysPro: 0.957 ± 0.29
0.547CysGln: 0.547 ± 0.211
1.093CysArg: 1.093 ± 0.331
0.478CysSer: 0.478 ± 0.213
0.342CysThr: 0.342 ± 0.145
0.547CysVal: 0.547 ± 0.223
0.205CysTrp: 0.205 ± 0.12
0.547CysTyr: 0.547 ± 0.236
0.0CysXaa: 0.0 ± 0.0
Asp
10.524AspAla: 10.524 ± 1.039
1.025AspCys: 1.025 ± 0.413
7.927AspAsp: 7.927 ± 0.98
5.809AspGlu: 5.809 ± 0.818
1.572AspPhe: 1.572 ± 0.426
6.834AspGly: 6.834 ± 0.82
1.162AspHis: 1.162 ± 0.277
0.957AspIle: 0.957 ± 0.248
1.777AspLys: 1.777 ± 0.348
4.579AspLeu: 4.579 ± 0.456
1.298AspMet: 1.298 ± 0.284
2.187AspAsn: 2.187 ± 0.39
5.33AspPro: 5.33 ± 0.787
3.759AspGln: 3.759 ± 0.598
3.827AspArg: 3.827 ± 0.578
2.392AspSer: 2.392 ± 0.4
3.895AspThr: 3.895 ± 0.439
6.15AspVal: 6.15 ± 0.549
2.05AspTrp: 2.05 ± 0.377
2.187AspTyr: 2.187 ± 0.43
0.0AspXaa: 0.0 ± 0.0
Glu
5.809GluAla: 5.809 ± 0.782
0.615GluCys: 0.615 ± 0.203
4.51GluAsp: 4.51 ± 0.592
2.802GluGlu: 2.802 ± 0.379
2.392GluPhe: 2.392 ± 0.355
3.554GluGly: 3.554 ± 0.494
2.187GluHis: 2.187 ± 0.467
1.64GluIle: 1.64 ± 0.421
2.187GluLys: 2.187 ± 0.399
3.827GluLeu: 3.827 ± 0.503
1.162GluMet: 1.162 ± 0.258
1.435GluAsn: 1.435 ± 0.372
4.169GluPro: 4.169 ± 0.766
2.665GluGln: 2.665 ± 0.407
4.51GluArg: 4.51 ± 0.733
1.572GluSer: 1.572 ± 0.298
2.05GluThr: 2.05 ± 0.347
3.895GluVal: 3.895 ± 0.489
1.162GluTrp: 1.162 ± 0.25
1.23GluTyr: 1.23 ± 0.284
0.0GluXaa: 0.0 ± 0.0
Phe
3.417PheAla: 3.417 ± 0.491
0.41PheCys: 0.41 ± 0.193
3.007PheAsp: 3.007 ± 0.546
2.392PheGlu: 2.392 ± 0.393
0.82PhePhe: 0.82 ± 0.242
3.075PheGly: 3.075 ± 0.563
0.547PheHis: 0.547 ± 0.217
0.888PheIle: 0.888 ± 0.326
0.752PheLys: 0.752 ± 0.236
2.255PheLeu: 2.255 ± 0.443
0.137PheMet: 0.137 ± 0.079
1.025PheAsn: 1.025 ± 0.256
1.23PhePro: 1.23 ± 0.296
1.025PheGln: 1.025 ± 0.267
1.435PheArg: 1.435 ± 0.306
1.23PheSer: 1.23 ± 0.289
1.913PheThr: 1.913 ± 0.376
2.118PheVal: 2.118 ± 0.419
0.547PheTrp: 0.547 ± 0.187
0.888PheTyr: 0.888 ± 0.27
0.0PheXaa: 0.0 ± 0.0
Gly
11.003GlyAla: 11.003 ± 1.708
0.752GlyCys: 0.752 ± 0.209
7.381GlyAsp: 7.381 ± 0.696
4.442GlyGlu: 4.442 ± 0.473
2.939GlyPhe: 2.939 ± 0.464
11.823GlyGly: 11.823 ± 3.315
1.572GlyHis: 1.572 ± 0.384
3.895GlyIle: 3.895 ± 0.59
3.69GlyLys: 3.69 ± 0.467
6.355GlyLeu: 6.355 ± 0.753
1.435GlyMet: 1.435 ± 0.374
3.417GlyAsn: 3.417 ± 0.508
4.1GlyPro: 4.1 ± 0.621
2.597GlyGln: 2.597 ± 0.471
6.492GlyArg: 6.492 ± 0.765
5.057GlySer: 5.057 ± 0.907
5.604GlyThr: 5.604 ± 0.553
5.945GlyVal: 5.945 ± 0.745
2.187GlyTrp: 2.187 ± 0.401
2.05GlyTyr: 2.05 ± 0.47
0.0GlyXaa: 0.0 ± 0.0
His
2.118HisAla: 2.118 ± 0.446
0.273HisCys: 0.273 ± 0.114
1.435HisAsp: 1.435 ± 0.39
0.547HisGlu: 0.547 ± 0.22
0.273HisPhe: 0.273 ± 0.138
1.777HisGly: 1.777 ± 0.324
0.137HisHis: 0.137 ± 0.099
0.752HisIle: 0.752 ± 0.223
0.478HisLys: 0.478 ± 0.137
1.298HisLeu: 1.298 ± 0.397
0.752HisMet: 0.752 ± 0.184
0.205HisAsn: 0.205 ± 0.124
1.572HisPro: 1.572 ± 0.409
1.298HisGln: 1.298 ± 0.339
2.46HisArg: 2.46 ± 0.538
0.752HisSer: 0.752 ± 0.204
1.025HisThr: 1.025 ± 0.279
1.913HisVal: 1.913 ± 0.392
0.342HisTrp: 0.342 ± 0.144
0.205HisTyr: 0.205 ± 0.097
0.0HisXaa: 0.0 ± 0.0
Ile
7.722IleAla: 7.722 ± 0.88
0.342IleCys: 0.342 ± 0.173
3.144IleAsp: 3.144 ± 0.484
1.708IleGlu: 1.708 ± 0.265
0.888IlePhe: 0.888 ± 0.241
4.237IleGly: 4.237 ± 0.689
1.093IleHis: 1.093 ± 0.208
0.957IleIle: 0.957 ± 0.257
1.435IleLys: 1.435 ± 0.397
2.802IleLeu: 2.802 ± 0.443
0.683IleMet: 0.683 ± 0.208
0.957IleAsn: 0.957 ± 0.311
2.802IlePro: 2.802 ± 0.49
1.093IleGln: 1.093 ± 0.314
3.28IleArg: 3.28 ± 0.376
1.64IleSer: 1.64 ± 0.424
2.597IleThr: 2.597 ± 0.403
2.734IleVal: 2.734 ± 0.401
1.025IleTrp: 1.025 ± 0.298
0.615IleTyr: 0.615 ± 0.177
0.0IleXaa: 0.0 ± 0.0
Lys
5.33LysAla: 5.33 ± 0.623
0.41LysCys: 0.41 ± 0.175
1.913LysAsp: 1.913 ± 0.48
0.82LysGlu: 0.82 ± 0.226
1.298LysPhe: 1.298 ± 0.316
1.777LysGly: 1.777 ± 0.349
0.683LysHis: 0.683 ± 0.223
2.05LysIle: 2.05 ± 0.376
1.298LysLys: 1.298 ± 0.343
2.324LysLeu: 2.324 ± 0.366
0.888LysMet: 0.888 ± 0.264
1.503LysAsn: 1.503 ± 0.325
2.05LysPro: 2.05 ± 0.354
0.752LysGln: 0.752 ± 0.256
3.144LysArg: 3.144 ± 0.525
2.324LysSer: 2.324 ± 0.398
2.118LysThr: 2.118 ± 0.388
2.392LysVal: 2.392 ± 0.387
0.478LysTrp: 0.478 ± 0.151
0.478LysTyr: 0.478 ± 0.181
0.0LysXaa: 0.0 ± 0.0
Leu
9.089LeuAla: 9.089 ± 0.815
0.342LeuCys: 0.342 ± 0.142
5.262LeuAsp: 5.262 ± 0.592
4.1LeuGlu: 4.1 ± 0.523
2.324LeuPhe: 2.324 ± 0.406
6.287LeuGly: 6.287 ± 0.653
0.957LeuHis: 0.957 ± 0.248
3.759LeuIle: 3.759 ± 0.501
2.46LeuLys: 2.46 ± 0.46
5.604LeuLeu: 5.604 ± 0.568
1.367LeuMet: 1.367 ± 0.373
2.05LeuAsn: 2.05 ± 0.388
5.33LeuPro: 5.33 ± 0.671
1.572LeuGln: 1.572 ± 0.306
5.262LeuArg: 5.262 ± 0.594
4.237LeuSer: 4.237 ± 0.523
5.194LeuThr: 5.194 ± 0.587
4.852LeuVal: 4.852 ± 0.68
1.777LeuTrp: 1.777 ± 0.425
2.118LeuTyr: 2.118 ± 0.474
0.0LeuXaa: 0.0 ± 0.0
Met
3.417MetAla: 3.417 ± 0.343
0.137MetCys: 0.137 ± 0.096
1.025MetAsp: 1.025 ± 0.259
0.82MetGlu: 0.82 ± 0.216
0.752MetPhe: 0.752 ± 0.188
1.572MetGly: 1.572 ± 0.324
0.273MetHis: 0.273 ± 0.128
1.162MetIle: 1.162 ± 0.309
0.82MetLys: 0.82 ± 0.229
2.118MetLeu: 2.118 ± 0.33
0.205MetMet: 0.205 ± 0.121
0.342MetAsn: 0.342 ± 0.179
0.82MetPro: 0.82 ± 0.248
0.683MetGln: 0.683 ± 0.255
0.957MetArg: 0.957 ± 0.217
1.503MetSer: 1.503 ± 0.296
2.255MetThr: 2.255 ± 0.351
1.162MetVal: 1.162 ± 0.252
0.41MetTrp: 0.41 ± 0.174
0.137MetTyr: 0.137 ± 0.1
0.0MetXaa: 0.0 ± 0.0
Asn
4.442AsnAla: 4.442 ± 0.839
0.41AsnCys: 0.41 ± 0.198
1.982AsnAsp: 1.982 ± 0.39
0.957AsnGlu: 0.957 ± 0.263
1.025AsnPhe: 1.025 ± 0.227
3.964AsnGly: 3.964 ± 0.467
0.82AsnHis: 0.82 ± 0.285
1.093AsnIle: 1.093 ± 0.304
0.752AsnLys: 0.752 ± 0.183
2.05AsnLeu: 2.05 ± 0.397
0.547AsnMet: 0.547 ± 0.182
0.957AsnAsn: 0.957 ± 0.212
3.69AsnPro: 3.69 ± 0.558
1.367AsnGln: 1.367 ± 0.268
2.255AsnArg: 2.255 ± 0.449
1.025AsnSer: 1.025 ± 0.274
2.255AsnThr: 2.255 ± 0.361
1.298AsnVal: 1.298 ± 0.256
0.888AsnTrp: 0.888 ± 0.177
0.478AsnTyr: 0.478 ± 0.178
0.0AsnXaa: 0.0 ± 0.0
Pro
9.499ProAla: 9.499 ± 0.915
0.478ProCys: 0.478 ± 0.188
7.176ProAsp: 7.176 ± 0.892
3.349ProGlu: 3.349 ± 0.57
1.572ProPhe: 1.572 ± 0.364
4.169ProGly: 4.169 ± 0.625
1.093ProHis: 1.093 ± 0.377
3.28ProIle: 3.28 ± 0.388
2.665ProLys: 2.665 ± 0.581
4.647ProLeu: 4.647 ± 0.613
0.888ProMet: 0.888 ± 0.296
1.845ProAsn: 1.845 ± 0.337
4.374ProPro: 4.374 ± 0.785
2.187ProGln: 2.187 ± 0.353
3.212ProArg: 3.212 ± 0.557
3.212ProSer: 3.212 ± 0.498
3.554ProThr: 3.554 ± 0.499
3.144ProVal: 3.144 ± 0.418
0.547ProTrp: 0.547 ± 0.175
0.752ProTyr: 0.752 ± 0.266
0.0ProXaa: 0.0 ± 0.0
Gln
5.535GlnAla: 5.535 ± 0.706
0.137GlnCys: 0.137 ± 0.105
2.187GlnAsp: 2.187 ± 0.37
0.82GlnGlu: 0.82 ± 0.22
1.162GlnPhe: 1.162 ± 0.213
3.075GlnGly: 3.075 ± 0.44
0.957GlnHis: 0.957 ± 0.304
1.572GlnIle: 1.572 ± 0.292
0.82GlnLys: 0.82 ± 0.202
2.939GlnLeu: 2.939 ± 0.408
1.435GlnMet: 1.435 ± 0.281
1.777GlnAsn: 1.777 ± 0.331
1.845GlnPro: 1.845 ± 0.339
1.777GlnGln: 1.777 ± 0.345
2.529GlnArg: 2.529 ± 0.455
1.777GlnSer: 1.777 ± 0.414
2.255GlnThr: 2.255 ± 0.346
2.597GlnVal: 2.597 ± 0.385
0.683GlnTrp: 0.683 ± 0.168
0.342GlnTyr: 0.342 ± 0.14
0.0GlnXaa: 0.0 ± 0.0
Arg
8.406ArgAla: 8.406 ± 1.136
1.367ArgCys: 1.367 ± 0.396
4.169ArgAsp: 4.169 ± 0.428
3.69ArgGlu: 3.69 ± 0.482
1.23ArgPhe: 1.23 ± 0.319
4.852ArgGly: 4.852 ± 0.524
1.435ArgHis: 1.435 ± 0.322
3.144ArgIle: 3.144 ± 0.394
2.529ArgLys: 2.529 ± 0.438
6.014ArgLeu: 6.014 ± 0.613
2.118ArgMet: 2.118 ± 0.385
2.324ArgAsn: 2.324 ± 0.324
3.895ArgPro: 3.895 ± 0.54
2.802ArgGln: 2.802 ± 0.518
6.082ArgArg: 6.082 ± 1.066
2.597ArgSer: 2.597 ± 0.486
4.374ArgThr: 4.374 ± 0.609
3.349ArgVal: 3.349 ± 0.543
1.64ArgTrp: 1.64 ± 0.407
1.64ArgTyr: 1.64 ± 0.339
0.0ArgXaa: 0.0 ± 0.0
Ser
5.057SerAla: 5.057 ± 0.829
0.752SerCys: 0.752 ± 0.281
2.939SerAsp: 2.939 ± 0.54
2.392SerGlu: 2.392 ± 0.385
1.162SerPhe: 1.162 ± 0.292
5.33SerGly: 5.33 ± 0.87
0.683SerHis: 0.683 ± 0.213
2.187SerIle: 2.187 ± 0.478
1.298SerLys: 1.298 ± 0.265
3.622SerLeu: 3.622 ± 0.516
0.41SerMet: 0.41 ± 0.165
1.64SerAsn: 1.64 ± 0.314
2.734SerPro: 2.734 ± 0.464
1.298SerGln: 1.298 ± 0.308
2.939SerArg: 2.939 ± 0.551
2.46SerSer: 2.46 ± 0.486
3.554SerThr: 3.554 ± 0.559
2.734SerVal: 2.734 ± 0.482
0.957SerTrp: 0.957 ± 0.285
1.093SerTyr: 1.093 ± 0.303
0.0SerXaa: 0.0 ± 0.0
Thr
8.679ThrAla: 8.679 ± 0.771
0.615ThrCys: 0.615 ± 0.325
4.032ThrAsp: 4.032 ± 0.621
3.349ThrGlu: 3.349 ± 0.522
1.64ThrPhe: 1.64 ± 0.311
5.877ThrGly: 5.877 ± 0.779
1.367ThrHis: 1.367 ± 0.294
2.529ThrIle: 2.529 ± 0.365
1.982ThrLys: 1.982 ± 0.365
4.647ThrLeu: 4.647 ± 0.552
1.503ThrMet: 1.503 ± 0.407
2.187ThrAsn: 2.187 ± 0.386
4.374ThrPro: 4.374 ± 0.689
1.298ThrGln: 1.298 ± 0.406
3.759ThrArg: 3.759 ± 0.515
3.895ThrSer: 3.895 ± 0.549
4.305ThrThr: 4.305 ± 0.507
4.715ThrVal: 4.715 ± 0.452
1.093ThrTrp: 1.093 ± 0.282
1.025ThrTyr: 1.025 ± 0.237
0.0ThrXaa: 0.0 ± 0.0
Val
9.431ValAla: 9.431 ± 0.796
0.888ValCys: 0.888 ± 0.279
5.535ValAsp: 5.535 ± 0.761
4.989ValGlu: 4.989 ± 0.496
2.529ValPhe: 2.529 ± 0.394
5.262ValGly: 5.262 ± 0.791
0.957ValHis: 0.957 ± 0.223
2.734ValIle: 2.734 ± 0.388
2.46ValLys: 2.46 ± 0.456
4.647ValLeu: 4.647 ± 0.457
1.503ValMet: 1.503 ± 0.331
2.118ValAsn: 2.118 ± 0.349
3.895ValPro: 3.895 ± 0.487
2.734ValGln: 2.734 ± 0.412
3.485ValArg: 3.485 ± 0.45
2.187ValSer: 2.187 ± 0.347
4.374ValThr: 4.374 ± 0.778
4.579ValVal: 4.579 ± 0.529
1.435ValTrp: 1.435 ± 0.339
1.435ValTyr: 1.435 ± 0.335
0.0ValXaa: 0.0 ± 0.0
Trp
2.392TrpAla: 2.392 ± 0.396
0.137TrpCys: 0.137 ± 0.109
1.093TrpAsp: 1.093 ± 0.281
1.162TrpGlu: 1.162 ± 0.26
0.957TrpPhe: 0.957 ± 0.233
1.162TrpGly: 1.162 ± 0.383
0.478TrpHis: 0.478 ± 0.18
1.025TrpIle: 1.025 ± 0.209
0.683TrpLys: 0.683 ± 0.255
2.05TrpLeu: 2.05 ± 0.55
0.342TrpMet: 0.342 ± 0.116
1.298TrpAsn: 1.298 ± 0.318
0.82TrpPro: 0.82 ± 0.206
0.82TrpGln: 0.82 ± 0.189
1.572TrpArg: 1.572 ± 0.355
0.82TrpSer: 0.82 ± 0.258
1.777TrpThr: 1.777 ± 0.344
1.093TrpVal: 1.093 ± 0.263
0.41TrpTrp: 0.41 ± 0.166
0.478TrpTyr: 0.478 ± 0.187
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.255TyrAla: 2.255 ± 0.452
0.273TyrCys: 0.273 ± 0.136
1.435TyrAsp: 1.435 ± 0.316
0.957TyrGlu: 0.957 ± 0.257
0.205TyrPhe: 0.205 ± 0.126
2.392TyrGly: 2.392 ± 0.429
0.82TyrHis: 0.82 ± 0.253
0.547TyrIle: 0.547 ± 0.163
0.478TyrLys: 0.478 ± 0.158
1.845TyrLeu: 1.845 ± 0.294
0.205TyrMet: 0.205 ± 0.114
0.41TyrAsn: 0.41 ± 0.187
0.82TyrPro: 0.82 ± 0.232
0.752TyrGln: 0.752 ± 0.213
1.435TyrArg: 1.435 ± 0.32
1.23TyrSer: 1.23 ± 0.301
1.093TyrThr: 1.093 ± 0.258
2.255TyrVal: 2.255 ± 0.412
0.273TyrTrp: 0.273 ± 0.097
0.478TyrTyr: 0.478 ± 0.174
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 65 proteins (14634 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski