Amino acid dipepetide frequency for Mycobacterium phage Squirty

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
16.188AlaAla: 16.188 ± 1.959
0.836AlaCys: 0.836 ± 0.197
7.52AlaAsp: 7.52 ± 0.672
7.99AlaGlu: 7.99 ± 0.688
2.977AlaPhe: 2.977 ± 0.427
8.407AlaGly: 8.407 ± 1.126
2.663AlaHis: 2.663 ± 0.552
4.648AlaIle: 4.648 ± 0.499
3.446AlaLys: 3.446 ± 0.398
10.131AlaLeu: 10.131 ± 0.737
2.402AlaMet: 2.402 ± 0.415
2.872AlaAsn: 2.872 ± 0.417
5.483AlaPro: 5.483 ± 0.658
3.551AlaGln: 3.551 ± 0.44
8.46AlaArg: 8.46 ± 1.265
6.11AlaSer: 6.11 ± 0.658
6.684AlaThr: 6.684 ± 0.723
6.736AlaVal: 6.736 ± 0.579
2.924AlaTrp: 2.924 ± 0.433
2.611AlaTyr: 2.611 ± 0.394
0.0AlaXaa: 0.0 ± 0.0
Cys
1.253CysAla: 1.253 ± 0.272
0.209CysCys: 0.209 ± 0.09
1.514CysAsp: 1.514 ± 0.274
0.731CysGlu: 0.731 ± 0.201
0.052CysPhe: 0.052 ± 0.055
1.567CysGly: 1.567 ± 0.367
0.522CysHis: 0.522 ± 0.154
0.261CysIle: 0.261 ± 0.106
0.418CysLys: 0.418 ± 0.164
0.679CysLeu: 0.679 ± 0.177
0.261CysMet: 0.261 ± 0.108
0.574CysAsn: 0.574 ± 0.179
0.992CysPro: 0.992 ± 0.233
0.418CysGln: 0.418 ± 0.197
1.149CysArg: 1.149 ± 0.287
0.679CysSer: 0.679 ± 0.201
0.574CysThr: 0.574 ± 0.161
0.47CysVal: 0.47 ± 0.172
0.209CysTrp: 0.209 ± 0.096
0.366CysTyr: 0.366 ± 0.146
0.0CysXaa: 0.0 ± 0.0
Asp
7.154AspAla: 7.154 ± 0.62
1.358AspCys: 1.358 ± 0.27
4.752AspAsp: 4.752 ± 0.598
4.491AspGlu: 4.491 ± 0.511
1.41AspPhe: 1.41 ± 0.253
6.423AspGly: 6.423 ± 0.466
1.253AspHis: 1.253 ± 0.266
2.768AspIle: 2.768 ± 0.501
1.305AspLys: 1.305 ± 0.196
6.266AspLeu: 6.266 ± 0.519
1.149AspMet: 1.149 ± 0.223
1.828AspAsn: 1.828 ± 0.323
4.961AspPro: 4.961 ± 0.54
3.185AspGln: 3.185 ± 0.421
5.222AspArg: 5.222 ± 0.499
3.916AspSer: 3.916 ± 0.63
3.916AspThr: 3.916 ± 0.456
4.125AspVal: 4.125 ± 0.446
1.619AspTrp: 1.619 ± 0.286
2.298AspTyr: 2.298 ± 0.331
0.0AspXaa: 0.0 ± 0.0
Glu
6.997GluAla: 6.997 ± 0.802
0.836GluCys: 0.836 ± 0.251
3.29GluAsp: 3.29 ± 0.446
2.924GluGlu: 2.924 ± 0.399
1.932GluPhe: 1.932 ± 0.321
2.663GluGly: 2.663 ± 0.423
1.41GluHis: 1.41 ± 0.298
2.663GluIle: 2.663 ± 0.408
1.828GluLys: 1.828 ± 0.332
6.319GluLeu: 6.319 ± 0.688
1.305GluMet: 1.305 ± 0.265
1.775GluAsn: 1.775 ± 0.263
2.35GluPro: 2.35 ± 0.426
3.133GluGln: 3.133 ± 0.384
4.595GluArg: 4.595 ± 0.508
3.603GluSer: 3.603 ± 0.476
3.76GluThr: 3.76 ± 0.494
4.543GluVal: 4.543 ± 0.59
1.044GluTrp: 1.044 ± 0.236
1.567GluTyr: 1.567 ± 0.336
0.0GluXaa: 0.0 ± 0.0
Phe
3.551PheAla: 3.551 ± 0.439
0.313PheCys: 0.313 ± 0.133
2.82PheAsp: 2.82 ± 0.569
1.097PheGlu: 1.097 ± 0.227
0.574PhePhe: 0.574 ± 0.215
3.029PheGly: 3.029 ± 0.478
0.679PheHis: 0.679 ± 0.243
1.044PheIle: 1.044 ± 0.33
0.783PheLys: 0.783 ± 0.212
1.41PheLeu: 1.41 ± 0.199
0.679PheMet: 0.679 ± 0.178
0.627PheAsn: 0.627 ± 0.206
1.358PhePro: 1.358 ± 0.28
0.679PheGln: 0.679 ± 0.329
1.984PheArg: 1.984 ± 0.288
1.462PheSer: 1.462 ± 0.293
2.245PheThr: 2.245 ± 0.293
1.723PheVal: 1.723 ± 0.291
0.679PheTrp: 0.679 ± 0.183
0.522PheTyr: 0.522 ± 0.199
0.0PheXaa: 0.0 ± 0.0
Gly
8.198GlyAla: 8.198 ± 1.036
1.305GlyCys: 1.305 ± 0.301
5.587GlyAsp: 5.587 ± 0.578
4.334GlyGlu: 4.334 ± 0.479
2.611GlyPhe: 2.611 ± 0.356
7.624GlyGly: 7.624 ± 1.292
1.984GlyHis: 1.984 ± 0.372
4.178GlyIle: 4.178 ± 0.722
3.133GlyLys: 3.133 ± 0.303
5.013GlyLeu: 5.013 ± 0.429
1.88GlyMet: 1.88 ± 0.356
2.402GlyAsn: 2.402 ± 0.466
3.551GlyPro: 3.551 ± 0.623
2.715GlyGln: 2.715 ± 0.423
5.222GlyArg: 5.222 ± 0.582
5.483GlySer: 5.483 ± 0.822
5.483GlyThr: 5.483 ± 0.528
6.423GlyVal: 6.423 ± 0.602
2.193GlyTrp: 2.193 ± 0.301
2.037GlyTyr: 2.037 ± 0.453
0.0GlyXaa: 0.0 ± 0.0
His
2.141HisAla: 2.141 ± 0.397
0.366HisCys: 0.366 ± 0.148
1.201HisAsp: 1.201 ± 0.234
1.671HisGlu: 1.671 ± 0.34
0.522HisPhe: 0.522 ± 0.147
1.567HisGly: 1.567 ± 0.269
0.783HisHis: 0.783 ± 0.244
1.044HisIle: 1.044 ± 0.255
0.888HisLys: 0.888 ± 0.294
1.984HisLeu: 1.984 ± 0.342
0.366HisMet: 0.366 ± 0.126
0.783HisAsn: 0.783 ± 0.217
1.723HisPro: 1.723 ± 0.28
0.679HisGln: 0.679 ± 0.187
2.559HisArg: 2.559 ± 0.443
0.627HisSer: 0.627 ± 0.196
1.358HisThr: 1.358 ± 0.254
1.671HisVal: 1.671 ± 0.507
0.627HisTrp: 0.627 ± 0.209
0.731HisTyr: 0.731 ± 0.194
0.0HisXaa: 0.0 ± 0.0
Ile
5.065IleAla: 5.065 ± 0.552
0.522IleCys: 0.522 ± 0.189
3.812IleAsp: 3.812 ± 0.417
3.603IleGlu: 3.603 ± 0.44
0.836IlePhe: 0.836 ± 0.255
4.073IleGly: 4.073 ± 0.576
1.514IleHis: 1.514 ± 0.284
1.358IleIle: 1.358 ± 0.251
0.888IleLys: 0.888 ± 0.245
1.984IleLeu: 1.984 ± 0.32
0.574IleMet: 0.574 ± 0.185
1.88IleAsn: 1.88 ± 0.298
2.402IlePro: 2.402 ± 0.337
1.932IleGln: 1.932 ± 0.279
3.29IleArg: 3.29 ± 0.521
2.298IleSer: 2.298 ± 0.378
3.133IleThr: 3.133 ± 0.424
3.133IleVal: 3.133 ± 0.396
0.783IleTrp: 0.783 ± 0.233
1.149IleTyr: 1.149 ± 0.203
0.0IleXaa: 0.0 ± 0.0
Lys
3.446LysAla: 3.446 ± 0.459
0.313LysCys: 0.313 ± 0.12
1.723LysAsp: 1.723 ± 0.251
1.514LysGlu: 1.514 ± 0.237
0.836LysPhe: 0.836 ± 0.182
2.715LysGly: 2.715 ± 0.371
0.888LysHis: 0.888 ± 0.247
1.253LysIle: 1.253 ± 0.227
1.253LysLys: 1.253 ± 0.281
2.768LysLeu: 2.768 ± 0.377
0.418LysMet: 0.418 ± 0.129
0.992LysAsn: 0.992 ± 0.196
2.507LysPro: 2.507 ± 0.336
1.044LysGln: 1.044 ± 0.191
2.559LysArg: 2.559 ± 0.385
1.88LysSer: 1.88 ± 0.329
2.193LysThr: 2.193 ± 0.359
2.559LysVal: 2.559 ± 0.321
0.627LysTrp: 0.627 ± 0.159
0.94LysTyr: 0.94 ± 0.208
0.0LysXaa: 0.0 ± 0.0
Leu
8.773LeuAla: 8.773 ± 0.885
0.992LeuCys: 0.992 ± 0.236
6.005LeuAsp: 6.005 ± 0.542
3.916LeuGlu: 3.916 ± 0.417
2.037LeuPhe: 2.037 ± 0.388
6.475LeuGly: 6.475 ± 0.731
1.044LeuHis: 1.044 ± 0.259
3.655LeuIle: 3.655 ± 0.51
2.35LeuLys: 2.35 ± 0.362
5.483LeuLeu: 5.483 ± 0.595
1.253LeuMet: 1.253 ± 0.323
2.924LeuAsn: 2.924 ± 0.389
5.065LeuPro: 5.065 ± 0.675
3.133LeuGln: 3.133 ± 0.441
6.319LeuArg: 6.319 ± 0.787
4.491LeuSer: 4.491 ± 0.53
5.17LeuThr: 5.17 ± 0.545
5.587LeuVal: 5.587 ± 0.659
1.305LeuTrp: 1.305 ± 0.242
1.828LeuTyr: 1.828 ± 0.322
0.0LeuXaa: 0.0 ± 0.0
Met
1.88MetAla: 1.88 ± 0.309
0.209MetCys: 0.209 ± 0.106
1.305MetAsp: 1.305 ± 0.233
0.888MetGlu: 0.888 ± 0.214
0.47MetPhe: 0.47 ± 0.17
1.462MetGly: 1.462 ± 0.272
0.209MetHis: 0.209 ± 0.086
1.149MetIle: 1.149 ± 0.284
0.783MetLys: 0.783 ± 0.242
1.358MetLeu: 1.358 ± 0.243
0.418MetMet: 0.418 ± 0.209
0.679MetAsn: 0.679 ± 0.175
1.775MetPro: 1.775 ± 0.284
0.366MetGln: 0.366 ± 0.108
1.671MetArg: 1.671 ± 0.24
2.559MetSer: 2.559 ± 0.42
2.298MetThr: 2.298 ± 0.333
1.305MetVal: 1.305 ± 0.335
0.366MetTrp: 0.366 ± 0.137
0.366MetTyr: 0.366 ± 0.131
0.0MetXaa: 0.0 ± 0.0
Asn
3.238AsnAla: 3.238 ± 0.517
0.157AsnCys: 0.157 ± 0.084
1.514AsnAsp: 1.514 ± 0.218
2.037AsnGlu: 2.037 ± 0.365
0.627AsnPhe: 0.627 ± 0.223
2.872AsnGly: 2.872 ± 0.397
0.627AsnHis: 0.627 ± 0.159
0.731AsnIle: 0.731 ± 0.239
0.522AsnLys: 0.522 ± 0.203
2.715AsnLeu: 2.715 ± 0.327
0.679AsnMet: 0.679 ± 0.161
1.41AsnAsn: 1.41 ± 0.27
2.559AsnPro: 2.559 ± 0.312
1.149AsnGln: 1.149 ± 0.285
1.932AsnArg: 1.932 ± 0.272
1.462AsnSer: 1.462 ± 0.283
1.932AsnThr: 1.932 ± 0.31
1.828AsnVal: 1.828 ± 0.296
0.679AsnTrp: 0.679 ± 0.182
0.627AsnTyr: 0.627 ± 0.176
0.0AsnXaa: 0.0 ± 0.0
Pro
5.953ProAla: 5.953 ± 0.645
1.044ProCys: 1.044 ± 0.229
5.483ProAsp: 5.483 ± 0.602
3.916ProGlu: 3.916 ± 0.508
2.037ProPhe: 2.037 ± 0.286
5.64ProGly: 5.64 ± 0.506
1.41ProHis: 1.41 ± 0.262
2.663ProIle: 2.663 ± 0.353
2.141ProLys: 2.141 ± 0.417
4.282ProLeu: 4.282 ± 0.667
1.984ProMet: 1.984 ± 0.382
1.671ProAsn: 1.671 ± 0.267
3.551ProPro: 3.551 ± 0.542
2.089ProGln: 2.089 ± 0.297
3.499ProArg: 3.499 ± 0.516
2.768ProSer: 2.768 ± 0.398
2.82ProThr: 2.82 ± 0.428
3.864ProVal: 3.864 ± 0.417
0.992ProTrp: 0.992 ± 0.217
1.567ProTyr: 1.567 ± 0.344
0.0ProXaa: 0.0 ± 0.0
Gln
4.909GlnAla: 4.909 ± 0.451
0.47GlnCys: 0.47 ± 0.123
1.305GlnAsp: 1.305 ± 0.248
1.567GlnGlu: 1.567 ± 0.315
0.836GlnPhe: 0.836 ± 0.192
2.245GlnGly: 2.245 ± 0.382
0.94GlnHis: 0.94 ± 0.213
1.305GlnIle: 1.305 ± 0.264
1.671GlnLys: 1.671 ± 0.284
3.29GlnLeu: 3.29 ± 0.473
1.044GlnMet: 1.044 ± 0.26
0.522GlnAsn: 0.522 ± 0.155
1.775GlnPro: 1.775 ± 0.34
1.88GlnGln: 1.88 ± 0.429
3.29GlnArg: 3.29 ± 0.456
2.35GlnSer: 2.35 ± 0.303
1.775GlnThr: 1.775 ± 0.348
2.663GlnVal: 2.663 ± 0.351
1.097GlnTrp: 1.097 ± 0.193
0.94GlnTyr: 0.94 ± 0.253
0.0GlnXaa: 0.0 ± 0.0
Arg
6.736ArgAla: 6.736 ± 0.603
1.358ArgCys: 1.358 ± 0.371
5.17ArgAsp: 5.17 ± 0.522
4.595ArgGlu: 4.595 ± 0.504
2.507ArgPhe: 2.507 ± 0.374
5.17ArgGly: 5.17 ± 0.615
1.514ArgHis: 1.514 ± 0.326
4.7ArgIle: 4.7 ± 0.487
2.715ArgLys: 2.715 ± 0.33
5.64ArgLeu: 5.64 ± 0.581
2.037ArgMet: 2.037 ± 0.371
1.984ArgAsn: 1.984 ± 0.375
3.969ArgPro: 3.969 ± 0.564
2.663ArgGln: 2.663 ± 0.382
5.901ArgArg: 5.901 ± 0.649
4.073ArgSer: 4.073 ± 0.46
3.551ArgThr: 3.551 ± 0.54
6.162ArgVal: 6.162 ± 0.957
1.828ArgTrp: 1.828 ± 0.408
1.828ArgTyr: 1.828 ± 0.259
0.0ArgXaa: 0.0 ± 0.0
Ser
7.206SerAla: 7.206 ± 1.301
0.574SerCys: 0.574 ± 0.258
3.864SerAsp: 3.864 ± 0.492
2.768SerGlu: 2.768 ± 0.449
2.037SerPhe: 2.037 ± 0.396
5.222SerGly: 5.222 ± 0.535
1.097SerHis: 1.097 ± 0.216
2.507SerIle: 2.507 ± 0.357
2.298SerLys: 2.298 ± 0.34
4.021SerLeu: 4.021 ± 0.539
1.723SerMet: 1.723 ± 0.284
1.41SerAsn: 1.41 ± 0.242
4.021SerPro: 4.021 ± 0.419
1.41SerGln: 1.41 ± 0.252
2.977SerArg: 2.977 ± 0.429
3.238SerSer: 3.238 ± 0.669
3.342SerThr: 3.342 ± 0.447
4.543SerVal: 4.543 ± 0.643
1.201SerTrp: 1.201 ± 0.256
1.619SerTyr: 1.619 ± 0.237
0.0SerXaa: 0.0 ± 0.0
Thr
6.266ThrAla: 6.266 ± 0.746
0.627ThrCys: 0.627 ± 0.178
4.178ThrAsp: 4.178 ± 0.549
3.185ThrGlu: 3.185 ± 0.332
1.41ThrPhe: 1.41 ± 0.251
5.379ThrGly: 5.379 ± 1.008
1.88ThrHis: 1.88 ± 0.367
3.029ThrIle: 3.029 ± 0.448
2.559ThrLys: 2.559 ± 0.328
5.222ThrLeu: 5.222 ± 0.579
0.731ThrMet: 0.731 ± 0.2
1.567ThrAsn: 1.567 ± 0.277
4.439ThrPro: 4.439 ± 0.499
2.037ThrGln: 2.037 ± 0.367
4.648ThrArg: 4.648 ± 0.52
3.081ThrSer: 3.081 ± 0.358
4.543ThrThr: 4.543 ± 0.632
5.117ThrVal: 5.117 ± 0.572
1.097ThrTrp: 1.097 ± 0.248
1.671ThrTyr: 1.671 ± 0.296
0.0ThrXaa: 0.0 ± 0.0
Val
8.825ValAla: 8.825 ± 0.793
0.836ValCys: 0.836 ± 0.191
5.065ValAsp: 5.065 ± 0.639
4.595ValGlu: 4.595 ± 0.662
1.775ValPhe: 1.775 ± 0.362
5.483ValGly: 5.483 ± 0.477
1.671ValHis: 1.671 ± 0.298
3.342ValIle: 3.342 ± 0.442
2.037ValLys: 2.037 ± 0.36
5.326ValLeu: 5.326 ± 0.553
1.514ValMet: 1.514 ± 0.286
2.35ValAsn: 2.35 ± 0.319
4.439ValPro: 4.439 ± 0.388
2.089ValGln: 2.089 ± 0.348
4.752ValArg: 4.752 ± 0.619
4.439ValSer: 4.439 ± 0.403
4.23ValThr: 4.23 ± 0.512
5.744ValVal: 5.744 ± 0.855
1.828ValTrp: 1.828 ± 0.276
1.932ValTyr: 1.932 ± 0.408
0.0ValXaa: 0.0 ± 0.0
Trp
1.984TrpAla: 1.984 ± 0.356
0.366TrpCys: 0.366 ± 0.118
1.305TrpAsp: 1.305 ± 0.368
0.888TrpGlu: 0.888 ± 0.251
1.097TrpPhe: 1.097 ± 0.171
1.514TrpGly: 1.514 ± 0.295
0.47TrpHis: 0.47 ± 0.171
0.731TrpIle: 0.731 ± 0.186
0.679TrpLys: 0.679 ± 0.195
2.141TrpLeu: 2.141 ± 0.382
0.679TrpMet: 0.679 ± 0.185
0.418TrpAsn: 0.418 ± 0.111
0.783TrpPro: 0.783 ± 0.215
0.94TrpGln: 0.94 ± 0.24
2.037TrpArg: 2.037 ± 0.313
1.305TrpSer: 1.305 ± 0.238
1.932TrpThr: 1.932 ± 0.325
1.671TrpVal: 1.671 ± 0.254
0.731TrpTrp: 0.731 ± 0.194
0.679TrpTyr: 0.679 ± 0.191
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.872TyrAla: 2.872 ± 0.402
0.209TyrCys: 0.209 ± 0.108
1.984TyrAsp: 1.984 ± 0.362
1.828TyrGlu: 1.828 ± 0.282
0.679TyrPhe: 0.679 ± 0.176
1.828TyrGly: 1.828 ± 0.276
0.731TyrHis: 0.731 ± 0.215
1.201TyrIle: 1.201 ± 0.226
0.731TyrLys: 0.731 ± 0.182
1.775TyrLeu: 1.775 ± 0.413
0.366TyrMet: 0.366 ± 0.118
0.627TyrAsn: 0.627 ± 0.143
1.671TyrPro: 1.671 ± 0.266
0.783TyrGln: 0.783 ± 0.251
2.037TyrArg: 2.037 ± 0.397
1.253TyrSer: 1.253 ± 0.272
1.775TyrThr: 1.775 ± 0.308
2.402TyrVal: 2.402 ± 0.322
0.522TyrTrp: 0.522 ± 0.166
0.366TyrTyr: 0.366 ± 0.107
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 102 proteins (19151 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski