Amino acid dipepetide frequency for Mycobacterium phage Kingsley

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
13.561AlaAla: 13.561 ± 1.496
1.027AlaCys: 1.027 ± 0.223
6.883AlaAsp: 6.883 ± 0.519
7.14AlaGlu: 7.14 ± 0.773
2.671AlaPhe: 2.671 ± 0.342
9.349AlaGly: 9.349 ± 1.284
2.62AlaHis: 2.62 ± 0.398
4.726AlaIle: 4.726 ± 0.556
4.469AlaLys: 4.469 ± 0.446
8.065AlaLeu: 8.065 ± 0.745
2.568AlaMet: 2.568 ± 0.406
3.185AlaAsn: 3.185 ± 0.516
4.983AlaPro: 4.983 ± 0.503
4.058AlaGln: 4.058 ± 0.474
7.14AlaArg: 7.14 ± 0.684
5.496AlaSer: 5.496 ± 0.559
5.393AlaThr: 5.393 ± 0.63
6.575AlaVal: 6.575 ± 0.584
2.414AlaTrp: 2.414 ± 0.325
2.671AlaTyr: 2.671 ± 0.314
0.0AlaXaa: 0.0 ± 0.0
Cys
0.668CysAla: 0.668 ± 0.225
0.154CysCys: 0.154 ± 0.083
1.336CysAsp: 1.336 ± 0.283
0.873CysGlu: 0.873 ± 0.198
0.257CysPhe: 0.257 ± 0.106
1.438CysGly: 1.438 ± 0.368
0.411CysHis: 0.411 ± 0.136
0.514CysIle: 0.514 ± 0.166
0.514CysLys: 0.514 ± 0.194
0.822CysLeu: 0.822 ± 0.247
0.308CysMet: 0.308 ± 0.119
0.308CysAsn: 0.308 ± 0.132
1.233CysPro: 1.233 ± 0.278
0.514CysGln: 0.514 ± 0.163
1.336CysArg: 1.336 ± 0.427
0.873CysSer: 0.873 ± 0.275
0.719CysThr: 0.719 ± 0.192
0.719CysVal: 0.719 ± 0.187
0.154CysTrp: 0.154 ± 0.077
0.103CysTyr: 0.103 ± 0.073
0.0CysXaa: 0.0 ± 0.0
Asp
7.191AspAla: 7.191 ± 0.477
1.027AspCys: 1.027 ± 0.281
4.623AspAsp: 4.623 ± 0.566
4.058AspGlu: 4.058 ± 0.505
1.901AspPhe: 1.901 ± 0.25
6.421AspGly: 6.421 ± 0.592
1.438AspHis: 1.438 ± 0.324
2.928AspIle: 2.928 ± 0.428
1.746AspLys: 1.746 ± 0.306
6.061AspLeu: 6.061 ± 0.538
1.027AspMet: 1.027 ± 0.197
1.336AspAsn: 1.336 ± 0.22
4.109AspPro: 4.109 ± 0.435
2.363AspGln: 2.363 ± 0.365
4.983AspArg: 4.983 ± 0.588
3.596AspSer: 3.596 ± 0.544
3.698AspThr: 3.698 ± 0.355
4.828AspVal: 4.828 ± 0.502
1.387AspTrp: 1.387 ± 0.217
1.849AspTyr: 1.849 ± 0.353
0.0AspXaa: 0.0 ± 0.0
Glu
6.524GluAla: 6.524 ± 0.603
1.027GluCys: 1.027 ± 0.304
3.133GluAsp: 3.133 ± 0.347
3.236GluGlu: 3.236 ± 0.407
2.106GluPhe: 2.106 ± 0.331
2.928GluGly: 2.928 ± 0.382
0.976GluHis: 0.976 ± 0.264
2.877GluIle: 2.877 ± 0.416
2.106GluLys: 2.106 ± 0.295
4.777GluLeu: 4.777 ± 0.56
1.798GluMet: 1.798 ± 0.29
1.798GluAsn: 1.798 ± 0.253
3.082GluPro: 3.082 ± 0.516
2.928GluGln: 2.928 ± 0.368
4.572GluArg: 4.572 ± 0.53
2.928GluSer: 2.928 ± 0.432
4.263GluThr: 4.263 ± 0.497
4.777GluVal: 4.777 ± 0.503
1.079GluTrp: 1.079 ± 0.21
1.695GluTyr: 1.695 ± 0.294
0.0GluXaa: 0.0 ± 0.0
Phe
2.774PheAla: 2.774 ± 0.397
0.257PheCys: 0.257 ± 0.114
3.236PheAsp: 3.236 ± 0.45
1.387PheGlu: 1.387 ± 0.254
0.565PhePhe: 0.565 ± 0.172
3.442PheGly: 3.442 ± 0.549
0.616PheHis: 0.616 ± 0.202
1.233PheIle: 1.233 ± 0.357
1.233PheLys: 1.233 ± 0.292
1.952PheLeu: 1.952 ± 0.309
1.13PheMet: 1.13 ± 0.268
1.027PheAsn: 1.027 ± 0.265
1.336PhePro: 1.336 ± 0.288
0.616PheGln: 0.616 ± 0.259
1.284PheArg: 1.284 ± 0.212
1.644PheSer: 1.644 ± 0.215
2.055PheThr: 2.055 ± 0.307
1.798PheVal: 1.798 ± 0.262
0.514PheTrp: 0.514 ± 0.208
0.822PheTyr: 0.822 ± 0.197
0.0PheXaa: 0.0 ± 0.0
Gly
9.246GlyAla: 9.246 ± 1.054
1.027GlyCys: 1.027 ± 0.222
5.753GlyAsp: 5.753 ± 0.499
4.418GlyGlu: 4.418 ± 0.431
2.26GlyPhe: 2.26 ± 0.336
10.222GlyGly: 10.222 ± 2.018
1.798GlyHis: 1.798 ± 0.262
4.109GlyIle: 4.109 ± 0.514
2.466GlyLys: 2.466 ± 0.354
5.445GlyLeu: 5.445 ± 0.504
1.592GlyMet: 1.592 ± 0.363
2.62GlyAsn: 2.62 ± 0.407
4.007GlyPro: 4.007 ± 0.481
2.62GlyGln: 2.62 ± 0.442
5.599GlyArg: 5.599 ± 0.606
5.856GlySer: 5.856 ± 0.796
6.729GlyThr: 6.729 ± 0.767
6.01GlyVal: 6.01 ± 0.559
2.568GlyTrp: 2.568 ± 0.39
2.414GlyTyr: 2.414 ± 0.409
0.0GlyXaa: 0.0 ± 0.0
His
1.901HisAla: 1.901 ± 0.302
0.514HisCys: 0.514 ± 0.191
1.079HisAsp: 1.079 ± 0.233
1.336HisGlu: 1.336 ± 0.274
0.822HisPhe: 0.822 ± 0.221
1.644HisGly: 1.644 ± 0.257
1.13HisHis: 1.13 ± 0.247
1.027HisIle: 1.027 ± 0.234
1.027HisLys: 1.027 ± 0.221
1.49HisLeu: 1.49 ± 0.253
0.308HisMet: 0.308 ± 0.115
0.565HisAsn: 0.565 ± 0.138
1.541HisPro: 1.541 ± 0.276
0.668HisGln: 0.668 ± 0.177
2.106HisArg: 2.106 ± 0.37
0.668HisSer: 0.668 ± 0.165
1.284HisThr: 1.284 ± 0.27
1.438HisVal: 1.438 ± 0.309
0.411HisTrp: 0.411 ± 0.119
1.13HisTyr: 1.13 ± 0.224
0.0HisXaa: 0.0 ± 0.0
Ile
5.496IleAla: 5.496 ± 0.576
0.873IleCys: 0.873 ± 0.23
3.698IleAsp: 3.698 ± 0.458
3.904IleGlu: 3.904 ± 0.422
1.079IlePhe: 1.079 ± 0.226
3.287IleGly: 3.287 ± 0.48
1.438IleHis: 1.438 ± 0.269
1.541IleIle: 1.541 ± 0.235
1.13IleLys: 1.13 ± 0.236
2.568IleLeu: 2.568 ± 0.342
0.565IleMet: 0.565 ± 0.167
1.746IleAsn: 1.746 ± 0.297
3.339IlePro: 3.339 ± 0.391
1.438IleGln: 1.438 ± 0.21
2.26IleArg: 2.26 ± 0.359
2.26IleSer: 2.26 ± 0.422
3.544IleThr: 3.544 ± 0.466
3.133IleVal: 3.133 ± 0.376
0.719IleTrp: 0.719 ± 0.218
0.719IleTyr: 0.719 ± 0.197
0.0IleXaa: 0.0 ± 0.0
Lys
3.698LysAla: 3.698 ± 0.433
0.514LysCys: 0.514 ± 0.147
1.592LysAsp: 1.592 ± 0.288
1.233LysGlu: 1.233 ± 0.247
1.13LysPhe: 1.13 ± 0.183
2.825LysGly: 2.825 ± 0.338
1.079LysHis: 1.079 ± 0.239
1.027LysIle: 1.027 ± 0.278
1.592LysLys: 1.592 ± 0.353
2.517LysLeu: 2.517 ± 0.431
0.822LysMet: 0.822 ± 0.153
0.976LysAsn: 0.976 ± 0.188
3.287LysPro: 3.287 ± 0.415
1.233LysGln: 1.233 ± 0.233
2.517LysArg: 2.517 ± 0.386
2.003LysSer: 2.003 ± 0.278
2.825LysThr: 2.825 ± 0.332
2.26LysVal: 2.26 ± 0.357
0.925LysTrp: 0.925 ± 0.2
1.027LysTyr: 1.027 ± 0.233
0.0LysXaa: 0.0 ± 0.0
Leu
7.345LeuAla: 7.345 ± 0.681
0.925LeuCys: 0.925 ± 0.257
4.828LeuAsp: 4.828 ± 0.536
3.852LeuGlu: 3.852 ± 0.456
2.106LeuPhe: 2.106 ± 0.335
6.164LeuGly: 6.164 ± 0.604
0.565LeuHis: 0.565 ± 0.186
3.647LeuIle: 3.647 ± 0.458
2.466LeuLys: 2.466 ± 0.43
5.085LeuLeu: 5.085 ± 0.564
1.438LeuMet: 1.438 ± 0.277
2.517LeuAsn: 2.517 ± 0.401
4.931LeuPro: 4.931 ± 0.582
3.031LeuGln: 3.031 ± 0.455
5.342LeuArg: 5.342 ± 0.555
4.828LeuSer: 4.828 ± 0.527
5.393LeuThr: 5.393 ± 0.538
5.753LeuVal: 5.753 ± 0.523
1.027LeuTrp: 1.027 ± 0.281
1.49LeuTyr: 1.49 ± 0.313
0.0LeuXaa: 0.0 ± 0.0
Met
1.695MetAla: 1.695 ± 0.337
0.205MetCys: 0.205 ± 0.095
1.438MetAsp: 1.438 ± 0.247
1.181MetGlu: 1.181 ± 0.218
1.027MetPhe: 1.027 ± 0.271
1.644MetGly: 1.644 ± 0.269
0.36MetHis: 0.36 ± 0.136
0.976MetIle: 0.976 ± 0.238
0.514MetLys: 0.514 ± 0.21
1.695MetLeu: 1.695 ± 0.216
0.616MetMet: 0.616 ± 0.194
1.079MetAsn: 1.079 ± 0.214
1.592MetPro: 1.592 ± 0.253
0.462MetGln: 0.462 ± 0.14
1.438MetArg: 1.438 ± 0.26
2.825MetSer: 2.825 ± 0.384
2.106MetThr: 2.106 ± 0.288
1.079MetVal: 1.079 ± 0.3
0.411MetTrp: 0.411 ± 0.134
0.514MetTyr: 0.514 ± 0.165
0.0MetXaa: 0.0 ± 0.0
Asn
3.39AsnAla: 3.39 ± 0.45
0.205AsnCys: 0.205 ± 0.114
1.644AsnAsp: 1.644 ± 0.311
1.901AsnGlu: 1.901 ± 0.292
0.514AsnPhe: 0.514 ± 0.177
3.493AsnGly: 3.493 ± 0.395
0.77AsnHis: 0.77 ± 0.156
1.13AsnIle: 1.13 ± 0.364
1.13AsnLys: 1.13 ± 0.262
2.979AsnLeu: 2.979 ± 0.372
0.668AsnMet: 0.668 ± 0.146
1.849AsnAsn: 1.849 ± 0.338
2.568AsnPro: 2.568 ± 0.387
1.079AsnGln: 1.079 ± 0.317
2.055AsnArg: 2.055 ± 0.338
1.387AsnSer: 1.387 ± 0.247
2.517AsnThr: 2.517 ± 0.346
1.901AsnVal: 1.901 ± 0.317
0.616AsnTrp: 0.616 ± 0.146
0.514AsnTyr: 0.514 ± 0.152
0.0AsnXaa: 0.0 ± 0.0
Pro
5.804ProAla: 5.804 ± 0.602
0.873ProCys: 0.873 ± 0.192
4.109ProAsp: 4.109 ± 0.566
4.161ProGlu: 4.161 ± 0.477
1.798ProPhe: 1.798 ± 0.324
6.575ProGly: 6.575 ± 0.681
1.746ProHis: 1.746 ± 0.292
2.209ProIle: 2.209 ± 0.263
2.568ProLys: 2.568 ± 0.43
3.955ProLeu: 3.955 ± 0.49
1.541ProMet: 1.541 ± 0.339
1.849ProAsn: 1.849 ± 0.298
3.75ProPro: 3.75 ± 0.509
2.466ProGln: 2.466 ± 0.374
3.75ProArg: 3.75 ± 0.585
2.774ProSer: 2.774 ± 0.313
3.698ProThr: 3.698 ± 0.491
4.777ProVal: 4.777 ± 0.537
0.77ProTrp: 0.77 ± 0.171
1.695ProTyr: 1.695 ± 0.298
0.0ProXaa: 0.0 ± 0.0
Gln
4.623GlnAla: 4.623 ± 0.617
0.308GlnCys: 0.308 ± 0.118
1.387GlnAsp: 1.387 ± 0.279
1.746GlnGlu: 1.746 ± 0.325
1.181GlnPhe: 1.181 ± 0.2
2.311GlnGly: 2.311 ± 0.441
0.925GlnHis: 0.925 ± 0.205
1.746GlnIle: 1.746 ± 0.302
1.336GlnLys: 1.336 ± 0.218
3.596GlnLeu: 3.596 ± 0.465
0.668GlnMet: 0.668 ± 0.204
1.13GlnAsn: 1.13 ± 0.268
1.952GlnPro: 1.952 ± 0.321
1.438GlnGln: 1.438 ± 0.352
2.568GlnArg: 2.568 ± 0.345
2.517GlnSer: 2.517 ± 0.3
1.387GlnThr: 1.387 ± 0.203
2.363GlnVal: 2.363 ± 0.344
0.976GlnTrp: 0.976 ± 0.194
1.027GlnTyr: 1.027 ± 0.276
0.0GlnXaa: 0.0 ± 0.0
Arg
6.267ArgAla: 6.267 ± 0.596
1.438ArgCys: 1.438 ± 0.369
5.034ArgAsp: 5.034 ± 0.573
4.674ArgGlu: 4.674 ± 0.59
2.466ArgPhe: 2.466 ± 0.336
4.931ArgGly: 4.931 ± 0.443
1.438ArgHis: 1.438 ± 0.333
3.904ArgIle: 3.904 ± 0.441
2.311ArgLys: 2.311 ± 0.338
4.931ArgLeu: 4.931 ± 0.522
2.517ArgMet: 2.517 ± 0.421
1.952ArgAsn: 1.952 ± 0.317
3.647ArgPro: 3.647 ± 0.481
2.26ArgGln: 2.26 ± 0.343
5.753ArgArg: 5.753 ± 0.705
4.52ArgSer: 4.52 ± 0.38
3.801ArgThr: 3.801 ± 0.554
4.931ArgVal: 4.931 ± 0.566
1.746ArgTrp: 1.746 ± 0.303
1.901ArgTyr: 1.901 ± 0.327
0.0ArgXaa: 0.0 ± 0.0
Ser
5.958SerAla: 5.958 ± 0.86
0.514SerCys: 0.514 ± 0.185
4.007SerAsp: 4.007 ± 0.415
2.928SerGlu: 2.928 ± 0.397
1.952SerPhe: 1.952 ± 0.276
5.856SerGly: 5.856 ± 0.699
0.976SerHis: 0.976 ± 0.188
2.825SerIle: 2.825 ± 0.353
2.157SerLys: 2.157 ± 0.309
3.596SerLeu: 3.596 ± 0.452
1.49SerMet: 1.49 ± 0.291
2.003SerAsn: 2.003 ± 0.433
3.75SerPro: 3.75 ± 0.415
1.695SerGln: 1.695 ± 0.247
4.572SerArg: 4.572 ± 0.528
3.955SerSer: 3.955 ± 0.593
3.698SerThr: 3.698 ± 0.412
4.161SerVal: 4.161 ± 0.415
1.438SerTrp: 1.438 ± 0.261
1.387SerTyr: 1.387 ± 0.237
0.0SerXaa: 0.0 ± 0.0
Thr
6.78ThrAla: 6.78 ± 0.577
0.616ThrCys: 0.616 ± 0.175
3.801ThrAsp: 3.801 ± 0.534
3.493ThrGlu: 3.493 ± 0.378
1.387ThrPhe: 1.387 ± 0.246
6.267ThrGly: 6.267 ± 0.65
1.541ThrHis: 1.541 ± 0.261
3.236ThrIle: 3.236 ± 0.444
2.209ThrLys: 2.209 ± 0.295
4.263ThrLeu: 4.263 ± 0.51
1.079ThrMet: 1.079 ± 0.203
2.311ThrAsn: 2.311 ± 0.355
4.88ThrPro: 4.88 ± 0.569
2.26ThrGln: 2.26 ± 0.38
4.161ThrArg: 4.161 ± 0.429
3.955ThrSer: 3.955 ± 0.438
5.137ThrThr: 5.137 ± 0.774
5.958ThrVal: 5.958 ± 0.617
1.181ThrTrp: 1.181 ± 0.249
2.209ThrTyr: 2.209 ± 0.293
0.0ThrXaa: 0.0 ± 0.0
Val
7.756ValAla: 7.756 ± 0.7
1.181ValCys: 1.181 ± 0.28
5.239ValAsp: 5.239 ± 0.578
4.418ValGlu: 4.418 ± 0.471
2.106ValPhe: 2.106 ± 0.328
4.983ValGly: 4.983 ± 0.587
1.336ValHis: 1.336 ± 0.238
3.082ValIle: 3.082 ± 0.476
2.877ValLys: 2.877 ± 0.37
5.188ValLeu: 5.188 ± 0.536
1.438ValMet: 1.438 ± 0.255
2.568ValAsn: 2.568 ± 0.311
4.212ValPro: 4.212 ± 0.334
2.517ValGln: 2.517 ± 0.366
4.726ValArg: 4.726 ± 0.572
4.469ValSer: 4.469 ± 0.573
5.085ValThr: 5.085 ± 0.44
5.958ValVal: 5.958 ± 0.69
1.592ValTrp: 1.592 ± 0.269
1.387ValTyr: 1.387 ± 0.29
0.0ValXaa: 0.0 ± 0.0
Trp
2.106TrpAla: 2.106 ± 0.266
0.154TrpCys: 0.154 ± 0.071
1.49TrpAsp: 1.49 ± 0.276
0.822TrpGlu: 0.822 ± 0.233
0.668TrpPhe: 0.668 ± 0.152
0.976TrpGly: 0.976 ± 0.209
0.514TrpHis: 0.514 ± 0.155
1.079TrpIle: 1.079 ± 0.192
0.719TrpLys: 0.719 ± 0.176
1.798TrpLeu: 1.798 ± 0.345
0.822TrpMet: 0.822 ± 0.197
0.514TrpAsn: 0.514 ± 0.147
1.181TrpPro: 1.181 ± 0.261
0.77TrpGln: 0.77 ± 0.212
1.901TrpArg: 1.901 ± 0.276
1.387TrpSer: 1.387 ± 0.258
1.438TrpThr: 1.438 ± 0.306
1.541TrpVal: 1.541 ± 0.271
0.462TrpTrp: 0.462 ± 0.154
0.668TrpTyr: 0.668 ± 0.181
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.209TyrAla: 2.209 ± 0.309
0.411TyrCys: 0.411 ± 0.16
2.26TyrAsp: 2.26 ± 0.365
1.849TyrGlu: 1.849 ± 0.276
0.925TyrPhe: 0.925 ± 0.201
2.003TyrGly: 2.003 ± 0.448
0.411TyrHis: 0.411 ± 0.105
0.925TyrIle: 0.925 ± 0.235
0.514TyrLys: 0.514 ± 0.179
1.952TyrLeu: 1.952 ± 0.363
0.411TyrMet: 0.411 ± 0.14
0.976TyrAsn: 0.976 ± 0.202
1.49TyrPro: 1.49 ± 0.274
0.77TyrGln: 0.77 ± 0.172
2.414TyrArg: 2.414 ± 0.306
0.976TyrSer: 0.976 ± 0.228
1.952TyrThr: 1.952 ± 0.303
2.157TyrVal: 2.157 ± 0.267
0.616TyrTrp: 0.616 ± 0.17
0.616TyrTyr: 0.616 ± 0.117
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 112 proteins (19469 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski