Amino acid dipepetide frequency for Mycobacterium phage Nerujay

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
12.796AlaAla: 12.796 ± 1.334
0.837AlaCys: 0.837 ± 0.226
7.414AlaAsp: 7.414 ± 0.703
6.398AlaGlu: 6.398 ± 0.732
2.93AlaPhe: 2.93 ± 0.534
8.072AlaGly: 8.072 ± 0.893
1.136AlaHis: 1.136 ± 0.306
4.305AlaIle: 4.305 ± 0.594
4.305AlaLys: 4.305 ± 0.52
8.79AlaLeu: 8.79 ± 0.892
2.691AlaMet: 2.691 ± 0.479
2.332AlaAsn: 2.332 ± 0.372
4.544AlaPro: 4.544 ± 0.611
2.87AlaGln: 2.87 ± 0.465
6.159AlaArg: 6.159 ± 0.549
4.664AlaSer: 4.664 ± 0.473
5.979AlaThr: 5.979 ± 0.71
8.072AlaVal: 8.072 ± 0.825
2.212AlaTrp: 2.212 ± 0.576
2.452AlaTyr: 2.452 ± 0.338
0.0AlaXaa: 0.0 ± 0.0
Cys
0.837CysAla: 0.837 ± 0.252
0.06CysCys: 0.06 ± 0.055
0.478CysAsp: 0.478 ± 0.165
0.658CysGlu: 0.658 ± 0.202
0.12CysPhe: 0.12 ± 0.083
0.777CysGly: 0.777 ± 0.257
0.12CysHis: 0.12 ± 0.092
0.299CysIle: 0.299 ± 0.143
0.299CysLys: 0.299 ± 0.151
0.598CysLeu: 0.598 ± 0.205
0.179CysMet: 0.179 ± 0.111
0.179CysAsn: 0.179 ± 0.111
0.299CysPro: 0.299 ± 0.137
0.179CysGln: 0.179 ± 0.106
0.658CysArg: 0.658 ± 0.18
0.478CysSer: 0.478 ± 0.175
0.239CysThr: 0.239 ± 0.112
0.359CysVal: 0.359 ± 0.125
0.179CysTrp: 0.179 ± 0.102
0.12CysTyr: 0.12 ± 0.077
0.0CysXaa: 0.0 ± 0.0
Asp
6.039AspAla: 6.039 ± 0.577
0.658AspCys: 0.658 ± 0.206
5.023AspAsp: 5.023 ± 0.524
3.767AspGlu: 3.767 ± 0.532
2.212AspPhe: 2.212 ± 0.418
6.577AspGly: 6.577 ± 0.609
1.136AspHis: 1.136 ± 0.275
2.81AspIle: 2.81 ± 0.413
2.751AspLys: 2.751 ± 0.42
6.876AspLeu: 6.876 ± 0.77
1.256AspMet: 1.256 ± 0.261
1.973AspAsn: 1.973 ± 0.374
4.963AspPro: 4.963 ± 0.583
1.674AspGln: 1.674 ± 0.361
3.528AspArg: 3.528 ± 0.397
3.229AspSer: 3.229 ± 0.49
3.946AspThr: 3.946 ± 0.44
4.784AspVal: 4.784 ± 0.592
1.674AspTrp: 1.674 ± 0.32
2.153AspTyr: 2.153 ± 0.361
0.0AspXaa: 0.0 ± 0.0
Glu
5.74GluAla: 5.74 ± 0.703
0.299GluCys: 0.299 ± 0.127
5.023GluAsp: 5.023 ± 0.514
5.083GluGlu: 5.083 ± 0.617
1.973GluPhe: 1.973 ± 0.347
3.707GluGly: 3.707 ± 0.588
1.435GluHis: 1.435 ± 0.36
3.468GluIle: 3.468 ± 0.469
2.452GluLys: 2.452 ± 0.436
7.116GluLeu: 7.116 ± 0.588
1.794GluMet: 1.794 ± 0.311
1.734GluAsn: 1.734 ± 0.384
2.332GluPro: 2.332 ± 0.384
2.631GluGln: 2.631 ± 0.418
4.245GluArg: 4.245 ± 0.532
3.588GluSer: 3.588 ± 0.362
3.647GluThr: 3.647 ± 0.442
5.381GluVal: 5.381 ± 0.665
1.435GluTrp: 1.435 ± 0.345
2.272GluTyr: 2.272 ± 0.474
0.0GluXaa: 0.0 ± 0.0
Phe
2.571PheAla: 2.571 ± 0.402
0.299PheCys: 0.299 ± 0.14
2.571PheAsp: 2.571 ± 0.348
2.033PheGlu: 2.033 ± 0.313
0.598PhePhe: 0.598 ± 0.194
3.647PheGly: 3.647 ± 0.487
0.718PheHis: 0.718 ± 0.235
1.136PheIle: 1.136 ± 0.252
1.375PheLys: 1.375 ± 0.251
2.392PheLeu: 2.392 ± 0.472
0.598PheMet: 0.598 ± 0.208
1.435PheAsn: 1.435 ± 0.332
1.674PhePro: 1.674 ± 0.305
0.837PheGln: 0.837 ± 0.231
1.973PheArg: 1.973 ± 0.349
1.555PheSer: 1.555 ± 0.325
2.093PheThr: 2.093 ± 0.333
2.093PheVal: 2.093 ± 0.375
0.478PheTrp: 0.478 ± 0.169
0.837PheTyr: 0.837 ± 0.235
0.0PheXaa: 0.0 ± 0.0
Gly
7.474GlyAla: 7.474 ± 1.23
0.718GlyCys: 0.718 ± 0.223
5.74GlyAsp: 5.74 ± 0.513
4.186GlyGlu: 4.186 ± 0.455
3.169GlyPhe: 3.169 ± 0.539
10.643GlyGly: 10.643 ± 2.801
2.093GlyHis: 2.093 ± 0.416
4.784GlyIle: 4.784 ± 0.849
3.767GlyLys: 3.767 ± 0.458
7.056GlyLeu: 7.056 ± 0.716
2.093GlyMet: 2.093 ± 0.36
3.289GlyAsn: 3.289 ± 0.494
3.946GlyPro: 3.946 ± 0.602
2.392GlyGln: 2.392 ± 0.372
4.843GlyArg: 4.843 ± 0.597
6.757GlySer: 6.757 ± 0.836
4.784GlyThr: 4.784 ± 0.627
5.561GlyVal: 5.561 ± 0.504
3.05GlyTrp: 3.05 ± 0.395
2.511GlyTyr: 2.511 ± 0.365
0.0GlyXaa: 0.0 ± 0.0
His
1.674HisAla: 1.674 ± 0.337
0.179HisCys: 0.179 ± 0.163
1.196HisAsp: 1.196 ± 0.235
1.495HisGlu: 1.495 ± 0.287
0.658HisPhe: 0.658 ± 0.18
1.495HisGly: 1.495 ± 0.389
0.718HisHis: 0.718 ± 0.227
0.957HisIle: 0.957 ± 0.197
0.957HisLys: 0.957 ± 0.281
1.136HisLeu: 1.136 ± 0.27
0.239HisMet: 0.239 ± 0.133
0.239HisAsn: 0.239 ± 0.108
1.375HisPro: 1.375 ± 0.244
0.957HisGln: 0.957 ± 0.256
1.555HisArg: 1.555 ± 0.351
0.718HisSer: 0.718 ± 0.202
0.897HisThr: 0.897 ± 0.235
1.555HisVal: 1.555 ± 0.412
0.478HisTrp: 0.478 ± 0.167
0.478HisTyr: 0.478 ± 0.184
0.0HisXaa: 0.0 ± 0.0
Ile
6.518IleAla: 6.518 ± 0.686
0.239IleCys: 0.239 ± 0.114
3.229IleAsp: 3.229 ± 0.373
3.887IleGlu: 3.887 ± 0.382
0.957IlePhe: 0.957 ± 0.249
4.066IleGly: 4.066 ± 0.567
0.897IleHis: 0.897 ± 0.236
1.674IleIle: 1.674 ± 0.301
1.973IleLys: 1.973 ± 0.317
3.109IleLeu: 3.109 ± 0.327
0.837IleMet: 0.837 ± 0.203
1.794IleAsn: 1.794 ± 0.353
2.99IlePro: 2.99 ± 0.364
1.256IleGln: 1.256 ± 0.272
3.468IleArg: 3.468 ± 0.492
3.109IleSer: 3.109 ± 0.452
3.588IleThr: 3.588 ± 0.513
2.751IleVal: 2.751 ± 0.533
0.718IleTrp: 0.718 ± 0.186
1.495IleTyr: 1.495 ± 0.293
0.0IleXaa: 0.0 ± 0.0
Lys
3.767LysAla: 3.767 ± 0.501
0.239LysCys: 0.239 ± 0.11
2.452LysAsp: 2.452 ± 0.499
2.093LysGlu: 2.093 ± 0.376
1.555LysPhe: 1.555 ± 0.317
2.631LysGly: 2.631 ± 0.388
1.136LysHis: 1.136 ± 0.326
2.212LysIle: 2.212 ± 0.347
1.794LysLys: 1.794 ± 0.437
2.99LysLeu: 2.99 ± 0.471
1.017LysMet: 1.017 ± 0.207
1.435LysAsn: 1.435 ± 0.287
2.99LysPro: 2.99 ± 0.412
2.093LysGln: 2.093 ± 0.413
2.93LysArg: 2.93 ± 0.49
2.631LysSer: 2.631 ± 0.348
2.511LysThr: 2.511 ± 0.359
3.109LysVal: 3.109 ± 0.456
0.718LysTrp: 0.718 ± 0.206
1.076LysTyr: 1.076 ± 0.296
0.0LysXaa: 0.0 ± 0.0
Leu
9.507LeuAla: 9.507 ± 0.859
0.299LeuCys: 0.299 ± 0.127
6.577LeuAsp: 6.577 ± 0.694
4.963LeuGlu: 4.963 ± 0.584
2.153LeuPhe: 2.153 ± 0.341
7.116LeuGly: 7.116 ± 0.762
1.375LeuHis: 1.375 ± 0.324
4.485LeuIle: 4.485 ± 0.555
4.006LeuLys: 4.006 ± 0.573
5.979LeuLeu: 5.979 ± 0.701
1.555LeuMet: 1.555 ± 0.275
2.87LeuAsn: 2.87 ± 0.471
5.68LeuPro: 5.68 ± 0.708
2.81LeuGln: 2.81 ± 0.482
5.74LeuArg: 5.74 ± 0.564
5.501LeuSer: 5.501 ± 0.554
5.561LeuThr: 5.561 ± 0.439
4.843LeuVal: 4.843 ± 0.6
0.957LeuTrp: 0.957 ± 0.283
2.212LeuTyr: 2.212 ± 0.389
0.0LeuXaa: 0.0 ± 0.0
Met
2.511MetAla: 2.511 ± 0.34
0.0MetCys: 0.0 ± 0.0
1.196MetAsp: 1.196 ± 0.298
1.375MetGlu: 1.375 ± 0.277
0.658MetPhe: 0.658 ± 0.179
1.674MetGly: 1.674 ± 0.298
0.239MetHis: 0.239 ± 0.114
0.718MetIle: 0.718 ± 0.24
1.076MetLys: 1.076 ± 0.264
1.375MetLeu: 1.375 ± 0.318
0.179MetMet: 0.179 ± 0.103
1.136MetAsn: 1.136 ± 0.214
1.017MetPro: 1.017 ± 0.231
0.478MetGln: 0.478 ± 0.152
1.315MetArg: 1.315 ± 0.341
2.153MetSer: 2.153 ± 0.382
2.153MetThr: 2.153 ± 0.331
1.256MetVal: 1.256 ± 0.255
0.239MetTrp: 0.239 ± 0.104
0.419MetTyr: 0.419 ± 0.146
0.0MetXaa: 0.0 ± 0.0
Asn
3.348AsnAla: 3.348 ± 0.495
0.06AsnCys: 0.06 ± 0.057
1.973AsnAsp: 1.973 ± 0.368
1.913AsnGlu: 1.913 ± 0.355
0.837AsnPhe: 0.837 ± 0.251
3.647AsnGly: 3.647 ± 0.473
0.777AsnHis: 0.777 ± 0.22
1.555AsnIle: 1.555 ± 0.311
0.598AsnLys: 0.598 ± 0.226
2.691AsnLeu: 2.691 ± 0.354
0.538AsnMet: 0.538 ± 0.172
0.718AsnAsn: 0.718 ± 0.213
2.81AsnPro: 2.81 ± 0.344
0.897AsnGln: 0.897 ± 0.21
1.315AsnArg: 1.315 ± 0.305
2.093AsnSer: 2.093 ± 0.449
1.734AsnThr: 1.734 ± 0.31
2.87AsnVal: 2.87 ± 0.406
0.718AsnTrp: 0.718 ± 0.167
1.196AsnTyr: 1.196 ± 0.283
0.0AsnXaa: 0.0 ± 0.0
Pro
4.604ProAla: 4.604 ± 0.607
0.299ProCys: 0.299 ± 0.132
4.485ProAsp: 4.485 ± 0.615
4.724ProGlu: 4.724 ± 0.554
2.093ProPhe: 2.093 ± 0.327
5.142ProGly: 5.142 ± 0.6
0.897ProHis: 0.897 ± 0.255
2.332ProIle: 2.332 ± 0.416
2.153ProLys: 2.153 ± 0.275
4.245ProLeu: 4.245 ± 0.548
1.017ProMet: 1.017 ± 0.235
1.435ProAsn: 1.435 ± 0.284
2.81ProPro: 2.81 ± 0.471
1.435ProGln: 1.435 ± 0.319
2.99ProArg: 2.99 ± 0.471
4.126ProSer: 4.126 ± 0.45
3.588ProThr: 3.588 ± 0.505
4.006ProVal: 4.006 ± 0.588
0.718ProTrp: 0.718 ± 0.304
1.614ProTyr: 1.614 ± 0.33
0.0ProXaa: 0.0 ± 0.0
Gln
3.229GlnAla: 3.229 ± 0.43
0.179GlnCys: 0.179 ± 0.103
1.375GlnAsp: 1.375 ± 0.38
1.614GlnGlu: 1.614 ± 0.281
1.136GlnPhe: 1.136 ± 0.259
2.332GlnGly: 2.332 ± 0.32
0.538GlnHis: 0.538 ± 0.164
2.81GlnIle: 2.81 ± 0.483
1.375GlnLys: 1.375 ± 0.302
3.588GlnLeu: 3.588 ± 0.497
0.957GlnMet: 0.957 ± 0.222
0.538GlnAsn: 0.538 ± 0.173
1.734GlnPro: 1.734 ± 0.293
1.734GlnGln: 1.734 ± 0.335
1.854GlnArg: 1.854 ± 0.362
1.854GlnSer: 1.854 ± 0.316
1.854GlnThr: 1.854 ± 0.3
2.212GlnVal: 2.212 ± 0.362
0.837GlnTrp: 0.837 ± 0.189
0.598GlnTyr: 0.598 ± 0.174
0.0GlnXaa: 0.0 ± 0.0
Arg
4.843ArgAla: 4.843 ± 0.597
1.017ArgCys: 1.017 ± 0.291
3.169ArgAsp: 3.169 ± 0.444
4.724ArgGlu: 4.724 ± 0.682
1.973ArgPhe: 1.973 ± 0.356
5.083ArgGly: 5.083 ± 0.723
0.837ArgHis: 0.837 ± 0.225
3.229ArgIle: 3.229 ± 0.572
3.05ArgLys: 3.05 ± 0.565
5.561ArgLeu: 5.561 ± 0.66
1.854ArgMet: 1.854 ± 0.33
2.272ArgAsn: 2.272 ± 0.424
2.751ArgPro: 2.751 ± 0.424
1.913ArgGln: 1.913 ± 0.339
5.322ArgArg: 5.322 ± 0.735
4.126ArgSer: 4.126 ± 0.636
3.109ArgThr: 3.109 ± 0.517
5.142ArgVal: 5.142 ± 0.557
1.495ArgTrp: 1.495 ± 0.325
2.033ArgTyr: 2.033 ± 0.299
0.0ArgXaa: 0.0 ± 0.0
Ser
7.235SerAla: 7.235 ± 1.323
0.598SerCys: 0.598 ± 0.202
3.109SerAsp: 3.109 ± 0.398
4.186SerGlu: 4.186 ± 0.444
2.093SerPhe: 2.093 ± 0.404
6.338SerGly: 6.338 ± 0.717
1.435SerHis: 1.435 ± 0.306
2.631SerIle: 2.631 ± 0.441
2.153SerLys: 2.153 ± 0.285
4.903SerLeu: 4.903 ± 0.556
1.375SerMet: 1.375 ± 0.255
2.392SerAsn: 2.392 ± 0.415
3.169SerPro: 3.169 ± 0.465
2.093SerGln: 2.093 ± 0.256
3.528SerArg: 3.528 ± 0.545
3.707SerSer: 3.707 ± 0.777
3.468SerThr: 3.468 ± 0.439
4.006SerVal: 4.006 ± 0.542
1.495SerTrp: 1.495 ± 0.334
1.674SerTyr: 1.674 ± 0.3
0.0SerXaa: 0.0 ± 0.0
Thr
5.083ThrAla: 5.083 ± 0.599
0.359ThrCys: 0.359 ± 0.143
4.186ThrAsp: 4.186 ± 0.57
4.305ThrGlu: 4.305 ± 0.605
2.093ThrPhe: 2.093 ± 0.374
6.697ThrGly: 6.697 ± 0.597
0.957ThrHis: 0.957 ± 0.223
2.452ThrIle: 2.452 ± 0.575
2.631ThrLys: 2.631 ± 0.39
5.68ThrLeu: 5.68 ± 0.565
0.897ThrMet: 0.897 ± 0.2
1.973ThrAsn: 1.973 ± 0.379
3.827ThrPro: 3.827 ± 0.467
1.435ThrGln: 1.435 ± 0.255
3.289ThrArg: 3.289 ± 0.545
3.707ThrSer: 3.707 ± 0.59
3.946ThrThr: 3.946 ± 0.532
5.501ThrVal: 5.501 ± 0.716
1.076ThrTrp: 1.076 ± 0.28
1.854ThrTyr: 1.854 ± 0.287
0.0ThrXaa: 0.0 ± 0.0
Val
6.876ValAla: 6.876 ± 0.589
0.299ValCys: 0.299 ± 0.138
5.262ValAsp: 5.262 ± 0.566
4.664ValGlu: 4.664 ± 0.546
2.332ValPhe: 2.332 ± 0.369
5.083ValGly: 5.083 ± 0.694
1.495ValHis: 1.495 ± 0.254
3.827ValIle: 3.827 ± 0.535
2.99ValLys: 2.99 ± 0.439
5.441ValLeu: 5.441 ± 0.509
1.136ValMet: 1.136 ± 0.298
2.511ValAsn: 2.511 ± 0.309
3.647ValPro: 3.647 ± 0.47
2.571ValGln: 2.571 ± 0.398
5.322ValArg: 5.322 ± 0.689
4.843ValSer: 4.843 ± 0.44
5.501ValThr: 5.501 ± 0.565
5.202ValVal: 5.202 ± 0.777
1.256ValTrp: 1.256 ± 0.252
2.272ValTyr: 2.272 ± 0.422
0.0ValXaa: 0.0 ± 0.0
Trp
1.435TrpAla: 1.435 ± 0.286
0.299TrpCys: 0.299 ± 0.12
1.375TrpAsp: 1.375 ± 0.258
1.076TrpGlu: 1.076 ± 0.212
0.897TrpPhe: 0.897 ± 0.199
1.794TrpGly: 1.794 ± 0.351
0.419TrpHis: 0.419 ± 0.153
1.076TrpIle: 1.076 ± 0.225
0.299TrpLys: 0.299 ± 0.184
2.093TrpLeu: 2.093 ± 0.326
0.419TrpMet: 0.419 ± 0.178
0.658TrpAsn: 0.658 ± 0.244
0.897TrpPro: 0.897 ± 0.243
1.076TrpGln: 1.076 ± 0.204
1.256TrpArg: 1.256 ± 0.282
1.315TrpSer: 1.315 ± 0.471
1.495TrpThr: 1.495 ± 0.332
2.033TrpVal: 2.033 ± 0.352
0.538TrpTrp: 0.538 ± 0.208
0.239TrpTyr: 0.239 ± 0.12
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.87TyrAla: 2.87 ± 0.458
0.179TyrCys: 0.179 ± 0.117
1.196TyrAsp: 1.196 ± 0.285
2.212TyrGlu: 2.212 ± 0.305
0.478TyrPhe: 0.478 ± 0.148
2.332TyrGly: 2.332 ± 0.367
0.598TyrHis: 0.598 ± 0.21
1.794TyrIle: 1.794 ± 0.353
1.375TyrLys: 1.375 ± 0.323
2.571TyrLeu: 2.571 ± 0.373
0.478TyrMet: 0.478 ± 0.14
1.375TyrAsn: 1.375 ± 0.32
1.315TyrPro: 1.315 ± 0.275
1.017TyrGln: 1.017 ± 0.294
2.212TyrArg: 2.212 ± 0.37
1.435TyrSer: 1.435 ± 0.274
1.854TyrThr: 1.854 ± 0.378
1.794TyrVal: 1.794 ± 0.341
0.419TyrTrp: 0.419 ± 0.176
0.419TyrTyr: 0.419 ± 0.153
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 97 proteins (16725 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski