Amino acid dipepetide frequency for Mycobacterium virus Joedirt

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
12.129AlaAla: 12.129 ± 1.016
1.055AlaCys: 1.055 ± 0.237
6.372AlaAsp: 6.372 ± 0.604
7.075AlaGlu: 7.075 ± 0.849
3.164AlaPhe: 3.164 ± 0.358
7.998AlaGly: 7.998 ± 0.75
2.065AlaHis: 2.065 ± 0.328
4.966AlaIle: 4.966 ± 0.495
7.866AlaLys: 7.866 ± 0.535
9.097AlaLeu: 9.097 ± 0.789
2.637AlaMet: 2.637 ± 0.301
3.603AlaAsn: 3.603 ± 0.474
4.307AlaPro: 4.307 ± 0.469
4.263AlaGln: 4.263 ± 0.575
5.493AlaArg: 5.493 ± 0.518
5.01AlaSer: 5.01 ± 0.588
5.713AlaThr: 5.713 ± 0.483
6.372AlaVal: 6.372 ± 0.628
1.538AlaTrp: 1.538 ± 0.245
2.944AlaTyr: 2.944 ± 0.423
0.0AlaXaa: 0.0 ± 0.0
Cys
1.23CysAla: 1.23 ± 0.32
0.176CysCys: 0.176 ± 0.094
0.791CysAsp: 0.791 ± 0.231
0.659CysGlu: 0.659 ± 0.173
0.264CysPhe: 0.264 ± 0.098
1.274CysGly: 1.274 ± 0.296
0.264CysHis: 0.264 ± 0.097
0.396CysIle: 0.396 ± 0.15
0.308CysLys: 0.308 ± 0.114
0.967CysLeu: 0.967 ± 0.201
0.264CysMet: 0.264 ± 0.118
0.352CysAsn: 0.352 ± 0.125
0.659CysPro: 0.659 ± 0.175
0.527CysGln: 0.527 ± 0.16
0.835CysArg: 0.835 ± 0.249
0.396CysSer: 0.396 ± 0.137
0.527CysThr: 0.527 ± 0.166
0.615CysVal: 0.615 ± 0.191
0.044CysTrp: 0.044 ± 0.052
0.308CysTyr: 0.308 ± 0.104
0.0CysXaa: 0.0 ± 0.0
Asp
5.713AspAla: 5.713 ± 0.501
0.571AspCys: 0.571 ± 0.174
4.219AspAsp: 4.219 ± 0.49
4.702AspGlu: 4.702 ± 0.487
1.802AspPhe: 1.802 ± 0.25
5.493AspGly: 5.493 ± 0.475
0.967AspHis: 0.967 ± 0.176
2.593AspIle: 2.593 ± 0.352
3.428AspLys: 3.428 ± 0.427
5.449AspLeu: 5.449 ± 0.588
1.626AspMet: 1.626 ± 0.296
1.802AspAsn: 1.802 ± 0.288
3.34AspPro: 3.34 ± 0.31
2.109AspGln: 2.109 ± 0.34
3.56AspArg: 3.56 ± 0.446
3.516AspSer: 3.516 ± 0.439
3.208AspThr: 3.208 ± 0.382
4.614AspVal: 4.614 ± 0.49
1.45AspTrp: 1.45 ± 0.263
2.329AspTyr: 2.329 ± 0.332
0.0AspXaa: 0.0 ± 0.0
Glu
6.724GluAla: 6.724 ± 0.63
0.879GluCys: 0.879 ± 0.211
3.779GluAsp: 3.779 ± 0.39
3.032GluGlu: 3.032 ± 0.344
2.461GluPhe: 2.461 ± 0.39
4.482GluGly: 4.482 ± 0.519
1.978GluHis: 1.978 ± 0.284
2.856GluIle: 2.856 ± 0.354
2.812GluLys: 2.812 ± 0.543
8.042GluLeu: 8.042 ± 0.729
1.802GluMet: 1.802 ± 0.365
2.153GluAsn: 2.153 ± 0.27
2.9GluPro: 2.9 ± 0.443
2.021GluGln: 2.021 ± 0.344
5.098GluArg: 5.098 ± 0.538
2.461GluSer: 2.461 ± 0.367
3.647GluThr: 3.647 ± 0.482
3.779GluVal: 3.779 ± 0.396
1.099GluTrp: 1.099 ± 0.205
2.241GluTyr: 2.241 ± 0.298
0.0GluXaa: 0.0 ± 0.0
Phe
3.12PheAla: 3.12 ± 0.295
0.439PheCys: 0.439 ± 0.131
1.89PheAsp: 1.89 ± 0.302
2.065PheGlu: 2.065 ± 0.294
0.923PhePhe: 0.923 ± 0.201
2.505PheGly: 2.505 ± 0.349
0.703PheHis: 0.703 ± 0.189
1.802PheIle: 1.802 ± 0.275
1.23PheLys: 1.23 ± 0.3
2.241PheLeu: 2.241 ± 0.312
0.835PheMet: 0.835 ± 0.166
1.406PheAsn: 1.406 ± 0.217
1.45PhePro: 1.45 ± 0.245
0.967PheGln: 0.967 ± 0.175
1.978PheArg: 1.978 ± 0.309
1.67PheSer: 1.67 ± 0.237
1.758PheThr: 1.758 ± 0.229
2.285PheVal: 2.285 ± 0.287
0.615PheTrp: 0.615 ± 0.152
1.45PheTyr: 1.45 ± 0.224
0.0PheXaa: 0.0 ± 0.0
Gly
6.811GlyAla: 6.811 ± 0.891
0.747GlyCys: 0.747 ± 0.223
5.317GlyAsp: 5.317 ± 0.467
5.493GlyGlu: 5.493 ± 0.403
2.812GlyPhe: 2.812 ± 0.357
7.91GlyGly: 7.91 ± 1.28
1.714GlyHis: 1.714 ± 0.302
3.56GlyIle: 3.56 ± 0.403
5.185GlyLys: 5.185 ± 0.488
6.24GlyLeu: 6.24 ± 0.608
1.934GlyMet: 1.934 ± 0.285
3.032GlyAsn: 3.032 ± 0.463
3.911GlyPro: 3.911 ± 0.488
2.461GlyGln: 2.461 ± 0.374
4.922GlyArg: 4.922 ± 0.516
4.263GlySer: 4.263 ± 0.481
5.142GlyThr: 5.142 ± 0.553
5.449GlyVal: 5.449 ± 0.506
1.758GlyTrp: 1.758 ± 0.275
3.34GlyTyr: 3.34 ± 0.339
0.0GlyXaa: 0.0 ± 0.0
His
1.362HisAla: 1.362 ± 0.219
0.22HisCys: 0.22 ± 0.093
1.099HisAsp: 1.099 ± 0.267
1.406HisGlu: 1.406 ± 0.243
0.791HisPhe: 0.791 ± 0.227
1.582HisGly: 1.582 ± 0.265
0.396HisHis: 0.396 ± 0.139
0.835HisIle: 0.835 ± 0.199
0.923HisLys: 0.923 ± 0.241
1.626HisLeu: 1.626 ± 0.283
0.703HisMet: 0.703 ± 0.206
0.615HisAsn: 0.615 ± 0.149
1.055HisPro: 1.055 ± 0.235
0.132HisGln: 0.132 ± 0.075
1.714HisArg: 1.714 ± 0.318
1.099HisSer: 1.099 ± 0.18
1.494HisThr: 1.494 ± 0.277
1.714HisVal: 1.714 ± 0.268
0.483HisTrp: 0.483 ± 0.15
0.923HisTyr: 0.923 ± 0.222
0.0HisXaa: 0.0 ± 0.0
Ile
4.658IleAla: 4.658 ± 0.517
0.439IleCys: 0.439 ± 0.172
3.076IleAsp: 3.076 ± 0.346
3.472IleGlu: 3.472 ± 0.446
1.011IlePhe: 1.011 ± 0.206
3.472IleGly: 3.472 ± 0.391
0.747IleHis: 0.747 ± 0.19
1.494IleIle: 1.494 ± 0.321
1.582IleLys: 1.582 ± 0.242
3.34IleLeu: 3.34 ± 0.403
0.923IleMet: 0.923 ± 0.167
1.846IleAsn: 1.846 ± 0.315
2.461IlePro: 2.461 ± 0.297
1.846IleGln: 1.846 ± 0.215
3.56IleArg: 3.56 ± 0.442
1.934IleSer: 1.934 ± 0.375
2.241IleThr: 2.241 ± 0.303
3.34IleVal: 3.34 ± 0.439
0.703IleTrp: 0.703 ± 0.173
0.835IleTyr: 0.835 ± 0.194
0.0IleXaa: 0.0 ± 0.0
Lys
6.548LysAla: 6.548 ± 0.614
0.527LysCys: 0.527 ± 0.142
2.505LysAsp: 2.505 ± 0.445
3.076LysGlu: 3.076 ± 0.4
1.582LysPhe: 1.582 ± 0.229
3.911LysGly: 3.911 ± 0.432
0.923LysHis: 0.923 ± 0.186
1.846LysIle: 1.846 ± 0.305
2.593LysLys: 2.593 ± 0.302
3.999LysLeu: 3.999 ± 0.437
1.099LysMet: 1.099 ± 0.24
2.109LysAsn: 2.109 ± 0.301
3.34LysPro: 3.34 ± 0.412
2.021LysGln: 2.021 ± 0.394
4.351LysArg: 4.351 ± 0.508
2.241LysSer: 2.241 ± 0.349
1.758LysThr: 1.758 ± 0.341
3.252LysVal: 3.252 ± 0.386
1.406LysTrp: 1.406 ± 0.23
1.846LysTyr: 1.846 ± 0.327
0.0LysXaa: 0.0 ± 0.0
Leu
9.053LeuAla: 9.053 ± 0.581
1.318LeuCys: 1.318 ± 0.33
5.757LeuAsp: 5.757 ± 0.437
4.351LeuGlu: 4.351 ± 0.518
2.944LeuPhe: 2.944 ± 0.348
5.933LeuGly: 5.933 ± 0.827
1.626LeuHis: 1.626 ± 0.297
3.032LeuIle: 3.032 ± 0.429
4.526LeuLys: 4.526 ± 0.503
6.68LeuLeu: 6.68 ± 0.543
2.461LeuMet: 2.461 ± 0.347
3.516LeuAsn: 3.516 ± 0.464
5.185LeuPro: 5.185 ± 0.511
2.329LeuGln: 2.329 ± 0.287
5.054LeuArg: 5.054 ± 0.529
5.933LeuSer: 5.933 ± 0.692
6.064LeuThr: 6.064 ± 0.474
5.801LeuVal: 5.801 ± 0.577
1.714LeuTrp: 1.714 ± 0.334
2.505LeuTyr: 2.505 ± 0.407
0.0LeuXaa: 0.0 ± 0.0
Met
3.164MetAla: 3.164 ± 0.334
0.264MetCys: 0.264 ± 0.112
1.274MetAsp: 1.274 ± 0.297
1.23MetGlu: 1.23 ± 0.229
0.483MetPhe: 0.483 ± 0.168
1.582MetGly: 1.582 ± 0.273
0.527MetHis: 0.527 ± 0.164
0.615MetIle: 0.615 ± 0.16
0.835MetLys: 0.835 ± 0.188
1.626MetLeu: 1.626 ± 0.31
0.571MetMet: 0.571 ± 0.196
0.747MetAsn: 0.747 ± 0.179
1.187MetPro: 1.187 ± 0.252
0.659MetGln: 0.659 ± 0.175
1.714MetArg: 1.714 ± 0.275
2.153MetSer: 2.153 ± 0.338
1.714MetThr: 1.714 ± 0.28
1.67MetVal: 1.67 ± 0.313
0.703MetTrp: 0.703 ± 0.249
0.835MetTyr: 0.835 ± 0.176
0.0MetXaa: 0.0 ± 0.0
Asn
3.691AsnAla: 3.691 ± 0.585
0.132AsnCys: 0.132 ± 0.073
1.934AsnAsp: 1.934 ± 0.321
2.109AsnGlu: 2.109 ± 0.266
1.055AsnPhe: 1.055 ± 0.209
4.307AsnGly: 4.307 ± 0.381
0.659AsnHis: 0.659 ± 0.174
1.626AsnIle: 1.626 ± 0.318
1.187AsnLys: 1.187 ± 0.209
3.56AsnLeu: 3.56 ± 0.499
1.011AsnMet: 1.011 ± 0.237
1.494AsnAsn: 1.494 ± 0.309
2.285AsnPro: 2.285 ± 0.332
1.538AsnGln: 1.538 ± 0.276
2.373AsnArg: 2.373 ± 0.297
1.978AsnSer: 1.978 ± 0.344
2.461AsnThr: 2.461 ± 0.334
2.329AsnVal: 2.329 ± 0.379
0.703AsnTrp: 0.703 ± 0.17
1.055AsnTyr: 1.055 ± 0.227
0.0AsnXaa: 0.0 ± 0.0
Pro
3.779ProAla: 3.779 ± 0.445
0.571ProCys: 0.571 ± 0.169
3.999ProAsp: 3.999 ± 0.461
4.087ProGlu: 4.087 ± 0.486
1.582ProPhe: 1.582 ± 0.301
4.614ProGly: 4.614 ± 0.596
0.747ProHis: 0.747 ± 0.189
2.417ProIle: 2.417 ± 0.247
2.856ProLys: 2.856 ± 0.408
3.384ProLeu: 3.384 ± 0.315
0.879ProMet: 0.879 ± 0.182
2.373ProAsn: 2.373 ± 0.377
2.593ProPro: 2.593 ± 0.381
1.274ProGln: 1.274 ± 0.223
3.076ProArg: 3.076 ± 0.335
3.252ProSer: 3.252 ± 0.336
3.208ProThr: 3.208 ± 0.431
3.955ProVal: 3.955 ± 0.372
1.143ProTrp: 1.143 ± 0.24
1.406ProTyr: 1.406 ± 0.231
0.0ProXaa: 0.0 ± 0.0
Gln
4.878GlnAla: 4.878 ± 0.495
0.352GlnCys: 0.352 ± 0.117
1.187GlnAsp: 1.187 ± 0.356
2.417GlnGlu: 2.417 ± 0.315
1.274GlnPhe: 1.274 ± 0.251
2.461GlnGly: 2.461 ± 0.326
0.659GlnHis: 0.659 ± 0.144
1.802GlnIle: 1.802 ± 0.258
1.626GlnLys: 1.626 ± 0.252
3.252GlnLeu: 3.252 ± 0.476
0.527GlnMet: 0.527 ± 0.155
0.879GlnAsn: 0.879 ± 0.239
1.45GlnPro: 1.45 ± 0.21
1.494GlnGln: 1.494 ± 0.336
2.461GlnArg: 2.461 ± 0.367
1.45GlnSer: 1.45 ± 0.28
1.143GlnThr: 1.143 ± 0.187
2.549GlnVal: 2.549 ± 0.353
0.615GlnTrp: 0.615 ± 0.148
1.187GlnTyr: 1.187 ± 0.257
0.0GlnXaa: 0.0 ± 0.0
Arg
6.767ArgAla: 6.767 ± 0.58
1.055ArgCys: 1.055 ± 0.29
4.438ArgAsp: 4.438 ± 0.487
5.449ArgGlu: 5.449 ± 0.657
1.978ArgPhe: 1.978 ± 0.289
5.142ArgGly: 5.142 ± 0.44
1.67ArgHis: 1.67 ± 0.334
2.856ArgIle: 2.856 ± 0.436
3.603ArgLys: 3.603 ± 0.528
4.658ArgLeu: 4.658 ± 0.49
1.362ArgMet: 1.362 ± 0.22
2.9ArgAsn: 2.9 ± 0.332
2.637ArgPro: 2.637 ± 0.391
1.89ArgGln: 1.89 ± 0.318
4.438ArgArg: 4.438 ± 0.439
3.384ArgSer: 3.384 ± 0.444
2.944ArgThr: 2.944 ± 0.371
4.482ArgVal: 4.482 ± 0.58
1.582ArgTrp: 1.582 ± 0.325
2.197ArgTyr: 2.197 ± 0.385
0.0ArgXaa: 0.0 ± 0.0
Ser
5.317SerAla: 5.317 ± 0.476
0.088SerCys: 0.088 ± 0.065
3.603SerAsp: 3.603 ± 0.379
3.516SerGlu: 3.516 ± 0.404
1.934SerPhe: 1.934 ± 0.281
5.405SerGly: 5.405 ± 0.621
0.879SerHis: 0.879 ± 0.186
2.505SerIle: 2.505 ± 0.318
2.065SerLys: 2.065 ± 0.357
4.922SerLeu: 4.922 ± 0.375
1.055SerMet: 1.055 ± 0.2
1.802SerAsn: 1.802 ± 0.274
2.549SerPro: 2.549 ± 0.425
1.714SerGln: 1.714 ± 0.295
2.988SerArg: 2.988 ± 0.36
2.812SerSer: 2.812 ± 0.345
2.812SerThr: 2.812 ± 0.339
4.614SerVal: 4.614 ± 0.592
1.626SerTrp: 1.626 ± 0.312
1.274SerTyr: 1.274 ± 0.2
0.0SerXaa: 0.0 ± 0.0
Thr
6.24ThrAla: 6.24 ± 0.481
0.615ThrCys: 0.615 ± 0.158
2.944ThrAsp: 2.944 ± 0.31
2.812ThrGlu: 2.812 ± 0.311
2.197ThrPhe: 2.197 ± 0.39
4.394ThrGly: 4.394 ± 0.479
1.099ThrHis: 1.099 ± 0.242
3.032ThrIle: 3.032 ± 0.419
2.593ThrLys: 2.593 ± 0.348
5.845ThrLeu: 5.845 ± 0.608
0.967ThrMet: 0.967 ± 0.216
2.109ThrAsn: 2.109 ± 0.354
3.208ThrPro: 3.208 ± 0.255
1.89ThrGln: 1.89 ± 0.287
2.988ThrArg: 2.988 ± 0.386
2.856ThrSer: 2.856 ± 0.475
2.725ThrThr: 2.725 ± 0.363
5.142ThrVal: 5.142 ± 0.676
1.187ThrTrp: 1.187 ± 0.231
1.67ThrTyr: 1.67 ± 0.305
0.0ThrXaa: 0.0 ± 0.0
Val
7.471ValAla: 7.471 ± 0.544
0.923ValCys: 0.923 ± 0.234
4.614ValAsp: 4.614 ± 0.446
3.867ValGlu: 3.867 ± 0.488
1.45ValPhe: 1.45 ± 0.262
5.449ValGly: 5.449 ± 0.52
1.274ValHis: 1.274 ± 0.231
3.296ValIle: 3.296 ± 0.418
4.087ValLys: 4.087 ± 0.407
6.02ValLeu: 6.02 ± 0.533
1.802ValMet: 1.802 ± 0.298
2.769ValAsn: 2.769 ± 0.389
3.911ValPro: 3.911 ± 0.42
2.329ValGln: 2.329 ± 0.299
4.614ValArg: 4.614 ± 0.496
4.307ValSer: 4.307 ± 0.517
4.394ValThr: 4.394 ± 0.547
5.625ValVal: 5.625 ± 0.516
1.406ValTrp: 1.406 ± 0.241
1.934ValTyr: 1.934 ± 0.297
0.0ValXaa: 0.0 ± 0.0
Trp
1.934TrpAla: 1.934 ± 0.394
0.132TrpCys: 0.132 ± 0.075
1.582TrpAsp: 1.582 ± 0.295
1.274TrpGlu: 1.274 ± 0.26
0.659TrpPhe: 0.659 ± 0.157
1.318TrpGly: 1.318 ± 0.178
0.527TrpHis: 0.527 ± 0.162
0.571TrpIle: 0.571 ± 0.148
0.527TrpLys: 0.527 ± 0.129
2.065TrpLeu: 2.065 ± 0.328
0.659TrpMet: 0.659 ± 0.168
1.011TrpAsn: 1.011 ± 0.208
0.967TrpPro: 0.967 ± 0.212
1.055TrpGln: 1.055 ± 0.229
1.538TrpArg: 1.538 ± 0.262
1.187TrpSer: 1.187 ± 0.253
1.582TrpThr: 1.582 ± 0.286
1.626TrpVal: 1.626 ± 0.241
0.615TrpTrp: 0.615 ± 0.191
0.396TrpTyr: 0.396 ± 0.153
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.076TyrAla: 3.076 ± 0.43
0.352TyrCys: 0.352 ± 0.148
2.109TyrAsp: 2.109 ± 0.28
2.153TyrGlu: 2.153 ± 0.364
0.967TyrPhe: 0.967 ± 0.192
2.769TyrGly: 2.769 ± 0.266
0.747TyrHis: 0.747 ± 0.187
0.879TyrIle: 0.879 ± 0.234
1.099TyrLys: 1.099 ± 0.222
2.769TyrLeu: 2.769 ± 0.378
0.396TyrMet: 0.396 ± 0.152
0.967TyrAsn: 0.967 ± 0.174
1.846TyrPro: 1.846 ± 0.338
1.143TyrGln: 1.143 ± 0.187
2.681TyrArg: 2.681 ± 0.31
1.582TyrSer: 1.582 ± 0.25
1.978TyrThr: 1.978 ± 0.29
2.329TyrVal: 2.329 ± 0.344
0.835TyrTrp: 0.835 ± 0.196
0.791TyrTyr: 0.791 ± 0.202
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 126 proteins (22757 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski