Amino acid dipepetide frequency for Mycobacterium phage Scarlett

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
23.441AlaAla: 23.441 ± 1.71
1.049AlaCys: 1.049 ± 0.238
9.02AlaAsp: 9.02 ± 0.676
9.282AlaGlu: 9.282 ± 0.835
3.776AlaPhe: 3.776 ± 0.553
9.335AlaGly: 9.335 ± 1.036
2.989AlaHis: 2.989 ± 0.35
4.772AlaIle: 4.772 ± 0.545
5.192AlaLys: 5.192 ± 0.64
12.009AlaLeu: 12.009 ± 0.765
2.884AlaMet: 2.884 ± 0.402
2.832AlaAsn: 2.832 ± 0.505
7.027AlaPro: 7.027 ± 0.788
4.667AlaGln: 4.667 ± 0.634
8.443AlaArg: 8.443 ± 0.668
6.45AlaSer: 6.45 ± 0.676
7.656AlaThr: 7.656 ± 0.79
9.911AlaVal: 9.911 ± 0.683
2.412AlaTrp: 2.412 ± 0.321
3.042AlaTyr: 3.042 ± 0.4
0.0AlaXaa: 0.0 ± 0.0
Cys
1.311CysAla: 1.311 ± 0.281
0.052CysCys: 0.052 ± 0.045
0.839CysAsp: 0.839 ± 0.24
0.839CysGlu: 0.839 ± 0.228
0.21CysPhe: 0.21 ± 0.099
1.101CysGly: 1.101 ± 0.286
0.262CysHis: 0.262 ± 0.156
0.315CysIle: 0.315 ± 0.113
0.524CysLys: 0.524 ± 0.203
0.682CysLeu: 0.682 ± 0.221
0.367CysMet: 0.367 ± 0.138
0.367CysAsn: 0.367 ± 0.141
0.629CysPro: 0.629 ± 0.184
0.052CysGln: 0.052 ± 0.045
0.629CysArg: 0.629 ± 0.168
0.996CysSer: 0.996 ± 0.263
0.839CysThr: 0.839 ± 0.231
0.524CysVal: 0.524 ± 0.162
0.577CysTrp: 0.577 ± 0.181
0.262CysTyr: 0.262 ± 0.12
0.0CysXaa: 0.0 ± 0.0
Asp
7.656AspAla: 7.656 ± 0.68
0.577AspCys: 0.577 ± 0.203
5.401AspAsp: 5.401 ± 0.671
5.978AspGlu: 5.978 ± 0.731
0.891AspPhe: 0.891 ± 0.196
7.08AspGly: 7.08 ± 0.523
1.101AspHis: 1.101 ± 0.263
1.363AspIle: 1.363 ± 0.246
1.573AspLys: 1.573 ± 0.321
6.345AspLeu: 6.345 ± 0.561
1.993AspMet: 1.993 ± 0.307
1.468AspAsn: 1.468 ± 0.28
3.881AspPro: 3.881 ± 0.419
1.888AspGln: 1.888 ± 0.254
4.562AspArg: 4.562 ± 0.623
2.674AspSer: 2.674 ± 0.382
2.674AspThr: 2.674 ± 0.395
4.667AspVal: 4.667 ± 0.469
0.787AspTrp: 0.787 ± 0.207
1.573AspTyr: 1.573 ± 0.362
0.0AspXaa: 0.0 ± 0.0
Glu
7.394GluAla: 7.394 ± 0.805
0.944GluCys: 0.944 ± 0.226
2.465GluAsp: 2.465 ± 0.382
1.101GluGlu: 1.101 ± 0.241
1.783GluPhe: 1.783 ± 0.301
5.244GluGly: 5.244 ± 0.578
1.626GluHis: 1.626 ± 0.289
2.045GluIle: 2.045 ± 0.381
1.468GluLys: 1.468 ± 0.378
5.716GluLeu: 5.716 ± 0.571
1.311GluMet: 1.311 ± 0.242
0.734GluAsn: 0.734 ± 0.209
3.251GluPro: 3.251 ± 0.518
2.307GluGln: 2.307 ± 0.306
5.192GluArg: 5.192 ± 0.565
2.517GluSer: 2.517 ± 0.336
2.307GluThr: 2.307 ± 0.344
5.873GluVal: 5.873 ± 0.748
1.363GluTrp: 1.363 ± 0.282
2.098GluTyr: 2.098 ± 0.389
0.0GluXaa: 0.0 ± 0.0
Phe
3.723PheAla: 3.723 ± 0.58
0.21PheCys: 0.21 ± 0.109
2.517PheAsp: 2.517 ± 0.386
1.363PheGlu: 1.363 ± 0.24
0.577PhePhe: 0.577 ± 0.161
3.251PheGly: 3.251 ± 0.366
0.262PheHis: 0.262 ± 0.117
0.891PheIle: 0.891 ± 0.196
0.944PheLys: 0.944 ± 0.332
1.94PheLeu: 1.94 ± 0.333
0.682PheMet: 0.682 ± 0.204
1.049PheAsn: 1.049 ± 0.212
1.259PhePro: 1.259 ± 0.239
0.629PheGln: 0.629 ± 0.175
1.154PheArg: 1.154 ± 0.249
1.049PheSer: 1.049 ± 0.238
1.416PheThr: 1.416 ± 0.313
2.412PheVal: 2.412 ± 0.356
0.367PheTrp: 0.367 ± 0.128
0.42PheTyr: 0.42 ± 0.167
0.0PheXaa: 0.0 ± 0.0
Gly
10.646GlyAla: 10.646 ± 1.347
0.996GlyCys: 0.996 ± 0.23
5.087GlyAsp: 5.087 ± 0.554
5.297GlyGlu: 5.297 ± 0.414
2.674GlyPhe: 2.674 ± 0.451
10.069GlyGly: 10.069 ± 1.47
1.835GlyHis: 1.835 ± 0.352
3.409GlyIle: 3.409 ± 0.45
3.514GlyLys: 3.514 ± 0.559
7.132GlyLeu: 7.132 ± 0.701
1.783GlyMet: 1.783 ± 0.279
2.465GlyAsn: 2.465 ± 0.295
3.146GlyPro: 3.146 ± 0.377
2.517GlyGln: 2.517 ± 0.405
5.978GlyArg: 5.978 ± 0.51
5.349GlySer: 5.349 ± 0.684
6.555GlyThr: 6.555 ± 0.611
7.184GlyVal: 7.184 ± 0.596
2.203GlyTrp: 2.203 ± 0.317
2.203GlyTyr: 2.203 ± 0.352
0.0GlyXaa: 0.0 ± 0.0
His
2.412HisAla: 2.412 ± 0.474
0.524HisCys: 0.524 ± 0.167
1.573HisAsp: 1.573 ± 0.34
1.363HisGlu: 1.363 ± 0.275
0.682HisPhe: 0.682 ± 0.202
2.15HisGly: 2.15 ± 0.319
0.682HisHis: 0.682 ± 0.181
0.524HisIle: 0.524 ± 0.175
0.472HisLys: 0.472 ± 0.146
2.098HisLeu: 2.098 ± 0.302
0.472HisMet: 0.472 ± 0.135
0.577HisAsn: 0.577 ± 0.136
1.259HisPro: 1.259 ± 0.262
0.472HisGln: 0.472 ± 0.151
2.203HisArg: 2.203 ± 0.409
0.734HisSer: 0.734 ± 0.188
1.363HisThr: 1.363 ± 0.229
1.993HisVal: 1.993 ± 0.278
0.42HisTrp: 0.42 ± 0.132
0.629HisTyr: 0.629 ± 0.203
0.0HisXaa: 0.0 ± 0.0
Ile
6.083IleAla: 6.083 ± 0.629
0.157IleCys: 0.157 ± 0.09
3.146IleAsp: 3.146 ± 0.384
3.199IleGlu: 3.199 ± 0.457
0.367IlePhe: 0.367 ± 0.129
4.09IleGly: 4.09 ± 0.723
0.367IleHis: 0.367 ± 0.143
0.734IleIle: 0.734 ± 0.209
1.678IleLys: 1.678 ± 0.29
1.94IleLeu: 1.94 ± 0.372
0.367IleMet: 0.367 ± 0.129
1.259IleAsn: 1.259 ± 0.294
1.94IlePro: 1.94 ± 0.416
0.682IleGln: 0.682 ± 0.164
2.045IleArg: 2.045 ± 0.29
1.888IleSer: 1.888 ± 0.274
2.57IleThr: 2.57 ± 0.356
3.514IleVal: 3.514 ± 0.421
0.524IleTrp: 0.524 ± 0.164
0.367IleTyr: 0.367 ± 0.142
0.0IleXaa: 0.0 ± 0.0
Lys
3.828LysAla: 3.828 ± 0.6
0.472LysCys: 0.472 ± 0.149
1.311LysAsp: 1.311 ± 0.219
0.472LysGlu: 0.472 ± 0.127
0.682LysPhe: 0.682 ± 0.179
2.937LysGly: 2.937 ± 0.48
0.682LysHis: 0.682 ± 0.157
1.521LysIle: 1.521 ± 0.31
0.734LysLys: 0.734 ± 0.204
3.199LysLeu: 3.199 ± 0.352
0.629LysMet: 0.629 ± 0.188
0.577LysAsn: 0.577 ± 0.16
3.146LysPro: 3.146 ± 0.456
0.629LysGln: 0.629 ± 0.165
2.517LysArg: 2.517 ± 0.387
1.101LysSer: 1.101 ± 0.207
2.307LysThr: 2.307 ± 0.414
2.727LysVal: 2.727 ± 0.369
0.629LysTrp: 0.629 ± 0.198
0.944LysTyr: 0.944 ± 0.312
0.0LysXaa: 0.0 ± 0.0
Leu
12.953LeuAla: 12.953 ± 0.83
0.891LeuCys: 0.891 ± 0.193
8.653LeuAsp: 8.653 ± 0.624
2.098LeuGlu: 2.098 ± 0.327
2.255LeuPhe: 2.255 ± 0.46
7.184LeuGly: 7.184 ± 0.6
2.098LeuHis: 2.098 ± 0.332
3.461LeuIle: 3.461 ± 0.342
2.307LeuLys: 2.307 ± 0.333
5.401LeuLeu: 5.401 ± 0.588
1.731LeuMet: 1.731 ± 0.28
2.674LeuAsn: 2.674 ± 0.383
3.933LeuPro: 3.933 ± 0.566
2.989LeuGln: 2.989 ± 0.456
6.555LeuArg: 6.555 ± 0.572
5.454LeuSer: 5.454 ± 0.499
4.72LeuThr: 4.72 ± 0.473
6.188LeuVal: 6.188 ± 0.427
1.573LeuTrp: 1.573 ± 0.323
1.678LeuTyr: 1.678 ± 0.335
0.0LeuXaa: 0.0 ± 0.0
Met
2.937MetAla: 2.937 ± 0.297
0.157MetCys: 0.157 ± 0.095
0.734MetAsp: 0.734 ± 0.18
0.734MetGlu: 0.734 ± 0.23
0.787MetPhe: 0.787 ± 0.227
1.573MetGly: 1.573 ± 0.262
0.629MetHis: 0.629 ± 0.165
1.206MetIle: 1.206 ± 0.239
0.262MetLys: 0.262 ± 0.127
1.626MetLeu: 1.626 ± 0.258
0.262MetMet: 0.262 ± 0.112
0.42MetAsn: 0.42 ± 0.136
1.416MetPro: 1.416 ± 0.274
0.577MetGln: 0.577 ± 0.182
1.678MetArg: 1.678 ± 0.272
2.57MetSer: 2.57 ± 0.328
1.363MetThr: 1.363 ± 0.256
1.678MetVal: 1.678 ± 0.309
0.472MetTrp: 0.472 ± 0.154
0.472MetTyr: 0.472 ± 0.181
0.0MetXaa: 0.0 ± 0.0
Asn
3.723AsnAla: 3.723 ± 0.541
0.262AsnCys: 0.262 ± 0.106
1.049AsnAsp: 1.049 ± 0.224
0.839AsnGlu: 0.839 ± 0.205
0.734AsnPhe: 0.734 ± 0.214
2.989AsnGly: 2.989 ± 0.399
0.472AsnHis: 0.472 ± 0.16
0.629AsnIle: 0.629 ± 0.234
0.577AsnLys: 0.577 ± 0.181
1.731AsnLeu: 1.731 ± 0.302
0.315AsnMet: 0.315 ± 0.124
0.629AsnAsn: 0.629 ± 0.209
2.465AsnPro: 2.465 ± 0.356
0.682AsnGln: 0.682 ± 0.167
1.94AsnArg: 1.94 ± 0.26
0.787AsnSer: 0.787 ± 0.162
1.521AsnThr: 1.521 ± 0.228
2.57AsnVal: 2.57 ± 0.33
0.315AsnTrp: 0.315 ± 0.116
0.996AsnTyr: 0.996 ± 0.232
0.0AsnXaa: 0.0 ± 0.0
Pro
8.443ProAla: 8.443 ± 0.764
0.42ProCys: 0.42 ± 0.149
3.723ProAsp: 3.723 ± 0.418
4.72ProGlu: 4.72 ± 0.562
1.573ProPhe: 1.573 ± 0.285
6.031ProGly: 6.031 ± 0.593
1.154ProHis: 1.154 ± 0.236
2.098ProIle: 2.098 ± 0.316
1.678ProLys: 1.678 ± 0.291
4.457ProLeu: 4.457 ± 0.522
0.996ProMet: 0.996 ± 0.221
1.206ProAsn: 1.206 ± 0.381
3.251ProPro: 3.251 ± 0.493
1.678ProGln: 1.678 ± 0.364
3.409ProArg: 3.409 ± 0.573
2.57ProSer: 2.57 ± 0.44
2.989ProThr: 2.989 ± 0.506
4.72ProVal: 4.72 ± 0.513
1.101ProTrp: 1.101 ± 0.218
1.416ProTyr: 1.416 ± 0.261
0.0ProXaa: 0.0 ± 0.0
Gln
4.3GlnAla: 4.3 ± 0.566
0.262GlnCys: 0.262 ± 0.119
0.891GlnAsp: 0.891 ± 0.211
0.787GlnGlu: 0.787 ± 0.171
0.891GlnPhe: 0.891 ± 0.307
2.36GlnGly: 2.36 ± 0.336
0.891GlnHis: 0.891 ± 0.188
1.678GlnIle: 1.678 ± 0.245
0.367GlnLys: 0.367 ± 0.124
2.15GlnLeu: 2.15 ± 0.365
0.891GlnMet: 0.891 ± 0.218
0.42GlnAsn: 0.42 ± 0.142
1.993GlnPro: 1.993 ± 0.371
1.468GlnGln: 1.468 ± 0.243
2.57GlnArg: 2.57 ± 0.377
1.835GlnSer: 1.835 ± 0.33
1.783GlnThr: 1.783 ± 0.283
3.199GlnVal: 3.199 ± 0.421
0.734GlnTrp: 0.734 ± 0.206
1.049GlnTyr: 1.049 ± 0.229
0.0GlnXaa: 0.0 ± 0.0
Arg
7.499ArgAla: 7.499 ± 0.714
1.259ArgCys: 1.259 ± 0.322
4.09ArgAsp: 4.09 ± 0.49
4.353ArgGlu: 4.353 ± 0.574
1.468ArgPhe: 1.468 ± 0.325
3.986ArgGly: 3.986 ± 0.489
1.888ArgHis: 1.888 ± 0.413
2.517ArgIle: 2.517 ± 0.367
3.251ArgLys: 3.251 ± 0.431
6.555ArgLeu: 6.555 ± 0.601
2.36ArgMet: 2.36 ± 0.324
2.15ArgAsn: 2.15 ± 0.315
3.776ArgPro: 3.776 ± 0.573
2.727ArgGln: 2.727 ± 0.336
6.45ArgArg: 6.45 ± 0.839
3.461ArgSer: 3.461 ± 0.429
3.933ArgThr: 3.933 ± 0.482
4.405ArgVal: 4.405 ± 0.486
1.888ArgTrp: 1.888 ± 0.365
1.94ArgTyr: 1.94 ± 0.286
0.0ArgXaa: 0.0 ± 0.0
Ser
6.66SerAla: 6.66 ± 0.528
0.472SerCys: 0.472 ± 0.145
2.465SerAsp: 2.465 ± 0.263
2.832SerGlu: 2.832 ± 0.428
1.626SerPhe: 1.626 ± 0.289
4.3SerGly: 4.3 ± 0.729
1.311SerHis: 1.311 ± 0.264
2.57SerIle: 2.57 ± 0.464
1.101SerLys: 1.101 ± 0.227
4.562SerLeu: 4.562 ± 0.467
1.521SerMet: 1.521 ± 0.287
1.678SerAsn: 1.678 ± 0.261
3.776SerPro: 3.776 ± 0.45
1.573SerGln: 1.573 ± 0.294
3.304SerArg: 3.304 ± 0.376
2.779SerSer: 2.779 ± 0.452
2.779SerThr: 2.779 ± 0.273
3.304SerVal: 3.304 ± 0.434
1.101SerTrp: 1.101 ± 0.227
1.521SerTyr: 1.521 ± 0.233
0.0SerXaa: 0.0 ± 0.0
Thr
6.87ThrAla: 6.87 ± 0.744
0.891ThrCys: 0.891 ± 0.203
3.146ThrAsp: 3.146 ± 0.381
3.671ThrGlu: 3.671 ± 0.388
2.15ThrPhe: 2.15 ± 0.325
6.083ThrGly: 6.083 ± 0.551
1.311ThrHis: 1.311 ± 0.267
3.094ThrIle: 3.094 ± 0.395
1.783ThrLys: 1.783 ± 0.333
4.51ThrLeu: 4.51 ± 0.487
0.734ThrMet: 0.734 ± 0.233
1.468ThrAsn: 1.468 ± 0.247
3.933ThrPro: 3.933 ± 0.611
1.468ThrGln: 1.468 ± 0.242
2.779ThrArg: 2.779 ± 0.427
2.674ThrSer: 2.674 ± 0.347
3.618ThrThr: 3.618 ± 0.475
5.506ThrVal: 5.506 ± 0.49
0.891ThrTrp: 0.891 ± 0.228
1.888ThrTyr: 1.888 ± 0.34
0.0ThrXaa: 0.0 ± 0.0
Val
10.803ValAla: 10.803 ± 0.864
1.154ValCys: 1.154 ± 0.204
5.139ValAsp: 5.139 ± 0.558
5.611ValGlu: 5.611 ± 0.641
1.835ValPhe: 1.835 ± 0.285
6.765ValGly: 6.765 ± 0.585
2.15ValHis: 2.15 ± 0.439
2.727ValIle: 2.727 ± 0.422
2.832ValLys: 2.832 ± 0.359
6.922ValLeu: 6.922 ± 0.624
1.468ValMet: 1.468 ± 0.26
1.888ValAsn: 1.888 ± 0.251
5.297ValPro: 5.297 ± 0.533
1.94ValGln: 1.94 ± 0.233
4.562ValArg: 4.562 ± 0.486
3.881ValSer: 3.881 ± 0.365
5.297ValThr: 5.297 ± 0.461
5.926ValVal: 5.926 ± 0.682
1.94ValTrp: 1.94 ± 0.286
1.888ValTyr: 1.888 ± 0.342
0.0ValXaa: 0.0 ± 0.0
Trp
2.779TrpAla: 2.779 ± 0.484
0.315TrpCys: 0.315 ± 0.119
0.944TrpAsp: 0.944 ± 0.226
0.682TrpGlu: 0.682 ± 0.227
0.682TrpPhe: 0.682 ± 0.163
1.101TrpGly: 1.101 ± 0.243
0.682TrpHis: 0.682 ± 0.167
0.996TrpIle: 0.996 ± 0.252
0.21TrpLys: 0.21 ± 0.092
2.884TrpLeu: 2.884 ± 0.401
0.367TrpMet: 0.367 ± 0.142
0.577TrpAsn: 0.577 ± 0.17
0.839TrpPro: 0.839 ± 0.195
1.101TrpGln: 1.101 ± 0.19
1.94TrpArg: 1.94 ± 0.321
0.944TrpSer: 0.944 ± 0.246
1.206TrpThr: 1.206 ± 0.201
1.101TrpVal: 1.101 ± 0.263
0.367TrpTrp: 0.367 ± 0.11
0.367TrpTyr: 0.367 ± 0.115
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.727TyrAla: 2.727 ± 0.312
0.315TyrCys: 0.315 ± 0.134
1.94TyrAsp: 1.94 ± 0.327
1.521TyrGlu: 1.521 ± 0.289
0.472TyrPhe: 0.472 ± 0.116
2.412TyrGly: 2.412 ± 0.34
0.21TyrHis: 0.21 ± 0.11
0.367TyrIle: 0.367 ± 0.137
0.577TyrLys: 0.577 ± 0.208
2.937TyrLeu: 2.937 ± 0.387
0.367TyrMet: 0.367 ± 0.145
0.787TyrAsn: 0.787 ± 0.171
1.416TyrPro: 1.416 ± 0.252
0.472TyrGln: 0.472 ± 0.2
2.045TyrArg: 2.045 ± 0.323
1.521TyrSer: 1.521 ± 0.278
1.626TyrThr: 1.626 ± 0.284
2.622TyrVal: 2.622 ± 0.382
0.42TyrTrp: 0.42 ± 0.129
0.577TyrTyr: 0.577 ± 0.195
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 102 proteins (19070 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski