Amino acid dipepetide frequency for Bacillus phage Shanette

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
2.139AlaAla: 2.139 ± 0.227
0.674AlaCys: 0.674 ± 0.141
4.069AlaAsp: 4.069 ± 0.404
4.231AlaGlu: 4.231 ± 0.299
2.069AlaPhe: 2.069 ± 0.22
4.231AlaGly: 4.231 ± 0.42
1.116AlaHis: 1.116 ± 0.148
4.766AlaIle: 4.766 ± 0.341
5.487AlaLys: 5.487 ± 0.403
5.603AlaLeu: 5.603 ± 0.35
1.813AlaMet: 1.813 ± 0.213
2.906AlaAsn: 2.906 ± 0.303
2.441AlaPro: 2.441 ± 0.267
2.255AlaGln: 2.255 ± 0.316
2.627AlaArg: 2.627 ± 0.225
3.301AlaSer: 3.301 ± 0.29
3.859AlaThr: 3.859 ± 0.561
3.906AlaVal: 3.906 ± 0.315
0.907AlaTrp: 0.907 ± 0.192
2.139AlaTyr: 2.139 ± 0.217
0.0AlaXaa: 0.0 ± 0.0
Cys
0.349CysAla: 0.349 ± 0.086
0.139CysCys: 0.139 ± 0.058
0.604CysAsp: 0.604 ± 0.139
0.651CysGlu: 0.651 ± 0.115
0.186CysPhe: 0.186 ± 0.057
0.721CysGly: 0.721 ± 0.153
0.163CysHis: 0.163 ± 0.057
0.976CysIle: 0.976 ± 0.17
0.767CysLys: 0.767 ± 0.151
1.069CysLeu: 1.069 ± 0.192
0.325CysMet: 0.325 ± 0.087
0.511CysAsn: 0.511 ± 0.105
0.488CysPro: 0.488 ± 0.127
0.349CysGln: 0.349 ± 0.092
0.488CysArg: 0.488 ± 0.099
0.674CysSer: 0.674 ± 0.112
0.721CysThr: 0.721 ± 0.145
0.535CysVal: 0.535 ± 0.118
0.186CysTrp: 0.186 ± 0.062
0.302CysTyr: 0.302 ± 0.079
0.0CysXaa: 0.0 ± 0.0
Asp
3.859AspAla: 3.859 ± 0.416
0.581AspCys: 0.581 ± 0.142
3.604AspAsp: 3.604 ± 0.297
4.673AspGlu: 4.673 ± 0.386
2.557AspPhe: 2.557 ± 0.222
4.952AspGly: 4.952 ± 0.392
1.511AspHis: 1.511 ± 0.315
4.534AspIle: 4.534 ± 0.296
5.626AspLys: 5.626 ± 0.366
5.696AspLeu: 5.696 ± 0.332
2.371AspMet: 2.371 ± 0.253
3.162AspAsn: 3.162 ± 0.246
2.348AspPro: 2.348 ± 0.335
1.604AspGln: 1.604 ± 0.219
2.604AspArg: 2.604 ± 0.254
3.487AspSer: 3.487 ± 0.276
3.464AspThr: 3.464 ± 0.311
4.348AspVal: 4.348 ± 0.366
1.209AspTrp: 1.209 ± 0.232
2.395AspTyr: 2.395 ± 0.24
0.0AspXaa: 0.0 ± 0.0
Glu
4.371GluAla: 4.371 ± 0.292
0.767GluCys: 0.767 ± 0.151
5.487GluAsp: 5.487 ± 0.439
7.393GluGlu: 7.393 ± 0.663
2.836GluPhe: 2.836 ± 0.226
5.557GluGly: 5.557 ± 0.38
2.348GluHis: 2.348 ± 0.261
4.58GluIle: 4.58 ± 0.324
5.068GluLys: 5.068 ± 0.377
6.301GluLeu: 6.301 ± 0.481
2.767GluMet: 2.767 ± 0.275
2.929GluAsn: 2.929 ± 0.272
2.162GluPro: 2.162 ± 0.255
2.883GluGln: 2.883 ± 0.269
3.418GluArg: 3.418 ± 0.383
3.79GluSer: 3.79 ± 0.268
3.255GluThr: 3.255 ± 0.295
6.231GluVal: 6.231 ± 0.545
1.395GluTrp: 1.395 ± 0.203
3.278GluTyr: 3.278 ± 0.272
0.0GluXaa: 0.0 ± 0.0
Phe
1.767PheAla: 1.767 ± 0.216
0.511PheCys: 0.511 ± 0.126
2.255PheAsp: 2.255 ± 0.249
2.255PheGlu: 2.255 ± 0.228
1.186PhePhe: 1.186 ± 0.207
2.418PheGly: 2.418 ± 0.24
0.814PheHis: 0.814 ± 0.161
2.046PheIle: 2.046 ± 0.219
2.836PheLys: 2.836 ± 0.282
2.79PheLeu: 2.79 ± 0.272
1.046PheMet: 1.046 ± 0.157
1.86PheAsn: 1.86 ± 0.199
1.372PhePro: 1.372 ± 0.179
0.907PheGln: 0.907 ± 0.158
1.767PheArg: 1.767 ± 0.207
2.511PheSer: 2.511 ± 0.229
2.348PheThr: 2.348 ± 0.245
2.069PheVal: 2.069 ± 0.212
0.442PheTrp: 0.442 ± 0.096
1.116PheTyr: 1.116 ± 0.147
0.0PheXaa: 0.0 ± 0.0
Gly
4.022GlyAla: 4.022 ± 0.287
0.721GlyCys: 0.721 ± 0.131
4.092GlyAsp: 4.092 ± 0.318
5.092GlyGlu: 5.092 ± 0.36
2.674GlyPhe: 2.674 ± 0.227
5.952GlyGly: 5.952 ± 0.757
1.395GlyHis: 1.395 ± 0.232
4.929GlyIle: 4.929 ± 0.311
5.347GlyLys: 5.347 ± 0.367
5.58GlyLeu: 5.58 ± 0.386
2.278GlyMet: 2.278 ± 0.23
3.697GlyAsn: 3.697 ± 0.518
0.256GlyPro: 0.256 ± 0.083
2.162GlyGln: 2.162 ± 0.245
3.255GlyArg: 3.255 ± 0.24
4.348GlySer: 4.348 ± 0.38
5.44GlyThr: 5.44 ± 0.51
5.324GlyVal: 5.324 ± 0.434
1.255GlyTrp: 1.255 ± 0.21
2.999GlyTyr: 2.999 ± 0.304
0.0GlyXaa: 0.0 ± 0.0
His
1.046HisAla: 1.046 ± 0.145
0.163HisCys: 0.163 ± 0.068
1.325HisAsp: 1.325 ± 0.22
1.488HisGlu: 1.488 ± 0.233
1.093HisPhe: 1.093 ± 0.167
1.604HisGly: 1.604 ± 0.254
0.558HisHis: 0.558 ± 0.156
1.441HisIle: 1.441 ± 0.189
1.674HisLys: 1.674 ± 0.239
1.883HisLeu: 1.883 ± 0.254
0.976HisMet: 0.976 ± 0.167
1.209HisAsn: 1.209 ± 0.168
1.255HisPro: 1.255 ± 0.196
0.488HisGln: 0.488 ± 0.116
0.953HisArg: 0.953 ± 0.165
1.232HisSer: 1.232 ± 0.196
1.395HisThr: 1.395 ± 0.161
1.348HisVal: 1.348 ± 0.175
0.302HisTrp: 0.302 ± 0.089
1.162HisTyr: 1.162 ± 0.215
0.0HisXaa: 0.0 ± 0.0
Ile
3.371IleAla: 3.371 ± 0.295
0.325IleCys: 0.325 ± 0.084
4.696IleAsp: 4.696 ± 0.286
4.975IleGlu: 4.975 ± 0.372
1.627IlePhe: 1.627 ± 0.175
3.813IleGly: 3.813 ± 0.268
1.581IleHis: 1.581 ± 0.227
3.72IleIle: 3.72 ± 0.306
5.092IleLys: 5.092 ± 0.341
4.673IleLeu: 4.673 ± 0.399
2.069IleMet: 2.069 ± 0.259
3.557IleAsn: 3.557 ± 0.313
2.836IlePro: 2.836 ± 0.263
2.511IleGln: 2.511 ± 0.211
3.394IleArg: 3.394 ± 0.259
3.604IleSer: 3.604 ± 0.347
4.441IleThr: 4.441 ± 0.373
3.929IleVal: 3.929 ± 0.333
0.558IleTrp: 0.558 ± 0.098
2.511IleTyr: 2.511 ± 0.238
0.0IleXaa: 0.0 ± 0.0
Lys
6.58LysAla: 6.58 ± 0.441
0.79LysCys: 0.79 ± 0.166
5.464LysAsp: 5.464 ± 0.347
7.51LysGlu: 7.51 ± 0.566
2.464LysPhe: 2.464 ± 0.283
6.184LysGly: 6.184 ± 0.373
1.976LysHis: 1.976 ± 0.26
3.65LysIle: 3.65 ± 0.326
6.812LysLys: 6.812 ± 0.587
5.719LysLeu: 5.719 ± 0.373
2.674LysMet: 2.674 ± 0.246
3.069LysAsn: 3.069 ± 0.245
3.232LysPro: 3.232 ± 0.256
2.906LysGln: 2.906 ± 0.283
3.534LysArg: 3.534 ± 0.348
4.557LysSer: 4.557 ± 0.451
4.069LysThr: 4.069 ± 0.316
5.696LysVal: 5.696 ± 0.323
1.209LysTrp: 1.209 ± 0.164
2.999LysTyr: 2.999 ± 0.253
0.0LysXaa: 0.0 ± 0.0
Leu
5.115LeuAla: 5.115 ± 0.308
0.651LeuCys: 0.651 ± 0.135
5.975LeuAsp: 5.975 ± 0.415
6.789LeuGlu: 6.789 ± 0.434
2.418LeuPhe: 2.418 ± 0.212
5.231LeuGly: 5.231 ± 0.346
1.627LeuHis: 1.627 ± 0.227
4.301LeuIle: 4.301 ± 0.291
5.998LeuLys: 5.998 ± 0.311
5.929LeuLeu: 5.929 ± 0.424
2.627LeuMet: 2.627 ± 0.265
3.255LeuAsn: 3.255 ± 0.301
3.348LeuPro: 3.348 ± 0.302
3.301LeuGln: 3.301 ± 0.287
3.348LeuArg: 3.348 ± 0.31
5.231LeuSer: 5.231 ± 0.351
4.603LeuThr: 4.603 ± 0.304
4.859LeuVal: 4.859 ± 0.425
1.023LeuTrp: 1.023 ± 0.157
3.069LeuTyr: 3.069 ± 0.276
0.0LeuXaa: 0.0 ± 0.0
Met
2.581MetAla: 2.581 ± 0.281
0.395MetCys: 0.395 ± 0.107
1.93MetAsp: 1.93 ± 0.203
2.209MetGlu: 2.209 ± 0.237
1.255MetPhe: 1.255 ± 0.176
2.278MetGly: 2.278 ± 0.265
0.418MetHis: 0.418 ± 0.118
1.627MetIle: 1.627 ± 0.163
2.976MetLys: 2.976 ± 0.372
2.395MetLeu: 2.395 ± 0.284
1.046MetMet: 1.046 ± 0.198
1.999MetAsn: 1.999 ± 0.18
1.023MetPro: 1.023 ± 0.146
1.0MetGln: 1.0 ± 0.153
1.581MetArg: 1.581 ± 0.214
2.278MetSer: 2.278 ± 0.229
1.651MetThr: 1.651 ± 0.184
2.232MetVal: 2.232 ± 0.258
0.349MetTrp: 0.349 ± 0.085
1.255MetTyr: 1.255 ± 0.146
0.0MetXaa: 0.0 ± 0.0
Asn
3.092AsnAla: 3.092 ± 0.44
0.604AsnCys: 0.604 ± 0.116
2.836AsnAsp: 2.836 ± 0.234
2.836AsnGlu: 2.836 ± 0.249
1.767AsnPhe: 1.767 ± 0.23
3.813AsnGly: 3.813 ± 0.335
1.093AsnHis: 1.093 ± 0.13
3.348AsnIle: 3.348 ± 0.299
3.929AsnLys: 3.929 ± 0.358
3.278AsnLeu: 3.278 ± 0.339
1.697AsnMet: 1.697 ± 0.234
2.488AsnAsn: 2.488 ± 0.246
2.557AsnPro: 2.557 ± 0.264
1.418AsnGln: 1.418 ± 0.164
2.092AsnArg: 2.092 ± 0.238
2.999AsnSer: 2.999 ± 0.251
2.813AsnThr: 2.813 ± 0.421
2.72AsnVal: 2.72 ± 0.254
0.86AsnTrp: 0.86 ± 0.127
2.185AsnTyr: 2.185 ± 0.211
0.0AsnXaa: 0.0 ± 0.0
Pro
2.302ProAla: 2.302 ± 0.24
0.465ProCys: 0.465 ± 0.108
2.325ProAsp: 2.325 ± 0.273
3.673ProGlu: 3.673 ± 0.35
0.976ProPhe: 0.976 ± 0.172
1.697ProGly: 1.697 ± 0.212
1.069ProHis: 1.069 ± 0.165
2.348ProIle: 2.348 ± 0.208
3.162ProLys: 3.162 ± 0.323
2.581ProLeu: 2.581 ± 0.242
0.976ProMet: 0.976 ± 0.138
1.813ProAsn: 1.813 ± 0.195
1.023ProPro: 1.023 ± 0.187
1.139ProGln: 1.139 ± 0.16
1.093ProArg: 1.093 ± 0.161
2.209ProSer: 2.209 ± 0.195
2.65ProThr: 2.65 ± 0.246
2.511ProVal: 2.511 ± 0.245
0.372ProTrp: 0.372 ± 0.084
1.325ProTyr: 1.325 ± 0.167
0.0ProXaa: 0.0 ± 0.0
Gln
2.023GlnAla: 2.023 ± 0.232
0.349GlnCys: 0.349 ± 0.108
1.674GlnAsp: 1.674 ± 0.213
2.836GlnGlu: 2.836 ± 0.257
1.418GlnPhe: 1.418 ± 0.183
2.604GlnGly: 2.604 ± 0.249
0.883GlnHis: 0.883 ± 0.174
2.069GlnIle: 2.069 ± 0.235
2.302GlnLys: 2.302 ± 0.242
3.046GlnLeu: 3.046 ± 0.258
1.325GlnMet: 1.325 ± 0.208
1.372GlnAsn: 1.372 ± 0.158
1.255GlnPro: 1.255 ± 0.225
1.255GlnGln: 1.255 ± 0.194
1.558GlnArg: 1.558 ± 0.157
1.744GlnSer: 1.744 ± 0.226
1.418GlnThr: 1.418 ± 0.187
2.232GlnVal: 2.232 ± 0.206
0.511GlnTrp: 0.511 ± 0.105
1.767GlnTyr: 1.767 ± 0.19
0.0GlnXaa: 0.0 ± 0.0
Arg
2.767ArgAla: 2.767 ± 0.199
0.395ArgCys: 0.395 ± 0.096
2.395ArgAsp: 2.395 ± 0.259
3.418ArgGlu: 3.418 ± 0.285
1.697ArgPhe: 1.697 ± 0.189
2.86ArgGly: 2.86 ± 0.276
0.93ArgHis: 0.93 ± 0.145
3.371ArgIle: 3.371 ± 0.278
4.045ArgLys: 4.045 ± 0.406
3.441ArgLeu: 3.441 ± 0.285
1.837ArgMet: 1.837 ± 0.272
2.069ArgAsn: 2.069 ± 0.235
1.372ArgPro: 1.372 ± 0.168
1.255ArgGln: 1.255 ± 0.179
2.441ArgArg: 2.441 ± 0.253
2.092ArgSer: 2.092 ± 0.255
2.418ArgThr: 2.418 ± 0.199
3.836ArgVal: 3.836 ± 0.37
0.511ArgTrp: 0.511 ± 0.145
1.79ArgTyr: 1.79 ± 0.179
0.0ArgXaa: 0.0 ± 0.0
Ser
3.859SerAla: 3.859 ± 0.408
0.535SerCys: 0.535 ± 0.106
3.79SerAsp: 3.79 ± 0.364
3.441SerGlu: 3.441 ± 0.288
2.092SerPhe: 2.092 ± 0.238
4.255SerGly: 4.255 ± 0.398
1.232SerHis: 1.232 ± 0.167
4.045SerIle: 4.045 ± 0.329
5.394SerLys: 5.394 ± 0.316
4.278SerLeu: 4.278 ± 0.28
2.139SerMet: 2.139 ± 0.235
2.976SerAsn: 2.976 ± 0.373
1.906SerPro: 1.906 ± 0.179
2.092SerGln: 2.092 ± 0.25
2.162SerArg: 2.162 ± 0.248
3.743SerSer: 3.743 ± 0.353
3.487SerThr: 3.487 ± 0.282
4.022SerVal: 4.022 ± 0.347
0.697SerTrp: 0.697 ± 0.153
2.255SerTyr: 2.255 ± 0.215
0.0SerXaa: 0.0 ± 0.0
Thr
3.743ThrAla: 3.743 ± 0.502
0.558ThrCys: 0.558 ± 0.121
3.185ThrAsp: 3.185 ± 0.316
4.394ThrGlu: 4.394 ± 0.309
2.278ThrPhe: 2.278 ± 0.233
4.813ThrGly: 4.813 ± 0.426
1.372ThrHis: 1.372 ± 0.188
3.859ThrIle: 3.859 ± 0.321
5.022ThrLys: 5.022 ± 0.329
4.766ThrLeu: 4.766 ± 0.351
1.139ThrMet: 1.139 ± 0.159
3.069ThrAsn: 3.069 ± 0.445
2.836ThrPro: 2.836 ± 0.288
1.79ThrGln: 1.79 ± 0.22
2.255ThrArg: 2.255 ± 0.241
3.464ThrSer: 3.464 ± 0.464
3.952ThrThr: 3.952 ± 0.522
4.72ThrVal: 4.72 ± 0.418
0.721ThrTrp: 0.721 ± 0.108
2.255ThrTyr: 2.255 ± 0.214
0.0ThrXaa: 0.0 ± 0.0
Val
4.138ValAla: 4.138 ± 0.367
0.721ValCys: 0.721 ± 0.143
5.44ValAsp: 5.44 ± 0.408
5.394ValGlu: 5.394 ± 0.46
1.79ValPhe: 1.79 ± 0.245
4.348ValGly: 4.348 ± 0.441
1.488ValHis: 1.488 ± 0.217
4.324ValIle: 4.324 ± 0.325
5.371ValLys: 5.371 ± 0.363
5.022ValLeu: 5.022 ± 0.339
1.906ValMet: 1.906 ± 0.218
3.511ValAsn: 3.511 ± 0.35
2.79ValPro: 2.79 ± 0.286
2.302ValGln: 2.302 ± 0.229
3.557ValArg: 3.557 ± 0.326
4.417ValSer: 4.417 ± 0.38
4.627ValThr: 4.627 ± 0.499
5.161ValVal: 5.161 ± 0.399
0.79ValTrp: 0.79 ± 0.141
2.557ValTyr: 2.557 ± 0.192
0.0ValXaa: 0.0 ± 0.0
Trp
0.907TrpAla: 0.907 ± 0.125
0.302TrpCys: 0.302 ± 0.105
0.93TrpAsp: 0.93 ± 0.15
0.93TrpGlu: 0.93 ± 0.157
0.488TrpPhe: 0.488 ± 0.113
1.023TrpGly: 1.023 ± 0.156
0.186TrpHis: 0.186 ± 0.055
0.697TrpIle: 0.697 ± 0.108
1.069TrpLys: 1.069 ± 0.163
0.953TrpLeu: 0.953 ± 0.147
0.511TrpMet: 0.511 ± 0.113
0.93TrpAsn: 0.93 ± 0.179
0.0TrpPro: 0.0 ± 0.0
0.604TrpGln: 0.604 ± 0.12
0.628TrpArg: 0.628 ± 0.102
0.79TrpSer: 0.79 ± 0.127
0.976TrpThr: 0.976 ± 0.198
1.418TrpVal: 1.418 ± 0.215
0.325TrpTrp: 0.325 ± 0.095
0.511TrpTyr: 0.511 ± 0.105
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.534TyrAla: 2.534 ± 0.294
0.558TyrCys: 0.558 ± 0.107
2.511TyrAsp: 2.511 ± 0.233
2.627TyrGlu: 2.627 ± 0.244
1.441TyrPhe: 1.441 ± 0.169
2.464TyrGly: 2.464 ± 0.273
0.837TyrHis: 0.837 ± 0.161
2.65TyrIle: 2.65 ± 0.236
3.115TyrLys: 3.115 ± 0.272
3.697TyrLeu: 3.697 ± 0.331
0.93TyrMet: 0.93 ± 0.135
2.046TyrAsn: 2.046 ± 0.229
1.209TyrPro: 1.209 ± 0.152
1.418TyrGln: 1.418 ± 0.184
2.185TyrArg: 2.185 ± 0.2
1.93TyrSer: 1.93 ± 0.215
2.604TyrThr: 2.604 ± 0.269
2.557TyrVal: 2.557 ± 0.221
0.558TyrTrp: 0.558 ± 0.135
1.534TyrTyr: 1.534 ± 0.199
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 220 proteins (43013 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski