Amino acid dipepetide frequency for Lactobacillus phage Semele

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
2.466AlaAla: 2.466 ± 0.4
0.197AlaCys: 0.197 ± 0.062
3.822AlaAsp: 3.822 ± 0.353
3.378AlaGlu: 3.378 ± 0.302
1.948AlaPhe: 1.948 ± 0.213
4.34AlaGly: 4.34 ± 0.647
0.592AlaHis: 0.592 ± 0.118
3.995AlaIle: 3.995 ± 0.295
6.337AlaLys: 6.337 ± 0.517
5.178AlaLeu: 5.178 ± 0.384
1.603AlaMet: 1.603 ± 0.199
3.526AlaAsn: 3.526 ± 0.316
1.849AlaPro: 1.849 ± 0.245
2.219AlaGln: 2.219 ± 0.251
2.318AlaArg: 2.318 ± 0.231
4.882AlaSer: 4.882 ± 0.447
4.34AlaThr: 4.34 ± 0.419
3.65AlaVal: 3.65 ± 0.267
0.666AlaTrp: 0.666 ± 0.124
2.712AlaTyr: 2.712 ± 0.252
0.0AlaXaa: 0.0 ± 0.0
Cys
0.222CysAla: 0.222 ± 0.088
0.123CysCys: 0.123 ± 0.083
0.395CysAsp: 0.395 ± 0.096
0.247CysGlu: 0.247 ± 0.084
0.197CysPhe: 0.197 ± 0.069
0.74CysGly: 0.74 ± 0.147
0.222CysHis: 0.222 ± 0.07
0.493CysIle: 0.493 ± 0.109
0.542CysLys: 0.542 ± 0.114
0.518CysLeu: 0.518 ± 0.116
0.099CysMet: 0.099 ± 0.045
0.395CysAsn: 0.395 ± 0.094
0.419CysPro: 0.419 ± 0.127
0.173CysGln: 0.173 ± 0.062
0.173CysArg: 0.173 ± 0.074
0.395CysSer: 0.395 ± 0.097
0.296CysThr: 0.296 ± 0.102
0.271CysVal: 0.271 ± 0.083
0.049CysTrp: 0.049 ± 0.032
0.222CysTyr: 0.222 ± 0.082
0.0CysXaa: 0.0 ± 0.0
Asp
4.34AspAla: 4.34 ± 0.369
0.518AspCys: 0.518 ± 0.106
5.918AspAsp: 5.918 ± 0.519
4.217AspGlu: 4.217 ± 0.356
2.934AspPhe: 2.934 ± 0.286
4.562AspGly: 4.562 ± 0.309
0.469AspHis: 0.469 ± 0.094
5.573AspIle: 5.573 ± 0.356
6.559AspLys: 6.559 ± 0.464
6.214AspLeu: 6.214 ± 0.452
1.874AspMet: 1.874 ± 0.197
5.499AspAsn: 5.499 ± 0.345
1.775AspPro: 1.775 ± 0.212
1.356AspGln: 1.356 ± 0.171
2.096AspArg: 2.096 ± 0.287
5.573AspSer: 5.573 ± 0.401
4.71AspThr: 4.71 ± 0.404
4.118AspVal: 4.118 ± 0.391
0.962AspTrp: 0.962 ± 0.133
3.724AspTyr: 3.724 ± 0.317
0.0AspXaa: 0.0 ± 0.0
Glu
4.414GluAla: 4.414 ± 0.44
0.296GluCys: 0.296 ± 0.094
4.513GluAsp: 4.513 ± 0.428
4.019GluGlu: 4.019 ± 0.482
2.022GluPhe: 2.022 ± 0.261
2.786GluGly: 2.786 ± 0.286
1.184GluHis: 1.184 ± 0.163
3.502GluIle: 3.502 ± 0.309
3.403GluLys: 3.403 ± 0.321
6.781GluLeu: 6.781 ± 0.481
1.356GluMet: 1.356 ± 0.179
3.008GluAsn: 3.008 ± 0.295
1.652GluPro: 1.652 ± 0.232
2.293GluGln: 2.293 ± 0.247
2.269GluArg: 2.269 ± 0.217
3.699GluSer: 3.699 ± 0.307
3.107GluThr: 3.107 ± 0.304
3.502GluVal: 3.502 ± 0.282
0.518GluTrp: 0.518 ± 0.131
3.23GluTyr: 3.23 ± 0.35
0.0GluXaa: 0.0 ± 0.0
Phe
1.554PheAla: 1.554 ± 0.168
0.222PheCys: 0.222 ± 0.061
2.565PheAsp: 2.565 ± 0.245
1.8PheGlu: 1.8 ± 0.206
0.641PhePhe: 0.641 ± 0.123
1.874PheGly: 1.874 ± 0.198
0.518PheHis: 0.518 ± 0.109
2.293PheIle: 2.293 ± 0.283
3.354PheLys: 3.354 ± 0.252
2.491PheLeu: 2.491 ± 0.286
1.06PheMet: 1.06 ± 0.136
3.181PheAsn: 3.181 ± 0.294
1.06PhePro: 1.06 ± 0.177
1.036PheGln: 1.036 ± 0.141
1.085PheArg: 1.085 ± 0.188
3.008PheSer: 3.008 ± 0.267
2.589PheThr: 2.589 ± 0.264
2.466PheVal: 2.466 ± 0.266
0.321PheTrp: 0.321 ± 0.088
1.554PheTyr: 1.554 ± 0.199
0.0PheXaa: 0.0 ± 0.0
Gly
3.699GlyAla: 3.699 ± 0.508
0.321GlyCys: 0.321 ± 0.102
4.019GlyAsp: 4.019 ± 0.3
3.378GlyGlu: 3.378 ± 0.316
2.417GlyPhe: 2.417 ± 0.263
3.896GlyGly: 3.896 ± 0.361
1.011GlyHis: 1.011 ± 0.18
4.167GlyIle: 4.167 ± 0.36
5.03GlyLys: 5.03 ± 0.473
4.956GlyLeu: 4.956 ± 0.36
1.726GlyMet: 1.726 ± 0.255
4.192GlyAsn: 4.192 ± 0.402
0.518GlyPro: 0.518 ± 0.104
1.8GlyGln: 1.8 ± 0.178
1.775GlyArg: 1.775 ± 0.213
4.685GlySer: 4.685 ± 0.661
4.636GlyThr: 4.636 ± 0.543
4.389GlyVal: 4.389 ± 0.392
0.789GlyTrp: 0.789 ± 0.142
3.403GlyTyr: 3.403 ± 0.288
0.0GlyXaa: 0.0 ± 0.0
His
0.888HisAla: 0.888 ± 0.119
0.173HisCys: 0.173 ± 0.083
1.455HisAsp: 1.455 ± 0.216
0.74HisGlu: 0.74 ± 0.136
0.567HisPhe: 0.567 ± 0.129
0.912HisGly: 0.912 ± 0.167
0.444HisHis: 0.444 ± 0.088
0.962HisIle: 0.962 ± 0.158
1.627HisLys: 1.627 ± 0.23
1.258HisLeu: 1.258 ± 0.152
0.469HisMet: 0.469 ± 0.092
1.233HisAsn: 1.233 ± 0.151
0.518HisPro: 0.518 ± 0.118
0.321HisGln: 0.321 ± 0.087
0.592HisArg: 0.592 ± 0.108
1.134HisSer: 1.134 ± 0.197
0.912HisThr: 0.912 ± 0.159
0.937HisVal: 0.937 ± 0.153
0.173HisTrp: 0.173 ± 0.07
0.937HisTyr: 0.937 ± 0.158
0.0HisXaa: 0.0 ± 0.0
Ile
3.871IleAla: 3.871 ± 0.282
0.419IleCys: 0.419 ± 0.115
5.4IleAsp: 5.4 ± 0.332
3.748IleGlu: 3.748 ± 0.313
1.184IlePhe: 1.184 ± 0.168
3.896IleGly: 3.896 ± 0.297
0.838IleHis: 0.838 ± 0.143
4.118IleIle: 4.118 ± 0.32
6.041IleLys: 6.041 ± 0.418
4.044IleLeu: 4.044 ± 0.352
1.652IleMet: 1.652 ± 0.227
5.351IleAsn: 5.351 ± 0.418
2.441IlePro: 2.441 ± 0.254
2.071IleGln: 2.071 ± 0.174
2.244IleArg: 2.244 ± 0.262
5.696IleSer: 5.696 ± 0.358
4.784IleThr: 4.784 ± 0.321
4.488IleVal: 4.488 ± 0.353
0.321IleTrp: 0.321 ± 0.077
3.008IleTyr: 3.008 ± 0.237
0.0IleXaa: 0.0 ± 0.0
Lys
5.154LysAla: 5.154 ± 0.412
0.567LysCys: 0.567 ± 0.116
5.598LysAsp: 5.598 ± 0.42
5.721LysGlu: 5.721 ± 0.534
3.033LysPhe: 3.033 ± 0.319
4.932LysGly: 4.932 ± 0.496
1.8LysHis: 1.8 ± 0.245
4.217LysIle: 4.217 ± 0.386
4.956LysLys: 4.956 ± 0.467
8.187LysLeu: 8.187 ± 0.478
2.269LysMet: 2.269 ± 0.194
4.439LysAsn: 4.439 ± 0.384
2.663LysPro: 2.663 ± 0.318
3.576LysGln: 3.576 ± 0.33
3.181LysArg: 3.181 ± 0.306
5.4LysSer: 5.4 ± 0.595
3.773LysThr: 3.773 ± 0.278
6.091LysVal: 6.091 ± 0.416
0.863LysTrp: 0.863 ± 0.16
3.896LysTyr: 3.896 ± 0.335
0.0LysXaa: 0.0 ± 0.0
Leu
5.795LeuAla: 5.795 ± 0.426
0.37LeuCys: 0.37 ± 0.106
6.559LeuAsp: 6.559 ± 0.458
5.252LeuGlu: 5.252 ± 0.377
3.378LeuPhe: 3.378 ± 0.273
4.882LeuGly: 4.882 ± 0.362
1.307LeuHis: 1.307 ± 0.157
5.302LeuIle: 5.302 ± 0.444
5.943LeuLys: 5.943 ± 0.421
6.189LeuLeu: 6.189 ± 0.534
2.293LeuMet: 2.293 ± 0.246
5.943LeuAsn: 5.943 ± 0.341
2.491LeuPro: 2.491 ± 0.277
2.811LeuGln: 2.811 ± 0.246
2.663LeuArg: 2.663 ± 0.231
7.792LeuSer: 7.792 ± 0.453
5.672LeuThr: 5.672 ± 0.379
5.548LeuVal: 5.548 ± 0.319
0.814LeuTrp: 0.814 ± 0.145
3.699LeuTyr: 3.699 ± 0.33
0.0LeuXaa: 0.0 ± 0.0
Met
1.899MetAla: 1.899 ± 0.209
0.099MetCys: 0.099 ± 0.053
1.43MetAsp: 1.43 ± 0.17
1.43MetGlu: 1.43 ± 0.209
0.937MetPhe: 0.937 ± 0.151
1.085MetGly: 1.085 ± 0.165
0.247MetHis: 0.247 ± 0.097
1.603MetIle: 1.603 ± 0.199
2.318MetLys: 2.318 ± 0.237
2.392MetLeu: 2.392 ± 0.244
0.345MetMet: 0.345 ± 0.088
1.775MetAsn: 1.775 ± 0.197
0.641MetPro: 0.641 ± 0.118
0.962MetGln: 0.962 ± 0.159
0.789MetArg: 0.789 ± 0.125
1.923MetSer: 1.923 ± 0.243
1.578MetThr: 1.578 ± 0.214
1.578MetVal: 1.578 ± 0.164
0.148MetTrp: 0.148 ± 0.052
0.962MetTyr: 0.962 ± 0.151
0.0MetXaa: 0.0 ± 0.0
Asn
3.921AsnAla: 3.921 ± 0.347
0.345AsnCys: 0.345 ± 0.105
4.685AsnAsp: 4.685 ± 0.352
3.378AsnGlu: 3.378 ± 0.344
2.047AsnPhe: 2.047 ± 0.247
4.932AsnGly: 4.932 ± 0.595
1.627AsnHis: 1.627 ± 0.226
4.143AsnIle: 4.143 ± 0.288
5.894AsnLys: 5.894 ± 0.378
5.055AsnLeu: 5.055 ± 0.336
1.627AsnMet: 1.627 ± 0.214
4.488AsnAsn: 4.488 ± 0.414
2.712AsnPro: 2.712 ± 0.271
2.885AsnGln: 2.885 ± 0.262
2.589AsnArg: 2.589 ± 0.237
4.71AsnSer: 4.71 ± 0.449
4.562AsnThr: 4.562 ± 0.37
3.748AsnVal: 3.748 ± 0.323
0.616AsnTrp: 0.616 ± 0.112
3.699AsnTyr: 3.699 ± 0.326
0.0AsnXaa: 0.0 ± 0.0
Pro
1.701ProAla: 1.701 ± 0.238
0.247ProCys: 0.247 ± 0.086
1.973ProAsp: 1.973 ± 0.208
2.54ProGlu: 2.54 ± 0.295
1.159ProPhe: 1.159 ± 0.168
0.814ProGly: 0.814 ± 0.126
0.567ProHis: 0.567 ± 0.124
1.948ProIle: 1.948 ± 0.194
2.663ProLys: 2.663 ± 0.316
2.885ProLeu: 2.885 ± 0.268
0.395ProMet: 0.395 ± 0.079
1.997ProAsn: 1.997 ± 0.229
0.345ProPro: 0.345 ± 0.096
0.641ProGln: 0.641 ± 0.15
1.011ProArg: 1.011 ± 0.137
2.269ProSer: 2.269 ± 0.255
1.997ProThr: 1.997 ± 0.273
2.269ProVal: 2.269 ± 0.282
0.271ProTrp: 0.271 ± 0.072
1.849ProTyr: 1.849 ± 0.228
0.0ProXaa: 0.0 ± 0.0
Gln
3.156GlnAla: 3.156 ± 0.291
0.197GlnCys: 0.197 ± 0.079
2.047GlnAsp: 2.047 ± 0.186
2.17GlnGlu: 2.17 ± 0.221
1.233GlnPhe: 1.233 ± 0.185
1.751GlnGly: 1.751 ± 0.237
0.715GlnHis: 0.715 ± 0.149
2.071GlnIle: 2.071 ± 0.232
1.923GlnLys: 1.923 ± 0.22
3.748GlnLeu: 3.748 ± 0.28
0.789GlnMet: 0.789 ± 0.162
1.627GlnAsn: 1.627 ± 0.211
1.208GlnPro: 1.208 ± 0.232
1.578GlnGln: 1.578 ± 0.231
1.356GlnArg: 1.356 ± 0.184
2.614GlnSer: 2.614 ± 0.326
1.973GlnThr: 1.973 ± 0.229
2.441GlnVal: 2.441 ± 0.253
0.37GlnTrp: 0.37 ± 0.083
1.701GlnTyr: 1.701 ± 0.216
0.0GlnXaa: 0.0 ± 0.0
Arg
2.047ArgAla: 2.047 ± 0.227
0.296ArgCys: 0.296 ± 0.082
2.244ArgAsp: 2.244 ± 0.247
1.849ArgGlu: 1.849 ± 0.221
1.504ArgPhe: 1.504 ± 0.159
2.269ArgGly: 2.269 ± 0.333
0.542ArgHis: 0.542 ± 0.13
2.219ArgIle: 2.219 ± 0.235
2.54ArgLys: 2.54 ± 0.279
3.23ArgLeu: 3.23 ± 0.259
1.085ArgMet: 1.085 ± 0.164
1.899ArgAsn: 1.899 ± 0.229
0.962ArgPro: 0.962 ± 0.159
1.134ArgGln: 1.134 ± 0.174
1.381ArgArg: 1.381 ± 0.206
1.923ArgSer: 1.923 ± 0.197
2.121ArgThr: 2.121 ± 0.233
2.984ArgVal: 2.984 ± 0.273
0.419ArgTrp: 0.419 ± 0.126
1.8ArgTyr: 1.8 ± 0.262
0.0ArgXaa: 0.0 ± 0.0
Ser
4.069SerAla: 4.069 ± 0.329
0.222SerCys: 0.222 ± 0.078
6.239SerAsp: 6.239 ± 0.387
3.871SerGlu: 3.871 ± 0.284
2.417SerPhe: 2.417 ± 0.266
5.4SerGly: 5.4 ± 0.61
1.307SerHis: 1.307 ± 0.197
5.203SerIle: 5.203 ± 0.415
6.732SerLys: 6.732 ± 0.54
6.214SerLeu: 6.214 ± 0.416
1.627SerMet: 1.627 ± 0.208
5.03SerAsn: 5.03 ± 0.431
1.973SerPro: 1.973 ± 0.244
3.008SerGln: 3.008 ± 0.237
2.688SerArg: 2.688 ± 0.271
6.51SerSer: 6.51 ± 0.741
4.587SerThr: 4.587 ± 0.587
5.178SerVal: 5.178 ± 0.454
0.814SerTrp: 0.814 ± 0.136
3.773SerTyr: 3.773 ± 0.296
0.0SerXaa: 0.0 ± 0.0
Thr
3.304ThrAla: 3.304 ± 0.359
0.321ThrCys: 0.321 ± 0.088
4.587ThrAsp: 4.587 ± 0.363
2.959ThrGlu: 2.959 ± 0.287
2.663ThrPhe: 2.663 ± 0.25
4.291ThrGly: 4.291 ± 0.474
0.789ThrHis: 0.789 ± 0.155
4.587ThrIle: 4.587 ± 0.34
5.055ThrLys: 5.055 ± 0.329
5.228ThrLeu: 5.228 ± 0.406
1.282ThrMet: 1.282 ± 0.186
4.932ThrAsn: 4.932 ± 0.383
2.293ThrPro: 2.293 ± 0.239
2.096ThrGln: 2.096 ± 0.179
1.603ThrArg: 1.603 ± 0.192
4.882ThrSer: 4.882 ± 0.622
5.795ThrThr: 5.795 ± 1.056
5.252ThrVal: 5.252 ± 0.442
0.74ThrTrp: 0.74 ± 0.157
3.526ThrTyr: 3.526 ± 0.288
0.0ThrXaa: 0.0 ± 0.0
Val
4.019ValAla: 4.019 ± 0.31
0.542ValCys: 0.542 ± 0.119
5.302ValAsp: 5.302 ± 0.348
3.724ValGlu: 3.724 ± 0.367
2.219ValPhe: 2.219 ± 0.205
3.477ValGly: 3.477 ± 0.296
0.789ValHis: 0.789 ± 0.146
5.178ValIle: 5.178 ± 0.411
5.277ValLys: 5.277 ± 0.371
4.685ValLeu: 4.685 ± 0.382
1.282ValMet: 1.282 ± 0.184
5.006ValAsn: 5.006 ± 0.377
2.17ValPro: 2.17 ± 0.218
2.318ValGln: 2.318 ± 0.246
2.17ValArg: 2.17 ± 0.242
5.499ValSer: 5.499 ± 0.327
5.055ValThr: 5.055 ± 0.381
3.526ValVal: 3.526 ± 0.378
0.542ValTrp: 0.542 ± 0.12
3.724ValTyr: 3.724 ± 0.252
0.0ValXaa: 0.0 ± 0.0
Trp
0.469TrpAla: 0.469 ± 0.111
0.099TrpCys: 0.099 ± 0.055
0.616TrpAsp: 0.616 ± 0.117
0.666TrpGlu: 0.666 ± 0.118
0.493TrpPhe: 0.493 ± 0.113
0.715TrpGly: 0.715 ± 0.122
0.247TrpHis: 0.247 ± 0.085
0.592TrpIle: 0.592 ± 0.141
0.641TrpLys: 0.641 ± 0.114
1.258TrpLeu: 1.258 ± 0.205
0.222TrpMet: 0.222 ± 0.077
0.616TrpAsn: 0.616 ± 0.125
0.099TrpPro: 0.099 ± 0.051
0.321TrpGln: 0.321 ± 0.09
0.321TrpArg: 0.321 ± 0.092
0.641TrpSer: 0.641 ± 0.137
0.469TrpThr: 0.469 ± 0.104
0.863TrpVal: 0.863 ± 0.138
0.099TrpTrp: 0.099 ± 0.05
0.69TrpTyr: 0.69 ± 0.121
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.033TyrAla: 3.033 ± 0.266
0.616TyrCys: 0.616 ± 0.143
3.625TyrAsp: 3.625 ± 0.338
2.466TyrGlu: 2.466 ± 0.306
1.677TyrPhe: 1.677 ± 0.191
3.255TyrGly: 3.255 ± 0.276
1.036TyrHis: 1.036 ± 0.164
3.452TyrIle: 3.452 ± 0.284
3.477TyrLys: 3.477 ± 0.324
4.192TyrLeu: 4.192 ± 0.356
1.06TyrMet: 1.06 ± 0.159
3.551TyrAsn: 3.551 ± 0.336
1.751TyrPro: 1.751 ± 0.226
2.071TyrGln: 2.071 ± 0.209
2.145TyrArg: 2.145 ± 0.236
3.625TyrSer: 3.625 ± 0.47
3.23TyrThr: 3.23 ± 0.348
3.206TyrVal: 3.206 ± 0.259
0.641TyrTrp: 0.641 ± 0.13
2.614TyrTyr: 2.614 ± 0.278
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 177 proteins (40554 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski