Amino acid dipepetide frequency for Gordonia phage Sixama

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
8.571AlaAla: 8.571 ± 0.656
0.571AlaCys: 0.571 ± 0.121
4.743AlaAsp: 4.743 ± 0.382
6.4AlaGlu: 6.4 ± 0.521
2.943AlaPhe: 2.943 ± 0.309
6.943AlaGly: 6.943 ± 0.715
1.371AlaHis: 1.371 ± 0.173
5.771AlaIle: 5.771 ± 0.574
5.457AlaLys: 5.457 ± 0.424
8.086AlaLeu: 8.086 ± 0.563
2.686AlaMet: 2.686 ± 0.3
3.486AlaAsn: 3.486 ± 0.327
4.543AlaPro: 4.543 ± 0.444
2.971AlaGln: 2.971 ± 0.308
4.771AlaArg: 4.771 ± 0.35
4.971AlaSer: 4.971 ± 0.479
6.057AlaThr: 6.057 ± 0.534
6.086AlaVal: 6.086 ± 0.451
1.6AlaTrp: 1.6 ± 0.21
2.429AlaTyr: 2.429 ± 0.249
0.0AlaXaa: 0.0 ± 0.0
Cys
0.886CysAla: 0.886 ± 0.188
0.143CysCys: 0.143 ± 0.069
0.686CysAsp: 0.686 ± 0.154
0.8CysGlu: 0.8 ± 0.189
0.229CysPhe: 0.229 ± 0.088
1.4CysGly: 1.4 ± 0.234
0.171CysHis: 0.171 ± 0.074
0.4CysIle: 0.4 ± 0.114
0.429CysLys: 0.429 ± 0.123
0.657CysLeu: 0.657 ± 0.152
0.171CysMet: 0.171 ± 0.068
0.371CysAsn: 0.371 ± 0.108
0.743CysPro: 0.743 ± 0.17
0.2CysGln: 0.2 ± 0.08
0.571CysArg: 0.571 ± 0.129
0.657CysSer: 0.657 ± 0.156
0.686CysThr: 0.686 ± 0.142
0.686CysVal: 0.686 ± 0.137
0.057CysTrp: 0.057 ± 0.045
0.314CysTyr: 0.314 ± 0.094
0.0CysXaa: 0.0 ± 0.0
Asp
6.457AspAla: 6.457 ± 0.584
0.657AspCys: 0.657 ± 0.162
5.0AspAsp: 5.0 ± 0.501
4.343AspGlu: 4.343 ± 0.372
2.171AspPhe: 2.171 ± 0.275
4.2AspGly: 4.2 ± 0.482
1.371AspHis: 1.371 ± 0.227
3.371AspIle: 3.371 ± 0.314
3.343AspLys: 3.343 ± 0.29
5.171AspLeu: 5.171 ± 0.392
1.657AspMet: 1.657 ± 0.25
2.914AspAsn: 2.914 ± 0.272
3.257AspPro: 3.257 ± 0.317
2.171AspGln: 2.171 ± 0.277
3.457AspArg: 3.457 ± 0.342
3.771AspSer: 3.771 ± 0.325
3.429AspThr: 3.429 ± 0.33
4.886AspVal: 4.886 ± 0.409
1.371AspTrp: 1.371 ± 0.235
2.114AspTyr: 2.114 ± 0.219
0.0AspXaa: 0.0 ± 0.0
Glu
5.857GluAla: 5.857 ± 0.445
0.657GluCys: 0.657 ± 0.146
3.971GluAsp: 3.971 ± 0.318
4.257GluGlu: 4.257 ± 0.363
2.314GluPhe: 2.314 ± 0.221
3.571GluGly: 3.571 ± 0.364
1.629GluHis: 1.629 ± 0.227
3.8GluIle: 3.8 ± 0.326
3.486GluLys: 3.486 ± 0.358
5.257GluLeu: 5.257 ± 0.395
1.8GluMet: 1.8 ± 0.194
2.257GluAsn: 2.257 ± 0.294
2.429GluPro: 2.429 ± 0.262
2.629GluGln: 2.629 ± 0.247
4.171GluArg: 4.171 ± 0.422
3.6GluSer: 3.6 ± 0.318
3.714GluThr: 3.714 ± 0.391
4.314GluVal: 4.314 ± 0.343
1.343GluTrp: 1.343 ± 0.222
2.429GluTyr: 2.429 ± 0.302
0.0GluXaa: 0.0 ± 0.0
Phe
2.686PheAla: 2.686 ± 0.292
0.171PheCys: 0.171 ± 0.074
2.629PheAsp: 2.629 ± 0.262
1.943PheGlu: 1.943 ± 0.254
0.886PhePhe: 0.886 ± 0.216
2.686PheGly: 2.686 ± 0.272
0.371PheHis: 0.371 ± 0.099
1.543PheIle: 1.543 ± 0.214
1.371PheLys: 1.371 ± 0.194
2.143PheLeu: 2.143 ± 0.235
0.886PheMet: 0.886 ± 0.15
1.314PheAsn: 1.314 ± 0.181
1.571PhePro: 1.571 ± 0.23
0.8PheGln: 0.8 ± 0.169
1.571PheArg: 1.571 ± 0.234
1.886PheSer: 1.886 ± 0.225
2.114PheThr: 2.114 ± 0.246
2.457PheVal: 2.457 ± 0.236
0.543PheTrp: 0.543 ± 0.131
0.943PheTyr: 0.943 ± 0.162
0.0PheXaa: 0.0 ± 0.0
Gly
5.857GlyAla: 5.857 ± 0.58
0.8GlyCys: 0.8 ± 0.161
3.886GlyAsp: 3.886 ± 0.349
4.057GlyGlu: 4.057 ± 0.292
2.4GlyPhe: 2.4 ± 0.321
5.029GlyGly: 5.029 ± 0.524
1.657GlyHis: 1.657 ± 0.23
4.514GlyIle: 4.514 ± 0.494
4.286GlyLys: 4.286 ± 0.473
5.314GlyLeu: 5.314 ± 0.422
2.086GlyMet: 2.086 ± 0.229
2.743GlyAsn: 2.743 ± 0.271
3.057GlyPro: 3.057 ± 0.285
2.171GlyGln: 2.171 ± 0.235
4.143GlyArg: 4.143 ± 0.429
4.8GlySer: 4.8 ± 0.518
4.857GlyThr: 4.857 ± 0.427
4.543GlyVal: 4.543 ± 0.394
1.514GlyTrp: 1.514 ± 0.195
2.429GlyTyr: 2.429 ± 0.24
0.0GlyXaa: 0.0 ± 0.0
His
1.971HisAla: 1.971 ± 0.261
0.171HisCys: 0.171 ± 0.08
1.4HisAsp: 1.4 ± 0.204
1.257HisGlu: 1.257 ± 0.235
0.771HisPhe: 0.771 ± 0.132
1.286HisGly: 1.286 ± 0.19
0.571HisHis: 0.571 ± 0.107
1.143HisIle: 1.143 ± 0.211
1.314HisLys: 1.314 ± 0.195
1.686HisLeu: 1.686 ± 0.231
0.457HisMet: 0.457 ± 0.119
0.914HisAsn: 0.914 ± 0.161
1.629HisPro: 1.629 ± 0.213
0.686HisGln: 0.686 ± 0.143
1.657HisArg: 1.657 ± 0.335
1.257HisSer: 1.257 ± 0.216
1.314HisThr: 1.314 ± 0.254
1.743HisVal: 1.743 ± 0.254
0.429HisTrp: 0.429 ± 0.119
0.686HisTyr: 0.686 ± 0.129
0.0HisXaa: 0.0 ± 0.0
Ile
5.629IleAla: 5.629 ± 0.377
0.886IleCys: 0.886 ± 0.183
3.686IleAsp: 3.686 ± 0.28
3.743IleGlu: 3.743 ± 0.319
1.657IlePhe: 1.657 ± 0.2
3.371IleGly: 3.371 ± 0.426
0.971IleHis: 0.971 ± 0.159
2.4IleIle: 2.4 ± 0.265
3.286IleLys: 3.286 ± 0.376
4.371IleLeu: 4.371 ± 0.453
0.971IleMet: 0.971 ± 0.17
2.143IleAsn: 2.143 ± 0.217
2.143IlePro: 2.143 ± 0.219
2.057IleGln: 2.057 ± 0.246
4.2IleArg: 4.2 ± 0.33
3.714IleSer: 3.714 ± 0.448
3.229IleThr: 3.229 ± 0.323
3.829IleVal: 3.829 ± 0.341
1.143IleTrp: 1.143 ± 0.168
1.429IleTyr: 1.429 ± 0.176
0.0IleXaa: 0.0 ± 0.0
Lys
5.257LysAla: 5.257 ± 0.37
0.571LysCys: 0.571 ± 0.177
3.229LysAsp: 3.229 ± 0.279
3.257LysGlu: 3.257 ± 0.285
1.6LysPhe: 1.6 ± 0.24
2.571LysGly: 2.571 ± 0.279
1.686LysHis: 1.686 ± 0.256
3.0LysIle: 3.0 ± 0.399
4.057LysLys: 4.057 ± 0.408
4.657LysLeu: 4.657 ± 0.384
1.371LysMet: 1.371 ± 0.195
2.543LysAsn: 2.543 ± 0.328
3.686LysPro: 3.686 ± 0.387
2.029LysGln: 2.029 ± 0.273
3.8LysArg: 3.8 ± 0.319
3.543LysSer: 3.543 ± 0.359
3.029LysThr: 3.029 ± 0.363
3.2LysVal: 3.2 ± 0.32
1.229LysTrp: 1.229 ± 0.159
1.571LysTyr: 1.571 ± 0.229
0.0LysXaa: 0.0 ± 0.0
Leu
7.057LeuAla: 7.057 ± 0.384
0.571LeuCys: 0.571 ± 0.152
5.829LeuAsp: 5.829 ± 0.388
5.371LeuGlu: 5.371 ± 0.422
1.943LeuPhe: 1.943 ± 0.239
5.571LeuGly: 5.571 ± 0.494
1.6LeuHis: 1.6 ± 0.235
4.057LeuIle: 4.057 ± 0.304
4.571LeuLys: 4.571 ± 0.387
5.8LeuLeu: 5.8 ± 0.481
1.886LeuMet: 1.886 ± 0.235
3.086LeuAsn: 3.086 ± 0.302
3.8LeuPro: 3.8 ± 0.326
2.2LeuGln: 2.2 ± 0.22
5.114LeuArg: 5.114 ± 0.422
4.857LeuSer: 4.857 ± 0.324
5.543LeuThr: 5.543 ± 0.334
5.657LeuVal: 5.657 ± 0.393
1.514LeuTrp: 1.514 ± 0.192
2.343LeuTyr: 2.343 ± 0.255
0.0LeuXaa: 0.0 ± 0.0
Met
2.571MetAla: 2.571 ± 0.257
0.343MetCys: 0.343 ± 0.098
1.171MetAsp: 1.171 ± 0.185
1.057MetGlu: 1.057 ± 0.173
0.543MetPhe: 0.543 ± 0.139
1.314MetGly: 1.314 ± 0.213
0.629MetHis: 0.629 ± 0.163
1.571MetIle: 1.571 ± 0.198
1.514MetLys: 1.514 ± 0.19
1.886MetLeu: 1.886 ± 0.229
0.657MetMet: 0.657 ± 0.141
1.2MetAsn: 1.2 ± 0.187
1.229MetPro: 1.229 ± 0.189
0.857MetGln: 0.857 ± 0.21
1.286MetArg: 1.286 ± 0.164
2.457MetSer: 2.457 ± 0.283
2.6MetThr: 2.6 ± 0.304
1.057MetVal: 1.057 ± 0.162
0.371MetTrp: 0.371 ± 0.091
0.486MetTyr: 0.486 ± 0.112
0.0MetXaa: 0.0 ± 0.0
Asn
3.2AsnAla: 3.2 ± 0.354
0.543AsnCys: 0.543 ± 0.115
2.886AsnAsp: 2.886 ± 0.324
2.629AsnGlu: 2.629 ± 0.252
1.2AsnPhe: 1.2 ± 0.186
3.657AsnGly: 3.657 ± 0.374
0.914AsnHis: 0.914 ± 0.177
1.714AsnIle: 1.714 ± 0.194
2.314AsnLys: 2.314 ± 0.269
2.857AsnLeu: 2.857 ± 0.27
0.629AsnMet: 0.629 ± 0.153
2.114AsnAsn: 2.114 ± 0.315
2.2AsnPro: 2.2 ± 0.252
1.514AsnGln: 1.514 ± 0.224
2.457AsnArg: 2.457 ± 0.305
2.571AsnSer: 2.571 ± 0.29
2.314AsnThr: 2.314 ± 0.252
2.743AsnVal: 2.743 ± 0.33
0.886AsnTrp: 0.886 ± 0.155
1.229AsnTyr: 1.229 ± 0.162
0.0AsnXaa: 0.0 ± 0.0
Pro
4.971ProAla: 4.971 ± 0.448
0.571ProCys: 0.571 ± 0.158
3.114ProAsp: 3.114 ± 0.393
3.429ProGlu: 3.429 ± 0.276
1.343ProPhe: 1.343 ± 0.187
4.257ProGly: 4.257 ± 0.338
1.2ProHis: 1.2 ± 0.207
2.486ProIle: 2.486 ± 0.277
3.314ProLys: 3.314 ± 0.372
3.229ProLeu: 3.229 ± 0.295
1.057ProMet: 1.057 ± 0.222
2.057ProAsn: 2.057 ± 0.225
3.057ProPro: 3.057 ± 0.39
1.6ProGln: 1.6 ± 0.195
2.429ProArg: 2.429 ± 0.3
2.914ProSer: 2.914 ± 0.316
3.286ProThr: 3.286 ± 0.37
3.743ProVal: 3.743 ± 0.323
0.714ProTrp: 0.714 ± 0.138
1.571ProTyr: 1.571 ± 0.247
0.0ProXaa: 0.0 ± 0.0
Gln
3.314GlnAla: 3.314 ± 0.275
0.2GlnCys: 0.2 ± 0.072
1.771GlnAsp: 1.771 ± 0.199
1.829GlnGlu: 1.829 ± 0.233
0.857GlnPhe: 0.857 ± 0.151
2.286GlnGly: 2.286 ± 0.265
0.657GlnHis: 0.657 ± 0.155
2.057GlnIle: 2.057 ± 0.283
1.514GlnLys: 1.514 ± 0.165
3.143GlnLeu: 3.143 ± 0.35
1.257GlnMet: 1.257 ± 0.185
1.229GlnAsn: 1.229 ± 0.205
1.714GlnPro: 1.714 ± 0.24
1.457GlnGln: 1.457 ± 0.206
2.257GlnArg: 2.257 ± 0.295
2.114GlnSer: 2.114 ± 0.258
1.943GlnThr: 1.943 ± 0.208
2.171GlnVal: 2.171 ± 0.247
0.543GlnTrp: 0.543 ± 0.133
1.171GlnTyr: 1.171 ± 0.184
0.0GlnXaa: 0.0 ± 0.0
Arg
4.8ArgAla: 4.8 ± 0.413
0.686ArgCys: 0.686 ± 0.167
4.314ArgAsp: 4.314 ± 0.404
4.229ArgGlu: 4.229 ± 0.457
2.0ArgPhe: 2.0 ± 0.232
3.686ArgGly: 3.686 ± 0.354
1.429ArgHis: 1.429 ± 0.224
3.2ArgIle: 3.2 ± 0.345
3.943ArgLys: 3.943 ± 0.376
4.886ArgLeu: 4.886 ± 0.367
1.457ArgMet: 1.457 ± 0.199
2.2ArgAsn: 2.2 ± 0.326
2.914ArgPro: 2.914 ± 0.363
2.429ArgGln: 2.429 ± 0.309
4.886ArgArg: 4.886 ± 0.43
3.657ArgSer: 3.657 ± 0.342
3.143ArgThr: 3.143 ± 0.406
3.486ArgVal: 3.486 ± 0.265
1.286ArgTrp: 1.286 ± 0.204
2.8ArgTyr: 2.8 ± 0.372
0.0ArgXaa: 0.0 ± 0.0
Ser
5.4SerAla: 5.4 ± 0.456
0.514SerCys: 0.514 ± 0.134
4.314SerAsp: 4.314 ± 0.377
3.771SerGlu: 3.771 ± 0.36
1.686SerPhe: 1.686 ± 0.213
5.543SerGly: 5.543 ± 0.487
1.486SerHis: 1.486 ± 0.215
3.229SerIle: 3.229 ± 0.32
2.943SerLys: 2.943 ± 0.308
4.857SerLeu: 4.857 ± 0.392
2.0SerMet: 2.0 ± 0.204
2.857SerAsn: 2.857 ± 0.334
2.771SerPro: 2.771 ± 0.274
1.914SerGln: 1.914 ± 0.237
3.229SerArg: 3.229 ± 0.343
5.057SerSer: 5.057 ± 0.568
4.457SerThr: 4.457 ± 0.423
3.857SerVal: 3.857 ± 0.393
1.629SerTrp: 1.629 ± 0.23
2.0SerTyr: 2.0 ± 0.211
0.0SerXaa: 0.0 ± 0.0
Thr
5.8ThrAla: 5.8 ± 0.436
0.743ThrCys: 0.743 ± 0.159
4.057ThrAsp: 4.057 ± 0.301
3.8ThrGlu: 3.8 ± 0.368
2.4ThrPhe: 2.4 ± 0.293
5.286ThrGly: 5.286 ± 0.354
1.257ThrHis: 1.257 ± 0.173
3.457ThrIle: 3.457 ± 0.317
2.971ThrLys: 2.971 ± 0.24
4.743ThrLeu: 4.743 ± 0.343
1.2ThrMet: 1.2 ± 0.179
2.314ThrAsn: 2.314 ± 0.254
4.371ThrPro: 4.371 ± 0.393
1.857ThrGln: 1.857 ± 0.197
4.057ThrArg: 4.057 ± 0.388
3.857ThrSer: 3.857 ± 0.404
4.171ThrThr: 4.171 ± 0.434
4.229ThrVal: 4.229 ± 0.377
1.657ThrTrp: 1.657 ± 0.21
2.0ThrTyr: 2.0 ± 0.179
0.0ThrXaa: 0.0 ± 0.0
Val
5.943ValAla: 5.943 ± 0.33
0.714ValCys: 0.714 ± 0.168
4.829ValAsp: 4.829 ± 0.341
4.4ValGlu: 4.4 ± 0.326
2.314ValPhe: 2.314 ± 0.289
3.629ValGly: 3.629 ± 0.306
1.829ValHis: 1.829 ± 0.224
4.457ValIle: 4.457 ± 0.432
2.829ValLys: 2.829 ± 0.32
5.571ValLeu: 5.571 ± 0.441
1.543ValMet: 1.543 ± 0.229
2.629ValAsn: 2.629 ± 0.251
3.0ValPro: 3.0 ± 0.277
2.514ValGln: 2.514 ± 0.28
3.743ValArg: 3.743 ± 0.327
4.457ValSer: 4.457 ± 0.433
4.571ValThr: 4.571 ± 0.394
5.657ValVal: 5.657 ± 0.451
1.257ValTrp: 1.257 ± 0.256
2.2ValTyr: 2.2 ± 0.23
0.0ValXaa: 0.0 ± 0.0
Trp
1.571TrpAla: 1.571 ± 0.219
0.314TrpCys: 0.314 ± 0.104
1.257TrpAsp: 1.257 ± 0.239
1.114TrpGlu: 1.114 ± 0.199
0.514TrpPhe: 0.514 ± 0.128
1.343TrpGly: 1.343 ± 0.188
0.8TrpHis: 0.8 ± 0.191
1.171TrpIle: 1.171 ± 0.174
1.2TrpLys: 1.2 ± 0.187
1.543TrpLeu: 1.543 ± 0.221
0.457TrpMet: 0.457 ± 0.131
1.057TrpAsn: 1.057 ± 0.211
0.771TrpPro: 0.771 ± 0.161
0.543TrpGln: 0.543 ± 0.11
1.486TrpArg: 1.486 ± 0.161
1.4TrpSer: 1.4 ± 0.202
1.257TrpThr: 1.257 ± 0.201
1.314TrpVal: 1.314 ± 0.2
0.514TrpTrp: 0.514 ± 0.128
0.686TrpTyr: 0.686 ± 0.143
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.657TyrAla: 2.657 ± 0.291
0.429TyrCys: 0.429 ± 0.119
2.343TyrAsp: 2.343 ± 0.29
1.8TyrGlu: 1.8 ± 0.261
0.743TyrPhe: 0.743 ± 0.166
2.514TyrGly: 2.514 ± 0.256
0.743TyrHis: 0.743 ± 0.139
1.514TyrIle: 1.514 ± 0.223
1.743TyrLys: 1.743 ± 0.208
2.514TyrLeu: 2.514 ± 0.249
0.543TyrMet: 0.543 ± 0.119
1.143TyrAsn: 1.143 ± 0.187
1.457TyrPro: 1.457 ± 0.212
0.886TyrGln: 0.886 ± 0.165
2.143TyrArg: 2.143 ± 0.276
2.0TyrSer: 2.0 ± 0.211
2.514TyrThr: 2.514 ± 0.333
2.429TyrVal: 2.429 ± 0.264
0.714TyrTrp: 0.714 ± 0.161
0.971TyrTyr: 0.971 ± 0.156
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 168 proteins (35001 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski