Amino acid dipepetide frequency for Streptomyces phage Gilson

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
7.403AlaAla: 7.403 ± 0.992
0.693AlaCys: 0.693 ± 0.155
4.991AlaAsp: 4.991 ± 0.322
6.044AlaGlu: 6.044 ± 0.531
3.438AlaPhe: 3.438 ± 0.293
6.266AlaGly: 6.266 ± 0.521
1.386AlaHis: 1.386 ± 0.175
4.103AlaIle: 4.103 ± 0.401
4.935AlaLys: 4.935 ± 0.429
6.294AlaLeu: 6.294 ± 0.52
2.634AlaMet: 2.634 ± 0.337
3.743AlaAsn: 3.743 ± 0.437
2.717AlaPro: 2.717 ± 0.31
2.828AlaGln: 2.828 ± 0.491
4.076AlaArg: 4.076 ± 0.44
3.965AlaSer: 3.965 ± 0.438
4.991AlaThr: 4.991 ± 0.588
5.656AlaVal: 5.656 ± 0.433
1.525AlaTrp: 1.525 ± 0.2
2.662AlaTyr: 2.662 ± 0.254
0.0AlaXaa: 0.0 ± 0.0
Cys
0.638CysAla: 0.638 ± 0.135
0.194CysCys: 0.194 ± 0.085
0.776CysAsp: 0.776 ± 0.155
0.693CysGlu: 0.693 ± 0.191
0.36CysPhe: 0.36 ± 0.123
1.303CysGly: 1.303 ± 0.227
0.222CysHis: 0.222 ± 0.085
0.416CysIle: 0.416 ± 0.109
0.749CysLys: 0.749 ± 0.155
0.776CysLeu: 0.776 ± 0.137
0.36CysMet: 0.36 ± 0.105
0.555CysAsn: 0.555 ± 0.153
0.499CysPro: 0.499 ± 0.133
0.388CysGln: 0.388 ± 0.104
0.61CysArg: 0.61 ± 0.145
0.638CysSer: 0.638 ± 0.177
0.277CysThr: 0.277 ± 0.079
0.665CysVal: 0.665 ± 0.164
0.139CysTrp: 0.139 ± 0.059
0.305CysTyr: 0.305 ± 0.097
0.0CysXaa: 0.0 ± 0.0
Asp
5.434AspAla: 5.434 ± 0.452
0.887AspCys: 0.887 ± 0.199
3.771AspAsp: 3.771 ± 0.363
5.545AspGlu: 5.545 ± 0.477
2.939AspPhe: 2.939 ± 0.328
5.989AspGly: 5.989 ± 0.462
1.026AspHis: 1.026 ± 0.175
3.078AspIle: 3.078 ± 0.304
3.41AspLys: 3.41 ± 0.276
4.464AspLeu: 4.464 ± 0.364
1.969AspMet: 1.969 ± 0.264
3.383AspAsn: 3.383 ± 0.305
2.329AspPro: 2.329 ± 0.284
1.469AspGln: 1.469 ± 0.203
2.301AspArg: 2.301 ± 0.288
3.438AspSer: 3.438 ± 0.338
3.549AspThr: 3.549 ± 0.303
5.102AspVal: 5.102 ± 0.449
1.414AspTrp: 1.414 ± 0.17
2.662AspTyr: 2.662 ± 0.277
0.0AspXaa: 0.0 ± 0.0
Glu
5.24GluAla: 5.24 ± 0.543
0.638GluCys: 0.638 ± 0.15
4.353GluAsp: 4.353 ± 0.461
5.573GluGlu: 5.573 ± 0.652
3.05GluPhe: 3.05 ± 0.324
3.993GluGly: 3.993 ± 0.301
1.331GluHis: 1.331 ± 0.232
3.965GluIle: 3.965 ± 0.382
5.24GluLys: 5.24 ± 0.418
5.046GluLeu: 5.046 ± 0.385
2.218GluMet: 2.218 ± 0.243
3.189GluAsn: 3.189 ± 0.287
1.774GluPro: 1.774 ± 0.213
2.717GluGln: 2.717 ± 0.294
4.298GluArg: 4.298 ± 0.445
3.882GluSer: 3.882 ± 0.376
3.577GluThr: 3.577 ± 0.328
5.018GluVal: 5.018 ± 0.467
1.303GluTrp: 1.303 ± 0.203
3.688GluTyr: 3.688 ± 0.413
0.0GluXaa: 0.0 ± 0.0
Phe
2.911PheAla: 2.911 ± 0.366
0.36PheCys: 0.36 ± 0.125
3.189PheAsp: 3.189 ± 0.309
2.967PheGlu: 2.967 ± 0.298
1.414PhePhe: 1.414 ± 0.224
2.856PheGly: 2.856 ± 0.289
1.026PheHis: 1.026 ± 0.166
1.941PheIle: 1.941 ± 0.197
2.218PheLys: 2.218 ± 0.23
3.133PheLeu: 3.133 ± 0.364
0.887PheMet: 0.887 ± 0.161
1.913PheAsn: 1.913 ± 0.193
1.081PhePro: 1.081 ± 0.166
1.081PheGln: 1.081 ± 0.179
2.246PheArg: 2.246 ± 0.234
2.967PheSer: 2.967 ± 0.384
2.163PheThr: 2.163 ± 0.276
2.884PheVal: 2.884 ± 0.304
0.665PheTrp: 0.665 ± 0.145
1.553PheTyr: 1.553 ± 0.22
0.0PheXaa: 0.0 ± 0.0
Gly
4.991GlyAla: 4.991 ± 0.382
0.832GlyCys: 0.832 ± 0.166
4.353GlyAsp: 4.353 ± 0.35
4.353GlyGlu: 4.353 ± 0.365
3.577GlyPhe: 3.577 ± 0.318
5.933GlyGly: 5.933 ± 0.474
1.691GlyHis: 1.691 ± 0.19
4.686GlyIle: 4.686 ± 0.356
4.686GlyLys: 4.686 ± 0.378
5.933GlyLeu: 5.933 ± 0.403
2.828GlyMet: 2.828 ± 0.276
3.965GlyAsn: 3.965 ± 0.319
2.634GlyPro: 2.634 ± 0.299
1.969GlyGln: 1.969 ± 0.278
4.713GlyArg: 4.713 ± 0.404
4.963GlySer: 4.963 ± 0.493
4.686GlyThr: 4.686 ± 0.711
5.434GlyVal: 5.434 ± 0.419
2.079GlyTrp: 2.079 ± 0.303
3.632GlyTyr: 3.632 ± 0.331
0.0GlyXaa: 0.0 ± 0.0
His
1.22HisAla: 1.22 ± 0.197
0.166HisCys: 0.166 ± 0.074
1.109HisAsp: 1.109 ± 0.223
1.22HisGlu: 1.22 ± 0.209
0.97HisPhe: 0.97 ± 0.172
1.664HisGly: 1.664 ± 0.226
0.305HisHis: 0.305 ± 0.089
1.081HisIle: 1.081 ± 0.177
1.248HisLys: 1.248 ± 0.215
1.275HisLeu: 1.275 ± 0.214
0.416HisMet: 0.416 ± 0.106
1.026HisAsn: 1.026 ± 0.174
0.721HisPro: 0.721 ± 0.125
0.471HisGln: 0.471 ± 0.12
1.303HisArg: 1.303 ± 0.215
1.081HisSer: 1.081 ± 0.186
0.749HisThr: 0.749 ± 0.146
1.691HisVal: 1.691 ± 0.25
0.444HisTrp: 0.444 ± 0.104
0.749HisTyr: 0.749 ± 0.167
0.0HisXaa: 0.0 ± 0.0
Ile
4.741IleAla: 4.741 ± 0.375
0.527IleCys: 0.527 ± 0.115
4.298IleAsp: 4.298 ± 0.469
3.826IleGlu: 3.826 ± 0.312
1.469IlePhe: 1.469 ± 0.206
4.103IleGly: 4.103 ± 0.304
0.943IleHis: 0.943 ± 0.181
2.551IleIle: 2.551 ± 0.258
3.909IleLys: 3.909 ± 0.297
3.355IleLeu: 3.355 ± 0.342
1.359IleMet: 1.359 ± 0.151
2.44IleAsn: 2.44 ± 0.303
2.246IlePro: 2.246 ± 0.258
2.052IleGln: 2.052 ± 0.261
3.105IleArg: 3.105 ± 0.296
2.911IleSer: 2.911 ± 0.286
2.939IleThr: 2.939 ± 0.323
4.131IleVal: 4.131 ± 0.374
0.721IleTrp: 0.721 ± 0.151
1.747IleTyr: 1.747 ± 0.214
0.0IleXaa: 0.0 ± 0.0
Lys
5.601LysAla: 5.601 ± 0.465
0.665LysCys: 0.665 ± 0.143
3.466LysAsp: 3.466 ± 0.326
3.383LysGlu: 3.383 ± 0.376
2.523LysPhe: 2.523 ± 0.228
3.882LysGly: 3.882 ± 0.38
1.026LysHis: 1.026 ± 0.201
3.078LysIle: 3.078 ± 0.279
4.159LysLys: 4.159 ± 0.338
3.66LysLeu: 3.66 ± 0.348
1.913LysMet: 1.913 ± 0.228
3.355LysAsn: 3.355 ± 0.335
2.856LysPro: 2.856 ± 0.249
1.885LysGln: 1.885 ± 0.232
4.519LysArg: 4.519 ± 0.429
3.604LysSer: 3.604 ± 0.294
4.408LysThr: 4.408 ± 0.263
4.991LysVal: 4.991 ± 0.339
1.331LysTrp: 1.331 ± 0.183
2.579LysTyr: 2.579 ± 0.291
0.0LysXaa: 0.0 ± 0.0
Leu
6.017LeuAla: 6.017 ± 0.43
0.749LeuCys: 0.749 ± 0.156
5.434LeuAsp: 5.434 ± 0.331
5.046LeuGlu: 5.046 ± 0.386
2.301LeuPhe: 2.301 ± 0.273
5.933LeuGly: 5.933 ± 0.492
1.303LeuHis: 1.303 ± 0.204
3.66LeuIle: 3.66 ± 0.306
4.603LeuLys: 4.603 ± 0.379
4.325LeuLeu: 4.325 ± 0.407
1.497LeuMet: 1.497 ± 0.188
3.272LeuAsn: 3.272 ± 0.281
2.551LeuPro: 2.551 ± 0.27
2.412LeuGln: 2.412 ± 0.301
3.549LeuArg: 3.549 ± 0.309
4.242LeuSer: 4.242 ± 0.378
3.854LeuThr: 3.854 ± 0.357
4.741LeuVal: 4.741 ± 0.37
1.664LeuTrp: 1.664 ± 0.216
2.8LeuTyr: 2.8 ± 0.237
0.0LeuXaa: 0.0 ± 0.0
Met
2.773MetAla: 2.773 ± 0.37
0.194MetCys: 0.194 ± 0.069
1.553MetAsp: 1.553 ± 0.193
1.913MetGlu: 1.913 ± 0.238
1.109MetPhe: 1.109 ± 0.156
1.885MetGly: 1.885 ± 0.218
0.499MetHis: 0.499 ± 0.123
1.497MetIle: 1.497 ± 0.199
1.58MetLys: 1.58 ± 0.227
2.052MetLeu: 2.052 ± 0.309
0.86MetMet: 0.86 ± 0.159
1.636MetAsn: 1.636 ± 0.224
1.303MetPro: 1.303 ± 0.205
0.887MetGln: 0.887 ± 0.162
1.553MetArg: 1.553 ± 0.197
1.58MetSer: 1.58 ± 0.194
2.218MetThr: 2.218 ± 0.214
1.969MetVal: 1.969 ± 0.22
0.388MetTrp: 0.388 ± 0.086
1.054MetTyr: 1.054 ± 0.163
0.0MetXaa: 0.0 ± 0.0
Asn
3.937AsnAla: 3.937 ± 0.437
0.527AsnCys: 0.527 ± 0.14
2.44AsnAsp: 2.44 ± 0.256
3.189AsnGlu: 3.189 ± 0.345
1.83AsnPhe: 1.83 ± 0.278
4.492AsnGly: 4.492 ± 0.432
1.054AsnHis: 1.054 ± 0.166
2.523AsnIle: 2.523 ± 0.247
2.717AsnLys: 2.717 ± 0.3
3.355AsnLeu: 3.355 ± 0.299
1.303AsnMet: 1.303 ± 0.172
2.107AsnAsn: 2.107 ± 0.291
2.384AsnPro: 2.384 ± 0.296
1.192AsnGln: 1.192 ± 0.145
2.634AsnArg: 2.634 ± 0.28
2.828AsnSer: 2.828 ± 0.354
2.745AsnThr: 2.745 ± 0.479
3.632AsnVal: 3.632 ± 0.359
0.555AsnTrp: 0.555 ± 0.135
1.969AsnTyr: 1.969 ± 0.245
0.0AsnXaa: 0.0 ± 0.0
Pro
2.689ProAla: 2.689 ± 0.33
0.305ProCys: 0.305 ± 0.086
2.773ProAsp: 2.773 ± 0.297
2.967ProGlu: 2.967 ± 0.271
1.414ProPhe: 1.414 ± 0.191
2.773ProGly: 2.773 ± 0.266
0.665ProHis: 0.665 ± 0.134
1.774ProIle: 1.774 ± 0.224
2.19ProLys: 2.19 ± 0.312
2.274ProLeu: 2.274 ± 0.217
0.555ProMet: 0.555 ± 0.101
2.052ProAsn: 2.052 ± 0.297
1.331ProPro: 1.331 ± 0.263
1.054ProGln: 1.054 ± 0.182
2.135ProArg: 2.135 ± 0.264
2.301ProSer: 2.301 ± 0.34
2.44ProThr: 2.44 ± 0.307
2.939ProVal: 2.939 ± 0.287
0.499ProTrp: 0.499 ± 0.165
1.525ProTyr: 1.525 ± 0.199
0.0ProXaa: 0.0 ± 0.0
Gln
2.745GlnAla: 2.745 ± 0.349
0.36GlnCys: 0.36 ± 0.093
1.83GlnAsp: 1.83 ± 0.167
2.384GlnGlu: 2.384 ± 0.262
0.97GlnPhe: 0.97 ± 0.155
1.83GlnGly: 1.83 ± 0.2
0.582GlnHis: 0.582 ± 0.137
1.969GlnIle: 1.969 ± 0.25
2.44GlnLys: 2.44 ± 0.289
2.412GlnLeu: 2.412 ± 0.331
1.026GlnMet: 1.026 ± 0.169
1.275GlnAsn: 1.275 ± 0.204
0.915GlnPro: 0.915 ± 0.142
1.054GlnGln: 1.054 ± 0.221
2.052GlnArg: 2.052 ± 0.307
2.357GlnSer: 2.357 ± 0.287
1.58GlnThr: 1.58 ± 0.191
1.553GlnVal: 1.553 ± 0.179
0.776GlnTrp: 0.776 ± 0.166
1.386GlnTyr: 1.386 ± 0.201
0.0GlnXaa: 0.0 ± 0.0
Arg
4.575ArgAla: 4.575 ± 0.484
0.416ArgCys: 0.416 ± 0.111
3.05ArgAsp: 3.05 ± 0.317
3.632ArgGlu: 3.632 ± 0.37
2.329ArgPhe: 2.329 ± 0.267
3.937ArgGly: 3.937 ± 0.333
1.164ArgHis: 1.164 ± 0.196
3.244ArgIle: 3.244 ± 0.31
4.464ArgLys: 4.464 ± 0.597
4.436ArgLeu: 4.436 ± 0.377
1.691ArgMet: 1.691 ± 0.222
2.689ArgAsn: 2.689 ± 0.275
2.44ArgPro: 2.44 ± 0.304
1.83ArgGln: 1.83 ± 0.229
3.826ArgArg: 3.826 ± 0.47
2.551ArgSer: 2.551 ± 0.261
2.329ArgThr: 2.329 ± 0.307
3.882ArgVal: 3.882 ± 0.41
1.081ArgTrp: 1.081 ± 0.169
2.606ArgTyr: 2.606 ± 0.295
0.0ArgXaa: 0.0 ± 0.0
Ser
4.63SerAla: 4.63 ± 0.497
0.665SerCys: 0.665 ± 0.16
3.688SerAsp: 3.688 ± 0.326
3.466SerGlu: 3.466 ± 0.335
2.606SerPhe: 2.606 ± 0.299
5.739SerGly: 5.739 ± 0.692
1.164SerHis: 1.164 ± 0.185
3.133SerIle: 3.133 ± 0.326
3.327SerLys: 3.327 ± 0.324
4.325SerLeu: 4.325 ± 0.369
1.996SerMet: 1.996 ± 0.283
2.163SerAsn: 2.163 ± 0.263
2.079SerPro: 2.079 ± 0.268
2.357SerGln: 2.357 ± 0.198
2.967SerArg: 2.967 ± 0.288
3.826SerSer: 3.826 ± 0.604
3.771SerThr: 3.771 ± 0.422
3.882SerVal: 3.882 ± 0.359
1.331SerTrp: 1.331 ± 0.263
2.218SerTyr: 2.218 ± 0.242
0.0SerXaa: 0.0 ± 0.0
Thr
4.741ThrAla: 4.741 ± 0.62
0.638ThrCys: 0.638 ± 0.124
3.965ThrAsp: 3.965 ± 0.399
4.408ThrGlu: 4.408 ± 0.437
2.579ThrPhe: 2.579 ± 0.236
5.712ThrGly: 5.712 ± 0.877
0.887ThrHis: 0.887 ± 0.169
3.632ThrIle: 3.632 ± 0.378
2.856ThrLys: 2.856 ± 0.248
4.408ThrLeu: 4.408 ± 0.348
1.275ThrMet: 1.275 ± 0.2
2.384ThrAsn: 2.384 ± 0.399
2.246ThrPro: 2.246 ± 0.287
1.969ThrGln: 1.969 ± 0.33
2.495ThrArg: 2.495 ± 0.293
3.41ThrSer: 3.41 ± 0.545
3.937ThrThr: 3.937 ± 0.532
4.63ThrVal: 4.63 ± 0.406
1.026ThrTrp: 1.026 ± 0.169
2.024ThrTyr: 2.024 ± 0.259
0.0ThrXaa: 0.0 ± 0.0
Val
5.462ValAla: 5.462 ± 0.309
1.026ValCys: 1.026 ± 0.187
5.434ValAsp: 5.434 ± 0.429
5.102ValGlu: 5.102 ± 0.534
2.44ValPhe: 2.44 ± 0.286
5.018ValGly: 5.018 ± 0.401
1.359ValHis: 1.359 ± 0.209
3.965ValIle: 3.965 ± 0.374
4.436ValLys: 4.436 ± 0.395
4.298ValLeu: 4.298 ± 0.375
2.052ValMet: 2.052 ± 0.231
2.773ValAsn: 2.773 ± 0.291
2.551ValPro: 2.551 ± 0.282
1.83ValGln: 1.83 ± 0.276
4.741ValArg: 4.741 ± 0.393
4.963ValSer: 4.963 ± 0.362
4.769ValThr: 4.769 ± 0.434
5.462ValVal: 5.462 ± 0.542
1.469ValTrp: 1.469 ± 0.227
3.383ValTyr: 3.383 ± 0.274
0.0ValXaa: 0.0 ± 0.0
Trp
1.275TrpAla: 1.275 ± 0.204
0.194TrpCys: 0.194 ± 0.073
1.026TrpAsp: 1.026 ± 0.143
1.386TrpGlu: 1.386 ± 0.231
0.915TrpPhe: 0.915 ± 0.15
1.497TrpGly: 1.497 ± 0.218
0.582TrpHis: 0.582 ± 0.126
1.026TrpIle: 1.026 ± 0.191
1.164TrpLys: 1.164 ± 0.204
1.414TrpLeu: 1.414 ± 0.204
0.471TrpMet: 0.471 ± 0.114
1.303TrpAsn: 1.303 ± 0.193
0.499TrpPro: 0.499 ± 0.138
0.61TrpGln: 0.61 ± 0.119
0.97TrpArg: 0.97 ± 0.173
1.054TrpSer: 1.054 ± 0.174
1.58TrpThr: 1.58 ± 0.308
1.164TrpVal: 1.164 ± 0.206
0.444TrpTrp: 0.444 ± 0.12
1.054TrpTyr: 1.054 ± 0.176
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.327TyrAla: 3.327 ± 0.248
0.527TyrCys: 0.527 ± 0.099
2.662TyrAsp: 2.662 ± 0.325
2.994TyrGlu: 2.994 ± 0.342
1.22TyrPhe: 1.22 ± 0.21
3.133TyrGly: 3.133 ± 0.263
0.693TyrHis: 0.693 ± 0.125
2.052TyrIle: 2.052 ± 0.254
2.551TyrLys: 2.551 ± 0.277
2.745TyrLeu: 2.745 ± 0.319
1.164TyrMet: 1.164 ± 0.149
2.19TyrAsn: 2.19 ± 0.235
1.497TyrPro: 1.497 ± 0.239
1.442TyrGln: 1.442 ± 0.151
2.079TyrArg: 2.079 ± 0.224
2.773TyrSer: 2.773 ± 0.311
2.662TyrThr: 2.662 ± 0.283
3.105TyrVal: 3.105 ± 0.312
0.776TyrTrp: 0.776 ± 0.147
1.774TyrTyr: 1.774 ± 0.194
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 229 proteins (36068 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski