Amino acid dipepetide frequency for Streptomyces phage Gilgamesh

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
21.166AlaAla: 21.166 ± 1.113
1.159AlaCys: 1.159 ± 0.267
8.6AlaAsp: 8.6 ± 0.553
9.476AlaGlu: 9.476 ± 0.857
2.42AlaPhe: 2.42 ± 0.27
9.193AlaGly: 9.193 ± 0.526
2.678AlaHis: 2.678 ± 0.362
3.631AlaIle: 3.631 ± 0.476
3.064AlaLys: 3.064 ± 0.351
11.845AlaLeu: 11.845 ± 0.696
3.064AlaMet: 3.064 ± 0.245
2.292AlaAsn: 2.292 ± 0.246
9.965AlaPro: 9.965 ± 0.634
4.429AlaGln: 4.429 ± 0.47
11.536AlaArg: 11.536 ± 0.769
4.841AlaSer: 4.841 ± 0.338
6.901AlaThr: 6.901 ± 0.462
8.523AlaVal: 8.523 ± 0.794
2.369AlaTrp: 2.369 ± 0.283
2.498AlaTyr: 2.498 ± 0.355
0.0AlaXaa: 0.0 ± 0.0
Cys
0.875CysAla: 0.875 ± 0.18
0.103CysCys: 0.103 ± 0.055
0.386CysAsp: 0.386 ± 0.128
0.618CysGlu: 0.618 ± 0.147
0.129CysPhe: 0.129 ± 0.059
0.927CysGly: 0.927 ± 0.259
0.206CysHis: 0.206 ± 0.092
0.206CysIle: 0.206 ± 0.086
0.154CysLys: 0.154 ± 0.071
0.772CysLeu: 0.772 ± 0.191
0.129CysMet: 0.129 ± 0.076
0.257CysAsn: 0.257 ± 0.084
0.85CysPro: 0.85 ± 0.219
0.129CysGln: 0.129 ± 0.06
0.901CysArg: 0.901 ± 0.204
0.669CysSer: 0.669 ± 0.162
0.592CysThr: 0.592 ± 0.162
0.566CysVal: 0.566 ± 0.169
0.18CysTrp: 0.18 ± 0.094
0.18CysTyr: 0.18 ± 0.066
0.0CysXaa: 0.0 ± 0.0
Asp
8.214AspAla: 8.214 ± 0.713
0.592AspCys: 0.592 ± 0.161
6.746AspAsp: 6.746 ± 1.153
5.227AspGlu: 5.227 ± 0.581
1.751AspPhe: 1.751 ± 0.201
7.339AspGly: 7.339 ± 1.118
1.107AspHis: 1.107 ± 0.155
1.931AspIle: 1.931 ± 0.243
1.262AspLys: 1.262 ± 0.246
6.025AspLeu: 6.025 ± 0.463
1.107AspMet: 1.107 ± 0.165
1.107AspAsn: 1.107 ± 0.224
5.382AspPro: 5.382 ± 0.498
2.42AspGln: 2.42 ± 0.452
4.97AspArg: 4.97 ± 0.85
2.884AspSer: 2.884 ± 0.335
4.455AspThr: 4.455 ± 0.463
4.661AspVal: 4.661 ± 0.375
1.184AspTrp: 1.184 ± 0.19
1.159AspTyr: 1.159 ± 0.18
0.0AspXaa: 0.0 ± 0.0
Glu
8.343GluAla: 8.343 ± 0.671
0.412GluCys: 0.412 ± 0.094
4.403GluAsp: 4.403 ± 0.467
4.944GluGlu: 4.944 ± 0.516
1.262GluPhe: 1.262 ± 0.196
4.815GluGly: 4.815 ± 0.327
1.159GluHis: 1.159 ± 0.157
2.42GluIle: 2.42 ± 0.244
1.596GluLys: 1.596 ± 0.234
5.948GluLeu: 5.948 ± 0.36
1.545GluMet: 1.545 ± 0.172
1.519GluAsn: 1.519 ± 0.211
5.253GluPro: 5.253 ± 0.478
3.425GluGln: 3.425 ± 0.316
5.922GluArg: 5.922 ± 0.484
2.935GluSer: 2.935 ± 0.211
4.249GluThr: 4.249 ± 0.328
4.3GluVal: 4.3 ± 0.315
1.159GluTrp: 1.159 ± 0.166
1.365GluTyr: 1.365 ± 0.212
0.0GluXaa: 0.0 ± 0.0
Phe
2.523PheAla: 2.523 ± 0.257
0.257PheCys: 0.257 ± 0.087
1.468PheAsp: 1.468 ± 0.206
1.519PheGlu: 1.519 ± 0.218
0.412PhePhe: 0.412 ± 0.11
2.523PheGly: 2.523 ± 0.248
0.36PheHis: 0.36 ± 0.11
0.515PheIle: 0.515 ± 0.127
0.644PheLys: 0.644 ± 0.141
1.931PheLeu: 1.931 ± 0.249
0.283PheMet: 0.283 ± 0.077
0.618PheAsn: 0.618 ± 0.128
1.262PhePro: 1.262 ± 0.198
0.669PheGln: 0.669 ± 0.134
1.905PheArg: 1.905 ± 0.221
1.159PheSer: 1.159 ± 0.169
1.751PheThr: 1.751 ± 0.17
1.365PheVal: 1.365 ± 0.201
0.283PheTrp: 0.283 ± 0.087
0.438PheTyr: 0.438 ± 0.1
0.0PheXaa: 0.0 ± 0.0
Gly
7.261GlyAla: 7.261 ± 0.537
0.721GlyCys: 0.721 ± 0.176
5.691GlyAsp: 5.691 ± 1.013
5.098GlyGlu: 5.098 ± 0.419
1.674GlyPhe: 1.674 ± 0.249
7.467GlyGly: 7.467 ± 0.578
1.983GlyHis: 1.983 ± 0.218
2.807GlyIle: 2.807 ± 0.344
3.193GlyLys: 3.193 ± 0.403
5.948GlyLeu: 5.948 ± 0.595
1.751GlyMet: 1.751 ± 0.174
2.034GlyAsn: 2.034 ± 0.29
4.867GlyPro: 4.867 ± 0.453
2.91GlyGln: 2.91 ± 0.336
7.39GlyArg: 7.39 ± 0.631
3.322GlySer: 3.322 ± 0.304
5.974GlyThr: 5.974 ± 0.542
5.588GlyVal: 5.588 ± 0.587
2.163GlyTrp: 2.163 ± 0.226
1.802GlyTyr: 1.802 ± 0.257
0.0GlyXaa: 0.0 ± 0.0
His
2.472HisAla: 2.472 ± 0.28
0.309HisCys: 0.309 ± 0.108
1.442HisAsp: 1.442 ± 0.173
1.184HisGlu: 1.184 ± 0.166
0.669HisPhe: 0.669 ± 0.15
1.571HisGly: 1.571 ± 0.236
0.85HisHis: 0.85 ± 0.158
0.927HisIle: 0.927 ± 0.176
0.592HisLys: 0.592 ± 0.134
2.266HisLeu: 2.266 ± 0.232
0.438HisMet: 0.438 ± 0.13
0.463HisAsn: 0.463 ± 0.105
1.699HisPro: 1.699 ± 0.227
0.978HisGln: 0.978 ± 0.152
1.931HisArg: 1.931 ± 0.253
0.695HisSer: 0.695 ± 0.11
0.978HisThr: 0.978 ± 0.168
1.777HisVal: 1.777 ± 0.255
0.515HisTrp: 0.515 ± 0.089
0.644HisTyr: 0.644 ± 0.148
0.0HisXaa: 0.0 ± 0.0
Ile
5.073IleAla: 5.073 ± 0.387
0.154IleCys: 0.154 ± 0.073
2.292IleAsp: 2.292 ± 0.213
2.317IleGlu: 2.317 ± 0.223
0.669IlePhe: 0.669 ± 0.12
2.292IleGly: 2.292 ± 0.224
0.592IleHis: 0.592 ± 0.131
0.927IleIle: 0.927 ± 0.136
0.927IleLys: 0.927 ± 0.192
2.317IleLeu: 2.317 ± 0.341
0.309IleMet: 0.309 ± 0.112
0.875IleAsn: 0.875 ± 0.144
2.446IlePro: 2.446 ± 0.332
0.953IleGln: 0.953 ± 0.146
2.626IleArg: 2.626 ± 0.236
1.468IleSer: 1.468 ± 0.23
2.755IleThr: 2.755 ± 0.239
2.472IleVal: 2.472 ± 0.311
0.283IleTrp: 0.283 ± 0.09
0.386IleTyr: 0.386 ± 0.136
0.0IleXaa: 0.0 ± 0.0
Lys
3.656LysAla: 3.656 ± 0.447
0.18LysCys: 0.18 ± 0.061
1.493LysAsp: 1.493 ± 0.171
1.339LysGlu: 1.339 ± 0.24
0.85LysPhe: 0.85 ± 0.177
1.828LysGly: 1.828 ± 0.231
0.772LysHis: 0.772 ± 0.133
0.824LysIle: 0.824 ± 0.161
1.39LysLys: 1.39 ± 0.234
1.571LysLeu: 1.571 ± 0.292
0.335LysMet: 0.335 ± 0.102
0.644LysAsn: 0.644 ± 0.135
2.369LysPro: 2.369 ± 0.246
0.875LysGln: 0.875 ± 0.155
2.549LysArg: 2.549 ± 0.284
1.159LysSer: 1.159 ± 0.243
1.751LysThr: 1.751 ± 0.284
2.317LysVal: 2.317 ± 0.325
0.669LysTrp: 0.669 ± 0.129
0.618LysTyr: 0.618 ± 0.134
0.0LysXaa: 0.0 ± 0.0
Leu
11.407LeuAla: 11.407 ± 0.694
0.85LeuCys: 0.85 ± 0.212
5.768LeuAsp: 5.768 ± 0.346
5.382LeuGlu: 5.382 ± 0.473
1.777LeuPhe: 1.777 ± 0.249
6.077LeuGly: 6.077 ± 0.694
2.214LeuHis: 2.214 ± 0.288
2.446LeuIle: 2.446 ± 0.304
1.983LeuLys: 1.983 ± 0.235
6.257LeuLeu: 6.257 ± 0.666
1.287LeuMet: 1.287 ± 0.226
1.519LeuAsn: 1.519 ± 0.252
6.412LeuPro: 6.412 ± 0.411
2.729LeuGln: 2.729 ± 0.311
7.493LeuArg: 7.493 ± 0.412
4.635LeuSer: 4.635 ± 0.393
4.892LeuThr: 4.892 ± 0.3
6.0LeuVal: 6.0 ± 0.557
1.365LeuTrp: 1.365 ± 0.209
1.674LeuTyr: 1.674 ± 0.281
0.0LeuXaa: 0.0 ± 0.0
Met
2.317MetAla: 2.317 ± 0.224
0.18MetCys: 0.18 ± 0.084
1.159MetAsp: 1.159 ± 0.186
0.747MetGlu: 0.747 ± 0.13
0.438MetPhe: 0.438 ± 0.096
1.133MetGly: 1.133 ± 0.206
0.309MetHis: 0.309 ± 0.077
0.669MetIle: 0.669 ± 0.15
0.515MetLys: 0.515 ± 0.143
1.339MetLeu: 1.339 ± 0.196
0.309MetMet: 0.309 ± 0.111
0.335MetAsn: 0.335 ± 0.117
1.854MetPro: 1.854 ± 0.215
0.798MetGln: 0.798 ± 0.134
1.519MetArg: 1.519 ± 0.202
1.596MetSer: 1.596 ± 0.253
2.111MetThr: 2.111 ± 0.242
1.339MetVal: 1.339 ± 0.201
0.232MetTrp: 0.232 ± 0.07
0.257MetTyr: 0.257 ± 0.08
0.0MetXaa: 0.0 ± 0.0
Asn
2.807AsnAla: 2.807 ± 0.402
0.18AsnCys: 0.18 ± 0.082
1.468AsnAsp: 1.468 ± 0.339
1.133AsnGlu: 1.133 ± 0.187
0.489AsnPhe: 0.489 ± 0.123
2.729AsnGly: 2.729 ± 0.522
0.489AsnHis: 0.489 ± 0.096
0.824AsnIle: 0.824 ± 0.129
0.541AsnLys: 0.541 ± 0.1
2.008AsnLeu: 2.008 ± 0.206
0.335AsnMet: 0.335 ± 0.093
0.489AsnAsn: 0.489 ± 0.125
1.596AsnPro: 1.596 ± 0.222
0.721AsnGln: 0.721 ± 0.123
1.622AsnArg: 1.622 ± 0.214
0.978AsnSer: 0.978 ± 0.17
1.365AsnThr: 1.365 ± 0.203
1.442AsnVal: 1.442 ± 0.219
0.18AsnTrp: 0.18 ± 0.062
0.283AsnTyr: 0.283 ± 0.083
0.0AsnXaa: 0.0 ± 0.0
Pro
10.892ProAla: 10.892 ± 0.761
0.618ProCys: 0.618 ± 0.186
6.901ProAsp: 6.901 ± 1.157
6.334ProGlu: 6.334 ± 1.209
1.339ProPhe: 1.339 ± 0.189
6.515ProGly: 6.515 ± 0.495
1.751ProHis: 1.751 ± 0.282
1.931ProIle: 1.931 ± 0.247
1.751ProLys: 1.751 ± 0.247
4.558ProLeu: 4.558 ± 0.567
1.004ProMet: 1.004 ± 0.165
1.519ProAsn: 1.519 ± 0.265
5.665ProPro: 5.665 ± 0.679
2.678ProGln: 2.678 ± 0.261
4.738ProArg: 4.738 ± 0.399
3.734ProSer: 3.734 ± 0.368
4.944ProThr: 4.944 ± 0.437
5.073ProVal: 5.073 ± 0.396
1.21ProTrp: 1.21 ± 0.223
1.622ProTyr: 1.622 ± 0.214
0.0ProXaa: 0.0 ± 0.0
Gln
5.15GlnAla: 5.15 ± 0.447
0.283GlnCys: 0.283 ± 0.085
2.008GlnAsp: 2.008 ± 0.299
2.008GlnGlu: 2.008 ± 0.205
0.953GlnPhe: 0.953 ± 0.151
2.678GlnGly: 2.678 ± 0.274
0.927GlnHis: 0.927 ± 0.148
0.978GlnIle: 0.978 ± 0.151
1.081GlnLys: 1.081 ± 0.16
2.858GlnLeu: 2.858 ± 0.304
0.927GlnMet: 0.927 ± 0.153
0.772GlnAsn: 0.772 ± 0.2
2.523GlnPro: 2.523 ± 0.352
1.905GlnGln: 1.905 ± 0.346
3.425GlnArg: 3.425 ± 0.477
1.493GlnSer: 1.493 ± 0.191
1.648GlnThr: 1.648 ± 0.198
2.24GlnVal: 2.24 ± 0.244
0.669GlnTrp: 0.669 ± 0.135
0.747GlnTyr: 0.747 ± 0.14
0.0GlnXaa: 0.0 ± 0.0
Arg
9.862ArgAla: 9.862 ± 0.624
0.85ArgCys: 0.85 ± 0.174
4.738ArgAsp: 4.738 ± 0.904
5.485ArgGlu: 5.485 ± 0.452
2.034ArgPhe: 2.034 ± 0.288
4.892ArgGly: 4.892 ± 0.329
2.24ArgHis: 2.24 ± 0.257
3.502ArgIle: 3.502 ± 0.358
2.807ArgLys: 2.807 ± 0.25
7.957ArgLeu: 7.957 ± 0.496
2.214ArgMet: 2.214 ± 0.217
1.957ArgAsn: 1.957 ± 0.318
6.025ArgPro: 6.025 ± 0.443
3.013ArgGln: 3.013 ± 0.356
8.884ArgArg: 8.884 ± 0.89
3.734ArgSer: 3.734 ± 0.481
5.433ArgThr: 5.433 ± 0.378
5.948ArgVal: 5.948 ± 0.416
1.442ArgTrp: 1.442 ± 0.196
2.086ArgTyr: 2.086 ± 0.239
0.0ArgXaa: 0.0 ± 0.0
Ser
5.176SerAla: 5.176 ± 0.369
0.283SerCys: 0.283 ± 0.089
3.45SerAsp: 3.45 ± 0.447
3.167SerGlu: 3.167 ± 0.298
1.107SerPhe: 1.107 ± 0.144
4.3SerGly: 4.3 ± 0.391
1.081SerHis: 1.081 ± 0.157
1.159SerIle: 1.159 ± 0.177
1.133SerLys: 1.133 ± 0.152
4.326SerLeu: 4.326 ± 0.421
0.747SerMet: 0.747 ± 0.122
1.107SerAsn: 1.107 ± 0.187
3.347SerPro: 3.347 ± 0.388
1.159SerGln: 1.159 ± 0.179
3.399SerArg: 3.399 ± 0.334
2.575SerSer: 2.575 ± 0.281
3.862SerThr: 3.862 ± 0.285
3.811SerVal: 3.811 ± 0.348
0.953SerTrp: 0.953 ± 0.167
1.03SerTyr: 1.03 ± 0.171
0.0SerXaa: 0.0 ± 0.0
Thr
8.961ThrAla: 8.961 ± 0.598
0.592ThrCys: 0.592 ± 0.132
3.862ThrAsp: 3.862 ± 0.405
3.373ThrGlu: 3.373 ± 0.294
1.545ThrPhe: 1.545 ± 0.196
6.206ThrGly: 6.206 ± 0.491
1.107ThrHis: 1.107 ± 0.147
2.781ThrIle: 2.781 ± 0.299
1.339ThrLys: 1.339 ± 0.198
4.867ThrLeu: 4.867 ± 0.386
1.287ThrMet: 1.287 ± 0.19
1.519ThrAsn: 1.519 ± 0.155
6.128ThrPro: 6.128 ± 0.56
1.751ThrGln: 1.751 ± 0.21
4.377ThrArg: 4.377 ± 0.384
3.605ThrSer: 3.605 ± 0.275
4.197ThrThr: 4.197 ± 0.379
5.768ThrVal: 5.768 ± 0.445
1.236ThrTrp: 1.236 ± 0.158
1.493ThrTyr: 1.493 ± 0.184
0.0ThrXaa: 0.0 ± 0.0
Val
8.6ValAla: 8.6 ± 0.617
0.644ValCys: 0.644 ± 0.159
4.815ValAsp: 4.815 ± 0.357
4.738ValGlu: 4.738 ± 0.588
1.442ValPhe: 1.442 ± 0.183
3.785ValGly: 3.785 ± 0.311
1.545ValHis: 1.545 ± 0.189
2.729ValIle: 2.729 ± 0.329
2.086ValLys: 2.086 ± 0.282
6.051ValLeu: 6.051 ± 0.769
1.365ValMet: 1.365 ± 0.208
1.545ValAsn: 1.545 ± 0.197
5.124ValPro: 5.124 ± 0.369
2.626ValGln: 2.626 ± 0.281
6.618ValArg: 6.618 ± 0.483
3.888ValSer: 3.888 ± 0.399
5.691ValThr: 5.691 ± 0.469
5.176ValVal: 5.176 ± 0.513
1.21ValTrp: 1.21 ± 0.156
1.777ValTyr: 1.777 ± 0.173
0.0ValXaa: 0.0 ± 0.0
Trp
2.034TrpAla: 2.034 ± 0.214
0.18TrpCys: 0.18 ± 0.073
1.004TrpAsp: 1.004 ± 0.194
1.159TrpGlu: 1.159 ± 0.171
0.566TrpPhe: 0.566 ± 0.097
1.442TrpGly: 1.442 ± 0.2
0.644TrpHis: 0.644 ± 0.114
0.515TrpIle: 0.515 ± 0.13
0.566TrpLys: 0.566 ± 0.111
1.571TrpLeu: 1.571 ± 0.219
0.438TrpMet: 0.438 ± 0.108
0.592TrpAsn: 0.592 ± 0.146
0.978TrpPro: 0.978 ± 0.159
0.463TrpGln: 0.463 ± 0.114
1.802TrpArg: 1.802 ± 0.232
0.927TrpSer: 0.927 ± 0.178
1.236TrpThr: 1.236 ± 0.233
1.21TrpVal: 1.21 ± 0.172
0.515TrpTrp: 0.515 ± 0.119
0.386TrpTyr: 0.386 ± 0.092
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.884TyrAla: 2.884 ± 0.332
0.283TyrCys: 0.283 ± 0.107
1.828TyrAsp: 1.828 ± 0.262
1.725TyrGlu: 1.725 ± 0.254
0.36TyrPhe: 0.36 ± 0.097
1.931TyrGly: 1.931 ± 0.319
0.386TyrHis: 0.386 ± 0.095
0.412TyrIle: 0.412 ± 0.09
0.566TyrLys: 0.566 ± 0.128
1.751TyrLeu: 1.751 ± 0.233
0.257TyrMet: 0.257 ± 0.076
0.515TyrAsn: 0.515 ± 0.116
1.03TyrPro: 1.03 ± 0.145
0.644TyrGln: 0.644 ± 0.12
1.493TyrArg: 1.493 ± 0.183
0.875TyrSer: 0.875 ± 0.149
1.21TyrThr: 1.21 ± 0.187
1.88TyrVal: 1.88 ± 0.222
0.412TyrTrp: 0.412 ± 0.114
0.283TyrTyr: 0.283 ± 0.111
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 156 proteins (38836 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski