Amino acid dipepetide frequency for Caulobacter phage CcrColossus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
13.499AlaAla: 13.499 ± 0.675
0.865AlaCys: 0.865 ± 0.096
6.359AlaAsp: 6.359 ± 0.272
7.152AlaGlu: 7.152 ± 0.365
4.003AlaPhe: 4.003 ± 0.241
8.811AlaGly: 8.811 ± 0.598
2.224AlaHis: 2.224 ± 0.158
5.0AlaIle: 5.0 ± 0.254
4.808AlaLys: 4.808 ± 0.322
10.325AlaLeu: 10.325 ± 0.443
3.269AlaMet: 3.269 ± 0.235
3.089AlaAsn: 3.089 ± 0.262
5.048AlaPro: 5.048 ± 0.318
4.291AlaGln: 4.291 ± 0.293
7.308AlaArg: 7.308 ± 0.413
5.325AlaSer: 5.325 ± 0.314
4.88AlaThr: 4.88 ± 0.355
6.431AlaVal: 6.431 ± 0.335
1.803AlaTrp: 1.803 ± 0.142
3.402AlaTyr: 3.402 ± 0.215
0.0AlaXaa: 0.0 ± 0.0
Cys
0.805CysAla: 0.805 ± 0.098
0.204CysCys: 0.204 ± 0.053
0.709CysAsp: 0.709 ± 0.094
0.697CysGlu: 0.697 ± 0.085
0.301CysPhe: 0.301 ± 0.061
0.865CysGly: 0.865 ± 0.117
0.288CysHis: 0.288 ± 0.066
0.493CysIle: 0.493 ± 0.074
0.349CysLys: 0.349 ± 0.062
0.601CysLeu: 0.601 ± 0.088
0.204CysMet: 0.204 ± 0.048
0.24CysAsn: 0.24 ± 0.062
0.625CysPro: 0.625 ± 0.094
0.288CysGln: 0.288 ± 0.071
0.721CysArg: 0.721 ± 0.099
0.505CysSer: 0.505 ± 0.08
0.493CysThr: 0.493 ± 0.071
0.673CysVal: 0.673 ± 0.096
0.204CysTrp: 0.204 ± 0.051
0.228CysTyr: 0.228 ± 0.056
0.0CysXaa: 0.0 ± 0.0
Asp
7.212AspAla: 7.212 ± 0.36
0.757AspCys: 0.757 ± 0.097
4.411AspAsp: 4.411 ± 0.282
4.135AspGlu: 4.135 ± 0.299
2.693AspPhe: 2.693 ± 0.182
5.926AspGly: 5.926 ± 0.338
1.647AspHis: 1.647 ± 0.161
3.125AspIle: 3.125 ± 0.178
2.368AspLys: 2.368 ± 0.201
5.469AspLeu: 5.469 ± 0.302
1.466AspMet: 1.466 ± 0.144
1.719AspAsn: 1.719 ± 0.144
3.666AspPro: 3.666 ± 0.246
2.116AspGln: 2.116 ± 0.163
3.57AspArg: 3.57 ± 0.26
2.885AspSer: 2.885 ± 0.197
3.366AspThr: 3.366 ± 0.195
4.207AspVal: 4.207 ± 0.264
1.19AspTrp: 1.19 ± 0.13
2.404AspTyr: 2.404 ± 0.175
0.0AspXaa: 0.0 ± 0.0
Glu
8.27GluAla: 8.27 ± 0.46
0.421GluCys: 0.421 ± 0.068
4.015GluAsp: 4.015 ± 0.289
4.123GluGlu: 4.123 ± 0.298
2.356GluPhe: 2.356 ± 0.2
4.339GluGly: 4.339 ± 0.227
1.611GluHis: 1.611 ± 0.183
3.414GluIle: 3.414 ± 0.198
2.957GluLys: 2.957 ± 0.245
4.195GluLeu: 4.195 ± 0.255
1.767GluMet: 1.767 ± 0.16
1.803GluAsn: 1.803 ± 0.14
2.885GluPro: 2.885 ± 0.215
2.356GluGln: 2.356 ± 0.205
5.012GluArg: 5.012 ± 0.411
2.656GluSer: 2.656 ± 0.183
3.366GluThr: 3.366 ± 0.212
4.496GluVal: 4.496 ± 0.23
1.118GluTrp: 1.118 ± 0.115
1.779GluTyr: 1.779 ± 0.159
0.0GluXaa: 0.0 ± 0.0
Phe
3.546PheAla: 3.546 ± 0.202
0.493PheCys: 0.493 ± 0.089
3.149PheAsp: 3.149 ± 0.218
2.572PheGlu: 2.572 ± 0.239
1.394PhePhe: 1.394 ± 0.148
3.173PheGly: 3.173 ± 0.209
0.721PheHis: 0.721 ± 0.101
1.887PheIle: 1.887 ± 0.165
1.587PheLys: 1.587 ± 0.181
2.5PheLeu: 2.5 ± 0.181
1.07PheMet: 1.07 ± 0.105
1.442PheAsn: 1.442 ± 0.141
1.695PhePro: 1.695 ± 0.145
1.286PheGln: 1.286 ± 0.124
2.079PheArg: 2.079 ± 0.127
2.404PheSer: 2.404 ± 0.193
2.332PheThr: 2.332 ± 0.218
2.428PheVal: 2.428 ± 0.188
0.685PheTrp: 0.685 ± 0.081
1.166PheTyr: 1.166 ± 0.128
0.0PheXaa: 0.0 ± 0.0
Gly
7.152GlyAla: 7.152 ± 0.41
0.974GlyCys: 0.974 ± 0.122
5.277GlyAsp: 5.277 ± 0.298
4.508GlyGlu: 4.508 ± 0.23
3.318GlyPhe: 3.318 ± 0.198
9.183GlyGly: 9.183 ± 2.16
1.755GlyHis: 1.755 ± 0.181
3.354GlyIle: 3.354 ± 0.193
3.546GlyLys: 3.546 ± 0.258
6.287GlyLeu: 6.287 ± 0.311
2.067GlyMet: 2.067 ± 0.166
2.56GlyAsn: 2.56 ± 0.345
3.077GlyPro: 3.077 ± 0.183
2.849GlyGln: 2.849 ± 0.296
5.0GlyArg: 5.0 ± 0.241
5.289GlySer: 5.289 ± 0.553
4.796GlyThr: 4.796 ± 0.5
5.758GlyVal: 5.758 ± 0.311
1.815GlyTrp: 1.815 ± 0.144
3.197GlyTyr: 3.197 ± 0.307
0.0GlyXaa: 0.0 ± 0.0
His
2.296HisAla: 2.296 ± 0.192
0.204HisCys: 0.204 ± 0.05
1.503HisAsp: 1.503 ± 0.174
1.262HisGlu: 1.262 ± 0.129
0.962HisPhe: 0.962 ± 0.119
2.019HisGly: 2.019 ± 0.186
0.841HisHis: 0.841 ± 0.107
1.13HisIle: 1.13 ± 0.126
0.914HisLys: 0.914 ± 0.129
1.839HisLeu: 1.839 ± 0.186
0.637HisMet: 0.637 ± 0.078
0.589HisAsn: 0.589 ± 0.078
1.346HisPro: 1.346 ± 0.146
0.541HisGln: 0.541 ± 0.085
1.286HisArg: 1.286 ± 0.14
1.118HisSer: 1.118 ± 0.141
1.046HisThr: 1.046 ± 0.126
1.815HisVal: 1.815 ± 0.172
0.349HisTrp: 0.349 ± 0.067
0.938HisTyr: 0.938 ± 0.093
0.0HisXaa: 0.0 ± 0.0
Ile
5.758IleAla: 5.758 ± 0.342
0.481IleCys: 0.481 ± 0.078
3.642IleAsp: 3.642 ± 0.232
4.159IleGlu: 4.159 ± 0.291
1.43IlePhe: 1.43 ± 0.108
3.618IleGly: 3.618 ± 0.265
0.974IleHis: 0.974 ± 0.114
2.308IleIle: 2.308 ± 0.184
2.548IleLys: 2.548 ± 0.178
3.342IleLeu: 3.342 ± 0.19
0.998IleMet: 0.998 ± 0.097
1.983IleAsn: 1.983 ± 0.171
2.476IlePro: 2.476 ± 0.153
1.466IleGln: 1.466 ± 0.117
2.945IleArg: 2.945 ± 0.237
2.56IleSer: 2.56 ± 0.157
3.57IleThr: 3.57 ± 0.296
3.414IleVal: 3.414 ± 0.234
0.625IleTrp: 0.625 ± 0.09
1.262IleTyr: 1.262 ± 0.111
0.0IleXaa: 0.0 ± 0.0
Lys
5.565LysAla: 5.565 ± 0.332
0.204LysCys: 0.204 ± 0.053
2.272LysAsp: 2.272 ± 0.182
2.476LysGlu: 2.476 ± 0.21
1.635LysPhe: 1.635 ± 0.166
3.137LysGly: 3.137 ± 0.234
1.034LysHis: 1.034 ± 0.118
2.296LysIle: 2.296 ± 0.169
2.308LysLys: 2.308 ± 0.204
4.123LysLeu: 4.123 ± 0.247
1.046LysMet: 1.046 ± 0.118
1.094LysAsn: 1.094 ± 0.136
2.488LysPro: 2.488 ± 0.186
1.599LysGln: 1.599 ± 0.127
3.594LysArg: 3.594 ± 0.252
2.128LysSer: 2.128 ± 0.181
3.161LysThr: 3.161 ± 0.207
3.113LysVal: 3.113 ± 0.208
0.649LysTrp: 0.649 ± 0.09
0.95LysTyr: 0.95 ± 0.104
0.0LysXaa: 0.0 ± 0.0
Leu
8.859LeuAla: 8.859 ± 0.396
0.685LeuCys: 0.685 ± 0.106
5.313LeuAsp: 5.313 ± 0.311
5.205LeuGlu: 5.205 ± 0.301
2.548LeuPhe: 2.548 ± 0.209
5.914LeuGly: 5.914 ± 0.249
2.031LeuHis: 2.031 ± 0.201
3.991LeuIle: 3.991 ± 0.212
3.991LeuLys: 3.991 ± 0.266
6.01LeuLeu: 6.01 ± 0.288
1.923LeuMet: 1.923 ± 0.188
3.125LeuAsn: 3.125 ± 0.194
4.159LeuPro: 4.159 ± 0.284
2.488LeuGln: 2.488 ± 0.159
5.758LeuArg: 5.758 ± 0.315
4.255LeuSer: 4.255 ± 0.261
5.493LeuThr: 5.493 ± 0.249
5.024LeuVal: 5.024 ± 0.214
0.962LeuTrp: 0.962 ± 0.116
2.248LeuTyr: 2.248 ± 0.18
0.0LeuXaa: 0.0 ± 0.0
Met
2.344MetAla: 2.344 ± 0.169
0.204MetCys: 0.204 ± 0.05
1.154MetAsp: 1.154 ± 0.122
1.13MetGlu: 1.13 ± 0.119
0.781MetPhe: 0.781 ± 0.105
1.791MetGly: 1.791 ± 0.136
0.409MetHis: 0.409 ± 0.077
1.442MetIle: 1.442 ± 0.14
1.214MetLys: 1.214 ± 0.142
1.899MetLeu: 1.899 ± 0.171
0.589MetMet: 0.589 ± 0.087
0.938MetAsn: 0.938 ± 0.104
1.707MetPro: 1.707 ± 0.137
0.829MetGln: 0.829 ± 0.106
1.803MetArg: 1.803 ± 0.152
2.26MetSer: 2.26 ± 0.173
2.284MetThr: 2.284 ± 0.193
1.346MetVal: 1.346 ± 0.13
0.301MetTrp: 0.301 ± 0.062
0.409MetTyr: 0.409 ± 0.084
0.0MetXaa: 0.0 ± 0.0
Asn
3.51AsnAla: 3.51 ± 0.267
0.301AsnCys: 0.301 ± 0.062
1.899AsnAsp: 1.899 ± 0.157
1.695AsnGlu: 1.695 ± 0.168
1.154AsnPhe: 1.154 ± 0.112
3.75AsnGly: 3.75 ± 0.369
0.613AsnHis: 0.613 ± 0.103
1.515AsnIle: 1.515 ± 0.156
1.022AsnLys: 1.022 ± 0.1
2.62AsnLeu: 2.62 ± 0.176
0.817AsnMet: 0.817 ± 0.114
1.166AsnAsn: 1.166 ± 0.142
2.14AsnPro: 2.14 ± 0.156
1.022AsnGln: 1.022 ± 0.125
2.128AsnArg: 2.128 ± 0.173
1.623AsnSer: 1.623 ± 0.186
1.683AsnThr: 1.683 ± 0.224
1.971AsnVal: 1.971 ± 0.154
0.433AsnTrp: 0.433 ± 0.082
1.25AsnTyr: 1.25 ± 0.122
0.0AsnXaa: 0.0 ± 0.0
Pro
5.445ProAla: 5.445 ± 0.355
0.385ProCys: 0.385 ± 0.068
3.474ProAsp: 3.474 ± 0.21
3.858ProGlu: 3.858 ± 0.257
1.899ProPhe: 1.899 ± 0.16
4.279ProGly: 4.279 ± 0.243
1.25ProHis: 1.25 ± 0.137
2.452ProIle: 2.452 ± 0.182
2.368ProLys: 2.368 ± 0.224
3.606ProLeu: 3.606 ± 0.206
1.214ProMet: 1.214 ± 0.14
2.116ProAsn: 2.116 ± 0.165
2.897ProPro: 2.897 ± 0.234
1.599ProGln: 1.599 ± 0.146
3.065ProArg: 3.065 ± 0.201
2.945ProSer: 2.945 ± 0.196
3.342ProThr: 3.342 ± 0.255
3.462ProVal: 3.462 ± 0.216
0.697ProTrp: 0.697 ± 0.08
1.767ProTyr: 1.767 ± 0.164
0.0ProXaa: 0.0 ± 0.0
Gln
4.291GlnAla: 4.291 ± 0.283
0.288GlnCys: 0.288 ± 0.059
1.671GlnAsp: 1.671 ± 0.161
2.067GlnGlu: 2.067 ± 0.164
1.418GlnPhe: 1.418 ± 0.14
2.428GlnGly: 2.428 ± 0.195
0.817GlnHis: 0.817 ± 0.094
2.043GlnIle: 2.043 ± 0.171
1.454GlnLys: 1.454 ± 0.128
2.019GlnLeu: 2.019 ± 0.176
0.998GlnMet: 0.998 ± 0.117
1.118GlnAsn: 1.118 ± 0.108
1.719GlnPro: 1.719 ± 0.161
1.358GlnGln: 1.358 ± 0.183
2.765GlnArg: 2.765 ± 0.206
1.575GlnSer: 1.575 ± 0.152
2.14GlnThr: 2.14 ± 0.175
2.62GlnVal: 2.62 ± 0.198
0.577GlnTrp: 0.577 ± 0.093
0.962GlnTyr: 0.962 ± 0.11
0.0GlnXaa: 0.0 ± 0.0
Arg
7.116ArgAla: 7.116 ± 0.394
0.577ArgCys: 0.577 ± 0.102
4.748ArgAsp: 4.748 ± 0.287
4.399ArgGlu: 4.399 ± 0.27
2.933ArgPhe: 2.933 ± 0.215
4.339ArgGly: 4.339 ± 0.269
1.37ArgHis: 1.37 ± 0.152
3.306ArgIle: 3.306 ± 0.189
3.209ArgLys: 3.209 ± 0.272
6.262ArgLeu: 6.262 ± 0.324
1.611ArgMet: 1.611 ± 0.147
1.887ArgAsn: 1.887 ± 0.154
3.233ArgPro: 3.233 ± 0.23
2.416ArgGln: 2.416 ± 0.171
5.289ArgArg: 5.289 ± 0.337
3.113ArgSer: 3.113 ± 0.224
3.029ArgThr: 3.029 ± 0.195
4.459ArgVal: 4.459 ± 0.225
1.442ArgTrp: 1.442 ± 0.149
2.164ArgTyr: 2.164 ± 0.175
0.0ArgXaa: 0.0 ± 0.0
Ser
5.109SerAla: 5.109 ± 0.295
0.577SerCys: 0.577 ± 0.087
3.065SerAsp: 3.065 ± 0.186
2.897SerGlu: 2.897 ± 0.196
2.296SerPhe: 2.296 ± 0.173
5.145SerGly: 5.145 ± 0.647
1.058SerHis: 1.058 ± 0.106
2.873SerIle: 2.873 ± 0.207
2.705SerLys: 2.705 ± 0.203
4.315SerLeu: 4.315 ± 0.265
1.298SerMet: 1.298 ± 0.122
1.731SerAsn: 1.731 ± 0.17
3.041SerPro: 3.041 ± 0.221
1.671SerGln: 1.671 ± 0.158
3.45SerArg: 3.45 ± 0.164
3.053SerSer: 3.053 ± 0.288
3.149SerThr: 3.149 ± 0.287
3.354SerVal: 3.354 ± 0.216
0.817SerTrp: 0.817 ± 0.08
2.091SerTyr: 2.091 ± 0.149
0.0SerXaa: 0.0 ± 0.0
Thr
5.842ThrAla: 5.842 ± 0.445
0.649ThrCys: 0.649 ± 0.086
3.738ThrAsp: 3.738 ± 0.235
2.909ThrGlu: 2.909 ± 0.218
2.344ThrPhe: 2.344 ± 0.171
4.724ThrGly: 4.724 ± 0.514
1.274ThrHis: 1.274 ± 0.139
3.245ThrIle: 3.245 ± 0.184
2.452ThrLys: 2.452 ± 0.184
5.169ThrLeu: 5.169 ± 0.254
1.238ThrMet: 1.238 ± 0.124
1.815ThrAsn: 1.815 ± 0.18
4.207ThrPro: 4.207 ± 0.319
1.851ThrGln: 1.851 ± 0.171
3.245ThrArg: 3.245 ± 0.176
3.017ThrSer: 3.017 ± 0.303
3.666ThrThr: 3.666 ± 0.44
4.375ThrVal: 4.375 ± 0.364
1.034ThrTrp: 1.034 ± 0.111
2.128ThrTyr: 2.128 ± 0.188
0.0ThrXaa: 0.0 ± 0.0
Val
6.539ValAla: 6.539 ± 0.298
0.577ValCys: 0.577 ± 0.088
4.556ValAsp: 4.556 ± 0.209
4.712ValGlu: 4.712 ± 0.244
2.356ValPhe: 2.356 ± 0.171
4.508ValGly: 4.508 ± 0.267
1.37ValHis: 1.37 ± 0.109
3.462ValIle: 3.462 ± 0.221
2.825ValLys: 2.825 ± 0.16
5.493ValLeu: 5.493 ± 0.27
1.575ValMet: 1.575 ± 0.143
2.212ValAsn: 2.212 ± 0.183
3.702ValPro: 3.702 ± 0.235
2.717ValGln: 2.717 ± 0.227
4.219ValArg: 4.219 ± 0.232
3.895ValSer: 3.895 ± 0.294
4.291ValThr: 4.291 ± 0.284
4.423ValVal: 4.423 ± 0.242
1.358ValTrp: 1.358 ± 0.134
2.116ValTyr: 2.116 ± 0.153
0.0ValXaa: 0.0 ± 0.0
Trp
1.719TrpAla: 1.719 ± 0.144
0.156TrpCys: 0.156 ± 0.044
0.902TrpAsp: 0.902 ± 0.1
0.853TrpGlu: 0.853 ± 0.101
0.745TrpPhe: 0.745 ± 0.098
0.986TrpGly: 0.986 ± 0.123
0.529TrpHis: 0.529 ± 0.092
0.805TrpIle: 0.805 ± 0.115
1.01TrpLys: 1.01 ± 0.116
1.37TrpLeu: 1.37 ± 0.151
0.505TrpMet: 0.505 ± 0.073
0.721TrpAsn: 0.721 ± 0.089
0.661TrpPro: 0.661 ± 0.099
0.517TrpGln: 0.517 ± 0.084
1.226TrpArg: 1.226 ± 0.133
1.214TrpSer: 1.214 ± 0.117
1.142TrpThr: 1.142 ± 0.105
1.13TrpVal: 1.13 ± 0.12
0.325TrpTrp: 0.325 ± 0.064
0.361TrpTyr: 0.361 ± 0.063
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.017TyrAla: 3.017 ± 0.214
0.493TyrCys: 0.493 ± 0.079
2.476TyrAsp: 2.476 ± 0.155
2.128TyrGlu: 2.128 ± 0.158
1.046TyrPhe: 1.046 ± 0.105
2.825TyrGly: 2.825 ± 0.187
0.817TyrHis: 0.817 ± 0.099
1.418TyrIle: 1.418 ± 0.123
1.238TyrLys: 1.238 ± 0.125
2.536TyrLeu: 2.536 ± 0.238
0.529TyrMet: 0.529 ± 0.069
0.962TyrAsn: 0.962 ± 0.119
1.346TyrPro: 1.346 ± 0.127
1.082TyrGln: 1.082 ± 0.105
2.44TyrArg: 2.44 ± 0.183
1.875TyrSer: 1.875 ± 0.139
1.695TyrThr: 1.695 ± 0.152
2.404TyrVal: 2.404 ± 0.2
0.493TyrTrp: 0.493 ± 0.081
1.214TyrTyr: 1.214 ± 0.146
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 448 proteins (83195 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski