Amino acid dipepetide frequency for Mycobacterium phage Phabba

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
10.203AlaAla: 10.203 ± 0.574
0.689AlaCys: 0.689 ± 0.115
6.481AlaAsp: 6.481 ± 0.389
6.815AlaGlu: 6.815 ± 0.471
2.699AlaPhe: 2.699 ± 0.255
7.761AlaGly: 7.761 ± 0.531
1.733AlaHis: 1.733 ± 0.22
4.294AlaIle: 4.294 ± 0.318
4.491AlaLys: 4.491 ± 0.284
7.091AlaLeu: 7.091 ± 0.433
2.876AlaMet: 2.876 ± 0.2
3.703AlaAsn: 3.703 ± 0.338
5.378AlaPro: 5.378 ± 0.42
4.708AlaGln: 4.708 ± 0.322
6.067AlaArg: 6.067 ± 0.319
5.24AlaSer: 5.24 ± 0.276
5.043AlaThr: 5.043 ± 0.444
6.52AlaVal: 6.52 ± 0.429
1.438AlaTrp: 1.438 ± 0.151
2.462AlaTyr: 2.462 ± 0.258
0.0AlaXaa: 0.0 ± 0.0
Cys
0.611CysAla: 0.611 ± 0.117
0.138CysCys: 0.138 ± 0.059
0.709CysAsp: 0.709 ± 0.146
0.355CysGlu: 0.355 ± 0.084
0.276CysPhe: 0.276 ± 0.08
1.083CysGly: 1.083 ± 0.155
0.295CysHis: 0.295 ± 0.085
0.276CysIle: 0.276 ± 0.082
0.394CysLys: 0.394 ± 0.083
0.63CysLeu: 0.63 ± 0.109
0.118CysMet: 0.118 ± 0.046
0.374CysAsn: 0.374 ± 0.104
0.512CysPro: 0.512 ± 0.106
0.236CysGln: 0.236 ± 0.058
0.552CysArg: 0.552 ± 0.13
0.571CysSer: 0.571 ± 0.108
0.295CysThr: 0.295 ± 0.081
0.433CysVal: 0.433 ± 0.096
0.079CysTrp: 0.079 ± 0.041
0.315CysTyr: 0.315 ± 0.086
0.0CysXaa: 0.0 ± 0.0
Asp
6.087AspAla: 6.087 ± 0.377
0.532AspCys: 0.532 ± 0.105
5.87AspAsp: 5.87 ± 0.527
5.102AspGlu: 5.102 ± 0.314
2.364AspPhe: 2.364 ± 0.256
5.535AspGly: 5.535 ± 0.307
1.576AspHis: 1.576 ± 0.179
2.856AspIle: 2.856 ± 0.229
2.561AspLys: 2.561 ± 0.256
5.181AspLeu: 5.181 ± 0.377
1.793AspMet: 1.793 ± 0.147
2.088AspAsn: 2.088 ± 0.245
4.944AspPro: 4.944 ± 0.361
2.955AspGln: 2.955 ± 0.215
4.117AspArg: 4.117 ± 0.249
3.723AspSer: 3.723 ± 0.399
3.762AspThr: 3.762 ± 0.287
4.57AspVal: 4.57 ± 0.348
1.379AspTrp: 1.379 ± 0.203
2.758AspTyr: 2.758 ± 0.241
0.0AspXaa: 0.0 ± 0.0
Glu
5.929GluAla: 5.929 ± 0.408
0.473GluCys: 0.473 ± 0.1
4.373GluAsp: 4.373 ± 0.339
4.215GluGlu: 4.215 ± 0.376
2.403GluPhe: 2.403 ± 0.225
4.649GluGly: 4.649 ± 0.345
1.694GluHis: 1.694 ± 0.201
3.782GluIle: 3.782 ± 0.235
3.014GluLys: 3.014 ± 0.242
5.181GluLeu: 5.181 ± 0.337
1.536GluMet: 1.536 ± 0.216
1.97GluAsn: 1.97 ± 0.169
3.29GluPro: 3.29 ± 0.344
3.408GluGln: 3.408 ± 0.242
4.59GluArg: 4.59 ± 0.351
2.777GluSer: 2.777 ± 0.255
3.171GluThr: 3.171 ± 0.27
4.491GluVal: 4.491 ± 0.346
1.517GluTrp: 1.517 ± 0.202
1.95GluTyr: 1.95 ± 0.218
0.0GluXaa: 0.0 ± 0.0
Phe
2.935PheAla: 2.935 ± 0.245
0.236PheCys: 0.236 ± 0.066
2.58PheAsp: 2.58 ± 0.187
2.167PheGlu: 2.167 ± 0.192
0.847PhePhe: 0.847 ± 0.118
2.876PheGly: 2.876 ± 0.245
0.729PheHis: 0.729 ± 0.11
1.339PheIle: 1.339 ± 0.181
1.497PheLys: 1.497 ± 0.18
2.305PheLeu: 2.305 ± 0.244
1.083PheMet: 1.083 ± 0.162
1.261PheAsn: 1.261 ± 0.152
1.28PhePro: 1.28 ± 0.143
1.339PheGln: 1.339 ± 0.155
1.517PheArg: 1.517 ± 0.166
1.852PheSer: 1.852 ± 0.176
2.147PheThr: 2.147 ± 0.217
1.694PheVal: 1.694 ± 0.185
0.571PheTrp: 0.571 ± 0.114
1.064PheTyr: 1.064 ± 0.138
0.0PheXaa: 0.0 ± 0.0
Gly
6.382GlyAla: 6.382 ± 0.493
0.808GlyCys: 0.808 ± 0.139
4.984GlyAsp: 4.984 ± 0.297
5.378GlyGlu: 5.378 ± 0.388
2.6GlyPhe: 2.6 ± 0.24
9.258GlyGly: 9.258 ± 1.163
1.871GlyHis: 1.871 ± 0.233
4.235GlyIle: 4.235 ± 0.299
3.9GlyLys: 3.9 ± 0.298
5.85GlyLeu: 5.85 ± 0.284
2.009GlyMet: 2.009 ± 0.228
2.896GlyAsn: 2.896 ± 0.283
3.94GlyPro: 3.94 ± 0.391
3.782GlyGln: 3.782 ± 0.255
4.865GlyArg: 4.865 ± 0.282
5.121GlySer: 5.121 ± 0.441
6.421GlyThr: 6.421 ± 0.479
5.456GlyVal: 5.456 ± 0.344
1.674GlyTrp: 1.674 ± 0.196
3.053GlyTyr: 3.053 ± 0.26
0.0GlyXaa: 0.0 ± 0.0
His
2.088HisAla: 2.088 ± 0.2
0.236HisCys: 0.236 ± 0.073
1.339HisAsp: 1.339 ± 0.183
1.438HisGlu: 1.438 ± 0.169
0.729HisPhe: 0.729 ± 0.123
1.694HisGly: 1.694 ± 0.189
0.67HisHis: 0.67 ± 0.141
1.024HisIle: 1.024 ± 0.133
0.689HisLys: 0.689 ± 0.124
1.891HisLeu: 1.891 ± 0.173
0.414HisMet: 0.414 ± 0.121
0.473HisAsn: 0.473 ± 0.103
1.3HisPro: 1.3 ± 0.166
1.064HisGln: 1.064 ± 0.161
1.615HisArg: 1.615 ± 0.196
1.044HisSer: 1.044 ± 0.156
1.221HisThr: 1.221 ± 0.176
1.635HisVal: 1.635 ± 0.183
0.453HisTrp: 0.453 ± 0.1
0.729HisTyr: 0.729 ± 0.125
0.0HisXaa: 0.0 ± 0.0
Ile
4.668IleAla: 4.668 ± 0.294
0.453IleCys: 0.453 ± 0.106
3.861IleAsp: 3.861 ± 0.245
2.896IleGlu: 2.896 ± 0.211
1.359IlePhe: 1.359 ± 0.17
3.329IleGly: 3.329 ± 0.257
0.749IleHis: 0.749 ± 0.128
1.93IleIle: 1.93 ± 0.151
2.009IleLys: 2.009 ± 0.193
3.014IleLeu: 3.014 ± 0.314
1.083IleMet: 1.083 ± 0.15
2.068IleAsn: 2.068 ± 0.261
2.502IlePro: 2.502 ± 0.207
1.891IleGln: 1.891 ± 0.195
3.29IleArg: 3.29 ± 0.237
2.383IleSer: 2.383 ± 0.22
3.112IleThr: 3.112 ± 0.255
3.073IleVal: 3.073 ± 0.277
0.709IleTrp: 0.709 ± 0.155
0.965IleTyr: 0.965 ± 0.138
0.0IleXaa: 0.0 ± 0.0
Lys
4.964LysAla: 4.964 ± 0.409
0.374LysCys: 0.374 ± 0.113
2.521LysAsp: 2.521 ± 0.209
2.403LysGlu: 2.403 ± 0.222
1.359LysPhe: 1.359 ± 0.212
3.171LysGly: 3.171 ± 0.235
0.847LysHis: 0.847 ± 0.158
2.108LysIle: 2.108 ± 0.219
2.6LysLys: 2.6 ± 0.301
3.743LysLeu: 3.743 ± 0.273
1.123LysMet: 1.123 ± 0.146
1.576LysAsn: 1.576 ± 0.15
2.423LysPro: 2.423 ± 0.219
1.674LysGln: 1.674 ± 0.183
3.27LysArg: 3.27 ± 0.243
1.911LysSer: 1.911 ± 0.201
2.58LysThr: 2.58 ± 0.23
3.014LysVal: 3.014 ± 0.283
0.808LysTrp: 0.808 ± 0.144
1.202LysTyr: 1.202 ± 0.152
0.0LysXaa: 0.0 ± 0.0
Leu
7.722LeuAla: 7.722 ± 0.403
0.729LeuCys: 0.729 ± 0.124
5.771LeuAsp: 5.771 ± 0.389
4.215LeuGlu: 4.215 ± 0.29
2.049LeuPhe: 2.049 ± 0.193
5.555LeuGly: 5.555 ± 0.301
1.615LeuHis: 1.615 ± 0.202
3.23LeuIle: 3.23 ± 0.224
3.29LeuLys: 3.29 ± 0.263
4.649LeuLeu: 4.649 ± 0.315
1.832LeuMet: 1.832 ± 0.192
2.364LeuAsn: 2.364 ± 0.214
3.782LeuPro: 3.782 ± 0.256
3.211LeuGln: 3.211 ± 0.308
5.003LeuArg: 5.003 ± 0.296
3.487LeuSer: 3.487 ± 0.298
5.062LeuThr: 5.062 ± 0.415
5.121LeuVal: 5.121 ± 0.328
1.261LeuTrp: 1.261 ± 0.147
2.285LeuTyr: 2.285 ± 0.195
0.0LeuXaa: 0.0 ± 0.0
Met
2.856MetAla: 2.856 ± 0.257
0.098MetCys: 0.098 ± 0.041
1.379MetAsp: 1.379 ± 0.181
1.576MetGlu: 1.576 ± 0.222
0.591MetPhe: 0.591 ± 0.106
1.97MetGly: 1.97 ± 0.244
0.414MetHis: 0.414 ± 0.109
1.024MetIle: 1.024 ± 0.154
0.926MetLys: 0.926 ± 0.137
1.517MetLeu: 1.517 ± 0.156
0.65MetMet: 0.65 ± 0.152
1.005MetAsn: 1.005 ± 0.171
1.536MetPro: 1.536 ± 0.173
1.024MetGln: 1.024 ± 0.172
1.714MetArg: 1.714 ± 0.187
2.521MetSer: 2.521 ± 0.235
2.226MetThr: 2.226 ± 0.234
1.477MetVal: 1.477 ± 0.165
0.315MetTrp: 0.315 ± 0.07
0.552MetTyr: 0.552 ± 0.125
0.0MetXaa: 0.0 ± 0.0
Asn
3.664AsnAla: 3.664 ± 0.361
0.276AsnCys: 0.276 ± 0.087
2.108AsnAsp: 2.108 ± 0.189
2.029AsnGlu: 2.029 ± 0.196
1.182AsnPhe: 1.182 ± 0.143
3.624AsnGly: 3.624 ± 0.243
0.63AsnHis: 0.63 ± 0.107
1.399AsnIle: 1.399 ± 0.189
1.28AsnLys: 1.28 ± 0.166
2.777AsnLeu: 2.777 ± 0.245
0.808AsnMet: 0.808 ± 0.133
1.182AsnAsn: 1.182 ± 0.161
3.093AsnPro: 3.093 ± 0.259
1.221AsnGln: 1.221 ± 0.174
2.364AsnArg: 2.364 ± 0.212
1.733AsnSer: 1.733 ± 0.193
2.068AsnThr: 2.068 ± 0.23
2.561AsnVal: 2.561 ± 0.242
0.847AsnTrp: 0.847 ± 0.153
1.083AsnTyr: 1.083 ± 0.137
0.0AsnXaa: 0.0 ± 0.0
Pro
5.259ProAla: 5.259 ± 0.364
0.394ProCys: 0.394 ± 0.078
4.432ProAsp: 4.432 ± 0.292
4.353ProGlu: 4.353 ± 0.405
1.674ProPhe: 1.674 ± 0.189
5.594ProGly: 5.594 ± 0.347
1.221ProHis: 1.221 ± 0.189
2.127ProIle: 2.127 ± 0.206
1.793ProLys: 1.793 ± 0.233
2.994ProLeu: 2.994 ± 0.241
1.241ProMet: 1.241 ± 0.158
1.989ProAsn: 1.989 ± 0.184
3.132ProPro: 3.132 ± 0.474
3.211ProGln: 3.211 ± 0.282
2.521ProArg: 2.521 ± 0.259
3.368ProSer: 3.368 ± 0.255
3.506ProThr: 3.506 ± 0.274
4.629ProVal: 4.629 ± 0.345
0.788ProTrp: 0.788 ± 0.12
1.615ProTyr: 1.615 ± 0.174
0.0ProXaa: 0.0 ± 0.0
Gln
5.318GlnAla: 5.318 ± 0.364
0.138GlnCys: 0.138 ± 0.048
2.659GlnAsp: 2.659 ± 0.228
2.502GlnGlu: 2.502 ± 0.229
1.793GlnPhe: 1.793 ± 0.195
3.25GlnGly: 3.25 ± 0.285
0.945GlnHis: 0.945 ± 0.145
2.186GlnIle: 2.186 ± 0.175
1.714GlnLys: 1.714 ± 0.152
3.526GlnLeu: 3.526 ± 0.305
0.906GlnMet: 0.906 ± 0.123
1.694GlnAsn: 1.694 ± 0.179
2.324GlnPro: 2.324 ± 0.242
2.186GlnGln: 2.186 ± 0.216
3.191GlnArg: 3.191 ± 0.277
2.186GlnSer: 2.186 ± 0.205
2.127GlnThr: 2.127 ± 0.233
2.994GlnVal: 2.994 ± 0.256
0.749GlnTrp: 0.749 ± 0.118
1.694GlnTyr: 1.694 ± 0.196
0.0GlnXaa: 0.0 ± 0.0
Arg
5.259ArgAla: 5.259 ± 0.354
0.591ArgCys: 0.591 ± 0.132
3.92ArgAsp: 3.92 ± 0.32
4.59ArgGlu: 4.59 ± 0.29
2.049ArgPhe: 2.049 ± 0.241
4.274ArgGly: 4.274 ± 0.349
1.241ArgHis: 1.241 ± 0.181
3.191ArgIle: 3.191 ± 0.251
3.171ArgLys: 3.171 ± 0.234
5.023ArgLeu: 5.023 ± 0.34
1.753ArgMet: 1.753 ± 0.186
2.502ArgAsn: 2.502 ± 0.242
2.777ArgPro: 2.777 ± 0.193
2.974ArgGln: 2.974 ± 0.255
4.865ArgArg: 4.865 ± 0.381
3.447ArgSer: 3.447 ± 0.249
3.703ArgThr: 3.703 ± 0.255
5.023ArgVal: 5.023 ± 0.34
1.182ArgTrp: 1.182 ± 0.145
2.206ArgTyr: 2.206 ± 0.214
0.0ArgXaa: 0.0 ± 0.0
Ser
5.161SerAla: 5.161 ± 0.355
0.394SerCys: 0.394 ± 0.095
3.743SerAsp: 3.743 ± 0.383
2.994SerGlu: 2.994 ± 0.241
1.911SerPhe: 1.911 ± 0.204
5.574SerGly: 5.574 ± 0.43
1.103SerHis: 1.103 ± 0.162
2.817SerIle: 2.817 ± 0.245
2.009SerLys: 2.009 ± 0.239
3.546SerLeu: 3.546 ± 0.257
1.477SerMet: 1.477 ± 0.195
2.088SerAsn: 2.088 ± 0.225
2.856SerPro: 2.856 ± 0.275
2.029SerGln: 2.029 ± 0.226
3.014SerArg: 3.014 ± 0.265
3.762SerSer: 3.762 ± 0.531
3.861SerThr: 3.861 ± 0.308
3.743SerVal: 3.743 ± 0.276
0.886SerTrp: 0.886 ± 0.123
2.009SerTyr: 2.009 ± 0.209
0.0SerXaa: 0.0 ± 0.0
Thr
5.949ThrAla: 5.949 ± 0.504
0.63ThrCys: 0.63 ± 0.119
3.88ThrAsp: 3.88 ± 0.36
3.506ThrGlu: 3.506 ± 0.274
2.009ThrPhe: 2.009 ± 0.247
6.146ThrGly: 6.146 ± 0.367
1.458ThrHis: 1.458 ± 0.17
2.679ThrIle: 2.679 ± 0.24
2.935ThrLys: 2.935 ± 0.234
4.708ThrLeu: 4.708 ± 0.324
1.793ThrMet: 1.793 ± 0.201
2.147ThrAsn: 2.147 ± 0.223
4.727ThrPro: 4.727 ± 0.278
2.226ThrGln: 2.226 ± 0.237
3.683ThrArg: 3.683 ± 0.286
2.896ThrSer: 2.896 ± 0.258
4.215ThrThr: 4.215 ± 0.38
4.668ThrVal: 4.668 ± 0.313
1.123ThrTrp: 1.123 ± 0.171
1.635ThrTyr: 1.635 ± 0.172
0.0ThrXaa: 0.0 ± 0.0
Val
6.835ValAla: 6.835 ± 0.387
0.63ValCys: 0.63 ± 0.121
5.476ValAsp: 5.476 ± 0.335
4.905ValGlu: 4.905 ± 0.296
1.93ValPhe: 1.93 ± 0.178
5.2ValGly: 5.2 ± 0.344
1.615ValHis: 1.615 ± 0.142
2.876ValIle: 2.876 ± 0.21
3.27ValLys: 3.27 ± 0.268
4.846ValLeu: 4.846 ± 0.353
1.615ValMet: 1.615 ± 0.166
2.679ValAsn: 2.679 ± 0.27
3.88ValPro: 3.88 ± 0.319
2.679ValGln: 2.679 ± 0.245
4.058ValArg: 4.058 ± 0.293
4.077ValSer: 4.077 ± 0.289
4.984ValThr: 4.984 ± 0.314
5.496ValVal: 5.496 ± 0.462
1.379ValTrp: 1.379 ± 0.17
2.167ValTyr: 2.167 ± 0.202
0.0ValXaa: 0.0 ± 0.0
Trp
1.3TrpAla: 1.3 ± 0.132
0.177TrpCys: 0.177 ± 0.056
1.379TrpAsp: 1.379 ± 0.156
1.241TrpGlu: 1.241 ± 0.17
0.512TrpPhe: 0.512 ± 0.103
1.379TrpGly: 1.379 ± 0.171
0.532TrpHis: 0.532 ± 0.1
0.65TrpIle: 0.65 ± 0.102
1.064TrpLys: 1.064 ± 0.155
1.477TrpLeu: 1.477 ± 0.137
0.433TrpMet: 0.433 ± 0.083
0.63TrpAsn: 0.63 ± 0.124
0.749TrpPro: 0.749 ± 0.132
0.689TrpGln: 0.689 ± 0.137
1.339TrpArg: 1.339 ± 0.186
1.024TrpSer: 1.024 ± 0.14
1.32TrpThr: 1.32 ± 0.172
1.32TrpVal: 1.32 ± 0.165
0.414TrpTrp: 0.414 ± 0.08
0.611TrpTyr: 0.611 ± 0.123
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.502TyrAla: 2.502 ± 0.178
0.256TyrCys: 0.256 ± 0.07
2.285TyrAsp: 2.285 ± 0.236
1.733TyrGlu: 1.733 ± 0.189
0.985TyrPhe: 0.985 ± 0.135
2.561TyrGly: 2.561 ± 0.259
0.808TyrHis: 0.808 ± 0.121
1.32TyrIle: 1.32 ± 0.183
1.064TyrLys: 1.064 ± 0.156
2.305TyrLeu: 2.305 ± 0.243
0.808TyrMet: 0.808 ± 0.129
1.241TyrAsn: 1.241 ± 0.141
1.418TyrPro: 1.418 ± 0.161
1.576TyrGln: 1.576 ± 0.151
2.108TyrArg: 2.108 ± 0.2
1.852TyrSer: 1.852 ± 0.243
2.186TyrThr: 2.186 ± 0.244
2.679TyrVal: 2.679 ± 0.24
0.67TyrTrp: 0.67 ± 0.109
0.945TyrTyr: 0.945 ± 0.173
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 244 proteins (50768 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski