Amino acid dipepetide frequency for Gordonia phage Terapin

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
9.615AlaAla: 9.615 ± 1.281
0.568AlaCys: 0.568 ± 0.18
5.305AlaAsp: 5.305 ± 0.519
7.294AlaGlu: 7.294 ± 0.848
3.031AlaPhe: 3.031 ± 0.597
7.01AlaGly: 7.01 ± 0.777
1.231AlaHis: 1.231 ± 0.199
5.21AlaIle: 5.21 ± 0.557
5.352AlaLys: 5.352 ± 0.489
7.389AlaLeu: 7.389 ± 0.627
3.221AlaMet: 3.221 ± 0.442
3.789AlaAsn: 3.789 ± 0.493
3.742AlaPro: 3.742 ± 0.521
4.31AlaGln: 4.31 ± 0.588
4.879AlaArg: 4.879 ± 0.581
5.778AlaSer: 5.778 ± 0.644
4.594AlaThr: 4.594 ± 0.4
5.921AlaVal: 5.921 ± 0.633
1.563AlaTrp: 1.563 ± 0.29
2.131AlaTyr: 2.131 ± 0.373
0.0AlaXaa: 0.0 ± 0.0
Cys
0.426CysAla: 0.426 ± 0.141
0.047CysCys: 0.047 ± 0.048
0.758CysAsp: 0.758 ± 0.233
0.521CysGlu: 0.521 ± 0.172
0.189CysPhe: 0.189 ± 0.099
0.9CysGly: 0.9 ± 0.236
0.142CysHis: 0.142 ± 0.081
0.237CysIle: 0.237 ± 0.093
0.426CysLys: 0.426 ± 0.153
0.474CysLeu: 0.474 ± 0.16
0.142CysMet: 0.142 ± 0.077
0.426CysAsn: 0.426 ± 0.135
0.426CysPro: 0.426 ± 0.153
0.237CysGln: 0.237 ± 0.114
0.237CysArg: 0.237 ± 0.088
0.379CysSer: 0.379 ± 0.119
0.805CysThr: 0.805 ± 0.18
0.379CysVal: 0.379 ± 0.13
0.237CysTrp: 0.237 ± 0.097
0.332CysTyr: 0.332 ± 0.112
0.0CysXaa: 0.0 ± 0.0
Asp
5.778AspAla: 5.778 ± 0.477
0.332AspCys: 0.332 ± 0.127
5.447AspAsp: 5.447 ± 1.189
6.584AspGlu: 6.584 ± 0.841
2.7AspPhe: 2.7 ± 0.453
4.026AspGly: 4.026 ± 0.503
1.563AspHis: 1.563 ± 0.263
2.558AspIle: 2.558 ± 0.293
3.268AspLys: 3.268 ± 0.473
5.589AspLeu: 5.589 ± 0.508
1.942AspMet: 1.942 ± 0.274
2.226AspAsn: 2.226 ± 0.323
4.5AspPro: 4.5 ± 0.497
2.368AspGln: 2.368 ± 0.361
3.836AspArg: 3.836 ± 0.551
3.031AspSer: 3.031 ± 0.407
3.173AspThr: 3.173 ± 0.527
4.736AspVal: 4.736 ± 0.468
1.231AspTrp: 1.231 ± 0.277
1.847AspTyr: 1.847 ± 0.326
0.0AspXaa: 0.0 ± 0.0
Glu
8.241GluAla: 8.241 ± 0.719
0.805GluCys: 0.805 ± 0.222
6.015GluAsp: 6.015 ± 0.888
6.584GluGlu: 6.584 ± 0.884
2.7GluPhe: 2.7 ± 0.431
6.489GluGly: 6.489 ± 0.585
1.61GluHis: 1.61 ± 0.331
3.694GluIle: 3.694 ± 0.425
4.31GluLys: 4.31 ± 0.5
5.542GluLeu: 5.542 ± 0.45
2.605GluMet: 2.605 ± 0.35
2.558GluAsn: 2.558 ± 0.343
2.321GluPro: 2.321 ± 0.302
2.747GluGln: 2.747 ± 0.358
4.263GluArg: 4.263 ± 0.564
2.842GluSer: 2.842 ± 0.344
2.463GluThr: 2.463 ± 0.364
4.215GluVal: 4.215 ± 0.488
1.895GluTrp: 1.895 ± 0.253
2.889GluTyr: 2.889 ± 0.393
0.0GluXaa: 0.0 ± 0.0
Phe
3.221PheAla: 3.221 ± 0.436
0.332PheCys: 0.332 ± 0.111
2.084PheAsp: 2.084 ± 0.279
2.084PheGlu: 2.084 ± 0.332
1.089PhePhe: 1.089 ± 0.231
3.315PheGly: 3.315 ± 0.375
0.616PheHis: 0.616 ± 0.201
1.468PheIle: 1.468 ± 0.286
1.705PheLys: 1.705 ± 0.253
2.7PheLeu: 2.7 ± 0.356
0.853PheMet: 0.853 ± 0.205
1.137PheAsn: 1.137 ± 0.258
1.895PhePro: 1.895 ± 0.339
1.705PheGln: 1.705 ± 0.294
2.273PheArg: 2.273 ± 0.361
1.989PheSer: 1.989 ± 0.29
1.847PheThr: 1.847 ± 0.303
2.51PheVal: 2.51 ± 0.323
0.426PheTrp: 0.426 ± 0.112
0.995PheTyr: 0.995 ± 0.186
0.0PheXaa: 0.0 ± 0.0
Gly
6.205GlyAla: 6.205 ± 0.848
0.71GlyCys: 0.71 ± 0.183
6.015GlyAsp: 6.015 ± 1.013
5.921GlyGlu: 5.921 ± 0.697
3.173GlyPhe: 3.173 ± 0.386
7.199GlyGly: 7.199 ± 0.821
1.847GlyHis: 1.847 ± 0.297
5.305GlyIle: 5.305 ± 0.782
5.305GlyLys: 5.305 ± 0.641
6.299GlyLeu: 6.299 ± 0.721
2.652GlyMet: 2.652 ± 0.364
3.031GlyAsn: 3.031 ± 0.32
3.694GlyPro: 3.694 ± 0.597
3.742GlyGln: 3.742 ± 0.522
4.5GlyArg: 4.5 ± 0.476
5.352GlySer: 5.352 ± 0.536
5.257GlyThr: 5.257 ± 0.565
5.873GlyVal: 5.873 ± 0.546
1.752GlyTrp: 1.752 ± 0.276
2.321GlyTyr: 2.321 ± 0.444
0.0GlyXaa: 0.0 ± 0.0
His
1.942HisAla: 1.942 ± 0.371
0.095HisCys: 0.095 ± 0.067
1.184HisAsp: 1.184 ± 0.22
1.326HisGlu: 1.326 ± 0.27
0.947HisPhe: 0.947 ± 0.212
1.61HisGly: 1.61 ± 0.286
0.663HisHis: 0.663 ± 0.18
0.853HisIle: 0.853 ± 0.167
0.758HisLys: 0.758 ± 0.19
2.037HisLeu: 2.037 ± 0.258
0.426HisMet: 0.426 ± 0.128
0.426HisAsn: 0.426 ± 0.145
1.184HisPro: 1.184 ± 0.267
0.521HisGln: 0.521 ± 0.158
1.563HisArg: 1.563 ± 0.327
1.137HisSer: 1.137 ± 0.201
1.184HisThr: 1.184 ± 0.234
1.231HisVal: 1.231 ± 0.25
0.521HisTrp: 0.521 ± 0.156
0.9HisTyr: 0.9 ± 0.242
0.0HisXaa: 0.0 ± 0.0
Ile
5.873IleAla: 5.873 ± 0.75
0.568IleCys: 0.568 ± 0.177
3.505IleAsp: 3.505 ± 0.403
2.937IleGlu: 2.937 ± 0.36
1.658IlePhe: 1.658 ± 0.309
4.073IleGly: 4.073 ± 0.583
1.089IleHis: 1.089 ± 0.209
1.8IleIle: 1.8 ± 0.276
2.463IleLys: 2.463 ± 0.277
3.742IleLeu: 3.742 ± 0.405
1.089IleMet: 1.089 ± 0.205
1.847IleAsn: 1.847 ± 0.267
2.51IlePro: 2.51 ± 0.401
2.558IleGln: 2.558 ± 0.515
3.41IleArg: 3.41 ± 0.312
2.7IleSer: 2.7 ± 0.342
2.226IleThr: 2.226 ± 0.405
2.984IleVal: 2.984 ± 0.444
0.616IleTrp: 0.616 ± 0.201
1.279IleTyr: 1.279 ± 0.242
0.0IleXaa: 0.0 ± 0.0
Lys
6.015LysAla: 6.015 ± 0.621
0.426LysCys: 0.426 ± 0.142
3.363LysAsp: 3.363 ± 0.541
3.931LysGlu: 3.931 ± 0.447
1.8LysPhe: 1.8 ± 0.332
4.215LysGly: 4.215 ± 0.828
0.805LysHis: 0.805 ± 0.225
3.458LysIle: 3.458 ± 0.429
3.315LysLys: 3.315 ± 0.525
4.31LysLeu: 4.31 ± 0.429
1.421LysMet: 1.421 ± 0.259
2.652LysAsn: 2.652 ± 0.369
2.605LysPro: 2.605 ± 0.368
2.131LysGln: 2.131 ± 0.261
4.784LysArg: 4.784 ± 0.675
2.368LysSer: 2.368 ± 0.374
2.558LysThr: 2.558 ± 0.307
3.079LysVal: 3.079 ± 0.427
1.184LysTrp: 1.184 ± 0.216
1.468LysTyr: 1.468 ± 0.246
0.0LysXaa: 0.0 ± 0.0
Leu
6.726LeuAla: 6.726 ± 0.484
0.758LeuCys: 0.758 ± 0.216
4.594LeuAsp: 4.594 ± 0.415
5.447LeuGlu: 5.447 ± 0.516
2.084LeuPhe: 2.084 ± 0.325
6.678LeuGly: 6.678 ± 0.543
1.421LeuHis: 1.421 ± 0.248
3.315LeuIle: 3.315 ± 0.426
4.358LeuLys: 4.358 ± 0.449
4.594LeuLeu: 4.594 ± 0.423
1.989LeuMet: 1.989 ± 0.27
2.842LeuAsn: 2.842 ± 0.329
3.6LeuPro: 3.6 ± 0.455
2.652LeuGln: 2.652 ± 0.395
5.826LeuArg: 5.826 ± 0.438
4.358LeuSer: 4.358 ± 0.381
3.694LeuThr: 3.694 ± 0.414
5.589LeuVal: 5.589 ± 0.46
1.137LeuTrp: 1.137 ± 0.193
1.752LeuTyr: 1.752 ± 0.259
0.0LeuXaa: 0.0 ± 0.0
Met
2.7MetAla: 2.7 ± 0.366
0.379MetCys: 0.379 ± 0.132
1.042MetAsp: 1.042 ± 0.267
1.658MetGlu: 1.658 ± 0.292
0.947MetPhe: 0.947 ± 0.18
1.942MetGly: 1.942 ± 0.305
0.521MetHis: 0.521 ± 0.168
1.421MetIle: 1.421 ± 0.298
1.847MetLys: 1.847 ± 0.303
1.61MetLeu: 1.61 ± 0.255
0.426MetMet: 0.426 ± 0.123
0.71MetAsn: 0.71 ± 0.181
0.805MetPro: 0.805 ± 0.203
1.231MetGln: 1.231 ± 0.246
1.468MetArg: 1.468 ± 0.287
2.463MetSer: 2.463 ± 0.369
1.942MetThr: 1.942 ± 0.314
1.8MetVal: 1.8 ± 0.297
0.568MetTrp: 0.568 ± 0.158
0.805MetTyr: 0.805 ± 0.168
0.0MetXaa: 0.0 ± 0.0
Asn
3.884AsnAla: 3.884 ± 0.58
0.189AsnCys: 0.189 ± 0.095
1.516AsnAsp: 1.516 ± 0.228
2.416AsnGlu: 2.416 ± 0.404
1.184AsnPhe: 1.184 ± 0.254
3.647AsnGly: 3.647 ± 0.395
0.426AsnHis: 0.426 ± 0.145
1.421AsnIle: 1.421 ± 0.275
1.847AsnLys: 1.847 ± 0.307
2.747AsnLeu: 2.747 ± 0.375
0.853AsnMet: 0.853 ± 0.213
1.231AsnAsn: 1.231 ± 0.235
2.605AsnPro: 2.605 ± 0.397
1.468AsnGln: 1.468 ± 0.255
3.079AsnArg: 3.079 ± 0.425
2.179AsnSer: 2.179 ± 0.307
1.895AsnThr: 1.895 ± 0.338
2.51AsnVal: 2.51 ± 0.316
0.758AsnTrp: 0.758 ± 0.162
0.805AsnTyr: 0.805 ± 0.232
0.0AsnXaa: 0.0 ± 0.0
Pro
4.358ProAla: 4.358 ± 0.566
0.237ProCys: 0.237 ± 0.108
3.505ProAsp: 3.505 ± 0.412
4.358ProGlu: 4.358 ± 0.566
1.326ProPhe: 1.326 ± 0.221
4.736ProGly: 4.736 ± 0.642
1.374ProHis: 1.374 ± 0.281
2.558ProIle: 2.558 ± 0.369
2.747ProLys: 2.747 ± 0.485
2.605ProLeu: 2.605 ± 0.27
0.947ProMet: 0.947 ± 0.222
1.752ProAsn: 1.752 ± 0.329
2.558ProPro: 2.558 ± 0.446
2.131ProGln: 2.131 ± 0.376
1.895ProArg: 1.895 ± 0.341
2.652ProSer: 2.652 ± 0.438
3.173ProThr: 3.173 ± 0.499
2.842ProVal: 2.842 ± 0.392
0.9ProTrp: 0.9 ± 0.231
1.421ProTyr: 1.421 ± 0.332
0.0ProXaa: 0.0 ± 0.0
Gln
3.979GlnAla: 3.979 ± 0.513
0.095GlnCys: 0.095 ± 0.068
2.463GlnAsp: 2.463 ± 0.37
3.031GlnGlu: 3.031 ± 0.443
1.847GlnPhe: 1.847 ± 0.325
5.115GlnGly: 5.115 ± 1.251
0.568GlnHis: 0.568 ± 0.188
1.989GlnIle: 1.989 ± 0.246
1.989GlnLys: 1.989 ± 0.331
3.505GlnLeu: 3.505 ± 0.475
1.089GlnMet: 1.089 ± 0.206
1.468GlnAsn: 1.468 ± 0.229
1.421GlnPro: 1.421 ± 0.285
2.037GlnGln: 2.037 ± 0.403
2.7GlnArg: 2.7 ± 0.492
1.516GlnSer: 1.516 ± 0.292
2.416GlnThr: 2.416 ± 0.336
2.652GlnVal: 2.652 ± 0.334
0.663GlnTrp: 0.663 ± 0.218
2.037GlnTyr: 2.037 ± 0.337
0.0GlnXaa: 0.0 ± 0.0
Arg
4.594ArgAla: 4.594 ± 0.424
0.426ArgCys: 0.426 ± 0.17
4.405ArgAsp: 4.405 ± 0.46
4.736ArgGlu: 4.736 ± 0.517
2.273ArgPhe: 2.273 ± 0.338
4.594ArgGly: 4.594 ± 0.482
1.326ArgHis: 1.326 ± 0.28
2.747ArgIle: 2.747 ± 0.321
5.021ArgLys: 5.021 ± 0.622
4.31ArgLeu: 4.31 ± 0.389
1.752ArgMet: 1.752 ± 0.227
2.416ArgAsn: 2.416 ± 0.315
2.273ArgPro: 2.273 ± 0.355
3.505ArgGln: 3.505 ± 0.444
4.263ArgArg: 4.263 ± 0.631
3.505ArgSer: 3.505 ± 0.511
3.173ArgThr: 3.173 ± 0.354
4.689ArgVal: 4.689 ± 0.491
1.089ArgTrp: 1.089 ± 0.245
1.705ArgTyr: 1.705 ± 0.345
0.0ArgXaa: 0.0 ± 0.0
Ser
4.405SerAla: 4.405 ± 0.577
0.616SerCys: 0.616 ± 0.195
3.884SerAsp: 3.884 ± 0.461
3.884SerGlu: 3.884 ± 0.427
1.8SerPhe: 1.8 ± 0.275
5.873SerGly: 5.873 ± 0.557
1.279SerHis: 1.279 ± 0.275
2.7SerIle: 2.7 ± 0.385
2.51SerLys: 2.51 ± 0.377
3.931SerLeu: 3.931 ± 0.363
1.374SerMet: 1.374 ± 0.249
1.942SerAsn: 1.942 ± 0.298
1.847SerPro: 1.847 ± 0.266
2.037SerGln: 2.037 ± 0.324
3.694SerArg: 3.694 ± 0.396
3.363SerSer: 3.363 ± 0.54
3.079SerThr: 3.079 ± 0.379
4.026SerVal: 4.026 ± 0.404
0.995SerTrp: 0.995 ± 0.196
1.847SerTyr: 1.847 ± 0.363
0.0SerXaa: 0.0 ± 0.0
Thr
4.358ThrAla: 4.358 ± 0.468
0.332ThrCys: 0.332 ± 0.126
2.794ThrAsp: 2.794 ± 0.384
3.505ThrGlu: 3.505 ± 0.526
2.179ThrPhe: 2.179 ± 0.292
5.636ThrGly: 5.636 ± 0.542
1.042ThrHis: 1.042 ± 0.236
2.984ThrIle: 2.984 ± 0.387
2.889ThrLys: 2.889 ± 0.297
3.884ThrLeu: 3.884 ± 0.36
0.9ThrMet: 0.9 ± 0.19
1.658ThrAsn: 1.658 ± 0.285
3.694ThrPro: 3.694 ± 0.427
1.8ThrGln: 1.8 ± 0.266
2.7ThrArg: 2.7 ± 0.341
3.268ThrSer: 3.268 ± 0.402
3.079ThrThr: 3.079 ± 0.346
4.073ThrVal: 4.073 ± 0.397
0.9ThrTrp: 0.9 ± 0.23
1.374ThrTyr: 1.374 ± 0.269
0.0ThrXaa: 0.0 ± 0.0
Val
5.968ValAla: 5.968 ± 0.735
0.379ValCys: 0.379 ± 0.144
4.263ValAsp: 4.263 ± 0.361
5.021ValGlu: 5.021 ± 0.386
1.989ValPhe: 1.989 ± 0.281
5.115ValGly: 5.115 ± 0.464
1.516ValHis: 1.516 ± 0.257
3.6ValIle: 3.6 ± 0.382
3.931ValLys: 3.931 ± 0.454
4.736ValLeu: 4.736 ± 0.421
1.752ValMet: 1.752 ± 0.291
2.321ValAsn: 2.321 ± 0.368
3.694ValPro: 3.694 ± 0.47
3.363ValGln: 3.363 ± 0.399
4.168ValArg: 4.168 ± 0.41
3.742ValSer: 3.742 ± 0.435
3.884ValThr: 3.884 ± 0.447
4.879ValVal: 4.879 ± 0.479
1.089ValTrp: 1.089 ± 0.242
2.037ValTyr: 2.037 ± 0.309
0.0ValXaa: 0.0 ± 0.0
Trp
1.563TrpAla: 1.563 ± 0.311
0.095TrpCys: 0.095 ± 0.067
1.752TrpAsp: 1.752 ± 0.258
1.279TrpGlu: 1.279 ± 0.25
0.616TrpPhe: 0.616 ± 0.143
1.089TrpGly: 1.089 ± 0.246
0.521TrpHis: 0.521 ± 0.15
0.758TrpIle: 0.758 ± 0.186
0.9TrpLys: 0.9 ± 0.195
1.231TrpLeu: 1.231 ± 0.285
0.426TrpMet: 0.426 ± 0.139
1.042TrpAsn: 1.042 ± 0.235
1.089TrpPro: 1.089 ± 0.275
0.71TrpGln: 0.71 ± 0.247
1.137TrpArg: 1.137 ± 0.263
1.042TrpSer: 1.042 ± 0.206
1.184TrpThr: 1.184 ± 0.19
1.184TrpVal: 1.184 ± 0.288
0.284TrpTrp: 0.284 ± 0.107
0.568TrpTyr: 0.568 ± 0.15
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.084TyrAla: 2.084 ± 0.406
0.332TyrCys: 0.332 ± 0.146
2.842TyrAsp: 2.842 ± 0.373
2.273TyrGlu: 2.273 ± 0.407
0.71TyrPhe: 0.71 ± 0.157
2.937TyrGly: 2.937 ± 0.394
0.947TyrHis: 0.947 ± 0.221
0.853TyrIle: 0.853 ± 0.232
0.947TyrLys: 0.947 ± 0.204
2.179TyrLeu: 2.179 ± 0.307
0.379TyrMet: 0.379 ± 0.138
1.184TyrAsn: 1.184 ± 0.263
1.61TyrPro: 1.61 ± 0.333
1.326TyrGln: 1.326 ± 0.282
2.131TyrArg: 2.131 ± 0.354
1.421TyrSer: 1.421 ± 0.254
1.421TyrThr: 1.421 ± 0.256
2.273TyrVal: 2.273 ± 0.405
0.663TyrTrp: 0.663 ± 0.206
0.71TyrTyr: 0.71 ± 0.157
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 97 proteins (21114 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski