Amino acid dipepetide frequency for Gordonia phage CarolAnn

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
16.76AlaAla: 16.76 ± 1.415
0.627AlaCys: 0.627 ± 0.203
7.981AlaAsp: 7.981 ± 0.705
7.411AlaGlu: 7.411 ± 0.712
3.42AlaPhe: 3.42 ± 0.589
9.634AlaGly: 9.634 ± 1.075
2.052AlaHis: 2.052 ± 0.356
5.245AlaIle: 5.245 ± 0.446
3.42AlaLys: 3.42 ± 0.644
10.261AlaLeu: 10.261 ± 0.832
2.907AlaMet: 2.907 ± 0.52
2.793AlaAsn: 2.793 ± 0.499
5.359AlaPro: 5.359 ± 0.662
5.131AlaGln: 5.131 ± 0.473
8.323AlaArg: 8.323 ± 0.837
5.929AlaSer: 5.929 ± 0.621
6.841AlaThr: 6.841 ± 0.604
8.152AlaVal: 8.152 ± 0.801
1.938AlaTrp: 1.938 ± 0.303
1.71AlaTyr: 1.71 ± 0.277
0.0AlaXaa: 0.0 ± 0.0
Cys
0.684CysAla: 0.684 ± 0.235
0.228CysCys: 0.228 ± 0.144
1.026CysAsp: 1.026 ± 0.309
0.342CysGlu: 0.342 ± 0.143
0.0CysPhe: 0.0 ± 0.0
1.026CysGly: 1.026 ± 0.278
0.228CysHis: 0.228 ± 0.122
0.057CysIle: 0.057 ± 0.055
0.114CysLys: 0.114 ± 0.082
0.57CysLeu: 0.57 ± 0.183
0.114CysMet: 0.114 ± 0.094
0.285CysAsn: 0.285 ± 0.116
0.798CysPro: 0.798 ± 0.239
0.342CysGln: 0.342 ± 0.116
0.798CysArg: 0.798 ± 0.223
0.513CysSer: 0.513 ± 0.211
0.627CysThr: 0.627 ± 0.2
0.228CysVal: 0.228 ± 0.129
0.285CysTrp: 0.285 ± 0.175
0.114CysTyr: 0.114 ± 0.082
0.0CysXaa: 0.0 ± 0.0
Asp
7.924AspAla: 7.924 ± 0.709
0.342AspCys: 0.342 ± 0.133
5.359AspAsp: 5.359 ± 0.639
4.389AspGlu: 4.389 ± 0.654
1.824AspPhe: 1.824 ± 0.32
6.328AspGly: 6.328 ± 0.746
1.653AspHis: 1.653 ± 0.401
2.964AspIle: 2.964 ± 0.436
1.14AspLys: 1.14 ± 0.256
5.929AspLeu: 5.929 ± 0.738
1.026AspMet: 1.026 ± 0.246
1.881AspAsn: 1.881 ± 0.359
4.218AspPro: 4.218 ± 0.522
3.021AspGln: 3.021 ± 0.527
5.131AspArg: 5.131 ± 0.695
3.876AspSer: 3.876 ± 0.425
4.218AspThr: 4.218 ± 0.482
5.188AspVal: 5.188 ± 0.574
1.026AspTrp: 1.026 ± 0.238
1.938AspTyr: 1.938 ± 0.353
0.0AspXaa: 0.0 ± 0.0
Glu
6.727GluAla: 6.727 ± 0.771
0.399GluCys: 0.399 ± 0.171
2.451GluAsp: 2.451 ± 0.321
3.078GluGlu: 3.078 ± 0.597
1.596GluPhe: 1.596 ± 0.306
3.99GluGly: 3.99 ± 0.469
1.653GluHis: 1.653 ± 0.301
2.508GluIle: 2.508 ± 0.374
2.223GluLys: 2.223 ± 0.377
4.789GluLeu: 4.789 ± 0.787
1.14GluMet: 1.14 ± 0.259
1.197GluAsn: 1.197 ± 0.271
3.591GluPro: 3.591 ± 0.708
2.394GluGln: 2.394 ± 0.333
4.732GluArg: 4.732 ± 0.633
2.964GluSer: 2.964 ± 0.357
2.85GluThr: 2.85 ± 0.425
5.473GluVal: 5.473 ± 0.556
1.14GluTrp: 1.14 ± 0.324
1.596GluTyr: 1.596 ± 0.286
0.0GluXaa: 0.0 ± 0.0
Phe
3.477PheAla: 3.477 ± 0.433
0.399PheCys: 0.399 ± 0.155
2.052PheAsp: 2.052 ± 0.295
1.653PheGlu: 1.653 ± 0.361
1.197PhePhe: 1.197 ± 0.307
2.451PheGly: 2.451 ± 0.341
0.57PheHis: 0.57 ± 0.229
0.912PheIle: 0.912 ± 0.269
0.798PheLys: 0.798 ± 0.306
1.653PheLeu: 1.653 ± 0.337
0.513PheMet: 0.513 ± 0.168
1.026PheAsn: 1.026 ± 0.238
1.539PhePro: 1.539 ± 0.304
0.684PheGln: 0.684 ± 0.17
2.109PheArg: 2.109 ± 0.305
1.368PheSer: 1.368 ± 0.306
2.394PheThr: 2.394 ± 0.383
2.451PheVal: 2.451 ± 0.362
0.399PheTrp: 0.399 ± 0.151
0.57PheTyr: 0.57 ± 0.189
0.0PheXaa: 0.0 ± 0.0
Gly
9.064GlyAla: 9.064 ± 0.988
0.513GlyCys: 0.513 ± 0.19
5.815GlyAsp: 5.815 ± 0.519
4.903GlyGlu: 4.903 ± 0.569
2.679GlyPhe: 2.679 ± 0.477
6.841GlyGly: 6.841 ± 0.866
1.767GlyHis: 1.767 ± 0.304
3.42GlyIle: 3.42 ± 0.504
3.078GlyLys: 3.078 ± 0.449
7.639GlyLeu: 7.639 ± 1.051
1.824GlyMet: 1.824 ± 0.354
2.508GlyAsn: 2.508 ± 0.396
3.99GlyPro: 3.99 ± 0.528
3.249GlyGln: 3.249 ± 0.409
6.67GlyArg: 6.67 ± 0.52
4.275GlySer: 4.275 ± 0.414
5.131GlyThr: 5.131 ± 0.705
6.214GlyVal: 6.214 ± 0.634
1.995GlyTrp: 1.995 ± 0.336
2.451GlyTyr: 2.451 ± 0.298
0.0GlyXaa: 0.0 ± 0.0
His
2.166HisAla: 2.166 ± 0.315
0.114HisCys: 0.114 ± 0.075
1.539HisAsp: 1.539 ± 0.279
0.969HisGlu: 0.969 ± 0.243
0.684HisPhe: 0.684 ± 0.222
1.995HisGly: 1.995 ± 0.364
0.513HisHis: 0.513 ± 0.204
0.855HisIle: 0.855 ± 0.194
0.456HisLys: 0.456 ± 0.185
1.938HisLeu: 1.938 ± 0.343
0.285HisMet: 0.285 ± 0.113
0.456HisAsn: 0.456 ± 0.151
1.938HisPro: 1.938 ± 0.367
0.684HisGln: 0.684 ± 0.179
1.881HisArg: 1.881 ± 0.335
1.311HisSer: 1.311 ± 0.267
1.653HisThr: 1.653 ± 0.31
1.482HisVal: 1.482 ± 0.295
0.627HisTrp: 0.627 ± 0.177
0.513HisTyr: 0.513 ± 0.157
0.0HisXaa: 0.0 ± 0.0
Ile
5.929IleAla: 5.929 ± 0.567
0.171IleCys: 0.171 ± 0.096
4.275IleAsp: 4.275 ± 0.495
2.736IleGlu: 2.736 ± 0.437
0.912IlePhe: 0.912 ± 0.237
5.245IleGly: 5.245 ± 0.626
0.912IleHis: 0.912 ± 0.218
1.596IleIle: 1.596 ± 0.368
1.824IleLys: 1.824 ± 0.513
1.938IleLeu: 1.938 ± 0.332
0.513IleMet: 0.513 ± 0.153
1.254IleAsn: 1.254 ± 0.329
2.907IlePro: 2.907 ± 0.357
1.026IleGln: 1.026 ± 0.221
3.477IleArg: 3.477 ± 0.538
1.71IleSer: 1.71 ± 0.261
2.907IleThr: 2.907 ± 0.446
3.876IleVal: 3.876 ± 0.448
0.342IleTrp: 0.342 ± 0.138
0.969IleTyr: 0.969 ± 0.249
0.0IleXaa: 0.0 ± 0.0
Lys
3.078LysAla: 3.078 ± 0.403
0.114LysCys: 0.114 ± 0.074
1.938LysAsp: 1.938 ± 0.394
1.083LysGlu: 1.083 ± 0.216
0.969LysPhe: 0.969 ± 0.237
2.964LysGly: 2.964 ± 0.5
0.627LysHis: 0.627 ± 0.203
1.767LysIle: 1.767 ± 0.35
1.71LysLys: 1.71 ± 0.257
2.793LysLeu: 2.793 ± 0.36
0.456LysMet: 0.456 ± 0.156
1.026LysAsn: 1.026 ± 0.272
2.223LysPro: 2.223 ± 0.382
1.026LysGln: 1.026 ± 0.322
2.052LysArg: 2.052 ± 0.325
1.995LysSer: 1.995 ± 0.343
2.736LysThr: 2.736 ± 0.413
2.28LysVal: 2.28 ± 0.367
0.798LysTrp: 0.798 ± 0.209
0.741LysTyr: 0.741 ± 0.241
0.0LysXaa: 0.0 ± 0.0
Leu
10.318LeuAla: 10.318 ± 0.895
0.855LeuCys: 0.855 ± 0.247
5.473LeuAsp: 5.473 ± 0.658
3.249LeuGlu: 3.249 ± 0.378
2.052LeuPhe: 2.052 ± 0.289
6.67LeuGly: 6.67 ± 0.655
1.425LeuHis: 1.425 ± 0.365
3.42LeuIle: 3.42 ± 0.548
2.052LeuLys: 2.052 ± 0.383
5.188LeuLeu: 5.188 ± 0.596
1.596LeuMet: 1.596 ± 0.301
1.938LeuAsn: 1.938 ± 0.346
4.903LeuPro: 4.903 ± 0.471
1.71LeuGln: 1.71 ± 0.309
5.302LeuArg: 5.302 ± 0.538
4.674LeuSer: 4.674 ± 0.553
6.328LeuThr: 6.328 ± 0.597
6.442LeuVal: 6.442 ± 0.544
2.109LeuTrp: 2.109 ± 0.316
1.425LeuTyr: 1.425 ± 0.239
0.0LeuXaa: 0.0 ± 0.0
Met
3.249MetAla: 3.249 ± 0.59
0.228MetCys: 0.228 ± 0.106
0.399MetAsp: 0.399 ± 0.152
0.912MetGlu: 0.912 ± 0.214
0.513MetPhe: 0.513 ± 0.169
1.596MetGly: 1.596 ± 0.352
0.342MetHis: 0.342 ± 0.127
0.798MetIle: 0.798 ± 0.196
0.627MetLys: 0.627 ± 0.157
1.539MetLeu: 1.539 ± 0.262
0.285MetMet: 0.285 ± 0.122
0.399MetAsn: 0.399 ± 0.154
1.653MetPro: 1.653 ± 0.303
0.684MetGln: 0.684 ± 0.243
2.451MetArg: 2.451 ± 0.538
1.653MetSer: 1.653 ± 0.29
2.394MetThr: 2.394 ± 0.354
0.798MetVal: 0.798 ± 0.269
0.627MetTrp: 0.627 ± 0.2
0.285MetTyr: 0.285 ± 0.122
0.0MetXaa: 0.0 ± 0.0
Asn
2.793AsnAla: 2.793 ± 0.366
0.228AsnCys: 0.228 ± 0.117
1.482AsnAsp: 1.482 ± 0.243
1.14AsnGlu: 1.14 ± 0.243
0.456AsnPhe: 0.456 ± 0.15
2.679AsnGly: 2.679 ± 0.415
0.912AsnHis: 0.912 ± 0.265
1.14AsnIle: 1.14 ± 0.251
0.798AsnLys: 0.798 ± 0.24
1.596AsnLeu: 1.596 ± 0.385
0.684AsnMet: 0.684 ± 0.193
0.912AsnAsn: 0.912 ± 0.253
3.021AsnPro: 3.021 ± 0.374
0.855AsnGln: 0.855 ± 0.2
1.881AsnArg: 1.881 ± 0.338
2.451AsnSer: 2.451 ± 0.402
1.938AsnThr: 1.938 ± 0.39
1.881AsnVal: 1.881 ± 0.415
0.342AsnTrp: 0.342 ± 0.13
0.798AsnTyr: 0.798 ± 0.24
0.0AsnXaa: 0.0 ± 0.0
Pro
5.986ProAla: 5.986 ± 0.657
0.798ProCys: 0.798 ± 0.261
4.617ProAsp: 4.617 ± 0.572
3.876ProGlu: 3.876 ± 0.548
1.938ProPhe: 1.938 ± 0.347
5.017ProGly: 5.017 ± 0.617
1.482ProHis: 1.482 ± 0.315
2.793ProIle: 2.793 ± 0.413
2.907ProLys: 2.907 ± 0.381
3.249ProLeu: 3.249 ± 0.446
1.539ProMet: 1.539 ± 0.433
1.653ProAsn: 1.653 ± 0.278
3.591ProPro: 3.591 ± 0.569
1.938ProGln: 1.938 ± 0.324
2.85ProArg: 2.85 ± 0.463
3.135ProSer: 3.135 ± 0.395
4.332ProThr: 4.332 ± 0.585
4.218ProVal: 4.218 ± 0.562
1.368ProTrp: 1.368 ± 0.27
1.026ProTyr: 1.026 ± 0.246
0.0ProXaa: 0.0 ± 0.0
Gln
3.819GlnAla: 3.819 ± 0.463
0.114GlnCys: 0.114 ± 0.074
1.368GlnAsp: 1.368 ± 0.287
1.368GlnGlu: 1.368 ± 0.326
1.14GlnPhe: 1.14 ± 0.277
2.337GlnGly: 2.337 ± 0.432
1.197GlnHis: 1.197 ± 0.315
1.71GlnIle: 1.71 ± 0.346
1.254GlnLys: 1.254 ± 0.235
3.021GlnLeu: 3.021 ± 0.391
0.969GlnMet: 0.969 ± 0.236
0.741GlnAsn: 0.741 ± 0.187
2.166GlnPro: 2.166 ± 0.442
1.368GlnGln: 1.368 ± 0.257
2.622GlnArg: 2.622 ± 0.345
2.052GlnSer: 2.052 ± 0.321
1.881GlnThr: 1.881 ± 0.34
3.363GlnVal: 3.363 ± 0.403
0.855GlnTrp: 0.855 ± 0.23
0.912GlnTyr: 0.912 ± 0.227
0.0GlnXaa: 0.0 ± 0.0
Arg
7.924ArgAla: 7.924 ± 0.716
0.912ArgCys: 0.912 ± 0.3
6.385ArgAsp: 6.385 ± 0.6
4.446ArgGlu: 4.446 ± 0.503
1.596ArgPhe: 1.596 ± 0.316
6.214ArgGly: 6.214 ± 0.63
2.109ArgHis: 2.109 ± 0.33
3.933ArgIle: 3.933 ± 0.409
2.451ArgLys: 2.451 ± 0.364
5.872ArgLeu: 5.872 ± 0.56
2.508ArgMet: 2.508 ± 0.377
2.736ArgAsn: 2.736 ± 0.338
3.021ArgPro: 3.021 ± 0.496
2.565ArgGln: 2.565 ± 0.363
7.411ArgArg: 7.411 ± 0.899
3.648ArgSer: 3.648 ± 0.335
4.503ArgThr: 4.503 ± 0.529
5.359ArgVal: 5.359 ± 0.523
1.539ArgTrp: 1.539 ± 0.318
1.71ArgTyr: 1.71 ± 0.265
0.0ArgXaa: 0.0 ± 0.0
Ser
5.644SerAla: 5.644 ± 0.686
0.228SerCys: 0.228 ± 0.127
3.306SerAsp: 3.306 ± 0.454
3.306SerGlu: 3.306 ± 0.391
2.052SerPhe: 2.052 ± 0.301
5.245SerGly: 5.245 ± 0.602
1.026SerHis: 1.026 ± 0.259
3.078SerIle: 3.078 ± 0.41
1.482SerLys: 1.482 ± 0.275
3.42SerLeu: 3.42 ± 0.373
1.083SerMet: 1.083 ± 0.218
1.938SerAsn: 1.938 ± 0.379
2.964SerPro: 2.964 ± 0.367
1.083SerGln: 1.083 ± 0.305
3.933SerArg: 3.933 ± 0.608
2.85SerSer: 2.85 ± 0.45
4.104SerThr: 4.104 ± 0.509
4.104SerVal: 4.104 ± 0.499
1.425SerTrp: 1.425 ± 0.251
0.57SerTyr: 0.57 ± 0.173
0.0SerXaa: 0.0 ± 0.0
Thr
8.266ThrAla: 8.266 ± 0.806
0.684ThrCys: 0.684 ± 0.204
4.903ThrAsp: 4.903 ± 0.609
4.674ThrGlu: 4.674 ± 0.572
1.938ThrPhe: 1.938 ± 0.407
5.815ThrGly: 5.815 ± 0.74
1.197ThrHis: 1.197 ± 0.234
3.192ThrIle: 3.192 ± 0.488
2.337ThrLys: 2.337 ± 0.406
5.701ThrLeu: 5.701 ± 0.513
1.14ThrMet: 1.14 ± 0.272
1.653ThrAsn: 1.653 ± 0.351
4.047ThrPro: 4.047 ± 0.429
2.622ThrGln: 2.622 ± 0.387
4.56ThrArg: 4.56 ± 0.505
3.021ThrSer: 3.021 ± 0.413
4.674ThrThr: 4.674 ± 0.526
6.499ThrVal: 6.499 ± 0.704
1.083ThrTrp: 1.083 ± 0.289
1.254ThrTyr: 1.254 ± 0.298
0.0ThrXaa: 0.0 ± 0.0
Val
8.323ValAla: 8.323 ± 0.692
1.026ValCys: 1.026 ± 0.288
6.442ValAsp: 6.442 ± 0.632
4.903ValGlu: 4.903 ± 0.577
1.938ValPhe: 1.938 ± 0.363
5.359ValGly: 5.359 ± 0.582
1.197ValHis: 1.197 ± 0.292
3.42ValIle: 3.42 ± 0.434
2.451ValLys: 2.451 ± 0.379
6.442ValLeu: 6.442 ± 0.78
1.824ValMet: 1.824 ± 0.39
1.881ValAsn: 1.881 ± 0.386
3.99ValPro: 3.99 ± 0.537
2.508ValGln: 2.508 ± 0.424
6.727ValArg: 6.727 ± 0.672
3.363ValSer: 3.363 ± 0.452
6.67ValThr: 6.67 ± 0.726
7.81ValVal: 7.81 ± 0.673
1.368ValTrp: 1.368 ± 0.37
1.539ValTyr: 1.539 ± 0.282
0.0ValXaa: 0.0 ± 0.0
Trp
1.71TrpAla: 1.71 ± 0.344
0.171TrpCys: 0.171 ± 0.105
1.539TrpAsp: 1.539 ± 0.334
0.627TrpGlu: 0.627 ± 0.191
0.684TrpPhe: 0.684 ± 0.194
0.969TrpGly: 0.969 ± 0.291
0.399TrpHis: 0.399 ± 0.153
0.684TrpIle: 0.684 ± 0.193
0.513TrpLys: 0.513 ± 0.168
2.166TrpLeu: 2.166 ± 0.265
0.456TrpMet: 0.456 ± 0.161
0.798TrpAsn: 0.798 ± 0.345
1.482TrpPro: 1.482 ± 0.307
0.798TrpGln: 0.798 ± 0.227
2.109TrpArg: 2.109 ± 0.314
1.026TrpSer: 1.026 ± 0.291
1.311TrpThr: 1.311 ± 0.229
1.767TrpVal: 1.767 ± 0.271
0.399TrpTrp: 0.399 ± 0.126
0.57TrpTyr: 0.57 ± 0.176
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.052TyrAla: 2.052 ± 0.327
0.228TyrCys: 0.228 ± 0.108
1.026TyrAsp: 1.026 ± 0.208
1.425TyrGlu: 1.425 ± 0.335
0.627TyrPhe: 0.627 ± 0.23
1.653TyrGly: 1.653 ± 0.296
0.798TyrHis: 0.798 ± 0.222
1.026TyrIle: 1.026 ± 0.263
0.741TyrLys: 0.741 ± 0.215
1.425TyrLeu: 1.425 ± 0.295
0.513TyrMet: 0.513 ± 0.156
0.912TyrAsn: 0.912 ± 0.196
0.969TyrPro: 0.969 ± 0.255
0.57TyrGln: 0.57 ± 0.227
1.824TyrArg: 1.824 ± 0.346
1.026TyrSer: 1.026 ± 0.238
1.767TyrThr: 1.767 ± 0.336
1.596TyrVal: 1.596 ± 0.339
0.513TyrTrp: 0.513 ± 0.157
0.513TyrTyr: 0.513 ± 0.192
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 80 proteins (17543 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski