Amino acid dipepetide frequency for Streptomyces phage Nanodon

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
11.665AlaAla: 11.665 ± 1.026
0.778AlaCys: 0.778 ± 0.263
6.221AlaAsp: 6.221 ± 0.679
8.684AlaGlu: 8.684 ± 0.799
2.787AlaPhe: 2.787 ± 0.406
7.258AlaGly: 7.258 ± 0.751
1.815AlaHis: 1.815 ± 0.425
5.184AlaIle: 5.184 ± 0.466
4.99AlaLys: 4.99 ± 0.729
10.304AlaLeu: 10.304 ± 1.129
3.175AlaMet: 3.175 ± 0.439
2.981AlaAsn: 2.981 ± 0.46
4.536AlaPro: 4.536 ± 0.612
3.435AlaGln: 3.435 ± 0.448
6.61AlaArg: 6.61 ± 0.675
5.184AlaSer: 5.184 ± 0.697
5.897AlaThr: 5.897 ± 0.872
9.137AlaVal: 9.137 ± 0.612
2.203AlaTrp: 2.203 ± 0.367
2.722AlaTyr: 2.722 ± 0.424
0.0AlaXaa: 0.0 ± 0.0
Cys
0.713CysAla: 0.713 ± 0.226
0.13CysCys: 0.13 ± 0.09
0.259CysAsp: 0.259 ± 0.133
0.583CysGlu: 0.583 ± 0.203
0.13CysPhe: 0.13 ± 0.102
0.648CysGly: 0.648 ± 0.229
0.194CysHis: 0.194 ± 0.123
0.259CysIle: 0.259 ± 0.135
0.389CysLys: 0.389 ± 0.148
0.518CysLeu: 0.518 ± 0.213
0.13CysMet: 0.13 ± 0.083
0.259CysAsn: 0.259 ± 0.135
0.648CysPro: 0.648 ± 0.213
0.065CysGln: 0.065 ± 0.078
0.454CysArg: 0.454 ± 0.215
0.648CysSer: 0.648 ± 0.235
0.324CysThr: 0.324 ± 0.124
0.389CysVal: 0.389 ± 0.153
0.13CysTrp: 0.13 ± 0.097
0.518CysTyr: 0.518 ± 0.189
0.0CysXaa: 0.0 ± 0.0
Asp
6.74AspAla: 6.74 ± 0.683
0.324AspCys: 0.324 ± 0.158
3.564AspAsp: 3.564 ± 0.56
5.055AspGlu: 5.055 ± 0.695
2.139AspPhe: 2.139 ± 0.396
6.545AspGly: 6.545 ± 0.694
1.426AspHis: 1.426 ± 0.387
2.722AspIle: 2.722 ± 0.496
2.203AspLys: 2.203 ± 0.35
6.221AspLeu: 6.221 ± 0.828
1.555AspMet: 1.555 ± 0.281
1.555AspAsn: 1.555 ± 0.261
3.888AspPro: 3.888 ± 0.512
1.944AspGln: 1.944 ± 0.335
3.37AspArg: 3.37 ± 0.496
3.435AspSer: 3.435 ± 0.409
4.083AspThr: 4.083 ± 0.547
3.759AspVal: 3.759 ± 0.53
1.75AspTrp: 1.75 ± 0.333
1.555AspTyr: 1.555 ± 0.359
0.0AspXaa: 0.0 ± 0.0
Glu
8.425GluAla: 8.425 ± 0.797
0.842GluCys: 0.842 ± 0.236
4.536GluAsp: 4.536 ± 0.573
5.314GluGlu: 5.314 ± 0.973
1.685GluPhe: 1.685 ± 0.273
5.962GluGly: 5.962 ± 0.677
1.685GluHis: 1.685 ± 0.354
3.694GluIle: 3.694 ± 0.582
2.074GluLys: 2.074 ± 0.443
6.416GluLeu: 6.416 ± 0.722
1.102GluMet: 1.102 ± 0.24
2.268GluAsn: 2.268 ± 0.377
2.981GluPro: 2.981 ± 0.469
2.592GluGln: 2.592 ± 0.492
4.407GluArg: 4.407 ± 0.607
4.342GluSer: 4.342 ± 0.633
4.018GluThr: 4.018 ± 0.688
5.638GluVal: 5.638 ± 0.617
1.426GluTrp: 1.426 ± 0.284
2.657GluTyr: 2.657 ± 0.536
0.0GluXaa: 0.0 ± 0.0
Phe
2.527PheAla: 2.527 ± 0.348
0.194PheCys: 0.194 ± 0.109
2.074PheAsp: 2.074 ± 0.394
2.916PheGlu: 2.916 ± 0.483
1.361PhePhe: 1.361 ± 0.284
3.046PheGly: 3.046 ± 0.371
0.389PheHis: 0.389 ± 0.145
1.879PheIle: 1.879 ± 0.398
1.037PheLys: 1.037 ± 0.238
2.139PheLeu: 2.139 ± 0.491
0.778PheMet: 0.778 ± 0.313
1.102PheAsn: 1.102 ± 0.251
1.426PhePro: 1.426 ± 0.311
0.972PheGln: 0.972 ± 0.267
2.268PheArg: 2.268 ± 0.463
1.815PheSer: 1.815 ± 0.418
2.268PheThr: 2.268 ± 0.295
1.426PheVal: 1.426 ± 0.282
0.324PheTrp: 0.324 ± 0.113
1.102PheTyr: 1.102 ± 0.289
0.0PheXaa: 0.0 ± 0.0
Gly
8.165GlyAla: 8.165 ± 1.103
0.518GlyCys: 0.518 ± 0.181
6.286GlyAsp: 6.286 ± 1.106
4.536GlyGlu: 4.536 ± 0.548
2.787GlyPhe: 2.787 ± 0.526
7.193GlyGly: 7.193 ± 0.84
2.139GlyHis: 2.139 ± 0.412
3.888GlyIle: 3.888 ± 0.634
4.083GlyLys: 4.083 ± 0.631
6.416GlyLeu: 6.416 ± 0.877
1.555GlyMet: 1.555 ± 0.337
3.046GlyAsn: 3.046 ± 0.437
3.435GlyPro: 3.435 ± 0.686
2.981GlyGln: 2.981 ± 0.452
4.601GlyArg: 4.601 ± 0.712
5.832GlySer: 5.832 ± 0.807
6.416GlyThr: 6.416 ± 0.729
7.064GlyVal: 7.064 ± 0.669
2.398GlyTrp: 2.398 ± 0.37
2.787GlyTyr: 2.787 ± 0.452
0.0GlyXaa: 0.0 ± 0.0
His
1.944HisAla: 1.944 ± 0.361
0.194HisCys: 0.194 ± 0.119
1.361HisAsp: 1.361 ± 0.301
0.972HisGlu: 0.972 ± 0.282
0.713HisPhe: 0.713 ± 0.213
1.815HisGly: 1.815 ± 0.439
0.518HisHis: 0.518 ± 0.176
1.037HisIle: 1.037 ± 0.293
0.713HisLys: 0.713 ± 0.196
1.815HisLeu: 1.815 ± 0.3
0.194HisMet: 0.194 ± 0.117
0.648HisAsn: 0.648 ± 0.206
1.166HisPro: 1.166 ± 0.303
0.972HisGln: 0.972 ± 0.291
0.972HisArg: 0.972 ± 0.245
1.166HisSer: 1.166 ± 0.282
1.361HisThr: 1.361 ± 0.29
1.491HisVal: 1.491 ± 0.35
0.648HisTrp: 0.648 ± 0.226
0.842HisTyr: 0.842 ± 0.209
0.0HisXaa: 0.0 ± 0.0
Ile
4.99IleAla: 4.99 ± 0.542
0.065IleCys: 0.065 ± 0.066
3.37IleAsp: 3.37 ± 0.507
4.472IleGlu: 4.472 ± 0.64
1.426IlePhe: 1.426 ± 0.36
3.823IleGly: 3.823 ± 0.569
0.842IleHis: 0.842 ± 0.263
2.463IleIle: 2.463 ± 0.715
2.722IleLys: 2.722 ± 0.657
2.916IleLeu: 2.916 ± 0.537
0.713IleMet: 0.713 ± 0.31
1.491IleAsn: 1.491 ± 0.309
2.463IlePro: 2.463 ± 0.401
1.62IleGln: 1.62 ± 0.456
2.981IleArg: 2.981 ± 0.388
1.944IleSer: 1.944 ± 0.332
2.074IleThr: 2.074 ± 0.316
3.37IleVal: 3.37 ± 0.45
0.648IleTrp: 0.648 ± 0.175
1.361IleTyr: 1.361 ± 0.273
0.0IleXaa: 0.0 ± 0.0
Lys
5.055LysAla: 5.055 ± 0.656
0.065LysCys: 0.065 ± 0.061
2.787LysAsp: 2.787 ± 0.43
1.944LysGlu: 1.944 ± 0.36
1.037LysPhe: 1.037 ± 0.251
5.184LysGly: 5.184 ± 0.628
0.713LysHis: 0.713 ± 0.23
1.879LysIle: 1.879 ± 0.522
2.268LysLys: 2.268 ± 0.454
3.694LysLeu: 3.694 ± 0.525
0.778LysMet: 0.778 ± 0.216
1.166LysAsn: 1.166 ± 0.262
3.305LysPro: 3.305 ± 0.582
1.815LysGln: 1.815 ± 0.318
3.24LysArg: 3.24 ± 0.495
1.944LysSer: 1.944 ± 0.315
2.333LysThr: 2.333 ± 0.429
2.463LysVal: 2.463 ± 0.499
0.778LysTrp: 0.778 ± 0.222
1.296LysTyr: 1.296 ± 0.298
0.0LysXaa: 0.0 ± 0.0
Leu
10.434LeuAla: 10.434 ± 0.939
0.583LeuCys: 0.583 ± 0.211
6.156LeuAsp: 6.156 ± 0.654
5.638LeuGlu: 5.638 ± 0.716
1.491LeuPhe: 1.491 ± 0.306
6.221LeuGly: 6.221 ± 0.855
1.685LeuHis: 1.685 ± 0.332
3.694LeuIle: 3.694 ± 0.435
3.37LeuLys: 3.37 ± 0.427
6.156LeuLeu: 6.156 ± 0.649
1.815LeuMet: 1.815 ± 0.421
3.888LeuAsn: 3.888 ± 0.505
4.342LeuPro: 4.342 ± 0.557
2.074LeuGln: 2.074 ± 0.373
5.573LeuArg: 5.573 ± 0.625
5.444LeuSer: 5.444 ± 0.622
5.314LeuThr: 5.314 ± 0.641
6.027LeuVal: 6.027 ± 0.67
1.361LeuTrp: 1.361 ± 0.317
1.685LeuTyr: 1.685 ± 0.28
0.0LeuXaa: 0.0 ± 0.0
Met
2.657MetAla: 2.657 ± 0.346
0.324MetCys: 0.324 ± 0.172
0.518MetAsp: 0.518 ± 0.208
0.778MetGlu: 0.778 ± 0.239
0.648MetPhe: 0.648 ± 0.296
1.426MetGly: 1.426 ± 0.444
0.324MetHis: 0.324 ± 0.144
1.166MetIle: 1.166 ± 0.291
1.037MetLys: 1.037 ± 0.27
1.685MetLeu: 1.685 ± 0.292
0.324MetMet: 0.324 ± 0.214
0.583MetAsn: 0.583 ± 0.163
1.296MetPro: 1.296 ± 0.288
0.583MetGln: 0.583 ± 0.153
1.491MetArg: 1.491 ± 0.332
1.944MetSer: 1.944 ± 0.384
1.685MetThr: 1.685 ± 0.34
1.426MetVal: 1.426 ± 0.273
0.13MetTrp: 0.13 ± 0.089
0.454MetTyr: 0.454 ± 0.218
0.0MetXaa: 0.0 ± 0.0
Asn
3.435AsnAla: 3.435 ± 0.473
0.583AsnCys: 0.583 ± 0.266
1.62AsnAsp: 1.62 ± 0.347
1.879AsnGlu: 1.879 ± 0.381
0.842AsnPhe: 0.842 ± 0.185
3.564AsnGly: 3.564 ± 0.505
0.778AsnHis: 0.778 ± 0.227
1.231AsnIle: 1.231 ± 0.214
1.296AsnLys: 1.296 ± 0.32
2.851AsnLeu: 2.851 ± 0.44
0.454AsnMet: 0.454 ± 0.172
0.842AsnAsn: 0.842 ± 0.27
1.685AsnPro: 1.685 ± 0.292
1.037AsnGln: 1.037 ± 0.218
2.398AsnArg: 2.398 ± 0.332
1.491AsnSer: 1.491 ± 0.301
2.333AsnThr: 2.333 ± 0.435
2.009AsnVal: 2.009 ± 0.387
0.324AsnTrp: 0.324 ± 0.129
0.842AsnTyr: 0.842 ± 0.184
0.0AsnXaa: 0.0 ± 0.0
Pro
4.796ProAla: 4.796 ± 0.696
0.648ProCys: 0.648 ± 0.202
3.175ProAsp: 3.175 ± 0.415
3.888ProGlu: 3.888 ± 0.601
1.491ProPhe: 1.491 ± 0.334
4.147ProGly: 4.147 ± 0.545
0.972ProHis: 0.972 ± 0.241
2.203ProIle: 2.203 ± 0.512
2.722ProLys: 2.722 ± 0.572
3.564ProLeu: 3.564 ± 0.485
1.231ProMet: 1.231 ± 0.284
1.426ProAsn: 1.426 ± 0.357
1.815ProPro: 1.815 ± 0.412
1.75ProGln: 1.75 ± 0.315
2.657ProArg: 2.657 ± 0.45
3.24ProSer: 3.24 ± 0.636
3.305ProThr: 3.305 ± 0.556
3.694ProVal: 3.694 ± 0.375
0.648ProTrp: 0.648 ± 0.175
0.972ProTyr: 0.972 ± 0.289
0.0ProXaa: 0.0 ± 0.0
Gln
4.536GlnAla: 4.536 ± 0.543
0.194GlnCys: 0.194 ± 0.1
1.62GlnAsp: 1.62 ± 0.34
2.398GlnGlu: 2.398 ± 0.391
1.296GlnPhe: 1.296 ± 0.284
2.203GlnGly: 2.203 ± 0.366
0.454GlnHis: 0.454 ± 0.17
2.009GlnIle: 2.009 ± 0.427
1.815GlnLys: 1.815 ± 0.3
2.657GlnLeu: 2.657 ± 0.441
0.907GlnMet: 0.907 ± 0.253
1.037GlnAsn: 1.037 ± 0.257
1.166GlnPro: 1.166 ± 0.277
0.842GlnGln: 0.842 ± 0.23
2.203GlnArg: 2.203 ± 0.382
1.555GlnSer: 1.555 ± 0.303
2.203GlnThr: 2.203 ± 0.356
2.398GlnVal: 2.398 ± 0.335
0.454GlnTrp: 0.454 ± 0.196
0.713GlnTyr: 0.713 ± 0.238
0.0GlnXaa: 0.0 ± 0.0
Arg
4.666ArgAla: 4.666 ± 0.536
0.518ArgCys: 0.518 ± 0.216
4.407ArgAsp: 4.407 ± 0.667
5.055ArgGlu: 5.055 ± 0.719
3.046ArgPhe: 3.046 ± 0.423
4.472ArgGly: 4.472 ± 0.611
1.361ArgHis: 1.361 ± 0.394
2.333ArgIle: 2.333 ± 0.369
2.916ArgLys: 2.916 ± 0.47
5.314ArgLeu: 5.314 ± 0.637
1.879ArgMet: 1.879 ± 0.384
2.139ArgAsn: 2.139 ± 0.361
2.203ArgPro: 2.203 ± 0.373
2.527ArgGln: 2.527 ± 0.444
5.314ArgArg: 5.314 ± 0.822
3.694ArgSer: 3.694 ± 0.484
3.888ArgThr: 3.888 ± 0.415
4.731ArgVal: 4.731 ± 0.657
0.907ArgTrp: 0.907 ± 0.251
2.074ArgTyr: 2.074 ± 0.29
0.0ArgXaa: 0.0 ± 0.0
Ser
5.508SerAla: 5.508 ± 0.693
0.13SerCys: 0.13 ± 0.092
3.499SerAsp: 3.499 ± 0.433
3.629SerGlu: 3.629 ± 0.655
2.009SerPhe: 2.009 ± 0.407
5.508SerGly: 5.508 ± 0.75
1.037SerHis: 1.037 ± 0.275
2.722SerIle: 2.722 ± 0.647
2.139SerLys: 2.139 ± 0.323
6.221SerLeu: 6.221 ± 0.736
0.907SerMet: 0.907 ± 0.21
1.555SerAsn: 1.555 ± 0.427
2.851SerPro: 2.851 ± 0.416
1.555SerGln: 1.555 ± 0.274
4.277SerArg: 4.277 ± 0.521
4.018SerSer: 4.018 ± 0.514
3.823SerThr: 3.823 ± 0.774
4.666SerVal: 4.666 ± 0.527
1.166SerTrp: 1.166 ± 0.277
2.009SerTyr: 2.009 ± 0.317
0.0SerXaa: 0.0 ± 0.0
Thr
6.675ThrAla: 6.675 ± 0.743
0.259ThrCys: 0.259 ± 0.123
3.759ThrAsp: 3.759 ± 0.458
4.796ThrGlu: 4.796 ± 0.669
2.527ThrPhe: 2.527 ± 0.636
6.675ThrGly: 6.675 ± 1.077
1.815ThrHis: 1.815 ± 0.353
2.787ThrIle: 2.787 ± 0.427
2.009ThrLys: 2.009 ± 0.412
5.055ThrLeu: 5.055 ± 0.68
0.713ThrMet: 0.713 ± 0.23
2.009ThrAsn: 2.009 ± 0.415
4.342ThrPro: 4.342 ± 0.794
1.944ThrGln: 1.944 ± 0.338
2.851ThrArg: 2.851 ± 0.38
3.694ThrSer: 3.694 ± 0.633
4.407ThrThr: 4.407 ± 0.688
5.444ThrVal: 5.444 ± 0.644
1.231ThrTrp: 1.231 ± 0.237
2.333ThrTyr: 2.333 ± 0.344
0.0ThrXaa: 0.0 ± 0.0
Val
7.712ValAla: 7.712 ± 0.72
0.583ValCys: 0.583 ± 0.203
4.731ValAsp: 4.731 ± 0.503
5.314ValGlu: 5.314 ± 0.612
2.203ValPhe: 2.203 ± 0.368
5.832ValGly: 5.832 ± 0.512
1.75ValHis: 1.75 ± 0.345
3.111ValIle: 3.111 ± 0.477
4.277ValLys: 4.277 ± 0.574
5.832ValLeu: 5.832 ± 0.834
1.231ValMet: 1.231 ± 0.276
1.944ValAsn: 1.944 ± 0.328
2.916ValPro: 2.916 ± 0.354
2.398ValGln: 2.398 ± 0.327
4.407ValArg: 4.407 ± 0.622
4.277ValSer: 4.277 ± 0.361
6.545ValThr: 6.545 ± 0.702
5.184ValVal: 5.184 ± 0.673
1.426ValTrp: 1.426 ± 0.282
1.426ValTyr: 1.426 ± 0.339
0.0ValXaa: 0.0 ± 0.0
Trp
2.139TrpAla: 2.139 ± 0.35
0.259TrpCys: 0.259 ± 0.127
1.555TrpAsp: 1.555 ± 0.281
1.426TrpGlu: 1.426 ± 0.354
0.454TrpPhe: 0.454 ± 0.168
1.491TrpGly: 1.491 ± 0.301
0.194TrpHis: 0.194 ± 0.111
0.454TrpIle: 0.454 ± 0.14
1.037TrpLys: 1.037 ± 0.241
1.296TrpLeu: 1.296 ± 0.34
0.389TrpMet: 0.389 ± 0.206
0.648TrpAsn: 0.648 ± 0.212
0.324TrpPro: 0.324 ± 0.139
0.842TrpGln: 0.842 ± 0.219
1.361TrpArg: 1.361 ± 0.344
1.555TrpSer: 1.555 ± 0.303
1.491TrpThr: 1.491 ± 0.364
1.296TrpVal: 1.296 ± 0.29
0.13TrpTrp: 0.13 ± 0.134
0.259TrpTyr: 0.259 ± 0.17
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.851TyrAla: 2.851 ± 0.415
0.194TyrCys: 0.194 ± 0.102
2.333TyrAsp: 2.333 ± 0.424
2.787TyrGlu: 2.787 ± 0.474
0.972TyrPhe: 0.972 ± 0.21
3.111TyrGly: 3.111 ± 0.488
0.454TyrHis: 0.454 ± 0.184
1.037TyrIle: 1.037 ± 0.274
0.648TyrLys: 0.648 ± 0.219
1.879TyrLeu: 1.879 ± 0.399
0.583TyrMet: 0.583 ± 0.179
0.907TyrAsn: 0.907 ± 0.248
1.555TyrPro: 1.555 ± 0.364
0.713TyrGln: 0.713 ± 0.224
2.009TyrArg: 2.009 ± 0.411
2.009TyrSer: 2.009 ± 0.379
1.555TyrThr: 1.555 ± 0.299
1.491TyrVal: 1.491 ± 0.338
0.518TyrTrp: 0.518 ± 0.216
0.713TyrTyr: 0.713 ± 0.177
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 75 proteins (15432 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski