Amino acid dipepetide frequency for Streptomyces phage Geostin

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
11.164AlaAla: 11.164 ± 1.237
0.824AlaCys: 0.824 ± 0.256
4.795AlaAsp: 4.795 ± 0.637
6.518AlaGlu: 6.518 ± 0.6
2.173AlaPhe: 2.173 ± 0.325
8.691AlaGly: 8.691 ± 0.846
1.349AlaHis: 1.349 ± 0.248
3.671AlaIle: 3.671 ± 0.531
6.668AlaLys: 6.668 ± 0.879
7.492AlaLeu: 7.492 ± 0.661
2.323AlaMet: 2.323 ± 0.393
4.196AlaAsn: 4.196 ± 0.634
3.596AlaPro: 3.596 ± 0.679
4.945AlaGln: 4.945 ± 0.605
5.095AlaArg: 5.095 ± 0.669
6.294AlaSer: 6.294 ± 0.748
7.043AlaThr: 7.043 ± 0.907
7.567AlaVal: 7.567 ± 0.674
1.948AlaTrp: 1.948 ± 0.331
3.521AlaTyr: 3.521 ± 0.752
0.0AlaXaa: 0.0 ± 0.0
Cys
0.3CysAla: 0.3 ± 0.152
0.0CysCys: 0.0 ± 0.0
0.524CysAsp: 0.524 ± 0.222
0.45CysGlu: 0.45 ± 0.183
0.075CysPhe: 0.075 ± 0.074
0.524CysGly: 0.524 ± 0.219
0.225CysHis: 0.225 ± 0.126
0.15CysIle: 0.15 ± 0.117
0.599CysLys: 0.599 ± 0.261
0.599CysLeu: 0.599 ± 0.202
0.15CysMet: 0.15 ± 0.116
0.0CysAsn: 0.0 ± 0.0
0.974CysPro: 0.974 ± 0.38
0.3CysGln: 0.3 ± 0.193
0.225CysArg: 0.225 ± 0.132
0.45CysSer: 0.45 ± 0.207
0.3CysThr: 0.3 ± 0.159
0.824CysVal: 0.824 ± 0.282
0.075CysTrp: 0.075 ± 0.08
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
6.443AspAla: 6.443 ± 0.685
0.45AspCys: 0.45 ± 0.178
4.346AspAsp: 4.346 ± 0.602
4.121AspGlu: 4.121 ± 0.548
2.023AspPhe: 2.023 ± 0.312
4.795AspGly: 4.795 ± 0.694
1.274AspHis: 1.274 ± 0.311
3.521AspIle: 3.521 ± 0.513
2.922AspLys: 2.922 ± 0.404
4.271AspLeu: 4.271 ± 0.571
2.622AspMet: 2.622 ± 0.617
2.248AspAsn: 2.248 ± 0.48
4.645AspPro: 4.645 ± 0.488
2.248AspGln: 2.248 ± 0.397
3.222AspArg: 3.222 ± 0.38
3.521AspSer: 3.521 ± 0.531
3.671AspThr: 3.671 ± 0.447
4.87AspVal: 4.87 ± 0.616
1.349AspTrp: 1.349 ± 0.339
2.847AspTyr: 2.847 ± 0.376
0.0AspXaa: 0.0 ± 0.0
Glu
5.619GluAla: 5.619 ± 0.688
0.45GluCys: 0.45 ± 0.236
3.222GluAsp: 3.222 ± 0.502
3.671GluGlu: 3.671 ± 0.471
1.873GluPhe: 1.873 ± 0.32
4.495GluGly: 4.495 ± 0.542
1.124GluHis: 1.124 ± 0.35
2.997GluIle: 2.997 ± 0.506
3.072GluLys: 3.072 ± 0.552
5.544GluLeu: 5.544 ± 0.499
0.974GluMet: 0.974 ± 0.356
2.398GluAsn: 2.398 ± 0.393
1.648GluPro: 1.648 ± 0.418
2.772GluGln: 2.772 ± 0.414
2.997GluArg: 2.997 ± 0.548
2.622GluSer: 2.622 ± 0.353
3.072GluThr: 3.072 ± 0.418
3.821GluVal: 3.821 ± 0.604
1.124GluTrp: 1.124 ± 0.259
2.547GluTyr: 2.547 ± 0.562
0.0GluXaa: 0.0 ± 0.0
Phe
2.398PheAla: 2.398 ± 0.404
0.3PheCys: 0.3 ± 0.157
2.547PheAsp: 2.547 ± 0.526
2.098PheGlu: 2.098 ± 0.437
0.824PhePhe: 0.824 ± 0.238
3.372PheGly: 3.372 ± 0.605
0.524PheHis: 0.524 ± 0.185
1.274PheIle: 1.274 ± 0.365
1.199PheLys: 1.199 ± 0.274
2.023PheLeu: 2.023 ± 0.401
0.899PheMet: 0.899 ± 0.241
1.498PheAsn: 1.498 ± 0.334
1.199PhePro: 1.199 ± 0.441
1.199PheGln: 1.199 ± 0.331
1.498PheArg: 1.498 ± 0.48
2.472PheSer: 2.472 ± 0.363
2.922PheThr: 2.922 ± 0.477
1.498PheVal: 1.498 ± 0.31
0.375PheTrp: 0.375 ± 0.184
1.199PheTyr: 1.199 ± 0.328
0.0PheXaa: 0.0 ± 0.0
Gly
8.242GlyAla: 8.242 ± 0.941
0.599GlyCys: 0.599 ± 0.207
3.971GlyAsp: 3.971 ± 0.58
5.245GlyGlu: 5.245 ± 0.728
2.847GlyPhe: 2.847 ± 0.48
8.691GlyGly: 8.691 ± 1.079
1.648GlyHis: 1.648 ± 0.411
3.746GlyIle: 3.746 ± 0.621
5.994GlyLys: 5.994 ± 0.651
6.743GlyLeu: 6.743 ± 0.621
2.098GlyMet: 2.098 ± 0.342
3.072GlyAsn: 3.072 ± 0.481
2.922GlyPro: 2.922 ± 0.402
3.971GlyGln: 3.971 ± 0.588
3.147GlyArg: 3.147 ± 0.5
5.095GlySer: 5.095 ± 0.507
5.994GlyThr: 5.994 ± 1.184
5.994GlyVal: 5.994 ± 0.54
1.498GlyTrp: 1.498 ± 0.265
3.072GlyTyr: 3.072 ± 0.584
0.0GlyXaa: 0.0 ± 0.0
His
1.199HisAla: 1.199 ± 0.236
0.15HisCys: 0.15 ± 0.111
1.498HisAsp: 1.498 ± 0.432
0.674HisGlu: 0.674 ± 0.242
0.599HisPhe: 0.599 ± 0.165
1.274HisGly: 1.274 ± 0.269
0.375HisHis: 0.375 ± 0.167
0.674HisIle: 0.674 ± 0.219
1.049HisLys: 1.049 ± 0.258
1.199HisLeu: 1.199 ± 0.292
0.375HisMet: 0.375 ± 0.159
0.524HisAsn: 0.524 ± 0.192
0.749HisPro: 0.749 ± 0.235
0.599HisGln: 0.599 ± 0.193
0.3HisArg: 0.3 ± 0.135
1.199HisSer: 1.199 ± 0.268
0.524HisThr: 0.524 ± 0.214
1.648HisVal: 1.648 ± 0.338
0.225HisTrp: 0.225 ± 0.181
0.899HisTyr: 0.899 ± 0.27
0.0HisXaa: 0.0 ± 0.0
Ile
4.645IleAla: 4.645 ± 0.595
0.3IleCys: 0.3 ± 0.202
3.596IleAsp: 3.596 ± 0.604
2.098IleGlu: 2.098 ± 0.431
1.124IlePhe: 1.124 ± 0.257
3.072IleGly: 3.072 ± 0.573
0.899IleHis: 0.899 ± 0.252
2.098IleIle: 2.098 ± 0.402
2.697IleLys: 2.697 ± 0.514
3.147IleLeu: 3.147 ± 0.406
0.974IleMet: 0.974 ± 0.266
1.349IleAsn: 1.349 ± 0.302
1.648IlePro: 1.648 ± 0.401
1.124IleGln: 1.124 ± 0.285
2.847IleArg: 2.847 ± 0.563
2.697IleSer: 2.697 ± 0.428
3.671IleThr: 3.671 ± 0.566
2.173IleVal: 2.173 ± 0.403
0.45IleTrp: 0.45 ± 0.165
1.349IleTyr: 1.349 ± 0.298
0.0IleXaa: 0.0 ± 0.0
Lys
6.968LysAla: 6.968 ± 0.935
0.375LysCys: 0.375 ± 0.185
3.896LysAsp: 3.896 ± 0.54
2.248LysGlu: 2.248 ± 0.371
2.697LysPhe: 2.697 ± 0.487
4.196LysGly: 4.196 ± 0.633
0.974LysHis: 0.974 ± 0.281
2.248LysIle: 2.248 ± 0.343
3.372LysLys: 3.372 ± 0.649
4.346LysLeu: 4.346 ± 0.716
1.723LysMet: 1.723 ± 0.396
2.098LysAsn: 2.098 ± 0.58
2.622LysPro: 2.622 ± 0.648
2.772LysGln: 2.772 ± 0.514
2.697LysArg: 2.697 ± 0.558
2.547LysSer: 2.547 ± 0.444
3.297LysThr: 3.297 ± 0.49
4.945LysVal: 4.945 ± 0.781
1.199LysTrp: 1.199 ± 0.329
2.248LysTyr: 2.248 ± 0.498
0.0LysXaa: 0.0 ± 0.0
Leu
6.368LeuAla: 6.368 ± 0.813
0.3LeuCys: 0.3 ± 0.163
5.17LeuAsp: 5.17 ± 0.821
3.446LeuGlu: 3.446 ± 0.458
3.147LeuPhe: 3.147 ± 0.536
5.469LeuGly: 5.469 ± 0.609
0.599LeuHis: 0.599 ± 0.203
2.922LeuIle: 2.922 ± 0.512
4.72LeuLys: 4.72 ± 0.675
5.095LeuLeu: 5.095 ± 0.524
1.798LeuMet: 1.798 ± 0.35
2.697LeuAsn: 2.697 ± 0.387
3.521LeuPro: 3.521 ± 0.445
3.596LeuGln: 3.596 ± 0.551
4.945LeuArg: 4.945 ± 0.697
4.945LeuSer: 4.945 ± 0.758
6.144LeuThr: 6.144 ± 0.591
5.769LeuVal: 5.769 ± 0.595
0.974LeuTrp: 0.974 ± 0.354
2.622LeuTyr: 2.622 ± 0.427
0.0LeuXaa: 0.0 ± 0.0
Met
4.645MetAla: 4.645 ± 0.608
0.075MetCys: 0.075 ± 0.075
2.023MetAsp: 2.023 ± 0.345
1.199MetGlu: 1.199 ± 0.281
0.524MetPhe: 0.524 ± 0.172
2.772MetGly: 2.772 ± 0.518
0.075MetHis: 0.075 ± 0.086
1.049MetIle: 1.049 ± 0.319
1.049MetLys: 1.049 ± 0.351
1.424MetLeu: 1.424 ± 0.359
0.749MetMet: 0.749 ± 0.215
1.199MetAsn: 1.199 ± 0.258
1.124MetPro: 1.124 ± 0.387
1.349MetGln: 1.349 ± 0.325
1.498MetArg: 1.498 ± 0.256
2.772MetSer: 2.772 ± 0.489
1.424MetThr: 1.424 ± 0.333
1.424MetVal: 1.424 ± 0.273
0.45MetTrp: 0.45 ± 0.178
0.375MetTyr: 0.375 ± 0.187
0.0MetXaa: 0.0 ± 0.0
Asn
3.821AsnAla: 3.821 ± 0.576
0.375AsnCys: 0.375 ± 0.161
2.398AsnAsp: 2.398 ± 0.458
2.697AsnGlu: 2.697 ± 0.456
1.349AsnPhe: 1.349 ± 0.281
4.046AsnGly: 4.046 ± 0.765
0.749AsnHis: 0.749 ± 0.207
1.424AsnIle: 1.424 ± 0.256
2.248AsnLys: 2.248 ± 0.422
3.896AsnLeu: 3.896 ± 0.643
0.45AsnMet: 0.45 ± 0.142
1.873AsnAsn: 1.873 ± 0.505
2.847AsnPro: 2.847 ± 0.447
1.648AsnGln: 1.648 ± 0.37
1.873AsnArg: 1.873 ± 0.478
2.472AsnSer: 2.472 ± 0.334
2.173AsnThr: 2.173 ± 0.314
2.098AsnVal: 2.098 ± 0.436
1.124AsnTrp: 1.124 ± 0.257
1.274AsnTyr: 1.274 ± 0.309
0.0AsnXaa: 0.0 ± 0.0
Pro
4.945ProAla: 4.945 ± 0.81
0.225ProCys: 0.225 ± 0.135
2.398ProAsp: 2.398 ± 0.43
2.622ProGlu: 2.622 ± 0.388
1.349ProPhe: 1.349 ± 0.289
4.271ProGly: 4.271 ± 0.616
0.749ProHis: 0.749 ± 0.215
0.974ProIle: 0.974 ± 0.243
2.098ProLys: 2.098 ± 0.447
3.072ProLeu: 3.072 ± 0.47
1.124ProMet: 1.124 ± 0.327
1.798ProAsn: 1.798 ± 0.302
2.023ProPro: 2.023 ± 0.471
1.798ProGln: 1.798 ± 0.398
2.547ProArg: 2.547 ± 0.434
2.997ProSer: 2.997 ± 0.499
3.746ProThr: 3.746 ± 0.594
3.222ProVal: 3.222 ± 0.409
0.899ProTrp: 0.899 ± 0.226
1.573ProTyr: 1.573 ± 0.358
0.0ProXaa: 0.0 ± 0.0
Gln
6.144GlnAla: 6.144 ± 0.78
0.15GlnCys: 0.15 ± 0.107
3.297GlnAsp: 3.297 ± 0.467
2.398GlnGlu: 2.398 ± 0.457
0.824GlnPhe: 0.824 ± 0.283
2.323GlnGly: 2.323 ± 0.491
0.225GlnHis: 0.225 ± 0.141
2.098GlnIle: 2.098 ± 0.47
2.472GlnLys: 2.472 ± 0.438
3.896GlnLeu: 3.896 ± 0.51
1.573GlnMet: 1.573 ± 0.288
1.124GlnAsn: 1.124 ± 0.227
1.498GlnPro: 1.498 ± 0.385
2.398GlnGln: 2.398 ± 0.446
2.847GlnArg: 2.847 ± 0.447
3.297GlnSer: 3.297 ± 0.469
3.072GlnThr: 3.072 ± 0.478
2.997GlnVal: 2.997 ± 0.456
0.3GlnTrp: 0.3 ± 0.155
1.648GlnTyr: 1.648 ± 0.328
0.0GlnXaa: 0.0 ± 0.0
Arg
4.42ArgAla: 4.42 ± 0.586
0.524ArgCys: 0.524 ± 0.21
3.896ArgAsp: 3.896 ± 0.518
2.922ArgGlu: 2.922 ± 0.546
2.023ArgPhe: 2.023 ± 0.43
3.746ArgGly: 3.746 ± 0.455
0.824ArgHis: 0.824 ± 0.222
2.173ArgIle: 2.173 ± 0.41
2.547ArgLys: 2.547 ± 0.488
2.997ArgLeu: 2.997 ± 0.509
1.424ArgMet: 1.424 ± 0.335
2.398ArgAsn: 2.398 ± 0.508
2.023ArgPro: 2.023 ± 0.386
1.873ArgGln: 1.873 ± 0.428
2.173ArgArg: 2.173 ± 0.621
2.847ArgSer: 2.847 ± 0.443
3.521ArgThr: 3.521 ± 0.463
4.121ArgVal: 4.121 ± 0.626
1.199ArgTrp: 1.199 ± 0.321
1.798ArgTyr: 1.798 ± 0.416
0.0ArgXaa: 0.0 ± 0.0
Ser
5.32SerAla: 5.32 ± 0.571
0.524SerCys: 0.524 ± 0.235
4.121SerAsp: 4.121 ± 0.525
2.922SerGlu: 2.922 ± 0.409
2.472SerPhe: 2.472 ± 0.292
7.867SerGly: 7.867 ± 0.976
0.899SerHis: 0.899 ± 0.223
2.997SerIle: 2.997 ± 0.38
1.948SerLys: 1.948 ± 0.324
4.196SerLeu: 4.196 ± 0.575
1.723SerMet: 1.723 ± 0.405
3.372SerAsn: 3.372 ± 0.474
2.922SerPro: 2.922 ± 0.437
2.847SerGln: 2.847 ± 0.388
2.622SerArg: 2.622 ± 0.452
5.17SerSer: 5.17 ± 0.927
4.42SerThr: 4.42 ± 0.703
4.346SerVal: 4.346 ± 0.567
1.199SerTrp: 1.199 ± 0.332
2.472SerTyr: 2.472 ± 0.359
0.0SerXaa: 0.0 ± 0.0
Thr
6.144ThrAla: 6.144 ± 0.98
0.375ThrCys: 0.375 ± 0.197
4.42ThrAsp: 4.42 ± 0.56
2.847ThrGlu: 2.847 ± 0.467
2.323ThrPhe: 2.323 ± 0.362
6.219ThrGly: 6.219 ± 0.787
1.274ThrHis: 1.274 ± 0.338
3.072ThrIle: 3.072 ± 0.572
4.046ThrLys: 4.046 ± 0.567
5.619ThrLeu: 5.619 ± 0.835
1.873ThrMet: 1.873 ± 0.395
3.147ThrAsn: 3.147 ± 0.614
4.046ThrPro: 4.046 ± 0.535
2.847ThrGln: 2.847 ± 0.464
2.622ThrArg: 2.622 ± 0.514
4.87ThrSer: 4.87 ± 0.762
5.694ThrThr: 5.694 ± 0.595
4.495ThrVal: 4.495 ± 0.756
1.049ThrTrp: 1.049 ± 0.246
3.147ThrTyr: 3.147 ± 0.4
0.0ThrXaa: 0.0 ± 0.0
Val
6.294ValAla: 6.294 ± 0.844
0.375ValCys: 0.375 ± 0.199
5.02ValAsp: 5.02 ± 0.606
4.271ValGlu: 4.271 ± 0.935
2.023ValPhe: 2.023 ± 0.413
4.42ValGly: 4.42 ± 0.478
1.498ValHis: 1.498 ± 0.35
2.922ValIle: 2.922 ± 0.365
4.945ValLys: 4.945 ± 0.577
4.87ValLeu: 4.87 ± 0.408
2.098ValMet: 2.098 ± 0.299
3.372ValAsn: 3.372 ± 0.612
2.472ValPro: 2.472 ± 0.501
3.297ValGln: 3.297 ± 0.518
3.896ValArg: 3.896 ± 0.613
4.795ValSer: 4.795 ± 0.547
5.844ValThr: 5.844 ± 0.831
3.596ValVal: 3.596 ± 0.688
1.049ValTrp: 1.049 ± 0.263
3.072ValTyr: 3.072 ± 0.53
0.0ValXaa: 0.0 ± 0.0
Trp
1.573TrpAla: 1.573 ± 0.318
0.075TrpCys: 0.075 ± 0.073
1.349TrpAsp: 1.349 ± 0.39
1.049TrpGlu: 1.049 ± 0.326
0.45TrpPhe: 0.45 ± 0.187
0.899TrpGly: 0.899 ± 0.247
0.15TrpHis: 0.15 ± 0.112
0.749TrpIle: 0.749 ± 0.204
1.049TrpLys: 1.049 ± 0.243
1.049TrpLeu: 1.049 ± 0.333
0.45TrpMet: 0.45 ± 0.173
0.749TrpAsn: 0.749 ± 0.229
0.674TrpPro: 0.674 ± 0.235
1.049TrpGln: 1.049 ± 0.3
0.824TrpArg: 0.824 ± 0.245
0.899TrpSer: 0.899 ± 0.21
1.424TrpThr: 1.424 ± 0.296
1.798TrpVal: 1.798 ± 0.268
0.45TrpTrp: 0.45 ± 0.171
0.674TrpTyr: 0.674 ± 0.254
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.222TyrAla: 3.222 ± 0.595
0.375TyrCys: 0.375 ± 0.165
2.922TyrAsp: 2.922 ± 0.504
2.547TyrGlu: 2.547 ± 0.61
0.599TyrPhe: 0.599 ± 0.185
3.446TyrGly: 3.446 ± 0.531
0.3TyrHis: 0.3 ± 0.153
1.199TyrIle: 1.199 ± 0.291
2.772TyrLys: 2.772 ± 0.513
2.772TyrLeu: 2.772 ± 0.385
1.723TyrMet: 1.723 ± 0.369
1.723TyrAsn: 1.723 ± 0.437
1.274TyrPro: 1.274 ± 0.318
1.948TyrGln: 1.948 ± 0.304
1.573TyrArg: 1.573 ± 0.404
2.398TyrSer: 2.398 ± 0.409
2.173TyrThr: 2.173 ± 0.402
2.922TyrVal: 2.922 ± 0.481
0.45TyrTrp: 0.45 ± 0.202
1.798TyrTyr: 1.798 ± 0.37
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 66 proteins (13348 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski