Amino acid dipepetide frequency for Poophage MBI-2016a

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
9.305AlaAla: 9.305 ± 4.368
0.0AlaCys: 0.0 ± 0.0
3.831AlaAsp: 3.831 ± 1.775
2.189AlaGlu: 2.189 ± 1.559
3.284AlaPhe: 3.284 ± 0.866
6.021AlaGly: 6.021 ± 4.536
1.642AlaHis: 1.642 ± 0.608
1.642AlaIle: 1.642 ± 1.647
2.737AlaLys: 2.737 ± 0.883
8.21AlaLeu: 8.21 ± 2.438
0.0AlaMet: 0.0 ± 0.0
1.095AlaAsn: 1.095 ± 0.443
2.737AlaPro: 2.737 ± 1.279
1.095AlaGln: 1.095 ± 1.098
1.642AlaArg: 1.642 ± 0.926
12.042AlaSer: 12.042 ± 2.787
2.737AlaThr: 2.737 ± 1.243
3.831AlaVal: 3.831 ± 1.129
0.547AlaTrp: 0.547 ± 0.478
3.284AlaTyr: 3.284 ± 0.772
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.0CysCys: 0.0 ± 0.0
0.547CysAsp: 0.547 ± 0.722
0.547CysGlu: 0.547 ± 0.478
0.547CysPhe: 0.547 ± 0.478
1.095CysGly: 1.095 ± 0.956
0.0CysHis: 0.0 ± 0.0
0.0CysIle: 0.0 ± 0.0
0.0CysLys: 0.0 ± 0.0
1.642CysLeu: 1.642 ± 1.066
0.0CysMet: 0.0 ± 0.0
0.547CysAsn: 0.547 ± 0.722
0.547CysPro: 0.547 ± 0.372
0.0CysGln: 0.0 ± 0.0
0.0CysArg: 0.0 ± 0.0
0.547CysSer: 0.547 ± 0.478
0.0CysThr: 0.0 ± 0.0
0.0CysVal: 0.0 ± 0.0
0.0CysTrp: 0.0 ± 0.0
1.095CysTyr: 1.095 ± 0.651
0.0CysXaa: 0.0 ± 0.0
Asp
2.737AspAla: 2.737 ± 2.091
0.547AspCys: 0.547 ± 0.478
2.737AspAsp: 2.737 ± 1.49
2.189AspGlu: 2.189 ± 0.462
6.568AspPhe: 6.568 ± 2.219
1.642AspGly: 1.642 ± 1.0
0.547AspHis: 0.547 ± 0.372
2.189AspIle: 2.189 ± 0.758
4.379AspLys: 4.379 ± 1.785
6.568AspLeu: 6.568 ± 1.801
4.379AspMet: 4.379 ± 1.411
1.642AspAsn: 1.642 ± 0.608
2.189AspPro: 2.189 ± 1.556
0.547AspGln: 0.547 ± 0.615
3.284AspArg: 3.284 ± 1.777
2.737AspSer: 2.737 ± 1.383
3.284AspThr: 3.284 ± 1.792
6.568AspVal: 6.568 ± 1.36
2.189AspTrp: 2.189 ± 0.462
4.926AspTyr: 4.926 ± 1.477
0.0AspXaa: 0.0 ± 0.0
Glu
3.831GluAla: 3.831 ± 0.59
0.547GluCys: 0.547 ± 0.478
4.379GluAsp: 4.379 ± 2.709
2.189GluGlu: 2.189 ± 1.198
2.737GluPhe: 2.737 ± 0.718
0.547GluGly: 0.547 ± 0.549
0.547GluHis: 0.547 ± 0.372
1.642GluIle: 1.642 ± 0.584
2.189GluLys: 2.189 ± 1.397
3.831GluLeu: 3.831 ± 1.984
1.095GluMet: 1.095 ± 0.443
1.095GluAsn: 1.095 ± 0.443
2.189GluPro: 2.189 ± 1.074
2.189GluGln: 2.189 ± 0.928
6.021GluArg: 6.021 ± 2.885
1.095GluSer: 1.095 ± 0.828
1.095GluThr: 1.095 ± 0.51
2.737GluVal: 2.737 ± 0.432
0.0GluTrp: 0.0 ± 0.0
4.379GluTyr: 4.379 ± 1.197
0.0GluXaa: 0.0 ± 0.0
Phe
4.926PheAla: 4.926 ± 1.675
0.547PheCys: 0.547 ± 0.478
4.379PheAsp: 4.379 ± 1.533
3.831PheGlu: 3.831 ± 1.207
3.284PhePhe: 3.284 ± 1.251
4.926PheGly: 4.926 ± 2.296
1.095PheHis: 1.095 ± 0.51
3.284PheIle: 3.284 ± 1.091
2.737PheLys: 2.737 ± 1.844
6.021PheLeu: 6.021 ± 2.132
0.547PheMet: 0.547 ± 0.372
3.831PheAsn: 3.831 ± 2.139
0.0PhePro: 0.0 ± 0.0
2.189PheGln: 2.189 ± 1.16
2.737PheArg: 2.737 ± 0.811
6.021PheSer: 6.021 ± 2.029
2.189PheThr: 2.189 ± 1.074
4.379PheVal: 4.379 ± 1.484
2.737PheTrp: 2.737 ± 1.089
2.189PheTyr: 2.189 ± 1.141
0.0PheXaa: 0.0 ± 0.0
Gly
4.926GlyAla: 4.926 ± 2.128
0.547GlyCys: 0.547 ± 0.615
1.642GlyAsp: 1.642 ± 0.916
2.737GlyGlu: 2.737 ± 1.258
3.284GlyPhe: 3.284 ± 1.155
7.115GlyGly: 7.115 ± 1.455
1.095GlyHis: 1.095 ± 1.098
3.831GlyIle: 3.831 ± 1.374
4.379GlyLys: 4.379 ± 1.157
4.926GlyLeu: 4.926 ± 1.454
1.642GlyMet: 1.642 ± 0.926
1.095GlyAsn: 1.095 ± 0.443
1.095GlyPro: 1.095 ± 0.443
4.379GlyGln: 4.379 ± 1.244
2.737GlyArg: 2.737 ± 0.86
5.473GlySer: 5.473 ± 1.639
1.095GlyThr: 1.095 ± 0.745
7.663GlyVal: 7.663 ± 1.703
1.095GlyTrp: 1.095 ± 0.443
4.926GlyTyr: 4.926 ± 1.421
0.0GlyXaa: 0.0 ± 0.0
His
0.547HisAla: 0.547 ± 0.478
0.547HisCys: 0.547 ± 0.722
0.547HisAsp: 0.547 ± 0.549
0.0HisGlu: 0.0 ± 0.0
1.095HisPhe: 1.095 ± 0.727
1.095HisGly: 1.095 ± 0.6
0.0HisHis: 0.0 ± 0.0
0.0HisIle: 0.0 ± 0.0
0.0HisLys: 0.0 ± 0.0
0.0HisLeu: 0.0 ± 0.0
0.0HisMet: 0.0 ± 0.0
0.0HisAsn: 0.0 ± 0.0
2.189HisPro: 2.189 ± 1.074
1.095HisGln: 1.095 ± 0.645
1.642HisArg: 1.642 ± 0.754
0.547HisSer: 0.547 ± 0.372
0.547HisThr: 0.547 ± 0.549
1.642HisVal: 1.642 ± 0.608
0.0HisTrp: 0.0 ± 0.0
0.547HisTyr: 0.547 ± 0.372
0.0HisXaa: 0.0 ± 0.0
Ile
3.284IleAla: 3.284 ± 0.866
0.0IleCys: 0.0 ± 0.0
3.831IleAsp: 3.831 ± 1.669
1.095IleGlu: 1.095 ± 0.726
2.737IlePhe: 2.737 ± 1.844
2.189IleGly: 2.189 ± 1.074
0.0IleHis: 0.0 ± 0.0
1.642IleIle: 1.642 ± 0.584
2.189IleLys: 2.189 ± 0.89
2.189IleLeu: 2.189 ± 0.975
1.642IleMet: 1.642 ± 0.942
0.0IleAsn: 0.0 ± 0.0
2.189IlePro: 2.189 ± 0.462
1.095IleGln: 1.095 ± 0.443
3.831IleArg: 3.831 ± 1.045
4.926IleSer: 4.926 ± 2.501
3.831IleThr: 3.831 ± 1.527
3.284IleVal: 3.284 ± 1.085
0.547IleTrp: 0.547 ± 0.478
2.189IleTyr: 2.189 ± 0.758
0.0IleXaa: 0.0 ± 0.0
Lys
2.737LysAla: 2.737 ± 1.344
0.0LysCys: 0.0 ± 0.0
3.831LysAsp: 3.831 ± 1.197
3.831LysGlu: 3.831 ± 1.288
3.284LysPhe: 3.284 ± 1.394
2.737LysGly: 2.737 ± 0.726
1.642LysHis: 1.642 ± 0.721
2.189LysIle: 2.189 ± 1.076
1.095LysLys: 1.095 ± 1.006
3.831LysLeu: 3.831 ± 1.374
1.095LysMet: 1.095 ± 0.443
1.095LysAsn: 1.095 ± 0.443
0.547LysPro: 0.547 ± 0.693
1.642LysGln: 1.642 ± 0.926
1.642LysArg: 1.642 ± 1.206
6.568LysSer: 6.568 ± 1.176
2.189LysThr: 2.189 ± 0.747
3.831LysVal: 3.831 ± 0.862
0.0LysTrp: 0.0 ± 0.0
3.831LysTyr: 3.831 ± 1.819
0.0LysXaa: 0.0 ± 0.0
Leu
5.473LeuAla: 5.473 ± 1.376
0.547LeuCys: 0.547 ± 0.615
4.379LeuAsp: 4.379 ± 1.849
5.473LeuGlu: 5.473 ± 2.285
4.926LeuPhe: 4.926 ± 1.291
6.021LeuGly: 6.021 ± 1.381
2.189LeuHis: 2.189 ± 1.02
3.831LeuIle: 3.831 ± 1.85
3.831LeuLys: 3.831 ± 2.173
6.021LeuLeu: 6.021 ± 2.618
3.831LeuMet: 3.831 ± 0.872
4.379LeuAsn: 4.379 ± 1.199
5.473LeuPro: 5.473 ± 2.948
5.473LeuGln: 5.473 ± 0.691
6.568LeuArg: 6.568 ± 2.248
9.305LeuSer: 9.305 ± 1.806
3.831LeuThr: 3.831 ± 0.761
3.284LeuVal: 3.284 ± 1.136
2.189LeuTrp: 2.189 ± 1.369
2.737LeuTyr: 2.737 ± 1.246
0.0LeuXaa: 0.0 ± 0.0
Met
1.095MetAla: 1.095 ± 0.443
0.547MetCys: 0.547 ± 0.478
0.547MetAsp: 0.547 ± 0.478
1.642MetGlu: 1.642 ± 0.965
1.095MetPhe: 1.095 ± 0.443
1.642MetGly: 1.642 ± 0.608
0.547MetHis: 0.547 ± 0.478
0.547MetIle: 0.547 ± 0.693
1.095MetLys: 1.095 ± 1.098
1.642MetLeu: 1.642 ± 0.942
0.0MetMet: 0.0 ± 0.0
2.189MetAsn: 2.189 ± 0.462
2.737MetPro: 2.737 ± 1.243
0.547MetGln: 0.547 ± 0.615
0.547MetArg: 0.547 ± 0.372
2.737MetSer: 2.737 ± 0.917
0.547MetThr: 0.547 ± 0.549
1.642MetVal: 1.642 ± 0.683
0.0MetTrp: 0.0 ± 0.0
1.095MetTyr: 1.095 ± 0.443
0.0MetXaa: 0.0 ± 0.0
Asn
3.831AsnAla: 3.831 ± 1.124
0.0AsnCys: 0.0 ± 0.0
2.737AsnAsp: 2.737 ± 0.679
2.189AsnGlu: 2.189 ± 0.887
3.284AsnPhe: 3.284 ± 1.222
2.189AsnGly: 2.189 ± 1.057
0.547AsnHis: 0.547 ± 0.372
0.0AsnIle: 0.0 ± 0.0
2.737AsnLys: 2.737 ± 1.42
4.926AsnLeu: 4.926 ± 0.619
0.547AsnMet: 0.547 ± 0.372
1.642AsnAsn: 1.642 ± 0.864
4.926AsnPro: 4.926 ± 1.588
1.642AsnGln: 1.642 ± 0.385
3.284AsnArg: 3.284 ± 1.088
4.926AsnSer: 4.926 ± 1.94
1.642AsnThr: 1.642 ± 0.926
1.642AsnVal: 1.642 ± 0.864
1.642AsnTrp: 1.642 ± 1.647
1.642AsnTyr: 1.642 ± 0.864
0.0AsnXaa: 0.0 ± 0.0
Pro
2.737ProAla: 2.737 ± 1.408
1.095ProCys: 1.095 ± 0.51
4.379ProAsp: 4.379 ± 2.114
2.737ProGlu: 2.737 ± 1.141
3.284ProPhe: 3.284 ± 1.542
5.473ProGly: 5.473 ± 2.09
0.0ProHis: 0.0 ± 0.0
2.737ProIle: 2.737 ± 0.718
2.189ProLys: 2.189 ± 1.16
4.379ProLeu: 4.379 ± 2.053
1.095ProMet: 1.095 ± 0.719
1.642ProAsn: 1.642 ± 0.77
1.642ProPro: 1.642 ± 0.699
1.095ProGln: 1.095 ± 0.745
2.189ProArg: 2.189 ± 1.02
4.926ProSer: 4.926 ± 1.024
1.095ProThr: 1.095 ± 0.443
3.284ProVal: 3.284 ± 0.9
0.547ProTrp: 0.547 ± 0.478
1.095ProTyr: 1.095 ± 0.745
0.0ProXaa: 0.0 ± 0.0
Gln
2.189GlnAla: 2.189 ± 0.754
0.0GlnCys: 0.0 ± 0.0
2.737GlnAsp: 2.737 ± 0.432
3.284GlnGlu: 3.284 ± 0.542
1.095GlnPhe: 1.095 ± 0.443
2.737GlnGly: 2.737 ± 0.999
0.0GlnHis: 0.0 ± 0.0
0.547GlnIle: 0.547 ± 0.372
1.095GlnLys: 1.095 ± 0.6
3.831GlnLeu: 3.831 ± 2.378
2.189GlnMet: 2.189 ± 0.928
1.642GlnAsn: 1.642 ± 0.936
1.095GlnPro: 1.095 ± 0.651
3.284GlnGln: 3.284 ± 1.232
4.926GlnArg: 4.926 ± 1.477
2.737GlnSer: 2.737 ± 0.997
1.095GlnThr: 1.095 ± 0.443
1.642GlnVal: 1.642 ± 0.584
0.0GlnTrp: 0.0 ± 0.0
1.095GlnTyr: 1.095 ± 1.098
0.0GlnXaa: 0.0 ± 0.0
Arg
2.737ArgAla: 2.737 ± 0.912
0.547ArgCys: 0.547 ± 0.478
2.189ArgAsp: 2.189 ± 1.489
4.379ArgGlu: 4.379 ± 1.821
4.926ArgPhe: 4.926 ± 2.213
1.642ArgGly: 1.642 ± 0.699
0.0ArgHis: 0.0 ± 0.0
4.379ArgIle: 4.379 ± 1.517
3.831ArgLys: 3.831 ± 2.278
6.021ArgLeu: 6.021 ± 2.487
0.547ArgMet: 0.547 ± 0.464
3.284ArgAsn: 3.284 ± 1.255
3.831ArgPro: 3.831 ± 1.507
2.189ArgGln: 2.189 ± 1.073
3.284ArgArg: 3.284 ± 1.801
7.663ArgSer: 7.663 ± 1.785
2.737ArgThr: 2.737 ± 1.633
4.379ArgVal: 4.379 ± 1.002
0.547ArgTrp: 0.547 ± 0.549
3.831ArgTyr: 3.831 ± 1.902
0.0ArgXaa: 0.0 ± 0.0
Ser
7.663SerAla: 7.663 ± 4.948
1.095SerCys: 1.095 ± 0.852
7.663SerAsp: 7.663 ± 2.049
1.095SerGlu: 1.095 ± 1.23
6.568SerPhe: 6.568 ± 1.615
9.305SerGly: 9.305 ± 2.06
0.547SerHis: 0.547 ± 0.372
6.568SerIle: 6.568 ± 2.791
3.284SerLys: 3.284 ± 0.958
7.663SerLeu: 7.663 ± 1.221
0.0SerMet: 0.0 ± 0.0
6.021SerAsn: 6.021 ± 1.857
3.284SerPro: 3.284 ± 1.508
2.737SerGln: 2.737 ± 0.746
7.663SerArg: 7.663 ± 1.431
13.684SerSer: 13.684 ± 2.842
2.737SerThr: 2.737 ± 0.679
7.663SerVal: 7.663 ± 2.434
1.095SerTrp: 1.095 ± 0.745
6.021SerTyr: 6.021 ± 1.584
0.0SerXaa: 0.0 ± 0.0
Thr
4.379ThrAla: 4.379 ± 2.457
0.0ThrCys: 0.0 ± 0.0
1.642ThrAsp: 1.642 ± 1.435
1.642ThrGlu: 1.642 ± 1.647
0.547ThrPhe: 0.547 ± 0.372
2.737ThrGly: 2.737 ± 0.679
0.547ThrHis: 0.547 ± 0.579
3.831ThrIle: 3.831 ± 1.184
1.642ThrLys: 1.642 ± 0.385
3.284ThrLeu: 3.284 ± 1.017
0.547ThrMet: 0.547 ± 0.372
2.189ThrAsn: 2.189 ± 0.462
1.095ThrPro: 1.095 ± 0.745
1.095ThrGln: 1.095 ± 0.6
1.642ThrArg: 1.642 ± 0.608
3.831ThrSer: 3.831 ± 1.081
1.095ThrThr: 1.095 ± 0.443
3.831ThrVal: 3.831 ± 1.022
0.547ThrTrp: 0.547 ± 0.372
2.737ThrTyr: 2.737 ± 1.42
0.0ThrXaa: 0.0 ± 0.0
Val
2.737ValAla: 2.737 ± 0.679
0.547ValCys: 0.547 ± 0.372
3.831ValAsp: 3.831 ± 1.065
2.737ValGlu: 2.737 ± 0.835
2.189ValPhe: 2.189 ± 1.02
4.379ValGly: 4.379 ± 1.453
0.547ValHis: 0.547 ± 0.722
2.189ValIle: 2.189 ± 1.176
4.926ValLys: 4.926 ± 2.022
7.663ValLeu: 7.663 ± 1.143
1.095ValMet: 1.095 ± 0.745
7.663ValAsn: 7.663 ± 1.905
6.021ValPro: 6.021 ± 1.768
2.189ValGln: 2.189 ± 0.754
3.284ValArg: 3.284 ± 1.203
7.115ValSer: 7.115 ± 1.984
2.189ValThr: 2.189 ± 0.747
4.926ValVal: 4.926 ± 1.891
0.547ValTrp: 0.547 ± 0.478
3.831ValTyr: 3.831 ± 1.814
0.0ValXaa: 0.0 ± 0.0
Trp
0.547TrpAla: 0.547 ± 0.549
0.0TrpCys: 0.0 ± 0.0
1.642TrpAsp: 1.642 ± 0.916
0.0TrpGlu: 0.0 ± 0.0
1.095TrpPhe: 1.095 ± 0.6
0.547TrpGly: 0.547 ± 0.372
0.0TrpHis: 0.0 ± 0.0
1.095TrpIle: 1.095 ± 0.956
1.095TrpLys: 1.095 ± 0.443
1.095TrpLeu: 1.095 ± 0.745
0.0TrpMet: 0.0 ± 0.0
1.642TrpAsn: 1.642 ± 0.608
0.0TrpPro: 0.0 ± 0.0
0.547TrpGln: 0.547 ± 0.372
0.547TrpArg: 0.547 ± 0.372
0.547TrpSer: 0.547 ± 0.478
2.189TrpThr: 2.189 ± 0.696
1.642TrpVal: 1.642 ± 0.754
0.0TrpTrp: 0.0 ± 0.0
1.095TrpTyr: 1.095 ± 0.6
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.189TyrAla: 2.189 ± 1.16
0.0TyrCys: 0.0 ± 0.0
4.379TyrAsp: 4.379 ± 1.159
0.547TyrGlu: 0.547 ± 0.615
6.021TyrPhe: 6.021 ± 2.211
1.642TyrGly: 1.642 ± 0.754
0.547TyrHis: 0.547 ± 0.372
0.547TyrIle: 0.547 ± 0.372
2.189TyrLys: 2.189 ± 0.462
6.021TyrLeu: 6.021 ± 2.352
1.642TyrMet: 1.642 ± 0.831
3.284TyrAsn: 3.284 ± 0.749
3.831TyrPro: 3.831 ± 1.852
2.189TyrGln: 2.189 ± 0.8
6.021TyrArg: 6.021 ± 1.917
4.926TyrSer: 4.926 ± 1.07
2.737TyrThr: 2.737 ± 0.746
2.737TyrVal: 2.737 ± 1.193
1.095TyrTrp: 1.095 ± 0.745
3.284TyrTyr: 3.284 ± 0.641
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 7 proteins (1828 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski