Amino acid dipepetide frequency for Branchiostoma belcheri (Amphioxus)

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
6.587AlaAla: 6.587 ± 0.039
1.329AlaCys: 1.329 ± 0.012
3.887AlaAsp: 3.887 ± 0.018
4.741AlaGlu: 4.741 ± 0.029
2.204AlaPhe: 2.204 ± 0.013
4.866AlaGly: 4.866 ± 0.027
1.372AlaHis: 1.372 ± 0.009
2.964AlaIle: 2.964 ± 0.015
3.585AlaLys: 3.585 ± 0.022
5.573AlaLeu: 5.573 ± 0.024
1.834AlaMet: 1.834 ± 0.012
2.489AlaAsn: 2.489 ± 0.013
3.518AlaPro: 3.518 ± 0.022
2.961AlaGln: 2.961 ± 0.016
3.541AlaArg: 3.541 ± 0.019
5.186AlaSer: 5.186 ± 0.021
4.545AlaThr: 4.545 ± 0.03
5.123AlaVal: 5.123 ± 0.024
0.735AlaTrp: 0.735 ± 0.008
1.548AlaTyr: 1.548 ± 0.009
0.002AlaXaa: 0.002 ± 0.0
Cys
1.376CysAla: 1.376 ± 0.018
0.555CysCys: 0.555 ± 0.009
1.766CysAsp: 1.766 ± 0.038
1.476CysGlu: 1.476 ± 0.024
0.665CysPhe: 0.665 ± 0.007
1.577CysGly: 1.577 ± 0.017
0.687CysHis: 0.687 ± 0.015
0.871CysIle: 0.871 ± 0.011
1.049CysLys: 1.049 ± 0.01
1.794CysLeu: 1.794 ± 0.016
0.451CysMet: 0.451 ± 0.006
1.069CysAsn: 1.069 ± 0.02
1.384CysPro: 1.384 ± 0.031
1.144CysGln: 1.144 ± 0.018
1.252CysArg: 1.252 ± 0.012
1.789CysSer: 1.789 ± 0.02
1.777CysThr: 1.777 ± 0.04
1.555CysVal: 1.555 ± 0.022
0.255CysTrp: 0.255 ± 0.005
0.581CysTyr: 0.581 ± 0.007
0.001CysXaa: 0.001 ± 0.0
Asp
3.528AspAla: 3.528 ± 0.018
1.297AspCys: 1.297 ± 0.016
4.299AspAsp: 4.299 ± 0.033
4.244AspGlu: 4.244 ± 0.027
2.152AspPhe: 2.152 ± 0.013
4.725AspGly: 4.725 ± 0.043
1.305AspHis: 1.305 ± 0.011
3.163AspIle: 3.163 ± 0.02
3.022AspLys: 3.022 ± 0.016
4.935AspLeu: 4.935 ± 0.022
1.57AspMet: 1.57 ± 0.01
2.512AspAsn: 2.512 ± 0.02
2.979AspPro: 2.979 ± 0.02
2.386AspGln: 2.386 ± 0.014
3.144AspArg: 3.144 ± 0.015
4.348AspSer: 4.348 ± 0.021
3.323AspThr: 3.323 ± 0.017
4.241AspVal: 4.241 ± 0.019
0.726AspTrp: 0.726 ± 0.007
1.722AspTyr: 1.722 ± 0.013
0.002AspXaa: 0.002 ± 0.0
Glu
4.64GluAla: 4.64 ± 0.03
1.573GluCys: 1.573 ± 0.03
4.82GluAsp: 4.82 ± 0.026
7.554GluGlu: 7.554 ± 0.062
1.906GluPhe: 1.906 ± 0.012
4.47GluGly: 4.47 ± 0.021
1.409GluHis: 1.409 ± 0.01
2.781GluIle: 2.781 ± 0.015
4.726GluLys: 4.726 ± 0.035
5.386GluLeu: 5.386 ± 0.028
1.697GluMet: 1.697 ± 0.011
2.86GluAsn: 2.86 ± 0.015
2.759GluPro: 2.759 ± 0.025
3.033GluGln: 3.033 ± 0.021
3.855GluArg: 3.855 ± 0.023
4.069GluSer: 4.069 ± 0.026
3.979GluThr: 3.979 ± 0.028
4.371GluVal: 4.371 ± 0.024
0.676GluTrp: 0.676 ± 0.007
1.644GluTyr: 1.644 ± 0.011
0.003GluXaa: 0.003 ± 0.0
Phe
1.978PheAla: 1.978 ± 0.012
0.798PheCys: 0.798 ± 0.009
1.881PheAsp: 1.881 ± 0.011
1.8PheGlu: 1.8 ± 0.011
1.341PhePhe: 1.341 ± 0.012
2.322PheGly: 2.322 ± 0.015
0.903PheHis: 0.903 ± 0.008
1.492PheIle: 1.492 ± 0.01
1.588PheLys: 1.588 ± 0.01
3.172PheLeu: 3.172 ± 0.019
0.736PheMet: 0.736 ± 0.007
1.396PheAsn: 1.396 ± 0.01
1.681PhePro: 1.681 ± 0.013
1.505PheGln: 1.505 ± 0.011
1.806PheArg: 1.806 ± 0.013
2.757PheSer: 2.757 ± 0.014
2.231PheThr: 2.231 ± 0.019
2.233PheVal: 2.233 ± 0.015
0.45PheTrp: 0.45 ± 0.006
1.087PheTyr: 1.087 ± 0.009
0.002PheXaa: 0.002 ± 0.0
Gly
4.193GlyAla: 4.193 ± 0.024
1.327GlyCys: 1.327 ± 0.014
4.233GlyAsp: 4.233 ± 0.028
4.346GlyGlu: 4.346 ± 0.025
2.407GlyPhe: 2.407 ± 0.018
5.508GlyGly: 5.508 ± 0.044
1.641GlyHis: 1.641 ± 0.014
2.654GlyIle: 2.654 ± 0.017
3.749GlyLys: 3.749 ± 0.021
5.009GlyLeu: 5.009 ± 0.022
1.562GlyMet: 1.562 ± 0.012
2.923GlyAsn: 2.923 ± 0.019
3.408GlyPro: 3.408 ± 0.045
3.13GlyGln: 3.13 ± 0.02
4.009GlyArg: 4.009 ± 0.023
5.701GlySer: 5.701 ± 0.032
4.554GlyThr: 4.554 ± 0.03
4.272GlyVal: 4.272 ± 0.026
0.905GlyTrp: 0.905 ± 0.01
2.519GlyTyr: 2.519 ± 0.03
0.004GlyXaa: 0.004 ± 0.0
His
1.403HisAla: 1.403 ± 0.01
0.582HisCys: 0.582 ± 0.008
1.27HisAsp: 1.27 ± 0.01
1.241HisGlu: 1.241 ± 0.009
0.855HisPhe: 0.855 ± 0.007
1.636HisGly: 1.636 ± 0.012
0.874HisHis: 0.874 ± 0.009
1.083HisIle: 1.083 ± 0.009
1.184HisLys: 1.184 ± 0.008
2.282HisLeu: 2.282 ± 0.014
0.686HisMet: 0.686 ± 0.011
0.995HisAsn: 0.995 ± 0.011
1.477HisPro: 1.477 ± 0.012
1.103HisGln: 1.103 ± 0.009
1.495HisArg: 1.495 ± 0.012
1.724HisSer: 1.724 ± 0.01
1.543HisThr: 1.543 ± 0.015
1.631HisVal: 1.631 ± 0.011
0.292HisTrp: 0.292 ± 0.004
0.752HisTyr: 0.752 ± 0.008
0.001HisXaa: 0.001 ± 0.0
Ile
2.763IleAla: 2.763 ± 0.014
0.978IleCys: 0.978 ± 0.009
2.386IleAsp: 2.386 ± 0.018
2.365IleGlu: 2.365 ± 0.014
1.525IlePhe: 1.525 ± 0.012
2.462IleGly: 2.462 ± 0.015
1.061IleHis: 1.061 ± 0.009
2.027IleIle: 2.027 ± 0.013
2.168IleLys: 2.168 ± 0.014
3.64IleLeu: 3.64 ± 0.017
0.946IleMet: 0.946 ± 0.008
1.729IleAsn: 1.729 ± 0.011
2.554IlePro: 2.554 ± 0.014
2.023IleGln: 2.023 ± 0.013
2.35IleArg: 2.35 ± 0.012
3.212IleSer: 3.212 ± 0.016
2.79IleThr: 2.79 ± 0.018
2.74IleVal: 2.74 ± 0.016
0.423IleTrp: 0.423 ± 0.005
1.184IleTyr: 1.184 ± 0.01
0.001IleXaa: 0.001 ± 0.0
Lys
3.779LysAla: 3.779 ± 0.025
1.046LysCys: 1.046 ± 0.013
3.359LysAsp: 3.359 ± 0.018
4.538LysGlu: 4.538 ± 0.035
1.57LysPhe: 1.57 ± 0.013
3.148LysGly: 3.148 ± 0.022
1.294LysHis: 1.294 ± 0.011
2.207LysIle: 2.207 ± 0.015
4.355LysLys: 4.355 ± 0.031
4.577LysLeu: 4.577 ± 0.021
1.405LysMet: 1.405 ± 0.01
1.998LysAsn: 1.998 ± 0.012
2.993LysPro: 2.993 ± 0.028
2.499LysGln: 2.499 ± 0.017
3.27LysArg: 3.27 ± 0.016
3.631LysSer: 3.631 ± 0.022
3.416LysThr: 3.416 ± 0.019
3.465LysVal: 3.465 ± 0.02
0.579LysTrp: 0.579 ± 0.006
1.469LysTyr: 1.469 ± 0.012
0.003LysXaa: 0.003 ± 0.0
Leu
5.68LeuAla: 5.68 ± 0.024
1.733LeuCys: 1.733 ± 0.013
4.702LeuAsp: 4.702 ± 0.019
5.765LeuGlu: 5.765 ± 0.033
2.735LeuPhe: 2.735 ± 0.016
4.944LeuGly: 4.944 ± 0.024
2.193LeuHis: 2.193 ± 0.013
2.923LeuIle: 2.923 ± 0.015
4.925LeuLys: 4.925 ± 0.022
7.638LeuLeu: 7.638 ± 0.034
1.883LeuMet: 1.883 ± 0.012
3.176LeuAsn: 3.176 ± 0.015
4.839LeuPro: 4.839 ± 0.024
4.963LeuGln: 4.963 ± 0.032
4.986LeuArg: 4.986 ± 0.02
6.488LeuSer: 6.488 ± 0.023
5.125LeuThr: 5.125 ± 0.021
5.194LeuVal: 5.194 ± 0.021
0.919LeuTrp: 0.919 ± 0.008
2.358LeuTyr: 2.358 ± 0.015
0.003LeuXaa: 0.003 ± 0.0
Met
2.214MetAla: 2.214 ± 0.013
0.485MetCys: 0.485 ± 0.006
1.436MetAsp: 1.436 ± 0.008
1.845MetGlu: 1.845 ± 0.011
0.86MetPhe: 0.86 ± 0.006
1.42MetGly: 1.42 ± 0.01
0.443MetHis: 0.443 ± 0.005
0.809MetIle: 0.809 ± 0.007
1.384MetLys: 1.384 ± 0.009
1.874MetLeu: 1.874 ± 0.011
0.681MetMet: 0.681 ± 0.007
0.856MetAsn: 0.856 ± 0.008
1.253MetPro: 1.253 ± 0.011
1.051MetGln: 1.051 ± 0.009
1.248MetArg: 1.248 ± 0.01
1.877MetSer: 1.877 ± 0.01
1.519MetThr: 1.519 ± 0.01
1.5MetVal: 1.5 ± 0.01
0.292MetTrp: 0.292 ± 0.004
0.763MetTyr: 0.763 ± 0.007
0.001MetXaa: 0.001 ± 0.0
Asn
2.365AsnAla: 2.365 ± 0.015
1.174AsnCys: 1.174 ± 0.028
1.985AsnAsp: 1.985 ± 0.014
2.107AsnGlu: 2.107 ± 0.014
1.417AsnPhe: 1.417 ± 0.009
3.184AsnGly: 3.184 ± 0.025
0.907AsnHis: 0.907 ± 0.008
2.169AsnIle: 2.169 ± 0.014
2.112AsnLys: 2.112 ± 0.012
3.494AsnLeu: 3.494 ± 0.02
1.083AsnMet: 1.083 ± 0.008
2.107AsnAsn: 2.107 ± 0.017
2.339AsnPro: 2.339 ± 0.02
1.665AsnGln: 1.665 ± 0.011
2.099AsnArg: 2.099 ± 0.014
2.922AsnSer: 2.922 ± 0.017
2.662AsnThr: 2.662 ± 0.024
2.565AsnVal: 2.565 ± 0.015
0.437AsnTrp: 0.437 ± 0.006
1.163AsnTyr: 1.163 ± 0.01
0.002AsnXaa: 0.002 ± 0.0
Pro
4.342ProAla: 4.342 ± 0.026
1.16ProCys: 1.16 ± 0.022
3.323ProAsp: 3.323 ± 0.019
3.672ProGlu: 3.672 ± 0.026
1.596ProPhe: 1.596 ± 0.01
4.597ProGly: 4.597 ± 0.057
1.272ProHis: 1.272 ± 0.01
1.884ProIle: 1.884 ± 0.014
2.604ProLys: 2.604 ± 0.019
3.704ProLeu: 3.704 ± 0.02
1.112ProMet: 1.112 ± 0.011
2.11ProAsn: 2.11 ± 0.015
5.442ProPro: 5.442 ± 0.043
2.476ProGln: 2.476 ± 0.018
2.901ProArg: 2.901 ± 0.021
4.785ProSer: 4.785 ± 0.028
4.294ProThr: 4.294 ± 0.04
3.82ProVal: 3.82 ± 0.022
0.604ProTrp: 0.604 ± 0.006
1.624ProTyr: 1.624 ± 0.019
0.006ProXaa: 0.006 ± 0.001
Gln
3.504GlnAla: 3.504 ± 0.017
1.017GlnCys: 1.017 ± 0.016
2.605GlnAsp: 2.605 ± 0.015
3.513GlnGlu: 3.513 ± 0.023
1.357GlnPhe: 1.357 ± 0.012
2.947GlnGly: 2.947 ± 0.021
1.28GlnHis: 1.28 ± 0.011
1.714GlnIle: 1.714 ± 0.011
2.495GlnLys: 2.495 ± 0.02
4.191GlnLeu: 4.191 ± 0.023
1.102GlnMet: 1.102 ± 0.009
1.837GlnAsn: 1.837 ± 0.013
2.752GlnPro: 2.752 ± 0.022
3.61GlnGln: 3.61 ± 0.036
2.679GlnArg: 2.679 ± 0.015
3.153GlnSer: 3.153 ± 0.019
2.929GlnThr: 2.929 ± 0.019
2.9GlnVal: 2.9 ± 0.015
0.531GlnTrp: 0.531 ± 0.006
1.287GlnTyr: 1.287 ± 0.012
0.002GlnXaa: 0.002 ± 0.0
Arg
3.457ArgAla: 3.457 ± 0.018
1.28ArgCys: 1.28 ± 0.018
3.183ArgAsp: 3.183 ± 0.017
3.87ArgGlu: 3.87 ± 0.02
1.745ArgPhe: 1.745 ± 0.011
3.439ArgGly: 3.439 ± 0.022
1.556ArgHis: 1.556 ± 0.012
2.206ArgIle: 2.206 ± 0.012
3.649ArgLys: 3.649 ± 0.021
4.844ArgLeu: 4.844 ± 0.022
1.304ArgMet: 1.304 ± 0.009
2.198ArgAsn: 2.198 ± 0.014
3.034ArgPro: 3.034 ± 0.019
2.857ArgGln: 2.857 ± 0.018
4.49ArgArg: 4.49 ± 0.031
4.058ArgSer: 4.058 ± 0.026
3.427ArgThr: 3.427 ± 0.021
3.325ArgVal: 3.325 ± 0.017
0.745ArgTrp: 0.745 ± 0.009
1.646ArgTyr: 1.646 ± 0.011
0.003ArgXaa: 0.003 ± 0.0
Ser
5.249SerAla: 5.249 ± 0.022
1.766SerCys: 1.766 ± 0.021
4.446SerAsp: 4.446 ± 0.024
4.346SerGlu: 4.346 ± 0.025
2.598SerPhe: 2.598 ± 0.019
5.534SerGly: 5.534 ± 0.028
1.823SerHis: 1.823 ± 0.013
2.862SerIle: 2.862 ± 0.015
3.545SerLys: 3.545 ± 0.02
6.487SerLeu: 6.487 ± 0.024
1.672SerMet: 1.672 ± 0.011
2.797SerAsn: 2.797 ± 0.016
5.276SerPro: 5.276 ± 0.037
3.467SerGln: 3.467 ± 0.02
4.234SerArg: 4.234 ± 0.024
8.156SerSer: 8.156 ± 0.044
5.525SerThr: 5.525 ± 0.043
4.803SerVal: 4.803 ± 0.019
0.901SerTrp: 0.901 ± 0.009
2.1SerTyr: 2.1 ± 0.013
0.003SerXaa: 0.003 ± 0.0
Thr
4.886ThrAla: 4.886 ± 0.023
2.42ThrCys: 2.42 ± 0.066
3.969ThrAsp: 3.969 ± 0.042
4.06ThrGlu: 4.06 ± 0.025
2.266ThrPhe: 2.266 ± 0.013
4.727ThrGly: 4.727 ± 0.035
1.474ThrHis: 1.474 ± 0.015
2.82ThrIle: 2.82 ± 0.02
2.934ThrLys: 2.934 ± 0.019
5.297ThrLeu: 5.297 ± 0.03
1.476ThrMet: 1.476 ± 0.011
2.386ThrAsn: 2.386 ± 0.016
4.14ThrPro: 4.14 ± 0.025
2.565ThrGln: 2.565 ± 0.02
2.912ThrArg: 2.912 ± 0.015
5.606ThrSer: 5.606 ± 0.031
6.275ThrThr: 6.275 ± 0.133
4.766ThrVal: 4.766 ± 0.025
0.843ThrTrp: 0.843 ± 0.009
1.709ThrTyr: 1.709 ± 0.012
0.002ThrXaa: 0.002 ± 0.0
Val
4.419ValAla: 4.419 ± 0.019
1.611ValCys: 1.611 ± 0.017
3.755ValAsp: 3.755 ± 0.017
4.304ValGlu: 4.304 ± 0.024
2.441ValPhe: 2.441 ± 0.012
3.894ValGly: 3.894 ± 0.019
1.522ValHis: 1.522 ± 0.01
2.863ValIle: 2.863 ± 0.016
3.388ValLys: 3.388 ± 0.021
5.768ValLeu: 5.768 ± 0.019
1.549ValMet: 1.549 ± 0.01
2.642ValAsn: 2.642 ± 0.019
3.681ValPro: 3.681 ± 0.02
3.107ValGln: 3.107 ± 0.016
3.419ValArg: 3.419 ± 0.016
5.044ValSer: 5.044 ± 0.022
4.887ValThr: 4.887 ± 0.035
4.865ValVal: 4.865 ± 0.03
0.793ValTrp: 0.793 ± 0.007
1.869ValTyr: 1.869 ± 0.011
0.003ValXaa: 0.003 ± 0.0
Trp
0.697TrpAla: 0.697 ± 0.007
0.255TrpCys: 0.255 ± 0.004
0.696TrpAsp: 0.696 ± 0.008
0.761TrpGlu: 0.761 ± 0.007
0.435TrpPhe: 0.435 ± 0.005
0.686TrpGly: 0.686 ± 0.008
0.264TrpHis: 0.264 ± 0.004
0.504TrpIle: 0.504 ± 0.005
0.697TrpLys: 0.697 ± 0.008
1.106TrpLeu: 1.106 ± 0.011
0.314TrpMet: 0.314 ± 0.005
0.52TrpAsn: 0.52 ± 0.006
0.445TrpPro: 0.445 ± 0.006
0.526TrpGln: 0.526 ± 0.006
0.79TrpArg: 0.79 ± 0.008
0.934TrpSer: 0.934 ± 0.01
0.821TrpThr: 0.821 ± 0.009
0.684TrpVal: 0.684 ± 0.007
0.217TrpTrp: 0.217 ± 0.004
0.397TrpTyr: 0.397 ± 0.006
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.528TyrAla: 1.528 ± 0.01
0.685TyrCys: 0.685 ± 0.014
1.621TyrAsp: 1.621 ± 0.011
1.657TyrGlu: 1.657 ± 0.013
1.084TyrPhe: 1.084 ± 0.009
1.964TyrGly: 1.964 ± 0.014
0.824TyrHis: 0.824 ± 0.007
1.328TyrIle: 1.328 ± 0.01
1.441TyrLys: 1.441 ± 0.013
2.399TyrLeu: 2.399 ± 0.015
0.75TyrMet: 0.75 ± 0.007
1.354TyrAsn: 1.354 ± 0.01
1.365TyrPro: 1.365 ± 0.013
1.334TyrGln: 1.334 ± 0.013
1.829TyrArg: 1.829 ± 0.013
2.152TyrSer: 2.152 ± 0.013
1.909TyrThr: 1.909 ± 0.017
1.769TyrVal: 1.769 ± 0.011
0.43TyrTrp: 0.43 ± 0.006
1.071TyrTyr: 1.071 ± 0.012
0.001TyrXaa: 0.001 ± 0.0
Xaa
0.002XaaAla: 0.002 ± 0.0
0.001XaaCys: 0.001 ± 0.0
0.002XaaAsp: 0.002 ± 0.0
0.003XaaGlu: 0.003 ± 0.0
0.002XaaPhe: 0.002 ± 0.0
0.005XaaGly: 0.005 ± 0.0
0.001XaaHis: 0.001 ± 0.0
0.002XaaIle: 0.002 ± 0.0
0.003XaaLys: 0.003 ± 0.0
0.003XaaLeu: 0.003 ± 0.0
0.001XaaMet: 0.001 ± 0.0
0.002XaaAsn: 0.002 ± 0.0
0.005XaaPro: 0.005 ± 0.001
0.002XaaGln: 0.002 ± 0.0
0.002XaaArg: 0.002 ± 0.0
0.003XaaSer: 0.003 ± 0.0
0.002XaaThr: 0.002 ± 0.0
0.003XaaVal: 0.003 ± 0.0
0.001XaaTrp: 0.001 ± 0.0
0.001XaaTyr: 0.001 ± 0.0
0.03XaaXaa: 0.03 ± 0.006
Statistics based on 31614 proteins (20550264 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski