Amino acid dipepetide frequency for Komagataella phaffii (strain GS115 / ATCC 20864) (Yeast) (Pichia pastoris)

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
3.801AlaAla: 3.801 ± 0.061
0.668AlaCys: 0.668 ± 0.018
2.838AlaAsp: 2.838 ± 0.037
3.34AlaGlu: 3.34 ± 0.049
2.31AlaPhe: 2.31 ± 0.036
2.926AlaGly: 2.926 ± 0.043
1.101AlaHis: 1.101 ± 0.023
3.65AlaIle: 3.65 ± 0.04
3.792AlaLys: 3.792 ± 0.049
5.684AlaLeu: 5.684 ± 0.054
1.057AlaMet: 1.057 ± 0.02
2.739AlaAsn: 2.739 ± 0.032
2.448AlaPro: 2.448 ± 0.04
2.264AlaGln: 2.264 ± 0.042
2.538AlaArg: 2.538 ± 0.036
4.957AlaSer: 4.957 ± 0.053
3.452AlaThr: 3.452 ± 0.044
3.377AlaVal: 3.377 ± 0.044
0.505AlaTrp: 0.505 ± 0.015
1.69AlaTyr: 1.69 ± 0.027
0.0AlaXaa: 0.0 ± 0.0
Cys
0.6CysAla: 0.6 ± 0.016
0.236CysCys: 0.236 ± 0.011
0.629CysAsp: 0.629 ± 0.017
0.588CysGlu: 0.588 ± 0.017
0.658CysPhe: 0.658 ± 0.018
0.762CysGly: 0.762 ± 0.019
0.306CysHis: 0.306 ± 0.014
0.838CysIle: 0.838 ± 0.02
0.687CysLys: 0.687 ± 0.018
1.331CysLeu: 1.331 ± 0.029
0.218CysMet: 0.218 ± 0.009
0.553CysAsn: 0.553 ± 0.015
0.487CysPro: 0.487 ± 0.017
0.424CysGln: 0.424 ± 0.013
0.507CysArg: 0.507 ± 0.014
1.014CysSer: 1.014 ± 0.022
0.566CysThr: 0.566 ± 0.016
0.689CysVal: 0.689 ± 0.016
0.15CysTrp: 0.15 ± 0.008
0.461CysTyr: 0.461 ± 0.014
0.0CysXaa: 0.0 ± 0.0
Asp
2.958AspAla: 2.958 ± 0.037
0.625AspCys: 0.625 ± 0.017
4.479AspAsp: 4.479 ± 0.058
4.995AspGlu: 4.995 ± 0.057
2.692AspPhe: 2.692 ± 0.038
2.867AspGly: 2.867 ± 0.039
1.2AspHis: 1.2 ± 0.022
4.057AspIle: 4.057 ± 0.045
3.441AspLys: 3.441 ± 0.046
6.006AspLeu: 6.006 ± 0.055
1.013AspMet: 1.013 ± 0.019
2.898AspAsn: 2.898 ± 0.037
2.687AspPro: 2.687 ± 0.032
2.298AspGln: 2.298 ± 0.031
2.295AspArg: 2.295 ± 0.03
5.294AspSer: 5.294 ± 0.054
3.001AspThr: 3.001 ± 0.039
3.489AspVal: 3.489 ± 0.041
0.647AspTrp: 0.647 ± 0.019
2.224AspTyr: 2.224 ± 0.029
0.0AspXaa: 0.0 ± 0.0
Glu
3.749GluAla: 3.749 ± 0.048
0.642GluCys: 0.642 ± 0.019
4.429GluAsp: 4.429 ± 0.057
6.264GluGlu: 6.264 ± 0.092
2.853GluPhe: 2.853 ± 0.032
2.935GluGly: 2.935 ± 0.04
1.296GluHis: 1.296 ± 0.024
4.218GluIle: 4.218 ± 0.041
5.253GluLys: 5.253 ± 0.065
7.134GluLeu: 7.134 ± 0.061
1.284GluMet: 1.284 ± 0.024
3.794GluAsn: 3.794 ± 0.042
2.398GluPro: 2.398 ± 0.053
2.79GluGln: 2.79 ± 0.04
3.131GluArg: 3.131 ± 0.044
5.346GluSer: 5.346 ± 0.062
3.83GluThr: 3.83 ± 0.045
3.663GluVal: 3.663 ± 0.046
0.656GluTrp: 0.656 ± 0.017
2.199GluTyr: 2.199 ± 0.035
0.0GluXaa: 0.0 ± 0.0
Phe
2.324PheAla: 2.324 ± 0.035
0.538PheCys: 0.538 ± 0.016
2.793PheAsp: 2.793 ± 0.038
2.85PheGlu: 2.85 ± 0.035
2.107PhePhe: 2.107 ± 0.034
2.718PheGly: 2.718 ± 0.044
1.076PheHis: 1.076 ± 0.023
2.807PheIle: 2.807 ± 0.044
3.047PheLys: 3.047 ± 0.038
4.298PheLeu: 4.298 ± 0.052
0.832PheMet: 0.832 ± 0.019
2.497PheAsn: 2.497 ± 0.032
1.82PhePro: 1.82 ± 0.028
2.029PheGln: 2.029 ± 0.03
1.801PheArg: 1.801 ± 0.031
3.75PheSer: 3.75 ± 0.039
2.472PheThr: 2.472 ± 0.036
2.804PheVal: 2.804 ± 0.042
0.531PheTrp: 0.531 ± 0.016
1.532PheTyr: 1.532 ± 0.027
0.0PheXaa: 0.0 ± 0.0
Gly
2.992GlyAla: 2.992 ± 0.05
0.717GlyCys: 0.717 ± 0.02
2.812GlyAsp: 2.812 ± 0.038
2.924GlyGlu: 2.924 ± 0.035
2.555GlyPhe: 2.555 ± 0.038
3.239GlyGly: 3.239 ± 0.055
1.117GlyHis: 1.117 ± 0.022
3.465GlyIle: 3.465 ± 0.045
3.543GlyLys: 3.543 ± 0.042
5.069GlyLeu: 5.069 ± 0.052
0.946GlyMet: 0.946 ± 0.023
2.526GlyAsn: 2.526 ± 0.034
1.773GlyPro: 1.773 ± 0.03
1.694GlyGln: 1.694 ± 0.028
2.235GlyArg: 2.235 ± 0.034
4.72GlySer: 4.72 ± 0.054
3.017GlyThr: 3.017 ± 0.043
3.39GlyVal: 3.39 ± 0.043
0.62GlyTrp: 0.62 ± 0.017
1.921GlyTyr: 1.921 ± 0.038
0.0GlyXaa: 0.0 ± 0.0
His
0.962HisAla: 0.962 ± 0.023
0.313HisCys: 0.313 ± 0.012
1.146HisAsp: 1.146 ± 0.022
1.249HisGlu: 1.249 ± 0.022
0.998HisPhe: 0.998 ± 0.023
1.134HisGly: 1.134 ± 0.024
0.656HisHis: 0.656 ± 0.024
1.354HisIle: 1.354 ± 0.023
1.305HisLys: 1.305 ± 0.024
2.241HisLeu: 2.241 ± 0.032
0.363HisMet: 0.363 ± 0.014
1.082HisAsn: 1.082 ± 0.021
1.154HisPro: 1.154 ± 0.023
0.979HisGln: 0.979 ± 0.023
1.106HisArg: 1.106 ± 0.027
1.914HisSer: 1.914 ± 0.033
1.063HisThr: 1.063 ± 0.02
1.237HisVal: 1.237 ± 0.023
0.224HisTrp: 0.224 ± 0.009
0.816HisTyr: 0.816 ± 0.023
0.0HisXaa: 0.0 ± 0.0
Ile
3.643IleAla: 3.643 ± 0.044
0.845IleCys: 0.845 ± 0.022
4.134IleAsp: 4.134 ± 0.044
4.159IleGlu: 4.159 ± 0.05
2.656IlePhe: 2.656 ± 0.037
3.308IleGly: 3.308 ± 0.044
1.441IleHis: 1.441 ± 0.025
3.805IleIle: 3.805 ± 0.047
4.158IleLys: 4.158 ± 0.043
6.178IleLeu: 6.178 ± 0.058
1.151IleMet: 1.151 ± 0.023
3.409IleAsn: 3.409 ± 0.042
3.332IlePro: 3.332 ± 0.038
2.585IleGln: 2.585 ± 0.039
2.818IleArg: 2.818 ± 0.035
5.661IleSer: 5.661 ± 0.051
3.46IleThr: 3.46 ± 0.039
3.941IleVal: 3.941 ± 0.042
0.667IleTrp: 0.667 ± 0.02
2.052IleTyr: 2.052 ± 0.027
0.0IleXaa: 0.0 ± 0.0
Lys
3.705LysAla: 3.705 ± 0.049
0.71LysCys: 0.71 ± 0.019
4.117LysAsp: 4.117 ± 0.044
5.328LysGlu: 5.328 ± 0.064
2.872LysPhe: 2.872 ± 0.037
3.073LysGly: 3.073 ± 0.039
1.446LysHis: 1.446 ± 0.024
4.289LysIle: 4.289 ± 0.047
5.485LysLys: 5.485 ± 0.066
7.327LysLeu: 7.327 ± 0.07
1.295LysMet: 1.295 ± 0.023
3.577LysAsn: 3.577 ± 0.037
2.987LysPro: 2.987 ± 0.043
2.94LysGln: 2.94 ± 0.042
3.699LysArg: 3.699 ± 0.049
5.428LysSer: 5.428 ± 0.057
3.762LysThr: 3.762 ± 0.043
4.319LysVal: 4.319 ± 0.045
0.718LysTrp: 0.718 ± 0.018
2.358LysTyr: 2.358 ± 0.035
0.0LysXaa: 0.0 ± 0.0
Leu
5.907LeuAla: 5.907 ± 0.058
1.147LeuCys: 1.147 ± 0.025
5.919LeuAsp: 5.919 ± 0.056
6.858LeuGlu: 6.858 ± 0.059
4.307LeuPhe: 4.307 ± 0.054
4.95LeuGly: 4.95 ± 0.05
2.068LeuHis: 2.068 ± 0.029
6.101LeuIle: 6.101 ± 0.072
7.587LeuLys: 7.587 ± 0.058
10.234LeuLeu: 10.234 ± 0.102
1.887LeuMet: 1.887 ± 0.03
5.756LeuAsn: 5.756 ± 0.055
4.742LeuPro: 4.742 ± 0.048
4.469LeuGln: 4.469 ± 0.046
4.752LeuArg: 4.752 ± 0.047
8.866LeuSer: 8.866 ± 0.065
5.597LeuThr: 5.597 ± 0.052
6.093LeuVal: 6.093 ± 0.057
0.904LeuTrp: 0.904 ± 0.019
3.16LeuTyr: 3.16 ± 0.038
0.0LeuXaa: 0.0 ± 0.0
Met
1.274MetAla: 1.274 ± 0.022
0.213MetCys: 0.213 ± 0.011
1.135MetAsp: 1.135 ± 0.023
1.136MetGlu: 1.136 ± 0.022
0.861MetPhe: 0.861 ± 0.02
0.983MetGly: 0.983 ± 0.023
0.287MetHis: 0.287 ± 0.011
1.184MetIle: 1.184 ± 0.02
1.287MetLys: 1.287 ± 0.023
1.78MetLeu: 1.78 ± 0.029
0.435MetMet: 0.435 ± 0.015
1.079MetAsn: 1.079 ± 0.021
0.685MetPro: 0.685 ± 0.019
0.576MetGln: 0.576 ± 0.017
0.784MetArg: 0.784 ± 0.018
1.863MetSer: 1.863 ± 0.028
1.065MetThr: 1.065 ± 0.024
1.202MetVal: 1.202 ± 0.022
0.16MetTrp: 0.16 ± 0.008
0.54MetTyr: 0.54 ± 0.016
0.0MetXaa: 0.0 ± 0.0
Asn
2.68AsnAla: 2.68 ± 0.037
0.656AsnCys: 0.656 ± 0.017
3.305AsnAsp: 3.305 ± 0.035
3.797AsnGlu: 3.797 ± 0.039
2.329AsnPhe: 2.329 ± 0.032
2.981AsnGly: 2.981 ± 0.036
1.16AsnHis: 1.16 ± 0.022
3.382AsnIle: 3.382 ± 0.04
3.392AsnLys: 3.392 ± 0.038
5.114AsnLeu: 5.114 ± 0.051
0.991AsnMet: 0.991 ± 0.024
3.094AsnAsn: 3.094 ± 0.043
2.427AsnPro: 2.427 ± 0.027
2.158AsnGln: 2.158 ± 0.04
2.225AsnArg: 2.225 ± 0.032
4.994AsnSer: 4.994 ± 0.05
2.923AsnThr: 2.923 ± 0.041
3.227AsnVal: 3.227 ± 0.031
0.622AsnTrp: 0.622 ± 0.015
1.95AsnTyr: 1.95 ± 0.025
0.0AsnXaa: 0.0 ± 0.0
Pro
2.358ProAla: 2.358 ± 0.039
0.348ProCys: 0.348 ± 0.014
2.404ProAsp: 2.404 ± 0.033
3.247ProGlu: 3.247 ± 0.043
1.985ProPhe: 1.985 ± 0.032
2.122ProGly: 2.122 ± 0.036
0.886ProHis: 0.886 ± 0.02
2.819ProIle: 2.819 ± 0.032
3.066ProLys: 3.066 ± 0.035
4.25ProLeu: 4.25 ± 0.046
0.74ProMet: 0.74 ± 0.019
2.324ProAsn: 2.324 ± 0.034
2.591ProPro: 2.591 ± 0.064
2.014ProGln: 2.014 ± 0.036
1.822ProArg: 1.822 ± 0.03
4.283ProSer: 4.283 ± 0.056
2.951ProThr: 2.951 ± 0.059
2.889ProVal: 2.889 ± 0.034
0.425ProTrp: 0.425 ± 0.014
1.367ProTyr: 1.367 ± 0.025
0.0ProXaa: 0.0 ± 0.0
Gln
2.201GlnAla: 2.201 ± 0.035
0.439GlnCys: 0.439 ± 0.015
2.307GlnAsp: 2.307 ± 0.033
2.927GlnGlu: 2.927 ± 0.041
1.92GlnPhe: 1.92 ± 0.03
1.832GlnGly: 1.832 ± 0.031
0.954GlnHis: 0.954 ± 0.023
2.504GlnIle: 2.504 ± 0.032
2.755GlnLys: 2.755 ± 0.034
4.794GlnLeu: 4.794 ± 0.05
0.832GlnMet: 0.832 ± 0.019
2.124GlnAsn: 2.124 ± 0.029
1.8GlnPro: 1.8 ± 0.038
2.889GlnGln: 2.889 ± 0.118
2.025GlnArg: 2.025 ± 0.037
3.264GlnSer: 3.264 ± 0.044
2.126GlnThr: 2.126 ± 0.035
2.435GlnVal: 2.435 ± 0.029
0.422GlnTrp: 0.422 ± 0.015
1.379GlnTyr: 1.379 ± 0.025
0.0GlnXaa: 0.0 ± 0.0
Arg
2.393ArgAla: 2.393 ± 0.037
0.561ArgCys: 0.561 ± 0.016
2.439ArgAsp: 2.439 ± 0.032
2.828ArgGlu: 2.828 ± 0.041
2.033ArgPhe: 2.033 ± 0.029
2.138ArgGly: 2.138 ± 0.036
0.991ArgHis: 0.991 ± 0.021
2.972ArgIle: 2.972 ± 0.036
3.642ArgLys: 3.642 ± 0.046
4.7ArgLeu: 4.7 ± 0.046
0.924ArgMet: 0.924 ± 0.018
2.483ArgAsn: 2.483 ± 0.031
1.809ArgPro: 1.809 ± 0.029
1.87ArgGln: 1.87 ± 0.031
2.878ArgArg: 2.878 ± 0.044
3.783ArgSer: 3.783 ± 0.046
2.445ArgThr: 2.445 ± 0.033
2.583ArgVal: 2.583 ± 0.034
0.502ArgTrp: 0.502 ± 0.015
1.563ArgTyr: 1.563 ± 0.026
0.0ArgXaa: 0.0 ± 0.0
Ser
4.495SerAla: 4.495 ± 0.055
0.928SerCys: 0.928 ± 0.021
4.709SerAsp: 4.709 ± 0.056
5.08SerGlu: 5.08 ± 0.054
4.155SerPhe: 4.155 ± 0.05
4.373SerGly: 4.373 ± 0.047
1.838SerHis: 1.838 ± 0.029
5.766SerIle: 5.766 ± 0.057
6.359SerLys: 6.359 ± 0.062
9.064SerLeu: 9.064 ± 0.076
1.61SerMet: 1.61 ± 0.024
4.987SerAsn: 4.987 ± 0.059
3.982SerPro: 3.982 ± 0.058
3.661SerGln: 3.661 ± 0.042
3.898SerArg: 3.898 ± 0.046
9.854SerSer: 9.854 ± 0.112
5.699SerThr: 5.699 ± 0.074
5.136SerVal: 5.136 ± 0.05
0.864SerTrp: 0.864 ± 0.021
2.7SerTyr: 2.7 ± 0.039
0.0SerXaa: 0.0 ± 0.0
Thr
3.202ThrAla: 3.202 ± 0.045
0.65ThrCys: 0.65 ± 0.017
2.983ThrAsp: 2.983 ± 0.039
3.609ThrGlu: 3.609 ± 0.056
2.438ThrPhe: 2.438 ± 0.036
3.261ThrGly: 3.261 ± 0.05
1.152ThrHis: 1.152 ± 0.023
3.765ThrIle: 3.765 ± 0.042
3.79ThrLys: 3.79 ± 0.045
5.49ThrLeu: 5.49 ± 0.052
1.03ThrMet: 1.03 ± 0.02
3.007ThrAsn: 3.007 ± 0.041
3.066ThrPro: 3.066 ± 0.044
2.134ThrGln: 2.134 ± 0.031
2.411ThrArg: 2.411 ± 0.032
5.242ThrSer: 5.242 ± 0.069
3.797ThrThr: 3.797 ± 0.067
3.655ThrVal: 3.655 ± 0.042
0.535ThrTrp: 0.535 ± 0.016
1.68ThrTyr: 1.68 ± 0.033
0.0ThrXaa: 0.0 ± 0.0
Val
3.644ValAla: 3.644 ± 0.042
0.767ValCys: 0.767 ± 0.02
3.866ValAsp: 3.866 ± 0.042
3.983ValGlu: 3.983 ± 0.043
2.797ValPhe: 2.797 ± 0.037
3.261ValGly: 3.261 ± 0.046
1.249ValHis: 1.249 ± 0.023
3.788ValIle: 3.788 ± 0.043
3.962ValLys: 3.962 ± 0.046
6.0ValLeu: 6.0 ± 0.054
1.076ValMet: 1.076 ± 0.023
3.052ValAsn: 3.052 ± 0.039
2.963ValPro: 2.963 ± 0.043
2.296ValGln: 2.296 ± 0.031
2.53ValArg: 2.53 ± 0.038
5.335ValSer: 5.335 ± 0.053
3.425ValThr: 3.425 ± 0.045
4.084ValVal: 4.084 ± 0.048
0.625ValTrp: 0.625 ± 0.022
1.989ValTyr: 1.989 ± 0.034
0.0ValXaa: 0.0 ± 0.0
Trp
0.536TrpAla: 0.536 ± 0.017
0.206TrpCys: 0.206 ± 0.009
0.657TrpAsp: 0.657 ± 0.018
0.618TrpGlu: 0.618 ± 0.016
0.507TrpPhe: 0.507 ± 0.015
0.53TrpGly: 0.53 ± 0.015
0.205TrpHis: 0.205 ± 0.009
0.683TrpIle: 0.683 ± 0.02
0.848TrpLys: 0.848 ± 0.016
1.019TrpLeu: 1.019 ± 0.024
0.208TrpMet: 0.208 ± 0.01
0.657TrpAsn: 0.657 ± 0.017
0.301TrpPro: 0.301 ± 0.012
0.329TrpGln: 0.329 ± 0.011
0.518TrpArg: 0.518 ± 0.015
0.852TrpSer: 0.852 ± 0.019
0.573TrpThr: 0.573 ± 0.019
0.584TrpVal: 0.584 ± 0.015
0.163TrpTrp: 0.163 ± 0.009
0.361TrpTyr: 0.361 ± 0.013
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.713TyrAla: 1.713 ± 0.03
0.493TyrCys: 0.493 ± 0.015
2.064TyrAsp: 2.064 ± 0.03
2.086TyrGlu: 2.086 ± 0.031
1.596TyrPhe: 1.596 ± 0.03
1.882TyrGly: 1.882 ± 0.033
0.867TyrHis: 0.867 ± 0.019
1.958TyrIle: 1.958 ± 0.029
2.038TyrLys: 2.038 ± 0.029
3.591TyrLeu: 3.591 ± 0.039
0.634TyrMet: 0.634 ± 0.016
1.767TyrAsn: 1.767 ± 0.029
1.44TyrPro: 1.44 ± 0.028
1.531TyrGln: 1.531 ± 0.029
1.562TyrArg: 1.562 ± 0.026
2.74TyrSer: 2.74 ± 0.036
1.671TyrThr: 1.671 ± 0.029
1.894TyrVal: 1.894 ± 0.029
0.416TyrTrp: 0.416 ± 0.013
1.388TyrTyr: 1.388 ± 0.027
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.001XaaMet: 0.001 ± 0.001
0.0XaaAsn: 0.0 ± 0.0
0.001XaaPro: 0.001 ± 0.001
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.001XaaThr: 0.001 ± 0.001
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.057XaaXaa: 0.057 ± 0.058
Statistics based on 5073 proteins (2416083 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski