Amino acid dipepetide frequency for Geobacillus sp. (strain Y412MC10)

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
8.741AlaAla: 8.741 ± 0.099
0.718AlaCys: 0.718 ± 0.021
4.416AlaAsp: 4.416 ± 0.054
5.747AlaGlu: 5.747 ± 0.061
3.407AlaPhe: 3.407 ± 0.046
6.794AlaGly: 6.794 ± 0.077
1.394AlaHis: 1.394 ± 0.025
4.978AlaIle: 4.978 ± 0.061
4.067AlaLys: 4.067 ± 0.047
8.067AlaLeu: 8.067 ± 0.067
2.45AlaMet: 2.45 ± 0.035
2.561AlaAsn: 2.561 ± 0.034
2.735AlaPro: 2.735 ± 0.039
2.556AlaGln: 2.556 ± 0.035
3.449AlaArg: 3.449 ± 0.043
5.007AlaSer: 5.007 ± 0.056
3.47AlaThr: 3.47 ± 0.116
6.414AlaVal: 6.414 ± 0.058
1.059AlaTrp: 1.059 ± 0.028
2.729AlaTyr: 2.729 ± 0.04
0.0AlaXaa: 0.0 ± 0.0
Cys
0.486CysAla: 0.486 ± 0.017
0.107CysCys: 0.107 ± 0.007
0.365CysAsp: 0.365 ± 0.014
0.402CysGlu: 0.402 ± 0.014
0.286CysPhe: 0.286 ± 0.013
0.733CysGly: 0.733 ± 0.021
0.19CysHis: 0.19 ± 0.009
0.454CysIle: 0.454 ± 0.016
0.301CysLys: 0.301 ± 0.013
0.738CysLeu: 0.738 ± 0.02
0.212CysMet: 0.212 ± 0.009
0.242CysAsn: 0.242 ± 0.011
0.339CysPro: 0.339 ± 0.014
0.209CysGln: 0.209 ± 0.01
0.453CysArg: 0.453 ± 0.018
0.547CysSer: 0.547 ± 0.017
0.375CysThr: 0.375 ± 0.012
0.429CysVal: 0.429 ± 0.015
0.091CysTrp: 0.091 ± 0.006
0.267CysTyr: 0.267 ± 0.012
0.0CysXaa: 0.0 ± 0.0
Asp
3.974AspAla: 3.974 ± 0.043
0.353AspCys: 0.353 ± 0.014
2.52AspAsp: 2.52 ± 0.035
3.871AspGlu: 3.871 ± 0.045
2.14AspPhe: 2.14 ± 0.036
4.266AspGly: 4.266 ± 0.059
1.297AspHis: 1.297 ± 0.028
3.632AspIle: 3.632 ± 0.038
2.513AspLys: 2.513 ± 0.036
4.822AspLeu: 4.822 ± 0.052
1.631AspMet: 1.631 ± 0.027
1.774AspAsn: 1.774 ± 0.033
2.6AspPro: 2.6 ± 0.033
2.099AspGln: 2.099 ± 0.032
2.953AspArg: 2.953 ± 0.036
2.761AspSer: 2.761 ± 0.041
2.487AspThr: 2.487 ± 0.035
3.532AspVal: 3.532 ± 0.048
0.893AspTrp: 0.893 ± 0.025
2.258AspTyr: 2.258 ± 0.037
0.0AspXaa: 0.0 ± 0.0
Glu
6.171GluAla: 6.171 ± 0.064
0.405GluCys: 0.405 ± 0.016
3.463GluAsp: 3.463 ± 0.043
5.578GluGlu: 5.578 ± 0.073
2.241GluPhe: 2.241 ± 0.034
4.97GluGly: 4.97 ± 0.065
1.642GluHis: 1.642 ± 0.025
4.258GluIle: 4.258 ± 0.047
3.606GluLys: 3.606 ± 0.052
7.127GluLeu: 7.127 ± 0.064
2.103GluMet: 2.103 ± 0.036
2.373GluAsn: 2.373 ± 0.042
2.538GluPro: 2.538 ± 0.042
3.508GluGln: 3.508 ± 0.052
4.073GluArg: 4.073 ± 0.05
3.674GluSer: 3.674 ± 0.044
3.398GluThr: 3.398 ± 0.047
4.428GluVal: 4.428 ± 0.053
1.04GluTrp: 1.04 ± 0.024
2.149GluTyr: 2.149 ± 0.032
0.0GluXaa: 0.0 ± 0.0
Phe
3.144PheAla: 3.144 ± 0.039
0.331PheCys: 0.331 ± 0.012
2.251PheAsp: 2.251 ± 0.038
2.396PheGlu: 2.396 ± 0.035
1.854PhePhe: 1.854 ± 0.039
3.204PheGly: 3.204 ± 0.04
0.969PheHis: 0.969 ± 0.024
2.961PheIle: 2.961 ± 0.043
1.97PheLys: 1.97 ± 0.033
3.881PheLeu: 3.881 ± 0.052
1.293PheMet: 1.293 ± 0.026
1.612PheAsn: 1.612 ± 0.03
1.654PhePro: 1.654 ± 0.031
1.479PheGln: 1.479 ± 0.025
2.099PheArg: 2.099 ± 0.035
2.746PheSer: 2.746 ± 0.04
2.462PheThr: 2.462 ± 0.041
2.867PheVal: 2.867 ± 0.042
0.577PheTrp: 0.577 ± 0.018
1.552PheTyr: 1.552 ± 0.027
0.0PheXaa: 0.0 ± 0.0
Gly
5.744GlyAla: 5.744 ± 0.132
0.69GlyCys: 0.69 ± 0.02
3.574GlyAsp: 3.574 ± 0.044
4.798GlyGlu: 4.798 ± 0.056
3.29GlyPhe: 3.29 ± 0.044
5.682GlyGly: 5.682 ± 0.07
1.556GlyHis: 1.556 ± 0.027
5.799GlyIle: 5.799 ± 0.061
4.342GlyLys: 4.342 ± 0.054
7.391GlyLeu: 7.391 ± 0.072
2.597GlyMet: 2.597 ± 0.032
2.802GlyAsn: 2.802 ± 0.043
2.124GlyPro: 2.124 ± 0.037
2.756GlyGln: 2.756 ± 0.037
3.731GlyArg: 3.731 ± 0.044
5.099GlySer: 5.099 ± 0.084
4.537GlyThr: 4.537 ± 0.067
5.162GlyVal: 5.162 ± 0.069
1.175GlyTrp: 1.175 ± 0.026
3.058GlyTyr: 3.058 ± 0.042
0.0GlyXaa: 0.0 ± 0.0
His
1.701HisAla: 1.701 ± 0.028
0.199HisCys: 0.199 ± 0.01
1.159HisAsp: 1.159 ± 0.026
1.442HisGlu: 1.442 ± 0.024
1.044HisPhe: 1.044 ± 0.022
1.651HisGly: 1.651 ± 0.035
0.651HisHis: 0.651 ± 0.019
1.446HisIle: 1.446 ± 0.028
0.861HisLys: 0.861 ± 0.02
2.135HisLeu: 2.135 ± 0.043
0.612HisMet: 0.612 ± 0.019
0.726HisAsn: 0.726 ± 0.019
1.312HisPro: 1.312 ± 0.03
0.848HisGln: 0.848 ± 0.023
1.216HisArg: 1.216 ± 0.025
1.206HisSer: 1.206 ± 0.024
1.041HisThr: 1.041 ± 0.025
1.553HisVal: 1.553 ± 0.03
0.331HisTrp: 0.331 ± 0.013
0.928HisTyr: 0.928 ± 0.021
0.0HisXaa: 0.0 ± 0.0
Ile
5.658IleAla: 5.658 ± 0.068
0.551IleCys: 0.551 ± 0.016
3.455IleAsp: 3.455 ± 0.045
4.233IleGlu: 4.233 ± 0.052
2.35IlePhe: 2.35 ± 0.041
5.36IleGly: 5.36 ± 0.056
1.624IleHis: 1.624 ± 0.029
4.079IleIle: 4.079 ± 0.059
2.743IleLys: 2.743 ± 0.041
6.042IleLeu: 6.042 ± 0.07
1.828IleMet: 1.828 ± 0.031
2.206IleAsn: 2.206 ± 0.037
3.27IlePro: 3.27 ± 0.035
2.672IleGln: 2.672 ± 0.039
3.844IleArg: 3.844 ± 0.046
4.448IleSer: 4.448 ± 0.058
3.84IleThr: 3.84 ± 0.051
4.933IleVal: 4.933 ± 0.059
0.768IleTrp: 0.768 ± 0.023
2.083IleTyr: 2.083 ± 0.034
0.0IleXaa: 0.0 ± 0.0
Lys
4.044LysAla: 4.044 ± 0.045
0.204LysCys: 0.204 ± 0.01
2.843LysAsp: 2.843 ± 0.039
4.21LysGlu: 4.21 ± 0.058
1.487LysPhe: 1.487 ± 0.031
3.496LysGly: 3.496 ± 0.042
1.117LysHis: 1.117 ± 0.024
2.812LysIle: 2.812 ± 0.041
2.973LysLys: 2.973 ± 0.048
5.174LysLeu: 5.174 ± 0.055
1.562LysMet: 1.562 ± 0.03
1.923LysAsn: 1.923 ± 0.033
2.368LysPro: 2.368 ± 0.031
2.326LysGln: 2.326 ± 0.035
2.894LysArg: 2.894 ± 0.039
2.859LysSer: 2.859 ± 0.043
2.613LysThr: 2.613 ± 0.037
3.336LysVal: 3.336 ± 0.044
0.748LysTrp: 0.748 ± 0.019
1.745LysTyr: 1.745 ± 0.032
0.0LysXaa: 0.0 ± 0.0
Leu
8.049LeuAla: 8.049 ± 0.074
0.767LeuCys: 0.767 ± 0.02
5.211LeuAsp: 5.211 ± 0.044
6.323LeuGlu: 6.323 ± 0.064
4.526LeuPhe: 4.526 ± 0.061
6.837LeuGly: 6.837 ± 0.059
2.201LeuHis: 2.201 ± 0.034
6.522LeuIle: 6.522 ± 0.075
5.218LeuLys: 5.218 ± 0.054
10.846LeuLeu: 10.846 ± 0.109
2.806LeuMet: 2.806 ± 0.042
3.94LeuAsn: 3.94 ± 0.039
4.631LeuPro: 4.631 ± 0.056
4.042LeuGln: 4.042 ± 0.044
4.947LeuArg: 4.947 ± 0.054
7.197LeuSer: 7.197 ± 0.069
5.553LeuThr: 5.553 ± 0.054
6.233LeuVal: 6.233 ± 0.059
1.13LeuTrp: 1.13 ± 0.028
3.377LeuTyr: 3.377 ± 0.047
0.0LeuXaa: 0.0 ± 0.0
Met
2.275MetAla: 2.275 ± 0.032
0.164MetCys: 0.164 ± 0.009
1.653MetAsp: 1.653 ± 0.029
2.077MetGlu: 2.077 ± 0.036
1.155MetPhe: 1.155 ± 0.023
1.864MetGly: 1.864 ± 0.033
0.493MetHis: 0.493 ± 0.014
2.1MetIle: 2.1 ± 0.038
2.164MetLys: 2.164 ± 0.03
3.185MetLeu: 3.185 ± 0.041
1.004MetMet: 1.004 ± 0.024
1.65MetAsn: 1.65 ± 0.03
1.233MetPro: 1.233 ± 0.025
1.063MetGln: 1.063 ± 0.023
1.332MetArg: 1.332 ± 0.027
1.939MetSer: 1.939 ± 0.031
1.814MetThr: 1.814 ± 0.028
1.899MetVal: 1.899 ± 0.03
0.286MetTrp: 0.286 ± 0.013
0.82MetTyr: 0.82 ± 0.02
0.0MetXaa: 0.0 ± 0.0
Asn
2.777AsnAla: 2.777 ± 0.04
0.209AsnCys: 0.209 ± 0.011
1.918AsnAsp: 1.918 ± 0.031
2.595AsnGlu: 2.595 ± 0.037
1.29AsnPhe: 1.29 ± 0.029
3.272AsnGly: 3.272 ± 0.047
0.882AsnHis: 0.882 ± 0.023
2.343AsnIle: 2.343 ± 0.032
1.827AsnLys: 1.827 ± 0.035
3.273AsnLeu: 3.273 ± 0.043
1.063AsnMet: 1.063 ± 0.025
1.487AsnAsn: 1.487 ± 0.034
2.098AsnPro: 2.098 ± 0.038
1.521AsnGln: 1.521 ± 0.029
2.212AsnArg: 2.212 ± 0.033
2.068AsnSer: 2.068 ± 0.036
1.946AsnThr: 1.946 ± 0.036
2.513AsnVal: 2.513 ± 0.035
0.546AsnTrp: 0.546 ± 0.019
1.363AsnTyr: 1.363 ± 0.025
0.0AsnXaa: 0.0 ± 0.0
Pro
3.451ProAla: 3.451 ± 0.052
0.24ProCys: 0.24 ± 0.011
2.831ProAsp: 2.831 ± 0.044
3.67ProGlu: 3.67 ± 0.046
1.912ProPhe: 1.912 ± 0.031
3.166ProGly: 3.166 ± 0.046
0.934ProHis: 0.934 ± 0.025
2.495ProIle: 2.495 ± 0.035
1.817ProLys: 1.817 ± 0.029
4.096ProLeu: 4.096 ± 0.047
1.087ProMet: 1.087 ± 0.024
1.571ProAsn: 1.571 ± 0.03
1.379ProPro: 1.379 ± 0.032
1.414ProGln: 1.414 ± 0.029
1.504ProArg: 1.504 ± 0.032
2.754ProSer: 2.754 ± 0.037
1.906ProThr: 1.906 ± 0.033
3.422ProVal: 3.422 ± 0.046
0.596ProTrp: 0.596 ± 0.018
1.649ProTyr: 1.649 ± 0.03
0.0ProXaa: 0.0 ± 0.0
Gln
3.368GlnAla: 3.368 ± 0.041
0.211GlnCys: 0.211 ± 0.011
1.937GlnAsp: 1.937 ± 0.033
2.855GlnGlu: 2.855 ± 0.043
1.47GlnPhe: 1.47 ± 0.027
2.881GlnGly: 2.881 ± 0.038
0.862GlnHis: 0.862 ± 0.021
2.345GlnIle: 2.345 ± 0.037
1.815GlnLys: 1.815 ± 0.033
4.063GlnLeu: 4.063 ± 0.046
1.144GlnMet: 1.144 ± 0.026
1.3GlnAsn: 1.3 ± 0.025
1.561GlnPro: 1.561 ± 0.026
1.822GlnGln: 1.822 ± 0.029
1.968GlnArg: 1.968 ± 0.037
2.346GlnSer: 2.346 ± 0.039
1.884GlnThr: 1.884 ± 0.031
2.5GlnVal: 2.5 ± 0.036
0.577GlnTrp: 0.577 ± 0.02
1.392GlnTyr: 1.392 ± 0.022
0.0GlnXaa: 0.0 ± 0.0
Arg
3.205ArgAla: 3.205 ± 0.041
0.374ArgCys: 0.374 ± 0.014
2.513ArgAsp: 2.513 ± 0.04
3.896ArgGlu: 3.896 ± 0.051
2.32ArgPhe: 2.32 ± 0.034
3.068ArgGly: 3.068 ± 0.038
1.215ArgHis: 1.215 ± 0.026
3.795ArgIle: 3.795 ± 0.046
3.025ArgLys: 3.025 ± 0.042
5.488ArgLeu: 5.488 ± 0.059
1.757ArgMet: 1.757 ± 0.03
2.048ArgAsn: 2.048 ± 0.031
1.753ArgPro: 1.753 ± 0.031
2.075ArgGln: 2.075 ± 0.033
2.771ArgArg: 2.771 ± 0.042
3.228ArgSer: 3.228 ± 0.038
2.702ArgThr: 2.702 ± 0.039
3.225ArgVal: 3.225 ± 0.044
0.751ArgTrp: 0.751 ± 0.019
2.125ArgTyr: 2.125 ± 0.036
0.0ArgXaa: 0.0 ± 0.0
Ser
4.802SerAla: 4.802 ± 0.051
0.436SerCys: 0.436 ± 0.016
3.134SerAsp: 3.134 ± 0.038
3.825SerGlu: 3.825 ± 0.045
2.958SerPhe: 2.958 ± 0.036
5.654SerGly: 5.654 ± 0.072
1.298SerHis: 1.298 ± 0.024
4.308SerIle: 4.308 ± 0.053
3.115SerLys: 3.115 ± 0.042
6.518SerLeu: 6.518 ± 0.065
1.924SerMet: 1.924 ± 0.032
2.208SerAsn: 2.208 ± 0.036
2.608SerPro: 2.608 ± 0.037
1.998SerGln: 1.998 ± 0.03
3.292SerArg: 3.292 ± 0.042
4.376SerSer: 4.376 ± 0.057
3.085SerThr: 3.085 ± 0.046
4.515SerVal: 4.515 ± 0.049
0.912SerTrp: 0.912 ± 0.02
2.333SerTyr: 2.333 ± 0.035
0.0SerXaa: 0.0 ± 0.0
Thr
4.528ThrAla: 4.528 ± 0.06
0.335ThrCys: 0.335 ± 0.013
2.815ThrAsp: 2.815 ± 0.035
3.324ThrGlu: 3.324 ± 0.037
2.279ThrPhe: 2.279 ± 0.036
4.841ThrGly: 4.841 ± 0.176
1.045ThrHis: 1.045 ± 0.022
3.556ThrIle: 3.556 ± 0.051
2.286ThrLys: 2.286 ± 0.038
5.41ThrLeu: 5.41 ± 0.06
1.454ThrMet: 1.454 ± 0.03
1.759ThrAsn: 1.759 ± 0.032
2.49ThrPro: 2.49 ± 0.036
1.519ThrGln: 1.519 ± 0.026
2.21ThrArg: 2.21 ± 0.037
3.192ThrSer: 3.192 ± 0.046
2.698ThrThr: 2.698 ± 0.045
4.433ThrVal: 4.433 ± 0.061
0.696ThrTrp: 0.696 ± 0.018
1.897ThrTyr: 1.897 ± 0.033
0.0ThrXaa: 0.0 ± 0.0
Val
4.9ValAla: 4.9 ± 0.051
0.556ValCys: 0.556 ± 0.019
3.584ValAsp: 3.584 ± 0.049
4.258ValGlu: 4.258 ± 0.052
3.0ValPhe: 3.0 ± 0.042
4.412ValGly: 4.412 ± 0.049
1.544ValHis: 1.544 ± 0.029
4.892ValIle: 4.892 ± 0.054
3.704ValLys: 3.704 ± 0.048
7.167ValLeu: 7.167 ± 0.058
2.121ValMet: 2.121 ± 0.031
2.829ValAsn: 2.829 ± 0.041
3.113ValPro: 3.113 ± 0.043
2.606ValGln: 2.606 ± 0.035
3.45ValArg: 3.45 ± 0.047
4.811ValSer: 4.811 ± 0.05
4.365ValThr: 4.365 ± 0.065
4.933ValVal: 4.933 ± 0.06
0.93ValTrp: 0.93 ± 0.022
2.48ValTyr: 2.48 ± 0.034
0.0ValXaa: 0.0 ± 0.0
Trp
0.866TrpAla: 0.866 ± 0.02
0.107TrpCys: 0.107 ± 0.007
0.761TrpAsp: 0.761 ± 0.017
0.818TrpGlu: 0.818 ± 0.023
0.623TrpPhe: 0.623 ± 0.018
0.959TrpGly: 0.959 ± 0.021
0.294TrpHis: 0.294 ± 0.013
1.026TrpIle: 1.026 ± 0.022
0.803TrpLys: 0.803 ± 0.023
1.482TrpLeu: 1.482 ± 0.027
0.507TrpMet: 0.507 ± 0.015
0.817TrpAsn: 0.817 ± 0.023
0.413TrpPro: 0.413 ± 0.015
0.503TrpGln: 0.503 ± 0.016
0.669TrpArg: 0.669 ± 0.021
0.928TrpSer: 0.928 ± 0.022
0.729TrpThr: 0.729 ± 0.02
0.868TrpVal: 0.868 ± 0.022
0.203TrpTrp: 0.203 ± 0.011
0.447TrpTyr: 0.447 ± 0.018
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.873TyrAla: 2.873 ± 0.035
0.27TyrCys: 0.27 ± 0.014
1.97TyrAsp: 1.97 ± 0.035
2.424TyrGlu: 2.424 ± 0.039
1.671TyrPhe: 1.671 ± 0.029
2.798TyrGly: 2.798 ± 0.039
0.853TyrHis: 0.853 ± 0.022
2.188TyrIle: 2.188 ± 0.033
1.584TyrLys: 1.584 ± 0.029
3.45TyrLeu: 3.45 ± 0.044
1.014TyrMet: 1.014 ± 0.022
1.374TyrAsn: 1.374 ± 0.027
1.674TyrPro: 1.674 ± 0.032
1.288TyrGln: 1.288 ± 0.026
2.266TyrArg: 2.266 ± 0.035
2.113TyrSer: 2.113 ± 0.031
1.863TyrThr: 1.863 ± 0.036
2.463TyrVal: 2.463 ± 0.036
0.515TyrTrp: 0.515 ± 0.015
1.456TyrTyr: 1.456 ± 0.031
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 6237 proteins (2028501 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski