Amino acid dipepetide frequency for Geomicrobium sp. JCM 19038

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
5.834AlaAla: 5.834 ± 0.088
0.588AlaCys: 0.588 ± 0.023
3.55AlaAsp: 3.55 ± 0.062
4.661AlaGlu: 4.661 ± 0.083
3.581AlaPhe: 3.581 ± 0.061
5.396AlaGly: 5.396 ± 0.081
1.612AlaHis: 1.612 ± 0.043
6.336AlaIle: 6.336 ± 0.088
4.155AlaLys: 4.155 ± 0.065
7.873AlaLeu: 7.873 ± 0.099
2.372AlaMet: 2.372 ± 0.043
2.963AlaAsn: 2.963 ± 0.052
2.359AlaPro: 2.359 ± 0.048
2.276AlaGln: 2.276 ± 0.047
2.871AlaArg: 2.871 ± 0.055
4.744AlaSer: 4.744 ± 0.066
4.448AlaThr: 4.448 ± 0.071
5.716AlaVal: 5.716 ± 0.077
0.625AlaTrp: 0.625 ± 0.026
2.482AlaTyr: 2.482 ± 0.05
0.0AlaXaa: 0.0 ± 0.0
Cys
0.363CysAla: 0.363 ± 0.018
0.082CysCys: 0.082 ± 0.008
0.369CysAsp: 0.369 ± 0.018
0.433CysGlu: 0.433 ± 0.024
0.259CysPhe: 0.259 ± 0.015
0.585CysGly: 0.585 ± 0.024
0.185CysHis: 0.185 ± 0.014
0.43CysIle: 0.43 ± 0.02
0.299CysLys: 0.299 ± 0.017
0.611CysLeu: 0.611 ± 0.026
0.159CysMet: 0.159 ± 0.012
0.238CysAsn: 0.238 ± 0.017
0.277CysPro: 0.277 ± 0.018
0.194CysGln: 0.194 ± 0.014
0.266CysArg: 0.266 ± 0.016
0.456CysSer: 0.456 ± 0.024
0.387CysThr: 0.387 ± 0.018
0.389CysVal: 0.389 ± 0.019
0.053CysTrp: 0.053 ± 0.007
0.215CysTyr: 0.215 ± 0.014
0.0CysXaa: 0.0 ± 0.0
Asp
4.008AspAla: 4.008 ± 0.073
0.309AspCys: 0.309 ± 0.018
3.342AspAsp: 3.342 ± 0.069
5.575AspGlu: 5.575 ± 0.082
2.077AspPhe: 2.077 ± 0.044
3.8AspGly: 3.8 ± 0.073
1.633AspHis: 1.633 ± 0.046
3.351AspIle: 3.351 ± 0.06
1.988AspLys: 1.988 ± 0.044
5.243AspLeu: 5.243 ± 0.076
1.336AspMet: 1.336 ± 0.036
1.678AspAsn: 1.678 ± 0.038
2.154AspPro: 2.154 ± 0.044
2.633AspGln: 2.633 ± 0.053
2.728AspArg: 2.728 ± 0.056
2.506AspSer: 2.506 ± 0.05
2.584AspThr: 2.584 ± 0.053
4.983AspVal: 4.983 ± 0.071
0.661AspTrp: 0.661 ± 0.024
2.257AspTyr: 2.257 ± 0.056
0.0AspXaa: 0.0 ± 0.0
Glu
6.137GluAla: 6.137 ± 0.08
0.35GluCys: 0.35 ± 0.018
4.593GluAsp: 4.593 ± 0.071
7.906GluGlu: 7.906 ± 0.12
2.257GluPhe: 2.257 ± 0.04
4.726GluGly: 4.726 ± 0.063
2.068GluHis: 2.068 ± 0.045
4.504GluIle: 4.504 ± 0.068
4.351GluLys: 4.351 ± 0.075
7.079GluLeu: 7.079 ± 0.087
2.233GluMet: 2.233 ± 0.04
3.129GluAsn: 3.129 ± 0.051
2.386GluPro: 2.386 ± 0.046
4.55GluGln: 4.55 ± 0.082
4.863GluArg: 4.863 ± 0.076
4.075GluSer: 4.075 ± 0.07
4.325GluThr: 4.325 ± 0.067
5.301GluVal: 5.301 ± 0.078
0.899GluTrp: 0.899 ± 0.031
2.034GluTyr: 2.034 ± 0.042
0.0GluXaa: 0.0 ± 0.0
Phe
3.198PheAla: 3.198 ± 0.055
0.267PheCys: 0.267 ± 0.015
2.33PheAsp: 2.33 ± 0.049
2.907PheGlu: 2.907 ± 0.053
2.155PhePhe: 2.155 ± 0.053
3.301PheGly: 3.301 ± 0.062
1.04PheHis: 1.04 ± 0.029
3.519PheIle: 3.519 ± 0.067
1.823PheLys: 1.823 ± 0.038
4.243PheLeu: 4.243 ± 0.084
1.224PheMet: 1.224 ± 0.035
1.649PheAsn: 1.649 ± 0.04
1.534PhePro: 1.534 ± 0.036
1.697PheGln: 1.697 ± 0.045
1.566PheArg: 1.566 ± 0.035
2.822PheSer: 2.822 ± 0.054
2.746PheThr: 2.746 ± 0.055
3.361PheVal: 3.361 ± 0.053
0.436PheTrp: 0.436 ± 0.02
1.645PheTyr: 1.645 ± 0.034
0.0PheXaa: 0.0 ± 0.0
Gly
5.26GlyAla: 5.26 ± 0.085
0.502GlyCys: 0.502 ± 0.024
3.443GlyAsp: 3.443 ± 0.062
4.748GlyGlu: 4.748 ± 0.064
3.373GlyPhe: 3.373 ± 0.062
4.979GlyGly: 4.979 ± 0.084
1.495GlyHis: 1.495 ± 0.037
5.409GlyIle: 5.409 ± 0.076
3.801GlyLys: 3.801 ± 0.064
6.43GlyLeu: 6.43 ± 0.091
2.302GlyMet: 2.302 ± 0.039
2.593GlyAsn: 2.593 ± 0.052
1.946GlyPro: 1.946 ± 0.054
2.27GlyGln: 2.27 ± 0.045
2.746GlyArg: 2.746 ± 0.045
4.267GlySer: 4.267 ± 0.064
4.247GlyThr: 4.247 ± 0.073
5.517GlyVal: 5.517 ± 0.069
0.738GlyTrp: 0.738 ± 0.028
2.673GlyTyr: 2.673 ± 0.054
0.0GlyXaa: 0.0 ± 0.0
His
1.628HisAla: 1.628 ± 0.037
0.187HisCys: 0.187 ± 0.012
1.449HisAsp: 1.449 ± 0.04
1.901HisGlu: 1.901 ± 0.043
1.112HisPhe: 1.112 ± 0.034
1.654HisGly: 1.654 ± 0.043
0.874HisHis: 0.874 ± 0.034
1.466HisIle: 1.466 ± 0.035
0.945HisLys: 0.945 ± 0.03
2.339HisLeu: 2.339 ± 0.048
0.613HisMet: 0.613 ± 0.023
0.866HisAsn: 0.866 ± 0.035
1.206HisPro: 1.206 ± 0.035
1.053HisGln: 1.053 ± 0.037
1.144HisArg: 1.144 ± 0.034
1.518HisSer: 1.518 ± 0.039
1.282HisThr: 1.282 ± 0.034
2.033HisVal: 2.033 ± 0.041
0.29HisTrp: 0.29 ± 0.016
1.046HisTyr: 1.046 ± 0.03
0.0HisXaa: 0.0 ± 0.0
Ile
6.262IleAla: 6.262 ± 0.082
0.465IleCys: 0.465 ± 0.02
4.376IleAsp: 4.376 ± 0.066
5.638IleGlu: 5.638 ± 0.065
2.646IlePhe: 2.646 ± 0.056
5.832IleGly: 5.832 ± 0.087
1.749IleHis: 1.749 ± 0.044
4.822IleIle: 4.822 ± 0.082
2.878IleLys: 2.878 ± 0.048
5.806IleLeu: 5.806 ± 0.105
1.701IleMet: 1.701 ± 0.04
2.496IleAsn: 2.496 ± 0.054
3.098IlePro: 3.098 ± 0.059
2.867IleGln: 2.867 ± 0.059
3.04IleArg: 3.04 ± 0.052
4.177IleSer: 4.177 ± 0.066
4.37IleThr: 4.37 ± 0.066
5.972IleVal: 5.972 ± 0.079
0.558IleTrp: 0.558 ± 0.021
2.068IleTyr: 2.068 ± 0.042
0.0IleXaa: 0.0 ± 0.0
Lys
4.059LysAla: 4.059 ± 0.066
0.216LysCys: 0.216 ± 0.013
2.833LysAsp: 2.833 ± 0.05
5.262LysGlu: 5.262 ± 0.075
1.192LysPhe: 1.192 ± 0.03
3.243LysGly: 3.243 ± 0.053
1.302LysHis: 1.302 ± 0.037
2.705LysIle: 2.705 ± 0.053
3.99LysLys: 3.99 ± 0.071
4.228LysLeu: 4.228 ± 0.065
1.492LysMet: 1.492 ± 0.038
2.021LysAsn: 2.021 ± 0.039
1.885LysPro: 1.885 ± 0.046
2.818LysGln: 2.818 ± 0.06
3.407LysArg: 3.407 ± 0.064
2.84LysSer: 2.84 ± 0.056
3.118LysThr: 3.118 ± 0.056
3.224LysVal: 3.224 ± 0.053
0.519LysTrp: 0.519 ± 0.022
1.293LysTyr: 1.293 ± 0.034
0.0LysXaa: 0.0 ± 0.0
Leu
7.302LeuAla: 7.302 ± 0.091
0.593LeuCys: 0.593 ± 0.025
4.995LeuAsp: 4.995 ± 0.078
6.723LeuGlu: 6.723 ± 0.087
4.798LeuPhe: 4.798 ± 0.1
6.267LeuGly: 6.267 ± 0.085
2.447LeuHis: 2.447 ± 0.051
6.885LeuIle: 6.885 ± 0.112
4.884LeuLys: 4.884 ± 0.073
10.096LeuLeu: 10.096 ± 0.148
2.589LeuMet: 2.589 ± 0.05
3.777LeuAsn: 3.777 ± 0.06
4.114LeuPro: 4.114 ± 0.063
4.095LeuGln: 4.095 ± 0.069
4.199LeuArg: 4.199 ± 0.06
6.732LeuSer: 6.732 ± 0.083
5.92LeuThr: 5.92 ± 0.072
6.425LeuVal: 6.425 ± 0.085
0.825LeuTrp: 0.825 ± 0.028
3.166LeuTyr: 3.166 ± 0.057
0.0LeuXaa: 0.0 ± 0.0
Met
2.081MetAla: 2.081 ± 0.044
0.129MetCys: 0.129 ± 0.011
1.478MetAsp: 1.478 ± 0.038
1.796MetGlu: 1.796 ± 0.037
1.142MetPhe: 1.142 ± 0.034
1.628MetGly: 1.628 ± 0.042
0.574MetHis: 0.574 ± 0.025
2.447MetIle: 2.447 ± 0.047
2.103MetLys: 2.103 ± 0.045
2.772MetLeu: 2.772 ± 0.056
1.043MetMet: 1.043 ± 0.025
1.714MetAsn: 1.714 ± 0.04
1.048MetPro: 1.048 ± 0.028
1.136MetGln: 1.136 ± 0.03
1.291MetArg: 1.291 ± 0.035
1.966MetSer: 1.966 ± 0.043
2.046MetThr: 2.046 ± 0.041
1.695MetVal: 1.695 ± 0.041
0.201MetTrp: 0.201 ± 0.013
0.855MetTyr: 0.855 ± 0.029
0.0MetXaa: 0.0 ± 0.0
Asn
2.841AsnAla: 2.841 ± 0.055
0.239AsnCys: 0.239 ± 0.015
2.538AsnAsp: 2.538 ± 0.052
3.585AsnGlu: 3.585 ± 0.056
1.354AsnPhe: 1.354 ± 0.037
2.888AsnGly: 2.888 ± 0.057
1.153AsnHis: 1.153 ± 0.03
2.505AsnIle: 2.505 ± 0.044
1.8AsnLys: 1.8 ± 0.044
3.184AsnLeu: 3.184 ± 0.059
1.039AsnMet: 1.039 ± 0.033
1.624AsnAsn: 1.624 ± 0.046
1.798AsnPro: 1.798 ± 0.046
1.798AsnGln: 1.798 ± 0.042
2.068AsnArg: 2.068 ± 0.041
1.765AsnSer: 1.765 ± 0.04
1.997AsnThr: 1.997 ± 0.043
3.405AsnVal: 3.405 ± 0.065
0.452AsnTrp: 0.452 ± 0.02
1.428AsnTyr: 1.428 ± 0.031
0.0AsnXaa: 0.0 ± 0.0
Pro
2.34ProAla: 2.34 ± 0.046
0.163ProCys: 0.163 ± 0.013
2.048ProAsp: 2.048 ± 0.054
2.977ProGlu: 2.977 ± 0.061
2.026ProPhe: 2.026 ± 0.051
2.38ProGly: 2.38 ± 0.052
0.882ProHis: 0.882 ± 0.028
2.762ProIle: 2.762 ± 0.055
1.934ProLys: 1.934 ± 0.042
3.732ProLeu: 3.732 ± 0.059
1.005ProMet: 1.005 ± 0.033
1.54ProAsn: 1.54 ± 0.032
1.072ProPro: 1.072 ± 0.038
0.967ProGln: 0.967 ± 0.026
1.196ProArg: 1.196 ± 0.032
2.475ProSer: 2.475 ± 0.052
2.209ProThr: 2.209 ± 0.053
3.112ProVal: 3.112 ± 0.124
0.389ProTrp: 0.389 ± 0.019
1.406ProTyr: 1.406 ± 0.036
0.0ProXaa: 0.0 ± 0.0
Gln
3.077GlnAla: 3.077 ± 0.051
0.234GlnCys: 0.234 ± 0.014
1.93GlnAsp: 1.93 ± 0.043
2.962GlnGlu: 2.962 ± 0.054
1.754GlnPhe: 1.754 ± 0.039
2.272GlnGly: 2.272 ± 0.048
1.007GlnHis: 1.007 ± 0.028
2.326GlnIle: 2.326 ± 0.047
2.258GlnLys: 2.258 ± 0.051
4.722GlnLeu: 4.722 ± 0.076
1.363GlnMet: 1.363 ± 0.034
1.453GlnAsn: 1.453 ± 0.038
1.442GlnPro: 1.442 ± 0.039
2.185GlnGln: 2.185 ± 0.052
1.866GlnArg: 1.866 ± 0.044
2.73GlnSer: 2.73 ± 0.05
2.367GlnThr: 2.367 ± 0.049
2.705GlnVal: 2.705 ± 0.049
0.511GlnTrp: 0.511 ± 0.023
1.271GlnTyr: 1.271 ± 0.034
0.0GlnXaa: 0.0 ± 0.0
Arg
3.154ArgAla: 3.154 ± 0.051
0.286ArgCys: 0.286 ± 0.015
2.36ArgAsp: 2.36 ± 0.045
3.598ArgGlu: 3.598 ± 0.059
2.228ArgPhe: 2.228 ± 0.051
2.663ArgGly: 2.663 ± 0.058
1.02ArgHis: 1.02 ± 0.031
3.217ArgIle: 3.217 ± 0.053
2.752ArgLys: 2.752 ± 0.055
4.627ArgLeu: 4.627 ± 0.078
1.464ArgMet: 1.464 ± 0.032
1.96ArgAsn: 1.96 ± 0.04
1.462ArgPro: 1.462 ± 0.037
1.797ArgGln: 1.797 ± 0.044
2.163ArgArg: 2.163 ± 0.045
2.825ArgSer: 2.825 ± 0.051
2.518ArgThr: 2.518 ± 0.044
3.017ArgVal: 3.017 ± 0.056
0.512ArgTrp: 0.512 ± 0.023
1.769ArgTyr: 1.769 ± 0.039
0.0ArgXaa: 0.0 ± 0.0
Ser
3.999SerAla: 3.999 ± 0.067
0.361SerCys: 0.361 ± 0.02
3.183SerAsp: 3.183 ± 0.057
4.298SerGlu: 4.298 ± 0.062
3.27SerPhe: 3.27 ± 0.059
4.551SerGly: 4.551 ± 0.075
1.397SerHis: 1.397 ± 0.033
4.739SerIle: 4.739 ± 0.072
3.077SerLys: 3.077 ± 0.049
6.358SerLeu: 6.358 ± 0.081
2.065SerMet: 2.065 ± 0.04
2.319SerAsn: 2.319 ± 0.052
1.969SerPro: 1.969 ± 0.048
2.022SerGln: 2.022 ± 0.043
2.53SerArg: 2.53 ± 0.052
4.111SerSer: 4.111 ± 0.072
3.581SerThr: 3.581 ± 0.059
4.382SerVal: 4.382 ± 0.066
0.623SerTrp: 0.623 ± 0.026
2.296SerTyr: 2.296 ± 0.045
0.0SerXaa: 0.0 ± 0.0
Thr
4.258ThrAla: 4.258 ± 0.068
0.373ThrCys: 0.373 ± 0.02
3.04ThrAsp: 3.04 ± 0.051
3.824ThrGlu: 3.824 ± 0.059
2.965ThrPhe: 2.965 ± 0.044
4.351ThrGly: 4.351 ± 0.095
1.299ThrHis: 1.299 ± 0.034
4.922ThrIle: 4.922 ± 0.073
3.04ThrLys: 3.04 ± 0.053
5.951ThrLeu: 5.951 ± 0.074
1.711ThrMet: 1.711 ± 0.037
2.532ThrAsn: 2.532 ± 0.049
2.406ThrPro: 2.406 ± 0.049
1.477ThrGln: 1.477 ± 0.035
2.176ThrArg: 2.176 ± 0.05
3.733ThrSer: 3.733 ± 0.058
3.604ThrThr: 3.604 ± 0.063
4.791ThrVal: 4.791 ± 0.071
0.605ThrTrp: 0.605 ± 0.021
2.09ThrTyr: 2.09 ± 0.041
0.0ThrXaa: 0.0 ± 0.0
Val
5.792ValAla: 5.792 ± 0.075
0.537ValCys: 0.537 ± 0.024
4.157ValAsp: 4.157 ± 0.075
5.365ValGlu: 5.365 ± 0.086
3.215ValPhe: 3.215 ± 0.053
5.069ValGly: 5.069 ± 0.065
1.778ValHis: 1.778 ± 0.039
5.814ValIle: 5.814 ± 0.08
3.529ValLys: 3.529 ± 0.062
7.165ValLeu: 7.165 ± 0.095
2.192ValMet: 2.192 ± 0.047
3.151ValAsn: 3.151 ± 0.061
2.881ValPro: 2.881 ± 0.047
2.855ValGln: 2.855 ± 0.057
3.084ValArg: 3.084 ± 0.053
4.815ValSer: 4.815 ± 0.061
4.918ValThr: 4.918 ± 0.079
5.819ValVal: 5.819 ± 0.076
0.634ValTrp: 0.634 ± 0.024
2.371ValTyr: 2.371 ± 0.049
0.0ValXaa: 0.0 ± 0.0
Trp
0.596TrpAla: 0.596 ± 0.023
0.077TrpCys: 0.077 ± 0.009
0.513TrpAsp: 0.513 ± 0.022
0.585TrpGlu: 0.585 ± 0.025
0.5TrpPhe: 0.5 ± 0.024
0.638TrpGly: 0.638 ± 0.023
0.247TrpHis: 0.247 ± 0.014
0.765TrpIle: 0.765 ± 0.029
0.533TrpLys: 0.533 ± 0.022
1.243TrpLeu: 1.243 ± 0.035
0.386TrpMet: 0.386 ± 0.018
0.481TrpAsn: 0.481 ± 0.02
0.288TrpPro: 0.288 ± 0.018
0.483TrpGln: 0.483 ± 0.022
0.453TrpArg: 0.453 ± 0.02
0.576TrpSer: 0.576 ± 0.022
0.568TrpThr: 0.568 ± 0.023
0.638TrpVal: 0.638 ± 0.026
0.142TrpTrp: 0.142 ± 0.012
0.343TrpTyr: 0.343 ± 0.018
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.256TyrAla: 2.256 ± 0.053
0.291TyrCys: 0.291 ± 0.015
2.317TyrAsp: 2.317 ± 0.044
3.022TyrGlu: 3.022 ± 0.055
1.557TyrPhe: 1.557 ± 0.037
2.517TyrGly: 2.517 ± 0.045
0.843TyrHis: 0.843 ± 0.027
1.896TyrIle: 1.896 ± 0.039
1.511TyrLys: 1.511 ± 0.038
3.13TyrLeu: 3.13 ± 0.053
0.855TyrMet: 0.855 ± 0.027
1.282TyrAsn: 1.282 ± 0.035
1.289TyrPro: 1.289 ± 0.033
1.265TyrGln: 1.265 ± 0.034
1.734TyrArg: 1.734 ± 0.047
1.997TyrSer: 1.997 ± 0.042
1.913TyrThr: 1.913 ± 0.042
2.656TyrVal: 2.656 ± 0.052
0.377TyrTrp: 0.377 ± 0.02
1.362TyrTyr: 1.362 ± 0.036
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 4086 proteins (1114903 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski