Amino acid dipepetide frequency for Streptomyces sp. NEAU-C40

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
21.589AlaAla: 21.589 ± 0.136
1.122AlaCys: 1.122 ± 0.02
8.059AlaAsp: 8.059 ± 0.064
8.113AlaGlu: 8.113 ± 0.069
3.428AlaPhe: 3.428 ± 0.037
12.829AlaGly: 12.829 ± 0.08
2.858AlaHis: 2.858 ± 0.033
4.073AlaIle: 4.073 ± 0.046
2.957AlaLys: 2.957 ± 0.038
14.091AlaLeu: 14.091 ± 0.093
2.55AlaMet: 2.55 ± 0.028
2.206AlaAsn: 2.206 ± 0.03
6.773AlaPro: 6.773 ± 0.058
4.13AlaGln: 4.13 ± 0.044
9.759AlaArg: 9.759 ± 0.07
6.444AlaSer: 6.444 ± 0.061
7.421AlaThr: 7.421 ± 0.064
12.058AlaVal: 12.058 ± 0.073
1.855AlaTrp: 1.855 ± 0.029
2.676AlaTyr: 2.676 ± 0.026
0.0AlaXaa: 0.0 ± 0.0
Cys
1.089CysAla: 1.089 ± 0.02
0.106CysCys: 0.106 ± 0.006
0.495CysAsp: 0.495 ± 0.014
0.436CysGlu: 0.436 ± 0.012
0.228CysPhe: 0.228 ± 0.008
0.991CysGly: 0.991 ± 0.019
0.234CysHis: 0.234 ± 0.009
0.2CysIle: 0.2 ± 0.008
0.122CysLys: 0.122 ± 0.007
0.759CysLeu: 0.759 ± 0.018
0.134CysMet: 0.134 ± 0.007
0.15CysAsn: 0.15 ± 0.008
0.559CysPro: 0.559 ± 0.017
0.196CysGln: 0.196 ± 0.008
0.694CysArg: 0.694 ± 0.015
0.526CysSer: 0.526 ± 0.014
0.566CysThr: 0.566 ± 0.014
0.632CysVal: 0.632 ± 0.016
0.145CysTrp: 0.145 ± 0.007
0.157CysTyr: 0.157 ± 0.008
0.0CysXaa: 0.0 ± 0.0
Asp
7.28AspAla: 7.28 ± 0.058
0.412AspCys: 0.412 ± 0.013
3.275AspAsp: 3.275 ± 0.037
3.431AspGlu: 3.431 ± 0.041
1.644AspPhe: 1.644 ± 0.026
5.769AspGly: 5.769 ± 0.048
1.439AspHis: 1.439 ± 0.022
2.064AspIle: 2.064 ± 0.026
1.129AspLys: 1.129 ± 0.022
6.079AspLeu: 6.079 ± 0.052
0.781AspMet: 0.781 ± 0.016
1.055AspAsn: 1.055 ± 0.022
4.316AspPro: 4.316 ± 0.043
1.724AspGln: 1.724 ± 0.022
4.642AspArg: 4.642 ± 0.048
2.707AspSer: 2.707 ± 0.035
3.213AspThr: 3.213 ± 0.034
4.362AspVal: 4.362 ± 0.039
0.98AspTrp: 0.98 ± 0.019
1.163AspTyr: 1.163 ± 0.021
0.0AspXaa: 0.0 ± 0.0
Glu
6.883GluAla: 6.883 ± 0.059
0.363GluCys: 0.363 ± 0.012
2.559GluAsp: 2.559 ± 0.036
3.064GluGlu: 3.064 ± 0.043
1.356GluPhe: 1.356 ± 0.024
3.7GluGly: 3.7 ± 0.043
1.53GluHis: 1.53 ± 0.023
2.258GluIle: 2.258 ± 0.028
1.214GluLys: 1.214 ± 0.021
6.381GluLeu: 6.381 ± 0.055
0.86GluMet: 0.86 ± 0.018
0.966GluAsn: 0.966 ± 0.018
3.139GluPro: 3.139 ± 0.04
2.254GluGln: 2.254 ± 0.03
4.937GluArg: 4.937 ± 0.044
2.478GluSer: 2.478 ± 0.032
2.692GluThr: 2.692 ± 0.033
4.128GluVal: 4.128 ± 0.044
0.736GluTrp: 0.736 ± 0.015
1.117GluTyr: 1.117 ± 0.023
0.0GluXaa: 0.0 ± 0.0
Phe
3.6PheAla: 3.6 ± 0.039
0.273PheCys: 0.273 ± 0.009
1.882PheAsp: 1.882 ± 0.024
1.338PheGlu: 1.338 ± 0.021
0.867PhePhe: 0.867 ± 0.019
2.97PheGly: 2.97 ± 0.03
0.621PheHis: 0.621 ± 0.014
0.888PheIle: 0.888 ± 0.02
0.503PheLys: 0.503 ± 0.014
2.509PheLeu: 2.509 ± 0.035
0.388PheMet: 0.388 ± 0.012
0.636PheAsn: 0.636 ± 0.017
1.433PhePro: 1.433 ± 0.022
0.721PheGln: 0.721 ± 0.018
1.791PheArg: 1.791 ± 0.022
1.558PheSer: 1.558 ± 0.022
2.091PheThr: 2.091 ± 0.029
2.058PheVal: 2.058 ± 0.032
0.405PheTrp: 0.405 ± 0.012
0.603PheTyr: 0.603 ± 0.015
0.0PheXaa: 0.0 ± 0.0
Gly
10.386GlyAla: 10.386 ± 0.063
0.886GlyCys: 0.886 ± 0.017
4.691GlyAsp: 4.691 ± 0.038
4.572GlyGlu: 4.572 ± 0.04
2.859GlyPhe: 2.859 ± 0.03
8.469GlyGly: 8.469 ± 0.082
2.394GlyHis: 2.394 ± 0.032
3.799GlyIle: 3.799 ± 0.042
2.433GlyLys: 2.433 ± 0.033
9.186GlyLeu: 9.186 ± 0.064
1.921GlyMet: 1.921 ± 0.028
1.905GlyAsn: 1.905 ± 0.031
5.062GlyPro: 5.062 ± 0.046
2.937GlyGln: 2.937 ± 0.034
7.495GlyArg: 7.495 ± 0.06
5.661GlySer: 5.661 ± 0.051
6.547GlyThr: 6.547 ± 0.06
7.073GlyVal: 7.073 ± 0.052
1.763GlyTrp: 1.763 ± 0.023
2.358GlyTyr: 2.358 ± 0.028
0.0GlyXaa: 0.0 ± 0.0
His
2.762HisAla: 2.762 ± 0.032
0.24HisCys: 0.24 ± 0.009
1.374HisAsp: 1.374 ± 0.024
1.145HisGlu: 1.145 ± 0.022
0.655HisPhe: 0.655 ± 0.016
2.429HisGly: 2.429 ± 0.03
0.749HisHis: 0.749 ± 0.017
0.816HisIle: 0.816 ± 0.018
0.378HisLys: 0.378 ± 0.012
2.532HisLeu: 2.532 ± 0.034
0.345HisMet: 0.345 ± 0.01
0.447HisAsn: 0.447 ± 0.014
1.824HisPro: 1.824 ± 0.024
0.781HisGln: 0.781 ± 0.017
2.209HisArg: 2.209 ± 0.031
1.15HisSer: 1.15 ± 0.018
1.405HisThr: 1.405 ± 0.019
1.665HisVal: 1.665 ± 0.025
0.386HisTrp: 0.386 ± 0.011
0.514HisTyr: 0.514 ± 0.013
0.0HisXaa: 0.0 ± 0.0
Ile
5.307IleAla: 5.307 ± 0.052
0.335IleCys: 0.335 ± 0.011
2.457IleAsp: 2.457 ± 0.028
2.072IleGlu: 2.072 ± 0.03
0.799IlePhe: 0.799 ± 0.016
3.689IleGly: 3.689 ± 0.043
0.743IleHis: 0.743 ± 0.016
1.039IleIle: 1.039 ± 0.02
0.777IleLys: 0.777 ± 0.019
2.688IleLeu: 2.688 ± 0.034
0.482IleMet: 0.482 ± 0.013
0.861IleAsn: 0.861 ± 0.017
2.017IlePro: 2.017 ± 0.029
0.857IleGln: 0.857 ± 0.019
2.498IleArg: 2.498 ± 0.029
1.999IleSer: 1.999 ± 0.027
2.594IleThr: 2.594 ± 0.031
2.982IleVal: 2.982 ± 0.034
0.414IleTrp: 0.414 ± 0.013
0.669IleTyr: 0.669 ± 0.014
0.0IleXaa: 0.0 ± 0.0
Lys
2.918LysAla: 2.918 ± 0.039
0.125LysCys: 0.125 ± 0.007
1.203LysAsp: 1.203 ± 0.025
1.023LysGlu: 1.023 ± 0.021
0.482LysPhe: 0.482 ± 0.011
1.713LysGly: 1.713 ± 0.029
0.49LysHis: 0.49 ± 0.016
0.924LysIle: 0.924 ± 0.018
0.717LysLys: 0.717 ± 0.022
2.08LysLeu: 2.08 ± 0.027
0.383LysMet: 0.383 ± 0.011
0.565LysAsn: 0.565 ± 0.015
1.349LysPro: 1.349 ± 0.023
0.766LysGln: 0.766 ± 0.016
1.512LysArg: 1.512 ± 0.023
1.248LysSer: 1.248 ± 0.021
1.359LysThr: 1.359 ± 0.024
1.898LysVal: 1.898 ± 0.03
0.294LysTrp: 0.294 ± 0.01
0.496LysTyr: 0.496 ± 0.013
0.0LysXaa: 0.0 ± 0.0
Leu
15.002LeuAla: 15.002 ± 0.101
0.863LeuCys: 0.863 ± 0.018
6.464LeuAsp: 6.464 ± 0.053
4.556LeuGlu: 4.556 ± 0.038
2.521LeuPhe: 2.521 ± 0.034
9.048LeuGly: 9.048 ± 0.066
2.363LeuHis: 2.363 ± 0.028
3.562LeuIle: 3.562 ± 0.039
2.024LeuLys: 2.024 ± 0.033
10.861LeuLeu: 10.861 ± 0.083
1.601LeuMet: 1.601 ± 0.025
1.836LeuAsn: 1.836 ± 0.025
6.409LeuPro: 6.409 ± 0.05
2.469LeuGln: 2.469 ± 0.027
8.669LeuArg: 8.669 ± 0.063
5.603LeuSer: 5.603 ± 0.044
6.732LeuThr: 6.732 ± 0.056
8.262LeuVal: 8.262 ± 0.064
1.301LeuTrp: 1.301 ± 0.024
1.86LeuTyr: 1.86 ± 0.03
0.0LeuXaa: 0.0 ± 0.0
Met
2.278MetAla: 2.278 ± 0.03
0.146MetCys: 0.146 ± 0.006
0.906MetAsp: 0.906 ± 0.018
0.704MetGlu: 0.704 ± 0.017
0.442MetPhe: 0.442 ± 0.011
1.293MetGly: 1.293 ± 0.025
0.355MetHis: 0.355 ± 0.011
0.654MetIle: 0.654 ± 0.014
0.417MetLys: 0.417 ± 0.011
1.655MetLeu: 1.655 ± 0.026
0.31MetMet: 0.31 ± 0.012
0.46MetAsn: 0.46 ± 0.013
1.143MetPro: 1.143 ± 0.016
0.475MetGln: 0.475 ± 0.013
1.447MetArg: 1.447 ± 0.023
1.352MetSer: 1.352 ± 0.019
1.527MetThr: 1.527 ± 0.023
1.278MetVal: 1.278 ± 0.024
0.222MetTrp: 0.222 ± 0.009
0.331MetTyr: 0.331 ± 0.011
0.0MetXaa: 0.0 ± 0.0
Asn
2.487AsnAla: 2.487 ± 0.033
0.178AsnCys: 0.178 ± 0.009
1.051AsnAsp: 1.051 ± 0.017
0.872AsnGlu: 0.872 ± 0.021
0.538AsnPhe: 0.538 ± 0.016
2.146AsnGly: 2.146 ± 0.036
0.426AsnHis: 0.426 ± 0.011
0.728AsnIle: 0.728 ± 0.017
0.448AsnLys: 0.448 ± 0.014
1.89AsnLeu: 1.89 ± 0.028
0.321AsnMet: 0.321 ± 0.011
0.553AsnAsn: 0.553 ± 0.017
1.506AsnPro: 1.506 ± 0.026
0.615AsnGln: 0.615 ± 0.017
1.337AsnArg: 1.337 ± 0.022
1.168AsnSer: 1.168 ± 0.028
1.309AsnThr: 1.309 ± 0.027
1.495AsnVal: 1.495 ± 0.025
0.332AsnTrp: 0.332 ± 0.01
0.458AsnTyr: 0.458 ± 0.014
0.0AsnXaa: 0.0 ± 0.0
Pro
8.674ProAla: 8.674 ± 0.064
0.389ProCys: 0.389 ± 0.013
4.309ProAsp: 4.309 ± 0.037
3.879ProGlu: 3.879 ± 0.045
1.515ProPhe: 1.515 ± 0.024
6.637ProGly: 6.637 ± 0.052
1.396ProHis: 1.396 ± 0.022
1.474ProIle: 1.474 ± 0.025
1.208ProLys: 1.208 ± 0.023
5.265ProLeu: 5.265 ± 0.051
1.006ProMet: 1.006 ± 0.019
1.054ProAsn: 1.054 ± 0.022
3.551ProPro: 3.551 ± 0.049
1.89ProGln: 1.89 ± 0.025
3.902ProArg: 3.902 ± 0.046
3.387ProSer: 3.387 ± 0.036
3.355ProThr: 3.355 ± 0.042
5.352ProVal: 5.352 ± 0.039
0.945ProTrp: 0.945 ± 0.019
1.407ProTyr: 1.407 ± 0.022
0.0ProXaa: 0.0 ± 0.0
Gln
4.19GlnAla: 4.19 ± 0.042
0.197GlnCys: 0.197 ± 0.007
1.476GlnAsp: 1.476 ± 0.025
1.422GlnGlu: 1.422 ± 0.024
0.778GlnPhe: 0.778 ± 0.017
2.347GlnGly: 2.347 ± 0.031
0.797GlnHis: 0.797 ± 0.016
1.248GlnIle: 1.248 ± 0.022
0.628GlnLys: 0.628 ± 0.015
3.358GlnLeu: 3.358 ± 0.034
0.539GlnMet: 0.539 ± 0.014
0.601GlnAsn: 0.601 ± 0.015
1.914GlnPro: 1.914 ± 0.03
1.471GlnGln: 1.471 ± 0.03
2.592GlnArg: 2.592 ± 0.028
1.507GlnSer: 1.507 ± 0.024
1.566GlnThr: 1.566 ± 0.026
2.587GlnVal: 2.587 ± 0.03
0.536GlnTrp: 0.536 ± 0.016
0.677GlnTyr: 0.677 ± 0.014
0.0GlnXaa: 0.0 ± 0.0
Arg
9.319ArgAla: 9.319 ± 0.069
0.682ArgCys: 0.682 ± 0.017
4.089ArgAsp: 4.089 ± 0.041
4.351ArgGlu: 4.351 ± 0.054
2.229ArgPhe: 2.229 ± 0.029
5.561ArgGly: 5.561 ± 0.046
2.244ArgHis: 2.244 ± 0.034
3.378ArgIle: 3.378 ± 0.032
1.645ArgLys: 1.645 ± 0.027
8.749ArgLeu: 8.749 ± 0.073
1.697ArgMet: 1.697 ± 0.024
1.463ArgAsn: 1.463 ± 0.022
4.975ArgPro: 4.975 ± 0.054
2.472ArgGln: 2.472 ± 0.03
7.783ArgArg: 7.783 ± 0.068
4.23ArgSer: 4.23 ± 0.04
5.22ArgThr: 5.22 ± 0.04
5.517ArgVal: 5.517 ± 0.051
1.451ArgTrp: 1.451 ± 0.025
1.804ArgTyr: 1.804 ± 0.024
0.0ArgXaa: 0.0 ± 0.0
Ser
7.266SerAla: 7.266 ± 0.058
0.466SerCys: 0.466 ± 0.015
2.817SerAsp: 2.817 ± 0.031
2.421SerGlu: 2.421 ± 0.031
1.61SerPhe: 1.61 ± 0.027
6.406SerGly: 6.406 ± 0.059
1.13SerHis: 1.13 ± 0.021
1.715SerIle: 1.715 ± 0.027
1.138SerLys: 1.138 ± 0.021
4.986SerLeu: 4.986 ± 0.044
1.11SerMet: 1.11 ± 0.019
1.123SerAsn: 1.123 ± 0.025
3.469SerPro: 3.469 ± 0.041
1.477SerGln: 1.477 ± 0.028
3.843SerArg: 3.843 ± 0.041
3.655SerSer: 3.655 ± 0.053
3.498SerThr: 3.498 ± 0.042
4.467SerVal: 4.467 ± 0.041
1.005SerTrp: 1.005 ± 0.018
1.369SerTyr: 1.369 ± 0.023
0.0SerXaa: 0.0 ± 0.0
Thr
9.335ThrAla: 9.335 ± 0.071
0.498ThrCys: 0.498 ± 0.014
3.568ThrAsp: 3.568 ± 0.043
2.999ThrGlu: 2.999 ± 0.032
1.658ThrPhe: 1.658 ± 0.027
6.809ThrGly: 6.809 ± 0.054
1.237ThrHis: 1.237 ± 0.02
2.073ThrIle: 2.073 ± 0.027
1.296ThrLys: 1.296 ± 0.024
5.712ThrLeu: 5.712 ± 0.046
0.997ThrMet: 0.997 ± 0.019
1.239ThrAsn: 1.239 ± 0.028
4.113ThrPro: 4.113 ± 0.047
1.581ThrGln: 1.581 ± 0.025
3.98ThrArg: 3.98 ± 0.039
3.598ThrSer: 3.598 ± 0.044
4.222ThrThr: 4.222 ± 0.053
6.079ThrVal: 6.079 ± 0.057
0.965ThrTrp: 0.965 ± 0.021
1.366ThrTyr: 1.366 ± 0.024
0.0ThrXaa: 0.0 ± 0.0
Val
10.18ValAla: 10.18 ± 0.071
0.773ValCys: 0.773 ± 0.017
4.518ValAsp: 4.518 ± 0.036
4.275ValGlu: 4.275 ± 0.037
2.386ValPhe: 2.386 ± 0.027
6.279ValGly: 6.279 ± 0.053
1.958ValHis: 1.958 ± 0.026
3.203ValIle: 3.203 ± 0.037
1.728ValLys: 1.728 ± 0.027
9.224ValLeu: 9.224 ± 0.063
1.409ValMet: 1.409 ± 0.022
1.792ValAsn: 1.792 ± 0.026
5.043ValPro: 5.043 ± 0.049
2.202ValGln: 2.202 ± 0.03
6.619ValArg: 6.619 ± 0.053
4.561ValSer: 4.561 ± 0.038
5.635ValThr: 5.635 ± 0.049
7.292ValVal: 7.292 ± 0.062
1.132ValTrp: 1.132 ± 0.023
1.604ValTyr: 1.604 ± 0.026
0.0ValXaa: 0.0 ± 0.0
Trp
1.692TrpAla: 1.692 ± 0.026
0.156TrpCys: 0.156 ± 0.007
0.817TrpAsp: 0.817 ± 0.017
0.731TrpGlu: 0.731 ± 0.016
0.483TrpPhe: 0.483 ± 0.013
1.075TrpGly: 1.075 ± 0.018
0.428TrpHis: 0.428 ± 0.012
0.585TrpIle: 0.585 ± 0.014
0.384TrpLys: 0.384 ± 0.012
1.794TrpLeu: 1.794 ± 0.025
0.306TrpMet: 0.306 ± 0.011
0.439TrpAsn: 0.439 ± 0.013
0.866TrpPro: 0.866 ± 0.018
0.684TrpGln: 0.684 ± 0.015
1.298TrpArg: 1.298 ± 0.022
1.028TrpSer: 1.028 ± 0.019
1.086TrpThr: 1.086 ± 0.018
0.999TrpVal: 0.999 ± 0.017
0.347TrpTrp: 0.347 ± 0.011
0.384TrpTyr: 0.384 ± 0.01
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.755TyrAla: 2.755 ± 0.033
0.209TyrCys: 0.209 ± 0.009
1.454TyrAsp: 1.454 ± 0.026
1.212TyrGlu: 1.212 ± 0.022
0.656TyrPhe: 0.656 ± 0.015
2.237TyrGly: 2.237 ± 0.027
0.416TyrHis: 0.416 ± 0.011
0.598TyrIle: 0.598 ± 0.016
0.4TyrLys: 0.4 ± 0.012
2.202TyrLeu: 2.202 ± 0.026
0.254TyrMet: 0.254 ± 0.01
0.506TyrAsn: 0.506 ± 0.013
1.135TyrPro: 1.135 ± 0.023
0.729TyrGln: 0.729 ± 0.018
1.852TyrArg: 1.852 ± 0.026
1.109TyrSer: 1.109 ± 0.022
1.289TyrThr: 1.289 ± 0.021
1.628TyrVal: 1.628 ± 0.024
0.368TyrTrp: 0.368 ± 0.011
0.488TyrTyr: 0.488 ± 0.013
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.001XaaXaa: 0.001 ± 0.001
Statistics based on 9780 proteins (2970835 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski