Amino acid dipepetide frequency for Clostridium sp. CAG:678

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
8.323AlaAla: 8.323 ± 0.162
1.405AlaCys: 1.405 ± 0.047
5.343AlaAsp: 5.343 ± 0.099
5.566AlaGlu: 5.566 ± 0.102
3.814AlaPhe: 3.814 ± 0.088
5.786AlaGly: 5.786 ± 0.105
1.213AlaHis: 1.213 ± 0.041
4.451AlaIle: 4.451 ± 0.1
5.305AlaLys: 5.305 ± 0.108
7.251AlaLeu: 7.251 ± 0.13
1.997AlaMet: 1.997 ± 0.054
3.101AlaAsn: 3.101 ± 0.075
2.296AlaPro: 2.296 ± 0.056
3.088AlaGln: 3.088 ± 0.061
2.936AlaArg: 2.936 ± 0.067
3.809AlaSer: 3.809 ± 0.089
3.178AlaThr: 3.178 ± 0.093
7.595AlaVal: 7.595 ± 0.125
0.582AlaTrp: 0.582 ± 0.034
2.938AlaTyr: 2.938 ± 0.081
0.0AlaXaa: 0.0 ± 0.0
Cys
1.764CysAla: 1.764 ± 0.053
0.304CysCys: 0.304 ± 0.023
1.151CysAsp: 1.151 ± 0.045
1.169CysGlu: 1.169 ± 0.045
0.734CysPhe: 0.734 ± 0.031
1.669CysGly: 1.669 ± 0.062
0.293CysHis: 0.293 ± 0.023
1.09CysIle: 1.09 ± 0.038
0.989CysLys: 0.989 ± 0.041
1.243CysLeu: 1.243 ± 0.039
0.334CysMet: 0.334 ± 0.023
0.666CysAsn: 0.666 ± 0.034
0.68CysPro: 0.68 ± 0.033
0.337CysGln: 0.337 ± 0.022
0.823CysArg: 0.823 ± 0.037
1.117CysSer: 1.117 ± 0.04
0.797CysThr: 0.797 ± 0.036
1.294CysVal: 1.294 ± 0.043
0.112CysTrp: 0.112 ± 0.016
0.609CysTyr: 0.609 ± 0.031
0.0CysXaa: 0.0 ± 0.0
Asp
4.411AspAla: 4.411 ± 0.09
1.094AspCys: 1.094 ± 0.039
3.151AspAsp: 3.151 ± 0.084
4.653AspGlu: 4.653 ± 0.103
3.407AspPhe: 3.407 ± 0.069
4.525AspGly: 4.525 ± 0.108
0.956AspHis: 0.956 ± 0.043
4.84AspIle: 4.84 ± 0.083
3.628AspLys: 3.628 ± 0.085
5.33AspLeu: 5.33 ± 0.1
1.702AspMet: 1.702 ± 0.049
2.463AspAsn: 2.463 ± 0.069
2.047AspPro: 2.047 ± 0.062
1.325AspGln: 1.325 ± 0.046
2.216AspArg: 2.216 ± 0.061
3.57AspSer: 3.57 ± 0.076
3.34AspThr: 3.34 ± 0.068
3.699AspVal: 3.699 ± 0.097
0.526AspTrp: 0.526 ± 0.025
3.083AspTyr: 3.083 ± 0.068
0.0AspXaa: 0.0 ± 0.0
Glu
3.817GluAla: 3.817 ± 0.082
0.908GluCys: 0.908 ± 0.039
3.752GluAsp: 3.752 ± 0.072
5.103GluGlu: 5.103 ± 0.104
2.718GluPhe: 2.718 ± 0.059
3.121GluGly: 3.121 ± 0.065
1.074GluHis: 1.074 ± 0.044
5.921GluIle: 5.921 ± 0.111
6.471GluLys: 6.471 ± 0.118
6.046GluLeu: 6.046 ± 0.108
2.056GluMet: 2.056 ± 0.058
5.107GluAsn: 5.107 ± 0.095
1.82GluPro: 1.82 ± 0.052
2.299GluGln: 2.299 ± 0.066
2.801GluArg: 2.801 ± 0.076
3.752GluSer: 3.752 ± 0.085
4.008GluThr: 4.008 ± 0.08
3.539GluVal: 3.539 ± 0.093
0.628GluTrp: 0.628 ± 0.031
3.309GluTyr: 3.309 ± 0.076
0.0GluXaa: 0.0 ± 0.0
Phe
3.638PheAla: 3.638 ± 0.087
0.895PheCys: 0.895 ± 0.041
3.087PheAsp: 3.087 ± 0.07
3.14PheGlu: 3.14 ± 0.074
2.004PhePhe: 2.004 ± 0.065
3.324PheGly: 3.324 ± 0.07
0.682PheHis: 0.682 ± 0.031
3.137PheIle: 3.137 ± 0.083
2.865PheLys: 2.865 ± 0.077
3.647PheLeu: 3.647 ± 0.075
1.021PheMet: 1.021 ± 0.041
2.115PheAsn: 2.115 ± 0.053
1.352PhePro: 1.352 ± 0.048
1.038PheGln: 1.038 ± 0.038
1.645PheArg: 1.645 ± 0.053
3.637PheSer: 3.637 ± 0.081
2.735PheThr: 2.735 ± 0.073
2.86PheVal: 2.86 ± 0.064
0.372PheTrp: 0.372 ± 0.025
1.911PheTyr: 1.911 ± 0.06
0.0PheXaa: 0.0 ± 0.0
Gly
5.65GlyAla: 5.65 ± 0.126
1.26GlyCys: 1.26 ± 0.045
3.635GlyAsp: 3.635 ± 0.086
4.492GlyGlu: 4.492 ± 0.091
3.164GlyPhe: 3.164 ± 0.075
4.971GlyGly: 4.971 ± 0.101
1.05GlyHis: 1.05 ± 0.047
5.748GlyIle: 5.748 ± 0.089
5.229GlyLys: 5.229 ± 0.093
4.951GlyLeu: 4.951 ± 0.1
2.004GlyMet: 2.004 ± 0.062
3.358GlyAsn: 3.358 ± 0.089
1.145GlyPro: 1.145 ± 0.049
1.516GlyGln: 1.516 ± 0.049
2.975GlyArg: 2.975 ± 0.07
4.304GlySer: 4.304 ± 0.08
4.144GlyThr: 4.144 ± 0.095
5.116GlyVal: 5.116 ± 0.088
0.617GlyTrp: 0.617 ± 0.034
3.059GlyTyr: 3.059 ± 0.083
0.001GlyXaa: 0.001 ± 0.001
His
1.127HisAla: 1.127 ± 0.04
0.372HisCys: 0.372 ± 0.023
0.901HisAsp: 0.901 ± 0.036
0.841HisGlu: 0.841 ± 0.036
0.837HisPhe: 0.837 ± 0.035
1.222HisGly: 1.222 ± 0.044
0.386HisHis: 0.386 ± 0.031
1.429HisIle: 1.429 ± 0.045
0.992HisLys: 0.992 ± 0.041
1.243HisLeu: 1.243 ± 0.048
0.365HisMet: 0.365 ± 0.024
0.825HisAsn: 0.825 ± 0.035
0.801HisPro: 0.801 ± 0.035
0.412HisGln: 0.412 ± 0.023
0.72HisArg: 0.72 ± 0.035
1.108HisSer: 1.108 ± 0.046
0.935HisThr: 0.935 ± 0.038
0.805HisVal: 0.805 ± 0.031
0.178HisTrp: 0.178 ± 0.017
0.727HisTyr: 0.727 ± 0.031
0.0HisXaa: 0.0 ± 0.0
Ile
5.819IleAla: 5.819 ± 0.096
1.449IleCys: 1.449 ± 0.047
4.617IleAsp: 4.617 ± 0.087
4.782IleGlu: 4.782 ± 0.083
2.885IlePhe: 2.885 ± 0.072
4.795IleGly: 4.795 ± 0.097
1.165IleHis: 1.165 ± 0.046
5.128IleIle: 5.128 ± 0.107
5.299IleLys: 5.299 ± 0.096
6.214IleLeu: 6.214 ± 0.106
1.676IleMet: 1.676 ± 0.05
3.688IleAsn: 3.688 ± 0.074
2.846IlePro: 2.846 ± 0.072
1.847IleGln: 1.847 ± 0.058
3.319IleArg: 3.319 ± 0.081
5.346IleSer: 5.346 ± 0.097
4.59IleThr: 4.59 ± 0.099
4.685IleVal: 4.685 ± 0.084
0.553IleTrp: 0.553 ± 0.031
2.726IleTyr: 2.726 ± 0.056
0.0IleXaa: 0.0 ± 0.0
Lys
5.512LysAla: 5.512 ± 0.1
0.928LysCys: 0.928 ± 0.039
3.965LysAsp: 3.965 ± 0.09
5.768LysGlu: 5.768 ± 0.113
2.348LysPhe: 2.348 ± 0.066
4.039LysGly: 4.039 ± 0.089
1.122LysHis: 1.122 ± 0.038
5.597LysIle: 5.597 ± 0.089
6.616LysLys: 6.616 ± 0.137
5.384LysLeu: 5.384 ± 0.087
2.039LysMet: 2.039 ± 0.057
4.491LysAsn: 4.491 ± 0.09
2.111LysPro: 2.111 ± 0.059
2.233LysGln: 2.233 ± 0.061
3.239LysArg: 3.239 ± 0.086
3.968LysSer: 3.968 ± 0.083
4.515LysThr: 4.515 ± 0.07
3.762LysVal: 3.762 ± 0.069
0.582LysTrp: 0.582 ± 0.029
3.211LysTyr: 3.211 ± 0.077
0.0LysXaa: 0.0 ± 0.0
Leu
6.393LeuAla: 6.393 ± 0.112
1.689LeuCys: 1.689 ± 0.05
5.053LeuAsp: 5.053 ± 0.078
5.217LeuGlu: 5.217 ± 0.087
4.174LeuPhe: 4.174 ± 0.106
5.378LeuGly: 5.378 ± 0.094
1.379LeuHis: 1.379 ± 0.05
6.09LeuIle: 6.09 ± 0.112
6.281LeuLys: 6.281 ± 0.103
7.426LeuLeu: 7.426 ± 0.161
2.122LeuMet: 2.122 ± 0.062
4.235LeuAsn: 4.235 ± 0.075
3.37LeuPro: 3.37 ± 0.071
2.343LeuGln: 2.343 ± 0.055
3.408LeuArg: 3.408 ± 0.071
6.44LeuSer: 6.44 ± 0.114
4.582LeuThr: 4.582 ± 0.087
4.753LeuVal: 4.753 ± 0.093
0.696LeuTrp: 0.696 ± 0.031
3.293LeuTyr: 3.293 ± 0.078
0.0LeuXaa: 0.0 ± 0.0
Met
2.073MetAla: 2.073 ± 0.054
0.389MetCys: 0.389 ± 0.027
1.441MetAsp: 1.441 ± 0.046
1.506MetGlu: 1.506 ± 0.039
1.02MetPhe: 1.02 ± 0.041
1.638MetGly: 1.638 ± 0.057
0.415MetHis: 0.415 ± 0.022
1.851MetIle: 1.851 ± 0.06
2.191MetLys: 2.191 ± 0.06
2.381MetLeu: 2.381 ± 0.061
0.666MetMet: 0.666 ± 0.034
1.443MetAsn: 1.443 ± 0.051
1.156MetPro: 1.156 ± 0.046
0.898MetGln: 0.898 ± 0.029
1.183MetArg: 1.183 ± 0.042
1.418MetSer: 1.418 ± 0.05
1.35MetThr: 1.35 ± 0.046
1.402MetVal: 1.402 ± 0.048
0.213MetTrp: 0.213 ± 0.022
0.835MetTyr: 0.835 ± 0.041
0.0MetXaa: 0.0 ± 0.0
Asn
4.242AsnAla: 4.242 ± 0.086
0.918AsnCys: 0.918 ± 0.042
2.813AsnAsp: 2.813 ± 0.062
3.391AsnGlu: 3.391 ± 0.072
1.871AsnPhe: 1.871 ± 0.055
4.731AsnGly: 4.731 ± 0.108
0.841AsnHis: 0.841 ± 0.038
4.019AsnIle: 4.019 ± 0.094
3.1AsnLys: 3.1 ± 0.064
3.85AsnLeu: 3.85 ± 0.096
1.313AsnMet: 1.313 ± 0.047
2.405AsnAsn: 2.405 ± 0.072
2.006AsnPro: 2.006 ± 0.059
1.347AsnGln: 1.347 ± 0.046
2.159AsnArg: 2.159 ± 0.059
3.107AsnSer: 3.107 ± 0.09
2.864AsnThr: 2.864 ± 0.069
3.037AsnVal: 3.037 ± 0.075
0.467AsnTrp: 0.467 ± 0.025
2.152AsnTyr: 2.152 ± 0.058
0.0AsnXaa: 0.0 ± 0.0
Pro
2.809ProAla: 2.809 ± 0.065
0.544ProCys: 0.544 ± 0.029
2.317ProAsp: 2.317 ± 0.068
2.749ProGlu: 2.749 ± 0.065
1.631ProPhe: 1.631 ± 0.056
1.999ProGly: 1.999 ± 0.06
0.554ProHis: 0.554 ± 0.026
1.882ProIle: 1.882 ± 0.054
2.076ProLys: 2.076 ± 0.062
2.76ProLeu: 2.76 ± 0.065
0.72ProMet: 0.72 ± 0.03
1.551ProAsn: 1.551 ± 0.045
0.98ProPro: 0.98 ± 0.044
1.364ProGln: 1.364 ± 0.042
0.993ProArg: 0.993 ± 0.036
1.956ProSer: 1.956 ± 0.049
1.577ProThr: 1.577 ± 0.065
2.87ProVal: 2.87 ± 0.065
0.284ProTrp: 0.284 ± 0.022
1.483ProTyr: 1.483 ± 0.05
0.0ProXaa: 0.0 ± 0.0
Gln
2.168GlnAla: 2.168 ± 0.06
0.395GlnCys: 0.395 ± 0.026
1.48GlnAsp: 1.48 ± 0.051
1.953GlnGlu: 1.953 ± 0.057
1.237GlnPhe: 1.237 ± 0.045
1.655GlnGly: 1.655 ± 0.051
0.483GlnHis: 0.483 ± 0.025
2.445GlnIle: 2.445 ± 0.06
2.468GlnLys: 2.468 ± 0.056
2.304GlnLeu: 2.304 ± 0.054
0.859GlnMet: 0.859 ± 0.037
2.078GlnAsn: 2.078 ± 0.059
0.913GlnPro: 0.913 ± 0.035
1.199GlnGln: 1.199 ± 0.052
1.264GlnArg: 1.264 ± 0.048
1.823GlnSer: 1.823 ± 0.063
1.668GlnThr: 1.668 ± 0.052
1.504GlnVal: 1.504 ± 0.049
0.284GlnTrp: 0.284 ± 0.022
1.335GlnTyr: 1.335 ± 0.049
0.0GlnXaa: 0.0 ± 0.0
Arg
2.971ArgAla: 2.971 ± 0.069
0.605ArgCys: 0.605 ± 0.032
2.122ArgAsp: 2.122 ± 0.05
2.962ArgGlu: 2.962 ± 0.076
2.097ArgPhe: 2.097 ± 0.067
2.301ArgGly: 2.301 ± 0.064
0.771ArgHis: 0.771 ± 0.034
3.313ArgIle: 3.313 ± 0.069
3.184ArgLys: 3.184 ± 0.073
3.758ArgLeu: 3.758 ± 0.084
1.098ArgMet: 1.098 ± 0.037
2.074ArgAsn: 2.074 ± 0.061
1.259ArgPro: 1.259 ± 0.044
1.385ArgGln: 1.385 ± 0.049
2.053ArgArg: 2.053 ± 0.065
2.277ArgSer: 2.277 ± 0.057
1.986ArgThr: 1.986 ± 0.056
2.577ArgVal: 2.577 ± 0.069
0.34ArgTrp: 0.34 ± 0.022
1.837ArgTyr: 1.837 ± 0.055
0.0ArgXaa: 0.0 ± 0.0
Ser
5.678SerAla: 5.678 ± 0.113
0.877SerCys: 0.877 ± 0.033
3.999SerAsp: 3.999 ± 0.088
4.259SerGlu: 4.259 ± 0.094
2.909SerPhe: 2.909 ± 0.075
5.546SerGly: 5.546 ± 0.095
0.989SerHis: 0.989 ± 0.036
3.91SerIle: 3.91 ± 0.074
4.077SerLys: 4.077 ± 0.082
5.283SerLeu: 5.283 ± 0.107
1.466SerMet: 1.466 ± 0.046
2.752SerAsn: 2.752 ± 0.082
1.871SerPro: 1.871 ± 0.059
1.702SerGln: 1.702 ± 0.05
2.711SerArg: 2.711 ± 0.067
3.959SerSer: 3.959 ± 0.101
2.909SerThr: 2.909 ± 0.077
5.33SerVal: 5.33 ± 0.104
0.516SerTrp: 0.516 ± 0.03
2.419SerTyr: 2.419 ± 0.068
0.0SerXaa: 0.0 ± 0.0
Thr
5.363ThrAla: 5.363 ± 0.103
0.717ThrCys: 0.717 ± 0.039
3.864ThrAsp: 3.864 ± 0.079
3.613ThrGlu: 3.613 ± 0.075
2.333ThrPhe: 2.333 ± 0.056
4.462ThrGly: 4.462 ± 0.102
0.878ThrHis: 0.878 ± 0.034
3.517ThrIle: 3.517 ± 0.08
2.952ThrLys: 2.952 ± 0.07
4.57ThrLeu: 4.57 ± 0.089
1.128ThrMet: 1.128 ± 0.037
2.212ThrAsn: 2.212 ± 0.079
2.333ThrPro: 2.333 ± 0.065
1.76ThrGln: 1.76 ± 0.057
1.763ThrArg: 1.763 ± 0.049
2.999ThrSer: 2.999 ± 0.067
2.703ThrThr: 2.703 ± 0.089
5.121ThrVal: 5.121 ± 0.122
0.376ThrTrp: 0.376 ± 0.026
2.255ThrTyr: 2.255 ± 0.075
0.0ThrXaa: 0.0 ± 0.0
Val
4.654ValAla: 4.654 ± 0.092
1.479ValCys: 1.479 ± 0.049
3.847ValAsp: 3.847 ± 0.092
3.712ValGlu: 3.712 ± 0.078
3.473ValPhe: 3.473 ± 0.08
3.587ValGly: 3.587 ± 0.085
1.033ValHis: 1.033 ± 0.041
5.333ValIle: 5.333 ± 0.104
4.415ValLys: 4.415 ± 0.089
6.552ValLeu: 6.552 ± 0.114
1.726ValMet: 1.726 ± 0.053
3.267ValAsn: 3.267 ± 0.075
2.52ValPro: 2.52 ± 0.062
1.824ValGln: 1.824 ± 0.049
2.651ValArg: 2.651 ± 0.064
5.185ValSer: 5.185 ± 0.096
4.042ValThr: 4.042 ± 0.1
4.304ValVal: 4.304 ± 0.09
0.607ValTrp: 0.607 ± 0.032
2.858ValTyr: 2.858 ± 0.066
0.0ValXaa: 0.0 ± 0.0
Trp
0.56TrpAla: 0.56 ± 0.031
0.185TrpCys: 0.185 ± 0.016
0.563TrpAsp: 0.563 ± 0.033
0.543TrpGlu: 0.543 ± 0.03
0.365TrpPhe: 0.365 ± 0.022
0.624TrpGly: 0.624 ± 0.034
0.185TrpHis: 0.185 ± 0.017
0.611TrpIle: 0.611 ± 0.029
0.573TrpLys: 0.573 ± 0.032
0.771TrpLeu: 0.771 ± 0.036
0.199TrpMet: 0.199 ± 0.017
0.523TrpAsn: 0.523 ± 0.028
0.125TrpPro: 0.125 ± 0.013
0.374TrpGln: 0.374 ± 0.023
0.34TrpArg: 0.34 ± 0.024
0.521TrpSer: 0.521 ± 0.03
0.379TrpThr: 0.379 ± 0.025
0.439TrpVal: 0.439 ± 0.026
0.109TrpTrp: 0.109 ± 0.013
0.402TrpTyr: 0.402 ± 0.027
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.007TyrAla: 3.007 ± 0.072
0.763TyrCys: 0.763 ± 0.031
2.865TyrAsp: 2.865 ± 0.076
2.831TyrGlu: 2.831 ± 0.068
2.084TyrPhe: 2.084 ± 0.052
3.016TyrGly: 3.016 ± 0.064
0.757TyrHis: 0.757 ± 0.033
3.115TyrIle: 3.115 ± 0.066
2.72TyrLys: 2.72 ± 0.076
3.356TyrLeu: 3.356 ± 0.069
0.975TyrMet: 0.975 ± 0.037
2.276TyrAsn: 2.276 ± 0.061
1.468TyrPro: 1.468 ± 0.046
1.227TyrGln: 1.227 ± 0.043
1.77TyrArg: 1.77 ± 0.053
2.867TyrSer: 2.867 ± 0.06
2.621TyrThr: 2.621 ± 0.069
2.377TyrVal: 2.377 ± 0.051
0.347TyrTrp: 0.347 ± 0.025
2.061TyrTyr: 2.061 ± 0.063
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.001XaaLeu: 0.001 ± 0.001
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 2250 proteins (703918 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski