Amino acid dipepetide frequency for Changjiang tombus-like virus 17

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
2.014AlaAla: 2.014 ± 1.046
1.007AlaCys: 1.007 ± 0.691
2.014AlaAsp: 2.014 ± 0.712
1.007AlaGlu: 1.007 ± 1.115
5.035AlaPhe: 5.035 ± 3.046
4.028AlaGly: 4.028 ± 1.538
0.0AlaHis: 0.0 ± 0.0
6.042AlaIle: 6.042 ± 1.809
7.049AlaLys: 7.049 ± 1.062
7.049AlaLeu: 7.049 ± 2.428
3.021AlaMet: 3.021 ± 1.988
2.014AlaAsn: 2.014 ± 0.712
4.028AlaPro: 4.028 ± 3.403
2.014AlaGln: 2.014 ± 0.712
7.049AlaArg: 7.049 ± 0.631
7.049AlaSer: 7.049 ± 1.062
4.028AlaThr: 4.028 ± 3.403
3.021AlaVal: 3.021 ± 2.552
1.007AlaTrp: 1.007 ± 0.691
0.0AlaTyr: 0.0 ± 0.0
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.0CysCys: 0.0 ± 0.0
0.0CysAsp: 0.0 ± 0.0
1.007CysGlu: 1.007 ± 0.691
2.014CysPhe: 2.014 ± 1.018
2.014CysGly: 2.014 ± 1.018
0.0CysHis: 0.0 ± 0.0
0.0CysIle: 0.0 ± 0.0
0.0CysLys: 0.0 ± 0.0
4.028CysLeu: 4.028 ± 1.868
0.0CysMet: 0.0 ± 0.0
0.0CysAsn: 0.0 ± 0.0
3.021CysPro: 3.021 ± 1.116
2.014CysGln: 2.014 ± 0.712
2.014CysArg: 2.014 ± 1.383
2.014CysSer: 2.014 ± 0.712
0.0CysThr: 0.0 ± 0.0
1.007CysVal: 1.007 ± 1.115
0.0CysTrp: 0.0 ± 0.0
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
7.049AspAla: 7.049 ± 2.076
1.007AspCys: 1.007 ± 1.115
3.021AspAsp: 3.021 ± 1.336
1.007AspGlu: 1.007 ± 0.851
6.042AspPhe: 6.042 ± 3.032
6.042AspGly: 6.042 ± 1.435
1.007AspHis: 1.007 ± 0.691
2.014AspIle: 2.014 ± 2.23
2.014AspLys: 2.014 ± 1.383
2.014AspLeu: 2.014 ± 1.046
2.014AspMet: 2.014 ± 1.018
1.007AspAsn: 1.007 ± 0.851
6.042AspPro: 6.042 ± 1.794
0.0AspGln: 0.0 ± 0.0
1.007AspArg: 1.007 ± 0.691
3.021AspSer: 3.021 ± 1.336
3.021AspThr: 3.021 ± 1.116
5.035AspVal: 5.035 ± 1.447
0.0AspTrp: 0.0 ± 0.0
1.007AspTyr: 1.007 ± 0.851
0.0AspXaa: 0.0 ± 0.0
Glu
1.007GluAla: 1.007 ± 0.691
1.007GluCys: 1.007 ± 0.691
0.0GluAsp: 0.0 ± 0.0
2.014GluGlu: 2.014 ± 1.383
2.014GluPhe: 2.014 ± 1.383
5.035GluGly: 5.035 ± 0.392
3.021GluHis: 3.021 ± 1.336
3.021GluIle: 3.021 ± 1.408
3.021GluLys: 3.021 ± 2.074
3.021GluLeu: 3.021 ± 1.408
2.014GluMet: 2.014 ± 0.712
3.021GluAsn: 3.021 ± 3.346
0.0GluPro: 0.0 ± 0.0
2.014GluGln: 2.014 ± 1.383
1.007GluArg: 1.007 ± 0.691
3.021GluSer: 3.021 ± 0.438
2.014GluThr: 2.014 ± 0.712
5.035GluVal: 5.035 ± 2.999
2.014GluTrp: 2.014 ± 1.383
3.021GluTyr: 3.021 ± 1.408
0.0GluXaa: 0.0 ± 0.0
Phe
1.007PheAla: 1.007 ± 0.851
2.014PheCys: 2.014 ± 1.018
8.056PheAsp: 8.056 ± 1.991
2.014PheGlu: 2.014 ± 1.046
4.028PhePhe: 4.028 ± 2.765
0.0PheGly: 0.0 ± 0.0
1.007PheHis: 1.007 ± 0.851
2.014PheIle: 2.014 ± 1.383
1.007PheLys: 1.007 ± 0.851
3.021PheLeu: 3.021 ± 1.336
0.0PheMet: 0.0 ± 0.589
2.014PheAsn: 2.014 ± 1.701
2.014PhePro: 2.014 ± 1.701
0.0PheGln: 0.0 ± 0.0
7.049PheArg: 7.049 ± 1.827
5.035PheSer: 5.035 ± 1.12
4.028PheThr: 4.028 ± 0.892
3.021PheVal: 3.021 ± 1.408
1.007PheTrp: 1.007 ± 0.691
2.014PheTyr: 2.014 ± 1.383
0.0PheXaa: 0.0 ± 0.0
Gly
4.028GlyAla: 4.028 ± 1.538
2.014GlyCys: 2.014 ± 1.383
3.021GlyAsp: 3.021 ± 2.074
4.028GlyGlu: 4.028 ± 1.714
5.035GlyPhe: 5.035 ± 0.392
3.021GlyGly: 3.021 ± 1.116
1.007GlyHis: 1.007 ± 0.691
4.028GlyIle: 4.028 ± 2.765
4.028GlyLys: 4.028 ± 0.496
6.042GlyLeu: 6.042 ± 5.239
1.007GlyMet: 1.007 ± 0.691
4.028GlyAsn: 4.028 ± 2.092
3.021GlyPro: 3.021 ± 1.116
1.007GlyGln: 1.007 ± 1.115
4.028GlyArg: 4.028 ± 3.101
6.042GlySer: 6.042 ± 1.228
6.042GlyThr: 6.042 ± 3.093
3.021GlyVal: 3.021 ± 1.116
1.007GlyTrp: 1.007 ± 1.115
1.007GlyTyr: 1.007 ± 0.851
0.0GlyXaa: 0.0 ± 0.0
His
0.0HisAla: 0.0 ± 0.0
0.0HisCys: 0.0 ± 0.0
0.0HisAsp: 0.0 ± 0.0
0.0HisGlu: 0.0 ± 0.0
3.021HisPhe: 3.021 ± 1.336
0.0HisGly: 0.0 ± 0.0
0.0HisHis: 0.0 ± 0.0
2.014HisIle: 2.014 ± 1.018
0.0HisLys: 0.0 ± 0.0
2.014HisLeu: 2.014 ± 1.018
0.0HisMet: 0.0 ± 0.0
0.0HisAsn: 0.0 ± 0.0
0.0HisPro: 0.0 ± 0.0
0.0HisGln: 0.0 ± 0.0
1.007HisArg: 1.007 ± 1.115
0.0HisSer: 0.0 ± 0.0
1.007HisThr: 1.007 ± 0.851
1.007HisVal: 1.007 ± 0.691
0.0HisTrp: 0.0 ± 0.0
1.007HisTyr: 1.007 ± 0.851
0.0HisXaa: 0.0 ± 0.0
Ile
8.056IleAla: 8.056 ± 3.076
2.014IleCys: 2.014 ± 1.383
4.028IleAsp: 4.028 ± 1.714
5.035IleGlu: 5.035 ± 1.12
1.007IlePhe: 1.007 ± 0.851
0.0IleGly: 0.0 ± 0.0
0.0IleHis: 0.0 ± 0.0
2.014IleIle: 2.014 ± 1.046
4.028IleLys: 4.028 ± 2.765
7.049IleLeu: 7.049 ± 2.313
1.007IleMet: 1.007 ± 0.851
3.021IleAsn: 3.021 ± 1.116
2.014IlePro: 2.014 ± 1.701
1.007IleGln: 1.007 ± 0.691
3.021IleArg: 3.021 ± 0.438
11.078IleSer: 11.078 ± 2.869
2.014IleThr: 2.014 ± 1.018
0.0IleVal: 0.0 ± 0.0
0.0IleTrp: 0.0 ± 0.0
0.0IleTyr: 0.0 ± 0.0
0.0IleXaa: 0.0 ± 0.0
Lys
3.021LysAla: 3.021 ± 1.336
1.007LysCys: 1.007 ± 1.115
2.014LysAsp: 2.014 ± 1.046
1.007LysGlu: 1.007 ± 0.691
3.021LysPhe: 3.021 ± 1.116
4.028LysGly: 4.028 ± 2.765
0.0LysHis: 0.0 ± 0.0
5.035LysIle: 5.035 ± 1.317
5.035LysLys: 5.035 ± 1.739
2.014LysLeu: 2.014 ± 0.712
5.035LysMet: 5.035 ± 1.12
2.014LysAsn: 2.014 ± 0.712
2.014LysPro: 2.014 ± 1.383
4.028LysGln: 4.028 ± 0.496
3.021LysArg: 3.021 ± 2.074
0.0LysSer: 0.0 ± 0.0
4.028LysThr: 4.028 ± 2.266
5.035LysVal: 5.035 ± 0.392
0.0LysTrp: 0.0 ± 0.0
2.014LysTyr: 2.014 ± 0.712
0.0LysXaa: 0.0 ± 0.0
Leu
5.035LeuAla: 5.035 ± 3.046
1.007LeuCys: 1.007 ± 1.115
4.028LeuAsp: 4.028 ± 1.423
6.042LeuGlu: 6.042 ± 1.435
3.021LeuPhe: 3.021 ± 0.438
12.085LeuGly: 12.085 ± 4.779
2.014LeuHis: 2.014 ± 2.23
3.021LeuIle: 3.021 ± 0.438
7.049LeuLys: 7.049 ± 2.313
13.092LeuLeu: 13.092 ± 3.212
3.021LeuMet: 3.021 ± 1.116
5.035LeuAsn: 5.035 ± 0.392
4.028LeuPro: 4.028 ± 1.714
3.021LeuGln: 3.021 ± 1.336
6.042LeuArg: 6.042 ± 2.425
11.078LeuSer: 11.078 ± 1.046
12.085LeuThr: 12.085 ± 4.882
6.042LeuVal: 6.042 ± 3.054
1.007LeuTrp: 1.007 ± 0.691
3.021LeuTyr: 3.021 ± 1.116
0.0LeuXaa: 0.0 ± 0.0
Met
2.014MetAla: 2.014 ± 2.23
0.0MetCys: 0.0 ± 0.0
1.007MetAsp: 1.007 ± 0.691
4.028MetGlu: 4.028 ± 1.714
1.007MetPhe: 1.007 ± 0.691
0.0MetGly: 0.0 ± 0.0
0.0MetHis: 0.0 ± 0.0
4.028MetIle: 4.028 ± 3.403
3.021MetLys: 3.021 ± 1.408
2.014MetLeu: 2.014 ± 1.383
0.0MetMet: 0.0 ± 0.0
0.0MetAsn: 0.0 ± 0.0
0.0MetPro: 0.0 ± 0.0
0.0MetGln: 0.0 ± 0.0
4.028MetArg: 4.028 ± 3.049
6.042MetSer: 6.042 ± 3.125
0.0MetThr: 0.0 ± 0.0
0.0MetVal: 0.0 ± 0.0
0.0MetTrp: 0.0 ± 0.0
2.014MetTyr: 2.014 ± 0.712
0.0MetXaa: 0.0 ± 0.0
Asn
4.028AsnAla: 4.028 ± 2.266
1.007AsnCys: 1.007 ± 0.691
1.007AsnAsp: 1.007 ± 0.691
1.007AsnGlu: 1.007 ± 0.691
0.0AsnPhe: 0.0 ± 0.0
2.014AsnGly: 2.014 ± 0.712
0.0AsnHis: 0.0 ± 0.0
2.014AsnIle: 2.014 ± 1.046
4.028AsnLys: 4.028 ± 1.538
6.042AsnLeu: 6.042 ± 3.054
3.021AsnMet: 3.021 ± 1.697
2.014AsnAsn: 2.014 ± 1.046
3.021AsnPro: 3.021 ± 1.408
0.0AsnGln: 0.0 ± 0.0
3.021AsnArg: 3.021 ± 1.988
4.028AsnSer: 4.028 ± 0.892
2.014AsnThr: 2.014 ± 1.701
3.021AsnVal: 3.021 ± 1.547
1.007AsnTrp: 1.007 ± 0.691
0.0AsnTyr: 0.0 ± 0.0
0.0AsnXaa: 0.0 ± 0.0
Pro
3.021ProAla: 3.021 ± 1.408
0.0ProCys: 0.0 ± 0.0
2.014ProAsp: 2.014 ± 1.018
3.021ProGlu: 3.021 ± 1.116
1.007ProPhe: 1.007 ± 0.691
4.028ProGly: 4.028 ± 1.423
0.0ProHis: 0.0 ± 0.0
5.035ProIle: 5.035 ± 3.456
0.0ProLys: 0.0 ± 0.0
3.021ProLeu: 3.021 ± 1.336
1.007ProMet: 1.007 ± 0.691
0.0ProAsn: 0.0 ± 0.0
4.028ProPro: 4.028 ± 3.403
1.007ProGln: 1.007 ± 0.851
2.014ProArg: 2.014 ± 0.712
7.049ProSer: 7.049 ± 3.36
4.028ProThr: 4.028 ± 3.403
5.035ProVal: 5.035 ± 1.687
2.014ProTrp: 2.014 ± 1.701
2.014ProTyr: 2.014 ± 1.383
0.0ProXaa: 0.0 ± 0.0
Gln
1.007GlnAla: 1.007 ± 0.691
0.0GlnCys: 0.0 ± 0.0
2.014GlnAsp: 2.014 ± 1.701
0.0GlnGlu: 0.0 ± 0.0
2.014GlnPhe: 2.014 ± 1.383
1.007GlnGly: 1.007 ± 0.691
1.007GlnHis: 1.007 ± 0.691
5.035GlnIle: 5.035 ± 1.447
0.0GlnLys: 0.0 ± 0.0
6.042GlnLeu: 6.042 ± 1.435
0.0GlnMet: 0.0 ± 0.0
1.007GlnAsn: 1.007 ± 0.691
1.007GlnPro: 1.007 ± 0.851
0.0GlnGln: 0.0 ± 0.0
1.007GlnArg: 1.007 ± 0.691
0.0GlnSer: 0.0 ± 0.0
0.0GlnThr: 0.0 ± 0.0
1.007GlnVal: 1.007 ± 0.691
0.0GlnTrp: 0.0 ± 0.0
0.0GlnTyr: 0.0 ± 0.0
0.0GlnXaa: 0.0 ± 0.0
Arg
7.049ArgAla: 7.049 ± 1.062
1.007ArgCys: 1.007 ± 0.691
5.035ArgAsp: 5.035 ± 2.999
3.021ArgGlu: 3.021 ± 2.074
4.028ArgPhe: 4.028 ± 0.892
5.035ArgGly: 5.035 ± 2.273
1.007ArgHis: 1.007 ± 1.115
2.014ArgIle: 2.014 ± 0.712
3.021ArgLys: 3.021 ± 1.336
10.07ArgLeu: 10.07 ± 2.653
1.007ArgMet: 1.007 ± 0.691
6.042ArgAsn: 6.042 ± 2.425
0.0ArgPro: 0.0 ± 0.0
1.007ArgGln: 1.007 ± 0.691
9.063ArgArg: 9.063 ± 5.004
6.042ArgSer: 6.042 ± 3.054
1.007ArgThr: 1.007 ± 0.691
2.014ArgVal: 2.014 ± 1.018
0.0ArgTrp: 0.0 ± 0.0
2.014ArgTyr: 2.014 ± 0.712
0.0ArgXaa: 0.0 ± 0.0
Ser
9.063SerAla: 9.063 ± 5.051
3.021SerCys: 3.021 ± 1.116
5.035SerAsp: 5.035 ± 2.651
1.007SerGlu: 1.007 ± 0.691
3.021SerPhe: 3.021 ± 2.021
9.063SerGly: 9.063 ± 2.778
1.007SerHis: 1.007 ± 0.851
3.021SerIle: 3.021 ± 2.074
3.021SerLys: 3.021 ± 1.408
16.113SerLeu: 16.113 ± 1.982
3.021SerMet: 3.021 ± 1.193
4.028SerAsn: 4.028 ± 1.423
3.021SerPro: 3.021 ± 1.408
2.014SerGln: 2.014 ± 1.018
5.035SerArg: 5.035 ± 1.739
7.049SerSer: 7.049 ± 1.827
7.049SerThr: 7.049 ± 3.613
4.028SerVal: 4.028 ± 2.036
1.007SerTrp: 1.007 ± 1.115
6.042SerTyr: 6.042 ± 0.684
0.0SerXaa: 0.0 ± 0.0
Thr
5.035ThrAla: 5.035 ± 3.046
0.0ThrCys: 0.0 ± 0.0
3.021ThrAsp: 3.021 ± 0.438
4.028ThrGlu: 4.028 ± 2.266
5.035ThrPhe: 5.035 ± 3.054
3.021ThrGly: 3.021 ± 2.552
0.0ThrHis: 0.0 ± 0.0
1.007ThrIle: 1.007 ± 0.691
1.007ThrLys: 1.007 ± 0.851
8.056ThrLeu: 8.056 ± 1.783
0.0ThrMet: 0.0 ± 0.0
4.028ThrAsn: 4.028 ± 3.049
7.049ThrPro: 7.049 ± 2.746
0.0ThrGln: 0.0 ± 0.0
4.028ThrArg: 4.028 ± 0.496
6.042ThrSer: 6.042 ± 2.518
5.035ThrThr: 5.035 ± 3.046
3.021ThrVal: 3.021 ± 1.547
1.007ThrTrp: 1.007 ± 0.851
2.014ThrTyr: 2.014 ± 1.701
0.0ThrXaa: 0.0 ± 0.0
Val
2.014ValAla: 2.014 ± 0.712
0.0ValCys: 0.0 ± 0.0
6.042ValAsp: 6.042 ± 0.877
3.021ValGlu: 3.021 ± 0.438
1.007ValPhe: 1.007 ± 0.691
3.021ValGly: 3.021 ± 1.547
0.0ValHis: 0.0 ± 0.0
3.021ValIle: 3.021 ± 1.408
4.028ValLys: 4.028 ± 1.868
6.042ValLeu: 6.042 ± 1.809
1.007ValMet: 1.007 ± 0.691
2.014ValAsn: 2.014 ± 1.018
3.021ValPro: 3.021 ± 3.346
1.007ValGln: 1.007 ± 0.851
2.014ValArg: 2.014 ± 1.018
8.056ValSer: 8.056 ± 1.799
3.021ValThr: 3.021 ± 0.438
2.014ValVal: 2.014 ± 1.046
0.0ValTrp: 0.0 ± 0.0
2.014ValTyr: 2.014 ± 2.23
0.0ValXaa: 0.0 ± 0.0
Trp
1.007TrpAla: 1.007 ± 0.851
0.0TrpCys: 0.0 ± 0.0
0.0TrpAsp: 0.0 ± 0.0
2.014TrpGlu: 2.014 ± 1.018
0.0TrpPhe: 0.0 ± 0.0
1.007TrpGly: 1.007 ± 0.691
0.0TrpHis: 0.0 ± 0.0
1.007TrpIle: 1.007 ± 1.115
2.014TrpLys: 2.014 ± 0.712
2.014TrpLeu: 2.014 ± 0.712
1.007TrpMet: 1.007 ± 0.691
0.0TrpAsn: 0.0 ± 0.0
0.0TrpPro: 0.0 ± 0.0
0.0TrpGln: 0.0 ± 0.0
0.0TrpArg: 0.0 ± 0.0
1.007TrpSer: 1.007 ± 0.691
0.0TrpThr: 0.0 ± 0.0
0.0TrpVal: 0.0 ± 0.0
1.007TrpTrp: 1.007 ± 0.691
0.0TrpTyr: 0.0 ± 0.0
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.021TyrAla: 3.021 ± 1.408
3.021TyrCys: 3.021 ± 1.408
2.014TyrAsp: 2.014 ± 0.712
2.014TyrGlu: 2.014 ± 1.018
0.0TyrPhe: 0.0 ± 0.0
2.014TyrGly: 2.014 ± 1.701
0.0TyrHis: 0.0 ± 0.0
1.007TyrIle: 1.007 ± 0.691
0.0TyrLys: 0.0 ± 0.0
2.014TyrLeu: 2.014 ± 1.018
1.007TyrMet: 1.007 ± 0.851
1.007TyrAsn: 1.007 ± 0.851
2.014TyrPro: 2.014 ± 1.383
2.014TyrGln: 2.014 ± 0.712
4.028TyrArg: 4.028 ± 1.423
2.014TyrSer: 2.014 ± 1.018
2.014TyrThr: 2.014 ± 0.712
0.0TyrVal: 0.0 ± 0.0
0.0TyrTrp: 0.0 ± 0.0
0.0TyrTyr: 0.0 ± 0.0
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 3 proteins (994 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski