Amino acid dipepetide frequency for Torque teno midi virus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
16.692AlaAla: 16.692 ± 10.18
0.759AlaCys: 0.759 ± 0.465
4.552AlaAsp: 4.552 ± 3.129
3.794AlaGlu: 3.794 ± 0.744
1.517AlaPhe: 1.517 ± 0.738
0.0AlaGly: 0.0 ± 0.0
4.552AlaHis: 4.552 ± 1.216
1.517AlaIle: 1.517 ± 0.637
6.07AlaLys: 6.07 ± 2.828
2.276AlaLeu: 2.276 ± 1.394
0.0AlaMet: 0.0 ± 0.0
0.0AlaAsn: 0.0 ± 0.0
3.035AlaPro: 3.035 ± 1.475
0.759AlaGln: 0.759 ± 0.864
2.276AlaArg: 2.276 ± 0.879
8.346AlaSer: 8.346 ± 1.862
5.311AlaThr: 5.311 ± 1.99
0.0AlaVal: 0.0 ± 0.0
0.0AlaTrp: 0.0 ± 0.0
0.759AlaTyr: 0.759 ± 0.465
0.0AlaXaa: 0.0 ± 0.0
Cys
3.035CysAla: 3.035 ± 2.06
0.759CysCys: 0.759 ± 0.465
3.035CysAsp: 3.035 ± 1.133
1.517CysGlu: 1.517 ± 0.93
0.0CysPhe: 0.0 ± 0.0
0.759CysGly: 0.759 ± 0.465
2.276CysHis: 2.276 ± 1.564
0.759CysIle: 0.759 ± 0.465
1.517CysLys: 1.517 ± 0.93
1.517CysLeu: 1.517 ± 0.93
0.0CysMet: 0.0 ± 0.0
3.035CysAsn: 3.035 ± 1.133
0.759CysPro: 0.759 ± 0.465
1.517CysGln: 1.517 ± 0.637
0.0CysArg: 0.0 ± 0.0
0.759CysSer: 0.759 ± 0.773
0.0CysThr: 0.0 ± 0.0
0.0CysVal: 0.0 ± 0.0
3.035CysTrp: 3.035 ± 1.133
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
2.276AspAla: 2.276 ± 1.564
0.0AspCys: 0.0 ± 0.0
1.517AspAsp: 1.517 ± 0.93
0.0AspGlu: 0.0 ± 0.0
6.07AspPhe: 6.07 ± 0.977
0.0AspGly: 0.0 ± 0.0
0.0AspHis: 0.0 ± 0.0
2.276AspIle: 2.276 ± 1.394
2.276AspLys: 2.276 ± 1.394
3.035AspLeu: 3.035 ± 2.06
0.759AspMet: 0.759 ± 0.685
0.759AspAsn: 0.759 ± 0.465
4.552AspPro: 4.552 ± 0.505
0.759AspGln: 0.759 ± 0.465
3.035AspArg: 3.035 ± 2.06
7.587AspSer: 7.587 ± 4.255
6.07AspThr: 6.07 ± 1.68
0.759AspVal: 0.759 ± 0.465
0.759AspTrp: 0.759 ± 0.465
3.794AspTyr: 3.794 ± 0.744
0.0AspXaa: 0.0 ± 0.0
Glu
0.759GluAla: 0.759 ± 0.465
2.276GluCys: 2.276 ± 0.622
5.311GluAsp: 5.311 ± 2.692
18.209GluGlu: 18.209 ± 9.639
0.0GluPhe: 0.0 ± 0.0
6.829GluGly: 6.829 ± 2.684
0.0GluHis: 0.0 ± 0.0
7.587GluIle: 7.587 ± 2.246
3.794GluLys: 3.794 ± 2.252
2.276GluLeu: 2.276 ± 0.622
0.759GluMet: 0.759 ± 0.839
0.759GluAsn: 0.759 ± 0.465
0.0GluPro: 0.0 ± 0.0
0.759GluGln: 0.759 ± 0.465
2.276GluArg: 2.276 ± 1.564
3.035GluSer: 3.035 ± 1.859
3.035GluThr: 3.035 ± 1.197
2.276GluVal: 2.276 ± 1.394
0.0GluTrp: 0.0 ± 0.0
3.035GluTyr: 3.035 ± 1.859
0.0GluXaa: 0.0 ± 0.0
Phe
3.035PheAla: 3.035 ± 2.06
0.0PheCys: 0.0 ± 0.0
2.276PheAsp: 2.276 ± 1.394
1.517PheGlu: 1.517 ± 0.738
4.552PhePhe: 4.552 ± 0.505
1.517PheGly: 1.517 ± 0.93
0.0PheHis: 0.0 ± 0.0
0.0PheIle: 0.0 ± 0.0
3.794PheLys: 3.794 ± 1.555
4.552PheLeu: 4.552 ± 2.789
0.0PheMet: 0.0 ± 0.0
3.794PheAsn: 3.794 ± 1.631
2.276PhePro: 2.276 ± 1.564
0.759PheGln: 0.759 ± 0.465
2.276PheArg: 2.276 ± 0.879
1.517PheSer: 1.517 ± 0.637
4.552PheThr: 4.552 ± 0.505
2.276PheVal: 2.276 ± 1.394
2.276PheTrp: 2.276 ± 1.394
3.035PheTyr: 3.035 ± 1.197
0.0PheXaa: 0.0 ± 0.0
Gly
3.035GlyAla: 3.035 ± 1.133
3.035GlyCys: 3.035 ± 1.859
0.759GlyAsp: 0.759 ± 0.465
4.552GlyGlu: 4.552 ± 0.505
3.794GlyPhe: 3.794 ± 1.589
6.829GlyGly: 6.829 ± 0.811
5.311GlyHis: 5.311 ± 2.692
0.0GlyIle: 0.0 ± 0.0
0.759GlyLys: 0.759 ± 0.465
1.517GlyLeu: 1.517 ± 0.637
3.035GlyMet: 3.035 ± 2.06
1.517GlyAsn: 1.517 ± 0.93
1.517GlyPro: 1.517 ± 0.93
1.517GlyGln: 1.517 ± 0.93
2.276GlyArg: 2.276 ± 0.879
0.0GlySer: 0.0 ± 0.0
6.07GlyThr: 6.07 ± 3.118
3.035GlyVal: 3.035 ± 1.149
0.759GlyTrp: 0.759 ± 0.465
0.759GlyTyr: 0.759 ± 0.465
0.0GlyXaa: 0.0 ± 0.0
His
3.035HisAla: 3.035 ± 1.133
0.0HisCys: 0.0 ± 0.0
2.276HisAsp: 2.276 ± 1.564
0.0HisGlu: 0.0 ± 0.0
0.759HisPhe: 0.759 ± 0.465
3.035HisGly: 3.035 ± 1.133
0.0HisHis: 0.0 ± 0.0
0.0HisIle: 0.0 ± 0.0
2.276HisLys: 2.276 ± 1.338
1.517HisLeu: 1.517 ± 0.637
3.794HisMet: 3.794 ± 1.617
0.759HisAsn: 0.759 ± 0.465
4.552HisPro: 4.552 ± 1.229
1.517HisGln: 1.517 ± 1.545
0.0HisArg: 0.0 ± 0.0
5.311HisSer: 5.311 ± 3.559
3.035HisThr: 3.035 ± 1.133
0.0HisVal: 0.0 ± 0.0
0.0HisTrp: 0.0 ± 0.0
0.759HisTyr: 0.759 ± 0.465
0.0HisXaa: 0.0 ± 0.0
Ile
3.035IleAla: 3.035 ± 1.133
5.311IleCys: 5.311 ± 2.692
0.759IleAsp: 0.759 ± 0.465
3.035IleGlu: 3.035 ± 2.06
5.311IlePhe: 5.311 ± 0.89
1.517IleGly: 1.517 ± 0.93
2.276IleHis: 2.276 ± 1.338
3.035IleIle: 3.035 ± 1.133
6.829IleLys: 6.829 ± 1.397
3.035IleLeu: 3.035 ± 1.133
0.0IleMet: 0.0 ± 0.0
2.276IleAsn: 2.276 ± 1.394
0.759IlePro: 0.759 ± 0.465
3.794IleGln: 3.794 ± 1.375
0.0IleArg: 0.0 ± 0.0
3.794IleSer: 3.794 ± 1.617
2.276IleThr: 2.276 ± 1.394
1.517IleVal: 1.517 ± 0.93
0.0IleTrp: 0.0 ± 0.0
0.0IleTyr: 0.0 ± 0.0
0.0IleXaa: 0.0 ± 0.0
Lys
7.587LysAla: 7.587 ± 1.36
1.517LysCys: 1.517 ± 0.93
5.311LysAsp: 5.311 ± 0.89
6.07LysGlu: 6.07 ± 2.828
1.517LysPhe: 1.517 ± 0.738
1.517LysGly: 1.517 ± 0.93
3.035LysHis: 3.035 ± 1.275
3.794LysIle: 3.794 ± 2.324
10.622LysLys: 10.622 ± 4.252
8.346LysLeu: 8.346 ± 1.318
0.0LysMet: 0.0 ± 0.0
4.552LysAsn: 4.552 ± 1.216
6.829LysPro: 6.829 ± 1.061
6.829LysGln: 6.829 ± 1.679
9.863LysArg: 9.863 ± 2.636
2.276LysSer: 2.276 ± 1.338
3.794LysThr: 3.794 ± 0.68
1.517LysVal: 1.517 ± 0.738
0.759LysTrp: 0.759 ± 0.465
0.0LysTyr: 0.0 ± 0.0
0.0LysXaa: 0.0 ± 0.0
Leu
4.552LeuAla: 4.552 ± 3.129
0.759LeuCys: 0.759 ± 0.465
0.0LeuAsp: 0.0 ± 0.0
3.794LeuGlu: 3.794 ± 1.589
0.759LeuPhe: 0.759 ± 0.864
1.517LeuGly: 1.517 ± 0.93
1.517LeuHis: 1.517 ± 0.637
1.517LeuIle: 1.517 ± 0.637
6.07LeuLys: 6.07 ± 2.887
8.346LeuLeu: 8.346 ± 1.318
1.517LeuMet: 1.517 ± 1.545
4.552LeuAsn: 4.552 ± 2.013
6.07LeuPro: 6.07 ± 0.711
8.346LeuGln: 8.346 ± 2.129
1.517LeuArg: 1.517 ± 0.637
6.829LeuSer: 6.829 ± 1.679
2.276LeuThr: 2.276 ± 1.394
2.276LeuVal: 2.276 ± 1.394
3.794LeuTrp: 3.794 ± 0.744
2.276LeuTyr: 2.276 ± 1.394
0.0LeuXaa: 0.0 ± 0.0
Met
1.517MetAla: 1.517 ± 0.93
0.0MetCys: 0.0 ± 0.0
0.0MetAsp: 0.0 ± 0.0
2.276MetGlu: 2.276 ± 1.564
1.517MetPhe: 1.517 ± 0.93
0.0MetGly: 0.0 ± 0.0
0.0MetHis: 0.0 ± 0.0
0.759MetIle: 0.759 ± 0.773
0.0MetLys: 0.0 ± 0.0
2.276MetLeu: 2.276 ± 1.564
0.759MetMet: 0.759 ± 0.773
0.759MetAsn: 0.759 ± 0.465
0.759MetPro: 0.759 ± 0.465
2.276MetGln: 2.276 ± 1.564
0.0MetArg: 0.0 ± 0.0
4.552MetSer: 4.552 ± 1.229
1.517MetThr: 1.517 ± 0.637
0.0MetVal: 0.0 ± 0.0
0.0MetTrp: 0.0 ± 0.0
0.759MetTyr: 0.759 ± 0.773
0.0MetXaa: 0.0 ± 0.0
Asn
0.0AsnAla: 0.0 ± 0.0
5.311AsnCys: 5.311 ± 2.692
0.0AsnAsp: 0.0 ± 0.0
3.035AsnGlu: 3.035 ± 1.859
3.794AsnPhe: 3.794 ± 1.631
0.759AsnGly: 0.759 ± 0.465
3.035AsnHis: 3.035 ± 1.133
3.794AsnIle: 3.794 ± 0.744
1.517AsnLys: 1.517 ± 1.545
4.552AsnLeu: 4.552 ± 1.072
0.0AsnMet: 0.0 ± 0.0
1.517AsnAsn: 1.517 ± 0.93
4.552AsnPro: 4.552 ± 1.072
6.07AsnGln: 6.07 ± 0.706
0.0AsnArg: 0.0 ± 0.0
1.517AsnSer: 1.517 ± 0.93
1.517AsnThr: 1.517 ± 0.738
1.517AsnVal: 1.517 ± 0.637
0.0AsnTrp: 0.0 ± 0.0
4.552AsnTyr: 4.552 ± 1.991
0.0AsnXaa: 0.0 ± 0.0
Pro
1.517ProAla: 1.517 ± 0.738
0.0ProCys: 0.0 ± 0.0
1.517ProAsp: 1.517 ± 0.93
3.794ProGlu: 3.794 ± 0.744
6.07ProPhe: 6.07 ± 0.981
4.552ProGly: 4.552 ± 1.909
0.759ProHis: 0.759 ± 0.465
3.035ProIle: 3.035 ± 1.133
6.07ProLys: 6.07 ± 1.437
3.035ProLeu: 3.035 ± 1.475
0.0ProMet: 0.0 ± 0.0
2.276ProAsn: 2.276 ± 1.394
7.587ProPro: 7.587 ± 2.414
6.829ProGln: 6.829 ± 1.832
5.311ProArg: 5.311 ± 3.918
1.517ProSer: 1.517 ± 0.93
5.311ProThr: 5.311 ± 1.445
2.276ProVal: 2.276 ± 0.805
1.517ProTrp: 1.517 ± 0.738
3.794ProTyr: 3.794 ± 2.324
0.0ProXaa: 0.0 ± 0.0
Gln
3.035GlnAla: 3.035 ± 2.384
0.0GlnCys: 0.0 ± 0.0
1.517GlnAsp: 1.517 ± 0.93
3.794GlnGlu: 3.794 ± 2.324
2.276GlnPhe: 2.276 ± 1.394
0.0GlnGly: 0.0 ± 0.0
0.759GlnHis: 0.759 ± 0.773
1.517GlnIle: 1.517 ± 0.637
6.829GlnLys: 6.829 ± 1.51
5.311GlnLeu: 5.311 ± 0.879
0.0GlnMet: 0.0 ± 0.0
6.07GlnAsn: 6.07 ± 1.576
4.552GlnPro: 4.552 ± 1.758
8.346GlnGln: 8.346 ± 4.258
4.552GlnArg: 4.552 ± 1.216
0.759GlnSer: 0.759 ± 0.864
3.794GlnThr: 3.794 ± 0.68
1.517GlnVal: 1.517 ± 0.93
1.517GlnTrp: 1.517 ± 0.93
2.276GlnTyr: 2.276 ± 0.805
0.0GlnXaa: 0.0 ± 0.0
Arg
2.276ArgAla: 2.276 ± 0.879
2.276ArgCys: 2.276 ± 1.564
3.794ArgAsp: 3.794 ± 2.744
2.276ArgGlu: 2.276 ± 1.564
3.035ArgPhe: 3.035 ± 1.859
1.517ArgGly: 1.517 ± 0.738
0.759ArgHis: 0.759 ± 0.465
1.517ArgIle: 1.517 ± 0.637
4.552ArgLys: 4.552 ± 1.912
1.517ArgLeu: 1.517 ± 0.93
1.517ArgMet: 1.517 ± 0.716
5.311ArgAsn: 5.311 ± 2.483
1.517ArgPro: 1.517 ± 0.738
1.517ArgGln: 1.517 ± 0.93
15.933ArgArg: 15.933 ± 8.875
1.517ArgSer: 1.517 ± 0.999
0.759ArgThr: 0.759 ± 0.864
1.517ArgVal: 1.517 ± 1.729
0.759ArgTrp: 0.759 ± 0.465
5.311ArgTyr: 5.311 ± 1.505
0.0ArgXaa: 0.0 ± 0.0
Ser
2.276SerAla: 2.276 ± 1.338
0.759SerCys: 0.759 ± 0.465
4.552SerAsp: 4.552 ± 0.505
2.276SerGlu: 2.276 ± 0.805
1.517SerPhe: 1.517 ± 0.93
5.311SerGly: 5.311 ± 3.559
4.552SerHis: 4.552 ± 3.129
9.863SerIle: 9.863 ± 2.99
6.829SerLys: 6.829 ± 3.379
3.794SerLeu: 3.794 ± 0.68
0.759SerMet: 0.759 ± 0.633
3.794SerAsn: 3.794 ± 1.375
2.276SerPro: 2.276 ± 1.701
1.517SerGln: 1.517 ± 0.637
1.517SerArg: 1.517 ± 0.93
12.898SerSer: 12.898 ± 11.09
6.07SerThr: 6.07 ± 2.171
1.517SerVal: 1.517 ± 0.93
0.0SerTrp: 0.0 ± 0.0
0.759SerTyr: 0.759 ± 0.465
0.0SerXaa: 0.0 ± 0.0
Thr
1.517ThrAla: 1.517 ± 0.93
0.0ThrCys: 0.0 ± 0.0
6.829ThrAsp: 6.829 ± 1.455
0.759ThrGlu: 0.759 ± 0.864
0.0ThrPhe: 0.0 ± 0.0
8.346ThrGly: 8.346 ± 1.497
0.759ThrHis: 0.759 ± 0.465
4.552ThrIle: 4.552 ± 0.505
6.07ThrLys: 6.07 ± 2.047
1.517ThrLeu: 1.517 ± 0.738
0.0ThrMet: 0.0 ± 0.0
3.035ThrAsn: 3.035 ± 1.133
7.587ThrPro: 7.587 ± 2.414
2.276ThrGln: 2.276 ± 0.805
3.794ThrArg: 3.794 ± 0.966
6.829ThrSer: 6.829 ± 4.942
4.552ThrThr: 4.552 ± 1.072
2.276ThrVal: 2.276 ± 1.394
1.517ThrTrp: 1.517 ± 0.93
3.794ThrTyr: 3.794 ± 1.589
0.0ThrXaa: 0.0 ± 0.0
Val
2.276ValAla: 2.276 ± 0.805
0.759ValCys: 0.759 ± 0.773
0.759ValAsp: 0.759 ± 0.465
0.759ValGlu: 0.759 ± 0.864
0.0ValPhe: 0.0 ± 0.0
1.517ValGly: 1.517 ± 0.93
0.759ValHis: 0.759 ± 0.864
1.517ValIle: 1.517 ± 0.738
2.276ValLys: 2.276 ± 1.394
3.794ValLeu: 3.794 ± 2.324
0.759ValMet: 0.759 ± 0.465
0.0ValAsn: 0.0 ± 0.0
2.276ValPro: 2.276 ± 0.879
1.517ValGln: 1.517 ± 0.93
1.517ValArg: 1.517 ± 0.93
2.276ValSer: 2.276 ± 0.805
2.276ValThr: 2.276 ± 1.394
2.276ValVal: 2.276 ± 1.394
0.759ValTrp: 0.759 ± 0.465
0.0ValTyr: 0.0 ± 0.0
0.0ValXaa: 0.0 ± 0.0
Trp
0.759TrpAla: 0.759 ± 0.465
0.0TrpCys: 0.0 ± 0.0
0.759TrpAsp: 0.759 ± 0.465
0.759TrpGlu: 0.759 ± 0.465
1.517TrpPhe: 1.517 ± 0.93
2.276TrpGly: 2.276 ± 1.394
2.276TrpHis: 2.276 ± 1.564
0.0TrpIle: 0.0 ± 0.0
0.0TrpLys: 0.0 ± 0.0
1.517TrpLeu: 1.517 ± 0.93
3.035TrpMet: 3.035 ± 1.133
0.759TrpAsn: 0.759 ± 0.465
0.0TrpPro: 0.0 ± 0.0
0.759TrpGln: 0.759 ± 0.465
0.0TrpArg: 0.0 ± 0.0
0.0TrpSer: 0.0 ± 0.0
0.759TrpThr: 0.759 ± 0.465
0.759TrpVal: 0.759 ± 0.864
0.759TrpTrp: 0.759 ± 0.465
1.517TrpTyr: 1.517 ± 0.93
0.0TrpXaa: 0.0 ± 0.0
Tyr
0.0TyrAla: 0.0 ± 0.0
0.759TyrCys: 0.759 ± 0.465
0.759TyrAsp: 0.759 ± 0.465
1.517TyrGlu: 1.517 ± 0.93
0.0TyrPhe: 0.0 ± 0.0
2.276TyrGly: 2.276 ± 1.394
0.0TyrHis: 0.0 ± 0.0
1.517TyrIle: 1.517 ± 0.93
7.587TyrLys: 7.587 ± 0.09
4.552TyrLeu: 4.552 ± 2.013
2.276TyrMet: 2.276 ± 1.394
1.517TyrAsn: 1.517 ± 0.637
5.311TyrPro: 5.311 ± 1.505
0.759TyrGln: 0.759 ± 0.465
3.035TyrArg: 3.035 ± 1.149
1.517TyrSer: 1.517 ± 0.93
3.035TyrThr: 3.035 ± 1.859
0.759TyrVal: 0.759 ± 0.465
0.0TyrTrp: 0.0 ± 0.0
1.517TyrTyr: 1.517 ± 0.93
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 4 proteins (1319 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski