Amino acid dipepetide frequency for Human plasma-associated gemycircularvirus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
3.275AlaAla: 3.275 ± 2.45
1.092AlaCys: 1.092 ± 0.938
2.183AlaAsp: 2.183 ± 1.704
2.183AlaGlu: 2.183 ± 0.738
2.183AlaPhe: 2.183 ± 1.704
2.183AlaGly: 2.183 ± 1.634
2.183AlaHis: 2.183 ± 1.006
5.459AlaIle: 5.459 ± 2.965
0.0AlaLys: 0.0 ± 0.0
4.367AlaLeu: 4.367 ± 2.338
3.275AlaMet: 3.275 ± 2.56
5.459AlaAsn: 5.459 ± 1.16
3.275AlaPro: 3.275 ± 1.446
2.183AlaGln: 2.183 ± 0.861
6.55AlaArg: 6.55 ± 1.465
3.275AlaSer: 3.275 ± 1.458
2.183AlaThr: 2.183 ± 1.704
0.0AlaVal: 0.0 ± 0.0
3.275AlaTrp: 3.275 ± 1.505
3.275AlaTyr: 3.275 ± 1.505
0.0AlaXaa: 0.0 ± 0.0
Cys
0.0CysAla: 0.0 ± 0.0
0.0CysCys: 0.0 ± 0.0
0.0CysAsp: 0.0 ± 0.0
1.092CysGlu: 1.092 ± 0.938
2.183CysPhe: 2.183 ± 1.704
0.0CysGly: 0.0 ± 0.0
0.0CysHis: 0.0 ± 0.0
2.183CysIle: 2.183 ± 1.704
1.092CysLys: 1.092 ± 0.852
0.0CysLeu: 0.0 ± 0.0
0.0CysMet: 0.0 ± 0.0
0.0CysAsn: 0.0 ± 0.0
0.0CysPro: 0.0 ± 0.0
0.0CysGln: 0.0 ± 0.0
1.092CysArg: 1.092 ± 0.852
1.092CysSer: 1.092 ± 0.852
2.183CysThr: 2.183 ± 0.738
0.0CysVal: 0.0 ± 0.0
1.092CysTrp: 1.092 ± 0.938
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
3.275AspAla: 3.275 ± 1.505
1.092AspCys: 1.092 ± 0.852
2.183AspAsp: 2.183 ± 0.861
2.183AspGlu: 2.183 ± 1.704
3.275AspPhe: 3.275 ± 1.505
4.367AspGly: 4.367 ± 2.056
0.0AspHis: 0.0 ± 0.0
0.0AspIle: 0.0 ± 0.0
4.367AspLys: 4.367 ± 2.29
2.183AspLeu: 2.183 ± 0.738
0.0AspMet: 0.0 ± 0.0
3.275AspAsn: 3.275 ± 1.766
5.459AspPro: 5.459 ± 0.806
4.367AspGln: 4.367 ± 0.915
3.275AspArg: 3.275 ± 0.155
3.275AspSer: 3.275 ± 1.458
4.367AspThr: 4.367 ± 0.915
4.367AspVal: 4.367 ± 2.185
4.367AspTrp: 4.367 ± 2.29
2.183AspTyr: 2.183 ± 1.704
0.0AspXaa: 0.0 ± 0.0
Glu
4.367GluAla: 4.367 ± 1.034
1.092GluCys: 1.092 ± 0.852
5.459GluAsp: 5.459 ± 2.112
1.092GluGlu: 1.092 ± 0.817
1.092GluPhe: 1.092 ± 0.852
1.092GluGly: 1.092 ± 0.938
2.183GluHis: 2.183 ± 1.006
2.183GluIle: 2.183 ± 0.738
0.0GluLys: 0.0 ± 0.0
4.367GluLeu: 4.367 ± 2.643
1.092GluMet: 1.092 ± 0.938
2.183GluAsn: 2.183 ± 1.877
1.092GluPro: 1.092 ± 0.852
0.0GluGln: 0.0 ± 0.0
2.183GluArg: 2.183 ± 1.006
5.459GluSer: 5.459 ± 1.55
3.275GluThr: 3.275 ± 1.574
3.275GluVal: 3.275 ± 1.766
0.0GluTrp: 0.0 ± 0.0
5.459GluTyr: 5.459 ± 2.872
0.0GluXaa: 0.0 ± 0.0
Phe
2.183PheAla: 2.183 ± 0.738
1.092PheCys: 1.092 ± 0.852
3.275PheAsp: 3.275 ± 1.505
1.092PheGlu: 1.092 ± 0.817
3.275PhePhe: 3.275 ± 1.446
4.367PheGly: 4.367 ± 2.29
1.092PheHis: 1.092 ± 0.817
1.092PheIle: 1.092 ± 0.817
2.183PheLys: 2.183 ± 1.634
1.092PheLeu: 1.092 ± 0.938
1.092PheMet: 1.092 ± 0.938
4.367PheAsn: 4.367 ± 1.722
6.55PhePro: 6.55 ± 1.72
4.367PheGln: 4.367 ± 2.643
3.275PheArg: 3.275 ± 1.446
3.275PheSer: 3.275 ± 2.556
4.367PheThr: 4.367 ± 2.29
1.092PheVal: 1.092 ± 0.852
1.092PheTrp: 1.092 ± 0.817
0.0PheTyr: 0.0 ± 0.0
0.0PheXaa: 0.0 ± 0.0
Gly
1.092GlyAla: 1.092 ± 0.817
1.092GlyCys: 1.092 ± 0.852
1.092GlyAsp: 1.092 ± 0.817
1.092GlyGlu: 1.092 ± 0.852
2.183GlyPhe: 2.183 ± 1.006
8.734GlyGly: 8.734 ± 4.106
0.0GlyHis: 0.0 ± 0.0
3.275GlyIle: 3.275 ± 0.155
4.367GlyLys: 4.367 ± 2.056
7.642GlyLeu: 7.642 ± 1.142
2.183GlyMet: 2.183 ± 0.738
5.459GlyAsn: 5.459 ± 1.55
3.275GlyPro: 3.275 ± 1.458
5.459GlyGln: 5.459 ± 1.92
3.275GlyArg: 3.275 ± 1.505
5.459GlySer: 5.459 ± 1.968
6.55GlyThr: 6.55 ± 1.383
7.642GlyVal: 7.642 ± 3.063
0.0GlyTrp: 0.0 ± 0.0
0.0GlyTyr: 0.0 ± 0.0
0.0GlyXaa: 0.0 ± 0.0
His
4.367HisAla: 4.367 ± 2.338
0.0HisCys: 0.0 ± 0.0
2.183HisAsp: 2.183 ± 0.738
3.275HisGlu: 3.275 ± 0.155
0.0HisPhe: 0.0 ± 0.0
1.092HisGly: 1.092 ± 0.938
0.0HisHis: 0.0 ± 0.0
0.0HisIle: 0.0 ± 0.0
1.092HisLys: 1.092 ± 0.852
2.183HisLeu: 2.183 ± 0.738
1.092HisMet: 1.092 ± 0.938
1.092HisAsn: 1.092 ± 0.852
1.092HisPro: 1.092 ± 0.852
1.092HisGln: 1.092 ± 0.938
4.367HisArg: 4.367 ± 0.915
2.183HisSer: 2.183 ± 1.634
2.183HisThr: 2.183 ± 1.877
0.0HisVal: 0.0 ± 0.0
1.092HisTrp: 1.092 ± 0.817
1.092HisTyr: 1.092 ± 0.852
0.0HisXaa: 0.0 ± 0.0
Ile
2.183IleAla: 2.183 ± 0.738
0.0IleCys: 0.0 ± 0.0
2.183IleAsp: 2.183 ± 1.634
1.092IleGlu: 1.092 ± 0.852
2.183IlePhe: 2.183 ± 1.634
5.459IleGly: 5.459 ± 0.686
1.092IleHis: 1.092 ± 0.938
1.092IleIle: 1.092 ± 0.938
2.183IleLys: 2.183 ± 0.861
1.092IleLeu: 1.092 ± 0.852
3.275IleMet: 3.275 ± 1.574
0.0IleAsn: 0.0 ± 0.0
1.092IlePro: 1.092 ± 0.817
1.092IleGln: 1.092 ± 0.817
2.183IleArg: 2.183 ± 1.006
4.367IleSer: 4.367 ± 0.915
3.275IleThr: 3.275 ± 1.289
7.642IleVal: 7.642 ± 3.129
0.0IleTrp: 0.0 ± 0.0
1.092IleTyr: 1.092 ± 0.852
0.0IleXaa: 0.0 ± 0.0
Lys
1.092LysAla: 1.092 ± 0.852
0.0LysCys: 0.0 ± 0.0
2.183LysAsp: 2.183 ± 1.704
2.183LysGlu: 2.183 ± 1.006
1.092LysPhe: 1.092 ± 0.817
1.092LysGly: 1.092 ± 0.938
0.0LysHis: 0.0 ± 0.0
1.092LysIle: 1.092 ± 0.817
3.275LysLys: 3.275 ± 2.45
0.0LysLeu: 0.0 ± 0.0
0.0LysMet: 0.0 ± 0.0
1.092LysAsn: 1.092 ± 0.852
3.275LysPro: 3.275 ± 1.289
1.092LysGln: 1.092 ± 0.852
6.55LysArg: 6.55 ± 2.892
0.0LysSer: 0.0 ± 0.0
3.275LysThr: 3.275 ± 1.458
0.0LysVal: 0.0 ± 0.0
0.0LysTrp: 0.0 ± 0.0
1.092LysTyr: 1.092 ± 0.852
0.0LysXaa: 0.0 ± 0.0
Leu
7.642LeuAla: 7.642 ± 1.142
1.092LeuCys: 1.092 ± 0.938
7.642LeuAsp: 7.642 ± 0.546
4.367LeuGlu: 4.367 ± 2.338
2.183LeuPhe: 2.183 ± 1.006
7.642LeuGly: 7.642 ± 1.417
1.092LeuHis: 1.092 ± 0.852
2.183LeuIle: 2.183 ± 0.738
1.092LeuLys: 1.092 ± 0.938
1.092LeuLeu: 1.092 ± 0.852
4.367LeuMet: 4.367 ± 1.034
0.0LeuAsn: 0.0 ± 0.0
0.0LeuPro: 0.0 ± 0.0
1.092LeuGln: 1.092 ± 0.817
6.55LeuArg: 6.55 ± 2.916
8.734LeuSer: 8.734 ± 4.676
10.917LeuThr: 10.917 ± 9.384
3.275LeuVal: 3.275 ± 2.815
1.092LeuTrp: 1.092 ± 0.852
0.0LeuTyr: 0.0 ± 0.0
0.0LeuXaa: 0.0 ± 0.0
Met
2.183MetAla: 2.183 ± 1.634
0.0MetCys: 0.0 ± 0.0
1.092MetAsp: 1.092 ± 0.817
1.092MetGlu: 1.092 ± 0.938
2.183MetPhe: 2.183 ± 0.861
1.092MetGly: 1.092 ± 0.938
3.275MetHis: 3.275 ± 2.815
0.0MetIle: 0.0 ± 0.0
0.0MetLys: 0.0 ± 0.0
5.459MetLeu: 5.459 ± 2.112
0.0MetMet: 0.0 ± 0.0
0.0MetAsn: 0.0 ± 0.0
0.0MetPro: 0.0 ± 0.0
0.0MetGln: 0.0 ± 0.0
1.092MetArg: 1.092 ± 0.817
1.092MetSer: 1.092 ± 0.938
2.183MetThr: 2.183 ± 1.877
0.0MetVal: 0.0 ± 0.0
0.0MetTrp: 0.0 ± 0.0
2.183MetTyr: 2.183 ± 1.006
0.0MetXaa: 0.0 ± 0.0
Asn
5.459AsnAla: 5.459 ± 1.55
0.0AsnCys: 0.0 ± 0.0
2.183AsnAsp: 2.183 ± 0.861
3.275AsnGlu: 3.275 ± 1.766
1.092AsnPhe: 1.092 ± 0.817
3.275AsnGly: 3.275 ± 1.574
0.0AsnHis: 0.0 ± 0.0
2.183AsnIle: 2.183 ± 0.738
0.0AsnLys: 0.0 ± 0.0
2.183AsnLeu: 2.183 ± 0.861
0.0AsnMet: 0.0 ± 0.0
0.0AsnAsn: 0.0 ± 0.0
3.275AsnPro: 3.275 ± 1.766
1.092AsnGln: 1.092 ± 0.938
4.367AsnArg: 4.367 ± 0.915
2.183AsnSer: 2.183 ± 0.861
3.275AsnThr: 3.275 ± 0.155
4.367AsnVal: 4.367 ± 2.056
0.0AsnTrp: 0.0 ± 0.0
1.092AsnTyr: 1.092 ± 0.852
0.0AsnXaa: 0.0 ± 0.0
Pro
2.183ProAla: 2.183 ± 1.634
1.092ProCys: 1.092 ± 0.938
3.275ProAsp: 3.275 ± 1.289
0.0ProGlu: 0.0 ± 0.0
4.367ProPhe: 4.367 ± 0.915
6.55ProGly: 6.55 ± 1.151
0.0ProHis: 0.0 ± 0.0
2.183ProIle: 2.183 ± 0.738
0.0ProLys: 0.0 ± 0.0
4.367ProLeu: 4.367 ± 3.754
1.092ProMet: 1.092 ± 0.817
3.275ProAsn: 3.275 ± 1.458
2.183ProPro: 2.183 ± 0.861
3.275ProGln: 3.275 ± 1.574
5.459ProArg: 5.459 ± 2.872
3.275ProSer: 3.275 ± 1.289
6.55ProThr: 6.55 ± 1.465
3.275ProVal: 3.275 ± 1.505
0.0ProTrp: 0.0 ± 0.0
3.275ProTyr: 3.275 ± 2.45
0.0ProXaa: 0.0 ± 0.0
Gln
2.183GlnAla: 2.183 ± 1.634
0.0GlnCys: 0.0 ± 0.0
0.0GlnAsp: 0.0 ± 0.0
3.275GlnGlu: 3.275 ± 1.574
3.275GlnPhe: 3.275 ± 0.155
4.367GlnGly: 4.367 ± 0.915
1.092GlnHis: 1.092 ± 0.852
3.275GlnIle: 3.275 ± 1.574
0.0GlnLys: 0.0 ± 0.0
2.183GlnLeu: 2.183 ± 1.877
1.092GlnMet: 1.092 ± 0.938
1.092GlnAsn: 1.092 ± 0.938
1.092GlnPro: 1.092 ± 0.938
2.183GlnGln: 2.183 ± 1.634
2.183GlnArg: 2.183 ± 1.704
3.275GlnSer: 3.275 ± 1.574
7.642GlnThr: 7.642 ± 2.808
1.092GlnVal: 1.092 ± 0.938
1.092GlnTrp: 1.092 ± 0.852
2.183GlnTyr: 2.183 ± 1.704
0.0GlnXaa: 0.0 ± 0.0
Arg
1.092ArgAla: 1.092 ± 0.852
0.0ArgCys: 0.0 ± 0.0
3.275ArgAsp: 3.275 ± 2.556
6.55ArgGlu: 6.55 ± 1.488
6.55ArgPhe: 6.55 ± 2.582
4.367ArgGly: 4.367 ± 2.056
4.367ArgHis: 4.367 ± 1.034
3.275ArgIle: 3.275 ± 1.446
3.275ArgLys: 3.275 ± 1.766
5.459ArgLeu: 5.459 ± 1.16
0.0ArgMet: 0.0 ± 0.0
4.367ArgAsn: 4.367 ± 2.185
7.642ArgPro: 7.642 ± 1.66
4.367ArgGln: 4.367 ± 2.056
16.376ArgArg: 16.376 ± 7.586
5.459ArgSer: 5.459 ± 0.806
9.825ArgThr: 9.825 ± 2.04
4.367ArgVal: 4.367 ± 2.185
3.275ArgTrp: 3.275 ± 0.155
4.367ArgTyr: 4.367 ± 2.185
0.0ArgXaa: 0.0 ± 0.0
Ser
1.092SerAla: 1.092 ± 0.817
0.0SerCys: 0.0 ± 0.0
7.642SerAsp: 7.642 ± 1.027
4.367SerGlu: 4.367 ± 1.477
3.275SerPhe: 3.275 ± 1.458
4.367SerGly: 4.367 ± 0.915
5.459SerHis: 5.459 ± 2.112
3.275SerIle: 3.275 ± 0.155
1.092SerLys: 1.092 ± 0.817
4.367SerLeu: 4.367 ± 1.477
1.092SerMet: 1.092 ± 0.734
1.092SerAsn: 1.092 ± 0.852
8.734SerPro: 8.734 ± 3.666
4.367SerGln: 4.367 ± 1.034
5.459SerArg: 5.459 ± 2.872
5.459SerSer: 5.459 ± 1.16
2.183SerThr: 2.183 ± 1.634
4.367SerVal: 4.367 ± 1.034
2.183SerTrp: 2.183 ± 0.738
3.275SerTyr: 3.275 ± 1.446
0.0SerXaa: 0.0 ± 0.0
Thr
2.183ThrAla: 2.183 ± 1.877
3.275ThrCys: 3.275 ± 2.556
3.275ThrAsp: 3.275 ± 1.505
4.367ThrGlu: 4.367 ± 2.338
2.183ThrPhe: 2.183 ± 1.006
4.367ThrGly: 4.367 ± 2.29
3.275ThrHis: 3.275 ± 0.155
6.55ThrIle: 6.55 ± 2.582
2.183ThrLys: 2.183 ± 0.738
14.192ThrLeu: 14.192 ± 9.472
1.092ThrMet: 1.092 ± 0.817
3.275ThrAsn: 3.275 ± 0.155
5.459ThrPro: 5.459 ± 1.55
3.275ThrGln: 3.275 ± 2.815
10.917ThrArg: 10.917 ± 2.32
5.459ThrSer: 5.459 ± 2.716
3.275ThrThr: 3.275 ± 1.766
4.367ThrVal: 4.367 ± 0.915
1.092ThrTrp: 1.092 ± 0.938
3.275ThrTyr: 3.275 ± 0.155
0.0ThrXaa: 0.0 ± 0.0
Val
5.459ValAla: 5.459 ± 0.806
0.0ValCys: 0.0 ± 0.0
5.459ValAsp: 5.459 ± 1.55
1.092ValGlu: 1.092 ± 0.817
4.367ValPhe: 4.367 ± 0.915
1.092ValGly: 1.092 ± 0.852
2.183ValHis: 2.183 ± 1.704
1.092ValIle: 1.092 ± 0.817
1.092ValLys: 1.092 ± 0.817
4.367ValLeu: 4.367 ± 1.034
0.0ValMet: 0.0 ± 0.0
1.092ValAsn: 1.092 ± 0.817
1.092ValPro: 1.092 ± 0.852
3.275ValGln: 3.275 ± 1.446
5.459ValArg: 5.459 ± 1.92
5.459ValSer: 5.459 ± 1.55
5.459ValThr: 5.459 ± 1.728
3.275ValVal: 3.275 ± 1.505
1.092ValTrp: 1.092 ± 0.817
2.183ValTyr: 2.183 ± 1.006
0.0ValXaa: 0.0 ± 0.0
Trp
1.092TrpAla: 1.092 ± 0.852
0.0TrpCys: 0.0 ± 0.0
2.183TrpAsp: 2.183 ± 0.861
0.0TrpGlu: 0.0 ± 0.0
1.092TrpPhe: 1.092 ± 0.852
1.092TrpGly: 1.092 ± 0.852
1.092TrpHis: 1.092 ± 0.817
1.092TrpIle: 1.092 ± 0.938
0.0TrpLys: 0.0 ± 0.0
3.275TrpLeu: 3.275 ± 2.556
1.092TrpMet: 1.092 ± 0.938
0.0TrpAsn: 0.0 ± 0.0
0.0TrpPro: 0.0 ± 0.0
0.0TrpGln: 0.0 ± 0.0
2.183TrpArg: 2.183 ± 1.634
3.275TrpSer: 3.275 ± 1.574
3.275TrpThr: 3.275 ± 0.155
1.092TrpVal: 1.092 ± 0.852
0.0TrpTrp: 0.0 ± 0.0
0.0TrpTyr: 0.0 ± 0.0
0.0TrpXaa: 0.0 ± 0.0
Tyr
5.459TyrAla: 5.459 ± 2.3
1.092TyrCys: 1.092 ± 0.852
1.092TyrAsp: 1.092 ± 0.817
3.275TyrGlu: 3.275 ± 1.289
2.183TyrPhe: 2.183 ± 1.704
3.275TyrGly: 3.275 ± 1.446
1.092TyrHis: 1.092 ± 0.852
1.092TyrIle: 1.092 ± 0.852
1.092TyrLys: 1.092 ± 0.852
2.183TyrLeu: 2.183 ± 1.877
0.0TyrMet: 0.0 ± 0.0
2.183TyrAsn: 2.183 ± 0.861
1.092TyrPro: 1.092 ± 0.852
0.0TyrGln: 0.0 ± 0.0
5.459TyrArg: 5.459 ± 2.965
2.183TyrSer: 2.183 ± 0.861
1.092TyrThr: 1.092 ± 0.938
1.092TyrVal: 1.092 ± 0.852
1.092TyrTrp: 1.092 ± 0.817
1.092TyrTyr: 1.092 ± 0.938
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 3 proteins (917 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski