Amino acid dipepetide frequency for Merkel cell polyomavirus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
4.202AlaAla: 4.202 ± 0.667
1.801AlaCys: 1.801 ± 0.871
1.2AlaAsp: 1.2 ± 0.43
2.401AlaGlu: 2.401 ± 1.107
1.801AlaPhe: 1.801 ± 0.927
1.2AlaGly: 1.2 ± 0.43
1.801AlaHis: 1.801 ± 1.178
4.202AlaIle: 4.202 ± 0.902
3.001AlaLys: 3.001 ± 0.907
6.002AlaLeu: 6.002 ± 1.836
0.0AlaMet: 0.0 ± 0.0
1.801AlaAsn: 1.801 ± 0.761
4.202AlaPro: 4.202 ± 1.064
0.6AlaGln: 0.6 ± 0.585
1.801AlaArg: 1.801 ± 0.734
7.803AlaSer: 7.803 ± 1.72
1.2AlaThr: 1.2 ± 1.17
2.401AlaVal: 2.401 ± 0.855
0.6AlaTrp: 0.6 ± 0.393
1.2AlaTyr: 1.2 ± 0.43
0.0AlaXaa: 0.0 ± 0.0
Cys
1.801CysAla: 1.801 ± 0.734
1.2CysCys: 1.2 ± 0.43
1.2CysAsp: 1.2 ± 0.43
1.801CysGlu: 1.801 ± 1.178
1.801CysPhe: 1.801 ± 1.639
0.0CysGly: 0.0 ± 0.0
0.0CysHis: 0.0 ± 0.0
2.401CysIle: 2.401 ± 0.574
4.802CysLys: 4.802 ± 1.149
4.802CysLeu: 4.802 ± 2.472
0.6CysMet: 0.6 ± 0.892
0.6CysAsn: 0.6 ± 0.393
1.801CysPro: 1.801 ± 0.982
1.801CysGln: 1.801 ± 1.178
0.6CysArg: 0.6 ± 0.892
1.2CysSer: 1.2 ± 0.789
0.6CysThr: 0.6 ± 0.393
0.0CysVal: 0.0 ± 0.0
0.0CysTrp: 0.0 ± 0.0
1.801CysTyr: 1.801 ± 1.639
0.0CysXaa: 0.0 ± 0.0
Asp
0.0AspAla: 0.0 ± 0.0
0.0AspCys: 0.0 ± 0.0
0.6AspAsp: 0.6 ± 0.393
3.601AspGlu: 3.601 ± 0.915
3.001AspPhe: 3.001 ± 1.398
3.001AspGly: 3.001 ± 0.763
1.2AspHis: 1.2 ± 0.785
3.001AspIle: 3.001 ± 1.57
5.402AspLys: 5.402 ± 0.939
6.603AspLeu: 6.603 ± 1.366
0.6AspMet: 0.6 ± 0.612
1.2AspAsn: 1.2 ± 0.43
3.001AspPro: 3.001 ± 1.57
1.2AspGln: 1.2 ± 0.43
0.0AspArg: 0.0 ± 0.0
3.001AspSer: 3.001 ± 0.656
2.401AspThr: 2.401 ± 0.855
1.2AspVal: 1.2 ± 0.554
1.2AspTrp: 1.2 ± 1.106
3.601AspTyr: 3.601 ± 1.468
0.0AspXaa: 0.0 ± 0.0
Glu
7.803GluAla: 7.803 ± 2.357
0.6GluCys: 0.6 ± 0.892
3.001GluAsp: 3.001 ± 0.763
9.004GluGlu: 9.004 ± 3.681
1.801GluPhe: 1.801 ± 1.178
3.001GluGly: 3.001 ± 0.763
1.2GluHis: 1.2 ± 0.554
4.202GluIle: 4.202 ± 1.404
3.601GluLys: 3.601 ± 1.584
6.002GluLeu: 6.002 ± 0.492
2.401GluMet: 2.401 ± 1.57
4.802GluAsn: 4.802 ± 1.056
3.001GluPro: 3.001 ± 1.211
1.801GluGln: 1.801 ± 1.274
1.801GluArg: 1.801 ± 0.871
3.001GluSer: 3.001 ± 0.656
3.601GluThr: 3.601 ± 0.915
4.202GluVal: 4.202 ± 2.185
0.6GluTrp: 0.6 ± 0.393
2.401GluTyr: 2.401 ± 1.096
0.0GluXaa: 0.0 ± 0.0
Phe
3.001PheAla: 3.001 ± 1.211
3.601PheCys: 3.601 ± 1.742
1.801PheAsp: 1.801 ± 1.639
2.401PheGlu: 2.401 ± 1.57
0.0PhePhe: 0.0 ± 0.0
2.401PheGly: 2.401 ± 0.574
0.0PheHis: 0.0 ± 0.0
3.601PheIle: 3.601 ± 0.476
5.402PheLys: 5.402 ± 1.978
1.801PheLeu: 1.801 ± 0.982
1.2PheMet: 1.2 ± 0.641
1.2PheAsn: 1.2 ± 0.81
3.001PhePro: 3.001 ± 1.615
3.001PheGln: 3.001 ± 0.563
0.0PheArg: 0.0 ± 0.0
7.803PheSer: 7.803 ± 1.224
2.401PheThr: 2.401 ± 1.077
0.0PheVal: 0.0 ± 0.0
0.0PheTrp: 0.0 ± 0.0
0.6PheTyr: 0.6 ± 0.393
0.0PheXaa: 0.0 ± 0.0
Gly
1.801GlyAla: 1.801 ± 0.551
2.401GlyCys: 2.401 ± 0.574
1.2GlyAsp: 1.2 ± 1.224
3.601GlyGlu: 3.601 ± 2.244
3.001GlyPhe: 3.001 ± 1.256
6.002GlyGly: 6.002 ± 0.715
0.0GlyHis: 0.0 ± 0.0
4.202GlyIle: 4.202 ± 2.281
4.202GlyLys: 4.202 ± 1.308
4.202GlyLeu: 4.202 ± 2.89
0.0GlyMet: 0.0 ± 0.0
3.001GlyAsn: 3.001 ± 1.396
4.802GlyPro: 4.802 ± 0.923
3.601GlyGln: 3.601 ± 0.69
1.2GlyArg: 1.2 ± 0.43
6.002GlySer: 6.002 ± 1.526
5.402GlyThr: 5.402 ± 1.908
4.802GlyVal: 4.802 ± 1.997
0.0GlyTrp: 0.0 ± 0.0
1.801GlyTyr: 1.801 ± 0.927
0.0GlyXaa: 0.0 ± 0.0
His
1.801HisAla: 1.801 ± 0.761
0.6HisCys: 0.6 ± 0.892
0.6HisAsp: 0.6 ± 0.612
1.2HisGlu: 1.2 ± 0.785
1.2HisPhe: 1.2 ± 0.785
1.2HisGly: 1.2 ± 0.43
2.401HisHis: 2.401 ± 1.096
0.6HisIle: 0.6 ± 0.585
1.801HisLys: 1.801 ± 0.871
3.001HisLeu: 3.001 ± 1.62
0.6HisMet: 0.6 ± 0.612
0.0HisAsn: 0.0 ± 0.0
1.2HisPro: 1.2 ± 0.789
0.6HisGln: 0.6 ± 0.393
0.6HisArg: 0.6 ± 0.393
1.801HisSer: 1.801 ± 0.761
0.0HisThr: 0.0 ± 0.0
0.6HisVal: 0.6 ± 0.393
0.0HisTrp: 0.0 ± 0.0
1.2HisTyr: 1.2 ± 0.43
0.0HisXaa: 0.0 ± 0.0
Ile
2.401IleAla: 2.401 ± 0.897
1.2IleCys: 1.2 ± 0.785
1.801IleAsp: 1.801 ± 1.178
6.002IleGlu: 6.002 ± 1.405
1.2IlePhe: 1.2 ± 0.554
2.401IleGly: 2.401 ± 1.81
1.801IleHis: 1.801 ± 0.927
2.401IleIle: 2.401 ± 1.107
1.801IleLys: 1.801 ± 0.551
6.002IleLeu: 6.002 ± 2.242
1.2IleMet: 1.2 ± 0.789
3.001IleAsn: 3.001 ± 0.907
4.802IlePro: 4.802 ± 0.134
1.801IleGln: 1.801 ± 0.527
0.6IleArg: 0.6 ± 0.393
5.402IleSer: 5.402 ± 1.837
3.601IleThr: 3.601 ± 1.767
1.801IleVal: 1.801 ± 0.982
1.2IleTrp: 1.2 ± 0.785
1.801IleTyr: 1.801 ± 0.551
0.0IleXaa: 0.0 ± 0.0
Lys
6.603LysAla: 6.603 ± 1.289
1.2LysCys: 1.2 ± 0.789
2.401LysAsp: 2.401 ± 1.097
4.202LysGlu: 4.202 ± 0.686
6.002LysPhe: 6.002 ± 1.973
6.002LysGly: 6.002 ± 1.965
3.601LysHis: 3.601 ± 1.736
1.2LysIle: 1.2 ± 0.785
4.802LysLys: 4.802 ± 2.154
6.603LysLeu: 6.603 ± 1.872
1.801LysMet: 1.801 ± 0.982
4.202LysAsn: 4.202 ± 1.064
5.402LysPro: 5.402 ± 2.059
1.801LysGln: 1.801 ± 1.639
6.002LysArg: 6.002 ± 1.188
1.801LysSer: 1.801 ± 0.761
7.203LysThr: 7.203 ± 1.838
1.2LysVal: 1.2 ± 0.43
0.0LysTrp: 0.0 ± 0.0
0.6LysTyr: 0.6 ± 0.393
0.0LysXaa: 0.0 ± 0.0
Leu
3.001LeuAla: 3.001 ± 1.599
3.601LeuCys: 3.601 ± 0.915
8.403LeuAsp: 8.403 ± 2.291
6.002LeuGlu: 6.002 ± 2.057
4.802LeuPhe: 4.802 ± 0.923
3.001LeuGly: 3.001 ± 0.891
3.001LeuHis: 3.001 ± 1.657
6.002LeuIle: 6.002 ± 1.337
4.802LeuLys: 4.802 ± 4.045
10.804LeuLeu: 10.804 ± 2.071
4.202LeuMet: 4.202 ± 1.75
6.603LeuAsn: 6.603 ± 0.985
4.802LeuPro: 4.802 ± 2.49
9.004LeuGln: 9.004 ± 2.225
4.802LeuArg: 4.802 ± 1.222
6.603LeuSer: 6.603 ± 2.391
3.001LeuThr: 3.001 ± 1.256
6.002LeuVal: 6.002 ± 2.67
1.801LeuTrp: 1.801 ± 1.639
1.801LeuTyr: 1.801 ± 0.551
0.0LeuXaa: 0.0 ± 0.0
Met
1.801MetAla: 1.801 ± 1.311
0.0MetCys: 0.0 ± 0.0
1.2MetAsp: 1.2 ± 0.789
2.401MetGlu: 2.401 ± 0.574
1.801MetPhe: 1.801 ± 0.734
1.801MetGly: 1.801 ± 0.527
0.0MetHis: 0.0 ± 0.0
0.6MetIle: 0.6 ± 0.393
1.801MetLys: 1.801 ± 0.734
2.401MetLeu: 2.401 ± 0.574
1.2MetMet: 1.2 ± 0.789
0.0MetAsn: 0.0 ± 0.0
1.2MetPro: 1.2 ± 0.43
0.6MetGln: 0.6 ± 0.892
0.0MetArg: 0.0 ± 0.0
0.0MetSer: 0.0 ± 0.0
1.801MetThr: 1.801 ± 0.527
0.6MetVal: 0.6 ± 0.393
0.6MetTrp: 0.6 ± 0.612
0.6MetTyr: 0.6 ± 0.393
0.0MetXaa: 0.0 ± 0.0
Asn
1.801AsnAla: 1.801 ± 0.871
2.401AsnCys: 2.401 ± 1.579
0.6AsnAsp: 0.6 ± 0.393
1.801AsnGlu: 1.801 ± 1.311
1.2AsnPhe: 1.2 ± 0.554
1.801AsnGly: 1.801 ± 0.982
0.0AsnHis: 0.0 ± 0.0
5.402AsnIle: 5.402 ± 2.044
3.001AsnLys: 3.001 ± 1.963
6.002AsnLeu: 6.002 ± 1.086
0.6AsnMet: 0.6 ± 0.393
1.2AsnAsn: 1.2 ± 0.785
3.001AsnPro: 3.001 ± 0.806
3.001AsnGln: 3.001 ± 2.093
2.401AsnArg: 2.401 ± 1.096
6.603AsnSer: 6.603 ± 1.327
1.801AsnThr: 1.801 ± 0.734
2.401AsnVal: 2.401 ± 1.579
1.801AsnTrp: 1.801 ± 0.551
1.801AsnTyr: 1.801 ± 0.982
0.0AsnXaa: 0.0 ± 0.0
Pro
1.2ProAla: 1.2 ± 1.224
1.801ProCys: 1.801 ± 1.178
5.402ProAsp: 5.402 ± 1.491
4.202ProGlu: 4.202 ± 1.823
1.801ProPhe: 1.801 ± 1.178
2.401ProGly: 2.401 ± 1.097
1.801ProHis: 1.801 ± 1.178
3.601ProIle: 3.601 ± 1.29
9.004ProLys: 9.004 ± 4.161
5.402ProLeu: 5.402 ± 1.082
1.2ProMet: 1.2 ± 1.224
3.001ProAsn: 3.001 ± 0.656
6.603ProPro: 6.603 ± 3.597
1.2ProGln: 1.2 ± 1.224
2.401ProArg: 2.401 ± 0.86
4.802ProSer: 4.802 ± 1.71
3.601ProThr: 3.601 ± 0.76
3.001ProVal: 3.001 ± 0.806
0.6ProTrp: 0.6 ± 0.892
1.2ProTyr: 1.2 ± 0.43
0.0ProXaa: 0.0 ± 0.0
Gln
1.801GlnAla: 1.801 ± 0.551
1.2GlnCys: 1.2 ± 1.027
2.401GlnAsp: 2.401 ± 0.454
3.001GlnGlu: 3.001 ± 0.907
1.801GlnPhe: 1.801 ± 0.527
3.001GlnGly: 3.001 ± 1.309
0.6GlnHis: 0.6 ± 0.892
2.401GlnIle: 2.401 ± 0.454
4.202GlnLys: 4.202 ± 1.258
2.401GlnLeu: 2.401 ± 0.846
1.2GlnMet: 1.2 ± 0.522
3.001GlnAsn: 3.001 ± 1.029
3.001GlnPro: 3.001 ± 1.387
3.601GlnGln: 3.601 ± 0.994
0.0GlnArg: 0.0 ± 0.0
4.202GlnSer: 4.202 ± 0.667
4.202GlnThr: 4.202 ± 2.89
1.801GlnVal: 1.801 ± 1.274
0.6GlnTrp: 0.6 ± 0.585
0.6GlnTyr: 0.6 ± 0.612
0.0GlnXaa: 0.0 ± 0.0
Arg
0.0ArgAla: 0.0 ± 0.0
0.0ArgCys: 0.0 ± 0.0
3.001ArgAsp: 3.001 ± 1.029
3.601ArgGlu: 3.601 ± 1.273
1.801ArgPhe: 1.801 ± 0.734
1.801ArgGly: 1.801 ± 0.871
1.2ArgHis: 1.2 ± 0.554
1.2ArgIle: 1.2 ± 0.43
5.402ArgLys: 5.402 ± 0.939
3.001ArgLeu: 3.001 ± 1.271
0.6ArgMet: 0.6 ± 0.612
1.2ArgAsn: 1.2 ± 0.785
0.0ArgPro: 0.0 ± 0.0
1.801ArgGln: 1.801 ± 0.734
1.2ArgArg: 1.2 ± 0.785
6.002ArgSer: 6.002 ± 2.795
1.2ArgThr: 1.2 ± 0.785
1.801ArgVal: 1.801 ± 0.982
1.2ArgTrp: 1.2 ± 0.81
1.2ArgTyr: 1.2 ± 1.224
0.0ArgXaa: 0.0 ± 0.0
Ser
4.202SerAla: 4.202 ± 2.18
3.601SerCys: 3.601 ± 1.312
3.601SerAsp: 3.601 ± 1.736
1.801SerGlu: 1.801 ± 0.551
4.202SerPhe: 4.202 ± 2.093
9.604SerGly: 9.604 ± 1.355
0.6SerHis: 0.6 ± 0.393
1.801SerIle: 1.801 ± 1.311
3.601SerLys: 3.601 ± 1.312
9.604SerLeu: 9.604 ± 3.989
1.2SerMet: 1.2 ± 0.789
6.002SerAsn: 6.002 ± 1.976
4.802SerPro: 4.802 ± 0.908
6.002SerGln: 6.002 ± 1.325
9.604SerArg: 9.604 ± 3.035
17.407SerSer: 17.407 ± 8.328
3.001SerThr: 3.001 ± 0.656
4.202SerVal: 4.202 ± 2.765
0.6SerTrp: 0.6 ± 0.585
1.2SerTyr: 1.2 ± 0.785
0.0SerXaa: 0.0 ± 0.0
Thr
1.801ThrAla: 1.801 ± 1.069
1.2ThrCys: 1.2 ± 1.224
2.401ThrAsp: 2.401 ± 1.096
3.001ThrGlu: 3.001 ± 0.666
2.401ThrPhe: 2.401 ± 0.855
2.401ThrGly: 2.401 ± 1.621
0.0ThrHis: 0.0 ± 0.0
3.001ThrIle: 3.001 ± 0.666
1.801ThrLys: 1.801 ± 0.927
9.004ThrLeu: 9.004 ± 2.754
0.0ThrMet: 0.0 ± 0.0
1.801ThrAsn: 1.801 ± 0.551
5.402ThrPro: 5.402 ± 1.132
2.401ThrGln: 2.401 ± 1.579
0.6ThrArg: 0.6 ± 0.393
4.802ThrSer: 4.802 ± 1.09
6.603ThrThr: 6.603 ± 1.21
4.802ThrVal: 4.802 ± 1.997
1.801ThrTrp: 1.801 ± 1.639
1.801ThrTyr: 1.801 ± 1.311
0.0ThrXaa: 0.0 ± 0.0
Val
4.202ValAla: 4.202 ± 1.121
0.0ValCys: 0.0 ± 0.0
1.2ValAsp: 1.2 ± 0.785
1.801ValGlu: 1.801 ± 0.982
1.2ValPhe: 1.2 ± 0.554
3.601ValGly: 3.601 ± 2.431
1.2ValHis: 1.2 ± 1.224
1.801ValIle: 1.801 ± 0.927
3.001ValLys: 3.001 ± 1.57
6.002ValLeu: 6.002 ± 2.506
0.0ValMet: 0.0 ± 0.0
4.202ValAsn: 4.202 ± 1.474
1.801ValPro: 1.801 ± 0.551
0.0ValGln: 0.0 ± 0.0
1.801ValArg: 1.801 ± 1.311
6.002ValSer: 6.002 ± 1.489
3.001ValThr: 3.001 ± 1.57
2.401ValVal: 2.401 ± 1.579
0.6ValTrp: 0.6 ± 0.612
1.2ValTyr: 1.2 ± 0.43
0.0ValXaa: 0.0 ± 0.0
Trp
0.0TrpAla: 0.0 ± 0.0
1.2TrpCys: 1.2 ± 0.43
0.6TrpAsp: 0.6 ± 0.612
3.001TrpGlu: 3.001 ± 1.657
1.2TrpPhe: 1.2 ± 0.789
0.6TrpGly: 0.6 ± 0.892
0.0TrpHis: 0.0 ± 0.0
0.0TrpIle: 0.0 ± 0.0
0.6TrpLys: 0.6 ± 0.393
0.6TrpLeu: 0.6 ± 0.585
0.0TrpMet: 0.0 ± 0.0
0.6TrpAsn: 0.6 ± 0.393
0.0TrpPro: 0.0 ± 0.0
0.6TrpGln: 0.6 ± 0.892
0.6TrpArg: 0.6 ± 0.393
1.2TrpSer: 1.2 ± 0.789
0.0TrpThr: 0.0 ± 0.0
1.801TrpVal: 1.801 ± 1.274
1.2TrpTrp: 1.2 ± 0.789
1.2TrpTyr: 1.2 ± 0.43
0.0TrpXaa: 0.0 ± 0.0
Tyr
0.0TyrAla: 0.0 ± 0.0
1.2TyrCys: 1.2 ± 0.789
1.2TyrAsp: 1.2 ± 0.43
3.001TyrGlu: 3.001 ± 1.211
1.2TyrPhe: 1.2 ± 0.43
6.002TyrGly: 6.002 ± 0.898
0.6TyrHis: 0.6 ± 0.612
0.6TyrIle: 0.6 ± 0.585
0.6TyrLys: 0.6 ± 0.393
1.801TyrLeu: 1.801 ± 0.551
1.2TyrMet: 1.2 ± 0.789
1.2TyrAsn: 1.2 ± 0.789
2.401TyrPro: 2.401 ± 2.447
1.2TyrGln: 1.2 ± 1.027
1.2TyrArg: 1.2 ± 0.785
1.2TyrSer: 1.2 ± 0.43
2.401TyrThr: 2.401 ± 0.86
0.0TyrVal: 0.0 ± 0.0
0.6TyrTrp: 0.6 ± 0.612
1.2TyrTyr: 1.2 ± 0.43
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 4 proteins (1667 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski