Amino acid dipepetide frequency for Human endogenous retrovirus HCML-ARV

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
6.284AlaAla: 6.284 ± 1.927
0.419AlaCys: 0.419 ± 0.553
1.676AlaAsp: 1.676 ± 0.89
4.189AlaGlu: 4.189 ± 1.437
2.933AlaPhe: 2.933 ± 1.413
2.514AlaGly: 2.514 ± 0.695
1.676AlaHis: 1.676 ± 1.552
1.676AlaIle: 1.676 ± 1.552
3.77AlaLys: 3.77 ± 1.448
6.284AlaLeu: 6.284 ± 1.37
0.838AlaMet: 0.838 ± 0.522
1.257AlaAsn: 1.257 ± 1.024
7.122AlaPro: 7.122 ± 1.881
2.514AlaGln: 2.514 ± 0.547
4.189AlaArg: 4.189 ± 0.926
3.351AlaSer: 3.351 ± 0.625
4.608AlaThr: 4.608 ± 1.004
7.96AlaVal: 7.96 ± 3.108
1.257AlaTrp: 1.257 ± 0.47
2.095AlaTyr: 2.095 ± 0.272
0.0AlaXaa: 0.0 ± 0.0
Cys
1.676CysAla: 1.676 ± 0.52
0.419CysCys: 0.419 ± 0.388
0.0CysAsp: 0.0 ± 0.0
1.676CysGlu: 1.676 ± 0.814
0.419CysPhe: 0.419 ± 0.261
2.095CysGly: 2.095 ± 1.01
0.838CysHis: 0.838 ± 0.542
0.838CysIle: 0.838 ± 0.776
0.419CysLys: 0.419 ± 0.553
3.351CysLeu: 3.351 ± 0.583
0.419CysMet: 0.419 ± 0.388
1.257CysAsn: 1.257 ± 0.289
2.514CysPro: 2.514 ± 1.063
1.257CysGln: 1.257 ± 0.476
2.095CysArg: 2.095 ± 0.914
0.838CysSer: 0.838 ± 0.776
1.676CysThr: 1.676 ± 1.045
3.351CysVal: 3.351 ± 0.583
0.0CysTrp: 0.0 ± 0.0
0.838CysTyr: 0.838 ± 0.776
0.0CysXaa: 0.0 ± 0.0
Asp
2.095AspAla: 2.095 ± 0.272
2.095AspCys: 2.095 ± 0.272
0.838AspAsp: 0.838 ± 0.343
0.838AspGlu: 0.838 ± 0.542
1.257AspPhe: 1.257 ± 0.783
2.933AspGly: 2.933 ± 0.347
0.0AspHis: 0.0 ± 0.0
2.095AspIle: 2.095 ± 0.272
1.257AspLys: 1.257 ± 1.66
5.027AspLeu: 5.027 ± 1.576
0.838AspMet: 0.838 ± 0.872
1.257AspAsn: 1.257 ± 0.47
5.446AspPro: 5.446 ± 2.394
1.257AspGln: 1.257 ± 0.289
2.095AspArg: 2.095 ± 0.828
1.676AspSer: 1.676 ± 0.678
1.676AspThr: 1.676 ± 0.103
0.838AspVal: 0.838 ± 0.343
1.676AspTrp: 1.676 ± 0.103
1.676AspTyr: 1.676 ± 0.814
0.0AspXaa: 0.0 ± 0.0
Glu
4.189GluAla: 4.189 ± 0.543
0.838GluCys: 0.838 ± 0.542
4.608GluAsp: 4.608 ± 1.417
4.189GluGlu: 4.189 ± 1.437
0.419GluPhe: 0.419 ± 0.388
5.446GluGly: 5.446 ± 1.603
1.676GluHis: 1.676 ± 0.625
3.77GluIle: 3.77 ± 0.317
4.189GluLys: 4.189 ± 0.14
5.865GluLeu: 5.865 ± 0.824
0.838GluMet: 0.838 ± 0.445
2.933GluAsn: 2.933 ± 1.369
2.095GluPro: 2.095 ± 0.272
1.257GluGln: 1.257 ± 0.97
5.027GluArg: 5.027 ± 0.974
3.77GluSer: 3.77 ± 0.909
3.351GluThr: 3.351 ± 0.492
4.608GluVal: 4.608 ± 1.837
1.257GluTrp: 1.257 ± 0.47
0.0GluTyr: 0.0 ± 0.0
0.0GluXaa: 0.0 ± 0.0
Phe
1.676PheAla: 1.676 ± 0.625
0.838PheCys: 0.838 ± 0.522
1.676PheAsp: 1.676 ± 1.083
1.676PheGlu: 1.676 ± 0.103
0.838PhePhe: 0.838 ± 0.343
1.676PheGly: 1.676 ± 0.678
0.0PheHis: 0.0 ± 0.0
1.676PheIle: 1.676 ± 0.685
1.676PheLys: 1.676 ± 0.103
5.027PheLeu: 5.027 ± 1.269
0.0PheMet: 0.0 ± 0.0
1.257PheAsn: 1.257 ± 1.024
0.838PhePro: 0.838 ± 0.542
1.257PheGln: 1.257 ± 0.684
1.257PheArg: 1.257 ± 0.47
2.095PheSer: 2.095 ± 0.78
2.514PheThr: 2.514 ± 1.063
1.257PheVal: 1.257 ± 0.97
1.257PheTrp: 1.257 ± 1.164
1.676PheTyr: 1.676 ± 0.52
0.0PheXaa: 0.0 ± 0.0
Gly
3.77GlyAla: 3.77 ± 1.448
1.257GlyCys: 1.257 ± 0.476
2.514GlyAsp: 2.514 ± 1.005
4.608GlyGlu: 4.608 ± 0.49
3.351GlyPhe: 3.351 ± 0.492
5.027GlyGly: 5.027 ± 0.771
3.351GlyHis: 3.351 ± 0.205
2.933GlyIle: 2.933 ± 1.279
6.703GlyLys: 6.703 ± 2.518
5.446GlyLeu: 5.446 ± 1.331
0.419GlyMet: 0.419 ± 0.261
2.933GlyAsn: 2.933 ± 1.109
2.933GlyPro: 2.933 ± 0.781
4.608GlyGln: 4.608 ± 0.49
1.257GlyArg: 1.257 ± 0.47
4.608GlySer: 4.608 ± 0.336
4.189GlyThr: 4.189 ± 1.179
4.189GlyVal: 4.189 ± 0.14
1.676GlyTrp: 1.676 ± 0.625
2.514GlyTyr: 2.514 ± 0.547
0.0GlyXaa: 0.0 ± 0.0
His
1.257HisAla: 1.257 ± 0.476
0.838HisCys: 0.838 ± 0.343
0.419HisAsp: 0.419 ± 0.261
0.838HisGlu: 0.838 ± 0.343
0.0HisPhe: 0.0 ± 0.0
0.419HisGly: 0.419 ± 0.261
0.838HisHis: 0.838 ± 0.522
0.419HisIle: 0.419 ± 0.553
2.933HisLys: 2.933 ± 0.395
1.676HisLeu: 1.676 ± 0.678
0.838HisMet: 0.838 ± 0.541
0.838HisAsn: 0.838 ± 0.445
1.676HisPro: 1.676 ± 0.678
2.514HisGln: 2.514 ± 0.695
0.419HisArg: 0.419 ± 0.261
1.676HisSer: 1.676 ± 0.103
1.257HisThr: 1.257 ± 1.164
0.419HisVal: 0.419 ± 0.388
0.838HisTrp: 0.838 ± 0.445
0.419HisTyr: 0.419 ± 0.388
0.0HisXaa: 0.0 ± 0.0
Ile
1.257IleAla: 1.257 ± 0.289
1.676IleCys: 1.676 ± 0.678
2.933IleAsp: 2.933 ± 1.089
2.095IleGlu: 2.095 ± 0.272
0.838IlePhe: 0.838 ± 0.343
3.77IleGly: 3.77 ± 1.858
1.257IleHis: 1.257 ± 0.47
3.77IleIle: 3.77 ± 2.393
3.351IleLys: 3.351 ± 1.371
2.933IleLeu: 2.933 ± 0.781
0.419IleMet: 0.419 ± 0.261
1.257IleAsn: 1.257 ± 0.47
2.514IlePro: 2.514 ± 0.579
3.77IleGln: 3.77 ± 0.251
4.608IleArg: 4.608 ± 0.336
1.257IleSer: 1.257 ± 0.47
2.933IleThr: 2.933 ± 1.739
2.933IleVal: 2.933 ± 2.209
0.419IleTrp: 0.419 ± 0.388
1.257IleTyr: 1.257 ± 0.684
0.0IleXaa: 0.0 ± 0.0
Lys
2.514LysAla: 2.514 ± 0.523
1.676LysCys: 1.676 ± 0.625
3.351LysAsp: 3.351 ± 1.357
4.608LysGlu: 4.608 ± 2.548
1.676LysPhe: 1.676 ± 1.058
5.865LysGly: 5.865 ± 0.694
0.838LysHis: 0.838 ± 0.445
1.257LysIle: 1.257 ± 0.47
3.77LysLys: 3.77 ± 0.909
7.96LysLeu: 7.96 ± 1.585
0.419LysMet: 0.419 ± 0.261
2.933LysAsn: 2.933 ± 1.089
2.933LysPro: 2.933 ± 0.781
4.189LysGln: 4.189 ± 1.1
3.77LysArg: 3.77 ± 1.443
2.095LysSer: 2.095 ± 0.828
5.865LysThr: 5.865 ± 2.325
4.608LysVal: 4.608 ± 1.159
2.095LysTrp: 2.095 ± 0.31
1.676LysTyr: 1.676 ± 0.678
0.0LysXaa: 0.0 ± 0.0
Leu
6.284LeuAla: 6.284 ± 0.397
1.676LeuCys: 1.676 ± 0.89
3.351LeuAsp: 3.351 ± 0.205
4.608LeuGlu: 4.608 ± 1.165
4.608LeuPhe: 4.608 ± 1.004
7.96LeuGly: 7.96 ± 3.416
1.257LeuHis: 1.257 ± 0.783
3.351LeuIle: 3.351 ± 0.583
5.027LeuLys: 5.027 ± 0.308
15.92LeuLeu: 15.92 ± 2.262
0.838LeuMet: 0.838 ± 0.445
5.446LeuAsn: 5.446 ± 1.897
10.054LeuPro: 10.054 ± 1.875
7.541LeuGln: 7.541 ± 1.47
6.284LeuArg: 6.284 ± 1.242
2.933LeuSer: 2.933 ± 0.781
5.865LeuThr: 5.865 ± 2.275
7.96LeuVal: 7.96 ± 0.86
0.419LeuTrp: 0.419 ± 0.261
3.351LeuTyr: 3.351 ± 1.241
0.0LeuXaa: 0.0 ± 0.0
Met
1.676MetAla: 1.676 ± 0.89
0.0MetCys: 0.0 ± 0.0
0.419MetAsp: 0.419 ± 0.261
0.419MetGlu: 0.419 ± 0.261
0.419MetPhe: 0.419 ± 0.553
0.838MetGly: 0.838 ± 0.343
0.419MetHis: 0.419 ± 0.261
0.838MetIle: 0.838 ± 0.343
0.0MetLys: 0.0 ± 0.0
0.838MetLeu: 0.838 ± 0.542
0.0MetMet: 0.0 ± 0.0
0.838MetAsn: 0.838 ± 0.343
1.257MetPro: 1.257 ± 0.476
0.419MetGln: 0.419 ± 0.553
0.838MetArg: 0.838 ± 0.776
0.419MetSer: 0.419 ± 0.261
1.676MetThr: 1.676 ± 0.103
1.676MetVal: 1.676 ± 0.89
0.0MetTrp: 0.0 ± 0.0
0.419MetTyr: 0.419 ± 0.553
0.0MetXaa: 0.0 ± 0.0
Asn
2.095AsnAla: 2.095 ± 0.31
1.257AsnCys: 1.257 ± 1.164
0.419AsnAsp: 0.419 ± 0.388
2.095AsnGlu: 2.095 ± 0.828
1.676AsnPhe: 1.676 ± 0.52
0.838AsnGly: 0.838 ± 0.445
0.838AsnHis: 0.838 ± 0.776
1.257AsnIle: 1.257 ± 0.47
2.933AsnLys: 2.933 ± 1.279
2.514AsnLeu: 2.514 ± 0.579
0.419AsnMet: 0.419 ± 0.553
0.419AsnAsn: 0.419 ± 0.553
5.027AsnPro: 5.027 ± 1.59
2.095AsnGln: 2.095 ± 1.361
2.095AsnArg: 2.095 ± 1.361
3.351AsnSer: 3.351 ± 0.583
2.933AsnThr: 2.933 ± 1.315
1.676AsnVal: 1.676 ± 0.52
1.676AsnTrp: 1.676 ± 0.52
0.419AsnTyr: 0.419 ± 0.388
0.0AsnXaa: 0.0 ± 0.0
Pro
6.703ProAla: 6.703 ± 1.069
2.095ProCys: 2.095 ± 0.78
2.933ProAsp: 2.933 ± 0.395
5.027ProGlu: 5.027 ± 0.482
3.77ProPhe: 3.77 ± 0.882
5.027ProGly: 5.027 ± 1.445
0.838ProHis: 0.838 ± 0.522
2.095ProIle: 2.095 ± 1.01
3.77ProLys: 3.77 ± 0.251
5.865ProLeu: 5.865 ± 0.581
0.419ProMet: 0.419 ± 0.261
2.095ProAsn: 2.095 ± 0.272
5.027ProPro: 5.027 ± 0.974
6.284ProGln: 6.284 ± 2.297
4.608ProArg: 4.608 ± 1.89
2.514ProSer: 2.514 ± 0.579
5.027ProThr: 5.027 ± 1.095
7.96ProVal: 7.96 ± 1.344
2.095ProTrp: 2.095 ± 0.78
1.676ProTyr: 1.676 ± 0.89
0.0ProXaa: 0.0 ± 0.0
Gln
6.703GlnAla: 6.703 ± 1.993
0.838GlnCys: 0.838 ± 1.107
2.933GlnAsp: 2.933 ± 1.089
4.189GlnGlu: 4.189 ± 1.437
1.257GlnPhe: 1.257 ± 0.97
2.933GlnGly: 2.933 ± 0.568
0.419GlnHis: 0.419 ± 0.261
2.514GlnIle: 2.514 ± 1.244
4.608GlnLys: 4.608 ± 0.49
4.608GlnLeu: 4.608 ± 1.837
1.676GlnMet: 1.676 ± 0.52
1.676GlnAsn: 1.676 ± 0.52
1.676GlnPro: 1.676 ± 0.89
2.095GlnGln: 2.095 ± 0.914
3.77GlnArg: 3.77 ± 2.157
2.514GlnSer: 2.514 ± 0.547
3.77GlnThr: 3.77 ± 0.816
5.027GlnVal: 5.027 ± 1.153
1.257GlnTrp: 1.257 ± 0.97
2.514GlnTyr: 2.514 ± 1.028
0.0GlnXaa: 0.0 ± 0.0
Arg
4.608ArgAla: 4.608 ± 1.072
1.676ArgCys: 1.676 ± 0.678
3.351ArgAsp: 3.351 ± 0.205
5.865ArgGlu: 5.865 ± 0.752
1.257ArgPhe: 1.257 ± 0.47
4.189ArgGly: 4.189 ± 1.37
0.419ArgHis: 0.419 ± 0.261
2.095ArgIle: 2.095 ± 0.31
2.933ArgLys: 2.933 ± 0.781
5.027ArgLeu: 5.027 ± 0.308
1.257ArgMet: 1.257 ± 0.97
0.838ArgAsn: 0.838 ± 0.542
4.189ArgPro: 4.189 ± 1.1
2.933ArgGln: 2.933 ± 1.067
1.257ArgArg: 1.257 ± 1.66
2.933ArgSer: 2.933 ± 0.747
2.933ArgThr: 2.933 ± 0.586
4.189ArgVal: 4.189 ± 0.65
1.257ArgTrp: 1.257 ± 0.47
1.676ArgTyr: 1.676 ± 0.625
0.0ArgXaa: 0.0 ± 0.0
Ser
3.77SerAla: 3.77 ± 0.882
1.676SerCys: 1.676 ± 1.058
0.838SerAsp: 0.838 ± 0.445
4.608SerGlu: 4.608 ± 0.422
1.257SerPhe: 1.257 ± 0.47
3.351SerGly: 3.351 ± 0.695
0.838SerHis: 0.838 ± 0.776
1.676SerIle: 1.676 ± 1.058
5.865SerLys: 5.865 ± 0.79
5.865SerLeu: 5.865 ± 0.048
0.419SerMet: 0.419 ± 0.225
0.838SerAsn: 0.838 ± 0.776
5.446SerPro: 5.446 ± 1.085
2.514SerGln: 2.514 ± 0.579
2.933SerArg: 2.933 ± 0.781
1.257SerSer: 1.257 ± 0.476
2.514SerThr: 2.514 ± 1.309
1.257SerVal: 1.257 ± 0.783
0.419SerTrp: 0.419 ± 0.388
1.676SerTyr: 1.676 ± 0.52
0.0SerXaa: 0.0 ± 0.0
Thr
3.351ThrAla: 3.351 ± 1.241
2.514ThrCys: 2.514 ± 0.939
2.514ThrAsp: 2.514 ± 0.579
2.514ThrGlu: 2.514 ± 0.241
1.676ThrPhe: 1.676 ± 0.52
5.446ThrGly: 5.446 ± 1.333
1.257ThrHis: 1.257 ± 0.97
4.608ThrIle: 4.608 ± 1.072
3.77ThrLys: 3.77 ± 0.688
8.379ThrLeu: 8.379 ± 1.925
0.419ThrMet: 0.419 ± 0.553
2.933ThrAsn: 2.933 ± 1.279
4.608ThrPro: 4.608 ± 1.165
3.77ThrGln: 3.77 ± 0.251
1.676ThrArg: 1.676 ± 0.52
4.608ThrSer: 4.608 ± 1.255
3.351ThrThr: 3.351 ± 1.241
6.703ThrVal: 6.703 ± 1.791
2.933ThrTrp: 2.933 ± 0.586
1.257ThrTyr: 1.257 ± 0.47
0.0ThrXaa: 0.0 ± 0.0
Val
4.189ValAla: 4.189 ± 0.14
3.351ValCys: 3.351 ± 2.116
2.095ValAsp: 2.095 ± 0.832
2.933ValGlu: 2.933 ± 0.781
1.676ValPhe: 1.676 ± 1.083
4.608ValGly: 4.608 ± 0.49
2.514ValHis: 2.514 ± 1.028
5.027ValIle: 5.027 ± 0.873
2.514ValLys: 2.514 ± 1.567
8.798ValLeu: 8.798 ± 1.025
0.838ValMet: 0.838 ± 0.522
2.933ValAsn: 2.933 ± 1.067
5.865ValPro: 5.865 ± 1.561
3.351ValGln: 3.351 ± 0.492
4.189ValArg: 4.189 ± 0.862
5.027ValSer: 5.027 ± 2.669
6.703ValThr: 6.703 ± 1.62
1.676ValVal: 1.676 ± 1.045
2.095ValTrp: 2.095 ± 0.272
1.257ValTyr: 1.257 ± 0.289
0.0ValXaa: 0.0 ± 0.0
Trp
2.095TrpAla: 2.095 ± 1.01
0.0TrpCys: 0.0 ± 0.0
0.419TrpAsp: 0.419 ± 0.553
1.676TrpGlu: 1.676 ± 0.685
0.419TrpPhe: 0.419 ± 0.388
2.095TrpGly: 2.095 ± 0.31
0.838TrpHis: 0.838 ± 0.542
2.095TrpIle: 2.095 ± 1.306
3.351TrpLys: 3.351 ± 0.492
1.676TrpLeu: 1.676 ± 0.89
0.0TrpMet: 0.0 ± 0.0
0.838TrpAsn: 0.838 ± 0.522
3.77TrpPro: 3.77 ± 0.251
0.0TrpGln: 0.0 ± 0.0
0.838TrpArg: 0.838 ± 0.522
0.0TrpSer: 0.0 ± 0.0
1.676TrpThr: 1.676 ± 0.685
1.676TrpVal: 1.676 ± 0.52
0.419TrpTrp: 0.419 ± 0.388
0.0TrpTyr: 0.0 ± 0.0
0.0TrpXaa: 0.0 ± 0.0
Tyr
0.419TyrAla: 0.419 ± 0.261
0.838TyrCys: 0.838 ± 0.542
0.0TyrAsp: 0.0 ± 0.0
1.676TyrGlu: 1.676 ± 0.685
0.419TyrPhe: 0.419 ± 0.261
1.257TyrGly: 1.257 ± 0.289
0.419TyrHis: 0.419 ± 0.261
1.257TyrIle: 1.257 ± 0.289
1.257TyrLys: 1.257 ± 0.47
1.676TyrLeu: 1.676 ± 0.678
1.676TyrMet: 1.676 ± 0.685
1.257TyrAsn: 1.257 ± 1.164
0.838TyrPro: 0.838 ± 0.522
3.351TyrGln: 3.351 ± 1.628
1.676TyrArg: 1.676 ± 0.52
2.095TyrSer: 2.095 ± 0.914
3.77TyrThr: 3.77 ± 0.868
1.676TyrVal: 1.676 ± 0.685
0.838TyrTrp: 0.838 ± 0.445
0.838TyrTyr: 0.838 ± 0.776
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 3 proteins (2388 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski