Amino acid dipepetide frequency for Human immunodeficiency virus type 1 group M subtype C (isolate ETH2220) (HIV-1)

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
4.825AlaAla: 4.825 ± 1.105
1.987AlaCys: 1.987 ± 0.383
1.419AlaAsp: 1.419 ± 0.473
5.109AlaGlu: 5.109 ± 1.172
1.987AlaPhe: 1.987 ± 0.458
5.961AlaGly: 5.961 ± 1.518
1.419AlaHis: 1.419 ± 0.363
5.109AlaIle: 5.109 ± 1.173
2.555AlaLys: 2.555 ± 0.655
7.948AlaLeu: 7.948 ± 1.084
1.987AlaMet: 1.987 ± 0.554
3.69AlaAsn: 3.69 ± 0.894
3.69AlaPro: 3.69 ± 0.952
1.135AlaGln: 1.135 ± 0.344
4.825AlaArg: 4.825 ± 1.332
3.69AlaSer: 3.69 ± 0.627
2.555AlaThr: 2.555 ± 1.053
2.838AlaVal: 2.838 ± 0.847
2.271AlaTrp: 2.271 ± 0.957
1.419AlaTyr: 1.419 ± 0.49
0.0AlaXaa: 0.0 ± 0.0
Cys
0.852CysAla: 0.852 ± 0.508
0.284CysCys: 0.284 ± 0.179
0.568CysAsp: 0.568 ± 0.385
0.284CysGlu: 0.284 ± 0.179
1.419CysPhe: 1.419 ± 0.683
1.703CysGly: 1.703 ± 0.509
0.284CysHis: 0.284 ± 0.236
0.0CysIle: 0.0 ± 0.0
1.703CysLys: 1.703 ± 0.696
0.852CysLeu: 0.852 ± 0.548
0.0CysMet: 0.0 ± 0.276
1.987CysAsn: 1.987 ± 1.455
0.284CysPro: 0.284 ± 0.236
1.419CysGln: 1.419 ± 0.41
1.419CysArg: 1.419 ± 0.572
1.703CysSer: 1.703 ± 0.782
1.703CysThr: 1.703 ± 0.486
1.703CysVal: 1.703 ± 0.486
0.852CysTrp: 0.852 ± 0.376
0.284CysTyr: 0.284 ± 0.375
0.0CysXaa: 0.0 ± 0.0
Asp
1.419AspAla: 1.419 ± 0.305
2.838AspCys: 2.838 ± 0.929
0.852AspAsp: 0.852 ± 0.297
1.419AspGlu: 1.419 ± 0.542
0.568AspPhe: 0.568 ± 0.359
1.419AspGly: 1.419 ± 0.469
0.568AspHis: 0.568 ± 0.484
4.542AspIle: 4.542 ± 0.891
3.974AspLys: 3.974 ± 0.652
4.542AspLeu: 4.542 ± 1.014
1.135AspMet: 1.135 ± 0.554
1.135AspAsn: 1.135 ± 0.399
2.555AspPro: 2.555 ± 0.483
1.419AspGln: 1.419 ± 0.481
4.258AspArg: 4.258 ± 0.926
3.406AspSer: 3.406 ± 1.285
3.69AspThr: 3.69 ± 1.163
1.419AspVal: 1.419 ± 0.556
0.568AspTrp: 0.568 ± 0.516
1.987AspTyr: 1.987 ± 0.435
0.0AspXaa: 0.0 ± 0.0
Glu
5.677GluAla: 5.677 ± 1.33
0.0GluCys: 0.0 ± 0.0
1.987GluAsp: 1.987 ± 1.192
8.799GluGlu: 8.799 ± 1.866
1.703GluPhe: 1.703 ± 0.593
5.677GluGly: 5.677 ± 0.914
0.852GluHis: 0.852 ± 0.538
5.109GluIle: 5.109 ± 1.12
4.825GluLys: 4.825 ± 0.922
7.38GluLeu: 7.38 ± 0.806
1.135GluMet: 1.135 ± 0.248
2.271GluAsn: 2.271 ± 0.837
3.974GluPro: 3.974 ± 1.14
3.974GluGln: 3.974 ± 1.01
3.406GluArg: 3.406 ± 1.107
3.406GluSer: 3.406 ± 0.652
3.406GluThr: 3.406 ± 1.315
3.122GluVal: 3.122 ± 0.769
1.419GluTrp: 1.419 ± 0.561
0.568GluTyr: 0.568 ± 0.497
0.0GluXaa: 0.0 ± 0.0
Phe
1.987PheAla: 1.987 ± 0.319
0.568PheCys: 0.568 ± 0.473
0.852PheAsp: 0.852 ± 0.728
0.568PheGlu: 0.568 ± 0.338
1.419PhePhe: 1.419 ± 0.305
0.852PheGly: 0.852 ± 0.4
0.0PheHis: 0.0 ± 0.0
1.703PheIle: 1.703 ± 0.558
2.271PheLys: 2.271 ± 0.594
3.122PheLeu: 3.122 ± 0.639
0.0PheMet: 0.0 ± 0.0
3.122PheAsn: 3.122 ± 1.046
1.987PhePro: 1.987 ± 0.707
1.135PheGln: 1.135 ± 0.448
3.122PheArg: 3.122 ± 0.962
1.419PheSer: 1.419 ± 0.363
1.703PheThr: 1.703 ± 0.516
0.284PheVal: 0.284 ± 0.179
0.852PheTrp: 0.852 ± 0.399
1.703PheTyr: 1.703 ± 0.598
0.0PheXaa: 0.0 ± 0.0
Gly
5.961GlyAla: 5.961 ± 1.058
1.703GlyCys: 1.703 ± 0.602
2.838GlyAsp: 2.838 ± 0.972
3.406GlyGlu: 3.406 ± 0.771
2.271GlyPhe: 2.271 ± 0.676
5.677GlyGly: 5.677 ± 0.897
2.555GlyHis: 2.555 ± 1.224
5.961GlyIle: 5.961 ± 1.775
5.393GlyLys: 5.393 ± 2.171
3.974GlyLeu: 3.974 ± 1.558
0.852GlyMet: 0.852 ± 0.292
3.122GlyAsn: 3.122 ± 1.201
5.393GlyPro: 5.393 ± 1.24
3.122GlyGln: 3.122 ± 0.962
3.69GlyArg: 3.69 ± 0.914
3.974GlySer: 3.974 ± 0.682
3.974GlyThr: 3.974 ± 1.89
3.974GlyVal: 3.974 ± 0.836
1.419GlyTrp: 1.419 ± 0.665
1.987GlyTyr: 1.987 ± 0.565
0.0GlyXaa: 0.0 ± 0.0
His
0.568HisAla: 0.568 ± 0.296
0.568HisCys: 0.568 ± 0.437
0.0HisAsp: 0.0 ± 0.0
0.284HisGlu: 0.284 ± 0.179
0.852HisPhe: 0.852 ± 0.868
1.419HisGly: 1.419 ± 0.441
0.568HisHis: 0.568 ± 0.834
0.568HisIle: 0.568 ± 0.834
0.852HisLys: 0.852 ± 0.376
3.406HisLeu: 3.406 ± 0.815
1.419HisMet: 1.419 ± 1.247
1.419HisAsn: 1.419 ± 0.67
2.555HisPro: 2.555 ± 0.716
2.838HisGln: 2.838 ± 1.086
0.568HisArg: 0.568 ± 0.199
1.135HisSer: 1.135 ± 0.524
1.135HisThr: 1.135 ± 0.725
0.284HisVal: 0.284 ± 0.179
0.0HisTrp: 0.0 ± 0.0
1.419HisTyr: 1.419 ± 0.764
0.0HisXaa: 0.0 ± 0.0
Ile
3.406IleAla: 3.406 ± 0.877
1.135IleCys: 1.135 ± 0.399
0.852IleAsp: 0.852 ± 0.433
4.825IleGlu: 4.825 ± 0.791
1.419IlePhe: 1.419 ± 0.856
4.542IleGly: 4.542 ± 1.584
1.987IleHis: 1.987 ± 0.802
7.38IleIle: 7.38 ± 1.216
7.38IleLys: 7.38 ± 0.733
5.961IleLeu: 5.961 ± 1.015
1.135IleMet: 1.135 ± 0.368
2.271IleAsn: 2.271 ± 0.667
3.406IlePro: 3.406 ± 0.718
3.122IleGln: 3.122 ± 1.4
3.69IleArg: 3.69 ± 1.254
2.555IleSer: 2.555 ± 0.434
2.555IleThr: 2.555 ± 1.795
6.245IleVal: 6.245 ± 1.495
1.987IleTrp: 1.987 ± 0.777
1.987IleTyr: 1.987 ± 0.49
0.0IleXaa: 0.0 ± 0.0
Lys
5.677LysAla: 5.677 ± 0.931
2.271LysCys: 2.271 ± 0.614
4.258LysAsp: 4.258 ± 1.275
5.961LysGlu: 5.961 ± 1.66
1.419LysPhe: 1.419 ± 0.55
3.406LysGly: 3.406 ± 1.06
1.987LysHis: 1.987 ± 0.443
6.245LysIle: 6.245 ± 1.672
4.825LysLys: 4.825 ± 1.089
5.961LysLeu: 5.961 ± 0.887
0.284LysMet: 0.284 ± 0.179
1.987LysAsn: 1.987 ± 0.701
2.838LysPro: 2.838 ± 1.56
3.974LysGln: 3.974 ± 1.067
3.406LysArg: 3.406 ± 0.523
2.555LysSer: 2.555 ± 0.698
4.542LysThr: 4.542 ± 0.67
4.258LysVal: 4.258 ± 0.891
2.271LysTrp: 2.271 ± 0.547
1.419LysTyr: 1.419 ± 0.404
0.0LysXaa: 0.0 ± 0.0
Leu
4.542LeuAla: 4.542 ± 0.566
0.852LeuCys: 0.852 ± 0.399
5.677LeuAsp: 5.677 ± 0.953
6.529LeuGlu: 6.529 ± 0.924
3.122LeuPhe: 3.122 ± 1.101
6.529LeuGly: 6.529 ± 2.01
1.987LeuHis: 1.987 ± 1.016
4.542LeuIle: 4.542 ± 2.242
8.232LeuLys: 8.232 ± 1.102
7.664LeuLeu: 7.664 ± 1.972
0.852LeuMet: 0.852 ± 0.615
5.961LeuAsn: 5.961 ± 0.933
1.703LeuPro: 1.703 ± 0.829
6.529LeuGln: 6.529 ± 1.093
5.961LeuArg: 5.961 ± 1.234
2.838LeuSer: 2.838 ± 1.136
4.825LeuThr: 4.825 ± 0.765
4.825LeuVal: 4.825 ± 0.937
2.838LeuTrp: 2.838 ± 0.574
1.419LeuTyr: 1.419 ± 0.439
0.0LeuXaa: 0.0 ± 0.0
Met
1.419MetAla: 1.419 ± 0.688
0.284MetCys: 0.284 ± 0.236
0.568MetAsp: 0.568 ± 0.359
1.419MetGlu: 1.419 ± 0.825
0.852MetPhe: 0.852 ± 0.498
2.271MetGly: 2.271 ± 0.593
0.852MetHis: 0.852 ± 0.88
0.852MetIle: 0.852 ± 0.399
1.419MetLys: 1.419 ± 0.487
2.555MetLeu: 2.555 ± 0.698
1.135MetMet: 1.135 ± 0.592
0.568MetAsn: 0.568 ± 0.385
0.284MetPro: 0.284 ± 0.179
1.987MetGln: 1.987 ± 0.763
0.852MetArg: 0.852 ± 0.236
0.852MetSer: 0.852 ± 0.4
2.271MetThr: 2.271 ± 0.775
1.419MetVal: 1.419 ± 0.835
0.568MetTrp: 0.568 ± 0.473
0.852MetTyr: 0.852 ± 0.236
0.0MetXaa: 0.0 ± 0.0
Asn
1.987AsnAla: 1.987 ± 0.45
3.122AsnCys: 3.122 ± 0.786
1.987AsnAsp: 1.987 ± 0.432
3.406AsnGlu: 3.406 ± 0.589
3.406AsnPhe: 3.406 ± 0.978
2.271AsnGly: 2.271 ± 1.253
0.568AsnHis: 0.568 ± 0.608
2.555AsnIle: 2.555 ± 1.196
2.838AsnLys: 2.838 ± 0.726
3.974AsnLeu: 3.974 ± 1.866
1.703AsnMet: 1.703 ± 0.486
2.838AsnAsn: 2.838 ± 0.942
3.406AsnPro: 3.406 ± 1.274
1.419AsnGln: 1.419 ± 0.472
1.987AsnArg: 1.987 ± 0.58
2.555AsnSer: 2.555 ± 1.01
5.393AsnThr: 5.393 ± 0.669
1.135AsnVal: 1.135 ± 0.625
1.987AsnTrp: 1.987 ± 0.458
1.135AsnTyr: 1.135 ± 0.32
0.0AsnXaa: 0.0 ± 0.0
Pro
3.974ProAla: 3.974 ± 1.255
0.568ProCys: 0.568 ± 0.473
2.838ProAsp: 2.838 ± 0.808
3.974ProGlu: 3.974 ± 1.358
1.419ProPhe: 1.419 ± 0.665
4.825ProGly: 4.825 ± 1.214
0.852ProHis: 0.852 ± 0.399
4.258ProIle: 4.258 ± 1.255
3.974ProLys: 3.974 ± 0.753
3.974ProLeu: 3.974 ± 1.495
1.135ProMet: 1.135 ± 0.619
0.852ProAsn: 0.852 ± 0.37
3.406ProPro: 3.406 ± 0.974
2.838ProGln: 2.838 ± 0.598
2.555ProArg: 2.555 ± 0.513
2.838ProSer: 2.838 ± 0.437
2.555ProThr: 2.555 ± 0.508
6.529ProVal: 6.529 ± 1.566
0.852ProTrp: 0.852 ± 0.64
1.135ProTyr: 1.135 ± 0.708
0.0ProXaa: 0.0 ± 0.0
Gln
4.825GlnAla: 4.825 ± 0.751
0.852GlnCys: 0.852 ± 0.594
2.838GlnAsp: 2.838 ± 0.725
3.69GlnGlu: 3.69 ± 0.772
0.284GlnPhe: 0.284 ± 0.179
5.393GlnGly: 5.393 ± 0.521
0.852GlnHis: 0.852 ± 0.407
3.406GlnIle: 3.406 ± 1.063
4.542GlnLys: 4.542 ± 1.163
5.961GlnLeu: 5.961 ± 1.307
3.69GlnMet: 3.69 ± 1.266
3.406GlnAsn: 3.406 ± 1.135
1.987GlnPro: 1.987 ± 0.766
3.974GlnGln: 3.974 ± 0.955
2.271GlnArg: 2.271 ± 1.24
1.703GlnSer: 1.703 ± 0.956
2.555GlnThr: 2.555 ± 0.434
2.838GlnVal: 2.838 ± 1.144
1.135GlnTrp: 1.135 ± 0.399
1.987GlnTyr: 1.987 ± 0.603
0.0GlnXaa: 0.0 ± 0.0
Arg
6.529ArgAla: 6.529 ± 0.993
0.284ArgCys: 0.284 ± 0.417
4.542ArgAsp: 4.542 ± 0.998
5.677ArgGlu: 5.677 ± 1.139
1.419ArgPhe: 1.419 ± 0.555
4.258ArgGly: 4.258 ± 0.855
1.419ArgHis: 1.419 ± 1.055
3.974ArgIle: 3.974 ± 1.859
2.555ArgLys: 2.555 ± 0.554
3.69ArgLeu: 3.69 ± 1.488
0.852ArgMet: 0.852 ± 0.408
1.703ArgAsn: 1.703 ± 0.627
3.406ArgPro: 3.406 ± 1.272
4.825ArgGln: 4.825 ± 1.482
3.974ArgArg: 3.974 ± 2.702
2.271ArgSer: 2.271 ± 1.491
2.271ArgThr: 2.271 ± 0.534
2.555ArgVal: 2.555 ± 0.434
1.419ArgTrp: 1.419 ± 0.685
0.568ArgTyr: 0.568 ± 0.321
0.0ArgXaa: 0.0 ± 0.0
Ser
2.271SerAla: 2.271 ± 0.39
0.0SerCys: 0.0 ± 0.0
2.271SerAsp: 2.271 ± 0.609
3.69SerGlu: 3.69 ± 1.016
1.135SerPhe: 1.135 ± 0.694
3.122SerGly: 3.122 ± 1.699
0.284SerHis: 0.284 ± 0.293
3.406SerIle: 3.406 ± 1.044
1.419SerLys: 1.419 ± 0.696
5.109SerLeu: 5.109 ± 1.24
0.852SerMet: 0.852 ± 0.469
4.542SerAsn: 4.542 ± 0.992
4.258SerPro: 4.258 ± 0.553
3.974SerGln: 3.974 ± 0.8
2.271SerArg: 2.271 ± 1.179
2.271SerSer: 2.271 ± 0.837
3.69SerThr: 3.69 ± 1.581
1.135SerVal: 1.135 ± 0.398
0.852SerTrp: 0.852 ± 0.297
0.852SerTyr: 0.852 ± 0.657
0.0SerXaa: 0.0 ± 0.0
Thr
4.542ThrAla: 4.542 ± 0.965
0.284ThrCys: 0.284 ± 0.236
3.122ThrAsp: 3.122 ± 0.76
3.69ThrGlu: 3.69 ± 0.565
1.987ThrPhe: 1.987 ± 0.567
3.406ThrGly: 3.406 ± 0.491
0.852ThrHis: 0.852 ± 0.399
3.406ThrIle: 3.406 ± 0.732
2.838ThrLys: 2.838 ± 0.623
6.529ThrLeu: 6.529 ± 0.935
1.703ThrMet: 1.703 ± 0.619
1.987ThrAsn: 1.987 ± 0.777
3.69ThrPro: 3.69 ± 0.973
3.122ThrGln: 3.122 ± 0.503
3.122ThrArg: 3.122 ± 0.931
2.838ThrSer: 2.838 ± 0.693
3.406ThrThr: 3.406 ± 0.502
4.258ThrVal: 4.258 ± 0.917
1.419ThrTrp: 1.419 ± 0.525
0.852ThrTyr: 0.852 ± 0.713
0.0ThrXaa: 0.0 ± 0.0
Val
3.122ValAla: 3.122 ± 0.973
0.284ValCys: 0.284 ± 0.375
3.406ValAsp: 3.406 ± 1.442
3.406ValGlu: 3.406 ± 0.872
0.852ValPhe: 0.852 ± 0.538
5.109ValGly: 5.109 ± 0.765
2.838ValHis: 2.838 ± 0.728
2.555ValIle: 2.555 ± 0.798
3.406ValLys: 3.406 ± 0.694
3.122ValLeu: 3.122 ± 0.576
0.284ValMet: 0.284 ± 0.236
3.122ValAsn: 3.122 ± 0.846
3.974ValPro: 3.974 ± 0.64
3.69ValGln: 3.69 ± 0.77
3.406ValArg: 3.406 ± 0.928
3.406ValSer: 3.406 ± 1.594
2.555ValThr: 2.555 ± 0.856
3.406ValVal: 3.406 ± 0.722
1.987ValTrp: 1.987 ± 0.804
1.419ValTyr: 1.419 ± 0.441
0.0ValXaa: 0.0 ± 0.0
Trp
1.419TrpAla: 1.419 ± 0.578
0.284TrpCys: 0.284 ± 0.293
1.703TrpAsp: 1.703 ± 0.794
1.987TrpGlu: 1.987 ± 0.448
0.284TrpPhe: 0.284 ± 0.236
2.271TrpGly: 2.271 ± 1.076
0.284TrpHis: 0.284 ± 0.417
1.135TrpIle: 1.135 ± 0.248
2.555TrpLys: 2.555 ± 0.558
0.852TrpLeu: 0.852 ± 0.56
1.987TrpMet: 1.987 ± 0.45
1.703TrpAsn: 1.703 ± 1.225
1.135TrpPro: 1.135 ± 0.446
1.987TrpGln: 1.987 ± 0.776
1.703TrpArg: 1.703 ± 0.732
0.568TrpSer: 0.568 ± 0.475
1.419TrpThr: 1.419 ± 0.688
1.703TrpVal: 1.703 ± 0.363
0.852TrpTrp: 0.852 ± 0.297
0.568TrpTyr: 0.568 ± 0.199
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.419TyrAla: 1.419 ± 0.856
1.419TyrCys: 1.419 ± 0.492
1.135TyrAsp: 1.135 ± 0.398
0.568TyrGlu: 0.568 ± 0.411
0.852TyrPhe: 0.852 ± 0.433
1.135TyrGly: 1.135 ± 0.489
0.852TyrHis: 0.852 ± 0.37
1.135TyrIle: 1.135 ± 0.533
1.987TyrLys: 1.987 ± 0.699
0.852TyrLeu: 0.852 ± 0.409
0.852TyrMet: 0.852 ± 0.376
1.987TyrAsn: 1.987 ± 0.581
1.419TyrPro: 1.419 ± 0.603
1.703TyrGln: 1.703 ± 0.829
1.703TyrArg: 1.703 ± 1.135
1.419TyrSer: 1.419 ± 0.305
0.852TyrThr: 0.852 ± 0.292
1.419TyrVal: 1.419 ± 0.665
0.852TyrTrp: 0.852 ± 0.409
0.852TyrTyr: 0.852 ± 0.292
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 9 proteins (3524 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski