Amino acid dipepetide frequency for Human immunodeficiency virus type 1 group M subtype B (isolate HXB2) (HIV-1)

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
5.496AlaAla: 5.496 ± 1.871
2.473AlaCys: 2.473 ± 0.8
1.649AlaAsp: 1.649 ± 0.541
4.946AlaGlu: 4.946 ± 1.109
1.924AlaPhe: 1.924 ± 0.344
4.672AlaGly: 4.672 ± 1.172
1.374AlaHis: 1.374 ± 0.6
4.397AlaIle: 4.397 ± 0.943
1.924AlaLys: 1.924 ± 0.86
5.771AlaLeu: 5.771 ± 1.18
1.924AlaMet: 1.924 ± 0.624
1.924AlaAsn: 1.924 ± 0.726
2.748AlaPro: 2.748 ± 0.82
1.649AlaGln: 1.649 ± 0.374
4.122AlaArg: 4.122 ± 0.854
5.496AlaSer: 5.496 ± 0.603
4.397AlaThr: 4.397 ± 0.851
4.122AlaVal: 4.122 ± 0.918
1.099AlaTrp: 1.099 ± 0.45
0.824AlaTyr: 0.824 ± 0.342
0.0AlaXaa: 0.0 ± 0.0
Cys
0.824CysAla: 0.824 ± 0.591
0.55CysCys: 0.55 ± 0.757
0.275CysAsp: 0.275 ± 0.192
0.275CysGlu: 0.275 ± 0.369
1.924CysPhe: 1.924 ± 1.62
1.924CysGly: 1.924 ± 0.571
0.0CysHis: 0.0 ± 0.0
0.0CysIle: 0.0 ± 0.0
1.374CysLys: 1.374 ± 0.726
0.275CysLeu: 0.275 ± 0.243
0.275CysMet: 0.275 ± 0.348
1.374CysAsn: 1.374 ± 1.087
0.275CysPro: 0.275 ± 0.243
1.374CysGln: 1.374 ± 0.727
1.924CysArg: 1.924 ± 0.51
1.374CysSer: 1.374 ± 0.897
3.298CysThr: 3.298 ± 0.802
1.649CysVal: 1.649 ± 0.547
0.824CysTrp: 0.824 ± 0.382
0.824CysTyr: 0.824 ± 0.981
0.0CysXaa: 0.0 ± 0.0
Asp
0.824AspAla: 0.824 ± 0.358
2.748AspCys: 2.748 ± 1.08
1.649AspAsp: 1.649 ± 0.485
1.099AspGlu: 1.099 ± 0.586
1.099AspPhe: 1.099 ± 0.767
1.099AspGly: 1.099 ± 0.506
0.0AspHis: 0.0 ± 0.0
3.298AspIle: 3.298 ± 0.839
2.748AspLys: 2.748 ± 0.81
3.572AspLeu: 3.572 ± 0.979
0.824AspMet: 0.824 ± 0.435
1.649AspAsn: 1.649 ± 0.678
2.473AspPro: 2.473 ± 1.398
1.924AspGln: 1.924 ± 0.525
4.672AspArg: 4.672 ± 0.983
2.473AspSer: 2.473 ± 0.83
3.298AspThr: 3.298 ± 0.625
0.824AspVal: 0.824 ± 0.401
0.55AspTrp: 0.55 ± 0.562
0.824AspTyr: 0.824 ± 0.382
0.0AspXaa: 0.0 ± 0.0
Glu
5.221GluAla: 5.221 ± 1.059
0.0GluCys: 0.0 ± 0.0
2.198GluAsp: 2.198 ± 0.658
7.694GluGlu: 7.694 ± 1.672
1.099GluPhe: 1.099 ± 0.506
5.221GluGly: 5.221 ± 0.761
0.55GluHis: 0.55 ± 0.384
4.397GluIle: 4.397 ± 1.106
4.672GluLys: 4.672 ± 0.875
7.42GluLeu: 7.42 ± 0.98
1.924GluMet: 1.924 ± 0.757
1.374GluAsn: 1.374 ± 0.426
5.771GluPro: 5.771 ± 1.863
4.122GluGln: 4.122 ± 0.707
4.122GluArg: 4.122 ± 1.595
2.748GluSer: 2.748 ± 0.92
4.397GluThr: 4.397 ± 1.903
4.122GluVal: 4.122 ± 0.521
1.924GluTrp: 1.924 ± 0.604
1.374GluTyr: 1.374 ± 0.589
0.0GluXaa: 0.0 ± 0.0
Phe
1.374PheAla: 1.374 ± 0.36
0.275PheCys: 0.275 ± 0.243
0.55PheAsp: 0.55 ± 0.562
0.275PheGlu: 0.275 ± 0.243
0.55PhePhe: 0.55 ± 0.486
1.099PheGly: 1.099 ± 0.385
0.824PheHis: 0.824 ± 0.981
1.649PheIle: 1.649 ± 0.773
1.374PheLys: 1.374 ± 0.452
2.748PheLeu: 2.748 ± 0.544
0.0PheMet: 0.0 ± 0.0
3.023PheAsn: 3.023 ± 1.41
1.374PhePro: 1.374 ± 0.918
0.55PheGln: 0.55 ± 0.226
3.023PheArg: 3.023 ± 0.942
2.198PheSer: 2.198 ± 0.552
0.824PheThr: 0.824 ± 0.342
0.55PheVal: 0.55 ± 0.226
0.275PheTrp: 0.275 ± 0.192
1.649PheTyr: 1.649 ± 0.349
0.0PheXaa: 0.0 ± 0.0
Gly
4.946GlyAla: 4.946 ± 0.883
1.924GlyCys: 1.924 ± 0.532
2.198GlyAsp: 2.198 ± 0.88
3.572GlyGlu: 3.572 ± 0.372
1.099GlyPhe: 1.099 ± 0.482
6.32GlyGly: 6.32 ± 1.088
4.397GlyHis: 4.397 ± 1.709
5.496GlyIle: 5.496 ± 1.672
5.771GlyLys: 5.771 ± 1.131
4.122GlyLeu: 4.122 ± 0.641
0.824GlyMet: 0.824 ± 0.357
2.473GlyAsn: 2.473 ± 0.786
4.946GlyPro: 4.946 ± 0.936
4.397GlyGln: 4.397 ± 1.444
3.572GlyArg: 3.572 ± 0.959
4.672GlySer: 4.672 ± 0.948
3.572GlyThr: 3.572 ± 2.199
3.572GlyVal: 3.572 ± 1.05
2.198GlyTrp: 2.198 ± 0.695
1.649GlyTyr: 1.649 ± 0.594
0.0GlyXaa: 0.0 ± 0.0
His
1.099HisAla: 1.099 ± 0.334
0.824HisCys: 0.824 ± 0.745
0.0HisAsp: 0.0 ± 0.0
0.55HisGlu: 0.55 ± 0.226
0.824HisPhe: 0.824 ± 0.998
1.649HisGly: 1.649 ± 0.848
1.099HisHis: 1.099 ± 0.96
1.649HisIle: 1.649 ± 0.818
1.099HisLys: 1.099 ± 0.536
2.198HisLeu: 2.198 ± 0.807
0.55HisMet: 0.55 ± 0.673
1.099HisAsn: 1.099 ± 0.507
2.473HisPro: 2.473 ± 1.218
3.572HisGln: 3.572 ± 1.695
0.824HisArg: 0.824 ± 0.358
2.198HisSer: 2.198 ± 0.659
2.198HisThr: 2.198 ± 0.626
0.55HisVal: 0.55 ± 0.376
0.0HisTrp: 0.0 ± 0.0
0.55HisTyr: 0.55 ± 0.416
0.0HisXaa: 0.0 ± 0.0
Ile
2.473IleAla: 2.473 ± 0.536
1.099IleCys: 1.099 ± 0.452
1.374IleAsp: 1.374 ± 0.567
4.946IleGlu: 4.946 ± 0.753
0.824IlePhe: 0.824 ± 0.429
5.221IleGly: 5.221 ± 1.546
2.198IleHis: 2.198 ± 0.775
4.672IleIle: 4.672 ± 1.102
4.397IleLys: 4.397 ± 1.115
5.496IleLeu: 5.496 ± 0.923
1.099IleMet: 1.099 ± 0.388
1.649IleAsn: 1.649 ± 0.547
4.122IlePro: 4.122 ± 0.934
2.748IleGln: 2.748 ± 1.328
5.221IleArg: 5.221 ± 1.362
3.572IleSer: 3.572 ± 1.006
3.023IleThr: 3.023 ± 1.204
7.969IleVal: 7.969 ± 1.519
1.924IleTrp: 1.924 ± 0.516
2.198IleTyr: 2.198 ± 0.829
0.0IleXaa: 0.0 ± 0.0
Lys
6.32LysAla: 6.32 ± 1.061
2.473LysCys: 2.473 ± 0.624
2.198LysAsp: 2.198 ± 0.769
7.42LysGlu: 7.42 ± 1.859
0.275LysPhe: 0.275 ± 0.192
3.572LysGly: 3.572 ± 0.912
1.924LysHis: 1.924 ± 1.004
6.87LysIle: 6.87 ± 1.815
6.595LysLys: 6.595 ± 2.263
6.32LysLeu: 6.32 ± 1.415
0.275LysMet: 0.275 ± 0.192
2.748LysAsn: 2.748 ± 0.961
1.374LysPro: 1.374 ± 0.686
3.847LysGln: 3.847 ± 0.759
2.748LysArg: 2.748 ± 0.647
1.649LysSer: 1.649 ± 0.365
4.122LysThr: 4.122 ± 0.641
4.122LysVal: 4.122 ± 1.25
1.374LysTrp: 1.374 ± 0.435
2.198LysTyr: 2.198 ± 0.623
0.0LysXaa: 0.0 ± 0.0
Leu
3.847LeuAla: 3.847 ± 0.9
0.824LeuCys: 0.824 ± 0.429
4.122LeuAsp: 4.122 ± 0.896
7.145LeuGlu: 7.145 ± 1.866
2.198LeuPhe: 2.198 ± 1.047
6.595LeuGly: 6.595 ± 1.577
1.649LeuHis: 1.649 ± 1.422
4.122LeuIle: 4.122 ± 1.546
6.87LeuLys: 6.87 ± 1.338
8.519LeuLeu: 8.519 ± 2.812
0.824LeuMet: 0.824 ± 0.527
3.847LeuAsn: 3.847 ± 1.048
2.473LeuPro: 2.473 ± 0.747
4.946LeuGln: 4.946 ± 0.829
4.946LeuArg: 4.946 ± 0.847
2.748LeuSer: 2.748 ± 0.877
4.397LeuThr: 4.397 ± 0.727
5.496LeuVal: 5.496 ± 1.094
3.023LeuTrp: 3.023 ± 0.993
2.473LeuTyr: 2.473 ± 0.771
0.0LeuXaa: 0.0 ± 0.0
Met
1.099MetAla: 1.099 ± 0.586
0.0MetCys: 0.0 ± 0.0
0.824MetAsp: 0.824 ± 0.433
1.924MetGlu: 1.924 ± 1.008
0.55MetPhe: 0.55 ± 0.278
1.924MetGly: 1.924 ± 0.617
0.55MetHis: 0.55 ± 0.226
1.374MetIle: 1.374 ± 0.683
0.55MetLys: 0.55 ± 0.278
1.649MetLeu: 1.649 ± 0.469
1.099MetMet: 1.099 ± 0.556
0.55MetAsn: 0.55 ± 0.37
0.0MetPro: 0.0 ± 0.0
1.099MetGln: 1.099 ± 0.556
2.198MetArg: 2.198 ± 0.502
0.824MetSer: 0.824 ± 0.382
2.473MetThr: 2.473 ± 0.728
0.824MetVal: 0.824 ± 0.234
0.55MetTrp: 0.55 ± 0.486
1.099MetTyr: 1.099 ± 0.325
0.0MetXaa: 0.0 ± 0.0
Asn
2.198AsnAla: 2.198 ± 0.584
2.748AsnCys: 2.748 ± 0.877
1.649AsnAsp: 1.649 ± 0.547
2.748AsnGlu: 2.748 ± 0.709
3.298AsnPhe: 3.298 ± 1.026
1.649AsnGly: 1.649 ± 0.723
0.275AsnHis: 0.275 ± 0.243
2.198AsnIle: 2.198 ± 0.741
3.298AsnLys: 3.298 ± 0.503
1.374AsnLeu: 1.374 ± 0.547
0.824AsnMet: 0.824 ± 0.73
3.572AsnAsn: 3.572 ± 1.881
3.572AsnPro: 3.572 ± 1.167
1.924AsnGln: 1.924 ± 0.624
1.649AsnArg: 1.649 ± 0.586
3.572AsnSer: 3.572 ± 1.001
4.397AsnThr: 4.397 ± 0.976
1.099AsnVal: 1.099 ± 0.659
1.924AsnTrp: 1.924 ± 0.554
1.099AsnTyr: 1.099 ± 0.385
0.0AsnXaa: 0.0 ± 0.0
Pro
2.473ProAla: 2.473 ± 0.825
0.824ProCys: 0.824 ± 0.73
2.473ProAsp: 2.473 ± 0.775
3.847ProGlu: 3.847 ± 0.997
1.374ProPhe: 1.374 ± 0.709
5.496ProGly: 5.496 ± 1.344
0.824ProHis: 0.824 ± 0.549
4.946ProIle: 4.946 ± 1.071
2.748ProLys: 2.748 ± 1.1
4.397ProLeu: 4.397 ± 1.018
0.824ProMet: 0.824 ± 0.445
0.824ProAsn: 0.824 ± 0.685
3.298ProPro: 3.298 ± 1.458
3.023ProGln: 3.023 ± 0.722
4.122ProArg: 4.122 ± 1.471
2.198ProSer: 2.198 ± 0.859
3.298ProThr: 3.298 ± 1.118
4.946ProVal: 4.946 ± 1.212
1.099ProTrp: 1.099 ± 0.862
0.55ProTyr: 0.55 ± 0.384
0.0ProXaa: 0.0 ± 0.0
Gln
6.32GlnAla: 6.32 ± 0.958
0.275GlnCys: 0.275 ± 0.243
2.198GlnAsp: 2.198 ± 0.977
3.572GlnGlu: 3.572 ± 0.654
0.275GlnPhe: 0.275 ± 0.243
5.221GlnGly: 5.221 ± 0.677
1.374GlnHis: 1.374 ± 0.412
4.397GlnIle: 4.397 ± 1.053
3.298GlnLys: 3.298 ± 1.149
5.496GlnLeu: 5.496 ± 1.164
3.298GlnMet: 3.298 ± 1.247
4.122GlnAsn: 4.122 ± 0.977
2.473GlnPro: 2.473 ± 1.67
2.198GlnGln: 2.198 ± 1.051
4.397GlnArg: 4.397 ± 1.436
1.924GlnSer: 1.924 ± 0.473
1.924GlnThr: 1.924 ± 0.594
4.122GlnVal: 4.122 ± 1.461
0.55GlnTrp: 0.55 ± 0.384
1.649GlnTyr: 1.649 ± 0.661
0.0GlnXaa: 0.0 ± 0.0
Arg
5.221ArgAla: 5.221 ± 0.786
0.55ArgCys: 0.55 ± 0.416
3.572ArgAsp: 3.572 ± 0.72
4.946ArgGlu: 4.946 ± 0.978
1.374ArgPhe: 1.374 ± 0.618
3.847ArgGly: 3.847 ± 0.649
1.099ArgHis: 1.099 ± 0.913
4.946ArgIle: 4.946 ± 2.186
4.672ArgLys: 4.672 ± 1.237
3.298ArgLeu: 3.298 ± 1.726
1.649ArgMet: 1.649 ± 0.365
2.198ArgAsn: 2.198 ± 0.781
2.748ArgPro: 2.748 ± 0.873
6.595ArgGln: 6.595 ± 1.254
4.946ArgArg: 4.946 ± 3.493
3.023ArgSer: 3.023 ± 1.429
1.649ArgThr: 1.649 ± 0.629
2.748ArgVal: 2.748 ± 0.649
3.298ArgTrp: 3.298 ± 0.719
1.099ArgTyr: 1.099 ± 0.465
0.0ArgXaa: 0.0 ± 0.0
Ser
3.023SerAla: 3.023 ± 0.521
0.55SerCys: 0.55 ± 0.226
2.473SerAsp: 2.473 ± 0.515
4.122SerGlu: 4.122 ± 0.765
1.649SerPhe: 1.649 ± 0.835
4.122SerGly: 4.122 ± 1.603
0.55SerHis: 0.55 ± 0.562
2.748SerIle: 2.748 ± 0.634
2.198SerLys: 2.198 ± 0.733
6.595SerLeu: 6.595 ± 2.239
1.099SerMet: 1.099 ± 0.322
2.748SerAsn: 2.748 ± 0.715
4.397SerPro: 4.397 ± 1.278
4.946SerGln: 4.946 ± 1.881
2.748SerArg: 2.748 ± 1.173
2.748SerSer: 2.748 ± 0.885
3.298SerThr: 3.298 ± 1.666
2.473SerVal: 2.473 ± 0.399
0.55SerTrp: 0.55 ± 0.226
1.374SerTyr: 1.374 ± 0.851
0.0SerXaa: 0.0 ± 0.0
Thr
3.572ThrAla: 3.572 ± 0.792
0.0ThrCys: 0.0 ± 0.0
2.198ThrAsp: 2.198 ± 0.904
4.946ThrGlu: 4.946 ± 1.284
0.824ThrPhe: 0.824 ± 0.357
3.298ThrGly: 3.298 ± 0.672
1.924ThrHis: 1.924 ± 0.97
3.572ThrIle: 3.572 ± 0.861
4.122ThrLys: 4.122 ± 1.217
5.771ThrLeu: 5.771 ± 1.28
1.374ThrMet: 1.374 ± 0.423
3.847ThrAsn: 3.847 ± 0.572
3.847ThrPro: 3.847 ± 0.737
2.198ThrGln: 2.198 ± 0.754
2.198ThrArg: 2.198 ± 0.866
4.672ThrSer: 4.672 ± 1.433
3.572ThrThr: 3.572 ± 0.811
5.221ThrVal: 5.221 ± 1.066
2.198ThrTrp: 2.198 ± 0.605
1.374ThrTyr: 1.374 ± 0.912
0.0ThrXaa: 0.0 ± 0.0
Val
3.572ValAla: 3.572 ± 0.91
0.55ValCys: 0.55 ± 0.757
3.572ValAsp: 3.572 ± 1.237
3.847ValGlu: 3.847 ± 0.945
0.824ValPhe: 0.824 ± 0.342
5.221ValGly: 5.221 ± 0.507
3.298ValHis: 3.298 ± 1.034
3.847ValIle: 3.847 ± 0.7
4.122ValLys: 4.122 ± 0.956
3.847ValLeu: 3.847 ± 0.871
0.275ValMet: 0.275 ± 0.369
3.023ValAsn: 3.023 ± 0.938
2.748ValPro: 2.748 ± 0.771
4.122ValGln: 4.122 ± 1.286
3.023ValArg: 3.023 ± 0.845
3.572ValSer: 3.572 ± 1.13
4.397ValThr: 4.397 ± 0.694
4.122ValVal: 4.122 ± 1.244
2.198ValTrp: 2.198 ± 0.599
1.099ValTyr: 1.099 ± 0.506
0.0ValXaa: 0.0 ± 0.0
Trp
1.924TrpAla: 1.924 ± 0.361
0.275TrpCys: 0.275 ± 0.411
1.649TrpAsp: 1.649 ± 0.671
1.649TrpGlu: 1.649 ± 0.589
0.824TrpPhe: 0.824 ± 0.572
2.198TrpGly: 2.198 ± 0.822
0.275TrpHis: 0.275 ± 0.369
1.099TrpIle: 1.099 ± 0.45
3.023TrpLys: 3.023 ± 0.628
0.824TrpLeu: 0.824 ± 0.656
1.649TrpMet: 1.649 ± 0.51
1.649TrpAsn: 1.649 ± 1.225
1.099TrpPro: 1.099 ± 0.45
1.924TrpGln: 1.924 ± 0.713
2.198TrpArg: 2.198 ± 0.754
1.374TrpSer: 1.374 ± 0.879
1.374TrpThr: 1.374 ± 0.719
1.099TrpVal: 1.099 ± 0.28
0.824TrpTrp: 0.824 ± 0.342
0.55TrpTyr: 0.55 ± 0.226
0.0TrpXaa: 0.0 ± 0.0
Tyr
1.099TyrAla: 1.099 ± 0.452
1.649TyrCys: 1.649 ± 0.709
0.824TyrAsp: 0.824 ± 0.342
0.824TyrGlu: 0.824 ± 0.549
1.374TyrPhe: 1.374 ± 0.807
1.374TyrGly: 1.374 ± 0.845
0.824TyrHis: 0.824 ± 0.357
0.55TyrIle: 0.55 ± 0.226
3.572TyrLys: 3.572 ± 1.079
1.374TyrLeu: 1.374 ± 0.56
0.275TyrMet: 0.275 ± 0.192
1.374TyrAsn: 1.374 ± 0.709
1.374TyrPro: 1.374 ± 0.618
2.198TyrGln: 2.198 ± 0.791
0.824TyrArg: 0.824 ± 0.467
1.374TyrSer: 1.374 ± 0.305
0.824TyrThr: 0.824 ± 0.357
1.649TyrVal: 1.649 ± 0.684
1.099TyrTrp: 1.099 ± 0.488
1.099TyrTyr: 1.099 ± 0.397
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 10 proteins (3640 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski