Amino acid dipepetide frequency for HIV-1 CRF03_AB

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
3.825AlaAla: 3.825 ± 0.981
1.391AlaCys: 1.391 ± 0.645
1.391AlaAsp: 1.391 ± 0.768
5.216AlaGlu: 5.216 ± 1.303
1.739AlaPhe: 1.739 ± 0.337
5.563AlaGly: 5.563 ± 1.147
1.043AlaHis: 1.043 ± 0.473
4.52AlaIle: 4.52 ± 1.729
2.434AlaLys: 2.434 ± 0.843
5.216AlaLeu: 5.216 ± 1.163
1.391AlaMet: 1.391 ± 0.619
1.739AlaAsn: 1.739 ± 0.875
3.129AlaPro: 3.129 ± 1.497
2.086AlaGln: 2.086 ± 0.443
4.52AlaArg: 4.52 ± 1.06
4.52AlaSer: 4.52 ± 0.876
4.52AlaThr: 4.52 ± 1.261
4.172AlaVal: 4.172 ± 1.264
1.391AlaTrp: 1.391 ± 0.603
1.043AlaTyr: 1.043 ± 0.473
0.0AlaXaa: 0.0 ± 0.0
Cys
1.043CysAla: 1.043 ± 0.705
0.348CysCys: 0.348 ± 0.598
0.348CysAsp: 0.348 ± 0.249
0.348CysGlu: 0.348 ± 0.426
1.043CysPhe: 1.043 ± 0.845
1.739CysGly: 1.739 ± 0.585
0.0CysHis: 0.0 ± 0.0
0.0CysIle: 0.0 ± 0.0
1.391CysLys: 1.391 ± 0.626
0.348CysLeu: 0.348 ± 0.598
0.348CysMet: 0.348 ± 0.429
1.391CysAsn: 1.391 ± 1.196
0.348CysPro: 0.348 ± 0.299
2.086CysGln: 2.086 ± 1.153
0.695CysArg: 0.695 ± 0.384
2.086CysSer: 2.086 ± 1.096
3.129CysThr: 3.129 ± 1.223
1.739CysVal: 1.739 ± 0.772
0.695CysTrp: 0.695 ± 0.31
0.695CysTyr: 0.695 ± 0.815
0.0CysXaa: 0.0 ± 0.0
Asp
1.739AspAla: 1.739 ± 0.622
3.477AspCys: 3.477 ± 1.275
2.086AspAsp: 2.086 ± 0.594
1.391AspGlu: 1.391 ± 0.615
1.043AspPhe: 1.043 ± 0.782
2.086AspGly: 2.086 ± 0.844
0.348AspHis: 0.348 ± 0.426
4.52AspIle: 4.52 ± 1.206
3.477AspLys: 3.477 ± 1.07
3.477AspLeu: 3.477 ± 1.134
0.695AspMet: 0.695 ± 0.32
2.434AspAsn: 2.434 ± 0.819
3.129AspPro: 3.129 ± 1.145
3.129AspGln: 3.129 ± 0.98
2.782AspArg: 2.782 ± 1.501
3.477AspSer: 3.477 ± 1.24
2.086AspThr: 2.086 ± 0.83
1.391AspVal: 1.391 ± 0.38
0.695AspTrp: 0.695 ± 0.64
0.695AspTyr: 0.695 ± 0.31
0.0AspXaa: 0.0 ± 0.0
Glu
5.563GluAla: 5.563 ± 1.082
0.0GluCys: 0.0 ± 0.0
2.434GluAsp: 2.434 ± 0.867
6.954GluGlu: 6.954 ± 2.398
1.391GluPhe: 1.391 ± 0.526
4.172GluGly: 4.172 ± 0.897
1.043GluHis: 1.043 ± 0.746
5.911GluIle: 5.911 ± 1.386
4.172GluLys: 4.172 ± 1.033
5.563GluLeu: 5.563 ± 1.368
2.086GluMet: 2.086 ± 0.794
2.782GluAsn: 2.782 ± 0.425
5.216GluPro: 5.216 ± 1.173
3.825GluGln: 3.825 ± 1.35
3.477GluArg: 3.477 ± 1.411
2.434GluSer: 2.434 ± 1.039
4.172GluThr: 4.172 ± 1.991
4.868GluVal: 4.868 ± 1.155
1.391GluTrp: 1.391 ± 0.718
1.391GluTyr: 1.391 ± 0.653
0.0GluXaa: 0.0 ± 0.0
Phe
1.043PheAla: 1.043 ± 0.29
0.348PheCys: 0.348 ± 0.299
1.043PheAsp: 1.043 ± 1.01
0.348PheGlu: 0.348 ± 0.299
1.043PhePhe: 1.043 ± 0.29
1.739PheGly: 1.739 ± 0.53
0.0PheHis: 0.0 ± 0.0
1.391PheIle: 1.391 ± 0.777
1.391PheLys: 1.391 ± 0.674
2.782PheLeu: 2.782 ± 0.712
0.348PheMet: 0.348 ± 0.598
3.129PheAsn: 3.129 ± 1.576
2.782PhePro: 2.782 ± 1.307
0.695PheGln: 0.695 ± 0.498
3.129PheArg: 3.129 ± 0.962
1.739PheSer: 1.739 ± 0.669
1.391PheThr: 1.391 ± 0.691
0.695PheVal: 0.695 ± 0.498
0.348PheTrp: 0.348 ± 0.249
1.739PheTyr: 1.739 ± 0.856
0.0PheXaa: 0.0 ± 0.0
Gly
5.563GlyAla: 5.563 ± 0.577
1.739GlyCys: 1.739 ± 0.674
4.868GlyAsp: 4.868 ± 1.376
2.434GlyGlu: 2.434 ± 0.443
2.086GlyPhe: 2.086 ± 0.696
6.259GlyGly: 6.259 ± 1.096
2.782GlyHis: 2.782 ± 1.921
7.65GlyIle: 7.65 ± 2.225
5.911GlyLys: 5.911 ± 1.475
2.782GlyLeu: 2.782 ± 0.963
1.391GlyMet: 1.391 ± 0.36
2.086GlyAsn: 2.086 ± 0.605
4.52GlyPro: 4.52 ± 1.003
4.52GlyGln: 4.52 ± 1.494
3.477GlyArg: 3.477 ± 0.988
5.216GlySer: 5.216 ± 1.494
3.825GlyThr: 3.825 ± 1.507
2.782GlyVal: 2.782 ± 0.602
1.739GlyTrp: 1.739 ± 0.904
1.739GlyTyr: 1.739 ± 0.935
0.0GlyXaa: 0.0 ± 0.0
His
0.0HisAla: 0.0 ± 0.0
0.695HisCys: 0.695 ± 0.616
0.0HisAsp: 0.0 ± 0.0
0.695HisGlu: 0.695 ± 0.308
1.391HisPhe: 1.391 ± 1.208
1.739HisGly: 1.739 ± 0.642
0.695HisHis: 0.695 ± 0.853
1.391HisIle: 1.391 ± 0.926
0.695HisLys: 0.695 ± 0.31
4.172HisLeu: 4.172 ± 0.768
0.0HisMet: 0.0 ± 0.386
1.043HisAsn: 1.043 ± 0.404
2.434HisPro: 2.434 ± 1.059
1.739HisGln: 1.739 ± 1.195
1.391HisArg: 1.391 ± 0.914
1.391HisSer: 1.391 ± 1.113
1.739HisThr: 1.739 ± 0.897
0.348HisVal: 0.348 ± 0.249
0.0HisTrp: 0.0 ± 0.0
0.695HisTyr: 0.695 ± 0.477
0.0HisXaa: 0.0 ± 0.0
Ile
2.086IleAla: 2.086 ± 0.879
1.739IleCys: 1.739 ± 0.604
1.739IleAsp: 1.739 ± 0.965
4.172IleGlu: 4.172 ± 0.755
0.695IlePhe: 0.695 ± 0.331
5.216IleGly: 5.216 ± 2.108
2.782IleHis: 2.782 ± 0.656
5.563IleIle: 5.563 ± 1.749
6.606IleLys: 6.606 ± 1.171
5.216IleLeu: 5.216 ± 1.207
0.695IleMet: 0.695 ± 0.39
1.391IleAsn: 1.391 ± 0.501
3.825IlePro: 3.825 ± 1.007
3.129IleGln: 3.129 ± 1.594
5.563IleArg: 5.563 ± 1.24
4.172IleSer: 4.172 ± 0.91
2.434IleThr: 2.434 ± 1.351
6.954IleVal: 6.954 ± 1.624
1.739IleTrp: 1.739 ± 0.573
2.434IleTyr: 2.434 ± 0.781
0.0IleXaa: 0.0 ± 0.0
Lys
4.172LysAla: 4.172 ± 1.224
1.739LysCys: 1.739 ± 0.738
3.477LysAsp: 3.477 ± 0.942
9.04LysGlu: 9.04 ± 0.845
1.739LysPhe: 1.739 ± 0.626
4.172LysGly: 4.172 ± 1.016
2.086LysHis: 2.086 ± 0.955
5.911LysIle: 5.911 ± 1.443
6.259LysLys: 6.259 ± 1.54
6.606LysLeu: 6.606 ± 1.74
0.695LysMet: 0.695 ± 0.308
1.739LysAsn: 1.739 ± 0.772
2.434LysPro: 2.434 ± 0.949
5.563LysGln: 5.563 ± 1.547
2.782LysArg: 2.782 ± 0.66
3.129LysSer: 3.129 ± 1.164
4.52LysThr: 4.52 ± 0.951
3.825LysVal: 3.825 ± 1.007
1.739LysTrp: 1.739 ± 0.642
2.086LysTyr: 2.086 ± 0.47
0.0LysXaa: 0.0 ± 0.0
Leu
4.868LeuAla: 4.868 ± 1.047
0.695LeuCys: 0.695 ± 0.308
5.216LeuAsp: 5.216 ± 1.306
5.563LeuGlu: 5.563 ± 1.76
2.434LeuPhe: 2.434 ± 1.03
5.911LeuGly: 5.911 ± 1.406
2.782LeuHis: 2.782 ± 1.554
2.782LeuIle: 2.782 ± 0.879
8.345LeuLys: 8.345 ± 1.296
6.606LeuLeu: 6.606 ± 1.726
1.391LeuMet: 1.391 ± 1.155
3.825LeuAsn: 3.825 ± 1.31
3.477LeuPro: 3.477 ± 1.157
3.129LeuGln: 3.129 ± 0.578
4.52LeuArg: 4.52 ± 0.579
2.086LeuSer: 2.086 ± 1.058
5.216LeuThr: 5.216 ± 1.37
5.911LeuVal: 5.911 ± 1.661
3.129LeuTrp: 3.129 ± 1.11
2.086LeuTyr: 2.086 ± 0.66
0.0LeuXaa: 0.0 ± 0.0
Met
1.391MetAla: 1.391 ± 0.783
0.0MetCys: 0.0 ± 0.0
1.043MetAsp: 1.043 ± 0.535
2.086MetGlu: 2.086 ± 1.135
0.348MetPhe: 0.348 ± 0.32
2.434MetGly: 2.434 ± 0.763
1.043MetHis: 1.043 ± 0.554
1.043MetIle: 1.043 ± 0.404
1.043MetLys: 1.043 ± 0.694
1.391MetLeu: 1.391 ± 0.527
1.391MetMet: 1.391 ± 0.937
0.348MetAsn: 0.348 ± 0.249
0.0MetPro: 0.0 ± 0.0
1.043MetGln: 1.043 ± 0.758
1.391MetArg: 1.391 ± 0.69
1.043MetSer: 1.043 ± 0.783
2.086MetThr: 2.086 ± 1.171
0.348MetVal: 0.348 ± 0.299
0.695MetTrp: 0.695 ± 0.505
0.348MetTyr: 0.348 ± 0.32
0.0MetXaa: 0.0 ± 0.0
Asn
2.086AsnAla: 2.086 ± 1.387
2.434AsnCys: 2.434 ± 1.287
1.739AsnAsp: 1.739 ± 0.759
1.739AsnGlu: 1.739 ± 0.733
3.825AsnPhe: 3.825 ± 1.118
1.739AsnGly: 1.739 ± 0.839
0.0AsnHis: 0.0 ± 0.0
2.434AsnIle: 2.434 ± 0.762
2.782AsnLys: 2.782 ± 0.808
2.434AsnLeu: 2.434 ± 0.843
1.739AsnMet: 1.739 ± 0.772
3.477AsnAsn: 3.477 ± 1.712
3.129AsnPro: 3.129 ± 1.008
1.391AsnGln: 1.391 ± 0.645
1.391AsnArg: 1.391 ± 0.622
1.043AsnSer: 1.043 ± 0.655
4.172AsnThr: 4.172 ± 2.111
1.739AsnVal: 1.739 ± 1.125
1.391AsnTrp: 1.391 ± 0.38
0.348AsnTyr: 0.348 ± 0.32
0.0AsnXaa: 0.0 ± 0.0
Pro
4.172ProAla: 4.172 ± 1.177
1.043ProCys: 1.043 ± 0.897
1.739ProAsp: 1.739 ± 0.61
4.172ProGlu: 4.172 ± 1.02
1.739ProPhe: 1.739 ± 0.745
3.825ProGly: 3.825 ± 1.018
0.348ProHis: 0.348 ± 0.249
4.172ProIle: 4.172 ± 0.949
3.477ProLys: 3.477 ± 1.655
5.563ProLeu: 5.563 ± 1.013
0.695ProMet: 0.695 ± 0.501
0.695ProAsn: 0.695 ± 0.477
4.172ProPro: 4.172 ± 1.426
4.172ProGln: 4.172 ± 1.23
2.782ProArg: 2.782 ± 1.8
3.825ProSer: 3.825 ± 1.533
2.434ProThr: 2.434 ± 0.61
5.563ProVal: 5.563 ± 1.135
1.043ProTrp: 1.043 ± 0.873
1.043ProTyr: 1.043 ± 0.652
0.0ProXaa: 0.0 ± 0.0
Gln
5.216GlnAla: 5.216 ± 1.17
0.348GlnCys: 0.348 ± 0.299
3.129GlnAsp: 3.129 ± 0.997
3.825GlnGlu: 3.825 ± 0.659
0.348GlnPhe: 0.348 ± 0.299
4.172GlnGly: 4.172 ± 1.094
1.739GlnHis: 1.739 ± 0.582
4.868GlnIle: 4.868 ± 1.148
3.825GlnLys: 3.825 ± 1.799
6.606GlnLeu: 6.606 ± 1.922
2.434GlnMet: 2.434 ± 1.345
2.782GlnAsn: 2.782 ± 1.37
2.434GlnPro: 2.434 ± 1.798
2.782GlnGln: 2.782 ± 1.062
3.129GlnArg: 3.129 ± 1.63
3.129GlnSer: 3.129 ± 0.816
1.391GlnThr: 1.391 ± 0.761
3.825GlnVal: 3.825 ± 1.867
0.695GlnTrp: 0.695 ± 0.498
2.086GlnTyr: 2.086 ± 1.001
0.0GlnXaa: 0.0 ± 0.0
Arg
5.563ArgAla: 5.563 ± 1.579
0.348ArgCys: 0.348 ± 0.299
3.477ArgAsp: 3.477 ± 0.918
5.563ArgGlu: 5.563 ± 1.092
1.043ArgPhe: 1.043 ± 0.718
3.129ArgGly: 3.129 ± 1.078
0.695ArgHis: 0.695 ± 0.682
4.868ArgIle: 4.868 ± 2.138
5.216ArgLys: 5.216 ± 1.179
3.129ArgLeu: 3.129 ± 1.374
0.695ArgMet: 0.695 ± 0.543
2.086ArgAsn: 2.086 ± 1.086
2.782ArgPro: 2.782 ± 1.122
4.52ArgGln: 4.52 ± 0.984
4.52ArgArg: 4.52 ± 2.979
1.391ArgSer: 1.391 ± 0.755
2.434ArgThr: 2.434 ± 1.124
2.434ArgVal: 2.434 ± 0.809
2.434ArgTrp: 2.434 ± 0.754
1.391ArgTyr: 1.391 ± 0.576
0.0ArgXaa: 0.0 ± 0.0
Ser
2.086SerAla: 2.086 ± 0.539
1.043SerCys: 1.043 ± 0.404
1.739SerAsp: 1.739 ± 0.424
4.52SerGlu: 4.52 ± 1.281
1.391SerPhe: 1.391 ± 0.843
4.52SerGly: 4.52 ± 1.175
0.695SerHis: 0.695 ± 0.64
3.477SerIle: 3.477 ± 1.063
3.129SerLys: 3.129 ± 0.924
5.216SerLeu: 5.216 ± 1.979
1.391SerMet: 1.391 ± 0.584
2.782SerAsn: 2.782 ± 0.853
4.172SerPro: 4.172 ± 1.253
3.477SerGln: 3.477 ± 1.37
3.129SerArg: 3.129 ± 1.228
4.868SerSer: 4.868 ± 1.185
3.825SerThr: 3.825 ± 1.088
2.086SerVal: 2.086 ± 0.47
0.695SerTrp: 0.695 ± 0.308
0.695SerTyr: 0.695 ± 0.616
0.0SerXaa: 0.0 ± 0.0
Thr
3.129ThrAla: 3.129 ± 1.189
0.0ThrCys: 0.0 ± 0.0
3.825ThrAsp: 3.825 ± 1.31
5.911ThrGlu: 5.911 ± 1.274
0.695ThrPhe: 0.695 ± 0.421
4.868ThrGly: 4.868 ± 0.992
1.391ThrHis: 1.391 ± 0.36
2.434ThrIle: 2.434 ± 0.558
3.825ThrLys: 3.825 ± 0.777
6.259ThrLeu: 6.259 ± 1.965
0.695ThrMet: 0.695 ± 0.477
2.086ThrAsn: 2.086 ± 0.679
3.477ThrPro: 3.477 ± 1.103
3.129ThrGln: 3.129 ± 0.582
2.086ThrArg: 2.086 ± 1.096
3.477ThrSer: 3.477 ± 0.915
2.434ThrThr: 2.434 ± 1.385
4.868ThrVal: 4.868 ± 1.286
2.086ThrTrp: 2.086 ± 0.705
2.086ThrTyr: 2.086 ± 0.857
0.0ThrXaa: 0.0 ± 0.0
Val
3.129ValAla: 3.129 ± 1.155
0.695ValCys: 0.695 ± 0.711
2.782ValAsp: 2.782 ± 1.019
2.434ValGlu: 2.434 ± 0.933
1.739ValPhe: 1.739 ± 0.728
6.259ValGly: 6.259 ± 1.19
1.739ValHis: 1.739 ± 0.392
3.477ValIle: 3.477 ± 0.808
4.172ValLys: 4.172 ± 0.946
4.52ValLeu: 4.52 ± 1.364
0.695ValMet: 0.695 ± 0.477
1.739ValAsn: 1.739 ± 0.924
3.477ValPro: 3.477 ± 1.102
4.52ValGln: 4.52 ± 1.465
2.782ValArg: 2.782 ± 1.016
3.477ValSer: 3.477 ± 0.539
4.868ValThr: 4.868 ± 1.771
4.868ValVal: 4.868 ± 2.168
2.086ValTrp: 2.086 ± 0.98
2.086ValTyr: 2.086 ± 0.613
0.0ValXaa: 0.0 ± 0.0
Trp
1.739TrpAla: 1.739 ± 0.503
0.348TrpCys: 0.348 ± 0.438
1.391TrpAsp: 1.391 ± 0.492
1.391TrpGlu: 1.391 ± 0.674
0.0TrpPhe: 0.0 ± 0.0
2.086TrpGly: 2.086 ± 0.79
0.348TrpHis: 0.348 ± 0.426
0.695TrpIle: 0.695 ± 0.31
2.782TrpLys: 2.782 ± 0.672
0.695TrpLeu: 0.695 ± 0.728
1.043TrpMet: 1.043 ± 0.579
1.739TrpAsn: 1.739 ± 1.127
1.391TrpPro: 1.391 ± 0.456
2.086TrpGln: 2.086 ± 0.817
2.434TrpArg: 2.434 ± 0.648
0.695TrpSer: 0.695 ± 0.505
1.739TrpThr: 1.739 ± 1.023
1.739TrpVal: 1.739 ± 0.495
0.695TrpTrp: 0.695 ± 0.498
0.348TrpTyr: 0.348 ± 0.249
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.434TyrAla: 2.434 ± 1.144
1.391TyrCys: 1.391 ± 0.798
0.695TyrAsp: 0.695 ± 0.498
1.043TyrGlu: 1.043 ± 0.68
1.391TyrPhe: 1.391 ± 0.603
1.739TyrGly: 1.739 ± 0.948
0.695TyrHis: 0.695 ± 0.421
1.043TyrIle: 1.043 ± 0.68
2.782TyrLys: 2.782 ± 1.517
1.391TyrLeu: 1.391 ± 0.5
0.348TyrMet: 0.348 ± 0.249
1.739TyrAsn: 1.739 ± 0.761
0.348TyrPro: 0.348 ± 0.32
2.086TyrGln: 2.086 ± 1.235
1.739TyrArg: 1.739 ± 0.622
1.739TyrSer: 1.739 ± 0.495
0.348TyrThr: 0.348 ± 0.249
1.391TyrVal: 1.391 ± 0.674
0.695TyrTrp: 0.695 ± 0.417
1.391TyrTyr: 1.391 ± 0.472
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 9 proteins (2877 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski