Amino acid dipepetide frequency for Sumatran orang-utan polyomavirus

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
4.379AlaAla: 4.379 ± 0.96
1.642AlaCys: 1.642 ± 0.723
3.284AlaAsp: 3.284 ± 1.803
2.737AlaGlu: 2.737 ± 1.502
3.284AlaPhe: 3.284 ± 0.389
2.737AlaGly: 2.737 ± 1.024
0.547AlaHis: 0.547 ± 0.347
4.379AlaIle: 4.379 ± 1.552
3.831AlaLys: 3.831 ± 1.701
6.021AlaLeu: 6.021 ± 1.743
0.0AlaMet: 0.0 ± 0.0
2.189AlaAsn: 2.189 ± 0.752
0.547AlaPro: 0.547 ± 0.532
4.379AlaGln: 4.379 ± 0.889
5.473AlaArg: 5.473 ± 2.739
5.473AlaSer: 5.473 ± 1.337
3.284AlaThr: 3.284 ± 1.531
3.831AlaVal: 3.831 ± 0.869
0.547AlaTrp: 0.547 ± 0.347
0.0AlaTyr: 0.0 ± 0.0
0.0AlaXaa: 0.0 ± 0.0
Cys
3.831CysAla: 3.831 ± 1.358
0.0CysCys: 0.0 ± 0.0
2.737CysAsp: 2.737 ± 1.299
0.547CysGlu: 0.547 ± 0.347
1.095CysPhe: 1.095 ± 1.138
1.095CysGly: 1.095 ± 0.429
0.547CysHis: 0.547 ± 0.569
1.095CysIle: 1.095 ± 0.55
4.379CysLys: 4.379 ± 1.644
3.284CysLeu: 3.284 ± 1.622
0.0CysMet: 0.0 ± 0.0
1.095CysAsn: 1.095 ± 0.694
2.189CysPro: 2.189 ± 1.417
0.547CysGln: 0.547 ± 0.347
0.0CysArg: 0.0 ± 0.0
1.095CysSer: 1.095 ± 0.694
1.095CysThr: 1.095 ± 0.429
0.547CysVal: 0.547 ± 0.569
0.0CysTrp: 0.0 ± 0.0
1.642CysTyr: 1.642 ± 0.627
0.0CysXaa: 0.0 ± 0.0
Asp
1.095AspAla: 1.095 ± 1.063
0.0AspCys: 0.0 ± 0.0
2.189AspAsp: 2.189 ± 1.101
3.831AspGlu: 3.831 ± 0.824
2.737AspPhe: 2.737 ± 1.299
3.284AspGly: 3.284 ± 0.924
0.547AspHis: 0.547 ± 0.347
6.021AspIle: 6.021 ± 1.463
6.568AspLys: 6.568 ± 0.87
2.737AspLeu: 2.737 ± 0.798
2.737AspMet: 2.737 ± 1.507
1.642AspAsn: 1.642 ± 0.723
3.831AspPro: 3.831 ± 0.501
0.547AspGln: 0.547 ± 0.347
1.642AspArg: 1.642 ± 0.672
7.663AspSer: 7.663 ± 1.51
1.095AspThr: 1.095 ± 1.138
3.831AspVal: 3.831 ± 0.381
1.095AspTrp: 1.095 ± 0.762
3.284AspTyr: 3.284 ± 0.857
0.0AspXaa: 0.0 ± 0.0
Glu
8.758GluAla: 8.758 ± 2.423
1.642GluCys: 1.642 ± 0.723
3.284GluAsp: 3.284 ± 0.579
7.115GluGlu: 7.115 ± 2.025
2.737GluPhe: 2.737 ± 0.959
3.284GluGly: 3.284 ± 1.2
1.095GluHis: 1.095 ± 1.138
2.737GluIle: 2.737 ± 0.723
4.926GluLys: 4.926 ± 2.628
7.663GluLeu: 7.663 ± 1.241
1.095GluMet: 1.095 ± 0.429
4.926GluAsn: 4.926 ± 0.839
0.547GluPro: 0.547 ± 0.532
3.284GluGln: 3.284 ± 0.868
0.0GluArg: 0.0 ± 0.0
6.021GluSer: 6.021 ± 1.687
2.189GluThr: 2.189 ± 0.523
4.379GluVal: 4.379 ± 1.701
1.095GluTrp: 1.095 ± 0.55
0.547GluTyr: 0.547 ± 0.347
0.0GluXaa: 0.0 ± 0.0
Phe
2.189PheAla: 2.189 ± 0.843
2.189PheCys: 2.189 ± 0.992
2.189PheAsp: 2.189 ± 0.992
2.189PheGlu: 2.189 ± 0.992
1.095PhePhe: 1.095 ± 0.429
1.095PheGly: 1.095 ± 1.063
2.189PheHis: 2.189 ± 0.752
1.642PheIle: 1.642 ± 0.627
2.737PheLys: 2.737 ± 1.736
4.926PheLeu: 4.926 ± 0.658
1.642PheMet: 1.642 ± 0.723
3.284PheAsn: 3.284 ± 1.254
2.189PhePro: 2.189 ± 0.752
2.189PheGln: 2.189 ± 1.039
0.547PheArg: 0.547 ± 0.347
5.473PheSer: 5.473 ± 1.202
2.737PheThr: 2.737 ± 0.743
2.189PheVal: 2.189 ± 0.628
0.0PheTrp: 0.0 ± 0.0
1.642PheTyr: 1.642 ± 0.672
0.0PheXaa: 0.0 ± 0.0
Gly
2.189GlyAla: 2.189 ± 1.039
0.547GlyCys: 0.547 ± 0.347
3.284GlyAsp: 3.284 ± 0.902
4.379GlyGlu: 4.379 ± 0.96
2.737GlyPhe: 2.737 ± 1.37
4.926GlyGly: 4.926 ± 1.866
1.095GlyHis: 1.095 ± 0.762
2.737GlyIle: 2.737 ± 0.828
2.189GlyLys: 2.189 ± 1.389
7.663GlyLeu: 7.663 ± 1.617
0.547GlyMet: 0.547 ± 0.525
2.737GlyAsn: 2.737 ± 1.299
4.926GlyPro: 4.926 ± 0.839
4.926GlyGln: 4.926 ± 0.873
2.737GlyArg: 2.737 ± 0.409
2.737GlySer: 2.737 ± 0.813
2.737GlyThr: 2.737 ± 0.632
4.379GlyVal: 4.379 ± 1.916
0.0GlyTrp: 0.0 ± 0.0
0.0GlyTyr: 0.0 ± 0.0
0.0GlyXaa: 0.0 ± 0.0
His
1.642HisAla: 1.642 ± 0.672
0.0HisCys: 0.0 ± 0.0
0.0HisAsp: 0.0 ± 0.0
1.642HisGlu: 1.642 ± 0.672
0.0HisPhe: 0.0 ± 0.0
0.0HisGly: 0.0 ± 0.0
0.0HisHis: 0.0 ± 0.0
0.547HisIle: 0.547 ± 0.569
2.737HisLys: 2.737 ± 1.238
1.095HisLeu: 1.095 ± 1.138
1.642HisMet: 1.642 ± 0.572
0.547HisAsn: 0.547 ± 0.347
2.189HisPro: 2.189 ± 0.742
0.0HisGln: 0.0 ± 0.0
0.547HisArg: 0.547 ± 0.347
2.737HisSer: 2.737 ± 0.409
0.0HisThr: 0.0 ± 0.0
0.0HisVal: 0.0 ± 0.0
0.0HisTrp: 0.0 ± 0.0
1.095HisTyr: 1.095 ± 0.694
0.0HisXaa: 0.0 ± 0.0
Ile
1.642IleAla: 1.642 ± 0.48
1.095IleCys: 1.095 ± 0.429
3.284IleAsp: 3.284 ± 0.389
3.831IleGlu: 3.831 ± 0.631
2.189IlePhe: 2.189 ± 1.101
3.284IleGly: 3.284 ± 2.285
0.0IleHis: 0.0 ± 0.0
2.737IleIle: 2.737 ± 1.197
2.189IleLys: 2.189 ± 0.628
5.473IleLeu: 5.473 ± 1.928
1.642IleMet: 1.642 ± 0.723
1.642IleAsn: 1.642 ± 1.162
2.737IlePro: 2.737 ± 1.394
0.547IleGln: 0.547 ± 0.569
2.737IleArg: 2.737 ± 0.798
2.737IleSer: 2.737 ± 0.743
4.379IleThr: 4.379 ± 1.616
5.473IleVal: 5.473 ± 0.763
1.642IleTrp: 1.642 ± 0.906
2.737IleTyr: 2.737 ± 1.238
0.0IleXaa: 0.0 ± 0.0
Lys
2.737LysAla: 2.737 ± 1.156
3.284LysCys: 3.284 ± 1.861
1.095LysAsp: 1.095 ± 0.694
5.473LysGlu: 5.473 ± 1.471
1.095LysPhe: 1.095 ± 0.429
4.379LysGly: 4.379 ± 1.274
2.189LysHis: 2.189 ± 0.752
2.737LysIle: 2.737 ± 1.054
7.115LysLys: 7.115 ± 0.844
4.379LysLeu: 4.379 ± 1.461
3.284LysMet: 3.284 ± 1.1
2.737LysAsn: 2.737 ± 0.949
3.284LysPro: 3.284 ± 1.058
4.926LysGln: 4.926 ± 1.473
5.473LysArg: 5.473 ± 0.677
1.095LysSer: 1.095 ± 0.694
3.831LysThr: 3.831 ± 1.398
4.379LysVal: 4.379 ± 1.676
0.0LysTrp: 0.0 ± 0.0
4.379LysTyr: 4.379 ± 1.027
0.0LysXaa: 0.0 ± 0.0
Leu
3.831LeuAla: 3.831 ± 1.916
1.642LeuCys: 1.642 ± 0.572
8.21LeuAsp: 8.21 ± 1.237
6.568LeuGlu: 6.568 ± 1.223
6.568LeuPhe: 6.568 ± 1.1
5.473LeuGly: 5.473 ± 1.952
0.0LeuHis: 0.0 ± 0.0
9.305LeuIle: 9.305 ± 2.233
2.189LeuLys: 2.189 ± 0.992
9.852LeuLeu: 9.852 ± 3.175
2.189LeuMet: 2.189 ± 0.462
5.473LeuAsn: 5.473 ± 1.521
6.021LeuPro: 6.021 ± 1.368
4.926LeuGln: 4.926 ± 0.839
2.189LeuArg: 2.189 ± 0.628
4.379LeuSer: 4.379 ± 1.243
4.926LeuThr: 4.926 ± 0.914
2.189LeuVal: 2.189 ± 0.48
2.189LeuTrp: 2.189 ± 1.101
3.284LeuTyr: 3.284 ± 0.55
0.0LeuXaa: 0.0 ± 0.0
Met
2.737MetAla: 2.737 ± 0.409
1.095MetCys: 1.095 ± 0.55
2.737MetAsp: 2.737 ± 1.238
2.189MetGlu: 2.189 ± 0.628
1.095MetPhe: 1.095 ± 0.429
1.095MetGly: 1.095 ± 0.686
0.0MetHis: 0.0 ± 0.0
0.547MetIle: 0.547 ± 0.347
1.642MetLys: 1.642 ± 0.723
2.737MetLeu: 2.737 ± 1.024
0.0MetMet: 0.0 ± 0.0
1.095MetAsn: 1.095 ± 0.429
2.737MetPro: 2.737 ± 0.409
0.547MetGln: 0.547 ± 0.532
2.737MetArg: 2.737 ± 1.054
1.095MetSer: 1.095 ± 0.429
0.547MetThr: 0.547 ± 0.532
0.0MetVal: 0.0 ± 0.0
1.095MetTrp: 1.095 ± 1.063
1.095MetTyr: 1.095 ± 0.762
0.0MetXaa: 0.0 ± 0.0
Asn
3.284AsnAla: 3.284 ± 1.055
2.189AsnCys: 2.189 ± 1.389
0.547AsnAsp: 0.547 ± 0.347
2.737AsnGlu: 2.737 ± 1.308
2.189AsnPhe: 2.189 ± 0.752
1.095AsnGly: 1.095 ± 0.686
0.547AsnHis: 0.547 ± 0.347
1.642AsnIle: 1.642 ± 1.042
2.189AsnLys: 2.189 ± 1.389
3.831AsnLeu: 3.831 ± 1.533
1.095AsnMet: 1.095 ± 0.504
0.547AsnAsn: 0.547 ± 0.347
2.737AsnPro: 2.737 ± 0.813
3.284AsnGln: 3.284 ± 1.576
2.189AsnArg: 2.189 ± 1.004
2.189AsnSer: 2.189 ± 0.628
3.831AsnThr: 3.831 ± 1.613
3.831AsnVal: 3.831 ± 1.034
1.642AsnTrp: 1.642 ± 0.672
3.284AsnTyr: 3.284 ± 1.354
0.0AsnXaa: 0.0 ± 0.0
Pro
2.189ProAla: 2.189 ± 1.108
2.737ProCys: 2.737 ± 0.949
7.115ProAsp: 7.115 ± 1.48
2.737ProGlu: 2.737 ± 1.736
1.642ProPhe: 1.642 ± 0.723
5.473ProGly: 5.473 ± 2.323
0.547ProHis: 0.547 ± 0.347
2.189ProIle: 2.189 ± 0.523
4.379ProLys: 4.379 ± 1.507
4.379ProLeu: 4.379 ± 0.875
1.095ProMet: 1.095 ± 0.429
0.0ProAsn: 0.0 ± 0.0
5.473ProPro: 5.473 ± 1.804
1.642ProGln: 1.642 ± 0.572
2.189ProArg: 2.189 ± 1.108
2.189ProSer: 2.189 ± 1.118
3.831ProThr: 3.831 ± 1.292
3.284ProVal: 3.284 ± 1.803
1.095ProTrp: 1.095 ± 0.762
0.547ProTyr: 0.547 ± 0.532
0.0ProXaa: 0.0 ± 0.0
Gln
4.926GlnAla: 4.926 ± 1.36
0.547GlnCys: 0.547 ± 0.347
2.737GlnAsp: 2.737 ± 1.139
0.0GlnGlu: 0.0 ± 0.0
3.831GlnPhe: 3.831 ± 1.819
2.189GlnGly: 2.189 ± 1.417
1.095GlnHis: 1.095 ± 1.138
2.737GlnIle: 2.737 ± 0.959
3.284GlnLys: 3.284 ± 1.234
1.095GlnLeu: 1.095 ± 0.504
3.284GlnMet: 3.284 ± 0.915
1.095GlnAsn: 1.095 ± 1.063
1.095GlnPro: 1.095 ± 0.429
3.831GlnGln: 3.831 ± 1.774
2.737GlnArg: 2.737 ± 0.723
2.737GlnSer: 2.737 ± 0.409
1.642GlnThr: 1.642 ± 0.48
2.737GlnVal: 2.737 ± 1.233
0.0GlnTrp: 0.0 ± 0.0
2.737GlnTyr: 2.737 ± 0.41
0.0GlnXaa: 0.0 ± 0.0
Arg
0.547ArgAla: 0.547 ± 0.347
1.095ArgCys: 1.095 ± 0.55
4.379ArgAsp: 4.379 ± 0.616
2.737ArgGlu: 2.737 ± 1.394
2.737ArgPhe: 2.737 ± 0.798
4.379ArgGly: 4.379 ± 1.046
0.547ArgHis: 0.547 ± 0.347
2.737ArgIle: 2.737 ± 1.456
4.379ArgLys: 4.379 ± 1.505
3.284ArgLeu: 3.284 ± 0.857
1.642ArgMet: 1.642 ± 0.788
2.189ArgAsn: 2.189 ± 0.878
0.0ArgPro: 0.0 ± 0.0
2.189ArgGln: 2.189 ± 1.307
6.568ArgArg: 6.568 ± 1.722
0.547ArgSer: 0.547 ± 0.532
1.095ArgThr: 1.095 ± 0.429
3.284ArgVal: 3.284 ± 0.389
1.095ArgTrp: 1.095 ± 0.762
3.284ArgTyr: 3.284 ± 1.576
0.0ArgXaa: 0.0 ± 0.0
Ser
2.737SerAla: 2.737 ± 0.687
3.284SerCys: 3.284 ± 1.354
2.737SerAsp: 2.737 ± 0.949
4.926SerGlu: 4.926 ± 1.959
6.568SerPhe: 6.568 ± 2.485
4.379SerGly: 4.379 ± 1.308
0.547SerHis: 0.547 ± 0.347
3.284SerIle: 3.284 ± 0.902
3.284SerLys: 3.284 ± 1.287
9.305SerLeu: 9.305 ± 1.814
0.547SerMet: 0.547 ± 0.347
2.737SerAsn: 2.737 ± 1.024
2.737SerPro: 2.737 ± 1.736
1.095SerGln: 1.095 ± 0.429
1.095SerArg: 1.095 ± 0.429
9.852SerSer: 9.852 ± 1.99
3.831SerThr: 3.831 ± 1.819
3.284SerVal: 3.284 ± 1.722
0.0SerTrp: 0.0 ± 0.0
3.831SerTyr: 3.831 ± 1.599
0.0SerXaa: 0.0 ± 0.0
Thr
3.831ThrAla: 3.831 ± 2.43
1.642ThrCys: 1.642 ± 0.572
2.189ThrAsp: 2.189 ± 0.858
3.284ThrGlu: 3.284 ± 1.254
0.547ThrPhe: 0.547 ± 0.525
2.189ThrGly: 2.189 ± 0.878
0.547ThrHis: 0.547 ± 0.347
0.547ThrIle: 0.547 ± 0.532
2.189ThrLys: 2.189 ± 0.628
4.379ThrLeu: 4.379 ± 1.297
0.0ThrMet: 0.0 ± 0.0
3.831ThrAsn: 3.831 ± 1.847
6.568ThrPro: 6.568 ± 0.628
1.642ThrGln: 1.642 ± 0.723
1.642ThrArg: 1.642 ± 0.902
6.021ThrSer: 6.021 ± 1.059
5.473ThrThr: 5.473 ± 1.899
6.021ThrVal: 6.021 ± 1.743
1.095ThrTrp: 1.095 ± 0.762
1.642ThrTyr: 1.642 ± 0.672
0.0ThrXaa: 0.0 ± 0.0
Val
2.189ValAla: 2.189 ± 0.992
1.642ValCys: 1.642 ± 1.064
0.547ValAsp: 0.547 ± 0.525
7.115ValGlu: 7.115 ± 2.948
1.095ValPhe: 1.095 ± 0.504
2.189ValGly: 2.189 ± 1.599
2.737ValHis: 2.737 ± 0.409
3.284ValIle: 3.284 ± 1.815
4.379ValLys: 4.379 ± 1.256
7.115ValLeu: 7.115 ± 1.956
1.095ValMet: 1.095 ± 0.429
3.284ValAsn: 3.284 ± 0.977
2.189ValPro: 2.189 ± 1.216
2.189ValGln: 2.189 ± 1.523
3.284ValArg: 3.284 ± 0.902
3.284ValSer: 3.284 ± 0.886
6.568ValThr: 6.568 ± 2.722
4.926ValVal: 4.926 ± 0.973
0.547ValTrp: 0.547 ± 0.569
1.642ValTyr: 1.642 ± 0.788
0.0ValXaa: 0.0 ± 0.0
Trp
0.0TrpAla: 0.0 ± 0.0
0.547TrpCys: 0.547 ± 0.532
2.737TrpAsp: 2.737 ± 1.574
0.547TrpGlu: 0.547 ± 0.532
0.0TrpPhe: 0.0 ± 0.0
0.547TrpGly: 0.547 ± 0.569
0.0TrpHis: 0.0 ± 0.0
0.0TrpIle: 0.0 ± 0.0
1.095TrpLys: 1.095 ± 0.55
0.0TrpLeu: 0.0 ± 0.0
1.095TrpMet: 1.095 ± 0.762
0.547TrpAsn: 0.547 ± 0.347
1.095TrpPro: 1.095 ± 0.762
1.095TrpGln: 1.095 ± 0.55
1.095TrpArg: 1.095 ± 0.762
0.0TrpSer: 0.0 ± 0.0
0.0TrpThr: 0.0 ± 0.0
1.095TrpVal: 1.095 ± 0.762
1.095TrpTrp: 1.095 ± 0.55
2.189TrpTyr: 2.189 ± 0.992
0.0TrpXaa: 0.0 ± 0.0
Tyr
3.831TyrAla: 3.831 ± 1.669
0.0TyrCys: 0.0 ± 0.0
1.095TyrAsp: 1.095 ± 0.762
2.737TyrGlu: 2.737 ± 0.949
1.095TyrPhe: 1.095 ± 1.063
4.379TyrGly: 4.379 ± 0.797
1.642TyrHis: 1.642 ± 0.627
1.095TyrIle: 1.095 ± 0.762
2.189TyrLys: 2.189 ± 0.742
2.189TyrLeu: 2.189 ± 1.389
1.642TyrMet: 1.642 ± 0.981
3.284TyrAsn: 3.284 ± 1.234
1.642TyrPro: 1.642 ± 0.902
0.547TyrGln: 0.547 ± 0.347
3.831TyrArg: 3.831 ± 2.186
2.737TyrSer: 2.737 ± 0.798
2.189TyrThr: 2.189 ± 0.48
1.642TyrVal: 1.642 ± 0.906
0.547TyrTrp: 0.547 ± 0.347
2.189TyrTyr: 2.189 ± 1.523
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 5 proteins (1828 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski