Isoelectric point predictions for UniProt database (219M protein sequences) . |-- README.txt |-- uniprot_trembl_flat.00.IPC2.fasta |-- ... |-- uniprot_trembl_flat.13.IPC2.fasta Due to the size of the whole database the orginal file has been divided into the smaller files with 16.5M sequences each. You can marge the files by 'cat', for instance: cat uniprot_trembl_flat.??.IPC2.fasta > uniprot_trembl.IPC2.fasta Note: Complete file is ~120GB thus before downloading and marging make sure that you have at least 0.3TB of free disc space =========================================================================================== Alternative mirror of the files: http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/ Should also work: mkdir uniprot/;cd uniprot; # compressed with 7zip wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/README.txt; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.00.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.01.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.02.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.03.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.04.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.05.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.06.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.07.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.08.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.09.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.10.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.11.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.12.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.13.IPC2.fasta.7z; for file in `ls *fasta.7z`; do 7z e $file ; done; cat uniprot_trembl_flat.??.IPC2.fasta > uniprot_trembl_flat.IPC2.fasta; # or as single file wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.IPC2.fasta.7z; 7z e uniprot_trembl_flat.IPC2.fasta.7z; #alternatively, compressed with zip wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.IPC2.fasta.zip; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.IPC2.fasta.z01; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.IPC2.fasta.z02; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.IPC2.fasta.z03; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.IPC2.fasta.z04; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.IPC2.fasta.z05; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.IPC2.fasta.z06; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.IPC2.fasta.z07; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.IPC2.fasta.z08; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.IPC2.fasta.z09; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.IPC2.fasta.z10; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.IPC2.fasta.z11; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.IPC2.fasta.z12; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/uniprot_trembl_flat.IPC2.fasta.z13; 7z e uniprot_trembl_flat.IPC2.fasta.zip; =========================================================================================== Contains predictions: Bjellqvist,DTASelect,Dawson,EMBOSS,Grimsley, IPC2_peptide,IPC2_protein,IPC_peptide,IPC_protein,Lehninger,Nozaki, Patrickios,ProMoST,Rodwell,Sillero,Solomon,Thurlkill,Toseland,Wikipedia =========================================================================================== References: Kozlowski LP (2022) Proteome-pI 2.0: proteome isoelectric point database update. Nucleic Acids Res. (Database Issue) 50 (D1): D1535-D1540, doi: 10.1093/nar/gkab944 Kozlowski LP (2021) IPC 2.0 - prediction of isoelectric point and pKa dissociation constants. Nucleic Acids Res. 49 (W1): W285-W292. doi: 10.1093/nar/gkab295 __author__ = "Lukasz Pawel Kozlowski" __email__ = "lukaszkozlowski.lpk@gmail.com" __copyrights__ = "Lukasz Pawel Kozlowski" __website__ = "http://isoelectricpointdb2.org"