Isoelectric point predictions for nr (NCBI) database (406M protein sequences) . |-- nr_flat.00.IPC2.fasta |-- ... |-- nr_flat.24.IPC2.fasta |-- README.txt Due to the size of the whole database the orginal file has been divided into the smaller files with 16.5M sequences each. You can marge the files by 'cat', for instance: cat nr_flat.??.IPC2.fasta > nr.IPC2.fasta Note: Complete file is 232GB thus before downloading and marging make sure that you have at least 0.5TB of free disc space =========================================================================================== Alternative mirror of the files: http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/ Should also work: mkdir nr/;cd nr; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/README.txt; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.00.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.01.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.02.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.03.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.04.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.05.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.06.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.07.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.08.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.09.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.10.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.11.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.12.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.13.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.14.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.15.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.16.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.17.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.18.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.19.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.20.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.21.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.22.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.23.IPC2.fasta.7z; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.24.IPC2.fasta.7z; for file in `ls *fasta.7z`; do 7z e $file ; done; cat nr_flat.??.IPC2.fasta > nr.IPC2.fasta; # or as single file wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/uniprot/nr_flat.IPC2.fasta.7z; 7z e nr_flat.IPC2.fasta.7z; #alternatively, compressed with zip wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.zip; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z01; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z02; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z03; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z04; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z05; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z06; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z07; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z08; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z09; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z10; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z11; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z12; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z13; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z14; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z15; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z16; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z17; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z18; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z19; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z20; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z21; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z22; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z23; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z24; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z25; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z26; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z27; wget http://isoelectricpointdb2.mimuw.edu.pl/extra_large_db/nr/nr_flat.IPC2.fasta.z28; 7z e nr_flat.IPC2.fasta.zip; =========================================================================================== Contains predictions: Bjellqvist,DTASelect,Dawson,EMBOSS,Grimsley, IPC2_peptide,IPC2_protein,IPC_peptide,IPC_protein,Lehninger,Nozaki, Patrickios,ProMoST,Rodwell,Sillero,Solomon,Thurlkill,Toseland,Wikipedia =========================================================================================== References: Kozlowski LP (2022) Proteome-pI 2.0: proteome isoelectric point database update. Nucleic Acids Res. (Database Issue) 50 (D1): D1535-D1540, doi: 10.1093/nar/gkab944 Kozlowski LP (2021) IPC 2.0 - prediction of isoelectric point and pKa dissociation constants. Nucleic Acids Res. 49 (W1): W285-W292. doi: 10.1093/nar/gkab295 __author__ = "Lukasz Pawel Kozlowski" __email__ = "lukaszkozlowski.lpk@gmail.com" __copyrights__ = "Lukasz Pawel Kozlowski" __website__ = "http://isoelectricpointdb2.org"