Hi Chris, Can you pull the sequences from ftp://ftp.ncbi.nih.gov/refseq/ originally and then get updates daily from ftp://ftp.ncbi.nih.gov/refseq/daily/? I have downloaded releases without too much trouble. I have never dealt with the daily updates, but I would think it would be fairly easy to get them and then sort the sequences into the appropriate (mouse, human, whatever) bins. I have never looked at dbSNP, but ftp://ftp.ncbi.nih.gov/snp/database/README.create_local_dbSNP.txt looks helpful. Ethan -----Original Message----- From: biodevelopers-bounces+ethan.strauss=promega.com at bioinformatics.org [mailto:biodevelopers-bounces+ethan.strauss=promega.com at bioinformatics.o rg] On Behalf Of Christopher Dwan Sent: Wednesday, July 05, 2006 1:25 PM To: biodevelopers at bioinformatics.org Subject: [Biodevelopers] Batch download of RefSeq or dbSNP? I'm writing some scripts to download data. Specifically, I need FASTA versions of: * All the "finished" mouse proteins in refseq * All the "finished" human proteins in refseq * All the sequences in dbSNP Ideally, my script would produce updated versions of these datasets nightly or so. I would prefer to do this without spamming the NCBI servers (or my bandwidth providers) too much. I've messed around with the bioperl Bio::DB routines enough to get really confused by ENTREZ queries. I've also looked at the FASTA source available through FTP from NCBI, and that confused me more. How do smart people do this sort of thing these days? -Chris Dwan _______________________________________________ Biodevelopers mailing list Biodevelopers at bioinformatics.org https://bioinformatics.org/mailman/listinfo/biodevelopers