On Thu, 31 Jul 2003, Patrick McConnell wrote: > > > > > > You are better off using SAX instead of DOM. What we do is filter Hsps and > Hits using a streaming technology (such as SAX), and then we parse the rest > with DOM. But, if you need all the Hsps and Hits, then you must use SAX or > load balancing. Yup, cheers, SAX is the way forward. > > Load balance based on file size. When your threads (or processes) ask for > another document to parse, you must give them one based on the size of the > documents the other threads are parsing. But I feel like the large > documents are still going to dominate the CPU time, and thus you will only > be left with a bunch of large documents in the end. I thought about this too, but I hate anything complex;) I found a really neat way to do massive dumps to mysql without incuring any of the normal overheads - Either increasingly slow index updates or (very) large prepared files for LOAD DATA INFILE ... Simply LOAD DATA INFILE from a named pipe... All is perfect, and multiprocessors (with a common file system) can cooperate like a charm. I found this solution in a mysql bug report. Thanks again, Dan. > > -Patrick > > > > > > Dan Bolser <dmb at mrc-dunn.cam.ac.uk>@bioinformatics.org on 07/31/2003 > 12:02:17 PM > > Please respond to biodevelopers at bioinformatics.org > > Sent by: biodevelopers-admin at bioinformatics.org > > > To: biodevelopers at bioinformatics.org > cc: > > Subject: [Biodevelopers] XML for huge DB? > > Hello, > > How can I use XML efficiently to parse multiple blast results > files? > > I want to parse them on a multi processor environment, without > hitting the system memory limit. > > This is likely to happen, as big files take the most time, so the > processes tend to work on big files at the same time, leading > to a system memory outage.... > > Cheers, > Dan. > > _______________________________________________ > Biodevelopers mailing list > Biodevelopers at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biodevelopers > > > > > _______________________________________________ > Biodevelopers mailing list > Biodevelopers at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biodevelopers >