> When I wrote to the BioXML mailing list about an XML database, Guy Hulbert gave > this reply: > > > Because this fills your database with blobs, so why use a database at all ? > > You'd be better off, performance-wise, storing the XML docs in the file system > > and just use the database to manage the file store (I worked as a sysadmin for > > a product that did just this for scanned images). This confuses me a bit. Are you saying that you don't need a DB, or that you don't want to store the contents of the DB in a DB-specific manner? ( I believe that you wan't to be able to send queries to the DB such as " give me all interfaces/objects/applications that can preform this operation upon this kind of data "? Isn't that a DB? ) > This causes me think about the operation of 'container loci'. Recall that they > can contain other loci and act as a database. Well, I'm not a database expert, > but I think we want a system that will manage loci as files on the filesystem. > This is why: Loci will not be of any particular data format (as I've tried to > stress recently). This will avoid any substantial 'import and translation' > function that will require the Loci system to (1) spend time and space on large > datasets and (2) lock Loci into a one-of-a-kind data format. But it will also > give us a neat way of 'opening' loci: The 'container loci' can merely be set to > read from/write to a certain directory on the filesystem, and the directories > will serve to separate locus categories. So, for example, the user can put all > GenBank docs for Dictyostelium under the directory Hmm. There is a Gnome project called the Gnome-DB ( you didn't expect that name, did you? :) http://www.chez.com/rmoya/gnome-db/index.html and it has got a DB-engine that uses raw files. I have not examined it yet ( save that I have read their homepage ) so I don't know much about it, but it might be worth a look. <snip> > And will serve as 'dead storage' for loci. But if we really want to solve the > '2 terabyte document problem' (for genome analyses, as an example) that Jim > Freeman brought up to me a few weeks ago, we can't duplicate everything that > goes from dead storage to active use. Therefore, loci (treated as files) will > have to either remain in place or be moved to another directory, and NOT > duplicated. > > I'd like to get some feedback about this from you guys. If you pass around CORBA objects, that "represents" the actual file, you will not need to duplicate the file. ( GNOME::Stream is a very good candidate! :) In either case, passing around a file "reference", instead of reading directly from the file, seems to be the obvious solution. // Liss ps. GNOME::Storage / GNOME::Stream, dwells inside the bonobo package, in case that you didn't find them. There are a lot of goodies in that package, that might be of interest to loci.