On Tue, 2004-03-16 at 17:37, Dan Bolser wrote: > > Basically the object<->relational mapping is on one-to-one and onto in > > most cases, so you have to resort to "hacks" like serialization to make > > Sorry, do you mean 'is not 1 to 1' ? s/on one-to-one/not one-to-one/ yes. > > > sure information and state are not lost (my apologies to those who do > > not consider serialization to be a hack). Objects can be rich and > > dynamic data structures which can be represented by an XML document to a > > degree (apart from the code elements), and can better represent dynamic > > data. > > I follow. I guess it is rare that people make large amounts of data > available via XML (i.e. using XML as a database). The way you describe Well... there are some folks who are using XML as the database. Some systems are known to spit out several gigs of XML. Makes parsing it in the traditional tree modality somewhat hard. This happens often enough that people write methods to handle the symptom. See the XML::Twig perl module. Basically parsing a tree is easy if it is all in memory. It gets ... more complicated ... if portions of the tree have to reside in a secondary storage mechanism. I look at XML as more of a "portable" way to represent complex data. RDBMS's are not portable in a binary sense (in most cases I am aware of) across ABI's. Look at XML as akin to ASN.1. They are not the same, but generally serve similar functions. It is however, somewhat hard to read binary ASN.1 data, and infer the structure from the file. What is really nice about XML is it is for the most part programming language and platform independent. I am not sure if the tags can be Unicode, so it might not be human language independent. The nice thing about XML is that the structure of the document maps well into the structure of the data it represents. > sounds like a good use of XML - giving / transporting data about a > programs internal state. > > > They generally solve different problems, though there is overlap. > > I am still a bit confused. I can't help thinking of dia, which makes > exelent use of XML to represent diagrams, and so has easy interchange with > lots of tools - i.e. good use of XML, it woudl be crazy to run dia off an > RDB. To a degree this is correct. If the XML document represented a connected set of tables, you could map that to an RDBMS. However, it would be hard to generate the diagram itself from the RDBMS (e.g. it is easy to encode data in an RDBMS, but hard to encode structure, though searching is easy). The XML could represent a richer non-tabular system, in which case the XML can take on the necessary structure to represent the system (e.g. it is easy to encode structure in XML, as well as the data which resides in the structure, though searching is hard). > But what is the point in creating biological data in this form, when the > 'data model' is basically our own concept about the data? One of these days someone is going to extend Go"del's incompleteness proof for biological systems. > Wouldn't a SwissProt RDB be much more sensible than an XML document? Only if the Swissprot never changes format. The whole point of XML is the "X". Extensible. If you want to integrate portions of Swissprot into your own research DB, you can do this, but you would either have to deal with the Swissprot normalization model, or datamart the swissprot and create your own normalization . Some of this comes from the bias of the developers as well. It is hard to transport RDBMS's portably. There are whole companies devoted to EDI that do nothing but this (for other industries). XML greatly simplifies the EDI. It is not a silver bullet, but it is helpful for data exchange. If you get your results back in tabular form (RDBMS) or structural form (XML) from a query, does it matter what the underlying data storage technology is?