[Biodevelopers] RDBMS and Bioinformatics

Marc Dumontier mrdumont at blueprint.org
Tue Apr 13 12:22:37 EDT 2004


Dan Bolser wrote:

>>>How does the use of XML make the data model less scary? 
>>>
>>>I see how the XML is convenient for your read / write API, and how the
>>>hierarchical data model is more naturally encoded in XML. I see that it is
>>>because you use XML that you have access to the fast indexing technology.
>>>
>>>But how do you deal with issues of data integrity? I get the feeling I
>>>should learn XML schema... Does the BIND datamodel have an XML schema with
>>>constraints on the data?
>>>
>>>I can't help feeling that a big / complex data model is probelmatic for
>>>any system, nomater what the format.
>>>
>>>Thanks very much for the feedback,
>>>
>>>Cheers,
>>>Dan.
>>> 
>>>
>>>      
>>>
>>Using the XML document instead of a fully relational model is much less 
>>scary because you don't have to deal with creating complex SQL to select 
>>data and to update data properly. If you have to use 30 SQL to update 
>>many tables, you've got alot of points of failure there. It's just much 
>>easier to deal with a single document which contains all the date, and 
>>to have specific indexes on that data to query against.
>>    
>>
>
>I see your point. Deleting one 'object' for example could require a set of
>deletes from many tables. What you describe sounds like you have the model
>encoded somewhere in the software (middle ware?) and so don't have to
>worry about it too much.
>
>
>  
>
>>Our software is easy to update, when the underlying data specification 
>>is updated as well. We just regenerate the XML Schema from the ASN.1 
>>spec, invoke jaxb, and start working with the new classes. We don't have 
>>to change any SQL, or anything.
>>    
>>
>
>Great. That is a big problem for 'old' DB backend apps with multiple data
>access points.
>
>
>  
>
>>BIND does have an XML schema which imposes restraints and defines the 
>>data types, and since JAXB works off this schema, our data structures 
>>are all properly typed. The XML document generated is always validated 
>>against the schema before being commited to the database.
>>    
>>
>
>Cool.
>
>OK, final question, how will you do complex queries across the data?
>
>Thanks again,
>Dan.
>
>
>  
>

We use Lucene (http://jakarta.apache.org/lucene) to make a 
field-specific text index. This software package provides us with a 
query language which is very robust; all we have to do is tell it how to 
index our data....the rest is done for us.


Marc Dumontier

>>Marc Dumontier
>>
>>    
>>
>>> 
>>>
>>>      
>>>
>>>>we should be releasing a beta of this software in a short while...please 
>>>>visit http://www.bind.ca periodically for more information.
>>>>
>>>>Marc Dumontier
>>>>BIND Software Developer
>>>>Blueprint Initiative
>>>>Mt. Sinai Hospital
>>>>Toronto,ON
>>>>
>>>>Dan Bolser wrote:
>>>>
>>>>   
>>>>
>>>>        
>>>>
>>>>>On Tue, 16 Mar 2004, Michel Dumontier wrote:
>>>>>
>>>>>
>>>>>
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>>>>Same goes for BIND, they plan to use RDB, but not in a conventional way
>>>>>>>(so far as I understand).
>>>>>>>
>>>>>>>    
>>>>>>>
>>>>>>>         
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>>BIND (http://bind.ca) stores bind-objects based on ASN.1 specification
>>>>>>(ftp://ftp.blueprint.org/pub/BIND/spec/, also available as XML DTD and
>>>>>>Schema), as ASN.1/XML in BLOB fields in the database table.  BIND makes use
>>>>>>of field-specific indexing to be able to search for any particular object or
>>>>>>set of objects that match the search criteria.  The relational aspect is
>>>>>>really more for curatorial work and tracking, afaik...
>>>>>>  
>>>>>>
>>>>>>       
>>>>>>
>>>>>>            
>>>>>>
>>>>>So it wont be like an XML query system? Sorry if I misunderstand, but it
>>>>>sounds like you just do plain text index on an XML blob, but is is more
>>>>>than that?
>>>>>
>>>>>Generally, can anyone tell me  what is the point of XML schema when
>>>>>relational schema have existed for years with well understood maths, query
>>>>>language and theories of relational design? I understand XML as a
>>>>>transport medium, but why make it the basis for your object model over the
>>>>>RDB relational schema? Perhaps object orented datamodeling can do things
>>>>>relational modeling can't, but at what cost? I hate sounding old, but what
>>>>>was wrong with the RDB that we have to invent X-path and the like?
>>>>>
>>>>>Anyone on the list remember when relational databases were 'the new
>>>>>thing'?
>>>>>
>>>>>Dan.
>>>>>
>>>>>
>>>>>
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>>>Michel Dumontier
>>>>>>PhD Candidate
>>>>>>Samuel Lunenfeld Research Institute, Mt. Sinai Hospital
>>>>>>Department of Biochemistry, University of Toronto
>>>>>>Toronto, ON M5G1X5
>>>>>>micheld at mshri.on.ca
>>>>>>http://blueprint.org
>>>>>>
>>>>>>
>>>>>>
>>>>>>_______________________________________________
>>>>>>Biodevelopers mailing list
>>>>>>Biodevelopers at bioinformatics.org
>>>>>>https://bioinformatics.org/mailman/listinfo/biodevelopers
>>>>>>
>>>>>>  
>>>>>>
>>>>>>       
>>>>>>
>>>>>>            
>>>>>>
>>>>>_______________________________________________
>>>>>Biodevelopers mailing list
>>>>>Biodevelopers at bioinformatics.org
>>>>>https://bioinformatics.org/mailman/listinfo/biodevelopers
>>>>>
>>>>>
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>_______________________________________________
>>>>Biodevelopers mailing list
>>>>Biodevelopers at bioinformatics.org
>>>>https://bioinformatics.org/mailman/listinfo/biodevelopers
>>>>
>>>>   
>>>>
>>>>        
>>>>
>>>_______________________________________________
>>>Biodevelopers mailing list
>>>Biodevelopers at bioinformatics.org
>>>https://bioinformatics.org/mailman/listinfo/biodevelopers
>>> 
>>>
>>>      
>>>
>>_______________________________________________
>>Biodevelopers mailing list
>>Biodevelopers at bioinformatics.org
>>https://bioinformatics.org/mailman/listinfo/biodevelopers
>>
>>    
>>
>
>_______________________________________________
>Biodevelopers mailing list
>Biodevelopers at bioinformatics.org
>https://bioinformatics.org/mailman/listinfo/biodevelopers
>  
>




More information about the Biodevelopers mailing list