[Pipet Devel] Re: SYNERGY
Humberto Ortiz Zuazaga
hortiz at neurobio.upr.clu.edu
Fri Oct 15 15:55:31 EDT 1999
> > 0. Loci is data independent, netgenetics in bioinformatics-centric
>
> Right. Speaking a data independence. It looks as though SYNERGY uses an
> internal format for their biodata, which we have decided against. An internal
> format requires 2 conversions between incompatible components:
>
> GENBANK 1 Internal 2 Analysis
> document ---> format ---> that doesn't
> read GENBANK
>
> The 'converter locus' scheme that Loci uses, would do only 1 conversion via the
> converter.
Jeff, you've got this exactly backwards. We need an internal format, we
decided it would be xml based, perhaps extended BSML. Converters should be
written to any format to ours and from any format to ours, otherwise we get to
write a converter for every pair of formats we support.
Example, image we want to support 4 file formats:
genbank - internal
pdb - internal
fasta - internal
bsml - internal
vs converters between the same 4 formats:
genbank - pdb
genbank - fasta
genbank - bsml
pdb - fasta
pdb - bsml
fasta - bsml
this comparison gets worse as you add more file formats. This is why the
netpbm tools all convert to pnm files.
What we had decided is that we can defer defining our file formats until we
actually have any loci that use them, and that we can have many small
languages instead of a big language that tries to capture all possible data
types.
So we'll have an internal format for nucleotide sequences, one for amino acid
sequences, one for multi sequence objects, one for sequence annotations, one
for bibliographic references, ...
--
Humberto Ortiz Zuazaga
Bioinformatics Specialist
Institute of Neurobiology
hortiz at neurobio.upr.clu.edu
More information about the Pipet-Devel
mailing list