On Wed, 01 Dec 1999, J.W. Bizzaro wrote: > > Just for kicks, I'm reposting my June message about 'constructing the > command-line' (well, and because I mentioned it to Brad). Note that I refer > to 'our own' XML for bioinformatics + Loci internals: LocusML. The plans for > a LocusML have changed a bit since then. Jeff I am glad Jeff reposted this. I have been creating perl CGI interfaces to EMBOSS programs. I was writing to Jeff about this and how it would be great to parse the *.acd files for each program ( these define the input and output data types, which are required, the data ranges, etc) into a GUI interface. This might be similar to GDE but Glade seems very promising. Alternatively, for a loci interface, parsing the *.acd files might generate a series of linked loci. One hassle with doing this is the acd interface will change, incrementally ( see below). As an aside on the internal data representation, you could either have one or not, similar to what Brad just mentioned about using databases. Personally I think format conversions are too lossy wrt annotations. Also, short of rewriting (almost) every application outside of loci, you would need to deal with format conversions at some point. The EMBOSS list has interesting thread going about protein sequences with very high ATCG content, so they must be forced to protein type otherwise the program thinks they are nucleic acids. The issue is adding a new flag for this forcing, what will be the flags name. The diversity of opinion on this issue is heartening. BLAST for example does this up front. You have to tell the program what type you have. Other programs tag sequences at the top with their type, but that would involve changing the databases, to create a new data format, like FBF. -- .david David Lapointe "The meek will inherit the earth," noted tycoon J. Paul Getty. "But not the mineral rights."