On Wednesday 14 May 2003 03:21 pm, Nico Stuurman wrote: [...] > > 1) Add a "getAsArray()" method to the seq object, which returns an > > array containing all of the 'set' attributes and their values > > (key=attribute > > ["sequence","id", etc.], value=value of that attribute). This > > will also substitute as a "wrapper" for all of the other interface > > methods > > at once (i.e. so the user doesn't have to do "getId(); getSequence();" > > (etc...) if they want all of the seq object's data.) > > Isn't this functionality supposed to part of seq_factory()? Maybe I > still don't get the concepts behind this structure. seq_factory's ONLY purpose as designed is go generate seq objects, rather than "disassembling" them. 'course, there are two different viewpoints here: On one hand, adjusting seq_factory's design goals to make it an "interconverter" or "translator" rather than a seq object creator wouldn't be THAT big of a change. On the other hand, getting "generic" data back out of the seq object seems to fit naturally into it - currently this is done by direct access ("$sequence=$seqobj->sequence;") and hopefully in the near future through the more "object-orientationally-correct" use of interface methods in the object ("$sequence=$seqobj->getSequence();"). I'm not sure moving retrieval of the data from the seq object out into another layer would necessarily be helpful in this case. (Additionally, for what its worth, the way "blahblahblah_factory" objects seem to be used elsewhere [C++, Java, etc.] are where I got the notion of a standalone object-generation-dedicated class. I don't know if that's necessarily "correct" design, but it does seem to be common). I'm not strongly opposed to broadening seq_factory's purpose a bit if we want to, though ('course, we'll want to rename it if we do.) > > 4)if given (to an "add()" method) an "array" of attributes, IOWrite > > just shoves them on the stack. If passed a seq object , IOWrite calls > > its > > "getAsArray()" method and shoves the results of that on the stack. > > (The > > "stack" is necessary when export is to interleaved file formats). We > > MIGHT > > include a "write()" (or some similar name) method to allow bypassing > > the "stack" and writing immediately for non-interleaved formats > > (returns false > > if called while set to an interleaved format). > > How important are interleaved formats going to be? They complicate > matters quite a bit, and if we can do without.... I would all be for a > 'write' method. Also, how is an interleaved format going to be > 'written'? By calling the 'write' method? Well, for MY purposes, converting from clustal to phylip is one thing I could see myself doing fairly often (both interleaved formats), as well as at some point reading clustal data in to 'cull' badly-aligning sequences from the list and writing back out. (Not something that needs to be done often when you're selecting the sequences to align by hand, but in an automated system that takes a not-human-reviewed list of sequences and aligns them, a future module to evaluate the quality of the individual alignments and cull bad ones for phylogenetic analysis could be handy). And, yeah, I figured the "write" method would be analogous to PHP's "flush()" - basically signalling the exporter to write whatever it's got saved up in its stack out to the destination [be it a variable of text, a file, or whatever). This MIGHT be done in some cases for non-interleaved formats, too, for data being sent over the 'net (for the small speed benefit of sending the data all at once rather than send a bit, read a bit, send a bit, read a bit...also beneficial if saving to media that "wears out", like compact flash cards, though I don't imagine that will be a really frequent concern.) [...] > > If I DID make a separate "Translate" class to be used like this, it > > might > > also include things like "Translate::NCBIDeflineExtract($field)" which > > one could use to get, e.g., just the accession number out of an NCBI > > Defline. > > I can't oversee the advantages/disadvantages completely here. The main advantages as I see it are easier code re-use, and lower resource usage for objects (in other words, currently every single seq object would contain a full copy of the code for some method, whereas if the "common" methods were moved to a separate class, they would only contain a "wrapper" which points to a single copy. (In the case of "complement()", this is already the case - just moving it into a separate file/class makes it easier to find and re-use in other modules, potentially. Inside the seq object we could implement a "getComplement()" method which simply does: return (sequence_Common::complement($this->seq)); Having said that - I have to admit I've never actually TRIED this before, so I don't know how easy it is to use or how well it works, but it SEEMS like it would be good for some things. My opinion at this point is that it's something we ought to CONSIDER, and probably something we'll eventually want to do, but that it's not something that we have a genuine NEED for yet and so can pretty much drop it if nobody else thinks the idea is useful at the moment. > > It might also be worth the trouble to move a lot of the "common" > > functions > > that are currently in the class files but not part of the classes (e.g. > > the "complement()" function in seq.inc.php) where they can be accessed > > by other object (or have the file be utilized by itself by other > > projects). [...] > Hmm. Doesn't it make more sense to make the part of the seq objects? In the specific case of complement (and others) they do seem a natural fit inside the seq object - in those cases really the only benefit the concept gives is reduced resource usage for each seq object (and the ability to get to a "complement()" function outside of the seq class file - though that's not much of a benefit by itself since someone can just as easily call seq::complement($sequence) as sequence_Common::complement($sequence) ) > Right, although it [error checking in interface methods] all adds overhead. Definitely true, though I think the ability to have checking and correction and so on is worth the trouble (not that every - or even "most" - methods need to include all of that. Most of mine are generally just wrappers for returning or setting the internal attributes.) The main reason I bring it up is that it seems like anyone who comes in from an OO background is going to be expecting to work through interface methods rather than accessing variables directly, so in addition to "possible" benefits that may or may not be used in an individual method, it also will accomodate people used to OO design (without harming anyone who still wants to deal with the attribute variables in the object directly). > > I was thinking about editing my old sequence class [...] > Good plan. I'll add that to my list, then...Once I get the NCBI-Blast to a minimally-complete point I'll get to work on that - I don't think it'll take long.