On Tue, 6 Nov 2001, Ewan Birney wrote: > > David - > > > Sounds like a great project, and if you want to code, noone is stopping > you that's for sure! > The only thing to fear is fear itself. I'm not sure why that quote came to mind :) > > ;) I guess we are talking about Genquire for browsing, not for write back > quite yet, but I am happy to get into write back. > > Yeah, that will come. For now, I just want to see stuff. > > > I suspect as I know the current Ensembl API well and the Bioperl > interfaces perfectly I would be the man to help out here. > > Hopefully I don't add too much to your plate. > > Big question for you > > > how do you handle scrolling (or imagine handling scrolling) across v. > large regions of DNA - ie, you have 250MB of chr1. Do you want to > We handled that in Arabidopsis with a contig-based chromosome viewer screen, kind of like DAS entry points. Then the easiest way into the genome was to pick your chromosome, then pick the contig of interest on that chromosome. 100 Kb is no big deal for features, and bigger than that you are flying so high above the data that individual features are irrelevant. This approach is not quite so relevant once we think in terms of whole chromosomes. I think our approach will have to be to use the GUI to encourage people to download smaller chunks at a time, and then to scroll chunk by chunk using the chromosome-level window. > > (i) pull out a sequence of the whole thing and then make calls like > > get_SeqFeatures_range(10000000,20000000); > > with the returned features starting at 100000000 > I don't like this, but it could be done. > > (ii) pull out a sequence between 1000000,2000000 and then make calls > like > > get_SeqFeatures(); > > with the returned features starting at 1 > > This is the way I think about things, and the way we've done it in Arabidopsis. > Remember we are talking millions of features across chr1, so pulling them > all out into memory is not going to happen! > No, we want to lazy load as much as possible, and encourage the user to restrict the length of sequence downloaded to something reasonable. > > Efficiently caching and managing the memory for the scrolling seems to be > where alot of "magic" has to happen for these sorts of browsers. > Maybe we're punting by using a higher-level screen. Mark, what do you think? > > I/We/Ensembl can accommodate both ways. (ii) is currently easier with the > current API (i) will be easier with a future API and so I'd love to see > how easy it is to adapt between the two things. > > We should be able to accomodate (i), but (ii) is a more natural fit. The problem is the DAS entry points list. In Arabidopsis, the list of contigs available from TIGR was the natural set to work with. They are almost all around 100 Kb, so loading them up was trivial. Anything bigger than that will take some work. > > > Secondly - How do you want the GO things attached to genes (DBxRefs?) and > do you want to reuse all the lovely GeneStructureI stuff inside > Bioperl? (I presume yes). Should GeneStructureI also have-a > AnnotationCollectionI (talking bioperl) or should we hook it up someway > else? > These are write-back questions, correct? Mark and I stored GO things somewhat crudely and directly inside our TagValue table, using a small hack. I'm not sure how we would want to handle this. We have had discussions about where annotations belong, on the gene or on the transcript. The Genquire annotation code is part of GQ::Server::GenericFeature, so it can hang off of Genes, Transcripts, or Features (Exons, etc.). Does the GeneStructureI map to Ensembl adaptors okay? Genquire implements all of those interfaces as well, so our business objects look like GeneStructureI objects. They just have a 'context', which is their persistence hook (DbObj was too ugly), from which they receive an appropriate adaptor, which is called to do persistence-y type things. > > I forsee a genquire-ensembl-bridge cvs repository existing somewhere out > there.... > It will probably be a fairly major sub-project within Genquire, so it can hang out at bioinformatics.org, or whatever. How well does mixing cvs roots work in a single installation? Namespace shouldn't be too hard, since genquire is all in GQ::Server or GQ::Client. But can I keep a sub-directory (GQ::Server::Ensembl) stored in a different cvs root? I'm sure you deal with that, with all the different projects on the go at one time in your world :) Thanks for responding. BTW, it looks like I'm going to be spending some quality commuting time on a train here in California. I look forward to some Ewan-ish outbursts in my future! Dave