Hi, I have a bioinformatics project that involves finding polymorphisms in mitochondrial DNA (mtDNA). The polymorphisms are typically denoted as "reference base/position/polymorphic base", as in A750G. I'd like to add a software tool to our company website where a visitor could paste in a set of mitochondrial genomes, and a reference sequence, and get back a list of polymorphisms. Something like: >Seq1 A458G, T4899A.... >SEQ2 T678C, G6789C.... etc. We sequence mitochondrial DNA for customers interested in learning about their ancient ancestry. The site will be freely available. It will be attached to our company site, www.argusbio.com, which is still in development at LunarPages. The author's name and an email link could be listed on the page. A full-length genome is 16,569 bases long. Typically two people will have around 30 to 50 differences in their mtDNAs - more (but less than 100) if they have very different ancestry (African vs European, for example). These polymorphisms determine the persons mitochondrial haplogroup. It would be very helpful if the program were able to determine which haplogroup the mtDNA belongs in based on the list of polymorphisms. I have tables of diagnostic polymorphisms used for classing mt genomes. It would also be very useful if there were an option to generate a fasta file that consisted of just polymorphic sites. So if someone put in 100 full-length genomes, and a reference genome, the output would be fasta sequences where each base varied from the reference in at least one test sequence. This output would be much easier to align with CLUSTALW than the full-length sequences, which are typically > 99% invariant. I am looking for some ideas of how best to implement this web-based tool. Thanks, David B. Whyte, Ph.D. Argus Biosciences, LLC 650-954-1055 dwhyte at argusbio.com www.argusbio.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://bioinformatics.org/pipermail/biodevelopers/attachments/20060408/39190763/attachment.html