[BiO BB] command-line (scriptable) ORF finders?

Ann Loraine aloraine at gmail.com
Sun Sep 17 03:19:38 EDT 2006


Hello all,

I'm hoping someone on the list who is involved with EST or full-length
cDNA sequencing projects can help me with something (well..two
things):

(1) I am looking for a command-line, scriptable tool that can take as
input an EST, cDNA, or assembled EST contig ("unigene") sequence and
return the most likely or longest open reading frame. This is for a
plant EST project.  It should also pay attention to codon usage rules.

(2) I am also looking for a tool that can take as input a set of exon
annotations (or mRNA-to-genome alignments) and return the most likely
CDS start and end for the given gene structure. Tools that can jigger
the alignment/exon boundaries to optimize the ORF *and* which pay
attention to codon usage rules would be extra great. This is for
deducing novel gene structures from cross-species mRNA-to-genome
alignments. Maybe there is a gene-finder that does this?

I've found a variety of web sites that claim to do this, but, as you
know, Web sites don't really cut it when you are working with
thousands of sequences. And also, I would like to see the code in case
I run into problems.

Any thoughts or suggestions (other than pointers to Web tools, please)
would be greatly appreciated!

Sincerely,

Ann Loraine

-- 
Ann Loraine
Assistant Professor
Section on Statistical Genetics
University of Alabama at Birmingham
http://www.ssg.uab.edu
http://www.transvar.org



More information about the BBB mailing list