[BiO BB] command-line (scriptable) ORF finders?

Ann Loraine aloraine at gmail.com
Sun Sep 17 03:19:38 EDT 2006

Hello all,

I'm hoping someone on the list who is involved with EST or full-length
cDNA sequencing projects can help me with something (well..two

(1) I am looking for a command-line, scriptable tool that can take as
input an EST, cDNA, or assembled EST contig ("unigene") sequence and
return the most likely or longest open reading frame. This is for a
plant EST project.  It should also pay attention to codon usage rules.

(2) I am also looking for a tool that can take as input a set of exon
annotations (or mRNA-to-genome alignments) and return the most likely
CDS start and end for the given gene structure. Tools that can jigger
the alignment/exon boundaries to optimize the ORF *and* which pay
attention to codon usage rules would be extra great. This is for
deducing novel gene structures from cross-species mRNA-to-genome
alignments. Maybe there is a gene-finder that does this?

I've found a variety of web sites that claim to do this, but, as you
know, Web sites don't really cut it when you are working with
thousands of sequences. And also, I would like to see the code in case
I run into problems.

Any thoughts or suggestions (other than pointers to Web tools, please)
would be greatly appreciated!


Ann Loraine

Ann Loraine
Assistant Professor
Section on Statistical Genetics
University of Alabama at Birmingham

More information about the BBB mailing list