[BiO BB] fastacmd - sequence retreival using "string" ?
MEC at Stowers-Institute.org
Wed Jan 31 17:44:14 EST 2007
fastacmd will not do this for you since formatdb does not index by
anything other than the sequence identifier(s).
To set up free text indexing on the deflines of the fasta database you
might look at Lucegene (http://www.gmod.org/?q=node/83), though the
overhead may be bigger than the advantage you get from it.
> -----Original Message-----
> bio_bulletin_board-bounces+mec=stowers-institute.org at bioinform
> [mailto:bio_bulletin_board-bounces+mec=stowers-institute.org at b
> ioinformatics.org] On Behalf Of Shameer Khadar
> Sent: Wednesday, January 31, 2007 11:46 AM
> To: General Forum at Bioinformatics.Org
> Subject: Fwd: [BiO BB] fastacmd - sequence retreival using "string" ?
> Dear Malcom,
> Thanks for such a detialed reply !!!
> I am sorry for the 'bad description' of my problem.
> I am aware that fastacmd -s can search using the Accession ID
> (say a set of
> numbers ), I am looking for an option to quickly search the
> nr database to
> retreive sequence basesd on the "Query String".
> For example : If the following is a snippet of a sequence from nr :
> > gi|15674171|ref|NP_268346.1,gi|Homo Sapiens - Kinase 1
> I need to retrieve the above entries (and of course entries
> having similar)
> based on a Query string say "Homo Sapiens". I know this can
> be done using a
> Perl script, and I have coded one for myself, but I need
> something quick
> like fastacmd -s.
> Hope you got my question this time.
> Thanks for all the time you spent for me !!!
> Happy Bioinformatics
> Across the miles... Shameer Khadar
> General Forum at Bioinformatics.Org -
> BiO_Bulletin_Board at bioinformatics.org
More information about the BBB