[BiO BB] fastacmd - sequence retreival using "string" ?

Cook, Malcolm MEC at Stowers-Institute.org
Wed Jan 31 17:44:14 EST 2007


fastacmd will not do this for you since formatdb does not index by
anything other than the sequence identifier(s).

To set up free text indexing on the deflines of the fasta database you
might look at Lucegene (http://www.gmod.org/?q=node/83), though the
overhead may be bigger than the advantage you get from it.

Cheers,

Malcolm Cook

  

> -----Original Message-----
> From: 
> bio_bulletin_board-bounces+mec=stowers-institute.org at bioinform
> atics.org 
> [mailto:bio_bulletin_board-bounces+mec=stowers-institute.org at b
> ioinformatics.org] On Behalf Of Shameer Khadar
> Sent: Wednesday, January 31, 2007 11:46 AM
> To: General Forum at Bioinformatics.Org
> Subject: Fwd: [BiO BB] fastacmd - sequence retreival using "string" ?
> 
> Dear Malcom,
> Thanks for such a detialed  reply !!!
> I am sorry for the 'bad description' of my problem.
> 
> I am aware that fastacmd -s can search using the Accession ID 
> (say a set of
> numbers ), I am looking for an option to quickly search the 
> nr database to
> retreive sequence basesd on the "Query String".
> 
> For example : If the following is a snippet of a sequence from nr :
> > gi|15674171|ref|NP_268346.1,gi|Homo Sapiens - Kinase 1
> MTHSTCC.....
> I  need to retrieve the above entries (and of course entries 
> having similar)
> based on a Query string say "Homo Sapiens". I know this can 
> be done using a
> Perl script, and I have coded one for myself, but I need 
> something quick
> like fastacmd -s.
> 
> Hope you got my question this time.
> Thanks for all the time you spent for me !!!
> -- 
> Happy Bioinformatics
> Across the miles... Shameer Khadar
> _______________________________________________
> General Forum at Bioinformatics.Org - 
> BiO_Bulletin_Board at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
> 



More information about the BBB mailing list