[BiO BB] fastacmd - sequence retreival using "string" ?
skhadar at gmail.com
Wed Jan 31 12:43:19 EST 2007
Thanks for such a detialed reply !!!
I am sorry for the 'bad description' of my problem.
I am aware that fastacmd -s can search using the Accession ID (say a set of
numbers ), I am looking for an option to quickly search the nr database to
retreive sequence basesd on the "Query String".
For example : If the following is a snippet of a sequence from nr :
> gi|15674171|ref|NP_268346.1,gi|Homo Sapiens - Kinase 1
I need to retrieve the above entries (and of course entries having similar)
based on a Query string say "Homo Sapiens". I know this can be done using a
Perl script, and I have coded one for myself, but I need something quick
like fastacmd -s.
Hope you got my question this time.
Thanks for all the time you spent for me !!!
Across the miles... Shameer Khadar
On 1/31/07, Cook, Malcolm <MEC at stowers-institute.org> wrote:
> It is unclear to me exactly what you want to do. What exactly do you
> mean by "string query"?
> Does knowing that the following two command return the same result
> answer your question?:
> > fastacmd -s 15674171,66818355
> > fastacmd -s NP_268346.1,XP_642837.1
> > fastacmd -s 'gi|15674171|ref|NP_268346.1,gi|66818355|ref|XP_642837.1'
> (note: you must quote the query to prevent the shell from trying to
> interpret the '|' character as pipe operator).
> If this does not help you, then I'm really unsure what you're after...
> The options that appear relevant to your need, taken from running
> fastacmd with --help as only option, are
> -s Comma-delimited search string(s).
> GIs, accessions, loci, or fullSeq-id strings may be used,
> e.g. 555, AC147927, 'gnl|dbname|tag' [String] Optional
> -i Input file with GIs/accessions/loci for batch
> retrieval [String] Optional
> -L Range of sequence to extract (Format: start,stop)
> 0 in 'start' refers to the beginning of the sequence
> 0 in 'stop' refers to the end of the sequence [String] Optional
> default = 0,0
> If you want to subsequences (ranges) from a bunch of different
> sequences, you must make separate calls to fastacmd. The -L option will
> not help you for this. The -L option only allows you to specify a
> single range. If you use it in conjuntion with multiple comma delimited
> search strings, this single range option is applied equally to all of
> the resulting sequences.
> Malcolm Cook
> Database Applications Manager - Bioinformatics
> Stowers Institute for Medical Research - Kansas City, Missouri
> > -----Original Message-----
> > From:
> > bio_bulletin_board-bounces+mec=stowers-institute.org at bioinform
> > atics.org
> > [mailto:bio_bulletin_board-bounces+mec=stowers-institute.org at b
> > ioinformatics.org] On Behalf Of Shameer Khadar
> > Sent: Tuesday, January 30, 2007 9:19 PM
> > To: General Forum at Bioinformatics.Org
> > Subject: [BiO BB] fastacmd - sequence retreival using "string" ?
> > Dear All,
> > Is it possible to retreive sequence(s) from a fastacmd nr
> > database based on
> > string qureies delimited by commas.
> > I know it is possible with the Accession IDs, Is there any
> > way to do it for
> > the string query.
> > Thanks,
> > Shameer
> > _______________________________________________
> > General Forum at Bioinformatics.Org -
> > BiO_Bulletin_Board at bioinformatics.org
> > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
> General Forum at Bioinformatics.Org -
> BiO_Bulletin_Board at bioinformatics.org
More information about the BBB