[BiO BB] Re: Quickly retrieving cross-referenced records from NCBI

Dale Richardson dalesan at gmail.com
Wed Dec 20 22:32:17 EST 2006


Hi Stan,

Thanks for this!  I will definitely keep a hold of it.  However, I  
think there is yet another way to retrieve such cross-ref'd records:   
For example, if I input a set of XP_ accessions and want to get the  
XM_ accessions one can simply select, "Nucleotide links" from the  
pulldown menu at NCBI (where GenPept, FASTA and other such options  
are listed).

Thanks for your tips tho (and to everyone else as well)!

Hope everyone has a great holiday season.

-dale


On Dec 20, 2006, at 12:02 PM, Gaj Stan (BIGCAT) wrote:

> Dear Dale,
>
> I encountered the same question a few weeks ago, but my focus was the
> other way around: go from NM to NP. For that I've written a Perl  
> script
> that I've adjusted to fit your needs (so going for NP to NM).
>
> If I'm correct, RefSeq splits it's database in three parts: genomic,
> mRNA and protein. For this script to work, you need a) to download a
> species-specific RefSeq mRNA database (ends with .rna.gbff) for the  
> NCBI
> ftp and b) to have your own file of convertable IDs, sorted in a
> list-form..
> Note that this script will NOT detect version numbers: e.g. XP_12345.1
> needs to be converted to XP_12345 in your list before it does it's  
> job!
>
> Although the code is far from perfect, it fulfills your question
> perfectly (-;
>
> Best wishes,
>
>    Stan
>
>
> -----Original Message-----
> From:
> bio_bulletin_board-bounces 
> +stan.gaj=bigcat.unimaas.nl at bioinformatics.org
> [mailto:bio_bulletin_board-bounces 
> +stan.gaj=bigcat.unimaas.nl at bioinforma
> tics.org] On Behalf Of Eugene Bolotin
> Sent: 13 December 2006 19:30
> To: General Forum at Bioinformatics.Org
> Subject: Re: [BiO BB] Re: Quickly retrieving cross-referenced records
> from NCBI
>
> The quickest way is UCSC table browser, batch retreive. Read up on  
> that.
>
>
> On 12/12/06, Dale Richardson <dalesan at gmail.com> wrote:
>>
>> Hello All,
>>
>> Forgive me for posting, but this question is hard to condense into a
>> good google search.  I am wondering if there is a quick way to batch
>> retrieve all coding sequences (mRNA sequences) linked to a particular
>> NCBI RefSeq Protein identifier.  For example, if I have a list of 10
>> sequences with the following protein refseq IDs:
>>
>> XP_698519.1
>> XP_697978.1
>>
>> and so on..
>>
>> How can I retrieve the cross-referenced XM_ identifiers for the
>> coding sequences based on such protein accessions?  Must one write
>> some kind of script to accomplish this or is there a quicker way?
>>
>> thanks,
>>
>> dale richardson
>> university of cologne
>>
>> _______________________________________________
>> General Forum at Bioinformatics.Org -
>> BiO_Bulletin_Board at bioinformatics.org
>> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>>
>
>
>
> -- 
> Eugene Bolotin
> Ph.D. candidate
> Genetics Genomics and Bioinformatics
> University of California Riverside
> ybolo001 at student.ucr.edu
> Dr. Frances Sladek Lab
> _______________________________________________
> General Forum at Bioinformatics.Org -
> BiO_Bulletin_Board at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
> _______________________________________________
> General Forum at Bioinformatics.Org -  
> BiO_Bulletin_Board at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board




More information about the BBB mailing list