[BiO BB] genbank2swissprot ?
Gaj Stan (BIGCAT)
Stan.Gaj at BIGCAT.unimaas.nl
Wed Oct 10 07:58:55 EDT 2007
Hi Christoph,
There are two other possibilities:
a) Use BioMART at www.ensembl.org to retrieve EnsEMBL gene IDs using your list of RefSeq ID (You did mention you used NM_-ID's, so I assume you mean RefSeq IDs) and export this list with their UniProt crosslinking as well. A problem you'll surely encounter using this approach is that there are situations where more than one UniProt ID has been associated with an EnsEMBL gene. The generated list contains this information, but on seperate lines. You'll need to filter the list for this.
b) The RefSeq group has recently announced that they updated their databases with information towards UniProt (since they collaborated closely on this one). I can't find the archive of their Gene-Announce-list, but here is the announcement:
=======
Announcing the availability of RefSeq-UniProtKB cross-link data
In collaboration with UniProtKB (http://www.pir.uniprot.org/) , the RefSeq group is now reporting explicit cross-references to Swiss-Prot and TrEMBL proteins that correspond to a RefSeq protein. These correspondences are being calculated by the UniProtKB group, and will be updated every three weeks to correspond to UniProt's release cycle. The data are being made available from several sites within NCBI:
1. The full report from Entrez Gene, in the Reference Sequences section.
For an example, go to the Full Report page for the sevenless gene of Drosophila melanogaster (http://www.ncbi.nlm.nih.gov/sites/entrez?Db=gene&Cmd=DetailsSearch&Term=32039%5Buid%5D) and click on the Reference Sequences section in the table of contents on the right. You will see
mRNA and Protein(s)
NM_078559.2→NP_511114.2 sevenless CG18085-PA [Drosophila melanogaster]
UniProtKB/Swiss-Prot P13368 <--- new data
2. Links in NCBI's Protein database
Explicit links between corresponding RefSeq and Swiss-Prot proteins are now provided within the NCBI Protein database. These links are available in the ‘Links’ menu located at the upper right of the protein display page. The link names are:
Protein (RefSeq): provides a link from a Swiss-Prot record the corresponding RefSeq record
Protein (UniProtKB): provides a link to the equivalent Swiss-Prot record
3. Filter choices in NCBI's Protein database
protein protein refseq2uniprot find RefSeq protein records with a link to a UniProtKB protein in NCBI's protein database
protein protein uniprot2refseq find UniProtKB protein records with a link to a RefSeq protein in NCBI's protein database
4. ftp sites
A new file was added to the gene and refseq ftp sites to report the relationship between NCBI Reference Sequence protein accessions and UniProtKB protein accessions. The new gene_refseq_uniprotkb_collab.gz file specifies the corresponding pairs of NCBI and UniProtKB protein accessions.
ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_refseq_uniprotkb_collab.gz
or
ftp://ftp.ncbi.nlm.nih.gov/refseq/uniprotkb/gene_refseq_uniprotkb_collab.gz
The README file on the gene and refseq ftp sites has been updated to document this addition. See:
ftp://ftp.ncbi.nlm.nih.gov/gene/README
ftp://ftp.ncbi.nlm.nih.gov/refseq/README
5. the ASN.1 in Entrez Gene
New implementation of a gene-commentary:
Each cross-reference will be reported in a gene-commentary of type other. Note: more than one cross-reference per RefSeq protein record is possible.
type other,
source {
{
src {
db "UniProtKB/Swiss-Prot",
tag str "P23760"
},
anchor "P23760"
}
{
src {
db "UniProtKB/TrEMBL",
tag str "O23760"
},
anchor "O23760"
}
====
I haven't tested this one out myself, but I think it might do the trick for you (:
Best wishes,
-- Stan
-----Original Message-----
From: bio_bulletin_board-bounces+stan.gaj=bigcat.unimaas.nl at bioinformatics.org [mailto:bio_bulletin_board-bounces+stan.gaj=bigcat.unimaas.nl at bioinformatics.org] On Behalf Of Boris Steipe
Sent: 09 October 2007 20:18
To: General Forum at Bioinformatics.Org
Subject: Re: [BiO BB] genbank2swissprot ?
Does the UniProt ID mapping service fit your requirements?
http://www.pir.uniprot.org/search/idmapping.shtml
Boris
On 9-Oct-07, at 12:09 PM, Dr. Christoph Gille wrote:
> Is there a mapping of identifiers from genbank nt sequences
> to identifiers of swissprot (protein) ?
> Using some tables in
> ftp://ftp.ncbi.nih.gov/refseq/
> ftp://ftp.ncbi.nih.gov/gene/DATA
> there seems to be a way indirectly over the proteinkb.
> But perhaps there is a more direct way?
> Many thanks
>
> Christoph
>
> _______________________________________________
> General Forum at Bioinformatics.Org -
> BiO_Bulletin_Board at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
_______________________________________________
General Forum at Bioinformatics.Org - BiO_Bulletin_Board at bioinformatics.org
https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
More information about the BBB
mailing list