[BiO BB] GeneBank to SwissProt Mapping?

Dan Bolser dmb at mrc-dunn.cam.ac.uk
Sat May 15 06:38:54 EDT 2004


On Sat, 15 May 2004, Anand Kumar wrote:

>Dan,
>
>We, with the Swissprot team, have begun to work in the area of bridging
>the gap between disease and gene products as a pilot project. This would
>help to bridge the gap between diseases and gene products, as it exists
>in OMIM. We are trying to address various granularities from clinical to
>molecular levels. Are you doing the mapping for human proteins? That
>could help us since we already use the annotations present with GO to
>the respective gene products.

To clarify what I am doing: 

I take BIND protein interaction data, and for each sequence available in
bind (with a GI accession number) I blast against SPTrEMBL. This data will
give me a GI <-> SP accession number mapping for the dataset I am
interested in. 

It will also create (as a side effect) a whole load of data about close
homologues of the protein sequences available in BIND, which could be very
useful.

It will be easy to make a human subset of the data, for example

http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=retrieve&db=pubmed&list_uids=15117749&dopt=Abstract

Please let me know,
Dan.

>
>Kind Regards,
>Anand.
>
>Am Sa 15.05.2004 11:35, Dan Bolser <dmb at mrc-dunn.cam.ac.uk> schrieb:
>
>> 
>> Thanks Pamela for your suggestions, I have tried LocusLink (the 'loc2acc'
>> file) but I find many missing swissprot / trembl accession numbers. I
>> fear that the GO mapping could be incomplete for other resons too. 
>> 
>> Thanks Svensson for the offer of assistance below, but I am not sure how
>> having PID will help me. (sorry for my ignorance).
>> 
>> I know sptrembl has a good PIR mapping, can we get to PIR from GI?
>> 
>> I am running blast jobs at the minuite as the bind sequence file isn't too
>> big, but getting accurate 1-1 mapping means that I have to blast against
>> the full datasets and not some non redundant version thereof.
>> 
>> 4 more days to go...
>> 
>> If anyone wants the data let me know.
>> 
>> Dan.
>> 
>> 
>> 
>> On 14 May 2004, Svensson, B.A.T. (HKG) wrote:
>> 
>> >
>> >
>> >
>> >On Fri, 2004-05-14 at 15:28, Dan Bolser wrote:
>> >> On 14 May 2004, Svensson, B.A.T. (HKG) wrote:
>> >> 
>> >> >What data do you have to start with?
>> >> 
>> >> BIND.... I am thinking that blast will get the job done if I use refseq...
>> >
>> >I don't know about BIND, but if you have refseq, then you will be able
>> >to fetch the protein identifier (PID) as well as well as the official
>> >gene symbol from NCBI's data set UniGene (pre-blasted as to say), this
>> >should be enough to link to your data. If you need help with retrieving
>> >PID's and gene symbols, let me know and I can make a query in my own
>> >database to retrieve this data for you - or anything else you need from
>> >UniGene.
>> >
>> >Precisely which EBI dataset would you like to link to?
>> >
>> >
>> >> >On Fri, 2004-05-14 at 15:10, Dan Bolser wrote:
>> >> >> Any body know how to map proteins from the NCBI to proteins from the EBI?
>> >> >> 
>> >> >> Cheers,
>> >> >> Dan.
>> >> >
>> >_______________________________________________
>> >BiO_Bulletin_Board maillist  -  BiO_Bulletin_Board at bioinformatics.org
>> >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>> >
>> 
>> 
>> _______________________________________________
>> BiO_Bulletin_Board maillist  -  BiO_Bulletin_Board at bioinformatics.org
>> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>> 
>
>Anand Kumar MBBS, PhD
>IFOMIS
>Faculty of Medicine
>University of Leipzig
>Härtelstraße 16-18
>04107 Leipzig
>Germany
>http://www.uni-leipzig.de/~akumar/
>
>
>
>
>
>_______________________________________________
>BiO_Bulletin_Board maillist  -  BiO_Bulletin_Board at bioinformatics.org
>https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>




More information about the BBB mailing list