[BiO BB] GeneBank to SwissProt Mapping?
dmb at mrc-dunn.cam.ac.uk
Sat May 15 06:38:54 EDT 2004
On Sat, 15 May 2004, Anand Kumar wrote:
>We, with the Swissprot team, have begun to work in the area of bridging
>the gap between disease and gene products as a pilot project. This would
>help to bridge the gap between diseases and gene products, as it exists
>in OMIM. We are trying to address various granularities from clinical to
>molecular levels. Are you doing the mapping for human proteins? That
>could help us since we already use the annotations present with GO to
>the respective gene products.
To clarify what I am doing:
I take BIND protein interaction data, and for each sequence available in
bind (with a GI accession number) I blast against SPTrEMBL. This data will
give me a GI <-> SP accession number mapping for the dataset I am
It will also create (as a side effect) a whole load of data about close
homologues of the protein sequences available in BIND, which could be very
It will be easy to make a human subset of the data, for example
Please let me know,
>Am Sa 15.05.2004 11:35, Dan Bolser <dmb at mrc-dunn.cam.ac.uk> schrieb:
>> Thanks Pamela for your suggestions, I have tried LocusLink (the 'loc2acc'
>> file) but I find many missing swissprot / trembl accession numbers. I
>> fear that the GO mapping could be incomplete for other resons too.
>> Thanks Svensson for the offer of assistance below, but I am not sure how
>> having PID will help me. (sorry for my ignorance).
>> I know sptrembl has a good PIR mapping, can we get to PIR from GI?
>> I am running blast jobs at the minuite as the bind sequence file isn't too
>> big, but getting accurate 1-1 mapping means that I have to blast against
>> the full datasets and not some non redundant version thereof.
>> 4 more days to go...
>> If anyone wants the data let me know.
>> On 14 May 2004, Svensson, B.A.T. (HKG) wrote:
>> >On Fri, 2004-05-14 at 15:28, Dan Bolser wrote:
>> >> On 14 May 2004, Svensson, B.A.T. (HKG) wrote:
>> >> >What data do you have to start with?
>> >> BIND.... I am thinking that blast will get the job done if I use refseq...
>> >I don't know about BIND, but if you have refseq, then you will be able
>> >to fetch the protein identifier (PID) as well as well as the official
>> >gene symbol from NCBI's data set UniGene (pre-blasted as to say), this
>> >should be enough to link to your data. If you need help with retrieving
>> >PID's and gene symbols, let me know and I can make a query in my own
>> >database to retrieve this data for you - or anything else you need from
>> >Precisely which EBI dataset would you like to link to?
>> >> >On Fri, 2004-05-14 at 15:10, Dan Bolser wrote:
>> >> >> Any body know how to map proteins from the NCBI to proteins from the EBI?
>> >> >>
>> >> >> Cheers,
>> >> >> Dan.
>> >> >
>> >BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org
>> BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org
>Anand Kumar MBBS, PhD
>Faculty of Medicine
>University of Leipzig
>BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org
More information about the BBB