<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=gb2312">
<TITLE>which one is the "best" protein sequence when multiple entries are found?</TITLE>
<META content="MSHTML 6.00.2800.1170" name=GENERATOR></HEAD>
<BODY>
<DIV><SPAN class=531302921-22072003><FONT face=Arial color=#0000ff size=2>I
would suggest picking the longest version of the protein.</FONT></SPAN></DIV>
<DIV><SPAN class=531302921-22072003><FONT face=Arial color=#0000ff
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=531302921-22072003><FONT face=Arial color=#0000ff size=2>-
Joel</FONT></SPAN></DIV>
<BLOCKQUOTE dir=ltr style="MARGIN-RIGHT: 0px">
<DIV class=OutlookMessageHeader dir=ltr align=left><FONT face=Tahoma
size=2>-----Original Message-----<BR><B>From:</B> Renxue Wang
[mailto:rwang@bccancer.bc.ca]<BR><B>Sent:</B> Tuesday, July 22, 2003 9:25
AM<BR><B>To:</B> 'bio_bulletin_board@bioinformatics.org'<BR><B>Subject:</B>
[BiO BB] which one is the "best" protein sequence when multiple entries ar e
found?<BR><BR></FONT></DIV>
<P><B><FONT face=Arial size=2>Hi, Everyone, In protein databases such as NCBI
and others. The same protein often has many different entries (different
accession numbers deposited by different authors at different time). To avoid
the redundancy, I need to pick one entry for each protein to work with. What
is the criteria to pick the "best" one (the most accurate)? Any rule of thumb
for a programmer? Thanks a lot. Ren (rxwang1@aol.com)</FONT></B><FONT
face=arial></FONT> </P></BLOCKQUOTE></BODY></HTML>