<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<HTML><HEAD>

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=gb2312">

<TITLE>which one is the "best" protein sequence when multiple entries are found?</TITLE>


<META content="MSHTML 6.00.2800.1170" name=GENERATOR></HEAD>

<BODY>

<DIV><SPAN class=531302921-22072003><FONT face=Arial color=#0000ff size=2>I 

would suggest picking the longest version of the protein.</FONT></SPAN></DIV>

<DIV><SPAN class=531302921-22072003><FONT face=Arial color=#0000ff 

size=2></FONT></SPAN> </DIV>

<DIV><SPAN class=531302921-22072003><FONT face=Arial color=#0000ff size=2>- 

Joel</FONT></SPAN></DIV>

<BLOCKQUOTE dir=ltr style="MARGIN-RIGHT: 0px">

  <DIV class=OutlookMessageHeader dir=ltr align=left><FONT face=Tahoma 

  size=2>-----Original Message-----<BR><B>From:</B> Renxue Wang 

  [mailto:rwang@bccancer.bc.ca]<BR><B>Sent:</B> Tuesday, July 22, 2003 9:25 

  AM<BR><B>To:</B> 'bio_bulletin_board@bioinformatics.org'<BR><B>Subject:</B> 

  [BiO BB] which one is the "best" protein sequence when multiple entries ar e 

  found?<BR><BR></FONT></DIV>

  <P><B><FONT face=Arial size=2>Hi, Everyone, In protein databases such as NCBI 

  and others. The same protein often has many different entries (different 

  accession numbers deposited by different authors at different time). To avoid 

  the redundancy, I need to pick one entry for each protein to work with. What 

  is the criteria to pick the "best" one (the most accurate)? Any rule of thumb 

  for a programmer? Thanks a lot. Ren (rxwang1@aol.com)</FONT></B><FONT 

  face=arial></FONT> </P></BLOCKQUOTE></BODY></HTML>