Bioinformatics.org
[University of Birmingham]
Not logged in
  • Log in
  • Bioinformatics.org
    Membership (42784+) Group hosting [?] Wiki
    Franklin Award
    Sponsorships

    Careers
    About bioinformatics
    Bioinformatics jobs

    Research
    All information groups
    Online databases Online analysis tools Online education tools More tools

    Development
    All software groups
    FTP repository
    SVN & CVS repositories [?]
    Mailing lists

    Forums
    News & Commentary
  • Submit
  • Archives
  • Subscribe

  • Jobs Forum
    (Career Center)
  • Submit
  • Archives
  • Subscribe
  • News & Commentary - Message forums

    PLoS Currents: Mining the NCBI Influenza Sequence Database
    Submitted by Dr. Leonid Zaslavsky; posted on Tuesday, November 03, 2009

    Submitter

    Mining the NCBI Influenza Sequence Database: adaptive grouping of BLAST results using precalculated neighbor indexing - a knol by Leonid Zaslavsky and Tatiana Tatusova

    The Influenza Virus Resource and other Virus Variation Resources at NCBI provide enhanced visualization web tools for exploratory analysis for influenza sequence data. Despite the improvements in data analysis, the initial data retrieval remains unsophisticated, frequently producing huge and imbalanced datasets due to the large number of identical and nearly-identical sequences in the database.

    We propose a data mining algorithm to organize reported sequences into groups based on their relatedness to the query sequence and to each other. The algorithm uses BLAST to find database sequences related to the query. Neighbor lists precalculated from pairwise BLAST alignments between database sequences are used to organize results in groups of nearly-identical and strongly related sequences. We propose to use a non-symmetric dissimilarity measure well crafted for dealing with sequences of different length (fragments).

    A balanced and representative data set produced by this tool can be used for further analysis, i.e. multiple sequence alignment and phylogenetic trees. The algorithm is implemented for protein coding sequences and is being integrated with the NCBI Influenza Virus Resource.

    ARTICLE

    knol.google.com/k/le[...]on=2#

    Published in PLoS Currents: Influenza (www.plos.org)

    ARTICLE

    Zaslavsky, Leonid; Tatusova, Tatiana. Mining the NCBI Influenza Sequence Database: adaptive grouping of BLAST results using precalculated neighbor indexing [Internet]. Version 136. PLoS Currents: Influenza. 2009 Oct 30:RRN1124.

    Expanded view | Monitor forum | Save place

    Start a new thread:
    You have to be logged in to post a reply.

     

    Copyright © 2021 Scilico, LLC · Privacy Policy