Table of Contents

Module: NCBIStandalone Bio/Blast/NCBIStandalone.py

NCBIStandalone.py

This module provides code to work with the standalone version of BLAST, either blastall or blastpgp, provided by the NCBI. http://www.ncbi.nlm.nih.gov/BLAST/

Classes: LowQualityBlastError Except that indicates low quality query sequences. BlastParser Parses output from blast. BlastErrorParser Parses output and tries to diagnose possible errors. PSIBlastParser Parses output from psi-blast. Iterator Iterates over a file of blast results.

_Scanner Scans output from standalone BLAST. _BlastConsumer Consumes output from blast. _PSIBlastConsumer Consumes output from psi-blast. _HeaderConsumer Consumes header information. _DescriptionConsumer Consumes description information. _AlignmentConsumer Consumes alignment information. _HSPConsumer Consumes hsp information. _DatabaseReportConsumer Consumes database report information. _ParametersConsumer Consumes parameters information.

Functions: blastall Execute blastall. blastpgp Execute blastpgp.

Imported modules   
from Bio import File
from Bio.Blast import Record
from Bio.ParserSupport import *
import os
import popen2
import re
import string
from types import *
Functions   
_get_cols
_re_search
_safe_float
_safe_int
blastall
blastpgp
  _get_cols 
_get_cols (
        line,
        cols_to_get,
        ncols=None,
        expected={},
        )

Exceptions   
SyntaxError, "I expected %d columns (got %d) in line\n%s" %( ncols, len( cols ), line )
SyntaxError, "I expected '%s' in column %d in line\n%s" %( expected [ k ], k, line )
  _re_search 
_re_search (
        regex,
        line,
        error_msg,
        )

Exceptions   
SyntaxError, error_msg
  _safe_float 
_safe_float ( str )

Thomas Rosleff Soerensen (rosleff@mpiz-koeln.mpg.de) noted that float(e-172) does not produce an error on his platform. Thus, we need to check the string for this condition.

  _safe_int 
_safe_int ( str )

  blastall 
blastall (
        blastcmd,
        program,
        database,
        infile,
        **keywds,
        )

blastall(blastcmd, program, database, infile, **keywds) -> read, error Undohandles

Execute and retrieve data from blastall. blastcmd is the command used to launch the blastall executable. program is the blast program to use, e.g. blastp, blastn, etc. database is the path to the database to search against. infile is the path to the file containing the sequence to search with.

You may pass more parameters to **keywds to change the behavior of the search. Otherwise, optional values will be chosen by blastall.

Scoring matrix Matrix to use. gap_open Gap open penalty. gap_extend Gap extension penalty. nuc_match Nucleotide match reward. (BLASTN) nuc_mismatch Nucleotide mismatch penalty. (BLASTN) query_genetic_code Genetic code for Query. db_genetic_code Genetic code for database. (TBLAST[NX])

Algorithm gapped Whether to do a gapped alignment. T/F (not for TBLASTX) expectation Expectation value cutoff. wordsize Word size. strands Query strands to search against database.([T]BLAST[NX]) keep_hits Number of best hits from a region to keep. xdrop Dropoff value (bits) for gapped alignments. hit_extend Threshold for extending hits. region_length Length of region used to judge hits. db_length Effective database length. search_length Effective length of search space.

Processing filter Filter query sequence? T/F believe_query Believe the query defline. T/F restrict_gi Restrict search to these GI's. nprocessors Number of processors to use.

Formatting html Produce HTML output? T/F descriptions Number of one-line descriptions. alignments Number of alignments. align_view Alignment view. Integer 0-6. show_gi Show GI's in deflines? T/F seqalign_file seqalign file to output.

Exceptions   
ValueError, "blastall does not exist at %s" % blastcmd
  blastpgp 
blastpgp (
        blastcmd,
        database,
        infile,
        **keywds,
        )

blastpgp(blastcmd, database, infile, **keywds) -> read, error Undohandles

Execute and retrieve data from blastpgp. blastcmd is the command used to launch the blastpgp executable. database is the path to the database to search against. infile is the path to the file containing the sequence to search with.

You may pass more parameters to **keywds to change the behavior of the search. Otherwise, optional values will be chosen by blastpgp.

Scoring matrix Matrix to use. gap_open Gap open penalty. gap_extend Gap extension penalty. window_size Multiple hits window size. npasses Number of passes. passes Hits/passes. Integer 0-2.

Algorithm gapped Whether to do a gapped alignment. T/F expectation Expectation value cutoff. wordsize Word size. keep_hits Number of beset hits from a region to keep. xdrop Dropoff value (bits) for gapped alignments. hit_extend Threshold for extending hits. region_length Length of region used to judge hits. db_length Effective database length. search_length Effective length of search space. nbits_gapping Number of bits to trigger gapping. pseudocounts Pseudocounts constants for multiple passes. xdrop_final X dropoff for final gapped alignment. xdrop_extension Dropoff for blast extensions. model_threshold E-value threshold to include in multipass model. required_start Start of required region in query. required_end End of required region in query.

Processing XXX should document default values program The blast program to use. (PHI-BLAST) filter Filter query sequence with SEG? T/F believe_query Believe the query defline? T/F nprocessors Number of processors to use.

Formatting html Produce HTML output? T/F descriptions Number of one-line descriptions. alignments Number of alignments. align_view Alignment view. Integer 0-6. show_gi Show GI's in deflines? T/F seqalign_file seqalign file to output. align_outfile Output file for alignment. checkpoint_outfile Output file for PSI-BLAST checkpointing. restart_infile Input file for PSI-BLAST restart. hit_infile Hit file for PHI-BLAST. matrix_outfile Output file for PSI-BLAST matrix in ASCII. align_infile Input alignment file for PSI-BLAST restart.

Exceptions   
ValueError, "blastpgp does not exist at %s" % blastcmd
Classes   
BlastErrorParser

Attempt to catch and diagnose BLAST errors while parsing.

BlastParser

Parses BLAST data into a Record.Blast object.

Iterator

Iterates over a file of multiple BLAST results.

LowQualityBlastError

Error caused by running a low quality sequence through BLAST.

PSIBlastParser

Parses BLAST data into a Record.PSIBlast object.

_AlignmentConsumer

This is a little bit tricky. An alignment can either be a

_BlastConsumer
_DatabaseReportConsumer
_DescriptionConsumer
_HSPConsumer
_HeaderConsumer
_PSIBlastConsumer
_ParametersConsumer
_Scanner

Scan BLAST output from blastall or blastpgp.


Table of Contents

This document was automatically generated on Mon Jul 1 12:02:46 2002 by HappyDoc version 2.0.1