Table of Contents

Module: genbank_format Bio/GenBank/genbank_format.py

Martel based parser to read GenBank formatted files.

This is a huge regular regular expression for GenBank, built using the regular expressiona on steroids capabilities of Martel.

Notes: Just so I remember -- the new end of line syntax is: New regexp syntax - \R \R means "\n|\r\n?" [\R] means "[\n\r]"

This helps us have endlines be consistent across platforms.

Documentation for GenBank format that I found:

  • GenBank/EMBL feature tables are described at: http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html

  • There are also descriptions of different GenBank lines at: http://www.ibc.wustl.edu/standards/gbrel.txt

Imported modules   
from Bio import Std
import Martel
from Martel import RecordReader
import string
Functions   
define_block
  define_block 
define_block (
        identifier,
        block_tag,
        block_data,
        std_block_tag=None,
        std_tag=None,
        )

Define a Martel grouping which can parse a block of text.

Many of the GenBank lines we'll want to process are grouped into a block like:

IDENTIFIER Blah blah blah

Where blah blah blah can wrap for multiple lines. This function makes it easy to consistently define a definition for these blocks.

Arguments: o identifier - The identifier that begins the block (like DEFINITION). o block_tag - A callback tag for the entire block. o block_data - A callback tag for the data in the block (ie. the stuff you are interested in). o std_block_tag - A Bio.Std Martel tag used to register the entire block as having being a "standard" type of information. o std_tag - A Bio.Std Martel tag used to register just the information in the block as being "standard"

  • useful functions


Table of Contents

This document was automatically generated on Mon Jul 1 12:02:49 2002 by HappyDoc version 2.0.1