Table of Contents

Module: Parser Martel/Parser.py

implement Martel parsers

The classes in this module are used by other Martel modules and not typically by external users.

There are two major parsers, Parser and RecordParser. The first is the standard one, which parses the file as one string in memory then generates the SAX events. The other reads a record at a time using a RecordReader and generates events after each read. The generated event callbacks are identical.

At some level, both parsers use "_do_callback" to convert mxTextTools tags into SAX events.

XXX finish this documentation

XXX need a better way to get closer to the likely error position when parsing.

XXX need to implement Locator

Imported modules   
import Dispatch
import pprint
import string
import sys
import traceback
import urllib
from xml.sax import xmlreader, _exceptions, handler, saxutils
Functions   
_do_callback
_do_dispatch_callback
_parse_elements
  _do_callback 
_do_callback (
        s,
        begin,
        end,
        taglist,
        cont_handler,
        attrlookup,
        )

internal function to convert the tagtable into ContentHandler events

s is the input text begin is the current position in the text end is 1 past the last position of the text allowed to be parsed taglist is the tag list from mxTextTools.parse cont_handler is the SAX ContentHandler attrlookup is a dict mapping the encoded tag name to the element info

Exceptions   
AssertionError("Unknown special tag %s" % repr( tag ) )
  _do_dispatch_callback 
_do_dispatch_callback (
        s,
        begin,
        end,
        taglist,
        start_table_get,
        cont_handler,
        save_stack,
        end_table_get,
        attrlookup,
        )

internal function to convert the tagtable into ContentHandler events

THIS IS A SPECIAL CASE FOR Dispatch.Dispatcher objects

s is the input text begin is the current position in the text end is 1 past the last position of the text allowed to be parsed taglist is the tag list from mxTextTools.parse start_table_get is the Dispatcher._start_table cont_handler is the Dispatcher end_table_get is the Dispatcher._end_table cont_handler is the SAX ContentHandler attrlookup is a dict mapping the encoded tag name to the element info

  _parse_elements 
_parse_elements (
        s,
        tagtable,
        cont_handler,
        debug_level,
        attrlookup,
        )

parse the string with the tagtable and send the ContentHandler events

Specifically, it sends the startElement, endElement and characters events but not startDocument and endDocument.

Classes   
HeaderFooterParser

Header followed by 0 or more records followed by a footer

MartelAttributeList

The SAX startElements take an AttributeList as the second argument.

Parser

Parse the input data all in memory

ParserException

used when a parse cannot be done

ParserIncompleteException
ParserPositionException
ParserRecordException

used by the RecordParser when it can't read a record

RecordParser

Parse the input data a record at a time


Table of Contents

This document was automatically generated on Mon Jul 1 12:03:20 2002 by HappyDoc version 2.0.1