Classes for nodes in the Expression tree
Expression
|--- Any - match (or don't match) a set of characters
|--- AnyEol - match any newline representation ("\n", "\r" or "\r\n")
|--- Assert - used for positive and negative lookahead assertions
|--- AtBeginning - match the beginning of a line
|--- AtEnd - match the end of a line
|--- Debug - print a debug message
|--- Dot - match any character except newline
|--- Group - give a group name to an expression
|--- GroupRef - match a previously identified expression
|--- Literal - match (or don't match) a single character
|--- MaxRepeat - greedy repeat of an expression, within min/max bounds
|--- NullOp - does nothing (useful as an initial seed)
|--- PassThrough - used when overriding 'make_parser'; match its subexp
| |--- FastFeature - keeps information about possibly optional tags
| |--- HeaderFooter - files with a header, records and a footer
| `--- ParseRecords - parse a record at a time
|--- Str - match a given string
`--- ExpressionList - expressions containing several subexpressions
|--- Alt - subexp1 or subexp2 or subexp3 or ...
`--- Seq - subexp1 followed by subexp2 followed by subexp3 ...
Imported modules
|
|
import Parser
import msre_parse
import re
import string
from xml.sax import xmlreader
|
Functions
|
|
|
|
NoCase
|
NoCase ( expr )
expression -> expression where the text is case insensitive
|
|
_make_fast_lookup
|
_make_fast_lookup ()
|
|
_make_group_pattern
|
_make_group_pattern (
name,
expression,
attrs,
)
|
|
_make_no_case
|
_make_no_case ( node )
modify an expression in place to remove case dependencies
may return a new top-level node
|
|
_minimize_any_range
|
_minimize_any_range ( s )
s -> a string useable inside [] which matches all the characters in s
For example, passing in "0123456789" returns "\d".
This code isn't perfect.
|
|
_minimize_escape_char
|
_minimize_escape_char ( c )
(c) -> into an appropriately escaped pattern for the character
|
|
_minimize_escape_range
|
_minimize_escape_range ( c1, c2 )
(c1, c2) -> the pattern for the range bounded by those two characters
|
|
_quote
|
_quote ( s )
|
|
_verify_name
|
_verify_name ( s )
Group names must be valid XML identifiers
Exceptions
|
|
AssertionError, "Illegal character in group name %s" % repr( s )
|
|
|
escape
|
escape ( pattern )
Escape all non-alphanumeric characters in pattern.
taken from re.escape, except also don't escape " ="
|
Classes
|
|
Alt |
An Expression tree with a list of alternate matches.
|
Any |
Any character in a given set: '[abc]
|
AnyEol |
Match a newline ("
|
Assert |
Lookahead assertions: (?=...)
|
AtBeginning |
Match the beginning of a line
|
AtEnd |
Match the end of a line
|
Debug |
Print a message when there is a match at this point.
|
Dot |
Match any character except newline
|
Expression |
Base class for nodes in the Expression tree
|
ExpressionList |
shares implementation used by 'Expressions with subexpressions
|
FastFeature | |
Group | |
GroupRef |
group reference: '(?P<name>.)(?P=name)
|
HeaderFooter | |
Literal |
A single character: a
|
MaxRepeat |
Greedy repeat: a*
|
NullOp |
does nothing
|
ParseRecords |
Might be useful to allow a minimum record count (likely either 0 or 1)
|
PassThrough |
Match the subexpression.
|
Seq |
An Expression matching a set of subexpressions, in sequential order
|
Str |
A sequence of characters: 'abcdef
|
|
|