From 05a709aaeaf299979a19aa9c3cc0569b9bead8f1 Mon Sep 17 00:00:00 2001
From: David Beazley
+When a syntax error occurs, yacc.py performs the following steps:
+
+
+
+
+
+
+
+This type of recovery is sometimes known as parser resynchronization.
+The error token acts as a wildcard for any bad input text and
+the token immediately following error acts as a
+synchronization token.
+
+
+It is important to note that the error token usually does not appear as the last token
+on the right in an error rule. For example:
+
+
+Panic mode recovery is implemented entirely in the p_error() function. For example, this
+function starts discarding tokens until it reaches a closing '}'. Then, it restarts the
+parser in its initial state.
+
+
+This function simply discards the bad token and tells the parser that the error was ok.
+
+
+More information on these methods is as follows:
+
+
+
+
+To supply the next lookahead token to the parser, p_error() can return a token. This might be
+useful if trying to synchronize on special characters. For example:
+
+
+Keep in mind in that the above error handling functions,
+parser is an instance of the parser created by
+yacc(). You'll need to save this instance someplace in your
+code so that you can refer to it during error handling.
+
+One important aspect of manually setting an error is that the p_error() function will NOT be
+called in this case. If you need to issue an error message, make sure you do it in the production that
+raises SyntaxError.
+
+
+Note: This feature of PLY is meant to mimic the behavior of the YYERROR macro in yacc.
+
+
+In most cases, yacc will handle errors as soon as a bad input token is
+detected on the input. However, be aware that yacc may choose to
+delay error handling until after it has reduced one or more grammar
+rules first. This behavior might be unexpected, but it's related to
+special states in the underlying parsing table known as "defaulted
+states." A defaulted state is parsing condition where the same
+grammar rule will be reduced regardless of what valid token
+comes next on the input. For such states, yacc chooses to go ahead
+and reduce the grammar rule without reading the next input
+token. If the next token is bad, yacc will eventually get around to reading it and
+report a syntax error. It's just a little unusual in that you might
+see some of your grammar rules firing immediately prior to the syntax
+error.
+
+Usually, the delayed error reporting with defaulted states is harmless
+(and there are other reasons for wanting PLY to behave in this way).
+However, if you need to turn this behavior off for some reason. You
+can clear the defaulted states table like this:
+
+Disabling defaulted states is not recommended if your grammar makes use
+of embedded actions as described in Section 6.11.
+Although it may be convenient for PLY to track position information on
+all grammar symbols, this is often unnecessary. For example, if you
+are merely using line number information in an error message, you can
+often just key off of a specific token in the grammar rule. For
+example:
+
+
+Similarly, you may get better parsing performance if you only
+selectively propagate line number information where it's needed using
+the p.set_lineno() method. For example:
+
+ A minimal way to construct a tree is to simply create and
+propagate a tuple or list in each grammar rule function. There
+are many possible ways to do this, but one example would be something
+like this:
+
+
+Another approach is to create a set of data structure for different
+kinds of abstract syntax tree nodes and assign nodes to p[0]
+in each rule. For example:
+
+
+To simplify tree traversal, it may make sense to pick a very generic
+tree structure for your parse tree nodes. For example:
+
+
+In this case, the supplied action code only executes after all of the
+symbols A, B, C, and D have been
+parsed. Sometimes, however, it is useful to execute small code
+fragments during intermediate stages of parsing. For example, suppose
+you wanted to perform some action immediately after A has
+been parsed. To do this, write an empty rule like this:
+
+
+In this example, the empty seen_A rule executes immediately
+after A is shifted onto the parsing stack. Within this
+rule, p[-1] refers to the symbol on the stack that appears
+immediately to the left of the seen_A symbol. In this case,
+it would be the value of A in the foo rule
+immediately above. Like other rules, a value can be returned from an
+embedded action by simply assigning it to p[0]
+
+
+The use of embedded actions can sometimes introduce extra shift/reduce conflicts. For example,
+this grammar has no conflicts:
+
+
+A common use of embedded rules is to control other aspects of parsing
+such as scoping of local variables. For example, if you were parsing C code, you might
+write code like this:
+
+
+
+
+Normally, the parsetab.py file is placed into the same directory as
+the module where the parser is defined. If you want it to go somewhere else, you can
+given an absolute package name for tabmodule instead. In that case, the
+tables will be written there.
+
+
+Note: Be aware that unless the directory specified is also on Python's path (sys.path), subsequent
+imports of the table file will fail. As a general rule, it's better to specify a destination using the
+tabmodule argument instead of directly specifying a directory using the outputdir argument.
+
+
+
+
+It should be noted that table generation is reasonably efficient, even for grammars that involve around a 100 rules
+and several hundred states.
+
+
+
+The different states that appear in this file are a representation of
+every possible sequence of valid input tokens allowed by the grammar.
+When receiving input tokens, the parser is building up a stack and
+looking for matching rules. Each state keeps track of the grammar
+rules that might be in the process of being matched at that point. Within each
+rule, the "." character indicates the current location of the parse
+within that rule. In addition, the actions for each valid input token
+are listed. When a shift/reduce or reduce/reduce conflict arises,
+rules not selected are prefixed with an !. For example:
+
+
+Unused terminals:
+
+
+Grammar
+
+Rule 1 expression -> expression PLUS expression
+Rule 2 expression -> expression MINUS expression
+Rule 3 expression -> expression TIMES expression
+Rule 4 expression -> expression DIVIDE expression
+Rule 5 expression -> NUMBER
+Rule 6 expression -> LPAREN expression RPAREN
+
+Terminals, with rules where they appear
+
+TIMES : 3
+error :
+MINUS : 2
+RPAREN : 6
+LPAREN : 6
+DIVIDE : 4
+PLUS : 1
+NUMBER : 5
+
+Nonterminals, with rules where they appear
+
+expression : 1 1 2 2 3 3 4 4 6 0
+
+
+Parsing method: LALR
+
+
+state 0
+
+ S' -> . expression
+ expression -> . expression PLUS expression
+ expression -> . expression MINUS expression
+ expression -> . expression TIMES expression
+ expression -> . expression DIVIDE expression
+ expression -> . NUMBER
+ expression -> . LPAREN expression RPAREN
+
+ NUMBER shift and go to state 3
+ LPAREN shift and go to state 2
+
+
+state 1
+
+ S' -> expression .
+ expression -> expression . PLUS expression
+ expression -> expression . MINUS expression
+ expression -> expression . TIMES expression
+ expression -> expression . DIVIDE expression
+
+ PLUS shift and go to state 6
+ MINUS shift and go to state 5
+ TIMES shift and go to state 4
+ DIVIDE shift and go to state 7
+
+
+state 2
+
+ expression -> LPAREN . expression RPAREN
+ expression -> . expression PLUS expression
+ expression -> . expression MINUS expression
+ expression -> . expression TIMES expression
+ expression -> . expression DIVIDE expression
+ expression -> . NUMBER
+ expression -> . LPAREN expression RPAREN
+
+ NUMBER shift and go to state 3
+ LPAREN shift and go to state 2
+
+
+state 3
+
+ expression -> NUMBER .
+
+ $ reduce using rule 5
+ PLUS reduce using rule 5
+ MINUS reduce using rule 5
+ TIMES reduce using rule 5
+ DIVIDE reduce using rule 5
+ RPAREN reduce using rule 5
+
+
+state 4
+
+ expression -> expression TIMES . expression
+ expression -> . expression PLUS expression
+ expression -> . expression MINUS expression
+ expression -> . expression TIMES expression
+ expression -> . expression DIVIDE expression
+ expression -> . NUMBER
+ expression -> . LPAREN expression RPAREN
+
+ NUMBER shift and go to state 3
+ LPAREN shift and go to state 2
+
+
+state 5
+
+ expression -> expression MINUS . expression
+ expression -> . expression PLUS expression
+ expression -> . expression MINUS expression
+ expression -> . expression TIMES expression
+ expression -> . expression DIVIDE expression
+ expression -> . NUMBER
+ expression -> . LPAREN expression RPAREN
+
+ NUMBER shift and go to state 3
+ LPAREN shift and go to state 2
+
+
+state 6
+
+ expression -> expression PLUS . expression
+ expression -> . expression PLUS expression
+ expression -> . expression MINUS expression
+ expression -> . expression TIMES expression
+ expression -> . expression DIVIDE expression
+ expression -> . NUMBER
+ expression -> . LPAREN expression RPAREN
+
+ NUMBER shift and go to state 3
+ LPAREN shift and go to state 2
+
+
+state 7
+
+ expression -> expression DIVIDE . expression
+ expression -> . expression PLUS expression
+ expression -> . expression MINUS expression
+ expression -> . expression TIMES expression
+ expression -> . expression DIVIDE expression
+ expression -> . NUMBER
+ expression -> . LPAREN expression RPAREN
+
+ NUMBER shift and go to state 3
+ LPAREN shift and go to state 2
+
+
+state 8
+
+ expression -> LPAREN expression . RPAREN
+ expression -> expression . PLUS expression
+ expression -> expression . MINUS expression
+ expression -> expression . TIMES expression
+ expression -> expression . DIVIDE expression
+
+ RPAREN shift and go to state 13
+ PLUS shift and go to state 6
+ MINUS shift and go to state 5
+ TIMES shift and go to state 4
+ DIVIDE shift and go to state 7
+
+
+state 9
+
+ expression -> expression TIMES expression .
+ expression -> expression . PLUS expression
+ expression -> expression . MINUS expression
+ expression -> expression . TIMES expression
+ expression -> expression . DIVIDE expression
+
+ $ reduce using rule 3
+ PLUS reduce using rule 3
+ MINUS reduce using rule 3
+ TIMES reduce using rule 3
+ DIVIDE reduce using rule 3
+ RPAREN reduce using rule 3
+
+ ! PLUS [ shift and go to state 6 ]
+ ! MINUS [ shift and go to state 5 ]
+ ! TIMES [ shift and go to state 4 ]
+ ! DIVIDE [ shift and go to state 7 ]
+
+state 10
+
+ expression -> expression MINUS expression .
+ expression -> expression . PLUS expression
+ expression -> expression . MINUS expression
+ expression -> expression . TIMES expression
+ expression -> expression . DIVIDE expression
+
+ $ reduce using rule 2
+ PLUS reduce using rule 2
+ MINUS reduce using rule 2
+ RPAREN reduce using rule 2
+ TIMES shift and go to state 4
+ DIVIDE shift and go to state 7
+
+ ! TIMES [ reduce using rule 2 ]
+ ! DIVIDE [ reduce using rule 2 ]
+ ! PLUS [ shift and go to state 6 ]
+ ! MINUS [ shift and go to state 5 ]
+
+state 11
+
+ expression -> expression PLUS expression .
+ expression -> expression . PLUS expression
+ expression -> expression . MINUS expression
+ expression -> expression . TIMES expression
+ expression -> expression . DIVIDE expression
+
+ $ reduce using rule 1
+ PLUS reduce using rule 1
+ MINUS reduce using rule 1
+ RPAREN reduce using rule 1
+ TIMES shift and go to state 4
+ DIVIDE shift and go to state 7
+
+ ! TIMES [ reduce using rule 1 ]
+ ! DIVIDE [ reduce using rule 1 ]
+ ! PLUS [ shift and go to state 6 ]
+ ! MINUS [ shift and go to state 5 ]
+
+state 12
+
+ expression -> expression DIVIDE expression .
+ expression -> expression . PLUS expression
+ expression -> expression . MINUS expression
+ expression -> expression . TIMES expression
+ expression -> expression . DIVIDE expression
+
+ $ reduce using rule 4
+ PLUS reduce using rule 4
+ MINUS reduce using rule 4
+ TIMES reduce using rule 4
+ DIVIDE reduce using rule 4
+ RPAREN reduce using rule 4
+
+ ! PLUS [ shift and go to state 6 ]
+ ! MINUS [ shift and go to state 5 ]
+ ! TIMES [ shift and go to state 4 ]
+ ! DIVIDE [ shift and go to state 7 ]
+
+state 13
+
+ expression -> LPAREN expression RPAREN .
+
+ $ reduce using rule 6
+ PLUS reduce using rule 6
+ MINUS reduce using rule 6
+ TIMES reduce using rule 6
+ DIVIDE reduce using rule 6
+ RPAREN reduce using rule 6
+
+
+
+
+By looking at these rules (and with a little practice), you can usually track down the source
+of most parsing conflicts. It should also be stressed that not all shift-reduce conflicts are
+bad. However, the only way to be sure that they are resolved correctly is to look at parser.out.
+
+
+ ! TIMES [ reduce using rule 2 ]
+ ! DIVIDE [ reduce using rule 2 ]
+ ! PLUS [ shift and go to state 6 ]
+ ! MINUS [ shift and go to state 5 ]
+
+6.8 Syntax Error Handling
+
+
+If you are creating a parser for production use, the handling of
+syntax errors is important. As a general rule, you don't want a
+parser to simply throw up its hands and stop at the first sign of
+trouble. Instead, you want it to report the error, recover if possible, and
+continue parsing so that all of the errors in the input get reported
+to the user at once. This is the standard behavior found in compilers
+for languages such as C, C++, and Java.
+
+In PLY, when a syntax error occurs during parsing, the error is immediately
+detected (i.e., the parser does not read any more tokens beyond the
+source of the error). However, at this point, the parser enters a
+recovery mode that can be used to try and continue further parsing.
+As a general rule, error recovery in LR parsers is a delicate
+topic that involves ancient rituals and black-magic. The recovery mechanism
+provided by yacc.py is comparable to Unix yacc so you may want
+consult a book like O'Reilly's "Lex and Yacc" for some of the finer details.
+
+
+
+
+6.8.1 Recovery and resynchronization with error rules
+
+
+The most well-behaved approach for handling syntax errors is to write grammar rules that include the error
+token. For example, suppose your language had a grammar rule for a print statement like this:
+
+
+
+
+To account for the possibility of a bad expression, you might write an additional grammar rule like this:
+
+
+def p_statement_print(p):
+ 'statement : PRINT expr SEMI'
+ ...
+
+
+
+
+In this case, the error token will match any sequence of
+tokens that might appear up to the first semicolon that is
+encountered. Once the semicolon is reached, the rule will be
+invoked and the error token will go away.
+
+
+def p_statement_print_error(p):
+ 'statement : PRINT error SEMI'
+ print("Syntax error in print statement. Bad expression")
+
+
+
+
+
+This is because the first bad token encountered will cause the rule to
+be reduced--which may make it difficult to recover if more bad tokens
+immediately follow.
+
+
+def p_statement_print_error(p):
+ 'statement : PRINT error'
+ print("Syntax error in print statement. Bad expression")
+
+6.8.2 Panic mode recovery
+
+
+An alternative error recovery scheme is to enter a panic mode recovery in which tokens are
+discarded to a point where the parser might be able to recover in some sensible manner.
+
+
+
+
+
+def p_error(p):
+ print("Whoa. You are seriously hosed.")
+ if not p:
+ print("End of File!")
+ return
+
+ # Read ahead looking for a closing '}'
+ while True:
+ tok = parser.token() # Get the next token
+ if not tok or tok.type == 'RBRACE':
+ break
+ parser.restart()
+
+
+
+
+
+def p_error(p):
+ if p:
+ print("Syntax error at token", p.type)
+ # Just discard the token and tell the parser it's okay.
+ parser.errok()
+ else:
+ print("Syntax error at EOF")
+
+
+
+
+
+
+
+
+def p_error(p):
+ # Read ahead looking for a terminating ";"
+ while True:
+ tok = parser.token() # Get the next token
+ if not tok or tok.type == 'SEMI': break
+ parser.errok()
+
+ # Return SEMI to the parser as the next lookahead token
+ return tok
+
+6.8.3 Signalling an error from a production
+
+
+If necessary, a production rule can manually force the parser to enter error recovery. This
+is done by raising the SyntaxError exception like this:
+
+
+
+
+The effect of raising SyntaxError is the same as if the last symbol shifted onto the
+parsing stack was actually a syntax error. Thus, when you do this, the last symbol shifted is popped off
+of the parsing stack and the current lookahead token is set to an error token. The parser
+then enters error-recovery mode where it tries to reduce rules that can accept error tokens.
+The steps that follow from this point are exactly the same as if a syntax error were detected and
+p_error() were called.
+
+
+def p_production(p):
+ 'production : some production ...'
+ raise SyntaxError
+
+6.8.4 When Do Syntax Errors Get Reported
+
+
+
+
+
+
+parser = yacc.yacc()
+parser.defaulted_states = {}
+
+6.8.5 General comments on error handling
+
+
+For normal types of languages, error recovery with error rules and resynchronization characters is probably the most reliable
+technique. This is because you can instrument the grammar to catch errors at selected places where it is relatively easy
+to recover and continue parsing. Panic mode recovery is really only useful in certain specialized applications where you might want
+to discard huge portions of the input text to find a valid restart point.
+
+6.9 Line Number and Position Tracking
+
+
+Position tracking is often a tricky problem when writing compilers.
+By default, PLY tracks the line number and position of all tokens.
+This information is available using the following functions:
+
+
+
+
+For example:
+
+
+
+
+As an optional feature, yacc.py can automatically track line
+numbers and positions for all of the grammar symbols as well.
+However, this extra tracking requires extra processing and can
+significantly slow down parsing. Therefore, it must be enabled by
+passing the
+tracking=True option to yacc.parse(). For example:
+
+
+def p_expression(p):
+ 'expression : expression PLUS expression'
+ line = p.lineno(2) # line number of the PLUS token
+ index = p.lexpos(2) # Position of the PLUS token
+
+
+
+
+Once enabled, the lineno() and lexpos() methods work
+for all grammar symbols. In addition, two additional methods can be
+used:
+
+
+yacc.parse(data,tracking=True)
+
+
+
+
+For example:
+
+
+
+
+Note: The lexspan() function only returns the range of values up to the start of the last grammar symbol.
+
+
+def p_expression(p):
+ 'expression : expression PLUS expression'
+ p.lineno(1) # Line number of the left expression
+ p.lineno(2) # line number of the PLUS operator
+ p.lineno(3) # line number of the right expression
+ ...
+ start,end = p.linespan(3) # Start,end lines of the right expression
+ starti,endi = p.lexspan(3) # Start,end positions of right expression
+
+
+
+
+
+
+def p_bad_func(p):
+ 'funccall : fname LPAREN error RPAREN'
+ # Line number reported from LPAREN token
+ print("Bad function call at line", p.lineno(2))
+
+
+
+
+PLY doesn't retain line number information from rules that have already been
+parsed. If you are building an abstract syntax tree and need to have line numbers,
+you should make sure that the line numbers appear in the tree itself.
+
+
+def p_fname(p):
+ 'fname : ID'
+ p[0] = p[1]
+ p.set_lineno(0,p.lineno(1))
+
+6.10 AST Construction
+
+
+yacc.py provides no special functions for constructing an
+abstract syntax tree. However, such construction is easy enough to do
+on your own.
+
+
+
+
+
+def p_expression_binop(p):
+ '''expression : expression PLUS expression
+ | expression MINUS expression
+ | expression TIMES expression
+ | expression DIVIDE expression'''
+
+ p[0] = ('binary-expression',p[2],p[1],p[3])
+
+def p_expression_group(p):
+ 'expression : LPAREN expression RPAREN'
+ p[0] = ('group-expression',p[2])
+
+def p_expression_number(p):
+ 'expression : NUMBER'
+ p[0] = ('number-expression',p[1])
+
+
+
+
+The advantage to this approach is that it may make it easier to attach more complicated
+semantics, type checking, code generation, and other features to the node classes.
+
+
+class Expr: pass
+
+class BinOp(Expr):
+ def __init__(self,left,op,right):
+ self.type = "binop"
+ self.left = left
+ self.right = right
+ self.op = op
+
+class Number(Expr):
+ def __init__(self,value):
+ self.type = "number"
+ self.value = value
+
+def p_expression_binop(p):
+ '''expression : expression PLUS expression
+ | expression MINUS expression
+ | expression TIMES expression
+ | expression DIVIDE expression'''
+
+ p[0] = BinOp(p[1],p[2],p[3])
+
+def p_expression_group(p):
+ 'expression : LPAREN expression RPAREN'
+ p[0] = p[2]
+
+def p_expression_number(p):
+ 'expression : NUMBER'
+ p[0] = Number(p[1])
+
+
+
+
+
+class Node:
+ def __init__(self,type,children=None,leaf=None):
+ self.type = type
+ if children:
+ self.children = children
+ else:
+ self.children = [ ]
+ self.leaf = leaf
+
+def p_expression_binop(p):
+ '''expression : expression PLUS expression
+ | expression MINUS expression
+ | expression TIMES expression
+ | expression DIVIDE expression'''
+
+ p[0] = Node("binop", [p[1],p[3]], p[2])
+
+6.11 Embedded Actions
+
+
+The parsing technique used by yacc only allows actions to be executed at the end of a rule. For example,
+suppose you have a rule like this:
+
+
+
+
+
+def p_foo(p):
+ "foo : A B C D"
+ print("Parsed a foo", p[1],p[2],p[3],p[4])
+
+
+
+
+
+def p_foo(p):
+ "foo : A seen_A B C D"
+ print("Parsed a foo", p[1],p[3],p[4],p[5])
+ print("seen_A returned", p[2])
+
+def p_seen_A(p):
+ "seen_A :"
+ print("Saw an A = ", p[-1]) # Access grammar symbol to left
+ p[0] = some_value # Assign value to seen_A
+
+
+
+
+
+However, if you insert an embedded action into one of the rules like this,
+
+
+def p_foo(p):
+ """foo : abcd
+ | abcx"""
+
+def p_abcd(p):
+ "abcd : A B C D"
+
+def p_abcx(p):
+ "abcx : A B C X"
+
+
+
+
+an extra shift-reduce conflict will be introduced. This conflict is
+caused by the fact that the same symbol C appears next in
+both the abcd and abcx rules. The parser can either
+shift the symbol (abcd rule) or reduce the empty
+rule seen_AB (abcx rule).
+
+
+def p_foo(p):
+ """foo : abcd
+ | abcx"""
+
+def p_abcd(p):
+ "abcd : A B C D"
+
+def p_abcx(p):
+ "abcx : A B seen_AB C X"
+
+def p_seen_AB(p):
+ "seen_AB :"
+
+
+
+
+In this case, the embedded action new_scope executes
+immediately after a LBRACE ({) symbol is parsed.
+This might adjust internal symbol tables and other aspects of the
+parser. Upon completion of the rule statements_block, code
+might undo the operations performed in the embedded action
+(e.g., pop_scope()).
+
+
+def p_statements_block(p):
+ "statements: LBRACE new_scope statements RBRACE"""
+ # Action code
+ ...
+ pop_scope() # Return to previous scope
+
+def p_new_scope(p):
+ "new_scope :"
+ # Create a new scope for local variables
+ s = new_scope()
+ push_scope(s)
+ ...
+
+6.12 Miscellaneous Yacc Notes
+
+
+
+
+
+
+in this case, x must be a Lexer object that minimally has a x.token() method for retrieving the next
+token. If an input string is given to yacc.parse(), the lexer must also have an x.input() method.
+
+
+parser = yacc.parse(lexer=x)
+
+
+
+
+
+parser = yacc.yacc(debug=False)
+
+
+
+
+
+parser = yacc.yacc(tabmodule="foo")
+
+
+
+
+
+parser = yacc.yacc(tabmodule="foo",outputdir="somedirectory")
+
+
+
+
+Note: If you disable table generation, yacc() will regenerate the parsing tables
+each time it runs (which may take awhile depending on how large your grammar is).
+
+
+parser = yacc.yacc(write_tables=False)
+
+
+
+
+
+parser = yacc.parse(debug=True)
+
+
+
+
++ ++from functools import wraps +from nodes import Collection + + +def strict(*types): + def decorate(func): + @wraps(func) + def wrapper(p): + func(p) + if not isinstance(p[0], types): + raise TypeError + + wrapper.co_firstlineno = func.__code__.co_firstlineno + return wrapper + + return decorate + +@strict(Collection) +def p_collection(p): + """ + collection : sequence + | map + """ + p[0] = p[1] ++
+As a general rules this isn't a problem. However, to make it work, +you need to carefully make sure everything gets hooked up correctly. +First, make sure you save the objects returned by lex() and +yacc(). For example: + +
++ +Next, when parsing, make sure you give the parse() function a reference to the lexer it +should be using. For example: + ++lexer = lex.lex() # Return lexer object +parser = yacc.yacc() # Return parser object ++
++ +If you forget to do this, the parser will use the last lexer +created--which is not always what you want. + ++parser.parse(text,lexer=lexer) ++
+Within lexer and parser rule functions, these objects are also +available. In the lexer, the "lexer" attribute of a token refers to +the lexer object that triggered the rule. For example: + +
++ +In the parser, the "lexer" and "parser" attributes refer to the lexer +and parser objects respectively. + ++def t_NUMBER(t): + r'\d+' + ... + print(t.lexer) # Show lexer object ++
++ +If necessary, arbitrary attributes can be attached to the lexer or parser object. +For example, if you wanted to have different parsing modes, you could attach a mode +attribute to the parser object and look at it later. + ++def p_expr_plus(p): + 'expr : expr PLUS expr' + ... + print(p.parser) # Show parser object + print(p.lexer) # Show lexer object ++
++ +then PLY can later be used when Python runs in optimized mode. To make this work, +make sure you first run Python in normal mode. Once the lexing and parsing tables +have been generated the first time, run Python in optimized mode. PLY will use +the tables without the need for doc strings. + ++lex.lex(optimize=1) +yacc.yacc(optimize=1) ++
+Beware: running PLY in optimized mode disables a lot of error +checking. You should only do this when your project has stabilized +and you don't need to do any debugging. One of the purposes of +optimized mode is to substantially decrease the startup time of +your compiler (by assuming that everything is already properly +specified and works). + +
+Debugging a compiler is typically not an easy task. PLY provides some +advanced diagostic capabilities through the use of Python's +logging module. The next two sections describe this: + +
+Both the lex() and yacc() commands have a debugging +mode that can be enabled using the debug flag. For example: + +
++ +Normally, the output produced by debugging is routed to either +standard error or, in the case of yacc(), to a file +parser.out. This output can be more carefully controlled +by supplying a logging object. Here is an example that adds +information about where different debugging messages are coming from: + ++lex.lex(debug=True) +yacc.yacc(debug=True) ++
++ +If you supply a custom logger, the amount of debugging +information produced can be controlled by setting the logging level. +Typically, debugging messages are either issued at the DEBUG, +INFO, or WARNING levels. + ++# Set up a logging object +import logging +logging.basicConfig( + level = logging.DEBUG, + filename = "parselog.txt", + filemode = "w", + format = "%(filename)10s:%(lineno)4d:%(message)s" +) +log = logging.getLogger() + +lex.lex(debug=True,debuglog=log) +yacc.yacc(debug=True,debuglog=log) ++
+PLY's error messages and warnings are also produced using the logging +interface. This can be controlled by passing a logging object +using the errorlog parameter. + +
++ +If you want to completely silence warnings, you can either pass in a +logging object with an appropriate filter level or use the NullLogger +object defined in either lex or yacc. For example: + ++lex.lex(errorlog=log) +yacc.yacc(errorlog=log) ++
++ ++yacc.yacc(errorlog=yacc.NullLogger()) ++
+To enable run-time debugging of a parser, use the debug option to parse. This +option can either be an integer (which simply turns debugging on or off) or an instance +of a logger object. For example: + +
++ +If a logging object is passed, you can use its filtering level to control how much +output gets generated. The INFO level is used to produce information +about rule reductions. The DEBUG level will show information about the +parsing stack, token shifts, and other details. The ERROR level shows information +related to parsing errors. + ++log = logging.getLogger() +parser.parse(input,debug=log) ++
+For very complicated problems, you should pass in a logging object that +redirects to a file where you can more easily inspect the output after +execution. + +