sly/CHANGES
2018-01-27 15:27:15 -06:00

74 lines
2.4 KiB
Plaintext

Version 0.3
-----------
1/27/2018 Tokens no longer have to be specified as strings. For example, you
can now write:
from sly import Lexer
class TheLexer(Lexer):
tokens = { ID, NUMBER, PLUS, MINUS }
ID = r'[a-zA-Z_][a-zA-Z0-9_]*'
NUMBER = r'\d+'
PLUS = r'\+'
MINUS = r'-'
This convention also carries over to the parser for things such
as precedence specifiers:
from sly import Parser
class TheParser(Parser):
tokens = TheLexer.tokens
precedence = (
('left', PLUS, MINUS),
('left', TIMES, DIVIDE),
('right', UMINUS),
)
...
Nevermind the fact that ID, NUMBER, PLUS, and MINUS appear to be
undefined identifiers. It all works.
1/27/2018 Tokens now allow special-case remapping. For example:
from sly import Lexer
class TheLexer(Lexer):
tokens = { ID, IF, ELSE, WHILE, NUMBER, PLUS, MINUS }
ID = r'[a-zA-Z_][a-zA-Z0-9_]*'
ID['if'] = IF
ID['else'] = ELSE
ID['while'] = WHILE
NUMBER = r'\d+'
PLUS = r'\+'
MINUS = r'-'
In this code, the ID rule matches any identifier. However,
special cases have been made for IF, ELSE, and WHILE tokens.
Previously, this had to be handled in a special action method
such as this:
def ID(self, t):
if t.value in { 'if', 'else', 'while' }:
t.type = t.value.upper()
return t
Nevermind the fact that the syntax appears to suggest that strings
work as a kind of mutable mapping.
1/16/2018 Usability improvement on Lexer class. Regular expression rules
specified as strings that don't match any name in tokens are
now reported as errors.
Version 0.2
-----------
12/24/2017 The error(self, t) method of lexer objects now receives a
token as input. The value attribute of this token contains
all remaining input text. If the passed token is returned
by error(), then it shows up in the token stream where
can be processed by the parser.