我有以下语法和测试用例:
from pyparsing import Word, nums, Forward, Suppress, OneOrMore, Group
#A grammar for a simple class of regular expressions
number = Word(nums)('number')
lparen = Suppress('(')
rparen = Suppress(')')
expression = Forward()('expression')
concatenation = Group(expression + expression)
concatenation.setResultsName('concatenation')
disjunction = Group(lparen + OneOrMore(expression + Suppress('|')) + expression + rparen)
disjunction.setResultsName('disjunction')
kleene = Group(lparen + expression + rparen + '*')
kleene.setResultsName('kleene')
expression << (number | disjunction | kleene | concatenation)
#Test a simple input
tests = """
(8)*((3|2)|2)
""".splitlines()[1:]
for t in tests:
print t
print expression.parseString(t)
print
Run Code Online (Sandbox Code Playgroud)
结果应该是
[['8', '*'],[['3', '2'], '2']]
Run Code Online (Sandbox Code Playgroud)
但相反,我只能得到
[['8', '*']]
Run Code Online (Sandbox Code Playgroud)
如何通过pyparsing来解析整个字符串?
您的concatenation表达式没有执行您想要的操作,并且接近左递归(幸运的是,它是表达式中的最后一项)。如果你这样做,你的语法就有效:
expression << OneOrMore(number | disjunction | kleene)
Run Code Online (Sandbox Code Playgroud)
通过此更改,我得到以下结果:
[['8', '*'], [['3', '2'], '2']]
Run Code Online (Sandbox Code Playgroud)
编辑:如果您使用运算符,您还可以避免<<over的优先级:|<<=
expression <<= OneOrMore(number | disjunction | kleene)
Run Code Online (Sandbox Code Playgroud)