我正在尝试使用jparsec来定义和利用我相当简单的语法,但我对如何解决它完全感到困惑.我不知道在这一点上是否是我对问题空间的不充分理解,或者jparsec的稀疏和无信息文档.或两者.
我有一个像这样的语法:
foo='abc' AND bar<>'def' OR (biz IN ['a', 'b', 'c'] AND NOT baz = 'foo')
Run Code Online (Sandbox Code Playgroud)
所以你可以看到它支持运营商,如AND,OR,NOT,IN,=,<>.它还支持任意嵌套的括号来指示优先级.
我认为我在标记方面做得相当远.这就是我所拥有的:
public final class NewParser {
// lexing
private static final Terminals OPERATORS = Terminals.operators("=", "OR", "AND", "NOT", "(", ")", "IN", "[", "]", ",", "<>");
private static final Parser<?> WHITESPACE = Scanners.WHITESPACES;
private static final Parser<?> FIELD_NAME_TOKENIZER = Terminals.Identifier.TOKENIZER;
private static final Parser<?> QUOTED_STRING_TOKENIZER = Terminals.StringLiteral.SINGLE_QUOTE_TOKENIZER.or(Terminals.StringLiteral.DOUBLE_QUOTE_TOKENIZER);
private static final Parser<?> IGNORED = Parsers.or(Scanners.WHITESPACES).skipMany();
private static final Parser<?> TOKENIZER = Parsers.or(OPERATORS.tokenizer(), WHITESPACE, FIELD_NAME_TOKENIZER, QUOTED_STRING_TOKENIZER).many();
@Test
public void test_tokenizer() {
Object result = TOKENIZER.parse("foo='abc' AND bar<>'def' OR (biz IN ['a', 'b', 'c'] AND NOT baz = 'foo')");
Assert.assertEquals("[foo, =, abc, null, AND, null, bar, <>, def, null, OR, null, (, biz, null, IN, null, [, a, ,, null, b, ,, null, c, ], null, AND, null, NOT, null, baz, null, =, null, foo, )]", result.toString());
}
}
Run Code Online (Sandbox Code Playgroud)
test_tokenizer通过,所以我觉得它工作正常.
现在,我已经有了一个表示语法的类型层次结构.例如,我有类调用Node,BinaryNode,FieldNode,LogicalAndNode,ConstantNode等等.我正在尝试做的是创建一个带有Parser我的令牌并吐出来的东西Node.这就是我一直陷入困境的地方.
我以为我会从这样一个非常简单的东西开始:
private static Parser<FieldNode> fieldNodeParser =
Parsers.sequence(FIELD_NAME_TOKENIZER)
.map(new Map<Object, FieldNode>() {
@Override
public FieldNode map(Object from) {
Fragment fragment = (Fragment)from;
return new FieldNode(fragment.text());
}
});
Run Code Online (Sandbox Code Playgroud)
我以为我能做到这一点:
public static Parser<Node> parser = fieldNodeParser.from(TOKENIZER);
Run Code Online (Sandbox Code Playgroud)
但这给了我一个编译错误:
The method from(Parser<? extends Collection<Token>>) in the type Parser<FieldNode> is not applicable for the arguments (Parser<capture#6-of ?>)
Run Code Online (Sandbox Code Playgroud)
所以看起来我的仿制品在某个地方出现了问题,但我不知道在哪里或如何解决这个问题.我甚至不确定我是否会以正确的方式解决这个问题.任何人都可以开导我吗?
您正在混合两个不同级别的"解析器":字符串级解析器aka.扫描程序或词法分析器以及令牌级解析器.这就是JParsec如何实现传统的词法分析和句法分析.
为了使代码编译干净,您可以.cast()在解析器定义的末尾添加对方法的调用,但这不会解决您的问题,因为您将遇到的下一个错误就是这样cannot run a character-level parser at token level.此问题来自于使用.from()定义顶级解析器,它隐式设置两个世界之间的边界.
这是解析器的工作实现(和单元测试):
public class SampleTest {
private static Parser<FieldNode> fieldNodeParser = Parsers.sequence(Terminals.fragment(Tokens.Tag.IDENTIFIER).map(new Map<String, FieldNode>() {
@Override
public FieldNode map(String from) {
String fragment = from;
return new FieldNode(fragment);
}
})).cast();
public static Parser<FieldNode> parser = fieldNodeParser.from(NewParser.TOKENIZER, Scanners.WHITESPACES);
@Test
public void test_tokenizer() {
Object result = Parsers.or(NewParser.TOKENIZER, Scanners.WHITESPACES.cast()).many().parse("foo='abc' AND bar<>'def' OR (biz IN ['a', 'b', 'c'] AND NOT baz = 'foo')");
Assert.assertEquals("[foo, =, abc, null, AND, null, bar, <>, def, null, OR, null, (, biz, null, IN, null, [, a, ,, null, b, ,, null, c, ], null, AND, null, NOT, null, baz, null, =, null, foo, )]", result.toString());
}
@Test
public void test_parser() throws Exception {
FieldNode foo = parser.parse("foo");
assertEquals(foo.text, "foo");
}
public static final class NewParser {
// lexing
static final Terminals OPERATORS = Terminals.operators("=", "OR", "AND", "NOT", "(", ")", "IN", "[", "]", ",", "<>");
static final Parser<String> FIELD_NAME_TOKENIZER = Terminals.Identifier.TOKENIZER.source();
static final Parser<?> QUOTED_STRING_TOKENIZER = Terminals.StringLiteral.SINGLE_QUOTE_TOKENIZER.or(Terminals.StringLiteral.DOUBLE_QUOTE_TOKENIZER);
static final Terminals TERMINALS = Terminals.caseSensitive(new String[] { "=", "(", ")", "[", "]", ",", "<>" }, new String[] { "OR", "AND", "NOT", "IN" });
static final Parser<?> TOKENIZER = Parsers.or(TERMINALS.tokenizer(), QUOTED_STRING_TOKENIZER);
}
private static class FieldNode {
final String text;
public FieldNode(String text) {
this.text = text;
}
}
Run Code Online (Sandbox Code Playgroud)
}
我改变的是:
Terminals.caseSensitive方法仅为终端创建词法分析器(关键字,运算符和标识符).使用的标识符词法分析器隐含地是jParsec本身提供的标识符(例如Terminals.IDENTIFIER),.from()TOKENIZER 的方法和WHITESPACES分隔符,fieldNodeParser用法.Terminals.fragment(...)希望有所帮助,Arnaud
| 归档时间: |
|
| 查看次数: |
1889 次 |
| 最近记录: |