从解析树中获取某些节点

Question

从解析树中获取某些节点

Tex*_*Tex 3 java nlp jgrapht stanford-nlp

我正在研究一个涉及通过Hobbs算法进行回指解析的项目.我使用Stanford解析器解析了我的文本,现在我想操纵节点以实现我的算法.

目前,我不明白如何:

基于其POS标签访问节点(例如,我需要以代词开头 - 我如何得到所有代词？).
使用访客.我有点像Java的菜鸟,但是在C++中我需要实现一个Visitor仿函数然后处理它的钩子.我找不到Stanford Parser的Tree结构.那是jgrapht吗？如果是的话,你可以在代码片段中提供一些指示吗？

Answer 1

Chr*_*ing 10

@ dhg的答案工作正常,但这里有两个其他选项,它们可能也有用了解:

在Tree类实现Iterable.您可以Tree在预先顺序遍历中遍历a的所有节点,或严格地,按每个节点为首的子树,使用:
```
for (Tree subtree : t) { 
    if (subtree.label().value().equals("PRP")) {
        pronouns.add(subtree);
    }
}
```
Run Code Online (Sandbox Code Playgroud)
您还可以通过使用获得满足某些(可能非常复杂的模式)的节点tregex,其行为非常类似于java.util.regex允许树上的模式匹配.你会有类似的东西:
```
TregexPattern tgrepPattern = TregexPattern.compile("PRP");
TregexMatcher m = tgrepPattern.matcher(t);
while (m.find()) {
    Tree subtree = m.getMatch();
    pronouns.add(subtree);
}
```
Run Code Online (Sandbox Code Playgroud)

Answer 2

dhg*_*dhg 5

这是一个解析句子并找到所有代词的简单示例.

private static ArrayList<Tree> findPro(Tree t) {
    ArrayList<Tree> pronouns = new ArrayList<Tree>();
    if (t.label().value().equals("PRP"))
        pronouns.add(t);
    else
        for (Tree child : t.children())
            pronouns.addAll(findPro(child));
    return pronouns;
}

public static void main(String[] args) {

    LexicalizedParser parser = LexicalizedParser.loadModel();
    Tree x = parser.apply("The dog walks and he barks .");
    System.out.println(x);
    ArrayList<Tree> pronouns = findPro(x);
    System.out.println("All Pronouns: " + pronouns);

}

Run Code Online (Sandbox Code Playgroud)

这打印:

    (ROOT (S (S (NP (DT The) (NN dog)) (VP (VBZ walks))) (CC and) (S (NP (PRP he)) (VP (VBZ barks))) (. .)))
    All Pronouns: [(PRP he)]

Run Code Online (Sandbox Code Playgroud)

归档时间：	13 年，9 月前
查看次数：	5782 次
最近记录：	9 年，3 月前