一旦语法完成,走ANTLR v4树的最佳方法是什么?

Nuc*_*eon 32 coldfusion antlr antlr4

目标

我正在开发一个为Coldfusion CFscript创建Varscoper的项目.基本上,这意味着检查源代码文件以确保开发人员正确地使用var他们的变量.

在使用ANTLR V4几天后,我有一个语法,在GUI视图中生成一个非常好的解析树.现在,使用该树,我需要一种方法来以编程方式在节点上爬行和寻找变量声明,并确保如果它们在函数内部,则它们具有适当的范围.如果可能的话,我宁愿不在语法文件中这样做,因为这需要将语言的定义与此特定任务混合.

我试过的

我最近的尝试是使用ParserRuleContext并尝试通过它的children通过getPayload().在检查了类之后,getPayLoad()我会有一个ParserRuleContext对象或一个Token对象.不幸的是,使用它我永远无法找到获取特定节点的实际规则类型的方法,只有它包含文本.每个节点的规则类型都是必需的,因为该文本节点是否是被忽略的右手表达式,变量赋值或函数声明都很重要.

问题

  1. 我是ANTLR的新手,这是正确的方法,还是有更好的方法来遍历树?

这是我的示例java代码:

Cfscript.java

import org.antlr.v4.runtime.*;
import org.antlr.v4.runtime.tree.Trees;

public class Cfscript {
    public static void main(String[] args) throws Exception {
        ANTLRInputStream input = new ANTLRFileStream(args[0]);
        CfscriptLexer lexer = new CfscriptLexer(input);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        CfscriptParser parser = new CfscriptParser(tokens);
        parser.setBuildParseTree(true);
        ParserRuleContext tree = parser.component();
        tree.inspect(parser); // show in gui
        /*
            Recursively go though tree finding function declarations and ensuring all variableDeclarations are varred
            but how?
        */
    }
}
Run Code Online (Sandbox Code Playgroud)

Cfscript.g4

grammar Cfscript;

component
    : 'component' keyValue* '{' componentBody '}'
    ;

componentBody
    : (componentElement)*
    ;

componentElement
    : statement
    | functionDeclaration
    ;

functionDeclaration
    : Identifier? Identifier? 'function' Identifier argumentsDefinition '{' functionBody '}'
    ;

argumentsDefinition
    : '(' argumentDefinition (',' argumentDefinition)* ')'
    | '()'
    ;

argumentDefinition
    : Identifier? Identifier? argumentName ('=' expression)?
    ; 

argumentName
    : Identifier
    ;

functionBody
    : (statement)*
    ;

statement
    : variableStatement
    | nonVarVariableStatement
    | expressionStatement
    ;

variableStatement
    : 'var' variableName '=' expression ';'
    ;

nonVarVariableStatement
    : variableName '=' expression ';'
    ;

expressionStatement
    : expression ';'
    ;

expression
    : assignmentExpression
    | arrayLiteral
    | objectLiteral
    | StringLiteral
    | incrementExpression
    | decrementExpression
    | 'true' 
    | 'false'
    | Identifier
    ;

incrementExpression
    : variableName '++'
    ;

decrementExpression
    : variableName '--'
    ;

assignmentExpression
    : Identifier (assignmentExpressionSuffix)*
    | assignmentExpression (('+'|'-'|'/'|'*') assignmentExpression)+
    ;

assignmentExpressionSuffix
    : '.' assignmentExpression
    | ArrayIndex
    | ('()' | '(' expression (',' expression)* ')' )
    ;

methodCall
    : Identifier ('()' | '(' expression (',' expression)* ')' )
    ;

variableName
    : Identifier (variableSuffix)*
    ;

variableSuffix
    : ArrayIndex
    | '.' variableName
    ;

arrayLiteral
    : '[' expression (',' expression)* ']'
    ;

objectLiteral
    : '{' (Identifier '=' expression (',' Identifier '=' expression)*)? '}'
    ;

keyValue
    : Identifier '=' StringLiteral
    ;

StringLiteral
    :  '"' (~('\\'|'"'))* '"'
    ;

 ArrayIndex
    : '[' [1-9] [0-9]* ']'
    | '[' StringLiteral ']'
    ;

Identifier
    : [a-zA-Z0-9]+
    ;

WS
    : [ \t\r\n]+ -> skip 
    ;

COMMENT 
    : '/*' .*? '*/'  -> skip
    ;
Run Code Online (Sandbox Code Playgroud)

Test.cfc(测试代码文件)

component something = "foo" another = "more" persistent = "true" datasource = "#application.env.dsn#" {
    var method = something.foo.test1;
    testing = something.foo[10];
    testingagain = something.foo["this is a test"];
    nuts["testing"]++;
    blah.test().test3["test"]();

    var math = 1 + 2 - blah.test().test4["test"];

    var test = something;
    var testing = somethingelse;
    var testing = { 
        test = more, 
        mystuff = { 
            interior = test 
        },
        third = "third key"
    };
    other = "Idunno homie";
    methodCall(interiorMethod());

    public function bar() {
        var new = "somebody i used to know";
        something = [1, 2, 3];
    }

    function nuts(required string test1 = "first", string test = "second", test3 = "third") {

    }

    private boolean function baz() {
        var this = "something else";
    }
}
Run Code Online (Sandbox Code Playgroud)

Bar*_*ers 40

如果我是你,我不会手动走这个.在生成词法分析器和解析器之后,ANTLR还会生成一个文件CfscriptBaseListener ,该文件具有适用于所有解析器规则的空方法.您可以让ANTLR遍历您的树并附加一个自定义树监听器,您只能覆盖您感兴趣的那些方法/规则.

在您的情况下,您可能希望在创建新函数时通知(创建新范围),并且您可能对变量赋值(variableStatementnonVarVariableStatement)感兴趣.VarListener当ANTLR走过树时,你的监听器将跟踪所有范围.

我确实稍微改变了一条规则(我补充说objectLiteralEntry):

objectLiteral
    : '{' (objectLiteralEntry (',' objectLiteralEntry)*)? '}'
    ;

objectLiteralEntry
    : Identifier '=' expression
    ;
    

以下演示使生活更轻松:

VarListener.java

public class VarListener extends CfscriptBaseListener {

    private Stack<Scope> scopes;

    public VarListener() {
        scopes = new Stack<Scope>();
        scopes.push(new Scope(null));
    } 

    @Override
    public void enterVariableStatement(CfscriptParser.VariableStatementContext ctx) {
        String varName = ctx.variableName().getText();
        Scope scope = scopes.peek();
        scope.add(varName);
    }

    @Override
    public void enterNonVarVariableStatement(CfscriptParser.NonVarVariableStatementContext ctx) {
        String varName = ctx.variableName().getText();
        checkVarName(varName);
    }

    @Override
    public void enterObjectLiteralEntry(CfscriptParser.ObjectLiteralEntryContext ctx) {
        String varName = ctx.Identifier().getText();
        checkVarName(varName);
    }

    @Override
    public void enterFunctionDeclaration(CfscriptParser.FunctionDeclarationContext ctx) {
        scopes.push(new Scope(scopes.peek()));
    }

    @Override
    public void exitFunctionDeclaration(CfscriptParser.FunctionDeclarationContext ctx) {
        scopes.pop();        
    }

    private void checkVarName(String varName) {
        Scope scope = scopes.peek();
        if(scope.inScope(varName)) {
            System.out.println("OK   : " + varName);
        }
        else {
            System.out.println("Oops : " + varName);
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

一个Scope对象可以是简单的:

Scope.java

class Scope extends HashSet<String> {

    final Scope parent;

    public Scope(Scope parent) {
        this.parent = parent;
    }

    boolean inScope(String varName) {
        if(super.contains(varName)) {
            return true;
        }
        return parent == null ? false : parent.inScope(varName);
    }
}
Run Code Online (Sandbox Code Playgroud)

现在,为了测试这一切,这里是一个小主类:

Main.java

import org.antlr.v4.runtime.*;
import org.antlr.v4.runtime.tree.*;

public class Main {

    public static void main(String[] args) throws Exception {

        CfscriptLexer lexer = new CfscriptLexer(new ANTLRFileStream("Test.cfc"));
        CfscriptParser parser = new CfscriptParser(new CommonTokenStream(lexer));
        ParseTree tree = parser.component();
        ParseTreeWalker.DEFAULT.walk(new VarListener(), tree);
    }
}
Run Code Online (Sandbox Code Playgroud)

如果您运行此类Main,将打印以下内容:

Oops : testing
Oops : testingagain
OK   : test
Oops : mystuff
Oops : interior
Oops : third
Oops : other
Oops : something

毫无疑问,这并不是你想要的,我可能会对Coldfusion的一些范围规则进行讨论.但我认为这将为您提供一些如何正确解决问题的见解.我认为代码是非常自我解释的,但如果不是这样,请不要犹豫要求澄清.

HTH

  • 提示:`ParseTreeWalker.DEFAULT.walk(...)` (2认同)