在F#中同时进行Lexing和解析

Question

在F#中同时进行Lexing和解析

在使用fslex和fsyacc时,是否有一种简单的方法可以让lexing和解析同时运行？

Answer 1

首先，在实际情况中，词法分析和解析对时间至关重要。特别是如果您需要在解析之前处理令牌。例如——过滤和收集评论或解决上下文相关的冲突。在这种情况下，解析器通常会等待词法分析器。

一个问题的答案。您可以与 MailboxProcessor 同时运行词法分析和解析。

思想核心。您可以在 mailBoxProcessor 中运行词法分析器。Lexer 应该生成新的令牌，处理并发布它们。词法分析器通常比解析器更快，有时它应该等待解析器。解析器可以在必要时接收下一个令牌。下面提供了代码。您可以修改超时、traceStep 来找到最适合您的解决方案。

[<Literal>]
let traceStep = 200000L

let tokenizerFun = 
    let lexbuf = Lexing.LexBuffer<_>.FromTextReader sr                        
    let timeOfIteration = ref System.DateTime.Now
    fun (chan:MailboxProcessor<lexer_reply>) ->
    let post = chan.Post 
    async {
        while not lexbuf.IsPastEndOfStream do
            lastTokenNum := 1L + !lastTokenNum
            if (!lastTokenNum % traceStep) = 0L then 
                let oldTime = !timeOfIteration
                timeOfIteration := System.DateTime.Now
                let mSeconds = int64 ((!timeOfIteration - oldTime).Duration().TotalMilliseconds)
                if int64 chan.CurrentQueueLength > 2L * traceStep then                                                                                  
                    int (int64 chan.CurrentQueueLength * mSeconds / traceStep)  |> System.Threading.Thread.Sleep      
            let tok = Calc.Lexer.token lexbuf
            // Process tokens. Filter comments. Add some context-depenede information.
            post tok
    }   

use tokenizer =  new MailboxProcessor<_>(tokenizerFun)

let getNextToken (lexbuf:Lexing.LexBuffer<_>) =
    let res = tokenizer.Receive 150000 |> Async.RunSynchronously
    i := 1L + !i 

    if (!i % traceStep) = 0L then 
        let oldTime = !timeOfIteration
        timeOfIteration := System.DateTime.Now
        let seconds = (!timeOfIteration - oldTime).TotalSeconds          
    res

let res =         
    tokenizer.Start()            
    Calc.Parser.file getNextToken <| Lexing.LexBuffer<_>.FromString "*this is stub*"

Run Code Online (Sandbox Code Playgroud)

完整的解决方案可在此处找到： https: //github.com/YaccConstructor/ConcurrentLexPars在此解决方案中，我们仅演示所描述想法的完整实现。性能比较并不实际，因为语义计算非常简单并且没有标记处理。

要了解性能比较结果，请查看完整报告https://docs.google.com/document/d/1K43g5jokNKFOEHQJVlHM1gVhZZ7vFK2g9CJHyAVtUtg/edit?usp=sharing 在这里，我们比较 T-SQL 子集解析器的顺序和并发解决方案的性能。顺序：27 秒，并发：20 秒。

我们还在生产 T-SQL 翻译器中使用了这种技术。

归档时间：	13 年，1 月前
查看次数：	1244 次
最近记录：	11 年，2 月前