如何使用 fparsec 解析由双空格分隔的单词的 seq?

Sim*_*n P 5 f# parsing fparsec

给定输入:

alpha beta gamma  one two three
Run Code Online (Sandbox Code Playgroud)

我怎样才能将其解析为下面的内容?

[["alpha"; "beta"; "gamma"]; ["one"; "two"; "three"]]
Run Code Online (Sandbox Code Playgroud)

当有更好的分隔符(例如__)时我可以写这个,就像这样

sepBy (sepBy word (pchar ' ')) (pstring "__")
Run Code Online (Sandbox Code Playgroud)

有效,但在双倍空格的情况下,第一个 sepBy 中的 pchar 消耗第一个空格,然后解析器失败。

byt*_*ter 4

FParsec 手册,在 中sepBy p sep,如果sep成功并且随后p失败(不改变状态),则整个sepBy也失败。因此,您的目标是:

  1. 如果遇到多个空格字符,则使分隔符失败;
  2. 回溯以便“内”sepBy循环愉快地关闭并将控制权传递给“外”sepBy循环。

以下是两者的实现方法:

// this is your word parser; it can be different of course,
// I just made it as simple as possible;
let pWord = many1Satisfy isAsciiLetter

// this is the Inner separator to separate individual words
let pSepInner =
    pchar ' '
    .>> notFollowedBy (pchar ' ') // guard rule to prevent 2nd space
    |> attempt                    // a wrapper that fails NON-fatally

// this is the Outer separator
let pSepOuter =
    pchar ' '
    |> many1                      // loop over 1+ spaces

// this is the parser that would return String list list
let pMain =
    pWord
    |> sepBy <| pSepInner         // the Inner loop
    |> sepBy <| pSepOuter         // the Outer loop
Run Code Online (Sandbox Code Playgroud)

使用:

run pMain "alpha beta gamma  one two three"
Success: [["alpha"; "beta"; "gamma"]; ["one"; "two"; "three"]]
Run Code Online (Sandbox Code Playgroud)