Haskell lexer问题

Mic*_*cah 0 haskell types lexer

我在haskell写一个词法分析器.这是代码:

lexer :: String -> [Token]
lexer s
    | s =~ whitespace :: Bool =
        let token = s =~ whitespace :: String in
            lex (drop (length token) s)
    | s =~ number :: Bool =
        let token = s =~ number :: String in
            Val (read token) : lex (drop (length token) s)
    | s =~ operator :: Bool =
        let token = s =~ operator :: String in
            Oper token : lex (drop (length token) s)
    | otherwise = error "unrecognized character"
    where
        whitespace = "^[ \t\n]"
        number = "^[0-9]*(\.[0-9]+)?"
        operator = "^[+-*/()]"

data Token = Val Int | Oper String
Run Code Online (Sandbox Code Playgroud)

我有两个问题.一,正则表达式"^[0-9]*(\.[0-9]+)?"引发此错误:

lexical error in string/character literal at character '['

当我注释掉包含它的行和使用它的函数部分时,我收到此错误:

Couldn't match expected type `Token'
           against inferred type `(String, String)'
      Expected type: [Token]
      Inferred type: [(String, String)]
    In the expression: lex (drop (length token) s)
    In the expression:
        let token = s =~ whitespace :: String
        in lex (drop (length token) s)

我不知道为什么我会遇到这些错误.有人能帮我吗?

sth*_*sth 7

反斜杠用作字符串文字中的转义字符,例如"\n"用于包含换行符的字符串.如果你想要一个文字反斜杠,你需要将其转义为"\\".这是正则表达式中的问题"^[0-9]*(\.[0-9]+)?",Haskell解析器试图将其解释"\."为正常的字符串转义并对其进行扼流(可能是因为没有这样的转义).如果你写了正则表达式,因为"^[0-9]*(\\.[0-9]+)?"错误消失了.

类型问题的原因是您lex从标准Prelude in 调用lex (drop (length token) s),它具有类型String -> [(String, String)].可能你想要对你自己的函数进行递归调用lexer......