如何在SML中解析String到(int*int)元组?

Sib*_*ibi 3 ml sml smlnj

我有一个像这样的字符串"3,4\r\n",我想将它们转换为元组即(3,4).

我们如何在SML中实现这一目标?

我得到字符串值的原因是因为我正在读取一个返回类似字符串的文件.

And*_*erg 7

你需要一个简单的解析器来实现它.解析整数的适当函数已经在库中可用Int.scan(与其他类型的朋友一起),但您必须自己编写其余的函数.例如:

(* scanLine : (char, 's) StringCvt.reader -> (int * int, 's) StringCvt.reader *)
fun scanLine getc stream =
    case Int.scan StringCvt.DEC getc stream
      of NONE => NONE
       | SOME (x1, stream') =>
    case getc stream'
      of NONE => NONE
       | SOME (c1, stream'') =>
    if c1 <> #"," then NONE else
    case Int.scan StringCvt.DEC getc stream''
      of NONE => NONE
       | SOME (x2, stream''') => 
    case getc stream'''
      of NONE => NONE
       | SOME (c2, stream'''') =>
    if c2 <> #"\n" then NONE else
    SOME ((x1, x2), stream'''')
Run Code Online (Sandbox Code Playgroud)

然后,解析所有行:

(* scanList : ((char, 's) StringCvt.reader -> ('a, 's) StringCvt.reader) -> (char, 's)  StringCvt.reader -> ('a list, 's) StringCvt.reader *)
fun scanList scanElem getc stream =
    case scanElem getc stream
      of NONE => SOME ([], stream)
       | SOME (x, stream') =>
    case scanList scanElem getc stream'
      of NONE => NONE
       | SOME (xs, stream'') => SOME (x::xs, stream'')
Run Code Online (Sandbox Code Playgroud)

要使用它,例如:

val test = "4,5\n2,3\n"
val result = StringCvt.scanString (scanList scanLine) test
(* val result : (int * int) list = [(4, 5), (2, 3)] *)
Run Code Online (Sandbox Code Playgroud)

如您所见,代码有点重复.要摆脱选项类型的所有匹配,您可以编写一些基本的解析器组合器:

(* scanCharExpect : char -> (char, 's) StringCvt.reader -> (char, 's) StringCvt.reader *)
fun scanCharExpect expect getc stream =
    case getc stream
      of NONE => NONE
       | SOME (c, stream') =>
         if c = expect then SOME (c, stream') else NONE

(* scanSeq : ((char, 's) StringCvt.reader -> ('a, 's) StringCvt.reader) * ((char, 's) StringCvt.reader -> ('b, 's) StringCvt.reader) -> (char, 's) StringCvt.reader -> ('a * 'b, 's) StringCvt.reader *)
fun scanSeq (scan1, scan2) getc stream =
    case scan1 getc stream
      of NONE => NONE
       | SOME (x1, stream') =>
    case scan2 getc stream'
      of NONE => NONE
       | SOME (x2, stream'') => SOME ((x1, x2), stream'')

fun scanSeqL (scan1, scan2) getc stream =
    Option.map (fn ((x, _), stream) => (x, stream)) (scanSeq (scan1, scan2) getc stream)
fun scanSeqR (scan1, scan2) getc stream =
    Option.map (fn ((_, x), stream) => (x, stream)) (scanSeq (scan1, scan2) getc stream)

(* scanLine : (char, 's) StringCvt.reader -> (int * int, 's) StringCvt.reader *)
fun scanLine getc stream =
    scanSeq (
        scanSeqL (Int.scan StringCvt.DEC, scanCharExpect #","),
        scanSeqL (Int.scan StringCvt.DEC, scanCharExpect #"\n")
    ) getc stream
Run Code Online (Sandbox Code Playgroud)

您可以沿着这些线构建更多很酷的抽象,特别是在定义自己的中缀运算符时.但我会留下它.

您可能还想处理令牌之间的空白区域.该StringCvt.skipWS读卡器是在lib目录来寻找现成的,只需将其插入正确的地方.