14w*_*wml 2 regex string ocaml
空格是空格,制表符或换行符(即回车符或换行符)
我假设\s包括, \t,\n,\r,和\f
但是当我尝试使用\s它时,无法正确拆分字符串:
# let line1 = "We the People of the United States, in Order to form a more perfect";;
# let wsp_regex = Str.regexp "\\s+";;
# let words = Str.split wsp_regex line1;;
val words : string list =
["We the People of the United State"; ", in Order to form a more perfect"]
# let wsp_regex = Str.regexp "[ \\s]+";;
# let words = Str.split wsp_regex line1;;
val words : string list =
["We"; "the"; "People"; "of"; "the"; "United"; "State"; ","; "in"; "Order"; "to"; "form"; "a"; "more"; "perfect"]
# let wsp_regex = Str.regexp "[\\s]+";;
# let words = Str.split wsp_regex line1;;
val words : string list =
["We the People of the United State"; ", in Order to form a more perfect"]
# let wsp_regex = Str.regexp "[ \\s\\t\\n\\r]+";;
# let words = Str.split wsp_regex line1;;
val words : string list =
["We"; "he"; "People"; "of"; "he"; "U"; "i"; "ed"; "S"; "a"; "e"; ","; "i"; "O"; "de"; "o"; "fo"; "m"; "a"; "mo"; "e"; "pe"; "fec"]
# let wsp_regex = Str.regexp "[\s]+";;
Characters 29-31:
Warning 14: illegal backslash escape in string.
val wsp_regex : Str.regexp = <abstr>
# let words = Str.split wsp_regex line1;;
val words : string list =
["We the People of the United State"; ", in Order to form a more perfect"]
# let wsp_regex = Str.regexp "[ \s]+";;
Characters 30-32:
Warning 14: illegal backslash escape in string.
val wsp_regex : Str.regexp = <abstr>
# let words = Str.split wsp_regex line1;;
val words : string list =
["We"; "the"; "People"; "of"; "the"; "United"; "State"; ","; "in"; "Order"; "to"; "form"; "a"; "more"; "perfect"]
# let wsp_regex = Str.regexp "[ \t\n\r\f]+";;
Characters 36-38:
Warning 14: illegal backslash escape in string.
val wsp_regex : Str.regexp = <abstr>
# let words = Str.split wsp_regex line1;;
val words : string list =
["We"; "the"; "People"; "o"; "the"; "United"; "States,"; "in"; "Order"; "to"; "orm"; "a"; "more"; "per"; "ect"]
# let wsp_regex = Str.regexp "[\t\n\r\f]+";;
Characters 35-37:
Warning 14: illegal backslash escape in string.
val wsp_regex : Str.regexp = <abstr>
# let words = Str.split wsp_regex line1;;
val words : string list =
["We the People o"; " the United States, in Order to "; "orm a more per"; "ect"]
Run Code Online (Sandbox Code Playgroud)
唯一似乎有效的案例是:
# let wsp_regex = Str.regexp "[ ]+";;
# let words = Str.split wsp_regex line1;;
val words : string list =
["We"; "the"; "People"; "of"; "the"; "United"; "States,"; "in"; "Order"; "to"; "form"; "a"; "more"; "perfect"]
# let wsp_regex = Str.regexp "[ \t\n\r]+";;
# let words = Str.split wsp_regex line1;;
val words : string list =
["We"; "the"; "People"; "of"; "the"; "United"; "States,"; "in"; "Order"; "to"; "form"; "a"; "more"; "perfect"]
Run Code Online (Sandbox Code Playgroud)
我不确定为什么第二种情况有效,因为做[ \s]+不起作用(Ocaml认为我想拆分 or a s)
我想要的是对空格分开,而无需使用只是 because I also want to capture \t,\n,\r,和\f.
但是我似乎无法弄清楚如何在Ocaml中创建一个正则表达式来分割白色空格.
如果有人能为我提供一个非常感谢的工作表达!
在Str模块的文档中,您会发现\s不支持.因此,您的第一个表达式将分离字符序列上的单词s.事实上,这就是你所看到的.
其他任何尝试都不会\s起作用,因为\s不受支持.
令人惊讶的是,偶数\n(双字符号)不支持作为正则表达式.因此,如果要匹配换行符,则需要在正则表达式模式中使用实际换行符.换句话说,你希望字符串具有:"\n"而不是:"\\n".对于\r和,情况也是如此\t.
\fOCaml字符串语法不接受该表示法.如果要匹配表单提要,则需要使用其十六进制表示法\x0c.
把这一切放在一起,你的模式应该是:"[ \n\r\x0c\t]+".
# Str.split (Str.regexp "[ \n\r\x0c\t]+") line1;;
- : string list =
["We"; "the"; "People"; "of"; "the"; "United"; "States,"; "in";
"Order"; "to"; "form"; "a"; "more"; "perfect"]
Run Code Online (Sandbox Code Playgroud)
有一个Perl兼容的正则表达式包,您可能会觉得使用起来更舒服:https://opam.ocaml.org/packages/pcre/pcre.7.1.5/
| 归档时间: |
|
| 查看次数: |
3677 次 |
| 最近记录: |