我试图标记字符串.只要没有引用字符一切都很好:
string:tokens ("abc def ghi", " ").
["abc","def","ghi"]
Run Code Online (Sandbox Code Playgroud)
但字符串:tokens/2确实对引用字符串有很大帮助.它的行为符合预期:
string:tokens ("abc \"def xyz\" ghi", " ").
["abc","\"def","xyz\"","ghi"]
Run Code Online (Sandbox Code Playgroud)
我需要的是一个函数,它将字符串标记化,分隔符和引号字符.就像是:
tokens ("abc \"def xyz\" ghi", " ", "\"").
["abc","def xyz","ghi"]
Run Code Online (Sandbox Code Playgroud)
在我开始重新发明轮子之前,我的问题是:
标准库中是否有这样的功能或类似功能?
编辑:
好吧,我编写了自己的实现,但我仍然对原始问题的答案非常感兴趣.到目前为止,这里是我的代码:
tokens (String) -> tokens (String, [], [] ).
tokens ( [], Tokens, Buffer) ->
lists:map (fun (Token) -> string:strip (Token, both, $") end, Tokens ++ [Buffer] );
tokens ( [Character | String], Tokens, Buffer) ->
case {Character, Buffer} of
{$ , [] } -> tokens (String, Tokens, Buffer);
{$ , [$" | _] } -> tokens (String, Tokens, Buffer ++ [Character] );
{$ , _} -> tokens (String, Tokens ++ [Buffer], [] );
{$", [] } -> tokens (String, Tokens, "\"" );
{$", [$" | _] } -> tokens (String, Tokens ++ [Buffer ++ "\""], [] );
{$", _} -> tokens (String, Tokens ++ [Buffer], "\"");
_ -> tokens (String, Tokens, Buffer ++ [Character] )
end.
Run Code Online (Sandbox Code Playgroud)
如果在一般情况下正则表达式是可接受的,您可以使用:
> re:split("abc \"def xyz\" ghi", " \"|\" ", [{return, list}]).
["abc","def xyz","ghi"]
Run Code Online (Sandbox Code Playgroud)
"\s\"|\"\s"如果要根据任何空格而不是空格进行拆分,也可以使用.
如果你碰巧从输入文件解析这一点,你可能需要使用strip_split/2从一个EString.
| 归档时间: |
|
| 查看次数: |
3379 次 |
| 最近记录: |