Rub*_*ver 0 ruby regex parsing
要解析的字符串(不含空格):
"instrumentalist ( bass (upright , fretless , 5-string ) , guitar ( electric , acoustic ) , trumpet ), teacher , songwriter, producer"
Run Code Online (Sandbox Code Playgroud)
我需要在Ruby中获得这个结构
["instrumentalist",[["bass",["upright","fretless","5-string"]],["guitar",["electric","acoustic"]],["trumpet"]],["teacher"],["songwriter"],["producer"]]
Run Code Online (Sandbox Code Playgroud)
由于嵌套(,)并, String#partition不能帮助我.我真的不知道是否有一个花哨的RegEx可以提取这种类型的字符串.或者我必须使用词法分析器?
对于这类问题,正则表达式本身并不是正确的事情,即使基本过程很简单:遍历字符串寻找逗号或括号.当您找到逗号时,将先前读取的字符添加到当前嵌套中.当你找到一个开放式支架时,你的嵌套级别会上升1,当你发现一个小括号减去它时.
StringScanner是为这类东西而设计的,因为它允许我们在保持某些状态时遍历字符串,在这种情况下,是一个镜像你的开始和结束括号的堆栈.这样的事情对我有用
require 'strscan'
def parse input
scanner = StringScanner.new input
stack = [[]]
while string = scanner.scan(/[^(),]+/)
case scanner.scan /[(),]+/
when '('
new_nesting = [string, []]
stack.last << new_nesting
stack << new_nesting[1]
when ')'
scanner.scan(/,/)
stack.last << string
stack.pop
else
stack.last << string
end
end
stack.last
end
Run Code Online (Sandbox Code Playgroud)