我给出以下格式的字符串:"a{1;4:6}"和"a{1;2}b{2:4}"其中;代表两个不同的号码,和:代表数字的序列.支架内可以有任意数量的分号和冒号组合.
我想扩展它,这些是扩展上面两个例子的结果:
"a{1;4:6}" ="a1a4a5a6""a{1;2}b{2:4}" = "a1b2b3b4a2b2b3b4"我以前从来没有处理这样的事情,因为我通常是在某种现成的格式是简单的可分解给出的字符串.在这种情况下,我必须手动解析字符串.
我的尝试是一遍又一遍地手动拆分字符串,直到遇到冒号或分号的情况,然后从那里开始构建字符串.这非常低效,我希望对这种方法有任何想法.这基本上是代码的样子(我省略了很多,只是为了更快地得到点):
>>> s = "a{1;4:6}"
>>> splitted = s.split("}")
>>> splitted
['a{1;4:6', '']
>>> splitted2 = [s.split("{") for s in splitted]
>>> splitted2
[['a', '1;4:6'], ['']]
>>> splitted3 = [s.split(";") for s in splitted2[0]]
>>> splitted3
[['a'], ['1', '4:6']]
# ... etc, then build up the strings manually once the ranges are figured out.
Run Code Online (Sandbox Code Playgroud)
最初在闭合支撑处分裂的想法是保证在它之后出现具有相关范围的新标识符.我哪里错了?我的方法适用于简单的字符串,例如第一个示例,但它不适用于第二个示例.此外,效率低下.我会感谢任何关于这个问题的意见.
我尝试了pyparsing,恕我直言,它产生了一个非常可读的代码(从前面的答案拿了pack_tokens).
from pyparsing import nums, Literal, Word, oneOf, Optional, OneOrMore, Group, delimitedList
from string import ascii_lowercase as letters
# transform a '123' to 123
number = Word(nums).setParseAction(lambda s, l, t: int(t[0]))
# parses 234:543 ranges
range_ = number + Literal(':').suppress() + number
# transforms the range x:y to a list [x, x+1, ..., y]
range_.setParseAction(lambda s, l, t: list(range(t[0], t[1]+1)))
# parse the comma delimited list of ranges or individual numbers
range_list = delimitedList(range_|number,",")
# and pack them in a tuple
range_list.setParseAction(lambda s, l, t: tuple(t))
# parses 'a{2,3,4:5}' group
group = Word(letters, max=1) + Literal('{').suppress() + range_list + Literal('}').suppress()
# transform the group parsed as ['a', [2, 4, 5]] to ['a2', 'a4' ...]
group.setParseAction(lambda s, l, t: tuple("%s%d" % (t[0],num) for num in t[1]))
# the full expression is just those group one after another
expression = OneOrMore(group)
def pack_tokens(s, l, tokens):
current, *rest = tokens
if not rest:
return ''.join(current) # base case
return ''.join(token + pack_tokens(s, l, rest) for token in current)
expression.setParseAction(pack_tokens)
parsed = expression.parseString('a{1,2,3}')[0]
print(parsed)
parsed = expression.parseString('a{1,3:7}b{1:5}')[0]
print(parsed)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
157 次 |
| 最近记录: |