MLS*_*LSC 21 python regex string
我有以下字符串,我把它分开:
>>> st = '%2g%k%3p'
>>> l = filter(None, st.split('%'))
>>> print l
['2g', 'k', '3p']
Run Code Online (Sandbox Code Playgroud)
现在我要打印两次g字母,k字母一次和p字母三次:
ggkppp
Run Code Online (Sandbox Code Playgroud)
这怎么可能?
Ant*_*pov 15
您可以使用generator
with isdigit()
来检查您的第一个符号是否为数字,然后返回具有适当计数的后续字符串.然后你可以join
用来得到你的输出:
''.join(i[1:]*int(i[0]) if i[0].isdigit() else i for i in l)
Run Code Online (Sandbox Code Playgroud)
示范:
In [70]: [i[1:]*int(i[0]) if i[0].isdigit() else i for i in l ]
Out[70]: ['gg', 'k', 'ppp']
In [71]: ''.join(i[1:]*int(i[0]) if i[0].isdigit() else i for i in l)
Out[71]: 'ggkppp'
Run Code Online (Sandbox Code Playgroud)
编辑
re
当第一个数字有几个数字时使用模块:
''.join(re.search('(\d+)(\w+)', i).group(2)*int(re.search('(\d+)(\w+)', i).group(1)) if re.search('(\d+)(\w+)', i) else i for i in l)
Run Code Online (Sandbox Code Playgroud)
例:
In [144]: l = ['12g', '2kd', 'h', '3p']
In [145]: ''.join(re.search('(\d+)(\w+)', i).group(2)*int(re.search('(\d+)(\w+)', i).group(1)) if re.search('(\d+)(\w+)', i) else i for i in l)
Out[145]: 'ggggggggggggkdkdhppp'
Run Code Online (Sandbox Code Playgroud)
EDIT2
您的输入如下:
st = '%2g_%3k%3p'
Run Code Online (Sandbox Code Playgroud)
你可以_
用空字符串替换,然后_
如果列表中的工作以_
符号结尾则添加到结尾:
st = '%2g_%3k%3p'
l = list(filter(None, st.split('%')))
''.join((re.search('(\d+)(\w+)', i).group(2)*int(re.search('(\d+)(\w+)', i).group(1))).replace("_", "") + '_' * i.endswith('_') if re.search('(\d+)(\w+)', i) else i for i in l)
Run Code Online (Sandbox Code Playgroud)
输出:
'gg_kkkppp'
Run Code Online (Sandbox Code Playgroud)
EDIT3
没有re
模块的解决方案,但通常的循环工作2位数.你可以定义函数:
def add_str(ind, st):
if not st.endswith('_'):
return st[ind:] * int(st[:ind])
else:
return st[ind:-1] * int(st[:ind]) + '_'
def collect(l):
final_str = ''
for i in l:
if i[0].isdigit():
if i[1].isdigit():
final_str += add_str(2, i)
else:
final_str += add_str(1, i)
else:
final_str += i
return final_str
Run Code Online (Sandbox Code Playgroud)
然后将它们用作:
l = ['12g_', '3k', '3p']
print(collect(l))
gggggggggggg_kkkppp
Run Code Online (Sandbox Code Playgroud)
Avi*_*Raj 13
单行正则表达方式:
>>> import re
>>> st = '%2g%k%3p'
>>> re.sub(r'%|(\d*)(\w+)', lambda m: int(m.group(1))*m.group(2) if m.group(1) else m.group(2), st)
'ggkppp'
Run Code Online (Sandbox Code Playgroud)
%|(\d*)(\w+)
正则表达式匹配所有%
并且在任何单词字符进入一个组之前存在零或多个数字,并且将后面的单词字符存储到另一个组中.在更换时,所有匹配的字符应替换为更换部件中给出的值.所以这应该是松散的%
性格.
要么
>>> re.sub(r'%(\d*)(\w+)', lambda m: int(m.group(1))*m.group(2) if m.group(1) else m.group(2), st)
'ggkppp'
Run Code Online (Sandbox Code Playgroud)
Łuk*_*ski 11
假设您始终打印单个字母,但前面的数字可能比基数10中的单个数字长.
seq = ['2g', 'k', '3p']
result = ''.join(int(s[:-1] or 1) * s[-1] for s in seq)
assert result == "ggkppp"
Run Code Online (Sandbox Code Playgroud)
另一种方法是定义你的函数,它将nC转换为CCCC ... C(ntimes),然后将它传递给a map
,将它应用于l
来自split
over 的列表的每个元素%
,最后join
它们全部,如下所示:
>>> def f(s):
x = 0
if s:
if len(s) == 1:
out = s
else:
for i in s:
if i.isdigit():
x = x*10 + int(i)
out = x*s[-1]
else:
out = ''
return out
>>> st
'%4g%10k%p'
>>> ''.join(map(f, st.split('%')))
'ggggkkkkkkkkkkp'
>>> st = '%2g%k%3p'
>>> ''.join(map(f, st.split('%')))
'ggkppp'
Run Code Online (Sandbox Code Playgroud)
或者,如果您想将所有这些放在一个单一的函数定义中:
>>> def f(s):
out = ''
if s:
l = filter(None, s.split('%'))
for item in l:
x = 0
if len(item) == 1:
repl = item
else:
for c in item:
if c.isdigit():
x = x*10 + int(c)
repl = x*item[-1]
out += repl
return out
>>> st
'%2g%k%3p'
>>> f(st)
'ggkppp'
>>>
>>> st = '%4g%10k%p'
>>>
>>> f(st)
'ggggkkkkkkkkkkp'
>>> st = '%4g%101k%2p'
>>> f(st)
'ggggkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkpp'
>>> len(f(st))
107
Run Code Online (Sandbox Code Playgroud)
编辑:
如果存在_
OP不希望重复此角色的地方,那么我认为最好的方法就是这样re.sub
,它会让事情变得更容易,这样:
>>> def f(s):
pat = re.compile(r'%(\d*)([a-zA-Z]+)')
out = pat.sub(lambda m:int(m.group(1))*m.group(2) if m.group(1) else m.group(2), s)
return out
>>> st = '%4g_%12k%p__%m'
>>> f(st)
'gggg_kkkkkkkkkkkkp__m'
Run Code Online (Sandbox Code Playgroud)