Rob*_*rto 0 python regex string
我试图遍历一个字符串列表,只保留那些与我指定的命名模板匹配的字符串。我想接受任何与模板完全匹配的列表条目,而不是在变量<SCENARIO>字段中包含一个整数。
检查需要是一般的。具体来说,字符串结构可能会发生变化,以至于无法保证<SCENARIO>总是出现在字符 X 处(例如,使用列表推导式)。
下面的代码显示了一种使用 的方法split,但必须有更好的方法来进行此字符串比较。我可以在这里使用正则表达式吗?
template = 'name_is_here_<SCENARIO>_20131204.txt'
testList = ['name_is_here_100_20131204.txt', # should accept
'name_is_here_100_20131204.txt.NEW', # should reject
'other_name.txt'] # should reject
acceptList = []
for name in testList:
print name
acceptFlag = True
splitTemplate = template.split('_')
splitName = name.split('_')
# if lengths do not match, name cannot possibly match template
if len(splitTemplate) == len(splitName):
print zip(splitTemplate, splitName)
# compare records in the split
for t, n in zip(splitTemplate, splitName):
if t!=n and not t=='<SCENARIO>':
#reject if any of the "other" fields are not identical
#(would also check that '<SCENARIO>' field is numeric - not shown here)
print 'reject: ' + name
acceptFlag = False
else:
acceptFlag = False
# keep name if it passed checks
if acceptFlag == True:
acceptList.append(name)
print acceptList
# correctly prints --> ['name_is_here_100_20131204.txt']
Run Code Online (Sandbox Code Playgroud)
尝试使用rePython 中的正则表达式模块:
import re
template = re.compile(r'^name_is_here_(\d+)_20131204.txt$')
testList = ['name_is_here_100_20131204.txt', #accepted
'name_is_here_100_20131204.txt.NEW', #rejected!
'name_is_here_aabs2352_20131204.txt', #rejected!
'other_name.txt'] #rejected!
acceptList = [item for item in testList if template.match(item)]
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
3518 次 |
| 最近记录: |