Python 测试字符串是否与模板值匹配

Rob*_*rto 0 python regex string

我试图遍历一个字符串列表,只保留那些与我指定的命名模板匹配的字符串。我想接受任何与模板完全匹配的列表条目,而不是在变量<SCENARIO>字段中包含一个整数。

检查需要是一般的。具体来说,字符串结构可能会发生变化,以至于无法保证<SCENARIO>总是出现在字符 X 处(例如,使用列表推导式)。

下面的代码显示了一种使用 的方法split,但必须有更好的方法来进行此字符串比较。我可以在这里使用正则表达式吗?

template = 'name_is_here_<SCENARIO>_20131204.txt'

testList = ['name_is_here_100_20131204.txt',        # should accept
            'name_is_here_100_20131204.txt.NEW',    # should reject
            'other_name.txt']                       # should reject

acceptList = []

for name in testList:
    print name
    acceptFlag = True
    splitTemplate = template.split('_')
    splitName = name.split('_')
    # if lengths do not match, name cannot possibly match template
    if len(splitTemplate) == len(splitName):
        print zip(splitTemplate, splitName)
        # compare records in the split
        for t, n in zip(splitTemplate, splitName):
            if t!=n and not t=='<SCENARIO>':
                #reject if any of the "other" fields are not identical
                #(would also check that '<SCENARIO>' field is numeric - not shown here)
                print 'reject: ' + name
                acceptFlag = False
    else:
        acceptFlag = False

    # keep name if it passed checks
    if acceptFlag == True:
        acceptList.append(name)

print acceptList
# correctly prints --> ['name_is_here_100_20131204.txt']
Run Code Online (Sandbox Code Playgroud)

Alv*_*tes 5

尝试使用rePython 中的正则表达式模块:

import re

template = re.compile(r'^name_is_here_(\d+)_20131204.txt$')

testList = ['name_is_here_100_20131204.txt', #accepted
            'name_is_here_100_20131204.txt.NEW', #rejected!
            'name_is_here_aabs2352_20131204.txt', #rejected!
            'other_name.txt'] #rejected!

acceptList = [item for item in testList if template.match(item)]
Run Code Online (Sandbox Code Playgroud)