Pau*_*tas 41 python filtering os.walk
我正在寻找一种方法来包含/排除文件模式并从os.walk()
调用中排除目录.
这就是我现在正在做的事情:
import fnmatch
import os
includes = ['*.doc', '*.odt']
excludes = ['/home/paulo-freitas/Documents']
def _filter(paths):
matches = []
for path in paths:
append = None
for include in includes:
if os.path.isdir(path):
append = True
break
if fnmatch.fnmatch(path, include):
append = True
break
for exclude in excludes:
if os.path.isdir(path) and path == exclude:
append = False
break
if fnmatch.fnmatch(path, exclude):
append = False
break
if append:
matches.append(path)
return matches
for root, dirs, files in os.walk('/home/paulo-freitas'):
dirs[:] = _filter(map(lambda d: os.path.join(root, d), dirs))
files[:] = _filter(map(lambda f: os.path.join(root, f), files))
for filename in files:
filename = os.path.join(root, filename)
print filename
Run Code Online (Sandbox Code Playgroud)
问题是:有更好的方法吗?怎么样?
Obe*_*nne 50
此解决方案用于fnmatch.translate
将glob模式转换为正则表达式(它假定仅包含用于文件):
import fnmatch
import os
import os.path
import re
includes = ['*.doc', '*.odt'] # for files only
excludes = ['/home/paulo-freitas/Documents'] # for dirs and files
# transform glob patterns to regular expressions
includes = r'|'.join([fnmatch.translate(x) for x in includes])
excludes = r'|'.join([fnmatch.translate(x) for x in excludes]) or r'$.'
for root, dirs, files in os.walk('/home/paulo-freitas'):
# exclude dirs
dirs[:] = [os.path.join(root, d) for d in dirs]
dirs[:] = [d for d in dirs if not re.match(excludes, d)]
# exclude/include files
files = [os.path.join(root, f) for f in files]
files = [f for f in files if not re.match(excludes, f)]
files = [f for f in files if re.match(includes, f)]
for fname in files:
print fname
Run Code Online (Sandbox Code Playgroud)
koj*_*iro 23
os.walk(top [,topdown = True [,onerror = None [,followlinks = False]]])
当topdown为True时,调用者可以就地修改dirnames列表...这可以用来修剪搜索...
for root, dirs, files in os.walk('/home/paulo-freitas', topdown=True):
# excludes can be done with fnmatch.filter and complementary set,
# but it's more annoying to read.
dirs[:] = [d for d in dirs if d not in excludes]
for pat in includes:
for f in fnmatch.filter(files, pat):
print os.path.join(root, f)
Run Code Online (Sandbox Code Playgroud)
我应该指出,上面的代码假设excludes
是一个模式,而不是一个完整的路径.您需要调整列表推导以过滤是否os.path.join(root, d) not in excludes
匹配OP案例.
为什么fnmatch?
import os
excludes=....
for ROOT,DIR,FILES in os.walk("/path"):
for file in FILES:
if file.endswith(('doc','odt')):
print file
for directory in DIR:
if not directory in excludes :
print directory
Run Code Online (Sandbox Code Playgroud)
没有详尽的测试
归档时间: |
|
查看次数: |
83340 次 |
最近记录: |