一个可以忽略目录的Python walker

Joh*_*son 8 python ignore-files directory-walk

我需要一个文件系统walker,我可以指示忽略遍历我想要保持不变的目录,包括该分支下面的所有子目录.os.walk和os.path.walk就是不这样做.

Ric*_*and 9

实际上,os.walk可能正是你想要的.假设我有一个列表(可能是一组)要忽略的目录ignore.那应该工作:

def my_walk(top_dir, ignore):
    for dirpath, dirnames, filenames in os.walk(top_dir):
        dirnames[:] = [ 
            dn for dn in dirnames 
            if os.path.join(dirpath, dn) not in ignore ]
        yield dirpath, dirnames, filenames
Run Code Online (Sandbox Code Playgroud)

  • 这是预期的方式,即使在os.path.walk()的文档中也这样说. (2认同)

Tor*_*rek 7

可以os.walk就地修改返回值的第二个元素:

[...]调用者可以就地修改dirnames列表(可能使用del或slice赋值),而walk()只会递归到名称保留在dirnames中的子目录中; 这可以用来修剪搜索[...]

def fwalk(root, predicate):
    for dirpath, dirnames, filenames in os.walk(root):
        dirnames[:] = [d for d in dirnames if predicate(r, d)]
        yield dirpath, dirnames, filenames
Run Code Online (Sandbox Code Playgroud)

现在,您只需提交子目录的谓词:

>>> ignore_list = [...]
>>> list(fwalk("some/root", lambda r, d: d not in ignore_list))
Run Code Online (Sandbox Code Playgroud)


Joh*_*son 1

所以我做了这个家庭角色步行者功能:

import os
from os.path import join, isdir, islink, isfile

def mywalk(top, topdown=True, onerror=None, ignore_list=('.ignore',)):
    try:
        # Note that listdir and error are globals in this module due
        # to earlier import-*.
        names = os.listdir(top)
    except Exception, err:
        if onerror is not None:
            onerror(err)
        return
    if len([1 for x in names if x in ignore_list]):
        return 
    dirs, nondirs = [], []
    for name in names:
        if isdir(join(top, name)):
            dirs.append(name)
        else:
            nondirs.append(name)

    if topdown:
        yield top, dirs, nondirs
    for name in dirs:
        path = join(top, name)
        if not islink(path): 
            for x in mywalk(path, topdown, onerror, ignore_list):
                yield x
    if not topdown:
        yield top, dirs, nondirs
Run Code Online (Sandbox Code Playgroud)