cin*_*nny 84 python traversal directory-structure
我知道我们可以使用os.walk()列出目录中的所有子目录或所有文件.但是,我想列出完整的目录树内容:
如何在Python中实现这一目标?
dho*_*bbs 121
这是一个使用格式化的功能:
import os
def list_files(startpath):
for root, dirs, files in os.walk(startpath):
level = root.replace(startpath, '').count(os.sep)
indent = ' ' * 4 * (level)
print('{}{}/'.format(indent, os.path.basename(root)))
subindent = ' ' * 4 * (level + 1)
for f in files:
print('{}{}'.format(subindent, f))
Run Code Online (Sandbox Code Playgroud)
Aar*_*all 36
在 Python 中列出目录树结构?
我们通常更喜欢只使用 GNU 树,但我们并不总是tree
在每个系统上都有,有时 Python 3 是可用的。这里的一个好的答案可以很容易地复制粘贴,而不是tree
要求GNU 。
tree
的输出如下所示:
$ tree
.
??? package
? ??? __init__.py
? ??? __main__.py
? ??? subpackage
? ? ??? __init__.py
? ? ??? __main__.py
? ? ??? module.py
? ??? subpackage2
? ??? __init__.py
? ??? __main__.py
? ??? module2.py
??? package2
??? __init__.py
4 directories, 9 files
Run Code Online (Sandbox Code Playgroud)
我在我的主目录中的一个目录下创建了上述目录结构pyscratch
。
我还在这里看到了其他接近这种输出的答案,但我认为我们可以做得更好,使用更简单、更现代的代码和懒惰的评估方法。
首先,让我们使用一个例子
Path
对象yield
和yield from
表达式(创建生成器函数)from pathlib import Path
# prefix components:
space = ' '
branch = '? '
# pointers:
tee = '??? '
last = '??? '
def tree(dir_path: Path, prefix: str=''):
"""A recursive generator, given a directory Path object
will yield a visual tree structure line by line
with each line prefixed by the same characters
"""
contents = list(dir_path.iterdir())
# contents each get pointers that are ??? with a final ??? :
pointers = [tee] * (len(contents) - 1) + [last]
for pointer, path in zip(pointers, contents):
yield prefix + pointer + path.name
if path.is_dir(): # extend the prefix and recurse:
extension = branch if pointer == tee else space
# i.e. space because last, ??? , above so no more |
yield from tree(path, prefix=prefix+extension)
Run Code Online (Sandbox Code Playgroud)
现在:
for line in tree(Path.home() / 'pyscratch'):
print(line)
Run Code Online (Sandbox Code Playgroud)
印刷:
??? package
? ??? __init__.py
? ??? __main__.py
? ??? subpackage
? ? ??? __init__.py
? ? ??? __main__.py
? ? ??? module.py
? ??? subpackage2
? ??? __init__.py
? ??? __main__.py
? ??? module2.py
??? package2
??? __init__.py
Run Code Online (Sandbox Code Playgroud)
我们确实需要将每个目录具体化为一个列表,因为我们需要知道它有多长,但之后我们将列表扔掉。对于深度和广泛的递归,这应该足够懒惰。
上面的代码和注释应该足以完全理解我们在这里所做的事情,但是如果需要,请随时使用调试器逐步完成它以更好地理解它。
现在 GNUtree
为我们提供了一些有用的特性,我希望这个函数具有这些特性:
n directories, m files
-L level
-d
此外,当有一个巨大的树时,限制迭代(例如 with islice
)以避免用文本锁定解释器是有用的,因为在某些时候输出变得太冗长而无用。默认情况下,我们可以将其设置为任意高 - 例如1000
。
因此,让我们删除之前的注释并填写此功能:
??? package
? ??? __init__.py
? ??? __main__.py
? ??? subpackage
? ? ??? __init__.py
? ? ??? __main__.py
? ? ??? module.py
? ??? subpackage2
? ??? __init__.py
? ??? __main__.py
? ??? module2.py
??? package2
??? __init__.py
Run Code Online (Sandbox Code Playgroud)
from pathlib import Path
from itertools import islice
space = ' '
branch = '? '
tee = '??? '
last = '??? '
Run Code Online (Sandbox Code Playgroud)
现在我们可以获得与以下相同类型的输出tree
:
def tree(dir_path: Path, level: int=-1, limit_to_directories: bool=False,
length_limit: int=1000):
"""Given a directory Path object print a visual tree structure"""
dir_path = Path(dir_path) # accept string coerceable to Path
files = 0
directories = 0
def inner(dir_path: Path, prefix: str='', level=-1):
nonlocal files, directories
if not level:
return # 0, stop iterating
if limit_to_directories:
contents = [d for d in dir_path.iterdir() if d.is_dir()]
else:
contents = list(dir_path.iterdir())
pointers = [tee] * (len(contents) - 1) + [last]
for pointer, path in zip(pointers, contents):
if path.is_dir():
yield prefix + pointer + path.name
directories += 1
extension = branch if pointer == tee else space
yield from inner(path, prefix=prefix+extension, level=level-1)
elif not limit_to_directories:
yield prefix + pointer + path.name
files += 1
print(dir_path.name)
iterator = inner(dir_path, level=level)
for line in islice(iterator, length_limit):
print(line)
if next(iterator, None):
print(f'... length_limit, {length_limit}, reached, counted:')
print(f'\n{directories} directories' + (f', {files} files' if files else ''))
Run Code Online (Sandbox Code Playgroud)
印刷:
pyscratch
??? package
? ??? __init__.py
? ??? __main__.py
? ??? subpackage
? ? ??? __init__.py
? ? ??? __main__.py
? ? ??? module.py
? ??? subpackage2
? ??? __init__.py
? ??? __main__.py
? ??? module2.py
??? package2
??? __init__.py
4 directories, 9 files
Run Code Online (Sandbox Code Playgroud)
我们可以限制级别:
tree(Path.home() / 'pyscratch')
Run Code Online (Sandbox Code Playgroud)
印刷:
pyscratch
??? package
? ??? __init__.py
? ??? __main__.py
? ??? subpackage
? ??? subpackage2
??? package2
??? __init__.py
4 directories, 3 files
Run Code Online (Sandbox Code Playgroud)
我们可以将输出限制为目录:
pyscratch
??? package
? ??? __init__.py
? ??? __main__.py
? ??? subpackage
? ? ??? __init__.py
? ? ??? __main__.py
? ? ??? module.py
? ??? subpackage2
? ??? __init__.py
? ??? __main__.py
? ??? module2.py
??? package2
??? __init__.py
4 directories, 9 files
Run Code Online (Sandbox Code Playgroud)
印刷:
pyscratch
??? package
? ??? subpackage
? ??? subpackage2
??? package2
4 directories
Run Code Online (Sandbox Code Playgroud)
回想起来,我们本来可以用于path.glob
匹配的。我们或许也可以path.rglob
用于递归通配符,但这需要重写。我们也可以使用itertools.tee
而不是具体化目录内容列表,但这可能会产生负面的权衡,并且可能会使代码更加复杂。
欢迎评论!
Int*_*tra 19
没有缩进的解决方案:
for path, dirs, files in os.walk(given_path):
print path
for f in files:
print f
Run Code Online (Sandbox Code Playgroud)
os.walk已经完成了你正在寻找的自上而下,深度优先的步行.
忽略dirs列表可以防止您提到的重叠.
abs*_*rus 13
与上述答案类似,但对于python3来说,可以说是可读性和可扩展性的:
from pathlib import Path
class DisplayablePath(object):
display_filename_prefix_middle = '???'
display_filename_prefix_last = '???'
display_parent_prefix_middle = ' '
display_parent_prefix_last = '? '
def __init__(self, path, parent_path, is_last):
self.path = Path(str(path))
self.parent = parent_path
self.is_last = is_last
if self.parent:
self.depth = self.parent.depth + 1
else:
self.depth = 0
@property
def displayname(self):
if self.path.is_dir():
return self.path.name + '/'
return self.path.name
@classmethod
def make_tree(cls, root, parent=None, is_last=False, criteria=None):
root = Path(str(root))
criteria = criteria or cls._default_criteria
displayable_root = cls(root, parent, is_last)
yield displayable_root
children = sorted(list(path
for path in root.iterdir()
if criteria(path)),
key=lambda s: str(s).lower())
count = 1
for path in children:
is_last = count == len(children)
if path.is_dir():
yield from cls.make_tree(path,
parent=displayable_root,
is_last=is_last,
criteria=criteria)
else:
yield cls(path, displayable_root, is_last)
count += 1
@classmethod
def _default_criteria(cls, path):
return True
@property
def displayname(self):
if self.path.is_dir():
return self.path.name + '/'
return self.path.name
def displayable(self):
if self.parent is None:
return self.displayname
_filename_prefix = (self.display_filename_prefix_last
if self.is_last
else self.display_filename_prefix_middle)
parts = ['{!s} {!s}'.format(_filename_prefix,
self.displayname)]
parent = self.parent
while parent and parent.parent is not None:
parts.append(self.display_parent_prefix_middle
if parent.is_last
else self.display_parent_prefix_last)
parent = parent.parent
return ''.join(reversed(parts))
Run Code Online (Sandbox Code Playgroud)
用法示例:
paths = DisplayablePath.make_tree(Path('doc'))
for path in paths:
print(path.displayable())
Run Code Online (Sandbox Code Playgroud)
输出示例:
doc/
??? _static/
? ??? embedded/
? ? ??? deep_file
? ? ??? very/
? ? ??? deep/
? ? ??? folder/
? ? ??? very_deep_file
? ??? less_deep_file
??? about.rst
??? conf.py
??? index.rst
Run Code Online (Sandbox Code Playgroud)
Rub*_*era 12
我来到这里寻找同样的事情,并为我使用了dhobbs的答案.作为一种感谢社区的方式,我添加了一些参数来写一个文件,正如akshay所说,并使显示文件可选,所以它不是一个输出.还使缩进成为可选参数,以便您可以更改它,因为有些人喜欢它是2而其他人更喜欢4.
使用不同的循环,因此不显示文件的循环不检查每次迭代是否必须.
希望它帮助别人,因为dhobbs的回答帮助了我.非常感谢.
def showFolderTree(path,show_files=False,indentation=2,file_output=False):
"""
Shows the content of a folder in a tree structure.
path -(string)- path of the root folder we want to show.
show_files -(boolean)- Whether or not we want to see files listed.
Defaults to False.
indentation -(int)- Indentation we want to use, defaults to 2.
file_output -(string)- Path (including the name) of the file where we want
to save the tree.
"""
tree = []
if not show_files:
for root, dirs, files in os.walk(path):
level = root.replace(path, '').count(os.sep)
indent = ' '*indentation*(level)
tree.append('{}{}/'.format(indent,os.path.basename(root)))
if show_files:
for root, dirs, files in os.walk(path):
level = root.replace(path, '').count(os.sep)
indent = ' '*indentation*(level)
tree.append('{}{}/'.format(indent,os.path.basename(root)))
for f in files:
subindent=' ' * indentation * (level+1)
tree.append('{}{}'.format(subindent,f))
if file_output:
output_file = open(file_output,'w')
for line in tree:
output_file.write(line)
output_file.write('\n')
else:
# Default behaviour: print on screen.
for line in tree:
print line
Run Code Online (Sandbox Code Playgroud)
基于这个精彩的帖子
http://code.activestate.com/recipes/217212-treepy-graphically-displays-the-directory-structur/
这里有一个完全像行为的改进
http://linux.die.net/man/1/tree
#!/usr/bin/env python2 # -*- coding: utf-8 -*- # tree.py # # Written by Doug Dahms # # Prints the tree structure for the path specified on the command line from os import listdir, sep from os.path import abspath, basename, isdir from sys import argv def tree(dir, padding, print_files=False, isLast=False, isFirst=False): if isFirst: print padding.decode('utf8')[:-1].encode('utf8') + dir else: if isLast: print padding.decode('utf8')[:-1].encode('utf8') + '??? ' + basename(abspath(dir)) else: print padding.decode('utf8')[:-1].encode('utf8') + '??? ' + basename(abspath(dir)) files = [] if print_files: files = listdir(dir) else: files = [x for x in listdir(dir) if isdir(dir + sep + x)] if not isFirst: padding = padding + ' ' files = sorted(files, key=lambda s: s.lower()) count = 0 last = len(files) - 1 for i, file in enumerate(files): count += 1 path = dir + sep + file isLast = i == last if isdir(path): if count == len(files): if isFirst: tree(path, padding, print_files, isLast, False) else: tree(path, padding + ' ', print_files, isLast, False) else: tree(path, padding + '?', print_files, isLast, False) else: if isLast: print padding + '??? ' + file else: print padding + '??? ' + file def usage(): return '''Usage: %s [-f] Print tree structure of path specified. Options: -f Print files as well as directories PATH Path to process''' % basename(argv[0]) def main(): if len(argv) == 1: print usage() elif len(argv) == 2: # print just directories path = argv[1] if isdir(path): tree(path, '', False, False, True) else: print 'ERROR: \'' + path + '\' is not a directory' elif len(argv) == 3 and argv[1] == '-f': # print directories and files path = argv[2] if isdir(path): tree(path, '', True, False, True) else: print 'ERROR: \'' + path + '\' is not a directory' else: print usage() if __name__ == '__main__': main()
有一个包(我创建的)seedir
用于使用文件夹树图执行此操作和其他操作:
>>> import seedir as sd
>>> sd.seedir('/path/to/some/path/or/package', style='emoji')
package/
?? __init__.py
?? subpackage1/
? ?? __init__.py
? ?? moduleX.py
? ?? moduleY.py
?? subpackage2/
? ?? __init__.py
? ?? moduleZ.py
?? moduleA.py
Run Code Online (Sandbox Code Playgroud)
可以通过以下方式完成与使用的样式 OP 类似的事情:
>>> sd.seedir('/path/to/folder', style='spaces', indent=4, anystart='- ')
- package/
- __init__.py
- subpackage1/
- __init__.py
- moduleX.py
- moduleY.py
- subpackage2/
- __init__.py
- moduleZ.py
- moduleA.py
Run Code Online (Sandbox Code Playgroud)
除了上面的 dhobbs 答案(/sf/answers/680993491/)之外,这里还有一个将结果存储到文件的额外功能(我个人用它来复制并粘贴到FreeMind中,以更好地了解结构,因此我使用制表符而不是空格进行缩进):
import os
def list_files(startpath):
with open("folder_structure.txt", "w") as f_output:
for root, dirs, files in os.walk(startpath):
level = root.replace(startpath, '').count(os.sep)
indent = '\t' * 1 * (level)
output_string = '{}{}/'.format(indent, os.path.basename(root))
print(output_string)
f_output.write(output_string + '\n')
subindent = '\t' * 1 * (level + 1)
for f in files:
output_string = '{}{}'.format(subindent, f)
print(output_string)
f_output.write(output_string + '\n')
list_files(".")
Run Code Online (Sandbox Code Playgroud)
import os
def fs_tree_to_dict(path_):
file_token = ''
for root, dirs, files in os.walk(path_):
tree = {d: fs_tree_to_dict(os.path.join(root, d)) for d in dirs}
tree.update({f: file_token for f in files})
return tree # note we discontinue iteration trough os.walk
Run Code Online (Sandbox Code Playgroud)
如果有人感兴趣,则该递归函数将返回字典的嵌套结构。键是file system
(目录和文件的)名称,值是:
file_token
)在此示例中,指定文件的字符串为空。例如,还可以为它们提供文件内容,其所有者信息或特权或与dict不同的任何对象。除非它是字典,否则在后续操作中可以很容易地将其与“目录类型”区分开。
在文件系统中具有这样的树:
# bash:
$ tree /tmp/ex
/tmp/ex
??? d_a
? ??? d_a_a
? ??? d_a_b
? ? ??? f1.txt
? ??? d_a_c
? ??? fa.txt
??? d_b
? ??? fb1.txt
? ??? fb2.txt
??? d_c
Run Code Online (Sandbox Code Playgroud)
结果将是:
# python 2 or 3:
>>> fs_tree_to_dict("/tmp/ex")
{
'd_a': {
'd_a_a': {},
'd_a_b': {
'f1.txt': ''
},
'd_a_c': {},
'fa.txt': ''
},
'd_b': {
'fb1.txt': '',
'fb2.txt': ''
},
'd_c': {}
}
Run Code Online (Sandbox Code Playgroud)
如果您愿意,我已经用这个东西(和一个很好的pyfakefs
助手)创建了一个包(python 2和3 ):https :
//pypi.org/project/fsforge/