如何列出目录的所有文件？

Question

如何列出目录的所有文件？

如何在Python中列出目录的所有文件并将其添加到list？

Answer 1

os.listdir() 将为您提供目录中的所有内容 - 文件和目录.

如果您只想要文件,可以使用以下方法对其进行过滤os.path:

from os import listdir
from os.path import isfile, join
onlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]

Run Code Online (Sandbox Code Playgroud)

或者您可以使用os.walk()哪个会为它访问的每个目录生成两个列表 - 为您分割成文件和目录.如果你只想要顶级目录,你可以在它第一次产生时中断

from os import walk

f = []
for (dirpath, dirnames, filenames) in walk(mypath):
    f.extend(filenames)
    break

Run Code Online (Sandbox Code Playgroud)

最后,正如该示例所示,将一个列表添加到另一个列表,您可以使用os.listdir()或

from os import listdir
from os.path import isfile, join
onlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]

Run Code Online (Sandbox Code Playgroud)

就个人而言,我更喜欢 os.path

`f.extend(filenames)`实际上并不等同于`f = f + filenames`.`extend`将在原地修改`f`,而添加在新的内存位置创建一个新列表.这意味着`extend`通常比`+`更有效,但如果多个对象持有对列表的引用,它有时会导致混淆.最后,值得注意的是`f + = filenames`相当于`f.extend(filenames)`,_not_`f = f + filenames`. (136认同)
更简单一点:`(_,_,filenames)= walk(mypath).next()`(如果你确信walk会返回至少一个值,它应该.) (77认同)
@misterbee,你的解决方案是最好的,只是一个小改进:``_,_,filenames = next(walk(mypath),(None,None,[]))`` (30认同)
在python 3.x中使用```(_,_,filenames)= next(os.walk(mypath))``` (25认同)
要获取子文件夹中文件的所有完整路径，请递归地： `[os.path.join(dirpath,f) for (dirpath, dirnames, filenames) in os.walk(mypath) for f in filenames] ` (17认同)
对存储完整路径的轻微修改:对于os.walk(mypath)中的(dirpath,dirnames,filenames):checksum_files.extend(文件名中的文件名的os.path.join(dirpath,filename))break (8认同)
@João Víctor Melo `os.walk()` 递归地访问所有子目录、它们的子目录等等。通过打破，我们只访问第一个目录“mypath”。 (3认同)
有没有办法使它包含每个文件的完整路径？ (2认同)
```f += filenames``` 相当于扩展，而不是相反？？？天啊。 (2认同)

Answer 2

ada*_*amk 1507

我更喜欢使用glob模块,因为它模式匹配和扩展.

import glob
print(glob.glob("/home/adam/*.txt"))

Run Code Online (Sandbox Code Playgroud)

它将返回包含查询文件的列表:

['/home/adam/file1.txt', '/home/adam/file2.txt', .... ]

Run Code Online (Sandbox Code Playgroud)

澄清一下,这确实没有回归"完整的道路"; 它只是返回glob的扩展,无论它是什么.例如,给定`/ home/user/foo/bar/hello.txt`,然后,如果在目录`foo`中运行,`glob("bar/*.txt")`将返回`bar/hello.txt` .有些情况下你确实想要完整的(即绝对的)路径; 对于这些情况,请参阅http://stackoverflow.com/questions/51520/how-to-get-an-absolute-file-path-in-python (26认同)
这是listdir + fnmatch的快捷方式http://docs.python.org/library/fnmatch.html#fnmatch.fnmatch (14认同)
`from glob import glob as g` 将 `glob()` 替换为 `g()`。 (8认同)
没有回答这个问题。`glob.glob（“ *”）`将会。 (5认同)
`from glob import glob` 将 `glob.glob()` 替换为 `glob()`。 (3认同)

Answer 3

sep*_*p2k 733

import os
os.listdir("somedirectory")

Run Code Online (Sandbox Code Playgroud)

将返回"somedirectory"中所有文件和目录的列表.

@Jixiang:`os.listdir()`总是返回_mere文件名_(不是相对路径).`glob.glob()`返回的内容由输入模式的路径格式驱动. (18认同)
与"glob.glob"返回的完整路径相比,这将返回文件的相对路径 (9认同)

Answer 4

Gio*_* PY 673

获取Python 2和3的文件列表

我也在这里做了一个简短的视频: Python:如何获取目录中的文件列表

os.listdir()

或者.....如何获取当前目录中的所有文件(和目录)(Python 3)

在Python 3中将文件放在当前目录中的最简单方法是这样.这很简单; 使用os.listdir()模块和os函数,你将在该目录中有文件(和目录中的最终文件夹,但你不会在子目录中有文件,因为你可以使用walk - 我将在稍后讨论它).

 import os
 arr = os.listdir()
 print(arr)

 >>> ['$RECYCLE.BIN', 'work.txt', '3ebooks.txt', 'documents']

Run Code Online (Sandbox Code Playgroud)

使用glob

我发现glob更容易选择相同类型的文件或共同的东西.请看以下示例:

import glob

txtfiles = []
for file in glob.glob("*.txt"):
    txtfiles.append(file)

Run Code Online (Sandbox Code Playgroud)

使用列表理解

import glob

mylist = [f for f in glob.glob("*.txt")]

Run Code Online (Sandbox Code Playgroud)

使用os.path.abspath获取完整路径名

如您所知,您在上面的代码中没有该文件的完整路径.如果您需要具有绝对路径,则可以使用所listdir()调用模块的另一个函数glob,将您获得的文件glob作为参数.还有其他方法可以获得完整路径,我们稍后会检查(我更换了,如mexmex所建议的那样,_getfullpathname with glob).

import glob

def filebrowser():
    return [f for f in glob.glob("*")]

x = filebrowser()
print(x)

>>> ['example.txt', 'fb.py', 'filebrowser.py', 'help']

Run Code Online (Sandbox Code Playgroud)

获取所有子目录中的文件类型的完整路径名 `glob`

我发现这对于在许多目录中查找内容非常有用,它帮助我找到了一个我不记得名字的文件:

import glob

def filebrowser(word=""):
    """Returns a list with all files with the word/extension in it"""
    file = []
    for f in glob.glob("*"):
        if word in f:
            file.append(f)
            return file

flist = filebrowser("example")
print(flist)
flist = filebrowser(".py")
print(flist)

>>> ['example.txt']
>>> ['fb.py', 'filebrowser.py']

Run Code Online (Sandbox Code Playgroud)

os.listdir():获取当前目录中的文件(Python 2)

在Python 2中,如果您想要当前目录中的文件列表,则必须将参数设置为".".或os.listdir方法中的os.getcwd().

 import os
 files_path = [os.path.abspath(x) for x in os.listdir()]
 print(files_path)

 >>> ['F:\\documenti\applications.txt', 'F:\\documenti\collections.txt']

Run Code Online (Sandbox Code Playgroud)

进入目录树

import os

# Getting the current work directory (cwd)
thisdir = os.getcwd()

# r=root, d=directories, f = files
for r, d, f in os.walk(thisdir):
    for file in f:
        if ".docx" in file:
            print(os.path.join(r, file))

Run Code Online (Sandbox Code Playgroud)

获取文件:特定目录中的os.listdir()(Python 2和3)

 import os
 arr = os.listdir('.')
 print(arr)

 >>> ['$RECYCLE.BIN', 'work.txt', '3ebooks.txt', 'documents']

Run Code Online (Sandbox Code Playgroud)

使用os.listdir()获取特定子目录的文件

# Method 1
x = os.listdir('..')

# Method 2
x= os.listdir('/')

Run Code Online (Sandbox Code Playgroud)

os.walk('.') - 当前目录

 import os
 arr = os.listdir('F:\\python')
 print(arr)

 >>> ['$RECYCLE.BIN', 'work.txt', '3ebooks.txt', 'documents']

Run Code Online (Sandbox Code Playgroud)

glob模块 - 所有文件

import os

x = os.listdir("./content")

Run Code Online (Sandbox Code Playgroud)

next(os.walk('.'))和os.path.join('dir','file')

 import os
 arr = next(os.walk('.'))[2]
 print(arr)

 >>> ['5bs_Turismo1.pdf', '5bs_Turismo1.pptx', 'esperienza.txt']

Run Code Online (Sandbox Code Playgroud)

next(os.walk('F:\') - 获取完整路径 - 列表理解

 import os
 arr = []
 for d,r,f in next(os.walk("F:\\_python")):
     for file in f:
         arr.append(os.path.join(r,file))

 for f in arr:
     print(files)

>>> F:\\_python\\dict_class.py
>>> F:\\_python\\programmi.txt

Run Code Online (Sandbox Code Playgroud)

os.walk - 获取完整路径 - 子目录中的所有文件

 [os.path.join(r,file) for r,d,f in next(os.walk("F:\\_python")) for file in f]

 >>> ['F:\\_python\\dict_class.py', 'F:\\_python\\programmi.txt']

Run Code Online (Sandbox Code Playgroud)

os.listdir() - 只获取txt文件

x = [os.path.join(r,file) for r,d,f in os.walk("F:\\_python") for file in f]
print(x)

>>> ['F:\\_python\\dict.py', 'F:\\_python\\progr.txt', 'F:\\_python\\readl.py']

Run Code Online (Sandbox Code Playgroud)

glob - 只获取txt文件

 arr_txt = [x for x in os.listdir() if x.endswith(".txt")]
 print(arr_txt)

 >>> ['work.txt', '3ebooks.txt']

Run Code Online (Sandbox Code Playgroud)

使用glob来获取文件的完整路径

如果我需要文件的绝对路径:

from path import path
from glob import glob
x = [path(f).abspath() for f in glob("F:\\*.txt")]
for f in x:
    print(f)

>>> F:\acquistionline.txt
>>> F:\acquisti_2018.txt
>>> F:\bootstrap_jquery_ecc.txt

Run Code Online (Sandbox Code Playgroud)

其他使用glob

如果我想要目录中的所有文件:

import os.path
listOfFiles = [f for f in os.listdir() if os.path.isfile(f)]
print(listOfFiles)

>>> ['a simple game.py', 'data.txt', 'decorator.py']

Run Code Online (Sandbox Code Playgroud)

使用os.path.isfile来避免列表中的目录

import pathlib

flist = []
for p in pathlib.Path('.').iterdir():
    if p.is_file():
        print(p)
        flist.append(p)

 >>> error.PNG
 >>> exemaker.bat
 >>> guiprova.mp3
 >>> setup.py
 >>> speak_gui2.py
 >>> thumb.PNG

Run Code Online (Sandbox Code Playgroud)

使用pathlib(Python 3.4)

flist = [p for p in pathlib.Path('.').iterdir() if p.is_file()]

Run Code Online (Sandbox Code Playgroud)

如果你想使用列表理解

import pathlib

py = pathlib.Path().glob("*.py")
for file in py:
    print(file)

>>> stack_overflow_list.py
>>> stack_overflow_list_tkinter.py

Run Code Online (Sandbox Code Playgroud)

*您也可以使用pathlib.Path()而不是pathlib.Path(".")

在pathlib.Path()中使用glob方法

import os
x = [i[2] for i in os.walk('.')]
y=[]
for t in x:
    for f in t:
        y.append(f)
print(y)

>>> ['append_to_list.py', 'data.txt', 'data1.txt', 'data2.txt', 'data_180617', 'os_walk.py', 'READ2.py', 'read_data.py', 'somma_defaltdic.py', 'substitute_words.py', 'sum_data.py', 'data.txt', 'data1.txt', 'data_180617']

Run Code Online (Sandbox Code Playgroud)

输出:

 import os
 x = next(os.walk('F://python'))[2]
 print(x)

 >>> ['calculator.bat','calculator.py']

Run Code Online (Sandbox Code Playgroud)

使用os.walk获取所有和唯一的文件

 import os
 next(os.walk('F://python'))[1] # for the current dir use ('.')

 >>> ['python3','others']

Run Code Online (Sandbox Code Playgroud)

只获取带有next的文件并进入目录

for r,d,f in os.walk("F:\\_python"):
    for dirs in d:
        print(dirs)

>>> .vscode
>>> pyexcel
>>> pyschool.py
>>> subtitles
>>> _metaprogramming
>>> .ipynb_checkpoints

Run Code Online (Sandbox Code Playgroud)

import os
x = [f.name for f in os.scandir() if f.is_file()]
print(x)

>>> ['calculator.bat','calculator.py']

# Another example with scandir (a little variation from docs.python.org)
# This one is more efficient than os.listdir.
# In this case, it shows the files only in the current directory
# where the script is executed.

import os
with os.scandir() as i:
    for entry in i:
        if entry.is_file():
            print(entry.name)

>>> ebookmaker.py
>>> error.PNG
>>> exemaker.bat
>>> guiprova.mp3
>>> setup.py
>>> speakgui4.py
>>> speak_gui2.py
>>> speak_gui3.py
>>> thumb.PNG

Run Code Online (Sandbox Code Playgroud)

获取所有子目录名称 `os.path.abspath`

import os

def count(dir, counter=0):
    "returns number of files in dir and subdirs"
    for pack in os.walk(dir):
        for f in pack[2]:
            counter += 1
    return dir + " : " + str(counter) + "files"

print(count("F:\\python"))

>>> 'F:\\\python' : 12057 files'

Run Code Online (Sandbox Code Playgroud)

来自Python 3.5的os.scandir()

import os
import shutil
from path import path

destination = "F:\\file_copied"
# os.makedirs(destination)

def copyfile(dir, filetype='pptx', counter=0):
    "Searches for pptx (or other - pptx is the default) files and copies them"
    for pack in os.walk(dir):
        for f in pack[2]:
            if f.endswith(filetype):
                fullpath = pack[0] + "\\" + f
                print(fullpath)
                shutil.copy(fullpath, destination)
                counter += 1
    if counter > 0:
        print('-' * 30)
        print("\t==> Found in: `" + dir + "` : " + str(counter) + " files\n")

for dir in os.listdir():
    "searches for folders that starts with `_`"
    if dir[0] == '_':
        # copyfile(dir, filetype='pdf')
        copyfile(dir, filetype='txt')


>>> _compiti18\Compito Contabilità 1\conti.txt
>>> _compiti18\Compito Contabilità 1\modula4.txt
>>> _compiti18\Compito Contabilità 1\moduloa4.txt
>>> ------------------------
>>> ==> Found in: `_compiti18` : 3 files

Run Code Online (Sandbox Code Playgroud)

防爆.1:子目录中有多少个文件？

在此示例中,我们查找包含在所有目录及其子目录中的文件数.

import os
mylist = ""
with open("filelist.txt", "w", encoding="utf-8") as file:
    for eachfile in os.listdir():
        mylist += eachfile + "\n"
    file.write(mylist)

Run Code Online (Sandbox Code Playgroud)

例2:如何将目录中的所有文件复制到另一个目录？

一个脚本,用于在计算机中查找所有类型的文件(默认值:pptx)并将其复制到新文件夹中.

"""
We are going to save a txt file with all the files in your directory.
We will use the function walk()
"""

import os

# see all the methods of os
# print(*dir(os), sep=", ")
listafile = []
percorso = []
with open("lista_file.txt", "w", encoding='utf-8') as testo:
    for root, dirs, files in os.walk("D:\\"):
        for file in files:
            listafile.append(file)
            percorso.append(root + "\\" + file)
            testo.write(file + "\n")
listafile.sort()
print("N. of files", len(listafile))
with open("lista_file_ordinata.txt", "w", encoding="utf-8") as testo_ordinato:
    for file in listafile:
        testo_ordinato.write(file + "\n")

with open("percorso.txt", "w", encoding="utf-8") as file_percorso:
    for file in percorso:
        file_percorso.write(file + "\n")

os.system("lista_file.txt")
os.system("lista_file_ordinata.txt")
os.system("percorso.txt")

Run Code Online (Sandbox Code Playgroud)

防爆.3:如何获取txt文件中的所有文件

如果您要创建包含所有文件名的txt文件:

import os

with open("file.txt", "w", encoding="utf-8") as filewrite:
    for r, d, f in os.walk("C:\\"):
        for file in f:
            filewrite.write(f"{r + file}\n")

Run Code Online (Sandbox Code Playgroud)

示例:txt包含硬盘驱动器的所有文件

import os

def searchfiles(extension='.ttf', folder='H:\\'):
    "Create a txt file with all the file of a type"
    with open(extension[1:] + "file.txt", "w", encoding="utf-8") as filewrite:
        for r, d, f in os.walk(folder):
            for file in f:
                if file.endswith(extension):
                    filewrite.write(f"{r + file}\n")

# looking for png file (fonts) in the hard disk H:\
searchfiles('.png', 'H:\\')

>>> H:\4bs_18\Dolphins5.png
>>> H:\4bs_18\Dolphins6.png
>>> H:\4bs_18\Dolphins7.png
>>> H:\5_18\marketing html\assets\imageslogo2.png
>>> H:\7z001.png
>>> H:\7z002.png

Run Code Online (Sandbox Code Playgroud)

C:\\的所有文件都在一个文本文件中

这是以前代码的较短版本.如果需要从其他位置开始,请更改文件夹从哪里开始查找文件.此代码在我的计算机上生成一个50 MB的文本文件,其中包含少于500.000行,文件包含完整路径.

import tkinter as tk
import os

def searchfiles(extension='.txt', folder='H:\\'):
    "insert all files in the listbox"
    for r, d, f in os.walk(folder):
        for file in f:
            if file.endswith(extension):
                lb.insert(0, r + "\\" + file)

def open_file():
    os.startfile(lb.get(lb.curselection()[0]))

root = tk.Tk()
root.geometry("400x400")
bt = tk.Button(root, text="Search", command=lambda:searchfiles('.png', 'H:\\'))
bt.pack()
lb = tk.Listbox(root)
lb.pack(fill="both", expand=1)
lb.bind("<Double-Button>", lambda x: open_file())
root.mainloop()

Run Code Online (Sandbox Code Playgroud)

搜索特定类型文件的功能

 import os
 arr = os.listdir()
 print(arr)

 >>> ['$RECYCLE.BIN', 'work.txt', '3ebooks.txt', 'documents']

Run Code Online (Sandbox Code Playgroud)

这是对这里未提出的问题的太多答案的混合。可能还值得解释一下注意事项或推荐的方法是什么。做同一件事时，我知道一种方法与 20 种方法并没有什么区别，除非我也知道何时使用哪种方法更合适。 (65认同)
这样的汇编可能会有所帮助，但这个答案对现有答案没有任何价值。举个例子， `[f for f in glob.glob("*.txt")]` 相当于 `glob.glob("*.txt")` 并且保证在本文中没有额外的部分。它也非常罗嗦并且有很多间距。可以通过添加解释或指出差异而不是列出另一个变体来进行改进。 (3认同)
好的，我会尽快查看我的答案，并尝试使其更清晰，并提供有关方法之间差异等的更多有用信息。 (2认同)

Answer 5

Rem*_*emi 152

只获取文件列表(无子目录)的单行解决方案:

filenames = next(os.walk(path))[2]

Run Code Online (Sandbox Code Playgroud)

或绝对路径名:

paths = [os.path.join(path, fn) for fn in next(os.walk(path))[2]]

Run Code Online (Sandbox Code Playgroud)

如果你已经"导入os",那么只有一个单行.对我来说,似乎不像`glob()`简洁. (7认同)
glob的问题是glob('/ home/adam /*.*')会返回一个名为'something.something'的文件夹 (4认同)
在OS X上,有一种叫做bundle的东西.这是一个目录,通常应该被视为一个文件(如.tar).你想要那些被视为文件或目录的人吗？使用`glob()`会将其视为文件.您的方法会将其视为目录. (4认同)

Answer 6

Joh*_*nny 126

从目录及其所有子目录获取完整文件路径

import os

def get_filepaths(directory):
    """
    This function will generate the file names in a directory 
    tree by walking the tree either top-down or bottom-up. For each 
    directory in the tree rooted at directory top (including top itself), 
    it yields a 3-tuple (dirpath, dirnames, filenames).
    """
    file_paths = []  # List which will store all of the full filepaths.

    # Walk the tree.
    for root, directories, files in os.walk(directory):
        for filename in files:
            # Join the two strings in order to form the full filepath.
            filepath = os.path.join(root, filename)
            file_paths.append(filepath)  # Add it to the list.

    return file_paths  # Self-explanatory.

# Run the above function and store its results in a variable.   
full_file_paths = get_filepaths("/Users/johnny/Desktop/TEST")

Run Code Online (Sandbox Code Playgroud)

我在上面的函数中提供的路径包含3个文件 - 其中两个位于根目录中,另一个位于名为"SUBFOLDER"的子文件夹中.您现在可以执行以下操作:
print full_file_paths 这将打印列表:
- ['/Users/johnny/Desktop/TEST/file1.txt', '/Users/johnny/Desktop/TEST/file2.txt', '/Users/johnny/Desktop/TEST/SUBFOLDER/file3.dat']

如果您愿意,可以打开并阅读内容,或只关注扩展名为".dat"的文件,如下面的代码所示:

for f in full_file_paths:
  if f.endswith(".dat"):
    print f

Run Code Online (Sandbox Code Playgroud)

/Users/johnny/Desktop/TEST/SUBFOLDER/file3.dat

Answer 7

Szi*_*dam 75

从版本3.4开始,有内置的迭代器,它比os.listdir()以下更有效:

pathlib:版本3.4中的新功能.

>>> import pathlib
>>> [p for p in pathlib.Path('.').iterdir() if p.is_file()]

Run Code Online (Sandbox Code Playgroud)

根据PEP 428,pathlib库的目的是提供一个简单的类层次结构来处理文件系统路径以及用户对它们执行的常见操作.

os.scandir():3.5版中的新功能.

>>> import os
>>> [entry for entry in os.scandir('.') if entry.is_file()]

Run Code Online (Sandbox Code Playgroud)

请注意,os.walk()使用os.scandir()而不是os.listdir()版本3.5,根据PEP 471,其速度提高了2-20倍.

我还建议您阅读下面的ShadowRanger评论.

注意:`os.scandir`解决方案比使用`os.path.is_file`检查等的`os.listdir`更有效,即使你需要`list`(所以你没有受益来自lazy迭代),因为`os.scandir`使用OS提供的API,在迭代时免费提供`is_file`信息,没有按文件往返磁盘到`stat`它们(在Windows上, `DirEntry`s让你免费完成`stat`信息,在*NIX系统上它需要`stat`以获取超出`is_file`,`is_dir`等的信息,但`DirEntry`缓存在第一个'stat`以方便). (6认同)

Answer 8

Cri*_*ati 55

初步说明

虽然问题文本中的文件和目录术语之间存在明显区别,但有些人可能认为目录实际上是特殊文件
声明:" 目录的所有文件 "可以用两种方式解释:
1. 所有直接(或1级)的后代只
2. 整个目录树中的所有后代(包括子目录中的后代)
当问到这个问题时,我认为Python 2是LTS版本,但代码示例将由Python 3(.5)运行(我将尽可能保持它们与Python 2兼容;同样,任何代码属于我要发布的Python来自v3.5.4 - 除非另有说明).这会产生与问题中另一个关键字相关的后果:" 将它们添加到列表中 ":
- 在Python 2之前的版本中,序列(iterables)主要由列表(元组,集合......)表示
- In Python 2.2, the concept of generator ([Python.Wiki]: Generators) - courtesy of [Python 3]: The yield statement) - was introduced. As time passed, generator counterparts started to appear for functions that returned/worked with lists
- In Python 3, generator is the default behavior
- Not sure if returning a list is still mandatory (or a generator would do as well), but passing a generator to the list constructor, will create a list out of it (and also consume it). The example below illustrates the differences on [Python 3]: map(function, iterable, ...)
```
>>> import sys
>>> sys.version
'2.7.10 (default, Mar  8 2016, 15:02:46) [MSC v.1600 64 bit (AMD64)]'
>>> m = map(lambda x: x, [1, 2, 3])  # Just a dummy lambda function
>>> m, type(m)
([1, 2, 3], <type 'list'>)
>>> len(m)
3
```
Run Code Online (Sandbox Code Playgroud)
```
>>> import sys
>>> sys.version
'3.5.4 (v3.5.4:3f56838, Aug  8 2017, 02:17:05) [MSC v.1900 64 bit (AMD64)]'
>>> m = map(lambda x: x, [1, 2, 3])
>>> m, type(m)
(<map object at 0x000001B4257342B0>, <class 'map'>)
>>> len(m)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: object of type 'map' has no len()
>>> lm0 = list(m)  # Build a list from the generator
>>> lm0, type(lm0)
([1, 2, 3], <class 'list'>)
>>>
>>> lm1 = list(m)  # Build a list from the same generator
>>> lm1, type(lm1)  # Empty list now - generator already consumed
([], <class 'list'>)
```
Run Code Online (Sandbox Code Playgroud)

这些示例将基于名为root_dir的目录,具有以下结构(此示例适用于Win,但我在Lnx上也使用相同的树):

E:\Work\Dev\StackOverflow\q003207219>tree /f "root_dir"
Folder PATH listing for volume Work
Volume serial number is 00000029 3655:6FED
E:\WORK\DEV\STACKOVERFLOW\Q003207219\ROOT_DIR
¦   file0
¦   file1
¦
+---dir0
¦   +---dir00
¦   ¦   ¦   file000
¦   ¦   ¦
¦   ¦   +---dir000
¦   ¦           file0000
¦   ¦
¦   +---dir01
¦   ¦       file010
¦   ¦       file011
¦   ¦
¦   +---dir02
¦       +---dir020
¦           +---dir0200
+---dir1
¦       file10
¦       file11
¦       file12
¦
+---dir2
¦   ¦   file20
¦   ¦
¦   +---dir20
¦           file200
¦
+---dir3

Run Code Online (Sandbox Code Playgroud)

解决方案

程序化方法:

[Python 3]:os.listdir(path ='.')

返回一个列表,其中包含path给出的目录中的条目名称.该列表是任意顺序,不包括特殊条目'.'和'..'...

>>> import os
>>> root_dir = "root_dir"  # Path relative to current dir (os.getcwd())
>>>
>>> os.listdir(root_dir)  # List all the items in root_dir
['dir0', 'dir1', 'dir2', 'dir3', 'file0', 'file1']
>>>
>>> [item for item in os.listdir(root_dir) if os.path.isfile(os.path.join(root_dir, item))]  # Filter items and only keep files (strip out directories)
['file0', 'file1']

Run Code Online (Sandbox Code Playgroud)

一个更详细的例子(code_os_listdir.py):

import os
from pprint import pformat


def _get_dir_content(path, include_folders, recursive):
    entries = os.listdir(path)
    for entry in entries:
        entry_with_path = os.path.join(path, entry)
        if os.path.isdir(entry_with_path):
            if include_folders:
                yield entry_with_path
            if recursive:
                for sub_entry in _get_dir_content(entry_with_path, include_folders, recursive):
                    yield sub_entry
        else:
            yield entry_with_path


def get_dir_content(path, include_folders=True, recursive=True, prepend_folder_name=True):
    path_len = len(path) + len(os.path.sep)
    for item in _get_dir_content(path, include_folders, recursive):
        yield item if prepend_folder_name else item[path_len:]


def _get_dir_content_old(path, include_folders, recursive):
    entries = os.listdir(path)
    ret = list()
    for entry in entries:
        entry_with_path = os.path.join(path, entry)
        if os.path.isdir(entry_with_path):
            if include_folders:
                ret.append(entry_with_path)
            if recursive:
                ret.extend(_get_dir_content_old(entry_with_path, include_folders, recursive))
        else:
            ret.append(entry_with_path)
    return ret


def get_dir_content_old(path, include_folders=True, recursive=True, prepend_folder_name=True):
    path_len = len(path) + len(os.path.sep)
    return [item if prepend_folder_name else item[path_len:] for item in _get_dir_content_old(path, include_folders, recursive)]


def main():
    root_dir = "root_dir"
    ret0 = get_dir_content(root_dir, include_folders=True, recursive=True, prepend_folder_name=True)
    lret0 = list(ret0)
    print(ret0, len(lret0), pformat(lret0))
    ret1 = get_dir_content_old(root_dir, include_folders=False, recursive=True, prepend_folder_name=False)
    print(len(ret1), pformat(ret1))


if __name__ == "__main__":
    main()

Run Code Online (Sandbox Code Playgroud)

备注:

有两种实现:
- 一个使用生成器(当然这里似乎没用,因为我立即将结果转换为列表)
- 经典之一(函数名以_old结尾)
使用递归(进入子目录)
对于每个实现,有两个功能:
- 一个以下划线(_)开头的:"私有"(不应该直接调用) - 这样做可以完成所有工作
- public one(前一个包装器):它只是从返回的条目中剥离初始路径(如果需要).这是一个丑陋的实现,但这是我在这一点上可以带来的唯一想法
在性能方面,生成器通常要快一点(考虑创建和迭代时间),但我没有在递归函数中测试它们,而且我在函数内部迭代内部生成器 - 不知道性能如何这是友好的
使用参数来获得不同的结果

输出:

(py35x64_test) E:\Work\Dev\StackOverflow\q003207219>"e:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" "code_os_listdir.py"
<generator object get_dir_content at 0x000001BDDBB3DF10> 22 ['root_dir\\dir0',
 'root_dir\\dir0\\dir00',
 'root_dir\\dir0\\dir00\\dir000',
 'root_dir\\dir0\\dir00\\dir000\\file0000',
 'root_dir\\dir0\\dir00\\file000',
 'root_dir\\dir0\\dir01',
 'root_dir\\dir0\\dir01\\file010',
 'root_dir\\dir0\\dir01\\file011',
 'root_dir\\dir0\\dir02',
 'root_dir\\dir0\\dir02\\dir020',
 'root_dir\\dir0\\dir02\\dir020\\dir0200',
 'root_dir\\dir1',
 'root_dir\\dir1\\file10',
 'root_dir\\dir1\\file11',
 'root_dir\\dir1\\file12',
 'root_dir\\dir2',
 'root_dir\\dir2\\dir20',
 'root_dir\\dir2\\dir20\\file200',
 'root_dir\\dir2\\file20',
 'root_dir\\dir3',
 'root_dir\\file0',
 'root_dir\\file1']
11 ['dir0\\dir00\\dir000\\file0000',
 'dir0\\dir00\\file000',
 'dir0\\dir01\\file010',
 'dir0\\dir01\\file011',
 'dir1\\file10',
 'dir1\\file11',
 'dir1\\file12',
 'dir2\\dir20\\file200',
 'dir2\\file20',
 'file0',
 'file1']

Run Code Online (Sandbox Code Playgroud)

[Python 3]:os.scandir(path ='.')(Python 3.5 +,backport:[PyPI]:scandir)

Return an iterator of os.DirEntry objects corresponding to the entries in the directory given by path. The entries are yielded in arbitrary order, and the special entries '.' and '..' are not included.

Using scandir() instead of listdir() can significantly increase the performance of code that also needs file type or file attribute information, because os.DirEntry objects expose this information if the operating system provides it when scanning a directory. All os.DirEntry methods may perform a system call, but is_dir() and is_file() usually only require a system call for symbolic links; os.DirEntry.stat() always requires a system call on Unix but only requires one for symbolic links on Windows.
```
>>> import os
>>> root_dir = os.path.join(".", "root_dir")  # Explicitly prepending current directory
>>> root_dir
'.\\root_dir'
>>>
>>> scandir_iterator = os.scandir(root_dir)
>>> scandir_iterator
<nt.ScandirIterator object at 0x00000268CF4BC140>
>>> [item.path for item in scandir_iterator]
['.\\root_dir\\dir0', '.\\root_dir\\dir1', '.\\root_dir\\dir2', '.\\root_dir\\dir3', '.\\root_dir\\file0', '.\\root_dir\\file1']
>>>
>>> [item.path for item in scandir_iterator]  # Will yield an empty list as it was consumed by previous iteration (automatically performed by the list comprehension)
[]
>>>
>>> scandir_iterator = os.scandir(root_dir)  # Reinitialize the generator
>>> for item in scandir_iterator :
...     if os.path.isfile(item.path):
...             print(item.name)
...
file0
file1
```
Run Code Online (Sandbox Code Playgroud)
Notes:
- It's similar to os.listdir
- But it's also more flexible (and offers more functionality), more Pythonic (and in some cases, faster)

[Python 3]: os.walk(top, topdown=True, onerror=None, followlinks=False)

Generate the file names in a directory tree by walking the tree either top-down or bottom-up. For each directory in the tree rooted at directory top (including top itself), it yields a 3-tuple (dirpath, dirnames, filenames).

>>> import os
>>> root_dir = os.path.join(os.getcwd(), "root_dir")  # Specify the full path
>>> root_dir
'E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir'
>>>
>>> walk_generator = os.walk(root_dir)
>>> root_dir_entry = next(walk_generator)  # First entry corresponds to the root dir (passed as an argument)
>>> root_dir_entry
('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir', ['dir0', 'dir1', 'dir2', 'dir3'], ['file0', 'file1'])
>>>
>>> root_dir_entry[1] + root_dir_entry[2]  # Display dirs and files (direct descendants) in a single list
['dir0', 'dir1', 'dir2', 'dir3', 'file0', 'file1']
>>>
>>> [os.path.join(root_dir_entry[0], item) for item in root_dir_entry[1] + root_dir_entry[2]]  # Display all the entries in the previous list by their full path
['E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir0', 'E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir1', 'E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir2', 'E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir3', 'E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\file0', 'E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\file1']
>>>
>>> for entry in walk_generator:  # Display the rest of the elements (corresponding to every subdir)
...     print(entry)
...
('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir0', ['dir00', 'dir01', 'dir02'], [])
('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir0\\dir00', ['dir000'], ['file000'])
('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir0\\dir00\\dir000', [], ['file0000'])
('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir0\\dir01', [], ['file010', 'file011'])
('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir0\\dir02', ['dir020'], [])
('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir0\\dir02\\dir020', ['dir0200'], [])
('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir0\\dir02\\dir020\\dir0200', [], [])
('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir1', [], ['file10', 'file11', 'file12'])
('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir2', ['dir20'], ['file20'])
('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir2\\dir20', [], ['file200'])
('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir3', [], [])

Run Code Online (Sandbox Code Playgroud)

Notes:

Under the scenes, it uses os.scandir (os.listdir on older versions)
It does the heavy lifting by recurring in subfolders

[Python 3]: glob.glob(pathname,*, recursive=False) ([Python 3]: glob.iglob(pathname,*, recursive=False))

Return a possibly-empty list of path names that match pathname, which must be a string containing a path specification. pathname can be either absolute (like /usr/src/Python-1.5/Makefile) or relative (like ../../Tools/*/*.gif), and can contain shell-style wildcards. Broken symlinks are included in the results (as in the shell).
...
Changed in version 3.5: Support for recursive globs using "**".

>>> import glob, os
>>> wildcard_pattern = "*"
>>> root_dir = os.path.join("root_dir", wildcard_pattern)  # Match every file/dir name
>>> root_dir
'root_dir\\*'
>>>
>>> glob_list = glob.glob(root_dir)
>>> glob_list
['root_dir\\dir0', 'root_dir\\dir1', 'root_dir\\dir2', 'root_dir\\dir3', 'root_dir\\file0', 'root_dir\\file1']
>>>
>>> [item.replace("root_dir" + os.path.sep, "") for item in glob_list]  # Strip the dir name and the path separator from begining
['dir0', 'dir1', 'dir2', 'dir3', 'file0', 'file1']
>>>
>>> for entry in glob.iglob(root_dir + "*", recursive=True):
...     print(entry)
...
root_dir\
root_dir\dir0
root_dir\dir0\dir00
root_dir\dir0\dir00\dir000
root_dir\dir0\dir00\dir000\file0000
root_dir\dir0\dir00\file000
root_dir\dir0\dir01
root_dir\dir0\dir01\file010
root_dir\dir0\dir01\file011
root_dir\dir0\dir02
root_dir\dir0\dir02\dir020
root_dir\dir0\dir02\dir020\dir0200
root_dir\dir1
root_dir\dir1\file10
root_dir\dir1\file11
root_dir\dir1\file12
root_dir\dir2
root_dir\dir2\dir20
root_dir\dir2\dir20\file200
root_dir\dir2\file20
root_dir\dir3
root_dir\file0
root_dir\file1

Run Code Online (Sandbox Code Playgroud)

Notes:

Uses os.listdir
For large trees (especially if recursive is on), iglob is preferred
Allows advanced filtering based on name (due to the wildcard)

[Python 3]: class pathlib.Path(*pathsegments) (Python 3.4+, backport: [PyPI]: pathlib2)

>>> import pathlib
>>> root_dir = "root_dir"
>>> root_dir_instance = pathlib.Path(root_dir)
>>> root_dir_instance
WindowsPath('root_dir')
>>> root_dir_instance.name
'root_dir'
>>> root_dir_instance.is_dir()
True
>>>
>>> [item.name for item in root_dir_instance.glob("*")]  # Wildcard searching for all direct descendants
['dir0', 'dir1', 'dir2', 'dir3', 'file0', 'file1']
>>>
>>> [os.path.join(item.parent.name, item.name) for item in root_dir_instance.glob("*") if not item.is_dir()]  # Display paths (including parent) for files only
['root_dir\\file0', 'root_dir\\file1']

Run Code Online (Sandbox Code Playgroud)

Notes:

This is one way of achieving our goal
It's the OOP style of handling paths
Offers lots of functionalities

[Python 2]: dircache.listdir(path) (Python 2 only)

But, according to [GitHub]: python/cpython - (2.7) cpython/Lib/dircache.py, it's just a (thin) wrapper over os.listdir with caching

def listdir(path):
    """List directory contents, using cache."""
    try:
        cached_mtime, list = cache[path]
        del cache[path]
    except KeyError:
        cached_mtime, list = -1, []
    mtime = os.stat(path).st_mtime
    if mtime != cached_mtime:
        list = os.listdir(path)
        list.sort()
    cache[path] = mtime, list
    return list

Run Code Online (Sandbox Code Playgroud)

[man7]: OPENDIR(3)/[man7]: READDIR(3)/[man7]: CLOSEDIR(3) via [Python 3]: ctypes - A foreign function library for Python (POSIX specific)

ctypes is a foreign function library for Python. It provides C compatible data types, and allows calling functions in DLLs or shared libraries. It can be used to wrap these libraries in pure Python.

code_ctypes.py:

#!/usr/bin/env python3

import sys
from ctypes import Structure, \
    c_ulonglong, c_longlong, c_ushort, c_ubyte, c_char, c_int, \
    CDLL, POINTER, \
    create_string_buffer, get_errno, set_errno, cast


DT_DIR = 4
DT_REG = 8

char256 = c_char * 256


class LinuxDirent64(Structure):
    _fields_ = [
        ("d_ino", c_ulonglong),
        ("d_off", c_longlong),
        ("d_reclen", c_ushort),
        ("d_type", c_ubyte),
        ("d_name", char256),
    ]

LinuxDirent64Ptr = POINTER(LinuxDirent64)

libc_dll = this_process = CDLL(None, use_errno=True)
# ALWAYS set argtypes and restype for functions, otherwise it's UB!!!
opendir = libc_dll.opendir
readdir = libc_dll.readdir
closedir = libc_dll.closedir


def get_dir_content(path):
    ret = [path, list(), list()]
    dir_stream = opendir(create_string_buffer(path.encode()))
    if (dir_stream == 0):
        print("opendir returned NULL (errno: {:d})".format(get_errno()))
        return ret
    set_errno(0)
    dirent_addr = readdir(dir_stream)
    while dirent_addr:
        dirent_ptr = cast(dirent_addr, LinuxDirent64Ptr)
        dirent = dirent_ptr.contents
        name = dirent.d_name.decode()
        if dirent.d_type & DT_DIR:
            if name not in (".", ".."):
                ret[1].append(name)
        elif dirent.d_type & DT_REG:
            ret[2].append(name)
        dirent_addr = readdir(dir_stream)
    if get_errno():
        print("readdir returned NULL (errno: {:d})".format(get_errno()))
    closedir(dir_stream)
    return ret


def main():
    print("{:s} on {:s}\n".format(sys.version, sys.platform))
    root_dir = "root_dir"
    entries = get_dir_content(root_dir)
    print(entries)


if __name__ == "__main__":
    main()

Run Code Online (Sandbox Code Playgroud)

Notes:

It loads the three functions from libc (loaded in the current process) and calls them (for more details check [SO]: How do I check whether a file exists without exceptions? (@CristiFati's answer) - last notes from item #4.). That would place this approach very close to the Python/C edge
LinuxDirent64 is the ctypes representation of struct dirent64 from [man7]: dirent.h(0P) (so are the DT_ constants) from my machine: Ubtu 16 x64 (4.10.0-40-generic and libc6-dev:amd64). On other flavors/versions, the struct definition might differ, and if so, the ctypes alias should be updated, otherwise it will yield Undefined Behavior
It returns data in the os.walk's format. I didn't bother to make it recursive, but starting from the existing code, that would be a fairly trivial task
Everything is doable on Win as well, the data (libraries, functions, structs, constants, ...) differ

Output:

[cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q003207219]> ./code_ctypes.py
3.5.2 (default, Nov 12 2018, 13:43:14)
[GCC 5.4.0 20160609] on linux

['root_dir', ['dir2', 'dir1', 'dir3', 'dir0'], ['file1', 'file0']]

Run Code Online (Sandbox Code Playgroud)

[ActiveState]: win32file.FindFilesW (Win specific)

Retrieves a list of matching filenames, using the Windows Unicode API. An interface to the API FindFirstFileW/FindNextFileW/Find close functions.

>>> import os, win32file, win32con
>>> root_dir = "root_dir"
>>> wildcard = "*"
>>> root_dir_wildcard = os.path.join(root_dir, wildcard)
>>> entry_list = win32file.FindFilesW(root_dir_wildcard)
>>> len(entry_list)  # Don't display the whole content as it's too long
8
>>> [entry[-2] for entry in entry_list]  # Only display the entry names
['.', '..', 'dir0', 'dir1', 'dir2', 'dir3', 'file0', 'file1']
>>>
>>> [entry[-2] for entry in entry_list if entry[0] & win32con.FILE_ATTRIBUTE_DIRECTORY and entry[-2] not in (".", "..")]  # Filter entries and only display dir names (except self and parent)
['dir0', 'dir1', 'dir2', 'dir3']
>>>
>>> [os.path.join(root_dir, entry[-2]) for entry in entry_list if entry[0] & (win32con.FILE_ATTRIBUTE_NORMAL | win32con.FILE_ATTRIBUTE_ARCHIVE)]  # Only display file "full" names
['root_dir\\file0', 'root_dir\\file1']

Run Code Online (Sandbox Code Playgroud)

Notes:

win32file.FindFilesW is part of [GitHub]: mhammond/pywin32 - Python for Windows (pywin32) Extensions, which is a Python wrapper over WINAPIs
The documentation link is from ActiveState, as I didn't find any pywin32 official documentation

Install some (other) third-party package that does the trick
- Most likely, will rely on one (or more) of the above (maybe with slight customizations)

Notes:

Code is meant to be portable (except places that target a specific area - which are marked) or cross:
- platform (Nix, Win, )
- Python version (2, 3, )
Multiple path styles (absolute, relatives) were used across the above variants, to illustrate the fact that the "tools" used are flexible in this direction
os.listdir and os.scandir use opendir/readdir/closedir ([MS.Docs]: FindFirstFileW function/[MS.Docs]: FindNextFileW function/[MS.Docs]: FindClose function) (via [GitHub]: python/cpython - (master) cpython/Modules/posixmodule.c)
win32file.FindFilesW uses those (Win specific) functions as well (via [GitHub]: mhammond/pywin32 - (master) pywin32/win32/src/win32file.i)
_get_dir_content (from point #1.) can be implemented using any of these approaches (some will require more work and some less)
- Some advanced filtering (instead of just file vs. dir) could be done: e.g. the include_folders argument could be replaced by another one (e.g. filter_func) which would be a function that takes a path as an argument: filter_func=lambda x: True (this doesn't strip out anything) and inside _get_dir_content something like: if not filter_func(entry_with_path): continue (if the function fails for one entry, it will be skipped), but the more complex the code becomes, the longer it will take to execute
Nota bene! Since recursion is used, I must mention that I did some tests on my laptop (Win 10 x64), totally unrelated to this problem, and when the recursion level was reaching values somewhere in the (990 .. 1000) range (recursionlimit - 1000 (default)), I got StackOverflow :). If the directory tree exceeds that limit (I am not an FS expert, so I don't know if that is even possible), that could be a problem.
I must also mention that I didn't try to increase recursionlimit because I have no experience in the area (how much can I increase it before having to also increase the stack at OS level), but in theory there will always be the possibility for failure, if the dir depth is larger than the highest possible recursionlimit (on that machine)
The code samples are for demonstrative purposes only. That means that I didn't take into account error handling (I don't think there's any try/except/else/finally block), so the code is not robust (the reason is: to keep it as simple and short as possible). For production, error handling should be added as well

Other approaches:

Use Python only as a wrapper
- Everything is done using another technology
- That technology is invoked from Python
- The most famous flavor that I know is what I call the system administrator approach:
  - Use Python (or any programming language for that matter) in order to execute shell commands (and parse their outputs)
  - Some consider this a neat hack
  - I consider it more like a lame workaround (gainarie), as the action per se is performed from shell (cmd in this case), and thus doesn't have anything to do with Python.
  - Filtering (grep/findstr) or output formatting could be done on both sides, but I'm not going to insist on it. Also, I deliberately used os.system instead of subprocess.Popen.
```
(py35x64_test) E:\Work\Dev\StackOverflow\q003207219>"e:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" -c "import os;os.system(\"dir /b root_dir\")"
dir0
dir1
dir2
dir3
file0
file1
```
  Run Code Online (Sandbox Code Playgroud)
In general this approach is to be avoided, since if some command output format slightly differs between OS versions/flavors, the parsing code should be adapted as well; not to mention differences between locales).

Answer 9

Art*_*are 48

我真的很喜欢adamk的回答,建议您使用glob()同名模块.这允许您与*s 进行模式匹配.

但正如其他人在评论中指出的那样,glob()可能会因不一致的斜线方向而被绊倒.为了解决这个问题,我建议您使用模块中的join()和expanduser()函数,也可以使用os.path模块中的getcwd()函数os.

例如:

from glob import glob

# Return everything under C:\Users\admin that contains a folder called wlp.
glob('C:\Users\admin\*\wlp')

Run Code Online (Sandbox Code Playgroud)

上面的内容非常糟糕 - 路径已被硬编码,并且只能在Windows上以驱动器名称和\硬编码到路径之间的方式工作.

from glob    import glob
from os.path import join

# Return everything under Users, admin, that contains a folder called wlp.
glob(join('Users', 'admin', '*', 'wlp'))

Run Code Online (Sandbox Code Playgroud)

上面的工作更好,但它依赖Users于Windows上常见的文件夹名称,而在其他操作系统上则不常见.它还依赖于具有特定名称的用户admin.

from glob    import glob
from os.path import expanduser, join

# Return everything under the user directory that contains a folder called wlp.
glob(join(expanduser('~'), '*', 'wlp'))

Run Code Online (Sandbox Code Playgroud)

这适用于所有平台.

另一个很好的例子,它可以跨平台完美运行,并且做一些不同的事情

from glob    import glob
from os      import getcwd
from os.path import join

# Return everything under the current directory that contains a folder called wlp.
glob(join(getcwd(), '*', 'wlp'))

Run Code Online (Sandbox Code Playgroud)

希望这些示例可以帮助您了解在标准Python库模块中可以找到的一些函数的强大功能.

额外的全部乐趣:从Python 3.5开始,只要你设置`recursive = True`,`**`就可以了.请参阅此处的文档:https://docs.python.org/3.5/library/glob.html#glob.glob (4认同)

Answer 10

Apo*_*tus 35

def list_files(path):
    # returns a list of names (with extension, without full path) of all files 
    # in folder path
    files = []
    for name in os.listdir(path):
        if os.path.isfile(os.path.join(path, name)):
            files.append(name)
    return files

Run Code Online (Sandbox Code Playgroud)

Answer 11

Yau*_*ich 23

如果你正在寻找一个find的Python实现,这是我经常使用的一个配方:

from findtools.find_files import (find_files, Match)

# Recursively find all *.sh files in **/usr/bin**
sh_files_pattern = Match(filetype='f', name='*.sh')
found_files = find_files(path='/usr/bin', match=sh_files_pattern)

for found_file in found_files:
    print found_file

Run Code Online (Sandbox Code Playgroud)

所以我用它制作了一个PyPI 包,还有一个GitHub存储库.我希望有人发现它可能对此代码有用.

Answer 12

The*_*Son 12

返回绝对文件路径列表,不会递归到子目录中

L = [os.path.join(os.getcwd(),f) for f in os.listdir('.') if os.path.isfile(os.path.join(os.getcwd(),f))]

Run Code Online (Sandbox Code Playgroud)

注意:`os.path.abspath(f)`将是一个更便宜的替代`os.path.join(os.getcwd(),f)`. (2认同)

Answer 13

ARG*_*Geo 12

为了获得更好的结果,您可以使用模块的listdir()方法和os生成器(生成器是一个保持其状态的强大迭代器,还记得吗？).以下代码适用于两个版本:Python 2和Python 3.

这是一个代码:

import os

def files(path):  
    for file in os.listdir(path):
        if os.path.isfile(os.path.join(path, file)):
            yield file

for file in files("."):  
    print (file)

Run Code Online (Sandbox Code Playgroud)

该listdir()方法返回给定目录的条目列表.如果给定条目是文件,则os.path.isfile()返回该方法True.并且yield运算符退出func但保持其当前状态,并且仅返回检测为文件的条目的名称.以上所有允许我们循环生成器函数.

希望这可以帮助.

Answer 14

pah*_*h8J 10

import os
import os.path


def get_files(target_dir):
    item_list = os.listdir(target_dir)

    file_list = list()
    for item in item_list:
        item_dir = os.path.join(target_dir,item)
        if os.path.isdir(item_dir):
            file_list += get_files(item_dir)
        else:
            file_list.append(item_dir)
    return file_list

Run Code Online (Sandbox Code Playgroud)

在这里,我使用递归结构.

Answer 15

fra*_*lau 8

一位聪明的老师曾经告诉我：

当有几种确定的方法可以做某事时，没有一种方法适合所有情况。

因此，我将为问题的一个子集添加一个解决方案：很多时候，我们只想检查文件是否匹配开始字符串和结束字符串，而无需进入子目录。因此，我们想要一个返回文件名列表的函数，例如：

filenames = dir_filter('foo/baz', radical='radical', extension='.txt')

Run Code Online (Sandbox Code Playgroud)

如果您想先声明两个函数，可以这样做：

def file_filter(filename, radical='', extension=''):
    "Check if a filename matches a radical and extension"
    if not filename:
        return False
    filename = filename.strip()
    return(filename.startswith(radical) and filename.endswith(extension))

def dir_filter(dirname='', radical='', extension=''):
    "Filter filenames in directory according to radical and extension"
    if not dirname:
        dirname = '.'
    return [filename for filename in os.listdir(dirname)
                if file_filter(filename, radical, extension)]

Run Code Online (Sandbox Code Playgroud)

此解决方案可以使用正则表达式轻松进行一般化（pattern如果您不希望模式始终坚持文件名的开头或结尾，则可能需要添加一个参数）。

Answer 16

sha*_*noo 6

使用发电机

import os
def get_files(search_path):
     for (dirpath, _, filenames) in os.walk(search_path):
         for filename in filenames:
             yield os.path.join(dirpath, filename)
list_files = get_files('.')
for filename in list_files:
    print(filename)

Run Code Online (Sandbox Code Playgroud)

Answer 17

fhc*_*chl 5

Python 3.4+ 的另一个非常易读的变体是使用 pathlib.Path.glob：

from pathlib import Path
folder = '/foo'
[f for f in Path(folder).glob('*') if f.is_file()]

Run Code Online (Sandbox Code Playgroud)

更具体的很简单，例如只在所有子目录中查找不是符号链接的 Python 源文件：

[f for f in Path(folder).glob('**/*.py') if not f.is_symlink()]

Run Code Online (Sandbox Code Playgroud)

归档时间：	15 年，11 月前
查看次数：	3745069 次
最近记录：	6 年，7 月前

如何列出目录的所有文件？

获取Python 2和3的文件列表

使用os.path.abspath获取完整路径名

获取所有子目录中的文件类型的完整路径名 glob

进入目录树

获取文件:特定目录中的os.listdir()(Python 2和3)

使用os.listdir()获取特定子目录的文件

os.walk('.') - 当前目录

glob模块 - 所有文件

next(os.walk('.'))和os.path.join('dir','file')

next(os.walk('F:\') - 获取完整路径 - 列表理解

os.listdir() - 只获取txt文件

glob - 只获取txt文件

使用glob来获取文件的完整路径

其他使用glob

使用os.path.isfile来避免列表中的目录

使用pathlib(Python 3.4)

在pathlib.Path()中使用glob方法

使用os.walk获取所有和唯一的文件

只获取带有next的文件并进入目录

只获取下一个目录并进入目录

获取所有子目录名称 os.path.abspath

来自Python 3.5的os.scandir()

防爆.1:子目录中有多少个文件？

例2:如何将目录中的所有文件复制到另一个目录？

防爆.3:如何获取txt文件中的所有文件

示例:txt包含硬盘驱动器的所有文件

C:\\的所有文件都在一个文本文件中

搜索特定类型文件的功能

初步说明

解决方案

程序化方法:

Other approaches:

获取所有子目录中的文件类型的完整路径名 `glob`

获取所有子目录名称 `os.path.abspath`