如何在python中递归生成目录大小,如du.呢?

rap*_*ura 0 python filesystems operating-system

可以说我的结构是这样的

/-- am here
/one/some/dir
/two
/three/has/many/leaves
/hello/world
Run Code Online (Sandbox Code Playgroud)

并说/ one/some/dir包含一个大文件,500mb和/ three/has/many/leaves在每个文件夹中包含一个400mb文件.

我想生成每个目录的大小,以获得此输出

/ - in total for all
/one/some/dir 500mb
/two 0 
/three/has/many/leaved - 400mb
/three/has/many 800
/three/has/ 800+someotherbigfilehere
Run Code Online (Sandbox Code Playgroud)

我该怎么做?

mgi*_*son 8

看看os.walk.具体来说,文档中有一个示例来查找目录的大小:

import os
from os.path import join, getsize
for root, dirs, files in os.walk('python/Lib/email'):
    print root, "consumes",
    print sum(getsize(join(root, name)) for name in files),
    print "bytes in", len(files), "non-directory files"
    if 'CVS' in dirs:
        dirs.remove('CVS')  # don't visit CVS directories
Run Code Online (Sandbox Code Playgroud)

这应该很容易根据您的目的进行修改.


这是一个未经测试的版本,以回应您的评论:

import os
from os.path import join, getsize
dirs_dict = {}

#We need to walk the tree from the bottom up so that a directory can have easy
# access to the size of its subdirectories.
for root, dirs, files in os.walk('python/Lib/email',topdown = False):

    # Loop through every non directory file in this directory and sum their sizes
    size = sum(getsize(join(root, name)) for name in files) 

    # Look at all of the subdirectories and add up their sizes from the `dirs_dict`
    subdir_size = sum(dirs_dict[join(root,d)] for d in dirs)

    # store the size of this directory (plus subdirectories) in a dict so we 
    # can access it later
    my_size = dirs_dict[root] = size + subdir_size

    print '%s: %d'%(root,my_size) 
Run Code Online (Sandbox Code Playgroud)