快速获取特定路径中的所有文件和目录

Ton*_*Nam 67 c# performance fileinfo

我正在创建一个备份应用程序,其中c#扫描目录.在我使用这样的东西之前,为了获取目录中的所有文件和子文件:

DirectoryInfo di = new DirectoryInfo("A:\\");
var directories= di.GetFiles("*", SearchOption.AllDirectories);

foreach (FileInfo d in directories)
{
       //Add files to a list so that later they can be compared to see if each file
       // needs to be copid or not
}
Run Code Online (Sandbox Code Playgroud)

唯一的问题是,有时无法访问文件,我会收到几个错误.我得到的一个错误示例是:错误

结果我创建了一个递归方法,它将扫描当前目录中的所有文件.如果该目录中有目录,那么将再次调用该方法传递该目录.关于这个方法的好处是我可以将文件放在try catch块中,如果没有错误,可以选择将这些文件添加到List,如果我有错误,则将目录添加到另一个列表.

try
{
    files = di.GetFiles(searchPattern, SearchOption.TopDirectoryOnly);               
}
catch
{
     //info of this folder was not able to get
     lstFilesErrors.Add(sDir(di));
     return;
}
Run Code Online (Sandbox Code Playgroud)

所以这个方法效果很好,唯一的问题是当我扫描一个大目录时需要很多次.我怎么能加快这个过程?我的实际方法是这样,以防你需要它.

private void startScan(DirectoryInfo di)
{
    //lstFilesErrors is a list of MyFile objects
    // I created that class because I wanted to store more specific information
    // about a file such as its comparePath name and other properties that I need 
    // in order to compare it with another list

    // lstFiles is a list of MyFile objects that store all the files
    // that are contained in path that I want to scan

    FileInfo[] files = null;
    DirectoryInfo[] directories = null;
    string searchPattern = "*.*";

    try
    {
        files = di.GetFiles(searchPattern, SearchOption.TopDirectoryOnly);               
    }
    catch
    {
        //info of this folder was not able to get
        lstFilesErrors.Add(sDir(di));
        return;
    }

    // if there are files in the directory then add those files to the list
    if (files != null)
    {
        foreach (FileInfo f in files)
        {
            lstFiles.Add(sFile(f));
        }
    }


    try
    {
        directories = di.GetDirectories(searchPattern, SearchOption.TopDirectoryOnly);
    }
    catch
    {
        lstFilesErrors.Add(sDir(di));
        return;
    }

    // if that directory has more directories then add them to the list then 
    // execute this function
    if (directories != null)
        foreach (DirectoryInfo d in directories)
        {
            FileInfo[] subFiles = null;
            DirectoryInfo[] subDir = null;

            bool isThereAnError = false;

            try
            {
                subFiles = d.GetFiles();
                subDir = d.GetDirectories();

            }
            catch
            {
                isThereAnError = true;                                                
            }

            if (isThereAnError)
                lstFilesErrors.Add(sDir(d));
            else
            {
                lstFiles.Add(sDir(d));
                startScan(d);
            }


        }

}
Run Code Online (Sandbox Code Playgroud)

如果我尝试用以下方法处理异常,请解决此问题:

DirectoryInfo di = new DirectoryInfo("A:\\");
FileInfo[] directories = null;
            try
            {
                directories = di.GetFiles("*", SearchOption.AllDirectories);

            }
            catch (UnauthorizedAccessException e)
            {
                Console.WriteLine("There was an error with UnauthorizedAccessException");
            }
            catch
            {
                Console.WriteLine("There was antother error");
            }
Run Code Online (Sandbox Code Playgroud)

如果发生异常,那么我没有文件.

Ton*_*Nam 44

这种方法要快得多.您只能在将大量文件放入目录时进行通信.我的A:\外部硬盘包含近1太比特,因此在处理大量文件时会产生很大的不同.

static void Main(string[] args)
{
    DirectoryInfo di = new DirectoryInfo("A:\\");
    FullDirList(di, "*");
    Console.WriteLine("Done");
    Console.Read();
}

static List<FileInfo> files = new List<FileInfo>();  // List that will hold the files and subfiles in path
static List<DirectoryInfo> folders = new List<DirectoryInfo>(); // List that hold direcotries that cannot be accessed
static void FullDirList(DirectoryInfo dir, string searchPattern)
{
    // Console.WriteLine("Directory {0}", dir.FullName);
    // list the files
    try
    {
        foreach (FileInfo f in dir.GetFiles(searchPattern))
        {
            //Console.WriteLine("File {0}", f.FullName);
            files.Add(f);                    
        }
    }
    catch
    {
        Console.WriteLine("Directory {0}  \n could not be accessed!!!!", dir.FullName);                
        return;  // We alredy got an error trying to access dir so dont try to access it again
    }

    // process each directory
    // If I have been able to see the files in the directory I should also be able 
    // to look at its directories so I dont think I should place this in a try catch block
    foreach (DirectoryInfo d in dir.GetDirectories())
    {
        folders.Add(d);
        FullDirList(d, searchPattern);                    
    }

}
Run Code Online (Sandbox Code Playgroud)

顺便说一下,感谢你的评论Jim Mischel


Dar*_*rov 18

在.NET 4.0中,Directory.EnumerateFiles方法返回一个IEnumerable<string>并且不加载内存中的所有文件.只有在您开始迭代返回的集合时,才会返回文件,并且可以处理异常.


csh*_*net 12

.NET文件枚举方法的历史很长很慢.问题是没有一种枚举大型目录结构的即时方法.甚至这里接受的答案都有GC分配的问题.

我能做的最好的事情就是包含在我的库中,并作为CSharpTest.Net.IO命名空间中的FindFile()类公开.此类可以枚举文件和文件夹,而无需不必要的GC分配和字符串编组.

用法很简单,RaiseOnAccessDenied属性将跳过用户无权访问的目录和文件:

    private static long SizeOf(string directory)
    {
        var fcounter = new CSharpTest.Net.IO.FindFile(directory, "*", true, true, true);
        fcounter.RaiseOnAccessDenied = false;

        long size = 0, total = 0;
        fcounter.FileFound +=
            (o, e) =>
            {
                if (!e.IsDirectory)
                {
                    Interlocked.Increment(ref total);
                    size += e.Length;
                }
            };

        Stopwatch sw = Stopwatch.StartNew();
        fcounter.Find();
        Console.WriteLine("Enumerated {0:n0} files totaling {1:n0} bytes in {2:n3} seconds.",
                          total, size, sw.Elapsed.TotalSeconds);
        return size;
    }
Run Code Online (Sandbox Code Playgroud)

对于我的本地C:\驱动器,它输出以下内容:

枚举810,046个文件,总计307,707,792,662个字节,232.876秒.

您的里程可能因驱动器速度而异,但这是我发现的在托管代码中枚举文件的最快方法.event参数是FindFile.FileFoundEventArgs类型的变异类,因此请确保不保留对它的引用,因为它的值将针对引发的每个事件而更改.

  • 您也缺少调用`FIND()`方法.我在lambda之后放置了`fcounter.Find()`方法,它运行得很好. (2认同)