使用streamreader/streamwriter在C#中逐行编写文件非常慢

web*_*orm 2 c# streamwriter streamreader

我编写了一个Winform应用程序,它读取文本文件的每一行,使用行上的RegEx进行搜索和替换,然后将其写回新文件.我选择了"逐行"方法,因为有些文件太大而无法加载到内存中.

我正在使用BackgroundWorker对象,因此可以使用作业的进度更新UI.下面是代码(为简洁起见省略了部分),它处理读取然后输出文件中的行.

public void bgWorker_DoWork(object sender, DoWorkEventArgs e)
{
    // Details of obtaining file paths omitted for brevity

    int totalLineCount = File.ReadLines(inputFilePath).Count();

    using (StreamReader sr = new StreamReader(inputFilePath))
    {
      int currentLine = 0;
      String line;
      while ((line = sr.ReadLine()) != null)
      {
        currentLine++;

        // Match and replace contents of the line
        // omitted for brevity

        if (currentLine % 100 == 0)
        {
          int percentComplete = (currentLine * 100 / totalLineCount);
          bgWorker.ReportProgress(percentComplete);
        }

        using (FileStream fs = new FileStream(outputFilePath, FileMode.Append, FileAccess.Write))
        using (StreamWriter sw = new StreamWriter(fs))
        {
          sw.WriteLine(line);
        }
      }
    }
}
Run Code Online (Sandbox Code Playgroud)

我正在处理的一些文件非常大(8 GB,1.32亿行).该过程需要很长时间(2 GB文件需要大约9个小时才能完成).它看起来以大约58 KB /秒的速度运行.这是预期还是应该更快?

Sco*_*ain 14

不要在每次循环迭代时关闭并重新打开写入文件,只需在文件循环外打开编写器.这应该可以提高性能,因为编写器不再需要在每次循环迭代中寻找文件的末尾.

还会File.ReadLines(inputFilePath).Count(); 导致您两次读取输入文件,这可能是一大块时间.而不是基于行的百分比计算基于流位置的百分比.

public void bgWorker_DoWork(object sender, DoWorkEventArgs e) 
{ 
    // Details of obtaining file paths omitted for brevity

    using (StreamWriter sw = new StreamWriter(outputFilePath, true)) //You can use this constructor instead of FileStream, it does the same operation.
    using (StreamReader sr = new StreamReader(inputFilePath))
    {
      int lastPercentage = 0;
      String line;
      while ((line = sr.ReadLine()) != null)
      {

        // Match and replace contents of the line
        // omitted for brevity

        //Poisition and length are longs not ints so we need to cast at the end.
        int currentPercentage = (int)(sr.BaseStream.Position * 100L / sr.BaseStream.Length);
        if (lastPercentage != currentPercentage )
        {
          bgWorker.ReportProgress(currentPercentage );
          lastPercentage = currentPercentage;
        }
          sw.WriteLine(line);
      }
    }
}
Run Code Online (Sandbox Code Playgroud)

除此之外,你需要展示Match and replace contents of the line omitted for brevity我猜的是你的慢慢来自哪里.在您的代码上运行一个分析器,看看它花费的时间最多,并集中精力在那里.