C# - 计算多个文件的总统计量

Cam*_*lch 2 c# linq parsing file

这是我之前提到过的一个类似问题,不同之处在于我正在处理多个文件并计算这些文件的总和.我知道我正在读取特定目录中的所有文件,但由于某种原因它没有正确分组.

这是我的代码:

public void CalculateMonthlyStatistics(string monthlyFiles)
        {
            string monthlyFileName = monthlyFiles + ".log";

            var statistics = File.ReadLines(monthlyFileName)

            .GroupBy(items => items[0])
            .Select(g =>
            new
            {

                Division = g.Key,
                ZipFiles = g.Sum(i => Convert.ToInt32(i[1])),
                Conversions = g.Sum(i => Convert.ToInt32(i[2])),
                ReturnedFiles = g.Sum(i => Convert.ToInt32(i[3])),
                TotalEmails = g.Sum(i => Convert.ToInt32(i[4]))
            });

            statistics
               .ToList()
               .ForEach(d => Console.WriteLine("{0}\t{1}\t{2}\t{3}\t{4}", 
                        d.Division, 
                        d.ZipFiles, 
                        d.Conversions, 
                        d.ReturnedFiles,  
                        d.TotalEmails));
               Console.Read();
               //.ForEach(d => Log.Open(tempFileName.TrimEnd(charsToTrim), d.Division, d.ZipFiles, d.Conversions, d.ReturnedFiles, d.TotalEmails));
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

我放入的日志文件如下所示:

 Division   Zip Files   Conversions Returned Files  Total E-Mails   
Corporate   0   5   0   5   
Energy  0   1   0   5   
Global Operations   0   3   0   3   
Oil & Gas   1   5   0   5   
Capital 5   18  0   12  
Run Code Online (Sandbox Code Playgroud)

所以我想要做的是按"公司","能源"等分组.然后计算所有正在读取的文件的总数,以创建每月统计文件.我目前正在获得总数,但我认为它与我传入的标题有关,我不知道如何告诉它跳过那一行.

提前致谢

编辑

这是我的处理器,它最初读取目录等.

public void ProcessMonthlyLogFiles()
    {
        DateTime currentTime = DateTime.Now;

        int month = currentTime.Month - 1;
        int year = currentTime.Year;

        string path = Path.GetDirectoryName(Settings.DailyPath + year + @"\" + month + @"\");

        foreach (string monthlyFileNames in Directory.GetFiles(path))
        {
            string monthlyFiles = path + @"\" + Path.GetFileNameWithoutExtension(monthlyFileNames);
            new MonthlyReader().CalculateMonthlyStatistics(monthlyFiles);
        }
    }
Run Code Online (Sandbox Code Playgroud)

处理器找到要搜索的正确目录以便从中获取文件.它使用当前日期,并查找上个月.

yam*_*men 5

跳过标题非常简单:

File.ReadLines(monthlyFileName).Skip(1).<rest of your chain>

但是,当您想要读取所有文件然后计算统计数据时,您似乎一次只读取一个文件?

第一个怎么样:

public IEnumerable<String> ReadLinesInDirectory(string path)
{
    return Directory.EnumerateFiles(path)
                    .SelectMany(f => 
                        File.ReadLines(f)
                        .AsEnumerable()
                        .Skip(1));
}
Run Code Online (Sandbox Code Playgroud)

并替换ReadLines为(确保您指向正确的路径等).


好的,这是完整的解释,但我认为您可能需要更多地学习C#.首先,定义ReadLinesInDirectory我上面写的函数.

然后替换ProcessMonthlyLogFiles为:

public void ProcessMonthlyLogFiles()
{
    DateTime currentTime = DateTime.Now;

    int month = currentTime.Month - 1;
    int year = currentTime.Year;

    string path = Path.GetDirectoryName(Settings.DailyPath + year + @"\" + month + @"\");

    CalculateMonthlyStatistics(path);
}
Run Code Online (Sandbox Code Playgroud)

并且CalculateMonthlyStatistics在前三行(之前GroupBy)如下:

    public void CalculateMonthlyStatistics(string path)
    {
        var statistics = ReadLinesInDirectory(path)
                         // .GroupBy etc...
Run Code Online (Sandbox Code Playgroud)