在C#中读取CSV文件

die*_*aro 13 c# csv open-source

有没有人知道一个开源库,它允许你解析和读取.csvC#中的文件?

Joe*_*orn 26

在这里,您的真实编写使用泛型集合和迭代器块.它支持使用双转义约定的双引号封闭文本字段(包括跨越多行的文本字段)(因此""在引用字段内部读取为单引号字符).它不支持:

  • 单引号封闭文字
  • \ -escaped引用文字
  • 备用分隔符(尚未在管道或制表符分隔的字段上工作)
  • 以引号开头的未加引号的文本字段

但是如果你需要的话,所有这些都很容易添加.我没有在任何地方进行基准测试(我希望看到一些结果),但性能应该非常好 - 比任何.Split()基于任何方式的都要.

现在在GitHub上

更新:感觉就像添加单引号封闭文本支持.这是一个简单的更改,但我在回复窗口中键入它,因此它未经测试.如果您更喜欢旧的(已测试)代码,请使用底部的修订链接.

public static class CSV
{
    public static IEnumerable<IList<string>> FromFile(string fileName)
    {
        foreach (IList<string> item in FromFile(fileName, ignoreFirstLineDefault)) yield return item;
    }

    public static IEnumerable<IList<string>> FromFile(string fileName, bool ignoreFirstLine)
    {
        using (StreamReader rdr = new StreamReader(fileName))
        {
            foreach(IList<string> item in FromReader(rdr, ignoreFirstLine)) yield return item;
        }
    }

    public static IEnumerable<IList<string>> FromStream(Stream csv)
    {
        foreach (IList<string> item in FromStream(csv, ignoreFirstLineDefault)) yield return item;
    }

    public static IEnumerable<IList<string>> FromStream(Stream csv, bool ignoreFirstLine)
    {
        using (var rdr = new StreamReader(csv))
        {
            foreach (IList<string> item in FromReader(rdr, ignoreFirstLine)) yield return item;
        }
    }

    public static IEnumerable<IList<string>> FromReader(TextReader csv)
    {
        //Probably should have used TextReader instead of StreamReader
        foreach (IList<string> item in FromReader(csv, ignoreFirstLineDefault)) yield return item;
    }

    public static IEnumerable<IList<string>> FromReader(TextReader csv, bool ignoreFirstLine)
    {
        if (ignoreFirstLine) csv.ReadLine();

        IList<string> result = new List<string>();

        StringBuilder curValue = new StringBuilder();
        char c;
        c = (char)csv.Read();
        while (csv.Peek() != -1)
        {
            switch (c)
            {
                case ',': //empty field
                    result.Add("");
                    c = (char)csv.Read();
                    break;
                case '"': //qualified text
                case '\'':
                    char q = c;
                    c = (char)csv.Read();
                    bool inQuotes = true;
                    while (inQuotes && csv.Peek() != -1)
                    {
                        if (c == q)
                        {
                            c = (char)csv.Read();
                            if (c != q)
                                inQuotes = false;
                        }

                        if (inQuotes)
                        {
                            curValue.Append(c);
                            c = (char)csv.Read();
                        } 
                    }
                    result.Add(curValue.ToString());
                    curValue = new StringBuilder();
                    if (c == ',') c = (char)csv.Read(); // either ',', newline, or endofstream
                    break;
                case '\n': //end of the record
                case '\r':
                    //potential bug here depending on what your line breaks look like
                    if (result.Count > 0) // don't return empty records
                    {
                        yield return result;
                        result = new List<string>();
                    }
                    c = (char)csv.Read();
                    break;
                default: //normal unqualified text
                    while (c != ',' && c != '\r' && c != '\n' && csv.Peek() != -1)
                    {
                        curValue.Append(c);
                        c = (char)csv.Read();
                    }
                    result.Add(curValue.ToString());
                    curValue = new StringBuilder();
                    if (c == ',') c = (char)csv.Read(); //either ',', newline, or endofstream
                    break;
            }

        }
        if (curValue.Length > 0) //potential bug: I don't want to skip on a empty column in the last record if a caller really expects it to be there
            result.Add(curValue.ToString());
        if (result.Count > 0) 
            yield return result;

    }
    private static bool ignoreFirstLineDefault = false;
}
Run Code Online (Sandbox Code Playgroud)


Gal*_*ian 23

看看CodeProject 上的A Fast CSV Reader.


Dan*_*den 21

上一次出现这种问题是问,这里的答案我给:

如果您只是尝试使用C#读取CSV文件,最简单的方法是使用Microsoft.VisualBasic.FileIO.TextFieldParser类.它实际上内置在.NET Framework中,而不是第三方扩展.

是的,它存在Microsoft.VisualBasic.dll,但这并不意味着您不能使用C#(或任何其他CLR语言).

以下是一个使用示例,取自MSDN文档:

Using MyReader As New _
Microsoft.VisualBasic.FileIO.TextFieldParser("C:\testfile.txt")
   MyReader.TextFieldType = FileIO.FieldType.Delimited
   MyReader.SetDelimiters(",")
   Dim currentRow As String()
   While Not MyReader.EndOfData
      Try
         currentRow = MyReader.ReadFields()
         Dim currentField As String
         For Each currentField In currentRow
            MsgBox(currentField)
         Next
      Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
      MsgBox("Line " & ex.Message & _
      "is not valid and will be skipped.")
      End Try
   End While
End Using
Run Code Online (Sandbox Code Playgroud)

同样,这个例子是在VB.NET中,但将它转换为C#是微不足道的.


mar*_*c_s 8

我非常喜欢FileHelpers库.它很快,它是C#100%,它是免费提供的,它非常灵活且易于使用.