使用OpenXmlReader

Arg*_*ent 14 c# openxml openxml-sdk

我讨厌借助StackOverflow这样(看似)基本的东西,但我在过去的几个小时里一直在和微软打架,似乎已经走到了尽头.我正在尝试阅读(大)Excel 2007+电子表格,谷歌已经告诉我,使用OpenXml SDK是一个非常受欢迎的选择.所以我给了这个东西一个镜头,读了一些教程,检查了微软自己的库页面,并且得到了很少的东西.

我正在使用一个小测试电子表格,只有一列数字和一个字符串 - 大规模测试将在稍后进行.我尝试了几种类似于我即将发布的实现,但没有一种实现读取数据.下面的代码主要来自另一个StackOverflow线程,它似乎已经工作了 - 对我来说不是这样.我想我会让你们检查/调试/帮助这个版本,因为它可能比我今天写的任何东西都要少.

static void ReadExcelFileSAX(string fileName)
    {
        using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(fileName, true))
        {
            WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
            WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();

            OpenXmlPartReader reader = new OpenXmlPartReader(worksheetPart);
            string text;
            string rowNum;
            while (reader.Read())
            {
                if (reader.ElementType == typeof(Row))
                {
                    do
                    {
                        if (reader.HasAttributes)
                        {
                            rowNum = reader.Attributes.First(a => a.LocalName == "r").Value;
                            Console.Write("rowNum: " + rowNum); //we never even get here, I tested it with a breakpoint
                        }

                    } while (reader.ReadNextSibling()); // Skip to the next row
                    Console.ReadKey();
                    break; // We just looped through all the rows so no need to continue reading the worksheet
                }
                if (reader.ElementType == typeof(Cell))
                {

                }

                if (reader.ElementType != typeof(Worksheet)) // Dont' want to skip the contents of the worksheet
                    reader.Skip(); // Skip contents of any node before finding the first row.
            }
            reader.Close();
            Console.WriteLine();
            Console.ReadKey();
        }
    }
Run Code Online (Sandbox Code Playgroud)

并且,在旁注中,有什么好的替代方法使用OpenXml SDK我不知何故错过了吗?

Han*_*ans 22

我认为你WorksheetPart读错了.

这条线

workbookPart.WorksheetParts.First();
Run Code Online (Sandbox Code Playgroud)

得到第一个WorksheetPart集合,它不一定是你在Microsoft Excel中看到的第一个工作表.

因此,遍历所有WorksheetParts,你应该在控制台窗口看到一些输出.

static void ReadExcelFileSAX(string fileName)
{
  using (SpreadsheetDocument spreadsheetDocument = 
                                   SpreadsheetDocument.Open(fileName, true))
  {
    WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;

    // Iterate through all WorksheetParts
    foreach (WorksheetPart worksheetPart in workbookPart.WorksheetParts)
    {          
      OpenXmlPartReader reader = new OpenXmlPartReader(worksheetPart);
      string text;
      string rowNum;
      while (reader.Read())
      {
        if (reader.ElementType == typeof(Row))
        {
          do
          {
            if (reader.HasAttributes)
            {
              rowNum = reader.Attributes.First(a => a.LocalName == "r").Value;
              Console.Write("rowNum: " + rowNum);
            }

          } while (reader.ReadNextSibling()); // Skip to the next row

          break; // We just looped through all the rows so no 
                 // need to continue reading the worksheet
        }

        if (reader.ElementType != typeof(Worksheet))
          reader.Skip(); 
      }
      reader.Close();      
    }
  }  
}
Run Code Online (Sandbox Code Playgroud)

要读取所有单元格值,请使用以下函数(省略所有错误处理详细信息):

static void ReadAllCellValues(string fileName)
{
  using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(fileName, false))
  {
    WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;

    foreach(WorksheetPart worksheetPart in workbookPart.WorksheetParts)
    {
      OpenXmlReader reader = OpenXmlReader.Create(worksheetPart);

      while (reader.Read())
      {
        if (reader.ElementType == typeof(Row))
        {
          reader.ReadFirstChild();

          do
          {
            if (reader.ElementType == typeof(Cell))
            {
              Cell c = (Cell)reader.LoadCurrentElement();

              string cellValue;

              if (c.DataType != null && c.DataType == CellValues.SharedString)
              {
                SharedStringItem ssi = workbookPart.SharedStringTablePart.SharedStringTable.Elements<SharedStringItem>().ElementAt(int.Parse(c.CellValue.InnerText));

                cellValue = ssi.Text.Text;
              }
              else
              {
                cellValue = c.CellValue.InnerText;
              }

              Console.Out.Write("{0}: {1} ", c.CellReference, cellValue);
            }
          } while (reader.ReadNextSibling());
          Console.Out.WriteLine();
        }            
      }
    }   
  }
}
Run Code Online (Sandbox Code Playgroud)

在上面的代码中,您会看到SharedString必须使用SharedStringTablePart.处理具有数据类型的单元格.

  • @Argent:使用函数读取Excel文件中包含的工作表中的所有单元格值,更新了我的答案. (2认同)
  • @Hans如果你有一个空的单元格,那么就不会选择那个,并且就像原来的那样有更少的列.如何读取空单元格或空单元格? (2认同)