打开xml excel读取单元格值

jwd*_*aan 47 c# openxml openxml-sdk

我正在使用Open XML SDK打开Excel xlsx文件,并尝试读取每个工作表中位置A1的单元格值.我使用以下代码:

using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(openFileDialog1.FileName, false))
{
    var sheets = spreadsheetDocument.WorkbookPart.Workbook.Descendants<Sheet>();

    foreach (Sheet sheet in sheets)
    {
        WorksheetPart worksheetPart = (WorksheetPart)spreadsheetDocument.WorkbookPart.GetPartById(sheet.Id);
        Worksheet worksheet = worksheetPart.Worksheet;

        Cell cell = GetCell(worksheet, "A", 1);

        Console.Writeline(cell.CellValue.Text);
     }
}

private static Cell GetCell(Worksheet worksheet, string columnName, uint rowIndex)
{
     Row row = GetRow(worksheet, rowIndex);

     if (row == null)
         return null;

     return row.Elements<Cell>().Where(c => string.Compare
               (c.CellReference.Value, columnName +
               rowIndex, true) == 0).First();
}

// Given a worksheet and a row index, return the row.
private static Row GetRow(Worksheet worksheet, uint rowIndex)
{
    return worksheet.GetFirstChild<SheetData>().
          Elements<Row>().Where(r => r.RowIndex == rowIndex).First();
} 
Run Code Online (Sandbox Code Playgroud)

位置A1的第一个工作表中的文本只是'test'但是,在我的控制台中,我将值'0'视为cell.CellValue.Text

有没有人有想法获得正确的细胞价值?

amu*_*rra 64

Excel工作表中的所有字符串都存储在一个名为SharedStringTable的结构中.此表的目标是将所有字符串集中在基于索引的数组中,然后在文档中多次使用该字符串以仅引用此数组中的索引.话虽如此,当您获得A1单元格的文本值时,您收到的0是SharedStringTable的索引.要获得真正的价值,您可以使用此辅助函数:

public static SharedStringItem GetSharedStringItemById(WorkbookPart workbookPart, int id)
{
    return workbookPart.SharedStringTablePart.SharedStringTable.Elements<SharedStringItem>().ElementAt(id);
}
Run Code Online (Sandbox Code Playgroud)

然后在你的代码中调用它来获得真正的价值:

Cell cell = GetCell(worksheet, "A", 1);

string cellValue = string.Empty;

if (cell.DataType != null)
{
    if (cell.DataType == CellValues.SharedString)
    {
       int id = -1;

       if (Int32.TryParse(cell.InnerText, out id))
       {
           SharedStringItem item = GetSharedStringItemById(workbookPart, id);

           if (item.Text != null)
           {
               cellValue = item.Text.Text;
           }
           else if (item.InnerText != null)
           {
               cellValue = item.InnerText;
           }
           else if (item.InnerXml != null)
           {
               cellValue = item.InnerXml;
           }
       }
    }
}
Run Code Online (Sandbox Code Playgroud)

  • 我正在添加此注释,因为确定单元格值是否表示SST索引的实际解决方案由于某种原因从未发布(非常烦人):if(cell.DataType!= null && cell.DataType == CellValues.SharedString) (10认同)
  • 这是正确的,但没有解决所有需要的问题.在查找SST中的单元格值之前,您需要实际确定其单元格值是表示SST索引还是实际上是值. (6认同)
  • 在这个问题上倾向于同意阿穆拉 - OP只是要求基本价值.通过这些评论的事实,他现在知道他可能需要考虑其他的东西,使得答案足以满足所提出的问题.可以在另一个问题中询问其他类似公式的事情. (4认同)
  • @Samuel Neff - 默认情况下,Excel会将所有基本字符串放入SST中,在此问题中,他只关心获取此基本字符串值.无需过度复杂化基本方案.如果他正在处理公式或其他数据,那么显然上面的代码需要更改以包含您的注释. (2认同)

Bre*_*ent 15

Amurra的答案似乎占了百分之九十,但它可能需要一些细微差别.

1)函数"GetSharedStringItemById"返回SharedStringItem,而不是字符串,这样调用代码示例将不起作用.要将实际值作为字符串获取,我相信您需要请求SharedStringItem的InnerText属性,如下所示:

public static string GetSharedStringItemById(WorkbookPart workbookPart, int id)
{
    return workbookPart.SharedStringTablePart.SharedStringTable.Elements<SharedStringItem>().ElementAt(id).InnerText;
}
Run Code Online (Sandbox Code Playgroud)

2)该函数也(正确地)要求int作为其签名的一部分,但示例代码调用提供字符串cell.CellValue.Text.将字符串转换为int很简​​单,但需要完成,因为编写的代码可能会令人困惑.


小智 11

很久以前发现这个非常有用的片段,所以不能代表作者.

private static string GetCellValue(string fileName, string sheetName, string addressName)
    {
        string value = null;

        using(SpreadsheetDocument document =  SpreadsheetDocument.Open(fileName, false))
        {
            WorkbookPart wbPart = document.WorkbookPart;

            // Find the sheet with the supplied name, and then use that Sheet
            // object to retrieve a reference to the appropriate worksheet.
            Sheet theSheet = wbPart.Workbook.Descendants<Sheet>().
              Where(s => s.Name == sheetName).FirstOrDefault();

            if(theSheet == null)
            {
                throw new ArgumentException("sheetName");
            }

            // Retrieve a reference to the worksheet part, and then use its 
            // Worksheet property to get a reference to the cell whose 
            // address matches the address you supplied:
            WorksheetPart wsPart = (WorksheetPart)(wbPart.GetPartById(theSheet.Id));
            Cell theCell = wsPart.Worksheet.Descendants<Cell>().
              Where(c => c.CellReference == addressName).FirstOrDefault();

            // If the cell does not exist, return an empty string:
            if(theCell != null)
            {
                value = theCell.InnerText;

                // If the cell represents a numeric value, you are done. 
                // For dates, this code returns the serialized value that 
                // represents the date. The code handles strings and Booleans
                // individually. For shared strings, the code looks up the 
                // corresponding value in the shared string table. For Booleans, 
                // the code converts the value into the words TRUE or FALSE.
                if(theCell.DataType != null)
                {
                    switch(theCell.DataType.Value)
                    {
                        case CellValues.SharedString:
                            // For shared strings, look up the value in the shared 
                            // strings table.
                            var stringTable = wbPart.
                              GetPartsOfType<SharedStringTablePart>().FirstOrDefault();
                            // If the shared string table is missing, something is 
                            // wrong. Return the index that you found in the cell.
                            // Otherwise, look up the correct text in the table.
                            if(stringTable != null)
                            {
                                value = stringTable.SharedStringTable.
                                  ElementAt(int.Parse(value)).InnerText;
                            }
                            break;

                        case CellValues.Boolean:
                            switch(value)
                            {
                                case "0":
                                    value = "FALSE";
                                    break;
                                default:
                                    value = "TRUE";
                                    break;
                            }
                            break;
                    }
                }
            }
        }
        return value;
    }
Run Code Online (Sandbox Code Playgroud)

  • 它来自msdn:http://msdn.microsoft.com/en-us/library/office/ff921204(v = office.14).aspx (7认同)
  • 这段代码是如此缓慢,以至于无法加载超过5x5的表格。添加一行大约需要200毫秒! (2认同)