Openpyxl max_row 和 max_column 错误地报告了更大的数字

Question

Openpyxl max_row 和 max_column 错误地报告了更大的数字

我的查询与作为我正在开发的解析脚本的一部分的函数有关。我正在尝试编写一个python函数来查找与excel中匹配值对应的列号。excel 是使用 openpyxl 即时创建的，它的第一行（从第 3 列开始）标题每跨 4 列合并为一个。在我随后的函数中，我正在解析一些要添加到与匹配标题对应的列中的内容。（附加信息：我正在解析的内容是blast+ 输出。我正在尝试创建一个汇总电子表格，其中每列中的命中名称以及命中、间隙、跨度和身份的子列。前两列是查询重叠群及其长度。）

我最初为 xlrd 编写了一个类似的函数并且它起作用了。但是当我尝试为 openpyxl 重写它时，我发现 max_row 和 max_col 函数错误地返回了比实际存在的更多的行和列。例如，我有 20 行用于这个试验输入，但它报告为 82。请注意，我手动选择了空行和列，然后右键单击并删除了它们，如本论坛其他地方所建议的。这并没有改变错误。

def find_column_number(x):
    col = 0
    print "maxrow = ", hrsh.max_row
    print "maxcol = ", hrsh.max_column
    for rowz in range(hrsh.max_row):
        print "now the row is ", rowz
        if(rowz > 0): 
            pass
        for colz in range(hrsh.max_column):
            print "now the column is ", colz
            name = (hrsh.cell(row=rowz,column=colz).value)
            if(name == x):
                col = colz
    return col

Run Code Online (Sandbox Code Playgroud)

max_row 和 max_col 的问题已经在这里讨论过https://bitbucket.org/openpyxl/openpyxl/issues/514/cell-max_row-reports-higher-than-actual我在这里应用了这个建议。但是 max_row 仍然是错误的。

for row in reversed(hrsh.rows):
    values = [cell.value for cell in row]
    if any(values):
        print("last row with data is {0}".format(row[0].row))
        maxrow = row[0].row

Run Code Online (Sandbox Code Playgroud)

然后我在https://www.reddit.com/r/learnpython/comments/3prmun/openpyxl_loop_through_and_find_value_of_the/尝试了建议，并尝试获取列值。脚本再次考虑空列并报告比实际存在的列数更多的列。

for currentRow in hrsh.rows:
    for currentCell in currentRow:
        print(currentCell.value)

Run Code Online (Sandbox Code Playgroud)

你能帮我解决这个错误，或者建议另一种方法来实现我的目标吗？

Answer 1

Cha*_*ark 5

正如您链接到的错误报告中所述，工作表报告的维度与这些维度是否包含空行或列之间存在差异。如果max_row并且max_column没有报告您想看到的内容，那么您将需要编写自己的代码以找到第一个完全空的代码。当然，最有效的方法是从头开始max_row并向后工作，但以下方法可能就足够了：

for max_row, row in enumerate(ws, 1):
    if all(c.value is None for c in row):
        break

Run Code Online (Sandbox Code Playgroud)

归档时间：	8 年，4 月前
查看次数：	18720 次
最近记录：	4 年，7 月前