如何使用python连接三个excels文件xlsx?

Aur*_*Vat 8 python excel openpyxl

您好我想使用python连接三个excels文件xlsx.

我尝试过使用openpyxl,但我不知道哪个函数可以帮助我将三个工作表附加到一个.

你有什么想法怎么做?

非常感谢

DSM*_*DSM 19

这是一个基于熊猫的方法.(它openpyxl在幕后使用.)

import pandas as pd

# filenames
excel_names = ["xlsx1.xlsx", "xlsx2.xlsx", "xlsx3.xlsx"]

# read them in
excels = [pd.ExcelFile(name) for name in excel_names]

# turn them into dataframes
frames = [x.parse(x.sheet_names[0], header=None,index_col=None) for x in excels]

# delete the first row for all frames except the first
# i.e. remove the header row -- assumes it's the first
frames[1:] = [df[1:] for df in frames[1:]]

# concatenate them..
combined = pd.concat(frames)

# write it out
combined.to_excel("c.xlsx", header=False, index=False)
Run Code Online (Sandbox Code Playgroud)

  • @AuréVat:Numpy需要32位*Python*,而不是32位Windows.很多人在64位Windows上运行32位Python.您可以在一台计算机上拥有多个Python环境. (2认同)

Hen*_*ter 6

我使用xlrdxlwt.假设您确实需要附加这些文件(而不是对它们进行任何实际工作),我会执行以下操作:打开要写入的文件xlwt,然后为其他三个文件中的每个文件循环数据并将每一行添加到输出文件中.为了帮助您入门:

import xlwt
import xlrd

wkbk = xlwt.Workbook()
outsheet = wkbk.add_sheet('Sheet1')

xlsfiles = [r'C:\foo.xlsx', r'C:\bar.xlsx', r'C:\baz.xlsx']

outrow_idx = 0
for f in xlsfiles:
    # This is all untested; essentially just pseudocode for concept!
    insheet = xlrd.open_workbook(f).sheets()[0]
    for row_idx in xrange(insheet.nrows):
        for col_idx in xrange(insheet.ncols):
            outsheet.write(outrow_idx, col_idx, 
                           insheet.cell_value(row_idx, col_idx))
        outrow_idx += 1
wkbk.save(r'C:\combined.xls')
Run Code Online (Sandbox Code Playgroud)

如果你的文件全部有标题行,你可能不想重复,所以你可以修改上面的代码看起来更像是这样的:

firstfile = True # Is this the first sheet?
for f in xlsfiles:
    insheet = xlrd.open_workbook(f).sheets()[0]
    for row_idx in xrange(0 if firstfile else 1, insheet.nrows):
        pass # processing; etc
    firstfile = False # We're done with the first sheet.
Run Code Online (Sandbox Code Playgroud)


小智 6

当我结合 excel 文件(mydata1.xlsx、mydata2.xlsx、mydata3.xlsx)进行数据分析时,我是这样做的:

import pandas as pd
import numpy as np
import glob

all_data = pd.DataFrame()
for f in glob.glob('myfolder/mydata*.xlsx'):
   df = pd.read_excel(f)
   all_data = all_data.append(df, ignore_index=True)
Run Code Online (Sandbox Code Playgroud)

然后,当我想将其另存为一个文件时:

writer = pd.ExcelWriter('mycollected_data.xlsx', engine='xlsxwriter')
all_data.to_excel(writer, sheet_name='Sheet1')
writer.save()
Run Code Online (Sandbox Code Playgroud)