dmv*_*nna 8 python excel pandas
我可以使用以下命令打开受密码保护的Excel文件:
import sys
import win32com.client
xlApp = win32com.client.Dispatch("Excel.Application")
print "Excel library version:", xlApp.Version
filename, password = sys.argv[1:3]
xlwb = xlApp.Workbooks.Open(filename, Password=password)
# xlwb = xlApp.Workbooks.Open(filename)
xlws = xlwb.Sheets(1) # counts from 1, not from 0
print xlws.Name
print xlws.Cells(1, 1) # that's A1
Run Code Online (Sandbox Code Playgroud)
我不确定如何将信息传递给pandas数据帧.我是否需要逐个读取单元格,或者是否有方便的方法来实现?
Suh*_*ote 17
import io
import pandas as pd
import msoffcrypto
passwd = 'xyz'
decrypted_workbook = io.BytesIO()
with open(path_to_your_file, 'rb') as file:
office_file = msoffcrypto.OfficeFile(file)
office_file.load_key(password=passwd)
office_file.decrypt(decrypted_workbook)
df = pd.read_excel(decrypted_workbook, sheet_name='abc')
Run Code Online (Sandbox Code Playgroud)
pip install --user msoffcrypto-tool
Run Code Online (Sandbox Code Playgroud)
from glob import glob
PATH = "Active Cons data"
# Scaning all the excel files from directories and sub-directories
excel_files = [y for x in os.walk(PATH) for y in glob(os.path.join(x[0], '*.xlsx'))]
for i in excel_files:
print(str(i))
decrypted_workbook = io.BytesIO()
with open(i, 'rb') as file:
office_file = msoffcrypto.OfficeFile(file)
office_file.load_key(password=passwd)
office_file.decrypt(decrypted_workbook)
df = pd.read_excel(decrypted_workbook, sheet_name=None)
sheets_count = len(df.keys())
sheet_l = list(df.keys()) # list of sheet names
print(sheet_l)
for i in range(sheets_count):
sheet = sheet_l[i]
df = pd.read_excel(decrypted_workbook, sheet_name=sheet)
new_file = f"D:\\all_csv\\{sheet}.csv"
df.to_csv(new_file, index=False)
Run Code Online (Sandbox Code Playgroud)
小智 6
假设起始单元格指定为 (StartRow, StartCol),结束单元格指定为 (EndRow, EndCol),我发现以下内容对我有用:
# Get the content in the rectangular selection region
# content is a tuple of tuples
content = xlws.Range(xlws.Cells(StartRow, StartCol), xlws.Cells(EndRow, EndCol)).Value
# Transfer content to pandas dataframe
dataframe = pandas.DataFrame(list(content))
Run Code Online (Sandbox Code Playgroud)
注意:Excel 单元格 B5 在 win32com 中作为第 5 行,第 2 列给出。此外,我们需要 list(...) 将元组元组转换为元组列表,因为没有用于元组元组的 pandas.DataFrame 构造函数。
小智 5
来自大卫哈曼的网站(所有学分都归他所有) https://davidhamann.de/2018/02/21/read-password-protected-excel-files-into-pandas-dataframe/
使用 xlwings,打开文件将首先启动 Excel 应用程序,以便您可以输入密码。
import pandas as pd
import xlwings as xw
PATH = '/Users/me/Desktop/xlwings_sample.xlsx'
wb = xw.Book(PATH)
sheet = wb.sheets['sample']
df = sheet['A1:C4'].options(pd.DataFrame, index=False, header=True).value
df
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
10913 次 |
| 最近记录: |