小编J.K*_*.K.的帖子

使用pyODBC的fast_executemany加速pandas.DataFrame.to_sql

我想发送一个大型pandas.DataFrame到运行MS SQL的远程服务器.我现在的方法是将data_frame对象转换为元组列表,然后使用pyODBC的executemany()函数将其发送出去.它是这样的:

 import pyodbc as pdb

 list_of_tuples = convert_df(data_frame)

 connection = pdb.connect(cnxn_str)

 cursor = connection.cursor()
 cursor.fast_executemany = True
 cursor.executemany(sql_statement, list_of_tuples)
 connection.commit()

 cursor.close()
 connection.close()

Run Code Online (Sandbox Code Playgroud)

然后我开始怀疑使用data_frame.to_sql()方法是否可以加速(或至少更具可读性).我想出了以下解决方案:

 import sqlalchemy as sa

 engine = sa.create_engine("mssql+pyodbc:///?odbc_connect=%s" % cnxn_str)
 data_frame.to_sql(table_name, engine, index=False)

Run Code Online (Sandbox Code Playgroud)

现在代码更具可读性,但上传速度至少慢150倍 ......

有没有办法fast_executemany在使用SQLAlchemy时翻转？

我正在使用pandas-0.20.3,pyODBC-4.0.21和sqlalchemy-1.1.13.

python sqlalchemy pyodbc pandas-to-sql

J.K*_*.K.

2018 05-16

43
推荐指数

7
解决办法

3万
查看次数