Ale*_*dro 2 python csv netezza export-to-csv
我正在尝试从 Netezza 导出一个大文件(使用 Netezza ODBC + pyodbc),此解决方案会引发 memoryError,如果我在没有“list”的情况下循环,它会非常慢。你知道一个不会杀死我的服务器/python进程但可以运行得更快的中间解决方案吗?
cursorNZ.execute(sql)
archi = open("c:\test.csv", "w")
lista = list(cursorNZ.fetchall())
for fila in lista:
registro = ''
for campo in fila:
campo = str(campo)
registro = registro+str(campo)+";"
registro = registro[:-1]
registro = registro.replace('None','NULL')
registro = registro.replace("'NULL'","NULL")
archi.write(registro+"\n")
Run Code Online (Sandbox Code Playgroud)
- - 编辑 - -
谢谢,我正在尝试:其中“sql”是查询,cursorNZ 是
connMy = pyodbc.connect(DRIVER=.....)
cursorNZ = connNZ.cursor()
chunk = 10 ** 5 # tweak this
chunks = pandas.read_sql(sql, cursorNZ, chunksize=chunk)
with open('C:/test.csv', 'a') as output:
for n, df in enumerate(chunks):
write_header = n == 0
df.to_csv(output, sep=';', header=write_header, na_rep='NULL')
Run Code Online (Sandbox Code Playgroud)
有这个: AttributeError: 'pyodbc.Cursor' object has no attribute 'cursor' 知道吗?
不要使用cursorNZ.fetchall().
相反,直接循环光标:
with open("c:/test.csv", "w") as archi: # note the fixed '/'
cursorNZ.execute(sql)
for fila in cursorNZ:
registro = ''
for campo in fila:
campo = str(campo)
registro = registro+str(campo)+";"
registro = registro[:-1]
registro = registro.replace('None','NULL')
registro = registro.replace("'NULL'","NULL")
archi.write(registro+"\n")
Run Code Online (Sandbox Code Playgroud)
就个人而言,我只会使用熊猫:
import pyodbc
import pandas
cnn = pyodbc.connect(DRIVER=.....)
chunksize = 10 ** 5 # tweak this
chunks = pandas.read_sql(sql, cnn, chunksize=chunksize)
with open('C:/test.csv', 'a') as output:
for n, df in enumerate(chunks):
write_header = n == 0
df.to_csv(output, sep=';', header=write_header, na_rep='NULL')
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1474 次 |
| 最近记录: |