对于python/pandas,我发现df.to_csv(fname)的工作速度约为每分钟1百万行.我有时可以将性能提高7倍,如下所示:
def df2csv(df,fname,myformats=[],sep=','):
"""
# function is faster than to_csv
# 7 times faster for numbers if formats are specified,
# 2 times faster for strings.
# Note - be careful. It doesn't add quotes and doesn't check
# for quotes or separators inside elements
# We've seen output time going down from 45 min to 6 min
# on a simple numeric 4-col dataframe with 45 million rows.
"""
if len(df.columns) <= 0:
return
Nd = len(df.columns)
Nd_1 = …Run Code Online (Sandbox Code Playgroud)