Cof*_*Liu 5 python dataframe dask
我正在尝试使用dask读取csv文件,它给了我如下错误。但问题是我要我ARTICLE_ID成为object(string)。有人可以帮助我成功读取数据吗?
追溯如下:
ValueError: Mismatched dtypes found in `pd.read_csv`/`pd.read_table`.
+------------+--------+----------+
| Column | Found | Expected |
+------------+--------+----------+
| ARTICLE_ID | object | int64 |
+------------+--------+----------+
The following columns also raised exceptions on conversion:
ARTICLE_ID:
ValueError("invalid literal for int() with base 10: ' July 2007 and 31 March 2008. Diagnostic practices of the medical practitioners for establishing the diagnosis of different types of EPTB were studied. Results: For the diagnosi\\\\'",)
Usually this is due to dask's dtype inference failing, and
*may* be fixed by specifying dtypes manually by adding:
dtype={'ARTICLE_ID': 'object'}
to the call to `read_csv`/`read_table`.
Run Code Online (Sandbox Code Playgroud)
该消息表明您将呼叫从
df = dd.read_csv('mylocation.csv', ...)
Run Code Online (Sandbox Code Playgroud)
至
df = dd.read_csv('mylocation.csv', ..., dtype={'ARTICLE_ID': 'object'})
Run Code Online (Sandbox Code Playgroud)
您应该在此处更改文件位置以及之前使用的其他任何参数。如果仍然无法解决问题,请更新您的问题。
| 归档时间: |
|
| 查看次数: |
3260 次 |
| 最近记录: |