在unicode中将pandas DataFrame写入JSON

Swi*_*ier 11 python unicode json pandas

我正在尝试将包含unicode的pandas DataFrame写入json,但内置.to_json函数会转义字符.我该如何解决?

例:

import pandas as pd
df = pd.DataFrame([['?', 'a', 1], ['?', 'b', 2]])
df.to_json('df.json')
Run Code Online (Sandbox Code Playgroud)

这给出了:

{"0":{"0":"\u03c4","1":"\u03c0"},"1":{"0":"a","1":"b"},"2":{"0":1,"1":2}}
Run Code Online (Sandbox Code Playgroud)

这与预期结果不同:

{"0":{"0":"?","1":"?"},"1":{"0":"a","1":"b"},"2":{"0":1,"1":2}}
Run Code Online (Sandbox Code Playgroud)


我试过添加force_ascii=False参数:

import pandas as pd
df = pd.DataFrame([['?', 'a', 1], ['?', 'b', 2]])
df.to_json('df.json', force_ascii=False)
Run Code Online (Sandbox Code Playgroud)

但是这会产生以下错误:

UnicodeEncodeError: 'charmap' codec can't encode character '\u03c4' in position 11: character maps to <undefined>
Run Code Online (Sandbox Code Playgroud)


我正在使用WinPython 3.4.4.2 64bit和pandas 0.18.0

Swi*_*ier 17

打开编码设置为utf-8的文件,然后将该文件传递给该.to_json函数可以解决问题:

with open('df.json', 'w', encoding='utf-8') as file:
    df.to_json(file, force_ascii=False)
Run Code Online (Sandbox Code Playgroud)

给出正确的:

{"0":{"0":"?","1":"?"},"1":{"0":"a","1":"b"},"2":{"0":1,"1":2}}
Run Code Online (Sandbox Code Playgroud)

注意:它仍然需要force_ascii=False参数.