kav*_*kav 6 python dataframe pandas
我有一个正在转换为数据框的 dict 列表。当我尝试传递列参数时,输出值都是 nan。
# This code does not result in desired output
l = [{'a': 1, 'b': 2}, {'a': 3, 'b': 4}]
pd.DataFrame(l, columns=['c', 'd'])
c d
0 NaN NaN
1 NaN NaN
Run Code Online (Sandbox Code Playgroud)
# This code does result in desired output
l = [{'a': 1, 'b': 2}, {'a': 3, 'b': 4}]
df = pd.DataFrame(l)
df.columns = ['c', 'd']
df
c d
0 1 2
1 3 4
Run Code Online (Sandbox Code Playgroud)
为什么会这样?
因为如果在DataFrame构造函数中创建了来自键的字典传递列表的新列名:
l = [{'a': 1, 'b': 2}, {'a': 3, 'b': 4}]
print (pd.DataFrame(l))
a b
0 1 2
1 3 4
Run Code Online (Sandbox Code Playgroud)
如果字典键中不存在具有某些值的传递列参数,则从字典中过滤列,对于不存在的值,将创建具有缺失值的列,其顺序类似于列名称列表中的值:
#changed order working, because a,b keys at least in one dictionary
print (pd.DataFrame(l, columns=['b', 'a']))
b a
0 2 1
1 4 3
#filtered a, d filled missing values - key is not at least in one dictionary
print (pd.DataFrame(l, columns=['a', 'd']))
a d
0 1 NaN
1 3 NaN
#filtered b, c filled missing values - key is not at least in one dictionary
print (pd.DataFrame(l, columns=['c', 'b']))
c b
0 NaN 2
1 NaN 4
#filtered a,b, c, d filled missing values - keys are not at least in one dictionary
print (pd.DataFrame(l, columns=['c', 'd','a','b']))
c d a b
0 NaN NaN 1 2
1 NaN NaN 3 4
Run Code Online (Sandbox Code Playgroud)
因此,如果想要其他列名称,您需要重命名它们或像在第二个代码中一样设置新名称。