考虑 df
A B C
0 3 2 1
1 4 2 3
2 1 4 1
3 2 2 3
Run Code Online (Sandbox Code Playgroud)
我想补充另一列"D",使得d包含基于各种条件不同的列表"A","B"并"C"
A B C D
0 3 2 1 [1,0]
1 4 2 3 [1,0]
2 1 4 1 [0,2]
3 2 2 3 [2,0]
Run Code Online (Sandbox Code Playgroud)
我的代码段如下所示:
df['D'] = 0
df['D'] = df['D'].astype(object)
df.loc[(df['A'] > 1) & (df['B'] > 1), "D"] = [1,0]
df.loc[(df['A'] == 1) , "D"] = [0,2]
df.loc[(df['A'] == 2) & (df['C'] != 0) , "D"] = [2,0]
Run Code Online (Sandbox Code Playgroud)
当我尝试运行此代码时,它将引发以下错误:
ValueError: Must have equal len keys and value when setting with an iterable
Run Code Online (Sandbox Code Playgroud)
我已将列转换Object为此处建议的类型,但仍然有错误。
我可以推断出的是,pandas试图遍历列表中的元素,并将每个值分配给单元格,而我试图将整个列表分配给所有符合条件的单元格。
有什么办法可以以上述方式分配列表?
另一种解决方案是创建Series填充通过list与shape用于产生length的df:
df.loc[(df['A'] > 1) & (df['B'] > 1), "D"] = pd.Series([[1,0]]*df.shape[0])
df.loc[(df['A'] == 1) , "D"] = pd.Series([[0,2]]*df.shape[0])
df.loc[(df['A'] == 2) & (df['C'] != 0) , "D"] = pd.Series([[2,0]]*df.shape[0])
print (df)
A B C D
0 3 2 1 [1, 0]
1 4 2 3 [1, 0]
2 1 4 1 [0, 2]
3 2 2 3 [2, 0]
Run Code Online (Sandbox Code Playgroud)