在 pandas apply 方法中，根据条件复制行

Question

在 pandas apply 方法中，根据条件复制行

这是我的 df 的一个例子：

pd.DataFrame([["1", "2"], ["1", "2"], ["3", "other_value"]],
                     columns=["a", "b"])
    a   b
0   1   2
1   1   2
2   3   other_value

Run Code Online (Sandbox Code Playgroud)

我想达到这个目的：

pd.DataFrame([["1", "2"], ["1", "2"], ["3", "other_value"], ["3", "row_duplicated_with_edits_in_this_column"]],
                     columns=["a", "b"])
    a   b
0   1   2
1   1   2
2   3   other_value
3   3   row_duplicated_with_edits_in_this_column

Run Code Online (Sandbox Code Playgroud)

规则是使用 apply 方法，进行一些检查（为了使示例简单，我不包括这些检查），但在某些条件下，对于 apply 函数中的某些行，复制该行，对该行进行编辑并将两行插入 df 中。

所以像这样：

def f(row):
   if condition:
      row["a"] = 3
   elif condition:
      row["a"] = 4
   elif condition:
      row_duplicated = row.copy()
      row_duplicated["a"] = 5 # I need also this row to be included in the df

   return row
df.apply(f, axis=1)

Run Code Online (Sandbox Code Playgroud)

我不想将重复的行存储在类中的某个位置并将它们添加到末尾。我想即时完成。

我见过这个pandas: apply function to DataFrame 可以返回多行，但我不确定 groupby 是否可以在这里帮助我。

谢谢

Answer 1

cs9*_*s95 3

df.iterrows这是在列表理解内部使用的一种方法。您需要将行追加到循环中，然后连接。

def func(row):
   if row['a'] == "3":
        row2 = row.copy()
        # make edits to row2
        return pd.concat([row, row2], axis=1)
   return row

pd.concat([func(row) for _, row in df.iterrows()], ignore_index=True, axis=1).T

   a            b
0  1            2
1  1            2
2  3  other_value
3  3  other_value

Run Code Online (Sandbox Code Playgroud)

我发现就我而言，没有更好ignore_index=True，因为我后来合并了 2 个 dfs。

归档时间：	7 年，1 月前
查看次数：	3979 次
最近记录：	7 年，1 月前