在每行列上使用函数操作在 Pandas 数据框中创建新行的最有效的计算方法？

Question

在每行列上使用函数操作在 Pandas 数据框中创建新行的最有效的计算方法？

假设我有一个数据框，其中两列包含整数

Run Code Online (Sandbox Code Playgroud)

我想创建一个从现有列创建新行的函数

def new_rows(row):
    for idx in range (row['A']):
        c = idx*row['B']
        row['C'] = c
        return row

Run Code Online (Sandbox Code Playgroud)

所以结果数据帧将是

A.  B.   C 
3.  3.   0
3.  3.   3
3.  3.   6
4.  6    0
4.  6    6
4.  6    12
4.  6    18
6  4.    0
...
...
...

Run Code Online (Sandbox Code Playgroud)

据我所知，pandas map 和 apply 可用于创建新列，但不能用于创建其他行

我能想到的最佳解决方案是使用 pandas iterrows 在迭代期间应用操作，将所有值保存到字典列表中，然后创建该列表的 pandas 数据框。

Answer 1

ank*_*_91 5

您可以以矢量化方式解决此问题，Index.repeat在 df.A 上使用并groupby.cumcount生成范围并与 B 相乘：

def myf(data):
    a = data.loc[data.index.repeat(df['A'])]
    a['C'] = a.groupby("A").cumcount()*data['B']
    return a.reset_index(drop=True)

Run Code Online (Sandbox Code Playgroud)

print(myf(df))


    A  B   C
0   3  3   0
1   3  3   3
2   3  3   6
3   4  6   0
4   4  6   6
5   4  6  12
6   4  6  18
7   6  4   0
8   6  4   4
9   6  4   8
10  6  4  12
11  6  4  16
12  6  4  20
13  7  4   0
14  7  4   4
15  7  4   8
16  7  4  12
17  7  4  16
18  7  4  20
19  7  4  24
?

Run Code Online (Sandbox Code Playgroud)

归档时间：	4 年，5 月前
查看次数：	58 次
最近记录：	4 年，5 月前