将项目拆分为行熊猫

Question

将项目拆分为行熊猫

ML8*_*L85 5 python numpy dataframe pandas

我有数据帧中的数据，如下所示。我想将项目拆分为相同数量的行

>>> df
idx  a  
0  3  
1  5  
2  4

Run Code Online (Sandbox Code Playgroud)

从上面的数据框，我想要下面的

Run Code Online (Sandbox Code Playgroud)

我尝试了几种方法，但都没有成功。

Answer 1

ank*_*_91 5

这是一种series.repeat+Groupby. cumcount 假设idx是索引的方法 - 如果不是df.set_index('idx')['a']..rest of the code..

(df['a'].repeat(df['a']).groupby(level=0).cumcount().add(1)
        .reset_index(drop=True).rename_axis('idx'))

Run Code Online (Sandbox Code Playgroud)

idx

0     1
1     2
2     3
3     1
4     2
5     3
6     4
7     5
8     1
9     2
10    3
11    4
dtype: int64

Run Code Online (Sandbox Code Playgroud)

Answer 2

WeN*_*Ben 5

有趣的方式

df.a.map(range).explode()+1 # may add reset_index(), however, I think keep the original index is good, and help us convert back.
Out[158]: 
idx
0    1
0    2
0    3
1    1
1    2
1    3
1    4
1    5
2    1
2    2
2    3
2    4
Name: a, dtype: object

Run Code Online (Sandbox Code Playgroud)

你可能想要 `reset_index` 因为 OP 需要唯一索引 @YOBEN_S (3认同)

Answer 3

yat*_*atu 5

这是一个基于 numpy 的：

a = (np.arange(df.a.max())+1)
m = a <= df.a.values[:,None]
df = pd.DataFrame(m.cumsum(1)[m], columns=['a'])

Run Code Online (Sandbox Code Playgroud)

Run Code Online (Sandbox Code Playgroud)

Answer 4

piR*_*red 5

列表理解

pd.DataFrame({'a': [x + 1 for y in df['a'] for x in range(y)]})

    a
0   1
1   2
2   3
3   1
4   2
5   3
6   4
7   5
8   1
9   2
10  3
11  4

Run Code Online (Sandbox Code Playgroud)

归档时间：	6 年，4 月前
查看次数：	165 次
最近记录：	6 年，4 月前