Das*_*ual 11 python dataset dataframe pandas
我的pandas数据框看起来像这样:
Person ID ZipCode Gender
0 12345 882 38182 Female
1 32917 271 88172 Male
2 18273 552 90291 Female
Run Code Online (Sandbox Code Playgroud)
我想复制每一行3次,如:
Person ID ZipCode Gender
0 12345 882 38182 Female
0 12345 882 38182 Female
0 12345 882 38182 Female
1 32917 271 88172 Male
1 32917 271 88172 Male
1 32917 271 88172 Male
2 18273 552 90291 Female
2 18273 552 90291 Female
2 18273 552 90291 Female
Run Code Online (Sandbox Code Playgroud)
当然,重置索引所以它是:
0
1
2
Run Code Online (Sandbox Code Playgroud)
我尝试过如下解决方案:
pd.concat([df[:5]]*3, ignore_index=True)
Run Code Online (Sandbox Code Playgroud)
和:
df.reindex(np.repeat(df.index.values, df['ID']), method='ffill')
Run Code Online (Sandbox Code Playgroud)
我没有运气,如果你能帮助我会很感激.
U10*_*ard 16
试试这个:
newdf = pd.DataFrame(np.repeat(df.values,3,axis=0))
newdf.columns = df.columns
print(newdf)
Run Code Online (Sandbox Code Playgroud)
输出:
Person ID ZipCode Gender
0 12345 882 38182 Female
1 12345 882 38182 Female
2 12345 882 38182 Female
3 32917 271 88172 Male
4 32917 271 88172 Male
5 32917 271 88172 Male
6 18273 552 90291 Female
7 18273 552 90291 Female
8 18273 552 90291 Female
Run Code Online (Sandbox Code Playgroud)
使用concat:
pd.concat([df]*3).sort_index()
Out[129]:
Person ID ZipCode Gender
0 12345 882 38182 Female
0 12345 882 38182 Female
0 12345 882 38182 Female
1 32917 271 88172 Male
1 32917 271 88172 Male
1 32917 271 88172 Male
2 18273 552 90291 Female
2 18273 552 90291 Female
2 18273 552 90291 Female
Run Code Online (Sandbox Code Playgroud)
小智 7
我不确定为什么从未提出过这一点,但您可以轻松地df.index.repeat与以下内容结合使用.loc:
new_df = df.loc[df.index.repeat(3)]
Run Code Online (Sandbox Code Playgroud)
输出:
>>> new_df
Person ID ZipCode Gender
0 12345 882 38182 Female
0 12345 882 38182 Female
0 12345 882 38182 Female
1 32917 271 88172 Male
1 32917 271 88172 Male
1 32917 271 88172 Male
2 18273 552 90291 Female
2 18273 552 90291 Female
2 18273 552 90291 Female
Run Code Online (Sandbox Code Playgroud)
这些将重复索引并保留op演示的列
iloc 版本1df.iloc[np.arange(len(df)).repeat(3)]
Run Code Online (Sandbox Code Playgroud)
iloc 版本2df.iloc[np.arange(len(df) * 3) // 3]
Run Code Online (Sandbox Code Playgroud)