Fre*_*red 7 python transformation dataframe pandas
我有一个“是/否”格式的数据框,例如
7 22
1 NaN t
25 t NaN
Run Code Online (Sandbox Code Playgroud)
其中“t”代表是,我需要将其转换为 XY 表,因为列名是 X 坐标,索引是 Y 坐标:
X Y
1 22 1
2 7 25
Run Code Online (Sandbox Code Playgroud)
一个伪代码,如:
if a cell = "t":
newdf.X = df.column(t)
newdf.Y = df.index(t)
Run Code Online (Sandbox Code Playgroud)
尝试这个:
# Use np.where to get the integer location of the 't's in the dataframe
r, c = np.where(df == 't')
# Use dataframe constructor with dataframe indexes to define X, Y
df_out = pd.DataFrame({'X':df.columns[c], 'Y':df.index[r]})
df_out
Run Code Online (Sandbox Code Playgroud)
输出:
X Y
0 22 1
1 7 25
Run Code Online (Sandbox Code Playgroud)
更新以解决@RajeshC 评论:
给定 df,
7 22
1 NaN t
13 NaN NaN
25 t NaN
Run Code Online (Sandbox Code Playgroud)
然后:
r, c = np.where(df == 't')
df_out = pd.DataFrame({'X':df.columns[c], 'Y':df.index[r]}, index=r)
df_out = df_out.reindex(range(df.shape[0]))
df_out
Run Code Online (Sandbox Code Playgroud)
输出:
X Y
0 22 1.0
1 NaN NaN
2 7 25.0
Run Code Online (Sandbox Code Playgroud)
另一种选择stack:
pd.DataFrame.from_records(
df.stack().index.swaplevel(),
columns=['X', 'Y'])
Run Code Online (Sandbox Code Playgroud)
输出:
X Y
0 22 1
1 7 25
Run Code Online (Sandbox Code Playgroud)