Pandas为每个字符拆分数据帧列

War*_* S. 8 python dataframe pandas

我有多个数据帧列,如下所示:

                         Day1
0    DDDDDDDDDDBBBBBBAAAAAAAAAABBBBBBDDDDDDDDDDDDDDDD
1    DDDDDDDDDDBBBBBBAAAAAAAAAABBBBBBDDDDDDDDDDDDDDDD
2    DDDDDDDDDDBBBBBBAAAAAAAAAABBBBBBDDDDDDDDDDDDDDDD
3    DDDDDDDDDDBBBBBBAAAAAAAAAABBBBBBDDDDDDDDDDDDDDDD
4    DDDDDDDDDDBBBBBBAAAAAAAAAABBBBBBDDDDDDDDDDDDDDDD
Run Code Online (Sandbox Code Playgroud)

我想要的是每个角色都在一个专栏中分开:

     012345678910111213....
0    DDDDDDDDDDBBBBBBAAAAAAAAAABBBBBBDDDDDDDDDDDDDDDD
1    DDDDDDDDDDBBBBBBAAAAAAAAAABBBBBBDDDDDDDDDDDDDDDD
2    DDDDDDDDDDBBBBBBAAAAAAAAAABBBBBBDDDDDDDDDDDDDDDD
3    DDDDDDDDDDBBBBBBAAAAAAAAAABBBBBBDDDDDDDDDDDDDDDD
4    DDDDDDDDDDBBBBBBAAAAAAAAAABBBBBBDDDDDDDDDDDDDDDD
Run Code Online (Sandbox Code Playgroud)

因此,"第1天 - 列"分为48列,每列有一个值A/B/C/D

我尝试拆分,但那没有用.

EdC*_*ica 13

您可以拨打apply和每一行调用pd.Series上的list价值观:

In [16]:

df['Day1'].apply(lambda x: pd.Series(list(x)))
Out[16]:
  0  1  2  3  4  5  6  7  8  9  ... 38 39 40 41 42 43 44 45 46 47
0  D  D  D  D  D  D  D  D  D  D ...  D  D  D  D  D  D  D  D  D  D
1  D  D  D  D  D  D  D  D  D  D ...  D  D  D  D  D  D  D  D  D  D
2  D  D  D  D  D  D  D  D  D  D ...  D  D  D  D  D  D  D  D  D  D
3  D  D  D  D  D  D  D  D  D  D ...  D  D  D  D  D  D  D  D  D  D
4  D  D  D  D  D  D  D  D  D  D ...  D  D  D  D  D  D  D  D  D  D

[5 rows x 48 columns]
Run Code Online (Sandbox Code Playgroud)

看起来你有尾随空格,删除这些使用str.rstrip:

df['Day1'] = df['Day1'].str.rstip()
Run Code Online (Sandbox Code Playgroud)

然后做上面的事情


Max*_*axU 5

使用Series.str.extractall()方法:

In [19]: df.Day1.str.extractall('(.)', flags=re.U)[0].unstack().rename_axis(None, 1)
Out[19]:
  0  1  2  3  4  5  6  7  8  9  ... 38 39 40 41 42 43 44 45 46 47
0  D  D  D  D  D  D  D  D  D  D ...  D  D  D  D  D  D  D  D  D  D
1  D  D  D  D  D  D  D  D  D  D ...  D  D  D  D  D  D  D  D  D  D
2  D  D  D  D  D  D  D  D  D  D ...  D  D  D  D  D  D  D  D  D  D
3  D  D  D  D  D  D  D  D  D  D ...  D  D  D  D  D  D  D  D  D  D
4  D  D  D  D  D  D  D  D  D  D ...  D  D  D  D  D  D  D  D  D  D

[5 rows x 48 columns]
Run Code Online (Sandbox Code Playgroud)