Tho*_*aut 3 python dataframe pandas
这是我的问题。我有一个带有x列和y行的数据框。一些列实际上是列表。我想将这些列转换为包含单个值的多个列。
一个例子说明了一切:
我的数据框:
ans_length ans_unigram_numbers ... levenshtein_dist que_entropy
0 [19, 14] [12, 8] ... 9.00 3.189898
1 [19] [12] ... 4.00 3.189898
2 [0] [0] ... 170.00 4.299996
3 [0] [0] ... 170.00 4.303341
4 [0] [0] ... 170.00 4.304335
5 [0] [0] ... 170.00 4.311820
28 [56] [23] ... 24.00 4.110291
29 [0] [0] ... 56.00 4.181720
... ... ... ... ... ...
1976 [24] [11] ... 24.00 3.084963
1977 [24] [11] ... 24.00 3.084963
1992 [31, 24, 32, 28] [14, 15, 17, 11] ... 18.75 3.292770
1993 [31, 24, 32, 28] [14, 15, 17, 11] ... 18.75 3.292770
[1998 rows x 9 columns]
Run Code Online (Sandbox Code Playgroud)
我期望的是:
ans_length_0 ans_length_1 ans_length_2 ans_length_3 \
0 19 14
1 19
2 0
3 0
4 0
5 0
28 56
29 0
1976 24
1977 24
1992 31 24 32 28
1993 31 24 32 28
ans_unigram_numbers_0 ans_unigram_numbers_1 ans_unigram_numbers_2 ans_unigram_numbers_3 \
12 8
12
0
0
0
0
23
0
11
11
14 15 17 11
14 15 17 11
levenshtein_dist que_entropy
9 3.189898
4 3.189898
170 4.299996
170 4.303341
170 4.304335
170 4.31182
24 4.110291
56 4.18172
24 3.084963
24 3.084963
18.75 3.29277
18.75 3.29277
Run Code Online (Sandbox Code Playgroud)
新生成的列应使用旧列的名称,并在其末尾添加索引。
我认为您可以使用:
cols = ['ans_length','ans_unigram_numbers']
df1 = pd.concat([pd.DataFrame(df[x].values.tolist()).add_prefix(x) for x in cols], axis=1)
df = pd.concat([df1, df.drop(cols, axis=1)], axis=1)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
2833 次 |
| 最近记录: |