我有一个 Pandas 数据框,看起来像这样:
text = ["abcd", "efgh", "ijkl", "mnop", "qrst", "uvwx", "yz"]
labels = ["label_1, label_2",
"label_1, label_3, label_2",
"label_2, label_4",
"label_1, label_2, label_5",
"label_2, label_3",
"label_3, label_5, label_1, label_2",
"label_1, label_3"]
df = pd.DataFrame(dict(text=text, labels=labels))
df
text labels
0 abcd label_1, label_2
1 efgh label_1, label_3, label_2
2 ijkl label_2, label_4
3 mnop label_1, label_2, label_5
4 qrst label_2, label_3
5 uvwx label_3, label_5, label_1, label_2
6 yz label_1, label_3
Run Code Online (Sandbox Code Playgroud)
我想将数据框格式化为如下所示:
text label_1 label_2 label_3 label_4 label_5
abcd 1.0 …Run Code Online (Sandbox Code Playgroud) 我有一个像这样的DataFrame:
df = pd.DataFrame({'name': ['toto', 'tata', 'tati'], 'choices': 0})
df['choices'] = df['choices'].astype(object)
df['choices'][0] = [1,2,3]
df['choices'][1] = [5,4,3,1]
df['choices'][2] = [6,3,2,1,5,4]
print(df)
choices name
0 [1, 2, 3] toto
1 [5, 4, 3, 1] tata
2 [6, 3, 2, 1, 5, 4] tati
Run Code Online (Sandbox Code Playgroud)
我想基于这样的df构建一个DataFrame
choice rank name
0 1 0 toto
1 2 1 toto
2 3 2 toto
3 5 0 tata
4 4 1 tata
5 3 2 tata
6 1 3 tata
7 6 0 tati
8 …Run Code Online (Sandbox Code Playgroud) 我有以下内容DataFrame:
import pandas as pd
df = pd.DataFrame({
'col1': ['a, b'],
'col2': [100]
}, index=['A'])
Run Code Online (Sandbox Code Playgroud)
我想要实现的是通过"爆炸" col1创建一个多级索引,其值为col1第二级 - 同时保留col2原始索引的值,例如:
idx_1,idx_2,val
A,a,100
A,b,100
Run Code Online (Sandbox Code Playgroud)
我确定我需要一个col1.str.split(', '),但我完全失去了如何创建所需的结果 - 也许我需要一个pivot_table但却看不出我怎么能得到所需的索引.
我花了一个半小时的时间来看看有关重塑和旋转等问题的文档......我确信它是直截了当的 - 我只是不知道找到"正确的东西"所需的术语.
我有一个df称为列的列Description,它的值像:
ID Description
1 (a) this is good (b) bad (c) average
2 Ok
3 i am rahul works on (a) stack overflow (b) stack exchange
Run Code Online (Sandbox Code Playgroud)
预期DF:
ID Description
1 (a) this is good
1 (b) bad
1 (c) average
2 Ok
3 i am rahul works on (a) stack overflow
3 (b) stack exchange
Run Code Online (Sandbox Code Playgroud)