yan*_*234 4 python group-by numpy pandas pandas-groupby
我有以下数据帧:
Question1 Question2 Question3 Question4
User1 Agree Agree Disagree Strongly Disagree
User2 Disagree Agree Agree Disagree
User3 Agree Agree Agree Agree
Run Code Online (Sandbox Code Playgroud)
有没有办法将上面列出的数据帧转换为以下内容?
Agree Disagree Strongly Disagree
Question1 2 1 0
Question2 2 1 0
Question3 2 1 0
Question4 1 1 1
Run Code Online (Sandbox Code Playgroud)
这与我之前的问题类似:使用三列的分组问题创建一个数据框
我尝试用stack/pivot查看之前的问题,但无法弄明白.实际的数据框架有20多个问题和一个强烈同意,同意,中立,不同意,强烈不同意的类似规模.
同 pd.get_dummies
pd.get_dummies(df.stack()).groupby(level=1).sum()
Agree Disagree Strongly Disagree
Question1 2 1 0
Question2 3 0 0
Question3 2 1 0
Question4 1 1 1
Run Code Online (Sandbox Code Playgroud)
把它带到另一个层次
我们可以numpy.bincount用来加快速度.但我们必须注意尺寸
v = df.values
f, u = pd.factorize(v.ravel())
n, m = u.size, v.shape[1]
r = np.tile(np.arange(m), n)
b0 = np.bincount(r * n + f)
pad = np.zeros(n * m - b0.size, dtype=int)
b = np.append(b0, pad)
pd.DataFrame(b.reshape(m, n), df.columns, u)
Agree Disagree Strongly Disagree
Question1 2 1 0
Question2 3 0 0
Question3 2 1 0
Question4 1 1 1
Run Code Online (Sandbox Code Playgroud)
另一种numpy选择
v = df.values
n, m = v.shape
f, u = pd.factorize(v.ravel())
pd.DataFrame(
np.eye(u.size, dtype=int)[f].reshape(n, m, -1).sum(0),
df.columns, u
)
Agree Disagree Strongly Disagree
Question1 2 1 0
Question2 3 0 0
Question3 2 1 0
Question4 1 1 1
Run Code Online (Sandbox Code Playgroud)
速度差异
%%timeit
v = df.values
f, u = pd.factorize(v.ravel())
n, m = u.size, v.shape[1]
r = np.tile(np.arange(m), n)
b0 = np.bincount(r * n + f)
pad = np.zeros(n * m - b0.size, dtype=int)
b = np.append(b0, pad)
?
pd.DataFrame(b.reshape(m, n), df.columns, u)
1000 loops, best of 3: 194 µs per loop
%%timeit
v = df.values
n, m = v.shape
f, u = pd.factorize(v.ravel())
pd.DataFrame(
np.eye(u.size, dtype=int)[f].reshape(n, m, -1).sum(0),
df.columns, u
)
1000 loops, best of 3: 195 µs per loop
%timeit pd.get_dummies(df.stack()).groupby(level=1).sum()
1000 loops, best of 3: 1.2 ms per loop
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
538 次 |
| 最近记录: |