如何在列元素列表时将列转换为非嵌套列表?
例如,列就像
column
[1, 2, 3]
[1, 2]
Run Code Online (Sandbox Code Playgroud)
我想要最后关注.
[1,2,3,1,2]
Run Code Online (Sandbox Code Playgroud)
但现在column.tolist(),我会得到
[[1,2,3],[1,2]]
Run Code Online (Sandbox Code Playgroud)
编辑:谢谢你的帮助.我的目的是找到最简单(优雅)和有效的方法来做到这一点.现在我使用@jezrael方法.
from itertools import chain
output = list(chain.from_iterable(df[column])
Run Code Online (Sandbox Code Playgroud)
最简单的方法是由@piRSquared提供的,但可能更慢.
output = df[column].values.sum()
Run Code Online (Sandbox Code Playgroud)
你可以使用numpy.concatenate:
print (np.concatenate(df['column'].values).tolist())
[1, 2, 3, 1, 2]
Run Code Online (Sandbox Code Playgroud)
要么:
from itertools import chain
print (list(chain.from_iterable(df['column'])))
[1, 2, 3, 1, 2]
Run Code Online (Sandbox Code Playgroud)
另一个解决方案,谢谢juanpa.arrivillaga:
print ([item for sublist in df['column'] for item in sublist])
[1, 2, 3, 1, 2]
Run Code Online (Sandbox Code Playgroud)
时间:
df = pd.DataFrame({'column':[[1,2,3], [1,2]]})
df = pd.concat([df]*10000).reset_index(drop=True)
print (df)
In [77]: %timeit (np.concatenate(df['column'].values).tolist())
10 loops, best of 3: 22.7 ms per loop
In [78]: %timeit (list(chain.from_iterable(df['column'])))
1000 loops, best of 3: 1.44 ms per loop
In [79]: %timeit ([item for sublist in df['column'] for item in sublist])
100 loops, best of 3: 2.31 ms per loop
In [80]: %timeit df.column.sum()
1 loop, best of 3: 1.34 s per loop
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
167 次 |
| 最近记录: |