在这篇文章中,Guido van Rossum说功能调用可能很昂贵,但我不明白为什么也不贵.
多少延迟会为您的代码添加一个简单的函数调用,为什么?
我有一个MultiIndex pandas DataFrame,我想在其中的一个列中应用一个函数,并将结果分配给同一列.
In [1]:
import numpy as np
import pandas as pd
cols = ['One', 'Two', 'Three', 'Four', 'Five']
df = pd.DataFrame(np.array(list('ABCDEFGHIJKLMNO'), dtype='object').reshape(3,5), index = list('ABC'), columns=cols)
df.to_hdf('/tmp/test.h5', 'df')
df = pd.read_hdf('/tmp/test.h5', 'df')
df
Out[1]:
One Two Three Four Five
A A B C D E
B F G H I J
C K L M N O
3 rows × 5 columns
In [2]:
df.columns = pd.MultiIndex.from_arrays([list('UUULL'), ['One', 'Two', 'Three', 'Four', 'Five']])
df['L']['Five'] = df['L']['Five'].apply(lambda x: x.lower())
df …Run Code Online (Sandbox Code Playgroud) 我有一个这样的 HDF 文件:
>>> dataset.store
... <class 'pandas.io.pytables.HDFStore'>
... File path: ../data/data_experiments_01-02-03.h5
... /exp01/user01 frame_table (typ->appendable,nrows->221,ncols->124,indexers->[index])
... /exp01/user02 frame_table (typ->appendable,nrows->163,ncols->124,indexers->[index])
... /exp01/user03 frame_table (typ->appendable,nrows->145,ncols->124,indexers->[index])
... /exp02/user01 frame_table (typ->appendable,nrows->194,ncols->124,indexers->[index])
... /exp02/user02 frame_table (typ->appendable,nrows->145,ncols->124,indexers->[index])
... /exp03/user03 frame_table (typ->appendable,nrows->348,ncols->124,indexers->[index])
... /exp03/user01 frame_table (typ->appendable,nrows->240,ncols->124,indexers->[index])
Run Code Online (Sandbox Code Playgroud)
我想从其中一个实验(exp0Z)中检索所有用户(userXY)并将它们附加到一个大数据帧中。我尝试store.get('exp03')获取以下错误:
>>> store.get('exp03')
...
... ---------------------------------------------------------------------------
... TypeError Traceback (most recent call last)
... <ipython-input-109-0a2e29e9e0a4> in <module>()
... ----> 1 dataset.store.get('/exp03')
...
... /Library/Python/2.7/site-packages/pandas/io/pytables.pyc in get(self, key)
... 613 if group is None:
... 614 raise KeyError('No object …Run Code Online (Sandbox Code Playgroud) 注意:根据某些人的建议,我将此问题转发给codereview网站
我想使用包含每个拆分长度的另一个列表拆分列表.
例如.
>>> print list(split_by_lengths(list('abcdefg'), [2,1]))
... [['a', 'b'], ['c'], ['d', 'e', 'f', 'g']]
>>> print list(split_by_lengths(list('abcdefg'), [2,2]))
... [['a', 'b'], ['c', 'd'], ['e', 'f', 'g']]
>>> print list(split_by_lengths(list('abcdefg'), [2,2,6]))
... [['a', 'b'], ['c', 'd'], ['e', 'f', 'g']]
>>> print list(split_by_lengths(list('abcdefg'), [1,10]))
... [['a'], ['b', 'c', 'd', 'e', 'f', 'g']]
>>> print list(split_by_lengths(list('abcdefg'), [2,2,6,5]))
... [['a', 'b'], ['c', 'd'], ['e', 'f', 'g']]
Run Code Online (Sandbox Code Playgroud)
您可以注意到,如果长度列表未涵盖所有列表,则我将其余元素作为附加子列表附加.此外,我希望在长度列表产生更多要分割的列表中的元素的情况下,最后避免空列表.
我已经有一个按我想要的功能:
def take(n, iterable):
"Return first n items of the iterable as a …Run Code Online (Sandbox Code Playgroud)