我想将缩放(使用sklearn.preprocessing中的StandardScaler())应用到pandas数据帧.以下代码返回一个numpy数组,因此我丢失了所有列名和indeces.这不是我想要的.
features = df[["col1", "col2", "col3", "col4"]]
autoscaler = StandardScaler()
features = autoscaler.fit_transform(features)
Run Code Online (Sandbox Code Playgroud)
我在网上找到的"解决方案"是:
features = features.apply(lambda x: autoscaler.fit_transform(x))
Run Code Online (Sandbox Code Playgroud)
它似乎有效,但会导致弃用警告:
/usr/lib/python3.5/site-packages/sklearn/preprocessing/data.py:583:DreprecationWarning:传递1d数组作为数据在0.17中被弃用,并将在0.19中引发ValueError.如果数据具有单个要素,则使用X.reshape(-1,1)重新整形数据;如果包含单个样本,则使用X.reshape(1,-1)重新整形数据.
我因此尝试过:
features = features.apply(lambda x: autoscaler.fit_transform(x.reshape(-1, 1)))
Run Code Online (Sandbox Code Playgroud)
但这给了:
回溯(最近一次调用最后一次):文件"./analyse.py",第91行,在features = features.apply(lambda x:autoscaler.fit_transform(x.reshape(-1,1)))文件"/ usr/lib/python3.5/site-packages/pandas/core/frame.py",第3972行,在apply中返回self._apply_standard(f,axis,reduce = reduce)文件"/usr/lib/python3.5/site- packages/pandas/core/frame.py",第4081行,在_apply_standard结果= self._constructor(data = results,index = index)文件"/usr/lib/python3.5/site-packages/pandas/core/frame .py",第226行,在 init mgr = self._init_dict(data,index,columns,dtype = dtype)文件"/usr/lib/python3.5/site-packages/pandas/core/frame.py",行363,in _init_dict dtype = dtype)文件"/usr/lib/python3.5/site-packages/pandas/core/frame.py",第5163行,在_arrays_to_mgr arrays = _homogenize(arrays,index,dtype)File"/ usr/lib/python3.5/site-packages/pandas/core/frame.py",第5477行,_homogenize raise_cast_failure = False)文件"/usr/lib/python3.5/site-packages/pandas/core/series .s",第2885行,在_sanitize_a中 rray raise Exception('数据必须是1维')例外:数据必须是1维的
如何将缩放应用于pandas数据帧,使数据帧保持不变?如果可能,不复制数据.
我一直在使用"ipython --script"为每个ipython笔记本自动保存.py文件,这样我就可以用它将类导入到其他笔记本中.但是这个最近停止工作,我收到以下错误消息:
`--script` is deprecated. You can trigger nbconvert via pre- or post-save hooks:
ContentsManager.pre_save_hook
FileContentsManager.post_save_hook
A post-save hook has been registered that calls:
ipython nbconvert --to script [notebook]
which behaves similarly to `--script`.
Run Code Online (Sandbox Code Playgroud)
据我所知,我需要设置一个后保存挂钩,但我不明白该怎么做.谁能解释一下?
我可以使用np.savez存储字典吗?结果令人惊讶(至少对我来说),我找不到通过密钥获取数据的方法.
In [1]: a = {'0': {'A': array([1,2,3]), 'B': array([4,5,6])}}
In [2]: a
Out[2]: {'0': {'A': array([1, 2, 3]), 'B': array([4, 5, 6])}}
In [3]: np.savez('model.npz', **a)
In [4]: a = np.load('model.npz')
In [5]: a
Out[5]: <numpy.lib.npyio.NpzFile at 0x7fc9f8acaad0>
In [6]: a['0']
Out[6]: array({'B': array([4, 5, 6]), 'A': array([1, 2, 3])}, dtype=object)
In [7]: a['0']['B']
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-16-c916b98771c9> in <module>()
----> 1 a['0']['B']
ValueError: field named B not found
In [8]: dict(a['0'])
---------------------------------------------------------------------------
TypeError Traceback …Run Code Online (Sandbox Code Playgroud) 是否可以在Python 3中访问装饰器属性?
例如:self.misses在调用装饰的斐波那契方法后是否可以访问?
class Cache:
def __init__(self, func):
self.func = func
self.cache = {}
self.misses = 0
def __call__(self, *args):
if not (args in self.cache):
self.misses += 1
self.cache[args] = self.func(*args)
return self.cache[args]
@Cache
def fibonacci(n):
return n if n in (0, 1) else fibonacci(n - 1) + fibonacci(n - 2)
fibonacci(20)
### now we want to print the number of cache misses ###
Run Code Online (Sandbox Code Playgroud) python ×3
numpy ×2
arrays ×1
decorator ×1
dictionary ×1
hook ×1
jupyter ×1
pandas ×1
save ×1
scikit-learn ×1