JPM*_*JPM 5 python scale scikit-learn deep-learning keras
我正在研究信号分类问题,想先缩放数据集矩阵,但是我的数据是3D格式(批,长度,通道)。
我尝试使用Scikit-learn Standard Scaler:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
Run Code Online (Sandbox Code Playgroud)
但是我收到了以下错误消息:
找到具有暗3的数组。StandardScaler预期<= 2
我认为一种解决方案是将每个通道的矩阵分成多个2D矩阵,分别缩放比例,然后放回3D格式,但是我想知道是否有更好的解决方案。
非常感谢你。
Mar*_*ani 20
只有3行代码...
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train.reshape(-1, X_train.shape[-1])).reshape(X_train.shape)
X_test = scaler.transform(X_test.reshape(-1, X_test.shape[-1])).reshape(X_test.shape)
Run Code Online (Sandbox Code Playgroud)
Ber*_*man 16
您必须为每个通道安装和存储一个缩放器
from sklearn.preprocessing import StandardScaler
scalers = {}
for i in range(X_train.shape[1]):
scalers[i] = StandardScaler()
X_train[:, i, :] = scalers[i].fit_transform(X_train[:, i, :])
for i in range(X_test.shape[1]):
X_test[:, i, :] = scalers[i].transform(X_test[:, i, :])
Run Code Online (Sandbox Code Playgroud)
lie*_*dji 10
An elegant way of doing this is using class Inheritance as follows:
from sklearn.preprocessing import MinMaxScaler
import numpy as np
class MinMaxScaler3D(MinMaxScaler):
def fit_transform(self, X, y=None):
x = np.reshape(X, newshape=(X.shape[0]*X.shape[1], X.shape[2]))
return np.reshape(super().fit_transform(x, y=y), newshape=X.shape)
Run Code Online (Sandbox Code Playgroud)
Usage:
scaler = MinMaxScaler3D()
X = scaler.fit_transform(X)
Run Code Online (Sandbox Code Playgroud)
如果您想要像其他方法那样缩放每个要素StandardScaler
,可以使用以下方法:
import numpy as np
from sklearn.base import TransformerMixin
from sklearn.preprocessing import StandardScaler
class NDStandardScaler(TransformerMixin):
def __init__(self, **kwargs):
self._scaler = StandardScaler(copy=True, **kwargs)
self._orig_shape = None
def fit(self, X, **kwargs):
X = np.array(X)
# Save the original shape to reshape the flattened X later
# back to its original shape
if len(X.shape) > 1:
self._orig_shape = X.shape[1:]
X = self._flatten(X)
self._scaler.fit(X, **kwargs)
return self
def transform(self, X, **kwargs):
X = np.array(X)
X = self._flatten(X)
X = self._scaler.transform(X, **kwargs)
X = self._reshape(X)
return X
def _flatten(self, X):
# Reshape X to <= 2 dimensions
if len(X.shape) > 2:
n_dims = np.prod(self._orig_shape)
X = X.reshape(-1, n_dims)
return X
def _reshape(self, X):
# Reshape X back to it's original shape
if len(X.shape) >= 2:
X = X.reshape(-1, *self._orig_shape)
return X
Run Code Online (Sandbox Code Playgroud)
在将输入提供给sklearn之前,它只是将输入的功能展平StandardScaler
。然后,将其重新塑形。用法与相同StandardScaler
:
data = [[[0, 1], [2, 3]], [[1, 5], [2, 9]]]
scaler = NDStandardScaler()
print(scaler.fit_transform(data))
Run Code Online (Sandbox Code Playgroud)
版画
[[[-1. -1.]
[ 0. -1.]]
[[ 1. 1.]
[ 0. 1.]]]
Run Code Online (Sandbox Code Playgroud)
参数with_mean
和with_std
直接传递给StandardScaler
预期的工作。copy=False
将不会起作用,因为重塑不会就地进行。对于2D输入,其NDStandardScaler
工作类似于StandardScaler
:
data = [[0, 0], [0, 0], [1, 1], [1, 1]]
scaler = NDStandardScaler()
scaler.fit(data)
print(scaler.transform(data))
print(scaler.transform([[2, 2]]))
Run Code Online (Sandbox Code Playgroud)
版画
[[-1. -1.]
[-1. -1.]
[ 1. 1.]
[ 1. 1.]]
[[3. 3.]]
Run Code Online (Sandbox Code Playgroud)
就像在sklearn的示例中一样StandardScaler
。
归档时间: |
|
查看次数: |
5907 次 |
最近记录: |