如何对3D矩阵进行标准缩放？

Question

如何对3D矩阵进行标准缩放？

JPM*_*JPM 5 python scale scikit-learn deep-learning keras

我正在研究信号分类问题，想先缩放数据集矩阵，但是我的数据是3D格式（批，长度，通道）。
我尝试使用Scikit-learn Standard Scaler：

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

Run Code Online (Sandbox Code Playgroud)

但是我收到了以下错误消息：

找到具有暗3的数组。StandardScaler预期<= 2

我认为一种解决方案是将每个通道的矩阵分成多个2D矩阵，分别缩放比例，然后放回3D格式，但是我想知道是否有更好的解决方案。
非常感谢你。

Answer 1

Mar*_*ani 20

只有3行代码...

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train.reshape(-1, X_train.shape[-1])).reshape(X_train.shape)
X_test = scaler.transform(X_test.reshape(-1, X_test.shape[-1])).reshape(X_test.shape)

Run Code Online (Sandbox Code Playgroud)

Answer 2

Ber*_*man 16

您必须为每个通道安装和存储一个缩放器

from sklearn.preprocessing import StandardScaler

scalers = {}
for i in range(X_train.shape[1]):
    scalers[i] = StandardScaler()
    X_train[:, i, :] = scalers[i].fit_transform(X_train[:, i, :]) 

for i in range(X_test.shape[1]):
    X_test[:, i, :] = scalers[i].transform(X_test[:, i, :])

Run Code Online (Sandbox Code Playgroud)

它不起作用。不应该是这样的：`for i in range(X_train.shape[1]):` (2认同)
不，我认为应该是 X_train[:, :, i] = scalers[i].fit_transform(X_train[:, :, i])。至少对我来说，当我的数据结构为（批次、样本、行、列）时 (2认同)

Answer 3

lie*_*dji 10

An elegant way of doing this is using class Inheritance as follows:


from sklearn.preprocessing import MinMaxScaler
import numpy as np

class MinMaxScaler3D(MinMaxScaler):

    def fit_transform(self, X, y=None):
        x = np.reshape(X, newshape=(X.shape[0]*X.shape[1], X.shape[2]))
        return np.reshape(super().fit_transform(x, y=y), newshape=X.shape)

Run Code Online (Sandbox Code Playgroud)

Usage:


scaler = MinMaxScaler3D()
X = scaler.fit_transform(X)

Run Code Online (Sandbox Code Playgroud)

Answer 4

Kil*_*ner 5

如果您想要像其他方法那样缩放每个要素StandardScaler，可以使用以下方法：

import numpy as np
from sklearn.base import TransformerMixin
from sklearn.preprocessing import StandardScaler


class NDStandardScaler(TransformerMixin):
    def __init__(self, **kwargs):
        self._scaler = StandardScaler(copy=True, **kwargs)
        self._orig_shape = None

    def fit(self, X, **kwargs):
        X = np.array(X)
        # Save the original shape to reshape the flattened X later
        # back to its original shape
        if len(X.shape) > 1:
            self._orig_shape = X.shape[1:]
        X = self._flatten(X)
        self._scaler.fit(X, **kwargs)
        return self

    def transform(self, X, **kwargs):
        X = np.array(X)
        X = self._flatten(X)
        X = self._scaler.transform(X, **kwargs)
        X = self._reshape(X)
        return X

    def _flatten(self, X):
        # Reshape X to <= 2 dimensions
        if len(X.shape) > 2:
            n_dims = np.prod(self._orig_shape)
            X = X.reshape(-1, n_dims)
        return X

    def _reshape(self, X):
        # Reshape X back to it's original shape
        if len(X.shape) >= 2:
            X = X.reshape(-1, *self._orig_shape)
        return X

Run Code Online (Sandbox Code Playgroud)

在将输入提供给sklearn之前，它只是将输入的功能展平StandardScaler。然后，将其重新塑形。用法与相同StandardScaler：

data = [[[0, 1], [2, 3]], [[1, 5], [2, 9]]]
scaler = NDStandardScaler()
print(scaler.fit_transform(data))

Run Code Online (Sandbox Code Playgroud)

版画

[[[-1. -1.]
  [ 0. -1.]]

 [[ 1.  1.]
  [ 0.  1.]]]

Run Code Online (Sandbox Code Playgroud)

参数with_mean和with_std直接传递给StandardScaler预期的工作。copy=False将不会起作用，因为重塑不会就地进行。对于2D输入，其NDStandardScaler工作类似于StandardScaler：

data = [[0, 0], [0, 0], [1, 1], [1, 1]]
scaler = NDStandardScaler()
scaler.fit(data)
print(scaler.transform(data))
print(scaler.transform([[2, 2]]))

Run Code Online (Sandbox Code Playgroud)

版画

[[-1. -1.]
 [-1. -1.]
 [ 1.  1.]
 [ 1.  1.]]
[[3. 3.]]

Run Code Online (Sandbox Code Playgroud)

就像在sklearn的示例中一样StandardScaler。

归档时间：	7 年，9 月前
查看次数：	5907 次
最近记录：	6 年，7 月前