Sha*_*07. 1 numpy machine-learning data-analysis pandas scikit-learn
saleprice_scaled = /
StandardScaler().fit_transform(df_train['SalePrice'][:,np.newaxis]);
Run Code Online (Sandbox Code Playgroud)
为什么newaxis在这里使用?我知道newaxis,但是我不知道它在这种特殊情况下的用途。
df_train['SalePrice'] 是形状为(N个元素)的Pandas.Series(向量/一维数组)
现代的(版本:0.17+)SKLearn方法不喜欢一维数组(向量),他们希望使用二维数组。
df_train['SalePrice'][:,np.newaxis]
Run Code Online (Sandbox Code Playgroud)
将一维数组(形状:N个元素)转换为二维数组(形状:N行,1列)。
演示:
In [21]: df = pd.DataFrame(np.random.randint(10, size=(5, 3)), columns=list('abc'))
In [22]: df
Out[22]:
a b c
0 4 3 8
1 7 5 6
2 1 3 9
3 7 5 7
4 7 0 6
In [23]: from sklearn.preprocessing import StandardScaler
In [24]: df['a'].shape
Out[24]: (5,) # <--- 1D array
In [25]: df['a'][:, np.newaxis].shape
Out[25]: (5, 1) # <--- 2D array
Run Code Online (Sandbox Code Playgroud)
有熊猫可以做到这一点:
In [26]: df[['a']].shape
Out[26]: (5, 1) # <--- 2D array
In [27]: StandardScaler().fit_transform(df[['a']])
Out[27]:
array([[-0.5 ],
[ 0.75],
[-1.75],
[ 0.75],
[ 0.75]])
Run Code Online (Sandbox Code Playgroud)
如果我们将传递一维数组会发生什么:
In [28]: StandardScaler().fit_transform(df['a'])
C:\Users\Max\Anaconda4\lib\site-packages\sklearn\utils\validation.py:429: DataConversionWarning: Data with input dtype int32 was converted t
o float64 by StandardScaler.
warnings.warn(msg, _DataConversionWarning)
C:\Users\Max\Anaconda4\lib\site-packages\sklearn\preprocessing\data.py:586: DeprecationWarning: Passing 1d arrays as data is deprecated in 0
.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1)
if it contains a single sample.
warnings.warn(DEPRECATION_MSG_1D, DeprecationWarning)
C:\Users\Max\Anaconda4\lib\site-packages\sklearn\preprocessing\data.py:649: DeprecationWarning: Passing 1d arrays as data is deprecated in 0
.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1)
if it contains a single sample.
warnings.warn(DEPRECATION_MSG_1D, DeprecationWarning)
Out[28]: array([-0.5 , 0.75, -1.75, 0.75, 0.75])
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1522 次 |
| 最近记录: |