pan*_*Box 3 python scipy dataframe pandas
我正在对pandas数据框中的变量进行转换,然后我想用新值替换该列.问题似乎是在转换之后,数组的长度与我的数据帧索引的长度不同.我不认为这是真的.
>>> df['variable'] = stats.boxcox(df.variable)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\eMachine\WinPython-64bit-2.7.5.3\python-2.7.5.amd64\lib\site-packages\pandas\core\frame.py", line 2119, in __setitem__
self._set_item(key, value)
File "C:\Users\eMachine\WinPython-64bit-2.7.5.3\python-2.7.5.amd64\lib\site-packages\pandas\core\frame.py", line 2165, in _set_item
value = self._sanitize_column(key, value)
File "C:\Users\eMachine\WinPython-64bit-2.7.5.3\python-2.7.5.amd64\lib\site-packages\pandas\core\frame.py", line 2205, in _sanitize_column
raise AssertionError('Length of values does not match '
AssertionError: Length of values does not match length of index
Run Code Online (Sandbox Code Playgroud)
当我检查长度时,这些长度似乎不一致.len(数组)说它是2但是当我打电话给stats.boxcox它说它是50000.这里发生了什么?
>>> len(df)
50000
>>> len(stats.boxcox(df.variable))
2
>>> stats.boxcox(df.variable)
(0 -0.079496
1 -0.117982
2 -0.104637
...
49985 -0.041300
49986 0.651771
49987 -0.115660
49988 -0.118034
49998 -0.118014
49999 -0.034076
Name: feat9, Length: 50000, dtype: float64, 8.4721358117221772)
>>>
Run Code Online (Sandbox Code Playgroud)
Bre*_*arn 11
您可以在示例中看到结果boxcox是元组.这与文档一致,表明boxcox返回转换数据的元组和lambda值.请注意该页面上的示例:
xt, _ = stats.boxcox(x)
Run Code Online (Sandbox Code Playgroud)
...再次显示boxcox返回2元组.
你应该这样做df['variable'] = stats.boxcox(df.variable)[0].
| 归档时间: |
|
| 查看次数: |
3156 次 |
| 最近记录: |