use*_*011 3 numpy python-3.x scikit-learn
我试图通过"train_test_split"进行测试并训练数据.为什么我收到错误"至少需要一个数组作为输入".
"train_test_split"的输入可以是数组和dataFrame,对吗?
import pandas as pd
import numpy as np
from rpy2.robjects.packages import importr
import rpy2.robjects as ro
import pandas.rpy.common as rpy_common
from sklearn.model_selection import train_test_split
def la():
ro.r('library(MASS)')
pydf = rpy_common.load_data(name = 'Boston', package=None, convert=True)
pddf = pd.DataFrame(pydf)
targetIndex = pddf.columns.get_loc("medv")
# make train and test data
rowNum = pddf.shape[0]
colNum = pddf.shape[1]
print(type(pddf.as_matrix()))
print(pddf.as_matrix().shape)
m = np.asarray(pddf.as_matrix()).reshape(rowNum,colNum)
print(type(m))
x_train, x_test, y_train, y_test = train_test_split(x = m[:, 0:rowNum-2], \
y = m[:, -1],\
test_size = 0.5)
# error: raise ValueError("At least one array required as input")
ValueError: At least one array required as input
Run Code Online (Sandbox Code Playgroud)
从sklearn docs中,使用位置项解包("*args")处理数组.
您正在使用关键字参数"x ="和"y =",它试图通过查看"x"和"y"是否是特殊关键字选项的名称来处理.
尝试:
train_test_split(m[:, 0:rowNum-2], m[:, -1], test_size=0.5)
Run Code Online (Sandbox Code Playgroud)
(从数组中删除关键字参数名称).
| 归档时间: |
|
| 查看次数: |
2625 次 |
| 最近记录: |