类型错误:train_test_split() 得到了一个意外的关键字参数“test_size”

use*_*200 0 python machine-learning scikit-learn

我正在尝试使用随机森林方法找到最佳特征集,我需要将数据集拆分为测试和训练。这是我的代码

from sklearn.model_selection import train_test_split

def train_test_split(x,y):
    # split data train 70 % and test 30 %
    x_train, x_test, y_train, y_test = train_test_split(x, y,train_size=0.3,random_state=42)
    #normalization
    x_train_N = (x_train-x_train.mean())/(x_train.max()-x_train.min())
    x_test_N = (x_test-x_test.mean())/(x_test.max()-x_test.min())

train_test_split(data,data_y)
Run Code Online (Sandbox Code Playgroud)

参数 data,data_y 解析正确。但我收到以下错误。我想不通这是为什么。

在此处输入图片说明

Gen*_*neX 5

您在代码中使用的函数名称与 sklearn.preprocessing 中的函数名称相同,更改函数名称即可完成这项工作。像这样的东西,

from sklearn.model_selection import train_test_split

def my_train_test_split(x,y):
    # split data train 70 % and test 30 %
    x_train, x_test, y_train, y_test = train_test_split(x,y,train_size=0.3,random_state=42)
    #normalization
    x_train_N = (x_train-x_train.mean())/(x_train.max()-x_train.min())
    x_test_N = (x_test-x_test.mean())/(x_test.max()-x_test.min())

my_train_test_split(data,data_y)
Run Code Online (Sandbox Code Playgroud)

说明:- 尽管在 python 中有方法重载(即根据参数类型选择同名函数),但在您的情况下,这两个函数都需要相同类型的参数,因此不同的命名是唯一可能的解决方案海事组织。