我有一个拥有86k行,5个功能和1个目标列的pandas DataFrame.我正在尝试使用70%的DataFrame训练DecisionTreeClassifier作为训练数据,我从fit方法得到一个MemoryError.我已经尝试更改一些参数,但我真的不知道是什么导致错误,所以我不知道如何处理它.我在Windows 10上有8GB的RAM.
码
train, test = train_test_split(data, test_size = 0.3)
X_train = train.iloc[:, 1:-1] # first column is not a feature
y_train = train.iloc[:, -1]
X_test = test.iloc[:, 1:-1]
y_test = test.iloc[:, -1]
DT = DecisionTreeClassifier()
DT.fit(X_train, y_train)
dt_predictions = DT.predict(X_test)
Run Code Online (Sandbox Code Playgroud)
错误
File (...), line 97, in <module>
DT.fit(X_train, y_train)
File "(...)\AppData\Local\Programs\Python\Python36-32\lib\site-packages\sklearn\tree\tree.py", line 790, in fit
X_idx_sorted=X_idx_sorted)
File "(...)\AppData\Local\Programs\Python\Python36-32\lib\site-packages\sklearn\tree\tree.py", line 362, in fit
builder.build(self.tree_, X, y, sample_weight, X_idx_sorted)
File "sklearn\trewe\_tree.pyx", line 145, in sklearn.tree._tree.DepthFirstTreeBuilder.build
File "sklearn\tree\_tree.pyx", line 244, in …Run Code Online (Sandbox Code Playgroud)