Sklearn 部分依赖图返回 ValueError:百分位数彼此太接近

Afs*_*ooy 5 python-3.x scikit-learn sklearn-pandas

我想绘制一些输入变量与目标值的部分依赖图。我使用 sklearn 训练了一个梯度增强模型,然后使用获得的模型运行了sklearn.inspection.plot_partial_dependence. 但是,我收到ValueError: percentiles are too close to each other, unable to build the grid. Please choose percentiles that are further apart错误。知道如何解决这个问题吗?

这是我的代码:

    columns = ['zip', 't-s', 'r', 'f', 'm-t', 'ir', 'if', 'n-d-m', 't-n-d', 'a-d-f-l-d-t', 'a-d-f-l-a-t', 'a']

print("Training GradientBoostingRegressor...")
est = HistGradientBoostingRegressor()
est.fit(inputsTrain, outputsTrain)
print("Test R2 score: {:.2f}".format(est.score(inputsTest, outputsTest)))

print('Computing partial dependence plots...')
features = columns + [('zip', 'r')]
plot_partial_dependence(est, inputsTrain, features,
                        n_jobs=3, grid_resolution=20)
fig = plt.gcf()
fig.subplots_adjust(wspace=0.4, hspace=0.3)
Run Code Online (Sandbox Code Playgroud)

我收到以下错误:

joblib.externals.loky.process_executor._RemoteTraceback: 
Traceback (most recent call last):
  File "/home/anac/anaconda3/lib/python3.6/site-packages/joblib/externals/loky/process_executor.py", line 418, in _process_worker
    r = call_item()
  File "/home/anac/anaconda3/lib/python3.6/site-packages/joblib/externals/loky/process_executor.py", line 272, in __call__
    return self.fn(*self.args, **self.kwargs)
  File "/home/anac/anaconda3/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 608, in __call__
    return self.func(*args, **kwargs)
  File "/home/anac/anaconda3/lib/python3.6/site-packages/joblib/parallel.py", line 256, in __call__
    for func, args, kwargs in self.items]
  File "/home/anac/anaconda3/lib/python3.6/site-packages/joblib/parallel.py", line 256, in <listcomp>
    for func, args, kwargs in self.items]
  File "/home/anac/anaconda3/lib/python3.6/site-packages/sklearn/utils/validation.py", line 72, in inner_f
    return f(**kwargs)
  File "/home/anac/anaconda3/lib/python3.6/site-packages/sklearn/inspection/_partial_dependence.py", line 404, in partial_dependence
    grid_resolution
  File "/home/anac/anaconda3/lib/python3.6/site-packages/sklearn/inspection/_partial_dependence.py", line 94, in _grid_from_X
    'percentiles are too close to each other, '
ValueError: percentiles are too close to each other, unable to build the grid. Please choose percentiles that are further apart.
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/anac/pycharm-2020.2/plugins/python/helpers/pydev/pydevd.py", line 2141, in <module>
    main()
  File "/home/anac/pycharm-2020.2/plugins/python/helpers/pydev/pydevd.py", line 2132, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "/home/anac/pycharm-2020.2/plugins/python/helpers/pydev/pydevd.py", line 1441, in run
    return self._exec(is_module, entry_point_fn, module_name, file, globals, locals)
  File "/home/anac/pycharm-2020.2/plugins/python/helpers/pydev/pydevd.py", line 1448, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "home/anac/pycharm-2020.2/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/home/anac/project/train_env.py", line 514, in <module>
    gradient_boosting_pdp(inputsTrain, outputsTrain, inputsValid, outputsValid, inputsTest, outputsTest)
  File "home/anac/project/utilities.py", line 271, in gradient_boosting_pdp
    n_jobs=3, grid_resolution=20)
  File "/home/anac/anaconda3/lib/python3.6/site-packages/sklearn/utils/validation.py", line 72, in inner_f
    return f(**kwargs)
  File "/home/anac/anaconda3/lib/python3.6/site-packages/sklearn/inspection/_plot/partial_dependence.py", line 286, in plot_partial_dependence
    for fxs in features)
  File "/home/anac/anaconda3/lib/python3.6/site-packages/joblib/parallel.py", line 1017, in __call__
    self.retrieve()
  File "/home/anac/anaconda3/lib/python3.6/site-packages/joblib/parallel.py", line 909, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "/home/anac/anaconda3/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 562, in wrap_future_result
    return future.result(timeout=timeout)
  File "/home/anac/anaconda3/lib/python3.6/concurrent/futures/_base.py", line 432, in result
    return self.__get_result()
  File "/home/anac/anaconda3/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
ValueError: percentiles are too close to each other, unable to build the grid. Please choose percentiles that are further apart.
Run Code Online (Sandbox Code Playgroud)

yak*_*ir0 4

虽然很晚了,但我还是会回答。

您需要向(或) 添加一个percentiles参数。默认值是,如果出现此异常,您需要将其替换为和,且两者之间的距离比 0.05 和 0.95 更远。你能走的最远是。plot_partial_dependencepartial_dependence(0.05, 0.95)(a, b)ab(0, 1)

对于您的情况,您可以这样写:

plot_partial_dependence(est, inputsTrain, features, n_jobs=3, grid_resolution=20, percentiles=(0, 1))
Run Code Online (Sandbox Code Playgroud)