dil*_*dar 5 python java pipeline machine-learning pmml
我试图将 Pipeline 对象保存为 PMML,但 Python 抛出 RuntimeError。
我的Python版本是3.6,sklearn2pmml版本是0.44.0,JDK版本是1.8.0_201。
所有这些都符合包的先决条件。
这是我到目前为止所做的。(我不包括数据加载和清理部分)
from sklearn2pmml.pipeline import PMMLPipeline
from sklearn2pmml import make_pmml_pipeline, sklearn2pmml
logit_pipline = Pipeline([('vect', CountVectorizer(ngram_range=(1,2))), ('tfidf', TfidfTransformer(use_idf=True)), ('clf', LogisticRegression(C=11.3))])
pmml_pipeline = PMMLPipeline([("logit", logit_pipline)])
pmml_pipeline.fit(X, Y)
sklearn2pmml(pmml_pipeline, 'logit.pmml', with_repr=True)
Run Code Online (Sandbox Code Playgroud)
我运行上面提到的最后一行后发生的事情是......
sklearn2pmml(pmml_pipeline, 'logit.pmml', with_repr=True)
Standard output is empty
Standard error:
Apr 30, 2019 11:59:04 AM org.jpmml.sklearn.Main run
INFO: Parsing PKL..
Apr 30, 2019 11:59:04 AM org.jpmml.sklearn.Main run
INFO: Parsed PKL in 230 ms.
Apr 30, 2019 11:59:04 AM org.jpmml.sklearn.Main run
INFO: Converting..
Apr 30, 2019 11:59:04 AM org.jpmml.sklearn.Main run
SEVERE: Failed to convert
java.lang.IllegalArgumentException: Expected an estimator object as the last step, got a transformer object (Python class sklearn.pipeline.Pipeline)
at sklearn2pmml.pipeline.PMMLPipeline.getEstimator(PMMLPipeline.java:541)
at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:93)
at org.jpmml.sklearn.Main.run(Main.java:145)
at org.jpmml.sklearn.Main.main(Main.java:94)
Exception in thread "main" java.lang.IllegalArgumentException: Expected an estimator object as the last step, got a transformer object (Python class sklearn.pipeline.Pipeline)
at sklearn2pmml.pipeline.PMMLPipeline.getEstimator(PMMLPipeline.java:541)
at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:93)
at org.jpmml.sklearn.Main.run(Main.java:145)
at org.jpmml.sklearn.Main.main(Main.java:94)
Traceback (most recent call last):
File "<ipython-input-129-f5c307b4aaba>", line 1, in <module>
sklearn2pmml(pmml_pipeline, 'logit.pmml', with_repr=True)
File "C:\ProgramData\Anaconda3\lib\site-packages\sklearn2pmml\__init__.py", line 252, in sklearn2pmml
raise RuntimeError("The JPMML-SkLearn conversion application has failed. The Java executable should have printed more information about the failure into its standard output and/or standard error streams")
RuntimeError: The JPMML-SkLearn conversion application has failed. The Java executable should have printed more information about the failure into its standard output and/or standard error streams
Run Code Online (Sandbox Code Playgroud)
现在根据一些人的说法,这是一些 JDK 兼容性问题,使用 JDK 1.9 及以上版本或 1.6 及以下版本会引发此类问题。但是既然我的JDK版本是sklearn2pmml可以接受的,为什么会出现这种错误呢?
正如底层 Java 异常所示,该类sklearn2pmml.pipeline.PMMLPipeline期望使用一系列步骤进行参数化,其中最后一步保存一些估计器对象。在您的情况下,您正在PMMLPipeline使用单元素步骤列表进行参数化;最后一步保存一个Pipeline对象,从这个意义上来说,它不是估计器对象。
要解决这个问题,只需去掉中间logit_pipline层(将管道包装在管道内有什么想法?)。
例如,这可以工作:
logit_pipline = PMMLPipeline([..])
logit_pipeline.fit(X, y)
sklearn2pmml(logit_pipeline, "logit.pmml")
Run Code Online (Sandbox Code Playgroud)
此问题与 JDK、Python 或 Scikit-Learn 版本完全无关。
| 归档时间: |
|
| 查看次数: |
3224 次 |
| 最近记录: |