pyspark --- randomForests使用"categoricalFeaturesInfo"指定分类变量

Kua*_* CK 2 decision-tree categories random-forest apache-spark pyspark

你如何在pyspark randomForests中指定categoricalFeaturesInfo?

文档不是很明确,我尝试了一些像:

categoricalFeaturesInfo = {(12,4)}

categoricalFeaturesInfo = {(12-> 4)}

categoricalFeaturesInfo = {Map [int,int](12,4)}

...等等,但没有一个工作.任何帮助是极大的赞赏.

dpe*_*ock 5

从我们的python文档:

categoricalFeaturesInfo: Map storing arity of categorical
             features.  E.g., an entry (n -> k) indicates that
             feature n is categorical with k categories indexed
             from 0: {0, 1, ..., k-1}.
Run Code Online (Sandbox Code Playgroud)

尝试使用:

categoricalFeaturesInfo = {12:4}
Run Code Online (Sandbox Code Playgroud)