Jim*_*phy 5 python json decision-tree apache-spark pyspark
我在Apache Spark中遇到的一个问题是可视化决策树.
我可以使用生产树DecisionTree.trainClassifier.我可以使用以下方法获得一些基本输出:
print(model.toDebugString())
Run Code Online (Sandbox Code Playgroud)
但理想情况下,目前的输出:
If (feature 0 <= -35.0)
If (feature 24 <= 176.0)
Predict: 2.1
If (feature 24 = 176.0)
Predict: 4.2
Else (feature 24 > 176.0)
Predict: 6.3
Else (feature 0 > -35.0)
If (feature 24 <= 11.0)
Predict: 4.5
Else (feature 24 > 11.0)
Predict: 10.2
Run Code Online (Sandbox Code Playgroud)
可以输出为JSON或可解析的东西,以便我们可以在D3 Visualization库中进行分层.使用上面的例子......
{
"node": [
{
"name":"node1",
"rule":"feature 0 <= -35.0",
"children":[
{
"name":"node2",
"rule":"feature 24 <= 176.0",
"children":[
{
"name":"node4",
"rule":"feature 20 < 116.0",
"predict": 2.1
},
{
"name":"node5",
"rule":"feature 20 = 116.0",
"predict": 4.2
},
{
"name":"node5",
"rule":"feature 20 > 116.0",
"predict": 6.3
}
]
},
{
"name":"node3",
"rule":"feature 0 > -35.0",
"children":[
{
"name":"node7",
"rule":"feature 3 <= 11.0",
"predict": 4.5
},
{
"name":"node8",
"rule":"feature 3 > 11.0",
"predict": 10.2
}
]
}
]
}
]
Run Code Online (Sandbox Code Playgroud)
}
| 归档时间: |
|
| 查看次数: |
2032 次 |
| 最近记录: |