我试图显示.png
我使用以下构建的文件.
import pydot, StringIO
dot_data = StringIO.StringIO()
tree.export_graphviz( clf, out_file = dot_data,
feature_names =['age', 'sex', 'first_class', 'second_class', 'third_class'])
graph = pydot.graph_from_dot_data( dot_data.getvalue())
graph.write_png('titanic.png')
from IPython.core.display import Image
Image( filename ='titanic.png')
Run Code Online (Sandbox Code Playgroud)
我从未这样做过,非常感谢你的帮助.
我尝试了以下但没有错误也没有.png
显示.
from PIL import Image
image = Image.open("titanic.png")
image.show()
Run Code Online (Sandbox Code Playgroud) 以下代码读取清理的泰坦尼克数据,打印出所有功能和分数
import csv
import numpy as np
data = np.genfromtxt('titanic.csv',dtype=float, delimiter=',', names=True)
feature_names = np.array(data.dtype.names)
feature_names = feature_names[[ 0,1,2,3,4]]
data = np.genfromtxt('plants.csv',dtype=float, delimiter=',', skip_header=1)
_X = data[:, [0,1,2,3,4]]
#Return a flattened array required by scikit-learn fit for 2nd argument
_y = np.ravel(data[:,[5]])
from sklearn import feature_selection
fs = feature_selection.SelectPercentile(feature_selection.chi2, percentile=20)
X_train_fs = fs.fit_transform(_X, _y)
print feature_names, '\n', fs.scores_
Run Code Online (Sandbox Code Playgroud)
结果:
['A' 'B' 'C' 'D' 'E']
[ 4.7324711 89.1428574 70.23474577 7.02447375 52.42447817]
Run Code Online (Sandbox Code Playgroud)
我想要做的是捕获前20%的功能,并将名称和分数存储在一个数组中,然后我可以按分数排序.这将有助于我在更大的功能集减少尺寸.为什么我会获得所有5个功能,如何解决这个问题,以及如何存储和打印前20%的功能名称和分数?
我试图从scikit-learn ensemble打印出森林的决策树:例如对于DecisionTreeClassifier,我会使用:
from sklearn import tree
clf = tree.DecisionTreeClassifier( criterion ='entropy', max_depth = 3,
min_samples_leaf =
clf = clf.fit( X_train, y_train) #Input this to analyze the training set.
import pydot, StringIO
dot_data = StringIO.StringIO()
tree.export_graphviz( clf, out_file = dot_data,
feature_names =[' age', 'sex', 'first_class', 'second_class', 'third_class'])
graph = pydot.graph_from_dot_data( dot_data.getvalue())
graph.write_png('visualtree.png')
from IPython.core.display import Image
Image( filename =visualtree.png')
Run Code Online (Sandbox Code Playgroud)
我为Random Forest Regressor尝试了类似的方法(见下文并得到错误)
# Fit regression model
from sklearn.ensemble import RandomForestRegressor
rfr_1 = RandomForestRegressor(n_estimators=10, max_depth=5)
rfr_1.fit(X, y)
from sklearn.ensemble import*
import pydot, …
Run Code Online (Sandbox Code Playgroud) 如何计算python中的左侧特征向量?
>>> import from numpy as np
>>> from scipy.linalg import eig
>>> np.set_printoptions(precision=4)
>>> T = np.mat("0.2 0.4 0.4;0.8 0.2 0.0;0.8 0.0 0.2")
>>> print "T\n", T
T
[[ 0.2 0.4 0.4]
[ 0.8 0.2 0. ]
[ 0.8 0. 0.2]]
>>> w, vl, vr = eig(T, left=True)
>>> vl
array([[ 0.8165, 0.8165, 0. ],
[ 0.4082, -0.4082, -0.7071],
[ 0.4082, -0.4082, 0.7071]])
Run Code Online (Sandbox Code Playgroud)
这似乎不正确,谷歌对此并不友好!
numpy ×4
python ×3
scikit-learn ×3
eigenvector ×1
graphviz ×1
matplotlib ×1
pygraphviz ×1
python-2.7 ×1