无法理解SVM和LR中决策边界的绘图

Il'*_*nin 1 python numpy machine-learning svm svc

例如,我们有f(x)= x.如何策划?我们取一些x然后计算y并再次执行此操作,然后逐点绘制图表.简单明了.

但我无法理解如此清晰地绘制决策边界 - 当我们没有绘制时,只有x.

SVM的Python代码:

h = .02  # step size in the mesh
Y = y
# we create an instance of SVM and fit out data. We do not scale our
# data since we want to plot the support vectors
C = 1.0  # SVM regularization parameter
svc = svm.SVC(kernel='linear', C=C).fit(X, Y)
rbf_svc = svm.SVC(kernel='rbf', gamma=0.7, C=C).fit(X, Y)
poly_svc = svm.SVC(kernel='poly', degree=3, C=C).fit(X, Y)
lin_svc = svm.LinearSVC(C=C).fit(X, Y)

# create a mesh to plot in
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
                     np.arange(y_min, y_max, h))


for i, clf in enumerate((svc, rbf_svc, poly_svc, lin_svc)):
    # Plot the decision boundary. For that, we will asign a color to each
    # point in the mesh [x_min, m_max]x[y_min, y_max].
Run Code Online (Sandbox Code Playgroud)

绘制图表的所有内容都在这里,我如何理解:

    pl.subplot(2, 2, i + 1)
    Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])

    # Put the result into a color plot
    Z = Z.reshape(xx.shape)
    pl.contourf(xx, yy, Z, cmap=pl.cm.Paired)
    pl.axis('off')

    # Plot also the training points
    pl.scatter(X[:, 0], X[:, 1], c=Y, cmap=pl.cm.Paired)

pl.show()
Run Code Online (Sandbox Code Playgroud)

有人可以用语言解释这种情节是如何起作用的吗?

lej*_*lot 6

Basically, you are plotting the function f : R^2 -> {0,1} so it is a function from the 2 dimensional space into the degenerated space of only two values - 0 and 1.

First, you generate the mesh you want to visualize your function on. In case of your example with f(x)=y you would select some interval [x_min,x_max] on which you would take points with some distance eps and plot the corresponding values of f

x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
                     np.arange(y_min, y_max, h))
Run Code Online (Sandbox Code Playgroud)

Next, we calculate the function values, in our case it is a SVM.predict function, which results in either 0 or 1

Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
Run Code Online (Sandbox Code Playgroud)

It is the same as in your example of calculating f(x) for all the analyzed x

Now, the "tricky" part which may lead to missunderstanding is

pl.contourf(xx, yy, Z, cmap=pl.cm.Paired)
Run Code Online (Sandbox Code Playgroud)

This function plots the contours of your f function. To visualize 3 dimensional function on the plane one often creates contour plots, it is like a map of height of your function. You draw a line between points if the large change in the value of f is detected around them.

Nice example from the mathworld 样本等高线图

Shows an example of such plot.

In the case of SVM we have only two possible values - 0 and 1, so as the result, the contour lines are located exactly in these parts of your 2d space, where on one side we have f(x)=0 and on the other f(x)=1. So even though it seems like a "2d plot" it is not - this shape, that you can observe (the decision boundary) is a visualization of the biggest differences in the 3d function.

In sklearn documentation that visualize it for the multi-classification example, when we have f : R^2 -> {0,1,2}, so the idea is exactly the same, but contour is plotted between such adjacent x1 and x2 that f(x1)!=f(x2).

SVM多类