训练集
trainSample <- cbind(data[1:980,1], data[1:980,2]) cl <-
factor(c(data[1:980,3]))
Run Code Online (Sandbox Code Playgroud)
测试集
testSample <- data(data[981:1485,1], data[981:1485,2])
cl.test <- clknn
Run Code Online (Sandbox Code Playgroud)
预测
k <- knn(trainSample, testSample, cl, k = 5)
Run Code Online (Sandbox Code Playgroud)
产量
< k
[1] 2 2 1 1 1 1 2 1 2 1 1 2 2 2 2 2 1 1 2 2 2 2 2 2 2 2 2 2 2 1 2 2 1 1 2 2 1 1 2 2 2 2 1 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2
[60] 2 2 2 2 1 2 2 2 2 1 2 2 1 2 2 2 1 1 2 1 2 2 1 1 1 2 1 2 2 2 1 2 2 2 2 2 1 2 1 2 2 2 2 2 2 2 2 1 2 2 2 2 1 2 2 2 2 2 2
[119] 2 2 2 1 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 1 2 2 2 2 1 2 1 1 1 1 2 2 2 2 2 2 2 2 1 2 1 2 2 2 2 2 2 1 2 2 1 2 1 2 2 2 2
[178] 2 2 2 2 1 1 2 2 2 2 2 2 2 2 2 1 1 1 1 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 1 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 1
[237] 2 2 2 2 2 1 2 2 1 2 2 1 2 2 2 2 2 1 2 2 2 2 2 2 2 1 2 2 2 2 2 2 1 2 2 1 2 2 2 2 1 2 1 2 2 2 2 1 1 2 1 2 2 2 2 1 2 2 2
[296] 2 2 2 1 2 1 2 1 1 1 2 1 2 2 1 1 2 2 1 2 1 2 2 1 2 2 2 1 2 2 2 2 2 1 2 2 2 1 2 2 2 1 2 2 2 2 2 2 2 1 2 1 1 2 2 2 1 1 2
[355] 1 2 1 2 1 2 1 2 2 2 2 2 2 1 1 1 2 1 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 1 2 2 2 2 2 1 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2
[414] 2 2 1 2 2 2 2 2 2 2 2 2 1 1 2 2 2 1 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[473] 2 2 2 2 2 1 1 2 2 2 2 2 1 2 2 1 1 2 2 1 2 2 1 2 1 2 2 1 2 2 2 2 2
Levels: 1 2
Run Code Online (Sandbox Code Playgroud)
我想要"c"和"not-c"(就像在我原来的data.csv中),而不是1和2(我也不确定哪个数字应该代表哪个)
有人可以帮忙吗?
Lyz*_*deR 24
更改因子水平非常容易,也不会混淆哪个是:
示例数据:
> a <- factor(rep(c(1,2,1),50))
> a
[1] 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2
[75] 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1
[149] 2 1
Levels: 1 2
#this will help later as a verification
#this counts the instances for 1 and 2
> table(a)
a
1 2
100 50
Run Code Online (Sandbox Code Playgroud)
因此,正如您在上面看到的那样,级别的顺序是1
第一个和2
第二个.更改级别(下方)时,订单保持不变:
#the assignment function levels can be used to change the levels
#the order will remain the same i.e. 'c' for '1' and 'not-c' for '2'
levels(a) <- c('c', 'not-c')
> a
[1] c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c
[25] c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c
[49] c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c
[73] c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c
[97] c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c
[121] c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c
[145] c not-c c c not-c c
Levels: c not-c
Run Code Online (Sandbox Code Playgroud)
这是验证:
> table(a)
a
c not-c
100 50
Run Code Online (Sandbox Code Playgroud)
订阅作业也有效.例如,这是一个因素:
> a <- factor(sample(letters[1:5],100,replace=T))
> a
[1] a d d d d a d d a b a b e a c d a c a a b e e d a e d e e a a c a a a b a
[38] b b a a e b d b c a a a b e b c e d d b b c c a b a d c b c c d e b d e d
[75] a a a b e e c b c b c c d d e e d a e e e b c e b e
Levels: a b c d e
Run Code Online (Sandbox Code Playgroud)
现在,让我们给出几个级别的新名称:
> levels(a)[c(2,4)] <- c('y','z')
> a
[1] a z z z z a z z a y a y e a c z a c a a y e e z a e z e e a a c a a a y a
[38] y y a a e y z y c a a a y e y c e z z y y c c a y a z c y c c z e y z e z
[75] a a a y e e c y c y c c z z e e z a e e e y c e y e
Levels: a y c z e
Run Code Online (Sandbox Code Playgroud)