抱歉糟糕的头衔.不知道我应该怎么说.
我正在玩地球包,看看使用或多或少的标准指标回归神经网络信号.数据文件是1000行,当前有187列(186指标结果),最后一列中有我的目标变量.我编写的代码非常简单,现在省略了任何样本内与样本外的问题,但至少它似乎起作用:
library(earth)
MyData = read.csv("C:\\Users\\TSIT\\\\GS-Pass12.csv",header=TRUE)
x=data.frame(MyData[,1:ncol(MyData)-1])
y=MyData[,ncol(MyData)]
a = earth(x,y,nprune=5)
summary(a, digits = 2, style = "pmax")
Run Code Online (Sandbox Code Playgroud)
摘要的输出看起来很合理:
summary(a, digits = 2, style = "pmax")
Call: earth(x=x, y=y, nprune=5)
y = 1.2
- 31 * pmax(0, Percent.Difference.from.Moving.Average..C..10. - 0.096)
+ 10 * pmax(0, 0.096 - Percent.Difference.from.Moving.Average..C..10.)
+ 25 * pmax(0, Percent.Difference.from.Moving.Average..C..15. - 0.14)
- 16 * pmax(0, 0.14 - Percent.Difference.from.Moving.Average..C..15.)
Selected 5 of 116 terms, and 2 of 185 predictors Importance:
Percent.Difference.from.Moving.Average..C..15.,
Value.Oscillator..C..8..26..1.-unused, ... Number of terms at each
degree of interaction: 1 4 (additive model) GCV 0.083 RSS 239
GRSq 0.66 RSq 0.66
Run Code Online (Sandbox Code Playgroud)
我现在正在努力的是如何将结果模型(y)从a变为某种R变量,以便我可以使用它.有人能指出我在正确的方向吗?
提前致谢.
该format()
功能可以使用:
R> library(earth)
R> example(earth)
[... stuff omitted ...]
R> cat(format(a), "\n")
27.2459
+ 6.17669 * h(Girth-14)
- 3.26623 * h(14-Girth)
+ 0.491207 * h(Height-72)
R>
Run Code Online (Sandbox Code Playgroud)
还有其他格式:
R> cat(format(a, style="pmax"), "\n")
27.2459
+ 6.17669 * pmax(0, Girth - 14)
- 3.26623 * pmax(0, 14 - Girth)
+ 0.491207 * pmax(0, Height - 72)
R> cat(format(a, style="bf"), "\n")
27.2459
+ 6.17669 * bf1
- 3.26623 * bf2
+ 0.491207 * bf3
bf1 h(Girth-14)
bf2 h(14-Girth)
bf3 h(Height-72)
R>
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
3531 次 |
最近记录: |