假设我们有以下数据框:
> df
A B C
1 1 2 3
2 4 5 6
3 7 8 9
Run Code Online (Sandbox Code Playgroud)
我们可以从索引中选择列'B':
> df[,2]
[1] 2 5 8
Run Code Online (Sandbox Code Playgroud)
有没有办法从列标签('B')获取索引(2)?
说我有一个训练在数据帧中设置train的柱子ColA,ColB,ColC,等其中的一个列指定一个二进制类,说柱Class,用"是"或"否"的价值观.
我正在尝试一些二元分类器,例如:
library(klaR)
mynb <- NaiveBayes(Class ~ ColA + ColB + ColC, train)
Run Code Online (Sandbox Code Playgroud)
我想在循环中运行上面的代码,自动生成公式中所有可能的列组合,即:
mynb <- append(mynb, NaiveBayes(Class ~ ColA, train)
mynb <- append(mynb, NaiveBayes(Class ~ ColA + ColB, train)
mynb <- append(mynb, NaiveBayes(Class ~ ColA + ColB + ColC, train)
...
mynb <- append(mynb, NaiveBayes(Class ~ ColB + ColC + ColD, train)
...
Run Code Online (Sandbox Code Playgroud)
如何为涉及数据框列的每个可能的线性模型自动生成公式?
假设我们有两个数字向量x和y.之间的Pearson相关系数x和y由下式给出
cor(x,y)
如何才能自动仅考虑计算中的一部分(x和y90%)以最大化相关系数?
我需要自动检测2D图中的凹陷,如下图中标有红色圆圈的区域.我只对"主"倾角感兴趣,这意味着倾角必须跨越x轴的最小长度.逢低的数量是未知的,即不同的图将包含不同数量的下降.有任何想法吗?

更新:
根据要求,这里是样本数据,并尝试使用中值过滤来平滑它,如藤蔓所示.
看起来我现在需要一种可靠的方法来近似每个点的导数,这会忽略数据中保留的小亮点.有没有标准方法?
y <- c(0.9943,0.9917,0.9879,0.9831,0.9553,0.9316,0.9208,0.9119,0.8857,0.7951,0.7605,0.8074,0.7342,0.6374,0.6035,0.5331,0.4781,0.4825,0.4825,0.4879,0.5374,0.4600,0.3668,0.3456,0.4282,0.3578,0.3630,0.3399,0.3578,0.4116,0.3762,0.3668,0.4420,0.4749,0.4556,0.4458,0.5084,0.5043,0.5043,0.5331,0.4781,0.5623,0.6604,0.5900,0.5084,0.5802,0.5802,0.6174,0.6124,0.6374,0.6827,0.6906,0.7034,0.7418,0.7817,0.8311,0.8001,0.7912,0.7912,0.7540,0.7951,0.7817,0.7644,0.7912,0.8311,0.8311,0.7912,0.7688,0.7418,0.7232,0.7147,0.6906,0.6715,0.6681,0.6374,0.6516,0.6650,0.6604,0.6124,0.6334,0.6374,0.5514,0.5514,0.5412,0.5514,0.5374,0.5473,0.4825,0.5084,0.5126,0.5229,0.5126,0.5043,0.4379,0.4781,0.4600,0.4781,0.3806,0.4078,0.3096,0.3263,0.3399,0.3184,0.2820,0.2167,0.2122,0.2080,0.2558,0.2255,0.1921,0.1766,0.1732,0.1205,0.1732,0.0723,0.0701,0.0405,0.0643,0.0771,0.1018,0.0587,0.0884,0.0884,0.1240,0.1088,0.0554,0.0607,0.0441,0.0387,0.0490,0.0478,0.0231,0.0414,0.0297,0.0701,0.0502,0.0567,0.0405,0.0363,0.0464,0.0701,0.0832,0.0991,0.1322,0.1998,0.3146,0.3146,0.3184,0.3578,0.3311,0.3184,0.4203,0.3578,0.3578,0.3578,0.4282,0.5084,0.5802,0.5667,0.5473,0.5514,0.5331,0.4749,0.4037,0.4116,0.4203,0.3184,0.4037,0.4037,0.4282,0.4513,0.4749,0.4116,0.4825,0.4918,0.4879,0.4918,0.4825,0.4245,0.4333,0.4651,0.4879,0.5412,0.5802,0.5126,0.4458,0.5374,0.4600,0.4600,0.4600,0.4600,0.3992,0.4879,0.4282,0.4333,0.3668,0.3005,0.3096,0.3847,0.3939,0.3630,0.3359,0.2292,0.2292,0.2748,0.3399,0.2963,0.2963,0.2385,0.2531,0.1805,0.2531,0.2786,0.3456,0.3399,0.3491,0.4037,0.3885,0.3806,0.2748,0.2700,0.2657,0.2963,0.2865,0.2167,0.2080,0.1844,0.2041,0.1602,0.1416,0.2041,0.1958,0.1018,0.0744,0.0677,0.0909,0.0789,0.0723,0.0660,0.1322,0.1532,0.1060,0.1018,0.1060,0.1150,0.0789,0.1266,0.0965,0.1732,0.1766,0.1766,0.1805,0.2820,0.3096,0.2602,0.2080,0.2333,0.2385,0.2385,0.2432,0.1602,0.2122,0.2385,0.2333,0.2558,0.2432,0.2292,0.2209,0.2483,0.2531,0.2432,0.2432,0.2432,0.2432,0.3053,0.3630,0.3578,0.3630,0.3668,0.3263,0.3992,0.4037,0.4556,0.4703,0.5173,0.6219,0.6412,0.7275,0.6984,0.6756,0.7079,0.7192,0.7342,0.7458,0.7501,0.7540,0.7605,0.7605,0.7342,0.7912,0.7951,0.8036,0.8074,0.8074,0.8118,0.7951,0.8118,0.8242,0.8488,0.8650,0.8488,0.8311,0.8424,0.7912,0.7951,0.8001,0.8001,0.7458,0.7192,0.6984,0.6412,0.6516,0.5900,0.5802,0.5802,0.5762,0.5623,0.5374,0.4556,0.4556,0.4333,0.3762,0.3456,0.4037,0.3311,0.3263,0.3311,0.3717,0.3762,0.3717,0.3668,0.3491,0.4203,0.4037,0.4149,0.4037,0.3992,0.4078,0.4651,0.4967,0.5229,0.5802,0.5802,0.5846,0.6293,0.6412,0.6374,0.6604,0.7317,0.7034,0.7573,0.7573,0.7573,0.7772,0.7605,0.8036,0.7951,0.7817,0.7869,0.7724,0.7869,0.7869,0.7951,0.7644,0.7912,0.7275,0.7342,0.7275,0.6984,0.7342,0.7605,0.7418,0.7418,0.7275,0.7573,0.7724,0.8118,0.8521,0.8823,0.8984,0.9119,0.9316,0.9512)
yy <- runmed(y, 41)
plot(y, type="l", ylim=c(0,1), ylab="", xlab="", lwd=0.5)
points(yy, col="blue", type="l", lwd=2)
Run Code Online (Sandbox Code Playgroud)

我需要从包含NULL值的文件中读取数据帧.这是一个示例文件:
charCol floatCol intCol a 1.5 10 b NULL 3 c 3.9 NULL d -3.4 4
我把这个文件读入数据框:
> df <- read.table('example.dat', header=TRUE)
Run Code Online (Sandbox Code Playgroud)
但是"NULL"条目不会被R解释为NULL:
> is.null(df$floatCol[2])
[1] FALSE
Run Code Online (Sandbox Code Playgroud)
我应该如何格式化我的输入文件,以便R正确地将这些条目视为NULL?
我正在试图弄清楚如何实现以下身份验证流程:

要求:
我正在寻找有关如何实施此类工作流程的建议.我应该使用OAuth(或OAuth2)来访问API吗?OAuth提供商是否应该在App Engine上运行,而Web应用程序(Rails)是否代表用户向API请求令牌?如果是这样,只允许Web应用程序(Rails)请求OAuth令牌的最佳方法是什么?或者我应该考虑一个完全不同的策略?
任何建议都非常感谢.我也在寻找库的建议,以便在上面的上下文中实现OAuth.
我需要在C中编写单元测试,以便对文件句柄进行操作.我想将测试文件的内容直接包含在单元测试源中.因此我的问题是:是否可以定义一个字符串并将其作为文件句柄在C中访问?
在使用GNU Autotools构建的项目中,我有一个需要修改的脚本make来包含安装路径.这是一个小例子:
configure.ac:
AC_INIT(foobar, 1.0)
AC_PREREQ(2.66)
AC_CONFIG_HEADERS(config.h)
AM_INIT_AUTOMAKE(foreign)
AC_CONFIG_FILES([Makefile blah/Makefile])
AC_OUTPUT
Run Code Online (Sandbox Code Playgroud)
Makefile.am:
SUBDIRS = blah
Run Code Online (Sandbox Code Playgroud)
胡说/ Makefile.am:
all: myscript
myscript: myscript.in
sed -e 's,@datadir\@,$(pkgdatadir),g' myscript.in > myscript
chmod +x myscript
EXTRA_DIST = myscript.in
Run Code Online (Sandbox Code Playgroud)
./configure; make成功创造myscript.同上make dist; tar xvzf foobar-1.0.tar.gz; cd foobar-1.0; ./configure; make.但是,make distcheck因为文件myscript.in丢失而失败(但成功复制了make dist).
任何想法为什么文件myscript.in没有被复制make distcheck?
我有一个使用Autotools构建的C程序.在src/Makefile.am,我定义一个宏与安装数据文件的路径:
AM_CPPFLAGS = -DAM_INSTALLDIR='"$(pkgdatadir)"'
Run Code Online (Sandbox Code Playgroud)
问题是我需要先运行make install才能测试二进制文件(因为它需要能够找到数据文件).
我可以使用源树的路径定义另一个宏,以便可以定位数据文件而无需安装:
AM_CPPFLAGS = -DAM_INSTALLDIR='"$(pkgdatadir)"' -DAM_TOPDIR='"$(abs_top_srcdir)"'
Run Code Online (Sandbox Code Playgroud)
现在,我想要以下行为:
make install,请使用AM_INSTALLDIR来获取数据文件.这可能吗?有没有更好的方法来解决这个问题?