我在一些.Rnw文件中的文本中包含一些sweave表达式.以下段落包含两个sweave表达式.我可以使用什么正则表达式来查找每个表达式中的R代码.所以正则表达式应该能够找到mean(mtcars$mpg)和/或summary(lm(mpg ~ hp + drat, mtcars))
Lorem ipsum dolor sit amet,consectetur adipisicing elit,sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.\ {Sexpr平均(mtcars $ MPG)}.Ut enim ad minim veniam,quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.Duis aute irure dolor in repreptderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.Excepteur sint occaecat cupidatat\Sexpr {summary(lm(mpg~hp + drat,mtcars))} non non proident,sunt in culpa qui officia deserunt mollit anim id est laborum.
我有这个数据框:
set.seed(50)
data <- data.frame(age=c(rep("juv", 10), rep("ad", 10)),
sex=c(rep("m", 10), rep("f", 10)),
size=c(rep("large", 10), rep("small", 10)),
length=rnorm(20),
width=rnorm(20),
height=rnorm(20))
data$length[sample(1:20, size=8, replace=F)] <- NA
data$width[sample(1:20, size=8, replace=F)] <- NA
data$height[sample(1:20, size=8, replace=F)] <- NA
age sex size length width height
1 juv m large NA -0.34992735 0.10955641
2 juv m large -0.84160374 NA -0.41341885
3 juv m large 0.03299794 -1.58987765 NA
4 juv m large NA NA NA
5 juv m large -1.72760411 NA 0.09534935
6 juv m large -0.27786453 …Run Code Online (Sandbox Code Playgroud) 我通过对从www.gadm.org下载的shapefile进行子集化来制作下面的地图:
load(url('http://gadm.org/data/rda/GBR_adm0.RData'))
library(ggplot2)
ukMapFort <- fortify(gadm)
ukMapFortSub <- subset(ukMapFort, lat > 55.575 & lat < 55.739 & long > -1.929 & long < -1.7)
ggplot() + geom_polygon(data=data.frame(ukMapFortSub), aes(long, lat, group=id), fill=NA, color="black")
Run Code Online (Sandbox Code Playgroud)

如何删除两条红线?注意我已经使用Photoshop将线条着色为红色 - 这些线条是由R代码生成的,但R代码不会将其涂成红色.
我用以下代码创建了这个图:
library(ggplot2); library(reshape2); library(plyr)
likert <- data.frame(age = c(rep("young", 5), rep("middle", 5), rep("old", 5)),
score1 = c(rep("unlikely", 1), rep("likely", 1), rep("very likely", 13)),
score2 = c(rep("disagree", 6), rep("neutral", 4), rep("agree", 5)),
score3 = c(rep("no", 5), rep("maybe", 7), rep("yes", 3)))
meltedLikert <- melt(dlply(likert, .(age), function(x) llply(x, table)))
names(meltedLikert) <- c("score", "count", "variable", "age")
ggplot(meltedLikert[meltedLikert$variable != "age",], aes(variable, count, fill=score)) +
geom_bar(position="dodge", stat="identity") +
geom_text(data=data.frame(meltedLikert), aes(variable, count, group=score, label=meltedLikert$score), size=4) +
facet_grid(age ~ .)
Run Code Online (Sandbox Code Playgroud)

如何标记位置文本,以便每个标签score位于variable每个栏顶部的相应栏上?
我希望下面的代码输出一个包含三行的数据框,每行代表计算每组的平均值后mpg的累积平均值cyl:
library(dplyr)
mtcars %>%
arrange(cyl) %>%
group_by(cyl) %>%
summarise(running.mean.mpg = cummean(mpg))
Run Code Online (Sandbox Code Playgroud)
这就是我期望发生的事情:
mean_cyl_4 <- mtcars %>%
filter(cyl == 4) %>%
summarise(mean(mpg))
mean_cyl_4_6 <- mtcars %>%
filter(cyl == 4 | cyl == 6) %>%
summarise(mean(mpg))
mean_cyl_4_6_8 <- mtcars %>%
filter(cyl == 4 | cyl == 6 | cyl == 8) %>%
summarise(mean(mpg))
data.frame(cyl = c(4,6,8), running.mean.mpg = c(mean_cyl_4[1,1], mean_cyl_4_6[1,1], mean_cyl_4_6_8[1,1]))
cyl running.mean.mpg
1 4 26.66364
2 6 23.97222
3 8 20.09062
Run Code Online (Sandbox Code Playgroud)
为什么dplyr似乎忽略了group_by(cyl)?
此代码在ggplot2中绘制了具有交互的回归线:
library(ggplot2)
ggplot(mtcars, aes(hp, mpg, group = cyl)) + geom_point() + stat_smooth(method = "lm")
Run Code Online (Sandbox Code Playgroud)

没有相互作用的线可以用stat_smooth?
我试图捕获一些R代码的输出并用乳胶代码替换它.
如果您运行此代码:
library(stargazer)
x <- capture.output(stargazer(mtcars[1:5, 1:3], summary = FALSE, title="The main caption of the table."))
x
Run Code Online (Sandbox Code Playgroud)
这是输出:
[1] ""
[2] "% Table created by stargazer v.5.1 by Marek Hlavac, Harvard University. E-mail: hlavac at fas.harvard.edu"
[3] "% Date and time: Sat, Jun 27, 2015 - 11:36:07"
[4] "\\begin{table}[!htbp] \\centering "
[5] " \\caption{The main caption of the table.} "
[6] " \\label{} "
[7] "\\begin{tabular}{@{\\extracolsep{5pt}} cccc} "
[8] "\\\\[-1.8ex]\\hline "
[9] "\\hline \\\\[-1.8ex] "
[10] " & mpg & …Run Code Online (Sandbox Code Playgroud) 我期望以下代码返回95%置信区间的下限和上限:
confint95 = function(mean, se)
{
confint = abs(se*1.96)
lower = abs(mean-cint)
upper = abs(mean+cint)
return(lower,upper)
}
Run Code Online (Sandbox Code Playgroud)
但这给出了这样的信息:
Error in return(lower, upper) : multi-argument returns are not permitted
Run Code Online (Sandbox Code Playgroud)
如何设置函数以返回95%置信区间的下限和上限?
附带的屏幕截图显示了我刚从excel文件导入R的数据帧的一部分.在空白的单元格中,我需要插入'NA'.如何将NA插入任何空白的细胞中(同时留下已经填充的细胞)?

我做了这个选型表:
lm_mtcars <- lm(mpg ~ drat + hp + wt, mtcars)
library(MuMIn)
mod_sel_lm_mtcars <- (mod.sel(lm_mtcars))
mod_sel_lm_mtcars
Model selection table
(Intrc) drat hp wt df logLik AICc delta weight
lm_mtcars 29.39 1.615 -0.03223 -3.228 5 -73.366 159 0 1
Run Code Online (Sandbox Code Playgroud)
我可以将stargazer其转换为乳胶表:
library(stargazer)
stargazer(mod_sel_lm_mtcars)
Run Code Online (Sandbox Code Playgroud)
如何stargazer打印变量名称,如Coulombe Et Al 2011, p288, Table 2中所示。
所以: