使用带有knitr的循环来生成多个pdf报告...需要一些帮助才能让我超越驼峰

Chr*_*ris 32 r knitr

首先,我必须承认我对knitr和可重复分析的概念非常陌生,但我可以看到它在改进我当前工作流程方面的潜力(其中包括很多复制粘贴到word文档中).

我经常需要按组(本例中的医院)生成多个报告,并且在每个医院内,可能有许多不同的病房,我正在报告结果.以前我使用循环在R中运行我的所有绘图和分析,然后开始复制/粘贴工作; 但是,在阅读这篇文章后(Can Sweave会自动生成许多pdf文件?),它让我希望我实际上可以跳过很多步骤,直接从R通过Rnw/knitr报告.

然而,在尝试之后,我发现有一些东西不能完全解决(因为Rnw中的R环境似乎没有识别出我试图传递给它的循环变量?).

   ##  make my data
Hospital <- c(rep("A", 20), rep("B", 20))
Ward <- rep(c(rep("ICU", 10), rep("Medicine", 10)), 2)
Month <- rep(seq(1:10), 4)
Outcomes <- rnorm(40, 20, 5)
df <- data.frame(Hospital, Ward, Month, Outcomes)


##  Here is my current work flow-- produce all plots, but export as png and cut/paste
for(hosp in unique(df$Hospital)){
  subgroup <- df[ df$Hospital == hosp,]
  for(ward in unique(subgroup$Ward)){
    subgroup2 <- subgroup[subgroup$Ward == ward,]
    savename <- paste(hosp, ward)
    plot(subgroup2$Month, subgroup2$Outcomes, type="o", main=paste("Trend plot for", savename))
  }
}
# followed by much copy/pasting


##  Here is what I'm trying to go for using knitr 
library(knitr)
for (hosp in unique(df$Hospital)){
  knit("C:file.path\\testing_loops.Rnw", output=paste('report_', Hospital, '.tex', sep=""))
}

## With the following *Rnw file
## start *.Rnw Code
\documentclass[10pt]{article}
\usepackage[margin=1.15 in]{geometry}
<<loaddata, echo=FALSE, message=FALSE>>=
  Hospital <- c(rep("A", 20), rep("B", 20))
Ward <- rep(c(rep("ICU", 10), rep("Medicine", 10)), 2)
Month <- rep(seq(1:10), 4)
Outcomes <- rnorm(40, 20, 5)
df <- data.frame(Hospital, Ward, Month, Outcomes)
subgroup <- df[ df$Hospital == hosp,]
@

\begin{document}
<<setup, echo=FALSE >>=
  opts_chunk$set(fig.path = paste("test", hosp , sep=""))
@

Some infomative text about hospital \Sexpr{hosp}

<<plots, echo=FALSE >>=
  for(ward in unique(subgroup$Ward)){
    subgroup2 <- subgroup[subgroup$Ward == ward,]
    #     subgroup2 <- subgroup2[ order(subgroup2$Month),]
    savename <- paste(hosp, ward)
    plot(subgroup2$Month, subgroup2$Outcomes, type="o", main=paste("Trend plot for", savename))
  }
@
\end{document}


##  To be then turned into pdf with this
tools::texi2pdf("C:file.path\\report_A.tex", clean = TRUE, quiet = TRUE)
Run Code Online (Sandbox Code Playgroud)

在尝试运行我的knit()代码块后,我收到此错误:

Error in file(con, "w") : invalid 'description' argument
Run Code Online (Sandbox Code Playgroud)

当我查看要创建*.tex文件的目录时,我可以看到来自医院A的2个pdf图已经生成(B中没有),也没有医院特定的*.tex文件可以编译成pdf.提前感谢您提供的任何帮助!

Bri*_*ggs 15

您不需要重新定义.Rnw文件中的数据,我认为警告来自于您将输出名称与Hospital(医院的完整向量)一起而不是hosp(循环索引).

按照你的例子,testingloops.Rnw将是

\documentclass[10pt]{article}
\usepackage[margin=1.15 in]{geometry}
<<loaddata, echo=FALSE, message=FALSE>>=
subgroup <- df[ df$Hospital == hosp,]
@

\begin{document}
<<setup, echo=FALSE >>=
  opts_chunk$set(fig.path = paste("test", hosp , sep=""))
@

Some infomative text about hospital \Sexpr{hosp}

<<plots, echo=FALSE >>=
  for(ward in unique(subgroup$Ward)){
    subgroup2 <- subgroup[subgroup$Ward == ward,]
    #     subgroup2 <- subgroup2[ order(subgroup2$Month),]
    savename <- paste(hosp, ward)
    plot(subgroup2$Month, subgroup2$Outcomes, type="o", main=paste("Trend plot for", savename))
  }
@
\end{document}
Run Code Online (Sandbox Code Playgroud)

而驱动程序R文件就是这样

##  make my data
Hospital <- c(rep("A", 20), rep("B", 20))
Ward <- rep(c(rep("ICU", 10), rep("Medicine", 10)), 2)
Month <- rep(seq(1:10), 4)
Outcomes <- rnorm(40, 20, 5)
df <- data.frame(Hospital, Ward, Month, Outcomes)

## knitr loop
library("knitr")
for (hosp in unique(df$Hospital)){
  knit2pdf("testingloops.Rnw", output=paste0('report_', hosp, '.tex'))
}
Run Code Online (Sandbox Code Playgroud)


Ben*_*Ben 10

好问题!这适用于我在你的问题中提供的其他位.请注意,我已经更换了你hospx.我打电话给你的Rnw档案test.rnw

# input data
Hospital <- c(rep("A", 20), rep("B", 20))
Ward <- rep(c(rep("ICU", 10), rep("Medicine", 10)), 2)
Month <- rep(seq(1:10), 4)
Outcomes <- rnorm(40, 20, 5)
df <- data.frame(Hospital, Ward, Month, Outcomes)

# generate the tex files, one for each hospital in df
library(knitr)
lapply(unique(df$Hospital), function(x) 
       knit("C:\\emacs\\test.rnw", 
            output=paste('report_', x, '.tex', sep="")))

# generate PDFs from the tex files, one for each hospital in df
lapply(unique(df$Hospital), function(x)
       tools::texi2pdf(paste0("C:\\emacs\\", paste0('report_', x, '.tex')), 
                       clean = TRUE, quiet = TRUE))
Run Code Online (Sandbox Code Playgroud)

我已经用lapply和匿名函数替换了你的循环,这些函数通常看起来更像是 - 更多R.

在这里你可以看到我在文件中替换了hospwith x的位置rnw:

\documentclass[10pt]{article}
\usepackage[margin=1.15 in]{geometry}
<<loaddata, echo=FALSE, message=FALSE>>=
  Hospital <- c(rep("A", 20), rep("B", 20))
Ward <- rep(c(rep("ICU", 10), rep("Medicine", 10)), 2)
Month <- rep(seq(1:10), 4)
Outcomes <- rnorm(40, 20, 5)
df <- data.frame(Hospital, Ward, Month, Outcomes)
subgroup <- df[ df$Hospital == x,]
@

\begin{document}
<<setup, echo=FALSE >>=
  opts_chunk$set(fig.path = paste("test", x , sep=""))
@

Some informative text about hospital \Sexpr{x}

<<plots, echo=FALSE >>=
  for(ward in unique(subgroup$Ward)){
    subgroup2 <- subgroup[subgroup$Ward == ward,]
    #     subgroup2 <- subgroup2[ order(subgroup2$Month),]
    savename <- paste(x, ward)
    plot(subgroup2$Month, subgroup2$Outcomes, type="o", main=paste("Trend plot for", savename))
  }
@
\end{document}
Run Code Online (Sandbox Code Playgroud)

结果是两个tex文件(report_A.tex,report_B.tex),四个PDF(A1,A2,B1,B2)和两个报告PDF(report_A.pdf,report_B.pdf),每个都有他们的数字在他们中.这就是你追求的吗?

  • 绝对!我遇到麻烦(现在)让第二个lapply chunk在循环工具时表现得很好:texi2pdf,但我可以自己解决这个问题.只有第一个lapply编织我的*.tex文件太棒了!非常感谢!! (2认同)