xTable,Sweave,R,交叉表中的计数和百分比

Cha*_*ase 15 latex r sweave xtable

编辑:基于aL3xa的答案,我在下面修改了他的语法.不完美,但越来越近了.我还没有找到一种方法来为列或行创建xtable accept\multicolumn {}参数.似乎Hmisc在幕后处理这些类型的任务,但看起来有点想要了解那里发生了什么.有没有人有Hmisc乳胶功能的经验?

ctab <- function(tab, dec = 2, margin = NULL) {
    tab <- as.table(tab)
    ptab <- paste(round(prop.table(tab, margin = margin) * 100, dec), "%", sep = "")
    res <- matrix(NA, nrow = nrow(tab) , ncol = ncol(tab) * 2, byrow = TRUE)
    oddc <- 1:ncol(tab) %% 2 == 1
    evenc <- 1:ncol(tab) %% 2 == 0
    res[,oddc ] <- tab
    res[,evenc ] <- ptab
    res <- as.table(res)
    colnames(res) <- rep(colnames(tab), each = 2)
    rownames(res) <- rownames(tab)
    return(res)
}
Run Code Online (Sandbox Code Playgroud)

我想创建一个格式化为LaTeX输出的表,其中包含每个列或变量的计数和百分比.我还没有找到解决这个问题的现成解决方案,但我觉得我必须在某种程度上重新创建方向盘.

我已经为直表制定了一个解决方案,但我正在努力采用交叉制表的东西.

首先是一些样本数据:

#Generate sample data
dow <- sample(1:7, 100, replace=TRUE)
purp <- sample(1:4, 100, replace=TRUE)
dow <- factor(dow, 1:7, c("Mon", "Tues", "Wed", "Thurs", "Fri", "Sat", "Sun"))
purp <- factor(purp, 1:4, c("Business", "Commute", "Vacation", "Other"))
Run Code Online (Sandbox Code Playgroud)

现在工作的直接标签功能:

customTable <- function(var, capt = NULL){
    counts <- table(var)
    percs <- 100 * prop.table(counts)       

    print(
        xtable(
            cbind(
                Count = counts
                , Percent = percs
            )
        , caption = capt
        , digits = c(0,0,2)
        )
    , caption.placement="top"
    )
}

#Usage
customTable(dow, capt="Day of Week")
customTable(purp, capt="Trip Pupose")
Run Code Online (Sandbox Code Playgroud)

有没有人有任何建议采用这个交叉表(即一周一天的旅行目的)?这是我目前编写的,它不使用xtable库和ALMOST工作,但不是动态的,并且使用起来非常难看:

#Create table and percentages
a <- table(dow, purp)
b <- round(prop.table(a, 1),2)

#Column bind all of the counts & percentages together, this SHOULD become dynamic in future
d <- cbind( cbind(Count = a[,1],Percent =  b[,1])
        , cbind(Count = a[,2], Percent = b[,2])
        , cbind(Count = a[,3], Percent = b[,3])
        , cbind(Count = a[,4], Percent = b[,4])
)

#Ugly function that needs help, or scrapped for something else
crossTab <- function(title){
    cat("\\begin{table}[ht]\n")
    cat("\\begin{center}\n")
    cat("\\caption{", title, "}\n", sep="") 

    cat("\\begin{tabular}{rllllllll}\n")
    cat("\\hline\n")

    cat("", cat("", paste("&\\multicolumn{2}{c}{",colnames(a), "}"), sep = ""), "\\\\\n", sep="")
    c("&", cat("", colnames(d), "\\\\\n", sep=" & "))
    cat("\\hline\n")
    c("&", write.table(d, sep = " & ", eol="\\\\\n", quote=FALSE, col.names=FALSE))

    cat("\\hline\n")
    cat("\\end{tabular}\n")
    cat("\\end{center}\n")
    cat("\\end{table}\n")   
}   

crossTab(title = "Day of week BY Trip Purpose")
Run Code Online (Sandbox Code Playgroud)

Ras*_*sen 12

在Tables-package中它是一行:

# data:
dow <- sample(1:7, 100, replace=TRUE)
purp <- sample(1:4, 100, replace=TRUE)
dow <- factor(dow, 1:7, c("Mon", "Tues", "Wed", "Thurs", "Fri", "Sat", "Sun"))
purp <- factor(purp, 1:4, c("Business", "Commute", "Vacation", "Other"))

dataframe <-  data.frame( dow, purp)

# The packages

library(tables)
library(Hmisc)

# The table
tabular(  (Weekday=dow) ~  (Purpose=purp)*(Percent("row")+ 1)    ,data=dataframe        )

# The latex table
latex(  tabular(  (Weekday=dow) ~  (Purpose=purp)*(Percent("col")+ 1)    ,data=dataframe        ))
Run Code Online (Sandbox Code Playgroud)

使用booktabs,你得到这个(可以进一步定制):

在此输入图像描述


aL3*_*3xa 7

很好的问题,这个人困扰了我一段时间(这不是那么难,只是我像往常一样懒得......)然而......虽然问题很好,但我担心,你的方法不是.这是一个无价的包xtable,你可以(误)使用.此外,这个问题太常见了 - 很有可能已经有一些现成的解决方案可以安装在互联网上.

有一天我将要一劳永逸地解决这个问题(我将在GitHub上发布代码).主要想法有点像这样:你想要一个单元格内的频率和/或百分比值(由\分隔)或连续的绝对和相对频率(或%)的行?我会选择第二个,所以我现在将发布一个"急救"解决方案:

ctab <- function(tab, dec = 2, ...) {
  tab <- as.table(tab)
  ptab <- paste(round(prop.table(tab) * 100, dec), "%", sep = "")
  res <- matrix(NA, nrow = nrow(tab) * 2, ncol = ncol(tab), byrow = TRUE)
  oddr <- 1:nrow(tab) %% 2 == 1
  evenr <- 1:nrow(tab) %% 2 == 0
  res[oddr, ] <- tab
  res[evenr, ] <- ptab
  res <- as.table(res)
  colnames(res) <- colnames(tab)
  rownames(res) <- rep(rownames(tab), each = 2)
  return(res)
}
Run Code Online (Sandbox Code Playgroud)

现在尝试以下方法:

data(HairEyeColor)           # load an appropriate dataset
tb <- HairEyeColor[, , 1]    # choose only male respondents
ctab(tb)
      Brown  Blue   Hazel Green
Black 32     11     10    3    
Black 11.47% 3.94%  3.58% 1.08%
Brown 53     50     25    15   
Brown 19%    17.92% 8.96% 5.38%
Red   10     10     7     7    
Red   3.58%  3.58%  2.51% 2.51%
Blond 3      30     5     8    
Blond 1.08%  10.75% 1.79% 2.87%
Run Code Online (Sandbox Code Playgroud)

确保你加载了xtable包和使用print(它是一个通用函数,所以你必须传递一个xtable被分类的对象).禁止行名称很重要.我明天会优化这个 - 它应该是xtable兼容的.现在是我所在时区的凌晨3点,所以有了这些话,我将结束我的回答:

print(xtable(ctab(tb)), include.rownames = FALSE)
Run Code Online (Sandbox Code Playgroud)

干杯!


Cha*_*ase 4

我无法弄清楚如何使用 xtable 生成多列标题,但我确实意识到我可以将计数和百分比连接到同一列中以进行打印。不理想,但似乎完成了工作。这是我写的函数:

ctab3 <- function(row, col, margin = 1, dec = 2, percs = FALSE, total = FALSE, tex = FALSE, caption = NULL){
    tab <- as.table(table(row,col))
    ptab <- signif(prop.table(tab, margin = margin), dec)

    if (percs){

        z <- matrix(NA, nrow = nrow(tab), ncol = ncol(tab), byrow = TRUE) 
        for (i in 1:ncol(tab)) z[,i] <- paste(tab[,i], ptab[,i], sep = " ")
        rownames(z) <- rownames(tab)
        colnames(z) <- colnames(tab)

        if (margin == 1 & total){
            rowTot <- paste(apply(tab, 1, sum), apply(ptab, 1, sum), sep = " ")
            z <- cbind(z, Total = rowTot)
        } else if (margin == 2 & total) {
            colTot <- paste(apply(tab, 2, sum), apply(ptab, 2, sum), sep = " ")
            z <- rbind(z,Total = colTot)
        }
    } else {
        z <- table(row, col)    
    }
ifelse(tex, return(xtable(z, caption)), return(z))
}
Run Code Online (Sandbox Code Playgroud)

可能不是最终产品,但确实允许参数具有一定的灵活性。在最基本的层面上,它只是一个包装器table(),但也可以生成 LaTeX 格式的输出。这是我最终在Sweave文档中使用的内容:

<<echo = FALSE>>=
for (i in 1:ncol(df)){
    print(ctab3(
        col = df[,1]
        , row = df[,i]
        , margin = 2
        , total = TRUE
        , tex = TRUE
        , caption = paste("Dow by", colnames(df[i]), sep = " ")
    ))
}
@
Run Code Online (Sandbox Code Playgroud)