创建大型数据框架

Question

创建大型数据框架

假设我想从头开始生成一个大型数据框.

使用data.frame函数是我通常如何创建数据框架.但是,df类似于以下内容非常容易出错并且效率低下.

那么是否有更有效的方法来创建以下数据框.

df <- data.frame(GOOGLE_CAMPAIGN=c(rep("Google - Medicare - US", 928), rep("MedicareBranded", 2983),
                                   rep("Medigap", 805), rep("Medigap Branded", 1914),
                                   rep("Medicare Typos", 1353), rep("Medigap Typos", 635),
                                   rep("Phone - MedicareGeneral", 585),
                                   rep("Phone - MedicareBranded", 2967),
                                   rep("Phone-Medigap", 812),
                                   rep("Auto Broad Match", 27),
                                   rep("Auto Exact Match", 80),
                                   rep("Auto Exact Match", 875)),                   
                 GOOGLE_AD_GROUP=c(rep("Medicare", 928), rep("MedicareBranded", 2983),
                                   rep("Medigap", 805), rep("Medigap Branded", 1914),
                                   rep("Medicare Typos", 1353), rep("Medigap Typos", 635),
                                   rep("Phone ads 1-Medicare Terms",585),
                                   rep("Ad Group #1", 2967), rep("Medigap-phone", 812),
                                   rep("Auto Insurance", 27),
                                   rep("Auto General", 80),
                                   rep("Auto Brand", 875)))

Run Code Online (Sandbox Code Playgroud)

哎呀,这是一些"糟糕"的代码.如何以更有效的方式生成这个"大"数据帧？

Answer 1

jor*_*ran 7

如果你对这些信息的唯一来源是一张纸,那么你可能不会得到多少比这更好的,但你至少可以整合所有到一个单一的rep通话为每列:

#I'm going to cheat and not type out all those strings by hand
x <- unique(df[,1])
y <- unique(df[,2])

#Vectors of the number of times for each    
x1 <- c(928,2983,805,1914,1353,635,585,2967,812,27,955)
y1 <- c(x1[-11],80,875)

dd <- data.frame(GOOGLE_CAMPAIGN = rep(x, times = x1), 
                 GOOGLE_AD_GROUP = rep(y, times = y1))

Run Code Online (Sandbox Code Playgroud)

应该是相同的:

> all.equal(dd,df)
[1] TRUE

Run Code Online (Sandbox Code Playgroud)

但是如果这个信息已经在R中以某种方式存在于数据结构中并且您只需要对其进行转换,那么这可能更容易,但我们需要知道该结构是什么.

归档时间：	14 年，4 月前
查看次数：	965 次
最近记录：	14 年，4 月前