ddply:如何在结果中包含字符向量

Question

ddply:如何在结果中包含字符向量

对不起,对于神秘的标题,我没有找到任何更好的问题摘要.所以这是我的问题:我有一个数据帧,并希望制作diff()工作正常的组:

 df <- data.frame (name = rep(c("a", "b", "c"), 4),
              index = rep(c("c1", "c2"), each=6),
              year = rep(c(2008:2010),4),
              value = rep(1:3, each=4))

head(df)

  name index year value

1    a    c1 2008     1
2    b    c1 2009     1
3    c    c1 2010     1

ddply(df, .(name, year), summarize,  value=diff(value))

Run Code Online (Sandbox Code Playgroud)

但是,我想index在我尝试的结果中加入我的结果:

ddply(df, .(name, year), summarize,  value=diff(value), index=index)

Run Code Online (Sandbox Code Playgroud)

然而,这会产生错误消息:

length(rows) == 1 is not TRUE

Run Code Online (Sandbox Code Playgroud)

我猜是因为索引有更多的行,因为它没有被处理diff.我的问题有快速解决方案吗？

非常感谢你!

编辑

我试着澄清我想要添加到结果中的问题:

假设index上面的变量.这是一个应该解释的因素.然而,我无法接受diff()那些没有意义的东西所以我只想通过这个而不改变任何东西.我尝试过drop==FALSE产生同样的错误消息.

所有这些困惑的索尔!这是一个非常简单的例子:

name year  index  value
 a   2008    c1    10
 a   2009    c2    30
 a   2010    c1    40

Run Code Online (Sandbox Code Playgroud)

在服用diff的acroos组'a'之后,这看起来像:

name year index d.value 
 a   2009  c2     +20  #c2 stayed the same just the first row got intentionally dropped.
 a   2010  c1     +10

Run Code Online (Sandbox Code Playgroud)

将不幸的名称index视为属性:它可以在这些年中改变但是没有意义diff()

我真的希望这能为你提供我想要的线索 - 如果不是我会删除这个问题,因为我发现了一个不合理的解决方法;)并为所有的不便表示歉意!

Answer 1

小智 2

我不完全确定你想要什么，听起来你想要获得差异，保留索引变量并删除每个分组的第一行。这能让你得到你想要的吗？

doSummary = function(df) {
  values = diff(df$value)
  indexes = df$index[2:length(df)]
  data.frame(d.value=values, index=indexes)
}
ddply(df, .(name, year), doSummary)

Run Code Online (Sandbox Code Playgroud)

归档时间：	14 年，1 月前
查看次数：	3350 次
最近记录：	14 年前