将新列添加到表或数据框列表中的每个元素

Question

将新列添加到表或数据框列表中的每个元素

我有一个文件列表.我还有一个"名称"列表,我substr()从这些文件的实际文件名中获取.我想为列表中的每个文件添加一个新列.此列将包含"names"中相应元素,重复次数为文件中的行数.

例如:

df1 <- data.frame(x = 1:3, y=letters[1:3])
df2 <- data.frame(x = 4:6, y=letters[4:6])
filelist <- list(df1,df2)
ID <- c("1A","IB")

Run Code Online (Sandbox Code Playgroud)

伪代码

  for( i in length(filelist)){

       filelist[i]$SampleID <- rep(ID[i],nrow(filelist[i])

  }

Run Code Online (Sandbox Code Playgroud)

//基本上在filelist的每个数据框中创建一个新列,并用重复的相应ID值填充该列

我的输出应该是这样的:

filelist[1] 应该:

   x y SAmpleID
 1 1 a       1A
 2 2 b       1A
 3 3 c       1A

Run Code Online (Sandbox Code Playgroud)

fileList[2]

   x y SampleID
 1 4 d       IB
 2 5 e       IB
 3 6 f       IB

Run Code Online (Sandbox Code Playgroud)

等等.....

任何想法如何做到这一点.

Answer 1

Ric*_*rta 46

另一种解决方案是使用cbind,并利用R将重新定义较短矢量值的事实.

例如

x <- df2  # from above
cbind(x, NewColumn="Singleton")
 #    x y NewColumn
 #  1 4 d Singleton
 #  2 5 e Singleton
 #  3 6 f Singleton

Run Code Online (Sandbox Code Playgroud)

没有必要使用rep.R为你做到了.

因此,你可以放入 cbind(filelist[[i]], ID[[i]])你的for loop或@Sven指出,你可以使用清洁剂mapply:

filelist <- mapply(cbind, filelist, "SampleID"=ID, SIMPLIFY=F)

Run Code Online (Sandbox Code Playgroud)

非常感谢大家的帮助和出色的方法.for循环,mapply()和cbind都像魅力一样.它很容易学习这样的语言,每当我在这个板上提出问题时,我都会学到新的东西.对不起,我不能早点写信表示感谢和赞赏.谢谢 (7认同)

Answer 2

Sve*_*ein 21

这是你的循环的更正版本:

for( i in seq_along(filelist)){

  filelist[[i]]$SampleID <- rep(ID[i],nrow(filelist[[i]]))

}

Run Code Online (Sandbox Code Playgroud)

有3个问题:

)在体内命令后,决赛失踪了.
列表的元素是通过而[[不是通过[.[返回长度为1的列表.[[仅返回元素.
length(filelist)只是一个值,因此循环仅运行列表的最后一个元素.我换了它seq_along(filelist).

更有效的方法是mapply用于任务:

mapply(function(x, y) "[<-"(x, "SampleID", value = y) ,
       filelist, ID, SIMPLIFY = FALSE)

Run Code Online (Sandbox Code Playgroud)

你真的不需要`mapply`中的匿名函数.``mapply(`[< - `,filelist,'sampleID',value = ID,SIMPLIFY = FALSE)``将工作 (14认同)

Answer 3

Ron*_*hah 6

的purrr方式，使用map2

library(dplyr)
library(purrr)

map2(filelist, ID, ~cbind(.x, SampleID = .y))

#[[1]]
#  x y SampleId
#1 1 a       1A
#2 2 b       1A
#3 3 c       1A

#[[2]]
#  x y SampleId
#1 4 d       IB
#2 5 e       IB
#3 6 f       IB

Run Code Online (Sandbox Code Playgroud)

或者也可以使用

map2(filelist, ID, ~.x %>% mutate(SampleId = .y))

Run Code Online (Sandbox Code Playgroud)

如果您命名列表，我们可以使用imap并根据它的名称添加新列。

names(filelist) <- c("1A","IB")
imap(filelist, ~cbind(.x, SampleID = .y))
#OR
#imap(filelist, ~.x %>% mutate(SampleId = .y))

Run Code Online (Sandbox Code Playgroud)

这类似于使用 Map

Map(cbind, filelist, SampleID = names(filelist))

Run Code Online (Sandbox Code Playgroud)

Answer 4

may*_*cca 5

这个对我有用：

为列表中的每个数据框创建一个新列；根据现有列填充新列的值。（在您的情况下是 ID）。

例子：

# Create dummy data
df1<-data.frame(a = c(1,2,3))
df2<-data.frame(a = c(5,6,7))

# Create a list
l<-list(df1, df2)

> l
[[1]]
  a
1 1
2 2
3 3

[[2]]
  a
1 5
2 6
3 7

# add new column 'b'
# create 'b' values based on column 'a' 
l2<-lapply(l, function(x) 
  cbind(x, b = x$a*4))

Run Code Online (Sandbox Code Playgroud)

结果是：

Run Code Online (Sandbox Code Playgroud)

在你的情况下是这样的：

filelist<-lapply(filelist, function(x) 
  cbind(x, b = x$SampleID))

Run Code Online (Sandbox Code Playgroud)

归档时间：	12 年，11 月前
查看次数：	38918 次
最近记录：	6 年，8 月前