我的 df 如下:
set.seed(123)
df <- data.frame(x = sample(letters[1:3],20,replace = TRUE),
y = sample(1:10,20,replace = TRUE))
df <- df[order(df$x),]
Run Code Online (Sandbox Code Playgroud)
我想用 NA 替换每个组的第一个值。例如:
x y
a NA
a 8
a 1
a 8
b NA
b 3
b 2
b 10
b 8
.
.
Run Code Online (Sandbox Code Playgroud)
我对获取第一个值没有问题,但这没有意义。
test <- df %>%
group_by(x) %>%
do(a = head(.$y,1))
Run Code Online (Sandbox Code Playgroud)
请帮助下一步。
df <- data.frame(x = c(1,1,1,2,2,3,3,3,4,5,5),
y = c("A","B","C","A","B","A","B","D","B","C","D"),
z = c(3,2,1,4,2,3,2,1,2,3,4))
df_new <- dcast(df, x ~ y, value.var = "z")
Run Code Online (Sandbox Code Playgroud)
如果上面给出的样本数据,则 dcast() 函数保留 NA 值。但它不适用于我的数据集。因此,该函数将 na 转换为零。为什么?
如何保持 na 值?
r <- read.csv("ratings.csv")
m <- read.csv("movies.csv")
rm <- merge(ratings, movies, by="movieId")
umr <- dcast(rm, userId ~ title, value.var = "rating", fun.aggregate= sum)
Run Code Online (Sandbox Code Playgroud)
提前致谢。