Qui*_*tic 10 r multiple-columns
我有一个如下所示的数据帧(df):
School Student Year
A 10 1999
A 10 2000
A 20 1999
A 20 2000
A 20 2001
B 10 1999
B 10 2000
Run Code Online (Sandbox Code Playgroud)
我想创建一个人ID
专栏,以便df看起来像这样:
ID School Student Year
1 A 10 1999
1 A 10 2000
2 A 20 1999
2 A 20 2000
2 A 20 2001
3 B 10 1999
3 B 10 2000
Run Code Online (Sandbox Code Playgroud)
换句话说,ID
变量指示它在数据集中的哪个人,同时考虑学生编号和学校会员资格(这里我们总共有3个学生).
df$ID <- df$Student
如果c("School", "Student)
是唯一的,我做了并试图请求值+1 .它不起作用.帮助赞赏.
akr*_*run 12
我们可以base R
通过操作无需任何组来完成此操作
df$ID <- cumsum(!duplicated(df[1:2]))
df
# School Student Year ID
#1 A 10 1999 1
#2 A 10 2000 1
#3 A 20 1999 2
#4 A 20 2000 2
#5 A 20 2001 2
#6 B 10 1999 3
#7 B 10 2000 3
Run Code Online (Sandbox Code Playgroud)
注意:假设订购了"学校"和"学生"
或使用 tidyverse
library(dplyr)
df %>%
mutate(ID = group_indices_(df, .dots=c("School", "Student")))
# School Student Year ID
#1 A 10 1999 1
#2 A 10 2000 1
#3 A 20 1999 2
#4 A 20 2000 2
#5 A 20 2001 2
#6 B 10 1999 3
#7 B 10 2000 3
Run Code Online (Sandbox Code Playgroud)
按学校和学生分组,然后将组ID分配给ID
变量。
library('data.table')
df[, ID := .GRP, by = .(School, Student)]
# School Student Year ID
# 1: A 10 1999 1
# 2: A 10 2000 1
# 3: A 20 1999 2
# 4: A 20 2000 2
# 5: A 20 2001 2
# 6: B 10 1999 3
# 7: B 10 2000 3
Run Code Online (Sandbox Code Playgroud)
数据:
df <- fread('School Student Year
A 10 1999
A 10 2000
A 20 1999
A 20 2000
A 20 2001
B 10 1999
B 10 2000')
Run Code Online (Sandbox Code Playgroud)