San*_*afa 0 r surveymonkey dummy-variable
在一项调查中,有一个问题是“课程的哪个方面对您学习概念帮助最大?选择所有适用的”
以下是回复列表的样子:
Student_ID = c(1,2,3)
Responses = c("lectures,tutorials","tutorials,assignments,lectures", "assignments,presentations,tutorials")
Grades = c(1.1,1.2,1.3)
Data = data.frame(Student_ID,Responses,Grades);Data
Student_ID | Responses | Grades
1 | lectures,tutorials | 1.1
2 | tutorials,assignments,lectures | 1.2
3 | assignments,presentations,tutorials | 1.3
Run Code Online (Sandbox Code Playgroud)
现在我想创建一个看起来像这样的数据框
Student_ID | Lectures | Tutorials | Assignments | Presentation | Grades
1 | 1 | 1 | 0 | 0 | 1.3
2 | 1 | 1 | 1 | 0 | 1.4
3 | 0 | 1 | 1 | 1 | 1.3
Run Code Online (Sandbox Code Playgroud)
我设法使用 splitstackshape 包将逗号分隔的响应分成列。所以目前我的数据是这样的:
Student ID | Response 1 | Response 2 | Response 3 | Response 4 | Grades
1 | lectures | tutorials | NA | NA | 1.1
2 | tutorials | assignments | lectures | NA | 1.2
3 | assignments| presentation| tutorials | NA | 1.3
Run Code Online (Sandbox Code Playgroud)
但正如我之前所说,我希望我的表格看起来像我上面展示的那样,在虚拟代码中。我被困在如何继续。也许一个想法是通过列中的每个观察结果并将 1 或 0 附加到以讲座、教程、作业、演示为标题的新数据框中?
首先将Response列从因子转换为字符类。该列的每个元素然后用逗号分割。我不知道所有可能的响应是什么,所以我使用了所有存在的响应。接下来将拆分Response列制成表格,指定可能的级别。结果列表在混合到旧的 data.frame 之前被转换为矩阵。
Data$Responses <- as.character(Data$Responses)
resp.split <- strsplit(Data$Responses, ",")
lev <- unique(unlist(resp.split))
resp.dummy <- lapply(resp.split, function(x) table(factor(x, levels=lev)))
Data2 <- with(Data, data.frame(Student_ID, do.call(rbind, resp.dummy), Grades))
Data2
# Student_ID lectures tutorials assignments presentations Grades
# 1 1 1 1 0 0 1.1
# 2 2 1 1 1 0 1.2
# 3 3 0 1 1 1 1.3
Run Code Online (Sandbox Code Playgroud)