清理包含需要折叠的多个级别的因子的最有效(即有效/适当)方法是什么?也就是说,如何将两个或多个因子级别组合成一个.
这是一个示例,其中"是"和"Y"这两个级别应折叠为"是","否"和"N"折叠为"否":
## Given:
x <- c("Y", "Y", "Yes", "N", "No", "H") # The 'H' should be treated as NA
## expectedOutput
[1] Yes Yes Yes No No <NA>
Levels: Yes No # <~~ NOTICE ONLY **TWO** LEVELS
Run Code Online (Sandbox Code Playgroud)
一个选择当然是在手工使用sub和朋友之前清理琴弦.
另一种方法是允许重复标签,然后丢弃它们
## Duplicate levels ==> "Warning: deprecated"
x.f <- factor(x, levels=c("Y", "Yes", "No", "N"), labels=c("Yes", "Yes", "No", "No"))
## the above line can be wrapped in either of the next two lines
factor(x.f)
droplevels(x.f)
Run Code Online (Sandbox Code Playgroud)
但是,有更有效的方法吗?
虽然我知道levels …