我有一个数据框,其中列最初标记为任意.稍后,我想将这些级别更改为数值.以下脚本说明了该问题.
library(ggplot2)
library(reshape2)
m <- 10
n <- 6
nam <- list(c(),letters[1:n])
var <- as.data.frame(matrix(sort(rnorm(m*n)),m,n,F,nam))
dtf <- data.frame(t=seq(m)*0.1, var)
mdf <- melt(dtf, id=c('t'))
xs <- c(0.25,0.5,1.0,2.0,4.0,8.0)
levels(mdf$variable) <- xs
g <- ggplot(mdf,aes(variable,value,group=variable,colour=t))
g +
geom_point() +
#scale_x_continuous() +
opts()
Run Code Online (Sandbox Code Playgroud)
这个图是产生的.
'变量'量在图上均匀分布,即使在数字上这不是真的.如何才能使x轴上的间距正确?
我在R中有一个因子,具有NA水平.
set.seed(1)
x <- sample(c(1, 2, NA), 25, replace=TRUE)
x <- factor(x, exclude = NULL)
> x
[1] 1 2 2 <NA> 1 <NA> <NA> 2 2 1 1
[12] 1 <NA> 2 <NA> 2 <NA> <NA> 2 <NA> <NA> 1
[23] 2 1 1
Levels: 1 2 <NA>
Run Code Online (Sandbox Code Playgroud)
如何按<NA>
级别对该因子进行子集化?我试过的两种方法都行不通.
> x[is.na(x)]
factor(0)
Levels: 1 2 <NA>
> x[x=='<NA>']
factor(0)
Levels: 1 2 <NA>
Run Code Online (Sandbox Code Playgroud) 我有一个变量,称为gender
二进制分类值"女性"/"男性".我想将其类型更改为整数0/1,以便我可以在回归分析中使用它.即我希望将值"女性"和"男性"映射到1和0.
> str(gender)
gender : Factor w/ 2 levels "female","male": 1 1 1 0 0 0 0 1 1 0 ...
> gender[1]
[1] female
Run Code Online (Sandbox Code Playgroud)
我想转换性别变量类型,以便在查询元素时得到int值1,即
> gender[1]
[1] 1
Run Code Online (Sandbox Code Playgroud) 想象一下数据框如下面的df1:
df1 <- data.frame(v1 = as.factor(c("m0p1", "m5p30", "m11p20", "m59p60", "m59p60")))
Run Code Online (Sandbox Code Playgroud)
如何创建变量所有级别的列表?谢谢.
可能重复:
在R中的子集化数据帧中丢弃因子级别
我已经用一定的因子水平对观察进行了子集化.当检查是否已经完成时,summary()
仍然列出了水平,但没有观察到.它们不应该在子集中消失吗?
drop = TRUE
data.frame过滤中有一个有趣的选项,参见摘录自help('[.data.frame')
:
用法
类'data.frame'的S3方法
Run Code Online (Sandbox Code Playgroud)x[i, j, drop = ]
但是当我在data.frame上尝试它时,它不起作用!
> df = data.frame(a = c("europe", "asia", "oceania"), b = c(1, 2, 3))
>
> df[1:2,, drop = TRUE]$a
[1] europe asia
Levels: asia europe oceania <--- oceania shouldn't be here!!
>
Run Code Online (Sandbox Code Playgroud)
我知道还有其他方法
df2 <- droplevels(df[1:2,])
Run Code Online (Sandbox Code Playgroud)
但文档承诺更优雅的方式来做到这一点,为什么它不起作用?这是一个错误吗?因为我不明白这是怎么一个功能......
编辑:我对drop = TRUE
降低向量的因子水平感到困惑,你可以在这里看到.[i, drop = TRUE]
降低因子水平并不是非常直观[i, j, drop = TRUE]
!
我正在使用data.table
(1.8.9)和:=
运算符来更新另一个表中的值.要更新的表(dt1)有许多因子列,带有更新的表(dt2)具有类似的列,其值可能不存在于另一个表中.如果dt2中的列是字符,我会收到一条错误消息,但是当我将它们分解时,我会得到不正确的值.
如何在不将所有因子首先转换为字符的情况下更新表格?
这是一个简化的例子:
library(data.table)
set.seed(3957)
## Create some sample data
## Note column y is a factor
dt1<-data.table(x=1:10,y=factor(sample(letters,10)))
dt1
## x y
## 1: 1 m
## 2: 2 z
## 3: 3 t
## 4: 4 b
## 5: 5 l
## 6: 6 a
## 7: 7 s
## 8: 8 y
## 9: 9 q
## 10: 10 i
setkey(dt1,x)
set.seed(9068)
## Create a second table that will be used to update the …
Run Code Online (Sandbox Code Playgroud) 大家好,一劳永逸,你是怎么做的(强调你,因为我确定不止一种方法可以实现这一点)对比代码(治疗,总和,头盔等)并保留一个有意义的因子标签(所以你可以在glm函数中对效果做出有意义的解释吗?
我知道我可以使用level()来了解哪个因子水平是参考,但是当我开始涉及具有5或10个水平及其相互作用的因子时,这会变得乏味.
这是我的意思的快速双因素示例
outcome <- c(1,0,0,1,1,0,0,0,1, 0, 0, 1)
firstvar <- c("A", "B", "C", "C", "B", "B", "A", "A", "C", "A", "C", "B")
secondvar <- c("D", "D", "E", "F", "F", "E", "D", "E", "F", "F", "D", "E")
df <- as.data.frame(cbind(outcome, firstvar, secondvar))
df$firstvar <- as.factor(df$firstvar)
df$secondvar <- as.factor(df$secondvar)
#not coded manually (and default appears to be dummy or treatment coding)
#gives meaningful factor labels in summary function
summary(glm(outcome ~ firstvar*secondvar, data=df, family="binomial"))
#effects coded
#does not give meaningful factor labels …
Run Code Online (Sandbox Code Playgroud) 我是R的新手.我正在编写一本关于我工作的常用功能/特性的语法的单独手册.我的示例数据框如下:
x.sample <-
structure(list(Q9_A = structure(c(5L, 3L, 5L, 3L, 5L, 3L, 1L,
5L, 5L, 5L), .Label = c("Impt", "Neutral", "Not Impt at all",
"Somewhat Impt", "Very Impt"), class = "factor"), Q9_B = structure(c(5L,
5L, 5L, 3L, 5L, 5L, 3L, 5L, 3L, 3L), .Label = c("Impt", "Neutral",
"Not Impt at all", "Somewhat Impt", "Very Impt"), class = "factor"),
Q9_C = structure(c(3L, 5L, 5L, 3L, 5L, 5L, 3L, 5L, 5L, 3L
), .Label = c("Impt", "Neutral", "Not Impt at all", "Somewhat Impt", …
Run Code Online (Sandbox Code Playgroud) 这是我的数据
> a
[1] Male Male Female Male Male Male Female Female Male Male Female Male Male Male
[15] Female Female Female Male Female Male Female Male Male Female Male Male Female Male
[29] Male Male Female Male Male Male Female Female Male Male Male Male Male
Levels: Female Male
> b
[1] 0 1 0 1 0 0 0 0 1 1 1 1 0 1 0 0 0 1 0 0 1 0 0 0 1 1 1 …
Run Code Online (Sandbox Code Playgroud)