Zhu*_*arb 4 r dataframe r-factor
我有一个类似于下面的data.frame.我通过删除我不感兴趣的行来预处理它.我的大多数列是'因素',其'级别'不会更新,因为我过滤data.frame.
我可以看到我在下面做的事情并不理想.在修改data.frame时如何更新因子级别?下面是出现问题的演示.
# generate data
set.seed(2013)
df <- data.frame(site = sample(c("A","B","C"), 50, replace = TRUE),
currency = sample(c("USD", "EUR", "GBP", "CNY", "CHF"),50, replace=TRUE, prob=c(10,6,5,6,0.5)),
value = ceiling(rnorm(50)*10))
# check counts to see there is one entry where currency = CHF
count(df, vars="currency")
>currency freq
>1 CHF 1
>2 CNY 13
>3 EUR 16
>4 GBP 6
>5 USD 14
# filter out all entires where site = A, i.e. take subset of df
df <- df[!(df$site=="A"),]
# check counts again to see how this affected the currency frequencies
count(df, vars="currency")
>currency freq
>1 CNY 10
>2 EUR 8
>3 GBP 4
>4 USD 10
# But, the filtered data.frame's levels have not been updated:
levels(df$currency)
>[1] "CHF" "CNY" "EUR" "GBP" "USD"
levels(df$site)
>[1] "A" "B" "C"
Run Code Online (Sandbox Code Playgroud)
期望的产出:
# levels(df$currency) = "CNY" "EUR" "GBP" "USD
# levels(df$site) = "B" "C"
Run Code Online (Sandbox Code Playgroud)
A5C*_*2T1 12
用途droplevels:
> df <- droplevels(df)
> levels(df$currency)
[1] "CNY" "EUR" "GBP" "USD"
> levels(df$site)
[1] "B" "C"
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
5834 次 |
| 最近记录: |