vag*_*ond 3 for-loop r calculated-columns conditional-statements dataframe
我提到过:
所有示例都基于测试数字向量或其他列中的NA并添加新变量.这是一个简短的可重现的例子:
x <- c("dec 12", "jan 13", "feb 13", "march 13", "apr 13", "may 13",
"june 13", "july 13", "aug 13", "sep 13", "oct 13", "nov 13")
y <- c(234, 678, 534, 122, 179, 987, 872, 730, 295, 450, 590, 312)
df<-data.frame(x,y)
Run Code Online (Sandbox Code Playgroud)
我想为df$x= dec | 添加"winter" jan | feb,"spring"for march | apr | may,"summer"和"autumn".
我试过了
df$season <- ifelse(df[1:3, ], "winter", ifelse(df[4:6, ], "spring",
ifelse(df[7:9, ], "summer", "autumn")))
Run Code Online (Sandbox Code Playgroud)
我知道这是一种非常低效的做事方式,但我是一个新手和一个kludger.它返回了错误:
Error in ifelse(df[1:3, ], "winter", ifelse(df[4:6, ], "spring",
ifelse(df[7:9, : (list) object cannot be coerced to type 'logical'
Run Code Online (Sandbox Code Playgroud)
如果相同的数据框有数千行,我想循环遍历它并根据一年中的月份为季节创建一个新变量,我该怎么做?我提到:" 循环数据框以在其他列中添加依赖变量的列 ",但这是循环并设置数学运算符以创建新变量.我试过外部资源:将R邮件列表上线和上TalkStats论坛线程.但是,两者都基于数值变量和条件.
如果您有一个非常大的数据框,那么data.table对您将非常有帮助.以下作品:
library(data.table)
x <- c("dec 12", "jan 13", "feb 13", "march 13", "apr 13", "may 13",
"june 13", "july 13", "aug 13", "sep 13", "oct 13", "nov 13")
y <- c(234, 678, 534, 122, 179, 987, 872, 730, 295, 450, 590, 312)
df <-data.frame(x,y)
DT <- data.table(df)
DT[, month := substr(tolower(x), 1, 3)]
DT[, season := ifelse(month %in% c("dec", "jan", "feb"), "winter",
ifelse(month %in% c("mar", "apr", "may"), "spring",
ifelse(month %in% c("jun", "jul", "aug"), "summer",
ifelse(month %in% c("sep", "oct", "nov"), "autumn", NA))))]
DT
x y month season
1: dec 12 234 dec winter
2: jan 13 678 jan winter
3: feb 13 534 feb winter
4: march 13 122 mar spring
5: apr 13 179 apr spring
6: may 13 987 may spring
7: june 13 872 jun summer
8: july 13 730 jul summer
9: aug 13 295 aug summer
0: sep 13 450 sep autumn
1: oct 13 590 oct autumn
12: nov 13 312 nov autumn
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
8070 次 |
| 最近记录: |