我有一个很简单的问题,我目前正在努力解决这个问题.如果我有一个示例数据帧:
a <- c(1:5)
b <- c(1,3,5,9,11)
df1 <- data.frame(a,b)
Run Code Online (Sandbox Code Playgroud)
如何创建新列('c'),然后使用列b上的if语句填充该列.例如:对于b中的那些值为'cat',对于b中的那些值为1或2'dog',对于b中大于6的那些值,在3和5'兔子'之间为'
所以使用数据帧df1的列'c'将为:cat,dog,dog,rabbit,rabbit.
提前谢谢了.
使用示例数据框:
df <- structure(list(KY27SCH1 = c(4, 4, 4, 4, NA, 5, 2, 4, 4, NA, 4,
5, 3, 5, 5), KY27SCH2 = c(5, 4, 4, 4, NA, 4, 1, 4, 4, NA, 4,
5, 4, 5, 5), KY27SCH3 = c(4, 4, 5, 4, NA, 4, 4, 4, 5, NA, 5,
5, 3, 5, 5), KY27SCH4 = c(3, 5, 5, 4, NA, 5, 4, 5, 5, NA, 5,
5, 4, 5, 5)), .Names = c("KY27SCH1", "KY27SCH2", "KY27SCH3",
"KY27SCH4"), row.names = 197:211, …Run Code Online (Sandbox Code Playgroud) 我正在研究一个显示人们旅行方式的大型数据集.我需要计算人们旅行的独特日数.下表显示了ID,它对每个特定的人都是唯一的.与每个ID相关联的是他们旅行的日期 - 对于某些人来说,这可能是每天一次旅行,对于其他人,每天可能有多次旅行(例如,人"1"在4日进行了两次旅行).我需要做的是选择数据集中所有人的唯一天数总数(例如,人1 = 2,人2 = 3,人3 = 1,人4 = 2 - 因此总使用迷你 - 下面的数据集应为8.
ID = c(1,1,1,2,2,2,2,3,4,4,4,4)
date = c("4th Nov","4th Nov","5th Nov","5th Nov","6th Nov","7th Nov","7th Nov","8th Nov","6th Nov","6th Nov","7th Nov","7th Nov")
data<-data.frame(ID,date)
Run Code Online (Sandbox Code Playgroud)
我们将非常感谢有关R编码的任何建议.
非常感谢.
我正在研究一个大于40列的大型数据帧.我希望能够移动列,而无需指定所有列名称.例如:
a<-c(1:5)
b<-c(4,3,2,1,1)
Percent<-c(40,30,20,10,10)
Labels<-c("Cat","Dog","Rabbit","Rat","Mouse")
df1<-data.frame(a,b,Percent,Labels)
Run Code Online (Sandbox Code Playgroud)
如何将列'Lables'移动到列'a'之前,而不必写入所有其他列名称(即我可以在另一列之前/之后指定一列?).
谢谢.
我在向gplot2图添加垂直线时遇到一些问题.
我的示例数据框如下所示.
set.seed(1234)
df <- data.frame(Date=seq(as.POSIXct("05:00", format="%H:%M"),
as.POSIXct("23:00", format="%H:%M"), by="hours"))
df$Counts <- sample(19)
df <- df[-c(4,7,17,18),]
# generate the groups automatically and plot
idx <- c(1, diff(df$Date))
i2 <- c(1,which(idx != 1), nrow(df)+1)
df$grp <- rep(1:length(diff(i2)), diff(i2))
g <- ggplot(df, aes(x=Date, y=Counts)) + geom_line(aes(group = grp)) +
geom_point()
Run Code Online (Sandbox Code Playgroud)
关于堆栈溢出和Web的讨论似乎有很多关于在时间序列上使用vlines的讨论.我已经去纠正我的代码,但到目前为止我没有太多运气.
比方说,我希望在21日下午2点有一条垂直线.
g1 <- g + geom_vline(xintercept=as.numeric(as.Date("2013-02-21 14:00:00")))
Run Code Online (Sandbox Code Playgroud)
任何人都可以告诉我如何让这个工作?
对于示例数据框:
df <- structure(
list(
country = structure(
c(1L, 1L, 1L, 2L, 2L, 2L,
3L, 3L, 3L, 3L),
.Label = c("Austria", "France", "UK"),
class = "factor"
),
id = 1:10,
region.0 = structure(
c(1L, 1L, 1L, 2L, 2L, 2L,
3L, 3L, 3L, 3L),
.Label = c("AT", "FR", "UK"),
class = "factor"
),
region.1 = structure(
c(1L, 1L, 2L, 3L, 3L, 3L, 4L, 4L, 6L,
5L),
.Label = c("AT1", "AT2", "FR1", "UK1", "UK4", "UK6"),
class = "factor"
),
region.2 = structure( …Run Code Online (Sandbox Code Playgroud) 对于示例数据框:
set.seed (1000)
a <- rnorm(1000)
b <- seq(1, 1000, by=1)
df <- data.frame(b, a)
Run Code Online (Sandbox Code Playgroud)
我想排除数据的前1%和后1%(列a)。
我已经阅读了有关R中的修剪和分位数的信息,但似乎无法使它们正常工作。
有人可以帮我解释一下我如何:
一种。将这些末端设置为NA
b。从我的数据框中删除这些肢体
对于示例数据帧:
df <- structure(list(id = 1:18, region = structure(c(1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("a",
"b"), class = "factor"), age.cat = structure(c(1L, 1L, 2L, 2L,
2L, 3L, 3L, 4L, 1L, 1L, 1L, 1L, 2L, 2L, 3L, 4L, 4L, 4L), .Label = c("0-18",
"19-35", "36-50", "50+"), class = "factor")), .Names = c("id",
"region", "age.cat"), class = "data.frame", row.names = c(NA,
-18L))
Run Code Online (Sandbox Code Playgroud)
我想重塑数据,详情如下:
region 0-18 19-35 36-50 50+ …Run Code Online (Sandbox Code Playgroud) 对于样品data.frame:
df <- structure(list(region = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L), .Label = c("a", "b", "c", "d"), class = "factor"),
result = c(0L, 1L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 0L), weight = c(0.126,
0.5, 0.8, 1.5, 5.3, 2.2, 3.2, 1.1, 0.1, 1.3, 2.5)), .Names = c("region",
"result", "weight"), row.names = c(NA, 11L), class = "data.frame")
df$region <- factor(df$region)
result <- xtabs(weight ~ region + result, data=df)
result
Run Code Online (Sandbox Code Playgroud)
我将如何更改xtab(我不想切换我之前问过的轴)的顺序?例如,确保 …
对于示例数据框:
survey <- structure(list(id = 1:10, cntry = structure(c(2L, 3L, 1L, 2L,
2L, 3L, 1L, 1L, 3L, 2L), .Label = c("DE", "FR", "UK"), class = "factor"),
age.cat = structure(c(1L, 1L, 2L, 4L, 1L, 3L, 4L, 4L, 1L,
2L), .Label = c("Y_15.24", "Y_40.54", "Y_55.plus", "Y_less.15"
), class = "factor")), .Names = c("id", "cntry", "age.cat"
), class = "data.frame", row.names = c(NA, -10L))
Run Code Online (Sandbox Code Playgroud)
我想添加一个名为'age.cat'的额外列,该列由另一个数据帧填充:
age.cat <- structure(list(cntry = structure(c(2L, 3L, 1L), .Label = c("DE",
"FR", "UK"), class = "factor"), Y_less.15 = c(0.2, …Run Code Online (Sandbox Code Playgroud)