我试图从以下段落结构中提取此类信息:
women_ran men_ran kids_ran walked
1 2 1 3
2 4 3 1
3 6 5 2
text = ["On Tuesday, one women ran on the street while 2 men ran and 1 child ran on the sidewalk. Also, there were 3 people walking.", "One person was walking yesterday, but there were 2 women running as well as 4 men and 3 kids running.", "The other day, there were three women running and also 6 men and 5 kids running …
Run Code Online (Sandbox Code Playgroud) 在我的数据中,在某些月份存在对某些ID的观察,而在其他月份则没有,例如
dat <- data.frame(c(1, 1, 1, 2, 3, 3, 3, 4, 4, 4), c(rep(30, 2), rep(25, 5), rep(20, 3)), c('2017-01-01', '2017-02-01', '2017-04-01', '2017-02-01', '2017-01-01', '2017-02-01', '2017-03-01', '2017-01-01',
'2017-02-01', '2017-04-01'))
colnames(dat) <- c('id', 'value', 'date')
Run Code Online (Sandbox Code Playgroud)
我想,每个id
值插入一行,其中包括一个月(县)缺少该id
和NA
的value
.
有没有办法(有些)简明扼要地做这几个月seq(min(as.Date(dat$date)), max(as.Date(dat$date)), by = 'months')
?我经常使用tidyverse和data.table,但我对任何方法都持开放态度.
我有两个日期(date1
和date2
)和id
data.frame中的变量:
dat <- data.frame(c('2014-02-11', '2014-05-04', '2014-05-22'), c('2014-04-12', '2014-09-22', '2014-07-04'), c('a', 'a', 'b'))
names(dat) <- c('date1', 'date2', 'id')
dat$date1 <- as.character.Date(dat$date1, format = '%Y-%m-%d')
dat$date2 <- as.character.Date(dat$date2, format = '%Y-%m-%d')
> dat
date1 date2 id
1 2014-02-11 2014-04-12 a
2 2014-05-04 2014-09-22 a
3 2014-05-22 2014-07-04 b
Run Code Online (Sandbox Code Playgroud)
我想创建一个新变量var
,指示是否有任何 date2
日期值在该date1
行的日期值之前(而不仅仅是date2
紧接在它之前的值):
> dat
date1 date2 id var
1 2014-02-11 2014-04-12 a 0
2 2014-05-04 2014-09-22 a 1
3 2014-05-22 …
Run Code Online (Sandbox Code Playgroud)