我有一组时间序列,我想在特定的时间间隔内相对于它们的值来缩放每个时间序列.这样,每个系列在那个时候都是1.0,并按比例改变.
我无法弄清楚如何用dplyr做到这一点.
这是一个使用for循环的工作示例:
library(dplyr)
data = expand.grid(
category = LETTERS[1:3],
year = 2000:2005)
data$value = runif(nrow(data))
# the first time point in the series
baseYear = 2002
# for each category, divide all the values by the category's value in the base year
for(category in as.character(levels(factor(data$category)))) {
data[data$category == category,]$value = data[data$category == category,]$value / data[data$category == category & data$year == baseYear,]$value[[1]]
}
Run Code Online (Sandbox Code Playgroud)
编辑:修改了问题,使基准时间点不可索引.有时"时间"列实际上是一个因素,不一定是序数.
我有以下 df:
df<-data.frame(geo_num=c(11,12,22,41,42,43,77,71),
cust_id=c("A","A","B","C","C","C","D","D"),
sales=c(2,3,2,1,2,4,6,3))
> df
geo_num cust_id sales
1 11 A 2
2 12 A 3
3 22 B 2
4 41 C 1
5 42 C 2
6 43 C 4
7 77 D 6
8 71 D 3
Run Code Online (Sandbox Code Playgroud)
需要创建一个新列“geo_num_new”,其中“cust_id”中的每个组都具有“geo_num”中的第一个值,如下所示:
> df_new
geo_num cust_id sales geo_num_new
1 11 A 2 11
2 12 A 3 11
3 22 B 2 22
4 41 C 1 41
5 42 C 2 41
6 43 C 4 41
7 …Run Code Online (Sandbox Code Playgroud)