wol*_*oor 2 r transform dataframe
我有一个如下所示的数据框:
weekyear Location_Id priceA priceB
1 20101 6367 0.8712934 8
2 20101 6380 0.1712934 8
3 20102 6367 0.8712934 4
4 20102 6380 0.4712934 4
5 20103 6367 0.8712934 1
6 20103 6380 0.8712934 9
Run Code Online (Sandbox Code Playgroud)
我想贬低priceA和priceB.每个都按位置和时间索引.我想要的
priceAnew = priceA_{location,time} - mean(over time)(priceA_{location}) - mean(over location)(priceA_{time})
Run Code Online (Sandbox Code Playgroud)
这里的符号更清晰:https: //stats.stackexchange.com/questions/126549/do-people-used-fixed-effects-in-lasso
这是一种无益的方式吗?
我猜你在寻找类似的东西
transform(dd,
newA = priceA-ave(priceA, weekyear)-ave(priceA, Location_Id),
newB = priceB-ave(priceB, weekyear)-ave(priceB, Location_Id)
)
Run Code Online (Sandbox Code Playgroud)
(dddata.frame的名称在哪里).这回来了
weekyear Location_Id priceA priceB newA newB
1 20101 6367 0.8712934 8 -0.5212934 -4.333333
2 20101 6380 0.1712934 8 -0.8546267 -7.000000
3 20102 6367 0.8712934 4 -0.6712934 -4.333333
4 20102 6380 0.4712934 4 -0.7046267 -7.000000
5 20103 6367 0.8712934 1 -0.8712934 -8.333333
6 20103 6380 0.8712934 9 -0.5046267 -3.000000
Run Code Online (Sandbox Code Playgroud)
为您的样本输入.如果您必须在许多列上执行此操作,我可能更喜欢循环.
cols <- paste0("price", LETTERS[1:2])
for(col in cols) {
dd[[paste0("new", col)]] <- dd[[col]] -
ave(dd[[col]], dd$weekyear)-
ave(dd[[col]], dd$Location_Id),
}
Run Code Online (Sandbox Code Playgroud)