create table umd2
as select a.permno, a.date, a.realdate, exp(sum(log(1+b.ret))) - 1 as cum_return
from msex2 (obs=50 keep=permno date realdate) as a, msex2 (obs=50 keep=permno date ret) as b
where a.permno=b.permno and 0<=intck('month', b.date, a.date)<3
group by a.permno, a.date
having count(b.ret)=3;
Run Code Online (Sandbox Code Playgroud)
此查询用于计算动量(过去3个月的累计回报).但是,这给了我重复的行.我认为group by不会返回重复的行?当我将realdate列添加到我的group by语句时,
create table umd2
as select a.permno, a.date, a.realdate, exp(sum(log(1+b.ret))) - 1 as cum_return
from msex2 (obs=50 keep=permno date realdate) as a, msex2 (obs=50 keep=permno date ret) as b
where a.permno=b.permno and 0<=intck('month', b.date, a.date)<3
group by a.permno, a.date, a.realdate
having count(b.ret)=3;
Run Code Online (Sandbox Code Playgroud)
那些重复的行消失了.为什么是这样?
这是SAS的行为方式.SAS识别以下查询:
select a.permno, a.date, a.realdate, count(*)
from <whatever>
group by a.permno, a.date, a.realdate;
Run Code Online (Sandbox Code Playgroud)
作为聚合查询.这意味着行被聚合和减少,每三个列的组合有一个结果行.特别是,select匹配中的非聚合列(或是子集)中的列group by.
当你这样做:
select a.permno, a.date, a.realdate, count(*)
from <whatever>
group by a.permno, a.date;
Run Code Online (Sandbox Code Playgroud)
您现在使用的是非标准SQL.大多数数据库都会生成错误.MySQL会接受这种语法,并a.read_date从匹配值中分配任意值.SAS做了不同的事情.SAS说:"好吧,你显然不打算将其作为聚合查询." 因此,它不会聚合行,但会将聚合值附加到每一行.在其他数据库中,您可以使用窗口函数执行此操作.
从技术上讲,SAS会调用此重新汇总的摘要数据,此处将对此进行说明.