woo*_*gie 5 sql t-sql sql-server sql-server-2008
有一个大型数据库,我从中提取了一个研究人群.为了进行比较,我想选择具有类似特征的控制组.关于我想要匹配的两个标准是年龄和性别.查询给我我想要匹配的数字是
select sex, age/10 as decades,COUNT(*) as counts
from
(
select distinct m.patid
,m.sex,DATEPART(year,min(c.admitdate)) -m.yrdob as Age
from members as m
inner join claims as c on c.patid=m.PATID
group by m.PATID, m.sex,m.yrdob
)x group by sex, Age/10
Run Code Online (Sandbox Code Playgroud)
结果集看起来像

这个时代的十年专栏是由表达式给出的
(DATEPART(year,min(c.admitdate)) -m.yrdob)/10
Run Code Online (Sandbox Code Playgroud)
这用于使用整数除法查找年龄范围为20-29,30-39等的人.例如,我想从一个更大的数据集中选择507名20多岁的女性.查找较大数据集特征的查询是
select distinct m.patid
,m.sex
,(DATEPART(year,min(c.admitdate)) -m.yrdob)/10 as decades
from members as m
inner join claims as c on c.patid=m.PATID
group by m.PATID, m.sex,m.yrdob
Run Code Online (Sandbox Code Playgroud)
编辑:第二次查询的结果

所以我需要sum第二个查询中的数十年列在第一个查询中相等counts.我尝试了(并返回零结果)如下.我需要做些什么来匹配这些年龄?
运行的查询,但不返回任何结果:
select x.PATID--,x.sex,x.decades,y.counts
from
(
select distinct m.patid
,m.sex
,(DATEPART(year,min(c.admitdate)) -m.yrdob)/10 as decades
from members as m
inner join claims as c on c.patid=m.PATID
group by m.PATID, m.sex,m.yrdob
) as x
inner join
(
select sex, age/10 as decades,COUNT(*) as counts
from
(
select distinct m.patid
,m.sex,DATEPART(year,min(c.admitdate)) -m.yrdob as Age
from members as m
inner join claims as c on c.patid=m.PATID
group by m.PATID, m.sex,m.yrdob
)x group by sex, Age/10
) as y on x.sex=y.sex and x.decades=y.decades
group by y.counts,x.PATID,x.sex,y.sex
having SUM(x.decades)=y.counts and x.sex=y.sex
Run Code Online (Sandbox Code Playgroud)
select
T1.sex,
T1.decades,
T1.counts,
T2.patid
from (
select
sex,
age/10 as decades,
COUNT(*) as counts
from (
select m.patid,
m.sex,
DATEPART(year,min(c.admitdate)) -m.yrdob as Age
from members as m
inner join claims as c on c.patid=m.PATID
group by m.PATID, m.sex,m.yrdob
)x
group by sex, Age/10
) as T1
join (
--right here is where the random sampling occurs
SELECT TOP 50--this is the total number of peolpe in our dataset
patid
,sex
,decades
from (
select m.patid,
m.sex,
(DATEPART(year,min(c.admitdate)) -m.yrdob)/10 as decades
from members as m
inner join claims as c on c.patid=m.PATID
group by m.PATID, m.sex, m.yrdob
) T2
order by NEWID()
) as T2
on T2.sex = T1.sex
and T2.decades = T1.decades
Run Code Online (Sandbox Code Playgroud)
编辑:我发布了另一个与此类似的问题,其中我发现我的结果实际上不是随机的,但它们只是前 N 个结果。我在最外面的查询中进行了排序newid(),所做的只是围绕完全相同的结果集进行洗牌。从现已关闭的问题中,我发现我需要在上述查询的注释行中使用TOP关键字 with 。order by newid()
| 归档时间: |
|
| 查看次数: |
398 次 |
| 最近记录: |