相关子查询以查找大于状态平均值的总数

MSI*_*SIS 2 sql-server subquery sql-server-2012

我有表 Vendors (VendorName, VendorState,....) 和 Invoices (InvoiceID, InvoiceTotal,...)。我想获得大于平均状态 InvoiceTotal 的发票(作为 InvoiceId)。

我知道我首先找到每个州的平均总数:

SELECT  VendorState, Avg(InvoiceTotal) AS AvgStateInvoice
from Invoices I join Vendors V on V.VendorID= I.VendorID 
group by VendorState
Run Code Online (Sandbox Code Playgroud)

所以我现在有按州列出的平均 InvoiceTotal 列表。现在我需要弄清楚:

如何进行外部查询以选择那些大于州平均水平的发票,这就是我迷路的地方,因为我不记得进行比较的语法。我想它会是这样的:

SELECT InvoiceId from Invoices where InvoiceTotal > .....?
Run Code Online (Sandbox Code Playgroud)

有什么想法吗?

Jam*_*s Z 7

你可以这样做:

select * 
from 
    dbo.Invoice I1 
    join dbo.Vendors V1 on V1.VendorID = I1.VendorID
where 
    I1.InvoiceTotal > (
        SELECT 
            Avg(I2.InvoiceTotal)
        from 
            dbo.Invoices I2 
            join dbo.Vendors V2 on V2.VendorID = I2.VendorID
        where 
            V1.VendorState = V2.VendorState 
    );
Run Code Online (Sandbox Code Playgroud)

或者使用窗口函数,这可能会更快,因为它不需要额外的连接:

SELECT X.*
from (
    select
        *,
        Avg(I.InvoiceTotal) over (partition by V.VendorState) as AvgInv
    from 
        dbo.Invoices I
        join dbo.Vendors V on V.VendorID = I.VendorID
) X
where
    X.InvoiceTotal > X.AvgInv;
Run Code Online (Sandbox Code Playgroud)

窗口函数选项可能不会更快。虽然它在查询中保存了一个连接,但执行计划将具有一个子表达式假脱机(带有两个额外的连接)。需要表假脱机来计算和保存当前分区的平均值。每个分区重放一次假脱机结果。