SQL Server中的计算列

mal*_*iks 9 sql-server calculated-columns

我的数据在表中如下:

id  Author_ID   Research_Area       Category_ID  Paper_Count   Paper_Year   Rank  
---------------------------------------------------------------------------------
1   677         feature extraction  8            1             2005         1
2   677         image annotation    11           1             2005         2
3   677         probabilistic model 12           1             2005         3
4   677         semantic            19           1             2007         1
5   677         feature extraction  8            1             2009         1
6   677         image annotation    11           1             2011         1  
7   677         semantic            19           1             2012         1  
8   677         video sequence      5            2             2013         1  
9   1359        adversary model     1            2             2005         1
10  1359        ensemble method     14           2             2005         2
11  1359        image represent     11           2             2005         3
12  1359        adversary model     1            7             2006         1
13  1359        concurrency control 17           5             2006         2
14  1359        information system  12           2             2006         3  
15  ...         
16  ...  
Run Code Online (Sandbox Code Playgroud)

而我希望查询输出为:

id  Author_ID   Category_ID  Paper_Count   Category_Prob   Paper_Year   Rank  
---------------------------------------------------------------------------------
1   677         8            1             0.333           2005         1
2   677         11           1             0.333           2005         2
3   677         12           1             0.333           2005         3
4   677         19           1             1.0             2007         1
5   677         8            1             1.0             2009         1
6   677         11           1             1.0             2011         1  
7   677         19           1             1.0             2012         1  
8   677         5            2             1.0             2013         1  
9   1359        1            2             0.333           2005         1
10  1359        14           2             0.333           2005         2
11  1359        11           2             0.333           2005         3
12  1359        1            7             0.5             2006         1
13  1359        17           5             0.357           2006         2
14  1359        12           2             0.142           2006         3  
15  ...         
16  ...  
Run Code Online (Sandbox Code Playgroud)

Category_Prob计算列则分两步计算:

步骤首先,我们必须有一个SUMPaper_Count每个Paper_Year例如即Paper_Year = 2005Author_ID = 677中,SUM(Paper_Count) = 3

步骤二,然后为每个Category_ID,我们必须分配Paper_Count与价值SUM(Paper_Count)Paper_Year其中将1/30.333等等...

而且,我试过这个查询:

SELECT 
    Author_ID, Abstract_Category, Paper_Count,
    [Category_Prob] = Paper_Count / SUM(Paper_Count),
    Paper_Year, Rank
FROM 
    Author_Areas
GROUP BY 
    Author_ID, Abstract_Category, Paper_Year, Paper_Count, Rank
ORDER BY 
    Author_ID, Paper_Year
Run Code Online (Sandbox Code Playgroud)

但它仅返回1Category_Prob的表中的所有行.

Gio*_*sos 6

您的查询的问题在于您不是按分组进行分组Paper_Year,而是按分组进行分组Author_ID, Abstract_Category, Paper_Count, Rank.因此SUM(Paper_Count)等于每个组的Paper_Count.

你可以使用SUM OVER这个:

SELECT      id, Author_ID, Abstract_Category [Category_ID],  
            Paper_Count, 
            Paper_Count * 1.0 / SUM(Paper_Count)  
            OVER (PARTITION BY Author_ID, Paper_Year) AS [Category_Prob],
            Paper_Year, Rank
FROM        Author_Areas
ORDER BY    Author_ID, Paper_Year
Run Code Online (Sandbox Code Playgroud)

注意:您必须乘以1.0以避免整数除法. 注2:如果你的实际要求是按作者分组,那么也许你必须Author_IDPARTITION BY条款中添加字段.