用于检测PostgreSQL中趋势的聚合函数

mae*_*sto 3 sql database postgresql select aggregate-functions

我正在使用psql DB来存储数据结构,如下所示:

datapoint(userId, rank, timestamp)
Run Code Online (Sandbox Code Playgroud)

其中timestamp是Unix Epoch毫秒时间戳.

在这个结构中,我每天存储每个用户的等级,所以它就像:

UserId   Rank  Timestamp
1        1     1435366459
1        2     1435366458
1        3     1435366457
2        8     1435366456
2        6     1435366455
2        7     1435366454
Run Code Online (Sandbox Code Playgroud)

所以,在上面的样本数据,用户id 1其改进它的秩每次测量,这意味着它有一个积极的趋势,而用户id 2在秩,这意味着它具有负的趋势正在下降.

我需要做的是根据最后N次测量检测所有具有正趋势的用户.

Mur*_*nik 6

一种方法是对每个用户的等级执行线性回归,并检查斜率是正还是负.幸运的是,PostgreSQL有一个内置函数来做到这一点 - regr_slope:

SELECT   user_id, regr_slope (rank1, timestamp1) AS slope
FROM     my_table
GROUP BY user_id
Run Code Online (Sandbox Code Playgroud)

此查询为您提供基本功能.现在,case如果你愿意,你可以用表情装扮一下:

SELECT user_id, 
       CASE WHEN slope > 0 THEN 'positive' 
            WHEN slope < 0 THEN 'negative' 
            ELSE 'steady' END AS trend
FROM   (SELECT   user_id, regr_slope (rank1, timestamp1) AS slope
        FROM     my_table
        GROUP BY user_id) t
Run Code Online (Sandbox Code Playgroud)

编辑:
不幸的是,regr_slope没有内置的方法来处理"前N"类型的要求,所以这应该单独处理,例如,通过子查询row_number:

-- Decoration outer query
SELECT user_id, 
       CASE WHEN slope > 0 THEN 'positive' 
            WHEN slope < 0 THEN 'negative' 
            ELSE 'steady' END AS trend
FROM   (-- Inner query to calculate the slope
        SELECT   user_id, regr_slope (rank1, timestamp1) AS slope
        FROM     (-- Inner query to get top N
                  SELECT user_id, rank1, 
                         ROW_NUMER() OVER (PARTITION BY user_id 
                                           ORDER BY timestamp1 DESC) AS rn
                  FROM   my_table) t
        WHERE    rn <= N -- Replace N with the number of rows you need
        GROUP BY user_id) t2
Run Code Online (Sandbox Code Playgroud)