mae*_*sto 3 sql database postgresql select aggregate-functions
我正在使用psql DB来存储数据结构,如下所示:
datapoint(userId, rank, timestamp)
Run Code Online (Sandbox Code Playgroud)
其中timestamp是Unix Epoch毫秒时间戳.
在这个结构中,我每天存储每个用户的等级,所以它就像:
UserId Rank Timestamp
1 1 1435366459
1 2 1435366458
1 3 1435366457
2 8 1435366456
2 6 1435366455
2 7 1435366454
Run Code Online (Sandbox Code Playgroud)
所以,在上面的样本数据,用户id 1其改进它的秩每次测量,这意味着它有一个积极的趋势,而用户id 2在秩,这意味着它具有负的趋势正在下降.
我需要做的是根据最后N次测量检测所有具有正趋势的用户.
一种方法是对每个用户的等级执行线性回归,并检查斜率是正还是负.幸运的是,PostgreSQL有一个内置函数来做到这一点 - regr_slope:
SELECT user_id, regr_slope (rank1, timestamp1) AS slope
FROM my_table
GROUP BY user_id
Run Code Online (Sandbox Code Playgroud)
此查询为您提供基本功能.现在,case如果你愿意,你可以用表情装扮一下:
SELECT user_id,
CASE WHEN slope > 0 THEN 'positive'
WHEN slope < 0 THEN 'negative'
ELSE 'steady' END AS trend
FROM (SELECT user_id, regr_slope (rank1, timestamp1) AS slope
FROM my_table
GROUP BY user_id) t
Run Code Online (Sandbox Code Playgroud)
编辑:
不幸的是,regr_slope没有内置的方法来处理"前N"类型的要求,所以这应该单独处理,例如,通过子查询row_number:
-- Decoration outer query
SELECT user_id,
CASE WHEN slope > 0 THEN 'positive'
WHEN slope < 0 THEN 'negative'
ELSE 'steady' END AS trend
FROM (-- Inner query to calculate the slope
SELECT user_id, regr_slope (rank1, timestamp1) AS slope
FROM (-- Inner query to get top N
SELECT user_id, rank1,
ROW_NUMER() OVER (PARTITION BY user_id
ORDER BY timestamp1 DESC) AS rn
FROM my_table) t
WHERE rn <= N -- Replace N with the number of rows you need
GROUP BY user_id) t2
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
3825 次 |
| 最近记录: |