Jef*_*ood 124 sql sql-server date gaps-and-islands
以下用户历史记录表包含给定用户访问网站的每一天的一条记录(在24小时UTC时间段内).它有数千条记录,但每个用户每天只有一条记录.如果用户当天没有访问过该网站,则不会生成任何记录.
Id UserId CreationDate ------ ------ ------------ 750997 12 2009-07-07 18:42:20.723 750998 15 2009-07-07 18:42:20.927 751000 19 2009-07-07 18:42:22.283
我正在寻找的是这个表上的SQL查询具有良好的性能,它告诉我哪些用户组连续几天访问了网站而没有错过一天.
换句话说,有多少用户在此表中有(n)个记录,包括顺序(前一天或后一天)日期?如果序列中缺少任何一天,则序列被破坏并应在1处重新开始; 我们正在寻找在这里连续几天没有差距的用户.
此查询与特定Stack Overflow徽章之间的任何相似之处纯属巧合,当然.. :)
Rob*_*ley 147
怎么样(请确保前面的语句以分号结尾):
WITH numberedrows
AS (SELECT ROW_NUMBER() OVER (PARTITION BY UserID
ORDER BY CreationDate)
- DATEDIFF(day,'19000101',CreationDate) AS TheOffset,
CreationDate,
UserID
FROM tablename)
SELECT MIN(CreationDate),
MAX(CreationDate),
COUNT(*) AS NumConsecutiveDays,
UserID
FROM numberedrows
GROUP BY UserID,
TheOffset
Run Code Online (Sandbox Code Playgroud)
我们的想法是,如果我们有天数列表(作为数字)和row_number,那么错过的天数会使这两个列表之间的偏差略大一些.所以我们正在寻找具有一致偏移的范围.
您可以在此末尾使用"ORDER BY NumConsecutiveDays DESC",或者说"HAVING count(*)> 14"表示阈值...
我没有测试过这个 - 只是把它写在我的头顶.希望在SQL2005中运行.
...并且会对tablename上的索引(UserID,CreationDate)提供很大帮助
编辑:结果偏移是一个保留字,所以我使用了TheOffset.
编辑:使用COUNT(*)的建议是非常有效的 - 我应该首先做到这一点,但并没有真正思考.以前它使用的是datediff(day,min(CreationDate),max(CreationDate)).
抢
Spe*_*ort 69
答案显然是:
SELECT DISTINCT UserId
FROM UserHistory uh1
WHERE (
SELECT COUNT(*)
FROM UserHistory uh2
WHERE uh2.CreationDate
BETWEEN uh1.CreationDate AND DATEADD(d, @days, uh1.CreationDate)
) = @days OR UserId = 52551
Run Code Online (Sandbox Code Playgroud)
编辑:
好的,这是我认真的答案:
DECLARE @days int
DECLARE @seconds bigint
SET @days = 30
SET @seconds = (@days * 24 * 60 * 60) - 1
SELECT DISTINCT UserId
FROM (
SELECT uh1.UserId, Count(uh1.Id) as Conseq
FROM UserHistory uh1
INNER JOIN UserHistory uh2 ON uh2.CreationDate
BETWEEN uh1.CreationDate AND
DATEADD(s, @seconds, DATEADD(dd, DATEDIFF(dd, 0, uh1.CreationDate), 0))
AND uh1.UserId = uh2.UserId
GROUP BY uh1.Id, uh1.UserId
) as Tbl
WHERE Conseq >= @days
Run Code Online (Sandbox Code Playgroud)
编辑:
[Jeff Atwood]这是一个非常快速的解决方案,值得被接受,但Rob Farley的解决方案也非常出色,可以说更快(!).请检查一下!
Meh*_*ari 18
如果您可以更改表模式,我建议您在表中添加一个列LongestStreak
,该列设置为以结尾的连续天数CreationDate
.这很容易更新在登录时表(类似于你在做什么已经,如果没有行当天的存在,你会如果任何行存在前一天的检查.如果属实,你会递增LongestStreak
中新行,否则,您将其设置为1.)
添加此列后,查询将显而易见:
if exists(select * from table
where LongestStreak >= 30 and UserId = @UserId)
-- award the Woot badge.
Run Code Online (Sandbox Code Playgroud)
小智 6
一些很好的表达式SQL:
select
userId,
dbo.MaxConsecutiveDates(CreationDate) as blah
from
dbo.Logins
group by
userId
Run Code Online (Sandbox Code Playgroud)
假设你有一个用户定义的聚合函数的某些东西(注意这是错误的):
using System;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using System.Runtime.InteropServices;
namespace SqlServerProject1
{
[StructLayout(LayoutKind.Sequential)]
[Serializable]
internal struct MaxConsecutiveState
{
public int CurrentSequentialDays;
public int MaxSequentialDays;
public SqlDateTime LastDate;
}
[Serializable]
[SqlUserDefinedAggregate(
Format.Native,
IsInvariantToNulls = true, //optimizer property
IsInvariantToDuplicates = false, //optimizer property
IsInvariantToOrder = false) //optimizer property
]
[StructLayout(LayoutKind.Sequential)]
public class MaxConsecutiveDates
{
/// <summary>
/// The variable that holds the intermediate result of the concatenation
/// </summary>
private MaxConsecutiveState _intermediateResult;
/// <summary>
/// Initialize the internal data structures
/// </summary>
public void Init()
{
_intermediateResult = new MaxConsecutiveState { LastDate = SqlDateTime.MinValue, CurrentSequentialDays = 0, MaxSequentialDays = 0 };
}
/// <summary>
/// Accumulate the next value, not if the value is null
/// </summary>
/// <param name="value"></param>
public void Accumulate(SqlDateTime value)
{
if (value.IsNull)
{
return;
}
int sequentialDays = _intermediateResult.CurrentSequentialDays;
int maxSequentialDays = _intermediateResult.MaxSequentialDays;
DateTime currentDate = value.Value.Date;
if (currentDate.AddDays(-1).Equals(new DateTime(_intermediateResult.LastDate.TimeTicks)))
sequentialDays++;
else
{
maxSequentialDays = Math.Max(sequentialDays, maxSequentialDays);
sequentialDays = 1;
}
_intermediateResult = new MaxConsecutiveState
{
CurrentSequentialDays = sequentialDays,
LastDate = currentDate,
MaxSequentialDays = maxSequentialDays
};
}
/// <summary>
/// Merge the partially computed aggregate with this aggregate.
/// </summary>
/// <param name="other"></param>
public void Merge(MaxConsecutiveDates other)
{
// add stuff for two separate calculations
}
/// <summary>
/// Called at the end of aggregation, to return the results of the aggregation.
/// </summary>
/// <returns></returns>
public SqlInt32 Terminate()
{
int max = Math.Max((int) ((sbyte) _intermediateResult.CurrentSequentialDays), (sbyte) _intermediateResult.MaxSequentialDays);
return new SqlInt32(max);
}
}
}
Run Code Online (Sandbox Code Playgroud)
似乎您可以利用这样一个事实:要连续 n 天需要有 n 行。
所以像这样:
SELECT users.UserId, count(1) as cnt
FROM users
WHERE users.CreationDate > now() - INTERVAL 30 DAY
GROUP BY UserId
HAVING cnt = 30
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
11225 次 |
最近记录: |