使用sql识别具有特定特征的时段

Ton*_*lff 7 sql oracle time-series

我正在寻找一个SQL查询,它可以确定一个人在没有吃饭的情况下最长的一段时间.理想情况下,输出看起来像

person  periodstart  periodend 
Run Code Online (Sandbox Code Playgroud)

每个人在哪里可以确定没有肉的最长时期

periodstart将是第一次非肉餐的时间

periodend将是第一次吃肉的时间.

下面的SQL创建表和数据.

CREATE TABLE MEALS 
(
  PERSON VARCHAR2(20 BYTE) 
, MEALTIME DATE 
, FOODTYPE VARCHAR2(20) 
);

Insert into MEALS (PERSON,MEALTIME,FOODTYPE) 
values ('Jane',to_date('04-JAN-15 06:09:09','DD-MON-RR HH24:MI:SS'),'fruit');
Insert into MEALS (PERSON,MEALTIME,FOODTYPE) 
values ('Jane',to_date('05-JAN-15 06:09:09','DD-MON-RR HH24:MI:SS'),'veg');
Insert into MEALS (PERSON,MEALTIME,FOODTYPE) 
values ('Jane',to_date('07-JAN-15 06:01:24','DD-MON-RR HH24:MI:SS'),'meat');
Insert into MEALS (PERSON,MEALTIME,FOODTYPE) 
values ('Jane',to_date('07-JAN-15 12:03:50','DD-MON-RR HH24:MI:SS'),'veg');
Insert into MEALS (PERSON,MEALTIME,FOODTYPE) 
values ('John',to_date('02-JAN-15 10:03:23','DD-MON-RR HH24:MI:SS'),'veg');
Insert into MEALS (PERSON,MEALTIME,FOODTYPE) 
values ('John',to_date('03-JAN-15 10:03:23','DD-MON-RR HH24:MI:SS'),'meat');
Insert into MEALS (PERSON,MEALTIME,FOODTYPE) 
values ('John',to_date('04-JAN-15 10:03:23','DD-MON-RR HH24:MI:SS'),'veg');
Insert into MEALS (PERSON,MEALTIME,FOODTYPE) 
values ('John',to_date('05-JAN-15 07:03:23','DD-MON-RR HH24:MI:SS'),'veg');
Insert into MEALS (PERSON,MEALTIME,FOODTYPE) 
values ('John',to_date('05-JAN-15 10:03:23','DD-MON-RR HH24:MI:SS'),'veg');
Insert into MEALS (PERSON,MEALTIME,FOODTYPE) 
values ('John',to_date('06-JAN-15 05:01:54','DD-MON-RR HH24:MI:SS'),'veg');
Insert into MEALS (PERSON,MEALTIME,FOODTYPE) 
values ('John',to_date('06-JAN-15 05:01:54','DD-MON-RR HH24:MI:SS'),'fruit');
Insert into MEALS (PERSON,MEALTIME,FOODTYPE) 
values ('John',to_date('06-JAN-15 10:03:23','DD-MON-RR HH24:MI:SS'),'meat');
Insert into MEALS (PERSON,MEALTIME,FOODTYPE) 
values ('Mary',to_date('02-JAN-15 05:01:54','DD-MON-RR HH24:MI:SS'),'veg');
Insert into MEALS (PERSON,MEALTIME,FOODTYPE) 
values ('Mary',to_date('03-JAN-15 06:04:25','DD-MON-RR HH24:MI:SS'),'meat');
Insert into MEALS (PERSON,MEALTIME,FOODTYPE) 
values ('Mary',to_date('05-JAN-15 04:04:25','DD-MON-RR HH24:MI:SS'),'veg');
Insert into MEALS (PERSON,MEALTIME,FOODTYPE) 
values ('Mary',to_date('05-JAN-15 06:04:25','DD-MON-RR HH24:MI:SS'),'meat');
Insert into MEALS (PERSON,MEALTIME,FOODTYPE) 
values ('Mary',to_date('05-JAN-15 06:04:25','DD-MON-RR HH24:MI:SS'),'meat');
Insert into MEALS (PERSON,MEALTIME,FOODTYPE) 
values ('Mary',to_date('06-JAN-15 05:01:54','DD-MON-RR HH24:MI:SS'),'veg');
Insert into MEALS (PERSON,MEALTIME,FOODTYPE) 
values ('Mary',to_date('07-JAN-15 06:04:25','DD-MON-RR HH24:MI:SS'),'veg');

commit;
Run Code Online (Sandbox Code Playgroud)

Ale*_*ole 2

这是一个缺口和岛屿问题,有多种方法可以解决它。一种是使用分析函数效果/技巧来查找每种类型的连续周期链:

select person, mealtime, foodtype,
  case when foodtype = 'meat' then 'Yes' else 'No' end as meat,
  dense_rank() over (partition by person,
      case when foodtype = 'meat' then 1 else 0 end order by mealtime)
    - dense_rank() over (partition by person order by mealtime) as chain
from meals
order by person, mealtime;
Run Code Online (Sandbox Code Playgroud)

“链”伪列基于此处case,因为您希望水果和蔬菜 - 或任何非肉类 - 得到相同的处理。

然后,您可以将其用作内部查询,从每个链中的第一餐开始查找每个肉类和非肉类时段的开始:

select person, meat, min(mealtime) as first_meal
from (
  select person, mealtime, foodtype,
    case when foodtype = 'meat' then 'Yes' else 'No' end as meat,
    dense_rank() over (partition by person,
        case when foodtype = 'meat' then 1 else 0 end order by mealtime)
      - dense_rank() over (partition by person order by mealtime) as chain
  from meals
)
group by person, meat, chain
order by person, min(mealtime);

PERSON               MEAT FIRST_MEAL       
-------------------- ---- ------------------
Jane                 No   04-JAN-15 06:09:09 
Jane                 Yes  07-JAN-15 06:01:24 
Jane                 No   07-JAN-15 12:03:50 
John                 No   02-JAN-15 10:03:23 
...
Run Code Online (Sandbox Code Playgroud)

您希望该时间段涵盖第一顿非肉餐到下一顿肉餐,因此您可以将用作带有领先和滞后的内部查询,以查看两侧的行:对于素食期,您可以向前查看以查看下一个肉类时期的开始;对于肉食期,您回头看看上一个素食期的开始:

select person, meat,
  case when meat = 'Yes' then lag(first_meal) over (partition by person
      order by first_meal) else first_meal end as period_start,
  case when meat = 'No' then lead(first_meal) over (partition by person
      order by first_meal) else first_meal end as period_end
from (
  select person, meat, min(mealtime) as first_meal
  from (
    select person, mealtime, foodtype,
      case when foodtype = 'meat' then 'Yes' else 'No' end as meat,
      dense_rank() over (partition by person,
          case when foodtype = 'meat' then 1 else 0 end order by mealtime)
        - dense_rank() over (partition by person order by mealtime) as chain
    from meals
  )
  group by person, meat, chain
)
order by person, period_start;

PERSON               MEAT PERIOD_START       PERIOD_END       
-------------------- ---- ------------------ ------------------
Jane                 No   04-JAN-15 06:09:09 07-JAN-15 06:01:24 
Jane                 Yes  04-JAN-15 06:09:09 07-JAN-15 06:01:24 
Jane                 No   07-JAN-15 12:03:50                    
John                 No   02-JAN-15 10:03:23 03-JAN-15 10:03:23 
...
Run Code Online (Sandbox Code Playgroud)

这实际上给了你重复的内容,尽管我已经留下了“肉”标志以使其在这一点上更清晰。假设您想忽略最新的开放式期间,您只需跳过这些并消除重复项:

select person, period_start, period_end
from (
  select person, meat,
    case when meat = 'Yes' then lag(first_meal) over (partition by person
        order by first_meal) else first_meal end as period_start,
    case when meat = 'No' then lead(first_meal) over (partition by person
        order by first_meal) else first_meal end as period_end
  from (
    select person, meat, min(mealtime) as first_meal
    from (
      select person, mealtime, foodtype,
        case when foodtype = 'meat' then 'Yes' else 'No' end as meat,
        dense_rank() over (partition by person,
            case when foodtype = 'meat' then 1 else 0 end order by mealtime)
          - dense_rank() over (partition by person order by mealtime) as chain
      from meals
    )
    group by person, meat, chain
  )
)
where meat = 'No'
and period_start is not null
and period_end is not null
order by person, period_start;

PERSON               PERIOD_START       PERIOD_END       
-------------------- ------------------ ------------------
Jane                 04-JAN-15 06:09:09 07-JAN-15 06:01:24 
John                 02-JAN-15 10:03:23 03-JAN-15 10:03:23 
John                 04-JAN-15 10:03:23 06-JAN-15 10:03:23 
Mary                 02-JAN-15 05:01:54 03-JAN-15 06:04:25 
Mary                 05-JAN-15 04:04:25 05-JAN-15 06:04:25 
Run Code Online (Sandbox Code Playgroud)

SQL Fiddle完整的中间步骤。

后来才意识到您只想要每个人的最长期限,您可以通过另一层获得:

select person, period_start, period_end
from (
  select person, period_start, period_end,
    rank() over (partition by person order by period_end - period_start desc) as rnk
  from (
    ...
  )
  where meat = 'No'
  and period_start is not null
  and period_end is not null
)
where rnk = 1
order by person, period_start;

PERSON               PERIOD_START       PERIOD_END       
-------------------- ------------------ ------------------
Jane                 04-JAN-15 06:09:09 07-JAN-15 06:01:24 
John                 04-JAN-15 10:03:23 06-JAN-15 10:03:23 
Mary                 02-JAN-15 05:01:54 03-JAN-15 06:04:25 
Run Code Online (Sandbox Code Playgroud)

更新了 SQL Fiddle