使用具有不同日期频率的两个表进行计算

mou*_*r11 3 postgresql timestamp

我希望将一个表中的日期与另一个表中具有不同频率的日期对齐。我希望与低频表相关联的值在出现新日期之前重复,以便我可以对涉及两个表的数据进行计算。

为了促进这一点,我认为构建索引范围方案会很有用且更快。

这可能会更清楚...

让我们称之为daily表:

CREATE TEMP TABLE daily AS
SELECT date::date, val FROM ( VALUES
  ('2017-01-01',1),
  ('2017-01-02',2),
  ('2017-01-03',1),
  ('2017-01-04',56),
  ('2017-01-05',7),
  ('2017-01-06',6),
  ('2017-01-07',8),
  ('2017-01-08',6),
  ('2017-01-09',4),
  ('2017-01-10',4),
  ('2017-01-11',6),
  ('2017-01-12',8)
) AS t(date,val);
Run Code Online (Sandbox Code Playgroud)

这是low_fq(低频)表:

CREATE TEMP TABLE lowfq AS
SELECT date::date, val FROM ( VALUES
  ( '2017-01-02',700 ),
  ( '2017-01-06',100 ),
  ( '2017-01-08',200 ),
  ( '2017-01-12',500 )
) AS t(date,val);
Run Code Online (Sandbox Code Playgroud)

结果应该是这样的:

?????????????????????????????????????????????????????????????
?   dialy    ?     ?  ?   low_fq   ?      ?  ? low_fg/daily ?
?????????????????????????????????????????????????????????????
? date       ? val ?  ? date       ? val  ?  ? calc         ?
? 2017-01-01 ? 1   ?  ? 2017-01-02 ? null ?  ? null         ?
? 2017-01-02 ? 2   ?  ? 2017-01-02 ? 700  ?  ? 350          ?
? 2017-01-03 ? 1   ?  ? 2017-01-06 ? 700  ?  ? 700          ?
? 2017-01-04 ? 56  ?  ? 2017-01-06 ? 700  ?  ? 12.5         ?
? 2017-01-05 ? 7   ?  ? 2017-01-06 ? 700  ?  ? 100          ?
? 2017-01-06 ? 6   ?  ? 2017-01-06 ? 100  ?  ? 16.66666667  ?
? 2017-01-07 ? 8   ?  ? 2017-01-08 ? 100  ?  ? 12.5         ?
? 2017-01-08 ? 6   ?  ? 2017-01-08 ? 200  ?  ? 33.33333333  ?
? 2017-01-09 ? 4   ?  ? 2017-01-12 ? 200  ?  ? 50           ?
? 2017-01-10 ? 4   ?  ? 2017-01-12 ? 200  ?  ? 50           ?
? 2017-01-11 ? 6   ?  ? 2017-01-12 ? 200  ?  ? 33.33333333  ?
? 2017-01-12 ? 8   ?  ? 2017-01-12 ? 500  ?  ? 62.5         ?
?????????????????????????????????????????????????????????????
Run Code Online (Sandbox Code Playgroud)

其中Low_fg/daily只是将low_fg值除以daily值。

我不需要 2017-01-01 计算,因此处理空值可能意味着提前将其过滤掉。

请注意重复值,直到low_fq表中的日期发生变化。

真实世界:

如前所述,在现实世界中,为了做到这一点,我试图按照adalammar在这个问题中的描述构建一个分区。除了我正在构建一个 FK,我有日期,而且我的值不为空,但希望你能明白:分配给日期范围的 FK 整数。

我很高兴跳过 FK 分配问题,但我认为这将使计算更容易和更快。

这里最好的策略是什么,我该如何实施?

这是我的真实世界表:

额外细节

低频数据和我必须加入的表格才能获取日期:

CREATE TABLE fund_data
(
  id serial NOT NULL,
  fund_entries_id integer NOT NULL,
  fund_val numeric(25,6) NOT NULL,
  bbg_pulls_id integer NOT NULL,
  CONSTRAINT fund_data_pkey PRIMARY KEY (id),
  CONSTRAINT fund_data_bbg_pulls_id_fkey FOREIGN KEY (bbg_pulls_id)
      REFERENCES bbg_pulls (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT fund_data_fund_entries_id_fkey FOREIGN KEY (fund_entries_id)
      REFERENCES fund_entries (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT fund_data_fund_entries_id_bbg_pulls_id_key UNIQUE (fund_entries_id, bbg_pulls_id)
)

CREATE TABLE ern_dt
(
  company_id integer NOT NULL,
  ern_release_date date NOT NULL,
  fiscal_prd character varying(7) NOT NULL,
  id serial NOT NULL,
  ern_release_date_update timestamp without time zone,
  gen_qtr_end_dt_id integer,
  CONSTRAINT ern_dt_pkey PRIMARY KEY (id),
  CONSTRAINT ern_dt_company_id_fkey FOREIGN KEY (company_id)
      REFERENCES company (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT ern_dt_gen_qtr_end_dt_id_fkey11 FOREIGN KEY (gen_qtr_end_dt_id)
      REFERENCES gen_qtr_end_dt (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT set UNIQUE (company_id, ern_release_date, fiscal_prd)
)
Run Code Online (Sandbox Code Playgroud)

高频数据:

CREATE TABLE daily_data
(
  id serial NOT NULL,
  company_id integer NOT NULL,
  trade_date date NOT NULL,
  daily_val numeric(13,6) NOT NULL,
  bbg_pulls_id integer NOT NULL,
  gen_qtr_end_dt_id integer,
  CONSTRAINT daily_data_pkey PRIMARY KEY (id),
  CONSTRAINT daily_data_bbg_pulls_id_fkey FOREIGN KEY (bbg_pulls_id)
      REFERENCES bbg_pulls (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT daily_data_company_id_fkey FOREIGN KEY (company_id)
      REFERENCES company (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT daily_data_company_id_trade_date_bbg_pulls_id_key UNIQUE (company_id, trade_date, bbg_pulls_id)
)
Run Code Online (Sandbox Code Playgroud)

我正在使用 PostgreSQL 9.3.5。

更新:[每个请求,已删除并自行回答此问题]

McN*_*ets 5

这里的问题是如何添加分区,它是使用以下方法完成的:

sum((case when lf.d1 is null then 0 else 1 end)) over (order by hf.cia, hf.d1) + 1 + hf.cia
Run Code Online (Sandbox Code Playgroud)

请注意,+ 1 + hf.cia当 d2 为 NULL 但 cia 已更改时,我曾经很小心。

with tbl as
(
    select hf.cia, hf.d1, (hf.val)::float, lf.d1 d2, (lf.val)::float val2
           ,sum((case when lf.d1 is null then 0 else 1 end)) over (order by hf.cia, hf.d1) + 1 + hf.cia as vpart
    from hf
         left join lf on lf.cia = hf.cia and lf.d1 = hf.d1
    order by hf.cia, hf.d1
)
select
     t.cia, t.d1, t.val, t2.d2, t2.val2 ,t2.val2 / val calc, t.vpart
from tbl t
     inner join 
                (select d2, val2::float, vpart
                 from tbl
                 where d2 is not null) t2
     on t2.vpart = t.vpart
order by vpart;
Run Code Online (Sandbox Code Playgroud)

我感谢Evan Carroll他对初始解决方案中使用的命名 WINDOW 的使用做出的贡献。感谢@ypercube??,指出out of memory问题可能是由 pgAdmin 而不是服务器问题引起的。

这是结果:

+-----+------------+-----+------------+------+---------+-------+
| cia | d1         | val | d2         | val2 |    calc | vpart |
+-----+------------+-----+------------+------+---------+-------+
|  1  | 2017.01.02 |  2  | 2017.01.02 |  700 |  350.00 |   3   |
|  1  | 2017.01.03 |  1  | 2017.01.02 |  700 |  700.00 |   3   |
|  1  | 2017.01.04 |  56 | 2017.01.02 |  700 |   12.50 |   3   |
|  1  | 2017.01.05 |  7  | 2017.01.02 |  700 |  100.00 |   3   |
+-----+------------+-----+------------+------+---------+-------+
|  1  | 2017.01.06 |  6  | 2017.01.06 |  100 |   16.67 |   4   |
|  1  | 2017.01.07 |  8  | 2017.01.06 |  100 |   12.50 |   4   |
+-----+------------+-----+------------+------+---------+-------+
|  1  | 2017.01.08 |  6  | 2017.01.08 |  200 |   33.33 |   5   |
|  1  | 2017.01.09 |  4  | 2017.01.08 |  200 |   50.00 |   5   |
|  1  | 2017.01.10 |  4  | 2017.01.08 |  200 |   50.00 |   5   |
|  1  | 2017.01.11 |  6  | 2017.01.08 |  200 |   33.33 |   5   |
+-----+------------+-----+------------+------+---------+-------+
|  1  | 2017.01.12 |  8  | 2017.01.12 |  500 |   62.50 |   6   |
+-----+------------+-----+------------+------+---------+-------+
|  2  | 2017.01.02 |  2  | 2017.01.02 |  700 |  350.00 |   8   |
|  2  | 2017.01.03 |  1  | 2017.01.02 |  700 |  700.00 |   8   |
|  2  | 2017.01.04 |  56 | 2017.01.02 |  700 |   12.50 |   8   |
|  2  | 2017.01.05 |  7  | 2017.01.02 |  700 |  100.00 |   8   |
+-----+------------+-----+------------+------+---------+-------+
|  2  | 2017.01.06 |  6  | 2017.01.06 |  100 |   16.67 |   9   |
|  2  | 2017.01.07 |  8  | 2017.01.06 |  100 |   12.50 |   9   |
+-----+------------+-----+------------+------+---------+-------+
|  2  | 2017.01.08 |  6  | 2017.01.08 |  200 |   33.33 |   10  |
|  2  | 2017.01.09 |  4  | 2017.01.08 |  200 |   50.00 |   10  |
|  2  | 2017.01.10 |  4  | 2017.01.08 |  200 |   50.00 |   10  |
|  2  | 2017.01.11 |  6  | 2017.01.08 |  200 |   33.33 |   10  |
+-----+------------+-----+------------+------+---------+-------+
|  2  | 2017.01.12 |  8  | 2017.01.12 |  500 |   62.50 |   11  |
+-----+------------+-----+------------+------+---------+-------+
Run Code Online (Sandbox Code Playgroud)

在这里检查:http : //rextester.com/DRAW20062