在q kdb中的某些条件下使用group by优化查询

Uts*_*Uts 2 kdb

我们有一个表t如下

q)t:([] sym:10?`GOOG`AMZN`IBM; px:10?100.; size:10?1000; mkt:10?`ab`cd`ef)
Run Code Online (Sandbox Code Playgroud)

我们的要求是,如果“ mkt”列的值为“ ef”,则按“ sym”列对表“ t”进行“分组”,对于其余市场(“ ab`cd”),我们需要所有值(而不是“分组” )。对于此用例,我在下面编写了可以按预期工作的查询,

q)(select px, size, sym, mkt from select by sym from t where mkt=`ef), select px, size, sym, mkt from t where mkt in `ab`cd
Run Code Online (Sandbox Code Playgroud)

请以一种方式帮助我优化上述查询,即

sudo code - 
if mkt=`ef: 
    then use group by on table
else if mkt in `ab`cd
    don't use group by on table
Run Code Online (Sandbox Code Playgroud)

Rob*_*tch 5

I have found two different ways to make your query that are different from the one you have provided.

You can use the following query to accomplish what you want in one select statement:

select from t where (mkt<>`ef)|(mkt=`ef)&i=(last;i)fby ([]sym;mkt)
Run Code Online (Sandbox Code Playgroud)

However if you compare its speed:

q)\t:1000 select from t where (mkt<>`ef)|(mkt=`ef)&i=(last;i)fby ([]sym;mkt)
68
Run Code Online (Sandbox Code Playgroud)

to your original query:

q)\t:1000 (select px, size, sym, mkt from select by sym from t where mkt=`ef), select px, size, sym, mkt from t where mkt in `ab`cd
40
Run Code Online (Sandbox Code Playgroud)

You can see that your query is faster.

Additionally you can try this which does not require explicitly stating every mkt in t you wish to not group by sym

(0!select by sym from t where mkt=`ef),select from t where mkt<>`ef
Run Code Online (Sandbox Code Playgroud)

But again this ends up being around the same speed as your original solution:

q)\t:1000 (0!select by sym from t where mkt=`ef),select from t where mkt<>`ef
42
Run Code Online (Sandbox Code Playgroud)

So in terms of optimization it seems your query works well for what you want it to accomplish.