Ste*_*ins 8 sql postgresql json
在PostgreSQL 9.3中,我存储了一些相当复杂的JSON对象,其中数组嵌套在数组中.此片段不是真实数据,但说明了相同的概念:
{
"customerId" : "12345",
"orders" : [{
"orderId" : "54321",
"lineItems" : [{
"productId" : "abc",
"qty" : 3
}, {
"productId" : "def",
"qty" : 1
}]
}
}
Run Code Online (Sandbox Code Playgroud)
我希望SQL查询能够对lineItem
对象进行操作...不仅在这个单一的JSON结构中,而且在该表列中的所有JSON对象中.例如,一个SQL查询返回所有不同productId
的,以及它们的总销售额qty
.为了防止这样的查询花了一整天,我可能想要索引lineItem
或其子字段.
使用这个StackOverflow问题,我想出了如何编写一个有效的查询:
SELECT
line_item->>'productId' AS product_id,
SUM(CAST(line_item->>'qty' AS INTEGER)) AS qty_sold
FROM
my_table,
json_array_elements(my_table.my_json_column->'orders') AS order,
json_array_elements(order->'lineItems') AS line_item
GROUP BY product_id;
Run Code Online (Sandbox Code Playgroud)
但是,最初的StackOverflow问题处理的是仅嵌套一层而不是两层的数据.我扩展了相同的概念(即条款中的"横向连接" FROM
),通过添加额外的横向连接来深入潜水.但是,我不确定这是否是最佳方法,因此我的问题的第一部分是:查询JSON对象中任意数量级别的 JSON数据的最佳方法是什么?
对于第二部分,在此类嵌套数据上创建索引,此StackOverflow问题再次处理仅嵌套一层深度的数据.然而,我只是完全迷失了,我的头脑游泳试图想想如何将这个应用到更深层次的水平.任何人都可以提供一个明确的方法来索引至少两个级别的数据,lineItems
如上所述?
为了处理无限递归问题,您需要使用递归 CTE来操作每个表行中的每个单独的 json 元素:
WITH RECURSIVE
raw_json as (
SELECT
*
FROM
(VALUES
(1,
'{
"customerId": "12345",
"orders": [
{
"orderId": "54321",
"lineItems": [
{
"productId": "abc",
"qty": 3
},
{
"productId": "def",
"qty": 1
}
]
}
]
}'::json),
(2,
'{
"customerId": "678910",
"artibitraryLevel": {
"orders": [
{
"orderId": "55345",
"lineItems": [
{
"productId": "abc",
"qty": 3
},
{
"productId": "ghi",
"qty": 10
}
]
}
]
}
}'::json)
) a(id,sample_json)
),
json_recursive as (
SELECT
a.id,
b.k,
b.v,
b.json_type,
case when b.json_type = 'object' and not (b.v->>'customerId') is null then b.v->>'customerId' else a.customer_id end customer_id, --track any arbitrary id when iterating through json graph
case when b.json_type = 'object' and not (b.v->>'orderId') is null then b.v->>'orderId' else a.order_id end order_id,
case when b.json_type = 'object' and not (b.v->>'productId') is null then b.v->>'productId' else a.product_id end product_id
FROM
(
SELECT
id,
sample_json v,
case left(sample_json::text,1)
when '[' then 'array'
when '{' then 'object'
else 'scalar'
end json_type, --because choice of json accessor function depends on this, and for some reason postgres has no built in function to get this value
sample_json->>'customerId' customer_id,
sample_json->>'orderId' order_id,
sample_json->>'productId' product_id
FROM
raw_json
) a
CROSS JOIN LATERAL (
SELECT
b.k,
b.v,
case left(b.v::text,1)
when '[' then 'array'
when '{' then 'object'
else 'scalar'
end json_type
FROM
json_each(case json_type when 'object' then a.v else null end ) b(k,v) --get key value pairs for individual elements if we are dealing with standard object
UNION ALL
SELECT
null::text k,
c.v,
case left(c.v::text,1)
when '[' then 'array'
when '{' then 'object'
else 'scalar'
end json_type
FROM
json_array_elements(case json_type when 'array' then a.v else null end) c(v) --if we have an array, just get the elements and use parent key
) b
UNION ALL --recursive term
SELECT
a.id,
b.k,
b.v,
b.json_type,
case when b.json_type = 'object' and not (b.v->>'customerId') is null then b.v->>'customerId' else a.customer_id end customer_id,
case when b.json_type = 'object' and not (b.v->>'orderId') is null then b.v->>'orderId' else a.order_id end order_id,
case when b.json_type = 'object' and not (b.v->>'productId') is null then b.v->>'productId' else a.product_id end product_id
FROM
json_recursive a
CROSS JOIN LATERAL (
SELECT
b.k,
b.v,
case left(b.v::text,1)
when '[' then 'array'
when '{' then 'object'
else 'scalar'
end json_type
FROM
json_each(case json_type when 'object' then a.v else null end ) b(k,v)
UNION ALL
SELECT
a.k,
c.v,
case left(c.v::text,1)
when '[' then 'array'
when '{' then 'object'
else 'scalar'
end json_type
FROM
json_array_elements(case json_type when 'array' then a.v else null end) c(v)
) b
)
Run Code Online (Sandbox Code Playgroud)
然后你可以通过任意 id 求和“qty”...
SELECT
customer_id,
sum(v::text::integer)
FROM
json_recursive
WHERE
k = 'qty'
GROUP BY
customer_id
Run Code Online (Sandbox Code Playgroud)
或者您可以获取“lineItem”对象并根据需要操作它们:
SELECT
*
FROM
json_recursive
WHERE
k = 'lineItems' and json_type = 'object'
Run Code Online (Sandbox Code Playgroud)
至于索引,您可以将递归查询调整为一个函数,该函数返回原始表每行中每个 json 对象的唯一键,然后在 json 列上创建功能索引:
SELECT
array_agg(DISTINCT k)
FROM
json_recursive
WHERE
not k is null
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
4925 次 |
最近记录: |