pdo*_*naj 5 json amazon-s3 amazon-web-services amazon-athena
我在 S3 上嵌套了 JSON 文件,并尝试使用 Athena 查询它们。
但是,我在查询嵌套 JSON 值时遇到问题。
我的 JSON 文件如下所示:
{
"id": "17842007980192959",
"acount_id": "17841401243773780",
"stats": [
{
"name": "engagement",
"period": "lifetime",
"values": [
{
"value": 374
}
],
"title": "Engagement",
"description": "Total number of likes and comments on the media object",
"id": "17842007980192959/insights/engagement/lifetime"
},
{
"name": "impressions",
"period": "lifetime",
"values": [
{
"value": 11125
}
],
"title": "Impressions",
"description": "Total number of times the media object has been seen",
"id": "17842007980192959/insights/impressions/lifetime"
},
{
"name": "reach",
"period": "lifetime",
"values": [
{
"value": 8223
}
],
"title": "Reach",
"description": "Total number of unique accounts that have seen the media object",
"id": "17842007980192959/insights/reach/lifetime"
},
{
"name": "saved",
"period": "lifetime",
"values": [
{
"value": 0
}
],
"title": "Saved",
"description": "Total number of unique accounts that have saved the media object",
"id": "17842007980192959/insights/saved/lifetime"
}
],
"import_date": "2017-12-04"
}
Run Code Online (Sandbox Code Playgroud)
我想做的是查询“stats”字段值,其中 name=impressions。
所以理想情况下是这样的:
SELECT id, account_id, stats.values.value WHERE stats.name='engagement'
Run Code Online (Sandbox Code Playgroud)
AWS示例:https://docs.aws.amazon.com/athena/latest/ug/searching-for-values.html
任何帮助,将不胜感激。
您可以使用以下表定义查询 JSON:
CREATE EXTERNAL TABLE test(
id string,
acount_id string,
stats array<
struct<
name:string,
period:string,
values:array<
struct<value:string>>,
title:string
>
>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3://bucket/';
Run Code Online (Sandbox Code Playgroud)
现在,value可以通过以下取消嵌套来使用该列:
select id, acount_id, stat.name,x.value
from test
cross join UNNEST(test.stats) as st(stat)
cross join UNNEST(stat."values") as valx(x)
WHERE stat.name='engagement';
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
8875 次 |
| 最近记录: |