Bal*_*nki 2 json amazon-s3 amazon-redshift jq
我希望为 Redshift 生成一个清单文件,其中COPY包含aws s3api --list-objects和jq,如下所示:-
aws s3api list-objects --bucket annalects3 --prefix "DFA/20160926/394007-OMD-Coles/dcm_account394007_impression" --output json --query '{"entries": Contents[].{"url":"Key"}}' | jq '.entries[].mandatory = true'
Run Code Online (Sandbox Code Playgroud)
它生成如下输出:-
{ "entries": [
{
"mandatory": true,
"url": "DFA/20160926/394007-OMD-Coles/dcm_account394007_impression_2016092507_20160926_002328_292527438.csv.gz"
},
{
"mandatory": true,
"url": "DFA/20160926/394007-OMD-Coles/dcm_account394007_impression_2016092508_20160926_020131_292592736.csv.gz"
},
{
"mandatory": true,
"url": "DFA/20160926/394007-OMD-Coles/dcm_account394007_impression_2016092509_20160926_030312_292502379.csv.gz"
},
{
"mandatory": true,
"url": "DFA/20160926/394007-OMD-Coles/dcm_account394007_impression_2016092510_20160926_033656_292590227.csv.gz"
}
]
}
Run Code Online (Sandbox Code Playgroud)
但是,清单文件需要以存储桶名称为前缀的 URL 对象,但我没有使用过。输出需要看起来像
{ "entries": [
{
"mandatory": true,
"url": "s3://mybucket/DFA/20160926/394007-OMD-Coles/dcm_account394007_impression_2016092507_20160926_002328_292527438.csv.gz"
},
{
"mandatory": true,
"url": "s3://mybucket/DFA/20160926/394007-OMD-Coles/dcm_account394007_impression_2016092508_20160926_020131_292592736.csv.gz"
},
{
"mandatory": true,
"url": "s3://mybucket/DFA/20160926/394007-OMD-Coles/dcm_account394007_impression_2016092509_20160926_030312_292502379.csv.gz"
},
{
"mandatory": true,
"url": "s3://mybucket/DFA/20160926/394007-OMD-Coles/dcm_account394007_impression_2016092510_20160926_033656_292590227.csv.gz"
}
]
}
Run Code Online (Sandbox Code Playgroud)
以下将实现您想要的
aws s3api list-objects \
--bucket <mybucket> \
--prefix "<myprefix>" \
--output json \
--query '{"entries": Contents[].{"url":"Key"}}' \
| jq '.entries[] | .url = "s3://<mybucket>/\(.entries.url)" | .mandatory = true'
Run Code Online (Sandbox Code Playgroud)
我正在使用字符串插值来更新entries[].url值
| 归档时间: |
|
| 查看次数: |
2878 次 |
| 最近记录: |