使用AWS :: Glue :: Table,您可以像此处那样建立Athena表。雅典娜支持在S3中基于文件夹结构对数据进行分区。我想从我的Glue模板对Athena表进行分区。
从AWS Glue Table TableInput看来,我可以使用它PartitionKeys来对数据进行分区,但是当我尝试使用以下模板时,Athena失败并且无法获取任何数据。
Resources:
...
MyGlueTable:
Type: AWS::Glue::Table
Properties:
DatabaseName: !Ref MyGlueDatabase
CatalogId: !Ref AWS::AccountId
TableInput:
Name: my-glue-table
Parameters: { "classification" : "json" }
PartitionKeys:
- {Name: dt, Type: string}
StorageDescriptor:
Location: "s3://elasticmapreduce/samples/hive-ads/tables/impressions/"
InputFormat: "org.apache.hadoop.mapred.TextInputFormat"
OutputFormat: "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"
SerdeInfo:
Parameters: { "separatorChar" : "," }
SerializationLibrary: "org.apache.hive.hcatalog.data.JsonSerDe"
StoredAsSubDirectories: false
Columns:
- {Name: requestBeginTime, Type: string}
- {Name: adId, Type: string}
- …Run Code Online (Sandbox Code Playgroud)