小编adi*_*ari的帖子

使用 Jupyter Notebook 中的 PySpark 从 AWS EMR 集群读取存储在 AWS S3 中的解压缩 Shapefile

我对 AWS EMR 和 apache Spark 完全陌生。我正在尝试使用 shapefile 将 GeoID 分配给住宅物业。我无法从我的 s3 存储桶中读取 shapefile。请帮助我了解发生了什么，因为我在互联网上找不到任何解释确切问题的答案。

<!-- language: python 3.4 -->

import shapefile
import pandas as pd

def read_shapefile(shp_path):

"""
Read a shapefile into a Pandas dataframe with a 'coords' column holding
the geometry information. This uses the pyshp package
"""
    #read file, parse out the records and shapes
    sf = shapefile.Reader(shp_path)
    fields = [x[0] for x in sf.fields][1:]
    records = sf.records()
    shps = [s.points for s in sf.shapes()]
    center = [shape(s).centroid.coords[0] …

Run Code Online (Sandbox Code Playgroud)

gis amazon-s3 shapefile python-3.x pyspark

adi*_*ari

2018 08-01

5
推荐指数

1
解决办法

2330
查看次数