Bal*_*r82 6 python json openstreetmap pandas overpass-api
我尝试读取一个有效的Openstreetmaps API输出JSON字符串.
我使用以下代码:
import pandas as pd
import requests
# Links unten
minLat = 50.9549
minLon = 13.55232
# Rechts oben
maxLat = 51.1390
maxLon = 13.89873
osmrequest = {'data': '[out:json][timeout:25];(node["highway"="bus_stop"](%s,%s,%s,%s););out body;>;out skel qt;' % (minLat, minLon, maxLat, maxLon)}
osmurl = 'http://overpass-api.de/api/interpreter'
osm = requests.get(osmurl, params=osmrequest)
osmdata = osm.json()
osmdataframe = pd.read_json(osmdata)
Run Code Online (Sandbox Code Playgroud)
抛出以下错误:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-66-304b7fbfb645> in <module>()
----> 1 osmdataframe = pd.read_json(osmdata)
/Users/paul/anaconda/lib/python2.7/site-packages/pandas/io/json.pyc in read_json(path_or_buf, orient, typ, dtype, convert_axes, convert_dates, keep_default_dates, numpy, precise_float, date_unit)
196 obj = FrameParser(json, orient, dtype, convert_axes, convert_dates,
197 keep_default_dates, numpy, precise_float,
--> 198 date_unit).parse()
199
200 if typ == 'series' or obj is None:
/Users/paul/anaconda/lib/python2.7/site-packages/pandas/io/json.pyc in parse(self)
264
265 else:
--> 266 self._parse_no_numpy()
267
268 if self.obj is None:
/Users/paul/anaconda/lib/python2.7/site-packages/pandas/io/json.pyc in _parse_no_numpy(self)
481 if orient == "columns":
482 self.obj = DataFrame(
--> 483 loads(json, precise_float=self.precise_float), dtype=None)
484 elif orient == "split":
485 decoded = dict((str(k), v)
TypeError: Expected String or Unicode
Run Code Online (Sandbox Code Playgroud)
如何修改请求或Pandas read_json,以避免错误?顺便问一下,问题是什么?
unu*_*tbu 13
如果您将json字符串打印到文件,
content = osm.read()
with open('/tmp/out', 'w') as f:
f.write(content)
Run Code Online (Sandbox Code Playgroud)
你会看到这样的东西:
{
"version": 0.6,
"generator": "Overpass API",
"osm3s": {
"timestamp_osm_base": "2014-07-20T07:52:02Z",
"copyright": "The data included in this document is from www.openstreetmap.org. The data is made available under ODbL."
},
"elements": [
{
"type": "node",
"id": 536694,
"lat": 50.9849256,
"lon": 13.6821776,
"tags": {
"highway": "bus_stop",
"name": "Niederhäslich Bergmannsweg"
}
},
...]}
Run Code Online (Sandbox Code Playgroud)
如果要将JSON字符串转换为Python对象,那么它将是一个dict,其elements键是一个dicts列表.绝大多数数据都在这个词典列表中.
此JSON字符串不能直接转换为Pandas对象.什么是索引,列是什么?当然你不想[u'elements', u'version', u'osm3s', u'generator']成为专栏,因为几乎所有的信息都在elements列表中.
但是如果你想让DataFrame只包含在dic elements-list 中的数据,那么你必须指定,因为Pandas不能为你做出这样的假设.
更复杂的是每个字典elements都是嵌套的字典.考虑第一个词典elements:
{
"type": "node",
"id": 536694,
"lat": 50.9849256,
"lon": 13.6821776,
"tags": {
"highway": "bus_stop",
"name": "Niederhäslich Bergmannsweg"
}
}
Run Code Online (Sandbox Code Playgroud)
['lat', 'lon', 'type', 'id', 'tags']列应该是?这似乎是合理的,除了该tags列最终将成为一列dicts.这通常不是很有用.如果tags字典中的键被制成列,那也许会更好.我们可以这样做,但我们必须自己编码,因为熊猫无法知道我们想要的东西.
import pandas as pd
import requests
# Links unten
minLat = 50.9549
minLon = 13.55232
# Rechts oben
maxLat = 51.1390
maxLon = 13.89873
osmrequest = {'data': '[out:json][timeout:25];(node["highway"="bus_stop"](%s,%s,%s,%s););out body;>;out skel qt;' % (minLat, minLon, maxLat, maxLon)}
osmurl = 'http://overpass-api.de/api/interpreter'
osm = requests.get(osmurl, params=osmrequest)
osmdata = osm.json()
osmdata = osmdata['elements']
for dct in osmdata:
for key, val in dct['tags'].iteritems():
dct[key] = val
del dct['tags']
osmdataframe = pd.DataFrame(osmdata)
print(osmdataframe[['lat', 'lon', 'name']].head())
Run Code Online (Sandbox Code Playgroud)
产量
lat lon name
0 50.984926 13.682178 Niederhäslich Bergmannsweg
1 51.123623 13.782789 Sagarder Weg
2 51.065752 13.895734 Weißig, Einkaufszentrum
3 51.007140 13.698498 Stuttgarter Straße
4 51.010199 13.701411 Heilbronner Straße
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
9860 次 |
| 最近记录: |