Saq*_*Ali 3 python regex parsing google-api
见下文。给定一个众所周知的 Google URL,我正在尝试从该 URL 检索数据。该数据将为我提供另一个 Google URL,我可以从中检索 JWK 列表。
>>> import requests, json
>>> open_id_config_url = 'https://ggp.sandbox.google.com/.well-known/openid-configuration'
>>> response = requests.get(open_id_config_url)
>>> r.status_code
200
>>> response.text
u'{\n "issuer": "https://www.stadia.com",\n "jwks_uri": "https://www.googleapis.com/service_accounts/v1/jwk/stadia-jwt@system.gserviceaccount.com",\n "claims_supported": [\n "iss",\n "aud",\n "sub",\n "iat",\n "exp",\n "s_env",\n "s_app_id",\n "s_gamer_tag",\n "s_purchase_country",\n "s_current_country",\n "s_session_id",\n "s_instance_ip",\n "s_restrict_text_chat",\n "s_restrict_voice_chat",\n "s_restrict_multiplayer",\n "s_restrict_stream_connect",\n ],\n "id_token_signing_alg_values_supported": [\n "RS256"\n ],\n}'
Run Code Online (Sandbox Code Playgroud)
上面我已经成功地从第一个 URL 中检索到数据。我可以看到该条目jwks_uri包含我需要的第二个 URL。但是当我尝试将该文本块转换为 python 字典时,它失败了。
>>> response.json()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/saqib.ali/saqib-env-99/lib/python2.7/site-packages/requests/models.py", line 889, in json
self.content.decode(encoding), **kwargs
File "/usr/local/Cellar/python@2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 339, in loads
return _default_decoder.decode(s)
File "/usr/local/Cellar/python@2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/Cellar/python@2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 382, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
>>> json.loads(response.text)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/Cellar/python@2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 339, in loads
return _default_decoder.decode(s)
File "/usr/local/Cellar/python@2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/Cellar/python@2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 382, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
Run Code Online (Sandbox Code Playgroud)
我能得到 JWKs URL 的唯一方法是做这个丑陋的正则表达式解析:
>>> re.compile('(?<="jwks_uri": ")[^"]+').findall(response.text)[0]
u'https://www.googleapis.com/service_accounts/v1/jwk/stadia-jwt@system.gserviceaccount.com'
Run Code Online (Sandbox Code Playgroud)
有没有更干净、更 Pythonic 的方法来提取这个字符串?我真的希望 Google 发回一个可以完全 JSON 化的字符串。
返回的 json 字符串不正确,因为字典的最后一项以 结尾,,json 无法解析。
": [\n "RS256"\n ],\n}'
^^^
Run Code Online (Sandbox Code Playgroud)
但是ast.literal_eval可以这样做(因为python解析接受以逗号结尾的列表/字典)。只要你没有布尔值或null值,就有可能和 pythonic
>>> ast.literal_eval(response.text)["jwks_uri"]
'https://www.googleapis.com/service_accounts/v1/jwk/stadia-jwt@system.gserviceaccount.com'
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
386 次 |
| 最近记录: |