Chr*_*now 6 python data-science-experience ibm-cloud
我正在尝试访问我的 Watson Data Platform 目录中的 csv 文件。我使用了 DSX 笔记本中的代码生成功能:Insert to code> Insert StreamingBody object。
生成的代码是:
import os
import types
import pandas as pd
import boto3
def __iter__(self): return 0
# @hidden_cell
# The following code accesses a file in your IBM Cloud Object Storage. It includes your credentials.
# You might want to remove those credentials before you share your notebook.
os.environ['AWS_ACCESS_KEY_ID'] = '******'
os.environ['AWS_SECRET_ACCESS_KEY'] = '******'
endpoint = 's3-api.us-geo.objectstorage.softlayer.net'
bucket = 'catalog-test'
cos_12345 = boto3.resource('s3', endpoint_url=endpoint)
body = cos_12345.Object(bucket,'my.csv').get()['Body']
# add missing __iter__ method so pandas accepts body as file-like object
if not hasattr(body, "__iter__"): body.__iter__ = types.MethodType(__iter__, body)
df_data_2 = pd.read_csv(body)
df_data_2.head()
Run Code Online (Sandbox Code Playgroud)
当我尝试运行此代码时,我得到:
/usr/local/src/conda3_runtime.v27/4.1.1/lib/python3.5/site-packages/botocore/endpoint.py in create_endpoint(self, service_model, region_name, endpoint_url, verify, response_parser_factory, timeout, max_pool_connections)
270 if not is_valid_endpoint_url(endpoint_url):
271
--> 272 raise ValueError("Invalid endpoint: %s" % endpoint_url)
273 return Endpoint(
274 endpoint_url,
ValueError: Invalid endpoint: s3-api.us-geo.objectstorage.service.networklayer.com
Run Code Online (Sandbox Code Playgroud)
奇怪的是,如果我为 SparkSession 设置生成代码,则使用相同的端点,但火花代码运行正常。
我该如何解决这个问题?
我假设其他 Softlayer 端点会遇到同样的问题,所以我也将它们列在这里,以确保这个问题也适用于其他 softlayer 位置:
解决方案是在端点前加上https://,从 ...
这个
endpoint = 's3-api.us-geo.objectstorage.softlayer.net'
Run Code Online (Sandbox Code Playgroud)
到
endpoint = 'https://s3-api.us-geo.objectstorage.softlayer.net'
Run Code Online (Sandbox Code Playgroud)