我试图上传约5GB大小的文件,如下所示,但它会抛出错误string longer than 2147483647 bytes
.听起来上传的限制为2 GB.有没有办法以块的形式上传数据?有人可以提供指导吗?
logger.debug(attachment_path)
currdir = os.path.abspath(os.getcwd())
os.chdir(os.path.dirname(attachment_path))
headers = self._headers
headers['Content-Type'] = content_type
headers['X-Override-File'] = 'true'
if not os.path.exists(attachment_path):
raise Exception, "File path was invalid, no file found at the path %s" % attachment_path
filesize = os.path.getsize(attachment_path)
fileToUpload = open(attachment_path, 'rb').read()
logger.info(filesize)
logger.debug(headers)
r = requests.put(self._baseurl + 'problems/' + problemID + "/" + attachment_type + "/" + urllib.quote(os.path.basename(attachment_path)),
headers=headers,data=fileToUpload,timeout=300)
Run Code Online (Sandbox Code Playgroud)
错误:
string longer than 2147483647 bytes
Run Code Online (Sandbox Code Playgroud)
更新:
def read_in_chunks(file_object,chunk_size=30720*30720):
"""Lazy function (generator) to read a file piece by piece.
Default chunk size: 1k."""
while True:
data = file_object.read(chunk_size)
if not data:
break
yield data
f = open(attachment_path)
for piece in read_in_chunks(f):
r = requests.put(self._baseurl + 'problems/' + problemID + "/" + attachment_type + "/" + urllib.quote(os.path.basename(attachment_path)),
headers=headers,data=piece,timeout=300)
Run Code Online (Sandbox Code Playgroud)
kun*_*phu 10
您的问题已在requests
bug跟踪器上询问; 他们的建议是使用流媒体上传.如果这不起作用,您可能会看到块编码的请求是否有效.
[编辑]
基于原始代码的示例:
# Using `with` here will handle closing the file implicitly
with open(attachment_path, 'rb') as file_to_upload:
r = requests.put(
"{base}problems/{pid}/{atype}/{path}".format(
base=self._baseurl,
# It's better to use consistent naming; search PEP-8 for standard Python conventions.
pid=problem_id,
atype=attachment_type,
path=urllib.quote(os.path.basename(attachment_path)),
),
headers=headers,
# Note that you're passing the file object, NOT the contents of the file:
data=file_to_upload,
# Hard to say whether this is a good idea with a large file upload
timeout=300,
)
Run Code Online (Sandbox Code Playgroud)
我不能保证这会按原样运行,因为我无法实际测试它,但它应该是接近的.我链接的错误跟踪器注释也提到发送多个标头可能会导致问题,因此如果您指定的标头实际上是必需的,这可能无效.
关于块编码:这应该是你的第二选择.您的代码未指定'rb'
为模式open(...)
,因此更改该代码可能会使上面的代码工作.如果没有,你可以试试这个.
def read_in_chunks():
# If you're going to chunk anyway, doesn't it seem like smaller ones than this would be a good idea?
chunk_size = 30720 * 30720
# I don't know how correct this is; if it doesn't work as expected, you'll need to debug
with open(attachment_path, 'rb') as file_object:
while True:
data = file_object.read(chunk_size)
if not data:
break
yield data
# Same request as above, just using the function to chunk explicitly; see the `data` param
r = requests.put(
"{base}problems/{pid}/{atype}/{path}".format(
base=self._baseurl,
pid=problem_id,
atype=attachment_type,
path=urllib.quote(os.path.basename(attachment_path)),
),
headers=headers,
# Call the chunk function here and the request will be chunked as you specify
data=read_in_chunks(),
timeout=300,
)
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
478 次 |
最近记录: |