使用Python下载的PDF文件无法在acrobat中打开

Question

使用Python下载的PDF文件无法在acrobat中打开

我有一个小 python 脚本，用来下载一大堆 PDF 文件进行存档。我遇到的问题是，当我下载文件时，它们会正确显示在正确的标题下，但它们的大小错误，并且无法由 Acrobat 打开，Acrobat 会失败并显示错误消息Out of memory或Insufficient data for an image或其他任意 Acrobat 错误。在文本编辑器中查看页面内容看起来有点像 PDF 文档，我的意思是它通常难以理解，但有一些文本和标记片段，包括 PDF 标识符。

下载文件的代码是这样的：

def download_file( file_id):
    folder_path = ".\\pdf_files\\"
    file_download="http://myserver/documentimages.asp?SERVICE_ID=RETRIEVE_IMAGE&documentKey="    
    file_content = urllib.urlopen(file_download+file_id, proxies={})
    file_local = open( folder_path + file_id + '.pdf', 'w' )
    file_local.write(file_content.read())
    file_content.close()
    file_local.close()

Run Code Online (Sandbox Code Playgroud)

如果通过浏览器下载相同的文件，它看起来不错，但在磁盘上也更大。我猜测问题可能与保存文件时的编码有关？

Answer 1

小智 5

您需要将其编写为二进制文件，如下所示：

file_local = open( folder_path + file_id + '.pdf', 'wb' )

归档时间：	13 年，7 月前
查看次数：	1787 次
最近记录：	13 年，7 月前