TypeError: 使用flask 读取pdf 文件时预期str、bytes 或os.PathLike 对象,而不是FileStorage

dhi*_*nar 5 python flask python-3.x

我正在尝试使用 Flask 应用程序读取 python 文件。我正在使用 pdfminer 阅读 pdf 文本。

@app.route('/getfile', methods=['POST'])
def getfile():
    request_data = request.files['file']
    rsrcmgr = PDFResourceManager()
    retstr = io.StringIO()
    codec = 'utf-8'
    laparams = LAParams()
    device = TextConverter(rsrcmgr, retstr, codec=codec, laparams=laparams)
    fp = open(request_data, 'rb')
    interpreter = PDFPageInterpreter(rsrcmgr, device)
    password = ""
    maxpages = 0
    caching = True
    pagenos = set()

    for page in PDFPage.get_pages(fp, pagenos, maxpages=maxpages,
                                  password=password,
                                  caching=caching,
                                  check_extractable=True):
        interpreter.process_page(page)

    text = retstr.getvalue()

    fp.close()
    device.close()
    retstr.close()
return text
Run Code Online (Sandbox Code Playgroud)

不幸的是它抛出错误,

  • http://0.0.0.0:5000/ 上运行(按 CTRL+C 退出) 127.0.0.1 - - [11/Apr/2018 16:07:53] "GET /hello HTTP/1.1" 200 - [2018-04-11 16:07:55,720] 应用程序错误:/getfile [POST] Traceback 上的异常(最近一次调用最后一次):文件“c:\users\rb287jd\appdata\local\programs\python\python36\lib\site-packages\flask\app.py”,第 1982 行, 在 wsgi_app response = self.full_dispatch_request() 文件 "c:\users\rb287jd\appdata\local\programs\python\python36\lib\site-packages\flask\app.py", line 1614, in full_dispatch_request rv = self .handle_user_exception(e) 文件 "c:\users\rb287jd\appdata\local\programs\python\python36\lib\site-packages\flask\app.py", line 1517, in handle_user_exception reraise(exc_type, exc_value, tb)文件“c:\users\rb287jd\appdata\local\programs\python\python36\lib\site-packages\flask_compat.py”,第 33 行,在重新提高值文件“c:\users\rb287jd\appdata\local\programs\python\python36\lib\site-packages\flask\app.py”,第 1612 行,在 full_dispatch_request rv = self.dispatch_request( ) 文件“c:\users\rb287jd\appdata\local\programs\python\python36\lib\site-packages\flask\app.py”,第 1598 行,在 dispatch_request 中返回 self.view_functionsrule.endpoint 文件“C:/Users /RB287JD/Documents/Programs/flask_1.py", line 27, in getfile fp = open(request_data, 'rb').decode("utf-8") TypeError: expected str, bytes or os.PathLike object, not FileStorage 127.0.0.1 - - [11/Apr/2018 16:07:55] "POST /getfile HTTP/1.1" 500 -在 full_dispatch_request rv = self.dispatch_request() 文件“c:\users\rb287jd\appdata\local\programs\python\python36\lib\site-packages\flask\app.py”中,第 1598 行,在 dispatch_request 中返回 self.view_functionsrule .endpoint 文件 "C:/Users/RB287JD/Documents/Programs/flask_1.py", line 27, in getfile fp = open(request_data, 'rb').decode("utf-8") TypeError: expected str, bytes或 os.PathLike 对象,而不是 FileStorage 127.0.0.1 - - [11/Apr/2018 16:07:55] "POST /getfile HTTP/1.1" 500 -在 full_dispatch_request rv = self.dispatch_request() 文件“c:\users\rb287jd\appdata\local\programs\python\python36\lib\site-packages\flask\app.py”中,第 1598 行,在 dispatch_request 中返回 self.view_functionsrule .endpoint 文件 "C:/Users/RB287JD/Documents/Programs/flask_1.py", line 27, in getfile fp = open(request_data, 'rb').decode("utf-8") TypeError: expected str, bytes或 os.PathLike 对象,而不是 FileStorage 127.0.0.1 - - [11/Apr/2018 16:07:55] "POST /getfile HTTP/1.1" 500 -在 getfile fp = open(request_data, 'rb').decode("utf-8") 类型错误:预期的 str、字节或 os.PathLike 对象,而不是 FileStorage 127.0.0.1 - - [11/Apr/2018 16:07: 55] "POST /getfile HTTP/1.1" 500 -在 getfile fp = open(request_data, 'rb').decode("utf-8") 类型错误:预期的 str、字节或 os.PathLike 对象,而不是 FileStorage 127.0.0.1 - - [11/Apr/2018 16:07: 55] "POST /getfile HTTP/1.1" 500 -

我如何读取烧瓶内的输入 pdf 文件?附注。我不想在代码的任何地方提供我的文件位置。我想即时完成。

Mar*_*oni 4

它是FileStoragerequest.files['file']类的一个实例(另请参见http://flask.pocoo.org/docs/0.12/api/#flask.Request.files),因此您不能执行. FileStorage 对象包含一个应该指向打开的临时文件的属性,您可能可以将其传递给fp = open(request_data, 'rb')streamPDFPage.get_pages()

所以,像这样:

@app.route('/getfile', methods=['POST'])
def getfile():
    file = request.files['file']
    rsrcmgr = PDFResourceManager()
    retstr = io.StringIO()
    codec = 'utf-8'
    laparams = LAParams()
    device = TextConverter(rsrcmgr, retstr, codec=codec, laparams=laparams)
    interpreter = PDFPageInterpreter(rsrcmgr, device)
    password = ""
    maxpages = 0
    caching = True
    pagenos = set()

    for page in PDFPage.get_pages(file.stream, pagenos, maxpages=maxpages,
                                  password=password,
                                  caching=caching,
                                  check_extractable=True):
        interpreter.process_page(page)

    text = retstr.getvalue()

    device.close()
    retstr.close()
return text
Run Code Online (Sandbox Code Playgroud)