假设有一个脚本做这样的事情:
# module writer.py
import sys
def write():
sys.stdout.write("foobar")
Run Code Online (Sandbox Code Playgroud)
现在假设我想捕获write函数的输出并将其存储在变量中以供进一步处理.天真的解决方案是:
# module mymodule.py
from writer import write
out = write()
print out.upper()
Run Code Online (Sandbox Code Playgroud)
但这不起作用.我想出了另一个解决方案并且它有效,但是请告诉我是否有更好的方法来解决问题.谢谢
import sys
from cStringIO import StringIO
# setup the environment
backup = sys.stdout
# ####
sys.stdout = StringIO() # capture output
write()
out = sys.stdout.getvalue() # release output
# ####
sys.stdout.close() # close the stream
sys.stdout = backup # restore original stdout
print out.upper() # post processing
Run Code Online (Sandbox Code Playgroud) 通过以下方式使用pdfminer(git的最新版本)安装时,出现UnicodeEncodeError pip install git+https://github.com/pdfminer/pdfminer.six.git:
Traceback (most recent call last):
File "pdfminer_sample3.py", line 34, in <module>
print(convert_pdf_to_txt("samples/numbers-test-document.pdf"))
File "pdfminer_sample3.py", line 27, in convert_pdf_to_txt
text = retstr.getvalue()
File "/usr/lib/python2.7/StringIO.py", line 271, in getvalue
self.buf += ''.join(self.buflist)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: ordinal not in range(128)
Run Code Online (Sandbox Code Playgroud)
我该如何解决?
#!/usr/bin/env python
from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
from pdfminer.converter import TextConverter
from pdfminer.layout import LAParams
from pdfminer.pdfpage import PDFPage
from StringIO import StringIO
import codecs …Run Code Online (Sandbox Code Playgroud)