我正在编写一个脚本来递归遍历主文件夹中的子文件夹并构建一个特定文件类型的列表.我遇到了脚本问题.目前设定如下
for root, subFolder, files in os.walk(PATH):
for item in files:
if item.endswith(".txt") :
fileNamePath = str(os.path.join(root,subFolder,item))
Run Code Online (Sandbox Code Playgroud)
问题是subFolder变量正在拉入子文件夹列表而不是ITEM文件所在的文件夹.我想在之前为子文件夹运行for循环并加入路径的第一部分,但我想我会仔细检查以确定是否有人在此之前有任何建议.谢谢你的帮助!
I am trying to extract text from a PDF file using Python. My main goal is I am trying to create a program that reads a bank statement and extracts its text to update an excel file to easily record monthly spendings. Right now I am focusing just extracting the text from the pdf file but I don't know how to do so.
What is currently the best and easiest way to extract text from a PDF file into a …