循环遍历PDF文件并使用word将其转换为doc

Joh*_*Doe 8 pdf vba ms-word

我正在尝试使用VBA编码 - 这是我很新的 - 从PDF(不是图像)获取一系列.doc文档,也就是说,我试图循环各种PDF文件并将它们保存在MS Word中格式.我的经验是,这个单词很好地读取了我所拥有的PDF文档:word大部分时间都保持了PDF文件的正确布局.我不确定这是否是解决这个问题的正确选择,我要求另一个建议 - 如果可能的话,使用R.

无论如何,这里是我在这里找到的代码:

Sub convertToWord()

   Dim MyObj As Object, MySource As Object, file As Variant

   file = Dir("C:\Users\username\work_dir_example" & "*.pdf") 'pdf path

   Do While (file <> "")

   ChangeFileOpenDirectory "C:\Users\username\work_dir_example"

          Documents.Open Filename:=file, ConfirmConversions:=False, ReadOnly:= _
        False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate:= _
        "", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="", _
        Format:=wdOpenFormatAuto, XMLTransform:=""

    ChangeFileOpenDirectory "C:\Users\username\work_dir_example"

    ActiveDocument.SaveAs2 Filename:=Replace(file, ".pdf", ".docx"), FileFormat:=wdFormatXMLDocument _
        , LockComments:=False, Password:="", AddToRecentFiles:=True, _
        WritePassword:="", ReadOnlyRecommended:=False, EmbedTrueTypeFonts:=False, _
         SaveNativePictureFormat:=False, SaveFormsData:=False, SaveAsAOCELetter:= _
        False, CompatibilityMode:=15

    ActiveDocument.Close

     file = Dir

   Loop

End Sub
Run Code Online (Sandbox Code Playgroud)

在开发人员的窗口中粘贴之后,我将代码保存在模块中 - >我关闭开发人员的窗口 - >我点击"宏"按钮 - >我执行"convertToWord"宏.我在弹出框中收到以下错误:"Sub或Function not defined".我该如何解决?此外,之前由于某些原因我现在还不清楚,我得到了一个与函数相关的错误ChangeFileOpenDirectory,似乎也没有定义.

2017年8月27日更新

我将代码更改为以下内容:

Sub convertToWord()

   Dim MyObj As Object, MySource As Object, file As Variant

   file = Dir("C:\Users\username\work_dir_example" & "*.pdf")

   ChDir "C:\Users\username\work_dir_example"

   Do While (file <> "")

        Documents.Open Filename:=file, ConfirmConversions:=False, ReadOnly:= _
        False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate:= _
        "", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="", _
        Format:=wdOpenFormatAuto, XMLTransform:=""

        ActiveDocument.SaveAs2 Filename:=Replace(file, ".pdf", ".docx"), FileFormat:=wdFormatXMLDocument _
        , LockComments:=False, Password:="", AddToRecentFiles:=True, _
        WritePassword:="", ReadOnlyRecommended:=False, EmbedTrueTypeFonts:=False, _
         SaveNativePictureFormat:=False, SaveFormsData:=False, SaveAsAOCELetter:= _
        False, CompatibilityMode:=15

    ActiveDocument.Close

     file = Dir

   Loop

End Sub
Run Code Online (Sandbox Code Playgroud)

现在我没有在弹出框中收到任何错误消息,但我的工作目录中没有输出.现在可能出现什么问题?

Aji*_*thy 5

可以读取PDF文件和编写Word文档(XML)的任何语言都可以做到这一点,但是您想要的转换(打开PDF时Word会执行此操作)将需要为应用程序本身使用API​​。VBA是您轻松的选择。

您发布的代码片段(以及下面的示例)使用早期绑定和枚举常量,这意味着我们需要对Word对象库的引用。已经为您在Word文档中编写的任何代码进行了设置,因此请创建一个新的Word文档,并将该代码添加到标准模块中。(如果需要更多详细信息,请参阅此Excel教程,我们的过程步骤相同)。

您可以从VB编辑器(使用“运行”按钮)或普通文档窗口(在Word 2010-2016中的“视图”选项卡上单击“宏”按钮)运行宏。如果要重复使用宏而无需再次设置代码,则将文档另存为DOCM文件。

现在获取代码!

如注释中所述,如果仅确保文件夹路径以反斜杠“ \”字符结尾,则第二个片段有效。修复该问题后,它仍然不是很好的代码,但这可以使您正常运行。

我假设您想加倍努力,并有一个写得很好的版本,可以在以后重新使用或扩展。为简单起见,我们将使用两个过程:主转换和禁止显示PDF转换警告对话框的过程(由注册表控制)。

主要程序:

Sub ConvertPDFsToWord2()
    Dim path As String
    'Manually edit path in the next line before running
    path = "C:\users\username\work_dir_example\"

    Dim file As String
    Dim doc As Word.Document
    Dim regValPDF As Integer
    Dim originalAlertLevel As WdAlertLevel

'Generate string for getting all PDFs with Dir command
    'Check for terminal \
    If Right(path, 1) <> "\" Then path = path & "\"
    'Append file type with wildcard
    file = path & "*.pdf"

    'Get path for first PDF (blank string if no PDFs exist)
    file = Dir(file)

    originalAlertLevel = Application.DisplayAlerts
    Application.DisplayAlerts = wdAlertsNone

    If file <> "" Then regValPDF = TogglePDFWarning(1)

    Do While file <> ""
        'Open method will automatically convert PDF for editing
        Set doc = Documents.Open(path & file, False)

        'Save and close document
        doc.SaveAs2 path & Replace(file, ".pdf", ".docx"), _
                    fileformat:=wdFormatDocumentDefault
        doc.Close False

        'Get path for next PDF (blank string if no PDFs remain)
        file = Dir
    Loop

CleanUp:
    On Error Resume Next 'Ignore errors during cleanup
    doc.Close False
    'Restore registry value, if necessary
    If regValPDF <> 1 Then TogglePDFWarning regValPDF
    Application.DisplayAlerts = originalAlertLevel

End Sub
Run Code Online (Sandbox Code Playgroud)

注册表设置功能:

Private Function TogglePDFWarning(newVal As Integer) As Integer
'This function reads and writes the registry value that controls
'the dialog displayed when Word opens (and converts) a PDF file
    Dim wShell As Object
    Dim regKey As String
    Dim regVal As Variant

    'setup shell object and string for key
    Set wShell = CreateObject("WScript.Shell")
    regKey = "HKCU\SOFTWARE\Microsoft\Office\" & _
             Application.Version & "\Word\Options\"

    'Get existing registry value, if any
    On Error Resume Next 'Ignore error if reg value does not exist
    regVal = wShell.RegRead(regKey & "DisableConvertPdfWarning")
    On Error GoTo 0      'Break on errors after this point

    wShell.regwrite regKey & "DisableConvertPdfWarning", newVal, "REG_DWORD"

    'Return original setting / registry value (0 if omitted)
    If Err.Number <> 0 Or regVal = 0 Then
        TogglePDFWarning = 0
    Else
        TogglePDFWarning = 1
    End If

End Function
Run Code Online (Sandbox Code Playgroud)