Powershell在word文档中搜索匹配字符串

Yog*_*esh 5 regex powershell search ms-word powershell-2.0

我有一个简单的要求。我需要在 Word 文档中搜索一个字符串,因此我需要在文档中获取匹配的行/一些单词。

到目前为止,我可以在包含 Word 文档的文件夹中成功搜索字符串,但它根据是否可以找到搜索字符串返回 True / False。

#ERROR REPORTING ALL
Set-StrictMode -Version latest
$path     = "c:\MORLAB"
$files    = Get-Childitem $path -Include *.docx,*.doc -Recurse | Where-Object { !($_.psiscontainer) }
$output   = "c:\wordfiletry.txt"
$application = New-Object -comobject word.application
$application.visible = $False
$findtext = "CRHPCD01"

Function getStringMatch
{
  # Loop through all *.doc files in the $path directory
  Foreach ($file In $files)
  {
   $document = $application.documents.open($file.FullName,$false,$true)
   $range = $document.content
   $wordFound = $range.find.execute($findText)

   if($wordFound) 
    { 
     "$file.fullname has $wordfound" | Out-File $output -Append
    }

  }
$document.close()
$application.quit()
}

getStringMatch
Run Code Online (Sandbox Code Playgroud)

Mat*_*att 5

#ERROR REPORTING ALL
Set-StrictMode -Version latest
$path     = "c:\Temp"
$files    = Get-Childitem $path -Include *.docx,*.doc -Recurse | Where-Object { !($_.psiscontainer) }
$output   = "c:\temp\wordfiletry.csv"
$application = New-Object -comobject word.application
$application.visible = $False
$findtext = "First"
$charactersAround = 30
$results = @{}

Function getStringMatch
{
    # Loop through all *.doc files in the $path directory
    Foreach ($file In $files)
    {
        $document = $application.documents.open($file.FullName,$false,$true)
        $range = $document.content

        If($range.Text -match ".{$($charactersAround)}$($findtext).{$($charactersAround)}"){
             $properties = @{
                File = $file.FullName
                Match = $findtext
                TextAround = $Matches[0] 
             }
             $results += New-Object -TypeName PsCustomObject -Property $properties
        }
    }

    If($results){
        $results | Export-Csv $output -NoTypeInformation
    }

    $document.close()
    $application.quit()
}

getStringMatch

import-csv $output
Run Code Online (Sandbox Code Playgroud)

有几种方法可以获得您想要的东西。一种简单的方法是,因为您已经拥有文档的文本,所以可以对其执行正则表达式匹配并返回结果等。这有助于尝试解决文档中的一些单词问题

我们有一个变量$charactersAround来设置要匹配的字符数$findtext。另外,我认为输出更适合 CSV 文件,因此我用来$results捕获属性的哈希表,最终将其输出到 csv 文件。

请务必更改变量以进行您自己的测试。现在我们使用正则表达式来定位匹配,这打开了一个充满可能性的世界。

样本输出

Match TextAround                                                        File                          
----- ----------                                                        ----                          
First dley Air Services Limited dba First Air meets or exceeds all term C:\Temp\20120315132117214.docx
Run Code Online (Sandbox Code Playgroud)