我有以下PowerShell脚本,它将解析一些非常大的文件以用于ETL目的.对于初学者,我的测试文件大约是30 MB.预计大约200 MB的大文件.所以我有几个问题.
下面的脚本可以工作,但是处理甚至30 MB的文件需要很长时间.
$path = "E:\Documents\Projects\ESPS\Dev\DataFiles\DimProductionOrderOperation"
$infile = "14SEP11_ProdOrderOperations.txt"
$outfile = "PROCESSED_14SEP11_ProdOrderOperations.txt"
$array = @()
$content = gc $path\$infile |
select -skip 4 |
where {$_ -match "[|].*[|].*"} |
foreach {$_ -replace "^[|]","" -replace "[|]$",""}
$header = $content[0]
$array = $content[0]
for ($i = 1; $i -le $content.length; $i+=1) {
if ($array[$i] -ne $content[0]) {$array += $content[$i]}
}
$array | out-file $path\$outfile -encoding ASCII
Run Code Online (Sandbox Code Playgroud)
---------------------------
|Data statistics|Number of|
|-------------------------|
|Records passed | 93,118|
--------------------------- …Run Code Online (Sandbox Code Playgroud) powershell ×1