M. *_*ley 225 powershell encoding byte-order-mark utf-8
Out-File 似乎在使用UTF-8时强制BOM:
$MyFile = Get-Content $MyPath
$MyFile | Out-File -Encoding "UTF8" $MyPath
Run Code Online (Sandbox Code Playgroud)
如何使用PowerShell以UTF-8编写没有BOM的文件?
M. *_*ley 208
使用.NET的UTF8Encoding类并传递$False给构造函数似乎工作:
$MyFile = Get-Content $MyPath
$Utf8NoBomEncoding = New-Object System.Text.UTF8Encoding $False
[System.IO.File]::WriteAllLines($MyPath, $MyFile, $Utf8NoBomEncoding)
Run Code Online (Sandbox Code Playgroud)
For*_*VeR 71
目前正确的方法是使用@Roman Kuzmin推荐的解决方案给@M 发表评论.达德利回答:
[IO.File]::WriteAllLines($filename, $content)
Run Code Online (Sandbox Code Playgroud)
(我还通过删除不必要的System命名空间澄清来缩短它- 它将默认自动替换.)
Len*_*nny 44
我认为这不是UTF,但我发现了一个非常简单的解决方案似乎有效......
Get-Content path/to/file.ext | out-file -encoding ASCII targetFile.ext
Run Code Online (Sandbox Code Playgroud)
对我来说,无论源格式如何,都会产生没有bom文件的utf-8.
mkl*_*nt0 28
注意:此答案适用于Windows PowerShell ; 相比之下,在跨平台的PowerShell 核心版中,没有BOM的 UTF-8 是默认编码.
为了补充M. Dudley自己简单实用的答案(以及ForNeVeR更简洁的重新制定):
为方便起见,这里是高级功能Out-FileUtf8NoBom,一种模仿的基于管道的替代方案Out-File,这意味着:
Out-File在管道中一样使用它.Out-File.例:
(Get-Content $MyPath) | Out-FileUtf8NoBom $MyPath
Run Code Online (Sandbox Code Playgroud)
请注意如何(Get-Content $MyPath)包含(...),这可确保在通过管道发送结果之前打开整个文件,完整读取和关闭.这是必要的,以便能够回写到同一个文件(在适当的位置更新).
但是,一般情况下,这种技术不建议有两个原因:(a)整个文件必须适合内存;(b)如果命令中断,数据将丢失.
关于内存使用的说明:
源代码Out-FileUtf8NoBom(也可作为MIT授权的Gist提供):
<#
.SYNOPSIS
Outputs to a UTF-8-encoded file *without a BOM* (byte-order mark).
.DESCRIPTION
Mimics the most important aspects of Out-File:
* Input objects are sent to Out-String first.
* -Append allows you to append to an existing file, -NoClobber prevents
overwriting of an existing file.
* -Width allows you to specify the line width for the text representations
of input objects that aren't strings.
However, it is not a complete implementation of all Out-String parameters:
* Only a literal output path is supported, and only as a parameter.
* -Force is not supported.
Caveat: *All* pipeline input is buffered before writing output starts,
but the string representations are generated and written to the target
file one by one.
.NOTES
The raison d'être for this advanced function is that, as of PowerShell v5,
Out-File still lacks the ability to write UTF-8 files without a BOM:
using -Encoding UTF8 invariably prepends a BOM.
#>
function Out-FileUtf8NoBom {
[CmdletBinding()]
param(
[Parameter(Mandatory, Position=0)] [string] $LiteralPath,
[switch] $Append,
[switch] $NoClobber,
[AllowNull()] [int] $Width,
[Parameter(ValueFromPipeline)] $InputObject
)
#requires -version 3
# Make sure that the .NET framework sees the same working dir. as PS
# and resolve the input path to a full path.
[System.IO.Directory]::SetCurrentDirectory($PWD) # Caveat: .NET Core doesn't support [Environment]::CurrentDirectory
$LiteralPath = [IO.Path]::GetFullPath($LiteralPath)
# If -NoClobber was specified, throw an exception if the target file already
# exists.
if ($NoClobber -and (Test-Path $LiteralPath)) {
Throw [IO.IOException] "The file '$LiteralPath' already exists."
}
# Create a StreamWriter object.
# Note that we take advantage of the fact that the StreamWriter class by default:
# - uses UTF-8 encoding
# - without a BOM.
$sw = New-Object IO.StreamWriter $LiteralPath, $Append
$htOutStringArgs = @{}
if ($Width) {
$htOutStringArgs += @{ Width = $Width }
}
# Note: By not using begin / process / end blocks, we're effectively running
# in the end block, which means that all pipeline input has already
# been collected in automatic variable $Input.
# We must use this approach, because using | Out-String individually
# in each iteration of a process block would format each input object
# with an indvidual header.
try {
$Input | Out-String -Stream @htOutStringArgs | % { $sw.WriteLine($_) }
} finally {
$sw.Dispose()
}
}
Run Code Online (Sandbox Code Playgroud)
sc9*_*911 13
从开始第6版 PowerShell支持UTF8NoBOM的编码都设置内容和出文件,甚至以此为默认的编码。
因此,在上面的示例中,它应该像这样:
$MyFile | Out-File -Encoding UTF8NoBOM $MyPath
Run Code Online (Sandbox Code Playgroud)
使用Set-Content时Out-File,可以指定编码Byte,可用于将字节数组写入文件.这与不发出BOM的自定义UTF8编码相结合,可以得到所需的结果:
# This variable can be reused
$utf8 = New-Object System.Text.UTF8Encoding $false
$MyFile = Get-Content $MyPath -Raw
Set-Content -Value $utf8.GetBytes($MyFile) -Encoding Byte -Path $MyPath
Run Code Online (Sandbox Code Playgroud)
使用[IO.File]::WriteAllLines()或类似的区别在于它应该适用于任何类型的项目和路径,而不仅仅是实际的文件路径.
重要!:仅当开头的额外空格或换行符对于您的文件用例没有问题时才有效
(例如,如果它是 SQL 文件、Java 文件或人类可读的文本文件)
可以使用创建空(非 UTF8 或 ASCII(UTF8 兼容))文件并向其追加内容的组合(如果源是文件,则替换$str为):gc $src
" " | out-file -encoding ASCII -noNewline $dest
$str | out-file -encoding UTF8 -append $dest
Run Code Online (Sandbox Code Playgroud)
根据您的用例替换$dest和:$str
$_ofdst = $dest ; " " | out-file -encoding ASCII -noNewline $_ofdst ; $src | out-file -encoding UTF8 -append $_ofdst
Run Code Online (Sandbox Code Playgroud)
function Out-File-UTF8-noBOM { param( $str, $dest )
" " | out-file -encoding ASCII -noNewline $dest
$str | out-file -encoding UTF8 -append $dest
}
Run Code Online (Sandbox Code Playgroud)
将其与源文件一起使用:
Out-File-UTF8-noBOM (gc $src), $dest
Run Code Online (Sandbox Code Playgroud)
将其与字符串一起使用:
Out-File-UTF8-noBOM $str, $dest
Run Code Online (Sandbox Code Playgroud)
可选:继续附加Out-File:
"more foo bar" | Out-File -encoding UTF8 -append $dest
Run Code Online (Sandbox Code Playgroud)
该脚本会将 DIRECTORY1 中的所有 .txt 文件转换为无 BOM 的 UTF-8,并将其输出到 DIRECTORY2
foreach ($i in ls -name DIRECTORY1\*.txt)
{
$file_content = Get-Content "DIRECTORY1\$i";
[System.IO.File]::WriteAllLines("DIRECTORY2\$i", $file_content);
}
Run Code Online (Sandbox Code Playgroud)
老问题,新答案:
虽然“旧”powershell 写入 BOM,但新的与平台无关的变体的行为有所不同:默认为“无 BOM”,可以通过开关进行配置:
-编码
指定目标文件的编码类型。默认值为 utf8NoBOM。
该参数可接受的值如下:
- ascii:使用 ASCII(7 位)字符集的编码。
- bigendianunicode:使用 big-endian 字节顺序以 UTF-16 格式进行编码。
- oem:使用 MS-DOS 和控制台程序的默认编码。
- unicode:使用小端字节顺序以 UTF-16 格式进行编码。
- utf7:以 UTF-7 格式编码。
- utf8:以 UTF-8 格式编码。
- utf8BOM:使用字节顺序标记 (BOM) 以 UTF-8 格式进行编码
- utf8NoBOM:以 UTF-8 格式编码,不带字节顺序标记 (BOM)
- utf32:以 UTF-32 格式编码。
来源: https: //learn.microsoft.com/de-de/powershell/module/Microsoft.PowerShell.Utility/Out-File ?view=powershell-7 重点是我的
| 归档时间: |
|
| 查看次数: |
213365 次 |
| 最近记录: |