Jen*_*off 2 delphi unicode delphi-xe2
我们有一个库函数,如下所示:
class function TFileUtils.ReadTextStream(const AStream: TStream): string;
var
StringStream: TStringStream;
begin
StringStream := TStringStream.Create('', TEncoding.Unicode);
try
// This is WRONG since CopyFrom might rewind the stream (see Remys comment)
StringStream.CopyFrom(AStream, AStream.Size - AStream.Position);
Result := StringStream.DataString;
finally
StringStream.Free;
end;
end;
Run Code Online (Sandbox Code Playgroud)
当我检查函数返回的字符串时,第一个Char是(小端)BOM.
为什么TStringStream不会忽略BOM?
有一个更好的方法吗?我不需要向后兼容旧的Delphi版本,XE2的工作解决方案也没问题.
BOM必须来自源TStream,因为TStringStream不写BOM.如果要忽略BOM中的BOM(如果它存在于源中),则必须在复制数据之前手动执行,例如:
class function TFileUtils.ReadTextStream(const AStream: TStream): string;
var
StreamPos, StreamSize: Int64;
Buf: TBytes;
NumBytes: Integer;
Encoding: TEncoding;
begin
Result := '';
StreamPos := AStream.Position;
StreamSize := AStream.Size - StreamPos;
// Anything available to read?
if StreamSize < 1 then Exit;
// Read the first few bytes from the stream...
SetLength(Buf, 4);
NumBytes := AStream.Read(Buf[0], Length(Buf));
if NumBytes < 1 then Exit;
Inc(StreamPos, NumBytes);
Dec(StreamSize, NumBytes);
// Detect the BOM. If you know for a fact what the TStream data is encoded as, you can assign the Encoding variable to the appropriate TEncoding object and GetBufferEncoding() will check for that encoding's BOM only...
SetLength(Buf, NumBytes);
Encoding := nil;
Dec(NumBytes, TEncoding.GetBufferEncoding(Buf, Encoding));
// If any non-BOM bytes were read than rewind the stream back to that position...
if NumBytes > 0 then
begin
AStream.Seek(-NumBytes, soCurrent);
Dec(StreamPos, NumBytes);
Inc(StreamSize, NumBytes);
end else
begin
// Anything left to read after the BOM?
if StreamSize < 1 then Exit;
end;
// Now read and decode whatever is left in the stream...
StringStream := TStringStream.Create('', Encoding);
try
StringStream.CopyFrom(AStream, StreamSize);
Result := StringStream.DataString;
finally
StringStream.Free;
end;
end;
Run Code Online (Sandbox Code Playgroud)