Nat*_*han 5 c# unicode utf-8 xml-serialization
好的,我正在尝试使用UTF8文本文件.我一直在努力争取作家为UTF8投入的BOF字符,它几乎可以用来读取文件,包括序列化器和其他文本阅读器.
我得到了前六个字节的数据:
0xEF
0xBB
0xBF
0xEF
0xBB
0xBF
Run Code Online (Sandbox Code Playgroud)
(现在我正在看它,我意识到那里有两个字符.那是UTF8 BOF标记吗?我对它进行双重编码)?
注意序列化器编码为UTF8,然后内存流得到一个字符串为UTF8,然后我用UTF8将字符串写入文件...似乎有很多冗余.思考?
//I'm storing this xml result to a database field. (this one includes the BOF chars)
using (MemoryStream ms = new MemoryStream())
{
Utility.SerializeXml(ms, root);
xml = Encoding.UTF8.GetString(ms.ToArray());
}
//later on, I would take that xml and then write it out to a file like this:
File.WriteAllText(path, xml, Encoding.UTF8);
public static void SerializeXml(Stream output, object data)
{
XmlSerializer xs = new XmlSerializer(data.GetType());
XmlWriterSettings settings = new XmlWriterSettings();
settings.Indent = true;
settings.IndentChars = "\t";
settings.Encoding = Encoding.UTF8;
XmlWriter writer = XmlTextWriter.Create(output, settings);
xs.Serialize(writer, data);
writer.Flush();
writer.Close();
}
Run Code Online (Sandbox Code Playgroud)
bob*_*nce 10
是的,这是两个BOM.你编码为UTF-8两次,每次添加一个伪BOM,由于非常不幸的事实:
Encoding.UTF8
Run Code Online (Sandbox Code Playgroud)
意思是"UTF-8带有无意义,无意义的U + FEFF粘在前面以搞砸你的应用程序".尝试改为使用
new UTF8Encoding(false)
Run Code Online (Sandbox Code Playgroud)
这应该会给你一个不那么糟糕的版本.
归档时间: |
|
查看次数: |
4956 次 |
最近记录: |