Rob*_*bie 13 c# xml-serialization illegal-characters xmlserializer
在C#(.net 4.0和4.5/vs2010和vs12)中,当我使用XMLSerializer序列化包含具有非法字符的字符串的对象时,不会引发错误.但是,当我反序列化该结果时,会抛出"无效字符"错误.
// add to XML
Items items = new Items();
items.Item = "\v hello world"; // contains "illegal" character \v
// variables
System.Xml.Serialization.XmlSerializer serializer = new System.Xml.Serialization.XmlSerializer(typeof(Items));
string tmpFile = Path.GetTempFileName();
// serialize
using (FileStream tmpFileStream = new FileStream(tmpFile, FileMode.Open, FileAccess.ReadWrite))
{
serializer.Serialize(tmpFileStream, items);
}
Console.WriteLine("Success! XML serialized in file " + tmpFile);
// deserialize
Items result = null;
using (FileStream plainTextFile = new FileStream(tmpFile, FileMode.Open, FileAccess.Read))
{
result = (Items)serializer.Deserialize(plainTextFile); //FAILS here
}
Console.WriteLine(result.Item);
Run Code Online (Sandbox Code Playgroud)
"Items"只是由xsd/c Items.xsd自动生成的一个小类.Items.xsd只不过是一个包含一个子元素(Item)的根元素(Items):
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified">
<xs:element name="Items">
<xs:complexType>
<xs:sequence>
<xs:element name="Item" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Run Code Online (Sandbox Code Playgroud)
反序列化期间抛出的错误是
未处理的异常:System.InvalidOperationException:XML文档中存在错误(3,12).---> System.Xml.XmlException:'♂',十六进制值0x0B,是无效字符.第3行,第12位.
序列化的XML文件包含第3行:
<Item> hello world</Item>
Run Code Online (Sandbox Code Playgroud)
我知道\ v - > 是一个非法字符,但为什么XMLSerialize允许它被序列化(没有错误)?我发现它与.NET不一致,它允许我序列化一些没有问题的东西,只是发现我无法反序列化它.
是否有解决方案,以便XMLSerializer在序列化之前自动删除非法字符,还是可以指示反序列化忽略非法字符?
目前我通过将文件内容作为字符串读取来解决它,替换"手动"非法字符然后反序列化它...但我发现这是一个丑陋的黑客/解决方法.
L.B*_*L.B 25
您可以设置XmlWriterSettings的CheckCharacters属性,以避免编写非法字符.(Serialize方法会抛出异常)
using (FileStream tmpFileStream = new FileStream(tmpFile, FileMode.OpenOrCreate, FileAccess.ReadWrite))
{
var writer = XmlWriter.Create(tmpFileStream, new XmlWriterSettings() { CheckCharacters = true});
serializer.Serialize(writer, items);
}
Run Code Online (Sandbox Code Playgroud)
您可以创建自己的XmlTextWriter,以在序列化时过滤掉不需要的字符
using (FileStream tmpFileStream = new FileStream(tmpFile, FileMode.OpenOrCreate, FileAccess.ReadWrite))
{
var writer = new MyXmlWriter(tmpFileStream);
serializer.Serialize(writer, items);
}
public class MyXmlWriter : XmlTextWriter
{
public MyXmlWriter(Stream s) : base(s, Encoding.UTF8)
{
}
public override void WriteString(string text)
{
string newText = String.Join("", text.Where(c => !char.IsControl(c)));
base.WriteString(newText);
}
}
Run Code Online (Sandbox Code Playgroud)
通过创建自己的XmlTextReader,您可以在反序列化时过滤掉不需要的字符
using (FileStream plainTextFile = new FileStream(tmpFile, FileMode.Open, FileAccess.Read))
{
var reader = new MyXmlReader(plainTextFile);
result = (SomeObject)serializer.Deserialize(reader);
}
public class MyXmlReader : XmlTextReader
{
public MyXmlReader(Stream s) : base(s)
{
}
public override string ReadString()
{
string text = base.ReadString();
string newText = String.Join("", text.Where(c => !char.IsControl(c)));
return newText;
}
}
Run Code Online (Sandbox Code Playgroud)
您可以将XmlReaderSettings's CheckCharacters属性设置为false.反序列化现在可以顺利进行.(你会\v回来的.)
using (FileStream plainTextFile = new FileStream(tmpFile, FileMode.Open, FileAccess.Read))
{
var reader = XmlReader.Create(plainTextFile, new XmlReaderSettings() { CheckCharacters = false });
result = (SomeObject)serializer.Deserialize(reader);
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
8844 次 |
| 最近记录: |