从<1> </ 1>中提取文本(HTML/XML-Like但带有数字标记)

Dis*_*ogo 4 c#

所以我有一个包含尖括号的长字符串,我希望从中提取文本部分.

string exampleString = "<1>text1</1><27>text27</27><3>text3</3>";
Run Code Online (Sandbox Code Playgroud)

我希望能够得到这个

1 = "text1"
27 = "text27"
3 = "text3"
Run Code Online (Sandbox Code Playgroud)

我怎样才能轻松获得这个?我无法想出一个非黑客的方法来做到这一点.

谢谢.

Ian*_*Ian 6

使用基本XmlReader和一些其他技巧来做包装来创建XML类似的数据,我会做这样的事情

string xmlString = "<1>text1</1><27>text27</27><3>text3</3>";
xmlString = "<Root>" + xmlString.Replace("<", "<o").Replace("<o/", "</o") + "</Root>";
string key = "";
List<KeyValuePair<string,string>> kvpList = new List<KeyValuePair<string,string>>(); //assuming the result is in the KVP format
using (XmlReader xmlReader = XmlReader.Create(new StringReader(xmlString))){
    bool firstElement = true;
    while (xmlReader.Read()) {
        if (firstElement) { //throwing away root
            firstElement = false;
            continue;
        }
        if (xmlReader.NodeType == XmlNodeType.Element) {
            key = xmlReader.Name.Substring(1); //cut of "o"
        } else if (xmlReader.NodeType == XmlNodeType.Text) {
            kvpList.Add(new KeyValuePair<string,string>(key, xmlReader.Value));
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

编辑:

主要技巧是这一行:

xmlString = "<Root>" + xmlString.Replace("<", "<o").Replace("<o/", "</o") + "</Root>"; //wrap to make this having single root, o is put to force the tagName started with known letter (comment edit suggested by Mr. chwarr)
Run Code Online (Sandbox Code Playgroud)

当你第一次更换所有opening pointy bracketsitself + char,即

<1>text1</1> -> <o1>text1<o/1> //first replacement, fix the number issue 
Run Code Online (Sandbox Code Playgroud)

然后反转的所有序列opening point brackets + char + forward slashopening point brackets + forward slash + char

<o1>text1<o/1> -> <o1>text1</o1> //second replacement, fix the ending tag issue
Run Code Online (Sandbox Code Playgroud)

使用simple WinFormwith RichTextBox打印出结果,

for (int i = 0; i < kvpList.Count; ++i) {
    richTextBox1.AppendText(kvpList[i].Key + " = " + kvpList[i].Value + "\n");
}
Run Code Online (Sandbox Code Playgroud)

这是我得到的结果:

在此输入图像描述