如何从XmlNode实例获取xpath

Question

如何从XmlNode实例获取xpath

joe*_*joe 49 .net c# xml .net-2.0

有人可以提供一些代码来获取System.Xml.XmlNode实例的xpath吗？

谢谢!

Answer 1

Jon*_*eet 56

好吧,我忍不住去了.它只适用于属性和元素,但是嘿......你能在15分钟内得到什么:)同样可能有一种更清洁的方式.

将索引包含在每个元素(特别是根元素!)上是多余的,但它比试图弄清楚是否存在任何歧义更容易.

using System;
using System.Text;
using System.Xml;

class Test
{
    static void Main()
    {
        string xml = @"
<root>
  <foo />
  <foo>
     <bar attr='value'/>
     <bar other='va' />
  </foo>
  <foo><bar /></foo>
</root>";
        XmlDocument doc = new XmlDocument();
        doc.LoadXml(xml);
        XmlNode node = doc.SelectSingleNode("//@attr");
        Console.WriteLine(FindXPath(node));
        Console.WriteLine(doc.SelectSingleNode(FindXPath(node)) == node);
    }

    static string FindXPath(XmlNode node)
    {
        StringBuilder builder = new StringBuilder();
        while (node != null)
        {
            switch (node.NodeType)
            {
                case XmlNodeType.Attribute:
                    builder.Insert(0, "/@" + node.Name);
                    node = ((XmlAttribute) node).OwnerElement;
                    break;
                case XmlNodeType.Element:
                    int index = FindElementIndex((XmlElement) node);
                    builder.Insert(0, "/" + node.Name + "[" + index + "]");
                    node = node.ParentNode;
                    break;
                case XmlNodeType.Document:
                    return builder.ToString();
                default:
                    throw new ArgumentException("Only elements and attributes are supported");
            }
        }
        throw new ArgumentException("Node was not in a document");
    }

    static int FindElementIndex(XmlElement element)
    {
        XmlNode parentNode = element.ParentNode;
        if (parentNode is XmlDocument)
        {
            return 1;
        }
        XmlElement parent = (XmlElement) parentNode;
        int index = 1;
        foreach (XmlNode candidate in parent.ChildNodes)
        {
            if (candidate is XmlElement && candidate.Name == element.Name)
            {
                if (candidate == element)
                {
                    return index;
                }
                index++;
            }
        }
        throw new ArgumentException("Couldn't find element within parent");
    }
}

Run Code Online (Sandbox Code Playgroud)

乔恩,谢谢,我最近用过这个.当一个元素在它之前有一个相同类型的"nephew"时,FindElementIndex中有一个错误.我会稍作修改来解决这个问题. (3认同)

Answer 2

Rob*_*ney 24

Jon是正确的,有任何数量的XPath表达式将在实例文档中产生相同的节点.构建明确产生特定节点的表达式的最简单方法是使用谓词中节点位置的节点测试链,例如:

/node()[0]/node()[2]/node()[6]/node()[1]/node()[2]

Run Code Online (Sandbox Code Playgroud)

显然,这个表达式不使用元素名称,但是如果您要做的就是在文档中定位节点,则不需要其名称.它也不能用于查找属性(因为属性不是节点而没有位置;您只能通过名称找到它们),但它会找到所有其他节点类型.

要构建此表达式,您需要编写一个返回节点在其父节点中的位置的方法,因为XmlNode它不会将其作为属性公开:

static int GetNodePosition(XmlNode child)
{
   for (int i=0; i<child.ParentNode.ChildNodes.Count; i++)
   {
       if (child.ParentNode.ChildNodes[i] == child)
       {
          // tricksy XPath, not starting its positions at 0 like a normal language
          return i + 1;
       }
   }
   throw new InvalidOperationException("Child node somehow not found in its parent's ChildNodes property.");
}

Run Code Online (Sandbox Code Playgroud)

(使用LINQ可能有更优雅的方法,因为XmlNodeList实现IEnumerable,但我会按照我所知道的那样去.)

然后你可以编写一个这样的递归方法:

static string GetXPathToNode(XmlNode node)
{
    if (node.NodeType == XmlNodeType.Attribute)
    {
        // attributes have an OwnerElement, not a ParentNode; also they have
        // to be matched by name, not found by position
        return String.Format(
            "{0}/@{1}",
            GetXPathToNode(((XmlAttribute)node).OwnerElement),
            node.Name
            );            
    }
    if (node.ParentNode == null)
    {
        // the only node with no parent is the root node, which has no path
        return "";
    }
    // the path to a node is the path to its parent, plus "/node()[n]", where 
    // n is its position among its siblings.
    return String.Format(
        "{0}/node()[{1}]",
        GetXPathToNode(node.ParentNode),
        GetNodePosition(node)
        );
}

Run Code Online (Sandbox Code Playgroud)

正如您所看到的,我在某种程度上也破解了它以找到属性.

在我写作的时候,乔恩插入了他的版本.关于他的代码有一些东西会让我现在有点吵了,如果听起来我对Jon很讨厌,我会提前道歉.(我不是.我很确定Jon必须向我学习的内容非常简短.)但我认为,对于任何使用XML的人来说,我要说的是非常重要的一点.想一想.

我怀疑Jon的解决方案来自我看到很多开发人员所做的事情:将XML文档视为元素和属性的树.我认为这主要来自于主要使用XML作为序列化格式的开发人员,因为他们习惯使用的所有XML都是以这种方式构建的.您可以发现这些开发人员,因为他们可以互换地使用术语"节点"和"元素".这使他们想出了将所有其他节点类型视为特殊情况的解决方案.(很长一段时间,我自己就是其中一个人.)

当你正在制作时,这感觉就像是一个简化的假设.但事实并非如此.它使问题更难,代码更复杂.它引导您绕过node()专门设计用于一般处理所有节点类型的XML技术(如XPath中的函数).

Jon的代码中有一个红旗,即使我不知道要求是什么,也会让我在代码审查中查询它,那就是GetElementsByTagName.每当我看到使用该方法时,跳到脑海中的问题始终是"它为什么必须成为一个元素？" 答案经常是"哦,这段代码是否也需要处理文本节点？"

更好的一般答案. (4认同)

Answer 3

Roe*_*mer 6

我知道,旧帖子,但我最喜欢的版本(名称有一个版本)是有缺陷的:当父节点有不同名称的节点时,它会在找到第一个不匹配的节点名称后停止计算索引.

这是我的固定版本:

/// <summary>
/// Gets the X-Path to a given Node
/// </summary>
/// <param name="node">The Node to get the X-Path from</param>
/// <returns>The X-Path of the Node</returns>
public string GetXPathToNode(XmlNode node)
{
    if (node.NodeType == XmlNodeType.Attribute)
    {
        // attributes have an OwnerElement, not a ParentNode; also they have             
        // to be matched by name, not found by position             
        return String.Format("{0}/@{1}", GetXPathToNode(((XmlAttribute)node).OwnerElement), node.Name);
    }
    if (node.ParentNode == null)
    {
        // the only node with no parent is the root node, which has no path
        return "";
    }

    // Get the Index
    int indexInParent = 1;
    XmlNode siblingNode = node.PreviousSibling;
    // Loop thru all Siblings
    while (siblingNode != null)
    {
        // Increase the Index if the Sibling has the same Name
        if (siblingNode.Name == node.Name)
        {
            indexInParent++;
        }
        siblingNode = siblingNode.PreviousSibling;
    }

    // the path to a node is the path to its parent, plus "/node()[n]", where n is its position among its siblings.         
    return String.Format("{0}/{1}[{2}]", GetXPathToNode(node.ParentNode), node.Name, indexInParent);
}

Run Code Online (Sandbox Code Playgroud)

归档时间：	17 年，3 月前
查看次数：	71570 次
最近记录：	7 年前