XQuery: look for node with descendants in a certain order

she*_*nia 4 xpath xquery basex

I have an XML file that represents the syntax trees of all the sentences in a book:

<book>
    <sentence>
        <w class="pronoun" role="subject">
            I
        </w>
        <wg type="verb phrase">
            <w class="verb" role="verb">
                like
            </w>
            <wg type="noun phrase" role="object">
                <w class="adj">
                    green
                </w>
                <w class="noun">
                    eggs
                </w>
            </wg>
        </wg>
    </sentence>
    <sentence>
        ...
    </sentence>
    ...
</book>
Run Code Online (Sandbox Code Playgroud)

This example is fake, but the point is that the actual words (the <w> elements) are nested in unpredictable ways based on syntactic relationships.

What I'm trying to do is find <sentence> nodes with <w> children matching particular criteria in a certain order. For example, I may be looking for a sentence with a w[@class='pronoun'] descendant followed by a w[@class='verb'] descendant.

It's easy to find sentences that just contain both descendants, without caring about ordering:

//sentence[descendant::w[criteria1] and descendant::w[criteria2]]
Run Code Online (Sandbox Code Playgroud)

I did manage to figure out this query that does what I want, which looks for a <w> with a following <w> matching the criteria with the same closest <sentence> ancestor:

for $sentence in //sentence
where $sentence[descendant::w[criteria1 and 
    following::w[(ancestor::sentence[1] = $sentence) and criteria2]]]
return ...
Run Code Online (Sandbox Code Playgroud)

...but unfortunately it's very slow, and I'm not sure why.

Is there a non-slow way to search for a node that contains descendants matching criteria in a certain order? I'm using XQuery 3.1 with BaseX. If I can't find a reasonable way to do this with XQuery, plan B is to do post-processing with Python.

Chr*_*rün 6

following轴确实很昂贵,因为它跨越了文档的所有后续节点,这些节点既没有后代也没有祖先。

节点比较运算符 ( <<, >>, is) 可以帮助您。在下面的代码示例中,检查是否至少有一个动词后跟名词:

for $sentence in //sentence
let $words1 := $sentence//w[@class = 'verb']
let $words2 := $sentence//w[@class = 'noun']
where some $w1 in $words1 satisfies 
      some $w2 in $words2 satisfies $w1 << $w2
return $sentence
Run Code Online (Sandbox Code Playgroud)