如何使用selenium/python将所有xml/dom作为父web元素下的文本?

har*_*ant 4 python selenium python-3.x selenium-webdriver

我有一个场景需要处理显示为网格的UI对象,但行和列是包含在xml/dom层次结构中的单独的Web元素,由多个xpath组成,可以使用通用模式进行解析.所有这些元素都包含与列类型对应的文本.通过webelement参考逐一获取所有这些文本需要时间.有没有办法通过解析整个xml内联来将所有这些xml作为文本(或单个镜头中的至少一行)来节省提取时间.

例如,考虑底部提到的xml.如何将下面的所有xml层次结构<div[@class='table']>作为要解析的文本.

这是示例示例:

<div[@class='table']>
     <div[@class='rows']>
          <div[@class='row']>
               <div[@class='col']>
                   <div[@class='element']>some_text1</div[@class='element']>
                   <div[@class='element']>some_text2</div[@class='element']>
                   <div[@class='element']>some_text3</div[@class='element']>
                   ...
               </div[@class='col']>
          </div[@class='row']>
          <div[@class='row']>
               <div[@class='col']>
                   <div[@class='element']>some_text1</div[@class='element']>
                   <div[@class='element']>some_text2</div[@class='element']>
                   <div[@class='element']>some_text3</div[@class='element']>
                   ...
               </div[@class='col']>
          </div[@class='row']>
          <div[@class='row']>
               ...
          </div[@class='row']>
          <div[@class='row']>
               ...
          </div[@class='row']>
          <div[@class='row']>
               ...
          </div[@class='row']>
     </div[@class='rows']>
</div[@class='table']>
Run Code Online (Sandbox Code Playgroud)

我需要获取xml/dom/div heirarchy,如下所述:

    <div[@class='rows']>
          <div[@class='row']>
               <div[@class='col']>
                   <div[@class='element']>some_text1</div[@class='element']>
                   <div[@class='element']>some_text2</div[@class='element']>
                   <div[@class='element']>some_text3</div[@class='element']>
                   ...
               </div[@class='col']>
          </div[@class='row']>
          <div[@class='row']>
               <div[@class='col']>
                   <div[@class='element']>some_text1</div[@class='element']>
                   <div[@class='element']>some_text2</div[@class='element']>
                   <div[@class='element']>some_text3</div[@class='element']>
                   ...
               </div[@class='col']>
          </div[@class='row']>
          <div[@class='row']>
               ...
          </div[@class='row']>
          <div[@class='row']>
               ...
          </div[@class='row']>
          <div[@class='row']>
               ...
          </div[@class='row']>
     </div[@class='rows']>
Run Code Online (Sandbox Code Playgroud)

一气呵成.

Tod*_*kov 5

element = driver.find_element_by_xpath("//div[@class='table']").get_attribute('outerHTML')
Run Code Online (Sandbox Code Playgroud)

该属性outerHTML将返回元素本身及其所有子节点 - 因为它们存在于DOM中.
我建议反对类似于innerHTML目标元素有一个文本子节点,你会收到它,但结果将不是一个合适的xml.