bud*_*mat 7 python dom handle playwright playwright-python
在 playwright-python 中我知道我可以得到一个elementHandleusing querySelector().
示例(同步):
from playwright import sync_playwright
with sync_playwright() as p:
for browser_type in [p.chromium, p.firefox, p.webkit]:
browser = browser_type.launch()
page = browser.newPage()
page.goto('https://duckduckgo.com/')
element = page.querySelector('input[id=\"search_form_input_homepage\"]')
Run Code Online (Sandbox Code Playgroud)
如何根据 this 获取相对于此的元素elementHandle?即父母、祖父母、兄弟姐妹、孩子的句柄?
bud*_*mat 10
原答案:
将querySelector()/querySelectorAll与
XPath(XML 路径语言)一起使用可让您检索elementHandle(分别是句柄的集合)。一般来说,XPath可用于浏览 XML 文档中的元素和属性。
from playwright import sync_playwright
with sync_playwright() as p:
for browser_type in [p.chromium, p.firefox, p.webkit]:
browser = browser_type.launch(headless=False)
page = browser.newPage()
page.goto('https://duckduckgo.com/')
element = page.querySelector('input[id=\"search_form_input_homepage\"]')
parent = element.querySelector('xpath=..')
grandparent = element.querySelector('xpath=../..')
siblings = element.querySelectorAll('xpath=following-sibling::*')
children = element.querySelectorAll('xpath=child::*')
browser.close()
Run Code Online (Sandbox Code Playgroud)
更新(2022-07-22):
似乎已browser.newPage()被弃用,因此在较新版本的 playwright 中,调用了该函数browser.new_page()(注意不同的函数名称)。
(可选)首先创建一个浏览器上下文(然后关闭它)并调用new_page()该上下文。
访问孩子/父母/祖父母/兄弟姐妹的方式保持不变。
from playwright import sync_playwright
with sync_playwright() as p:
for browser_type in [p.chromium, p.firefox, p.webkit]:
browser = browser_type.launch(headless=False)
context = browser.new_context()
page = context.new_page()
page.goto('https://duckduckgo.com/')
element = page.querySelector('input[id=\"search_form_input_homepage\"]')
parent = element.querySelector('xpath=..')
grandparent = element.querySelector('xpath=../..')
siblings = element.querySelectorAll('xpath=following-sibling::*')
children = element.querySelectorAll('xpath=child::*')
context.close()
browser.close()
Run Code Online (Sandbox Code Playgroud)