Mev*_*pek 1 javascript xpath google-chrome puppeteer
根据这个响应,是否有一种方法(比如使用casperjs/phantomjs)在page.evaluate()上下文中添加我们的自定义函数?
例如,包含一个带辅助函数的文件x来调用Xpath函数:x('//a/@href')
您可以在单独的page.evaluate()函数中注册辅助函数.page.exposeFunction()看起来很诱人,但它无法访问浏览器上下文(并且您需要document对象).
以下是注册辅助函数的示例$x():
const puppeteer = require('puppeteer');
const helperFunctions = () => {
window.$x = xPath => document
.evaluate(
xPath,
document,
null,
XPathResult.FIRST_ORDERED_NODE_TYPE,
null
)
.singleNodeValue;
};
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://en.wikipedia.org', { waitUntil: 'networkidle2' });
await page.evaluate(helperFunctions);
const text = await page.evaluate(() => {
// $x() is now available
const featureArticle = $x('//*[@id="mp-tfa"]');
return featureArticle.textContent;
});
console.log(text);
await browser.close();
})();
Run Code Online (Sandbox Code Playgroud)
(编辑 - 从文件添加帮助程序)
您还可以将帮助程序保存在单独的文件中,并将其注入浏览器上下文中page.addScriptTag().这是一个例子:
helperFunctions.js
window.$x = xPath => document
.evaluate(
xPath,
document,
null,
XPathResult.FIRST_ORDERED_NODE_TYPE,
null
)
.singleNodeValue;
Run Code Online (Sandbox Code Playgroud)
并使用它:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://en.wikipedia.org', { waitUntil: 'networkidle2' });
await page.addScriptTag({ path: './helperFunctions.js' });
const text = await page.evaluate(() => {
// $x() is now available
const featureArticle = $x('//*[@id="mp-tfa"]');
return featureArticle.textContent;
});
console.log(text);
await browser.close();
})();
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1708 次 |
| 最近记录: |