每个DOM节点的屏幕截图

Мар*_*ров 8 javascript node.js web

如何为任何站点中的每个DOM节点创建屏幕截图?

我尝试使用无头浏览器(puppeteer),只有当我知道某些元素的XPath或Selector时它才能工作.但是如何才能为所有元素接收XPath或Selector?

async function screenshotDOMElement(opts = {}) {
const padding = 'padding' in opts ? opts.padding : 0;
const path = 'path' in opts ? opts.path : null;
const selector = opts.selector;

if (!selector)
    throw Error('Please provide a selector.');

const rect = await page.evaluate(selector => {
    const element =
     document.evaluate(selector, document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue;
    if (!element)
        return null;
    const {x, y, width, height} = element.getBoundingClientRect();
    console.log (x,y,width,height)
    return {left: x, top: y, width, height, id: element.id};
}, selector);

if (!rect)
    throw Error(`Could not find element that matches selector: ${selector}.`);

return await page.screenshot({
    path,
    clip: {
        x: rect.left - padding,
        y: rect.top - padding,
        width: rect.width + padding * 2,
        height: rect.height + padding * 2
    }
});
}
Run Code Online (Sandbox Code Playgroud)

我也尝试使用HtmlAgilityPack(C#)并通过XPath枚举HtmlDocument中的每个节点,但是这个XPath无法使用puppeteer

我需要使用puppeteer'因为它是XPath或Selector的截图任务的最佳工具

谁能帮我?

shk*_*per 10

使用puppeteer,您不再需要使用整页截图,因为它有elementHandle.screenshot([options]).这是你可以做的:

const browser = await puppeteer.launch();

const page = await browser.newPage();
await page.goto('https://example.com');

// get a list of all elements - same as document.querySelectorAll('*')
const elements = await page.$$('*')

for (let i = 0; i < elements.length; i++) {
  try {
    // get screenshot of a particular element
    await elements[i].screenshot({path: `${i}.png`})
  } catch(e) {
    // if element is 'not visible', spit out error and continue
    console.log(`couldnt take screenshot of element with index: ${i}. cause: `,  e)
  }
}
await browser.close();
Run Code Online (Sandbox Code Playgroud)

请注意,木偶操纵者无法获取某些不可见或被其他元素覆盖的元素的屏幕截图等.在这种情况下,您需要捕获错误并继续前进.