Puppeteer querySelector返回null

Ara*_*yan 0 jquery-selectors node.js web-scraping puppeteer

我试图用puppeteer废弃一些数据,但对于一些网站querySelector返回null,我不知道有什么问题.我在stackoverflow中找到了关于这个问题的一些答案,但没有一个有效.这是带有示例链接的代码不起作用.

const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();

    await page.goto('https://www.macys.com/shop/product/the-north-face-mens- 
    logo-half-dome-t-shirt?ID=2085687&CategoryID=30423&cm_kws=2085687');

    const textContent = await page.evaluate(() => {
    return document.querySelector('.price');
});

console.log(textContent); 

browser.close();
})();
Run Code Online (Sandbox Code Playgroud)

Pjo*_*kov 10

可能元素是通过javascript异步加载的,当你调用.evaluate()时,它们仍然不在DOM中.

尝试使用puppeteer .waitForSelector函数等待选择器

const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();

await page.goto('https://www.macys.com/shop/product/the-north-face-mens- 
logo-half-dome-t-shirt?ID=2085687&CategoryID=30423&cm_kws=2085687');

await page.waitForSelector('.price');

const textContent = await page.evaluate(() => {
   return document.querySelector('.price');
});

console.log(textContent); 

browser.close();
})();
Run Code Online (Sandbox Code Playgroud)


Ara*_*yan 8

拍摄页面快照后,结果发现我的请求被机器人检测系统阻止。这是解决方案。我们只需要传递更多数据,这样它就不会被检测为机器人。如果还是不行,你可以看看这个教程

const puppeteer = require('puppeteer');

// This is where we'll put the code to get around the tests.
const preparePageForTests = async (page) => {

// Pass the User-Agent Test.
const userAgent = 'Mozilla/5.0 (X11; Linux x86_64)' +
  'AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.39 Safari/537.36';
await page.setUserAgent(userAgent);
}


(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await preparePageForTests(page);

 // await page.setRequestInterception(true);
 await page.goto('websiteURL');

 const textContent = await page.evaluate(() => {
   return {document.querySelector('yourCSSselector').textContent,
 }
 });
  console.log(textContent);

  browser.close();
Run Code Online (Sandbox Code Playgroud)