目前我有一个网站,其 HTML 中有此内容。我通过检查chrome开发者工具中的元素确认了这一点。
<div class="hdp-photo-carousel" style="transform: translateX(0px);">
<div class="photo-tile photo-tile-large">
Run Code Online (Sandbox Code Playgroud)
我直观地看到页面打开,我可以看到该项目在那里。然后 30 秒后我收到此错误:
<div class="hdp-photo-carousel" style="transform: translateX(0px);">
<div class="photo-tile photo-tile-large">
Run Code Online (Sandbox Code Playgroud)
我在 puppeteer js 中的代码是:
const pptrFirefox = require('puppeteer-firefox');
(async () => {
const browser = await pptrFirefox.launch({headless: false});
const page = await browser.newPage();
await page.goto('https://zillow.com');
await page.type('.react-autosuggest__input', '8002 Blandwood Rd. Downey, CA 90240');
await page.click('.zsg-search-button_primary');
await page.waitForSelector('.photo-tile');
console.log('did I get this far?');
})();
Run Code Online (Sandbox Code Playgroud)
谁能告诉我我做错了什么?
我在express / node / ubuntu上运行puppeteer,如下所示:
var puppeteer = require('puppeteer');
var express = require('express');
var router = express.Router();
/* GET home page. */
router.get('/', function(req, res, next) {
(async () => {
headless = true;
const browser = await puppeteer.launch({headless: true, args:['--no-sandbox']});
const page = await browser.newPage();
url = req.query.url;
await page.goto(url);
let bodyHTML = await page.evaluate(() => document.body.innerHTML);
res.send(bodyHTML)
await browser.close();
})();
});
Run Code Online (Sandbox Code Playgroud)
多次运行此脚本会留下数百个僵尸:
$ pgrep chrome | wc -l
133
Run Code Online (Sandbox Code Playgroud)
哪个阻塞了srv,
我该如何解决?
kill从Express JS脚本运行可以解决吗?
除了木偶戏和无头镀铬之外,还有没有更好的方法来获得相同的结果?