标签: puppeteer

如何使用直接链接 Puppeteer 下载文件？

我需要用 puppeteer 下载图像。这里的问题是，buffer通过goto方法返回。我认为它会返回图像加载的序列。所以 writeFile 方法只获取最后一个缓冲区。是否有其他承诺方法来处理序列缓冲区？

const puppeteer = require('puppeteer-core');
const fs = require('fs').promises;

(async () => {
  const options = {
   product: 'chrome',
   headless: true,
   pipe: true,
   executablePath: 'chrome.exe'
  };

  const browser = await puppeteer.launch(options);
  const page = await browser.newPage();
  const response = await page.goto('https://static.wikia.nocookie.net/naruto/images/d/dd/Naruto_Uzumaki%21%21.png/revision/latest?cb=20161013233552');
  
  // save buffer to file
  await fs.writeFile('file.jpg', await response.buffer());
  browser.close();
})();

Run Code Online (Sandbox Code Playgroud)

javascript node.js puppeteer

dav*_*hen

lucky-day

1
推荐指数

1
解决办法

7287
查看次数

如何让 puppeteer 等待 Cloudflare 浏览器检查的页面重定向？

我正在抓取一个网站，在提交表单后，我被重定向到此 -

Checking your browser before accessing <Website Name>.\nThis process is automatic. Your browser will redirect to your requested content shortly.\n\nPlease allow up to 5 seconds\xe2\x80\xa6\n\nDDoS protection by Cloudflare\nRay ID: <Some ID>\n

Run Code Online (Sandbox Code Playgroud)\n

现在，通常当我自己从“真正的网络浏览器”手动提交该表单时，在该浏览器检查内容出现后，我几乎立即被重定向到主要内容。但在木偶师中，情况并非如此。

我尝试过使用page.waitForNavigation()，但无法使其工作。\n有什么方法可以真正通过此检查过程吗？或者傀儡师刚刚被封锁了？

提前致谢！

javascript browser ddos cloudflare puppeteer

Imt*_*bir

lucky-day

1
推荐指数

1
解决办法

7611
查看次数

如何使用Puppeteer获取图像的src属性？我收到“无法读取 null 的属性“getAttribute””错误

我正在与 Puppeteer 合作并尝试下载图像。在 Chrom 开发工具控制台上，这会返回我想要的内容：

document.querySelector('.photo img').getAttribute('src')

Run Code Online (Sandbox Code Playgroud)

但使用 Puppeteer 评估函数相同的代码：

let imageSrc = await page.evaluate(() => {
  return document.querySelector('.photo img').getAttribute('src');
});

Run Code Online (Sandbox Code Playgroud)

抛出错误：

error:  Error: Evaluation failed: TypeError: Cannot read property 'getAttribute' of null

Run Code Online (Sandbox Code Playgroud)

知道为什么会发生这种情况吗？

javascript node.js puppeteer

And*_* D_

lucky-day

1
推荐指数

1
解决办法

1848
查看次数

如何验证剧作家中的代理

我的剧作家浏览器选项中有这个参数，'--proxy-server=endpoint:port'我可以使用await page.authenticate({username, password});. 我找不到任何方法可以与剧作家进行相同的操作。我怎么做？

node.js puppeteer playwright

A.B*_*izh

lucky-day

1
推荐指数

1
解决办法

7903
查看次数

Puppeteer - 仅使用 1 个浏览器实例

我如何才能在执行 puppeteer 的多个任务时只使用 1 个浏览器实例？我正在抓取的网站正在检测浏览器实例的创建，即使在等待 browser.close() 之后也是如此。因此，如果我始终打开浏览器，我就可以绕过它。

示例场景：


(async() => {
    const browser = await puppeteer.launch({headless: true}); //

    // Have this only run once ^^^^

    // Command gets run, it should not make a new browser and instead go
    // to make a new page    
    // VVVVVVVV

    const page = await browser.newPage();
    
    await page.goto(args[1]) // Go to the url the user specified

    // do some stuff

    await page.close();

   //repeat from browser.newPage();

})();

Run Code Online (Sandbox Code Playgroud)

有任何想法吗？

puppeteer

Jac*_*edo

lucky-day

1
推荐指数

1
解决办法

3867
查看次数

TypeError [ERR_INVALID_CALLBACK]：回调必须是一个函数。收到的承诺 { <pending> }

我在尝试 scrape 时遇到此错误prnt.sc，但我不明白为什么。

我认为setInterval()这给我带来了问题。

TypeError [ERR_INVALID_CALLBACK]：回调必须是一个函数。收到的承诺 { }

const puppeteer = require("puppeteer");
const select = require('puppeteer-select');

async function llamar() {

  const browser = await puppeteer.launch({
    headless: true
  });

  var text = "";
  var possible = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789";

  for (var i = 0; i < 6; i++)
    text += "https://prnt.sc/" + possible.charAt(Math.floor(Math.random() * possible.length));
  console.log(text)


  const page = await browser.newPage();
  path = Math.random()
  await page.goto(text)
  const element = await select(page).getElement('button:contains(AGREE)');
  await element.click()
  await page.screenshot({
    path: path + '.jpg' …

Run Code Online (Sandbox Code Playgroud)

javascript puppeteer

SaM*_*M85

2021 10-14

1
推荐指数

1
解决办法

1320
查看次数

无法在本地启动 puppeteer - Mac M1

我已经使用brew 安装了puppeteer。另外，我的项目中有 puppeteer 作为依赖项。

但是当我的代码到达这一行时：

const browser = await puppeteer.launch({ headless: true });

Run Code Online (Sandbox Code Playgroud)

它在终端中抛出此错误：

我无法在任何地方找到解决方案。

node.js puppeteer

Sac*_*dav

lucky-day

1
推荐指数

1
解决办法

6164
查看次数

木偶/铬由于缺乏RAM而导致服务器崩溃

即时通讯使用nodejs / puppeteer将我的用户登录到远程网站...这是它的工作方式

客户端通过socket.io连接到nodejs服务器，客户端发送start_tunnel到nodejs服务器以启动puppeteer和run(socket , data.token );运行该puppeteer的节点调用

    io.on('connection' , function(socket){

        socket.on('start_tunnel' , function (data) {
            fullfillCaptcha[socket.id] = null ;
            set_stat(socket.id , 1 );
            run(socket , data.token );
        })

        socket.on('get_captcha_from_client' , function (data) {

            fullfillCaptcha[socket.id](data);

        })

    });

    var fullfillCaptcha = {};
    var pay_stats = {} ;

    function captchaPromise(id){
        return  new Promise(resolve => fullfillCaptcha[id] = resolve);
    }

Run Code Online (Sandbox Code Playgroud)

这是给run木偶吃午餐的功能...。我已经注释了代码，因此其易于阅读...基本上，它打开了一个包含带有验证码的表单的网页，从验证码图像中获取屏幕截图，并将其发送给客户端，接收从客户端输入验证码，将其放入输入中并提交表单

async function run(socket , token ) {

   /// OPENING THE WEB PAGE 
    const browser = await puppeteer.launch({headless: true …

Run Code Online (Sandbox Code Playgroud)

javascript node.js puppeteer

hre*_*tic

2018 07-12

0
推荐指数

1
解决办法

879
查看次数

在Docker中以headful模式执行人偶的问题

我是puppeteer和docker的新手。我在docker容器中以headful模式设置人偶时遇到问题。

Puppeteer version: 1.6.2 Platform / OS version: Docker node:8-slim Node.js version: node 8

Run Code Online (Sandbox Code Playgroud)

DockerFile-

FROM node:8-slim
RUN apt-get update && apt-get install --no-install-recommends -y ca-certificates curl fontconfig fonts-liberation gconf-service git libappindicator1 libasound2 libatk1.0-0 libc6 libcairo2 libcups2 libdbus-1-3 libexpat1 libfontconfig1 libgcc1 libgconf-2-4 libgdk-pixbuf2.0-0 libglib2.0-0 libgtk-3-0 libnspr4 libnss3 libpango-1.0-0 libpangocairo-1.0-0 libstdc++6 libx11-6 libx11-xcb1 libxcb1 libxcomposite1 libxcursor1 libxdamage1 libxext6 libxfixes3 libxi6 libxrandr2 libxrender1 libxss1 libxtst6 locales lsb-release unzip wget xdg-utils

RUN apt-get update && apt-get install -y wget --no-install-recommends && wget -q …

Run Code Online (Sandbox Code Playgroud)

docker puppeteer

jsa*_*yce

2018 08-03

0
推荐指数

1
解决办法

1688
查看次数

使用Puppeteer，如何打开页面，获取数据，然后返回上一页以获取列表中的下一页？

情况：

这是我想做的：

1）我加载页面0。页面0包含指向不同页面的可点击链接。我想加载所有这些页面的内容。所以：

2）点击第一个链接。加载页面1.获取数据。返回上一页（第0页）

3）单击第二个链接，该链接加载第2页。等等，直到所有链接都被单击为止。

在我当前的代码中，第0页加载，然后单击第一个链接并加载第1页，然后出现崩溃并显示以下错误：

(node:2629) UnhandledPromiseRejectionWarning: Error: Protocol error (Runtime.callFunctionOn): Execution context was destroyed.

Run Code Online (Sandbox Code Playgroud)

题：

我在做什么错？如何使脚本按预期方式运行？

码：

const puppeteer = require('puppeteer');
const fs = require('fs');

let getData = async () => {
    const browser = await puppeteer.launch({headless: false});
    const page = await browser.newPage();

    await page.goto('url', { waitUntil: 'networkidle2' });
    await page.setViewport({width: ..., height:...});

    const result = await page.evaluate(async () => {
        let data = []; 
        let elements = document.querySelector('.items').querySelectorAll('.item'); 

        for (const element of elements) {

            element.click();
            await new …

Run Code Online (Sandbox Code Playgroud)

javascript node.js puppeteer

The*_*mer

lucky-day

0
推荐指数

2
解决办法

2780
查看次数