如何通过puppeteer拦截网站客户端生成的blob下载?

Pi-*_* Up 7 node.js google-chrome-headless puppeteer

我在此链接 ( https://master.d3tei1upkyr9mb.amplifyapp.com/report ) 上有一个页面,其中有 3 个导出按钮。这些导出按钮在前端生成 XLSX、CSV、PDF,因此没有 XLSX、CSV、PDF 的 URL。

我需要 puppeteer 能够在我的节点后端下载、获取或拦截这些文件的 blob 或缓冲区。

我尝试了不同的方法来实现这一目标,但仍然没有弄清楚。

通过下面编写的代码可以通过剧作家库实现。但我需要能够用 Puppeteer 来做到这一点。

const {chromium} = require('playwright');
const fs = require('fs');

(async () => {
    const browser = await chromium.launch();
    const context = await browser.newContext({acceptDownloads: true});
    const page = await context.newPage();

    await page.goto('http://localhost:3000/');

    const [ download ] = await Promise.all([
        page.waitForEvent('download'), // <-- start waiting for the download
        page.click('button#expoXLSX') // <-- perform the action that directly or indirectly initiates it.
    ]);

    const path = await download.path();

    console.log(path);

    const newFile = await fs.readFileSync(path);

    console.log(newFile);

    fs.writeFile("test.xlsx", newFile,  "binary",function(err) {
        if(err) {
            console.log(err);
        } else {
            console.log("The file was saved!");
        }
    });

    await browser.close()
})();

Run Code Online (Sandbox Code Playgroud)

有什么办法吗?

wil*_*end 2

有什么理由不模拟前端的点击并允许puppeteer将文件下载到您选择的位置?您可以通过以下方式轻松下载文件:

编辑:您可以通过侦听事件Page.downloadProgress并检查completed状态来确定文件下载何时完成。使用此方法不能 100% 保证将实际文件名保存到磁盘,但是您可以suggestedFileNamePage.downloadWillBegin事件中获取所谓的,在我迄今为止的测试中(至少在问题的示例页面上)确实匹配文件名保留在磁盘上。

const puppeteer = require('puppeteer');
const path = require('path');
const downloadPath = path.resolve('./download');


(async ()=> {
  let fileName;
  const browser = await puppeteer.launch({
      headless: false
  });
  
  const page = await browser.newPage();
  await page.goto(
      'https://master.d3tei1upkyr9mb.amplifyapp.com/report', 
      { waitUntil: 'networkidle2' }
  );
  
  await page._client.send('Page.setDownloadBehavior', {
      behavior: 'allow',
      downloadPath: downloadPath 
  });

  await page._client.on('Page.downloadWillBegin', ({ url, suggestedFilename }) => {
    console.log('download beginning,', url, suggestedFilename);
    fileName = suggestedFilename;
  });

  await page._client.on('Page.downloadProgress', ({ state }) => {
    if (state === 'completed') {
      console.log('download completed. File location: ', downloadPath + '/' + fileName);
    }
  });

  await page.click('button#expoPDF');
})();
Run Code Online (Sandbox Code Playgroud)