Is it possible to use puppeteer to pass a Javascript object to nodejs?

Sea*_*ean 6 google-chrome node.js puppeteer tensorflow.js

Background

I am using Posenet (see the in browser demo here) for keypoint detection. I have set it up to run on a WebRTC MediaStream, s.t.:

Client: Runs in a chrome tab on machine A. Initializes WebRTC connection and sends a MediaStream to Server. Receives back real time keypoint data from Server via WebRTC's DataChannel.

Server: Runs in a chrome tab on machine B, receives a WebRTC stream and passes the corresponding MediaStream to Posenet. Posenet does its thing and computes keypoints. This keypoint data is then send back to the client via WebRTC's DataChannel (if you have a better idea, I'm all ears).

Problem: I would like to have the server receive multiple streams from various clients and run Posenet on each, sending real time keypoint data to all clients. Though I'm not thrilled about the server utilizing Chrome, I am fine with using puppeteer and Chrome's headless mode for now, mainly to abstract away WebRTC's complexity.

Approaches

I have tried two approaches, being heavily in favor of approach #2:

Approach #1

Run @tensorflow/tfjs inside the puppeteer context (i.e. inside a headless chrome tab). However, I cannot seem to get the PoseNet Browser Demo working in headless mode, due to some WebGL error (it does work in non-headless mode though). I tried the following (passing args to puppeteer.launch() to enable WebGL, though I haven't had any luck - see here and here for reference):

const puppeteer = require('puppeteer');

async function main() {
  const browser = await puppeteer.launch({
    headless: true,
    args: ['--enable-webgl-draft-extensions', '--enable-webgl-image-chromium', '--enable-webgl-swap-chain', '--enable-webgl2-compute-context']
  });
  const page = await browser.newPage();
  await page.goto('https://storage.googleapis.com/tfjs-models/demos/posenet/camera.html', {
    waitUntil: 'networkidle2'
  });
  // Make chromium console calls available to nodejs console
  page.on('console', msg => {
    for (let i = 0; i < msg.args().length; ++i)
      console.log(`${i}: ${msg.args()[i]}`);
  });
}

main();

Run Code Online (Sandbox Code Playgroud)

In headless mode, I am receiving this error message.

0: JSHandle:Initialization of backend webgl failed
0: JSHandle:Error: WebGL is not supported on this device
Run Code Online (Sandbox Code Playgroud)

This leaves me with question #1: How do I enable WebGL in puppeteer?

Approach #2

Preferably, I would like to run posenet using the @tensorflow/tfjs-node backend, to accelerate computation. Therefore, I would to link puppeteer and @tensorflow/tfjs-node, s.t.:

  • The puppeteer-chrome-tab talks WebRTC with the client. It makes a Mediastream object available to node.
  • node takes this MediaStream and passes it to posenet, (and thus @tensorflow/tfjs-node), where the machine learning magic happens. node then passes detected keypoints back to puppeteer-chrome-tab which uses its RTCDataChannel to communicate them back to client.

Problem

The problem is that I cannot seem to get access to puppeteer's MediaStream object within node, to pass this object to posenet. I'm only getting access to JSHandles and ElementHandles. Is it possible to pass the javascript object associated with the handle to node?

Concretely, this error is thrown:

UnhandledPromiseRejectionWarning: Error: When running in node, pixels must be an HTMLCanvasElement like the one returned by the `canvas` npm package
    at NodeJSKernelBackend.fromPixels (/home/work/code/node_modules/@tensorflow/tfjs-node/dist/nodejs_kernel_backend.js:1464:19)
    at Engine.fromPixels (/home/work/code/node_modules/@tensorflow/tfjs-core/dist/engine.js:749:29)
    at fromPixels_ (/home/work/code/node_modules/@tensorflow/tfjs-core/dist/ops/browser.js:85:28)
    at Object.fromPixels (/home/work/code/node_modules/@tensorflow/tfjs-core/dist/ops/operation.js:46:29)
    at toInputTensor (/home/work/code/node_modules/@tensorflow-models/posenet/dist/util.js:164:60)
    at /home/work/code/node_modules/@tensorflow-models/posenet/dist/util.js:198:27
    at /home/work/code/node_modules/@tensorflow/tfjs-core/dist/engine.js:349:22
    at Engine.scopedRun (/home/work/code/node_modules/@tensorflow/tfjs-core/dist/engine.js:359:23)
    at Engine.tidy (/home/work/code/node_modules/@tensorflow/tfjs-core/dist/engine.js:348:21)
    at Object.tidy (/home/work/code/node_modules/@tensorflow/tfjs-core/dist/globals.js:164:28)
Run Code Online (Sandbox Code Playgroud)

记录pixels传递给NodeJSKernelBackend.prototype.fromPixels = function (pixels, numChannels) {..}它的参数,它的结果为ElementHandle。我知道我可以访问序列化的 JavaScript对象的属性,使用puppeteerpage.evaluate。然而,如果我要通过CanvasRenderingContext2DimageData(使用方法getImageData(),以node通过调用puppeteer.evaluate(..),此装置字符串化整个原始图像,然后在重建它node的上下文。

这给我留下了question #2:是否有任何方法可以puppeteer直接从内部访问上下文中的对象(只读)node,而不必经过例如puppeteer.evaluate(..)

Tho*_*orf 6

我推荐的另一种方法是放弃在服务器端使用 puppeteer 的想法,而是在 Node.js 中实现一个实际的 WebRTC 客户端,然后通过@tensorflow/tfjs-node.

为什么不在服务器端使用 puppeteer

在服务器端使用 puppeteer 会带来很多复杂性。除了与多个客户端的活动 WebRTC 连接之外,您现在还必须为每个连接管理一个浏览器(或至少一个选项卡)。因此,您不仅要考虑与客户端的连接失败时会发生什么,而且还必须为其他情况做好准备,例如浏览器崩溃、页面崩溃、WebGL 支持(每页)、浏览器中的文档未加载,浏览器实例的内存/CPU 使用率,...

也就是说,让我们回顾一下你的方法。

方法一:在 puppeteer 中运行 Tensorflow.js

您应该能够仅使用cpu backend来运行它。在使用任何其他代码之前,您可以像这样设置后端:

tf.setBackend('cpu');
Run Code Online (Sandbox Code Playgroud)

您也许还可以让 WebGL 运行(因为您不是唯一一个在 WebGL 和 puppeteer 方面遇到问题的人)。但即使你让它运行,你现在正在运行一个 Node.js 脚本来启动一个 Chrome 浏览器,该浏览器启动 WebRTC 会话和 Tensorflow.js 训练在网站内。复杂性的角度来看,这将是非常,如果出现任何问题,调试硬...

方法二:在 puppeteer 和 Node.js 之间传输数据

如果没有大幅减速(关于帧的发送和接收),这种方法几乎是不可能的。puppeteer 需要序列化任何交换的数据。Node.js 和浏览器环境之间没有共享内存或共享数据对象之类的东西。这意味着您必须序列化每个帧(所有像素...)才能将它们从浏览器环境传输到 Node.js。在性能方面,这对于小图像可能没问题,但图像越大,情况会变得越糟。


总而言之,如果您想采用两种方法中的一种,则会引入很多复杂性。因此,让我们看看替代方案。

替代方法:将您的视频流直接发送到您的服务器

您可以直接实现 WebRTC peer,而不是使用 puppeteer 建立 WebRTC 连接。我从你的问题中读到你害怕复杂性,但这可能值得麻烦。

要实现 WebRTC 服务器,您可以使用库node-webrtc,它允许在服务器端实现 WebRTC 对等点。有多个示例,其中一个对您的用例非常有趣。这是video-compositing示例,它在客户端(浏览器)和服务器(Node.js)之间建立连接以流式传输视频。然后服务器将修改发送的帧并在它们上面放置一个“水印”。

代码示例

以下代码显示了video-compositing示例中最相关的行。该代码从输入流中读取一个帧并从中创建一个node-canvas对象。

const lastFrameCanvas = createCanvas(lastFrame.width,  lastFrame.height);
const lastFrameContext = lastFrameCanvas.getContext('2d', { pixelFormat: 'RGBA24' });

const rgba = new Uint8ClampedArray(lastFrame.width *  lastFrame.height * 4);
const rgbaFrame = createImageData(rgba, lastFrame.width, lastFrame.height);
i420ToRgba(lastFrame, rgbaFrame);

lastFrameContext.putImageData(rgbaFrame, 0, 0);
context.drawImage(lastFrameCanvas, 0, 0);
Run Code Online (Sandbox Code Playgroud)

你现在有一个画布对象,你可以像这样使用 PoseNet 的提要:

const net = await posenet.load();

// ...
const input = tf.browser.fromPixels(lastFrameCanvas);
const pose = await net.estimateSinglePose(input, /* ... */);
Run Code Online (Sandbox Code Playgroud)

现在需要将结果数据传输回客户端,这可以通过使用数据通道来完成。ping-pong存储库中还有一个与此相关的示例 ( ),它比视频示例简单得多。

尽管您可能会担心使用 的复杂性node-webrtc,但我建议您node-webrtc-examples尝试一下这种方法。您可以先检出存储库。所有示例都已准备好尝试使用。

  • 对于任何想知道这是如何结束的人:我设法使用 @ThomasDondorf 提议的“node-webrtc”和“RTCVideoSink”构建了我需要的东西。我为“WebRTC”编写了一个自定义信号服务器,用于处理连接并生成 Nodejs 工作线程。剩下的唯一问题是让节点的“canvas”库上下文感知 - 目前它不支持多个工作线程。atm 上正在进行一些讨论:https://github.com/Automattic/node-canvas/issues/1394 (2认同)