Docker 上的 Node.js + Puppeteer,没有可用的沙箱

Tx_*_*ter 10 google-chrome node.js docker docker-compose puppeteer

我正在构建一个 node.js LTS 应用程序。我跟着 puppeteer 文档,所以我的 Dockerfile 有这个内容:

FROM node:12.18.0

WORKDIR /home/node/app
ADD package*.json ./

# Install latest chrome dev package and fonts to support major charsets (Chinese, Japanese, Arabic, Hebrew, Thai and a few others)
# Note: this installs the necessary libs to make the bundled version of Chromium that Puppeteer
# installs, work.
RUN apt-get update \
    && apt-get install -y wget gnupg \
    && wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
    && sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list' \
    && apt-get update \
    && apt-get install -y google-chrome-unstable fonts-ipafont-gothic fonts-wqy-zenhei fonts-thai-tlwg fonts-kacst fonts-freefont-ttf \
      --no-install-recommends \
    && rm -rf /var/lib/apt/lists/*

# Install node modules
RUN npm i

# Add user so we don't need --no-sandbox.
RUN groupadd -r -f audio \
    && groupadd -r -f video \
    && usermod -a -G audio,video node \
    && mkdir -p /home/node/Downloads \
    && chown -R node:node /home/node

USER node

CMD ["google-chrome-unstable"]
Run Code Online (Sandbox Code Playgroud)

应用程序构建并运行良好,但是一旦我尝试启动浏览器,await puppeteer.launch();我就会收到此错误:

pdf    | Error: Failed to launch the browser process!
pdf    | [0612/133635.958777:FATAL:zygote_host_impl_linux.cc(116)] No usable sandbox! Update your kernel or see https://chromium.googlesource.com/chromium/src/+/master/docs/linux/suid_sandbox_development.md for more information on developing with the SUID sandbox. If you want to live dangerously and need an immediate workaround, you can try using --no-sandbox.
pdf    | #0 0x5638d5faa399 base::debug::CollectStackTrace()
pdf    | #1 0x5638d5f0b2a3 base::debug::StackTrace::StackTrace()
pdf    | #2 0x5638d5f1cc95 logging::LogMessage::~LogMessage()
pdf    | #3 0x5638d77f940e service_manager::ZygoteHostImpl::Init()
pdf    | #4 0x5638d5ad5060 content::ContentMainRunnerImpl::Initialize()
pdf    | #5 0x5638d5b365e7 service_manager::Main()
pdf    | #6 0x5638d5ad3631 content::ContentMain()
pdf    | #7 0x5638d5b3580d headless::(anonymous namespace)::RunContentMain()
pdf    | #8 0x5638d5b3550c headless::HeadlessShellMain()
pdf    | #9 0x5638d35295a7 ChromeMain
pdf    | #10 0x7fc01f0492e1 __libc_start_main
pdf    | #11 0x5638d35293ea _start
pdf    | 
pdf    | Received signal 6
pdf    | #0 0x5638d5faa399 base::debug::CollectStackTrace()
pdf    | #1 0x5638d5f0b2a3 base::debug::StackTrace::StackTrace()
pdf    | #2 0x5638d5fa9f35 base::debug::(anonymous namespace)::StackDumpSignalHandler()
pdf    | #3 0x7fc0255f30e0 (/lib/x86_64-linux-gnu/libpthread-2.24.so+0x110df)
pdf    | #4 0x7fc01f05bfff gsignal
pdf    | #5 0x7fc01f05d42a abort
pdf    | #6 0x5638d5fa8e95 base::debug::BreakDebugger()
pdf    | #7 0x5638d5f1d132 logging::LogMessage::~LogMessage()
pdf    | #8 0x5638d77f940e service_manager::ZygoteHostImpl::Init()
pdf    | #9 0x5638d5ad5060 content::ContentMainRunnerImpl::Initialize()
pdf    | #10 0x5638d5b365e7 service_manager::Main()
pdf    | #11 0x5638d5ad3631 content::ContentMain()
pdf    | #12 0x5638d5b3580d headless::(anonymous namespace)::RunContentMain()
pdf    | #13 0x5638d5b3550c headless::HeadlessShellMain()
pdf    | #14 0x5638d35295a7 ChromeMain
pdf    | #15 0x7fc01f0492e1 __libc_start_main
pdf    | #16 0x5638d35293ea _start
pdf    |   r8: 0000000000000000  r9: 00007ffcd14664d0 r10: 0000000000000008 r11: 0000000000000246
pdf    |  r12: 00007ffcd1467788 r13: 00007ffcd1466760 r14: 00007ffcd1467790 r15: aaaaaaaaaaaaaaaa
pdf    |   di: 0000000000000002  si: 00007ffcd14664d0  bp: 00007ffcd1466710  bx: 0000000000000006
pdf    |   dx: 0000000000000000  ax: 0000000000000000  cx: 00007fc01f05bfff  sp: 00007ffcd1466548
pdf    |   ip: 00007fc01f05bfff efl: 0000000000000246 cgf: 002b000000000033 erf: 0000000000000000
pdf    |  trp: 0000000000000000 msk: 0000000000000000 cr2: 0000000000000000
pdf    | [end of stack trace]
pdf    | Calling _exit(1). Core file will not be generated.
pdf    | 
pdf    | 
pdf    | TROUBLESHOOTING: https://github.com/puppeteer/puppeteer/blob/master/docs/troubleshooting.md
pdf    | 
pdf    |     at onClose (/home/node/app/node_modules/puppeteer/lib/launcher/BrowserRunner.js:159:20)
pdf    |     at Interface.<anonymous> (/home/node/app/node_modules/puppeteer/lib/launcher/BrowserRunner.js:149:65)
pdf    |     at Interface.emit (events.js:327:22)
pdf    |     at Interface.close (readline.js:416:8)
pdf    |     at Socket.onend (readline.js:194:10)
pdf    |     at Socket.emit (events.js:327:22)
pdf    |     at endReadableNT (_stream_readable.js:1221:12)
pdf    |     at processTicksAndRejections (internal/process/task_queues.js:84:21)
Run Code Online (Sandbox Code Playgroud)

哦,是的,容器名称是 pdf

我尝试按照建议查看 puppeteer 故障排除页面,但没有找到任何解决方案。

有什么建议?

Ahm*_*lly 15

您应该在启动浏览器时传递 --no-sandbox, --disable-setuid-sandbox 参数。这是我的 docker 文件和小脚本。它运行成功。

您可以通过此参考资料了解更多关于 puppeteer with docker 的信息。

  1. https://github.com/buildkite/docker-puppeteer
  2. https://github.com/alekzonder/docker-puppeteer

文件


FROM node:12.18.0

RUN  apt-get update \
     && apt-get install -y wget gnupg ca-certificates \
     && wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
     && sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list' \
     && apt-get update \
     # We install Chrome to get all the OS level dependencies, but Chrome itself
     # is not actually used as it's packaged in the node puppeteer library.
     # Alternatively, we could could include the entire dep list ourselves
     # (https://github.com/puppeteer/puppeteer/blob/master/docs/troubleshooting.md#chrome-headless-doesnt-launch-on-unix)
     # but that seems too easy to get out of date.
     && apt-get install -y google-chrome-stable \
     && rm -rf /var/lib/apt/lists/* \
     && wget --quiet https://raw.githubusercontent.com/vishnubob/wait-for-it/master/wait-for-it.sh -O /usr/sbin/wait-for-it.sh \
     && chmod +x /usr/sbin/wait-for-it.sh

# Install Puppeteer under /node_modules so it's available system-wide
ADD package.json package-lock.json /
RUN npm install

CMD ["node", "index.js"]
Run Code Online (Sandbox Code Playgroud)

索引.js

const puppeteer = require('puppeteer');

(async() => {

    const browser = await puppeteer.launch({
        args: [
            '--no-sandbox',
            '--disable-setuid-sandbox'
        ]
    });

    const page = await browser.newPage();

    await page.goto('https://www.google.com/', {waitUntil: 'networkidle2'});

    browser.close();

})();
Run Code Online (Sandbox Code Playgroud)

  • 太感谢了!我尝试了大约 20 个“解决方案”,除了你的之外,没有任何效果! (2认同)

Tx_*_*ter 9

我找到了一种允许使用 chrome 沙箱的方法,这要归功于这里的usethe4ce 答案

最初我需要与 puppeteer 分开安装 chrome,我编辑了我的 Dockerfile 如下:

FROM node:12.18.0

WORKDIR /home/runner/app
ADD package*.json ./

# Install latest chrome dev package and fonts to support major charsets (Chinese, Japanese, Arabic, Hebrew, Thai and a few others)
# Note: this installs the necessary libs to make the bundled version of Chromium that Puppeteer
# installs, work.
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
    && sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list' \
    && apt-get update \
    && apt-get install -y google-chrome-unstable fonts-ipafont-gothic fonts-wqy-zenhei fonts-thai-tlwg fonts-kacst ttf-freefont \
        --no-install-recommends \
    && rm -rf /var/lib/apt/lists/*

# Uncomment to skip the chromium download when installing puppeteer. If you do,
# you'll need to launch puppeteer with:
#     browser.launch({executablePath: 'google-chrome-unstable'})
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD true


# Install node modules
RUN npm i\
    # Add user so we don't need --no-sandbox.
    # same layer as npm install to keep re-chowned files from using up several hundred MBs more space
    && groupadd -r runner && useradd -r -g runner -G audio,video runner \
    && mkdir -p /home/runner/Downloads \
    && chown -R runner:runner /home/runner \
    && chown -R runner:runner /home/runner/app/node_modules

USER runner

CMD ["google-chrome-unstable"]
Run Code Online (Sandbox Code Playgroud)

这样做,错误从更改No usable sandbox为:

Failed to move to new namespace: PID namespaces supported, Network namespace supported, but failed: errno = Operation not permitted
Run Code Online (Sandbox Code Playgroud)

然后我遵循了 usethe4ce 的回答建议。默认情况下,Docker 会阻止某些内核级操作的可访问性,Seccomp 选项允许“解锁”chrome 创建自己的沙箱所需的一些操作。因此,我将此chrome.json文件添加到我的存储库中,并按如下方式编辑了我的 docker-compose 文件:

version: "3.8"

services:
  <service name>:
    build:
      <build options>
    init: true
    security_opt: 
      - seccomp=<path to chrome.json file>
    [...]

Run Code Online (Sandbox Code Playgroud)

如果您不使用 docker-compose 文件,则可以使用--security-opt seccomp=path/to/chrome.json链接答案中建议的选项运行容器。

最后使用以下命令启动浏览器:

await puppeteer.launch({
  executablePath: 'google-chrome-unstable'
});
Run Code Online (Sandbox Code Playgroud)

编辑:

不适合使用自定义安装的 chrome,因为 puppeteer 无法完全支持其版本。唯一能保证与特定 puppeteer 版本一起使用的版本是捆绑版本。

所以我建议使用 security_opt 如上所述,只需忽略自定义安装部分。


Ham*_*bot 6

我终于找到了如何使用沙箱运行它,但仅在我的本地计算机上运行。我只需阅读并应用官方 github 存储库上的文档:

我缺少的部分是使用以下选项运行图像--cap-add=SYS_ADMIN

docker run --cap-add=SYS_ADMIN <YOUR_IMAGE_NAME>
Run Code Online (Sandbox Code Playgroud)

然而,这看起来像是一个安全流程,因为它似乎为您的容器提供了对主机的一些访问权限。如果您正在阅读本文,那么这不一定是您想要做的,因为您绝对想使用 Chrome 沙箱。

我的最后一个用例是在 Cloud Run 上运行我的容器,在我看来,他们绝不会允许这样的标志。如果我最终让它在 Cloud Run 上与沙箱一起工作,我将编辑我的答案......

编辑:没关系,它只是在 Cloud Run 上没有任何标志的情况下工作!所以,是的,我将继续--cap-add=SYS_ADMIN在我的开发机器上使用该标志,这对我来说似乎很好。


这是我现在适用的完整 Dockerfile:

FROM node:14-slim

WORKDIR /app

RUN apt-get update \
    && apt-get install -y wget gnupg \
    && wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
    && sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list' \
    && apt-get update \
    && apt-get install -y google-chrome-stable fonts-ipafont-gothic fonts-wqy-zenhei fonts-thai-tlwg fonts-kacst fonts-freefont-ttf libxss1 \
      --no-install-recommends \
    && rm -rf /var/lib/apt/lists/*

COPY . .

RUN yarn \
    && groupadd -r pptruser && useradd -r -g pptruser -G audio,video pptruser \
    && mkdir -p /home/pptruser/Downloads \
    && chown -R pptruser:pptruser /home/pptruser \
    && chown -R pptruser:pptruser /app

# Run everything after as non-privileged user.
USER pptruser

CMD node src/index.js
Run Code Online (Sandbox Code Playgroud)

还有我的src/index.js文件:

const puppeteer = require('puppeteer')

const main = async () => {
  console.log('Starting browser')
  const browser = await puppeteer.launch()
  console.log('Opening a new page')
  const page = await browser.newPage()
  console.log('Navigating to google')
  await page.goto('https://www.google.fr', {
    waitUntil: 'networkidle2'
  })
  console.log('closing browser')
  await browser.close()
}

main()
Run Code Online (Sandbox Code Playgroud)