来自 alpine 软件包存储库的 Numpy 无法导入 c 扩展

Jer*_*ark 2 numpy pandas docker alpine-linux

我正在制作一个需要 pandas 和 numpy 的 docker 映像,但通过 pip 安装大约需要 20 分钟,这对于我的用例来说太长了。然后我选择从 alpine 软件包存储库安装 pandas 和 numpy,但似乎无法正确导入 numpy。

这是我的 Dockerfile:

# syntax=docker/dockerfile:experimental
FROM python:3.9.5-alpine as base

FROM base as builder
RUN apk add build-base gcc musl-dev

RUN --mount=type=cache,target=/root/.cache/pip \
    pip install --target="/install" django

FROM base
RUN apk add py3-pandas py3-numpy

COPY --from=builder /install /usr/local/lib/python3.9/site-packages

ENV PYTHONPATH "${PYTHONPATH}:/usr/lib/python3.9/site-packages"

CMD ["python"]
Run Code Online (Sandbox Code Playgroud)

当我尝试导入依赖于 numpy 的 pandas 时,出现错误:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.9/site-packages/pandas/__init__.py", line 16, in <module>
    raise ImportError(
ImportError: Unable to import required dependencies:
numpy: 

IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!

Importing the numpy C-extensions failed. This error can happen for
many reasons, often due to issues with your setup or how NumPy was
installed.

We have compiled some common reasons and troubleshooting tips at:

    https://numpy.org/devdocs/user/troubleshooting-importerror.html

Please note and check the following:

  * The Python version is: Python3.9 from "/usr/local/bin/python"
  * The NumPy version is: "1.20.3"

and make sure that they are the versions you expect.
Please carefully study the documentation linked above for further help.

Original error was: No module named 'numpy.core._multiarray_umath'
Run Code Online (Sandbox Code Playgroud)

如果我导入 numpy 会出现错误:

Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/numpy/core/__init__.py", line 22, in <module>
    from . import multiarray
  File "/usr/lib/python3.9/site-packages/numpy/core/multiarray.py", line 12, in <module>
    from . import overrides
  File "/usr/lib/python3.9/site-packages/numpy/core/overrides.py", line 7, in <module>
    from numpy.core._multiarray_umath import (
ModuleNotFoundError: No module named 'numpy.core._multiarray_umath'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.9/site-packages/numpy/__init__.py", line 145, in <module>
    from . import core
  File "/usr/lib/python3.9/site-packages/numpy/core/__init__.py", line 48, in <module>
    raise ImportError(msg)
ImportError: 

IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!

Importing the numpy C-extensions failed. This error can happen for
many reasons, often due to issues with your setup or how NumPy was
installed.

We have compiled some common reasons and troubleshooting tips at:

    https://numpy.org/devdocs/user/troubleshooting-importerror.html

Please note and check the following:

  * The Python version is: Python3.9 from "/usr/local/bin/python"
  * The NumPy version is: "1.20.3"

and make sure that they are the versions you expect.
Please carefully study the documentation linked above for further help.

Original error was: No module named 'numpy.core._multiarray_umath'
Run Code Online (Sandbox Code Playgroud)

我已经束手无策,试图找出我错过了什么和做错了什么。我已经尝试了错误跟踪给出的 url 中的故障排除提示,但似乎没有解决问题。

任何帮助是极大的赞赏。

Mr.*_* 47 7

我知道自从有人提出这个问题以来已经有一段时间了,您可能已经找到了解决方案,或者从 Alpine 转移到了另一个发行版。但我遇到了同样的问题,这是我搜索中出现的第一件事。因此,在花了几个小时找到解决方案之后,我认为值得在这里记录下来。

问题(显然)出在numpypandas包上。我使用了社区存储库中的预制轮子,并遇到了与您相同的问题。因此,显然,构建过程本身就引入了这个问题。具体来说,如果您查看numpy/core安装位置 ( /usr/lib/python3.9/site-packages),您会发现所有 C 扩展名都包含.cpython-39-x86_64-linux-musl在其名称中。因此,例如,您遇到问题的模块numpy.core._multiarray_umath被命名为_multiarray_umath.cpython-39-x86_64-linux-musl.so,而不仅仅是_multiarray_umath.so。从这些文件名中删除.cpython-39-x86_64-linux-musl解决了问题(编辑:有关详细信息,请参阅附录)。

Dockerfile安装后可以添加以下行py3-pandaspy3-numpy修复它:

RUN find /usr/lib/python3.9/site-packages -iname "*.so" -exec sh -c 'x="{}"; mv "$x" "${x/cpython-39-x86_64-linux-musl./}"' \;
Run Code Online (Sandbox Code Playgroud)

PS:在进一步研究这个问题后,我找到了罪魁祸首:出于某种原因,在 Alpine 下运行的 Python 认为其完整平台扩展后缀(可从 获得importlib.machinery.EXTENSION_SUFFIXES)应该是cpython-39-x86_64-linux-gnu.so,而不是cpython-39-x86_64-linux-musl.so。我不相信它是用 glibc 构建的,但谁知道呢。因此,您只需将上面这些共享对象的名称中的 更改为即可muslgnu这样也可以。不知道为什么构建期间生成的扩展后缀与 Python 在运行时使用的扩展后缀不同。