[Bug] Cannot install torch-npu==2.3.1, torch==2.3.1 and torchvision==0.18.1 because these package versions have conflicting dependencies. #2745

jiabao-wang · 2024-11-13T05:24:05Z

Checklist

1. I have searched related issues but cannot get the expected help.
2. The bug has not been fixed in the latest version.
3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.

Describe the bug

when i run :
DOCKER_BUILDKIT=1 docker build -t lmdeploy-aarch64-ascend:latest
-f docker/Dockerfile_aarch64_ascend .

ERROR: Cannot install torch-npu==2.3.1, torch==2.3.1 and torchvision==0.18.1 because these package versions have conflicting dependencies.
341.8
341.8 The conflict is caused by:
341.8 The user requested torch==2.3.1
341.8 torchvision 0.18.1 depends on torch==2.3.1
341.8 torch-npu 2.3.1 depends on torch==2.3.1+cpu
341.8
341.8 To fix this you could try to:
341.8 1. loosen the range of package versions you've specified
341.8 2. remove package versions to allow pip to attempt to solve the dependency conflict
341.8
341.8 ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

Dockerfile_aarch64_ascend:110

109 | # timm is required for internvl2 model
110 | >>> RUN --mount=type=cache,target=/root/.cache/pip
111 | >>> pip3 install torch==2.3.1 torchvision==0.18.1 torch-npu==2.3.1 &&
112 | >>> pip3 install transformers timm &&
113 | >>> pip3 install dlinfer-ascend
114 |

ERROR: failed to solve: process "/bin/bash -c pip3 install torch==2.3.1 torchvision==0.18.1 torch-npu==2.3.1 && pip3 install transformers timm && pip3 install dlinfer-ascend" did not complete successfully: exit code: 1

Reproduction

DOCKER_BUILDKIT=1 docker build -t lmdeploy-aarch64-ascend:latest
-f docker/Dockerfile_aarch64_ascend .

Environment

Atlas-800-Model-3010
Ascend Docker Runtime has already been installed.

Error traceback

DOCKER_BUILDKIT=1 docker build -t lmdeploy-aarch64-ascend:latest     -f docker/Dockerfile_aarch64_ascend .
[+] Building 1038.2s (15/18)                                                                                                                                                                     docker:default
 => [internal] load build definition from Dockerfile_aarch64_ascend                                                                                                                                        0.0s
 => => transferring dockerfile: 5.15kB                                                                                                                                                                     0.0s
 => [internal] load .dockerignore                                                                                                                                                                          0.0s
 => => transferring context: 2B                                                                                                                                                                            0.0s
 => [internal] load metadata for docker.io/library/ubuntu:20.04                                                                                                                                            3.3s
 => [build_temp 1/2] FROM docker.io/library/ubuntu:20.04@sha256:8e5c4f0285ecbb4ead070431d29b576a530d3166df73ec44affc1cd27555141b                                                                          11.5s
 => => resolve docker.io/library/ubuntu:20.04@sha256:8e5c4f0285ecbb4ead070431d29b576a530d3166df73ec44affc1cd27555141b                                                                                      0.0s
 => => sha256:8e5c4f0285ecbb4ead070431d29b576a530d3166df73ec44affc1cd27555141b 6.69kB / 6.69kB                                                                                                             0.0s
 => => sha256:e5a6aeef391a8a9bdaee3de6b28f393837c479d8217324a2340b64e45a81e0ef 424B / 424B                                                                                                                 0.0s
 => => sha256:6013ae1a63c2ee58a8949f03c6366a3ef6a2f386a7db27d86de2de965e9f450b 2.30kB / 2.30kB                                                                                                             0.0s
 => => sha256:d9802f032d6798e2086607424bfe88cb8ec1d6f116e11cd99592dcaf261e9cd2 27.51MB / 27.51MB                                                                                                           9.8s
 => => extracting sha256:d9802f032d6798e2086607424bfe88cb8ec1d6f116e11cd99592dcaf261e9cd2                                                                                                                  1.4s
 => [internal] load build context                                                                                                                                                                         27.3s
 => => transferring context: 3.56GB                                                                                                                                                                       27.2s
 => [base_builder 2/6] WORKDIR /tmp                                                                                                                                                                        1.6s
 => [base_builder 3/6] RUN sed -i 's@http://.*.ubuntu.com@http://mirrors.tuna.tsinghua.edu.cn@g' /etc/apt/sources.list &&     apt update &&     apt install --no-install-recommends ca-certificates -y &  84.3s
 => [build_temp 2/2] COPY . /tmp                                                                                                                                                                          15.3s
 => [copy_temp 1/1] RUN rm -rf /tmp/*.run                                                                                                                                                                  0.3s
 => [base_builder 4/6] RUN umask 0022  &&     wget https://repo.huaweicloud.com/python/3.10.5/Python-3.10.5.tar.xz &&     tar -xf Python-3.10.5.tar.xz && cd Python-3.10.5 && ./configure --prefix=/usr/  99.0s
 => [base_builder 5/6] RUN --mount=type=cache,target=/root/.cache/pip pip3 config set global.index-url http://mirrors.aliyun.com/pypi/simple &&     pip3 config set global.trusted-host mirrors.aliyun.c  53.2s
 => [base_builder 6/6] RUN if [ ! -d "/lib64" ];     then         mkdir /lib64 && ln -sf /lib/ld-linux-aarch64.so.1 /lib64/ld-linux-aarch64.so.1;     fi                                                   0.5s
 => [cann_builder 1/3] RUN --mount=type=cache,target=/tmp,from=build_temp,source=/tmp     umask 0022 &&     mkdir -p /usr/local/Ascend/driver &&     if [ "all" != "all" ];     then         CHIPOPTION  441.9s
 => [cann_builder 2/3] RUN echo "source /usr/local/Ascend/ascend-toolkit/set_env.sh" >> ~/.bashrc &&     echo "source /usr/local/Ascend/nnal/atb/set_env.sh --cxx_abi=0" >> ~/.bashrc &&     . ~/.bashrc   0.4s
 => ERROR [cann_builder 3/3] RUN --mount=type=cache,target=/root/.cache/pip     pip3 install torch==2.3.1 torchvision==0.18.1 torch-npu==2.3.1 &&     pip3 install transformers timm &&     pip3 instal  342.3s
------
 > [cann_builder 3/3] RUN --mount=type=cache,target=/root/.cache/pip     pip3 install torch==2.3.1 torchvision==0.18.1 torch-npu==2.3.1 &&     pip3 install transformers timm &&     pip3 install dlinfer-ascend:
0.306 ERROR: ld.so: object '/lib/aarch64-linux-gnu/libGLdispatch.so.0' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
0.309 ERROR: ld.so: object '/lib/aarch64-linux-gnu/libGLdispatch.so.0' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
0.830 Looking in indexes: http://mirrors.aliyun.com/pypi/simple
1.137 Collecting torch==2.3.1
1.339   Downloading http://mirrors.aliyun.com/pypi/packages/cb/e2/1bd899d3eb60c6495cf5d0d2885edacac08bde7a1407eadeb2ab36eca3c7/torch-2.3.1-cp310-cp310-manylinux1_x86_64.whl (779.1 MB)
107.3      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 779.1/779.1 MB 10.1 MB/s eta 0:00:00
110.0 Collecting torchvision==0.18.1
110.0   Downloading http://mirrors.aliyun.com/pypi/packages/08/04/17425bf3c0620465ee182cea5c674db4debab87ed0627145d38039cb2a9e/torchvision-0.18.1-cp310-cp310-manylinux1_x86_64.whl (7.0 MB)
110.7      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.0/7.0 MB 10.3 MB/s eta 0:00:00
110.9 Collecting torch-npu==2.3.1
111.1   Downloading http://mirrors.aliyun.com/pypi/packages/a6/e1/60664898a464930397632eb718a4330dd9b394d543394fd07d7b837abef4/torch_npu-2.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.7 MB)
112.2      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.7/11.7 MB 10.8 MB/s eta 0:00:00
112.4 Collecting filelock (from torch==2.3.1)
112.5   Downloading http://mirrors.aliyun.com/pypi/packages/b9/f8/feced7779d755758a52d1f6635d990b8d98dc0a29fa568bbe0625f18fdf3/filelock-3.16.1-py3-none-any.whl (16 kB)
112.5 Collecting typing-extensions>=4.8.0 (from torch==2.3.1)
112.6   Downloading http://mirrors.aliyun.com/pypi/packages/26/9f/ad63fc0248c5379346306f8668cda6e2e2e9c95e01216d2b8ffd9ff037d0/typing_extensions-4.12.2-py3-none-any.whl (37 kB)
112.6 Requirement already satisfied: sympy in /usr/local/python3.10.5/lib/python3.10/site-packages (from torch==2.3.1) (1.13.3)
112.7 Collecting networkx (from torch==2.3.1)
112.7   Downloading http://mirrors.aliyun.com/pypi/packages/b9/54/dd730b32ea14ea797530a4479b2ed46a6fb250f682a9cfb997e968bf0261/networkx-3.4.2-py3-none-any.whl (1.7 MB)
112.8      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.7/1.7 MB 11.3 MB/s eta 0:00:00
112.9 Collecting jinja2 (from torch==2.3.1)
113.0   Downloading http://mirrors.aliyun.com/pypi/packages/31/80/3a54838c3fb461f6fec263ebf3a3a41771bd05190238de3486aae8540c36/jinja2-3.1.4-py3-none-any.whl (133 kB)
113.1 Collecting fsspec (from torch==2.3.1)
113.1   Downloading http://mirrors.aliyun.com/pypi/packages/c6/b2/454d6e7f0158951d8a78c2e1eb4f69ae81beb8dca5fee9809c6c99e9d0d0/fsspec-2024.10.0-py3-none-any.whl (179 kB)
113.3 Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch==2.3.1)
113.3   Downloading http://mirrors.aliyun.com/pypi/packages/b6/9f/c64c03f49d6fbc56196664d05dba14e3a561038a81a638eeb47f4d4cfd48/nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
115.5      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.7/23.7 MB 10.7 MB/s eta 0:00:00
115.7 Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch==2.3.1)
115.7   Downloading http://mirrors.aliyun.com/pypi/packages/eb/d5/c68b1d2cdfcc59e72e8a5949a37ddb22ae6cade80cd4a57a84d4c8b55472/nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
115.8      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 823.6/823.6 kB 11.9 MB/s eta 0:00:00
115.8 Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch==2.3.1)
115.8   Downloading http://mirrors.aliyun.com/pypi/packages/7e/00/6b218edd739ecfc60524e585ba8e6b00554dd908de2c9c66c1af3e44e18d/nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
117.1      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.1/14.1 MB 11.1 MB/s eta 0:00:00
117.2 Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch==2.3.1)
117.6   Downloading http://mirrors.aliyun.com/pypi/packages/ff/74/a2e2be7fb83aaedec84f391f082cf765dfb635e7caa9b49065f73e4835d8/nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB)
193.2      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 731.7/731.7 MB 6.6 MB/s eta 0:00:00
195.4 Collecting nvidia-cublas-cu12==12.1.3.1 (from torch==2.3.1)
195.5   Downloading http://mirrors.aliyun.com/pypi/packages/37/6d/121efd7382d5b0284239f4ab1fc1590d86d34ed4a4a2fdb13b30ca8e5740/nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)
240.8      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 410.6/410.6 MB 8.2 MB/s eta 0:00:00
242.1 Collecting nvidia-cufft-cu12==11.0.2.54 (from torch==2.3.1)
242.1   Downloading http://mirrors.aliyun.com/pypi/packages/86/94/eb540db023ce1d162e7bea9f8f5aa781d57c65aed513c33ee9a5123ead4d/nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)
254.1      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.6/121.6 MB 10.2 MB/s eta 0:00:00
254.5 Collecting nvidia-curand-cu12==10.3.2.106 (from torch==2.3.1)
254.6   Downloading http://mirrors.aliyun.com/pypi/packages/44/31/4890b1c9abc496303412947fc7dcea3d14861720642b49e8ceed89636705/nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)
260.1      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.5/56.5 MB 10.2 MB/s eta 0:00:00
260.3 Collecting nvidia-cusolver-cu12==11.4.5.107 (from torch==2.3.1)
260.4   Downloading http://mirrors.aliyun.com/pypi/packages/bc/1d/8de1e5c67099015c834315e333911273a8c6aaba78923dd1d1e25fc5f217/nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl (124.2 MB)
272.5      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.2/124.2 MB 10.2 MB/s eta 0:00:00
273.0 Collecting nvidia-cusparse-cu12==12.1.0.106 (from torch==2.3.1)
273.0   Downloading http://mirrors.aliyun.com/pypi/packages/65/5b/cfaeebf25cd9fdec14338ccb16f6b2c4c7fa9163aefcf057d86b9cc248bb/nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl (196.0 MB)
290.5      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 196.0/196.0 MB 11.2 MB/s eta 0:00:00
291.2 Collecting nvidia-nccl-cu12==2.20.5 (from torch==2.3.1)
291.2   Downloading http://mirrors.aliyun.com/pypi/packages/4b/2a/0a131f572aa09f741c30ccd45a8e56316e8be8dfc7bc19bf0ab7cfef7b19/nvidia_nccl_cu12-2.20.5-py3-none-manylinux2014_x86_64.whl (176.2 MB)
306.9      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 176.2/176.2 MB 11.3 MB/s eta 0:00:00
307.5 Collecting nvidia-nvtx-cu12==12.1.105 (from torch==2.3.1)
307.5   Downloading http://mirrors.aliyun.com/pypi/packages/da/d3/8057f0587683ed2fcd4dbfbdfdfa807b9160b809976099d36b8f60d08f03/nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (99 kB)
307.6 Collecting triton==2.3.1 (from torch==2.3.1)
307.7   Downloading http://mirrors.aliyun.com/pypi/packages/d7/69/8a9fde07d2d27a90e16488cdfe9878e985a247b2496a4b5b1a2126042528/triton-2.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (168.1 MB)
339.7      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 168.1/168.1 MB 4.8 MB/s eta 0:00:00
340.3 Requirement already satisfied: numpy in /usr/local/python3.10.5/lib/python3.10/site-packages (from torchvision==0.18.1) (1.24.0)
341.0 Collecting pillow!=8.3.*,>=5.3.0 (from torchvision==0.18.1)
341.1   Downloading http://mirrors.aliyun.com/pypi/packages/41/c3/94f33af0762ed76b5a237c5797e088aa57f2b7fa8ee7932d399087be66a8/pillow-11.0.0-cp310-cp310-manylinux_2_28_x86_64.whl (4.4 MB)
341.7      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.4/4.4 MB 7.0 MB/s eta 0:00:00
341.8 INFO: pip is looking at multiple versions of torch-npu to determine which version is compatible with other requirements. This could take a while.
341.8 ERROR: Cannot install torch-npu==2.3.1, torch==2.3.1 and torchvision==0.18.1 because these package versions have conflicting dependencies.
341.8
341.8 The conflict is caused by:
341.8     The user requested torch==2.3.1
341.8     torchvision 0.18.1 depends on torch==2.3.1
341.8     torch-npu 2.3.1 depends on torch==2.3.1+cpu
341.8
341.8 To fix this you could try to:
341.8 1. loosen the range of package versions you've specified
341.8 2. remove package versions to allow pip to attempt to solve the dependency conflict
341.8
341.8 ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts
------
Dockerfile_aarch64_ascend:110
--------------------
 109 |     # timm is required for internvl2 model
 110 | >>> RUN --mount=type=cache,target=/root/.cache/pip \
 111 | >>>     pip3 install torch==2.3.1 torchvision==0.18.1 torch-npu==2.3.1 && \
 112 | >>>     pip3 install transformers timm && \
 113 | >>>     pip3 install dlinfer-ascend
 114 |
--------------------
ERROR: failed to solve: process "/bin/bash -c pip3 install torch==2.3.1 torchvision==0.18.1 torch-npu==2.3.1 &&     pip3 install transformers timm &&     pip3 install dlinfer-ascend" did not complete successfully: exit code: 1

CyCle1024 · 2024-11-13T10:22:45Z

@jiabao-wang Hi, are you building docker image on x86_64 platform? Currently, the Dockerfile is only supported on aarch64. For x86_64 platform, the pypi package of dlinfer is not uploaded, as well as the problem you mentioned above.
There's a workaround for this case which is not released yet.

CyCle1024 · 2024-11-13T11:20:48Z

@jiabao-wang Here is a new Dockerfile for ascend x86_64 platform, it's only tested for building on x86_64 machine, the inference of models is not tested yet since we don't have any x86_64 ascend npu machine.

FROM ubuntu:20.04 as base_builder

WORKDIR /tmp

ARG http_proxy
ARG https_proxy
ARG DEBIAN_FRONTEND=noninteractive

RUN sed -i 's@http://.*.ubuntu.com@http://mirrors.tuna.tsinghua.edu.cn@g' /etc/apt/sources.list && \
    apt update && \
    apt install --no-install-recommends ca-certificates -y && \
    apt install --no-install-recommends bc wget -y && \
    apt install --no-install-recommends git curl gcc make g++ pkg-config unzip -y && \
    apt install --no-install-recommends libsqlite3-dev libblas3 liblapack3 gfortran vim -y && \
    apt install --no-install-recommends liblapack-dev libblas-dev libhdf5-dev libffi-dev -y && \
    apt install --no-install-recommends libssl-dev zlib1g-dev xz-utils cython3 python3-h5py -y && \
    apt install --no-install-recommends libopenblas-dev libgmpxx4ldbl liblzma-dev -y && \
    apt install --no-install-recommends libicu66 libxml2 pciutils libgl1-mesa-glx libbz2-dev -y && \
    apt install --no-install-recommends libreadline-dev libncurses5 libncurses5-dev libncursesw5 -y && \
    sed -i 's@http://mirrors.tuna.tsinghua.edu.cn@https://mirrors.tuna.tsinghua.edu.cn@g' /etc/apt/sources.list && \
    apt clean && rm -rf /var/lib/apt/lists/*

ARG PYVERSION=3.10.5

ENV LD_LIBRARY_PATH=/usr/local/python${PYVERSION}/lib: \
    PATH=/usr/local/python${PYVERSION}/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

RUN umask 0022  && \
    wget https://repo.huaweicloud.com/python/${PYVERSION}/Python-${PYVERSION}.tar.xz && \
    tar -xf Python-${PYVERSION}.tar.xz && cd Python-${PYVERSION} && ./configure --prefix=/usr/local/python${PYVERSION} --enable-shared && \
    make -j 16 && make install && \
    ln -sf /usr/local/python${PYVERSION}/bin/python3 /usr/bin/python3 && \
    ln -sf /usr/local/python${PYVERSION}/bin/python3 /usr/bin/python && \
    ln -sf /usr/local/python${PYVERSION}/bin/pip3 /usr/bin/pip3 && \
    ln -sf /usr/local/python${PYVERSION}/bin/pip3 /usr/bin/pip && \
    cd .. && \
    rm -rf Python*

RUN --mount=type=cache,target=/root/.cache/pip pip3 config set global.index-url http://mirrors.aliyun.com/pypi/simple && \
    pip3 config set global.trusted-host mirrors.aliyun.com && \
    pip3 install -U pip && \
    pip3 install wheel==0.43.0 scikit-build==0.18.0 numpy==1.24 setuptools==69.5.1 && \
    pip3 install decorator sympy cffi && \
    pip3 install cmake ninja pyyaml && \
    pip3 install pathlib2 protobuf attrs attr scipy && \
    pip3 install requests psutil absl-py

ENV LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu/hdf5/serial:$LD_LIBRARY_PATH

FROM ubuntu:20.04 as build_temp
COPY . /tmp

FROM base_builder as cann_builder

ARG ASCEND_BASE=/usr/local/Ascend
ARG TOOLKIT_PATH=$ASCEND_BASE/ascend-toolkit/latest

ENV LD_LIBRARY_PATH=\
$ASCEND_BASE/driver/lib64:\
$ASCEND_BASE/driver/lib64/common:\
$ASCEND_BASE/driver/lib64/driver:\
$ASCEND_BASE/driver/tools/hccn_tool/:\
$TOOLKIT_PATH/opp/built-in/op_impl/ai_core/tbe/op_tiling/lib/linux/x86_64/:\
$LD_LIBRARY_PATH

# run files should be placed at the root dir of repo
ARG CHIP=all
ARG TOOLKIT_PKG=Ascend-cann-toolkit_*.run
ARG KERNELS_PKG=Ascend-cann-kernels-*.run
ARG NNAL_PKG=Ascend-cann-nnal_*.run

RUN --mount=type=cache,target=/tmp,from=build_temp,source=/tmp \
    umask 0022 && \
    mkdir -p $ASCEND_BASE/driver && \
    if [ "$CHIP" != "all" ]; \
    then \
        CHIPOPTION="--chip=$CHIP"; \
    else \
        CHIPOPTION=""; \
    fi && \
    chmod +x $TOOLKIT_PKG $KERNELS_PKG $NNAL_PKG && \
    ./$TOOLKIT_PKG --quiet --install --install-path=$ASCEND_BASE --install-for-all $CHIPOPTION && \
    ./$KERNELS_PKG --quiet --install --install-path=$ASCEND_BASE --install-for-all && \
    . /usr/local/Ascend/ascend-toolkit/set_env.sh && \
    ./$NNAL_PKG --quiet --install --install-path=$ASCEND_BASE && \
    rm -f $TOOLKIT_PKG $KERNELS_PKG $NNAL_PKG

ENV GLOG_v=2 \
    LD_LIBRARY_PATH=$TOOLKIT_PATH/lib64:$LD_LIBRARY_PATH \
    TBE_IMPL_PATH=$TOOLKIT_PATH/opp/op_impl/built-in/ai_core/tbe \
    PATH=$TOOLKIT_PATH/ccec_compiler/bin:$PATH \
    ASCEND_OPP_PATH=$TOOLKIT_PATH/opp \
    ASCEND_AICPU_PATH=$TOOLKIT_PATH

ENV PYTHONPATH=$TBE_IMPL_PATH:$PYTHONPATH

SHELL ["/bin/bash", "-c"]
RUN echo "source /usr/local/Ascend/ascend-toolkit/set_env.sh" >> ~/.bashrc && \
    echo "source /usr/local/Ascend/nnal/atb/set_env.sh --cxx_abi=0" >> ~/.bashrc && \
    . ~/.bashrc

# dlinfer
# timm is required for internvl2 model
WORKDIR /opt/
RUN --mount=type=cache,target=/root/.cache/pip \
    pip3 install torch==2.3.1+cpu torchvision==0.18.1+cpu --index-url=https://download.pytorch.org/whl/cpu && \
    pip3 install torch-npu==2.3.1 && \
    pip3 install transformers timm && \
    git clone https://github.com/DeepLink-org/dlinfer.git && \
    cd dlinfer && DEVICE=ascend python setup.py develop

# lmdeploy
FROM build_temp as copy_temp
RUN rm -rf /tmp/*.run

FROM cann_builder as final_builder
COPY --from=copy_temp /tmp /opt/lmdeploy
WORKDIR /opt/lmdeploy

RUN --mount=type=cache,target=/root/.cache/pip \
    sed -i '/triton/d' requirements/runtime.txt && \
    pip3 install -v --no-build-isolation -e .

jiabao-wang · 2024-11-14T07:28:11Z

@CyCle1024
I have make the docker iamge following the Dockerfile for ascend x86_64
but when i try to run: docker run -e ASCEND_VISIBLE_DEVICES=0 --rm --name lmdeploy -t lmdeploy-aarch64-ascend:latest lmdeploy check_env

output error:
(base) wjb@ubuntu-Atlas-800-Model-3010:~$ docker run -e ASCEND_VISIBLE_DEVICES=0 --rm --name lmdeploy -t lmdeploy-aarch64-ascend:latest lmdeploy check_env
Traceback (most recent call last):
File "/usr/local/python3.10.5/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1778, in _get_module
return importlib.import_module("." + module_name, self.name)
File "/usr/local/python3.10.5/lib/python3.10/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 1006, in _find_and_load_unlocked
File "", line 688, in _load_unlocked
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "/usr/local/python3.10.5/lib/python3.10/site-packages/transformers/generation/utils.py", line 115, in
from accelerate.hooks import AlignDevicesHook, add_hook_to_module
File "/usr/local/python3.10.5/lib/python3.10/site-packages/accelerate/init.py", line 16, in
from .accelerator import Accelerator
File "/usr/local/python3.10.5/lib/python3.10/site-packages/accelerate/accelerator.py", line 36, in
from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
File "/usr/local/python3.10.5/lib/python3.10/site-packages/accelerate/checkpointing.py", line 24, in
from .utils import (
File "/usr/local/python3.10.5/lib/python3.10/site-packages/accelerate/utils/init.py", line 126, in
from .modeling import (
File "/usr/local/python3.10.5/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 31, in
from ..state import AcceleratorState
File "/usr/local/python3.10.5/lib/python3.10/site-packages/accelerate/state.py", line 64, in
if is_npu_available(check_device=False):
File "/usr/local/python3.10.5/lib/python3.10/site-packages/accelerate/utils/imports.py", line 362, in is_npu_available
import torch_npu # noqa: F401
File "/usr/local/python3.10.5/lib/python3.10/site-packages/torch_npu/init.py", line 16, in
import torch_npu.npu
File "/usr/local/python3.10.5/lib/python3.10/site-packages/torch_npu/npu/init.py", line 119, in
from torch_npu.utils.error_code import ErrCode, pta_error, prof_error
File "/usr/local/python3.10.5/lib/python3.10/site-packages/torch_npu/utils/init.py", line 1, in
from ._module import _apply_module_patch
File "/usr/local/python3.10.5/lib/python3.10/site-packages/torch_npu/utils/_module.py", line 26, in
from torch_npu.npu.amp.autocast_mode import autocast
File "/usr/local/python3.10.5/lib/python3.10/site-packages/torch_npu/npu/amp/init.py", line 6, in
from .grad_scaler import GradScaler # noqa: F401
File "/usr/local/python3.10.5/lib/python3.10/site-packages/torch_npu/npu/amp/grad_scaler.py", line 8, in
from torch.amp.grad_scaler import _MultiDeviceReplicator, OptState, _refresh_per_optimizer_state
ModuleNotFoundError: No module named 'torch.amp.grad_scaler'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/python3.10.5/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1778, in _get_module
return importlib.import_module("." + module_name, self.name)
File "/usr/local/python3.10.5/lib/python3.10/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 1006, in _find_and_load_unlocked
File "", line 688, in _load_unlocked
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "/usr/local/python3.10.5/lib/python3.10/site-packages/transformers/models/auto/modeling_auto.py", line 21, in
from .auto_factory import (
File "/usr/local/python3.10.5/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 40, in
from ...generation import GenerationMixin
File "", line 1075, in _handle_fromlist
File "/usr/local/python3.10.5/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1766, in getattr
module = self._get_module(self._class_to_module[name])
File "/usr/local/python3.10.5/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1780, in _get_module
raise RuntimeError(
RuntimeError: Failed to import transformers.generation.utils because of the following error (look up to see its traceback):
No module named 'torch.amp.grad_scaler'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/python3.10.5/bin/lmdeploy", line 33, in
sys.exit(load_entry_point('lmdeploy', 'console_scripts', 'lmdeploy')())
File "/usr/local/python3.10.5/bin/lmdeploy", line 25, in importlib_load_entry_point
return next(matches).load()
File "/usr/local/python3.10.5/lib/python3.10/importlib/metadata/init.py", line 171, in load
module = import_module(match.group('module'))
File "/usr/local/python3.10.5/lib/python3.10/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 992, in _find_and_load_unlocked
File "", line 241, in _call_with_frames_removed
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 1006, in _find_and_load_unlocked
File "", line 688, in _load_unlocked
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "/opt/lmdeploy/lmdeploy/init.py", line 3, in
from .api import client, pipeline, serve
File "/opt/lmdeploy/lmdeploy/api.py", line 5, in
from .archs import autoget_backend_config, get_task
File "/opt/lmdeploy/lmdeploy/archs.py", line 6, in
from lmdeploy.serve.vl_async_engine import VLAsyncEngine
File "/opt/lmdeploy/lmdeploy/serve/vl_async_engine.py", line 8, in
from lmdeploy.vl.engine import ImageEncoder
File "/opt/lmdeploy/lmdeploy/vl/engine.py", line 12, in
from lmdeploy.vl.model.builder import load_vl_model
File "/opt/lmdeploy/lmdeploy/vl/model/builder.py", line 7, in
from .internvl import InternVLVisionModel
File "/opt/lmdeploy/lmdeploy/vl/model/internvl.py", line 7, in
from transformers import AutoConfig, AutoModel, CLIPImageProcessor
File "", line 1075, in _handle_fromlist
File "/usr/local/python3.10.5/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1767, in getattr
value = getattr(module, name)
File "/usr/local/python3.10.5/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1766, in getattr
module = self._get_module(self._class_to_module[name])
File "/usr/local/python3.10.5/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1780, in _get_module
raise RuntimeError(
RuntimeError: Failed to import transformers.models.auto.modeling_auto because of the following error (look up to see its traceback):
Failed to import transformers.generation.utils because of the following error (look up to see its traceback):
No module named 'torch.amp.grad_scaler'

lvhan028 assigned jinminxi104 Nov 13, 2024

CyCle1024 self-assigned this Nov 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Cannot install torch-npu==2.3.1, torch==2.3.1 and torchvision==0.18.1 because these package versions have conflicting dependencies. #2745

[Bug] Cannot install torch-npu==2.3.1, torch==2.3.1 and torchvision==0.18.1 because these package versions have conflicting dependencies. #2745

jiabao-wang commented Nov 13, 2024

CyCle1024 commented Nov 13, 2024 •

edited

Loading

CyCle1024 commented Nov 13, 2024

jiabao-wang commented Nov 14, 2024 •

edited

Loading

[Bug] Cannot install torch-npu==2.3.1, torch==2.3.1 and torchvision==0.18.1 because these package versions have conflicting dependencies. #2745

[Bug] Cannot install torch-npu==2.3.1, torch==2.3.1 and torchvision==0.18.1 because these package versions have conflicting dependencies. #2745

Comments

jiabao-wang commented Nov 13, 2024

Checklist

Describe the bug

Dockerfile_aarch64_ascend:110

109 | # timm is required for internvl2 model 110 | >>> RUN --mount=type=cache,target=/root/.cache/pip 111 | >>> pip3 install torch==2.3.1 torchvision==0.18.1 torch-npu==2.3.1 && 112 | >>> pip3 install transformers timm && 113 | >>> pip3 install dlinfer-ascend 114 |

Reproduction

Environment

Error traceback

CyCle1024 commented Nov 13, 2024 • edited Loading

CyCle1024 commented Nov 13, 2024

jiabao-wang commented Nov 14, 2024 • edited Loading

109 | # timm is required for internvl2 model
110 | >>> RUN --mount=type=cache,target=/root/.cache/pip
111 | >>> pip3 install torch==2.3.1 torchvision==0.18.1 torch-npu==2.3.1 &&
112 | >>> pip3 install transformers timm &&
113 | >>> pip3 install dlinfer-ascend
114 |

CyCle1024 commented Nov 13, 2024 •

edited

Loading

jiabao-wang commented Nov 14, 2024 •

edited

Loading