[PoC] Use SYCL runtime wheel instead of PTDB and g++ or clang instead of dpc++ to compile the kernel launcher #1717

vlad-penkin · 2024-07-29T12:53:31Z

Expected results:

Working prototype in the feature branch
Additional requirements if any for the SYCL runtime wheel
Detailed requirements for the CI integration

ZzEeKkAa · 2024-08-05T14:21:00Z

From what I've experienced - https://pypi.org/project/intel-sycl-rt/ is the same package as used in conda https://anaconda.org/conda-forge/intel-sycl-rt with the exception we need to set some environment variables to find libraries within python environment. In fact I was able to create POC dockerfile with support of https://github.com/IntelPython/dpctl and https://github.com/IntelPython/numba-dpex

ZzEeKkAa · 2024-08-05T14:26:08Z

The only thing that we need to ask release team to publish 2024.1.4 release of this package (the same that used in PTDB I guess)

ZzEeKkAa · 2024-08-05T14:30:20Z

There are multiple rt package to fit different purposes here listed here: https://github.com/conda-forge/intel-compiler-repack-feedstock/blob/main/recipe/meta.yaml and there should be corresponding package on pypi. We just need to ask release team to keep pypi in sync.
cc: @xaleryb

ZzEeKkAa · 2024-08-06T19:27:46Z

So, I was able to create triton runtime environment without PTDB/Oneapi toolkit:

Summary

I've used PTDB 0.5.2.18 to build upstream pytorch (main branch) and intel's triton (llvm-target branch) with some patches. I also did repack of intel-sycl-rt (2024.1.2) with sycl headers from PTDB 0.5.2.18 (compiler 2024.1.3).

Environment setup

python3.9 -m venv ./.venv
source ./.venv/bin/activate
pip install --upgrade pip
pip install ./intel_sycl_rt-2024.1.2-py2.py3-none-manylinux1_x86_64.whl ./torch-2.5.0a0+git7f58740-cp39-cp39-linux_x86_64.whl ./triton-3.0.0-cp39-cp39-linux_x86_64.whl
pip install dpcpp_cpp_rt==2024.1.2 numpy matplotlib pandas
sed -i "s/\/opt\/anaconda1anaconda2anaconda3/$(print ${VIRTUAL_ENV} | sed 's/\//\\\//g')/g" $VIRTUAL_ENV/etc/OpenCL/vendors/*
rm -rf ~/.triton
export LIBRARY_PATH=$LIBRARY_PATH:$VIRTUAL_ENV/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$VIRTUAL_ENV/lib
export CPATH=$CPATH:$VIRTUAL_ENV/include/sycl
mkdir -p $VIRTUAL_ENV/lib/python3.9/site-packages/intel_extension_for_pytorch

Now you can run tutorial by (will use clang++):

wget https://raw.githubusercontent.com/intel/intel-xpu-backend-for-triton/llvm-target/python/tutorials/01-vector-add.py
python ./01-vector-add.py

Or with g++

CXX=g++ python ./01-vector-add.py

Building wheels

Pytorch

Version: upstream main branch
Patches:

Build options: static mkl linking

Triton

Version: upstream intel's triton
Patches:

~~Update device property names #1788~~
Add c++17 std to python/triton/runtime/build.py (or use Add intel-sycl-rt wheel support #1857 ):

         if icpx is not None:
             cc_cmd += ["-fsycl"]
+        else:
+            cc_cmd += ["--std=gnu++17"]

Patched sycl runtime

Version: 2024.1.2
Sycl headers version: 2024.1.3 (PTDB release)

wget https://files.pythonhosted.org/packages/cc/1e/d74e608f0c040e4f72dbfcd3b183f39570f054d08de39cc431f153220d90/intel_sycl_rt-2024.1.2-py2.py3-none-manylinux1_x86_64.whl
wheel unpack intel_sycl_rt-2024.1.2-py2.py3-none-manylinux1_x86_64.whl
mkdir -p ./intel_sycl_rt-2024.1.2/intel_sycl_rt-2024.1.2.data/data/include
cp -r /opt/intel/oneapi/compiler/2024.1/include/sycl ./intel_sycl_rt-2024.*/intel_sycl_rt-2024.*.data/data/include/
wheel pack intel_sycl_rt-2024.1.2 --build headers_patch

ZzEeKkAa · 2024-08-12T18:46:16Z

UPD:

with #1857 you only need LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$VIRTUAL_ENV/lib VERBOSE=1 python ./01-vector-add.py to run triton. LIBRARY_PATH and CPATH are set directly in the PR. However all other preparations are needed. LD_LIBRARY_PATH needed for pytorch to find sycl install.

leshikus · 2024-08-15T11:09:01Z

I wonder where ./torch-2.5.0a0+git7f58740-cp39-cp39-linux_x86_64.whl ./triton-3.0.0-cp39-cp39-linux_x86_64.whl come from?

ZzEeKkAa · 2024-08-15T14:45:13Z

I wonder where ./torch-2.5.0a0+git7f58740-cp39-cp39-linux_x86_64.whl ./triton-3.0.0-cp39-cp39-linux_x86_64.whl come from?

You need to build it. I guess intel triton's nightly builds will work too.

leshikus · 2024-08-22T09:23:33Z

I see that test-triton.sh already works with venv. I wonder if you plan integrating your scenario into the standard build script

leshikus · 2024-08-22T13:26:00Z

pip says me,

ERROR: intel_sycl_rt-2024.1.2-headers_patch-py2.py3-none-manylinux1_x86_64.whl is not a valid wheel filename.

how did you overcome this?

leshikus · 2024-08-22T13:40:15Z

I've just copied a new file on the top of the original file. I wonder if it is possible to keep a name with headers inside. In this case one need to use pip --force-reinstall option, otherwise the package will be skipped if the original was installed

leshikus · 2024-08-22T16:20:14Z

I have most test passed locally with this approach (though I've modified it a bit). The next step is to make it work in CI.

#!/bin/sh

set -euvx

name=${1:-triton}

rm -rf intel-xpu-backend-for-triton/
git clone https://github.com/intel/intel-xpu-backend-for-triton -b lesh/remove-flag

set +uvx
. ~/.conda/etc/profile.d/conda.sh
#conda deactivate
conda env remove -n $name
conda env remove -n dpcpp
set -uvx
set -e

conda create -y -n triton python=3.9.*
conda env update -n $name -f intel-xpu-backend-for-triton/scripts/triton.yml
#conda env update -n $name -f intel-xpu-backend-for-triton/scripts/basekit.yml

python -m venv ./.venv
. ./.venv/bin/activate
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH:-}:$VIRTUAL_ENV/lib
export CPATH=${CPATH:-}:$VIRTUAL_ENV/include:$VIRTUAL_ENV/include/sycl

rm -rf intel_sycl_rt-2024.1.2*
wget https://files.pythonhosted.org/packages/cc/1e/d74e608f0c040e4f72dbfcd3b183f39570f054d08de39cc431f153220d90/intel_sycl_rt-2024.1.2-py2.py3-none-manylinux1_x86_64.whl
wheel unpack intel_sycl_rt-2024.1.2-py2.py3-none-manylinux1_x86_64.whl
mkdir -p ./intel_sycl_rt-2024.1.2/intel_sycl_rt-2024.1.2.data/data/include
cp -r /opt/intel/oneapi/compiler/2024.1/include/sycl ./intel_sycl_rt-2024.*/intel_sycl_rt-2024.*.data/data/include/
wheel pack intel_sycl_rt-2024.1.2 --build headers_patch

mv intel_sycl_rt-2024.1.2-headers_patch-py2.py3-none-manylinux1_x86_64.whl intel_sycl_rt-2024.1.2-py2.py3-none-manylinux1_x86_64.whl
pip install --force-reinstall ./intel_sycl_rt-2024.1.2-py2.py3-none-manylinux1_x86_64.whl
pip install dpcpp_cpp_rt==2024.1.2 numpy matplotlib pandas

find /opt/intel/oneapi/mkl/2025.0/lib/ \( -name '*.so' -or -name '*.so.*' \) -exec cp -n {} $HOME/.conda/envs/triton/lib \;
find /opt/intel/oneapi/compiler/2024.1/lib/ \( -name '*.so' -or -name '*.so.*' \) -exec cp -n {} $HOME/.conda/envs/triton/lib \;


export LD_LIBRARY_PATH=/home/jovyan/.conda/envs/triton/lib:${LD_LIBRARY_PATH:-}
ln -snf /usr/include/level_zero ~/.conda/envs/triton/bin/../x86_64-conda-linux-gnu/sysroot/usr/include/level_zero
find /usr -name libze_\* -exec ln -sf {} ~/.conda/envs/triton/lib/ \;

cd intel-xpu-backend-for-triton/
conda run --no-capture-output -n $name scripts/compile-triton.sh --triton 2>&1 | tee ../$name.log

conda run --no-capture-output -n $name bash -v -x scripts/test-triton.sh 2>&1 | tee -a ../$name.log

ZzEeKkAa · 2024-08-22T18:09:21Z

@leshikus thank you for confirming. As far as I see, it is pretty much the same with the difference:

conda is used with python installed, instead of virtual environment
multiple conda packages were used instead of system wise packages
mkl libraries were added to the environment, instead of statically link them at pytorch build time

leshikus · 2024-08-24T07:24:32Z

yes, there are differences; both conda and venv are used; I'm testing PR right now; #2000

conda can be removed; 2) I have no instruction how to compile mkl statically, thus I used the simple variant; 3) another difference is I need more compiler libraries; 4) it is still much smaller dependency set than the original basekit - thanks you for your effort; 5) more tests pass here than in the original conda-basekit workflow

leshikus · 2024-08-24T09:57:55Z

@vlad-penkin what do you think about our strategic direction, should it be conda, venv or both? Or none

anmyachev · 2024-08-26T19:23:03Z

I have no instruction how to compile mkl statically, thus I used the simple variant;

@leshikus these are probably just pip packages, according to the pytorch build script:

intel-xpu-backend-for-triton/scripts/compile-pytorch-ipex.sh

Line 234 in 776182a

pip install mkl-static mkl-include

It might also be necessary to use: export USE_STATIC_MKL=1

Add support to https://pypi.org/project/intel-sycl-rt/ wheel package that is described #1717 --------- Signed-off-by: Anatoly Myachev <[email protected]> Co-authored-by: Anatoly Myachev <[email protected]>

Add support to https://pypi.org/project/intel-sycl-rt/ wheel package that is described intel#1717 --------- Signed-off-by: Anatoly Myachev <[email protected]> Co-authored-by: Anatoly Myachev <[email protected]>

vlad-penkin added packaging ci research labels Jul 29, 2024

vlad-penkin added this to the 7. CI milestone Jul 29, 2024

vlad-penkin added the dependencies label Jul 29, 2024

vlad-penkin changed the title ~~[PoC] Use SYCL runtime wheel instead of PTDB~~ [PoC] Use SYCL runtime wheel instead of PTDB and g++ or clang instead of dpc++ Aug 4, 2024

vlad-penkin changed the title ~~[PoC] Use SYCL runtime wheel instead of PTDB and g++ or clang instead of dpc++~~ [PoC] Use SYCL runtime wheel instead of PTDB and g++ or clang instead of dpc++ to compile the kernel launcher Aug 4, 2024

vlad-penkin assigned vlad-penkin and ZzEeKkAa Aug 5, 2024

vlad-penkin removed their assignment Aug 12, 2024

ZzEeKkAa mentioned this issue Aug 12, 2024

Add intel-sycl-rt wheel support #1857

Merged

vlad-penkin linked a pull request Aug 12, 2024 that will close this issue

Add intel-sycl-rt wheel support #1857

Merged

vlad-penkin modified the milestones: 7. CI, 0.3 [Triton] Language and Runtime Aug 17, 2024

vlad-penkin added dependencies: sycl runtime dependencies: compiler packaging: compiler and sycl runtime and removed dependencies labels Aug 18, 2024

vlad-penkin assigned vlad-penkin and anmyachev and unassigned ZzEeKkAa Aug 26, 2024

anmyachev closed this as completed in #1857 Aug 27, 2024

vlad-penkin unassigned anmyachev and vlad-penkin Sep 9, 2024

vlad-penkin reopened this Sep 9, 2024

vlad-penkin assigned ZzEeKkAa Sep 9, 2024

vlad-penkin closed this as completed Sep 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PoC] Use SYCL runtime wheel instead of PTDB and g++ or clang instead of dpc++ to compile the kernel launcher #1717

[PoC] Use SYCL runtime wheel instead of PTDB and g++ or clang instead of dpc++ to compile the kernel launcher #1717

vlad-penkin commented Jul 29, 2024 •

edited

Loading

ZzEeKkAa commented Aug 5, 2024

ZzEeKkAa commented Aug 5, 2024

ZzEeKkAa commented Aug 5, 2024

ZzEeKkAa commented Aug 6, 2024 •

edited

Loading

ZzEeKkAa commented Aug 12, 2024 •

edited

Loading

leshikus commented Aug 15, 2024 •

edited

Loading

ZzEeKkAa commented Aug 15, 2024

leshikus commented Aug 22, 2024

leshikus commented Aug 22, 2024

leshikus commented Aug 22, 2024 •

edited

Loading

leshikus commented Aug 22, 2024 •

edited

Loading

ZzEeKkAa commented Aug 22, 2024

leshikus commented Aug 24, 2024 •

edited

Loading

leshikus commented Aug 24, 2024 •

edited

Loading

anmyachev commented Aug 26, 2024 •

edited

Loading

[PoC] Use SYCL runtime wheel instead of PTDB and g++ or clang instead of dpc++ to compile the kernel launcher #1717

[PoC] Use SYCL runtime wheel instead of PTDB and g++ or clang instead of dpc++ to compile the kernel launcher #1717

Comments

vlad-penkin commented Jul 29, 2024 • edited Loading

ZzEeKkAa commented Aug 5, 2024

ZzEeKkAa commented Aug 5, 2024

ZzEeKkAa commented Aug 5, 2024

ZzEeKkAa commented Aug 6, 2024 • edited Loading

Summary

Environment setup

Building wheels

Pytorch

Triton

Patched sycl runtime

ZzEeKkAa commented Aug 12, 2024 • edited Loading

leshikus commented Aug 15, 2024 • edited Loading

ZzEeKkAa commented Aug 15, 2024

leshikus commented Aug 22, 2024

leshikus commented Aug 22, 2024

leshikus commented Aug 22, 2024 • edited Loading

leshikus commented Aug 22, 2024 • edited Loading

ZzEeKkAa commented Aug 22, 2024

leshikus commented Aug 24, 2024 • edited Loading

leshikus commented Aug 24, 2024 • edited Loading

anmyachev commented Aug 26, 2024 • edited Loading

vlad-penkin commented Jul 29, 2024 •

edited

Loading

ZzEeKkAa commented Aug 6, 2024 •

edited

Loading

ZzEeKkAa commented Aug 12, 2024 •

edited

Loading

leshikus commented Aug 15, 2024 •

edited

Loading

leshikus commented Aug 22, 2024 •

edited

Loading

leshikus commented Aug 22, 2024 •

edited

Loading

leshikus commented Aug 24, 2024 •

edited

Loading

leshikus commented Aug 24, 2024 •

edited

Loading

anmyachev commented Aug 26, 2024 •

edited

Loading