Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ROCm support #5

Merged
merged 23 commits into from
Dec 29, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 44 additions & 0 deletions .ci/configure.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
import glob
import jinja2
import sys

def render_template(filepath, **options):
if filepath.endswith(".jinja2"):
# read input file
with open(filepath, "r") as file:
template = jinja2.Template(file.read())

# render template
rendered = template.render(**options)

# write output file
with open(filepath[:-7], "w") as file:
file.write(rendered)

def main():
# by default, use cuda
cuda = True
rocm = False

# enable rocm if specified
if len(sys.argv) == 2:
if sys.argv[1] == "rocm":
cuda = False
rocm = True

# list of rendered files
rendered = []

# render every file
for filepath in glob.glob("**/*.jinja2", recursive=True) + [".gitignore.jinja2"]:
# render file
render_template(filepath, CUDA=cuda, ROCm=rocm, rendered=rendered)

# add output file to rendered list
rendered.append(filepath[:-7])

# print status
print(f"File '{filepath}' rendered successfully")

if __name__ == "__main__":
main()
7 changes: 6 additions & 1 deletion .github/workflows/build-iso-cuda.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ jobs:
pacman --sync --noconfirm --sysupgrade

# Install required packages
pacman --sync --noconfirm --needed archiso patch
pacman --sync --noconfirm --needed archiso patch python python-jinja

# Apply patch to archiso
patch -p0 << 'EOF'
Expand All @@ -54,6 +54,11 @@ jobs:
# export build artifacts for netboot
EOF

# Configure to use CUDA
pushd /workspace
python3 .ci/configure.py cuda
popd

# Build image
mkarchiso -v -m iso -w /workspace/work -o /workspace/out /workspace

Expand Down
69 changes: 69 additions & 0 deletions .github/workflows/build-iso-rocm.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
name: Build ISO (ROCm)

on:
- push
- pull_request

jobs:
build:
runs-on: ubuntu-latest

steps:
- name: Cleanup
uses: rokibhasansagar/slimhub_actions@main
with:
retain: "docker_imgcache,docker_buildkit,docker_imgcache"

- name: Checkout repository
uses: actions/checkout@v4
with:
submodules: recursive

- name: Build image
uses: addnab/docker-run-action@v3
with:
image: archlinux:latest
options: --privileged --volume ${{ github.workspace }}:/workspace
run: |
# Exit on error
set -eu

# Refresh package databases
pacman --sync --noconfirm --refresh

# Upgrade system
pacman --sync --noconfirm --sysupgrade

# Install required packages
pacman --sync --noconfirm --needed archiso patch python python-jinja

# Apply patch to archiso
patch -p0 << 'EOF'
--- /usr/bin/mkarchiso
+++ /usr/bin/mkarchiso
@@ -1227,6 +1227,10 @@
if [[ -v cert_list ]]; then
_cms_sign_artifact "${airootfs_image_filename}"
fi
+
+ _msg_info 'Removing the pacstrap directory...'
+ rm -rf -- "${pacstrap_dir:?}/"
+ _msg_info 'Done!'
}

# export build artifacts for netboot
EOF

# Configure to use ROCm
pushd /workspace
python3 .ci/configure.py rocm
popd

# Build image
mkarchiso -v -m iso -w /workspace/work -o /workspace/out /workspace

- name: Upload artifacts
uses: actions/upload-artifact@v4
with:
name: archiso-output
path: out/
11 changes: 11 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,13 @@
out/
work/

# rendered files
packages.x86_64
airootfs/root/customize_airootfs.sh
airootfs/root/customize_airootfs/scripts/1000-axolotl-dependencies.sh
airootfs/root/customize_airootfs/scripts/0100-koboldcpp-patches.sh
airootfs/root/customize_airootfs/scripts/1000-sillytavern-extras-dependencies.sh
airootfs/root/customize_airootfs/scripts/1000-vllm-dependencies.sh
airootfs/root/customize_airootfs/scripts/1000-text-generation-webui-dependencies.sh
airootfs/root/customize_airootfs/scripts/0100-automatic-patches.sh
airootfs/root/customize_airootfs/scripts/9999-cleanup.sh
7 changes: 7 additions & 0 deletions .gitignore.jinja2
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
out/
work/

# rendered files
{% for file in rendered %}
{{- file}}
{% endfor %}
10 changes: 6 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,18 +15,20 @@ If you would like to see another AI-related project included in ToriLinux, pleas

* Easy setup: just boot the ISO, and you will have a working setup for training and/or inferencing Large Language Models/Stable Diffusion/etc.
* Fully offline training and/or inference.
* Includes performance state switcher, which reduces GPU temperatures when inference is not running (only automatic & koboldcpp for now).
* Includes performance state switcher, which reduces GPU temperatures when inference is not running (only NVIDIA, only automatic & koboldcpp for now).

## Usage

To use ToriLinux:
1. Install [Ventoy](https://ventoy.net/en/doc_start.html) on a USB drive.
2. Download the latest ISO from [workflows](https://github.com/sasha0552/ToriLinux/actions?query=branch%3Amain) and copy it to the USB drive.
2. Download the latest ISO from workflows ([NVIDIA](https://github.com/sasha0552/ToriLinux/actions/workflows/build-iso-cuda.yml?query=branch%3Amain) / [AMD](https://github.com/sasha0552/ToriLinux/actions/workflows/build-iso-rocm.yml?query=branch%3Amain)) and copy it to the USB drive.
3. Boot from the USB drive (select it as the boot device in BIOS/UEFI).
4. Log in with the username `tori` and password `tori`. You can also use [SSH](https://en.wikipedia.org/wiki/Secure_Shell).

Please note that ToriLinux currently works only with NVIDIA GPUs. If you would like a ROCm (AMD GPUs) version, please open an issue, and I'll make one.

## Misc

Note that you need pre-downloaded models on a local hard drive or NFS server, or enough RAM and internet connection to download models directly into RAM.

Note that following projects is not available on ROCm version:
* [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl)
* [vllm](https://github.com/vllm-project/vllm)
2 changes: 1 addition & 1 deletion airootfs/home/tori/axolotl
2 changes: 1 addition & 1 deletion airootfs/home/tori/vllm
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,14 @@ mv /usr/lib/os-release.new /usr/lib/os-release
# set user password
echo "tori:tori" | chpasswd

# remove any jinja2 files
find -type f -name "*.jinja2" -print -delete

{% if ROCm %}
# remove nvidia-persistenced if rocm
rm -f /etc/systemd/system/multi-user.target.wants/nvidia-persistenced.service
{% endif %}

# enter user directory
cd "/home/tori"

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@

uint32_t seed = -1; // RNG seed
int32_t n_keep = 0; // number of tokens to keep from initial prompt
@@ -712,7 +712,7 @@ struct llama_server_context
@@ -711,7 +711,7 @@ struct llama_server_context
}

slot->params.stream = json_value(data, "stream", false);
Expand All @@ -18,12 +18,12 @@
slot->params.n_predict = json_value(data, "n_predict", default_params.n_predict);
slot->sparams.top_k = json_value(data, "top_k", default_sparams.top_k);
slot->sparams.top_p = json_value(data, "top_p", default_sparams.top_p);
@@ -2439,7 +2439,7 @@ json oaicompat_completion_params_parse(
// Map OpenAI parameters to llama.cpp parameters
@@ -2446,7 +2446,7 @@ json oaicompat_completion_params_parse(
llama_sampling_params default_sparams;
llama_params["model"] = json_value(body, "model", std::string("uknown"));
llama_params["prompt"] = format_chatml(body["messages"]); // OpenAI 'messages' to llama.cpp 'prompt'
- llama_params["cache_prompt"] = json_value(body, "cache_prompt", false);
+ llama_params["cache_prompt"] = json_value(body, "cache_prompt", true);
llama_params["temperature"] = json_value(body, "temperature", 0.8);
llama_params["top_k"] = json_value(body, "top_k", 40);
llama_params["top_p"] = json_value(body, "top_p", 0.95);
llama_params["temperature"] = json_value(body, "temperature", 0.0);
llama_params["top_k"] = json_value(body, "top_k", default_sparams.top_k);
llama_params["top_p"] = json_value(body, "top_p", 1.0);
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@ pushd "automatic"
sed -i 's/lambda: {"choices": theme.list_themes()}, refresh=theme.refresh_themes/{"choices": ["black-teal"]}/g' modules/shared.py
sed -i 's/shared.opts.motd/False/g' modules/api/api.py

{% if CUDA %}
# drop pstate in idle
patch -p1 < "$CUSTOMIZE_AIROOTFS/patches/0000-automatic-drop-pstate-in-idle.patch"
{% endif %}
popd
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ set -eu

# koboldcpp patches
pushd "koboldcpp"
{% if CUDA %}
# drop pstate in idle
patch -p1 < "$CUSTOMIZE_AIROOTFS/patches/0000-koboldcpp-drop-pstate-in-idle.patch"
{% endif %}
popd
4 changes: 2 additions & 2 deletions airootfs/root/customize_airootfs/scripts/0100-vllm-patches.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,6 @@ set -eu

# vllm patches
pushd "vllm"
# build for pascal
patch -p1 < "$CUSTOMIZE_AIROOTFS/patches/0100-vllm-build-for-pascal.patch"
# enable other architectures
patch -p1 < "$CUSTOMIZE_AIROOTFS/patches/0100-vllm-enable-other-archs.patch"
popd
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,10 @@ pushd "automatic"
# install dependencies
python3 launch.py --test
deactivate

# remove installation config
rm config.json

# remove installation log
rm sdnext.log
popd
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ set -eu

# axolotl dependencies
pushd "axolotl"
{% if CUDA %}
# disable package caching
export PIP_NO_CACHE_DIR=0

Expand Down Expand Up @@ -33,4 +34,5 @@ pushd "axolotl"
# downgrade flash-attn (https://github.com/OpenAccess-AI-Collective/axolotl/issues/911#issuecomment-1868546443)
pip3 install flash-attn==2.3.2
deactivate
{% endif %}
popd
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,17 @@ pushd "SillyTavern-Extras"

# activate venv
source venv/bin/activate
# install dependencies
{% if CUDA %}
# install dependencies (cuda)
pip3 install -r requirements.txt
{% endif %}

{% if ROCm %}
# install dependencies (rocm)
pip3 install -r requirements-rocm.txt
{% endif %}

# install remaining dependencies
pip3 install -r requirements-coqui.txt
pip3 install -r requirements-rvc.txt
deactivate
Expand Down

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
#!/bin/bash
set -eu

# text-generation-webui dependencies
pushd "text-generation-webui"
# disable package caching
export PIP_NO_CACHE_DIR=0

# create venv
python3 -m venv venv

# activate venv
source venv/bin/activate
{% if CUDA %}
# install dependencies (cuda)
pip3 install -r requirements.txt
{% endif %}

{% if ROCm %}
# extract pytorch version
index_url=$(grep -o 'https://download.pytorch.org/whl/rocm[0-9.]*' one_click.py)

# install pytorch
pip3 install torch torchvision torchaudio --index-url "$index_url"

# install dependencies (rocm)
pip3 install -r requirements_amd.txt
{% endif %}
deactivate
popd
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ set -eu

# vllm dependencies
pushd "vllm"
{% if CUDA %}
# disable package caching
export PIP_NO_CACHE_DIR=0

Expand Down Expand Up @@ -43,4 +44,5 @@ pushd "vllm"
# install dependencies
pip3 install -r requirements.txt
deactivate
{% endif %}
popd
11 changes: 0 additions & 11 deletions airootfs/root/customize_airootfs/scripts/2000-automatic-cleanup.sh

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,11 @@ rm -fr /home/tori/.config/matplotlib

# keras
rm -fr /home/tori/.keras

{% if ROCm %}
# remove axolotl if rocm
rm -fr /home/tori/axolotl

# remove vllm if rocm
rm -fr /home/tori/vllm
{% endif %}
Loading