Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
# Description Found a bug when walking through the shortfin llm docs using latest `nightly` sharktank. gguf is currently incompatible with numpy > 2. This breaks `sharktank.examples.export_paged_llm_v1` on linux. The gguf issue is filed [here](ggerganov/llama.cpp#9021). It was closed from inactivity, but isn't actually solved and has a PR open for the fix. ## Repro Steps On linux: ### Before re-pinning Create a virtual environment: ```bash python -m venv --prompt sharktank .venv souce .venv/bin/activate ``` Install depencies and sharktank: ```bash pip install -r pytorch-cpu-requirements.txt pip install -r requirements.txt -e sharktank/ ``` Show numpy version (before re-pinning): ```bash pip show numpy | grep Version Version: 2.1.3 ``` Try running `export_paged_llm_v1`: ```bash python -m sharktank.examples.export_paged_llm_v1 --gguf-file=$PATH_TO_GGUF --output-mlir=./temp/model.mlir --output-config=./temp/config.json --bs=1,4 ``` You'll see this error: ```text Traceback (most recent call last): File "<frozen runpy>", line 198, in _run_module_as_main File "<frozen runpy>", line 88, in _run_code File "/home/stbaione/repos/SHARK-Platform/sharktank/sharktank/examples/export_paged_llm_v1.py", line 336, in <module> main() File "/home/stbaione/repos/SHARK-Platform/sharktank/sharktank/examples/export_paged_llm_v1.py", line 67, in main dataset = cli.get_input_dataset(args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/stbaione/repos/SHARK-Platform/sharktank/sharktank/utils/cli.py", line 104, in get_input_dataset return Dataset.load(data_files["gguf"], file_type="gguf") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/stbaione/repos/SHARK-Platform/sharktank/sharktank/types/theta.py", line 347, in load ds = _dataset_load_helper(path, file_type=file_type, mmap=mmap) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/stbaione/repos/SHARK-Platform/sharktank/sharktank/types/theta.py", line 536, in _dataset_load_helper return gguf_interop.load_file(path) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/stbaione/repos/SHARK-Platform/sharktank/sharktank/types/gguf_interop/base.py", line 117, in load_file reader = GGUFReader(gguf_path) ^^^^^^^^^^^^^^^^^^^^^ File "/home/stbaione/repos/SHARK-Platform/.venv_2/lib/python3.12/site-packages/gguf/gguf_reader.py", line 87, in __init__ if self._get(offs, np.uint32, override_order = '<')[0] != GGUF_MAGIC: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/stbaione/repos/SHARK-Platform/.venv_2/lib/python3.12/site-packages/gguf/gguf_reader.py", line 137, in _get .newbyteorder(override_order or self.byte_order) ^^^^^^^^^^^^ AttributeError: `newbyteorder` was removed from the ndarray class in NumPy 2.0. Use `arr.view(arr.dtype.newbyteorder(order))` instead. ``` ## After re-pinning Create a virtual environment: ```bash python -m venv --prompt sharktank .venv souce .venv/bin/activate ``` Install depencies and sharktank: ```bash pip install -r pytorch-cpu-requirements.txt pip install -r requirements.txt -e sharktank/ ``` Show numpy version: ```bash pip show numpy | grep Version Version: 1.26.3 ``` Run `export_paged_llm_v1`: ```bash python -m sharktank.examples.export_paged_llm_v1 --gguf-file=$PATH_TO_GGUF --output-mlir=./temp/model.mlir --output-config=./temp/config.json --bs=1,4 ``` With re-pinning we get desired output: ```text Exporting decode_bs1 Exporting prefill_bs4 Exporting decode_bs4 GENERATED! Exporting Saving to './temp/model.mlir' ``` --------- Co-authored-by: Marius Brehler <[email protected]>
- Loading branch information