Skip to content

Commit

Permalink
Set upper Numpy Version (#540)
Browse files Browse the repository at this point in the history
# Description

Found a bug when walking through the shortfin llm docs using latest
`nightly` sharktank. gguf is currently incompatible with numpy > 2. This
breaks `sharktank.examples.export_paged_llm_v1` on linux.

The gguf issue is filed
[here](ggerganov/llama.cpp#9021). It was
closed from inactivity, but isn't actually solved and has a PR open for
the fix.

## Repro Steps
On linux:

### Before re-pinning
Create a virtual environment:
```bash
python -m venv --prompt sharktank .venv
souce .venv/bin/activate
```

Install depencies and sharktank:
```bash
pip install -r pytorch-cpu-requirements.txt
pip install -r requirements.txt -e sharktank/
```

Show numpy version (before re-pinning):
```bash
pip show numpy | grep Version
Version: 2.1.3
```

Try running `export_paged_llm_v1`:
```bash
python -m sharktank.examples.export_paged_llm_v1 --gguf-file=$PATH_TO_GGUF --output-mlir=./temp/model.mlir --output-config=./temp/config.json --bs=1,4
```

You'll see this error:
```text
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/stbaione/repos/SHARK-Platform/sharktank/sharktank/examples/export_paged_llm_v1.py", line 336, in <module>
    main()
  File "/home/stbaione/repos/SHARK-Platform/sharktank/sharktank/examples/export_paged_llm_v1.py", line 67, in main
    dataset = cli.get_input_dataset(args)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stbaione/repos/SHARK-Platform/sharktank/sharktank/utils/cli.py", line 104, in get_input_dataset
    return Dataset.load(data_files["gguf"], file_type="gguf")
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stbaione/repos/SHARK-Platform/sharktank/sharktank/types/theta.py", line 347, in load
    ds = _dataset_load_helper(path, file_type=file_type, mmap=mmap)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stbaione/repos/SHARK-Platform/sharktank/sharktank/types/theta.py", line 536, in _dataset_load_helper
    return gguf_interop.load_file(path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stbaione/repos/SHARK-Platform/sharktank/sharktank/types/gguf_interop/base.py", line 117, in load_file
    reader = GGUFReader(gguf_path)
             ^^^^^^^^^^^^^^^^^^^^^
  File "/home/stbaione/repos/SHARK-Platform/.venv_2/lib/python3.12/site-packages/gguf/gguf_reader.py", line 87, in __init__
    if self._get(offs, np.uint32, override_order = '<')[0] != GGUF_MAGIC:
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stbaione/repos/SHARK-Platform/.venv_2/lib/python3.12/site-packages/gguf/gguf_reader.py", line 137, in _get
    .newbyteorder(override_order or self.byte_order)
     ^^^^^^^^^^^^
AttributeError: `newbyteorder` was removed from the ndarray class in NumPy 2.0. Use `arr.view(arr.dtype.newbyteorder(order))` instead.
```

## After re-pinning
Create a virtual environment:
```bash
python -m venv --prompt sharktank .venv
souce .venv/bin/activate
```

Install depencies and sharktank:
```bash
pip install -r pytorch-cpu-requirements.txt
pip install -r requirements.txt -e sharktank/
```

Show numpy version:
```bash
pip show numpy | grep Version
Version: 1.26.3
```

Run `export_paged_llm_v1`:
```bash
python -m sharktank.examples.export_paged_llm_v1 --gguf-file=$PATH_TO_GGUF --output-mlir=./temp/model.mlir --output-config=./temp/config.json --bs=1,4
```

With re-pinning we get desired output:
```text
Exporting decode_bs1
Exporting prefill_bs4
Exporting decode_bs4
GENERATED!
Exporting
Saving to './temp/model.mlir'
```

---------

Co-authored-by: Marius Brehler <[email protected]>
  • Loading branch information
stbaione and marbre authored Nov 15, 2024
1 parent b574649 commit 5ccfc87
Showing 1 changed file with 1 addition and 2 deletions.
3 changes: 1 addition & 2 deletions sharktank/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,7 @@ iree-turbine

# Runtime deps.
gguf==0.6.0
numpy==1.26.3; sys_platform == 'win32'
numpy; sys_platform != 'win32'
numpy<2.0

# Needed for newer gguf versions (TODO: remove when gguf package includes this)
# sentencepiece>=0.1.98,<=0.2.0
Expand Down

0 comments on commit 5ccfc87

Please sign in to comment.