[BUGFIX]fix FP8 failing issue on habana_main [PatchedVLLMKVCache fwd …

…rror] (#502) Fix argument incompatible issue for FP8 ``` ERROR 11-11 04:29:13 engine.py:143] File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1556, in _wrapped_call_impl ERROR 11-11 04:29:13 engine.py:143] return self._call_impl(*args, **kwargs) ERROR 11-11 04:29:13 engine.py:143] File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1606, in _call_impl ERROR 11-11 04:29:13 engine.py:143] result = forward_call(*args, **kwargs) ERROR 11-11 04:29:13 engine.py:143] TypeError: PatchedVLLMKVCache.forward() missing 2 required positional arguments: 'block_indices' and 'block_offset' ``` FIX #453 https://github.com/HabanaAI/vllm-fork/blob/habana_main/README_GAUDI.md#troubleshooting-tweaking-hpu-graphs
HabanaAI · Nov 18, 2024 · c79982d · c79982d
1 parent 6ae5229
commit c79982d
Showing 1 changed file with 1 addition and 0 deletions.
diff --git a/requirements-hpu.txt b/requirements-hpu.txt
@@ -9,4 +9,5 @@ tabulate
 setuptools>=61
 setuptools-scm>=8
 vllm-hpu-extension @ git+https://github.com/HabanaAI/vllm-hpu-extension.git@3a60b49
+neural-compressor @ git+https://github.com/intel/neural-compressor.git@b196432