diff --git a/docs/source/quantization/inc.rst b/docs/source/quantization/inc.rst index 4d9020f3186c1..76d5c662409df 100644 --- a/docs/source/quantization/inc.rst +++ b/docs/source/quantization/inc.rst @@ -7,7 +7,7 @@ vLLM supports FP8 (8-bit floating point) weight and activation quantization usin Currently, quantization is supported only for Llama models. Intel Gaudi supports quantization of various modules and functions, including, but not limited to ``Linear``, ``KVCache``, ``Matmul`` and ``Softmax``. For more information, please refer to: -`Supported Modules\Supported Functions\Custom Patched Modules `_. +`Supported Modules\\Supported Functions\\Custom Patched Modules `_. .. note:: Measurement files are required to run quantized models with vLLM on Gaudi accelerators. The FP8 model calibration procedure is described in the `vllm-hpu-extention `_ package.