diff --git a/README_GAUDI.md b/README_GAUDI.md index 6ba3bb50d4a04..483b6e6cda741 100644 --- a/README_GAUDI.md +++ b/README_GAUDI.md @@ -81,6 +81,7 @@ Supported Features - Inference with [HPU Graphs](https://docs.habana.ai/en/latest/PyTorch/Inference_on_PyTorch/Inference_Using_HPU_Graphs.html) for accelerating low-batch latency and throughput +- Attention with Linear Biases (ALiBi) - INC quantization Unsupported Features @@ -88,7 +89,6 @@ Unsupported Features - Beam search - LoRA adapters -- Attention with Linear Biases (ALiBi) - AWQ quantization - Prefill chunking (mixed-batch inferencing) diff --git a/docs/source/getting_started/gaudi-installation.rst b/docs/source/getting_started/gaudi-installation.rst index 5915de92802d9..c9df862197f0a 100644 --- a/docs/source/getting_started/gaudi-installation.rst +++ b/docs/source/getting_started/gaudi-installation.rst @@ -76,6 +76,7 @@ Supported Features - Tensor parallelism support for multi-card inference - Inference with `HPU Graphs `__ for accelerating low-batch latency and throughput +- Attention with Linear Biases (ALiBi) - INC quantization Unsupported Features @@ -83,7 +84,6 @@ Unsupported Features - Beam search - LoRA adapters -- Attention with Linear Biases (ALiBi) - AWQ quantization - Prefill chunking (mixed-batch inferencing)