Skip to content

Commit

Permalink
Refactor reranking (#1113)
Browse files Browse the repository at this point in the history
Signed-off-by: WenjiaoYue <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>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ZePan110 <[email protected]>
  • Loading branch information
3 people authored Jan 8, 2025
1 parent ca21633 commit 267cad1
Show file tree
Hide file tree
Showing 27 changed files with 235 additions and 112 deletions.
2 changes: 1 addition & 1 deletion .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
/comps/lvms/ [email protected] [email protected]
/comps/prompt_registry/ [email protected] [email protected]
/comps/ragas/ [email protected] [email protected]
/comps/reranks/ [email protected] [email protected]
/comps/rerankings/ [email protected] [email protected]
/comps/retrievers/ [email protected] [email protected]
/comps/text2image/ [email protected] [email protected]
/comps/text2sql/ [email protected] [email protected]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,5 @@
services:
reranking:
build:
dockerfile: comps/reranks/src/Dockerfile
dockerfile: comps/rerankings/src/Dockerfile
image: ${REGISTRY:-opea}/reranking:${TAG:-latest}
2 changes: 1 addition & 1 deletion .github/workflows/manual-comps-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ on:
inputs:
services:
default: "asr"
description: "List of services to test [agent,asr,chathistory,dataprep,embeddings,feedback_management,finetuning,guardrails,intent_detection,knowledgegraphs,llms,lvms,nginx,prompt_registry,ragas,reranks,retrievers,tts,vectorstores,web_retrievers]"
description: "List of services to test [agent,asr,chathistory,dataprep,embeddings,feedback_management,finetuning,guardrails,intent_detection,knowledgegraphs,llms,lvms,nginx,prompt_registry,ragas,rerankings,retrievers,tts,vectorstores,web_retrievers]"
required: true
type: string
build:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/manual-docker-publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ on:
inputs:
services:
default: ""
description: "List of services to test [agent,asr,chathistory,dataprep,embeddings,feedback_management,finetuning,guardrails,intent_detection,knowledgegraphs,llms,lvms,nginx,prompt_registry,ragas,reranks,retrievers,tts,vectorstores,web_retrievers]"
description: "List of services to test [agent,asr,chathistory,dataprep,embeddings,feedback_management,finetuning,guardrails,intent_detection,knowledgegraphs,llms,lvms,nginx,prompt_registry,ragas,rerankings,retrievers,tts,vectorstores,web_retrievers]"
required: false
type: string
images:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/manual-docker-scan.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ on:
inputs:
services:
default: "asr"
description: "List of services to test [agent_langchain,asr,chathistory_mongo,dataprep_milvus...]" #,embeddings,guardrails,llms,lvms,prompt_registry,ragas,reranks,retrievers,tts,vectorstores,web_retrievers]"
description: "List of services to test [agent_langchain,asr,chathistory_mongo,dataprep_milvus...]" #,embeddings,guardrails,llms,lvms,prompt_registry,ragas,rerankings,retrievers,tts,vectorstores,web_retrievers]"
required: false
type: string
images:
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,8 @@ The initially supported `Microservices` are described in the below table. More `
| [Embedding](./comps/embeddings/src/README.md) | [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) | [TEI-Gaudi](https://github.com/huggingface/tei-gaudi) | Gaudi2 | Embedding on Gaudi2 |
| [Embedding](./comps/embeddings/src/README.md) | [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) | [TEI](https://github.com/huggingface/text-embeddings-inference) | Xeon | Embedding on Xeon CPU |
| [Retriever](./comps/retrievers/src/README.md) | [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) | [TEI](https://github.com/huggingface/text-embeddings-inference) | Xeon | Retriever on Xeon CPU |
| [Reranking](./comps/reranks/src/README.md) | [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) | [TEI-Gaudi](https://github.com/huggingface/tei-gaudi) | Gaudi2 | Reranking on Gaudi2 |
| [Reranking](./comps/reranks/src/README.md) | [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [BBAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) | [TEI](https://github.com/huggingface/text-embeddings-inference) | Xeon | Reranking on Xeon CPU |
| [Reranking](./comps/rerankings/src/README.md) | [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) | [TEI-Gaudi](https://github.com/huggingface/tei-gaudi) | Gaudi2 | Reranking on Gaudi2 |
| [Reranking](./comps/rerankings/src/README.md) | [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [BBAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) | [TEI](https://github.com/huggingface/text-embeddings-inference) | Xeon | Reranking on Xeon CPU |
| [ASR](./comps/asr/src/README.md) | NA | [openai/whisper-small](https://huggingface.co/openai/whisper-small) | NA | Gaudi2 | Audio-Speech-Recognition on Gaudi2 |
| [ASR](./comps/asr/src/README.md) | NA | [openai/whisper-small](https://huggingface.co/openai/whisper-small) | NA | Xeon | Audio-Speech-RecognitionS on Xeon CPU |
| [TTS](./comps/tts/src/README.md) | NA | [microsoft/speecht5_tts](https://huggingface.co/microsoft/speecht5_tts) | NA | Gaudi2 | Text-To-Speech on Gaudi2 |
Expand Down
2 changes: 1 addition & 1 deletion comps/finetuning/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -244,7 +244,7 @@ curl http://${your_ip}:8015/v1/finetune/list_checkpoints -X POST -H "Content-Typ

### 3.4 Leverage fine-tuned model

After fine-tuning job is done, fine-tuned model can be chosen from listed checkpoints, then the fine-tuned model can be used in other microservices. For example, fine-tuned reranking model can be used in [reranks](../reranks/src/README.md) microservice by assign its path to the environment variable `RERANK_MODEL_ID`, fine-tuned embedding model can be used in [embeddings](../embeddings/src/README.md) microservice by assign its path to the environment variable `model`, LLMs after instruction tuning can be used in [llms](../llms/src/text-generation/README.md) microservice by assign its path to the environment variable `your_hf_llm_model`.
After fine-tuning job is done, fine-tuned model can be chosen from listed checkpoints, then the fine-tuned model can be used in other microservices. For example, fine-tuned reranking model can be used in [rerankings](../rerankings/src/README.md) microservice by assign its path to the environment variable `RERANK_MODEL_ID`, fine-tuned embedding model can be used in [embeddings](../embeddings/src/README.md) microservice by assign its path to the environment variable `model`, LLMs after instruction tuning can be used in [llms](../llms/src/text-generation/README.md) microservice by assign its path to the environment variable `your_hf_llm_model`.

## 🚀4. Descriptions for Finetuning parameters

Expand Down
2 changes: 1 addition & 1 deletion comps/lvms/llama-vision/transformers_generation_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -4238,7 +4238,7 @@ def _ranking_fast(
alpha: float,
beam_width: int,
) -> torch.FloatTensor:
"""Reranks the top_k candidates based on a degeneration penalty (cosine similarity with previous tokens), as described
"""Rerankings the top_k candidates based on a degeneration penalty (cosine similarity with previous tokens), as described
in the paper "A Contrastive Framework for Neural Text Generation".
Returns the index of the best candidate for each
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ services:
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
TEI_RERANKING_ENDPOINT: ${TEI_RERANKING_ENDPOINT}
RERANK_COMPONENT_NAME: "OPEA_TEI_RERANKING"
HF_TOKEN: ${HF_TOKEN}
depends_on:
tei_reranking_service:
Expand Down
22 changes: 22 additions & 0 deletions comps/rerankings/deployment/docker_compose/rerank_videoqna.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

services:
reranking:
image: opea/reranking:latest
container_name: reranking-videoqna-server
ports:
- "8000:8000"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
CHUNK_DURATION: ${CHUNK_DURATION}
FILE_SERVER_ENDPOINT: ${FILE_SERVER_ENDPOINT}
RERANK_COMPONENT_NAME: "OPEA_VIDEO_RERANKING"
restart: unless-stopped

networks:
default:
driver: bridge
46 changes: 46 additions & 0 deletions comps/rerankings/src/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

FROM python:3.11-slim

ENV LANG=C.UTF-8

ARG ARCH="cpu"
ARG SERVICE="all"

RUN apt-get update -y && apt-get install -y --no-install-recommends --fix-missing \
git \
libgl1-mesa-glx \
libjemalloc-dev

RUN useradd -m -s /bin/bash user && \
mkdir -p /home/user && \
chown -R user /home/user/

USER user

COPY comps /home/user/comps

RUN if [ ${ARCH} = "cpu" ]; then \
pip install --no-cache-dir torch --index-url https://download.pytorch.org/whl/cpu; \
fi && \
if [ ${SERVICE} = "videoqna" ]; then \
pip install --no-cache-dir --upgrade pip setuptools && \
pip install --no-cache-dir -r /home/user/comps/rerankings/src/requirements_videoqna.txt; \
elif [ ${SERVICE} = "all" ]; then \
git clone https://github.com/IntelLabs/fastRAG.git /home/user/fastRAG && \
cd /home/user/fastRAG && \
pip install --no-cache-dir --upgrade pip && \
pip install --no-cache-dir . && \
pip install --no-cache-dir .[intel] && \
pip install --no-cache-dir -r /home/user/comps/rerankings/src/requirements_videoqna.txt; \
fi && \
pip install --no-cache-dir --upgrade pip setuptools && \
pip install --no-cache-dir -r /home/user/comps/rerankings/src/requirements.txt;


ENV PYTHONPATH=$PYTHONPATH:/home/user

WORKDIR /home/user/comps/rerankings/src

ENTRYPOINT ["python", "opea_reranking_microservice.py"]
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
RerankingResponseData,
)

logger = CustomLogger("reranking_tei")
logger = CustomLogger("tei_reranking")
logflag = os.getenv("LOGFLAG", False)

# Environment variables
Expand All @@ -27,8 +27,8 @@
CLIENT_SECRET = os.getenv("CLIENT_SECRET")


@OpeaComponentRegistry.register("OPEA_RERANK_TEI")
class OPEATEIReranking(OpeaComponent):
@OpeaComponentRegistry.register("OPEA_TEI_RERANKING")
class OpeaTEIReranking(OpeaComponent):
"""A specialized reranking component derived from OpeaComponent for TEI reranking services.
Attributes:
Expand Down
121 changes: 121 additions & 0 deletions comps/rerankings/src/integrations/videoqna.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

import logging
import os
import re

from fastapi import HTTPException

from comps import CustomLogger, LVMVideoDoc, OpeaComponentRegistry, SearchedMultimodalDoc, ServiceType
from comps.cores.common.component import OpeaComponent

logger = CustomLogger("video_reranking")
logflag = os.getenv("LOGFLAG", False)

chunk_duration = os.getenv("CHUNK_DURATION", "10") or "10"
chunk_duration = float(chunk_duration) if chunk_duration.isdigit() else 10.0

file_server_endpoint = os.getenv("FILE_SERVER_ENDPOINT") or "http://0.0.0.0:6005"

logging.basicConfig(
level=logging.INFO, format="%(levelname)s: [%(asctime)s] %(message)s", datefmt="%d/%m/%Y %I:%M:%S"
)


def get_top_doc(top_n, videos) -> list:
hit_score = {}
if videos is None:
return None
for video_name in videos:
try:
if video_name not in hit_score.keys():
hit_score[video_name] = 0
hit_score[video_name] += 1
except KeyError as r:
logging.info(f"no video name {r}")

x = dict(sorted(hit_score.items(), key=lambda item: -item[1])) # sorted dict of video name and score
top_n_names = list(x.keys())[:top_n]
logging.info(f"top docs = {x}")
logging.info(f"top n docs names = {top_n_names}")

return top_n_names


def find_timestamp_from_video(metadata_list, video):
return next(
(metadata["timestamp"] for metadata in metadata_list if metadata["video"] == video),
None,
)


def format_video_name(video_name):
# Check for an existing file extension
match = re.search(r"\.(\w+)$", video_name)

if match:
extension = match.group(1)
# If the extension is not 'mp4', raise an error
if extension != "mp4":
raise ValueError(f"Invalid file extension: .{extension}. Only '.mp4' is allowed.")

# Use regex to remove any suffix after the base name (e.g., '_interval_0', etc.)
base_name = re.sub(r"(_interval_\d+)?(\.mp4)?$", "", video_name)

# Add the '.mp4' extension
formatted_name = f"{base_name}.mp4"

return formatted_name


@OpeaComponentRegistry.register("OPEA_VIDEO_RERANKING")
class OpeaVideoReranking(OpeaComponent):
"""A specialized reranking component derived from OpeaComponent for OPEA Video native reranking services."""

def __init__(self, name: str, description: str, config: dict = None):
super().__init__(name, ServiceType.RERANK.name.lower(), description, config)

async def invoke(self, input: SearchedMultimodalDoc) -> LVMVideoDoc:
"""Invokes the reranking service to generate reranking for the provided input.
Args:
input (SearchedMultimodalDoc): The input in OpenAI reranking format.
Returns:
LVMVideoDoc: The response in OpenAI reranking format.
"""
try:
# get top video name from metadata
video_names = [meta["video"] for meta in input.metadata]
top_video_names = get_top_doc(input.top_n, video_names)

# only use the first top video
timestamp = find_timestamp_from_video(input.metadata, top_video_names[0])
formatted_video_name = format_video_name(top_video_names[0])
video_url = f"{file_server_endpoint.rstrip('/')}/{formatted_video_name}"

result = LVMVideoDoc(
video_url=video_url,
prompt=input.initial_query,
chunk_start=timestamp,
chunk_duration=float(chunk_duration),
max_new_tokens=512,
)
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
except Exception as e:
logging.error(f"Unexpected error in reranking: {str(e)}")
# Handle any other exceptions with a generic server error response
raise HTTPException(status_code=500, detail="An unexpected error occurred.")

return result

def check_health(self) -> bool:
"""Checks the health of the reranking service.
Returns:
bool: True if the service is reachable and healthy, False otherwise.
"""

return True
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@
import time
from typing import Union

from integrations.opea_tei import OPEATEIReranking
from integrations.tei import OpeaTEIReranking
from integrations.videoqna import OpeaVideoReranking

from comps import (
CustomLogger,
Expand All @@ -22,7 +23,7 @@
logger = CustomLogger("opea_reranking_microservice")
logflag = os.getenv("LOGFLAG", False)

rerank_component_name = os.getenv("RERANK_COMPONENT_NAME", "OPEA_RERANK_TEI")
rerank_component_name = os.getenv("RERANK_COMPONENT_NAME", "OPEA_TEI_RERANKING")
# Initialize OpeaComponentLoader
loader = OpeaComponentLoader(rerank_component_name, description=f"OPEA RERANK Component: {rerank_component_name}")

Expand Down
File renamed without changes.
7 changes: 7 additions & 0 deletions comps/rerankings/src/requirements_videoqna.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
datasets
haystack-ai
langchain --extra-index-url https://download.pytorch.org/whl/cpu
langchain_community --extra-index-url https://download.pytorch.org/whl/cpu
openai
Pillow
pydub
30 changes: 0 additions & 30 deletions comps/reranks/src/Dockerfile

This file was deleted.

2 changes: 1 addition & 1 deletion comps/text2sql/src/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -30,4 +30,4 @@ ENV PYTHONPATH=$PYTHONPATH:/home/user

WORKDIR /home/user/comps/text2sql/src/

ENTRYPOINT ["python", "opea_text2sql_microservice.py"]
ENTRYPOINT ["python", "opea_text2sql_microservice.py"]
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,12 @@ ip_address=$(hostname -I | awk '{print $1}')

function build_docker_images() {
cd $WORKPATH
docker build --no-cache -t opea/reranking:comps --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/reranks/src/Dockerfile .
docker build --no-cache \
-t opea/reranking:comps \
--build-arg https_proxy=$https_proxy \
--build-arg http_proxy=$http_proxy \
--build-arg SERVICE=tei \
-f comps/rerankings/src/Dockerfile .
if [ $? -ne 0 ]; then
echo "opea/reranking built fail"
exit 1
Expand All @@ -30,7 +35,7 @@ function start_service() {
export TEI_RERANKING_ENDPOINT="http://${ip_address}:${tei_endpoint}"
tei_service_port=5007
unset http_proxy
docker run -d --name="test-comps-reranking-server" -e LOGFLAG=True -p ${tei_service_port}:8000 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e TEI_RERANKING_ENDPOINT=$TEI_RERANKING_ENDPOINT -e HF_TOKEN=$HF_TOKEN -e RERANK_TYPE="tei" opea/reranking:comps
docker run -d --name="test-comps-reranking-server" -e LOGFLAG=True -p ${tei_service_port}:8000 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e TEI_RERANKING_ENDPOINT=$TEI_RERANKING_ENDPOINT -e HF_TOKEN=$HF_TOKEN -e RERANK_COMPONENT_NAME="OPEA_TEI_RERANKING" opea/reranking:comps
sleep 15
}

Expand All @@ -52,7 +57,7 @@ function validate_microservice() {
}

function stop_docker() {
cid=$(docker ps -aq --filter "name=test-comps-rerank*")
cid=$(docker ps -aq --filter "name=test-comps-reranking*")
if [[ ! -z "$cid" ]]; then docker stop $cid && docker rm $cid && sleep 1s; fi
}

Expand Down
Loading

0 comments on commit 267cad1

Please sign in to comment.