-
Notifications
You must be signed in to change notification settings - Fork 149
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: WenjiaoYue <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> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: ZePan110 <[email protected]>
- Loading branch information
1 parent
ca21633
commit 267cad1
Showing
27 changed files
with
235 additions
and
112 deletions.
There are no files selected for viewing
Validating CODEOWNERS rules …
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -21,7 +21,7 @@ | |
/comps/lvms/ [email protected] [email protected] | ||
/comps/prompt_registry/ [email protected] [email protected] | ||
/comps/ragas/ [email protected] [email protected] | ||
/comps/reranks/ [email protected] [email protected] | ||
/comps/rerankings/ [email protected] [email protected] | ||
/comps/retrievers/ [email protected] [email protected] | ||
/comps/text2image/ [email protected] [email protected] | ||
/comps/text2sql/ [email protected] [email protected] | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
22 changes: 22 additions & 0 deletions
22
comps/rerankings/deployment/docker_compose/rerank_videoqna.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
services: | ||
reranking: | ||
image: opea/reranking:latest | ||
container_name: reranking-videoqna-server | ||
ports: | ||
- "8000:8000" | ||
ipc: host | ||
environment: | ||
no_proxy: ${no_proxy} | ||
http_proxy: ${http_proxy} | ||
https_proxy: ${https_proxy} | ||
CHUNK_DURATION: ${CHUNK_DURATION} | ||
FILE_SERVER_ENDPOINT: ${FILE_SERVER_ENDPOINT} | ||
RERANK_COMPONENT_NAME: "OPEA_VIDEO_RERANKING" | ||
restart: unless-stopped | ||
|
||
networks: | ||
default: | ||
driver: bridge |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
FROM python:3.11-slim | ||
|
||
ENV LANG=C.UTF-8 | ||
|
||
ARG ARCH="cpu" | ||
ARG SERVICE="all" | ||
|
||
RUN apt-get update -y && apt-get install -y --no-install-recommends --fix-missing \ | ||
git \ | ||
libgl1-mesa-glx \ | ||
libjemalloc-dev | ||
|
||
RUN useradd -m -s /bin/bash user && \ | ||
mkdir -p /home/user && \ | ||
chown -R user /home/user/ | ||
|
||
USER user | ||
|
||
COPY comps /home/user/comps | ||
|
||
RUN if [ ${ARCH} = "cpu" ]; then \ | ||
pip install --no-cache-dir torch --index-url https://download.pytorch.org/whl/cpu; \ | ||
fi && \ | ||
if [ ${SERVICE} = "videoqna" ]; then \ | ||
pip install --no-cache-dir --upgrade pip setuptools && \ | ||
pip install --no-cache-dir -r /home/user/comps/rerankings/src/requirements_videoqna.txt; \ | ||
elif [ ${SERVICE} = "all" ]; then \ | ||
git clone https://github.com/IntelLabs/fastRAG.git /home/user/fastRAG && \ | ||
cd /home/user/fastRAG && \ | ||
pip install --no-cache-dir --upgrade pip && \ | ||
pip install --no-cache-dir . && \ | ||
pip install --no-cache-dir .[intel] && \ | ||
pip install --no-cache-dir -r /home/user/comps/rerankings/src/requirements_videoqna.txt; \ | ||
fi && \ | ||
pip install --no-cache-dir --upgrade pip setuptools && \ | ||
pip install --no-cache-dir -r /home/user/comps/rerankings/src/requirements.txt; | ||
|
||
|
||
ENV PYTHONPATH=$PYTHONPATH:/home/user | ||
|
||
WORKDIR /home/user/comps/rerankings/src | ||
|
||
ENTRYPOINT ["python", "opea_reranking_microservice.py"] |
File renamed without changes.
File renamed without changes.
File renamed without changes
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,121 @@ | ||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
import logging | ||
import os | ||
import re | ||
|
||
from fastapi import HTTPException | ||
|
||
from comps import CustomLogger, LVMVideoDoc, OpeaComponentRegistry, SearchedMultimodalDoc, ServiceType | ||
from comps.cores.common.component import OpeaComponent | ||
|
||
logger = CustomLogger("video_reranking") | ||
logflag = os.getenv("LOGFLAG", False) | ||
|
||
chunk_duration = os.getenv("CHUNK_DURATION", "10") or "10" | ||
chunk_duration = float(chunk_duration) if chunk_duration.isdigit() else 10.0 | ||
|
||
file_server_endpoint = os.getenv("FILE_SERVER_ENDPOINT") or "http://0.0.0.0:6005" | ||
|
||
logging.basicConfig( | ||
level=logging.INFO, format="%(levelname)s: [%(asctime)s] %(message)s", datefmt="%d/%m/%Y %I:%M:%S" | ||
) | ||
|
||
|
||
def get_top_doc(top_n, videos) -> list: | ||
hit_score = {} | ||
if videos is None: | ||
return None | ||
for video_name in videos: | ||
try: | ||
if video_name not in hit_score.keys(): | ||
hit_score[video_name] = 0 | ||
hit_score[video_name] += 1 | ||
except KeyError as r: | ||
logging.info(f"no video name {r}") | ||
|
||
x = dict(sorted(hit_score.items(), key=lambda item: -item[1])) # sorted dict of video name and score | ||
top_n_names = list(x.keys())[:top_n] | ||
logging.info(f"top docs = {x}") | ||
logging.info(f"top n docs names = {top_n_names}") | ||
|
||
return top_n_names | ||
|
||
|
||
def find_timestamp_from_video(metadata_list, video): | ||
return next( | ||
(metadata["timestamp"] for metadata in metadata_list if metadata["video"] == video), | ||
None, | ||
) | ||
|
||
|
||
def format_video_name(video_name): | ||
# Check for an existing file extension | ||
match = re.search(r"\.(\w+)$", video_name) | ||
|
||
if match: | ||
extension = match.group(1) | ||
# If the extension is not 'mp4', raise an error | ||
if extension != "mp4": | ||
raise ValueError(f"Invalid file extension: .{extension}. Only '.mp4' is allowed.") | ||
|
||
# Use regex to remove any suffix after the base name (e.g., '_interval_0', etc.) | ||
base_name = re.sub(r"(_interval_\d+)?(\.mp4)?$", "", video_name) | ||
|
||
# Add the '.mp4' extension | ||
formatted_name = f"{base_name}.mp4" | ||
|
||
return formatted_name | ||
|
||
|
||
@OpeaComponentRegistry.register("OPEA_VIDEO_RERANKING") | ||
class OpeaVideoReranking(OpeaComponent): | ||
"""A specialized reranking component derived from OpeaComponent for OPEA Video native reranking services.""" | ||
|
||
def __init__(self, name: str, description: str, config: dict = None): | ||
super().__init__(name, ServiceType.RERANK.name.lower(), description, config) | ||
|
||
async def invoke(self, input: SearchedMultimodalDoc) -> LVMVideoDoc: | ||
"""Invokes the reranking service to generate reranking for the provided input. | ||
Args: | ||
input (SearchedMultimodalDoc): The input in OpenAI reranking format. | ||
Returns: | ||
LVMVideoDoc: The response in OpenAI reranking format. | ||
""" | ||
try: | ||
# get top video name from metadata | ||
video_names = [meta["video"] for meta in input.metadata] | ||
top_video_names = get_top_doc(input.top_n, video_names) | ||
|
||
# only use the first top video | ||
timestamp = find_timestamp_from_video(input.metadata, top_video_names[0]) | ||
formatted_video_name = format_video_name(top_video_names[0]) | ||
video_url = f"{file_server_endpoint.rstrip('/')}/{formatted_video_name}" | ||
|
||
result = LVMVideoDoc( | ||
video_url=video_url, | ||
prompt=input.initial_query, | ||
chunk_start=timestamp, | ||
chunk_duration=float(chunk_duration), | ||
max_new_tokens=512, | ||
) | ||
except ValueError as e: | ||
raise HTTPException(status_code=400, detail=str(e)) | ||
except Exception as e: | ||
logging.error(f"Unexpected error in reranking: {str(e)}") | ||
# Handle any other exceptions with a generic server error response | ||
raise HTTPException(status_code=500, detail="An unexpected error occurred.") | ||
|
||
return result | ||
|
||
def check_health(self) -> bool: | ||
"""Checks the health of the reranking service. | ||
Returns: | ||
bool: True if the service is reachable and healthy, False otherwise. | ||
""" | ||
|
||
return True |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
datasets | ||
haystack-ai | ||
langchain --extra-index-url https://download.pytorch.org/whl/cpu | ||
langchain_community --extra-index-url https://download.pytorch.org/whl/cpu | ||
openai | ||
Pillow | ||
pydub |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.