-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Session History Feature for the Video RAG Chat Interface #253
Open
bashirmoham
wants to merge
25
commits into
opea-project:main
Choose a base branch
from
ttrigui:session-history
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
25 commits
Select commit
Hold shift + click to select a range
3ea8cb0
Added VideoRAG + time-based search use case
ttrigui 5e8bfe0
Added placeholders for prompt processing and UI
ttrigui a245751
Cleaned up Readme, requirements and VectorDB; Added env variables for…
avbodas 0e7b4aa
Merge pull request #1 from avbodas/abdev
ttrigui 6ef4527
update README file
ttrigui 73ca4b4
add get_history function to retrive messages from session state
bashirmoham 041199e
Add history parameter to the function
bashirmoham 6dff959
fix instruction for the assistant
bashirmoham 1148bc5
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 45976c4
Enable autoescaping for Jinja2 to prevent vulnerabilities
bashirmoham 3ce37aa
Merge branch 'session-history' of https://github.com/ttrigui/GenAIExa…
bashirmoham fa509ba
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 6db5f0c
Update prompt_handler.py
bashirmoham efbfe5f
Merge branch 'session-history' of https://github.com/ttrigui/GenAIExa…
bashirmoham c726b91
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] d634b42
enable chat interface
bashirmoham c9a89ec
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 98aa223
Update README.md
bashirmoham ec4ce04
Merge branch 'session-history' of https://github.com/ttrigui/GenAIExa…
bashirmoham e8dee0d
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 74c3260
Update video-rag-ui.py
bashirmoham a9b12cf
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 392967f
Update config.yaml
bashirmoham 22a5d3f
Update config.yaml
bashirmoham 51b594a
Merge branch 'main' into session-history
lvliang-intel File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,104 @@ | ||
# Video RAG | ||
|
||
## Introduction | ||
|
||
Video RAG is a framework that retrieves video based on provided user prompt. It uses both video scene description generated by open source vision models (ex video-llama, video-llava etc.) as text embeddings and frames as image embeddings to perform vector similarity search. The provided solution also supports feature to retrieve more similar videos without prompting it. (see the example video below) | ||
|
||
![Example Video](docs/visual-rag-demo.gif) | ||
|
||
## Tools | ||
|
||
- **UI**: gradio **or** streamlit | ||
- **Vector Storage**: Chroma DB **or** Intel's VDMS | ||
- **Image Embeddings**: CLIP | ||
- **Text Embeddings**: all-MiniLM-L12-v2 | ||
- **RAG Retriever**: Langchain Ensemble Retrieval | ||
|
||
## Prerequisites | ||
|
||
There are 10 example videos present in `video_ingest/videos` along with their description generated by open-source vision model. | ||
If you want these video RAG to work on your own videos, make sure it matches below format. | ||
|
||
## File Structure | ||
|
||
```bash | ||
video_ingest/ | ||
. | ||
├── scene_description | ||
│ ├── op_10_0320241830.mp4.txt | ||
│ ├── op_1_0320241830.mp4.txt | ||
│ ├── op_19_0320241830.mp4.txt | ||
│ ├── op_21_0320241830.mp4.txt | ||
│ ├── op_24_0320241830.mp4.txt | ||
│ ├── op_31_0320241830.mp4.txt | ||
│ ├── op_47_0320241830.mp4.txt | ||
│ ├── op_5_0320241915.mp4.txt | ||
│ ├── op_DSCF2862_Rendered_001.mp4.txt | ||
│ └── op_DSCF2864_Rendered_006.mp4.txt | ||
└── videos | ||
├── op_10_0320241830.mp4 | ||
├── op_1_0320241830.mp4 | ||
├── op_19_0320241830.mp4 | ||
├── op_21_0320241830.mp4 | ||
├── op_24_0320241830.mp4 | ||
├── op_31_0320241830.mp4 | ||
├── op_47_0320241830.mp4 | ||
├── op_5_0320241915.mp4 | ||
├── op_DSCF2862_Rendered_001.mp4 | ||
└── op_DSCF2864_Rendered_006.mp4 | ||
``` | ||
|
||
## Setup and Installation | ||
|
||
Install pip requirements | ||
|
||
```bash | ||
cd VideoRAGQnA | ||
pip3 install -r docs/requirements.txt | ||
``` | ||
|
||
The current framework supports both Chroma DB and Intel's VDMS, use either of them, | ||
|
||
Running Chroma DB as docker container | ||
|
||
```bash | ||
docker run -d -p 8000:8000 chromadb/chroma | ||
``` | ||
|
||
**or** | ||
|
||
Running VDMS DB as docker container | ||
|
||
```bash | ||
docker run -d -p 55555:55555 intellabs/vdms:latest | ||
``` | ||
|
||
**Note:** If you are not using file structure similar to what is described above, consider changing it in `config.yaml`. | ||
|
||
Update your choice of db and port in `config.yaml`. | ||
|
||
```bash | ||
export VECTORDB_SERVICE_HOST_IP=<ip of host where vector db is running> | ||
|
||
export HUGGINGFACEHUB_API_TOKEN='<your HF token>' | ||
``` | ||
|
||
HuggingFace hub API token can be generated [here](https://huggingface.co/login?next=%2Fsettings%2Ftokens). | ||
|
||
Generating Image embeddings and store them into selected db, specify config file location and video input location | ||
|
||
```bash | ||
python3 embedding/generate_store_embeddings.py docs/config.yaml video_ingest/videos/ | ||
``` | ||
|
||
**Web UI Video RAG - Streamlit** | ||
|
||
```bash | ||
streamlit run video-rag-ui.py --server.address 0.0.0.0 --server.port 50055 | ||
``` | ||
|
||
**Web UI Video RAG - Gradio** | ||
|
||
```bash | ||
python3 video-rag-ui.py docs/config.yaml True '0.0.0.0' 50055 | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
# Path to all videos | ||
videos: video_ingest/videos/ | ||
# Path to video description generated by open-source vision models (ex. video-llama, video-llava, etc.) | ||
description: video_ingest/scene_description/ | ||
# Do you want to extract frames of videos (True if not done already, else False) | ||
generate_frames: True | ||
# Do you want to generate image embeddings? | ||
embed_frames: True | ||
# Path to store extracted frames | ||
image_output_dir: video_ingest/frames/ | ||
# Path to store metadata files | ||
meta_output_dir: video_ingest/frame_metadata/ | ||
# Number of frames to extract per second, | ||
# if 24 fps, and this value is 2, then it will extract 12th and 24th frame | ||
number_of_frames_per_second: 2 | ||
|
||
vector_db: | ||
choice_of_db: 'vdms' #'chroma' # #Supported databases [vdms, chroma] | ||
host: 0.0.0.0 | ||
port: 55555 #8000 # | ||
|
||
# LLM path | ||
model_path: meta-llama/Llama-2-7b-chat-hf |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
accelerate | ||
chromadb | ||
dateparser | ||
gradio | ||
langchain-experimental | ||
metafunctions | ||
open-clip-torch | ||
opencv-python-headless | ||
sentence-transformers | ||
streamlit | ||
tzlocal | ||
vdms |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,120 @@ | ||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
import datetime | ||
import json | ||
import os | ||
import random | ||
|
||
import cv2 | ||
from tzlocal import get_localzone | ||
|
||
|
||
def process_all_videos(path, image_output_dir, meta_output_dir, N, selected_db): | ||
|
||
def extract_frames(video_path, image_output_dir, meta_output_dir, N, date_time, local_timezone): | ||
video = video_path.split("/")[-1] | ||
# Create a directory to store frames and metadata | ||
os.makedirs(image_output_dir, exist_ok=True) | ||
os.makedirs(meta_output_dir, exist_ok=True) | ||
|
||
# Open the video file | ||
cap = cv2.VideoCapture(video_path) | ||
|
||
if int(cv2.__version__.split(".")[0]) < 3: | ||
fps = cap.get(cv2.cv.CV_CAP_PROP_FPS) | ||
else: | ||
fps = cap.get(cv2.CAP_PROP_FPS) | ||
|
||
total_frames = cap.get(cv2.CAP_PROP_FRAME_COUNT) | ||
|
||
# print (f'fps {fps}') | ||
# print (f'total frames {total_frames}') | ||
|
||
mod = int(fps // N) | ||
if mod == 0: | ||
mod = 1 | ||
|
||
print(f"total frames {total_frames}, N {N}, mod {mod}") | ||
|
||
# Variables to track frame count and desired frames | ||
frame_count = 0 | ||
|
||
# Metadata dictionary to store timestamp and image paths | ||
metadata = {} | ||
|
||
while cap.isOpened(): | ||
ret, frame = cap.read() | ||
|
||
if not ret: | ||
break | ||
|
||
frame_count += 1 | ||
|
||
if frame_count % mod == 0: | ||
timestamp = cap.get(cv2.CAP_PROP_POS_MSEC) / 1000 # Convert milliseconds to seconds | ||
frame_path = os.path.join(image_output_dir, f"{video}_{frame_count}.jpg") | ||
time = date_time.strftime("%H:%M:%S") | ||
date = date_time.strftime("%Y-%m-%d") | ||
hours, minutes, seconds = map(float, time.split(":")) | ||
year, month, day = map(int, date.split("-")) | ||
|
||
cv2.imwrite(frame_path, frame) # Save the frame as an image | ||
|
||
metadata[frame_count] = { | ||
"timestamp": timestamp, | ||
"frame_path": frame_path, | ||
"date": date, | ||
"year": year, | ||
"month": month, | ||
"day": day, | ||
"time": time, | ||
"hours": hours, | ||
"minutes": minutes, | ||
"seconds": seconds, | ||
} | ||
if selected_db == "vdms": | ||
# Localize the current time to the local timezone of the machine | ||
# Tahani might not need this | ||
current_time_local = date_time.replace(tzinfo=datetime.timezone.utc).astimezone(local_timezone) | ||
|
||
# Convert the localized time to ISO 8601 format with timezone offset | ||
iso_date_time = current_time_local.isoformat() | ||
metadata[frame_count]["date_time"] = {"_date": str(iso_date_time)} | ||
|
||
# Save metadata to a JSON file | ||
metadata_file = os.path.join(meta_output_dir, f"{video}_metadata.json") | ||
with open(metadata_file, "w") as f: | ||
json.dump(metadata, f, indent=4) | ||
|
||
# Release the video capture and close all windows | ||
cap.release() | ||
print(f"{frame_count/mod} Frames extracted and metadata saved successfully.") | ||
return fps, total_frames, metadata_file | ||
|
||
videos = [file for file in os.listdir(path) if file.endswith(".mp4")] | ||
|
||
# print (f'Total {len(videos)} videos will be processed') | ||
metadata = {} | ||
|
||
for i, each_video in enumerate(videos): | ||
video_path = os.path.join(path, each_video) | ||
date_time = datetime.datetime.now() | ||
print("date_time : ", date_time) | ||
# Get the local timezone of the machine | ||
local_timezone = get_localzone() | ||
fps, total_frames, metadata_file = extract_frames( | ||
video_path, image_output_dir, meta_output_dir, N, date_time, local_timezone | ||
) | ||
metadata[each_video] = { | ||
"fps": fps, | ||
"total_frames": total_frames, | ||
"extracted_frame_metadata_file": metadata_file, | ||
"embedding_path": f"embeddings/{each_video}.pt", | ||
"video_path": f"{path}/{each_video}", | ||
} | ||
print(f"✅ {i+1}/{len(videos)}") | ||
|
||
metadata_file = os.path.join(meta_output_dir, "metadata.json") | ||
with open(metadata_file, "w") as f: | ||
json.dump(metadata, f, indent=4) |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All microservice-related code should be placed in the GenAIComps repo. Only the Docker Compose files, Kubernetes manifests, and UI code need to be stored in the GenAIExamples repo. Please reorganize your code accordingly. Thanks.