Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add local Rerank microservice for VideoRAGQnA #496

Merged
merged 19 commits into from
Aug 29, 2024

Conversation

BaoHuiling
Copy link
Collaborator

Description

The summary of the proposed changes as long as the relevant motivation and context.
Add support for local rerank for VideoRAGQnA with is a usecase of MMRAG. This rerank the video retrieved with local rerank algorithm. And format LVMVideoDoc for Video-Llama LVM microservice.

Issues

List the issue or RFC link this PR is working on. If there is no such link, please mark it as n/a.
The RFC is still under review. opea-project/docs#49

Type of change

List the type of change like below. Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds new functionality)
  • Breaking change (fix or feature that would break existing design and interface)
  • Others (enhancement, documentation, validation, etc.)

Dependencies

List the newly introduced 3rd party dependency if exists.
N/A

Tests

Describe the tests that you ran to verify your changes.
https://github.com/siddhivelankar23/GenAIComps/blob/huiling-rerank/tests/test_reranks_video-rag-qna.sh build, start, validate, clean the microservice

BaoHuiling and others added 6 commits August 15, 2024 19:21
Signed-off-by: BaoHuiling <[email protected]>
Signed-off-by: BaoHuiling <[email protected]>
Signed-off-by: BaoHuiling <[email protected]>
Signed-off-by: BaoHuiling <[email protected]>
Copy link

codecov bot commented Aug 15, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Files with missing lines Coverage Δ
comps/cores/proto/docarray.py 99.10% <100.00%> (+0.09%) ⬆️

... and 1 file with indirect coverage changes

@BaoHuiling

This comment was marked as resolved.

@chensuyue

This comment was marked as resolved.

@BaoHuiling

This comment was marked as resolved.

@chensuyue chensuyue added this to the v0.9 milestone Aug 17, 2024
@BaoHuiling
Copy link
Collaborator Author

BaoHuiling commented Aug 19, 2024

@lvliang-intel @XuhuiRen Please help to review the code, thanks! It's almost the DDL, we still need some time for possible changes if required.🙂

@BaoHuiling
Copy link
Collaborator Author

@XuhuiRen @lvliang-intel hello, I know you should be busy around this time, but could you help to review the code? We are targeting to merge this in this release. Please also mention anyone else for review if it’s needed.

@chensuyue chensuyue modified the milestones: v0.9, v1.0 Aug 27, 2024
Signed-off-by: BaoHuiling <[email protected]>
Signed-off-by: BaoHuiling <[email protected]>
@XuhuiRen
Copy link
Collaborator

From the perspectivity of functionality, i recommend to pending this PR after the PR for video embedding and PR for video retriever is merged.

@BaoHuiling BaoHuiling mentioned this pull request Aug 29, 2024
3 tasks
@lvliang-intel lvliang-intel merged commit 5fb4a38 into opea-project:main Aug 29, 2024
12 checks passed
@BaoHuiling
Copy link
Collaborator Author

@lvliang-intelv Hi Liang, we have some conflict of this PR with #575 and planned to hold this PR for now , is it possible to undo this merge? Otherwise we could issue another bug fix for it. Which would you recommend ?

@lvliang-intel
Copy link
Collaborator

@BaoHuiling,
Please raise another PR to fix it.

@BaoHuiling
Copy link
Collaborator Author

@BaoHuiling, Please raise another PR to fix it.

Got it, thanks!

a32543254 pushed a commit to a32543254/GenAIComps that referenced this pull request Sep 3, 2024
* initial commit

Signed-off-by: BaoHuiling <[email protected]>

* save

Signed-off-by: BaoHuiling <[email protected]>

* add readme, test script, fix bug

Signed-off-by: BaoHuiling <[email protected]>

* update video URL

Signed-off-by: BaoHuiling <[email protected]>

* use default

Signed-off-by: BaoHuiling <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update core dependency

Signed-off-by: BaoHuiling <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* use p 5000

Signed-off-by: BaoHuiling <[email protected]>

* use 5037

Signed-off-by: BaoHuiling <[email protected]>

* update ctnr name

Signed-off-by: BaoHuiling <[email protected]>

* remove langsmith

Signed-off-by: BaoHuiling <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add rerank algo desc in readme

Signed-off-by: BaoHuiling <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: BaoHuiling <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: chen, suyue <[email protected]>
Signed-off-by: Dong, Bo1 <[email protected]>
sharanshirodkar7 pushed a commit to predictionguard/pg-GenAIComps that referenced this pull request Sep 3, 2024
* initial commit

Signed-off-by: BaoHuiling <[email protected]>

* save

Signed-off-by: BaoHuiling <[email protected]>

* add readme, test script, fix bug

Signed-off-by: BaoHuiling <[email protected]>

* update video URL

Signed-off-by: BaoHuiling <[email protected]>

* use default

Signed-off-by: BaoHuiling <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update core dependency

Signed-off-by: BaoHuiling <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* use p 5000

Signed-off-by: BaoHuiling <[email protected]>

* use 5037

Signed-off-by: BaoHuiling <[email protected]>

* update ctnr name

Signed-off-by: BaoHuiling <[email protected]>

* remove langsmith

Signed-off-by: BaoHuiling <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add rerank algo desc in readme

Signed-off-by: BaoHuiling <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: BaoHuiling <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: chen, suyue <[email protected]>
lvliang-intel added a commit that referenced this pull request Sep 10, 2024
* add rerank with neural speed

Signed-off-by: Dong, Bo1 <[email protected]>

* add the code

Signed-off-by: Dong, Bo1 <[email protected]>

* add the code

Signed-off-by: Dong, Bo1 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Dong, Bo1 <[email protected]>

* fix mismatched response format w/wo streaming guardrails (#568)

* fix mismatched response format w/wo streaming  guardrails

* fix & debug

* fix & rm debug

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Dong, Bo1 <[email protected]>

* Fix guardrails out handle logics for space linebreak and quote (#571)

* fix mismatched response format w/wo streaming  guardrails

* fix & debug

* fix & rm debug

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* debug

* debug

* debug

* fix pre-space and linebreak

* fix pre-space and linebreak

* fix single/double quote

* fix single/double quote

* remove debug

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Dong, Bo1 <[email protected]>

* BUG FIX: LVM security fix (#572)

* add url validator

Signed-off-by: BaoHuiling <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add validation for video_url

Signed-off-by: BaoHuiling <[email protected]>

---------

Signed-off-by: BaoHuiling <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Dong, Bo1 <[email protected]>

* Modify output messages. (#569)

* Reduced output.

Signed-off-by: zepan <[email protected]>

* Output the location where the modified Dockerfile file is referenced.

Signed-off-by: zepan <[email protected]>

* for test

Signed-off-by: zepan <[email protected]>

* Restore test file.

Signed-off-by: zepan <[email protected]>

---------

Signed-off-by: zepan <[email protected]>
Signed-off-by: Dong, Bo1 <[email protected]>

* refine logging code. (#559)

* add ut and refine logging code.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update microservice port.

---------

Co-authored-by: root <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Dong, Bo1 <[email protected]>

* adding lancedb to langchain vectorstores (#291)

* adding lancedb to langchain vectorstores

Signed-off-by: sharanshirodkar7 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: sharanshirodkar7 <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: lvliang-intel <[email protected]>
Signed-off-by: Dong, Bo1 <[email protected]>

* Refine Dataprep Milvus MS (#570)

Signed-off-by: letonghan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Dong, Bo1 <[email protected]>

* final version

Signed-off-by: Dong, Bo1 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Dong, Bo1 <[email protected]>

* update the readme

Signed-off-by: Dong, Bo1 <[email protected]>

* add the sign

Signed-off-by: Dong, Bo1 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Dong, Bo1 <[email protected]>

* fix error for pre ci

Signed-off-by: Dong, Bo1 <[email protected]>

* add the ut

Signed-off-by: Dong, Bo1 <[email protected]>

* update docker file

Signed-off-by: Dong, Bo1 <[email protected]>

* update CI test log achieve (#577)

Signed-off-by: chensuyue <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Dong, Bo1 <[email protected]>

* Multimodal dataprep (#575)

* multimodal embedding for MM RAG for videos

Signed-off-by: Tiep Le <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* develop data prep first commit

Signed-off-by: Tiep Le <[email protected]>

* develop dataprep microservice for multimodal data

Signed-off-by: Tiep Le <[email protected]>

* multimodal langchain for dataprep

Signed-off-by: Tiep Le <[email protected]>

* update README

Signed-off-by: Tiep Le <[email protected]>

* update README

Signed-off-by: Tiep Le <[email protected]>

* update README

Signed-off-by: Tiep Le <[email protected]>

* update README

Signed-off-by: Tiep Le <[email protected]>

* cosmetic

Signed-off-by: Tiep Le <[email protected]>

* test for multimodal dataprep

Signed-off-by: Tiep Le <[email protected]>

* update test

Signed-off-by: Tiep Le <[email protected]>

* update test

Signed-off-by: Tiep Le <[email protected]>

* update test

Signed-off-by: Tiep Le <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* cosmetic update

Signed-off-by: Tiep Le <[email protected]>

* remove langsmith

Signed-off-by: Tiep Le <[email protected]>

* update API to remove /dataprep from API names and remove langsmith

Signed-off-by: Tiep Le <[email protected]>

* update test

Signed-off-by: Tiep Le <[email protected]>

* update the error message per PR reviewer

Signed-off-by: Tiep Le <[email protected]>

---------

Signed-off-by: Tiep Le <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Dong, Bo1 <[email protected]>

* add: Pathway vector store and retriever as LangChain component (#342)

* nb

Signed-off-by: Berke <[email protected]>

* init changes

Signed-off-by: Berke <[email protected]>

* docker

Signed-off-by: Berke <[email protected]>

* example data

Signed-off-by: Berke <[email protected]>

* docs(readme): update, add commands

Signed-off-by: Berke <[email protected]>

* fix: formatting, data sources

Signed-off-by: Berke <[email protected]>

* docs(readme): update instructions, add comments

Signed-off-by: Berke <[email protected]>

* fix: rm unused parts

Signed-off-by: Berke <[email protected]>

* fix: image name, compose env vars

Signed-off-by: Berke <[email protected]>

* fix: rm unused part

Signed-off-by: Berke <[email protected]>

* fix: logging name

Signed-off-by: Berke <[email protected]>

* fix: env var

Signed-off-by: Berke <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Berke <[email protected]>

* fix: rename pw docker

Signed-off-by: Berke <[email protected]>

* docs(readme): update input sources

Signed-off-by: Berke <[email protected]>

* nb

Signed-off-by: Berke <[email protected]>

* init changes

Signed-off-by: Berke <[email protected]>

* fix: formatting, data sources

Signed-off-by: Berke <[email protected]>

* docs(readme): update instructions, add comments

Signed-off-by: Berke <[email protected]>

* fix: rm unused part

Signed-off-by: Berke <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Berke <[email protected]>

* fix: rename pw docker

Signed-off-by: Berke <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Berke <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* feat: mv vector store, naming, clarify instructions, improve ingestion components

Signed-off-by: Berke <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* tests: add pw retriever test
fix: update docker to include libmagic

Signed-off-by: Berke <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* implement suggestions from review, entrypoint, reqs, comments, https_proxy.

Signed-off-by: Berke <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix: update docker tags in test and readme

Signed-off-by: Berke <[email protected]>

* tests: add separate pathway vectorstore test

Signed-off-by: Berke <[email protected]>

---------

Signed-off-by: Berke <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Sihan Chen <[email protected]>
Signed-off-by: Dong, Bo1 <[email protected]>

* Add local Rerank microservice for VideoRAGQnA (#496)

* initial commit

Signed-off-by: BaoHuiling <[email protected]>

* save

Signed-off-by: BaoHuiling <[email protected]>

* add readme, test script, fix bug

Signed-off-by: BaoHuiling <[email protected]>

* update video URL

Signed-off-by: BaoHuiling <[email protected]>

* use default

Signed-off-by: BaoHuiling <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update core dependency

Signed-off-by: BaoHuiling <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* use p 5000

Signed-off-by: BaoHuiling <[email protected]>

* use 5037

Signed-off-by: BaoHuiling <[email protected]>

* update ctnr name

Signed-off-by: BaoHuiling <[email protected]>

* remove langsmith

Signed-off-by: BaoHuiling <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add rerank algo desc in readme

Signed-off-by: BaoHuiling <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: BaoHuiling <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: chen, suyue <[email protected]>
Signed-off-by: Dong, Bo1 <[email protected]>

* Add Scan Container. (#560)

Signed-off-by: zepan <[email protected]>
Signed-off-by: Dong, Bo1 <[email protected]>

* fix SearchedMultimodalDoc in docarray (#583)

Signed-off-by: BaoHuiling <[email protected]>
Signed-off-by: Dong, Bo1 <[email protected]>

* update image build yaml (#529)

Signed-off-by: chensuyue <[email protected]>
Signed-off-by: zepan <[email protected]>
Signed-off-by: Dong, Bo1 <[email protected]>

* add microservice for intent detection (#131)

* add microservice for intent detection

Signed-off-by: Liangyx2 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update license copyright

Signed-off-by: Liangyx2 <[email protected]>

* add ut

Signed-off-by: Liangyx2 <[email protected]>

* refine

Signed-off-by: Liangyx2 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update folder

Signed-off-by: Liangyx2 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix test

Signed-off-by: Liangyx2 <[email protected]>

---------

Signed-off-by: Liangyx2 <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Dong, Bo1 <[email protected]>

* Make the scanning method optional. (#580)

Signed-off-by: zepan <[email protected]>
Signed-off-by: Dong, Bo1 <[email protected]>

* add code owners (#586)

Signed-off-by: Dong, Bo1 <[email protected]>

* remove revision for tei (#584)

Signed-off-by: letonghan <[email protected]>
Signed-off-by: Dong, Bo1 <[email protected]>

* Bug fix (#591)

* Check if the document exists.

Signed-off-by: zepan <[email protected]>

* Add flag output.

Signed-off-by: zepan <[email protected]>

* Modify nginx readme.

Signed-off-by: zepan <[email protected]>

* Modify document detection logic

Signed-off-by: zepan <[email protected]>

---------

Signed-off-by: zepan <[email protected]>
Signed-off-by: Dong, Bo1 <[email protected]>

* fix ut issue

Signed-off-by: Dong, Bo1 <[email protected]>

* merge the main

Signed-off-by: Dong, Bo1 <[email protected]>

* align with new pipeline

Signed-off-by: Dong, Bo1 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* align with newest pipeline

Signed-off-by: Dong, Bo1 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* upload code

Signed-off-by: Dong, Bo1 <[email protected]>

* update the ut

Signed-off-by: Dong, Bo1 <[email protected]>

* add docker path

Signed-off-by: Dong, Bo1 <[email protected]>

* add the docker path

Signed-off-by: Dong, Bo1 <[email protected]>

---------

Signed-off-by: Dong, Bo1 <[email protected]>
Signed-off-by: BaoHuiling <[email protected]>
Signed-off-by: zepan <[email protected]>
Signed-off-by: sharanshirodkar7 <[email protected]>
Signed-off-by: letonghan <[email protected]>
Signed-off-by: chensuyue <[email protected]>
Signed-off-by: Tiep Le <[email protected]>
Signed-off-by: Berke <[email protected]>
Signed-off-by: Liangyx2 <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Sihan Chen <[email protected]>
Co-authored-by: Huiling Bao <[email protected]>
Co-authored-by: ZePan110 <[email protected]>
Co-authored-by: lkk <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: Sharan Shirodkar <[email protected]>
Co-authored-by: lvliang-intel <[email protected]>
Co-authored-by: Letong Han <[email protected]>
Co-authored-by: chen, suyue <[email protected]>
Co-authored-by: Tiep Le <[email protected]>
Co-authored-by: berkecanrizai <[email protected]>
Co-authored-by: Liangyx2 <[email protected]>
Co-authored-by: kevinintel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants