Multimodal Embedding Microservice #555

tileintel · 2024-08-22T22:26:49Z

Description

This PR introduces multimodal embedding microservice using BridgeTower model as embedding model. This microservice is required for Multimodal RAG on Videos application

We have added several dataclasses into comps/cores/proto/docarray.py that are required for the proposed microservices.
We have provided a custom implementation of BridgeTower from the one on Huggingface, allowing to compute the embedding of text and the joint embedding of image-text pair.
We have employed BridgeTower model for Multimodal Embedding Inference Endpoint (MMEI) running on both CPU and HPU.
We have implemented multimodal embedding microservice with Local Multimodal Embedding Model (Local BridgeTower running on CPU)
We have implemented multimodal embedding microservice with with MMEI_EMBEDDING_ENDPOINT.
We have provided README file and tests.

Issues

RFC: https://github.com/opea-project/docs/pull/49/files
Issue: opea-project/GenAIExamples#358

Type of change

List the type of change like below. Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
[ x] New feature (non-breaking change which adds new functionality)
Breaking change (fix or feature that would break existing design and interface)
Others (enhancement, documentation, validation, etc.)

Dependencies

docarray[full]
fastapi
huggingface_hub
langchain
langsmith
opentelemetry-api
opentelemetry-exporter-otlp
opentelemetry-sdk
prometheus-fastapi-instrumentator
transformers
shortuuid
uvicorn
torch
torchvision
pydantic==2.8.2
BridgeTower

Tests

We have provided 2 tests for this microservice:

tests/test_multimodal_embeddings_langchain_cpu.sh: This is to test microservice with MMEI running on CPU.
tests/test_multimodal_embeddings_langchain_hpu.sh: This is to test microservice with MMEI running on CPU.

Signed-off-by: Tiep Le <[email protected]>

for more information, see https://pre-commit.ci

Multimodal embedding

comps/embeddings/multimodal_embeddings/multimodal_langchain/local_mm_embedding.py

comps/embeddings/multimodal_embeddings/multimodal_langchain/requirements.txt

codecov · 2024-08-29T01:58:00Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Files with missing lines	Coverage Δ
comps/cores/proto/docarray.py	`99.09% <100.00%> (+0.08%)`	⬆️

... and 1 file with indirect coverage changes

Signed-off-by: Tiep Le <[email protected]>

tileintel · 2024-08-29T03:03:33Z

Hi @lvliang-intel. Thank you for your feedback about langsmith. We have removed langsmith from all places that you pointed out above. Would you please re-review this?

comps/embeddings/multimodal_embeddings/bridgetower/bridgetower_embedding.py

Signed-off-by: Tiep Le <[email protected]>

XuhuiRen · 2024-08-29T06:30:02Z

comps/embeddings/multimodal_embeddings/multimodal_langchain/__init__.py

recommend to change the name of this folder. This title is to big to cover the multimodal functionality. Especially the use case for this code a pretty small domain where a text must paired with a image to obtain the embedding.

Thanks for your feedback. We would like to discuss to keep the name of this folder as it is because of the followings:

Currently, it supports not only image-text pairs, but also support text as well. (c.f. please see method embed_query and embed_documents which inherit from interface Embedding from langchain

This BridgeTowerEmbedding can be extended to embed image (and embed other modalities) in future as well. In our current implementation, we haven't provided such methods because this will not be employed for our proposed application Multimodal RAG on Videos.

Although BridgeTowerEmbedding provides implemented methods for embed_documents, embed_query, embed_image_text_pairs, but it can also be considered as an interface for MultimodalEmbedding for other modalities for future development.
We would appreciate if you can take into account these comments and consider again. Please let us know if any of these does not make sense and you insist in changing the folder name?
Thanks a lot

as this PR is merged into 575, please close this PR

tileintel · 2024-08-29T07:21:40Z

This PR #555 is merged to the PR #575. Closing this one.
Thanks @XuhuiRen and @lvliang-intel

Co-authored-by: chen, suyue <[email protected]>

tileintel and others added 3 commits August 22, 2024 22:18

multimodal embedding for MM RAG for videos

ad685ed

Signed-off-by: Tiep Le <[email protected]>

[pre-commit.ci] auto fixes from pre-commit.com hooks

5702683

for more information, see https://pre-commit.ci

Merge pull request #1 from tileintel/multimodal-embedding

3a413f4

Multimodal embedding

tileintel requested review from XuhuiRen and lvliang-intel as code owners August 22, 2024 22:26

This was referenced Aug 22, 2024

Multimodal Embedding Microservice #547

Closed

Multimodality opea-project/GenAIExamples#358

Closed

tileintel added 3 commits August 24, 2024 01:33

Merge branch 'main' into main

5bca14d

Merge branch 'main' into main

49ccb25

Merge branch 'main' into main

f0c0a43

tileintel mentioned this pull request Aug 29, 2024

Multimodal dataprep #575

Merged

3 tasks

hshen14 requested review from XuhuiRen and lkk12014402 and removed request for XuhuiRen August 29, 2024 01:45

lvliang-intel reviewed Aug 29, 2024

View reviewed changes

comps/embeddings/multimodal_embeddings/multimodal_langchain/local_mm_embedding.py Outdated Show resolved Hide resolved

lvliang-intel reviewed Aug 29, 2024

View reviewed changes

comps/embeddings/multimodal_embeddings/multimodal_langchain/local_mm_embedding.py Outdated Show resolved Hide resolved

lvliang-intel reviewed Aug 29, 2024

View reviewed changes

comps/embeddings/multimodal_embeddings/multimodal_langchain/requirements.txt Outdated Show resolved Hide resolved

kevinintel added this to the v1.0 milestone Aug 29, 2024

remove langsmith

893132b

Signed-off-by: Tiep Le <[email protected]>

tileintel requested a review from lvliang-intel August 29, 2024 03:02

XuhuiRen reviewed Aug 29, 2024

View reviewed changes

comps/embeddings/multimodal_embeddings/bridgetower/bridgetower_embedding.py Outdated Show resolved Hide resolved

update the error message per PR reviewer

ae47db6

Signed-off-by: Tiep Le <[email protected]>

XuhuiRen reviewed Aug 29, 2024

View reviewed changes

lvliang-intel approved these changes Aug 29, 2024

View reviewed changes

tileintel closed this Aug 29, 2024

lkk12014402 pushed a commit that referenced this pull request Sep 19, 2024

chore: add support for .md file in file upload (#555)

7a67298

Co-authored-by: chen, suyue <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multimodal Embedding Microservice #555

Multimodal Embedding Microservice #555

tileintel commented Aug 22, 2024 •

edited

Loading

codecov bot commented Aug 29, 2024

tileintel commented Aug 29, 2024

XuhuiRen Aug 29, 2024

tileintel Aug 29, 2024

XuhuiRen Aug 29, 2024

tileintel commented Aug 29, 2024 •

edited

Loading

Multimodal Embedding Microservice #555

Multimodal Embedding Microservice #555

Conversation

tileintel commented Aug 22, 2024 • edited Loading

Description

Issues

Type of change

Dependencies

Tests

codecov bot commented Aug 29, 2024

Codecov Report

tileintel commented Aug 29, 2024

XuhuiRen Aug 29, 2024

Choose a reason for hiding this comment

tileintel Aug 29, 2024

Choose a reason for hiding this comment

XuhuiRen Aug 29, 2024

Choose a reason for hiding this comment

tileintel commented Aug 29, 2024 • edited Loading

tileintel commented Aug 22, 2024 •

edited

Loading

tileintel commented Aug 29, 2024 •

edited

Loading