Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Correct parameter name in _encode method for BGEM3FlagModel #45

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

thititj
Copy link

@thititj thititj commented Nov 6, 2024

Fix: BGEM3EmbeddingFunction encode() parameter mismatch

Issue

When using the BGEM3EmbeddingFunction class, users encounter the following error:

M3Embedder.encode() missing 1 required positional argument: 'queries'

This occurs because the _encode() method is using sentences as the parameter name when calling self.model.encode(), but the underlying BGEM3FlagModel expects the parameter to be named queries.

Solution

Changed the parameter name in the _encode() method from sentences to queries to match the expected parameter name of the underlying model:

# Before
output = self.model.encode(sentences=texts, **self._encode_config)

# After
output = self.model.encode(queries=texts, **self._encode_config)

Testing

  • Tested the fix by creating an instance of BGEM3EmbeddingFunction and successfully encoding both documents and queries
  • Verified that all return types (dense, sparse, colbert_vecs) work as expected
  • Confirmed the error no longer occurs
  • Created a demonstration notebook showing the working implementation: Google Colab Notebook

Additional Context

This fix aligns with the FlagEmbedding implementation which uses queries as the parameter name in its encode method.

Related Issues

Resolves the error reported in various contexts where users attempt to use the BGE-M3 model with Milvus.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants