Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: VoyageAI encoder #255

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

feat: VoyageAI encoder #255

wants to merge 12 commits into from

Conversation

xRiddin
Copy link

@xRiddin xRiddin commented Apr 23, 2024

New Feature: Integration of super fast and accurate VoyageAI Embeddings into semantic-router

@xRiddin xRiddin marked this pull request as ready for review April 23, 2024 17:35
@theanupllm theanupllm self-assigned this Apr 23, 2024
@jamescalam jamescalam changed the title New Feature: VoyageAI feat: VoyageAI encoder Apr 25, 2024
@jamescalam
Copy link
Member

thanks @xRiddin — running again

@jamescalam
Copy link
Member

jamescalam commented Apr 25, 2024

@xRiddin you can run poetry lock --no-update to recreate the poetry.lock file, after committing that it should fix the test issue above

@xRiddin
Copy link
Author

xRiddin commented Apr 26, 2024

@jamescalam done 👍

@xRiddin
Copy link
Author

xRiddin commented Apr 26, 2024

@jamescalam also the the maximum length of the list is 128 for Voyage AI api. If the user feeds an insane amount of data inside the lists, it will likely throw an error. Need to find a workaround for that.

@jamescalam
Copy link
Member

Running:

poetry run pytest tests/unit/encoders/test_voyageai.py

Returns errors (same as being returned in the tests here), seem to only be mocking problems afaict — @xRiddin are you able to resolve? Let us know if you want us to jump in and help on the remaining mocking/tests

I also moved voyageai from a required install to an optional install, and modified the import in the encode file accordingly. Also renamed client to _client to align with other API-based encoders (for example the Mistral encoder)

@xRiddin
Copy link
Author

xRiddin commented Apr 29, 2024

@jamescalam I'm not able to resolve the mocking problems, any help on the remaining tests are very much appreciated.

try:
client = voyageai.Client(api_key=api_key)
except Exception as e:
raise ValueError(f"Unable to connect to VoyageAI {e.args}: {e}") from e
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to help with the test problems, the test is looking to assert the error message outputs as "VOYAGE API client failed to initialize. Error: Initialization error" but because e.args returns a tuple the output from the failure is actually "Unable to connect to VoyageAI ('Initialization error',): Initialization error".

To pass the assertion you could update your test assertion to match the args output or you could update the actual error message to ValueError(f"VOYAGE API client failed to initialize. Error: {e}") which should pass


except Exception as e:
logger.error(f"VoyageAI API call failed. Error: {error_message}")
raise ValueError(f"VoyageAI API call failed. Error: {e}") from e
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last test problems seem to be due to the handling of exceptions in the exponential backoff, currently it will stop on the first exception and not retry at all due to the exception being raised straight away. To get it to continue for the specified number of retries as the test requires you could just add a line to continue the retries before the raise which should pass those other tests

except Exception as e:
                logger.error(f"VoyageAI API call failed. Error: {error_message}")
                if j < 6:
                    sleep(2 ** j)
                    continue
                raise ValueError(f"VoyageAI API call failed. Error: {e}") from e

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants