Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PGVector DocumentStore #142

Closed
10 tasks done
Tracked by #6670
mathislucka opened this issue Dec 22, 2023 · 2 comments
Closed
10 tasks done
Tracked by #6670

Add PGVector DocumentStore #142

mathislucka opened this issue Dec 22, 2023 · 2 comments
Assignees
Labels
new integration Discuss the creation of a new integration in Core P1

Comments

@mathislucka
Copy link
Member

mathislucka commented Dec 22, 2023

Summary and motivation

PGVector is a popular request from our community. We should have it in 2.0.

Detailed design

TBD

Checklist

If the request is accepted, ensure the following checklist is complete before closing this issue.

@mathislucka mathislucka added new integration Discuss the creation of a new integration in Core P2 labels Dec 22, 2023
@mathislucka mathislucka added this to the 2.0 DocumentStores milestone Dec 22, 2023
@sahusiddharth
Copy link
Contributor

sahusiddharth commented Dec 30, 2023

@mathislucka I have started working on this, can you clarify something for me?

When it comes to metadata filtering there is way suggested in the template, but the one followed in chroma docstore is quite different, how should I proceed from here?

There is one more thing that I wanted to ask,

Context: We have different methods in different document stores, they have 4 in common [count_document, filter_document, write_document, delete_document] and there are some just to support them , but apart from them there are some extras.

Question: Do we have to implement those lacking functions in all the document store? If you could give me a list of all the function that will be required based on our current requirements.

@anakin87
Copy link
Member

anakin87 commented Jan 3, 2024

Hey @sahusiddharth, sorry for the late reply.

I recently developed the Pinecone Document Store, so I try to share some suggestions.

  • the official metadata filtering is that of the template. You can find more information in the docs.
    I think filters in Chroma Document Store are the legacy filters.

  • as explained in Creating Custom Document Stores, each Document Store must respect the Document Store protocol, by implementing the 4 standard methods (respecting their signature). For functions that are specific to your Document Store, you are free to implement any other methods you need.

  • incremental development: I suggest you create one of the 4 standard methods at a time and test it. DocumentStoreBaseTests was recently refactored to allow this.
    An example

    1. you implement count_documents
    2. in test_document_store.py, you define the test class TestDocumentStore(CountDocumentsTest): please note that here we are only inheriting from CountDocumentsTest
    3. you make tests pass
    4. you implement write_documents
    5. in test_document_store.py, you define the testing class TestDocumentStore(CountDocumentsTest, WriteDocumentsTest): here we are also inheriting from WriteDocumentsTest
    6. ...
  • Elasticsearch Document Store can be considered a good reference, also for the implementation of filters

Feel free to ask for clarification, if needed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new integration Discuss the creation of a new integration in Core P1
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants