Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrating database to new model #1062

Conversation

ManishMadan2882
Copy link
Collaborator

@ManishMadan2882 ManishMadan2882 commented Aug 11, 2024

  • What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)

API keys | Chatbots:

  1. the api key contains is either DBRef for source, or retriever name[enums: "classic","brave_search", "duckduck_search"]
  2. the retriever mentioned should replace the need to mention default sources.

Sources:

  1. the "location" for the local doc should be ideally present in the Mongo document.
  2. Moving to a new pattern of storing the vector indexes for the sources.
    • Currently, the indexes are stored at /application/indexes/input/userId/documentName
    • Indexes should now be stored at /application/indexes/<_id>
      Figure shows the previous and modern convention to store vector indexes
  3. sources should be DBRef("vector", ObjectID("_id")).
  4. sources may contain a preferred retriever, if not present "classic" should be assumed.

Streaming:

  1. Stream via api_keys :

    • Streaming through apiKeys, if sources are present the get_retriever method is responsible to return source retriever if present(default:"classic").
  2. Stream via active_docs

    • "active_docs" are ObjectId of the document in vector collection.
    • "active_docs" are present in the request body.
    • if the retriever is present, get retriever form get_retriever(active_doc) method
  3. Stream via retriever

    • "retreiver": enums("brave_search", "duckduck_search","classic")
    • the sources are {}
  • Why was this change needed? (You can also link to an open issue here)
    * Closes 🚀 Migrating database to new model #1059
    * Refactor legacy codebase to align with new design patterns, primarily focused on database schema.

Frontend:

Made changes adapting to the backend migration:
  • Added id as an optional field for Documents.
  • If the id is present in the selectedDoc, the id is substituted as source in API call to retrieval endpoints(stream, answer), otherwise the docsLink is added to the payload as retriever
  • Updated the createAPIKey method, the retriever is added to payload if the id is absent, made required changes to Dropdown component.

Copy link

vercel bot commented Aug 11, 2024

@ManishMadan2882 is attempting to deploy a commit to the Arc53 Team on Vercel.

A member of the Team first needs to authorize it.

@github-actions github-actions bot added the application Application label Aug 11, 2024
Copy link

vercel bot commented Aug 11, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
docs-gpt ✅ Ready (Inspect) Visit Preview 💬 Add feedback Sep 9, 2024 8:00pm

Copy link

codecov bot commented Sep 8, 2024

Codecov Report

Attention: Patch coverage is 9.45946% with 134 lines in your changes missing coverage. Please review.

Please upload report for BASE (1059-migrating-database-to-new-model@f9dbaa9). Learn more about missing BASE report.

Files with missing lines Patch % Lines
application/api/user/routes.py 1.85% 53 Missing ⚠️
application/api/answer/routes.py 3.03% 32 Missing ⚠️
scripts/migrate_to_v1_vectorstore.py 0.00% 28 Missing ⚠️
application/worker.py 22.22% 14 Missing ⚠️
application/api/internal/routes.py 16.66% 5 Missing ⚠️
application/retriever/classic_rag.py 0.00% 1 Missing ⚠️
application/vectorstore/faiss.py 87.50% 1 Missing ⚠️
Additional details and impacted files
@@                           Coverage Diff                           @@
##             1059-migrating-database-to-new-model    #1062   +/-   ##
=======================================================================
  Coverage                                        ?   21.48%           
=======================================================================
  Files                                           ?       81           
  Lines                                           ?     3774           
  Branches                                        ?        0           
=======================================================================
  Hits                                            ?      811           
  Misses                                          ?     2963           
  Partials                                        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@dartpain dartpain merged commit c686d95 into arc53:1059-migrating-database-to-new-model Sep 9, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants