How to get blob container files URL to use as metadata for pinecone #7883
Unanswered
levalencia
asked this question in
Q&A
Replies: 1 comment 1 reply
-
If I check the source of the dataloader, I dont think this will be possible: https://github.com/hwchase17/langchain/blob/master/langchain/document_loaders/azure_blob_storage_container.py |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
In my scenario I am using first langchain and pinecone for document search with similarity search.
I have a lot of files in blob storage and I want to enable the users to search for documents.
However we know that documents have to be split and then vectorized to be stored in pinecone.
The similarity search works pretty well, however, it only returns the chunk, not the document.
One way to overcome this would be to have the URL of the original file in the chunk metadata (vector metadata).
How would you approach this?
My code for indexing looks like this. (Ignore the text analytics part)
Beta Was this translation helpful? Give feedback.
All reactions