Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add source to pdf doc metadata #735

Open
beandrad opened this issue Sep 18, 2024 · 0 comments
Open

Add source to pdf doc metadata #735

beandrad opened this issue Sep 18, 2024 · 0 comments
Assignees

Comments

@beandrad
Copy link
Collaborator

beandrad commented Sep 18, 2024

The doc["metadata"]["source"] is used to set the chunk filenames. We should set the property source for pdf documents generated using "unstructured".

@beandrad beandrad self-assigned this Sep 18, 2024
julia-meshcheryakova pushed a commit that referenced this issue Sep 20, 2024
Since the index chunk filename is extracted from the metadata source
property.

Closes #735
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant