Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow DocumentMetadata to hold arbirtary data #158

Merged
merged 5 commits into from
Oct 26, 2023

Conversation

tanmaykm
Copy link
Contributor

This introduces a new custom field in DocumentMetadata that is set to nothing by default, but can be used by user code to store arbirtary metadata against the document for use later. Having a pre-determined place to store such data would simplify processing in may cases.

@zgornel
Copy link

zgornel commented May 23, 2019

A more flexible approach would be to allow documents to hold any metadata (arbitrarily complex) and provide the mechanism for converting custom metadata to the 'standardized' DocumentMetadata https://zgornel.github.io/StringAnalysis.jl/dev/doc_extensions/

Just for the record, there was a pull request extending DocumentMetadata with a few fields a while ago that went stale for months on end.

@tanmaykm
Copy link
Contributor Author

Yes, having an abstract metadata type seems like a better idea. The API changes may be more intrusive though? Stemming depends on the language stored in metadata, that needs to be abstracted out. And there are a bunch of APIs in metadata.jl. Is there anything else?

This introduces a new `custom` field in `DocumentMetadata` that is set to `nothing` by default, but can be used by user code to store arbirtary metadata against the document for use later. Having a pre-determined place to store such data would simplify processing in may cases.
@tanmaykm
Copy link
Contributor Author

Rebased to resolve conflicts.

Probably this change will be sufficient for now? While we can continue discussing about a more appropriate metadata representation for the future.

DocumentMetadata: generalized types of data fields
Simplifying `DocumentMetadata` constructor and providing compatibility
@rssdev10
Copy link
Collaborator

Hi @tanmaykm and @zgornel , I'm trying to refresh TextAnalysis last month. I find this PR useful, but made some changes to keep it API compatible. If there are no objections, I'd like to merge it.

@rssdev10 rssdev10 merged commit d57d8c4 into JuliaText:master Oct 26, 2023
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants