Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

guides for database.third_parties and OPTIMADE integration #66

Closed
jacksund opened this issue Jan 24, 2022 · 3 comments
Closed

guides for database.third_parties and OPTIMADE integration #66

jacksund opened this issue Jan 24, 2022 · 3 comments
Labels
documentation Improvements or additions to documentation enhancement New feature or request

Comments

@jacksund
Copy link
Owner

This issue discusses the docs and contributing guides for simmate.database.third_parties.

Specifically, I hope get some feedback on the quick intro to this module as well as the guide for providers. The same exact guide and source-code can be accessed on github too.

Here are some open questions about the module:

  • Are the guides clear for new providers? Where is clarification needed?
  • Is the issue of private/commercial databases properly addressed?
  • How can OPTIMADE APIs and/or archives facilitate this process? see discussion on OPTIMADE archives.

I wrote these guides in #65 following discussion with @CasperWA @ml-evs @JPBergsma

@jacksund jacksund added documentation Improvements or additions to documentation enhancement New feature or request labels Jan 24, 2022
@ml-evs
Copy link

ml-evs commented Jan 25, 2022

Hi @jacksund, some initial thoughts:

  1. I would perhaps add a paragraph as to why a provider might want their database to be added, something simple like, this will allow simmate users to access and filter on a local copy of the database. You might consider writing about what happens when data is added or changes in the underlying database, versioning/timestamping the archive within your code (or as part of the archive metadata when it is downloaded), and also the ability for a database to provide a canonical reference that they want cited if the data is used.
  2. I think so, though I wouldn't expect any private databases to contribute to your code.
  3. As we discussed, OPTIMADE could be a potential meta-provider that wraps other database providers in a somewhat dynamic way, as the mapping classes would only need to be written once (and could be taken directly from optimade-python-tools, for going from JSON->pymatgen. I mentioned the archive PR in the files PR (Files endpoint Materials-Consortia/OPTIMADE#360) to try and kick off some discussion there.

Otherwise I think everything is clear!

@jacksund
Copy link
Owner Author

add a paragraph as to why a provider might want their database to be added,

Definitely 👍


what happens when data is added or changes in the underlying database, versioning/timestamping the archive within your code (or as part of the archive metadata when it is downloaded)

I've thought archive versioning about but haven't implemented a solution yet. My current plan is to add timestamps to the archive files (e.g. ExampleProviderData_2022-01-24.zip). That way archives can act like fixed releases. Then load_remote_archive would grab the newest release by default -- but support loading an older version. It's good to hear you had the same train of thought.


the ability for a database to provide a canonical reference that they want cited if the data is used.

Great idea. I can easily add a bibtex_reference attribute and print an message + this reference whenever a user calls load_remote_archive or first accesses the data.


I wouldn't expect any private databases to contribute to your code

lol yep 😢. I can't help dreaming big though


OPTIMADE could be a potential meta-provider that wraps other database providers in a somewhat dynamic way, as the mapping classes would only need to be written once (and could be taken directly from optimade-python-tools, for going from JSON->pymatgen

I'm excited about this. and it'd be a nice utility in the for_providers module. It looks like there are many tools to build this functionality off of too -- like the pymatgen structure adapter in optimade-python-tools and also the optimade client in pymatgen.


Thanks for going through so quickly and for all these suggestions! Means a lot 😄

@jacksund jacksund changed the title Database third parties guides and OPTIMADE integration guides for database.third_parties and OPTIMADE integration Jan 25, 2022
@jacksund
Copy link
Owner Author

closing this issue as contrib guides have been refactored several times. More revisions will follow via #592 as well, so I'm just consolidating issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants