Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Replace docval with better strict type and shape validation system with type hints #1129

Open
rly opened this issue Jun 14, 2024 · 4 comments
Assignees
Labels
category: proposal proposed enhancements or new features
Milestone

Comments

@rly
Copy link
Contributor

rly commented Jun 14, 2024

What would you like to see added to HDMF?

I realized we have discussed this many times in the past year but we do not have an issue for it yet. This came up again today during the NWB Data Conversion Workshop.

PyNWB/HDMF uses docval which was developed before type hints were officially supported by Python and before Pydantic and these other strict type-checkers were popular. docval is now incompatible with systems that display type hints, like hovering over variables in Jupyter and auto-complete in IDEs. And it is cumbersome for new developers or anyone browsing the source code to learn. To improve usability and maintainability of the NWB and HDMF APIs, I suggest we replace docval with a more modern strict type-checking system and documentation system. This will be tedious but worth it in the end.

docval is used for documentation, type checking, and shape checking, and it is used in code that inspects other classes like the class generator and neuroconv's code that gets a json schema from a classes' constructor docval. The validator may also use docval args. We have hooks that allow you to create docval aliases like "array_data" that can be dynamically updated, e.g., in HDMF Zarr. We need to be careful when replacing docval to ensure we do not alter or lose significant functionality.

What solution would you like?

Replace docval with another system like pydantic in strict mode, beartype, or numpydantic. Need to research options. Pydantic is widely used and plays nicely with JSON schema, which will be useful in potential long-term integration with LinkML. beartype appears to be quite fast. I think neither plays nicely with numpy arrays, so we may need to use something like numpydantic

Do you have any interest in helping implement the feature?

Yes.

@rly rly added the category: proposal proposed enhancements or new features label Jun 14, 2024
@rly rly added this to the Next Major Release - 4.0 milestone Jun 14, 2024
@h-mayorquin
Copy link
Contributor

As a related background information. This has been an ongoing concern in numpy for a while:
numpy/numpy#16544

@mavaylon1
Copy link
Contributor

@rly is this something we can start in August?

@rly rly changed the title [Feature]: Replace docval with better strict type and shape validation system [Feature]: Replace docval with better strict type and shape validation system with type hints Jun 27, 2024
@rly
Copy link
Contributor Author

rly commented Jun 29, 2024

Yes, as discussed in person, let's target August/September to start working on this together.

@sneakers-the-rat
Copy link

related to:
#994
NeurodataWithoutBorders/pynwb#1408 (comment)

i'd be down to help with this <3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: proposal proposed enhancements or new features
Projects
None yet
Development

No branches or pull requests

5 participants