Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More declarative approach to compliance checks #1081

Open
benjwadams opened this issue May 22, 2024 · 2 comments
Open

More declarative approach to compliance checks #1081

benjwadams opened this issue May 22, 2024 · 2 comments

Comments

@benjwadams
Copy link
Contributor

Compliance Checker currently has a very imperative code style. For checks like those implemented by CF, the conformance docs form a set of steps to check, which would lend itself more towards a declarative code style, e.g. those contained in https://cfconventions.org/Data/cf-documents/requirements-recommendations/conformance-1.11.html.

We have comments in the code indicating when a part of the code is implements CF conformance in tests, such as in https://github.com/ioos/compliance-checker/blob/develop/compliance_checker/cf/cf_1_6.py#L1878-L1988 and in unit tests.

However, this isn't enforced anywhere and we don't directly check the conformance spec. Any suggestions for how we can better improve the composability of the codebase as well as testability against the points in the conformance docs? I've been experimenting recently with pytest-bdd and think something similar where each step is checked would be good. However, certain steps depend on others, which can be accomplished from BDD testing of features with multiple possible scenarios.

@benjwadams
Copy link
Contributor Author

I think we should move towards a DAG approach, declaring each section of the conformance as a separate step.

Primary libraries under consideration are Dask and Airflow.

Airflow seems more geared towards enterprise ETL/data science workflows.

I've used Dask in the past for some QARTOD runs, but want to abstract away explicit declaration of the graph.

@jcermauwedu
Copy link

Arjan does a deep dive on python decorators that might be a more upfront way to do what I think you described for pytest fixtures. See: https://www.youtube.com/watch?v=QH5fw9kxDQA

He goes pretty quickly showing how to nest classes and then functions that run in the order you want. As the tests proceed, I don't know if there a bookkeeping type way to keep blocks from retesting the same parts of a dataset with the same rules. Now we are talking like being able to compute the test coverage of a dataset, specification vs the code coverage of a software package?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants