Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python API 1.0 acceptance tests are in place and can be executed easily. #185

Closed
8 tasks
pablo-gar opened this issue Feb 16, 2023 · 3 comments · Fixed by #305
Closed
8 tasks

Python API 1.0 acceptance tests are in place and can be executed easily. #185

pablo-gar opened this issue Feb 16, 2023 · 3 comments · Fixed by #305
Assignees
Labels
P0 Priority 0 - Critical, fix ASAP! python api Related to the API

Comments

@pablo-gar
Copy link
Contributor

pablo-gar commented Feb 16, 2023

It is left to the discretion of the engineer working on this to assess which ones should also be converted to unit tests. Please reach out to me if you'd like to discuss

Initial candidate of tests:

  • For both organisms ,full obs, var and X can be iterated e.g. for chunk in ["obs"].read(): ...
  • For both organisms, full obs and var can be loaded in a pandas.DataFrame e.g. ["obs"].read().concat().to_pandas
  • Experiment query operations can be done over a large slice of data, e.g. with .ms["RNA"].axis_query(tiledbsoma.AxisQuery(value_filter = "tissue == 'brain') then perform to completion the main ExperimentQuery operations:
    • for chunk in obs(): ...
    • for chunk in var(): ...
    • for chunk in X().tables(): ...
  • Large AnnDatas can be loaded, document computing requirements:
    • largest possible in <20 min @bkmartinjr has made it happened in an EC2 instance r6id.32xlarge get_anndata(census = census, organism = "Homo sapiens",)
    • most common cell type get_anndata(census = census, organism = "Homo sapiens", obs_value_filter = "cell_type == 'neuron'")
    • most common tissue get_anndata(census = census, organism = "Homo sapiens", obs_value_filter = "cell_type == 'brain'")
@pablo-gar pablo-gar changed the title Python API is stress-tested, and bugs are filed Python API 1.0 is stress-tested, and bugs are filed Feb 16, 2023
@pablo-gar pablo-gar added the python api Related to the API label Feb 16, 2023
@pablo-gar pablo-gar changed the title Python API 1.0 is stress-tested, and bugs are filed Python API 1.0 RC is stress-tested, and bugs are filed Mar 12, 2023
@pablo-gar pablo-gar changed the title Python API 1.0 RC is stress-tested, and bugs are filed Python API 1.0 RC is stress-tested. Mar 12, 2023
@pablo-gar pablo-gar added P0 Priority 0 - Critical, fix ASAP! sprint-March27-April7 labels Mar 24, 2023
@atolopko-czi
Copy link
Collaborator

These are all "acceptance" tests, in fact. Even if not run in an automated fashion, it would seem good to have them all scripted so they can be run on an arbitrarily-sized EC2 instance as a single step.

@pablo-gar pablo-gar changed the title Python API 1.0 RC is stress-tested. Python API is stress-tested. Mar 24, 2023
@bkmartinjr
Copy link
Contributor

Agree. These are not stress tests. For an example issue found by a stress test, see: single-cell-data/TileDB-SOMA#1169 Our census builder/validator is a pretty good stress test all by itself.

@atolopko-czi - if we do automate the above, do you have any recommendations about how we record performance/behavior so that regressions/changes can be tracked?

@atolopko-czi
Copy link
Collaborator

@atolopko-czi - if we do automate the above, do you have any recommendations about how we record performance/behavior so that regressions/changes can be tracked?

I'd be fine with committing a log file to the repo. It would contain the relevant context info (instance type, OS, s/w versions for Cell Census, TileDB-SOMA, Python, etc.) and the test results (pass/fail, applicable result size counts, timings).

We could potentially wrap this all up in a notebook that gets run at the same time as the pedagogical notebooks, but honestly I'm not sure that's a great fit.

@pablo-gar pablo-gar changed the title Python API is stress-tested. Python API ~~is stress-tested~~. Mar 27, 2023
@pablo-gar pablo-gar changed the title Python API ~~is stress-tested~~. Python API acceptance tests are in place and can be executed easily. Mar 27, 2023
@pablo-gar pablo-gar changed the title Python API acceptance tests are in place and can be executed easily. Python API 1.0 acceptance tests are in place and can be executed easily. Mar 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P0 Priority 0 - Critical, fix ASAP! python api Related to the API
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants