chore: update lifecycle tags #509

atolopko-czi · 2023-05-30T16:55:32Z

Changes existing Python API methods' lifecycle tags to maturing
Sets experimental package methods' lifecycle tags to experimental
Adds some documentation to experimental public classes and methods, with proper formatting for github pages output.
Exports public names for experimental.ml package

Resolves #508

Partially addresses: #500

codecov · 2023-05-30T17:12:00Z

Codecov Report

Merging #509 (9f95b53) into main (e5b59d5) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main     #509   +/-   ##
=======================================
  Coverage   88.77%   88.78%           
=======================================
  Files          52       53    +1     
  Lines        3198     3200    +2     
=======================================
+ Hits         2839     2841    +2     
  Misses        359      359

Flag	Coverage Δ
unittests	`88.78% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
...llxgene_census/src/cellxgene_census/_experiment.py	`90.90% <ø> (ø)`
...lxgene_census/src/cellxgene_census/_get_anndata.py	`100.00% <ø> (ø)`
...hon/cellxgene_census/src/cellxgene_census/_open.py	`100.00% <ø> (ø)`
...ne_census/src/cellxgene_census/_presence_matrix.py	`100.00% <ø> (ø)`
..._census/src/cellxgene_census/_release_directory.py	`100.00% <ø> (ø)`
...s/src/cellxgene_census/experimental/ml/__init__.py	`100.00% <100.00%> (ø)`
...us/src/cellxgene_census/experimental/ml/pytorch.py	`87.38% <100.00%> (ø)`

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

ebezzi

LGTM, suggest removing the empty annotations, although it's more of a nitpick since I don't believe we'll generate docpages for the experimental module yet.

api/python/cellxgene_census/src/cellxgene_census/experimental/ml/pytorch.py

bkmartinjr · 2023-05-30T17:56:50Z

api/python/cellxgene_census/src/cellxgene_census/_experiment.py

@@ -31,7 +31,7 @@ def _get_experiment(census: soma.Collection, organism: str) -> soma.Experiment:
        ValueError: if unable to find the specified organism.

    Lifecycle:
-        Experimental.
+        Maturing.


the style used in SOMA is lower case, no punctuation

Eg.,

Lifecycle: maturing

or

Lifecycle: maturing

I prefer the first, as it is much easier to search for. Not sure if it will mess up online docsite

@ebezzi do you know how each will render?

I can't find any cases of same-line header+content in our docstrings, so conservatively went with the multi-line form for now.

We should use the multi-line. Regardless, this renders as a separate paragraph:

so I think the capitalization and dot makes it look better.

To add a bit more context, this is the parser we're using for our Google-style docstrings - I believe that without the multiline, the lifecycle section will render as a normal paragraph and not with a section title+paragraph, which will IMHO make the generated docstring page less readable. See an example here.

Useful context! I'm going with a compromise: multi-line, but lowercase and no punc, to make everyone equally slightly unhappy. 😆 I'll let @pablo-gar be the arbitor of docstring fashion upon his return and we can change on a future PR.

bkmartinjr

one style comment but it is a nit....

* Pin tiledb version as work-around (#454) * temporarily pin required tiledb version * revert release.json URL while infra changes are fluid * Incorporate comms changes (#456) * Final edits to gget tutorial (#451) * revert R API release.json URL (#458) the new URL is not yet ready and Python API has already been reverted * symlink the new notebooks * Fix docsite version and disable searchbar (#460) * update release boostrap URL to public permalink (#459) * Remove pre-release from README.md (#462) * Enable anonymous access to S3 bucket (#275) * Enable anonymous access to S3 bucket * R + unit tests * Pin tiledbsoma==1.2.3 * R styler * Update api/python/cellxgene_census/tests/test_open.py Co-authored-by: Andrew Tolopko <[email protected]> * remove R, upgrade pyproject * remove R * add newline * add negative test --------- Co-authored-by: Andrew Tolopko <[email protected]> * Bump static census version in R tests (#472) * [r] update `get_presence_matrix()` and vignette to use zero-based matrix view (#475) * wip * wip * update census_dataset_presence.Rmd * add acceptance test run (#485) * rerun notebooks for census build 2023-05-15 ("stable") Census release (#484) * Updated gget cellxgene tutorial to reflect workflow updates [formatting corrected] (#490) * Created using Colaboratory * Created using Colaboratory * Created using Colaboratory * correct formating --------- Co-authored-by: Laura Luebbert <[email protected]> * Add docsite version number from the library version (#481) * Add docsite version number from the library version * revert odd commit * pin tiledbsoma==1.2.4 (#493) * [docs] Add autosummary (#492) * autosummary * Add module desc + fix links * [R] close census objects (#486) * Using the new stateful open/close in R tiledbsoma -- mainly docs/vignettes/tests, but a few of the helper implementations too. * Refactored `open_soma()` to facilitate sharing/reuse of `SOMATileDBContext`, and do that hroughout the tests. * Updated vignettes to reflect recent changes to the Python notebooks. * However, the vignettes are now using too much memory to build in GHA. Still troubleshooting this, but for: disabled building them in GHA in order to un-break our CI. * Fix runs-on to use matrix strategy in py-unittests (#494) * Fix runs-on to use matrix strategy in py-unittests * try ARM64 runner * roll back ARM64 runner * add if * Revert bad commit * Enable anonymous access in R (#471) * Enable anonymous access to S3 bucket * R + unit tests * Pin tiledbsoma==1.2.3 * R styler * Update api/python/cellxgene_census/tests/test_open.py Co-authored-by: Andrew Tolopko <[email protected]> * fix * fix * reset credentials --------- Co-authored-by: Andrew Tolopko <[email protected]> * Add Databricks install instructions to FAQ (#488) * [docs] Fix the Census link in navbar (#491) * [r] use `stable` by default & add alias resolution message (#502) Completes #482 For parity with python #435 Also adapts to several recent breaking API changes in tiledbsoma * PyTorch DataLoader (#499) * Add PyTorch DataLoader support * Introduce this code under a new "experimental" sub-package, with new pytest "experimental" marker for unit tests. * Add initial PyTorch example code for LR model training. Not a notebook yet, but under notebooks dir for now. * chore: update lifecycle tags (#509) * Update lifecycle tags for non-experimental Python API to "maturing" * Update lifecycle tags for experimental Python API to "experimental" * export public names for experimental ml package * bump python api tiledbsoma version (#510) bump tiledbsoma to 1.2.5, which includes updated api doc lifecycle tags * minor clarifications for the pypi.org release process (#512) * cache most R dependencies to speed up r-check CI (#517) #309 -- Cache most R dependencies instead of always building the latest versions of all of them. (Then immediately afterwards, still install the latest tiledb & tiledbsoma from r-universe) * [r] Add comp_bio_census_info.Rmd (#407) Also: - update all vignettes to recent tiledbsoma API evolution - temporarily move vignettes into `vignettes_wip/` pending a plan for how to build them outside of GitHub Actions Co-authored-by: Emanuele Bezzi <[email protected]> * fix pytorch multiprocessing result (#516) The first partition of data was being returned from each worker, apparently caused by use of a PyArrow array for passing the set of joinids to each worker's result iterator, possibly due to a bug in TileDB-SOMA. * Update release_process.md (#520) * highly variable gene annotation (#511) * initial implementation of highly_variable_genes * add test marks * add prebuffered iterator * lint * lint * docstrings * reduce expensive tests * fix typo * actually fix typo * add test for get_highly_variable_genes * lint * reduce memory use in tests * add example to docstring * fix anon access in small memory context * PR feedback * loess jitter * increase max loess noise max to 1e-6 * add tests * fix: pytorch unit test hangs (#522) * force use of multiprocessing spawn start method for pytorch * run experimental tests in all envs except 3.7 * Add support for Python 3.11 (#528) * Add support for Python 3.11 * Add 3.11 to test workflow * update notebook guidelines (#534) * update notebook guidelines * Update docs/census_notebook_guidelines.md Co-authored-by: Emanuele Bezzi <[email protected]> * Update docs/census_notebook_guidelines.md * Apply suggestions from code review Co-authored-by: Andrew Tolopko <[email protected]> --------- Co-authored-by: Emanuele Bezzi <[email protected]> Co-authored-by: Andrew Tolopko <[email protected]> * Add docs for experimental modules (#537) * Add modules to python-api.rst * Add README.md * Update docs/python-api.rst Co-authored-by: pablo-gar <[email protected]> * Update docs/python-api.rst Co-authored-by: pablo-gar <[email protected]> --------- Co-authored-by: pablo-gar <[email protected]> * demo notebook for HVG experimental API (#536) * reorg files * add HVG notebook * clean up notebook docs symlinks * lint * lint,try 2 * fix Ruff error * fix lint in thread regex * work around upstream lint * update target python version to match target container version * fix typing lint * update and fix config for pre-commit * PR feedback / fixes * Add experimental notebooks to docsite + fix headers (#541) * fix OOM on pytorch unit test (#542) skip failing test on 3.9 only * Add pytorch notebook (#551) * add pytorch notebook; minor pytorch api docstring updates * CR feedback * Update api/python/notebooks/experimental/pytorch.ipynb * Update api/python/notebooks/experimental/pytorch.ipynb * run full notebook --------- Co-authored-by: pablo-gar <[email protected]> Co-authored-by: Pablo E Garcia-Nieto <[email protected]> * fix incorrect pytorch obs soma_joinids (#555) - Use torch.from_numpy() instead of Torch.Tensor() to construct tensors. The latter created with dtype=float32, which caused the bug due to truncation of precision when casting to int64-to-float32-to-int64. In addition to fixing the bug, a data copy is eliminated by using this new method. - Rework the test fixture generation methods to allow for ranges of obs and var soma_joinids that may start at any arbitrary value. Necessary for testing the case that produced this bug. - Add asserts to ease paranoia. * Fix MacOS failing tests (#557) * Fix MacOS failing tests For the time being, we'll try to determine if the issue is specific to a Python version by removing 3.7. * try macos13 * comma * try if * exclude * typo --------- Co-authored-by: Bruce Martin <[email protected]> Co-authored-by: pablo-gar <[email protected]> Co-authored-by: Laura Luebbert <[email protected]> Co-authored-by: Andrew Tolopko <[email protected]> Co-authored-by: Emanuele Bezzi <[email protected]> Co-authored-by: Martin Kim <[email protected]> Co-authored-by: Pablo E Garcia-Nieto <[email protected]>

atolopko-czi added 5 commits May 30, 2023 12:52

Update lifecycle tags for non-experimental Python API to "maturing"

72cc867

Update lifecycle tags for experimental Python API to "experimental"

e20d7cd

export public names for experimental ml package

65e70a2

fix copy/paste error

08ac5d4

docstring formatting

7b2865a

atolopko-czi requested review from bkmartinjr and ebezzi May 30, 2023 17:12

ebezzi approved these changes May 30, 2023

View reviewed changes

api/python/cellxgene_census/src/cellxgene_census/experimental/ml/pytorch.py Outdated Show resolved Hide resolved

bkmartinjr reviewed May 30, 2023

View reviewed changes

fix case/punctuation for lifecycle tags

7273168

bkmartinjr approved these changes May 30, 2023

View reviewed changes

rm empty docstring section headers

9f95b53

atolopko-czi merged commit 1c264c4 into main May 30, 2023

atolopko-czi deleted the atol/508-update-lifecycle-tags branch May 30, 2023 20:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: update lifecycle tags #509

chore: update lifecycle tags #509

atolopko-czi commented May 30, 2023 •

edited

Loading

codecov bot commented May 30, 2023 •

edited

Loading

ebezzi left a comment

bkmartinjr May 30, 2023

bkmartinjr May 30, 2023

atolopko-czi May 30, 2023

atolopko-czi May 30, 2023

ebezzi May 30, 2023

ebezzi May 30, 2023

atolopko-czi May 30, 2023 •

edited

Loading

bkmartinjr left a comment

chore: update lifecycle tags #509

chore: update lifecycle tags #509

Conversation

atolopko-czi commented May 30, 2023 • edited Loading

codecov bot commented May 30, 2023 • edited Loading

Codecov Report

ebezzi left a comment

Choose a reason for hiding this comment

bkmartinjr May 30, 2023

Choose a reason for hiding this comment

bkmartinjr May 30, 2023

Choose a reason for hiding this comment

atolopko-czi May 30, 2023

Choose a reason for hiding this comment

atolopko-czi May 30, 2023

Choose a reason for hiding this comment

ebezzi May 30, 2023

Choose a reason for hiding this comment

ebezzi May 30, 2023

Choose a reason for hiding this comment

atolopko-czi May 30, 2023 • edited Loading

Choose a reason for hiding this comment

bkmartinjr left a comment

Choose a reason for hiding this comment

atolopko-czi commented May 30, 2023 •

edited

Loading

codecov bot commented May 30, 2023 •

edited

Loading

atolopko-czi May 30, 2023 •

edited

Loading