Add pytorch notebook #551

atolopko-czi · 2023-06-16T17:02:58Z

Closes #476

add pytorch notebook
minor pytorch api docstring updates

api/python/notebooks/experimental/pytorch.ipynb

bkmartinjr

Looks completely sufficient to release IMHO. A couple of tiny nits noted.

…o atol/476-pytorch-notebook

codecov · 2023-06-16T18:17:47Z

Codecov Report

Merging #551 (b546e81) into main (aff8dbd) will not change coverage.
The diff coverage is n/a.

❗ Current head b546e81 differs from pull request most recent head c1667fd. Consider uploading reports for the commit c1667fd to get more accurate results

@@           Coverage Diff           @@
##             main     #551   +/-   ##
=======================================
  Coverage   87.67%   87.67%           
=======================================
  Files          59       59           
  Lines        3618     3618           
=======================================
  Hits         3172     3172           
  Misses        446      446

Flag	Coverage Δ
unittests	`87.67% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
...us/src/cellxgene_census/experimental/ml/pytorch.py	`85.96% <ø> (ø)`

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

pablo-gar

Needs to be re-run, the last few cell were not executed

api/python/notebooks/experimental/pytorch.ipynb

* Pin tiledb version as work-around (#454) * temporarily pin required tiledb version * revert release.json URL while infra changes are fluid * Incorporate comms changes (#456) * Final edits to gget tutorial (#451) * revert R API release.json URL (#458) the new URL is not yet ready and Python API has already been reverted * symlink the new notebooks * Fix docsite version and disable searchbar (#460) * update release boostrap URL to public permalink (#459) * Remove pre-release from README.md (#462) * Enable anonymous access to S3 bucket (#275) * Enable anonymous access to S3 bucket * R + unit tests * Pin tiledbsoma==1.2.3 * R styler * Update api/python/cellxgene_census/tests/test_open.py Co-authored-by: Andrew Tolopko <[email protected]> * remove R, upgrade pyproject * remove R * add newline * add negative test --------- Co-authored-by: Andrew Tolopko <[email protected]> * Bump static census version in R tests (#472) * [r] update `get_presence_matrix()` and vignette to use zero-based matrix view (#475) * wip * wip * update census_dataset_presence.Rmd * add acceptance test run (#485) * rerun notebooks for census build 2023-05-15 ("stable") Census release (#484) * Updated gget cellxgene tutorial to reflect workflow updates [formatting corrected] (#490) * Created using Colaboratory * Created using Colaboratory * Created using Colaboratory * correct formating --------- Co-authored-by: Laura Luebbert <[email protected]> * Add docsite version number from the library version (#481) * Add docsite version number from the library version * revert odd commit * pin tiledbsoma==1.2.4 (#493) * [docs] Add autosummary (#492) * autosummary * Add module desc + fix links * [R] close census objects (#486) * Using the new stateful open/close in R tiledbsoma -- mainly docs/vignettes/tests, but a few of the helper implementations too. * Refactored `open_soma()` to facilitate sharing/reuse of `SOMATileDBContext`, and do that hroughout the tests. * Updated vignettes to reflect recent changes to the Python notebooks. * However, the vignettes are now using too much memory to build in GHA. Still troubleshooting this, but for: disabled building them in GHA in order to un-break our CI. * Fix runs-on to use matrix strategy in py-unittests (#494) * Fix runs-on to use matrix strategy in py-unittests * try ARM64 runner * roll back ARM64 runner * add if * Revert bad commit * Enable anonymous access in R (#471) * Enable anonymous access to S3 bucket * R + unit tests * Pin tiledbsoma==1.2.3 * R styler * Update api/python/cellxgene_census/tests/test_open.py Co-authored-by: Andrew Tolopko <[email protected]> * fix * fix * reset credentials --------- Co-authored-by: Andrew Tolopko <[email protected]> * Add Databricks install instructions to FAQ (#488) * [docs] Fix the Census link in navbar (#491) * [r] use `stable` by default & add alias resolution message (#502) Completes #482 For parity with python #435 Also adapts to several recent breaking API changes in tiledbsoma * PyTorch DataLoader (#499) * Add PyTorch DataLoader support * Introduce this code under a new "experimental" sub-package, with new pytest "experimental" marker for unit tests. * Add initial PyTorch example code for LR model training. Not a notebook yet, but under notebooks dir for now. * chore: update lifecycle tags (#509) * Update lifecycle tags for non-experimental Python API to "maturing" * Update lifecycle tags for experimental Python API to "experimental" * export public names for experimental ml package * bump python api tiledbsoma version (#510) bump tiledbsoma to 1.2.5, which includes updated api doc lifecycle tags * minor clarifications for the pypi.org release process (#512) * cache most R dependencies to speed up r-check CI (#517) #309 -- Cache most R dependencies instead of always building the latest versions of all of them. (Then immediately afterwards, still install the latest tiledb & tiledbsoma from r-universe) * [r] Add comp_bio_census_info.Rmd (#407) Also: - update all vignettes to recent tiledbsoma API evolution - temporarily move vignettes into `vignettes_wip/` pending a plan for how to build them outside of GitHub Actions Co-authored-by: Emanuele Bezzi <[email protected]> * fix pytorch multiprocessing result (#516) The first partition of data was being returned from each worker, apparently caused by use of a PyArrow array for passing the set of joinids to each worker's result iterator, possibly due to a bug in TileDB-SOMA. * Update release_process.md (#520) * highly variable gene annotation (#511) * initial implementation of highly_variable_genes * add test marks * add prebuffered iterator * lint * lint * docstrings * reduce expensive tests * fix typo * actually fix typo * add test for get_highly_variable_genes * lint * reduce memory use in tests * add example to docstring * fix anon access in small memory context * PR feedback * loess jitter * increase max loess noise max to 1e-6 * add tests * fix: pytorch unit test hangs (#522) * force use of multiprocessing spawn start method for pytorch * run experimental tests in all envs except 3.7 * Add support for Python 3.11 (#528) * Add support for Python 3.11 * Add 3.11 to test workflow * update notebook guidelines (#534) * update notebook guidelines * Update docs/census_notebook_guidelines.md Co-authored-by: Emanuele Bezzi <[email protected]> * Update docs/census_notebook_guidelines.md * Apply suggestions from code review Co-authored-by: Andrew Tolopko <[email protected]> --------- Co-authored-by: Emanuele Bezzi <[email protected]> Co-authored-by: Andrew Tolopko <[email protected]> * Add docs for experimental modules (#537) * Add modules to python-api.rst * Add README.md * Update docs/python-api.rst Co-authored-by: pablo-gar <[email protected]> * Update docs/python-api.rst Co-authored-by: pablo-gar <[email protected]> --------- Co-authored-by: pablo-gar <[email protected]> * demo notebook for HVG experimental API (#536) * reorg files * add HVG notebook * clean up notebook docs symlinks * lint * lint,try 2 * fix Ruff error * fix lint in thread regex * work around upstream lint * update target python version to match target container version * fix typing lint * update and fix config for pre-commit * PR feedback / fixes * Add experimental notebooks to docsite + fix headers (#541) * fix OOM on pytorch unit test (#542) skip failing test on 3.9 only * Add pytorch notebook (#551) * add pytorch notebook; minor pytorch api docstring updates * CR feedback * Update api/python/notebooks/experimental/pytorch.ipynb * Update api/python/notebooks/experimental/pytorch.ipynb * run full notebook --------- Co-authored-by: pablo-gar <[email protected]> Co-authored-by: Pablo E Garcia-Nieto <[email protected]> * fix incorrect pytorch obs soma_joinids (#555) - Use torch.from_numpy() instead of Torch.Tensor() to construct tensors. The latter created with dtype=float32, which caused the bug due to truncation of precision when casting to int64-to-float32-to-int64. In addition to fixing the bug, a data copy is eliminated by using this new method. - Rework the test fixture generation methods to allow for ranges of obs and var soma_joinids that may start at any arbitrary value. Necessary for testing the case that produced this bug. - Add asserts to ease paranoia. * Fix MacOS failing tests (#557) * Fix MacOS failing tests For the time being, we'll try to determine if the issue is specific to a Python version by removing 3.7. * try macos13 * comma * try if * exclude * typo --------- Co-authored-by: Bruce Martin <[email protected]> Co-authored-by: pablo-gar <[email protected]> Co-authored-by: Laura Luebbert <[email protected]> Co-authored-by: Andrew Tolopko <[email protected]> Co-authored-by: Emanuele Bezzi <[email protected]> Co-authored-by: Martin Kim <[email protected]> Co-authored-by: Pablo E Garcia-Nieto <[email protected]>

add pytorch notebook; minor pytorch api docstring updates

9610d29

atolopko-czi self-assigned this Jun 16, 2023

atolopko-czi requested review from pablo-gar, bkmartinjr and ebezzi June 16, 2023 17:21

bkmartinjr reviewed Jun 16, 2023

View reviewed changes

api/python/notebooks/experimental/pytorch.ipynb Show resolved Hide resolved

bkmartinjr reviewed Jun 16, 2023

View reviewed changes

api/python/notebooks/experimental/pytorch.ipynb Outdated Show resolved Hide resolved

bkmartinjr approved these changes Jun 16, 2023

View reviewed changes

atolopko-czi added 2 commits June 16, 2023 13:52

Merge branch 'main' of github.com:chanzuckerberg/cellxgene-census int…

e0ea511

…o atol/476-pytorch-notebook

CR feedback

0a2816f

atolopko-czi changed the title ~~add pytorch notebook~~ Add pytorch notebook Jun 16, 2023

pablo-gar approved these changes Jun 16, 2023

View reviewed changes

api/python/notebooks/experimental/pytorch.ipynb Outdated Show resolved Hide resolved

api/python/notebooks/experimental/pytorch.ipynb Outdated Show resolved Hide resolved

pablo-gar and others added 3 commits June 16, 2023 15:36

Update api/python/notebooks/experimental/pytorch.ipynb

57053f7

Update api/python/notebooks/experimental/pytorch.ipynb

b546e81

run full notebook

c1667fd

bkmartinjr merged commit ea69b0c into main Jun 17, 2023

bkmartinjr deleted the atol/476-pytorch-notebook branch June 17, 2023 00:14

atolopko-czi mentioned this pull request Jun 20, 2023

remove pytorch_lr_classifier.py #556

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add pytorch notebook #551

Add pytorch notebook #551

atolopko-czi commented Jun 16, 2023

bkmartinjr left a comment

codecov bot commented Jun 16, 2023 •

edited

Loading

pablo-gar left a comment

Add pytorch notebook #551

Add pytorch notebook #551

Conversation

atolopko-czi commented Jun 16, 2023

bkmartinjr left a comment

Choose a reason for hiding this comment

codecov bot commented Jun 16, 2023 • edited Loading

Codecov Report

pablo-gar left a comment

Choose a reason for hiding this comment

codecov bot commented Jun 16, 2023 •

edited

Loading