Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup normalize_total #1667

Merged
merged 6 commits into from
Mar 5, 2021
Merged

Conversation

ivirshup
Copy link
Member

@ivirshup ivirshup commented Feb 19, 2021

I was looking over normalize_total and saw some strange behaviour. Since it's such a common function, I think it's important that it has standard scanpy behaviour. To this end, this PR looks at cleanup up it's code.

Addition

layer argument. A specific layer can now be normalized by itself.

Deprecations

I've deprecated the layers and layer_norm argument. Normalizing multiple layers at once seems less useful than normalizing a specific layer. These seem like very specific use cases that are easy for user's to implement themselves, and are not common patterns in scanpy functions.

TODO:

  • Tests for deprecations
  • Scheduling of deprecations (deprecate in 1.8, remove in 1.9)

@ivirshup ivirshup added this to the 1.8.0 milestone Mar 3, 2021
@ivirshup ivirshup marked this pull request as ready for review March 3, 2021 04:11
@ivirshup ivirshup merged commit 976aa18 into scverse:master Mar 5, 2021
jlause added a commit to jlause/scanpy that referenced this pull request Mar 5, 2021
jlause added a commit to jlause/scanpy that referenced this pull request Mar 5, 2021
jlause added a commit to jlause/scanpy that referenced this pull request Mar 5, 2021
jlause added a commit to jlause/scanpy that referenced this pull request Mar 10, 2021
jlause added a commit to jlause/scanpy that referenced this pull request Mar 10, 2021
Zethson pushed a commit that referenced this pull request Mar 15, 2021
* Cleanup normalize_total

* Add modification tests and copy kwarg for normalize_total

* Test that 'layers' argument is deprecated

* Added more mutation checks for normalize_total

* release note

* Error message
ivirshup added a commit that referenced this pull request Mar 18, 2021
* add flake8 pre-commit

Signed-off-by: Zethson <[email protected]>

* fix pre-commit

Signed-off-by: Zethson <[email protected]>

* add E402 to flake8 ignore

Signed-off-by: Zethson <[email protected]>

* revert neighbors

Signed-off-by: Zethson <[email protected]>

* fix flake8

Signed-off-by: Zethson <[email protected]>

* address review

Signed-off-by: Zethson <[email protected]>

* fix comment character in .flake8

Signed-off-by: Zethson <[email protected]>

* fix test

Signed-off-by: Zethson <[email protected]>

* black

Signed-off-by: Zethson <[email protected]>

* review round 2

Signed-off-by: Zethson <[email protected]>

* review round 3

Signed-off-by: Zethson <[email protected]>

* readded double comments

Signed-off-by: Zethson <[email protected]>

* Ignoring E262 & reverted comment

Signed-off-by: Zethson <[email protected]>

* using self for obs_tidy

Signed-off-by: Zethson <[email protected]>

* Restore setup.py

* rm call of black test (#1690)

* Fix print_versions for python<3.8 (#1691)

* add codecov so we can have a badge to point to (#1693)

* Attempt server-side search (#1672)

* Fix paga_path (#1047)

Fix paga_path

Co-authored-by: Isaac Virshup <[email protected]>

* Switch to flit

This reverts commit d645790

* add setup.py while leaving it ignored

* Update install instructions

* Circumvent new pip check (see pypa/pip#9628)

* Go back to regular pip (#1702)

* Go back to regular flit

Co-authored-by: Isaac Virshup <[email protected]>

* codecov comment (#1704)

* Use joblib for parallelism in regress_out (#1695)

* Use joblib for parallism in regress_out

* release note

* fix link in release notes

* Add todo for resource test

* Add sparsificiation step before sparse-dependent Scrublet calls (#1707)

* Add sparsificiation step before sparse-dependent Scrublet calls

* Apply sparsification suggestion

Co-authored-by: Isaac Virshup <[email protected]>

* Fix imports

Co-authored-by: Isaac Virshup <[email protected]>

* Fix version on Travis (#1713)

By default, Travis does `git clone --depth=50` which means the version can’t be detected from the git tag.

* `sc.metrics` module (add confusion matrix & Geary's C methods) (#915)

* Add `sc.metrics` with `gearys_c`

Add a module for computing useful metrics. Started off with Geary's C since I'm using it and finding it useful. I've also got a fairly fast way to calculate it worked out.

Unfortunatly my implementation runs into some issues with some global configs set by umap (see lmcinnes/umap#306), so I'm going to see if that can be resolved before changing it.

* Add sc.metrics.confusion_matrix

* Better tests and output for confusion_matrix

* Workaround umap<0.4 and increase numerical stability of gearys_c

* Work around lmcinnes/umap#306 by not
  calling out to kernel function. That code has been kept, but commented
  out.
* Increase numerical stability by casting data to system width. Tests
  were failing due to instability.

* Split up gearys_c tests

* Improved unexpected error message

* gearys_c working again. Sadly, a bit slower

* One option for doc strings

* Simplify implementation to use single dispatch

* release notes

* Fix clipped images in docs (#1717)

* Cleanup normalize_total (#1667)

* Cleanup normalize_total

* Add modification tests and copy kwarg for normalize_total

* Test that 'layers' argument is deprecated

* Added more mutation checks for normalize_total

* release note

* Error message

* deprecate scvi (#1703)

* deprecate scvi

* Update .azure-pipelines.yml

Co-authored-by: Isaac Virshup <[email protected]>

* remove :func: links to scvi in release notes

* remove tildes in front of scvi in release notes

* Update docs/release-notes/1.5.0.rst

Co-authored-by: Michael Jayasuriya <[email protected]>
Co-authored-by: Isaac Virshup <[email protected]>

* updated ecosystem.rst to add triku (#1722)

* Minor addition to contributing docs (#1726)

* Preserve category order when groupby is a list (#1735)

Preserve category order when groupby is a list

* Asymmetrical diverging colormaps and vcenter (#1551)

Add vcenter and norm arguments to plotting functions

* add flake8 pre-commit

Signed-off-by: Zethson <[email protected]>

* add E402 to flake8 ignore

Signed-off-by: Zethson <[email protected]>

* revert neighbors

Signed-off-by: Zethson <[email protected]>

* address review

Signed-off-by: Zethson <[email protected]>

* black

Signed-off-by: Zethson <[email protected]>

* using self for obs_tidy

Signed-off-by: Zethson <[email protected]>

* rebased

Signed-off-by: Zethson <[email protected]>

* rebasing

Signed-off-by: Zethson <[email protected]>

* rebasing

Signed-off-by: Zethson <[email protected]>

* rebasing

Signed-off-by: Zethson <[email protected]>

* add flake8 to dev docs

Signed-off-by: Zethson <[email protected]>

* add autopep8 to pre-commits

Signed-off-by: Zethson <[email protected]>

* add flake8 ignore docs

Signed-off-by: Zethson <[email protected]>

* add exception todos

Signed-off-by: Zethson <[email protected]>

* add ignore directories

Signed-off-by: Zethson <[email protected]>

* reinstated lambdas

Signed-off-by: Zethson <[email protected]>

* fix tests

Signed-off-by: Zethson <[email protected]>

* fix tests

Signed-off-by: Zethson <[email protected]>

* fix tests

Signed-off-by: Zethson <[email protected]>

* fix tests

Signed-off-by: Zethson <[email protected]>

* fix tests

Signed-off-by: Zethson <[email protected]>

* Add E741 to allowed flake8 violations.

Co-authored-by: Isaac Virshup <[email protected]>

* Add F811 flake8 ignore for tests

Co-authored-by: Isaac Virshup <[email protected]>

* Fix mask comparison

Co-authored-by: Isaac Virshup <[email protected]>

* Fix mask comparison

Co-authored-by: Isaac Virshup <[email protected]>

* fix flake8 config file

Signed-off-by: Zethson <[email protected]>

* readded autopep8

Signed-off-by: Zethson <[email protected]>

* import Literal

Signed-off-by: Zethson <[email protected]>

* revert literal import

Signed-off-by: Zethson <[email protected]>

* fix scatterplot pca import

Signed-off-by: Zethson <[email protected]>

* false comparison & unused vars

Signed-off-by: Zethson <[email protected]>

* Add cleaner level determination

Co-authored-by: Isaac Virshup <[email protected]>

* Fix comment formatting

Co-authored-by: Isaac Virshup <[email protected]>

* Add smoother dev documentation

Co-authored-by: Isaac Virshup <[email protected]>

* fix flake8

Signed-off-by: Zethson <[email protected]>

* Readd long comment

Co-authored-by: Isaac Virshup <[email protected]>

* Assuming X as array like

Co-authored-by: Isaac Virshup <[email protected]>

* fix flake8

Signed-off-by: Zethson <[email protected]>

* fix flake8 config

Signed-off-by: Zethson <[email protected]>

* reverted rank_genes

Signed-off-by: Zethson <[email protected]>

* fix disp_mean_bin formatting

Co-authored-by: Isaac Virshup <[email protected]>

* fix formatting

Signed-off-by: Zethson <[email protected]>

* add final todos

Signed-off-by: Zethson <[email protected]>

* boolean checks with is

Signed-off-by: Zethson <[email protected]>

* _dpt formatting

Signed-off-by: Zethson <[email protected]>

* literal fixes

Signed-off-by: Zethson <[email protected]>

* links to leafs

Signed-off-by: Zethson <[email protected]>

* revert paga variable naming

Co-authored-by: Philipp A <[email protected]>
Co-authored-by: Sergei Rybakov <[email protected]>
Co-authored-by: Isaac Virshup <[email protected]>
Co-authored-by: Jonathan Manning <[email protected]>
Co-authored-by: mjayasur <[email protected]>
Co-authored-by: Michael Jayasuriya <[email protected]>
Co-authored-by: Alex M. Ascensión <[email protected]>
Co-authored-by: Gökçen Eraslan <[email protected]>
ivirshup added a commit that referenced this pull request Mar 29, 2022
* adding core functions and documentation for pearson residual normalization and hvg selection

* adding Pearson residual+PCA bundles, minor bug fixes

* some style cleanup, minor fixes

* adapting _normalize_pearson_residuals() to cleaned-up _normalized_total() from #1667

* updating layer management as in #1667 for _highly_variable_pearson_residuals() as well

* slight performance improvement for sparse input

* style cleanup

* fixing import issue, fixing docstring style, adding check_values param and warning as in #1642

* fixed small NameError, simplified clip argument

* remove pd.categorical()

* adding check_values to docstrings and remaining pearson residual functions

* np.empty instead of np.nan

* add references to docstrings, add HVG details to docstring

* exposing pca keyword arguments to the user for the bundle/recipe functions

* removed unneeded reversal in hvg, fix kwargs_pca bug, consistent defaults across files

* fixing handling of `inplace` and `subset` arguments (see issue #1886), explicit typing of output, adding theta input check

* renaming output fields for consistency, fixing minor bug

* renaming output fields for consistency

* adding function that prepares testdata (used for pearson residual tests)

* adding tests for all pearson residual functions

* fix precommit high_var_genes

* try to get precommit to work

* try to get precommit to work

* fix recipes

* fix normalization

* remove relative imports

* fix docstrings

* retry to build docs

* fix highvar docstring

* more fixing docstrings

* docs build locally ? 🔨

* minor cleanup test normalization

* more minor cleanups

* final cleanup normalization

* fixes high var

* init experimental module

* fix column ordering for batch case

* moving to experimental, minor fix for experimental version of hvg selection

* linking tests to new experimental submodule, style cleanup

* adapt input arguments and docstring for experimental version of hvg selection function

* add recipes

* fix docs

* add correct module docs

* fix recipe docstrings

* try fix indentation

* fix indentation

* fix

* new indentation

* add space

* fixing typo in docstring

* renaming pca output fields

* adapting tests to new output fieldname

* fix docs 🔨

* update docs

* fix test 🔨

* ensure argument and docstring consistency

* update citation year

* cleaning imports in `preprocessing` functions

* making inputcheck tests specific to error/warning messages

* making inputcheck tests specific to error/warning messages

* resolve HVGs across batches more cleanly, fix dtype issue

* renaming pca input arguments

* renaming pca input arguments

* _pca bundle: more efficient copy handling, added input check. both _pca and _recipe: varm field for PCs, adapted tests and docs

* move repeated inputcheck code to helpers

* merging tests *_values and *_general

* condense code in pearson hvg selection test, smaller test data for speedup

* condensing code in normalization tests

* add asteriks for keyword

* updating refs to Genome Biology publication

* cleanup helpers.py

* cleanup main files as requested by @ivirshup

* revert unneeded settingWithCopy fix

* cache data

* use doc_params for doc

* fix doc_params var

* finalize docs

* fix param doc

* wrong var still

* add cached datasets module and test on high_var_genes tests

* use new cache dataset module for tests

* fix precommit

* fix docs

* fix reference and add notebook to tutorials

* add release note

* add release note

* fix release note

* typo

* remove duplicate reference

* fixing black flake etc requirements

* add _pca function to release note

* last edits to docs

* fix release and tutorial image

* try fix pre-commit

* minor docs

* Remove accidentally included files from merge

Co-authored-by: giovp <[email protected]>
Co-authored-by: Isaac Virshup <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant