Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify compare() docstring, error-check, pass var_name. #1616

Merged
merged 19 commits into from
Mar 30, 2021

Conversation

rpgoldman
Copy link
Contributor

@rpgoldman rpgoldman commented Mar 15, 2021

Description

Check to see if the log_likelihood groups have more than one data
variable, and if so, require the var_name parameter. This avoids
having a less understandable error message crop up later.

Introduce var_name parameter to compare(), and pass it through to subsidiary IC function. @OriolAbril reports it was simply an oversight that this wasn't done before.

Addresses issue #1614

Checklist

  • Follows official PR format
  • Includes new or updated tests to cover the new feature
  • Code style correct (follows pylint and black guidelines)
  • Changes are listed in changelog

@rpgoldman rpgoldman added the Enhancement Improvements to ArviZ label Mar 15, 2021
@rpgoldman rpgoldman linked an issue Mar 15, 2021 that may be closed by this pull request
arviz/stats/stats.py Outdated Show resolved Hide resolved
arviz/stats/stats.py Outdated Show resolved Hide resolved
@rpgoldman
Copy link
Contributor Author

Looking at the test failures, it seems that sometimes ArviZ computes the log_likelihood group for input InferenceData objects, instead of just extracting it. That was not what I expected. It suggests that this should be redone as I suggested above: catch the errors from the ic_func and annotate them, instead of doing the error-check in compare()

@OriolAbril
Copy link
Member

Looking at the test failures, it seems that sometimes ArviZ computes the log_likelihood group for input InferenceData objects, instead of just extracting it. That was not what I expected. It suggests that this should be redone as I suggested above: catch the errors from the ic_func and annotate them, instead of doing the error-check in compare()

Maybe the test helpers have not been updated and still have log_likelihood as a variable in sample_stats. The get_log_likelihood prints a warning if this happens but it still works for backward compatibility reasons.

@rpgoldman
Copy link
Contributor Author

I tried the error-catching and re-raising approach and it worked locally for me, so I am pushing it for inspection and testing.

I still need to add some new tests to verify that the errors are signaled as I expect.

arviz/stats/stats.py Outdated Show resolved Hide resolved
Check to see if the log_likelihood groups have more than one data variable, and if so, require the var_name parameter. This avoids having a less understandable error message crop up later.
Introduce `var_name` parameter, and pass it through to the IC function invoked by compare.
Previously, if we got a TypeError when trying to find a log_likelihood
in from_pymc3() that TypeError would be squashed completely. Now we
will echo it to the log before handling it.
Catch errors from IC functions invoked inside compare and annotate
them with information about the source `InferenceData` object.
@rpgoldman
Copy link
Contributor Author

@OriolAbril I think this now does things the way you wanted: I grab up the errors and signal a new error from the old one, with additional information identifying which model caused the problem. I'll look at your cross-reference now.

@codecov
Copy link

codecov bot commented Mar 27, 2021

Codecov Report

Merging #1616 (6d27fd5) into main (23e14fb) will decrease coverage by 0.02%.
The diff coverage is 88.09%.

❗ Current head 6d27fd5 differs from pull request most recent head 1a31db2. Consider uploading reports for the commit 1a31db2 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1616      +/-   ##
==========================================
- Coverage   90.91%   90.89%   -0.03%     
==========================================
  Files         108      108              
  Lines       11671    11700      +29     
==========================================
+ Hits        10611    10635      +24     
- Misses       1060     1065       +5     
Impacted Files Coverage Δ
arviz/data/io_pymc3.py 90.20% <75.00%> (-0.56%) ⬇️
arviz/stats/stats.py 96.28% <88.46%> (-0.40%) ⬇️
arviz/rcparams.py 94.16% <100.00%> (+0.20%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 23e14fb...1a31db2. Read the comment docs.

.pylintrc Outdated Show resolved Hide resolved
arviz/data/io_pymc3.py Show resolved Hide resolved
arviz/rcparams.py Show resolved Hide resolved
Make sure we check for multiple observed variables in compare() and that support for "var_name" works.
Previously, we checked for `sample_stats` in `get_log_likelihood()` *before* reading `log_likelihood`.  Add a check that `log_likelihood` must be missing before we check `sample_stats`.
@rpgoldman rpgoldman marked this pull request as ready for review March 29, 2021 17:26
Copy link
Member

@OriolAbril OriolAbril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a couple minor comments, looks good

arviz/stats/stats.py Outdated Show resolved Hide resolved
arviz/tests/base_tests/test_stats.py Outdated Show resolved Hide resolved
arviz/tests/base_tests/test_stats.py Outdated Show resolved Hide resolved
* Limit recomputation in tests by scoping a fixture.
* Test for expected IC in `compare` test.
* Refine type assertion.
@rpgoldman
Copy link
Contributor Author

@OriolAbril If my changes meet with your approval, please squash and merge.

@rpgoldman rpgoldman changed the title WIP: Modify compare() docstring, error-check, pass var_name. Modify compare() docstring, error-check, pass var_name. Mar 30, 2021
@OriolAbril OriolAbril merged commit d96880c into arviz-devs:main Mar 30, 2021
utkarsh-maheshwari pushed a commit to utkarsh-maheshwari/arviz that referenced this pull request May 27, 2021
)

* Modify compare() docstring, and var_name param and error-check.

Check to see if the log_likelihood groups have more than one data variable, and if so, require the var_name parameter. This avoids having a less understandable error message crop up later.
Introduce `var_name` parameter, and pass it through to the IC function invoked by compare.

* Fix error in argument test.

* More explicit error message.

Previously, if we got a TypeError when trying to find a log_likelihood
in from_pymc3() that TypeError would be squashed completely. Now we
will echo it to the log before handling it.

* Fix type signature of compare().

Took @OriolAbril correction.

* Annotated IC function errors from compare().

Catch errors from IC functions invoked inside compare and annotate
them with information about the source `InferenceData` object.

* Remove incorrect docstring.

* pylint

* mypy fixes.

* Make "e" acceptable variable name.

Many examples show this as a good variable name for exceptions,
particularly in "except <ExceptionClass> as e:"

* Changelog update.

* Backward-compatibility fix.

* Fix test.

Error now caught sooner.

* Fix mypy issues.

* Python 3.5 and 3.6 compatibility.

* Whitespace issue caught by Oriol.

* Test for error-trapping.

Make sure we check for multiple observed variables in compare() and that support for "var_name" works.

* Don't let sample_stats shadow log_likelihood.

Previously, we checked for `sample_stats` in `get_log_likelihood()` *before* reading `log_likelihood`.  Add a check that `log_likelihood` must be missing before we check `sample_stats`.

* Improvements suggested by OriolAbril.

* Limit recomputation in tests by scoping a fixture.
* Test for expected IC in `compare` test.
* Refine type assertion.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Improvements to ArviZ
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error comparing PyMC3 models with multiple observed variables
2 participants