Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refitting model_final and nuisance averaging #360

Merged
merged 27 commits into from
Jan 10, 2021
Merged

Conversation

vsyrgkanis
Copy link
Collaborator

@vsyrgkanis vsyrgkanis commented Jan 2, 2021

  • Support refitting only final model in DML after changing estimator parameters

  • Add support for monte carlo nuisance estimation, with multiple k-fold draws.

  • added rlearner residuals_ property that returns fitted residuals for training data (fixes plotting the residuals T & Y from the model #350 )

  • fixed flaky cate interpreter test

  • added refit example in the dml notebook

@vsyrgkanis vsyrgkanis force-pushed the vasilis/refit branch 5 times, most recently from acd4d11 to 39634db Compare January 2, 2021 22:00
fixed small issue with _d_t storage

enabled refitting in drlearner

enabled refit in ortho_iv

enabled monte_carlo_iterations in ortho_iv

added cache_values param to ortho_iv

fixed docstrings to remove the model parameters and added dosctrings to the abstract methods. Removed doctest examples from ortholearner and rlearner as these classes can no longer be used as standalone and are abstract classes
@vsyrgkanis vsyrgkanis force-pushed the vasilis/refit branch 3 times, most recently from b315e2e to cc2ae24 Compare January 4, 2021 02:36
linting

fixed failing tests

fixed notebooks

fixed ortholearner docstring

reverted to allowing n_crossfit_splits and raising deprecation warning. fixed ortholearner doctest

fixed bugs for failing tests

fixed failing test bugs

added extra kwargs to _strata

old sklearn crashes with new scipy. we can revert once we enforce new sklearn

disallowing bootstrap inference when refitting

added many refit tests. fixed some leftover bugs based on new tests

typo in tests

changed monte_carlo_iterations to mc_iters and added mc_agg parameter to change the aggregation method {mean, median}

fixed nuisance aggregation when nuisances have different dimensions

removed _models_nuisance initilaization at init.

added rlearner residuals_ property

fixed flaky cate interpreter test
@vsyrgkanis vsyrgkanis marked this pull request as ready for review January 4, 2021 21:19
@vsyrgkanis vsyrgkanis added the enhancement New feature or request label Jan 4, 2021
@vsyrgkanis vsyrgkanis changed the title Vasilis/refit Refitting model_final and nuisance averaging Jan 6, 2021
Copy link
Collaborator

@kbattocchi kbattocchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Left a few minor suggestions.

econml/_ortho_learner.py Outdated Show resolved Hide resolved
econml/_ortho_learner.py Show resolved Hide resolved
econml/_ortho_learner.py Outdated Show resolved Hide resolved
econml/_ortho_learner.py Outdated Show resolved Hide resolved
econml/_ortho_learner.py Outdated Show resolved Hide resolved
econml/dml.py Outdated Show resolved Hide resolved
econml/dml.py Outdated Show resolved Hide resolved
econml/dml.py Outdated Show resolved Hide resolved
econml/drlearner.py Show resolved Hide resolved
setup.cfg Outdated Show resolved Hide resolved
Copy link
Contributor

@heimengqi heimengqi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I am a bit confused by the attribute name changes with heading and trailing underscore, but I checked the test code coverage, it's all being tested so should be fine.

econml/_ortho_learner.py Show resolved Hide resolved
econml/cate_estimator.py Outdated Show resolved Hide resolved
econml/drlearner.py Show resolved Hide resolved
econml/cate_estimator.py Outdated Show resolved Hide resolved
econml/dml.py Outdated

@property
def bias_part_of_coef(self):
return self.rlearner_model_final._fit_cate_intercept
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, why we need this attribute if it's same as fit_cate_intercept ? Another corner case I think we are missing now is from the parse_final_model_params, what if the user sets fit_cate_intercept=False but input the final model with fit_intercept=True, we should update our fit_cate_intercept depends on whether there is an intercept in the user defined final model?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fit_cate_intercept is different than the intercept of the final model. The fit_cate_intercept is aobut whether we augment X with an constant of 1's. The fit intercept of the final model is about fitting an extr aoffset in the regresstion not a cate offset.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also observe that bias_part_of_coef, returns the fit_cate_intercept based on the fit time. See the implementation in of bias_part_of_coef in dml. It doesn't read the fit_cate_intercept, but it reads the parameter that was stored when fit was called.
So there wont be an inconsitency here. At least that was a corner case that I tried to address.

Copy link
Collaborator

@kbattocchi kbattocchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just one small question.

econml/dml/dml.py Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

plotting the residuals T & Y from the model
4 participants