Add support for refitting and nuisance averaging #343

kbattocchi · 2020-12-16T13:05:01Z

No description provided.

heimengqi

Mostly looks good to me. But I all agree on what you've mentioned in our call, it's annoying to loop all possible arguments and add a setter of it, and it's hard to maintain. We might need to brainstorming and see is there any alternatives.

econml/_rlearner.py

heimengqi · 2020-12-17T22:41:31Z

econml/_ortho_learner.py

+
+    @property
+    def discrete_treatment(self):
+        return self._discrete_treatment


Do we really need to expose the first stage related arguments and allow users to set something else? I think the value of refit is to save computation time when the nuisance models are the same and the data and results are cached. Even they reset a value here they can't only refit final stage but fit the entire pipeline again, it's the same as reinitialize a new estimator and call fit in term of computing time and lines of code. Also it's wired to allow user to change the discrete_treatment flag since T is fixed and it should be either continuous our categorical. If we only allow user to change final stage specific argument, it could also simplify tons of properties and setters across multiple classes?

Same for _RLearner and classes under DML file.

My idea was to make all things that you can set to some value in the initializer also available to update at any other time. It's true that many of these are not necessary for the refit functionality alone, and also that a workaround would be to create a new instance.

However, from the user perspective I think sklearn's behavior (where you can set any attribute at any time) would be much easier to understand than a policy where you can only set the attributes that don't affect the nuisances, since it is not always completely obvious which attributes affect the nuisances (e.g. the featurizer affects the Y residuals if we use the linear_first_stages flag, but doesn't affect any nuisances if it's False).

Eventually, I think we should enable setting attributes even for estimators that don't have support refit (e.g. the metalearners), but for this PR I'm just starting with the OrthoLearner class hierarchy.

heimengqi · 2020-12-17T22:46:57Z

econml/dml.py

+    def model_final(self, model):
+        raise AttributeError("LinearDML final model can't be chnaged from "
+                             "StatsModelsLinearRegression(fit_intercept=False)")
+


We can't change final stage model but we could change some arguments under that model? e.g. cov_type, like what you did for SubsampledHonestForest

Another typo here I just randomly saw it, "changed"

Yes, I think that would make sense - but for now, cov_type is set based on the inference object's cov_type, it's never set directly by the user, so I just left that behavior as is for now. Ultimately this ties back into our discussion of whether to change the DML hierarchy by unifying the LinearDML and SparseDML classes and cleaning up how we implement inference, but that's out of scope for now.

vsyrgkanis · 2020-12-18T22:21:51Z

econml/_ortho_learner.py

+        return self._discrete_treatment
+
+    @discrete_treatment.setter
+    def discrete_treatment(self, discrete_treatment):


Does this need to trigger a changed in the discrete treatment transformer??

As discussed offline, I don't think so because the transformation happens at fit time.

I found a related bug, not sure if related to the transformer. So the following works:

est = LinearDML(model_y=RandomForestRegressor(), model_t=RandomForestClassifier(min_samples_leaf=10), discrete_treatment=False, linear_first_stages=False, n_splits=6) est.fit(Y, T, X=X, W=W) te_pred = est.effect(X_test) te_pred[0]

But the following raises an error, while from the current semantics it's supposed to work fine:

est = LinearDML(model_y=RandomForestRegressor(), model_t=RandomForestClassifier(min_samples_leaf=10), discrete_treatment=True, linear_first_stages=False, n_splits=6) est.fit(Y, T, X=X, W=W) est.discrete_treatment = False est.fit(Y, T, X=X, W=W) te_pred[0]

Which means that there is some state of the object that is maintained but should have been reset when resetting the discrete_treatment.

Hm. I wouldn't expect that to work since it uses a classifier for model_t but a continuous treatment. I think changing discrete_treatment would require that either the model_t is 'auto' (although I need to fix this logic) or else that the model also changes.

Thats why I added the first example that when called a priori it works. The classifier can handle the discrete treatment. So there seems a problem with changing diwcrete treatment

Turns out this was due to a typo in property name for the DML setter. It will work after my next push.

vsyrgkanis · 2020-12-18T22:52:40Z

econml/dml.py

+
+    @_OrthoLearner.discrete_treatment.setter
+    def discrete_treatement(self, discrete_treatement):
+        # super().discrete_treatment = discrete_treatment


In general: we shoulld change the name of the _RLearner.model_final to _RLearner.rlearner_model_final, and _OrthoLearner.model_final to _OrthoLearner.ortholearner_model_final and then we should just be setting these variables. That would clean up the code even if the refit stuff were not there.

vsyrgkanis · 2020-12-18T22:57:40Z

econml/dml.py

-                           inference=inference)
+                           cache_values=cache_values, monte_carlo_iterations=None, inference=inference)
+
+    @DML.model_final.setter


I feel that for all these setters, there should be a logic within a setattr, as opposed to a separate method for each one, just so that we reduce lines of code. But this is secondary and I'm ok with doing it post release.

vsyrgkanis · 2020-12-20T01:01:29Z

econml/tests/test_dml.py

@@ -1065,3 +1064,59 @@ def test_deprecation(self):

        d = pickle.dumps(LinearDMLCateEstimator())
        e = pickle.loads(d)
+
+    def test_refit(self):


I think we need several more tests here that the new params affect the final stage. Things like changing fit_intercept and then testing that all our effect methods take the change after refitting.

Similarly maybe some other change in sparse_linear (e.g. alpha) or n_estimators in ForestDML.

Also a test that the "score_" attribute changes accordingly to the new model. Maybe do a model_selection experiment, where you run refit with two-three hyperparams, with 1 of them being the obvious best and the rest leading to random noise and read the score_ method each time after fit, and then make sure that the correct final model is chosen.

Similarly, that we can also do "score(Y, T, X, W)" out of sample and that the score returned is based on the new model.

Also maybe a test:
If I am in SparseLinearDML and I do:
SparseLinearDML().model_final = LinearRegression()

would that make the final model be LinearRegression()?

Finall, I think we definitely need a "check_is_fitted" to be called in all our predict methods and raise an error if not fitted. This is even more important now, as when I change a param we need to be invalidating the fit and so if someone calls effect, they shouldn't be getting the old fitted model final results, but an error that says that estimator is not fitted yet.

Currently our error messages are random internal errors of the estimators that some attribute is missing and not an interpretable error of the form "estimator is not fitted yet!".

I think we should use an approach similar to sklearn, where fit sets internal attributes of the fit that are stateful attributes and doesn't alter the inptu params. (e.g. at fit time we create a model_final_ which is the fitted model final, we also set models_y_ which are the fitted models_y etc, these are all attributes created by fit and the inptu params remain unaltered as static values).

Then sklearn has a generic check_is_fitted: https://scikit-learn.org/stable/modules/generated/sklearn.utils.validation.check_is_fitted.html
which just checks that the object has attributes ending in underscore.

For now we could go with simpler solutions by just setting a flag: "fitted" etc.

Post-release we also definitely need to showcase this functionality in the dml notebook. For instance having a cell that does sth like the following as it is a superb functionality!

from econml.sklearn_extensions.linear_model import StatsModelsLinearRegression, StatsModelsRLM, DebiasedLasso from sklearn.linear_model import ElasticNetCV, LassoCV from sklearn.preprocessing import PolynomialFeatures est = DML(model_y=RandomForestRegressor(min_samples_leaf=20, max_depth=3), model_t=RandomForestRegressor(min_samples_leaf=20, max_depth=3), model_final=StatsModelsLinearRegression(fit_intercept=False), random_state=123) est.fit(Y_train, T_train, X=X_train, W=W_train, cache_values=True) print(list(zip(est.coef_, *est.coef__interval()))) print(est.score_) est.model_final = StatsModelsRLM(fit_intercept=True) est.refit() print(list(zip(est.coef_, *est.coef__interval()))) print(est.score_) est.model_final = DebiasedLasso() est.refit() print(list(zip(est.coef_, *est.coef__interval()))) print(est.score_) est.model_final = Lasso(alpha=0.1) est.refit() print(est.coef_) print(est.score_) est.model_final = ElasticNetCV(cv=3) est.refit() print(est.coef_) print(est.score_) est.featurizer = PolynomialFeatures(degree=3) est.refit() print(est.coef_) print(est.score_)

I wish we could also use seamlessly the NonParamDML variants and the CausalForestDML variant, but those have a different rlearner_model_final, so it seems harder. but that would be very useful.

Maybe if we just had an input flag that says: linear_model_final, then if that is True, we just don't use the weight trick and all functionality as in DML is good to go. and if linear_model_final=False, then we use the weight trick and we raise an error in the case when the user passes multiple treatments, that this is not allowed (I blieve that this will actually be done automatically but the model final wrapper).

So I think it's just a matter of adding the flag: linear_model_final, and then we could do:

est.linear_model_final=False est.model_final = RandomForestRegressor() est.refit()

This would allow us to do everything through refitting (except CausalForestDML, whcih uses a separate model final wrapper).

For now we can do some of this by using the NonParamDML, but this still requires that the model accept weights. While for linear models we don't need weights. That seems good enough for now, but maybe the linear_model_final flag is still a good idea.

from econml.dml import NonParamDML from econml.sklearn_extensions.linear_model import StatsModelsLinearRegression, StatsModelsRLM, DebiasedLasso from econml.sklearn_extensions.ensemble import SubsampledHonestForest from sklearn.linear_model import LassoCV from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor est = NonParamDML(model_y=RandomForestRegressor(min_samples_leaf=20, max_depth=3), model_t=RandomForestRegressor(min_samples_leaf=20, max_depth=3), model_final=SubsampledHonestForest(), random_state=123) est.fit(Y_train, T_train, X=X_train, W=W_train, cache_values=True) print(list(zip(est.effect(X_val[:1]), *est.effect_interval(X_val[:1])))) print(est.score_) est.model_final = DebiasedLasso() est.refit() print(list(zip(est.effect(X_val[:1]), *est.effect_interval(X_val[:1])))) print(est.score_) est.model_final = Lasso(alpha=0.01) est.refit() print(est.effect(X_val[:1])) print(est.score_) est.featurizer = PolynomialFeatures(degree=3) est.refit() print(est.effect(X_val[:1])) print(est.score_) est.featurizer = None est.model_final = RandomForestRegressor(min_samples_leaf=20, max_depth=3) est.refit() print(est.effect(X_val[:1])) print(est.score_) est.model_final = GradientBoostingRegressor(min_samples_leaf=20, max_depth=3) est.refit() print(est.effect(X_val[:1])) print(est.score_)

Regarding checking fittedness, I agree that we should make sure that fit has been called before calling effect (and we don't always do this currently). However, note that sklearn will not reset an estimator to "unfitted" if attributes are set in between calls to fit; for example:

fit an estimator

set an attribute

call predict

will work just fine, and generate predictions using the previously fit model (using non-current settings). I'd expect our behavior to match.

econml/_ortho_learner.py

vsyrgkanis · 2020-12-20T19:35:14Z

econml/_ortho_learner.py

+        self._cache_invalid_message = None
+        return self
+
+    # TODO: this doesn't currently allow inference, because that is handled by the wrap_fit attribute


So is inference the same object as the one called by fit?

I guess this is also related to the fact that the inference object does not have the new model final. I suspect this might need to be solved simultaneously (i.e. propagate model final to the inference object and allow refit to take as input "inference=...")

vsyrgkanis · 2020-12-20T19:39:16Z

econml/_ortho_learner.py

+        fitted_inds = None
+
+        for _ in range(monte_carlo_iterations or 1):
+            nuisances, new_inds = self._fit_nuisances(Y, T, X, W, Z, sample_weight=sample_weight, groups=groups)


If we fit multiple times, what version is models_nuisance storing? Is it storing for each mc iteration and for each fold in the kfold, the fitted model? Or is it storing only the final version of mc iterations? The latter would be inconsistent.

Similarly, when we call "score" we use the fitted models_nuisance to predict. But which fitted models_nusiance? just the last mc iter? or all the mc_iters?

Right now, it is fitting only a single model multiple times and retaining information from the latest run (because users might expect a single model per fold, not one per fold*mc_iter combination), but I can change that.

Perhaps best would be to have a model wrapper that wraps all of the individual ones but provides a single prediction, similar to bootstrap?

Actually I take it back: _crossfit is cloning the model, so we don't call fit repeatedly on the same instance.

vsyrgkanis · 2020-12-20T23:46:14Z

@kbattocchi to address #350, we might want to implement public access methods to the cached nuissances, e.g. est.get_nuisances at the ortho_learner level and est.get_residuals at the rlearner level.

Currently this can be done by:

Yres = est._cached_values.nuisances[0]
Tres = est._cached_values.nuisances[0]

which is cumbersome.

vsyrgkanis · 2020-12-20T23:54:58Z

econml/_ortho_learner.py

+        return self._discrete_treatment
+
+    @discrete_treatment.setter
+    def discrete_treatment(self, discrete_treatment):


I found a related bug, not sure if related to the transformer. So the following works:

est = LinearDML(model_y=RandomForestRegressor(), model_t=RandomForestClassifier(min_samples_leaf=10), discrete_treatment=False, linear_first_stages=False, n_splits=6) est.fit(Y, T, X=X, W=W) te_pred = est.effect(X_test) te_pred[0]

But the following raises an error, while from the current semantics it's supposed to work fine:

est = LinearDML(model_y=RandomForestRegressor(), model_t=RandomForestClassifier(min_samples_leaf=10), discrete_treatment=True, linear_first_stages=False, n_splits=6) est.fit(Y, T, X=X, W=W) est.discrete_treatment = False est.fit(Y, T, X=X, W=W) te_pred[0]

Which means that there is some state of the object that is maintained but should have been reset when resetting the discrete_treatment.

vsyrgkanis · 2020-12-23T20:05:34Z

This code fails:

from econml.sklearn_extensions.linear_model import StatsModelsLinearRegression, StatsModelsRLM, DebiasedLasso
from sklearn.linear_model import ElasticNetCV, LassoCV

est = DML(model_y=RandomForestRegressor(min_samples_leaf=20, max_depth=3),
          model_t=RandomForestRegressor(min_samples_leaf=20, max_depth=3),
          model_final=StatsModelsLinearRegression(fit_intercept=False),
          random_state=123)
est.fit(Y_train, T_train, X=X_train, W=W_train, cache_values=True)
print(list(zip(est.coef_, *est.coef__interval())))
est.model_final = StatsModelsRLM(fit_intercept=True)
est.refit()
print(list(zip(est.coef_, *est.coef__interval())))

[(6.45655731600806, 6.050960723017014, 6.8621539089991055)]
4.274931642986886
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-6-0a8e6a227739> in <module>
     13 est.model_final = StatsModelsRLM(fit_intercept=True)
     14 est.refit()
---> 15 print(list(zip(est.coef_, *est.coef__interval())))
     16 print(est.score_)
     17 

c:\users\vasy\documents\alicedev\econml\econml\cate_estimator.py in call(self, *args, **kwargs)
    205                 return getattr(self._inference, name)(*args, **kwargs)
    206             else:
--> 207                 raise AttributeError("Can't call '%s' because 'inference' is None" % name)
    208         return call
    209 

AttributeError: Can't call 'coef__interval' because 'inference' is None

econml/dml.py

vsyrgkanis · 2020-12-23T20:22:53Z

econml/dml.py

+        # _BaseDML's final model setter reuses its old featurizer
+        # so we need to pass _BaseDML, not DML to the super() call
+        # super(_BasaeDML).model_final = _FinalWrapper(...)
+        super(_BaseDML, _BaseDML).model_final.__set__(self, _FinalWrapper(


This is a bug. There is not model_final

This reverts commit deda008.

vsyrgkanis · 2020-12-29T14:19:50Z

econml/dml.py

@@ -252,7 +270,7 @@ def model_cate(self):
            An instance of the model_final object that was fitted after calling fit which corresponds
            to the constant marginal CATE model.
        """
-        return super().model_final._model
+        return self._model_final


This should simply:

self.model_final

without the underscore

kbattocchi · 2021-01-06T16:43:30Z

Shelving in favor of #360

kbattocchi force-pushed the kebatt/refit branch from 3021200 to da246ec Compare December 17, 2020 00:37

heimengqi reviewed Dec 17, 2020

View reviewed changes

vsyrgkanis reviewed Dec 18, 2020

View reviewed changes

vsyrgkanis reviewed Dec 20, 2020

View reviewed changes

vsyrgkanis requested changes Dec 20, 2020

View reviewed changes

vsyrgkanis mentioned this pull request Dec 20, 2020

plotting the residuals T & Y from the model #350

Closed

vsyrgkanis requested changes Dec 20, 2020

View reviewed changes

kbattocchi force-pushed the kebatt/refit branch from 5e20ed4 to 571d187 Compare December 21, 2020 22:14

kbattocchi added 2 commits December 22, 2020 09:26

Support refitting in DML

a3adef1

Add support for monte carlo nuisance estimation

d3520ee

kbattocchi force-pushed the kebatt/refit branch 3 times, most recently from 66035e3 to c3a5652 Compare December 23, 2020 16:04

kbattocchi added 2 commits December 23, 2020 11:36

Address PR feedback

b4a6bdf

Address monte carlo feedback

7b98d46

kbattocchi force-pushed the kebatt/refit branch 2 times, most recently from f09f754 to de96b56 Compare December 23, 2020 19:15

Refit test fixes

f636dab

kbattocchi force-pushed the kebatt/refit branch from de96b56 to f636dab Compare December 23, 2020 19:49

vsyrgkanis requested changes Dec 23, 2020

View reviewed changes

vasilismsr added 2 commits December 28, 2020 14:56

added RScorer

deda008

Revert "added RScorer"

b7470f2

This reverts commit deda008.

vsyrgkanis reviewed Dec 29, 2020

View reviewed changes

kbattocchi closed this Jan 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for refitting and nuisance averaging #343

Add support for refitting and nuisance averaging #343

kbattocchi commented Dec 16, 2020

heimengqi left a comment

heimengqi Dec 17, 2020

kbattocchi Dec 18, 2020

heimengqi Dec 17, 2020

heimengqi Dec 17, 2020

kbattocchi Dec 18, 2020

vsyrgkanis Dec 18, 2020

kbattocchi Dec 20, 2020

vsyrgkanis Dec 20, 2020

kbattocchi Dec 21, 2020

vsyrgkanis Dec 21, 2020

kbattocchi Dec 21, 2020

vsyrgkanis Dec 18, 2020

kbattocchi Dec 20, 2020

vsyrgkanis Dec 18, 2020

vsyrgkanis Dec 20, 2020

vsyrgkanis Dec 20, 2020

vsyrgkanis Dec 20, 2020

vsyrgkanis Dec 20, 2020

vsyrgkanis Dec 20, 2020 •

edited

Loading

vsyrgkanis Dec 20, 2020

vsyrgkanis Dec 20, 2020

kbattocchi Dec 21, 2020 •

edited

Loading

vsyrgkanis Dec 20, 2020

vsyrgkanis Dec 20, 2020

kbattocchi Dec 21, 2020

kbattocchi Dec 21, 2020

vsyrgkanis commented Dec 20, 2020

vsyrgkanis Dec 20, 2020

vsyrgkanis commented Dec 23, 2020

vsyrgkanis Dec 23, 2020

vsyrgkanis Dec 29, 2020

kbattocchi commented Jan 6, 2021

Add support for refitting and nuisance averaging #343

Add support for refitting and nuisance averaging #343

Conversation

kbattocchi commented Dec 16, 2020

heimengqi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vsyrgkanis Dec 20, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kbattocchi Dec 21, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vsyrgkanis commented Dec 20, 2020

Choose a reason for hiding this comment

vsyrgkanis commented Dec 23, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kbattocchi commented Jan 6, 2021

vsyrgkanis Dec 20, 2020 •

edited

Loading

kbattocchi Dec 21, 2020 •

edited

Loading