Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added blb inference option to the OrthoForest #214

Merged
merged 3 commits into from
Feb 7, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions doc/spec/estimation/forest.rst
Original file line number Diff line number Diff line change
Expand Up @@ -363,8 +363,7 @@ and the `ForestLearners Jupyter notebook <https://github.com/microsoft/EconML/bl
>>> est.fit(Y, T, W, W)
<econml.ortho_forest.ContinuousTreatmentOrthoForest object at 0x...>
>>> print(est.effect(W[:2]))
[[1. ]
[1.2]]
[1.00... 1.19...]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, the shape changing here is good because it was wrong before.

Should I be concerned that the values themselves have changed as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the shape change is intentional. The values have changed because the trees are partitioned into segments ("little bags") that share data, so the same random seed gives slightly different results now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, so the trees are partitioned even when fit's inference option is None rather than 'blb'?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, otherwise we'd have two ways of partitioning the tress and it would be hard to keep track of. It doesn't affect the estimate much since that's done using all of the 2*n_trees leaves. The only difference is which data samples a particular tree uses, i.e. groups slice_len trees subsample from the same n_samples/2 samples but which n_samples/2 are used differs from group to group.


Similarly, we can call :class:`.DiscreteTreatmentOrthoForest`:

Expand All @@ -377,7 +376,7 @@ Similarly, we can call :class:`.DiscreteTreatmentOrthoForest`:
>>> est.fit(Y, T, W, W)
<econml.ortho_forest.DiscreteTreatmentOrthoForest object at 0x...>
>>> print(est.effect(W[:2]))
[1. 1.2]
[1.01... 1.25...]

Let's now look at a more involved example with a high-dimensional set of confounders :math:`W`
and with more realistic noisy data. In this case we can just use the default parameters
Expand Down
21 changes: 21 additions & 0 deletions doc/spec/inference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,27 @@ and the :class:`.ForestDRLearner`. You can enable such intervals by setting ``in
This inference is enabled by our implementation of the :class:`.SubsampledHonestForest` extension to the scikit-learn
:class:`~sklearn.ensemble.RandomForestRegressor`.


OrthoForest Bootstrap of Little Bags Inference
==============================================

For the Orthogonal Random Forest estimators (see :class:`.ContinuousTreatmentOrthoForest`, :class:`.DiscreteTreatmentOrthoForest`),
we provide confidence intervals built via the bootstrap-of-little-bags approach ([Athey2019]_). This technique is well suited for
estimating the uncertainty of the honest causal forests underlying the OrthoForest estimators. You can enable such intervals by setting
``inference='blb'``, e.g.:

.. testcode::

from econml.ortho_forest import ContinuousTreatmentOrthoForest
from econml.sklearn_extensions.linear_model import WeightedLasso
est = ContinuousTreatmentOrthoForest(n_trees=10,
min_leaf_size=3,
model_T=WeightedLasso(alpha=0.01),
model_Y=WeightedLasso(alpha=0.01))
est.fit(y, t, X, W, inference='blb')
point = est.const_marginal_effect(X)
lb, ub = est.const_marginal_effect_interval(X, alpha=0.05)

.. todo::
* Subsampling
* Doubly Robust Gradient Inference
Loading