Skip to content

Commit

Permalink
typo fixes (#864)
Browse files Browse the repository at this point in the history
Signed-off-by: Alex Malins <github@alexmalins.com>
  • Loading branch information
alexmalins authored Mar 26, 2024
1 parent 6cc5b23 commit 6219695
Show file tree
Hide file tree
Showing 9 changed files with 17 additions and 17 deletions.
2 changes: 1 addition & 1 deletion doc/spec/comparison.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Detailed estimator comparison


+---------------------------------------------+--------------+--------------+------------------+-------------+-----------------+------------+--------------+--------------------+
| Estimator | | Treatment | | Requires | | Delivers Conf. | | Linear | | Linear | | Mulitple | | Multiple | | High-Dimensional |
| Estimator | | Treatment | | Requires | | Delivers Conf. | | Linear | | Linear | | Multiple | | Multiple | | High-Dimensional |
| | | Type | | Instrument | | Intervals | | Treatment | | Heterogeneity | | Outcomes | | Treatments | | Features |
+=============================================+==============+==============+==================+=============+=================+============+==============+====================+
| :class:`.SieveTSLS` | Any | Yes | | Yes | Assumed | Yes | Yes | |
Expand Down
10 changes: 5 additions & 5 deletions doc/spec/estimation/dml.rst
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ Most of the methods provided make a parametric form assumption on the heterogene
linear on some pre-defined; potentially high-dimensional; featurization). These methods include:
:class:`.DML`, :class:`.LinearDML`,
:class:`.SparseLinearDML`, :class:`.KernelDML`.
For fullly non-parametric heterogeneous treatment effect models, check out the :class:`.NonParamDML`
For fully non-parametric heterogeneous treatment effect models, check out the :class:`.NonParamDML`
and the :class:`.CausalForestDML`.
For more options of non-parametric CATE estimators,
check out the :ref:`Forest Estimators User Guide <orthoforestuserguide>`
Expand Down Expand Up @@ -165,7 +165,7 @@ structure of the implemented CATE estimators is as follows.
Below we give a brief description of each of these classes:

* **DML.** The class :class:`.DML` assumes that the effect model for each outcome :math:`i` and treatment :math:`j` is linear, i.e. takes the form :math:`\theta_{ij}(X)=\langle \theta_{ij}, \phi(X)\rangle`, and allows for any arbitrary scikit-learn linear estimator to be defined as the final stage (e.g.
:class:`~sklearn.linear_model.ElasticNet`, :class:`~sklearn.linear_model.Lasso`, :class:`~sklearn.linear_model.LinearRegression` and their multi-task variations in the case where we have mulitple outcomes, i.e. :math:`Y` is a vector). The final linear model will be fitted on features that are derived by the Kronecker-product
:class:`~sklearn.linear_model.ElasticNet`, :class:`~sklearn.linear_model.Lasso`, :class:`~sklearn.linear_model.LinearRegression` and their multi-task variations in the case where we have multiple outcomes, i.e. :math:`Y` is a vector). The final linear model will be fitted on features that are derived by the Kronecker-product
of the vectors :math:`T` and :math:`\phi(X)`, i.e. :math:`\tilde{T}\otimes \phi(X) = \mathtt{vec}(\tilde{T}\cdot \phi(X)^T)`. This regression will estimate the coefficients :math:`\theta_{ijk}`
for each outcome :math:`i`, treatment :math:`j` and feature :math:`k`. The final model is minimizing a regularized empirical square loss of the form:

Expand Down Expand Up @@ -239,7 +239,7 @@ Below we give a brief description of each of these classes:
[Nie2017]_. It approximates any function in the RKHS by creating random Fourier features. Then runs a ElasticNet
regularized final model. Thus it approximately implements the results of [Nie2017], via the random fourier feature
approximate representation of functions in the RKHS. Moreover, given that we use Random Fourier Features this class
asssumes an RBF kernel.
assumes an RBF kernel.

* **NonParamDML.** The class :class:`.NonParamDML` makes no assumption on the effect model for each outcome :math:`i`.
However, it applies only when the treatment is either binary or single-dimensional continuous. It uses the observation that for a single
Expand Down Expand Up @@ -350,7 +350,7 @@ Usage FAQs
it does so in a manner that is robust to the estimation mistakes that these ML algorithms
might be making.

Moreover, one may typically want to estimate treatment effect hetergoeneity,
Moreover, one may typically want to estimate treatment effect heterogeneity,
which the above OLS approach wouldn't provide. One potential way of providing such heterogeneity
is to include product features of the form :math:`X\cdot T` in the OLS model. However, then
one faces again the same problems as above:
Expand Down Expand Up @@ -564,7 +564,7 @@ Usage FAQs
- **How can I assess the performance of the CATE model?**

Each of the DML classes have an attribute `score_` after they are fitted. So one can access that
attribute and compare the performance accross different modeling parameters (lower score is better):
attribute and compare the performance across different modeling parameters (lower score is better):

.. testcode::

Expand Down
2 changes: 1 addition & 1 deletion doc/spec/estimation/dr.rst
Original file line number Diff line number Diff line change
Expand Up @@ -472,7 +472,7 @@ Usage FAQs
- **How can I assess the performance of the CATE model?**

Each of the DRLearner classes have an attribute `score_` after they are fitted. So one can access that
attribute and compare the performance accross different modeling parameters (lower score is better):
attribute and compare the performance across different modeling parameters (lower score is better):

.. testcode::

Expand Down
2 changes: 1 addition & 1 deletion doc/spec/estimation/forest.rst
Original file line number Diff line number Diff line change
Expand Up @@ -257,7 +257,7 @@ Then the criterion implicit in the reduction is the weighted mean squared error,
where :math:`Var_n`, denotes the empirical variance. Essentially, this criterion tries to maximize heterogeneity
(as captured by maximizing the sum of squares of the two estimates), while penalizing splits that create nodes
with small variation in the treatment. On the contrary the criterion proposed in [Athey2019]_ ignores the within
child variation of the treatment and solely maximizes the hetergoeneity, i.e.
child variation of the treatment and solely maximizes the heterogeneity, i.e.

.. math::
Expand Down
2 changes: 1 addition & 1 deletion doc/spec/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ How do I give feedback?
------------------------------------

This project welcomes contributions and suggestions. We use the `DCO bot <https://github.com/apps/dco>`_ to enforce a
`Developer Certificate of Origin <https://developercertificate.org/>` which requires users to sign-off on their commits.
`Developer Certificate of Origin <https://developercertificate.org/>`_ which requires users to sign-off on their commits.
This is a simple way to certify that you wrote or otherwise have the right to submit the code you are contributing to
the project. Git provides a :code:`-s` command line option to include this automatically when you commit via :code:`git commit`.

Expand Down
10 changes: 5 additions & 5 deletions doc/spec/interpretability.rst
Original file line number Diff line number Diff line change
Expand Up @@ -73,10 +73,10 @@ models using the Shapley values methodology (see e.g. [Lundberg2017]_).
Similar to how black-box predictive machine learning models can be explained with SHAP, we can also explain black-box effect
heterogeneity models. This approach provides an explanation as to why a heterogeneous causal effect model produced larger or
smaller effect values for particular segments of the population. Which were the features that lead to such differentiation?
This question is easy to address when the model is succinctly described, such as the case of linear heterogneity models,
This question is easy to address when the model is succinctly described, such as the case of linear heterogeneity models,
where one can simply investigate the coefficients of the model. However, it becomes hard when one starts using more expressive
models, such as Random Forests and Causal Forests to model effect hetergoeneity. SHAP values can be of immense help to
understand the leading factors of effect hetergoeneity that the model picked up from the training data.
models, such as Random Forests and Causal Forests to model effect heterogeneity. SHAP values can be of immense help to
understand the leading factors of effect heterogeneity that the model picked up from the training data.

Our package offers seamless integration with the SHAP library. Every CATE estimator has a method `shap_values`, which returns the
SHAP value explanation of the estimators output for every treatment and outcome pair. These values can then be visualized with
Expand All @@ -92,8 +92,8 @@ For instance:
est = LinearDML()
est.fit(y, t, X=X, W=W)
shap_values = est.shap_values(X)
# local view: explain hetergoeneity for a given observation
# local view: explain heterogeneity for a given observation
ind=0
shap.plots.force(shap_values["Y0"]["T0"][ind], matplotlib=True)
# global view: explain hetergoeneity for a sample of dataset
# global view: explain heterogeneity for a sample of dataset
shap.summary_plot(shap_values['Y0']['T0'])
2 changes: 1 addition & 1 deletion econml/solutions/causal_analysis/_causal_analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -1546,7 +1546,7 @@ def plot_heterogeneity_tree(self, Xtest, feature_index, *,
include_model_uncertainty=False,
alpha=0.05):
"""
Plot an effect hetergoeneity tree using matplotlib.
Plot an effect heterogeneity tree using matplotlib.
Parameters
----------
Expand Down
2 changes: 1 addition & 1 deletion notebooks/Double Machine Learning Examples.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -927,7 +927,7 @@
"source": [
"### 2.4 Interpretability with SHAP Values\n",
"\n",
"Explain the hetergoeneity model for the constant marginal effect of the treatment using <a href=\"https://shap.readthedocs.io/en/latest/\">SHAP values</a>."
"Explain the heterogeneity model for the constant marginal effect of the treatment using <a href=\"https://shap.readthedocs.io/en/latest/\">SHAP values</a>."
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion notebooks/Interpretability with SHAP.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
"\n",
"[SHAP](https://shap.readthedocs.io/en/latest/) is a popular open source library for interpreting black-box machine learning models using the [Shapley values methodology](https://proceedings.neurips.cc/paper/2017/hash/8a20a8621978632d76c43dfd28b67767-Abstract.html).\n",
"\n",
"Similar to how black-box predictive machine learning models can be explained with SHAP, we can also explain black-box effect heterogeneity models. This approach provides an explanation as to why a heterogeneous causal effect model produced larger or smaller effect values for particular segments of the population. Which were the features that lead to such differentiation? This question is easy to address when the model is succinctly described, such as the case of linear heterogneity models, where one can simply investigate the coefficients of the model. However, it becomes hard when one starts using more expressive models, such as Random Forests and Causal Forests to model effect hetergoeneity. SHAP values can be of immense help to understand the leading factors of effect hetergoeneity that the model picked up from the training data.\n",
"Similar to how black-box predictive machine learning models can be explained with SHAP, we can also explain black-box effect heterogeneity models. This approach provides an explanation as to why a heterogeneous causal effect model produced larger or smaller effect values for particular segments of the population. Which were the features that lead to such differentiation? This question is easy to address when the model is succinctly described, such as the case of linear heterogeneity models, where one can simply investigate the coefficients of the model. However, it becomes hard when one starts using more expressive models, such as Random Forests and Causal Forests to model effect heterogeneity. SHAP values can be of immense help to understand the leading factors of effect heterogeneity that the model picked up from the training data.\n",
"\n",
"Our package offers seamless integration with the SHAP library. Every `CateEstimator` has a method `shap_values`, which returns the SHAP value explanation of the estimators output for every treatment and outcome pair. These values can then be visualized with the plethora of visualizations that the SHAP library offers. Moreover, whenever possible our library invokes fast specialized algorithms from the SHAP library, for each type of final model, which can greatly reduce computation times.\n",
"\n",
Expand Down

0 comments on commit 6219695

Please sign in to comment.