Add Bayesian instrumental variable estimation #213

NathanielF · 2023-06-30T20:12:07Z

Working through the repo to figure out how to best integrate the multivariate approach to Bayesian instrumental regression as seen in @juanitorduz 's blog. This is very early stages. Just want to make sure to test the basic idea (seems to work!) and get to know the Repo.

In addition happy to take feedback and have a conversation about choices. Will be experimenting a bit this weekend.

Relates to Add functionality for Bayesian Instrumental Variables #212

…ble regression

review-notebook-app · 2023-06-30T20:12:11Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

juanitorduz · 2023-06-30T20:50:55Z

Cool! 🚀 !

…ated MLE treatment effects Signed-off-by: Nathaniel <[email protected]>

NathanielF · 2023-07-02T19:59:07Z

Docs example seems to build locally:

Model fit

Signed-off-by: Nathaniel <[email protected]>

NathanielF · 2023-07-02T20:54:33Z

Ok @drbenvincent I think this is starting to look like something concrete.

I've added a new PyMC model type and a PyMC experiment type to run the Bayesian version of the IV regression discussed in @juanitorduz 's blog post. The docs build and I've added an example, and one integration test. I've compared in my example the Bayesian estimate with the two-stage least squares approach estimated with Sklearn's Linear Regression class. I've also chosen to allow priors to be specified for the coefficients. Defaulting to the MLE estimates where none are specified.

Before I add more to the documentation and the write up, and add some more data validation tests and the like. I wanted to check that this approach made sense and fits into the broad patterns you already have here.

juanitorduz · 2023-07-03T08:37:11Z

causalpy/pymc_models.py

+            sd_dist = pm.HalfCauchy.dist(beta=2, shape=2)
+            chol, corr, sigmas = pm.LKJCholeskyCov(
+                name="chol_cov", eta=2, n=2, sd_dist=sd_dist


Do you want to make these prior parameters also customizable through the coords mapping?

Just want to clear up what you mean here - should I let the coord names and values be changeable or the prior values for the Lkj var. Or both?

juanitorduz · 2023-07-03T08:40:22Z

causalpy/tests/test_integration_pymc_examples.py

+
+
+@pytest.mark.integration
+def test_iv_reg():


Can we also check the expected effect estimation to test the implementation itself?

It would also be nice to parametrize this test with different priors.

Definitely intend to add more tests of this type. Just added one to ensure I could run the testing code.

FYI I've added a parameter recovery example to the accompanying notebook. I just wasn't sure we wanted any long-running tests. It can take a little time to sample the multivariate distribution...

juanitorduz · 2023-07-03T08:46:25Z

docs/source/notebooks/iv_pymc.ipynb

@@ -0,0 +1,4160 @@
+{


Line #3. # %config InlineBackend.figure_format = 'svg'
shall we user "retina" ?

Reply via ReviewNB

Probably a good idea. I was experimenting with svg at one point to try to keep file sizes down. But I have a vague memory that svg had some rendering issues on either GitHub or readthedocs.

juanitorduz · 2023-07-03T08:46:25Z

docs/source/notebooks/iv_pymc.ipynb

@@ -0,0 +1,4160 @@
+{


Line #6. simple_ols_reg = sk_lin_reg().fit(X, y)
Not relevant for this PR but I think it would be nicer to test the bayesian estimates with statsmodels so that we get the usual stats like p-values and confidence intervals. WDYT?

Reply via ReviewNB

I have an ambition of specifically adding in statsmodels as a CausalPy backend, but I think you meant just to compare with a simple statsmodels output just for this example?

Yeah, I wasn't sure of its status as a dependency and realised I could use sklearn to do 2SLS so I didn't have to risk adding any more dependencies.

Let's create a discussion or issue to discuss this. As I mentioned initially, this is irrelevant to the PR 🙈 !

juanitorduz · 2023-07-03T08:47:57Z

@NathanielF It is looking good! I left some minor comments :)

codecov · 2023-07-03T08:48:16Z

Codecov Report

Merging #213 (cb68c78) into main (31e0039) will increase coverage by 2.60%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #213      +/-   ##
==========================================
+ Coverage   71.26%   73.86%   +2.60%     
==========================================
  Files          19       19              
  Lines        1044     1148     +104     
==========================================
+ Hits          744      848     +104     
  Misses        300      300

Files Changed	Coverage Δ
causalpy/data/datasets.py	`92.30% <ø> (ø)`
causalpy/pymc_experiments.py	`66.66% <100.00%> (+6.41%)`	⬆️
causalpy/pymc_models.py	`100.00% <100.00%> (ø)`
causalpy/tests/test_input_validation.py	`100.00% <100.00%> (ø)`
causalpy/tests/test_integration_pymc_examples.py	`100.00% <100.00%> (ø)`

drbenvincent · 2023-07-04T17:14:04Z

Nice! Some quick thoughts:

I've enabled bibtex references in the docs now, so if you update from main then you can do that in your example notebook. See one of the existing examples... you have to remember to add the reference section info in a markdown cell for the references to show up.
We don't have anything specific in the contributing guide on example notebooks. Let me know if you think I should add that in, but at this point things are a bit simpler than pymc-examples.
I think Instrumental Variables are worthy of a new section in the TOC, what do you think?

I should have time to take a proper look at the code and example notebook in the next few days.

Signed-off-by: Nathaniel <[email protected]>

docs/source/notebooks/iv_pymc.ipynb

drbenvincent · 2023-07-22T14:28:11Z

docs/source/notebooks/iv_pymc.ipynb

@@ -0,0 +1,4154 @@
+{


Worth running black on this. It should be done in the regular pre-commit I believe

Reply via ReviewNB

Yeah, this runs at each pre-commit check

docs/source/notebooks/iv_pymc.ipynb

drbenvincent · 2023-07-22T14:29:24Z

This is all looking very promising. So far I've just done a quick review of the example notebook.

Will try to look at the code changes soon.

Need to update from main as there have been quite a few updates :)

drbenvincent

Oh, in a previous comment I requested updates to the glossary. Bear in mind that the way switched to a 'proper' sphinx glossary, so once you've updated from main, there should be a glossary.rst file.

remove the over-cautious text on the code being alpha

drbenvincent

Added some comments.

The example notebook is great. Depending what we do with adding separate non-Bayesian classes for IV's, we could just have this one notebook as it already has a component of comparing Bayesian and non-Bayesian approaches. So we could rename the notebook to just be general, removing reference to PyMC.

causalpy/pymc_experiments.py

causalpy/pymc_models.py

causalpy/pymc_experiments.py

Signed-off-by: Nathaniel <[email protected]>

NathanielF · 2023-08-11T10:24:43Z

I think i've addressed the above points, but let me know if there is anything missing?

drbenvincent

Can you add a axis labels for the final plot in the notebook?

I think after this we'll merge :)

drbenvincent · 2023-08-23T10:51:43Z

causalpy/pymc_experiments.py

+                as an outcome variable and in the data object to be used as a covariate.
+                """
+            )
+        check_binary = len(self.data[treatment.strip()].unique()) > 2


I think this check might need to be more robust because we're seeing the exception raised in the example notebook where presumably there is no problem.

It could be worth adding a test to check that we aren't getting false positives or false negatives with this check.

This check is actually working fine. It just so happens that the treatment variable risk in the model is a score between 0-10. The treatment effect here should be thought of as a dosage treatment rather than a binary treatment here. I've added a line to the write up mentioning this.

Ah ok. So I misunderstood before - I didn't read that closely and assumed it had to be binary. If that's not the case, then I guess we can remove that exception? Sorry about that.

EDIT: For some reason it wasn't showing changes to me. I see that you changed the wording of the exception. Happy whichever way you want to do it.

Let's leave it in. I think many of the use-cases will be binary treatments, and it's no harm to be reminded to take care interpreting coefficients. Can revisit when i implement the sklearn/statsmodels IV class if you like.

causalpy/pymc_models.py

Signed-off-by: Nathaniel <[email protected]>

drbenvincent · 2023-08-24T09:45:16Z

Depending on the previous commend about the exception, this looks ready. Ok for me to merge?

NathanielF · 2023-08-24T11:50:54Z

Happy for you to merge! Thanks!

[IV 212] Adding base classes required for bayesian instrumental varia…

23eff06

…ble regression

[IV 212] loading data set properly, tidied notebook example and estim…

0a519d8

…ated MLE treatment effects Signed-off-by: Nathaniel <[email protected]>

NathanielF added 2 commits July 2, 2023 21:16

[IV 212] adding the class and example notebook to the documentation

28e9d9f

Signed-off-by: Nathaniel <[email protected]>

[IV 212] adding integration test

3c70dfe

Signed-off-by: Nathaniel <[email protected]>

juanitorduz reviewed Jul 3, 2023

View reviewed changes

NathanielF added 6 commits July 4, 2023 20:54

[IV 212] updating index rst

f97cd82

Merge branch 'latest' into feature_instrumental_variables

d30da22

[IV 212] experimenting with nicer plot

61d5592

Signed-off-by: Nathaniel <[email protected]>

[IV 212] changed colors

c70c3df

Signed-off-by: Nathaniel <[email protected]>

[IV 212] added reference

1bf4a74

Signed-off-by: Nathaniel <[email protected]>

[IV 212] added more write up and improved plot

31cf8d7

Signed-off-by: Nathaniel <[email protected]>

drbenvincent reviewed Jul 22, 2023

View reviewed changes

drbenvincent and others added 8 commits July 22, 2023 16:08

get bibtex references working

c50db34

[IV 212] added parameter recovery example

fe76082

improvements to current banks

6a6739b

minor edits

70b6571

attempt to fix failing test

7529eaa

update to proper sphinx glossary

3e9b475

add wilkinson notation reference

1f4e030

fix an admonition in brexit notebook

df594ca

drbenvincent and others added 11 commits July 22, 2023 16:36

add in some glossary terms to example notebooks

d76d295

Update README.md

306bb1e

remove the over-cautious text on the code being alpha

bump to version 0.0.14

5a14848

improvements to current banks

73e2959

update to proper sphinx glossary

85ae1b5

add wilkinson notation reference

916804e

add in some glossary terms to example notebooks

f279a80

trying to resolve merge conflicts

12334bf

[IV 212] addressing some of Ben's comments with notebook

dd22016

Merge branch 'main' into feature_instrumental_variables

7a28dab

[IV 212] adding an input validation test

f08b02d

NathanielF marked this pull request as ready for review July 22, 2023 21:33

drbenvincent reviewed Aug 7, 2023

View reviewed changes

NathanielF mentioned this pull request Aug 9, 2023

Add functionality for Frequentist IV regression classes #229

Open

NathanielF added 3 commits August 9, 2023 20:11

[IV 212] adding user warning and tidying params

04584d0

Signed-off-by: Nathaniel <[email protected]>

Merge branch 'main' into feature_instrumental_variables

efa3f6d

[IV 212] pretty print user warning

f737115

Signed-off-by: Nathaniel <[email protected]>

drbenvincent self-requested a review August 23, 2023 10:45

drbenvincent reviewed Aug 23, 2023

View reviewed changes

NathanielF added 2 commits August 23, 2023 22:31

[IV 212] adding axis labels with Ben's comments

2e1d692

Signed-off-by: Nathaniel <[email protected]>

[IV 212] fixing axis label

cb68c78

Signed-off-by: Nathaniel <[email protected]>

drbenvincent mentioned this pull request Aug 24, 2023

Increase docstring coverage and add doctests #232

Merged

drbenvincent changed the title ~~[IV 212] Adding base classes required for bayesian IVs~~ Add Bayesian instrumental variable estimation Aug 24, 2023

drbenvincent added the enhancement New feature or request label Aug 24, 2023

drbenvincent linked an issue Aug 24, 2023 that may be closed by this pull request

Add functionality for Bayesian Instrumental Variables #212

Closed

drbenvincent merged commit 3070bc9 into pymc-labs:main Aug 24, 2023
10 checks passed

alexmalins mentioned this pull request Aug 27, 2023

Fix broken instrumental variable regression image link #234

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Bayesian instrumental variable estimation #213

Add Bayesian instrumental variable estimation #213

NathanielF commented Jun 30, 2023 •

edited

Loading

review-notebook-app bot commented Jun 30, 2023

juanitorduz commented Jun 30, 2023

NathanielF commented Jul 2, 2023 •

edited

Loading

NathanielF commented Jul 2, 2023

juanitorduz Jul 3, 2023

NathanielF Jul 4, 2023

juanitorduz Jul 3, 2023

juanitorduz Jul 3, 2023

NathanielF Jul 4, 2023

NathanielF Jul 22, 2023 •

edited

Loading

juanitorduz Jul 3, 2023 •

edited

Loading

drbenvincent Jul 4, 2023 •

edited

Loading

juanitorduz Jul 3, 2023 •

edited

Loading

drbenvincent Jul 4, 2023

NathanielF Jul 4, 2023

juanitorduz Jul 4, 2023

juanitorduz commented Jul 3, 2023

codecov bot commented Jul 3, 2023 •

edited

Loading

drbenvincent commented Jul 4, 2023

drbenvincent Jul 22, 2023 •

edited

Loading

NathanielF Jul 22, 2023

drbenvincent commented Jul 22, 2023

drbenvincent left a comment

drbenvincent left a comment

NathanielF commented Aug 11, 2023

drbenvincent left a comment

drbenvincent Aug 23, 2023

NathanielF Aug 23, 2023

NathanielF Aug 23, 2023

drbenvincent Aug 24, 2023 •

edited

Loading

NathanielF Aug 24, 2023

drbenvincent commented Aug 24, 2023

NathanielF commented Aug 24, 2023



		@pytest.mark.integration
		def test_iv_reg():

Add Bayesian instrumental variable estimation #213

Add Bayesian instrumental variable estimation #213

Conversation

NathanielF commented Jun 30, 2023 • edited Loading

review-notebook-app bot commented Jun 30, 2023

juanitorduz commented Jun 30, 2023

NathanielF commented Jul 2, 2023 • edited Loading

NathanielF commented Jul 2, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

NathanielF Jul 22, 2023 • edited Loading

Choose a reason for hiding this comment

juanitorduz Jul 3, 2023 • edited Loading

Choose a reason for hiding this comment

drbenvincent Jul 4, 2023 • edited Loading

Choose a reason for hiding this comment

juanitorduz Jul 3, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

juanitorduz commented Jul 3, 2023

codecov bot commented Jul 3, 2023 • edited Loading

Codecov Report

drbenvincent commented Jul 4, 2023

drbenvincent Jul 22, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

drbenvincent commented Jul 22, 2023

drbenvincent left a comment

Choose a reason for hiding this comment

drbenvincent left a comment

Choose a reason for hiding this comment

NathanielF commented Aug 11, 2023

drbenvincent left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

drbenvincent Aug 24, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

drbenvincent commented Aug 24, 2023

NathanielF commented Aug 24, 2023

NathanielF commented Jun 30, 2023 •

edited

Loading

NathanielF commented Jul 2, 2023 •

edited

Loading

NathanielF Jul 22, 2023 •

edited

Loading

juanitorduz Jul 3, 2023 •

edited

Loading

drbenvincent Jul 4, 2023 •

edited

Loading

juanitorduz Jul 3, 2023 •

edited

Loading

codecov bot commented Jul 3, 2023 •

edited

Loading

drbenvincent Jul 22, 2023 •

edited

Loading

drbenvincent Aug 24, 2023 •

edited

Loading