CausalPy

A Python package focussing on causal inference in quasi-experimental settings. The package allows for sophisticated Bayesian model fitting methods to be used in addition to traditional OLS.

Installation

To get the latest release:

pip install CausalPy

Alternatively, if you want the very latest version of the package you can install from GitHub:

pip install git+https://github.com/pymc-labs/CausalPy.git

Quickstart

import causalpy as cp
import matplotlib.pyplot as plt

# Import and process data
df = (
    cp.load_data("drinking")
    .rename(columns={"agecell": "age"})
    .assign(treated=lambda df_: df_.age > 21)
    )

# Run the analysis
result = cp.RegressionDiscontinuity(
    df,
    formula="all ~ 1 + age + treated",
    running_variable_name="age",
    model=cp.pymc_models.LinearRegression(),
    treatment_threshold=21,
    )

# Visualize outputs
fig, ax = result.plot();

# Get a results summary
result.summary()

plt.show()

Roadmap

Plans for the repository can be seen in the Issues.

Videos

Click on the thumbnail below to watch a video about CausalPy on YouTube.

Features

CausalPy has a broad range of quasi-experimental methods for causal inference:

Method	Description
Synthetic control	Constructs a synthetic version of the treatment group from a weighted combination of control units. Used for causal inference in comparative case studies when a single unit is treated, and there are multiple control units.
Geographical lift	Measures the impact of an intervention in a specific geographic area by comparing it to similar areas without the intervention. Commonly used in marketing to assess regional campaigns.
ANCOVA	Analysis of Covariance combines ANOVA and regression to control for the effects of one or more quantitative covariates. Used when comparing group means while controlling for other variables.
Differences in Differences	Compares the changes in outcomes over time between a treatment group and a control group. Used in observational studies to estimate causal effects by accounting for time trends.
Regression discontinuity	Identifies causal effects by exploiting a cutoff or threshold in an assignment variable. Used when treatment is assigned based on a threshold value of an observed variable, allowing comparison just above and below the cutoff.
Regression kink designs	Focuses on changes in the slope (kinks) of the relationship between variables rather than jumps at cutoff points. Used to identify causal effects when treatment intensity changes at a threshold.
Interrupted time series	Analyzes the effect of an intervention by comparing time series data before and after the intervention. Used when data is collected over time and an intervention occurs at a known point, allowing assessment of changes in level or trend.
Instrumental variable regression	Addresses endogeneity by using an instrument variable that is correlated with the endogenous explanatory variable but uncorrelated with the error term. Used when explanatory variables are correlated with the error term, providing consistent estimates of causal effects.
Inverse Propensity Score Weighting	Weights observations by the inverse of the probability of receiving the treatment. Used in causal inference to create a synthetic sample where the treatment assignment is independent of measured covariates, helping to adjust for confounding variables in observational studies.

Learning resources

Here are some general resources about causal inference:

The official PyMC examples gallery has a set of examples specifically relating to causal inference.
Angrist, J. D., & Pischke, J. S. (2009). Mostly harmless econometrics: An empiricist's companion. Princeton university press.
Angrist, J. D., & Pischke, J. S. (2014). Mastering'metrics: The path from cause to effect. Princeton university press.
Cunningham, S. (2021). Causal inference: The Mixtape. Yale University Press.
Huntington-Klein, N. (2021). The effect: An introduction to research design and causality. Chapman and Hall/CRC.
Reichardt, C. S. (2019). Quasi-experimentation: A guide to design and analysis. Guilford Publications.

License

Apache License 2.0

Support

This repository is supported by PyMC Labs.

If you are interested in seeing what PyMC Labs can do for you, then please email [email protected]. We work with companies at a variety of scales and with varying levels of existing modeling capacity. We also run corporate workshop training events and can provide sessions ranging from introduction to Bayes to more advanced topics.

Name		Name	Last commit message	Last commit date
Latest commit History 1,085 Commits
.github		.github
causalpy		causalpy
docs		docs
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
codecov.yml		codecov.yml
environment.yml		environment.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CausalPy

Installation

Quickstart

Roadmap

Videos

Features

Learning resources

License

Support

About

Releases 22

Contributors 18

Languages

License

pymc-labs/CausalPy

Folders and files

Latest commit

History

Repository files navigation

CausalPy

Installation

Quickstart

Roadmap

Videos

Features

Learning resources

License

Support

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 22

Contributors 18

Languages