-
Notifications
You must be signed in to change notification settings - Fork 939
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about Support for Survival/Time-to-Event Data #1285
Comments
Can you provide a motivating example or dataset on which you'd like to run DoWhy? Supporting new kinds of data is significant work. So we can try to do this step-by-step: first, let's understand a popular, high impact scenario where we can extend DoWhy, and then later we can support survival analysis fully. |
Survival data typically comprises two key components: time (the duration from the start of an observation period to either an event occurrence, study end, loss of contact, or withdrawal) and status (indicating whether an event has occurred or if censoring has taken place). I've found several popular datasets on Kaggle datasets. Specifically:
Additionally, I've found a helpful introduction to survival analysis on the wiki, which provides a solid starting point for understanding this topic. Thank you for your attention to this matter.😊 |
This issue is stale because it has been open for 30 days with no activity. |
Adding an additional request here for this functionality. I do understand this would be a significant amount of work, but agree that is would be extremely useful for many applications (e.g., medical). For example, oftentimes the outcome of interest is 30-day mortality after treatment. Patients who died anytime after 30 days or never died are "right-censored" and to understand the effect of treatment or covariates on 30-day mortality, the survival time of right-censored patients is imputed as 30 days. However, without a test that considers right-censoring, imputing survival time as 30 days would affect treatment effect estimate. |
Hey folks, I wanted to add a link to this discussion of survival analysis in the discord: https://discord.com/channels/818456847551168542/818456856137170996/1221611463823720588 Notably, Paidamoyo Chapfuwa has published a counterfactual survival analysis notebook that could be integrated into PyWhy and extended with its identification algorithms and/or CATE estimators, etc. She was looking for someone who might push the integration forward. Would make a "good first project" for a person interested in getting more involved. https://github.com/paidamoyo/counterfactual_survival_analysis |
I am writing to express my appreciation for the excellent work on the package, which has greatly facilitated causal inference in Python. As a user of the package, I have been able to successfully apply it to various datasets and problems.
However, I was wondering if it would be possible to extend DoWhy's capabilities to support survival or time-to-event data? Currently, the package appears to focus on traditional outcomes such as binary, continuous, or count responses. Time-to-event data is a common outcome type in many fields (e.g., medicine, economics, sociology), and I believe that supporting this would greatly enhance the utility of DoWhy.
I understand that adding new features can be a significant undertaking, but I was hoping to get some insight into whether there are any plans to support survival analysis or if you could recommend alternative packages or methods for causal inference with time-to-event data. Any advice or resources you could share would be greatly appreciated.
Thank you again for your hard work on the package.
The text was updated successfully, but these errors were encountered: