Skip to content

Commit

Permalink
Update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
daehwankim12 committed Oct 18, 2024
1 parent ccc159b commit bc89a39
Show file tree
Hide file tree
Showing 6 changed files with 117 additions and 85 deletions.
31 changes: 29 additions & 2 deletions R/mediation_analysis.R
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,36 @@
#' @param num_threads An integer specifying the number of threads to use for parallel processing. Default is the number of available cores detected by `parallel::detectCores()`.
#' @return None. The results are written to the specified `output_file` in CSV format.
#'
#' @details
#' ... (rest of the documentation remains the same)
#' @details This function estimates the following effects:
#' * **Indirect Effect (t1):** The effect of the exposure on the outcome through the mediator when the exposure is set to its value at t=1 (typically meaning its presence or a higher level).
#' * **Indirect Effect (t0):** The effect of the exposure on the outcome through the mediator when the exposure is set to its value at t=0 (typically meaning its absence or a lower level).
#' * **Direct Effect (t1):** The effect of the exposure on the outcome not through the mediator when the exposure is set to t=1.
#' * **Direct Effect (t0):** The effect of the exposure on the outcome not through the mediator when the exposure is set to t=0.
#' * **Total Effect:** The total effect of the exposure on the outcome, both direct and indirect.
#'
#' The output CSV file includes estimates, standard errors, 95% confidence intervals (LCB, UCB), and p-values for each effect and combination of variables.
#'
#' @examples
#' \dontrun{
#' # Example data (replace with your own data)
#' my_data <- data.table(
#' Exposure_A = rnorm(1000),
#' Exposure_B = rnorm(1000),
#' Mediator_X = rnorm(1000),
#' Mediator_Y = rnorm(1000),
#' Outcome_1 = rnorm(1000),
#' Outcome_2 = rnorm(1000)
#' )
#'
#' # Perform mediation analysis
#' mediation_analysis(
#' data = my_data,
#' columns = list(exposure = "Exposure", mediator = "Mediator", outcome = "Outcome"),
#' nrep = 500, # Reduced for example speed
#' output_file = "mediation_results.csv"
#' )
#' }
#'
#' @import data.table Rcpp
#' @importFrom Rcpp evalCpp
#' @importFrom RcppParallel setThreadOptions
Expand Down
137 changes: 56 additions & 81 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,44 +8,45 @@

## Features

- **Efficient Mediation Analysis:** Conduct mediation analysis with multiple exposure, mediator, and outcome variables.
- **Parallel Processing:** Utilize multiple CPU cores to accelerate computations.
- **Bootstrap Resampling:** Estimate direct, indirect, and total effects with confidence intervals and p-values.
- **Customizable Column Prefixes:** Easily specify prefixes to identify exposure, mediator, and outcome variables.
- **Scalable for Large Datasets:** Handle large-scale data efficiently by processing in chunks.
- **Real-time CSV Output:** Save results directly to CSV files during analysis.
- **Efficient Mediation Analysis:** Conduct mediation analysis with multiple exposure, mediator, and outcome variables.
- **Parallel Processing:** Utilize multiple CPU cores to accelerate computations.
- **Bootstrap Resampling:** Estimate direct, indirect, and total effects with confidence intervals and p-values.
- **Customizable Column Prefixes:** Easily specify prefixes to identify exposure, mediator, and outcome variables.
- **Scalable for Large Datasets:** Handle large-scale data efficiently by processing in chunks.
- **Real-time CSV Output:** Save results directly to CSV files during analysis.

## Installation

You can install the development version of **fastmed** from GitHub using the `devtools` package:

``` r
```r
# Install devtools if not already installed
install.packages("devtools")
if (!requireNamespace("devtools", quietly = TRUE)) {
install.packages("devtools")
}

# Install fastmed from GitHub
devtools::install_github("daehwankim12/fastmed")
```

## Usage

Below is an example of how to use the `mediation_analysis` function in fastmed.
Here's a detailed example of how to use the `mediation_analysis` function in fastmed:

### Example

``` r
```r
library(fastmed)
library(data.table)

# Generate example data
set.seed(123)
n <- 1000
my_data <- data.table(
Exposure_A1 = rnorm(1000),
Exposure_A2 = rnorm(1000),
Mediator_X1 = rnorm(1000),
Mediator_X2 = rnorm(1000),
Outcome_Y1 = rnorm(1000),
Outcome_Y2 = rnorm(1000)
Exposure_A1 = rnorm(n),
Exposure_A2 = rnorm(n),
Mediator_X1 = rnorm(n),
Mediator_X2 = rnorm(n),
Outcome_Y1 = rnorm(n),
Outcome_Y2 = rnorm(n)
)

# Define column prefixes
Expand Down Expand Up @@ -74,78 +75,52 @@ print(results)

### Parameters

- `data`: A data.table or data.frame containing the dataset.
- `columns`: A list with three named elements: exposure, mediator, and outcome. Each should be a character string specifying the prefix of the respective columns.
- `nrep`: (Optional) Number of bootstrap replicates. Default is 1000.
- `output_file`: Path to the output CSV file where results will be saved.
- `num_threads`: (Optional) Number of threads for parallel processing. Defaults to the number of available cores.
- `data`: A data.table or data.frame containing the dataset.
- `columns`: A list with three named elements: exposure, mediator, and outcome. Each should be a character string specifying the prefix of the respective columns.
- `nrep`: (Optional) Number of bootstrap replicates. Default is 1000.
- `output_file`: Path to the output CSV file where results will be saved.
- `num_threads`: (Optional) Number of threads for parallel processing. Defaults to the number of available cores.

### Output

The output CSV file (`output_csv` in the example) will contain the following columns for each combination of exposure, mediator, and outcome variables:

- `combination`: Identifier for the combination (e.g., Exposure_A1_Mediator_X1_Outcome_Y1)
- `indirect_t1_estimate`: Indirect effect estimate at t=1
- `indirect_t1_std_err`: Standard error of the indirect effect at t=1
- `indirect_t1_lcb`: Lower confidence bound for the indirect effect at t=1
- `indirect_t1_ucb`: Upper confidence bound for the indirect effect at t=1
- `indirect_t1_p_value`: P-value for the indirect effect at t=1
- `indirect_t0_estimate`: Indirect effect estimate at t=0
- `indirect_t0_std_err`: Standard error of the indirect effect at t=0
- `indirect_t0_lcb`: Lower confidence bound for the indirect effect at t=0
- `indirect_t0_ucb`: Upper confidence bound for the indirect effect at t=0
- `indirect_t0_p_value`: P-value for the indirect effect at t=0
- `direct_t1_estimate`: Direct effect estimate at t=1
- `direct_t1_std_err`: Standard error of the direct effect at t=1
- `direct_t1_lcb`: Lower confidence bound for the direct effect at t=1
- `direct_t1_ucb`: Upper confidence bound for the direct effect at t=1
- `direct_t1_p_value`: P-value for the direct effect at t=1
- `direct_t0_estimate`: Direct effect estimate at t=0
- `direct_t0_std_err`: Standard error of the direct effect at t=0
- `direct_t0_lcb`: Lower confidence bound for the direct effect at t=0
- `direct_t0_ucb`: Upper confidence bound for the direct effect at t=0
- `direct_t0_p_value`: P-value for the direct effect at t=0
- `total_effect_estimate`: Total effect estimate
- `total_effect_std_err`: Standard error of the total effect
- `total_effect_lcb`: Lower confidence bound for the total effect
- `total_effect_ucb`: Upper confidence bound for the total effect
- `total_effect_p_value`: P-value for the total effect

## Development

If you wish to contribute to fastmed, please follow these guidelines:

1. Fork the repository on GitHub.
2. Create a new branch for your feature or bugfix.
3. Commit your changes with clear messages.
4. Push your branch to your forked repository.
5. Submit a pull request detailing your changes.

## Testing

fastmed includes a suite of tests to ensure functionality. To run the tests, use the following commands:

``` r
library(devtools)
library(testthat)

# Navigate to the package directory
setwd("path/to/fastmed")

# Run tests
devtools::test()
```
The output CSV file will contain detailed results for each combination of exposure, mediator, and outcome variables, including estimates, standard errors, confidence intervals, and p-values for indirect, direct, and total effects.

## Performance Considerations

- The package is optimized for parallel processing. Increase `num_threads` to utilize more CPU cores.
- For very large datasets, consider splitting the analysis into smaller chunks and combining the results.
- Monitor memory usage, especially when increasing `nrep` for bootstrap resampling.

## Troubleshooting

If you encounter issues:

1. Ensure you have the latest version of fastmed installed.
2. Check that all dependencies are up to date.
3. For performance issues, try adjusting `num_threads` or reducing `nrep`.
4. If you encounter a bug, please [open an issue](https://github.com/daehwankim12/fastmed/issues) with a reproducible example.

## License

This project is licensed under the MIT License. See the LICENSE file for details.
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.

## Acknowledgements

- Rcpp
- RcppParallel
- data.table
fastmed builds upon several powerful R packages:

- [Rcpp](https://www.rcpp.org/) for C++ integration
- [RcppParallel](https://rcppcore.github.io/RcppParallel/) for parallel processing
- [data.table](https://rdatatable.gitlab.io/data.table/) for efficient data manipulation

## Citation

If you use fastmed in your research, please cite it as follows:

```
Kim, D. (2024). fastmed: Fast Mediation Analysis in R. R package version 0.1.0.
https://github.com/daehwankim12/fastmed
```

## Contact

For any questions or suggestions, please open an issue on the GitHub repository.
For questions, suggestions, or collaborations, please [open an issue](https://github.com/daehwankim12/fastmed/issues) on the GitHub repository or contact the package maintainer at [[email protected]](mailto:[email protected]).
31 changes: 30 additions & 1 deletion man/mediation_analysis.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Binary file modified src/fastmed.so
Binary file not shown.
3 changes: 2 additions & 1 deletion src/mediation_analysis.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -127,10 +127,11 @@ class MediationWorker : public Worker {
private:
const MatrixXd& data;
const std::vector<std::string>& column_names;
const int nrep;
const std::vector<std::string>& exposure_vec;
const std::vector<std::string>& mediator_vec;
const std::vector<std::string>& outcome_vec;
const int nrep;

std::unordered_map<std::string, int> column_index_map;
BufferedFileWriter& writer;

Expand Down
Binary file modified src/mediation_analysis.o
Binary file not shown.

0 comments on commit bc89a39

Please sign in to comment.