MEDS-torch: Advanced Machine Learning for Electronic Health Records

🚀 Quick Start

Installation

pip install meds-torch

Set up environment variables

# Define data paths
PATHS_KWARGS="paths.data_dir=/CACHED/NESTED/RAGGED/TENSORS/DIR paths.meds_cohort_dir=/PATH/TO/MEDS/DATA/ paths.output_dir=/OUTPUT/RESULTS/DIRECTORY"

# Define task parameters (for supervised learning)
TASK_KWARGS="data.task_name=NAME_OF_TASK data.task_root_dir=/PATH/TO/TASK/LABELS/"

Basic Usage

Train a supervised model (GPU)

meds-torch-train trainer=gpu $PATHS_KWARGS $TASK_KWARGS

Pretrain an autoregressive forecasting model (GPU)

meds-torch-train trainer=gpu $PATHS_KWARGS model=eic_forecasting

Train with a specific experiment configuration

meds-torch-train experiment=experiment.yaml $PATHS_KWARGS $TASK_KWARGS hydra.searchpath=[pkg://meds_torch.configs,/PATH/TO/CUSTOM/CONFIGS]

Override parameters

meds-torch-train trainer.max_epochs=20 data.batch_size=64 $PATHS_KWARGS $TASK_KWARGS

Hyperparameter search

meds-torch-tune trainer=ray callbacks=tune_default hparams_search=ray_tune experiment=triplet_mtr $PATHS_KWARGS $TASK_KWARGS hydra.searchpath=[pkg://meds_torch.configs,/PATH/TO/CUSTOM/CONFIGS/WITH/experiment/triplet_mtr]

Advanced Examples

For detailed examples and tutorials:

Check MIMICIV_INDUCTIVE_EXPERIMENTS/README.md for a comprehensive guide to using MEDS-torch with MIMIC-IV data, including data preparation, task extraction, and running experiments with different tokenization and transfer learning methods.
See ZERO_SHOT_TUTORIAL/README.md for a rough WIP walkthrough of zero-shot prediction (and please share feedback on improving this! 🙂)

Example Experiment Configuration

Here's a sample experiment.yaml:

# @package _global_

defaults:
  - override /data: pytorch_dataset
  - override /logger: wandb
  - override /model/backbone: triplet_transformer_encoder
  - override /model/input_encoder: triplet_encoder
  - override /model: supervised
  - override /trainer: gpu

tags: [mimiciv, triplet, transformer_encoder]

seed: 0

trainer:
  min_epochs: 1
  max_epochs: 10
  gradient_clip_val: 1.0

data:
  dataloader:
    batch_size: 64
    num_workers: 6
  max_seq_len: 128
  collate_type: triplet
  subsequence_sampling_strategy: to_end

model:
  token_dim: 128
  optimizer:
    lr: 0.001
  backbone:
    n_layers: 2
    nheads: 4
    dropout: 0

logger:
  wandb:
    tags: ${tags}
    group: mimiciv_tokenization

This configuration sets up a supervised learning experiment using a triplet transformer encoder on MIMIC-IV data. Modify this file to suit your specific needs.

🌟 Key Features

Flexible ML Pipeline: Utilizes Hydra for dynamic configuration and PyTorch Lightning for scalable training.
Advanced Tokenization: Supports multiple strategies for embedding EHR data (Triplet, Text Code, Everything In Code).
Supervised Learning: Train models on arbitrary tasks defined in MEDS format data.
Transfer Learning: Pretrain models using contrastive learning, forecasting, and other methods, then finetune for specific tasks.
Multiple Pretraining Methods: Supports EBCL, OCP, STraTS Value Forecasting, and Autoregressive Observation Forecasting.

🛠 Installation

PyPI

pip install meds-torch

From Source

git clone git@github.com:Oufattole/meds-torch.git
cd meds-torch
pip install -e .

📚 Documentation

For detailed usage instructions, API reference, and examples, visit our documentation.

For a comprehensive demo of our pipeline and to see results from a suite of inductive experiments comparing different tokenization methods and learning approaches, please refer to the MIMICIV_INDUCTIVE_EXPERIMENTS/README.MD file. This document provides detailed scripts and performance metrics.

🧪 Running Experiments

Supervised Learning

bash MIMICIV_INDUCTIVE_EXPERIMENTS/launch_supervised.sh $MIMICIV_ROOT_DIR meds-torch

Transfer Learning

# Pretraining
bash MIMICIV_INDUCTIVE_EXPERIMENTS/launch_multi_window_pretrain.sh $MIMICIV_ROOT_DIR meds-torch [METHOD]
bash MIMICIV_INDUCTIVE_EXPERIMENTS/launch_ar_pretrain.sh $MIMICIV_ROOT_DIR meds-torch [AR_METHOD]

# Finetuning
bash MIMICIV_INDUCTIVE_EXPERIMENTS/launch_finetune.sh $MIMICIV_ROOT_DIR meds-torch [METHOD]
bash MIMICIV_INDUCTIVE_EXPERIMENTS/launch_ar_finetune.sh $MIMICIV_ROOT_DIR meds-torch [AR_METHOD]

Replace [METHOD] with one of the following:

ocp (Observation Contrastive Pretraining)
ebcl (Event-Based Contrastive Learning)
value_forecasting (STraTS Value Forecasting)

Replace [AR_METHOD] with one of the following:

eic_forecasting (Everything In Code Forecasting)
triplet_forecasting (Triplet Forecasting)

These scripts allow you to run various experiments, including supervised learning, different pretraining methods, and finetuning for both standard and autoregressive models.

📞 Support

For questions, issues, or feature requests, please open an issue on our GitHub repository.

MEDS-torch: Advancing healthcare machine learning through flexible, robust, and scalable sequence modeling tools.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

MEDS-torch: Advanced Machine Learning for Electronic Health Records

🚀 Quick Start

Installation

Set up environment variables

Basic Usage

Advanced Examples

Example Experiment Configuration

🌟 Key Features

🛠 Installation

PyPI

From Source

📚 Documentation

🧪 Running Experiments

Supervised Learning

Transfer Learning

📞 Support

Files

README.md

Latest commit

History

README.md

File metadata and controls

MEDS-torch: Advanced Machine Learning for Electronic Health Records

🚀 Quick Start

Installation

Set up environment variables

Basic Usage

Advanced Examples

Example Experiment Configuration

🌟 Key Features

🛠 Installation

PyPI

From Source

📚 Documentation

🧪 Running Experiments

Supervised Learning

Transfer Learning

📞 Support