The Childhood Cancer Data Lab was established by Alex’s Lemonade Stand Foundation (ALSF) in 2017. The Data Lab is a team of data scientists, designers, engineers, and communicators. Our mission is to accelerate the pace of finding novel cures and treatments for childhood cancer by putting resources and knowledge in the hands of pediatric cancer experts.
We construct tools that make vast amounts of data widely available, easily mineable, and broadly reusable. We also train researchers to better understand their own data and to advance their work more quickly. The Data Lab team simultaneously contributes to childhood cancer research and to the open science and open source software communities.
refine.bio is a multi-organism collection of genome-wide transcriptome or gene expression data that has been obtained from publicly available repositories and uniformly processed and normalized.
- Read the documentation.
- Get started with example workflows for use with refine.bio data.
- Learn about building and running the refine.bio project source code.
ALSF created the Single-cell Pediatric Cancer Atlas (ScPCA) project to generate an unprecedented resource for the pediatric cancer research community. Through funding investigators’ single-cell profiling of patient samples, ALSF established an atlas of over 600 samples from more than 50 cancer types and growing. To maximize the reach of this resource, the Data Lab built the ScPCA Portal to make the uniformly processed data freely available.
- Explore and immediately download data from the ScPCA Portal.
- Processing workflows for ScPCA data.
- ScPCA tools and data files for testing them.
- User information about ScPCA processing.
Interested in submitting your data to the Portal? We can accept submissions of 10x Genomics single-cell or single-nuclei profiling of childhood and adolescent cancer (ages 0-19) data, broadly defined to include relevant animal models, patient-derived xenografts, or cell lines, as well as tumor data.
- View the full contribution guidelines to learn more.
- Submit an interest form to be notified about our next call for contributions!
Email [email protected]
with any questions about the Portal or submitting data.
OpenScPCA is an open, collaborative project to analyze data from the ScPCA Portal. This project aims to:
- Characterize the ScPCA data with analyses such as labeling cell types or identifying recurrent cell states in multiple tumor types
- Work on open and collaborative analyses
- Build consensus around usage, strengths, and pitfalls of methods and their application to pediatric cancer data.
- Improve the utility of the ScPCA data for the research community
Join the conversation on GitHub Discussions and explore the OpenScPCA-analysis
repository to see what the community is working on.
Contribute to OpenScPCA! Interested in helping build a resource that will benefit a broad community of pediatric cancer researchers? OpenScPCA collaborators will:
- Discover new datasets that can advance their research
- Learn how to use powerful tooling for reproducible research and software development
- Join a supportive community and meet potential collaborators
- Build their analysis portfolio, develop transferable skills in data analysis, and gain experience working collaboratively in a large code base!
Fill out the contributor interest form. You will receive an email response with more information and next steps.
Grant opportunities are available for eligible pediatric cancer researchers! We’re seeking collaborators with experience analyzing single-cell RNA-seq datasets to help annotate and assign cell types to existing ScPCA datasets.
The Open Pediatric Brain Tumor Atlas (OpenPBTA) project was a global open science initiative, which analyzed a vast collection of pediatric brain tumor data, comprising data from over 1,000 tumors. This project operated on an open contribution model, crowdsourcing expertise from childhood brain cancer experts from across the world.
Read the OpenPBTA paper in Cell Genomics to learn more!
- View the analysis repository. (now archived)
- View the repository for the collaboratively written manuscript. (now archived)
We offer training workshops
to teach pediatric cancer researchers the data science skills they need to examine their own data.
Participants are introduced to the R programming language, reproducible research practices, and to cutting-edge technologies used in single-cell and bulk RNA-sequencing data analysis.
All Data Lab training materials are openly licensed and freely available for others to use. Interested in using our materials to hold your own workshop? Learn how to get started and fill out the instructor interest form to submit an inquiry.
Email [email protected]
with any questions about attending or holding a workshop.
Visit us at ccdatalab.org
, follow us on X at @CancerDataLab
, and connect with us on LinkedIn
.
For inquiries, please contact us at [email protected]
.
Support our work by making a tax-deductible contribution to ALSF’s Childhood Cancer Data Lab. Donate here!