This project involves the analysis of synthetic patient data, focusing on head and neck cancer cases. The data has been generated to simulate real-world cases while maintaining the confidentiality of the original dataset.
This project involves analyzing of synthietic patient data focusing on head and neck cancer cases. The data has been generated to simulate real-world cases while maintaining the confidentiality of the original dataset.
The dataset is a collection of patient-level data, containing variables such as:
- Patient demographics: Age, gender, location, occupation, marital status, alcohol and tobacco use.
- Tumor characteristics: Histology, tumor site, staging (T, N, M stages), and grade.
- Treatment details: Surgery, chemotherapy regimen, radiotherapy (XRT) dates and doses, concurrent treatment modalities.
- Other medical data: HIV status, insurance coverage, date of histological diagnosis, ECG readings.
The analysis includes:
- Descriptive statistics for summarizing patient demographics and tumor characteristics.
- Visualization of the distribution of variables like tumor site, treatment modalities, and histology.
- Analysis of associations between demographic factors and tumor stages.
- Evaluation of the prevalence of different treatment modalities (surgery, chemo, radiotherapy) among patients.
This analysis requires the following R packages:
tidyverse
- A collection of packages for data manipulation and visualizationgtsummary
- For creating publication-ready summary tables and regression tablesggsci
- Provides color palettes forggplot2
based on scientific themes.cowplot
- For combining multipleggplot2
plots into a single figure.patchwork
- An intuitive syntax for combiningggplot2
plotssf
- For handling spatial data and simple features.tmap
- For creating thematic maps and visualizing spatial data
install.packages(c("tidyverse", "ggsci","gtsummary","cowplot","patchwork","sf","tmap"))