This project involves the analysis of a dataset containing trip information from Renfe, the Spanish railway company. The analysis covers various aspects such as the distribution of origins, vehicle types, vehicle classes, fares, and more.
The main objectives of this project are:
- To explore and understand the distribution of various features in the dataset.
- To perform statistical tests to uncover significant differences between groups.
- To visualize the data to gain insights and make informed conclusions.
The dataset can be found on kaggle using the following link: https://www.kaggle.com/datasets/thegurusteam/spanish-high-speed-rail-system-ticket-pricing/data
To run this project, you need to have the following Python libraries installed:
- pandas
- numpy
- matplotlib
- seaborn
- ydata_profiling
- scipy
- statsmodels
You can install these dependencies using the following command:
pip install pandas numpy matplotlib seaborn ydata_profiling scipy statsmodels