This repository contains code written for the Kaggle competition "ECML/PKDD 15: Taxi Trajectory Prediction". This implementation, based on Tensorflow and Keras, is inspired from the approach followed by the competition's winners. More information about this code is provided on this blog post.
For more information about the original competition winner's solution, please refer to:
- The summary on Kaggle's blog.
- The detailed published paper.
- The original implementation using Theano and Blocks.
The code is comprised of three main files inside the code
folder:
- data.py: Methods for loading, cleaning and pre-processing the original datasets.
- training.py: Methods for defining the neural network model and for running the training process.
- utils.py: Various mathematical and graphical utility functions.
This implementation is based on Tensforflow version 0.11.0. The training process for the included neural network model can be quite time-comsuming so it's recommended to use a GPU. A simple GPU-enabled Docker container setup can be found at: https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/docker
Some extra python libraries, specified in requirements.txt
file, will also have to be installed.
Once your environment is set up, download the competition's CSV data files
in the datasets
folder.
To run the training process, run the following:
from code.training import full_train
full_train(n_epochs=100, batch_size=200, save_prefix='mymodel')
The above will run the full training process and save some files to disk inside the cache
folder:
mymodel-history.pickle
as- 100 files (one for each epoch) named
mymodel-XXX.hdf5
(withXXX
replaced with each epoch number).