This package implements a "Branching Attention" classifier head (described here).
It is meant to be an improvement of the ULMFiT text classification algorithm - increasing data efficiency, as well as adding some degree of explainability (see the demo and the screencast below).
The package also contains a complete environment necessary to test different configurations of the architecture (+baseline) on the IMDb movie review dataset.
It is based on the hyperspace_explorer framework.
To skip the full setup and just reproduce results or quickly run some code, try
notebooks/example_interactive.ipynb
.
To install "as an app" - to reproduce results, run experiments, etc:
Requirements:
- a CUDA-enabled GPU, recent enough to run PyTorch, min. 8GB memory
- NVidia drivers installed (CUDA not necessary)
- Conda
Warning: by default, some data and a pre-trained encoder will be placed in
${HOME}/.fastai/
. This can be changed, but then a FASTAI_HOME
variable has to be set at all times.
git clone https://github.com/tpietruszka/ulmfit_attention.git
cd ulmfit_attention
conda env create -f environment.yml
source activate ulmfit_attention
pip install -e .
# setup data, pretrained encoder
python -c "from fastai.text import untar_data, URLs; untar_data(URLs.IMDB)"
IMDB=${HOME}/.fastai/data/imdb
wget -O ${IMDB}/itos.pkl https://static.purecode.pl/ulmfit_attention/imdb/itos.pkl
mkdir ${IMDB}/models
wget -O ${IMDB}/models/fwd_enc.pth https://static.purecode.pl/ulmfit_attention/imdb/fwd_enc.pth
To test the install, including briefly training a model, run:
pytest --runslow
. It should complete in a few minutes.
In order to use the PyTorch modules defined here, this package can be also installed
using pip, directly from github:
pip install git+https://github.com/tpietruszka/ulmfit_attention.git#egg=ulmfit_attention
More work is needed before it will be ready for PyPI.
The main usage mode is to start a worker, which will process queued runs
from a MongoDB-based queue and store the results there. To do that, go to the
inner ulmfit_attention
and run:
hyperspace_worker.py ../tasks/ ulmfit_attention
For instructions on submitting runs to the queue and other usage modes - including interactive experimentation in Jupyter - see hyperspace_explorer.