Code for extracting a representative image from a PDF file using CV.
This code needs to be run on GPU. We include a colab example.
bash setup.sh
First add a bunch of PDF files to a directory pdfs/
.
Next call,
python run.py pdfs/ pics/
The code will attempt to extract an image for each pdf into the pic directory.