slides2questions

a code repo to generate questions from a given slide or study material

Warning

This project is still in development and may not work as expected. Please report any issues you encounter.

Getting Started

Prerequisites

Python 3.10 or higher
pip

Installation

Clone the repo
```
git clone
```
Install the required packages
```
pip install -r requirements.txt
```
Set up environment variables
```
cp .env.example .env
```
replace the placeholder GOOGLE_API_KEY with your own key in the .env file. You can get the key from here.

Usage

Put the PDF files you want to generate questions from in a directory. Let's say the directory is pdfs/.

Run the following command

python src/cli.py pdfs/

or

python src/cli.py pdfs/ --extract-text-from-images

The questions will be generated in the questions_and_answers.json file in current directory by default. You can change the output file by using the --output option.
```
python src/cli.py pdfs/ --output my_questions.json
```

Options

> python src/cli.py -h 

 usage: pdf2questions [-h] [--verbose] [--extract-text-from-images] [--number-of-topics NUMBER_OF_TOPICS] [--passes-over-corpus PASSES_OVER_CORPUS]
                     [--max-answers MAX_ANSWERS] [--min-answers MIN_ANSWERS] [--correct-answers CORRECT_ANSWERS]
                     pdf_directory

 Generate questions from PDF

 positional arguments:
 pdf_directory         Directory containing PDF files

 options:
 -h, --help            show this help message and exit
 --verbose, -v         Print more information (default: False)

 PDF options:
 --extract-text-from-images, -e
                         Extract text from images in the PDF (slower, requires `pip install rapidocr-onnxruntime`) (default: False)

 LDA options:
 --number-of-topics NUMBER_OF_TOPICS, -n NUMBER_OF_TOPICS
                         Number of topics to extract from the text (default: 10)
 --passes-over-corpus PASSES_OVER_CORPUS, -p PASSES_OVER_CORPUS
                         Number of passes over the corpus when training the LDA model (higher values may improve the quality of the topics but also
                         increase the training time) (default: 5)

 Multiple choice question options:
 --max-answers MAX_ANSWERS, -m MAX_ANSWERS
                         Maximum number of answers to generate for each question (default: 5)
 --min-answers MIN_ANSWERS, -i MIN_ANSWERS
                         Minimum number of answers to generate for each question (default: 4)
 --correct-answers CORRECT_ANSWERS, -c CORRECT_ANSWERS
                         Number of correct answers to generate for each question (default: 1)

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
src		src
.env.sample		.env.sample
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

slides2questions

Getting Started

Prerequisites

Installation

Usage

Options

About

Languages

License

orkhank/slides2questions

Folders and files

Latest commit

History

Repository files navigation

slides2questions

Getting Started

Prerequisites

Installation

Usage

Options

About

Topics

Resources

License

Stars

Watchers

Forks

Languages