Skip to content

State of the art AI generative model for rap lyrics - NLP course project.

Notifications You must be signed in to change notification settings

dmkwis/POLTORA-TALERZA

Repository files navigation

Artificial rapper

Transformer-based rap lyrics generator. Highly based on the Rapformer paper.

Data preprocessing

We use the dataset from Kaggle. To create the train and test datasets perform the following:

  1. Download the files artists-data.csv and lyrics-data.csv from Kaggle and place them in data/ folder in the project root.

  2. Run the following to generate files which will contain train and test examples:

python3 data.py

This will create 4 files: pretrain_[x|y].txt and finetune_[x|y] in the data/ folder. After this step the datasets are available by using LyricsDatasetProvider and LyricsDataset from dataset.py.

After that run the following to reduce dataset sizes and the number of distinct tokens.

python3 filter.py

This will override the pretrain_[x|y] and finetune_[x|y] files.

Training

Wandb

To use Wandb

pip install wandb

wandb login

and paste your API key.

Training script

Running train.py utilizes params, to check available run:

python train.py --help

and to train model with chosen params run:

python train.py [params]

alternatively use training script to easily change previously used parameters

./run_train.sh

Inference

Run infer.py with appropriate parameters. This will generate results.txt file.

Rhyme enhancement

Run rhyme_enhancement.py. This assumes that there exists the file named results.txt which contains generated examples. It prints the rhyme-enhanced examples into the screen.

About

State of the art AI generative model for rap lyrics - NLP course project.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published