This repo presents the code implementation for the paper Pianist Identification Using Convolutional Neural Networks
Please ensure that you have anaconda installed in you PC.
conda env create -f environment.yaml
The alignments data could be found here. Please download the alignment files to folder ./data/ATEPP-alignment
. For midi files, please check the ATEPP-dataset repo.
An example of preparing the data:
python data_preprocess.py --mode align --slice_len 1000 -s -o -S
You can set the features in config.py
. Features that could be used:
FEATURES_LIST = [
'pitch',
'onset_time',
'offset_time',
'velocity',
'duration',
'ioi', #Inter onset interval
'otd', #Offset time duration
# <----- Deviation Features ----->
'onset_time_dev',
'offset_time_dev',
'velocity_dev',
'duration_dev',
'ioi_dev',
'otd_dev'
]
Please refer to the paper for more details. A full list of options for data processing:
usage: data_preprocess.py [-h] [--path_to_dataset_csv PATH_TO_DATASET_CSV] [--path_to_save PATH_TO_SAVE] [--data_folder DATA_FOLDER] [--score_folder SCORE_FOLDER] [--align_result_column ALIGN_RESULT_COLUMN]
[--midi_file_column MIDI_FILE_COLUMN] [--random_state RANDOM_STATE] [--isSplits] [--isSlice] [--isFull] [--isOverlap] [--quantize {score,group,grid,None}] [--max_len MAX_LEN]
[--slice_len SLICE_LEN] [--mode {midi,align}]
Argument Parser
options:
-h, --help show this help message and exit
--path_to_dataset_csv PATH_TO_DATASET_CSV
Path to dataset CSV file
--path_to_save PATH_TO_SAVE
Path to save processed data
--data_folder DATA_FOLDER
Dictionary to the performances / alignment results
--score_folder SCORE_FOLDER
Dictionary to the scores
--align_result_column ALIGN_RESULT_COLUMN
Column to save the align result file paths
--midi_file_column MIDI_FILE_COLUMN
Column to save the midi performance file paths
--random_state RANDOM_STATE, -r RANDOM_STATE
Random state (default: 42)
--isSplits, -S To split the data into train, valid, test sets
--isSlice, -s To slice the performances into segments
--isFull, -f To use the full performances as input
--isOverlap, -o To insert overlap for segments
--quantize {score,group,grid,None}, -q {score,group,grid,None}
To quantize the midi files
--max_len MAX_LEN, -ml MAX_LEN
Maximum lengths for the input (even using the full performances)
--slice_len SLICE_LEN, -sl SLICE_LEN
Segment lengths for slicing
--mode {midi,align} Whether to process midi files or alignment files
The training was monitored by with W&B. The current implementation is only compatible with wandb
.
For training the models, please run the following commands:
python main.py --mode train
A full list of options for training (and evaluation):
usage: main.py [-h] [--mode {train,evaluate}] [--data_path DATA_PATH] [--num_of_features NUM_OF_FEATURES] [--num_of_performers NUM_OF_PERFORMERS] [--cuda_devices CUDA_DEVICES [CUDA_DEVICES ...]]
[--save_path SAVE_PATH] [--ckpt_path CKPT_PATH]
options:
-h, --help show this help message and exit
--mode {train,evaluate}
Chose to train or evaluate the model.
--data_path DATA_PATH
Path to the processed data file '*.npz'.
--num_of_features NUM_OF_FEATURES
Number of features used in the experiment
--num_of_performers NUM_OF_PERFORMERS
Number of performers considered in the experiment
--cuda_devices CUDA_DEVICES [CUDA_DEVICES ...]
CUDA device ids
--save_path SAVE_PATH
Dictionary to save the evalution report figures. Default to './evaluation/
--ckpt_path CKPT_PATH
Checkpoint path to continue training or evaluate the model.
An example of evaluate the model for performance segments of 1000 notes and using 13 features:
python main.py --mode evaluate --ckpt_path checkpoints/model_best_1000.ckpt
Pre-trained models with different input lengths and number of features are available here. Checkpoints are named by model_best_{SEQUENCE_LEN}_{NUM_FEATURES(if it's NOT 13)}.ckpt
.
An example of using the model checkpoint "model_best_1000_7.ckpt" to inference the identity:
# !!! Remember to modify the FEATURE_LIST in config to match the required features.
# Prepare the data, save to 'data/inference.npy"
python data_preprocess.py --mode midi --max_len 1000 --path_to_midi YOUR_PATH_TO_MIDI
# Inference, load data from "data/inference.npy"
python main.py --ckpt_path checkpoints/model_best_1000_7.ckpt --mode predict --inference_path data/inference.npy
In this study, we used piano MIDI performances from the ATEPP dataset. However, we have also made attempts on this task with the following datasets:
The MAESTRO dataset does not provide information about performers for each performance. We complemented the name and nationality to the meta-data by crawling the website of the International E-Piano Competition and manual verification. Results are provided here.
Around a hundred audio recordings were found wrongly labeled by the discography given in MazurkaBL during the research progress. By a cover song detection algorithm and manual verification, we created a clean version of the discography, provided here.
We applied the piano transcription algorithm by Kong et al. to the above two datasets (cleaned version). The transicribed midis are available here.
@ARTICLE{2023arXiv231000699T,
author = {{Tang}, Jingjing and {Wiggins}, Geraint and {Fazekas}, Gyorgy},
title = "{Pianist Identification Using Convolutional Neural Networks}",
journal = {arXiv e-prints},
keywords = {Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing},
year = 2023,
month = oct,
eid = {arXiv:2310.00699},
pages = {arXiv:2310.00699},
doi = {10.48550/arXiv.2310.00699},
archivePrefix = {arXiv},
eprint = {2310.00699},
primaryClass = {cs.SD},
adsurl = {https://ui.adsabs.harvard.edu/abs/2023arXiv231000699T},
adsnote = {Provided by the SAO/NASA Astrophysics Data System}
}
Jingjing Tang: [email protected]