GitHub - DynamicGenetics/Spotify-Rehydrator: A simple package for developing a full dataset of track features from self-requested Spotify data 🌱

Recreate a full dataset of audio features of songs downloaded through Spotify's download my data facility.

This requires the files named StreamingHistory{n}.json where {n} represents the file number that starts at 0, and goes up to however many files were retrieved.

Quick start

Extended documentation is available on ReadTheDocs. First, install the package using pip. An example of using the package to rehydrate a folder of json files is then:

# main.py
from spotifyrehydrator import Rehydrator
import os
import pathlib

if __name__ == "__main__":
    Rehydrator(
        os.path.join(pathlib.Path(__file__).parent.absolute(), "input"),
        os.path.join(pathlib.Path(__file__).parent.absolute(), "output"),
        client_id=os.getenv("SPOTIFY_CLIENT_ID"),
        client_secret=os.getenv("SPOTIFY_CLIENT_SECRET"),
    ).run(return_all=True)

Run takes boolean arguments for audio_features and artist info, or for return_all which then returns both. These will determine how much information is retrieved to make up the full dataset that is saved into the output folder.

How it works

The files for each person are read from the specified input folder.
The name and artist provided are searched with the Spotify API. The first result is taken to be the track, and the track ID is recorded.
Additional information is searched on other endpoints if audio_features, artist info or return_all were set to True.
The matched track ID and audio features are saved as one tab delimited .tsv file per person into the specified output folder.

Good to know

Not all tracks can be retreived from the API. In our experience about 5% of tracks cannot be found on the API. These will have a value of NONE in the output files.
There is not a guaranteed match between the first returned item in a search and the track you want. Comparing msPlayed with the track length is a good way to test this since msPlayed should not exceed the track length.

P.S. Thanks to Pixel perfect for the title icon. 🙂

Name		Name	Last commit message	Last commit date
Latest commit History 104 Commits
.github/workflows		.github/workflows
docs		docs
src/spotifyrehydrator		src/spotifyrehydrator
tests		tests
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.rst		README.rst
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Quick start

How it works

Good to know

About

Releases 4

Contributors 2

Languages

License

DynamicGenetics/Spotify-Rehydrator

Folders and files

Latest commit

History

Repository files navigation

Quick start

How it works

Good to know

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 4

Contributors 2

Languages