Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google colab notebooks for the demos ? #5

Open
timtensor opened this issue Mar 1, 2023 · 8 comments
Open

Google colab notebooks for the demos ? #5

timtensor opened this issue Mar 1, 2023 · 8 comments

Comments

@timtensor
Copy link

Hi , I am currently looking into higher level feature extraction from an audio signal such as genre, mood ,danceablity as a colab / jupyter notebook. Is there an example of it that one can refer to and try it ?

@palonso
Copy link
Contributor

palonso commented Mar 2, 2023

Hi @timtensor,
you can have a look at Essentia models. It contains feature extraction example scripts for all our models.

@timtensor
Copy link
Author

Thanks for pointing it out. I think there is problem with installation of essentia-tensorflow I get the following error

I did the installation using pypi - !pip install essentia-tensorflow while the pip version is pip 22.0.4 from /usr/local/lib/python3.8/dist-packages/pip (python 3.8)

@palonso
Copy link
Contributor

palonso commented Mar 2, 2023

I think you missed the error message.
Could you also mention your OS?

@pmahan00
Copy link

pmahan00 commented Mar 3, 2023

Sorry for the incomplete information. The following is the error message . I am running it in google colab so i guess its ubuntu based


<ipython-input-38-96cbcf823c6c> in <module>
----> 1 from essentia.standard import MonoLoader, TensorflowPredictMusiCNN

ImportError: cannot import name 'TensorflowPredictMusiCNN' from 'essentia.standard' (/usr/local/lib/python3.8/dist-packages/essentia/standard.py)

---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.

@pmahan00
Copy link

pmahan00 commented Mar 3, 2023

Just an update, it seems work on google colab when i have the following

!apt-get update
!apt-get install -y python3-dev libsndfile1-dev
!pip install essentia==2.1b6.dev374 librosa==0.8.1
!pip install essentia-tensorflow

I have two questions on the prediction model
a) Is it not possible to load the pre trained model from google drive . I mounted my drive and tried to point the graph file name as such /mnt/gdrive/xxxx but it resulted in an error
b) I am bit confused about the outcome ? from the embeddings i get a matrix of values but is there a decoding step as well ?

Sample code run on google colab

!apt-get update
!apt-get install -y python3-dev libsndfile1-dev
!pip install essentia==2.1b6.dev374 librosa==0.8.1
!pip install essentia-tensorflow

from essentia.standard import MonoLoader, TensorflowPredictEffnetDiscogs

audio = MonoLoader(filename=audioFile, sampleRate=16000)()
model = TensorflowPredictMusiCNN(graphFilename="msd-musicnn-1.pb",output = "model/dense/BiasAdd")
predictions = model(audio)
print(predictions)

Perhaps i am doing something wrong in the code ?

@palonso
Copy link
Contributor

palonso commented Mar 3, 2023

Glad to see that you could install and use the models!

regarding a), it is not related to Essentia, so I'd recommend to look for help somewhere else. Alternatively, you could directly download the models in the Colab, e.g., adding !curl -SLO https://essentia.upf.edu/models/autotagging/msd/msd-musicnn-1.pb to your script.

about b), you are right, the embeddings are not human-readable and need to be input to a classification head to get the class probabilities.
Note that clicking on each model from the web you will get the example script to get the predictions and links to the model weights, and metadata file. For example, this is the script to do inference with the danceability-msd-musicnn model on top of the embeddings you already extracted:

from essentia.standard import MonoLoader, TensorflowPredictMusiCNN, TensorflowPredict2D

audio = MonoLoader(filename="audio.wav", sampleRate=16000)()
embedding_model = TensorflowPredictMusiCNN(graphFilename="msd-musicnn-1.pb", output="model/dense/BiasAdd")
embeddings = embedding_model(audio)

model = TensorflowPredict2D(graphFilename="danceability-msd-musicnn-1.pb", output="model/Softmax")
predictions = model(embeddings)

predictions will be a matrix [time_stamp, n_classes] because this model makes a prediction each 1.5 seconds of audio. To get track-level predictions, you can average the matrix across the time axis.

@pmahan00
Copy link

pmahan00 commented Mar 3, 2023

Thanks for the curl tip . I totally had forgotten about it . I guess all the models are under here
https://essentia.upf.edu/models/

I didnt quite understand the human readable , explanation on track level. For example i was looking into a track level classification of pre-trained SVM Gaia models to learn about it. Is there a python code example that can help me to get classification based on SVM model or a code snippet to experiment with .
Model link :https://essentia.upf.edu/svm_models/

@dbogdanov
Copy link
Member

Hi @pmahan00.

To get overall track predictions, you can simply average the resulting matrix of activations across time similar to this example.

Note that SVM classifiers are outdated in terms of their accuracy and generalization, and we recommend using the new models instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants