Extracting info from the H5 files #32

mirix · 2023-08-08T10:12:57Z

Hello,

I would be interested to train an audio-only model (or, perhaps, a bimodal audio-text one) using CMU-MOSEI data.

I would be recomputing the audio embeddings.

So I would need only the links to the videos plus the timestamps and the annotated emotions per timestamp range.

How would I go about extracting this information?

Thanks,

Ed

mirix · 2023-08-08T11:45:55Z

Ok, perhaps I am getting to something:

import h5py
import numpy as np
import pandas as pd

filename = '/home/emoman/Downloads/mosei/CMU_MOSEI_Labels.csd'

hf = h5py.File(filename)

features = hf.get('All Labels/data/zv0Jl4TIQDc/features')
feat = np.array(features)
df_feat = pd.DataFrame(feat)
print(df_feat)

intervals = hf.get('All Labels/data/zv0Jl4TIQDc/intervals')
intval = np.array(intervals)
df_intval = pd.DataFrame(intval)
print(df_intval)

This gives:

          0         1    2         3    4    5    6
0  0.333333  0.666667  0.0  0.666667  0.0  0.0  0.0
1  1.000000  2.000000  0.0  0.000000  0.0  0.0  0.0
2  2.333333  2.666667  0.0  0.000000  0.0  0.0  0.0
        0       1
0  56.852  60.845
1  29.764  35.633
2  42.146  49.242

My interpretation is that video zv0Jl4TIQDc has three intervals annotated with the relative weights of Ekman's basic emotions.

Is that correct?

If that is the case, what would be the mapping of the emotions?

What is the highest possible value for a given emotion?

mirix · 2023-08-08T11:53:12Z

Each sentence is annotated for sentiment on a [-3,3]
Likert scale of: [−3: highly negative, −2 negative,
−1 weakly negative, 0 neutral, +1 weakly positive,
+2 positive, +3 highly positive]. Ekman emotions
(Ekman et al., 1980) of {happiness, sadness, anger,
fear, disgust, surprise} are annotated on a [0,3] Lik-
ert scale for presence of emotion x: [0: no evidence
of x, 1: weakly x, 2: x, 3: highly x].

So column zero is the Likert score and then the other columns would be, in this order, {happiness, sadness, anger, fear, disgust, surprise} ?

mirix · 2023-08-08T11:54:56Z

The issue with this interpretation is that segment 0 above would have been labelled with happiness and anger in similar amounts...

mirix · 2023-08-08T11:58:15Z

Or is it (Anger Disgust Fear Happy Sad Surprise) as in Table 3?

Then it would be Anger and Fear, which is more consistent, but the sentiment would be slightly positive...

mirix · 2023-08-08T13:15:46Z

Checking the entries with the most negative and positive sentiment, it seems to be {happiness, sadness, anger, fear, disgust, surprise}

mirix · 2023-08-10T12:51:49Z

I have forked MOSEI to build a unimodal SER dataset:

https://github.com/mirix/messaih/tree/main

mirix changed the title ~~Links and annotated transcriptions~~ Extracting info from the H5 files Aug 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extracting info from the H5 files #32

Extracting info from the H5 files #32

mirix commented Aug 8, 2023

mirix commented Aug 8, 2023

mirix commented Aug 8, 2023

mirix commented Aug 8, 2023

mirix commented Aug 8, 2023

mirix commented Aug 8, 2023

mirix commented Aug 10, 2023

Extracting info from the H5 files #32

Extracting info from the H5 files #32

Comments

mirix commented Aug 8, 2023

mirix commented Aug 8, 2023

mirix commented Aug 8, 2023

mirix commented Aug 8, 2023

mirix commented Aug 8, 2023

mirix commented Aug 8, 2023

mirix commented Aug 10, 2023