📜 How to incorporate a new model into Annotheia

In this tutorial, we are going to learn how to incorporate a new model into the toolkit to replace the model used in one of its modules with another. There may be different reasons for this, from the need for a specific ASR model for our language to the fact that said model is more precise or efficient. In addition, you will be able to infer how to include a new step into our pipeline, such as a new body landmarker module!

🏗️ How is AnnoTheia structured?

First of all, it is advisable to know a little more about the toolkit and how it is organized:

The first thing we need is a configuration file that includes the settings to build each module composing the pipeline, as well as different pipeline hyper-parameters.
When running the main_scenes.py script, we create an instance of the DetectCandidateScenesTask class.
This DetectCandidateScenesTask object, defined in detect_candidate_scenes_task.py, is in charge of building and setting up all the modules specified in the configuration file, such as the scene detector, face detector, and so on.
Once all the modules have been independently built, we can conform our pipeline by instantiating the Pipeline class, whose definition you can find in annotheia_pipeline.py.
Finally, the Pipeline processes the video to detect the candidate scenes for the future database.

A question arises: how is the flexibility that the toolkit offers in terms of changing or including a new model or module possible? If you inspect this directory, you will see that for each module and for each model that can represent it, there is a folder. If you look again, you will see that in each module directory, there exists a script like abs_{name_of_the_module}.py. These scripts define the different abstracts classes we need to force your models to receive and output the data as the pipeline expects. By following these guidelines, regardless of which model you include, the pipeline will continue to work 🪄!

💪 Hands on!

Imagine we want to incorporate the face aligner of Google Mediapipe because we think it is a better option for our project, for example. Let's go step by step:

1. Placing our New Model

In order to maintain the toolkit's structure, we will create the following directory as if we were at the root directory of the repository:

cd ./modules/face_alignment/
mkdir ./mediapipe_face_aligner/
cd ./mediapipe_face_aligner/

2. Installation & Model Download

According to the MediaPipe's tutorial 📜:

We will install MediaPipe:

pip install mediapipe

We will download the off-the-shelf model bundle:

mkdir ./models/
wget -O ./models/face_landmarker_v2_with_blendshapes.task -q https://storage.googleapis.com/mediapipe-models/face_landmarker/face_landmarker/float16/1/face_landmarker.task

3. Creating the Model Class

Creating an class for our model in the mediapipe_face_aligner.py script. Note that our face aligner will inherit the abstract class defined in ../abs_face_aligner.py:

import mediapipe as mp
from termcolor import cprint
from mediapipe.tasks import python
from mediapipe.tasks.python import vision
from modules.face_alignment.abs_face_aligner import AbsFaceAligner

class MediaPipeFaceAligner(AbsFaceAligner):
    def __init__(self, model_asset_path="./models/face_landmarker_v2_with_blendshapes.task"):

        # -- creating model configuration
        base_options = python.BaseOptions(
            model_asset_path=model_asset_path,
        )

        options = vision.FaceLandmarkerOptions(
            base_options=base_options,
            output_face_blendshapes=False,
            output_facial_transformation_matrixes=False,
        )

        # -- building face aligner
        self.face_aligner = vision.FaceLandmarker.create_from_options(options)

        # -- printing useful information for the user
        cprint(
            f"\t(Face Alignment) MediaPipe Face Aligner loaded from ({model_asset_path})",
            "cyan", attrs=["bold", "reverse"],
        )

    def detect_facial_landmarks(self, frame, face_bbs):
        """Detect the facial landmarks of each face appearing on the frame.
        Args:
            frame (np.ndarray): a frame read from the scene clip.
            face_bbs (np.ndarray): face bounding boxes provided by the face detector.
        Returns:
            np.ndarray: array containing the detected facial landmarks (N,L,2),
          where N refers to number of faces and L to the number of detected landmarks.
        """

        # -- in this case, we do not need the face bounding boxes
        landmarks = self.face_aligner.detect(frame).face_landmarks
        return landmarks

4. Importing Issues

Just to make it easier to import this model in the future, we will include it in the ../__init__.py:

from modules.face_alignment.mediapipe_face_aligner.mediapipe_face_aligner import MediaPipeFaceAligner

5. Adding the New Model to the Task

Once our model has been created, we can come back to the root directory of the repository. Then, we will include our new model into the setting up of our ./tasks/detect_candidate_scenes_task.py script:

First, include the import of the class

from modules.face_alignment import MediaPipeFaceAligner

Then, include it here to allow it to be defined by means of the configuration file:

face_alignment_choices = ClassChoices(
    name="face_alignment",
    classes=dict(
        fan=FANAligner,
        mediapipe=MediaPipeFaceAligner,
    ),
    type_check=AbsFaceAligner,
    default=None,
    optional=True
)

🌟 Tip: Inspect the code here to understand how the model is built and incorporated to the AnnoTheiaPipeline object. Of course, feel free to dive into our code 🤿! Did you find out how to include a new module step (e.g., a body pose landmarker) into our toolkit?

6. Update the Configuration File

The last step is just updating or creating a new configuration file whose face aligner should be set up based on the previous class we defined. So, in the ./config/ directory, for example, in the annotheia_pipeline_spanish.yaml:

# --face alignment
face_alignment: mediapipe
face_alignment_conf:
  model_asset_path="./models/face_landmarker_v2_with_blendshapes.task"

7. [Optional] Face Plotting in the GUI

🌟 Check the MediaPipe's tutorial 📜 to replace this line of code with its corresponding function to plot and display the detected facial landmarks. Otherwise, you can comment that line.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

📜 How to incorporate a new model into Annotheia

🏗️ How is AnnoTheia structured?

💪 Hands on!

1. Placing our New Model

2. Installation & Model Download

3. Creating the Model Class

4. Importing Issues

5. Adding the New Model to the Task

6. Update the Configuration File

7. [Optional] Face Plotting in the GUI

Files

README.md

Latest commit

History

README.md

File metadata and controls

📜 How to incorporate a new model into Annotheia

🏗️ How is AnnoTheia structured?

💪 Hands on!

1. Placing our New Model

2. Installation & Model Download

3. Creating the Model Class

4. Importing Issues

5. Adding the New Model to the Task

6. Update the Configuration File

7. [Optional] Face Plotting in the GUI