Length-Aware Motion Synthesis via Latent Diffusion

European Conference on Computer Vision 2024

Length-Aware Motion Synthesis via Latent Diffusion

Alessio Sampieri, Alessio Palma, Indro Spinelli, and Fabio Galasso

Sapienza University of Rome, Italy

Abstract

The target duration of a synthesized human motion is a critical attribute that requires modeling control over the motion dynamics and style. Speeding up an action performance is not merely fast-forwarding it. However, state-of-the-art techniques for human behavior synthesis have limited control over the target sequence length.

We introduce the problem of generating length-aware 3D human motion sequences from textual descriptors, and we propose a novel model to synthesize motions of variable target lengths, which we dub ``Length-Aware Latent Diffusion'' (LADiff). LADiff consists of two new modules: 1) a length-aware variational auto-encoder to learn motion representations with length-dependent latent codes; 2) a length-conforming latent diffusion model to generate motions with a richness of details that increases with the required target sequence length. LADiff significantly improves over the state-of-the-art across most of the existing motion synthesis metrics on the two established benchmarks of HumanML3D and KIT-ML.

Create the environment

conda create python=3.10 --name ladiff
conda activate ladiff

Install the packages in requirements.txt and install PyTorch 1.12.1

cd src
pip install -r requirements.txt

Run the scripts to download dependencies:

bash prepare/download_smpl_model.sh
bash prepare/prepare_clip.sh
bash prepare/download_t2m_evaluators.sh

Put datasets in the datasets folder, please refer to HumanML3D for setup.

We tested our code on Python 3.10.9 and PyTorch 1.12.1.

Pretrained model

Download the checkpoints trained on HumanML3D from the Google Drive, and place them in the experiments/ladiff folder.

Train your own model

For the stage 1 (LA-VAE) please first check the parameters in configs/config_vae_humanml3d.yaml, e.g. NAME,DEBUG.

Then, run the following command:

python -m train --cfg configs/config_vae_humanml3d.yaml --cfg_assets configs/assets.yaml --batch_size 64 --nodebug

For the stage 2 (LA-DDPM) please update the parameters in configs/config_ladiff_humanml3d.yaml, e.g. NAME,DEBUG,PRETRAINED_VAE (change to your latest ckpt model path in previous step)

Then, run the following command:

python -m train --cfg configs/config_ladiff_humanml3d.yaml --cfg_assets configs/assets.yaml --batch_size 128  --nodebug

Evaluate the model

Please first put the trained model checkpoint path to TEST.CHECKPOINT in configs/config_ladiff_humanml3d.yaml.

Then, run the following command:

python -m test --cfg configs/config_ladiff_humanml3d.yaml --cfg_assets configs/assets.yaml

Citation

If you find our code or paper helpful, please consider citing us:

@InProceedings{sampieri_eccv_24,
author="Sampieri, Alessio and Palma, Alessio and Spinelli, Indro and Galasso, Fabio",
editor="Leonardis, Ale{\v{s}} and Ricci, Elisa and Roth, Stefan and Russakovsky, Olga and Sattler, Torsten and Varol, G{\"u}l",
title="Length-Aware Motion Synthesis via Latent Diffusion",
booktitle="Computer Vision -- ECCV 2024",
year="2025",
publisher="Springer Nature Switzerland",
address="Cham",
pages="107--124",
isbn="978-3-031-73668-1"
}

Acknowledgements

Thanks to MLD, our code is borrowing from them. Please visit their page for more instructions.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
images		images
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

European Conference on Computer Vision 2024

Length-Aware Motion Synthesis via Latent Diffusion

Alessio Sampieri, Alessio Palma, Indro Spinelli, and Fabio Galasso

Sapienza University of Rome, Italy

Abstract

Create the environment

Pretrained model

Train your own model

Evaluate the model

Citation

Acknowledgements

About

Releases

Packages

Contributors 2

Languages

License

AlessioSam/LADiff

Folders and files

Latest commit

History

Repository files navigation

European Conference on Computer Vision 2024

Length-Aware Motion Synthesis via Latent Diffusion

Alessio Sampieri*, Alessio Palma*, Indro Spinelli, and Fabio Galasso

Sapienza University of Rome, Italy

Abstract

Create the environment

Pretrained model

Train your own model

Evaluate the model

Citation

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Alessio Sampieri, Alessio Palma, Indro Spinelli, and Fabio Galasso

Packages