Jianqi Chen, Panwen Hu, Xiaojun Chang, Zhenwei Shi, Michael Christian Kampffmeyer, and Xiaodan Liang
This repository is the official implementation of Sitcom-Crafter. If you encounter any question, please feel free to contact us. You can create an issue or just send email to me [email protected]. Also welcome for any idea exchange and discussion.
[10/15/2024] Code public.
[10/05/2024] Code cleanup done. Waiting to be made public.
[09/29/2024] Code init.
- Abstract
- Requirements
- Pretrained Weights Release
- Data Preparation
- Body Regressor
- Human-Human Interaction Module
- Full System Generation
- Results
- Citation & Acknowledgments
Recent advancements in human motion synthesis have focused on specific types of motions, such as human-scene interaction, locomotion or human-human interaction, however, there is a lack of a unified system capable of generating a diverse combination of motion types. In response, we introduce Sitcom-Crafter, a comprehensive and extendable system for human motion generation in 3D space, which can be guided by extensive plot contexts to enhance workflow efficiency for anime and game designers. The system is comprised of eight modules, three of which are dedicated to motion generation, while the remaining five are augmentation modules that ensure consistent fusion of motion sequences and system functionality. Central to the generation modules is our novel 3D scene-aware human-human interaction module, which addresses collision issues by synthesizing implicit 3D Signed Distance Function (SDF) points around motion spaces, thereby minimizing human-scene collisions without additional data collection costs. Complementing this, our locomotion and human-scene interaction modules leverage existing methods to enrich the system's motion generation capabilities. Augmentation modules encompass plot comprehension for command generation, motion synchronization for seamless integration of different motion types, hand pose retrieval to enhance motion realism, motion collision revision to prevent human collisions, and 3D retargeting to ensure visual fidelity. Experimental evaluations validate the system's ability to generate high-quality, diverse, and physically realistic motions, underscoring its potential for advancing creative workflows.
-
Hardware Requirements
- GPU: For training, more GPUs are preferred. For evaluation, a single GPU with about 12GB should be sufficient.
-
Software Requirements
- Python: 3.10 or above
- CUDA: 11.8 or above
- cuDNN: 8.4.1 or above
To install other requirements:
pip install -r requirements.txt
(You may encounter some issues when installing the
pointnet2_ops
andpytorch3d
packages. If so, please compile thepointnet2_ops
andpytorch3d
packages manually according to their official instructions.)
In this part, we provide the pretrained weights of our system. You can download them from the following links:
-
Body Regressor (Remember to modify
get_smplh_body_regressor_checkpoint_path()
andget_smplx_body_regressor_checkpoint_path()
in global_path.py to the downloaded paths). Please refer to the Body Regressor Training section for training details if you want to train your own models: -
Human-Human Interaction Module (Remember to modify
model.yaml
(evaluation/generation) ortrain.yaml
(finetuning) in configs to the downloaded paths). Here we only provide the weights trained on the 30-FPS InterHuman dataset. Please refer to the Human-human interaction module training section for training details if you want to train other kinds of models:
Note: [Before any training, evaluation, or visualization, please ensure you have modified the global parameters in global_path.py to your own paths first.]
First, download the InterHuman dataset from the official website and save it in the get_dataset_path()
path which is set in global_path.py. Then, run the following command to convert the raw InterHuman dataset to the marker-point InterHuman dataset that can be used in our system:
cd HHInter
python rearrange_dataset.py
You can modify the OUT_FPS
parameter in the rearrange_dataset.py file to control the output frame rate of the dataset.
(Optional) The above command is used to process the raw motions
in the original InterHuman dataset. If you want to directly convert the 30-FPS motions_processed
of the InterHuman dataset to the marker-point InterHuman dataset, you can run the following command:
cd HHInter
python intergen_processed_to_custom.py
First, download the Inter-X dataset from the official website and save it in the get_dataset_path()
path which is set in global_path.py. Then, run the following command to convert the raw Inter-X dataset to the marker-point Inter-X dataset that can be used in our system:
cd HHInter
python rearrange_dataset_interX.py
You can modify the OUT_FPS
parameter in the rearrange_dataset_interX.py file to control the output frame rate of the dataset.
We provide a comprehensive visualization tool for different kinds of datasets (the raw InterHuman dataset, the converted marker-point InterHuman dataset, the raw Inter-X dataset, and the converted marker-point Inter-X dataset, etc.). To visualize the data, run this command:
cd HHInter
python custom_visualize.py
Please check more details of functions and usages in the custom_visualize.py file.
To calculate the mean and std of the dataset which is used for normalization in the training, run this command:
cd HHInter
python cal_dataset_std_mean.py
To pre-extract the clip embeddings for the Hand Pose Retrieval module, run this command:
cd HHInter
python clip_embedding_extraction.py
Download the 3D Replica Scene dataset from the official website and save it in the HSInter\data
path. (Note that our system currently only supports one-layer scenes. If want to use multi-layer scenes, please modify the code accordingly.)
Download the SMPL/SMPLH/SMPLX model from the their official website-SMPL / official website-SMPLH / official website-SMPLX and save them in the get_SMPL_SMPLH_SMPLX_body_model_path()
path which is set in global_path.py.
For the SMPLX models, please also copy them to the path HSInter\data\models_smplx_v1_1\models\smplx
.
Download the essentials for Human-Human Penetration Loss from the official website provided by BUDDI and save them in the HHInter\data
.
Download the pretrained weights and configurations of the human locomotion and human-scene interaction modules from the official website by DIMOS and save them in the HHInter\
path (force overrides if there are conflicts).
Note: [Before any training, evaluation, or visualization, please ensure you have modified the global parameters in global_path.py to your own paths first.]
To train the body regressors Marker2SMPLH or Marker2SMPLX which convert the marker points to the SMPLH/SMPLX model parameters, run this command:
python marker_regressor/exp_GAMMAPrimitive/train_GAMMARegressor.py --cfg MoshRegressor_v3_neutral_new
Please modify the parameters marker_filepath
, body_model_path
, and dataset_path
in the configuration file MoshRegressor_v3_neutral_new.yaml to your own paths. By setting is_train_smplx
to True
, you can train the Marker2SMPLX model, otherwise, the Marker2SMPLH model will be trained. The results will be saved in the directory marker_regressor/results
.
To evaluate / visualization the body regressors, run this command:
python marker_regressor/exp_GAMMAPrimitive/infer_GAMMARegressor.py --cfg MoshRegressor_v3_neutral_new --checkpoint_path <checkpoint path>
Please modify the parameters marker_filepath
, body_model_path
, and dataset_path
in the configuration file MoshRegressor_v3_neutral_new.yaml to your own paths. The checkpoint_path
should be set to the path of the trained model. Also ensure the is_train_smplx
is consistent with the trained model type.
Note: [Before any training, evaluation, or visualization, please ensure you have modified the global parameters in global_path.py to your own paths first.]
To train the human-human interaction module, run this command:
cd HHInter
python train.py
Please modify the parameters DATA_ROOT
and DATA_INTERX_ROOT
under interhuman
, interhuman_val
, and interhuman_test
in the dataset configuration file datasets.yaml to your own paths. If you want to additionally train on Inter-X dataset, please set USE_INTERX
to True
.
Please also modify the parameters in model.yaml according to your needs。 You can use CHECKPOINT
to load a pretrained model (note that this is only for evaluation. For training finetuning, please use RESUME
parameter as mentioned below), otherwise just leave it empty. Use TRAIN_PHASE_TWO
to train the model in the second phase. Use USE_VERTEX_PENETRATION
to use human-human body penetration loss in the training (note that this will require more memory, thus also need to modify batch size below together). The settings of the three training phases are as follows: Phase 1: TRAIN_PHASE_TWO
is False
, USE_VERTEX_PENETRATION
is False
; Phase 2: TRAIN_PHASE_TWO
is True
, USE_VERTEX_PENETRATION
is False
; Phase 3: TRAIN_PHASE_TWO
is True
, USE_VERTEX_PENETRATION
is True
.
Please also modify the parameters in train.yaml according to your needs. You can use BATCH_SIZE
to control the batch size in the training. and also other parameters like LR
(learning rate), EPOCH
, SAVE_EPOCH
(save every N epochs), and also RESUME
(set it to a checkpoint path to finetune a model, or just leave it empty). The results will be saved in the directory CHECKPOINT
with the name EXP_NAME
.
To evaluate the human-human interaction module, run this command:
cd HHInter
python eval.py
Please modify the parameters DATA_ROOT
and DATA_INTERX_ROOT
under interhuman
, interhuman_val
, and interhuman_test
in the dataset configuration file datasets.yaml to your own paths.
Please also modify the parameters in model.yaml to set CHECKPOINT
to load a pretrained model for evaluation. The result is a .log file in the local directory.
To generate and visualize the human-human interaction module, run this command:
cd HHInter
python infer.py
Please modify the parameters DATA_ROOT
and DATA_INTERX_ROOT
under interhuman
, interhuman_val
, and interhuman_test
in the dataset configuration file datasets.yaml to your own paths.
Please also modify the parameters in model.yaml to set CHECKPOINT
to load a pretrained model for generation. The results are saved in a directory named results-<CHECKPOINT-NAME>
in the local directory.
We support Video Recording, OnScreen / OffScreen rendering, generation based on a prompt.txt file / test dataset. etc. You can check details in the infer.py file.
To craft adversarial examples, run this command:
cd HSInter
python synthesize/demo_hsi_auto.py
Please ensure you have set up the pretrained human-human interaction model path in the global_path.py file. Modify save_path_name
in the demo_hsi_auto.py file to your desired save path. If you want to input your own plot but not use the LLM generated one, please comment out the codes within the Automatically LLM plot & order generation
scope in the run()
function, and uncomment the codes within the Manually define the motion orders
scope.
To evaluate the plot-driven human motion generation, run this command:
cd HHInter
python eval-long-story.py
Please change the pickle_file_root
in the file to the path of the generated motion sequences. The result is a .log file in the local directory.
To visualize the generated motion sequences (without 3D retargeting) in Blender, please refer to the import_primitive.py file and run it in Blender Python Console with necessary modifications of the paths.
To visualize the generated motion sequences (with 3D retargeting) in Blender, please refer to the blendershow.py file and run it in Blender Python Console with necessary modifications of the paths.
If you find this paper useful in your research, please consider citing:
@article{chen2024sitcomcrafterplotdrivenhumanmotion,
title={Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes},
author={Chen, Jianqi and Hu, Panwen and Chang, Xiaojun and Shi, Zhenwei and Kampffmeyer, Michael and Liang, Xiaodan},
journal={arXiv preprint arXiv:2410.10790},
year={2024}
}
Also thanks for the open source code of InterGen, DIMOS, GAMMA, BUDDI. Some of our codes are based on them.
This project is licensed under the Apache-2.0 license. See LICENSE for details.