New Direction: Multi-Modal Video Understanding

We support two novel models for video recognition and retrieval based on open-domain text: ActionCLIP and CLIP4Clip. These models mark the first step of MMAction2's journey towards multi-modal video understanding. Furthermore, we also introduce a new video retrieval dataset, MSR-VTT.

For more details, please refer to ActionCLIP, CLIP4Clip and MSR-VTT.

Supported by @Dai-Wenxun in #2470 and #2489.

New Config Type

MMEngine introduced the pure Python style configuration file:

Support navigating to base configuration file in IDE
Support navigating to base variable in IDE
Support navigating to source code of class in IDE
Support inheriting two configuration files containing the same field
Load the configuration file without other third-party requirements

Refer to the tutorial for more detailed usages.

New Datasets

We are glad to support 3 new datasets:

(ICCV2019) HACS
(ICCV2021) MultiSports
(Arxiv2022) Kinetics-710

(ICCV2019) HACS

HACS is a new large-scale dataset for recognition and temporal localization of human actions collected from Web videos.

v_-3jHv_c1LKU.mp4

For more details, please refer to HACS.

Supported by @hukkai in #2224

(ICCV2021) MultiSports

MultiSports is a multi-person video dataset of spatio-temporally localized sports actions.

ICCV_2021._MultiSports_._.mp4

For more details, please refer to MultiSports.

Supported by @cir7 in #2280

(Arxiv2022) Kinetics-710

For more details, please refer to Kinetics710.

Supported by @cir7 in #2534

Other New Features

Support rich projects: Gesture Recognition, Spatio-Temporal Action Detection Tutorial, and Knowledge Distillation
Support TCANet(CVPR'2021)
Support VideoMAE V2(CVPR'2023) and VideoMAE(NeurIPS'2022) on action detection

What's Changed

[Doc] Fix document links in readme by @cir7 in #2358
[doc] fix installation doc by @cir7 in #2362
[Enhance] Support automatically assigning issues by @cir7 in #2368
[Doc] Fix model links in README by @cir7 in #2372
[Fix] Restore the wrongly modified config by @cir7 in #2375
[Doc] Fix readme links by @cir7 in #2376
[Fix] update skeleton demo by @WILLOSCAR in #2381
[Fix] Fix a bug in demo_skeleton.py by @Dai-Wenxun in #2380
[Update] Update version requirements by @Dai-Wenxun in #2383
[Doc] update readme by @cir7 in #2382
[Doc] Update Installation Related Doc by @Dai-Wenxun in #2379
[Fix] Fix colab tutorial by @cir7 in #2384
[Fix] update colab link in tutorial by @cir7 in #2391
[Doc] Refine Docs by @Dai-Wenxun in #2404
[CI] fix github ci (main) by @cir7 in #2421
[Fix] fix a bug in multi-label classification by @Dai-Wenxun in #2425
[Fix] Fix issue template by @cir7 in #2399
[Doc] Update repo list by @cir7 in #2429
[Fix] Fix a warning caused by torch.div by @Dai-Wenxun in #2449
[Fix] Fix readthedoc error raised by incompatible OpenSSL version by @cir7 in #2455
[Fix] Fix incompatibility of ImgAug and latest Numpy by @cir7 in #2451
[Fix] Update branch in dockerfile by @cir7 in #2397
[Doc] Update outdated config in readme by @cir7 in #2419
[Fix] Fix tutorial by @cir7 in #2475
[fix] Fix batch blending bug when use multi-label classification by @cir7 in #2466
[Fix] Fix UniFormer README and metafile by @cir7 in #2450
[Doc] update faq by @cir7 in #2476
[Fix] Fix a bug of MViT when set with_cls_token to False by @KeepLost in #2480
[Fix] Update outdated dependencies of mmcv for downloading fine-gym dataset by @yhZhai in #2495
[Doc] add finetune doc by @cir7 in #2453
[Doc] Update faq doc by @cir7 in #2482
[Doc] Fix document link by @cir7 in #2457
Merge dev-1.x to main by @cir7 in #2551

New Contributors

@WILLOSCAR made their first contribution in #2381
@KeepLost made their first contribution in #2480
@yhZhai made their first contribution in #2495

Full Changelog: v1.0.0...v1.1.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MMAction2 V1.1.0 Release