Recent Advances of Multimodal Continual Learning: A Comprehensive Survey
The first comprehensive survey for Multimodal Continual Learning (MMCL) Methods. [PDF] [机器之心]
Paper | Method | Venue | Code |
---|---|---|---|
Continual Instruction Tuning for Large Multimodal Models | TIR | arXiv 2023 | - |
Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models | ZSCL | ICCV 2023 | |
Continual Vision-Language Representation Learning with Off-Diagonal Information | Mod-X | ICML 2023 | - |
Multi-Domain Lifelong Visual Question Answering via Self-Critical Distillation | SCD | ACM Multimedia 2023 | - |
Revisiting Distillation for Continual Learning on Visual Question Localized-Answering in Robotic Surgery | CS-VQLA | MICCAI 2023 | |
CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation | CTP | ICCV 2023 | |
Continual Multimodal Knowledge Graph Construction | MSPT | IJCAI 2024 |
Paper | Method | Venue | Code |
---|---|---|---|
RATT: Recurrent Attention to Transient Tasks for Continual Image Captioning | RATT | NeurIPS 2020 | |
Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters | MoE-Adapters4CL | CVPR 2024 | |
CLAP4CLIP: Continual learning with probabilistic finetuning for vision-language models | CLAP | NeurIPS 2024 | |
Hierarchical Visual-Textual Knowledge Distillation for Life-Long Correlation Learning | VLKD | Int. J. Comput. Vis. 2021 | - |
Continual Instruction Tuning for Large Multimodal Models | EProj | arXiv 2023 | - |
Real-world Cross-modal Retrieval via Sequential Learning | SCML | IEEE Trans. Multim. 2021 | - |
Multimodal Continual Graph Learning with Neural Architecture Search | MSCGL | WWW 2022 | - |
Multimodal Continual Learning Using Online Dictionary Updating | ODU | IEEE Trans. Cogn. Dev. Syst. 2021 | - |
Confusion Mixup Regularized Multimodal Fusion Network for Continual Egocentric Activity Recognition | CMR-MFN | ICCV (Workshops) 2023 |
Paper | Method | Venue | Code |
---|---|---|---|
Multimodal Parameter-Efficient Few-Shot Class Incremental Learning | CPE-CLIP | ICCV (Workshops) 2023 | |
Decouple Before Interact: Multi-Modal Prompt Learning for Continual Visual Question Answering | TRIPLET | ICCV 2023 | - |
Beyond Anti-Forgetting: Multimodal Continual Instruction Tuning with Positive Forward Transfer | Fwd-Prompt | arXiv 2024 | - |
S-Prompts Learning with Pre-trained Transformers: An Occam's Razor for Domain Incremental Learning | S-liPrompts | NeurIPS 2022 |
Paper | Name | Venue | Code |
---|---|---|---|
CLiMB: A Continual Learning Benchmark for Vision-and-Language Tasks | CLiMB | NeurIPS 2022 | |
Symbolic Replay: Scene Graph as Prompt for Continual Learning on VQA Task | CLOVE | AAAI 2023 | |
Continual Multimodal Knowledge Graph Construction | IMNER, IMRE | IJCAI 2024 | |
Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models | MTIL | ICCV 2023 | |
CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation | VLCP | ICCV 2023 | |
Beyond Unimodal Learning: The Importance of Integrating Multiple Modalities for Lifelong Learning | MMCL | CoLLAs 2024 | |
Towards Continual Egocentric Activity Recognition: A Multi-Modal Egocentric Activity Dataset for Continual Learning | CEAR | IEEE Trans. Multim. 2024 |