Add support for Janus model from DeepSeek AI #34249

ighoshsubho · 2024-10-18T18:30:55Z

Model description

Janus is an autoregressive framework that unifies multimodal understanding and generation. Unlike previous approaches that use a single visual encoder for both tasks, Janus decouples visual encoding into separate pathways while utilizing a unified transformer architecture for processing. This decoupling addresses the conflict between visual encoder roles in understanding and generation, enhancing flexibility and performance.

Key features:

Unified framework for multimodal understanding and generation
Decoupled visual encoding pathways
Single, unified transformer architecture for processing
Improved performance in multimodal understanding tasks
Flexibility to select optimal encoding methods for each component

Open source status

The model implementation is available
The model weights are available

Provide useful links for the implementation

The Janus model is developed by DeepSeek AI. Here are the relevant links for implementation:

Paper: Janus: Bridging the Gap Between Multimodal Understanding and Generation
GitHub repository: deepseek-ai/Janus

ighoshsubho added the New model label Oct 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Janus model from DeepSeek AI #34249

Add support for Janus model from DeepSeek AI #34249

ighoshsubho commented Oct 18, 2024

Add support for Janus model from DeepSeek AI #34249

Add support for Janus model from DeepSeek AI #34249

Comments

ighoshsubho commented Oct 18, 2024

Model description

Open source status

Provide useful links for the implementation