Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Janus model from DeepSeek AI #34249

Open
2 tasks done
ighoshsubho opened this issue Oct 18, 2024 · 0 comments
Open
2 tasks done

Add support for Janus model from DeepSeek AI #34249

ighoshsubho opened this issue Oct 18, 2024 · 0 comments

Comments

@ighoshsubho
Copy link

Model description

Janus is an autoregressive framework that unifies multimodal understanding and generation. Unlike previous approaches that use a single visual encoder for both tasks, Janus decouples visual encoding into separate pathways while utilizing a unified transformer architecture for processing. This decoupling addresses the conflict between visual encoder roles in understanding and generation, enhancing flexibility and performance.

Key features:

  • Unified framework for multimodal understanding and generation
  • Decoupled visual encoding pathways
  • Single, unified transformer architecture for processing
  • Improved performance in multimodal understanding tasks
  • Flexibility to select optimal encoding methods for each component

Open source status

  • The model implementation is available
  • The model weights are available

Provide useful links for the implementation

The Janus model is developed by DeepSeek AI. Here are the relevant links for implementation:

Paper: Janus: Bridging the Gap Between Multimodal Understanding and Generation
GitHub repository: deepseek-ai/Janus

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant