Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Is there a Flash-Decoding algorithm implemented based on Composable kernel? #45

Open
zhangxiao-stack opened this issue Feb 28, 2024 · 3 comments
Labels

Comments

@zhangxiao-stack
Copy link

Suggestion Description

Is there a Flash-Decoding algorithm implemented based on Composable kernel?

Operating System

No response

GPU

No response

ROCm Component

composable kernel

@fxmarty
Copy link

fxmarty commented Feb 29, 2024

+1 cc @howiejayz @sabreshao, this would be useful for us in https://github.com/huggingface/text-generation-inference

@sabreshao
Copy link
Collaborator

@fxmarty @zhangxiao-stack This effort is planned in March. Stay tuned.

@jcao-ai
Copy link

jcao-ai commented Apr 22, 2024

@sabreshao Hi, is this feature ready ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants