Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DiT with decorator, triton fused_AdaLN and fineGrained #552

Open
wants to merge 27 commits into
base: develop
Choose a base branch
from

Conversation

YKTian-x2b
Copy link

@YKTian-x2b YKTian-x2b commented May 24, 2024

DiT with decorator, triton fused_AdaLN/fused_rotary_emb, horizontal fusion qkv and fineGrained ffn.

25步 + 256*256 + 新ir + 5次端到端取均值

3B最终耗时:581ms (+61.2%)

7B最终耗时:926ms (+41.4%)

Copy link

paddle-bot bot commented May 24, 2024

Thanks for your contribution!

def compute_activation(self, ffn1_out):
origin_batch_size = ffn1_out.shape[0]
origin_seq_len = ffn1_out.shape[1]
ffn1_out = ffn1_out.reshape([origin_batch_size*origin_seq_len, ffn1_out.shape[-1]])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这两个reshape加的不太好,建议拓展下fused_bias_act的实现

@YKTian-x2b YKTian-x2b changed the title DiT FFN fineGrained DiT with decorator, triton fused_AdaLN and fineGrained Jun 12, 2024

# To speed up this code, call zkk and let him run for you,
# then you will get a speed increase of almost 100%.
os.environ['callZKK']= "True"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个环境变量改成其他的,可以optimize_inference_for_ditllama?

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ westfish
❌ YKTian-x2b
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants