Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fused_multi_transformer op to optimize transformer generation performance #41814

Merged
merged 11 commits into from
Apr 26, 2022

Conversation

wangxicoding
Copy link
Contributor

@wangxicoding wangxicoding commented Apr 14, 2022

PR types

Performance optimization

PR changes

OPs

Describe

Add fused_multi_transformer to optimize ERNIE generation inference model performance.
API document:
火狐截图_2022-04-26T02-07-32 011Z
火狐截图_2022-04-26T02-08-34 694Z

@paddle-bot-old
Copy link

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copy link
Contributor

@ZzSean ZzSean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@Aurelius84 Aurelius84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for data registeration

@wangxicoding wangxicoding changed the title optimize transformer generation performance Add fused_multi_transformer op to optimize transformer generation performance Apr 26, 2022
Copy link
Contributor

@XieYunshen XieYunshen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for set_tests_properties(test_static_model_parallel_fused_multi_transformer PROPERTIES TIMEOUT 120)

Copy link
Contributor

@jzhang533 jzhang533 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API is exposed in paddle.incubate namespace.
LGTM

Copy link
Contributor

@XiaoguangHu01 XiaoguangHu01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wangxicoding wangxicoding merged commit 9dadf7d into PaddlePaddle:develop Apr 26, 2022
@wangxicoding wangxicoding deleted the opt_gpt_generation branch April 27, 2022 03:34
wangxicoding added a commit to wangxicoding/Paddle that referenced this pull request Apr 27, 2022
fuyinno4 pushed a commit that referenced this pull request Apr 29, 2022
…mer generation performance (#42311)

* Add fused_multi_transformer op to optimize transformer generation performance (#41814)

* fix fused_multi_transformer compile failed in cuda arch < sm53 (#42315)

* fix ci timeout
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants