-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoScheduler] Add Dynamic Gradient Descent Search Algorithm for Auto-Tuning #17126
base: main
Are you sure you want to change the base?
Conversation
[AutoScheduler] Add Dynamic Gradient Descent Search Algorithm for Auto-Tuning
Thank you @Lurkrazy for this contribution ! I add Cc to relevant folks here: @comaniac @jcf94 @merrymercy @FrozenGene @minminsun @jinhongyii |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(x) Could some references be added to the benchmarks, howtos and docs parts ?
(x) Also please make sure the CI issues (lint & build) are also all-in-green state.
Given that we are migrating toward meta-schedule and may phase out auto-scheduler, i would suggest we bring new xhanges to that path. |
This PR introduces the Dynamic Gradient Descent (DGD) Search algorithm for accelerating the auto-tuning process of GPU kernels within the Ansor/AutoScheduler framework. The DGD algorithm is designed to explore the search space more efficiently than the existing Genetic Algorithm-based approach. The following changes are included:
Dynamic Gradient Descent Search:
Record Processor:
This implementation is based on the algorithm described in the paper "Accelerated Auto-Tuning of GPU Kernels for Tensor Computations" presented at ICS'24.
Experimental evaluation on a number of matrix-matrix multiplication and convolution kernels shows that the DGD algorithm achieves an order-of-magnitude improvement in auto-tuning time while maintaining comparable code performance.
Usage:
To use the DGD Search algorithm, instantiate the
DynamicGradientSearchTuner
class with the desired parameters and call thedynamic_gradient_search
method.Example:
Experiments setup:
The experiments used the DGD Search algorithm with a time budget of 1 hour and full duration used by Ansor, comparing the performance achieved by Ansor after suggested trials. The models used for the evaluation were Bert, ResNet-50, and MobileNetV2, with the following configurations based on the Apache blog Introducing TVM Auto-scheduler (a.k.a. Ansor):
Relative Performance of the DGD Search algorithm achieved in 1 hour and full duration used by Ansor
This table presents the relative performance of the DGD Search algorithm with a 1-hour time budget compared to the full duration used by Ansor. The performance ratios indicate the effectiveness of the Dynamic Gradient Descent Search algorithm in achieving comparable performance within a significantly reduced time frame.