-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[transform] add a default schedule #899
Conversation
return success(); | ||
} | ||
|
||
DISC_TRANSFORM_SCHEDULE(PatternKind::kGEMM, "", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to register with device type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'll add it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think more about this. Maybe we should just use one tag instead of adding an explicit filed for device.
For example we may have different schedule for x86 and aarch64. And even within the same cpu family, there are many different hardware features. thus using tag maybe more robust.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline, let leave a single tag
here, and refine it if necessary when doing scheduling selection.
|
||
LogicalResult ScheduleDispatcher::parseModuleFromFile(MLIRContext* ctx) { | ||
if (transformFileName_.empty() || !parsedModuleMap_.empty()) return success(); | ||
std::string expectedFormatStr = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may have some badcases .... a:aa;
is a valid file name on linux..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think such corner case may not be very important. Many env vars (e.g. PATH) also use special separator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In bash :
can be used but has to be escaped. I agree it's really a corner case which may not be the highest priority.
Value matmul = buildMatchOp(b, loc, variant, {"linalg.matmul"}); | ||
|
||
// transform.structured.tile_to_foreach_thread_op %matmul num_threads [1, 1] | ||
auto forEachThreadOp = buildTileToForEachThreadOp(b, loc, matmul, {1, 1}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to rewrite the schedule's pdl version in c++ ? Which kind do we prefer in our codebase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's MLIR file with transform IR instead of PDL actually. The C++ version is used to provide a default schedule while we may override the default strategy for common fusion pattern (e.g. GEMM) by providing customized transform IR. Another use case is to provide codegen implementation for customized pattern. In that case, PDL may be useful to define the pattern.
it's worth noting that the C++ version is JIT generated. Thus it can be fine-tuned to make it suitable for specific shape setting. For example, we always do two level tiling in the current schedule. However, some tiled loops may only has 1 iteration if the shape of the GEMM is partial known (e.g., n, k of weight) and very small. These single-iteration loop will be folded during transformation. We have to take such information into consideration in order to make sure the transform IR is valid (e.g. some transform primitive depends on the number loop level).
to #787