-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add accfg design doc #167
base: main
Are you sure you want to change the base?
Add accfg design doc #167
Conversation
@JosseVanDelm I added a section on my proposed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good document! Can we put this in a top-level docs
folder please?
I'm not quite sure I fully agree with all the wording. Also I'm still a bit on the fence when it comes to the reset operation?
|
||
To summarise, our semantics are: | ||
- Only one `accfg.state` variable per client can be alive at any given time. | ||
- Nothing but `accfg` dialect operations modify the accelerator state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to contradict with #166 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yes. This phrasing should be improved.
1. Block level await overlap | ||
2. Loop level await overlap | ||
|
||
Block level await overlap tries to find a `launch() -> await() -> setup()` triplet inside the same block operating on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't find it very clear that in the launch -> await -> setup
the launch corresponds to a previous setup in this wording.
Maybe also add the first setup op that sets the first launch op?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This is a part that needs improving.
loop operating on the initial loop counters, together with a replaced version at the end of the loop body that operates | ||
on the loop counter incremented by the loop step. This effectively moves only the setup operation "upwards" inside the | ||
loop. The resulting IR contains one more `setup` call than the original version (as we now also set up the accelerator | ||
to the `ub+1` state), but since this state is never launched, no adverse effects are expected. This pattern effectively |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ub+1
reads like undefined behaviour + 1. Also should this not be iv
instead?
- `accfg-trace-states`: This pass is allowed to do the following replacement: | ||
`s0 = setup() ... reset(s0) ... s1 = setup()` to `s0 = setup() ... s1 = setup(input=s0)` | ||
- for `scf.if`, the `reset` can be moved outside the conditional | ||
- for `scf.for`, a single `reset` at the end of the loop suffices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why? This seems AMX specific?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is (afaik) how it's done for every accelerator that is not 100% owned by one process (so all systems that are able to run more than one process simultaneously)
|
||
*How this affects the other optimisations:** | ||
- `accfg-trace-states`: This pass is allowed to do the following replacement: | ||
`s0 = setup() ... reset(s0) ... s1 = setup()` to `s0 = setup() ... s1 = setup(input=s0)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This undoes the effect of the reset op in the first place. No?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point is that if we know that the state is not used in between, we can erase the reset op.
*How this affects the other optimisations:** | ||
- `accfg-trace-states`: This pass is allowed to do the following replacement: | ||
`s0 = setup() ... reset(s0) ... s1 = setup()` to `s0 = setup() ... s1 = setup(input=s0)` | ||
- for `scf.if`, the `reset` can be moved outside the conditional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, good question. I'll have to think about it some more. Maybe this is a wrong statement and we actually want to do the opposite...
Co-authored-by: Josse Van Delm <[email protected]>
Co-authored-by: Josse Van Delm <[email protected]>
Co-authored-by: Josse Van Delm <[email protected]>
Co-authored-by: Josse Van Delm <[email protected]>
Co-authored-by: Josse Van Delm <[email protected]>
@JosseVanDelm I will update this with the latest changes, want to get this done this week! |
No description provided.