Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add accfg design doc #167

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from
Draft

Add accfg design doc #167

wants to merge 8 commits into from

Conversation

AntonLydike
Copy link
Collaborator

No description provided.

@AntonLydike
Copy link
Collaborator Author

Copy link
Contributor

@JosseVanDelm JosseVanDelm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good document! Can we put this in a top-level docs folder please?

I'm not quite sure I fully agree with all the wording. Also I'm still a bit on the fence when it comes to the reset operation?

compiler/dialects/accfg.md Outdated Show resolved Hide resolved
compiler/dialects/accfg.md Outdated Show resolved Hide resolved
compiler/dialects/accfg.md Outdated Show resolved Hide resolved

To summarise, our semantics are:
- Only one `accfg.state` variable per client can be alive at any given time.
- Nothing but `accfg` dialect operations modify the accelerator state.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to contradict with #166 ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yes. This phrasing should be improved.

compiler/dialects/accfg.md Outdated Show resolved Hide resolved
1. Block level await overlap
2. Loop level await overlap

Block level await overlap tries to find a `launch() -> await() -> setup()` triplet inside the same block operating on
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't find it very clear that in the launch -> await -> setup the launch corresponds to a previous setup in this wording.
Maybe also add the first setup op that sets the first launch op?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This is a part that needs improving.

loop operating on the initial loop counters, together with a replaced version at the end of the loop body that operates
on the loop counter incremented by the loop step. This effectively moves only the setup operation "upwards" inside the
loop. The resulting IR contains one more `setup` call than the original version (as we now also set up the accelerator
to the `ub+1` state), but since this state is never launched, no adverse effects are expected. This pattern effectively
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ub+1 reads like undefined behaviour + 1. Also should this not be iv instead?

- `accfg-trace-states`: This pass is allowed to do the following replacement:
`s0 = setup() ... reset(s0) ... s1 = setup()` to `s0 = setup() ... s1 = setup(input=s0)`
- for `scf.if`, the `reset` can be moved outside the conditional
- for `scf.for`, a single `reset` at the end of the loop suffices
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why? This seems AMX specific?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is (afaik) how it's done for every accelerator that is not 100% owned by one process (so all systems that are able to run more than one process simultaneously)


*How this affects the other optimisations:**
- `accfg-trace-states`: This pass is allowed to do the following replacement:
`s0 = setup() ... reset(s0) ... s1 = setup()` to `s0 = setup() ... s1 = setup(input=s0)`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This undoes the effect of the reset op in the first place. No?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point is that if we know that the state is not used in between, we can erase the reset op.

*How this affects the other optimisations:**
- `accfg-trace-states`: This pass is allowed to do the following replacement:
`s0 = setup() ... reset(s0) ... s1 = setup()` to `s0 = setup() ... s1 = setup(input=s0)`
- for `scf.if`, the `reset` can be moved outside the conditional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, good question. I'll have to think about it some more. Maybe this is a wrong statement and we actually want to do the opposite...

@AntonLydike
Copy link
Collaborator Author

@JosseVanDelm I will update this with the latest changes, want to get this done this week!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants