Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get rid of most generated variables #1679

Merged
merged 21 commits into from
Oct 16, 2023
Merged

Conversation

yannham
Copy link
Member

@yannham yannham commented Oct 12, 2023

Fixes #1647.

Context

This PR adds a new node to the AST, Closure, which contains a "pointer" (more abstractly, a cache index, in practice a thunk) to an allocated closure. This removes the need of closurizing terms in the environment everywhere.

The first motivation was performance, as we used to allocate specific variables and put them in the environment to store those closures, with renaming issues, environment propagations, and paying the price of environment insertion and lookup. Now, we basically just use direct pointers.

The price to pay is that Closure now leaks in the AST, even at stages where it doesn't make sense. Ideally, having separate ASTs for parsing and for evaluation would make this cleaner, and is the eventual plan - but for now, it's an ok price to pay to fix some big performance issues. Also, this change makes a lot of code much simpler, because we don't have to propagate and think about nested environments for closurization anymore. There's a lot less threading to do.

Content

  • Adds a Closure node to the AST
  • Get rid of the share normal form transformation. This was the one creating a lot of let-binding with generated variables. Instead, we rely on the closurized attributes of arrays and records to know if something is a "fresh" array or record - in which case we allocate the closures for each element at first evaluation - or if it's already "transformed". In some sense, this PR factors the share normal form transformation directly into evaluation, removing the need of creating intermediate let-bindings and allocating the corresponding slot in the environment.
  • Moves the trait Closurizable into its specific module, and rename it Closurize. We still use closurize because it's handy, but the signature is simpler.
  • Updates various part of the codebase in consequence

Effects

On the codebase provided for #1622, this change doesn't gain an order of magnitude but still seems to provide a boost of around 20-30% (it's hard to say more without a proper statistical method, because there is a lot of variance across runs - an effect of Sip hashing?). It's noticeable without being game changing.

However, on other examples, the gain is drastic. On an example reported in the online chat by another user, this:

let things = std.array.range 0 3000
    |> std.array.map (fun number => {
        a = 0,
        b = 1,
        c = 2,
    })
in

things
    |> std.array.flat_map std.record.values
    |> std.array.at 0

Was basically taking forever (didn't finish before interrupted, and worked for at least a solid minute). After this change, it takes between 2 and 3 seconds (it's still a bit high, but nowhere near the previous numbers).

Follow-up

One medium-term follow-up would be to split the AST, and have 1. a concrete AST with no mention of closures 2. A runtime AST with versions of Array and (Rec)Record which enforce their elements are stored inside closures.

@github-actions github-actions bot temporarily deployed to pull request October 12, 2023 17:00 Inactive
@yannham yannham force-pushed the optimization/internal-closure branch 2 times, most recently from b052e49 to a575a75 Compare October 13, 2023 10:59
@github-actions github-actions bot temporarily deployed to pull request October 13, 2023 11:03 Inactive
@yannham yannham force-pushed the optimization/internal-closure branch from a575a75 to 9529727 Compare October 13, 2023 16:10
@github-actions github-actions bot temporarily deployed to pull request October 13, 2023 16:15 Inactive
@yannham yannham marked this pull request as ready for review October 13, 2023 16:15
Copy link
Member

@jneem jneem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm definitely lacking some context here, but I did have a look at least...

core/src/closurize.rs Outdated Show resolved Hide resolved
core/src/eval/fixpoint.rs Show resolved Hide resolved
core/src/eval/merge.rs Outdated Show resolved Hide resolved
core/src/term/mod.rs Outdated Show resolved Hide resolved
core/src/term/mod.rs Outdated Show resolved Hide resolved
core/src/eval/fixpoint.rs Outdated Show resolved Hide resolved
@yannham yannham force-pushed the optimization/internal-closure branch from ac447d6 to d2557aa Compare October 16, 2023 10:00
@github-actions github-actions bot temporarily deployed to pull request October 16, 2023 10:05 Inactive
@github-actions github-actions bot temporarily deployed to pull request October 16, 2023 11:04 Inactive
@github-actions github-actions bot temporarily deployed to pull request October 16, 2023 11:29 Inactive
@yannham yannham added this pull request to the merge queue Oct 16, 2023
Merged via the queue into master with commit 2a727ba Oct 16, 2023
5 checks passed
@yannham yannham deleted the optimization/internal-closure branch October 16, 2023 12:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Get rid of most generated variables
2 participants