Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_apply_in_world builtin #35844

Merged
merged 2 commits into from
Jul 3, 2020
Merged

_apply_in_world builtin #35844

merged 2 commits into from
Jul 3, 2020

Conversation

c42f
Copy link
Member

@c42f c42f commented May 12, 2020

This implements a new builtin Core._apply_in_world to allow Julia code to be run in a frozen world age (when combined with Base.get_world_counter()).

This is more general than Core._apply_latest, so we could remove _apply_latest (or replace with a simple shim which calls _apply_in_world with world=typemax(UInt)) if people think that's a good idea.

Motivation

The original motivation for this was to have a way to freeze the world age of Julia code implementing the Julia parser - see note at #35243 (comment).

Using a fixed world should be beneficial for infrastructure code which runs in a user's julia process, but which is otherwise not expected to be modified by the user. There's two benefits:

  • Users can't accidentally break the basic infrastructure of the language. For example, breaking the Julia parser breaks pretty much everything in the REPL. Likewise, breaking Revise pretty much breaks the user's session.
  • Method invalidation for infrastructure packages like Pkg will not slow down the user experience when the user loads $random_package into their session.

Note that world age is dynamically scoped, so fixed world would only apply when package code is entered through an _apply_in_world shim. Therefore devs of packages which choose to use this for deployment can still use a Revise-based workflow for developing their packages.

Possible usage scenarios

  • When replacing the flisp parser with CSTParser, I'd like to use this to ensure user mistakes don't take down the whole session.
  • Revise.jl is basic infrastructure but method invalidation slows down load time. @timholy has recently fixed this through heroic efforts, but using a fixed world may be a lot easier and more future proof. Same considerations likely apply to JuliaInterpreter/JuliaDebugger.
  • Pkg also suffers from method invalidation so this may be of interest for the pkg> REPL mode, run pkg> operations in subprocess? Pkg.jl#1816
  • Any in-process infrastructure for code editor support, etc, may also benefit from this

Questions / TODO

As public API, I've considered a callable ApplyInWorld function wrapper (and maybe an API Base.freeze_world(f) to create it) which would capture a function and Base.get_world_counter(). There might be other options though? I considered making inference understand the builtin so that the ApplyInWorld shim could be inferred, but in discussions with @Keno and @vtjnash, it seemed there were difficulties with this. For one, Jameson thought capturing the world age in the type parameters of ApplyInWorld would break subtyping in some way. For two, I was hoping this would help solve the problem that GeneralizedGenerated.jl solves, but Keno says that's not the case because inference can fundamentally only see older world ages. Actually I'm still a little confused by this because the newer world methods would be part of the global state so in principle accessible during compilation of an older world; but perhaps it just breaks fundamental inference invariants I don't understand yet.

A few todos:

  • Consider the public API.
  • Replace _apply_latest builtin with this Reverted to avoid an extra allocation
  • As I understand it, the world age isn't unique in precompilation. What complexities does this cause? Does it prevent ApplyInWorld wrappers from being saved during precompilation? (Edit: yes)
  • In general, is it safe to capture the world age UInt or does this create an implicit dependency on old methods which may be removed from internal caches? Do we need to lift this implicit dependency into an explicit one? (Edit: Should be ok?)

@vtjnash I'd greatly value your input on subtleties I might have missed, especially on the last two questions above.

@c42f
Copy link
Member Author

c42f commented May 12, 2020

In discussion with Keno and Jeff, it became clear that the way precompile works (by collapsing world ages) means it's clear that nobody should be capturing the internal world age UInt during precompile.

I suppose we might address this by introducing an opaque WorldAge token type which can be treated in a special way when deserializing precompiled packages. Without something like this it seems hard to create a reliable ApplyInWorld wrapper people can actually use.

@vtjnash
Copy link
Sponsor Member

vtjnash commented May 12, 2020

This'll be great to have for debuggers that want to simulate running generated functions.

I'm not sure we want to add yet another allocation to the _apply_latest code path, though I realize it's already somewhat slow.

The point of precompile is that most likely the same world does not even exist after deserializing, and so it's taking the work that happened with a different view of the world and transforming it to be applicable to the new world. It's not just the value that can't be captured, but the whole state behind it that has no analogue (in the current implementation).

@c42f
Copy link
Member Author

c42f commented May 12, 2020

yet another allocation to the _apply_latest code path

Do you mean for the boxed UInt world argument?

I'm happy just to revert that part and live with a small amount of code duplication if people prefer.

@c42f
Copy link
Member Author

c42f commented May 21, 2020

I added non-exported apply_in_world in analogy to applylatest as a more convenient way to call _apply_in_world and as a place to put the documentation.

I also reverted to using the existing _apply_latest implementation to avoid an extra allocation — this also fixed the tests.

So from my perspective this is in a state where it could be merged. Also it would be great if people could try it out to see whether it helps their use cases in practice and we can iterate a bit more, if desired.

@c42f
Copy link
Member Author

c42f commented Jul 2, 2020

@pfitzseb This is the PR I referred to which may help running a portion of text editor infrastructure in-process, without the user being able to change its functionality or cause latency spikes due to method invalidation.

@pfitzseb
Copy link
Member

pfitzseb commented Jul 2, 2020

Cool! I'm probably not qualified to comment on the implementation, but it does look correct to me ;)

New builtin Core._apply_in_world to allow frozen-world APIs to be
implemented in pure Julia code. Also add an internal, experimental API
Base.invoke_in_world() in analogy to Base.invokelatest() to make this
easier to use (especially for keyword args) and to give us a place to
put some documentation.
@c42f
Copy link
Member Author

c42f commented Jul 3, 2020

Ok I've rebased to fix some conflicts, and simplified as much as possible by removing the current_world builtin for now — while it's nice in principle, it seems to add complexity while also forcing some additional boxing. So I didn't find that very satisfying.

I plan to merge when tests pass so that people can try this in practice.

base/essentials.jl Outdated Show resolved Hide resolved
Users shouldn't construct world ages themselves, only use the return type from get_current_world which is currently a UInt.

Co-authored-by: Jameson Nash <[email protected]>
@c42f c42f merged commit f9496fb into master Jul 3, 2020
@c42f c42f deleted the cjf/apply-in-world branch July 3, 2020 09:14
@c42f
Copy link
Member Author

c42f commented Jul 3, 2020

Thanks for taking another quick look Jameson 👍

I hope this can be of some use to people!

@c42f
Copy link
Member Author

c42f commented Jul 8, 2020

Over at JuliaLang/Pkg.jl#1897, I was discussing how to use this with @timholy, but I thought I'd bring the answer here as it's relevant:

the frozen-world issue does seem like a disadvantage

It's a matter of perspective.

I think the litmus test is thus: "would you be happy if the pkg> repl mode ran in a separate process from user code?"

Easy answer: if I can Revise it, I'm happy---if not, not.

Yes, you can revise it in two different ways. Let me explain with the following code:

# Suppose foo() is some library function

foo() = "First implementation of foo"
world1 = Base.get_world_counter()

# Suppose `bar()` is a public entry point to foo(), which wraps it to run in the user's session
# but provides a degree of separation.
# (For example, this could be the pkg REPL mode):

_entry_point_world = Base.get_world_counter()
function bar()
    Base.invoke_in_world(_entry_point_world, foo)
end

@show foo()
@show bar()

# First way to work with the code interactively: call the implementation foo()
println("Now, revise foo:")
foo() = "Second implementation of foo"
@show foo()
@show bar()

# Second way to work with the code interactively: update the world age of the entry point
println("The public entry point world age can also be updated:")
_entry_point_world = Base.get_world_counter()
@show foo()
@show bar()

nothing

Output from this code:

foo() = "First implementation of foo"
bar() = "First implementation of foo"
Now, revise foo:
foo() = "Second implementation of foo"
bar() = "First implementation of foo"
The public entry point world age can also be updated:
foo() = "Second implementation of foo"
bar() = "Second implementation of foo"

So you see you've got quite a lot of flexibility here.

@c42f
Copy link
Member Author

c42f commented Jul 8, 2020

To state what's going on here in a different way: the world age is already dynamically scoped in nature. This means that you as the caller have the power to choose which world a computation runs in and invoke_in_world exposes this. Conversely, libraries should not in general use invoke_in_world, but leave this to the application layer. The REPL is the application layer for interactive julia sessions, so it's probably generally sensible for REPL mode tooling to be able to use invoke_in_world.

@tkf
Copy link
Member

tkf commented Jul 8, 2020

the world age is already dynamically scoped in nature

I'm just curious how this interacts with tasks so I played with it a bit:

foo() = "First implementation of foo"
function bar()
    Base.invoke_in_world(_entry_point_world, foo)
end

afoo() = fetch(@async foo())
function abar()
    Base.invoke_in_world(_entry_point_world, afoo)
end

# Note `@async` creates a closure so the counter has to be fetched after that.
_entry_point_world = Base.get_world_counter()

@show foo()
@show bar()
@show afoo()
@show abar()

println("Now, revise foo:")
foo() = "Second implementation of foo"

@show foo()
@show bar()
@show afoo()
@show abar()

This prints

foo() = "First implementation of foo"
bar() = "First implementation of foo"
afoo() = "First implementation of foo"
abar() = "First implementation of foo"
Now, revise foo:
foo() = "Second implementation of foo"
bar() = "First implementation of foo"
afoo() = "Second implementation of foo"
abar() = "Second implementation of foo"

So, I guess the world age is very close to dynamic scoping although not quite until we have #35690?

@c42f
Copy link
Member Author

c42f commented Jul 8, 2020

So, I guess the world age is very close to dynamic scoping although not quite until we have #35690?

Excellent point. Yet more evidence to favor the solution proposed there.

@tkf
Copy link
Member

tkf commented Jul 8, 2020

Okay. I just recorded this argument in #35690 (comment).

@timholy
Copy link
Sponsor Member

timholy commented Jul 8, 2020

Thanks for the explanation, @c42f. However, note my capitalization: I said I want to Revise it, not revise it.

As it stands, if you edit foo in your source code, all direct callers of foo will see the new one, but:

  • if you hadn't prepared the way by assigning the required world age to a non-const global, it wouldn't have been possible to change the version of foo called by bar by any means short of redefining bar itself
  • even if you do make such preparations, Revise will have a hard time knowing which global constant to modify in order to have the changes take effect.

Cool feature, but if it's widely used this makes me nervous about maintaining Revise going forward.

@c42f
Copy link
Member Author

c42f commented Jul 8, 2020

Cool feature, but if it's widely used this makes me nervous about maintaining Revise going forward.

I don't think this is a problem: for normal use with Revise.jl, the key is that libraries do not use invoke_in_world — that's what I was trying to emphasize in #35844 (comment). Instead invoke_in_world should be reserved as a choice for when libraries are put together into a complete application.

Pkg isn't the best example, but I'd say the normal APIs like Pkg.add should not use invoke_in_world, because those APIs can be used by applications other than the REPL. Conversely, the package REPL mode is a part of the Julia REPL application and can use invoke_in_world to wrap around the standard package API.

simeonschaub pushed a commit to simeonschaub/julia that referenced this pull request Aug 11, 2020
New builtin `Core._apply_in_world` to allow frozen-world APIs to be
implemented in pure Julia code. Also add an internal, experimental API
`Base.invoke_in_world()` in analogy to `Base.invokelatest()` to make this
easier to use (especially for keyword args) and to give us a place to
put some documentation.
@timholy
Copy link
Sponsor Member

timholy commented Jan 7, 2023

It seems like it would be nice to have something that would act like @__WORLD__ and extract the world age at the time of inference, except which gets serialized during precompilation as the macro rather than the integer value. That way when deserializing packages I think you could use the world age in the user's running system rather than the age in the precompiling process.

I think this is fairly readily implementable: perhaps we could create

struct FrozenWorld
    world::UInt
end

and:

  • allow invoke_in_world to accept FrozenWorld values (for which it just strips the wrapper)
  • add some new fixup rules in the pkgimage deserializer (src/staticdata.c) to update the value stored in FrozenWorld objects to the current world age.
  • @__WORLD__ just returns FrozenWorld(get_world_counter())

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jan 7, 2023

Most of the worlds encountered during precompile do not exist after deserialization, due to reordering and compression. We could have a mutable struct that updates the value, but what value would it have and what would usage look like?

@timholy
Copy link
Sponsor Member

timholy commented Jan 8, 2023

My thought is that we validate edges when we load the package, so any precompiled FrozenWorld would adopt the world age at time of package loading. I guess the concern would be, what if some calls relied on compilations that had been invalidated in that world? Will it recompile those dependencies with that world age?

Example of usage: Revise could fix the world age in its internals to prevent invalidation by packages that the user loads. Thus acting kind of like Core.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants