Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Officially support requiring subparts of concurrent-ruby, e.g., require 'concurrent/map' #934

Closed
eregon opened this issue Jan 19, 2022 · 9 comments · Fixed by #980 or #982
Closed
Assignees
Labels
enhancement Adding features, adding tests, improving documentation.

Comments

@eregon
Copy link
Collaborator

eregon commented Jan 19, 2022

Currently, concurrent-ruby loads every file of the gem eagerly (like most gems actually, this is more reliable than manually setting up autoloads everywhere which is error-prone).
It would be nice footprint-wise if only the files needed for the usages are loaded.

The best way to do that seems to adopt https://github.com/fxn/zeitwerk.
That would mean dropping support for Ruby < 2.5, which should probably be discussed in a separate issue first.

From a discussion with Petr and @chrisseaton.

@eregon eregon added the enhancement Adding features, adding tests, improving documentation. label Jan 19, 2022
@sambostock
Copy link
Contributor

I notice adding a runtime dependency on Zeitwerk would require breaking

The design goals of this gem are:

  • ...
  • Remain free of external gem dependencies

@eregon
Copy link
Collaborator Author

eregon commented Jan 31, 2022

That's true, it might be worth adapting that goal though.

@byroot
Copy link
Contributor

byroot commented Mar 1, 2022

The problem with this idea is that you can easily end up in a situation where one of the concurrent-ruby classes is only used at runtime. Assuming we're talking about a web application, that means the code will be loaded during the first request cycle that uses it, defeating Copy on Write (for most MRI setups) and also busting the global constant cache (big deal for MRI right now, may change in the future).

@eregon
Copy link
Collaborator Author

eregon commented Mar 2, 2022

If the goal is to optimize COW fork / avoiding autoloading then one should eagerly load, Zeitwerk has a mode for that: https://github.com/fxn/zeitwerk#global-eager-load
Of course that means everything (e.g., all concurrent-ruby classes) will be loaded eagerly, but that seems generally the case for eager loading.
So, this should just work if Rails uses Zeitwerk::Loader.eager_load_all in production, isn't it?

@byroot
Copy link
Contributor

byroot commented Mar 2, 2022

My point is that right now, when you use one of the concurrent-ruby component, you explictly require it, which means you get both:

  • Only load what you use
  • Everything you use is eager loaded.

If you were to migrate to Zeitwerk, you'd have to choose between loading things that are not used, vs maybe not eagerloading things.

So I don't think autoloading (be it with zeitwerk or Kernel#autoload) is a good fit for libraries such as concurrent-ruby which brings lots of code but where most people only pick a couple classes in it.

@eregon
Copy link
Collaborator Author

eregon commented Mar 2, 2022

when you use one of the concurrent-ruby component, you explictly require it

That's what is explicitly not supported currently: https://github.com/ruby-concurrency/concurrent-ruby#usage
And the reason for that AFAIK is it's pretty hard to load the necessary dependencies and test it for all possibilities (i.e., test that each file can be loaded on its own in a separate process, and all methods defined in the loaded files work).

In my understanding Zeitwerk solves this issue of loading the necessary dependencies automatically.
I thought Zeitwerk would still allow e.g. require 'concurrent/map', but I guess that actually doesn't work with Zeitwerk (i.e., it won't take care of loading file dependencies).

I think the autoloading/constant invalidation/COW issue is far more general, it seems very likely an application will load extra classes on the first request, unless everything is eager loaded.

@byroot
Copy link
Contributor

byroot commented Mar 2, 2022

That's what is explicitly not supported currently

Well, somehow it works because I've never seen any code requiring it all.

In my understanding Zeitwerk solves this issue of loading the necessary dependencies automatically.

What Zeitwerk does it to automatically setup Kernel#autoload for all the files inside your lib/ directory, and does so recursively when files are loaded. So yes you can mix it with require.

@eregon
Copy link
Collaborator Author

eregon commented Jan 8, 2023

I guess a common way to support specific requires is to run each of the spec in isolation, i.e., in a separate process, like here, and make the specs require only the files they need. That should help to catch missing requires.
It's not as safe or convenient as zeitwerk but it does avoid the dependency.

@eregon eregon self-assigned this Jan 8, 2023
@eregon eregon changed the title Provide lazy-loading of files based on actual usages using Zeitwerk Officially support requiring subparts of concurrent-ruby, e.g., require 'concurrent/map' Jan 8, 2023
eregon added a commit to eregon/concurrent-ruby that referenced this issue Jan 8, 2023
@eregon
Copy link
Collaborator Author

eregon commented Jan 8, 2023

I started work on this: #980
It's not trivial, I already found a cycle:

Currently:

  • lib/concurrent-ruby/concurrent/synchronization.rb loads C exts
  • require 'concurrent/synchronization/*' seems not supported, too deep, and relies on require 'concurrent/synchronization' instead (seems OK).

A cycle:
AtomicMarkableReference -> Synchronization::Object

Synchronization::Object (synchronization/object.rb) -> AtomicReference (through generated code for #initialize)
AtomicReference -> MutexAtomicReference (but only if non-cruby-ext/non-jruby/non-truffleruby)
MutexAtomicReference -> Synchronization::LockableObject
Synchronization::LockableObject -> MutexLockableObject, JRubyLockableObject, MonitorLockableObject
MutexLockableObject -> AbstractLockableObject
AbstractLockableObject -> Synchronization::Object

eregon added a commit to eregon/concurrent-ruby that referenced this issue Jan 8, 2023
eregon added a commit to eregon/concurrent-ruby that referenced this issue Jan 8, 2023
eregon added a commit to eregon/concurrent-ruby that referenced this issue Jan 8, 2023
eregon added a commit to eregon/concurrent-ruby that referenced this issue Jan 8, 2023
eregon added a commit to eregon/concurrent-ruby that referenced this issue Jan 11, 2023
eregon added a commit to eregon/concurrent-ruby that referenced this issue Jan 11, 2023
eregon added a commit to eregon/concurrent-ruby that referenced this issue Jan 11, 2023
eregon added a commit to eregon/concurrent-ruby that referenced this issue Jan 11, 2023
eregon added a commit to eregon/concurrent-ruby that referenced this issue Jan 11, 2023
eregon added a commit to eregon/concurrent-ruby that referenced this issue Jan 11, 2023
eregon added a commit to eregon/concurrent-ruby that referenced this issue Jan 12, 2023
eregon added a commit to eregon/concurrent-ruby that referenced this issue Jan 12, 2023
eregon added a commit to eregon/concurrent-ruby that referenced this issue Jan 12, 2023
eregon added a commit to eregon/concurrent-ruby that referenced this issue Jan 12, 2023
eregon added a commit to eregon/concurrent-ruby that referenced this issue Jan 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Adding features, adding tests, improving documentation.
Projects
None yet
3 participants