Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unbuffered mode #414

Closed
bdice opened this issue Jan 2, 2021 · 6 comments
Closed

Remove unbuffered mode #414

bdice opened this issue Jan 2, 2021 · 6 comments
Labels
enhancement New feature or request refactor Code refactoring

Comments

@bdice
Copy link
Member

bdice commented Jan 2, 2021

Feature description

Currently we support an "unbuffered" mode in signac-flow that disables signac's buffering capabilities, by disabling the config option use_buffered_mode (it's on by default). I don't know if it still makes sense to support an unbuffered mode because we require signac >= 1.0. I believe buffering was initially an optional feature (rather than always-on) because signac didn't support buffering in some earlier versions. Our CI takes a long time to run and testing both options (on/off) seems unnecessary at this point.

Proposed solution

The option defaults to True, and I think this feature could be immediately removed with no deprecation cycle -- in my understanding, it's basically a flag that disables an important performance optimization. However, I don't want to immediately remove it from the config validator (that's more of a breaking change than just requiring buffered mode). I suggest the following.

Deprecation:

  • Change the behavior of FlowProject._potentially_buffered to warn if use_buffered_mode is specified and ignore the option (always buffer).
  • Remove the tests for unbuffered projects.

Removal (after 2 minor versions):

  • Remove use_buffered_mode option from the config validator (config files with the option specified are no longer valid).
  • The FlowProject._potentially_buffered context manager could be removed and replaced with: with signac.buffered(): or we could just change the name to FlowProject._buffered.

Before anyone begins work on this issue, I would like to get input from @glotzerlab/signac-committers.

@bdice bdice added enhancement New feature or request refactor Code refactoring labels Jan 2, 2021
@atravitz
Copy link
Collaborator

atravitz commented Jan 2, 2021

Doing a short deprecation cycle for this seems reasonable to me 👍

@vyasr
Copy link
Contributor

vyasr commented Jan 4, 2021

The main hesitation I would have with changing this immediately is relating to glotzerlab/signac#397. It should be relatively easy to modify the new code for glotzerlab/signac#249 to handle thread parallelism transparently (and I plan to do this), but I don't know of any way to make it intrinsically safe for process parallelism. If a signac-flow user sets the status_parallelization option to 'process', I think signac-flow will have to be responsible for doing the update in a parallel-safe manner. It's possible that future refactorings in flow will make this easier to achieve (or a moot point), but so long as that limitation exists the options are:

  1. Always disable buffering when using process-parallel status updates. This will always be safe, but may end up making things slow enough that process parallelism becomes useless. Since switching to process-parallel status updates indicates that the user has identified processing rather than I/O as the bottleneck, there's no way to know for sure without testing on a particular workflow.
  2. Always set a massive (or unlimited) buffer size when using process-parallel status updates. This would mostly work, but could silently fail once there's enough data, or in the unlimited case lead to deadlock at some point if there's a massive number of I/O requests at once (Lustre filesystems in particular can have this problem).
  3. Simply include user warnings to the effect that process-parallel status updates may not be safe for large documents. This solution is just asking for users to shoot themselves in the foot, though.
  4. Allow the user to disable buffering using the flag in this issue.

Option (4) isn't great, but none of the others currently are either so we may need to consider that possibility before deciding whether or not to deprecate this.

@bdice
Copy link
Member Author

bdice commented Jan 4, 2021

@vyasr Fetching scheduler status is done before the parallel section begins. That’s the only part that writes to the project document. The parallel portion of the code is read-only, evaluating conditions and determining eligibility.

@vyasr
Copy link
Contributor

vyasr commented Jan 4, 2021

Ah right. So in that case, am I correct that status_parallelization='thread' is useful only if users write I/O-bound label functions (which is admittedly not unlikely if people are using job documents etc)? Is the call to _fetch_scheduler_status not done in buffered mode? Did I change it to just update the cached_status?

@vyasr
Copy link
Contributor

vyasr commented Jan 4, 2021

I discussed my previous comment offline with @bdice to explain my point a little more clearly, and I took a look at the code. Tl;dr, I'm fine with deprecating this flag.

My point here was that since we default to thread-parallel execution of job/group label acquisition, these tasks are only accelerated to the extent that they are I/O-bound because Python is inherently single-threaded. If label functions are expensive because they're actually doing some significant amount of processing, then process parallelism is the way to go. Having said that, all of this parallel code happens after the serial writing of the scheduler status to the project document as Bradley mentioned above, so I think we're fine to move forward with this deprecation.

@bdice bdice mentioned this issue Jan 13, 2021
12 tasks
@bdice
Copy link
Member Author

bdice commented Jan 20, 2021

I'm going to mark this issue as resolved - the design decisions have all been made and implemented in #425, and the only remaining task is to remove the deprecation warning (and remove the default value from the flow config monkeypatch) in a later release.

@bdice bdice closed this as completed Jan 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request refactor Code refactoring
Projects
None yet
Development

No branches or pull requests

3 participants