Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve ambiguous system order bugs in default systems #1466

Closed
alice-i-cecile opened this issue Feb 17, 2021 · 18 comments
Closed

Resolve ambiguous system order bugs in default systems #1466

alice-i-cecile opened this issue Feb 17, 2021 · 18 comments
Labels
A-Core Common functionality for all bevy apps C-Bug An unexpected or incorrect behavior P-High This is particularly urgent, and deserves immediate attention
Milestone

Comments

@alice-i-cecile
Copy link
Member

Implicit dependencies were replaced with explicit ones in #1144. While this is a great choice, it's resulted in numerous flaky behaviors in the engine due to ambiguous system ordering.

So, I think we can probably get away with 6 (PreUpdate, Update and PostUpdate + the startup stage equivalent) core stages total for 0.5 honestly.

You only need more stages when you need to process chains of systems that rely on command processing to work, and I would be surprised if we have very long chains of that anywhere yet.

So, as an overall strategy:
0. Label all of the systems.

  1. Place the systems that generate commands.
  2. Place the systems that consume entities / components that would be generated by these commands in the stage after they're made.
  3. Toss all of the other systems in Update, since this allows users to run systems in stages before / after them without needing to make new stages.
  4. Resolve all of the ambiguities, either with a dependency or a permitted ambiguity (this functionality would have to be built out on at least a pairwise basis).
  5. Ensure that change-detecting systems come after things that can generate changes for them if we haven't solved System-order-independent ECS change tracking #68.
@alice-i-cecile alice-i-cecile added C-Bug An unexpected or incorrect behavior A-Core Common functionality for all bevy apps P-High This is particularly urgent, and deserves immediate attention labels Feb 17, 2021
@alice-i-cecile alice-i-cecile added this to the Bevy 0.5 milestone Feb 17, 2021
@alice-i-cecile
Copy link
Member Author

This script could be handy to get shorter, prettier system names as part of the process: https://github.com/jakobhellermann/pretty-type-name/blob/main/src/lib.rs

@alice-i-cecile
Copy link
Member Author

In order to solve this we'll want very basic explicit ambiguities, in order to reduce the noise produced by the ambiguity checker. Pairwise ambiguities will be required as part of any solution, and are the most general tool, so I recommend we implement those.

This should be pretty simple, hook into the system descriptor code, add another field for a vec of ambiguous labels and then add a method that defines the system as ambiguous with another system with the chosen label. Then, read that field in the ambiguity checker and quash any matching warnings.

I'm generally not sure I love having these be defined on a per-system basis (which one do you put it on?), but it's consistent with the .before and .after API and should be quick to do.

@Ratysz
Copy link
Contributor

Ratysz commented Feb 18, 2021

Ideally, this should be done after #1423.

  1. Ensure that change-detecting systems come after things that can generate changes for them if we haven't solved System-order-independent ECS change tracking #68.

I think we want that even if we have resolved the issue. The fact that pre-#1144 Bevy worked at all means that we probably don't have an unbreakable cycle of systems relying on each other's changes being detected via this mechanism, so we should still be able to form a valid graph.

@TheRawMeatball
Copy link
Member

So, I think we can probably get away with 6 (PreUpdate, Update and PostUpdate + the startup stage equivalent) core stages total for 0.5 honestly.

I took a look at how many stages we really need, and I think a single update stage might be enough.

The only engine system that uses commands inside the main game loop is the hierarchy maintanence system, and there are only three exclusive systems that can't be replaced with NonSend systems, which are the render system, the tracker clearing system and a diagnostic system. All three of these could be placed as exclusive systems at the end of the update stage, and sorted using labels like normal systems.

@inodentry
Copy link
Contributor

With the current stages, the question of where user systems should go relative to engine systems is solved and straightforward: user systems go in UPDATE by default, which means they get sandwiched between engine systems in prior stages (PRE_UPDATE, etc.) and later stages (POST_UPDATE, etc.).

If we go "stageless" (have only one engine stage as you suggest), we will need to come up with some new solution to this problem. We need a mechanism to ensure that user systems are (by default) added in a way that makes sense, relative to all the engine systems.

@TheRawMeatball
Copy link
Member

There are major performance benefits to doing things with fewer stages, but yes, we'll need a way to create sane defaults for adding systems.

@Ratysz
Copy link
Contributor

Ratysz commented Feb 18, 2021

In order to solve this we'll want very basic explicit ambiguities, in order to reduce the noise produced by the ambiguity checker.

#1469 should help.

@alice-i-cecile
Copy link
Member Author

With the current stages, the question of where user systems should go relative to engine systems is solved and straightforward: user systems go in UPDATE by default, which means they get sandwiched between engine systems in prior stages (PRE_UPDATE, etc.) and later stages (POST_UPDATE, etc.).

If we go "stageless" (have only one engine stage as you suggest), we will need to come up with some new solution to this problem. We need a mechanism to ensure that user systems are (by default) added in a way that makes sense, relative to all the engine systems.

Automatic labeling plus thoughts about archetype invariants (from @aevyrie and @TheRawMeatball) should go a long way here. The engine could specify that everything with component A gets a particular label, which ensures that it runs in a sensible order relative to the default systems.

Not feasible for 0.5 though since it relies on the many-to-many labels.

@alice-i-cecile
Copy link
Member Author

There are major performance benefits to doing things with fewer stages, but yes, we'll need a way to create sane defaults for adding systems.

As a stop gap for this, I think we should have three main stages for 0.5. All of the stuff that user systems should usually run before should live in PostUpdate, and all of the stuff it should usually run after should run in PreUpdate. We lose a bit of parallelization, but preserve the prototyping and new user experience.

@cart
Copy link
Member

cart commented Feb 18, 2021

The engine could specify that everything with component A gets a particular label, which ensures that it runs in a sensible order relative to the default systems.

This is a power concept, but it would also make order extremely hard to reason about as a user.

As a stop gap for this, I think we should have three main stages for 0.5. All of the stuff that user systems should usually run before should live in PostUpdate, and all of the stuff it should usually run after should run in PreUpdate. We lose a bit of parallelization, but preserve the prototyping and new user experience.

Yeah I think this is a solid goal. Its worth attempting that and changing course if / when we hit a wall.

@alice-i-cecile
Copy link
Member Author

The engine could specify that everything with component A gets a particular label, which ensures that it runs in a sensible order relative to the default systems.

This is a power concept, but it would also make order extremely hard to reason about as a user.

Agreed. I'd be pretty firmly against adding this until we have some form of schedule visualization.

@Ratysz
Copy link
Contributor

Ratysz commented Feb 18, 2021

#1469 and #1473 (née #1423) have been merged, this should be able to proceed now.

@alice-i-cecile
Copy link
Member Author

alice-i-cecile commented Feb 23, 2021

Important use case to test for:

  1. We have an explicit dependency that connects two systems in different plugins.
  2. The end user only includes one of those two plugins.

This should not just break with "no system with that label is found". And ideally it shouldn't force the user to reconstruct the plugins from scratch to work around this.

The latter suggestion is also terrible, because you actually may not even be able to manually reproduce what a plugin is doing, since not all of its members will be pub.

We can avoid this by making the plugin-external labels that each plugin depends on publicly visible and configurable (i.e. Box<dyn SystemLabel>. This would allow end users to replace various critical components without forcing the end user to modify / reproduce / fork the plugin itself.

As a follow-up to that idea: we should extend before and after with optionally_before / after, which can take an Option<T: SystemLabel>, to allow for simply not specifying a before / after label if no replacement system exists.

@Ratysz
Copy link
Contributor

Ratysz commented Feb 24, 2021

Or we don't hard crash on missing labels and simply report that, while algorithms continue to evaluate the graph as if that dependency didn't exist. Maybe also a way to remove references to a particular label, as an escape hatch for silencing those errors.

Related, when removing systems from the schedule, the label of that system should be cleanly removed from everywhere; naturally, removal method should come with a matching replace/"override" that doesn't touch the labels. We don't have any way to remove systems yet, but I don't think there are any obstacles to that.

@Ratysz
Copy link
Contributor

Ratysz commented Mar 10, 2021

Or we don't hard crash on missing labels and simply report that, while algorithms continue to evaluate the graph as if that dependency didn't exist.

Done as part of #1576.

@Ratysz
Copy link
Contributor

Ratysz commented Mar 12, 2021

Partially resolved by #1606. Current solution for ambiguity checker noisiness is "let it scream" - something more satisfying will require better tools, most likely some form of mass ambiguity set assignment.

@cart
Copy link
Member

cart commented Mar 17, 2021

I think the "bugs" have been resolved at this point. Whether or not we resolve the "non-bug ambiguities" is an open question, but not something I think we need to block 0.5 on. Thoughts?

Edit: im calling this out because this issue is in the 0.5 milestone

@alice-i-cecile
Copy link
Member Author

I think the "bugs" have been resolved at this point. Whether or not we resolve the "non-bug ambiguities" is an open question, but not something I think we need to block 0.5 on. Thoughts?

This is correct IMO; this can be closed and we can make a new issue for broader interop questions as it comes up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Core Common functionality for all bevy apps C-Bug An unexpected or incorrect behavior P-High This is particularly urgent, and deserves immediate attention
Projects
None yet
Development

No branches or pull requests

5 participants