-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use custom simulation time variants for Ogre #584
Use custom simulation time variants for Ogre #584
Conversation
Codecov Report
@@ Coverage Diff @@
## ign-rendering6 #584 +/- ##
==================================================
+ Coverage 54.33% 54.34% +0.01%
==================================================
Files 198 198
Lines 20187 20220 +33
==================================================
+ Hits 10969 10989 +20
- Misses 9218 9231 +13
Continue to review full report at Codecov.
|
There is a
For gui, the place to set time would be in this function: You'll need to get sim time from |
That will likely cause the particle test to fail. Because a 0 updateTime means the particle simulation won't move forward (unless it happens that setting the simTime in the sensors fixes it).
You mean ign gui runs the separate thread? Fortunately I'm forcing serialization (or do you mean physics runs in one thread, Qt in another, and Ogre in another?). Otherwise even if protected by a mutex, the sim time won't be deterministic if physics and particles are not in sync. |
I'm thinking that we'll need to start calling
The physics runs in one thread and ogre in another (for both sensors and gui). As for actually getting the correct time for rendering on the gui side, I think the current set up should be ok. Let me know if you see any issue. First we extract data from physics / ECS in this renderUtil.UpdateFromECM call. Here let's say we store sim time into a variable |
b3961c2
to
8cfd0e1
Compare
OK so the only problem I see is this one, but I am not sure how ignition's loop works. This problem I'm going to mention doesn't seem to affect sensors though: It would appear that Gazebo works like this: Thread 0: Physics while( true )
{
simulate( simTime );
mutex.lock();
sendDataToGraphics();
mutex.unlock();
} Thread 1: Rendering while( true )
{
mutex.lock();
receiveDataFromPhysics();
mutex.unlock();
ogreRender( simTime );
} The only problem with this setup is the number of ticks rendering can perform. Let's say simTime = 16.66ms. That is after 60 ticks, 1 second has elapsed in the simulation. But let's say rendering takes 33.33ms to render in realtime, while physics takes exactly 16.66ms in realtime. The reverse can also be true. VSync also enters into play because VSync is used to lock the rendering framerate to that of the monitor (i.e. can't iterate more than 60 times per second on most monitors) which keeps CPUs cool and a desirable property while doing GUI. But if rendering is also simulating, then vsync should be turned off (right now I don't remember if it's on or off; I can't see gazebo explicitly un/setting it) It doesn't appear that sensors have this problem because, if I got this correctly, sensors wait for rendering and then attempt to render serialized. However since the initial conditions may not be synchronized (when a sensor starts updating), sensors would still be affected by the problem but indirectly. Perhaps I am mistaken on how the loops work though. |
OK I managed to find a compromised. Because Gazebo was already sending absolute simulation time, I could do: stepSize = newTime - prevAbsTime;
prevAbsTime = stepSize; This prevents the particles in rendering from falling behind because they can catch up. However, this is not deterministic. A single step with a stepSize = 1 second is not the same 4 steps with stepSize = 0.25 each even if they result in very similar looking results. I suppose we could improve this further by ensuring graphics performs the N necessary iterations to catch up. But more synchronization would have to be in place. For the time being this is "good enough" considering before this PR, the particles were just raw garbage in terms of simulation accuracy & synchronization, but we're still a few leaps behind before we can consider the ticket closed. |
a48b08c
to
c2dcd74
Compare
OK so CI caught that The reason is simple, it tests that But this also created another question: It would seem At least in gamedev lingo, update time means the tick rate (usually but not always 1 / 60 seconds) i.e. how much time elapsed between each update. It sounds like SetTime expects simulation time, not simulation update time. As for the backwards compatibility behavior: The problem was that since (I assume) most users of ignition-rendering did not call SetTime at all (or if they did, it does nothing!), this PR will result in a very different behavior:
If SetTime expects absolute simulation time, I don't see (without hacks) how not to break existing apps after upgrading ign-rendering6. Even though C++ ABI compatibility is not broken; an app that never called While we could argue this was bad API usage, there is a saying an API contract is sealed after its first observable behavior. |
One possible workaroundStart SetTime at -1. When negative, ign-rendering6 will behave like before this PR. There's of course always the chance an app breaks if it calls Second possible workaround
That way, hack-aware apps will call SetTime( -1 ) before starting to feed real simulation time. I'll wrap the hacks around |
See gazebosim/gz-rendering#556 See gazebosim/gz-rendering#584 Signed-off-by: Matias N. Goldberg <[email protected]>
@iche033 I see that latest https://build.osrfoundation.org/job/ignition_rendering-ci-pr_any-homebrew-amd64/2334/consoleFull#45593342ed30c675-ba23-4c35-b655-f5c948f97581 is failing at Scene_TEST.cc:687
I changed that test to account for the hack (if the magic value It appears the CI machine is failing, even though the code should be explicitly ignoring that -1 and remain at 0. The test passes on my machine. Do you know why it could be failing in the CI? It shouldn't. Possible things I can think of:
|
There is soft lock-stepping mechanism that we put in to make physics wait for sensors using condition variables. Here, we block the whole ECS / physics if a sensor needs update but the previous rendering operation has not finished yet. Here we notify when rendering thread is done with the update and able to continue.
I don't think we explicitly turn VSync on / off. We used to limit GUI rendering to 60 hz and I don't think it needs to go any higher than that for simulation. As for sensors, I can see that users may want to simulate high speed cameras to go > 60Hz but we haven't heard any requests related to this yet.
yes like you discovered
yep that sounds good.
The homebrew CI machine is running ogre 1.x instead of ogre 2.x. Now that mac support is becoming more complete, we probably should switch it back to ogre 2.x, at least for garden. |
Ahhh!!! Good point!!! The BaseScene should account for the workaround too. |
OK I pushed a new version. Hopefully this one will pass all checks.
I somehow sense you don't understand what I mean. But I think experiencing it yourself is the best way:
In other words: non deterministic behavior. But at least simulation speed is maintained. |
It would seem the linter doesn't like the ifdef around the if/else statements. It thinks it's |
Ok I tried running the steps and saw the "restart" behavior. I noticed on the GUI rendering side if (!this->LegacyAutoGpuFlush()) evaluates to |
Ok various things:
That's what I'm trying to explain! IT DOES look different because the particle FX system is trying to simulate 10 seconds in one tick. In pseudo code: // Ideal:
const float timeToSimulate = 10 seconds;
const int numTicks = floor( timeToSimulate / tickRate );
for( int i = 0; i < numTicks; ++i )
updateParticles( tickRate );
// What is happening when you relaunch the GUI with a simulation of 10 seconds
const float timeToSimulate = 10 seconds;
updateParticles( timeToSimulate ); // Just one tick with 10.0 as tick rate To fix this I would have to try to analyze if it's possible to detach the particle FX simulation from rendering and force N ticks before rendering. Ideally, particle FX simulation should happen as part of gazebo's simulation, not in rendering. |
Thus basically this PR will fix some of the issues in #556 but it won't close it, because it's still non-deterministic. For that we can focus in a new PR. |
ah that explains it! I thought that particles system is taking the ideal case so would be able to handle large time jumps. Ok we can leave that for later |
Yeah. From a high level perspective it should be an easy fix. But I'd have to look how easy it is without ABI breaking anything I'm slightly concerned about the GPU buffers, those may not like being updated 10 times in the same frame. It should be easily workaroundeable though |
This fixes Particle FXs not respecting simulation time Fixes gazebosim#556 Signed-off-by: Matias N. Goldberg <[email protected]>
Signed-off-by: Matias N. Goldberg <[email protected]>
Signed-off-by: Matias N. Goldberg <[email protected]>
Update tests to advance simulation time Tests pass now Signed-off-by: Matias N. Goldberg <[email protected]>
Signed-off-by: Matias N. Goldberg <[email protected]>
Signed-off-by: Matias N. Goldberg <[email protected]>
87d2584
to
f46cdd8
Compare
I fixed the linter false positive. All tests should succeed this time 🤞 It's ready for merge. |
See gazebosim/gz-rendering#556 See gazebosim/gz-rendering#584 Signed-off-by: Matias N. Goldberg <[email protected]>
looks good to me! |
🦟 Bug fix
Fixes #556
Summary
This fixes Particle FXs not respecting simulation time.
After researching Ogre behavior, turns out we already had the functionality to use custom simulation (deterministic) time. We just had to use it. Turned out to be much easier than I expected (I actually started writing more code until I realized it... :( )
Note: This is a draft because I was unsure where to pull gazebo's simulation time from. Therefore it is currently hardcoded with:
and obviously needs to be changed to use gazebo's.
But other than that, once that's implemented (I'm asking for pointers here....!) it should just work.
Checklist
codecheck
passed (See contributing)Note to maintainers: Remember to use Squash-Merge and edit the commit message to match the pull request summary while retaining
Signed-off-by
messages.