Use custom simulation time variants for Ogre #584

darksylinc · 2022-03-12T22:43:58Z

🦟 Bug fix

Fixes #556

Summary

This fixes Particle FXs not respecting simulation time.

After researching Ogre behavior, turns out we already had the functionality to use custom simulation (deterministic) time. We just had to use it. Turned out to be much easier than I expected (I actually started writing more code until I realized it... :( )

Note: This is a draft because I was unsure where to pull gazebo's simulation time from. Therefore it is currently hardcoded with:

static const Ogre::Real kSimulationTime = Ogre::Real(1.0 / 60.0);

and obviously needs to be changed to use gazebo's.

But other than that, once that's implemented (I'm asking for pointers here....!) it should just work.

Checklist

Signed all commits for DCO
Added tests
Updated documentation (as needed)
Updated migration guide (as needed)
codecheck passed (See contributing)
All tests passed (See test coverage)
While waiting for a review on your PR, please help review another open pull request to support the maintainers

Note to maintainers: Remember to use Squash-Merge and edit the commit message to match the pull request summary while retaining Signed-off-by messages.

codecov · 2022-03-12T23:17:01Z

Codecov Report

Merging #584 (d723713) into ign-rendering6 (195317e) will increase coverage by 0.01%.
The diff coverage is 68.42%.

❗ Current head d723713 differs from pull request most recent head f46cdd8. Consider uploading reports for the commit f46cdd8 to get more accurate results

@@                Coverage Diff                 @@
##           ign-rendering6     #584      +/-   ##
==================================================
+ Coverage           54.33%   54.34%   +0.01%     
==================================================
  Files                 198      198              
  Lines               20187    20220      +33     
==================================================
+ Hits                10969    10989      +20     
- Misses               9218     9231      +13

Impacted Files	Coverage Δ
src/base/BaseScene.cc	`73.37% <0.00%> (-0.55%)`	⬇️
ogre2/src/Ogre2Scene.cc	`76.56% <72.22%> (-0.25%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 195317e...f46cdd8. Read the comment docs.

iche033 · 2022-03-16T21:36:43Z

There is a Scene::SetTime function that we can call to set the simulation time. Instead of kSimulationTime, you can just call this->Time(). However, right now it just returns 0 every time as we're not setting it in ign-gazebo. For sensors, you should be able to set the time somewhere in this function:
https://github.com/ignitionrobotics/ign-gazebo/blob/ign-gazebo6/src/systems/sensors/Sensors.cc#L225

this->dataPtr->updateTime should be the simulation time.

For gui, the place to set time would be in this function:
https://github.com/ignitionrobotics/ign-gazebo/blob/ign-gazebo6/src/gui/plugins/scene_manager/GzSceneManager.cc#L224

You'll need to get sim time from UpdateInfo arg (_info.simTime) in the GzSceneManager::Update function which runs in a separate thread so probably need a mutex to protect it.

darksylinc · 2022-03-22T19:08:39Z

However, right now it just returns 0 every time as we're not setting it in ign-gazebo

That will likely cause the particle test to fail. Because a 0 updateTime means the particle simulation won't move forward (unless it happens that setting the simTime in the sensors fixes it).

You'll need to get sim time from UpdateInfo arg (_info.simTime) in the GzSceneManager::Update function which runs in a separate thread so probably need a mutex to protect it.

You mean ign gui runs the separate thread? Fortunately I'm forcing serialization (or do you mean physics runs in one thread, Qt in another, and Ogre in another?).

Otherwise even if protected by a mutex, the sim time won't be deterministic if physics and particles are not in sync.

iche033 · 2022-03-23T23:59:17Z

That will likely cause the particle test to fail. Because a 0 updateTime means the particle simulation won't move forward (unless it happens that setting the simTime in the sensors fixes it).

I'm thinking that we'll need to start calling SetTime in ign-gazebo's Sensors and GzSceneManager and in your code you will then be able to to get the correct sim time instead of 0

You mean ign gui runs the separate thread? Fortunately I'm forcing serialization (or do you mean physics runs in one thread, Qt in another, and Ogre in another?).

The physics runs in one thread and ogre in another (for both sensors and gui). As for actually getting the correct time for rendering on the gui side, I think the current set up should be ok. Let me know if you see any issue. First we extract data from physics / ECS in this renderUtil.UpdateFromECM call. Here let's say we store sim time into a variable simTime protected by a mutex here. Then in the rendering thread, we call renderUtil.Update() to actually update the rendering scene tree. We can then call scene->SetTime(simTime) in the same function to let the rendering side know the sim time that the data corresponds to. That should ensure that the rendered image has the correct time.

darksylinc · 2022-03-26T22:00:30Z

The physics runs in one thread and ogre in another (for both sensors and gui). As for actually getting the correct time for rendering on the gui side, I think the current set up should be ok. Let me know if you see any issue.

OK so the only problem I see is this one, but I am not sure how ignition's loop works. This problem I'm going to mention doesn't seem to affect sensors though:

It would appear that Gazebo works like this:

Thread 0: Physics

while( true )
{
   simulate( simTime );
   mutex.lock();
   sendDataToGraphics();
   mutex.unlock();
}

Thread 1: Rendering

while( true )
{
   mutex.lock();
   receiveDataFromPhysics();
   mutex.unlock();
   ogreRender( simTime );
}

The only problem with this setup is the number of ticks rendering can perform.

Let's say simTime = 16.66ms. That is after 60 ticks, 1 second has elapsed in the simulation.

But let's say rendering takes 33.33ms to render in realtime, while physics takes exactly 16.66ms in realtime.
Thus a particular system cannot sustain 60 ticks per second, while physics does. The simulation (of mostly particles) will fall behind because physics doesn't wait for graphics nor vice-versa.

The reverse can also be true.

VSync also enters into play because VSync is used to lock the rendering framerate to that of the monitor (i.e. can't iterate more than 60 times per second on most monitors) which keeps CPUs cool and a desirable property while doing GUI.

But if rendering is also simulating, then vsync should be turned off (right now I don't remember if it's on or off; I can't see gazebo explicitly un/setting it)

It doesn't appear that sensors have this problem because, if I got this correctly, sensors wait for rendering and then attempt to render serialized.

However since the initial conditions may not be synchronized (when a sensor starts updating), sensors would still be affected by the problem but indirectly.

Perhaps I am mistaken on how the loops work though.

darksylinc · 2022-03-26T22:21:59Z

OK I managed to find a compromised.

Because Gazebo was already sending absolute simulation time, I could do:

stepSize = newTime - prevAbsTime;
prevAbsTime = stepSize;

This prevents the particles in rendering from falling behind because they can catch up. However, this is not deterministic.

A single step with a stepSize = 1 second is not the same 4 steps with stepSize = 0.25 each even if they result in very similar looking results.

I suppose we could improve this further by ensuring graphics performs the N necessary iterations to catch up. But more synchronization would have to be in place.

For the time being this is "good enough" considering before this PR, the particles were just raw garbage in terms of simulation accuracy & synchronization, but we're still a few leaps behind before we can consider the ticket closed.

darksylinc · 2022-03-26T22:46:37Z

OK so CI caught that UNIT_Scene_TESTis failing.

The reason is simple, it tests that scene->Time() starts at 0, however I changed it to 1 / 60 to maintain backwards compatibility.

But this also created another question:

It would seem Scene::SetTime expects absolute time, and not inter-frame update time? IMO the documentation could be improved there. The contradiction happens when it says "Set the last simulation update time".

At least in gamedev lingo, update time means the tick rate (usually but not always 1 / 60 seconds) i.e. how much time elapsed between each update.

It sounds like SetTime expects simulation time, not simulation update time.

As for the backwards compatibility behavior: The problem was that since (I assume) most users of ignition-rendering did not call SetTime at all (or if they did, it does nothing!), this PR will result in a very different behavior:

If SetTime is never called by user, a value of 0 means particles are never fired. So I defaulted to 1 / 60 to at least avoid breaking these apps.
If SetTime is called by user, are bets are off; since it was previously doing nothing and whether my PR breaks apps or not depends on what value they are feeding.

If SetTime expects absolute simulation time, I don't see (without hacks) how not to break existing apps after upgrading ign-rendering6. Even though C++ ABI compatibility is not broken; an app that never called Scene::SetTime() means rendering is always stuck at time = 0 hence no particle simulation will move forward.

While we could argue this was bad API usage, there is a saying an API contract is sealed after its first observable behavior.

darksylinc · 2022-03-26T23:01:29Z

One possible workaround

Start SetTime at -1.

When negative, ign-rendering6 will behave like before this PR.
When >= 0, ign-rendering6 will honour the simulation time like in this PR.

There's of course always the chance an app breaks if it calls scene->Time() for the first time and expects 0 instead of -1; but the impact should be low IMHO.

Second possible workaround

Start SetTime at 0. The value set to SetTime will be ignored.
BUT if user calls SetTime() with a magic value like -1; further calls to SetTime will start being honoured.

That way, hack-aware apps will call SetTime( -1 ) before starting to feed real simulation time.
This is probably the best workaround.

I'll wrap the hacks around #if IGNITION_RENDERING_MAJOR_VERSION <= 6

See gazebosim/gz-rendering#556 See gazebosim/gz-rendering#584 Signed-off-by: Matias N. Goldberg <[email protected]>

darksylinc · 2022-03-27T00:22:49Z

@iche033 I see that latest https://build.osrfoundation.org/job/ignition_rendering-ci-pr_any-homebrew-amd64/2334/consoleFull#45593342ed30c675-ba23-4c35-b655-f5c948f97581 is failing at Scene_TEST.cc:687

  duration
     Which is: 8-byte object <00-00 00-00 00-00 00-00>
  scene->Time()
     Which is: 8-byte object <FF-FF FF-FF FF-FF FF-FF>

I changed that test to account for the hack (if the magic value scene->SetTime(-1) is called, that value is ignored and we start honouring simulation time)

It appears the CI machine is failing, even though the code should be explicitly ignoring that -1 and remain at 0. The test passes on my machine.

Do you know why it could be failing in the CI? It shouldn't.

Possible things I can think of:

The CI machine grabbed an old version of libignition-rendering but ran the new test
IGNITION_RENDERING_MAJOR_VERSION < 6 when compiling libignition-rendering but it is IGNITION_RENDERING_MAJOR_VERSION = 6 when compiling the test (that would be weird)

iche033 · 2022-03-28T21:18:33Z

This problem I'm going to mention doesn't seem to affect sensors though:

There is soft lock-stepping mechanism that we put in to make physics wait for sensors using condition variables. Here, we block the whole ECS / physics if a sensor needs update but the previous rendering operation has not finished yet. Here we notify when rendering thread is done with the update and able to continue.

But if rendering is also simulating, then vsync should be turned off (right now I don't remember if it's on or off; I can't see gazebo explicitly un/setting it)

I don't think we explicitly turn VSync on / off. We used to limit GUI rendering to 60 hz and I don't think it needs to go any higher than that for simulation. As for sensors, I can see that users may want to simulate high speed cameras to go > 60Hz but we haven't heard any requests related to this yet.

It would seem Scene::SetTime expects absolute time, and not inter-frame update time? IMO the documentation could be improved there. The contradiction happens when it says "Set the last simulation update time".

yes like you discovered SetTime excepts absolute time. We use more physics terms to describe time elapsed between update, e.g. step size, but yeah I agree we can update documentation to to make it clear.

Second possible workaround

yep that sounds good.

Do you know why it could be failing in the CI? It shouldn't.

The homebrew CI machine is running ogre 1.x instead of ogre 2.x. Now that mac support is becoming more complete, we probably should switch it back to ogre 2.x, at least for garden.

darksylinc · 2022-03-28T21:41:01Z

The homebrew CI machine is running ogre 1.x instead of ogre 2.x. Now that mac support is becoming more complete, we probably should switch it back to ogre 2.x, at least for garden.

Ahhh!!! Good point!!! The BaseScene should account for the workaround too.

darksylinc · 2022-03-28T21:50:52Z

OK I pushed a new version. Hopefully this one will pass all checks.

I don't think we explicitly turn VSync on / off. We used to limit GUI rendering to 60 hz and I don't think it needs to go any higher than that for simulation. As for sensors, I can see that users may want to simulate high speed cameras to go > 60Hz but we haven't heard any requests related to this yet.

I somehow sense you don't understand what I mean. But I think experiencing it yourself is the best way:

Checkout my ign-rendering6 branch (matias-PFX_simulTime)
Checkout my ign-gazebo branch (matias-PFX_simulTime)
Launch particle_emitter2.sdf with server, i.e. ign gazebo -s particle_emitter2.sdf
Launch the GUI. i.e. ign gazebo -g
Press Play
Wait some time (e.g. 10 seconds)
Pause
Close the GUI
Launch the GUI again. i.e. ign gazebo -g. The server should still be up
- See that the particles "restart" from 0 instead of continuing where they left off
  - That's because particles are simulated in the GUI
  - Also because the GUI doesn't accurately simulate the time between the origin of the simulation and the time at which it was paused (e.g. 10 seconds in sim time)

In other words: non deterministic behavior. But at least simulation speed is maintained.

darksylinc · 2022-03-28T22:57:27Z

It would seem the linter doesn't like the ifdef around the if/else statements. It thinks it's if( cond )body; instead of if( cond ) { body; }

iche033 · 2022-03-30T21:13:48Z

Also because the GUI doesn't accurately simulate the time between the origin of the simulation and the time at which it was paused (e.g. 10 seconds in sim time)

Ok I tried running the steps and saw the "restart" behavior. I noticed on the GUI rendering side if (!this->LegacyAutoGpuFlush()) evaluates to false since we never set cameraPassCountPerGpuFlush (that was only done in Sensors). So evt.timeSinceLastFrame was never set. As a test, I forced it to be on by setting cameraPassCountPerGpuFlush to 6. After that I tried the steps again and noticed that in the last step Launch the GUI again, the correct sim time is passed to evt.timeSinceLastFrame when the GUI is launched the second time - evt.timeSinceLastFrame is set to ~10s (after pausing for ~10 seconds). However the particles appearance still look different. Maybe there is something else I'm not understanding correctly.

darksylinc · 2022-03-30T21:46:29Z

Ok I tried running the steps and saw the "restart" behavior. I noticed on the GUI rendering side if (!this->LegacyAutoGpuFlush()) evaluates to false since we never set cameraPassCountPerGpuFlush (that was only done in Sensors). So evt.timeSinceLastFrame was never set. As a test, I forced it to be on by setting cameraPassCountPerGpuFlush to 6.

Ok various things:

So if I understood correctly this->LegacyAutoGpuFlush() == true which means the branch if (!this->LegacyAutoGpuFlush()) is never taken right? the GUI should definitely not be in legacy mode IIRC
- But what surprises me is that in legacy mode it still should work, because Ogre2Scene::StartRendering should take care of it.

However the particles appearance still look different. Maybe there is something else I'm not understanding correctly.

That's what I'm trying to explain! IT DOES look different because the particle FX system is trying to simulate 10 seconds in one tick. In pseudo code:

// Ideal:
const float timeToSimulate = 10 seconds;
const int numTicks = floor( timeToSimulate / tickRate );
for( int i = 0; i < numTicks; ++i )
    updateParticles( tickRate );

// What is happening when you relaunch the GUI with a simulation of 10 seconds
const float timeToSimulate = 10 seconds;
updateParticles( timeToSimulate ); // Just one tick with 10.0 as tick rate

To fix this I would have to try to analyze if it's possible to detach the particle FX simulation from rendering and force N ticks before rendering.

Ideally, particle FX simulation should happen as part of gazebo's simulation, not in rendering.

darksylinc · 2022-03-30T21:49:20Z

Thus basically this PR will fix some of the issues in #556 but it won't close it, because it's still non-deterministic.

For that we can focus in a new PR.

iche033 · 2022-03-30T22:01:20Z

In pseudo code:

ah that explains it! I thought that particles system is taking the ideal case so would be able to handle large time jumps. Ok we can leave that for later

darksylinc · 2022-03-30T22:27:35Z

Yeah. From a high level perspective it should be an easy fix. But I'd have to look how easy it is without ABI breaking anything

I'm slightly concerned about the GPU buffers, those may not like being updated 10 times in the same frame. It should be easily workaroundeable though

This fixes Particle FXs not respecting simulation time Fixes gazebosim#556 Signed-off-by: Matias N. Goldberg <[email protected]>

Signed-off-by: Matias N. Goldberg <[email protected]>

Update tests to advance simulation time Tests pass now Signed-off-by: Matias N. Goldberg <[email protected]>

Signed-off-by: Matias N. Goldberg <[email protected]>

darksylinc · 2022-04-03T23:18:58Z

I fixed the linter false positive.

All tests should succeed this time 🤞

It's ready for merge.

See gazebosim/gz-rendering#556 See gazebosim/gz-rendering#584 Signed-off-by: Matias N. Goldberg <[email protected]>

iche033 · 2022-04-04T22:43:07Z

looks good to me!

github-actions bot added the 🏯 fortress Ignition Fortress label Mar 12, 2022

darksylinc force-pushed the matias-PFX_simulTime branch 2 times, most recently from b3961c2 to 8cfd0e1 Compare March 26, 2022 15:48

darksylinc marked this pull request as ready for review March 26, 2022 22:30

darksylinc requested a review from iche033 as a code owner March 26, 2022 22:30

darksylinc force-pushed the matias-PFX_simulTime branch from a48b08c to c2dcd74 Compare March 26, 2022 22:32

darksylinc added a commit to darksylinc/ign-gazebo that referenced this pull request Mar 27, 2022

Set simulation time to Rendering

c2de743

See gazebosim/gz-rendering#556 See gazebosim/gz-rendering#584 Signed-off-by: Matias N. Goldberg <[email protected]>

darksylinc mentioned this pull request Mar 27, 2022

Set simulation time to Rendering gazebosim/gz-sim#1415

Merged

7 tasks

darksylinc added 2 commits April 3, 2022 20:18

Use custom simulation time variants for Ogre

544222c

This fixes Particle FXs not respecting simulation time Fixes gazebosim#556 Signed-off-by: Matias N. Goldberg <[email protected]>

Use Scene::Time for Ogre time

0044f28

Signed-off-by: Matias N. Goldberg <[email protected]>

darksylinc added 4 commits April 3, 2022 20:18

Default to 60hz in Ogre2Scene

6f57356

Signed-off-by: Matias N. Goldberg <[email protected]>

Add hack to allow backwards compat behavior

dbbdcea

Update tests to advance simulation time Tests pass now Signed-off-by: Matias N. Goldberg <[email protected]>

Fix SetTime hack for other engines (i.e. not Ogre2)

58d3f96

Signed-off-by: Matias N. Goldberg <[email protected]>

Fix linter false positive

f46cdd8

Signed-off-by: Matias N. Goldberg <[email protected]>

darksylinc force-pushed the matias-PFX_simulTime branch from 87d2584 to f46cdd8 Compare April 3, 2022 23:18

darksylinc added a commit to darksylinc/ign-gazebo that referenced this pull request Apr 3, 2022

Set simulation time to Rendering

a735700

See gazebosim/gz-rendering#556 See gazebosim/gz-rendering#584 Signed-off-by: Matias N. Goldberg <[email protected]>

iche033 approved these changes Apr 4, 2022

View reviewed changes

iche033 merged commit 3b0426e into gazebosim:ign-rendering6 Apr 4, 2022

darksylinc mentioned this pull request Apr 4, 2022

Potential non-deterministic behavior in particle emitter (ogre2) #556

Open

iche033 mentioned this pull request Apr 4, 2022

6 -> 7 #602

Merged

darksylinc mentioned this pull request Jul 11, 2022

Port ign-rendering to Ogre 2.3 #553

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use custom simulation time variants for Ogre #584

Use custom simulation time variants for Ogre #584

darksylinc commented Mar 12, 2022 •

edited

Loading

codecov bot commented Mar 12, 2022 •

edited

Loading

iche033 commented Mar 16, 2022

darksylinc commented Mar 22, 2022 •

edited

Loading

iche033 commented Mar 23, 2022

darksylinc commented Mar 26, 2022 •

edited

Loading

darksylinc commented Mar 26, 2022 •

edited

Loading

darksylinc commented Mar 26, 2022 •

edited

Loading

darksylinc commented Mar 26, 2022 •

edited

Loading

darksylinc commented Mar 27, 2022 •

edited

Loading

iche033 commented Mar 28, 2022

darksylinc commented Mar 28, 2022

darksylinc commented Mar 28, 2022 •

edited

Loading

darksylinc commented Mar 28, 2022

iche033 commented Mar 30, 2022

darksylinc commented Mar 30, 2022 •

edited

Loading

darksylinc commented Mar 30, 2022

iche033 commented Mar 30, 2022

darksylinc commented Mar 30, 2022

darksylinc commented Apr 3, 2022 •

edited

Loading

iche033 commented Apr 4, 2022

Use custom simulation time variants for Ogre #584

Use custom simulation time variants for Ogre #584

Conversation

darksylinc commented Mar 12, 2022 • edited Loading

🦟 Bug fix

Summary

Checklist

codecov bot commented Mar 12, 2022 • edited Loading

Codecov Report

iche033 commented Mar 16, 2022

darksylinc commented Mar 22, 2022 • edited Loading

iche033 commented Mar 23, 2022

darksylinc commented Mar 26, 2022 • edited Loading

darksylinc commented Mar 26, 2022 • edited Loading

darksylinc commented Mar 26, 2022 • edited Loading

darksylinc commented Mar 26, 2022 • edited Loading

One possible workaround

Second possible workaround

darksylinc commented Mar 27, 2022 • edited Loading

iche033 commented Mar 28, 2022

darksylinc commented Mar 28, 2022

darksylinc commented Mar 28, 2022 • edited Loading

darksylinc commented Mar 28, 2022

iche033 commented Mar 30, 2022

darksylinc commented Mar 30, 2022 • edited Loading

darksylinc commented Mar 30, 2022

iche033 commented Mar 30, 2022

darksylinc commented Mar 30, 2022

darksylinc commented Apr 3, 2022 • edited Loading

iche033 commented Apr 4, 2022

darksylinc commented Mar 12, 2022 •

edited

Loading

codecov bot commented Mar 12, 2022 •

edited

Loading

darksylinc commented Mar 22, 2022 •

edited

Loading

darksylinc commented Mar 26, 2022 •

edited

Loading

darksylinc commented Mar 26, 2022 •

edited

Loading

darksylinc commented Mar 26, 2022 •

edited

Loading

darksylinc commented Mar 26, 2022 •

edited

Loading

darksylinc commented Mar 27, 2022 •

edited

Loading

darksylinc commented Mar 28, 2022 •

edited

Loading

darksylinc commented Mar 30, 2022 •

edited

Loading

darksylinc commented Apr 3, 2022 •

edited

Loading