Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use custom simulation time variants for Ogre #584

Merged
merged 6 commits into from
Apr 4, 2022

Conversation

darksylinc
Copy link
Contributor

@darksylinc darksylinc commented Mar 12, 2022

🦟 Bug fix

Fixes #556

Summary

This fixes Particle FXs not respecting simulation time.

After researching Ogre behavior, turns out we already had the functionality to use custom simulation (deterministic) time. We just had to use it. Turned out to be much easier than I expected (I actually started writing more code until I realized it... :( )

Note: This is a draft because I was unsure where to pull gazebo's simulation time from. Therefore it is currently hardcoded with:

static const Ogre::Real kSimulationTime = Ogre::Real(1.0 / 60.0);

and obviously needs to be changed to use gazebo's.

But other than that, once that's implemented (I'm asking for pointers here....!) it should just work.

Checklist

  • Signed all commits for DCO
  • Added tests
  • Updated documentation (as needed)
  • Updated migration guide (as needed)
  • codecheck passed (See contributing)
  • All tests passed (See test coverage)
  • While waiting for a review on your PR, please help review another open pull request to support the maintainers

Note to maintainers: Remember to use Squash-Merge and edit the commit message to match the pull request summary while retaining Signed-off-by messages.

@github-actions github-actions bot added the 🏯 fortress Ignition Fortress label Mar 12, 2022
@codecov
Copy link

codecov bot commented Mar 12, 2022

Codecov Report

Merging #584 (d723713) into ign-rendering6 (195317e) will increase coverage by 0.01%.
The diff coverage is 68.42%.

❗ Current head d723713 differs from pull request most recent head f46cdd8. Consider uploading reports for the commit f46cdd8 to get more accurate results

@@                Coverage Diff                 @@
##           ign-rendering6     #584      +/-   ##
==================================================
+ Coverage           54.33%   54.34%   +0.01%     
==================================================
  Files                 198      198              
  Lines               20187    20220      +33     
==================================================
+ Hits                10969    10989      +20     
- Misses               9218     9231      +13     
Impacted Files Coverage Δ
src/base/BaseScene.cc 73.37% <0.00%> (-0.55%) ⬇️
ogre2/src/Ogre2Scene.cc 76.56% <72.22%> (-0.25%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 195317e...f46cdd8. Read the comment docs.

@iche033
Copy link
Contributor

iche033 commented Mar 16, 2022

There is a Scene::SetTime function that we can call to set the simulation time. Instead of kSimulationTime, you can just call this->Time(). However, right now it just returns 0 every time as we're not setting it in ign-gazebo. For sensors, you should be able to set the time somewhere in this function:
https://github.com/ignitionrobotics/ign-gazebo/blob/ign-gazebo6/src/systems/sensors/Sensors.cc#L225

this->dataPtr->updateTime should be the simulation time.

For gui, the place to set time would be in this function:
https://github.com/ignitionrobotics/ign-gazebo/blob/ign-gazebo6/src/gui/plugins/scene_manager/GzSceneManager.cc#L224

You'll need to get sim time from UpdateInfo arg (_info.simTime) in the GzSceneManager::Update function which runs in a separate thread so probably need a mutex to protect it.

@darksylinc
Copy link
Contributor Author

darksylinc commented Mar 22, 2022

However, right now it just returns 0 every time as we're not setting it in ign-gazebo

That will likely cause the particle test to fail. Because a 0 updateTime means the particle simulation won't move forward (unless it happens that setting the simTime in the sensors fixes it).

You'll need to get sim time from UpdateInfo arg (_info.simTime) in the GzSceneManager::Update function which runs in a separate thread so probably need a mutex to protect it.

You mean ign gui runs the separate thread? Fortunately I'm forcing serialization (or do you mean physics runs in one thread, Qt in another, and Ogre in another?).

Otherwise even if protected by a mutex, the sim time won't be deterministic if physics and particles are not in sync.

@iche033
Copy link
Contributor

iche033 commented Mar 23, 2022

That will likely cause the particle test to fail. Because a 0 updateTime means the particle simulation won't move forward (unless it happens that setting the simTime in the sensors fixes it).

I'm thinking that we'll need to start calling SetTime in ign-gazebo's Sensors and GzSceneManager and in your code you will then be able to to get the correct sim time instead of 0

You mean ign gui runs the separate thread? Fortunately I'm forcing serialization (or do you mean physics runs in one thread, Qt in another, and Ogre in another?).

The physics runs in one thread and ogre in another (for both sensors and gui). As for actually getting the correct time for rendering on the gui side, I think the current set up should be ok. Let me know if you see any issue. First we extract data from physics / ECS in this renderUtil.UpdateFromECM call. Here let's say we store sim time into a variable simTime protected by a mutex here. Then in the rendering thread, we call renderUtil.Update() to actually update the rendering scene tree. We can then call scene->SetTime(simTime) in the same function to let the rendering side know the sim time that the data corresponds to. That should ensure that the rendered image has the correct time.

@darksylinc darksylinc force-pushed the matias-PFX_simulTime branch 2 times, most recently from b3961c2 to 8cfd0e1 Compare March 26, 2022 15:48
@darksylinc
Copy link
Contributor Author

darksylinc commented Mar 26, 2022

The physics runs in one thread and ogre in another (for both sensors and gui). As for actually getting the correct time for rendering on the gui side, I think the current set up should be ok. Let me know if you see any issue.

OK so the only problem I see is this one, but I am not sure how ignition's loop works. This problem I'm going to mention doesn't seem to affect sensors though:

It would appear that Gazebo works like this:

Thread 0: Physics

while( true )
{
   simulate( simTime );
   mutex.lock();
   sendDataToGraphics();
   mutex.unlock();
}

Thread 1: Rendering

while( true )
{
   mutex.lock();
   receiveDataFromPhysics();
   mutex.unlock();
   ogreRender( simTime );
}

The only problem with this setup is the number of ticks rendering can perform.

Let's say simTime = 16.66ms. That is after 60 ticks, 1 second has elapsed in the simulation.

But let's say rendering takes 33.33ms to render in realtime, while physics takes exactly 16.66ms in realtime.
Thus a particular system cannot sustain 60 ticks per second, while physics does. The simulation (of mostly particles) will fall behind because physics doesn't wait for graphics nor vice-versa.

The reverse can also be true.

VSync also enters into play because VSync is used to lock the rendering framerate to that of the monitor (i.e. can't iterate more than 60 times per second on most monitors) which keeps CPUs cool and a desirable property while doing GUI.

But if rendering is also simulating, then vsync should be turned off (right now I don't remember if it's on or off; I can't see gazebo explicitly un/setting it)

It doesn't appear that sensors have this problem because, if I got this correctly, sensors wait for rendering and then attempt to render serialized.

However since the initial conditions may not be synchronized (when a sensor starts updating), sensors would still be affected by the problem but indirectly.

Perhaps I am mistaken on how the loops work though.

@darksylinc
Copy link
Contributor Author

darksylinc commented Mar 26, 2022

OK I managed to find a compromised.

Because Gazebo was already sending absolute simulation time, I could do:

stepSize = newTime - prevAbsTime;
prevAbsTime = stepSize;

This prevents the particles in rendering from falling behind because they can catch up. However, this is not deterministic.

A single step with a stepSize = 1 second is not the same 4 steps with stepSize = 0.25 each even if they result in very similar looking results.

I suppose we could improve this further by ensuring graphics performs the N necessary iterations to catch up. But more synchronization would have to be in place.

For the time being this is "good enough" considering before this PR, the particles were just raw garbage in terms of simulation accuracy & synchronization, but we're still a few leaps behind before we can consider the ticket closed.

@darksylinc darksylinc marked this pull request as ready for review March 26, 2022 22:30
@darksylinc
Copy link
Contributor Author

darksylinc commented Mar 26, 2022

OK so CI caught that UNIT_Scene_TESTis failing.

The reason is simple, it tests that scene->Time() starts at 0, however I changed it to 1 / 60 to maintain backwards compatibility.

But this also created another question:

It would seem Scene::SetTime expects absolute time, and not inter-frame update time? IMO the documentation could be improved there. The contradiction happens when it says "Set the last simulation update time".

At least in gamedev lingo, update time means the tick rate (usually but not always 1 / 60 seconds) i.e. how much time elapsed between each update.

It sounds like SetTime expects simulation time, not simulation update time.

As for the backwards compatibility behavior: The problem was that since (I assume) most users of ignition-rendering did not call SetTime at all (or if they did, it does nothing!), this PR will result in a very different behavior:

  • If SetTime is never called by user, a value of 0 means particles are never fired. So I defaulted to 1 / 60 to at least avoid breaking these apps.
  • If SetTime is called by user, are bets are off; since it was previously doing nothing and whether my PR breaks apps or not depends on what value they are feeding.

If SetTime expects absolute simulation time, I don't see (without hacks) how not to break existing apps after upgrading ign-rendering6. Even though C++ ABI compatibility is not broken; an app that never called Scene::SetTime() means rendering is always stuck at time = 0 hence no particle simulation will move forward.

While we could argue this was bad API usage, there is a saying an API contract is sealed after its first observable behavior.

@darksylinc
Copy link
Contributor Author

darksylinc commented Mar 26, 2022

One possible workaround

Start SetTime at -1.

When negative, ign-rendering6 will behave like before this PR.
When >= 0, ign-rendering6 will honour the simulation time like in this PR.

There's of course always the chance an app breaks if it calls scene->Time() for the first time and expects 0 instead of -1; but the impact should be low IMHO.

Second possible workaround

  • Start SetTime at 0. The value set to SetTime will be ignored.
  • BUT if user calls SetTime() with a magic value like -1; further calls to SetTime will start being honoured.

That way, hack-aware apps will call SetTime( -1 ) before starting to feed real simulation time.
This is probably the best workaround.

I'll wrap the hacks around #if IGNITION_RENDERING_MAJOR_VERSION <= 6

darksylinc added a commit to darksylinc/ign-gazebo that referenced this pull request Mar 27, 2022
@darksylinc
Copy link
Contributor Author

darksylinc commented Mar 27, 2022

@iche033 I see that latest https://build.osrfoundation.org/job/ignition_rendering-ci-pr_any-homebrew-amd64/2334/consoleFull#45593342ed30c675-ba23-4c35-b655-f5c948f97581 is failing at Scene_TEST.cc:687

  duration
     Which is: 8-byte object <00-00 00-00 00-00 00-00>
  scene->Time()
     Which is: 8-byte object <FF-FF FF-FF FF-FF FF-FF>

I changed that test to account for the hack (if the magic value scene->SetTime(-1) is called, that value is ignored and we start honouring simulation time)

It appears the CI machine is failing, even though the code should be explicitly ignoring that -1 and remain at 0. The test passes on my machine.

Do you know why it could be failing in the CI? It shouldn't.

Possible things I can think of:

  • The CI machine grabbed an old version of libignition-rendering but ran the new test
  • IGNITION_RENDERING_MAJOR_VERSION < 6 when compiling libignition-rendering but it is IGNITION_RENDERING_MAJOR_VERSION = 6 when compiling the test (that would be weird)

@iche033
Copy link
Contributor

iche033 commented Mar 28, 2022

This problem I'm going to mention doesn't seem to affect sensors though:

There is soft lock-stepping mechanism that we put in to make physics wait for sensors using condition variables. Here, we block the whole ECS / physics if a sensor needs update but the previous rendering operation has not finished yet. Here we notify when rendering thread is done with the update and able to continue.

But if rendering is also simulating, then vsync should be turned off (right now I don't remember if it's on or off; I can't see gazebo explicitly un/setting it)

I don't think we explicitly turn VSync on / off. We used to limit GUI rendering to 60 hz and I don't think it needs to go any higher than that for simulation. As for sensors, I can see that users may want to simulate high speed cameras to go > 60Hz but we haven't heard any requests related to this yet.

It would seem Scene::SetTime expects absolute time, and not inter-frame update time? IMO the documentation could be improved there. The contradiction happens when it says "Set the last simulation update time".

yes like you discovered SetTime excepts absolute time. We use more physics terms to describe time elapsed between update, e.g. step size, but yeah I agree we can update documentation to to make it clear.

Second possible workaround

yep that sounds good.

Do you know why it could be failing in the CI? It shouldn't.

The homebrew CI machine is running ogre 1.x instead of ogre 2.x. Now that mac support is becoming more complete, we probably should switch it back to ogre 2.x, at least for garden.

@darksylinc
Copy link
Contributor Author

The homebrew CI machine is running ogre 1.x instead of ogre 2.x. Now that mac support is becoming more complete, we probably should switch it back to ogre 2.x, at least for garden.

Ahhh!!! Good point!!! The BaseScene should account for the workaround too.

@darksylinc
Copy link
Contributor Author

darksylinc commented Mar 28, 2022

OK I pushed a new version. Hopefully this one will pass all checks.

I don't think we explicitly turn VSync on / off. We used to limit GUI rendering to 60 hz and I don't think it needs to go any higher than that for simulation. As for sensors, I can see that users may want to simulate high speed cameras to go > 60Hz but we haven't heard any requests related to this yet.

I somehow sense you don't understand what I mean. But I think experiencing it yourself is the best way:

  • Checkout my ign-rendering6 branch (matias-PFX_simulTime)
  • Checkout my ign-gazebo branch (matias-PFX_simulTime)
  • Launch particle_emitter2.sdf with server, i.e. ign gazebo -s particle_emitter2.sdf
  • Launch the GUI. i.e. ign gazebo -g
  • Press Play
  • Wait some time (e.g. 10 seconds)
  • Pause
  • Close the GUI
  • Launch the GUI again. i.e. ign gazebo -g. The server should still be up
    • See that the particles "restart" from 0 instead of continuing where they left off
      • That's because particles are simulated in the GUI
      • Also because the GUI doesn't accurately simulate the time between the origin of the simulation and the time at which it was paused (e.g. 10 seconds in sim time)

In other words: non deterministic behavior. But at least simulation speed is maintained.

@darksylinc
Copy link
Contributor Author

It would seem the linter doesn't like the ifdef around the if/else statements. It thinks it's if( cond )body; instead of if( cond ) { body; }

@iche033
Copy link
Contributor

iche033 commented Mar 30, 2022

Also because the GUI doesn't accurately simulate the time between the origin of the simulation and the time at which it was paused (e.g. 10 seconds in sim time)

Ok I tried running the steps and saw the "restart" behavior. I noticed on the GUI rendering side if (!this->LegacyAutoGpuFlush()) evaluates to false since we never set cameraPassCountPerGpuFlush (that was only done in Sensors). So evt.timeSinceLastFrame was never set. As a test, I forced it to be on by setting cameraPassCountPerGpuFlush to 6. After that I tried the steps again and noticed that in the last step Launch the GUI again, the correct sim time is passed to evt.timeSinceLastFrame when the GUI is launched the second time - evt.timeSinceLastFrame is set to ~10s (after pausing for ~10 seconds). However the particles appearance still look different. Maybe there is something else I'm not understanding correctly.

@darksylinc
Copy link
Contributor Author

darksylinc commented Mar 30, 2022

Ok I tried running the steps and saw the "restart" behavior. I noticed on the GUI rendering side if (!this->LegacyAutoGpuFlush()) evaluates to false since we never set cameraPassCountPerGpuFlush (that was only done in Sensors). So evt.timeSinceLastFrame was never set. As a test, I forced it to be on by setting cameraPassCountPerGpuFlush to 6.

Ok various things:

  • So if I understood correctly this->LegacyAutoGpuFlush() == true which means the branch if (!this->LegacyAutoGpuFlush()) is never taken right? the GUI should definitely not be in legacy mode IIRC
    • But what surprises me is that in legacy mode it still should work, because Ogre2Scene::StartRendering should take care of it.

However the particles appearance still look different. Maybe there is something else I'm not understanding correctly.

That's what I'm trying to explain! IT DOES look different because the particle FX system is trying to simulate 10 seconds in one tick. In pseudo code:

// Ideal:
const float timeToSimulate = 10 seconds;
const int numTicks = floor( timeToSimulate / tickRate );
for( int i = 0; i < numTicks; ++i )
    updateParticles( tickRate );

// What is happening when you relaunch the GUI with a simulation of 10 seconds
const float timeToSimulate = 10 seconds;
updateParticles( timeToSimulate ); // Just one tick with 10.0 as tick rate

To fix this I would have to try to analyze if it's possible to detach the particle FX simulation from rendering and force N ticks before rendering.

Ideally, particle FX simulation should happen as part of gazebo's simulation, not in rendering.

@darksylinc
Copy link
Contributor Author

Thus basically this PR will fix some of the issues in #556 but it won't close it, because it's still non-deterministic.

For that we can focus in a new PR.

@iche033
Copy link
Contributor

iche033 commented Mar 30, 2022

In pseudo code:

ah that explains it! I thought that particles system is taking the ideal case so would be able to handle large time jumps. Ok we can leave that for later

@darksylinc
Copy link
Contributor Author

Yeah. From a high level perspective it should be an easy fix. But I'd have to look how easy it is without ABI breaking anything

I'm slightly concerned about the GPU buffers, those may not like being updated 10 times in the same frame. It should be easily workaroundeable though

This fixes Particle FXs not respecting simulation time
Fixes gazebosim#556

Signed-off-by: Matias N. Goldberg <[email protected]>
Signed-off-by: Matias N. Goldberg <[email protected]>
Signed-off-by: Matias N. Goldberg <[email protected]>
Update tests to advance simulation time
Tests pass now

Signed-off-by: Matias N. Goldberg <[email protected]>
Signed-off-by: Matias N. Goldberg <[email protected]>
@darksylinc
Copy link
Contributor Author

darksylinc commented Apr 3, 2022

I fixed the linter false positive.

All tests should succeed this time 🤞

It's ready for merge.

darksylinc added a commit to darksylinc/ign-gazebo that referenced this pull request Apr 3, 2022
@iche033
Copy link
Contributor

iche033 commented Apr 4, 2022

looks good to me!

@iche033 iche033 merged commit 3b0426e into gazebosim:ign-rendering6 Apr 4, 2022
@iche033 iche033 mentioned this pull request Apr 4, 2022
@darksylinc darksylinc mentioned this pull request Jul 11, 2022
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏯 fortress Ignition Fortress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Potential non-deterministic behavior in particle emitter (ogre2)
2 participants