Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash due to meshStates.empty() assertion #958

Closed
MirceaKitsune opened this issue Jan 8, 2021 · 11 comments · Fixed by #1465
Closed

Crash due to meshStates.empty() assertion #958

MirceaKitsune opened this issue Jan 8, 2021 · 11 comments · Fixed by #1465
Labels
assertion-failure bug Something isn't working Severity: High Important functionality is affected, no workaround exists

Comments

@MirceaKitsune
Copy link

MirceaKitsune commented Jan 8, 2021

I run the latest AppImage downloaded from the official website on Linux openSUSE Tumbleweed x64 / KDE. Occasionally the interface crashes when visiting some areas. When ran from a console I get this error message:

[01/08 17:40:27] [DEBUG] [hifi.interface.deadlock] DEADLOCK WATCHDOG WARNING: lastHeartbeatAge: 71102 elapsedMovingAverage: 3143 maxElapsed: 1810101 PREVIOUS maxElapsedAverage: 3137 NEW maxElapsedAverage: 3143 ** NEW MAX ELAPSED AVERAGE ** samples: 5035
Vircadia-x86_64_v2020.3.3-Demeter.AppImage: /home/motofckr9k/Vircadia/source/libraries/render-utils/src/Model.cpp:309: virtual bool Model::updateGeometry(): Assertion `_meshStates.empty()' failed.
Aborted (core dumped)
@daleglass daleglass added bug Something isn't working Severity: Medium Important functionality is affected, but a workaround exists labels Jan 9, 2021
@JulianGro
Copy link
Contributor

I don't think I have ever encountered an issue like that on Linux. Could you run it in gdb to get a backtrace?

$ gdb Vircadia-x86_64_v2020.3.3-Demeter.AppImage
$ run
and after it crashed (gdb will keep it from closing)
$ bt

The output of bt is what we want.

@MirceaKitsune
Copy link
Author

I'm not sure if my setup currently makes it easy to trace this with GDB. I noticed I seem to get similar crashes too, but some of them don't print that message thus I'm unsure if they're even related... I'm not sure how to even isolate this exact one then.

Only detail I forgot to add is that they usually occur when going to another domain, a few seconds after it starts loading, the crash occurs before everything finishes loading up. If I start the interface again and go back to that same domain it will work, so it's not a broken object constantly crashing viewers but likely some random condition at loading time.

@stale
Copy link

stale bot commented Jun 5, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale Issue / PR has not had activity label Jun 5, 2021
@JulianGro
Copy link
Contributor

JulianGro commented Jun 5, 2021

Maybe stalebot did good for once.
The issue seemingly got a lot worse, making most people crash semi regularly when joining domains.

Either that, or we have a different issue which causes the same problem to occur.

@stale stale bot removed the stale Issue / PR has not had activity label Jun 5, 2021
@daleglass
Copy link
Contributor

I've been trying to catch this one for a while, but the darn thing seems to always strike right when I'm not using a debugger.

I'll try to do another attempt at catching it after the dev meeting. If anybody wants to join, please run under gdb or another debugger, and get a backtrace (bt command in gdb)

@daleglass daleglass added assertion-failure Severity: High Important functionality is affected, no workaround exists and removed Severity: Medium Important functionality is affected, but a workaround exists labels Jun 5, 2021
@daleglass
Copy link
Contributor

Okay, finally! Backtrace:

#0  0x00007fffe64a92a2 in raise () at /lib64/libc.so.6
#1  0x00007fffe64928a4 in abort () at /lib64/libc.so.6
#2  0x00007fffe6492789 in _nl_load_domain.cold () at /lib64/libc.so.6
#3  0x00007fffe64a1a16 in  () at /lib64/libc.so.6
#4  0x00007ffff5e63729 in Model::updateGeometry() (this=0x7ffef454d620)
    at /home/dale/git/vircadia/vircadia-master/libraries/render-utils/src/Model.cpp:298
#5  0x00007ffff5e6c082 in Model::simulate(float, bool) (this=0x7ffef454d620, deltaTime=0, fullUpdate=true)
    at /home/dale/git/vircadia/vircadia-master/libraries/render-utils/src/Model.cpp:1450
#6  0x00007ffff66cf327 in RenderableModelEntityItem::updateModelBounds() (this=0x7ffdb001f700)
    at /home/dale/git/vircadia/vircadia-master/libraries/entities-renderer/src/RenderableModelEntityItem.cpp:185
#7  0x00007ffff66d0e68 in RenderableModelEntityItem::computeShapeInfo(ShapeInfo&) (this=0x7ffdb001f700, shapeInfo=...)
    at /home/dale/git/vircadia/vircadia-master/libraries/entities-renderer/src/RenderableModelEntityItem.cpp:470
#8  0x00007ffff6c9d44a in PhysicalEntitySimulation::buildMotionStatesForEntitiesThatNeedThem() (this=0x151c780)
    at /home/dale/git/vircadia/vircadia-master/libraries/physics/src/PhysicalEntitySimulation.cpp:373
#9  0x00007ffff6c9d8e0 in PhysicalEntitySimulation::buildPhysicsTransaction(PhysicsEngine::Transaction&)
    (this=0x151c780, transaction=...)
    at /home/dale/git/vircadia/vircadia-master/libraries/physics/src/PhysicalEntitySimulation.cpp:414
#10 0x0000000000a13a07 in Application::update(float) (this=0x7fffffffa540, deltaTime=0.0196905397)
    at /home/dale/git/vircadia/vircadia-master/interface/src/Application.cpp:6581
#11 0x0000000000a08f65 in Application::idle() (this=0x7fffffffa540)
    at /home/dale/git/vircadia/vircadia-master/interface/src/Application.cpp:5346
#12 0x0000000000a04564 in Application::event(QEvent*) (this=0x7fffffffa540, event=0x7ffef441f3e0)
    at /home/dale/git/vircadia/vircadia-master/interface/src/Application.cpp:4281
#13 0x00007ffff427de73 in QApplicationPrivate::notify_helper(QObject*, QEvent*)
    (this=<optimized out>, receiver=0x7fffffffa540, e=0x7ffef441f3e0) at kernel/qapplication.cpp:3632
#14 0x0000000000a0435b in Application::notify(QObject*, QEvent*) (this=0x7fffffffa540, object=0x7fffffffa540, event=0x7ffef441f3e0)
    at /home/dale/git/vircadia/vircadia-master/interface/src/Application.cpp:4251
#15 0x00007fffe6c37f48 in QCoreApplication::notifyInternal2(QObject*, QEvent*) (receiver=0x7fffffffa540, event=0x7ffef441f3e0)
    at kernel/qcoreapplication.cpp:1063
#16 0x00007fffe6c3ac76 in QCoreApplicationPrivate::sendPostedEvents(QObject*, int, QThreadData*)
    (receiver=0x0, event_type=0, data=0x14b8b40) at kernel/qcoreapplication.cpp:1817
#17 0x00007fffe6c84c57 in postEventSourceDispatch(GSource*, GSourceFunc, gpointer) (s=0x1bf13f0)
    at kernel/qeventdispatcher_glib.cpp:277
#18 0x00007fffe1c834cf in g_main_dispatch (context=0x1ab1d70) at ../glib/gmain.c:3337
#19 g_main_context_dispatch (context=0x1ab1d70) at ../glib/gmain.c:4055
#20 0x00007fffe1cd74e8 in g_main_context_iterate.constprop.0
    (context=context@entry=0x1ab1d70, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib/gmain.c:4131
#21 0x00007fffe1c80c03 in g_main_context_iteration (context=0x1ab1d70, may_block=1) at ../glib/gmain.c:4196
#22 0x00007fffe6c846f8 in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) (this=0x1beb300, flags=...)
    at kernel/qeventdispatcher_glib.cpp:423
#23 0x00007fffe6c369b2 in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) (this=this@entry=0x7fffffffa490, flags=..., 
    flags@entry=...) at ../../include/QtCore/../../src/corelib/global/qflags.h:69
#24 0x00007fffe6c3e544 in QCoreApplication::exec() () at ../../include/QtCore/../../src/corelib/global/qflags.h:121
#25 0x0000000000d1e45f in main(int, char const**) (argc=1, argv=0x7fffffffdb68)
    at /home/dale/git/vircadia/vircadia-master/interface/src/main.cpp:448

@daleglass
Copy link
Contributor

Now let's see if we can figure out where this is going wrong.

https://github.com/vircadia/vircadia/blob/40f81e4866fdafeb8f1dc57bf439f1a923d349cb/libraries/render-utils/src/Model.cpp#L296-L311

So we want _meshStates to be empty (it's a vector), to fill it with whatever needs to be there. However, it is not, and contains:

$6 = std::vector of length 1, capacity 1 = {{clusterDualQuaternions = std::vector of length 1, capacity 1 = {{_scale = {{{x = 1, 
              y = 1, z = 1, w = 0}, {r = 1, g = 1, b = 1, a = 0}, {s = 1, t = 1, p = 1, q = 0}, data = {data = {1, 1, 1, 0}}}}, _dq = {
          _real = {{{x = 0, y = 0, z = 0, w = 1}, data = {data = {0, 0, 0, 1}}}}, _dual = {{{x = 0, y = 0, z = 0, w = 0}, data = {
                data = {0, 0, 0, 0}}}}}, _cauterizedPosition = {{{x = 0, y = 0, z = 0, w = 1}, {r = 0, g = 0, b = 0, a = 1}, {s = 0, 
              t = 0, p = 0, q = 1}, data = {data = {0, 0, 0, 1}}}}}}, clusterMatrices = std::vector of length 1, capacity 1 = {{
        value = {{{{x = 1, y = 0, z = 0, w = 0}, {r = 1, g = 0, b = 0, a = 0}, {s = 1, t = 0, p = 0, q = 0}, data = {data = {1, 0, 0, 
                  0}}}}, {{{x = 0, y = 1, z = 0, w = 0}, {r = 0, g = 1, b = 0, a = 0}, {s = 0, t = 1, p = 0, q = 0}, data = {data = {
                  0, 1, 0, 0}}}}, {{{x = 0, y = 0, z = 1, w = 0}, {r = 0, g = 0, b = 1, a = 0}, {s = 0, t = 0, p = 1, q = 0}, data = {
                data = {0, 0, 1, 0}}}}, {{{x = 0, y = 0, z = 0, w = 1}, {r = 0, g = 0, b = 0, a = 1}, {s = 0, t = 0, p = 0, q = 1}, 
              data = {data = {0, 0, 0, 1}}}}}}}}}

A MeshState is a

    class MeshState {
    public:
        std::vector<TransformDualQuaternion> clusterDualQuaternions;
        std::vector<glm::mat4> clusterMatrices;
    };

Hm. So far I don't quite know what is going on there. We iterate over the meshes in the model, simply create space for the MeshStates without actually computing anything, and I guess whatever hooks up to rigReady will want that.

I'm thinking this might be a threading issue because it seems to happen more or less at random.

@daleglass
Copy link
Contributor

daleglass commented Jul 1, 2021

And there's the entire Model where this happens:

https://paste.ubuntu.com/p/6rCGNR4qXn/

The location is Hayashi

@daleglass
Copy link
Contributor

Ok, trying to figure out more here. So, our assert happens because _meshStates is not empty, like it should be. Then if all is well, we fill it.

Now _meshStates is used in a bunch of places. CauterizedModel inherits from Model and reads data from it, but doesn't add anything. So that's out.

_meshStates is added to in:

  • Model::updateGeometry(), which is where the assertion is triggered.
  • Model::deleteGeometry(), which clears it and would make updateGeometry() happy.

And apparently, that's it.

So possible issues here:

  • updateGeometry is called twice when it shouldn't.
  • deleteGeometry isn't called when it should be.
  • threading issue.

@daleglass
Copy link
Contributor

Another option is this line:

https://github.com/vircadia/vircadia/blob/bfbbb2f528654cf203d4b8a343c371b9216bb98a/libraries/render-utils/src/Model.cpp#L296-L298

Assertion won't be triggered if jointStatesEmpty() is false. initJointStates() should be ensuring that, but apparently it isn't.

https://github.com/vircadia/vircadia/blob/bfbbb2f528654cf203d4b8a343c371b9216bb98a/libraries/animation/src/Rig.cpp#L719-L721

So initJointStates must somehow be leaving _internalPoseSet._relativePoses.empty(); as true. It does this:

https://github.com/vircadia/vircadia/blob/bfbbb2f528654cf203d4b8a343c371b9216bb98a/libraries/animation/src/Rig.cpp#L630-L631

So getRelativeDefaultPoses() may be returning an empty list. But the list is initialized from a count of joints:

https://github.com/vircadia/vircadia/blob/bfbbb2f528654cf203d4b8a343c371b9216bb98a/libraries/animation/src/AnimSkeleton.cpp#L221-L224

So current conclusion: The issue here is something having no joints. I'm not yet sure whether this is something that happens normally and isn't accounted for, or it's a bug in the code.

@daleglass
Copy link
Contributor

So, quick and dirty fix here: just clear the mesh states, and log a message. That should be generally harmless, the only question is whether this will hide some other issue down the line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
assertion-failure bug Something isn't working Severity: High Important functionality is affected, no workaround exists
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants