Optimized physics object spawn time, improved bullet physics server code. #42643

AndreaCatania · 2020-10-08T13:29:03Z

This is a combination of #39726 #40252 #41096 that got reverted here #42639

Optimized physics object spawn time, by integrating lazy body reload.
Optimized shape usage when the shape is not scaled.
Make sure to correctly fetch gravity when the integrate_forces function is used
Added override keyword
Using LocalVector

Test project: TestSpawnRB.zip you can just open an launch it.
The benchmark code is this:

extends Node

var count = 0

var shapes_per_frame := 2000
var shape := BoxShape3D.new()
var t1
var overall: int = 0

func _ready():
	t1 = OS.get_ticks_msec()
	_do_test()

func _physics_process(delta):
	if count < 3:
		var t2 = OS.get_ticks_msec()
		count += 1
		print("Frame ", count, ": ", t2 - t1, "ms")
		overall += t2 - t1
	else:
		print("Overall ", overall, "ms")
		get_tree().quit()

func _do_test():
	print("Test start")
	for i in range(0, shapes_per_frame):
		var rb = RigidBody3D.new()
		var c = CollisionShape3D.new()
		c.set_shape(shape)
		rb.add_child(c)
		add_child(rb)
		if i % 50 == 0:
			print("Added: ", i)

Test start
Added: 0
Added: 50
Added: 100
Added: 150
Added: 200
Added: 250
Added: 300
Added: 350
Added: 400
Added: 450
Added: 500
Added: 550
Added: 600
Added: 650
Added: 700
Added: 750
Added: 800
Added: 850
Added: 900
Added: 950
Added: 1000
Added: 1050
Added: 1100
Added: 1150
Added: 1200
Added: 1250
Added: 1300
Added: 1350
Added: 1400
Added: 1450
Added: 1500
Added: 1550
Added: 1600
Added: 1650
Added: 1700
Added: 1750
Added: 1800
Added: 1850
Added: 1900
Added: 1950
Frame 1: 33ms
Frame 2: 9146ms
Frame 3: 14503ms
Overall 23682ms

This is flame graph extracted from the perf of this PR:

This is flame graph extracted from the perf of master branch:

Without this PR, the set_ functions (like set_layer / set_mask) remove and add the body from the Physics World to recompute the entire hierarchy and cache.

Godot, call these functions one by one when a new RigidBody node is created, so the cache is internally reconstructed multiple times. With this PR, the body is created at the start of the frame and so all the code is executed only 1 time.

akien-mga · 2020-10-08T13:33:28Z

Do we have any demo projects available that have been ported to the current master branch? It would be nice to test physics changes on demos like the 3D platformer or TPS demo to check that everything work well (and ideally better when it's an optimizing like here, though I don't know if they would be impacted).

Maybe we could also have "pathological" physics demo that use lots of bodies and various APIs to stress test the implementation.

akien-mga · 2020-10-08T14:13:12Z

Note: It's not particularly important, but remember that commit logs should be formatted as one title on the first line, then a blank line, and then the body of the commit message. Otherwise all of it is concerned part of the commit title, which leads to this kind of weirdness (see header): https://patch-diff.githubusercontent.com/raw/godotengine/godot/pull/42643.patch

Calinou · 2020-10-08T14:53:14Z

Maybe we could also have "pathological" physics demo that use lots of bodies and various APIs to stress test the implementation.

We have one, but it needs to be ported to the master branch: https://github.com/godotengine/godot-demo-projects/tree/master/3d/physics_tests

AndreaCatania · 2020-10-09T11:34:21Z

Here the test project TestSpawnRB.zip you can just open and launch it. You will see in the console the overall time taken to process 3 frames; you can test the PR and the current master branch to see the difference.

AndreaCatania · 2020-10-09T11:40:02Z

Note: It's not particularly important, but remember that commit logs should be formatted as one title on the first line, then a blank line, and then the body of the commit message. Otherwise all of it is concerned part of the commit title, which leads to this kind of weirdness (see header): https://patch-diff.githubusercontent.com/raw/godotengine/godot/pull/42643.patch

Title added.

akien-mga · 2020-10-09T12:01:45Z

Here's a quick and dirty port of the 3D platformer demo to current master: platformer.zip

madmiraal · 2020-10-13T05:33:06Z

This PR is difficult to review, because it is both complex and trying to do multiple things. I think, to enable it to be reviewed and tested properly and effectively, it needs to be broken up into separate, self-contained commits and probably PRs; perhaps one for each of its deliverables:

Optimized physics object spawn time, by integrating lazy body reload.
Optimized shape usage when the shape is not scaled.
Make sure to correctly fetch gravity when the integrate_forces function is used
Added override keyword
Using LocalVector

pouleyKetchoupp · 2020-10-13T09:17:29Z

@madmiraal I'm not totally opposed to splitting, but I don't think it's really necessary. At this point, I would rather review this PR as it is, because asking @AndreaCatania to group his PRs when they are split, then to split them when they are grouped is not going to help with (hopefully) finalizing this optimization soon :)

These two are a bit off-topic so they could be possibly separated. They don't add too much to the complexity though, because the first one is just one block and the second one is only in headers.

Make sure to correctly fetch gravity when the integrate_forces function is used
Added override keyword

The others are all about optimizing physics spawn time and they affect different areas of code.

Optimized physics object spawn time, by integrating lazy body reload

This one is the only really complex change. For test purpose, different parts can be disabled by calling do_reload_shapes, do_reload_collision_filters or do_reload_body directly in some methods.

Optimized shape usage when the shape is not scaled

This change is localized in shape_bullet so it's easy to review. For test purpose, it can be disabled by commenting the first lines in ShapeBullet::create_bt_shape.

Using LocalVector

This one is only affecting vector accesses, and it's an easy, obvious optimization since it just gets rid of copy-on-write.

pouleyKetchoupp

The changes generally look ok, I've made a few comments for possible improvements.

I'll try to run my own tests about the performance gain later this week.

modules/bullet/collision_object_bullet.h

pouleyKetchoupp · 2020-10-13T09:36:20Z

modules/bullet/space_bullet.cpp

+	for (uint32_t i = 0; i < queue_pre_flush.size(); i += 1) {
+		queue_pre_flush[i]->dispatch_callbacks();
+		queue_pre_flush[i]->is_in_flush_queue = false;
+	}
+	for (uint32_t i = 0; i < queue_flush.size(); i += 1) {
+		queue_flush[i]->dispatch_callbacks();
+		queue_flush[i]->is_in_flush_queue = false;
 	}
+	queue_pre_flush.clear();
+	queue_flush.clear();


Some comments here to explain what the logic is behind having two different queues could make it clearer for future contributors.

yes please more comments for future contributors

Already done. You can find it above: https://github.com/godotengine/godot/pull/42643/files#diff-28f7bca21ebbe699cd16802c05d43f4e085331c7d0427c38da0ab3e6d14dc696R395-R399

👍 awesome thx!

pouleyKetchoupp · 2020-10-13T09:54:15Z

modules/bullet/shape_bullet.cpp


 btCollisionShape *ShapeBullet::create_bt_shape(const Vector3 &p_implicit_scale, real_t p_extra_edge) {
 	btVector3 s;
 	G_TO_B(p_implicit_scale, s);
 	return create_bt_shape(s, p_extra_edge);
 }

+btCollisionShape *ShapeBullet::create_bt_shape(const btVector3 &p_implicit_scale, real_t p_extra_edge) {
+	if (p_extra_edge == 0.0 && (p_implicit_scale - btVector3(1, 1, 1)).length2() <= CMP_EPSILON) {
+		return default_shape;


Why not creating the default shape here only when needed, to avoid always creating two shapes when there is a scale? Also, maybe with a ref count for the default shape to know when to destroy it, rather than storing the old default shape? That would simplify the logic around notifyShapeChanged.

That's a good catch! Integrated.

AndreaCatania · 2020-11-09T19:55:56Z

Re-based on 827e5b8 and implemented the @pouleyKetchoupp notes.
Re-tested the followings:

pouleyKetchoupp · 2020-11-23T17:07:32Z

I have problems to do proper performance tests at the moment, because I'm getting crashing asserts from Bullet when running TestSpawnRB project on latest master.

It's this assert in bvVector3::normalize():

btAssert(!fuzzyZero());

[Inline Frame] godot.windows.opt.tools.64.exe!btVector3::normalize() Line 305	C++
[Inline Frame] godot.windows.opt.tools.64.exe!btVector3::normalized() Line 953	C++
[Inline Frame] godot.windows.opt.tools.64.exe!btSingleSweepCallback::{ctor}(const btConvexShape *) Line 1040	C++
godot.windows.opt.tools.64.exe!btCollisionWorld::convexSweepTest(const btConvexShape * castShape, const btTransform & convexFromWorld, const btTransform & convexToWorld, btCollisionWorld::ConvexResultCallback & resultCallback, float allowedCcdPenetration) Line 1101	C++
godot.windows.opt.tools.64.exe!btDiscreteDynamicsWorld::integrateTransformsInternal(btRigidBody * * bodies, int numBodies, float timeStep) Line 1002	C++
godot.windows.opt.tools.64.exe!btDiscreteDynamicsWorld::integrateTransforms(float timeStep) Line 1056	C++
godot.windows.opt.tools.64.exe!btDiscreteDynamicsWorld::internalSingleStepSimulation(float timeStep) Line 489	C++
godot.windows.opt.tools.64.exe!btSoftRigidDynamicsWorld::internalSingleStepSimulation(float timeStep) Line 91	C++
godot.windows.opt.tools.64.exe!btDiscreteDynamicsWorld::stepSimulation(float timeStep, int maxSubSteps, float fixedTimeStep) Line 435	C++
godot.windows.opt.tools.64.exe!BulletPhysicsServer3D::step(float p_deltaTime) Line 1552	C++

It looks like it's due to something weird happening around ccd. Looking at the callstack, this code shouldn't be run on bodies with default settings.

From what I see, ccd seems to be activated on all bodies with this callstack:

>	godot.windows.tools.64.exe!RigidBodyBullet::set_continuous_collision_detection(bool p_enable) Line 734	C++
 	godot.windows.tools.64.exe!RigidBodyBullet::main_shape_changed() Line 309	C++
 	godot.windows.tools.64.exe!RigidCollisionObjectBullet::reload_shapes() Line 361	C++
 	godot.windows.tools.64.exe!RigidBodyBullet::reload_shapes() Line 811	C++
 	godot.windows.tools.64.exe!RigidCollisionObjectBullet::add_shape(ShapeBullet * p_shape, const Transform & p_transform, bool p_disabled) Line 231	C++
 	godot.windows.tools.64.exe!BulletPhysicsServer3D::body_add_shape(RID p_body, RID p_shape, const Transform & p_transform, bool p_disabled) Line 501	C++
 	godot.windows.tools.64.exe!CollisionObject3D::shape_owner_add_shape(unsigned int p_owner, const Ref<Shape3D> & p_shape) Line 253	C++

Because main_shape_changed is called two times for each added body. The first time it disables ccd, which sets the internal ccd threshold to be 10000. The second time, it enables ccd again because the internal threshold is > 0, due to this method:

bool RigidBodyBullet::is_continuous_collision_detection_enabled() const {
	return 0. < btBody->getCcdMotionThreshold();
}

I'll open a PR to propose a fix so performance changes from this PR can be properly compared on master.

…t server. - Optimized physics object spawn time, by integrating lazy body reload. - Optimized shape usage when the shape is not scaled. - Make sure to correctly fetch gravity when the integrate_forces function is used - Added override keyword - Using LocalVector

AndreaCatania · 2020-12-11T17:24:58Z

@pouleyKetchoupp I saw that you fixed the CCD issue. I've rebased and the crash seems gone. Though, I didn't had that, so I can't really say it's fixed. Please let me know if you still have the crash (in case you still have that, plese paste the stack trace).

AndreaCatania · 2020-12-11T17:45:09Z

Never mind, I'm able to reproduce the issue. Thought, it's not affecting this PR any more.

AndreaCatania · 2020-12-28T15:39:12Z

@pouleyKetchoupp did you try this after the CCD fix?

pouleyKetchoupp · 2020-12-28T17:45:11Z

@AndreaCatania I've made some quick tests but I still need to do more in-depth performance comparison. I've been busy with other things but I'll finish reviewing it as soon as I can find extra time.

AndreaCatania · 2020-12-29T07:28:52Z

@pouleyKetchoupp Ok thanks!

pouleyKetchoupp · 2021-01-06T00:17:52Z

I finally got to do proper performance comparisons, sorry for the delay!

Here's the project I'm using (it's part of the physics tests framework, ported to 4.0):
broadphase-test-4.0.zip

This PR is also needed on master to make it work properly:
#44104 - Fix error when calling coroutine with await in _ready

And I can confirm a clear improvement when adding objects (about 2x faster with 125000 objects on my configuration, when counting the extra time spent on the next physics step, since now a part of the process is postponed).

On the other hand, it seems removing objects is almost 3x slower than before. I haven't investigated why exactly. @AndreaCatania could you please check that?

Here are detailed timings I get with the attached test scene (timings are pretty consistent).
Physics ticks timings are deltas between two physics ticks, and the first one includes the time spent to create/add/move/remove objects as well.

Bullet / Current master

* Creating objects...
  Create Time: 1588.323 ms
  Physics Tick: 1595.326 ms (total = 1595.326 ms)
  Physics Tick: 1.999 ms (total = 1597.325 ms)
  Physics Tick: 1.231 ms (total = 1598.556 ms)
  Physics Tick: 1.506 ms (total = 1600.062 ms)
* Adding objects...
  Add Time: 9086.594 ms
  Physics Tick: 9390.825 ms (total = 9390.825 ms)
  Physics Tick: 564.139 ms (total = 9954.964 ms)
  Physics Tick: 246.837 ms (total = 10201.801 ms)
  Physics Tick: 245.333 ms (total = 10447.134 ms)
* Moving objects...
  Move Time: 112.258 ms
  Physics Tick: 251.868 ms (total = 251.868 ms)
  Physics Tick: 579.750 ms (total = 831.618 ms)
  Physics Tick: 245.037 ms (total = 1076.655 ms)
  Physics Tick: 245.109 ms (total = 1321.764 ms)
* Removing objects...
  Remove Time: 2254.943 ms
  Physics Tick: 2276.565 ms (total = 2276.565 ms)
  Physics Tick: 3.215 ms (total = 2279.780 ms)
  Physics Tick: 1.398 ms (total = 2281.178 ms)
  Physics Tick: 1.228 ms (total = 2282.406 ms)

Bullet / PR 42643

* Creating objects...
  Create Time: 1464.757 ms
  Physics Tick: 1481.533 ms (total = 1481.533 ms)
  Physics Tick: 2.624 ms (total = 1484.157 ms)
  Physics Tick: 2.054 ms (total = 1486.211 ms)
  Physics Tick: 1.397 ms (total = 1487.608 ms)
* Adding objects...
  Add Time: 884.057 ms
  Physics Tick: 1010.470 ms (total = 1010.470 ms)
  Physics Tick: 3700.365 ms (total = 4710.835 ms)
  Physics Tick: 246.892 ms (total = 4957.727 ms)
  Physics Tick: 247.128 ms (total = 5204.855 ms)
* Moving objects...
  Move Time: 109.260 ms
  Physics Tick: 241.588 ms (total = 241.588 ms)
  Physics Tick: 575.832 ms (total = 817.420 ms)
  Physics Tick: 248.515 ms (total = 1065.935 ms)
  Physics Tick: 247.323 ms (total = 1313.258 ms)
* Removing objects...
  Remove Time: 5944.551 ms
  Physics Tick: 5995.082 ms (total = 5995.082 ms)
  Physics Tick: 2.237 ms (total = 5997.319 ms)
  Physics Tick: 2.046 ms (total = 5999.365 ms)
  Physics Tick: 1.515 ms (total = 6000.880 ms)

Godot Physics 3D

* Creating objects...
  Create Time: 1417.275 ms
  Physics Tick: 1421.562 ms (total = 1421.562 ms)
  Physics Tick: 11.435 ms (total = 1432.997 ms)
  Physics Tick: 2.029 ms (total = 1435.026 ms)
  Physics Tick: 1.596 ms (total = 1436.622 ms)
* Adding objects...
  Add Time: 1280.441 ms
  Physics Tick: 1318.668 ms (total = 1318.668 ms)
  Physics Tick: 357.871 ms (total = 1676.539 ms)
  Physics Tick: 227.257 ms (total = 1903.796 ms)
  Physics Tick: 228.116 ms (total = 2131.912 ms)
* Moving objects...
  Move Time: 120.289 ms
  Physics Tick: 512.350 ms (total = 512.350 ms)
  Physics Tick: 224.549 ms (total = 736.899 ms)
  Physics Tick: 2.229 ms (total = 739.128 ms)
  Physics Tick: 18.212 ms (total = 757.340 ms)
* Removing objects...
  Remove Time: 465.413 ms
  Physics Tick: 484.925 ms (total = 484.925 ms)
  Physics Tick: 2.018 ms (total = 486.943 ms)
  Physics Tick: 2.646 ms (total = 489.589 ms)
  Physics Tick: 1.151 ms (total = 490.740 ms)

akien-mga · 2022-08-02T12:46:34Z

This would need to be redone for 3.x as Bullet was removed from the master branch, so it can be merged.

I also agree that the scope of this PR seems to be a bit big and it might be easier to assess and merge some of the changes independently (e.g. switch to LocalVector could likely be merged as is).

AndreaCatania added enhancement topic:physics labels Oct 8, 2020

AndreaCatania added this to the 4.0 milestone Oct 8, 2020

AndreaCatania requested review from pouleyKetchoupp, akien-mga and madmiraal October 8, 2020 13:29

AndreaCatania force-pushed the spawn_opt_2 branch from ad35e42 to 46ce9c3 Compare October 9, 2020 11:39

pouleyKetchoupp reviewed Oct 13, 2020

View reviewed changes

AndreaCatania force-pushed the spawn_opt_2 branch from 46ce9c3 to e179750 Compare November 9, 2020 19:52

This was referenced Nov 23, 2020

Fix ccd enabled by default on Bullet bodies #43801

Merged

[Bullet] [Needs Bullet Update] Crashing asserts in debug when using Bullet Physics with CCD enabled #43868

Closed

AndreaCatania force-pushed the spawn_opt_2 branch from e179750 to 9cc2a73 Compare December 11, 2020 17:22

Calinou mentioned this pull request Jan 13, 2021

Optimize setting large amounts of GridMap cells by adding a batching property godotengine/godot-proposals#1795

Open

Merge branch 'master' into spawn_opt_2

efc008d

pouleyKetchoupp mentioned this pull request Feb 1, 2021

[Bullet] Setting 3D body transform after add_child makes it very slow in procedural generation use cases #45598

Open

akien-mga closed this Aug 2, 2022

akien-mga added archived salvageable labels Aug 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimized physics object spawn time, improved bullet physics server code. #42643

Optimized physics object spawn time, improved bullet physics server code. #42643

AndreaCatania commented Oct 8, 2020 •

edited

Loading

akien-mga commented Oct 8, 2020 •

edited

Loading

akien-mga commented Oct 8, 2020

Calinou commented Oct 8, 2020

AndreaCatania commented Oct 9, 2020

AndreaCatania commented Oct 9, 2020

akien-mga commented Oct 9, 2020

madmiraal commented Oct 13, 2020

pouleyKetchoupp commented Oct 13, 2020

pouleyKetchoupp left a comment

pouleyKetchoupp Oct 13, 2020

AndreaCatania Nov 9, 2020

jordo Dec 6, 2020

AndreaCatania Dec 11, 2020

jordo Dec 11, 2020

pouleyKetchoupp Oct 13, 2020

AndreaCatania Nov 9, 2020

AndreaCatania commented Nov 9, 2020

pouleyKetchoupp commented Nov 23, 2020

AndreaCatania commented Dec 11, 2020

AndreaCatania commented Dec 11, 2020

AndreaCatania commented Dec 28, 2020

pouleyKetchoupp commented Dec 28, 2020

AndreaCatania commented Dec 29, 2020

pouleyKetchoupp commented Jan 6, 2021 •

edited

Loading

akien-mga commented Aug 2, 2022

Optimized physics object spawn time, improved bullet physics server code. #42643

Optimized physics object spawn time, improved bullet physics server code. #42643

Conversation

AndreaCatania commented Oct 8, 2020 • edited Loading

akien-mga commented Oct 8, 2020 • edited Loading

akien-mga commented Oct 8, 2020

Calinou commented Oct 8, 2020

AndreaCatania commented Oct 9, 2020

AndreaCatania commented Oct 9, 2020

akien-mga commented Oct 9, 2020

madmiraal commented Oct 13, 2020

pouleyKetchoupp commented Oct 13, 2020

pouleyKetchoupp left a comment

Choose a reason for hiding this comment

pouleyKetchoupp Oct 13, 2020

Choose a reason for hiding this comment

AndreaCatania Nov 9, 2020

Choose a reason for hiding this comment

jordo Dec 6, 2020

Choose a reason for hiding this comment

AndreaCatania Dec 11, 2020

Choose a reason for hiding this comment

jordo Dec 11, 2020

Choose a reason for hiding this comment

pouleyKetchoupp Oct 13, 2020

Choose a reason for hiding this comment

AndreaCatania Nov 9, 2020

Choose a reason for hiding this comment

AndreaCatania commented Nov 9, 2020

pouleyKetchoupp commented Nov 23, 2020

AndreaCatania commented Dec 11, 2020

AndreaCatania commented Dec 11, 2020

AndreaCatania commented Dec 28, 2020

pouleyKetchoupp commented Dec 28, 2020

AndreaCatania commented Dec 29, 2020

pouleyKetchoupp commented Jan 6, 2021 • edited Loading

akien-mga commented Aug 2, 2022

AndreaCatania commented Oct 8, 2020 •

edited

Loading

akien-mga commented Oct 8, 2020 •

edited

Loading

pouleyKetchoupp commented Jan 6, 2021 •

edited

Loading