Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GDScript vs C# performance #36060

Closed
dragmz opened this issue Feb 9, 2020 · 54 comments
Closed

GDScript vs C# performance #36060

dragmz opened this issue Feb 9, 2020 · 54 comments

Comments

@dragmz
Copy link
Contributor

dragmz commented Feb 9, 2020

Godot export templates:
https://downloads.tuxfamily.org/godotengine/3.2/mono/Godot_v3.2-stable_mono_export_templates.tpz

Projects:

GDScript takes 10x longer than C# to execute. Results for rendering the first 1000 frames are (in milliseconds):

Intel Core i5 650 @ 3210 MHz, GTX 950

# GDScript (gdlife) C# (gdlifenet)
1 17090 1743
2 17216 1711
3 17575 1700

How to benchmark:

  1. Clone the projects repositories
  2. Create exported builds of the projects
  3. Run the exported builds and wait for the run to complete - the result is printed to stdout

--

The purpose of this issue is to keep a track on the performance in case there are any optimizations implemented in the future.

@nathanfranke
Copy link
Contributor

I am kind of curious why GDScript would be that much slower. Isn't it converted into bytecode?

@Calinou
Copy link
Member

Calinou commented Feb 10, 2020

@nathanfranke I don't remember if bytecode compilation is still present. If it is, keep in mind it only speeds up how fast the script loads, not how fast it runs (just like in Python).

@Zylann
Copy link
Contributor

Zylann commented Feb 10, 2020

I've been doing GDScript microbenchmarks for a while: https://github.com/Zylann/gdscript_performance

@nathanfranke C# uses JIT to convert its bytecode into CPU instructions and is able to use ptrcalls (calling engine functions directly). It also had years of optimizations and features that allow to use CPU power and memory more efficiently (structs, string interning, GC...).
GDScript on the other hand uses a giant switch/case for its instructions, checks types of everything at runtime before operating, accesses functions using maps, uses Variant for everything and does not use ptrcalls. So right now it makes sense that it's much slower. It could be theoretically better due to its tight integration, the rest outweights those benefits.

@Shatur
Copy link
Contributor

Shatur commented Feb 10, 2020

Bad to hear. I really like GDScript.

@jabcross
Copy link
Contributor

As I understand, GDScript is not intended for use in CPU-bound situations, it's just a convenient language for beginners that's tightly coupled with the engine specifics. Most games' CPU-bound use-cases are intended to be covered by specific nodes.

For CPU-bound situations you can use C#, GDNative or write a module.

@ShlomiRex
Copy link
Contributor

As I understand, GDScript is not intended for use in CPU-bound situations, it's just a convenient language for beginners that's tightly coupled with the engine specifics. Most games' CPU-bound use-cases are intended to be covered by specific nodes.

For CPU-bound situations you can use C#, GDNative or write a module.

Agreed, though it is interesting topic.
I wonder if Native script (C++) is much faster than C#?

@dragmz
Copy link
Contributor Author

dragmz commented Feb 10, 2020

I wonder if Native script (C++) is much faster than C#?

I've got a Godot module implementation of the algorithm available here: https://github.com/dragmz/gdlifemod (tested with MSVC 2019 only)

@jabcross
Copy link
Contributor

I wonder if Native script (C++) is much faster than C#?

C++ is theoretically as fast as it can possibly be. Unless you want to write assembly :D

Seriously though, C# is pretty fast as long as you're not memory-bound. Otherwise you get the same problem as Java: lag spikes on garbage collection.

@aaronfranke
Copy link
Member

Unless you want to write assembly :D

A bit off topic, but... due to the sheer level of sophistication of compilers, it is extremely likely that C++ will outperform handwritten assembly. For trivial cases, the compiler has those cases optimized. For non-trivial cases, the problems are likely too complex for a human to hand-optimize the assembly.

Back on topic, here are some more benchmarks: http://www.royaldonut.games/2019/03/29/cpu-voxel-benchmarks-of-most-popular-languages-in-godot/ Note that voxel generation is a use case which can be given many low-level optimizations, so C# and especially C++ benefit a lot here.

@girng
Copy link

girng commented Feb 11, 2020

Bad to hear. I really like GDScript.

Please don't let specific benchmarks like this change your decision on GDScript :)

GDScript is not just for beginners, and it's certainly not slow (relative to a smooth gaming experience). There are trade-offs with any language. The sheer power and rapid prototyping GDScript offers, while giving the developer the chance to build their dream game, should not be overseen.

@Shatur
Copy link
Contributor

Shatur commented Feb 11, 2020

Please don't let specific benchmarks like this change your decision on GDScript :)

Yes, I'm will use it :) I think that if necessary, I will rewrite the bottlenecks in C++.
The documentation also says that static typing in GDScript may increase performance in future.

@bojidar-bg
Copy link
Contributor

Note that GDScript uses real Godot Objects and the ObjectDB for managing them when creating sub-classes in a script. Also it will create ScriptInstances for each of them, and make every variable access on such a script go through a few levels of indirection (thankfully, it will not do string comparisons, but hashmap and red-black tree lookups are enough).
C# on the other hand, can get away with not boxing int-s and bool-s as Variant-s, not storing any excess information in the private classes, apart from what the garbage collector needs, and not doing any hashmap or tree lookups when reading variables from them. Also, it can place objects next to each other in memory, improving cache locality.
Thus, I suspect a lot of the slowdown in this benchmark comes from the fact that the GDScript version does a lot more computation than the C# version. Some of the difference may be eliminated by using a PoolByteArray or two to store the values of the tiles, instead of creating actual tile objects.

@dragmz
Copy link
Contributor Author

dragmz commented Feb 11, 2020

@bojidar-bg both implementations are intentionally naive to mimic what a beginner would do

@Janglee123
Copy link
Contributor

I accept that c# is more efficient than gdscript. But the gaming coding is not necessarily what causes performance issues. The performance of game is more likely to decide by the parts of the game that the engine is already handling. The average user case does not include rendering 10k sprites or anything like that.
On the other hand, GDSript allows for rapid prototyping and makes it easy to get something ready quickly.
So don't think GDScript is a bad option just because it's not optimized as something that optimized over years and years.

@pchasco
Copy link

pchasco commented Feb 18, 2020

There are a number of performance improvements that can be made for GDScript, first and foremost by implementing some basic back-end optimization to the byte code. As of version 3.2 there is no optimization of the byte code beyond some basic constant folding that happens in the compiler front end (that I have observed, at least). Because GDScript VM executes code in a loop containing a giant switch, reducing the number of byte code instructions generated will go a long way to improving the performance.

I have a fork in the works that I've begun implementing the following optimizations of the GDScript byte code:

  • Dead code/store elimination
  • Jump threading
  • Constant propagation
  • Typed registers (Involves adding a number of virtual registers to the function state and instructions to address them. Access to virtual registers can be done without type checking for non-object types (int, real, vector, etc)). Not sure if it would make sense to extend this to further types.)

Potential:

  • ptrcall registers - Same basic concept as typed registers for values, but with ptrcalls to avoid more expensive calls through script API.

Some of these optimizations would affect the semantics of the running application if GDScript is being run in multiple threads. Optimizations that store into typed temporaries in the virtual registers will not be visible externally until the values are stored, which may not be guaranteed until the exit of the function. C# has the volatile qualifier to tell the optimizer to always load the value before it's read and commit it to memory after it is modified. I don't think we'd want to add this level of additional level of complexity to GDScript, so the optimizations will need to be very conservative when they could introduce concurrency issues.

Even with all of these optimizations in place, there is very little chance that GDScript will perform as well as C#. Sure, you could build a JIT for GDScript, but I think it would be easier just to implement GDScript as a .NET language and use the mono tooling that is already in place.

@aaronfranke
Copy link
Member

but I think it would be easier just to implement GDScript as a .NET language

Out of curiosity, do you know how such a thing would be done? It sounds like a fun experiment.

@pchasco
Copy link

pchasco commented Feb 18, 2020

Out of curiosity, do you know how such a thing would be done?

It's not something I've ever done in .NET, but it has been done. You would need to write your own front end to parse the GDScript, then transform that code to CIL instructions to build an assembly. Theoretically, at this point the assembly would be loaded and run no differently than one built with C#. Obviously it is easier said than done, but I think this solution would be easier than rolling your own JIT. Another approach is to compile GDScript to LLVM IL. I'm not sure which approach would be easier, but I suspect building the GDScript language for .NET would be as the mono integration is already in place.

A good place to start would be to have a look at the Boo source code. Boo is a python-like .NET language. It used to be an option for scripting in Unity, but support has been dropped long ago. Since then people have mostly seemed to have lost interest, but you may be able to use it as a jumping-off point.

@jabcross
Copy link
Contributor

jabcross commented Feb 18, 2020 via email

@pchasco
Copy link

pchasco commented Feb 18, 2020

If we're going to use a target for GDScript, it should probably be LLVM. Dotnet is a bit big to bundle when targeting mobile phones, I think.

Possibly... but the team has already decided that dotnet will be supported, and that support is implemented.

@aaronfranke
Copy link
Member

@pchasco Mono support is optional and the engine is officially and primarily compiled without it, since it about doubles the size of the engine to have Mono support.

@pchasco
Copy link

pchasco commented Feb 18, 2020

@pchasco Mono support is optional and the engine is officially and primarily compiled without it, since it about doubles the size of the engine to have Mono support.

That is true. If I am not mistaken LLVM has been discussed in the past and was decided against because it adds a rather sizable build dependency.

Personally, I think the best approach is to make GDScript as fast as it can be without going to JIT. I have done a significant amount of work on a GDScript to C compiler, but I have largely abandoned that effort in favor of improvements to the VM. After some initially encouraging results, real-world performance wasn't that much improved over GDScript to make it worth my while. The compiler targeted GDNative, which was the primary performance bottleneck. I could have it generate an engine module to avoid GDNative, but then it would be much less convenient to build.

@bojidar-bg
Copy link
Contributor

Note that GDScript performance improvements are planned, as discussed in a google doc.

Just a few cents on ideas listed previously:

Dead code/store elimination
Constant propagation

We were thinking of using SSA (single static assignment) in the compiler, which might alleviate this issue.

Jump threading

Definitely sounds useful, and I can see how the compiler might sometimes generate bytecode which chains jumps needlessly.

Typed registers

We were thinking of going for typed instructions instead. They will still use Variant, but they will not check its type, or unbox it, but will just take the bytes out.
With typed instructions, I feel there might not be much performance gain from this one, except for cache optimizations (due to less "padding" around the data from Variant).

ptrcall registers

Not sure about this one either. Maybe there could be some instructions which do a ptrcall using a pointer stored within the bytecode directly?

[..] I think it would be easier just to implement GDScript as a .NET

As already mentioned, non-mono builds of Godot should have working GDScript.
Not sure how well .NET will perform for untyped GDScript where everything might have to be dynamic, instead of having a type.

I think the best approach is to make GDScript as fast as it can be without going to JIT.

I agree. Going to JIT can only eliminate the time spent dispatching instructions. As currently implemented by #11518, the dispatching code is pretty fast, and can even be branch predicted by the CPU. (It is not a giant switch in a loop, except on compilers which do not support jumptables.)

@pchasco
Copy link

pchasco commented Feb 18, 2020

@bojidar-bg Thanks for the share! I've been trying to find a summary of the planned improvements on github but I haven't had much luck to this point.

We were thinking of using SSA (single static assignment) in the compiler, which might alleviate this issue.

You could go that way... But I think it's overkill. Deconstructing SSA form correctly is not trivial as it can introduce hard-to-find bugs in the code when not done correctly. You can still implement dead code elimination, constant folding, common subexpression elimination, and most other beneficial optimizations without it.

We were thinking of going for typed instructions instead. They will still use Variant, but they will not check its type, or unbox it, but will just take the bytes out.

I suppose this could work... In my fork where I have been trying some of these things out I went direction of registers. Mostly because I did not want to suggest breaking encapsulation of the variant structure to work in its data directly. Getting directly into the variant data might be a bit faster because it would avoid the load/store operations for the registers.

the dispatching code is pretty fast, and can even be branch predicted by the CPU.

Technically true but I wonder if this actually happens in practice.

Edit:
I should qualify the branch prediction statement... Branch predictor is not involved here as it is used when the decision is to branch or not to branch. The CPU uses the indirect branch predictor, if available, to keep the pipeline full in the case of a jump table. I don’t know whether the indirect predictor will do much good in the dispatch loop because the patterns will be difficult for the CPU to predict, and the amount of code being run between each iteration for many bytecode instructions may replace the history buffer before the next iteration.

Edit 2:
If you really wanted to try to get something out of the branch predictor, it would be best to schedule instructions with others of the same opcode (same branch taken). This is quite possible for sequences of arithmetic operations where the dependency order is known.

@pchasco
Copy link

pchasco commented Feb 28, 2020

I have a partial implementation of some of these optimizations in my fork:

https://github.com/pchasco/godot

Notably missing is support for the iterate and yield instructions, and functions with default arguments. Also incomplete is typed instructions. This is published now as only a PoC. In the next week or so I will implement the missing instructions. There is no challenge there beyond finding the time; there is no significant difference to them versus the standard jump instructions.

Typed instructions will take somewhat longer.

I also plan to implement optimization for built-ins that are pure functions. Calls to pure functions are candidates for elimination via common sub expression elimination optimization.

@massanchik
Copy link

Bad to hear. I really like GDScript.

Please don't let specific benchmarks like this change your decision on GDScript :)

GDScript is not just for beginners, and it's certainly not slow (relative to a smooth gaming experience). There are trade-offs with any language. The sheer power and rapid prototyping GDScript offers, while giving the developer the chance to build their dream game, should not be overseen.

On the other hand, I know Python and I know C# but I still need to learn GDScript almost from scratch. While it's not that hard, it messes up with my brain when I come back to regular python development at work. It's similar to python in some cases, but it's not python.
I guess will switch to C# for now, to save Python mental stuff 😅

@Two-Tone
Copy link

If we're going to use a target for GDScript, it should probably be LLVM. Dotnet is a bit big to bundle when targeting mobile phones, I think.

How small is LLVM? As far as I can tell it'd still more than double the current download size.

But honestly, going from 20MB to 50MB isn't exactly much. Sure, a 250% increase in download size sounds huge, but in context of modern day phones, computers, and available bandwidth it is absurdly tiny.

@jabcross
Copy link
Contributor

jabcross commented May 13, 2020

If we're going to use a target for GDScript, it should probably be LLVM. Dotnet is a bit big to bundle when targeting mobile phones, I think.

How small is LLVM? As far as I can tell it'd still more than double the current download size.

But honestly, going from 20MB to 50MB isn't exactly much. Sure, a 250% increase in download size sounds huge, but in context of modern day phones, computers, and available bandwidth it is absurdly tiny.

I was talking about the export's size, not the editor's. Games that use C# need to ship the virtual machine too on targets that don't have dotnet.

LLVM outputs native binary so it becomes a non-issue.

For what it's worth, I'd also make it optinal, like the Mono version is.

@jabcross
Copy link
Contributor

@DriNeo I don't think that'd be quite the case. LLVM is notoriously difficult to compile from source, so any user who's using an OS without a recent prepackaged LLVM will likely not bother about it.

Most users don't compile Godot from source, though. If somebody figures out the whole LLVM compilation issue and adds the source code to the repo (like we do with basically every external lib), all the end user will have to do is download the proper version from the website.

It would also be possible to implement LLVM support through an addon that outputs GDNative binaries, so this wouldn't necessarily need to be baked into the engine.

@Calinou
Copy link
Member

Calinou commented May 13, 2020

@jabcross The LLVM source code repository weighs dozens of gigabytes. It'll never be added to the Godot source repository at this point 😛

It would also be possible to implement LLVM support through an addon that outputs GDNative binaries, so this wouldn't necessarily need to be baked into the engine.

This is arguably a better course of action. I think we should focus on improving GDNative integration instead (and possibly add WebAssembly support as a replacement in the long-term, but that's another discussion).

@Shatur
Copy link
Contributor

Shatur commented May 13, 2020

I think we should focus on improving GDNative integration instead (and possibly add WebAssembly support as a replacement in the long-term).

It would be awesome. Currently GDNative is very inconvenient.

@jabcross
Copy link
Contributor

jabcross commented May 13, 2020

I'm not 100% updated on this, but the problem with GDNative currently is the interface bottleneck, right? Function calls still need string comparison and whatnot?

The LLVM source code repository weighs dozens of gigabytes. It'll never be added to the Godot source repository at this point stuck_out_tongue

Yeah, but a considerable portion is non-essential, like tests and benchmarks. It could be trimmed down. MLIR is also in the LLVM repository, after all.

We should also consider MLIR in the future, it's a really cool project.

@pchasco
Copy link

pchasco commented May 13, 2020 via email

@jabcross
Copy link
Contributor

Still, lots of hoop jumping for something arguably simple. It happens with variable assignment too, right?

Are there any known estabilished games that used GDNative so far?

@pchasco
Copy link

pchasco commented May 13, 2020

Still, lots of hoop jumping for something arguably simple. It happens with variable assignment too, right?

I don't think it's a lot of hoops. And I wouldn't recommend using ptrcall for every method; premature optimization adages still apply.

Are there any known estabilished games that used GDNative so far?

Karroffel mentioned one on my blog but he didn't give the name of it...

@CodingMadness
Copy link

What is the current state of GDScriptt right now and could you guys make some performance increase, be it GDNative and GDScript?

@nathanfranke
Copy link
Contributor

GDScript is being remade by vnen which will hopefully improve performance.

be it GDNative and GDScript?

GDNative is a completely different language so I don't understand the second part

@pchasco
Copy link

pchasco commented Jul 8, 2020

You can likely achieve performance increase by porting your GDScript to compiled GDNative scripts.

I am not confident that the currently proposed changes being made to GDScript will significantly improve performance, but time will tell.

@gjmcodes
Copy link

Out of curiosity, accordingly to Godot documentation:

.NET / C#
As Microsoft's C# is a favorite amongst game developers, we have added official support for it. C# is a mature language with tons of code written for it, and support was added thanks to a generous donation from Microsoft.

It has an excellent tradeoff between performance and ease of use, although one must be aware of its garbage collector.

I'm curious what manual care I'd have to have when coding with C#. Is there any recommendation to where and when I should manually call garbage collection operations in C# scripts?

@pchasco
Copy link

pchasco commented Jul 16, 2020

I'm curious what manual care I'd have to have when coding with C#. Is there any recommendation to where and when I should manually call garbage collection operations in C# scripts?

I don’t know why you would need to manually force a collection. I definitely would not do it during gameplay. The only time when it may be beneficial to do a manual collection would be after loading or saving a game, or after dismissing a menu when you can get away with some stutter. But you can generally just ignore the GC altogether if your game is performing well.

@Two-Tone
Copy link

You'd be better off asking your question either on the subreddit or Godot Q&A.

@gjmcodes
Copy link

Thanks! Well, truth be told: the way putted out in the documentation seemed there should be some sort of handling or expectations for memory leak if things are kept 'automatically'.

All in all, I was just curious whether instantiations and real time calculations could leave garbage behind without any proper manual handling. I'll query about those on Reddit. Thanks!

@milkowski
Copy link
Contributor

There is new kid on the block: MIR https://github.com/vnmakarov/mir that is aiming to be lightweight JIT for multiple intermediate representations https://developers.redhat.com/blog/2020/01/20/mir-a-lightweight-jit-compiler-project/ and initially being implemented as JIT for Ruby 3.0.
It seems it could be interesting target also for typed GDScript.

@CodingMadness
Copy link

@milkowski so would u then use GDScript or GDNative code in combination with this new M-JIT?

@emmggi
Copy link

emmggi commented Nov 29, 2020

Sorry to barge in like this but I'd like to ask what performance difference is there between gdscript and lua?

@Miziziziz
Copy link

Sharing my findings here:

Godot version: Godot Mono v3.2.3 and Godot v3.2.3

OS/device including version: Windows 10, GLES3, Geforce 2060 rtx

I have this procedural hair generated using ImmediateGeometry:
https://user-images.githubusercontent.com/7292421/107883893-fb097d00-6eae-11eb-9c10-86a716994d3a.mov
Calculating all the vertice positions requires a lot of vector math (cross products and such) and then also has to be converted from local, to global, to local positions at one point. Doing this on my machine has a noticeable impact in the profiler:
gdscriptprofiler
Recreating this in C# gives about a 10fps improvement in my main project.
Can't really see frame time in the profiler but you can see the graph and compare:
csprofiler

Since math is done in c++ I figured the issue was probably related to how gdscript handles loops or arrays and might be worth reporting.

Minimal reproduction project:

contains a C# scene and GdScript scene
ProcHairTest.zip

@pchasco
Copy link

pchasco commented Feb 16, 2021

It's difficult to speculate exactly why any specific C# program is significantly faster than the GDScript version. First, mono jit is likely applying several optimizations to the structure of the compiled bytecode, such as dead code elimination, common subexpression elimination, loop unrolling, etc.

I also expect a significant difference comes from the core types being implemented in mono for the mono module. Having mono implementations of those core types enables visibility into the methods by the optimizer, and, as they are not virtual, the optimizer may be able to inline some of the calls. The GDScript runtime has to dynamically dispatch those calls on core types, which is significantly slower. Inlining calls also allows the optimizer to include the instructions of the inlined method in its optimization of the caller, providing more opportunities to eliminate common subexpressions or reorganize the code for better instruction scheduling.

@trudnai
Copy link

trudnai commented May 30, 2021

A bit off topic, but... due to the sheer level of sophistication of compilers, it is extremely likely that C++ will outperform handwritten assembly. For trivial cases, the compiler has those cases optimized. For non-trivial cases, the problems are likely too complex for a human to hand-optimize the assembly.

That is actually a general misconception made up by C++ programmers who never ever seen an Assembly code. Assembly code can be as well optimized as the coder knowledge and experience allows it to. So a good Assembly programmer can make a lot better code than a C++ any time. However, a beginner will not be able to optimize their code too much. Unlike a beginner C++ programmer who just pass a command line argument to get a fair output.

Now as for commenting of the experienced Assembly programmer, it still takes time to manually optimize a code and sometimes that makes the code quite unmaintainable at the end. Once I wrote a DSP filter to a PIC controller, and it took me a good 3 month just to optimize it and to fit into the chip. As a result I was able to use a much smaller chip than the competition meanwhile filtering 2 channels at the same time instead of just 1. Simply because I was using Assembly and not C.

Moral is that writing something in Assembly can give you the ultimate performance and smallest code, but might not worth the effort of it. And it depends on the project again, so if you can save just $1 per circuit by using a smaller chip, you can save a million dollar in a million unit. But in a Game development it might not justify the amount of work you push into it.

@pchasco
Copy link

pchasco commented May 30, 2021 via email

@realkotob
Copy link
Contributor

@vnen I don't think this issue is relevant after the GDScript rewrite, the benchmarks need to be redone

@vnen
Copy link
Member

vnen commented Jan 7, 2022

@vnen I don't think this issue is relevant after the GDScript rewrite, the benchmarks need to be redone

The information is interesting to have for history's sake and it does need to be re-run from time to time, but I don't think an open issue is the best place to keep this information (given there's nothing actionable to one day close the issue). I don't know what would be the best place though.

@elvisish
Copy link

How is performance comparably after the rewrite?

@vnen
Copy link
Member

vnen commented Mar 23, 2022

I gave some general numbers about improvement in GDScript itself a while ago: https://godotengine.org/article/gdscript-progress-report-typed-instructions

I haven't compared it with other languages and don't plan to for now. Focus on performance will happen after Godot 4.0 is released.

I'm closing the issue now as I don't think there's anything that could be done here. Sure, GDScript could have better performance, but it doesn't really need to be on par with C#.

@vnen vnen closed this as completed Mar 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests