-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C++ runtime: Make Parser::exitRule() non-virtual #2162
Conversation
Add missing override markers to the following functions of the C++ runtime: - TokensStartState::getStateType() - TagChunk::toString() - TextChunk::toString() The missing markers made builds against the API with -Wsuggest-override choke.
Parser::exitRule() in the C++ runtime is a virtual function which is not reimplemented anywhere. OTOH, it is invoked during the execution of every rule, which can cause a noticeable performance hit. This commit removes its virtual qualifier. It should make a difference particularly for large grammars, because the number of rules corresponds to the number of the Parser object's virtual functions, and, consequently, its vtable lookup time. Tested with a VHDL grammar of 436 rules, where it brings down parsing time from 75 to 44 seconds on unoptimized compilation, i.e. a 40% speed gain. Still a lot slower than an equivalent java parser, though, which takes 2.64 seconds for the same input.
This is a very interesting outcome. VMT lookup shouldn't have such an impact and indeed I cannot reproduce this result with my (also fairly large grammar, > 500 parser rules, > 800 lexer rules). I used deeply nested expression to produce heavy recursion. But as I said, I can't reproduce that. Could you attach your grammar and sample input here? I then can try with your input to see if that makes a difference. |
:-(. Did you compile with -O3 by chance? Which compiler do you use? I measured with GCC 7.2.1 from -O0 up to -O2, with -O3 the gap (nearly) closed, 44 seconds vs 40 or so, i.e. 10%.
I've attached it to an email to you. I'm iffy about publishing at this point, because it's still part of a commercial project. |
I'm using clang (with XCode) and get these numbers (using your vhdl grammar): Java: C++ without Release with 'virtual': Release That's a bit different than what I get with my MySQL grammar, so it seems to be related also to the grammar structure. Comparing the release build times gives us:
hence ~45% speed increase (but 13 times slower than the Java parser, 10 times without |
Here are the results for my MySQL grammar (using 3 simple but highly nested queries): Java C++ (release only) w/o |
@mike-lischke should I merge? |
This patch aims at improving the performance of ANTLR-generated C++ parsers for large grammars.
Parser::exitRule() in the C++ runtime is a virtual function which is not reimplemented anywhere. OTOH, it is invoked during the execution of every rule, which seems to have a noticeable influence on performance. This commit removes its virtual qualifier. It should make a difference particularly for large grammars, because the number of rules corresponds to the number of the Parser object's virtual functions, and, consequently, its vtable lookup time.
Tested with a VHDL grammar of 436 rules, where it brings down parsing time from 75 to 44 seconds on unoptimized compilation, i.e. a 40% speed gain. Still a lot slower than an equivalent java parser, which takes 2.64 seconds for the same input.