Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GR-42531] Debug info memory footprint improvements #5436

Merged

Conversation

adinn
Copy link
Collaborator

@adinn adinn commented Nov 15, 2022

This PR implements significant improvements to debug info footprint. It only addresses changes local to the debug info generator. Although this is a great improvement the limited scope of the changes means that a large amount of the footprint that results from enabling of debug info is still present (debug info-related output by the compiler and ELF image modelling). An accompanying report (to be attached) identifies these overheads as well as the overheads addressed by this patch and suggests some areas where space savings might perhaps still be achieved in these other components.

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Nov 15, 2022
@adinn
Copy link
Collaborator Author

adinn commented Nov 15, 2022

Heap Usage Report (as promised):
Footprint.pdf

@adinn adinn self-assigned this Nov 15, 2022
@adinn adinn force-pushed the debug_info_memory_footprint_improvements branch from 5160df9 to ee513ce Compare November 15, 2022 21:58
Copy link
Collaborator

@zakkak zakkak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thanks Andrew!

Did you by any chance measure wall time improvement as well in your expirements?

@adinn
Copy link
Collaborator Author

adinn commented Nov 16, 2022

@zakkak Thanks for the review. I did not compare wall clock times -- I was running with full logging so the results would not have been very useful. I'll do a pair of before and after runs to see if there is any noticeable difference (modulo the fact that this is only a HelloWorld example).

@adinn
Copy link
Collaborator Author

adinn commented Nov 16, 2022

@zakkak For basic debuginfo before and after runs were both much the same -- best of two successive runs was before=1:16 vs after=1:15. However, for full debug info there was an apparent improvement of a few seconds - best of two successive runs was before=1:36 vs after=1:31. Not exactly scientific though :-)

@adinn
Copy link
Collaborator Author

adinn commented Nov 16, 2022

@zakkak ^^^ times above corrected

@zakkak
Copy link
Collaborator

zakkak commented Nov 16, 2022

Thanks @adinn !

@fniephaus fniephaus changed the title Debug info memory footprint improvements [GR-42531] Debug info memory footprint improvements Nov 16, 2022
@lewurm lewurm self-assigned this Nov 16, 2022
Copy link
Member

@fniephaus fniephaus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks a lot @adinn!

@adinn
Copy link
Collaborator Author

adinn commented Nov 16, 2022

@fniephaus Thanks for the review. This appears to improve the problem reported by @lewurm significantly (see #5294 (comment) and #5294 (comment)). I think it would be worth pushing this now as is and relegating any further improvements -- including to the compiler or ELF code -- to a follow-up PR.

@olpaw
Copy link
Member

olpaw commented Nov 16, 2022

@adinn did you identify any intermediate data structures that you are holding while generating dwarf debuginfo that could be deleted right after they got transformed into their final dwarf section data format? E.g. some data in a maps that could be deleted after their last use. In a similar effort to reduce memory pressure in EE this is how I was able to make lots of data available for GC so that the final debugObject.bake() step had much more memory budget to do it's job.

@lewurm
Copy link
Member

lewurm commented Nov 16, 2022

Did you by any chance measure wall time improvement as well in your expirements?

I've some numbers from our CI to share on this, before:

[libpythonvm:161612] [7/7] Creating image...                                                           (338.3s @ 8.76GB)
[libpythonvm:161612]  169.06MB (29.49%) for code area:   157,038 compilation units
[libpythonvm:161612]  216.69MB (37.80%) for image heap:2,439,202 objects and 4,349 resources
[libpythonvm:161612]  175.36MB (30.59%) for debug info generated in 210.7s
[libpythonvm:161612]   12.18MB ( 2.13%) for other data
[libpythonvm:161612]  573.29MB in total
[libpythonvm:161612] ------------------------------------------------------------------------------------------------------------------------
[libpythonvm:161612]                       127.4s (11.0% of total time) in 332 GCs | Peak RSS: 16.79GB | CPU load: 5.88
[libpythonvm:161612] ------------------------------------------------------------------------------------------------------------------------

After:

[libpythonvm:160450] [7/7] Creating image...                                                           (287.8s @ 7.69GB)
[libpythonvm:160450]  169.64MB (29.53%) for code area:   157,314 compilation units
[libpythonvm:160450]  217.00MB (37.78%) for image heap:2,435,818 objects and 4,395 resources
[libpythonvm:160450]  175.53MB (30.56%) for debug info generated in 162.4s
[libpythonvm:160450]   12.23MB ( 2.13%) for other data
[libpythonvm:160450]  574.41MB in total
[libpythonvm:160450] ------------------------------------------------------------------------------------------------------------------------
[libpythonvm:160450]                        79.4s (8.1% of total time) in 292 GCs | Peak RSS: 16.58GB | CPU load: 5.40
[libpythonvm:160450] ------------------------------------------------------------------------------------------------------------------------

Those numbers aren't captured in a "benchmark approved environment" (e.g. other jobs could run on the same machine at the same time), but you can clearly see a trend 🙂 Thanks again Andrew.

@adinn
Copy link
Collaborator Author

adinn commented Nov 16, 2022

@adinn did you identify any intermediate data structures that you are holding while generating dwarf debuginfo that could be deleted right after they got transformed into their final dwarf section data format? E.g. some data in a maps that could be deleted after their last use. In a similar effort to reduce memory pressure in EE this is how I was able to make lots of data available for GC so that the final debugObject.bake() step had much more memory budget to do it's job.

I have not yet tried that but I think it is very likely that this will provide further savings.

Also, as we discussed offline I think we can achieve achieve further savings as follows:

COMPILER

  1. Identify and merge equal JavaKind[] and JavaValue[] arrays created in DebugInfoBuilder

  2. Identify and merge equal ancestor ByteCodeFrame instances in call chains created by DebugInfoBuilder

  3. Replace HashMap uses with the cheaper EconomicMap

DWARF

  1. Identify repeated DWARF_OP expression sequences and use DWARF_OP_call to jump to replace them with a subroutine call to one original, generated expression.

@adinn
Copy link
Collaborator Author

adinn commented Nov 16, 2022

I'll raise follow-up PRs for the above changes if/when I can show that they help.

@graalvmbot graalvmbot merged commit 23b95a2 into oracle:master Nov 18, 2022
@adinn adinn deleted the debug_info_memory_footprint_improvements branch July 13, 2023 10:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement native-image-debuginfo OCA Verified All contributors have signed the Oracle Contributor Agreement. performance redhat-interest
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants